Signing Kernel-mode Drivers with SHA-2/SHA-256

I hit a pretty frustrating problem the other day when I renewed my digital certificate for code signing and my drivers stopped working. When I renewed the certificate through VeriSign (Symantec), I was given the option of choosing a SHA-1 or a SHA-2 (with 256-bit digest) hashing algorithm for the certificate. After reading through the background information on these options, and understanding that the new and improved SHA-2 was supported on Vista and above, I determined that SHA-2 would meet my requirements and clicked the button.

I got the new cert, put it in my cert store, and reconfigured all my projects in Visual Studio to sign with the new cert. Everything was working perfectly. It installed and run fine on my test machine, but then one of my co-workers complained that the driver wouldn’t load on his machine. After a short while I realized that I had the same problem on my machine, I just wasn’t seeing it because I had a kernel debugger attached. (Note that there are some registry options that can enable mandatory kernel mode code signing even when the debugger is attached. Read more at MSDN.)

So now that I could reproduce the problem, I started troubleshooting the signing itself. Using “signtool.exe verify /kp”, I verified that the driver was signed, all certificates were valid, and the cross-certificate from Microsoft was also in the signing chain. Signtool claimed that everything was just fine, but the darn thing still wouldn’t load. I was getting the error ERROR_INVALID_IMAGE_HASH (0x80070241): “Windows cannot verify the digital signature for this file. A recent hardware or software change might have installed a file that is signed incorrectly or damaged, or that might be malicious software from an unknown source.” After searching the event logs for a bit, I found the Microsoft-Windows-CodeIntegrity log which contained an Event 3004 with basically the same message as the error code above.

After exhausting my Google-Fu skills with no results, we started brainstorming about what was different between the build yesterday that worked and the one today. Somebody mentioned encryption algorithms, and the light bulbs went on in my head. Adding the specific case of SHA-2 to my searching yielded a couple of pages: Practical Windows Code and Driver Signing and PiXCL: Signing Windows 8 Drivers.

Some of my own testing showed that I couldn’t get a driver built with Visual Studio and a SHA-2 certificate to load on both Windows 7 and Windows 8. Theoretically you could sign it twice with each algorithm (using the /fd switch of signtool), but Visual Studio with the WDK is not really set up to do that, I didn’t want to go monkeying with MSBuild stuff too much.

After calling Symantec support and describing the problem we were basically told, “oh yeah, the support for SHA-2 is not really there for a lot of systems.” You think? Maybe you could have mentioned this when we were renewing the certificate?

The moral of the story, dear reader, is stick with the SHA-1 hash algorithm for signing your kernel mode drivers, at least for the time being.

Finding Stuff in Modern C++

I recently had a co-worker ask me what the best way to find an object matching certain criteria in an STL vector. We talked about the “old-school” ways of doing that, like a for loop from zero to the number of items, or a for loop using the being and end iterators. Each of these would do the comparison and then break out of the loop if a match was found. Now these ways work just fine, but there are some newer, more “modern” methods of accomplishing the same things that use STL algorithms. In case you haven’t seen them before I thought I would share…

The first method uses a structure/class to wrap the comparison function and is usually called a “functor”. Essentially this is just an object that implements operator(), which allows you to treat the object somewhat like a function. (And you can pass it to other functions like STL algorithms.)

The second method uses the more modern “lambda” function syntax. It allows you to just define the comparison function right inline with the code. This is the one that I prefer because it keeps the comparison logic with the calling of the find algorithm. I think one of the most important aspects of good code is that it’s easy to follow and understand: you shouldn’t have to go skipping all over the code to figure out what some piece of code is doing.

Of course at first glance, a programmer who is unfamiliar with either of these methods is going to respond “huh?” But once you get used to seeing a functor or a lambda expression, they become pretty easy to read and understand.

So without further ado, on to the code, which demonstrates a very simple example of each method:
Continue reading “Finding Stuff in Modern C++”

Modern C++ in the Windows Kernel

In the past few months I have been playing around with Microsoft Visual Studio 2013 and trying out a bunch of the new C++ language features that are supported. Many of these were already in 2012, but I have been making a focused effort to make sure I understand all the new language features. I am loving the new stuff, and the way it makes my code easier to write.

Unfortunately, when it comes to using all these cool new features in daily work, I am a kernel developer, and most of this seems to be problematic in the kernel. Specifically, I wish there were a kernel-friendly Standard Template Library that we could use in kernel-land. There have long been problems documented with using C++ in the Windows kernel. The Windows compiler team has done some work in 2012 to add a /kernel switch that is supposed to help with some of this, but really it seems to do no more than make sure you don’t use C++ exceptions.

What I really believe the Windows kernel community needs, however, is a concerted effort to make all the modern C++ language features including STL available to kernel developers. I believe that it really impacts the quality of the driver ecosystem to have every single developer writing their own lists and list processing code, and trying to create their own wrappers for things like locking, IPC, etc.

One of the purposes of the new language features is to make it easy for developers to write good code, and to enable them to do the right thing. Kernel mode programming makes it difficult to do the right thing. And the worst part about that is that when you do the wrong thing in kernel mode, you crash the machine altogether (as opposed to user mode where you simply crash your own process and the system continues merrily on its way).

I understand that there are inherent difficulties in kernel mode programming, and things that you don’t have to worry about in normal C++ code, such as controlling what memory gets paged vs. non-paged, or worrying about code that can only run at certain IRQLs. So I believe that in addition to needing modern C++ and STL, we would need some (probably Microsoft-specific) extensions to help deal with these extra little problems. For example, when you declare a template, the code gets generated when the template is actually used. If the template is used in a non-paged function, and in a paged function, then we probably need a way to deterministically say how the template code should be generated.

This is stuff that can all be done, and Microsoft compiler guys are GOOD at it. They could make our lives so much easier: not just for programmers, but for consumers who are sick of having drivers that crash their machines. Come on guys… do it for freedom. Do it for the children. Just do it. Please?

Open Source Project for Testing Native API System Calls

At work I have recently been working on some system software that integrates closely with the registry. This means I require some fairly in-depth knowledge about the Registry APIs, and I also need a way to test their functionality (both with and without my software involved).

I have done this in the past by writing custom code, which then gets lost or out-of-date. This time around I decided to make a more interactive and reusable tool to help me accomplish this, and my company – FSLogix, Inc. – decided to release the tool as open source on CodePlex. I also wrote a blog article with more in-depth information about the tool over at the FSLogix Blog.

Assembly Language is Still Relevant

If you have read my blog at all, you already know that I enjoy poking around at the assembly-code-level in a debugger. But I believe it’s really NOT just for the geek interested in the arcane. Understanding what your code looks like at the assembly level can help you figure out what’s really going on, and in some cases it can be a whole lot easier than the same task at a higher level language, such as C. (All you C# and Java guys are laughing at the idea that C is a high level language, but this idea can be applied to understanding the byte code generated from those languages as well.)

As a first example, a good defensive programming technique is to use asserts to make sure that your assumptions about the code are true. In a case where your assumption turns out to be wrong, these asserts can make a debugging session a nightmare. In a complicated product, such as assert may make it impossible to step through code running on a different thread. Fortunately it is super-easy to disable such an assert right in the debugger by just changing an opcode. No re-compiling, re-deploying, getting the scenario set up to reproduce again. In the assembly code, you can find the test for your assertion, which will be followed by a jump instruction of some type. If you are testing the return value of a function for success, you may see something like:

test eax, eax
jne <address>

If you look at the byte code for the jne (jump not equal) instruction it will likely begin with a 0x75 (assuming an x86 instruction set and a short jump). Disabling that assert is a simple as changing that 0x75 to an 0x74, which changes the jne instruction to a je (jump equal). Good bye assertion.

Another very closely related example I ran into the other day. I forgot to put a logical negation in a statement. The if statement was testing for the exact opposite of what I intended. Well setting up the scenario to test it was about a 5 minute process. But using the same trick as with the assert, I was able to switch my code to do the right thing and continue debugging the remainder of the scenario.

One more example that will be more applicable to you C# guys out there. In the latest version of the C# language and .NET Framework, Microsoft added a couple of new keywords: async and await. Well the descriptions of the functionality and the explanations I read didn’t quite add up. Being quite familiar with multi-threaded programming, I just didn’t feel like I was getting a full picture of what was really happening. (I actually asked about it at a developers conference, and the speaker said “I don’t know how they do it, it’s just magic.”) So I wrote a quick sample, fired up a debugger and looked at the generated byte code, to get the complete picture of what was happening. It works about as I expected it would, but without understanding assembly level code, I never would have been able to figure that out.

So the next time you’re tempted to think that assembly language programming is a dead art, think again.

Localization Made Awesome

I have had the opportunity to work for a number of different software companies each of which has localized their product into different languages. These opportunities have led to me experience a number of different processes around localization, and a great deal of pain surrounding the whole issue. Localization sucks. However, I don’t believe that it has to suck, and I think there are processes that if put in place can make this a relatively painless process.

I believe the biggest problem area that I have seen in localization is the interweaving of responsibilities. Let me describe the process from one company I have worked for. The developers create their resources in an english-only version, and the sources for these resources are checked into the source control. Then at some predetermined point, a developer packages up all the resource source files and sends them off to the localization team. They have some database which strings are loaded into, translated (often by an third party), and then new source files are generated for the various languages. The source files are then sent back to the developers who have to check them in and test them to make sure that they compile. Net result: days of turnaround time for even trivial localization changes, and hours of wasted time.

The root issue here is that you have developers who are involved in localization, which is not their primary responsibility, and localizers who are involved in development (by producing source files that have to be compiled). I think that we can resolve these issues and streamline the process by simply separating the functions more distinctly.

First, we need to have a method whereby localization can take place during the build process without any intervention. My first thoughts here are that we could create a program that is part of the build system that localizes an english source file into another language source file. The data that this program would consume is a translation database that would be checked into the source control system. The localization team is responsible simply for checking in a new copy of the database when they have updates. This system is a good improvement on the original, but it still leaves the problem that the files have to be compiled after the localization process changes them. One wrong string (with a misplaced quote character, etc.) and the build fails.

A better way to solve this problem is that the build system builds the initial binaries. Then the localization program is pointed at the binary file containing english resources. It edits this binary file and produces a copy of it with updated resources. In this way, we have isolated the localization process from the development process as much as possible.

The program that is responsible for performing the localization of binaries should also produce as part of its output a file that shows what resources are new or updated. This resources need further localization work. There needs now to be a tool that the localizers use to edit the localization database. It should be able to consume this build output product to import the new strings into their database. In this way the localizers never have to deal directly with the product. They only deal with the inputs to a tool that they own, and the outputs from a tool that they own.

Another side improvement to the general localization process is how to do a large degree of localization before any translation takes place. This process is what I have heard referred to as “pseudo-localization”. The localizing tool should be capable of producing a “pseudo-localized” binary, that does not have ANY real strings. Rather, it should take all existing string resources, generate enough gibberish to pad the value and make it long enough to simulate a translated string. (Usually strings grow by something like 30% or so when translated to certain languages.) The product can then be installed and tested with these pseudo-localized strings to find spacing issues, etc. long before any translation work is done.

Programming Virtue: Complete Coding

One of the virtues of programming that I have found very useful and have been trying to discipline myself to use is what I like to call complete coding. This is the practice of writing shipping quality code ALL the time.

Most programmers try out an algorithm or put a thought into code very quickly to test the idea. There are two major problems with this practice. All too often the code ends up not being cleaned up and sits in the live project until somebody happens to find a bug in it. Or, the code gets tested in its quick and dirty form, then cleaned up, and checked in without further testing. Both of these practices put bugs into the code when they don’t need to be there. (The bugs are just a result of sloppy procedures and are fairly easy to avoid.)

So with these thoughts in mind, I have a few recommendations to make to programmers (myself included).

1- Always write error handling code as you are writing the main code paths. If you don’t it probably won’t get done, and it will be a while before the bugs get found since those error code paths by definition don’t get executed during normal operation.

2- Always write the code in such a way that you can understand it easily. If you have to think about it to write it, you will have to think about it to maintain it. And most importantly, it will be much more difficult to be sure you wrote it correctly the first time. (Thinking is just too darn hard, he he.)

3- Consider reuse of the code in the initial writing. Sometimes this means hiding implementation details behind an interface. Sometimes it means doing the refactoring work NOW, even though you desperately want to just finish the piece of code you’re working on.

4- Always run through the code in your head (or even better, in a debugger) with sample input to make sure the code is complete. Think about the what-ifs (the things you never think will happen, but sooner or later they will).

Well, there are probably many more, which is why there’s entire books about this subject. But these are few that I have noticed as being extremely useful. Let me know if you have other “complete coding” practices that others could benefit from.