Open Source Project for Testing Native API System Calls

At work I have recently been working on some system software that integrates closely with the registry. This means I require some fairly in-depth knowledge about the Registry APIs, and I also need a way to test their functionality (both with and without my software involved).

I have done this in the past by writing custom code, which then gets lost or out-of-date. This time around I decided to make a more interactive and reusable tool to help me accomplish this, and my company – FSLogix, Inc. – decided to release the tool as open source on CodePlex. I also wrote a blog article with more in-depth information about the tool over at the FSLogix Blog.

Assembly Language is Still Relevant

If you have read my blog at all, you already know that I enjoy poking around at the assembly-code-level in a debugger. But I believe it’s really NOT just for the geek interested in the arcane. Understanding what your code looks like at the assembly level can help you figure out what’s really going on, and in some cases it can be a whole lot easier than the same task at a higher level language, such as C. (All you C# and Java guys are laughing at the idea that C is a high level language, but this idea can be applied to understanding the byte code generated from those languages as well.)

As a first example, a good defensive programming technique is to use asserts to make sure that your assumptions about the code are true. In a case where your assumption turns out to be wrong, these asserts can make a debugging session a nightmare. In a complicated product, such as assert may make it impossible to step through code running on a different thread. Fortunately it is super-easy to disable such an assert right in the debugger by just changing an opcode. No re-compiling, re-deploying, getting the scenario set up to reproduce again. In the assembly code, you can find the test for your assertion, which will be followed by a jump instruction of some type. If you are testing the return value of a function for success, you may see something like:

test eax, eax
jne <address>

If you look at the byte code for the jne (jump not equal) instruction it will likely begin with a 0x75 (assuming an x86 instruction set and a short jump). Disabling that assert is a simple as changing that 0x75 to an 0x74, which changes the jne instruction to a je (jump equal). Good bye assertion.

Another very closely related example I ran into the other day. I forgot to put a logical negation in a statement. The if statement was testing for the exact opposite of what I intended. Well setting up the scenario to test it was about a 5 minute process. But using the same trick as with the assert, I was able to switch my code to do the right thing and continue debugging the remainder of the scenario.

One more example that will be more applicable to you C# guys out there. In the latest version of the C# language and .NET Framework, Microsoft added a couple of new keywords: async and await. Well the descriptions of the functionality and the explanations I read didn’t quite add up. Being quite familiar with multi-threaded programming, I just didn’t feel like I was getting a full picture of what was really happening. (I actually asked about it at a developers conference, and the speaker said “I don’t know how they do it, it’s just magic.”) So I wrote a quick sample, fired up a debugger and looked at the generated byte code, to get the complete picture of what was happening. It works about as I expected it would, but without understanding assembly level code, I never would have been able to figure that out.

So the next time you’re tempted to think that assembly language programming is a dead art, think again.

IoGetCurrentIrpStackLocation in the debugger

Today I had occasion to debug a problem with some IRP handling code in my driver. In the particular debugger session I found myself in, I wanted to examine some of the Irp parameters, found in the current stack location. Unfortunately I had only a pointer to the Irp in this code, and therefore needed to figure out how to find the stack location pointer using the debugger. Fun!

The function that does this in code is IoGetCurrentIrpStackLocation, which I decided to disassemble. The first part of the function basically just checks the StackCount and CurrentLocation members of the Irp to make sure that everything is ok. (This fires an assert if it doesn’t check out.) Then near the bottom of the function, we find

drv!IoGetCurrentIrpStackLocation+0x42 [source language="removed"][/source]:
23318 9dc43ea2 8b4d08          mov     ecx,dword ptr [ebp+8]
23318 9dc43ea5 8b4160          mov     eax,dword ptr [ecx+60h]
23319 9dc43ea8 8be5            mov     esp,ebp
23319 9dc43eaa 5d              pop     ebp
23319 9dc43eab c20400          ret     4

So first this moves the Irp pointer into ecx, and then goes to offset 60 within that structure (which happens to be outside the range of the documented structure), and puts the pointer there into eax for return to the caller. So I try that in my debugger and compare with the output of the !irp command.

kd> !irp 9f104f68
Irp is active with 1 stacks 1 is current (= 0x9f104fd8)
 No Mdl: System buffer=9f08cbf0: Thread 88e05558:  Irp stack trace.  
     cmd  flg cl Device   File     Completion-Context
>[  e, 0]   5  1 88dcbd18 88e05ab8 00000000-00000000    pending
	       FileSystemFSLX
			Args: 0000040c 00000000 94000004 00000000

kd> dd 9f104f68+60 L1
9f104fc8  9f104fd8

kd> db 9f104fd8
9f104fd8  0e 00 05 01 0c 04 00 00-00 00 00 00 04 00 00 94  ................
9f104fe8  00 00 00 00 18 bd dc 88-b8 5a e0 88 00 00 00 00  .........Z......
9f104ff8  00 00 00 00 15 15 15 15-?? ?? ?? ?? ?? ?? ?? ??  ........????????

At this point, I now realized that this structure that I needed to look at really doesn’t provide anything that’s not already provided in the !irp output. The Args output of that command correspond to the members of the IO_STACK_LOCATION.Parameters union. In this case, I am looking at a device control Irp, so these parameters are OutputBufferLength (40c), InputBufferLength (0), IoControlCode (94000004), and Type3InputBuffer (0).

So I guess the bottom line of this post is that !irp is cool and does just what you need it to, I just had to poke around a little bit before I realized it.

More debugging-geekness

So yesterday I was troubleshooting some window creation issues, and had to fool around in the kernel side of window creation, down in win32k.sys. Specifically I was looking at window class registration, which happens when you call RegisterClassEx from your Windows app. Down in the kernel, some magic happens with creating Atoms as part of the window class registration. I traced through a bunch of win32k.sys routines to figure out where in memory they were storing this, and then I wanted to dump the table. After dumping about 4 of the entries manually, I got bored and wrote this little gem:

r $t0=poi(poi(win32k!UserAtomTableHandle)+c)
.for ( r $t1=0; @$t1 < @$t0; r $t1 = @$t1 + 1 ) { du poi(poi(win32k!UserAtomTableHandle)+10+( @$t1 * 4))+c }

Basically, this uses the symbol win32k!UserAtomTableHandle to find the length of the table, and then uses a for loop to go through, calculating the offset of each item, and them dumping its string value. On my Windows 7 system it produced something like this:

8c2a3d1c  "Native"
878b0c9c  "ObjectLink"
87e1e18c  "AeroWizardInternalFrameButtonCli"
87e1e1cc  "cked"
878cb314  "Static"
878cb104  "DDEMLUnicodeClient"
9620faec  "DataObject"
8c2affa4  "ACTIVATESHELLWINDOW"
8c2afe34  "FlashWState"
9620fa84  "SysCH"
8c2b2ce4  "PBrush"
8c3b8f24  "MSUIM.Msg.RpcSendReceive"
878bb7b4  "SysIC"
878cb1ec  "DDEMLEvent"
878bb784  "SHELLHOOK"
8c2b2e0c  "Custom Link Source"
9159dc84  "AltTab_KeyHookWnd"
91529084  "Search Box"
878bb6f4  "SysDT"
8c2b2dd4  "Link Source"
9620fb8c  "FileName"
87e35b0c  "GDI+ Accessibility"
878bb664  "SysWNDO"
878bb854  "DDEMLAnsiServer"
87e0c0bc  "SysLink"
9620fb24  "NetworkName"
8c2cde3c  "USER32"
8c2b2d14  "OleDraw"
9620fb5c  "FileNameW"
8c2b2bec  "MoreOlePrivateData"
8c282434  "Edit"
9620fbbc  "Binary"
878cb374  "OleClipboardPersistOnFlush"
8c2a3d4c  "OwnerLink"
878cb2e4  "ListBox"
8c2b2e54  "Embed Source"
878bb634  "SysIMEL"
878cb224  "ComboLBox"

Neato Debugging Trick

I had to debug an annoying little problem today that I thought might be worth writing about. I was interested in walking through some code that was failing, but the same code was getting called in a recursive loop, so there were literally hundreds of successful runs that I was not interested in prior to the single failure I did care about.

Now a normal usermode developer might just add some special code at the point of failure to detect the failure and recall the failing function. Nice and easy. But that’s really not any fun, and when you’re doing kernel debugging, writing some new code and getting it running on the machine is not quite as simple (it’s not hard, just more time consuming).

Enter this neato debugging trick…

bp address "j (dwo(status)!=0) 'r @rip=fffff880`02b5bd1f'; 'gc'"

Basically this executes a conditional test (the “j” command) each time the breakpoint is hit. If the DWORD value represented by the variable named ‘status’ is non-zero, then I know I’ve hit the failure condition. In that case, I just adjust the instruction pointer back up to before the failing function call, leaving me right where I am ready to trace into the function and see the failure. Otherwise, the breakpoint essentially just hits ‘Go’ to continue on to the next hit.

The syntax here is a bit rough, and would have to be modified if your program isn’t always at the same code location (since I hard-coded the rip register). It could be replaced with an offset from the current location to be a bit more elegant. But since I was working on a driver, it was always in memory and at the same place, so I was lazy. (A habit that always pays off immediately.)

A Better Way to Find an Allocation

Quite some time back I wrote a blog entry on how to make conditional breakpoints. In particular, I was looking for a way to find when a certain pool tag allocation occurred. Well it turns out there is a MUCH better way of doing it that what I posted in that blog entry.

The latest issue of “The NT Insider” from OSR just came out, and has an article on debugging techniques. Apparently there is a global value that you can set to a given pool tag that you would like to break on allocation for. So if I were looking for the tag ‘Test’ I could set the following value in the debugger:

kd> ed nt!PoolHitTag 'tseT'

Note that this technique will only work if pool tagging is enabled on the system. But nevertheless, this technique is a lot more efficient and faster than the method I showed previously. (Although the technique could still be good for other scenarios.)

Getting the Address of a Private Kernel Routine

When writing a driver, there are times when you may want to call a function if it is available on the version of the operating system you are running on, but it may not always be available. For example, I recently came across a need to use the ZwRenameKey function which was added in Windows XP. My driver also runs on Windows 2000 so I need to dynamically detect and use this routine if it is available. Enter the handy function MmGetSystemRoutineAddress. But wait… it doesn’t seem to work for ZwRenameKey, which is apparently not made public and therefore cannot be gotten using that routine.

But since I really need to use it (don’t ask why… long story) I’m going to have to find another way to get the address of the routine. The first step is to get the address of the service descriptor table.

kd> x nt!KeServiceDescriptorTable
8089f460 nt!KeServiceDescriptorTable = <no type information>

This table actually has four entries, the first of which is used for the Native API. (See Microsoft Windows Internals, Fourth Edition, page 122 for more information about these structures.) So we get the address from the first entry.

kd> dd 8089f460 L4
8089f460 80830bb4 00000000 00000128 80831058

Now we just need to dump this table with symbols so we can find the routine we’re interested in.

kd> dps 80830bb4 L120
80830bb4 80917510 nt!NtAcceptConnectPort
80830bb8 80962516 nt!NtAccessCheck
80830bbc 809667ce nt!NtAccessCheckAndAuditAlarm
80830bc0 80962548 nt!NtAccessCheckByType
80830bc4 80966808 nt!NtAccessCheckByTypeAndAuditAlarm
80830bc8 8096257e nt!NtAccessCheckByTypeResultList
80830bcc 8096684c nt!NtAccessCheckByTypeResultListAndAuditAlarm
80830bd0 80966890 nt!NtAccessCheckByTypeResultListAndAuditAlarmByHandle

80830ed4 808b0f88 nt!NtRenameKey

And then a little bit of math will tell us the offset. With this offset we can write some code to go to this offset and get the address of the routine we need.

kd> ? (80830ed4 – 80830bb4) / 4
Evaluate expression: 200 = 000000c8

Note that this is not a great thing to have to do. These offsets are not guaranteed to stay the same, and they are definitely different between versions of the operating system.

Windbg: Disabling ASSERTs

Sometimes you have an ASSERT in your code and for some reason it starts being hit all the time. It’s good to know about it the first time, but if it’s happening hundreds of times a second, it can make debugging (or just replacing the code with something that doesn’t ASSERT) very difficult. Enter windbg and the power to kill an ASSERT.

When the debugger breaks in on an ASSERT, look at your disassembly. You will probably see something similar to the following. Yours may look a little different, depending on what you’re ASSERTING on, but it will be similar.

f6bc9923 7414 jz driver!function+0x1d9 (f6bc9939)
f6bc9925 6819050000 push 0x519
f6bc992a 68c096bcf6 push 0xf6bc96c0
f6bc992f e874010300 call driver!DbgPrint (f6bf9aa8)
f6bc9934 83c408 add esp,0x8
f6bc9937 cd01 int 01

The key element here is the first line which is actually doing the test on the ASSERTion. If the ASSERT comparison evaluates to zero (i.e., the ASSERT succeeds), then it jumps over the call to output a debugger string and the int 01 which breaks into the debugger.

So what we want it to do is always skip that section. If we enter windbg into the assembler mode using the “a” command (of course telling it what address to assemble into), and then replace the jz instruction with a jmp instruction, that’s all we need.

kd> a f6bc9923
f6bc9923 jmp driver!function+0x1d9
jmp driver!function+0x1d9
f6bc9925

kd> g

Note that the only affect on the binary code is to change the first byte from a 74 to an EB. You could accomplish the same thing that way instead of using the assembler, which is as simple as:

kd> eb f6bc9923 eb

Windbg: Conditional Breakpoints with string pattern matching

Today I ran into a need to set a breakpoint that would only stop when a certain string was encountered. In the past I have just modified the code to test for the string, and then update my driver, reboot, etc. A very time consuming process. So today I decided that I wanted to figure out how to do it right in the debugger. I knew it was possible from comments, but didn’t know how to implement it.

First of all, since runs a non-trivial set of commands each time the breakpoint is hit, so I placed the commands in a secondary file. There may be a way to get this all on a single line breakpoint command, but I don’t see it. So the breakpoint we create is just going to run the commands from the secondary file. The command to create the breakpoint is something like this:

bp driver!functionName “$$< C:\debugCommands.txt”

Then comes the important part – the actual commands that get executed. We need to evaluate a string against a pattern, which the masm expression evaluator can handle using the “$spat” command. The hard part about that is that at first glance it only appears to work with string literals. So $spat( “Big string”, “*str*” ) will work, while $spat( poi(variableName), “*str*” ) just laughs mockingly at you.

The key here is to assign the string to an alias which will then allow it to be evaluated by the $spat command. So using our example comparison, the commands in the secondary file look like:

as /mu ${/v:MyAlias} poi(variableName);
.if ( $spat( “${MyAlias}”, “*str*” ) != 0 ) { g }

The commands evaluate the string. If a match is not found, the g[o] command is executed, otherwise execution will stop at the point when the pattern is found. Note that there are much more complicated pattern matching expressions available as well.

Windbg: Conditional breakpoints

I found a very strange bug with a kernel driver I do some work on this last week. It only seems to appear when we have GoogleDesktop or the MSN Desktop Search installed. After some period of time the graphics display just starts doing some whacky stuff: fonts don’t display, repaints don’t work quite right, just general whackiness.

So since this is only happening after some period of time on a machine, we started looking at resources. Using the pooltag tool from sysinternals we found one particular tag that looked out of control. The tag was FSrN, which according to the lists I could find is “File-system runtime”. Helpful, no?

So that brings me to the meat of the story. I wanted to find out who was allocating that and what the call stack looked like at the time. In order to do this I needed to put a breakpoint on ExAllocatePoolWithTag, but that gets called ALL the time. I only wanted to break when we hit the right tag value. So I came up with the following command to set the breakpoint in windbg:

bp nt!ExAllocatePoolWithTag “j (Poi(ss:esp+c) = ‘NrSF’) ‘kb; db ss:esp+c’; ‘gc’ “

You can check out the windbg help for more information on conditional breakpoints. It will explain what the above command means. I wanted to post it since I couldn’t find any good example of how to do this with a non-integer value.