Undocumented Query Directory Flags

Last week I ran into a strange Windows file system behavior that I couldn’t find any information on anywhere. Since it’s always extremely frustrating to try to figure things out when there’s no information available, I thought I would share what I found. The bug we were experiencing had to do with a directory query operation over the network (e.g., when you run ‘dir \localhostc$Windows’ from a command window). If the directory doesn’t have many files in it this works just fine, but if it is a large directory, as in the example above then the IRPs that are issued to the file system are a bit strange. Our filter driver wasn’t handling these quite correctly, and the result was that if you queried the directory using the local name you’d get ~200 files, and if you used the UNC name you’d only get about ~150 files.

After digging into this with a coworker, we found an unexpected style of IRP. When performing the directory query over the network the SRV kernel component issues a IRP_MN_QUERY_DIRECTORY with a IrpSp->Parameters.QueryDirectory.FileName and IrpSp->Parameters.QueryDirectory.FileIndex combination that seems to essentially reset the point at which the enumeration continues. The sequence we were seeing goes something like this:
Continue reading “Undocumented Query Directory Flags”

Directory Notification

I have had a number of situations over the last five years where I have had to write code that detects changes made in a directory. Some has been for testing my own directory notification code at the file-system level, and others have been for real implementation reasons up in usermode code.

The documentation can be a lot to wade through as there are a number of different ways that such notification can be accomplished. I won’t go into the reasons for using each (check MSDN documentation for some information on that), but I wanted to post some simple samples of using each for anyone who might be interested.

There are five different methods, using Win32 APIs, that are used in this sample. They don’t do anything special except print out basic information about the changes that are detected.
Continue reading “Directory Notification”

More debugging-geekness

So yesterday I was troubleshooting some window creation issues, and had to fool around in the kernel side of window creation, down in win32k.sys. Specifically I was looking at window class registration, which happens when you call RegisterClassEx from your Windows app. Down in the kernel, some magic happens with creating Atoms as part of the window class registration. I traced through a bunch of win32k.sys routines to figure out where in memory they were storing this, and then I wanted to dump the table. After dumping about 4 of the entries manually, I got bored and wrote this little gem:

r $t0=poi(poi(win32k!UserAtomTableHandle)+c)
.for ( r $t1=0; @$t1 < @$t0; r $t1 = @$t1 + 1 ) { du poi(poi(win32k!UserAtomTableHandle)+10+( @$t1 * 4))+c }

Basically, this uses the symbol win32k!UserAtomTableHandle to find the length of the table, and then uses a for loop to go through, calculating the offset of each item, and them dumping its string value. On my Windows 7 system it produced something like this:

8c2a3d1c  "Native"
878b0c9c  "ObjectLink"
87e1e18c  "AeroWizardInternalFrameButtonCli"
87e1e1cc  "cked"
878cb314  "Static"
878cb104  "DDEMLUnicodeClient"
9620faec  "DataObject"
8c2affa4  "ACTIVATESHELLWINDOW"
8c2afe34  "FlashWState"
9620fa84  "SysCH"
8c2b2ce4  "PBrush"
8c3b8f24  "MSUIM.Msg.RpcSendReceive"
878bb7b4  "SysIC"
878cb1ec  "DDEMLEvent"
878bb784  "SHELLHOOK"
8c2b2e0c  "Custom Link Source"
9159dc84  "AltTab_KeyHookWnd"
91529084  "Search Box"
878bb6f4  "SysDT"
8c2b2dd4  "Link Source"
9620fb8c  "FileName"
87e35b0c  "GDI+ Accessibility"
878bb664  "SysWNDO"
878bb854  "DDEMLAnsiServer"
87e0c0bc  "SysLink"
9620fb24  "NetworkName"
8c2cde3c  "USER32"
8c2b2d14  "OleDraw"
9620fb5c  "FileNameW"
8c2b2bec  "MoreOlePrivateData"
8c282434  "Edit"
9620fbbc  "Binary"
878cb374  "OleClipboardPersistOnFlush"
8c2a3d4c  "OwnerLink"
878cb2e4  "ListBox"
8c2b2e54  "Embed Source"
878bb634  "SysIMEL"
878cb224  "ComboLBox"

Neato Debugging Trick

I had to debug an annoying little problem today that I thought might be worth writing about. I was interested in walking through some code that was failing, but the same code was getting called in a recursive loop, so there were literally hundreds of successful runs that I was not interested in prior to the single failure I did care about.

Now a normal usermode developer might just add some special code at the point of failure to detect the failure and recall the failing function. Nice and easy. But that’s really not any fun, and when you’re doing kernel debugging, writing some new code and getting it running on the machine is not quite as simple (it’s not hard, just more time consuming).

Enter this neato debugging trick…

bp address "j (dwo(status)!=0) 'r @rip=fffff880`02b5bd1f'; 'gc'"

Basically this executes a conditional test (the “j” command) each time the breakpoint is hit. If the DWORD value represented by the variable named ‘status’ is non-zero, then I know I’ve hit the failure condition. In that case, I just adjust the instruction pointer back up to before the failing function call, leaving me right where I am ready to trace into the function and see the failure. Otherwise, the breakpoint essentially just hits ‘Go’ to continue on to the next hit.

The syntax here is a bit rough, and would have to be modified if your program isn’t always at the same code location (since I hard-coded the rip register). It could be replaced with an offset from the current location to be a bit more elegant. But since I was working on a driver, it was always in memory and at the same place, so I was lazy. (A habit that always pays off immediately.)

A Better Way to Find an Allocation

Quite some time back I wrote a blog entry on how to make conditional breakpoints. In particular, I was looking for a way to find when a certain pool tag allocation occurred. Well it turns out there is a MUCH better way of doing it that what I posted in that blog entry.

The latest issue of “The NT Insider” from OSR just came out, and has an article on debugging techniques. Apparently there is a global value that you can set to a given pool tag that you would like to break on allocation for. So if I were looking for the tag ‘Test’ I could set the following value in the debugger:

kd> ed nt!PoolHitTag 'tseT'

Note that this technique will only work if pool tagging is enabled on the system. But nevertheless, this technique is a lot more efficient and faster than the method I showed previously. (Although the technique could still be good for other scenarios.)

Localization Made Awesome

I have had the opportunity to work for a number of different software companies each of which has localized their product into different languages. These opportunities have led to me experience a number of different processes around localization, and a great deal of pain surrounding the whole issue. Localization sucks. However, I don’t believe that it has to suck, and I think there are processes that if put in place can make this a relatively painless process.

I believe the biggest problem area that I have seen in localization is the interweaving of responsibilities. Let me describe the process from one company I have worked for. The developers create their resources in an english-only version, and the sources for these resources are checked into the source control. Then at some predetermined point, a developer packages up all the resource source files and sends them off to the localization team. They have some database which strings are loaded into, translated (often by an third party), and then new source files are generated for the various languages. The source files are then sent back to the developers who have to check them in and test them to make sure that they compile. Net result: days of turnaround time for even trivial localization changes, and hours of wasted time.

The root issue here is that you have developers who are involved in localization, which is not their primary responsibility, and localizers who are involved in development (by producing source files that have to be compiled). I think that we can resolve these issues and streamline the process by simply separating the functions more distinctly.

First, we need to have a method whereby localization can take place during the build process without any intervention. My first thoughts here are that we could create a program that is part of the build system that localizes an english source file into another language source file. The data that this program would consume is a translation database that would be checked into the source control system. The localization team is responsible simply for checking in a new copy of the database when they have updates. This system is a good improvement on the original, but it still leaves the problem that the files have to be compiled after the localization process changes them. One wrong string (with a misplaced quote character, etc.) and the build fails.

A better way to solve this problem is that the build system builds the initial binaries. Then the localization program is pointed at the binary file containing english resources. It edits this binary file and produces a copy of it with updated resources. In this way, we have isolated the localization process from the development process as much as possible.

The program that is responsible for performing the localization of binaries should also produce as part of its output a file that shows what resources are new or updated. This resources need further localization work. There needs now to be a tool that the localizers use to edit the localization database. It should be able to consume this build output product to import the new strings into their database. In this way the localizers never have to deal directly with the product. They only deal with the inputs to a tool that they own, and the outputs from a tool that they own.

Another side improvement to the general localization process is how to do a large degree of localization before any translation takes place. This process is what I have heard referred to as “pseudo-localization”. The localizing tool should be capable of producing a “pseudo-localized” binary, that does not have ANY real strings. Rather, it should take all existing string resources, generate enough gibberish to pad the value and make it long enough to simulate a translated string. (Usually strings grow by something like 30% or so when translated to certain languages.) The product can then be installed and tested with these pseudo-localized strings to find spacing issues, etc. long before any translation work is done.

Getting the Address of a Private Kernel Routine

When writing a driver, there are times when you may want to call a function if it is available on the version of the operating system you are running on, but it may not always be available. For example, I recently came across a need to use the ZwRenameKey function which was added in Windows XP. My driver also runs on Windows 2000 so I need to dynamically detect and use this routine if it is available. Enter the handy function MmGetSystemRoutineAddress. But wait… it doesn’t seem to work for ZwRenameKey, which is apparently not made public and therefore cannot be gotten using that routine.

But since I really need to use it (don’t ask why… long story) I’m going to have to find another way to get the address of the routine. The first step is to get the address of the service descriptor table.

kd> x nt!KeServiceDescriptorTable
8089f460 nt!KeServiceDescriptorTable = <no type information>

This table actually has four entries, the first of which is used for the Native API. (See Microsoft Windows Internals, Fourth Edition, page 122 for more information about these structures.) So we get the address from the first entry.

kd> dd 8089f460 L4
8089f460 80830bb4 00000000 00000128 80831058

Now we just need to dump this table with symbols so we can find the routine we’re interested in.

kd> dps 80830bb4 L120
80830bb4 80917510 nt!NtAcceptConnectPort
80830bb8 80962516 nt!NtAccessCheck
80830bbc 809667ce nt!NtAccessCheckAndAuditAlarm
80830bc0 80962548 nt!NtAccessCheckByType
80830bc4 80966808 nt!NtAccessCheckByTypeAndAuditAlarm
80830bc8 8096257e nt!NtAccessCheckByTypeResultList
80830bcc 8096684c nt!NtAccessCheckByTypeResultListAndAuditAlarm
80830bd0 80966890 nt!NtAccessCheckByTypeResultListAndAuditAlarmByHandle

80830ed4 808b0f88 nt!NtRenameKey

And then a little bit of math will tell us the offset. With this offset we can write some code to go to this offset and get the address of the routine we need.

kd> ? (80830ed4 – 80830bb4) / 4
Evaluate expression: 200 = 000000c8

Note that this is not a great thing to have to do. These offsets are not guaranteed to stay the same, and they are definitely different between versions of the operating system.

Windbg: Disabling ASSERTs

Sometimes you have an ASSERT in your code and for some reason it starts being hit all the time. It’s good to know about it the first time, but if it’s happening hundreds of times a second, it can make debugging (or just replacing the code with something that doesn’t ASSERT) very difficult. Enter windbg and the power to kill an ASSERT.

When the debugger breaks in on an ASSERT, look at your disassembly. You will probably see something similar to the following. Yours may look a little different, depending on what you’re ASSERTING on, but it will be similar.

f6bc9923 7414 jz driver!function+0x1d9 (f6bc9939)
f6bc9925 6819050000 push 0x519
f6bc992a 68c096bcf6 push 0xf6bc96c0
f6bc992f e874010300 call driver!DbgPrint (f6bf9aa8)
f6bc9934 83c408 add esp,0x8
f6bc9937 cd01 int 01

The key element here is the first line which is actually doing the test on the ASSERTion. If the ASSERT comparison evaluates to zero (i.e., the ASSERT succeeds), then it jumps over the call to output a debugger string and the int 01 which breaks into the debugger.

So what we want it to do is always skip that section. If we enter windbg into the assembler mode using the “a” command (of course telling it what address to assemble into), and then replace the jz instruction with a jmp instruction, that’s all we need.

kd> a f6bc9923
f6bc9923 jmp driver!function+0x1d9
jmp driver!function+0x1d9
f6bc9925

kd> g

Note that the only affect on the binary code is to change the first byte from a 74 to an EB. You could accomplish the same thing that way instead of using the assembler, which is as simple as:

kd> eb f6bc9923 eb

Querying the name of an object from a handle

I was helping another developer with some work the other day and thought that what we came up with might be useful for others. We had a handle to a registry key that we got from somewhere (i.e., some other unrelated part of the code had opened it) and we needed to determine the name of the key.

In the past I have wrapped all my registry handling in some class objects that also maintain the name of the key. Each time a key is opened relative to another, it copies the key name from the parent key class to make up the key name of the new class. It certainly works, but it seems silly to maintain something that is already being stored somewhere down in the kernel.

So we created some code to use the native API function NtQueryObject. It returns the object name in the form that it’s store in down in the kernel, but was just perfect for what we needed. The name comes back in the form “REGISTRYMACHINESOFTWAREMicrosoft”.

A few interesting notes about this technique. It mostly tells you what the REAL name of the object is, so if you’re on a 64-bit system and you open HKLMSoftwareMicrosoft, and then query its name, you will see that it is REGISTRYMACHINESOFTWAREWow6432NodeMicrosoft. However, if you open a file object with a short name and then query it, you will STILL get the short name back.

For a sample program, continue reading…

=> Read more!

Windbg: Conditional Breakpoints with string pattern matching

Today I ran into a need to set a breakpoint that would only stop when a certain string was encountered. In the past I have just modified the code to test for the string, and then update my driver, reboot, etc. A very time consuming process. So today I decided that I wanted to figure out how to do it right in the debugger. I knew it was possible from comments, but didn’t know how to implement it.

First of all, since runs a non-trivial set of commands each time the breakpoint is hit, so I placed the commands in a secondary file. There may be a way to get this all on a single line breakpoint command, but I don’t see it. So the breakpoint we create is just going to run the commands from the secondary file. The command to create the breakpoint is something like this:

bp driver!functionName “$$< C:\debugCommands.txt”

Then comes the important part – the actual commands that get executed. We need to evaluate a string against a pattern, which the masm expression evaluator can handle using the “$spat” command. The hard part about that is that at first glance it only appears to work with string literals. So $spat( “Big string”, “*str*” ) will work, while $spat( poi(variableName), “*str*” ) just laughs mockingly at you.

The key here is to assign the string to an alias which will then allow it to be evaluated by the $spat command. So using our example comparison, the commands in the secondary file look like:

as /mu ${/v:MyAlias} poi(variableName);
.if ( $spat( “${MyAlias}”, “*str*” ) != 0 ) { g }

The commands evaluate the string. If a match is not found, the g[o] command is executed, otherwise execution will stop at the point when the pattern is found. Note that there are much more complicated pattern matching expressions available as well.