I ran across a bug in a project just the other day that I thought others could find interesting. In this project, I had a main thread that was listening for connections and then serving back some data. I also had a timer that would periodically trigger an update of the data that was being served. This was using a System.Threading.Timer and therefore was running on a secondary thread from the thread pool.

The problem was that the timer would run two or three times (in fifteen minute intervals) and then it would just magically stop running. I initially thought perhaps locking issues between the threads, so I went through and locked everything that was shared, all to no avail.

And to make the problem even more frustrating, I couldn’t reproduce it in a debugger. I initially thought that this was perhaps because I was not patient enough to wait 45 minutes for it to happen. But it turned out to be a release vs. debug kind of problem: the release build had the problem, while the debug build didn’t seem to.

For research purposes, take as an example the follow little program. This program should have a main thread that just sleeps the day away, and a timer that prints out a debug message every 5 seconds. If I run the debug build of this, it works great, but running the release build on my machine, the timer thread didn’t even run a single time! Waahh?!?!

class Program
{
  static long i = 0;

  static void TimerCallback(object state)
  {
    Debug.WriteLine("{0:D5}: TimerCallback", i);
  }

  static void Main(string[] args)
  {
    // Trigger the callback every 5 seconds
    System.Threading.Timer t = new System.Threading.Timer(TimerCallback, null, 0, 5000); 

    while (true)
    {
      Thread.Sleep(2500);
    }
  }
}

It turns out what is going on here is that the system is happily garbage collecting my Timer object. According to the system, that t variable never gets used after it is initialized, so it’s safe to just throw it away. If you look at the MSIL using the ILDASM tool, you see the following for the release build. Notice that it does a newobj to create the Timer object, and then rather than storing it in a local with something like stloc.0, it just pops it off the stack and doesn’t keep any reference on it.

IL_0013:  newobj     instance void [mscorlib]System.Threading.Timer::.ctor(class [mscorlib]System.Threading.TimerCallback,
                                                                             object,
                                                                             int32,
                                                                             int32)
IL_0018:  pop

The debug version of the same code like the following, and note that it declares a local object, and then stores the reference to the Timer object in that local object.

.locals init ([0] class [mscorlib]System.Threading.Timer t,
           [1] bool CS$4$0000)
...
IL_0014:  newobj     instance void [mscorlib]System.Threading.Timer::.ctor(class [mscorlib]System.Threading.TimerCallback,
                                                                             object,
                                                                             int32,
                                                                             int32)
IL_0019:  stloc.0

Now once I figured out what was going on, fixing it was trivial. A using statement around the disposable Timer object keeps it in scope, and deterministically cleans it up when appropriate. (Of course, this is how the code should have been written in the first place, but look at the cool problem I got to figure out as a result of my lazy coding.)

I have noticed an increasing trend recently of people who aren’t professional programmers wanting to learn coding. There are a bunch of businesses and web sites that have sprung up around this: pluralsight.com, lynda.com, codeacademy.com, just to name a few. I think that this is a great trend, but I also think it deserves a measured approach.

Every worker can likely improve their productivity by learning some basic coding skills. Whether it’s automating some data entry, or being able to maintain a simple website, we all have tasks that we do with computers that could probably be made more efficient with a little coding know-how. For example, I know someone who had to do data entry on a web form, transcribing it from an excel spreadsheet, and they automated the process with a simple Excel macro. I have also had cases where I needed to rename a few hundred files with certain conventions, and shell scripts (batch files) fit the bill perfectly.

But not everyone can or should try to be a professional programmer on the side. Being a professional anything takes time and dedication. While you may learn how to fix a leak under the kitchen sink or even replace a garbage disposal, it probably doesn’t make much sense for you to learn how to be a professional plumber while maintaining your day job. The same applies to auto mechanics skills, or electrician skills, or to legal skills and business management. Everyone can improve their life by acquiring some basic skills in all of these areas, but as soon as you start putting too much effort, you will short your primary pursuits and end up being a master of nothing.

My advice is to absolutely spend some time learning to code. Find the ways that you can invest smaller amounts of time to get the largest benefit. Don’t try to be a professional programmer… unless, of course, you want to give up being a professional whatever-you-are-now. In that case, by all means, dive right in and make a career switch! Programming is awesome!

When I am kernel debugging, I use the !list command a LOT. It basically walks a doubly-linked list and dumps the memory of each list entry. You can also have it dump a more readable type, or have it run commands for each list entry. Today I ran across a list that was extremely large. I wasn’t really interested in the data so much as understanding just how large it was. So I had to figure out how to count the number of list entries and I figured I would share.

r $t0 = 0; !list " -x \"dd @$extret L0; r $t0 = @$t0 + 1\" 0xffffffff'0a1bcdef "; ? @$t0

Gibberish? Let’s dig in and analyze the statement. For starters, it’s a multi-part command, separated by semi-colons. The first statement sets a pseudo-register to 0, the next part uses the !list command to hit every list entry, and the last part just evaluates the pseudo-registry and dumps it’s value afterward.

r $t0 = 0
!list " -x \"dd @$extret L0; r $t0 = @$t0 + 1\" 0xffffffff'0a1bcdef "
? @$t0

Now the guts of this is in the !list command, which runs a multi-part command for each list entry. The first part of the command is just there to use up the @$extret parameter, so windbg doesn’t try to tack it onto the end of our second command, which is simply incrementing the value in the pseudo-register.

Voila! Now I know that my list has 21,000 or so entries in it, and I now understand why there just might be a performance problem. ;)

Happy debugging!

A recent project required me to call some REST apis on a web server from my client application, which was written in C++. What I had to do was very simple, to I first started looking at the WinINet and WinHTTP API families. This quickly turned into me wanting to harm myself or someone else, so I decided to continue searching for a library to help out. I looked at cURLpp and it seemed okay, but then I stumbled across the C++ REST SDK code named Casablanca. Since I have been on a modern C++ kick lately and have been enjoying some of the new things Microsoft has added to the compilers (from the C++11/14 standards), I decided to give this Microsoft-created open-source SDK a look.

My first impressions are that the syntax is a bit complicated and the documentation is very near to non-existent. I was reduced to combing through the provided sample apps to try to figure out what I needed to do, but it wasn’t horribly difficult. A few hours had me on the way to some simple REST calls. My first task was to call one of my APIs with a PUT verb, providing a JSON document in the request body. The following example illustrates this. One interesting thing here is the .get() call that comes on a lot of the SDK objects. This function is what waits for the asynchronous web call to complete, and then returns the result from the call.

http_client cli( U("http://localhost:8080/myservlet") );

//	PUT http://localhost:8080/myservlet/api/computerdata
//
ostringstream_t uri;
uri << U("/api/computerdata");

value body = value::object();
body[U("id")] = value::string( U("COMPUTER") );
body[U("version")] = value::string( U("1.1.1.3") );

http_response response = cli.request( methods::PUT, uri.str(), body.serialize(), U("application/json") ).get();
if ( response.status_code() == status_codes::OK &&
     response.headers().content_type() == U("application/json") )
{
   value json_response = response.extract_json().get();
   ucout << json_response.serialize() << endl;
}
else
{
   ucout << response.to_string() << endl;
}
}

The get call is even more simple, since it doesn't require creating a JSON request body...

http_client cli( U("http://localhost:8080/myservlet") );

//	GET http://localhost:8080/myservlet/api/computerdata/COMPUTER
//
ostringstream_t uri;
uri << U("/api/computerdata/COMPUTER");

http_response response = cli.request( methods::GET, uri.str() ).get();
if ( response.status_code() == status_codes::OK &&
     response.headers().content_type() == U("application/json") )
{
   value json_response = response.extract_json().get();
   ucout << json_response.serialize() << endl;
}
else
{
   ucout << response.to_string() << endl;
}
}

These examples are incredibly simple, but illustrate some of the most basic uses of this SDK. The SDK includes a lot more powerful and complex operations, such as PPL tasks, which is a model for creating asynchronous operations that is based on C++11 stuff. The SDK can be easily included in your package from Visual Studio by using the NuGet package manager to include Casablanca in your project. It will set up all the include paths, etc. for you. The code, samples, and what documentation there is can be found at casablanca.codeplex.com.

So I was recently writing some code to test some performance characteristics of lists and vectors. This was prompted by my watching Bjarne Stroustrup’s keynote from Going Native 2012, where he explains yet another reason why vector should be the favored data structure: it often performs better than list, even when computer science common sense tells us that it should not. (See Bjarne Stroustrup: Why you should avoid Linked Lists (Youtube) for more about that.)

So I was using the Windows performance counter APIs, QueryPerformanceFrequency and QueryPerformanceCounter, and I was using them a LOT, since I was trying to measure what kind of impact each part of the testing had on the system. (E.g., how much relative time did it take to find the point at which we wanted to insert or delete an item vs. how much relative time did the actual insertion or deletion take.)

Since I have also been boning up on new language features in C++11/14, I decided that I wanted to figure out how to write a function that would take a lambda expression to make this all easy to use. I wanted to be able to call something like the following (which is completely trivial, but shows how I might want to use this functionality):

auto time = my_timer_function([](){ Sleep(500); });

Now the way I have done something like this in the past is to declare a function prototype, and then a function for each thing I want to measure, and then pass them into a function that takes a parameter of the type of the first prototype. If that sounds like it’s a bit hard to follow, that’s just because it’s a bit hard to follow.

Well in C++ they added a few new features to make this much simpler and easy to understand: Lambda expressions and the std::function type. Instead of defining a function prototype (which is always confusing syntax and I almost never get it 100% right the first time), you can use a parameter of type std::function, which is a template that takes the function signature as a parameter. So the following block defines a my_timer_function that takes in a function that looks like void fn( void ), and measures how long that function takes to complete.

auto my_timer_function( std::function fn ) -> double
{
	LARGE_INTEGER countsPerS = { 0 };
	LARGE_INTEGER beginElapsedCounts = { 0 };
	LARGE_INTEGER endElapsedCounts = { 0 };

	VERIFY( QueryPerformanceFrequency( &countsPerS ) );
	VERIFY( QueryPerformanceCounter( &beginElapsedCounts ) );

	//	Call the fn we are supposed to measure
	fn();

	VERIFY( QueryPerformanceCounter( &endElapsedCounts ) );
	return ( double( endElapsedCounts.QuadPart - beginElapsedCounts.QuadPart ) * 1.0 * 1000 / double( countsPerS.QuadPart ) );
}

The magic of the compiler makes it so you can pass this an actual function, or a lambda expression, or even a functor (object that can look like a function). So any of the following will work just fine…

// Lambda expression
auto time = my_timer_function( [](){ Sleep( 2000 ); } );

// Function
void MyFn()
{
	Sleep( 2000 );
}
...
auto time = my_timer_function( MyFn );

// Functor
struct MyFunctor
{
	void operator()()
	{
		Sleep( 2000 );
	}
};
...
auto time = my_timer_function( MyFunctor() );

I was looking through some code examples last week and realized there was a gap in my understanding of the const keyword in C++. I have always written definitions like “const MyClass& c”, but in the code I was reading I was seeing a lot of “MyClass const & c”. Now intuitively I was thinking that this must be basically the same thing, but since I wasn’t entirely positive I decided to do a little research. It turns out that the use of the const keyword is a bit more complicated than I originally thought.

In the two examples I mentioned above, the declarations are functionally equivalent. No difference, just a matter of preference. For myself I find that putting the const keyword first is the most readable. But I understand that some developers prefer the second. I have read that people who prefer to read their declarations right-to-left particularly prefer this, as it reads like “c is a reference to a constant MyClass”, or something like that. I’m not really able to identify one as better than the other, so I will continue to use my own preference in my code, and if I am working with someone elses code who prefers the second form, I can conform to that as well.

But as I delved a bit further into this I started looking at pointer declarations. I have seen const dropped in seemingly random places in a declaration: “<const> int <const> * <const> x”. It turns out that the placement of the const keyword does make a different declaration in some of these cases. The const keyword modifies the item just to its left, unless it is the left-most thing in the expression, when it modifies the item to its right. So “const int * x” is identical to “int const * x”, but “int * const x” is a WHOLE different beast. The first two forms make the int a constant value, while the third makes the POINTER itself a constant value.

To illustrate, here is a sample function that declares the same kind of pointer (to an integer) but with varying styles of using const. In each case, the pointer is used to try to modify the pointed-to integer, and then the pointer itself is updated to point to a different underlying integer. The compiler fails some of these operations depending on the const keywords usage, each of the failures is noted with a comment.

int x = 1;
int x2 = 2;

//	Non-const pointer to an integer
int* p1 = &x;
*p1 = 42;
p1 = &x2;

//	Non-const Pointer to a const integer (two styles)
const int* p2a = &x;
*p2a = 42; // error C3892: 'p2a' : you cannot assign to a variable that is const
p2a = &x2;

int const * p2b = &x;
*p2b = 42; // error C3892: 'p2b' : you cannot assign to a variable that is const
p2b = &x2;

//	Const pointer to a non-const integer
int * const p3 = &x;
*p3 = 42;
p3 = &x2; // error C3892: 'p3' : you cannot assign to a variable that is const

//	Const pointer to a const integer
const int * const p4 = &x;
*p4 = 42; // error C3892: 'p4' : you cannot assign to a variable that is const
p4 = &x2; // error C3892: 'p4' : you cannot assign to a variable that is const

In this episode of the series, we are going to take a little side trip to explore contexts in a registry filter driver. There are two main types of context in a registry filter. The first is the most simple and is called a callback context. This is simply a pointer value that you can initialize during a pre-operation callback and it gets passed to you in your post-operation callback. The PVOID member is in all of the pre-op callback structures, such as REG_CREATE_KEY_INFORMATION, starting with Vista. The field name is CallContext, and you simply set it during pre-op and use it during post-op. Cm does not manage this object’s lifetime in any way for you. It doesn’t automatically behave one way if the call succeeds vs. fails, or anything like that. So you have to take care to handle those things yourself. Following is a very simple pre-create/open callback, all it does is allocate a structure and set the CallContext so that the post- callback will receive it.

NTSTATUS RfPreCreateOrOpenKeyEx( __in PVOID CallbackContext, __in PVOID Argument1, __in PREG_CREATE_KEY_INFORMATION CallbackData )
{
    UNREFERENCED_PARAMETER( CallbackContext );
    UNREFERENCED_PARAMETER( Argument1 );
    PAGED_CODE();

    //	Allocate a new callback context and have it passed back to us when the open key completes
    PMY_CONTEXT pContext = (PMY_CONTEXT) ExAllocatePoolWithTag( PagedPool, sizeof(MY_CONTEXT), 'xCyM' );
    if ( pContext )
    {
        //	Put some meaningless data in our context that we can verify later
        pContext->Data = 42;
        CallbackData->CallContext = pContext;
    }
    return STATUS_SUCCESS;
}

The code for the post- callback is almost as simple, but now we will examine the second kind of context: the object context. Registry filters can attach a bit of data to any registry object. This may be useful for creating some data during key-open that would be used during later operations such as a value enumerate. You can put anything in here, and Cm will pass it to you in the ObjectContext field of every subsequent callback for that object. Multiple registry filters can each store their own context, and the system differentiates between these by using the “Cookie” value that is returned when you register your filter with Cm.

For our example, we will take the context that was generated in the pre-create/open, and if the create/open was successful, we will attempt to attach this same context to the object. If the operation was not successful, or we were unable to set the object context, then we need to free the context ourselves.

NTSTATUS RfPostCreateOrOpenKeyEx( __in PVOID CallbackContext, __in PVOID Argument1, __in PREG_POST_OPERATION_INFORMATION CallbackData )
{
    UNREFERENCED_PARAMETER( CallbackContext );
    UNREFERENCED_PARAMETER( Argument1 );
    PAGED_CODE();

    if ( CallbackData->CallContext )
    {
        if ( STATUS_SUCCESS == CallbackData->Status )
        {
            //	Attach the context we allocated to the opened object
            NTSTATUS status;
            status = CmSetCallbackObjectContext( CallbackData->Object, &g_CmCookie, CallbackData->CallContext, NULL );
            if ( NT_SUCCESS( status ) )
            {
                InterlockedIncrement( &g_nObjectContexts );
            }
            else
            {
                ExFreePool( CallbackData->CallContext );
            }
        }
        else
        {
            ExFreePool( CallbackData->CallContext );
        }
    }
    return STATUS_SUCCESS;
}

Once the context is attached to the object, Cm now will manage it’s lifetime for us, giving us a callback whenever it needs the context to be cleaned up. (Cm makes this callback either when the object is closed, or when your filter is unloaded.)

NTSTATUS RfCallbackObjectContextCleanup( __in PVOID CallbackContext, __in PVOID Argument1, __in PREG_CALLBACK_CONTEXT_CLEANUP_INFORMATION CallbackData )
{
    UNREFERENCED_PARAMETER( CallbackContext );
    UNREFERENCED_PARAMETER( Argument1 );
    PAGED_CODE();

    if ( CallbackData->ObjectContext )
    {
        //	Make sure that our context looks like we expect it to
        PMY_CONTEXT pContext = (PMY_CONTEXT) CallbackData->ObjectContext;
        ASSERT( 42 == pContext->Data );
        ExFreePool( CallbackData->ObjectContext );
        InterlockedDecrement( &g_nObjectContexts );
    }
    return STATUS_SUCCESS;
}

The object context is a powerful way to help you generate information about an object that you will need later, and attach it to the object. It’s nothing you couldn’t have written yourself by managing your own storage of pointers to context, and linking the object pointer to each one, but it’s just so nice when Cm does it for you. ;)

One additional note should be made. Cm will also provide you the context of a root object when a relative create/open is done, in the RootObjectContext field.

Also note that I have put in some testing and debugging code, keeping track of how many contexts have been created vs. freed, and making sure that they all get cleaned up when the driver is unloaded. This test code is not strictly necessary, but helped to reassure me that the correct things were happening.

Registry Filter Driver Source (Part 4)

In Part 2 we looked at passively filtering a call to open a registry key (we were simply logging the name of keys opened), and we promised that next we would look at actively filtering the same kind of call. The most simple form of active filtering is simply completing the operation with some kind of error code. For example, if we wanted to prevent users from ever opening a certain registry key, we could just return access denied. This is as simple as returning the status code in your filter callback.

In the following example, we still need to get the full name of the registry key being opened as we did in Part 2, because we are determining whether to deny access based on the object name. If a caller tries to open HKLM\Software\MySecretTestKey they will receive an access denied error. There are some other side effects of putting this code in place. First, if we fail to get the name of the object being opened via our call to CmCallbackGetKeyObjectID, we will now be returning that failure code to the caller, with the result that the key will not be able to be opened even though it is not the key we are trying to block access to. This is a design decision, and probably has to do with whether a random failure should result in granting access.

NTSTATUS RfPreOpenKeyEx( __in PVOID CallbackContext, __in PVOID Argument1, __in PREG_OPEN_KEY_INFORMATION CallbackData )
{
    ...

    NTSTATUS status = STATUS_SUCCESS;

    ...

    PUNICODE_STRING pKeyNameBeingOpened = pLocalCompleteName ? pLocalCompleteName : CallbackData->CompleteName;

    //	Prevent callers from opening our secret registry key
    UNICODE_STRING TestKeyName;
    RtlInitUnicodeString( &TestKeyName, L"\\REGISTRY\\MACHINE\\SOFTWARE\\MySecretTestKey" );
    if ( !RtlCompareUnicodeString( pKeyNameBeingOpened, &TestKeyName, TRUE ) )
    {
        KdPrint( ( "RegFlt: PreOpenKeyEx for %wZ being denied!\n", pKeyNameBeingOpened ) );
        status = STATUS_ACCESS_DENIED;
        goto LExit;
    }

    ...

    return status;
}

The second side effect of our filter as written is that it handles open-key operations differently than create-key operations, which are two completely separate calls in the registry. This could result in a weird case where somebody could create the key, but then fail to open it in the future. (Similar to what can happen if you accidentally create a key with a security descriptor that doesn’t allow yourself access to the object.) In practice, I have found that it is generally best to have the create and open key callbacks perform all the same logic, and if possible to have them call the same code. I will change our sample to do this, but will not show the code here. I am going to being attaching the code file with each article so you can look there if you want to see how it’s done.

Now we put the driver onto a test system and sure enough we can pull up regedit and see that it will not let us open that key. But a strange thing is now happening on my Windows 7 test system. It doesn’t appear to prevent me from creating that key through regedit. If I create it through code, or from the command-line reg.exe, then the create fails as expected. What we are seeing here is that regedit actually creates a key named something like “New Key #1″, and then does a RENAME! (The rename-key functionality was added in Windows XP, and although very few applications use it, regedit is one that does.)

So what we are going to have to do now is filter the rename key call. This seems like it should be very simple. The REG_RENAME_KEY_INFORMATION structure provides a NewName member. We should just be able to test this against our secret key name and return access denied. But wait! Apparently, the NewName member only has the final path component (the rename key operation only allows the final component to be changed, i.e., renaming a key but not moving it to a different parent). So it looks like we are going to have to do our own name lookup and parsing to do our comparison; after all, we don’t want to block somebody from making a key named MySecretTestKey just anywhere!

Note that this is the kind of code that I really would like to avoid having to write. Pointer arithmetic and messing around with buffers directly is error prone, and great care must be taken, but sometimes it cannot be avoided. I decided to write a function that would take a UNICODE_STRING containing an object name and initialize two new UNICODE_STRING structures using the same buffer as the original, but only referring to the parent part of the name and the relative part of the name. I won’t say that the following is perfect code, but it is good enough to pass my cursory testing, and gives you an idea of what has to be done.

NTSTATUS _RfSplitParentAndRelativeNames( __in PCUNICODE_STRING pFullObjectName, __out_opt PUNICODE_STRING pParentObjectName, __out_opt PUNICODE_STRING pRelativeObjectName )
{
    //	Defensive programming. These names may come from somebody elses code, so be careful!
    if ( !pFullObjectName->Buffer || 0 == pFullObjectName->Length )
    {
        return STATUS_INVALID_PARAMETER;
    }

    //	Search backward through full object name for the path separator, taking care not to underflow!
    const wchar_t* pCh = &pFullObjectName->Buffer[ (pFullObjectName->Length / sizeof(wchar_t)) - 1 ];
    while ( pCh && pCh >= pFullObjectName->Buffer && *pCh != OBJ_NAME_PATH_SEPARATOR )
    {
        --pCh;
    }

    //	Again, be defensive. The string provided may not have had a backslash.
    if ( pCh <= pFullObjectName->Buffer )
    {
        return STATUS_INVALID_PARAMETER;
    }

    //	Everything before the character we stopped on is the parent, everything after is the relative name
    USHORT cbOffset = (USHORT) PtrDiff( pCh, pFullObjectName->Buffer );
    if ( pParentObjectName )
    {
        pParentObjectName->Length = pParentObjectName->MaximumLength = cbOffset;
        pParentObjectName->Buffer = pFullObjectName->Buffer;
    }

    if ( pRelativeObjectName )
    {
        cbOffset += sizeof( wchar_t );
        pRelativeObjectName->Length = pRelativeObjectName->MaximumLength = pFullObjectName->Length - cbOffset;
        pRelativeObjectName->Buffer = (PWCH) Add2Ptr( pFullObjectName->Buffer, cbOffset );
    }

    return STATUS_SUCCESS;
}

Given this new function we can split up the name of the object passed to us in the rename key callback, and pass the parent of the original object name, along with the new name, and test to see if it matches our secret key. So our rename key callback is fairly simple, as follows.

NTSTATUS RfPreRenameKey( __in PVOID CallbackContext, __in PVOID Argument1, __in PREG_RENAME_KEY_INFORMATION CallbackData )
{
    UNREFERENCED_PARAMETER( CallbackContext );
    UNREFERENCED_PARAMETER( Argument1 );

    NTSTATUS status = STATUS_SUCCESS;

    //	Get the full name of the new key
    PCUNICODE_STRING pObjectName;
    status = CmCallbackGetKeyObjectID( &g_CmCookie, CallbackData->Object, NULL, &pObjectName );
    if ( !NT_SUCCESS( status ) )
    {
        goto LExit;
    }

    //	Find the last path separator character and get a "parent object" key name
    UNICODE_STRING ParentObjectName;
    status = _RfSplitParentAndRelativeNames( pObjectName, &ParentObjectName, NULL );
    if ( !NT_SUCCESS( status ) )
    {
        goto LExit;
    }

    //	Prevent callers from opening our secret registry key
    if ( _RfIsKeyMySecretKey( &ParentObjectName, CallbackData->NewName ) )
    {
        KdPrint( ( "RegFlt: Rename of key to %wZ being denied!\n", CallbackData->NewName ) );
        status = STATUS_ACCESS_DENIED;
        goto LExit;
    }

LExit:
    return status;
}

Now, a few additional comments. First, as I was working on the code for this, I found a small bug in my code to get the key object name from part 2. The code attached with this post will contain a fix, and I will post a comment to the previous article with details.

Another strange behavior I noted while testing this is that when a key is renamed multiple times, the rename key callback often gets the ORIGINAL key name when it calls CmCallbackGetKeyObjectID. For example, in regedit you create a new key “New Key #1″ and rename it to “TestKey”. Then you rename it again to “TestKey2″. The rename key callback will still get something like “\REGISTRY\MACHINE\SOFTWARE\New Key #1″ for the object, and the NewName would contain TestKey2. So it appears that you may not be able to entirely rely on CmCallbackGetKeyObjectID to return the current name of the object. It may be cached for the handle/object, or it may be cached for a longer lifetime, I couldn’t tell for sure from my limited testing.

RegFlt Source Part 3

In Part 1 we created a basic registry filter driver. For a sample, the code that we created was fine, but for production code, there are some parts of it that could become a little unwieldy. In this article we will do a little cleanup on the code, and then start looking at simple filtering of the open key operation.

Referring back to Part 1, the RfRegistryCallback function, which actually receives all the callbacks from the CM, has a nice little switch statement with three cases. In practice, there may be a good ten to twenty cases that you need to handle in order to implement a functional product. This switch statement gets a little difficult to maintain, and then you break things out into separate functions for each operation callback type. A good way to do this is just to convert our switch statement into some code that uses an array of function pointers, and calls the right operation callback depending on the operation class provided. So after these changes the RfRegistryCallback looks as follows:

NTSTATUS RfRegistryCallback( __in PVOID CallbackContext, __in PVOID Argument1, __in PVOID Argument2 )
{
    REG_NOTIFY_CLASS Operation = (REG_NOTIFY_CLASS) (ULONG_PTR) Argument1;

    //	If we have no operation callback routine for this operation then just return to Cm
    if ( !g_RegistryCallbackTable[ Operation ] )
    {
        return STATUS_SUCCESS;
    }

    //	Call our operation callback routine
    return g_RegistryCallbackTable[ Operation ]( CallbackContext, Argument1, Argument2 );
}

We also have a global array of function pointers for operation callback routines, and we initialize that during DriverEntry (prior to registering our callback):

PEX_CALLBACK_FUNCTION g_RegistryCallbackTable[ MaxRegNtNotifyClass ] = { 0 };

extern "C"
NTSTATUS DriverEntry( __in PDRIVER_OBJECT pDriverObject, __in PUNICODE_STRING pRegistryPath )
{
    //	... Set up unload routine, etc.

    //	Set up our registry callback table
    g_RegistryCallbackTable[ RegNtPreCreateKeyEx ] = (PEX_CALLBACK_FUNCTION) RfPreCreateKeyEx;
    g_RegistryCallbackTable[ RegNtPreOpenKeyEx ] = (PEX_CALLBACK_FUNCTION) RfPreOpenKeyEx;
    g_RegistryCallbackTable[ RegNtKeyHandleClose ] = (PEX_CALLBACK_FUNCTION) RfKeyHandleClose;

    //	... Register callback with CM
}

And then we have a few operation callback routines that all look similar, mainly differing in the type of the CallbackData parameter:

NTSTATUS RfPreCreateKeyEx( __in PVOID CallbackContext, __in PVOID Argument1, __in PREG_CREATE_KEY_INFORMATION CallbackData );

Now that we have a callback routine that is a bit cleaner and easier to maintain, let’s dive into filtering the open key operation. One of the most common things to need to do while filtering an open or create key operation is to examine the name of the object being opened. Maybe you have a list of keys you want to filter (or a list NOT to filter), or some other similar requirement. If we look at the CallbackData structure, which is of type REG_OPEN_KEY_INFORMATION, there is a CompleteName field. This field is not very well named, because it may or may not actually be the “complete” name. If the name begins with a backslash then it has the complete name, otherwise it is a name relative to the RootObject field, and we have to figure it out for ourselves. (Note that there is an API call CmCallbackGetKeyObjectID that purports to get the name of a key object. This API will not work in pre-Open/Create calls, because it is designed to work on an object, and cannot build us the complete name from an object and relative name. We will use this API to get the string name of the RootObject.)

The following sample function shows how to construct the full object name if necessary, and use either the constructed name or the one provided by Cm to display what key is being opened.

NTSTATUS RfPreOpenKeyEx( __in PVOID CallbackContext, __in PVOID Argument1, __in PREG_OPEN_KEY_INFORMATION CallbackData )
{
    UNREFERENCED_PARAMETER( CallbackContext );
    UNREFERENCED_PARAMETER( Argument1 );

    NTSTATUS status;
    PUNICODE_STRING pLocalCompleteName = NULL;

    //	Get the complete name of the key being opened
    if ( CallbackData->CompleteName->Length > 0 && *CallbackData->CompleteName->Buffer != OBJ_NAME_PATH_SEPARATOR )
    {
        PCUNICODE_STRING pRootObjectName;
        status = CmCallbackGetKeyObjectID( &g_CmCookie, CallbackData->RootObject, NULL, &pRootObjectName );
        if ( NT_SUCCESS( status ) )
        {
            //	Build the new name
            USHORT cbBuffer = pRootObjectName->Length;
            cbBuffer += sizeof( wchar_t );
            cbBuffer += CallbackData->CompleteName->Length;
            ULONG cbUString = sizeof(UNICODE_STRING) + cbBuffer;

            pLocalCompleteName = (PUNICODE_STRING) ExAllocatePoolWithTag( PagedPool, cbUString, 'tlFR' );
            if ( pLocalCompleteName )
            {
                pLocalCompleteName->Length = 0;
                pLocalCompleteName->MaximumLength = cbBuffer;
                pLocalCompleteName->Buffer = (PWCH) ( (PCCH) pLocalCompleteName + sizeof( UNICODE_STRING ) );

                RtlCopyUnicodeString( pLocalCompleteName, pRootObjectName );
                RtlAppendUnicodeToString( pLocalCompleteName, L"\\" );
                RtlAppendUnicodeStringToString( pLocalCompleteName, CallbackData->CompleteName );
            }
        }
        else
        {
            //	Could not get the complete name since we have a relative name and were
            //	unable to get the name of the root object
            goto LExit;
        }
    }

    KdPrint( ( "RegFlt: PreOpenKeyEx for %wZ\n", pLocalCompleteName ? pLocalCompleteName : CallbackData->CompleteName ) );

LExit:
    if ( pLocalCompleteName )
    {
        ExFreePool( pLocalCompleteName );
    }

    return STATUS_SUCCESS;
}

We now have a filter that get called for every open-key on the system and prints out in the debugger what is being opened. These techniques could be useful if you were building a monitoring product of some kind, but the real fun work comes when you want to change the behavior. In the next part we will look at actively filtering a registry open operation.

For the past number of years I have been working on driver projects that are file system filter drivers. Many people are familiar with these kinds of drivers, and there is lots of information out on the internet about how to create them. My drivers have also been register filter drivers: they do much the same thing that file system filter drivers do, but for the registry. Unfortunately, there is not a lot of information out there about writing one of these. There is a brief article on osronline.com (written by a coworker of mine of one of these drivers), and there is a sample driver include as part of the Microsoft Windows Driver Kit samples. The lack of information out there leaves me feeling like I am the only developer foolish enough to be working on one of these drivers. I recently decided that I should help to remedy that lack of information available about these drivers, and hopefully entice a few more foolish developers into trying their hand at this.

This first article will be a very basic intro on building a registry filter driver that does nothing! But never fear, dear reader, there will be more in-depth information coming soon.

First, let’s talk generally about motivation. What is “filtering” the registry, and why in the world would I want to do that? A filter driver lets you see operating system calls for the registry before (and after) they actually get processed by the Windows registry code (Configuration Manager or CM) in the kernel. A filter driver lets you examine the calls passively, for example you might create a registry call logging utility like Sysinternals’ RegMon or Process Monitor. It also lets you actively filter the calls, changing the results if you feel inclined. For example, you might want to have a security product that prevents people from writing to certain keys. (Granted, you could do this with ACLs as well, but you get the idea.)

In the old days of yore (pre-Vista) this functionality was not really baked. In early operating systems it was not there at all, and you would be forced to hook into the system call table in order to try to write your filtering product. This could lead to all sorts of potential problems, of course, and Microsoft decided that to support these products they needed new APIs for registry filtering. These were added in Windows XP and later, but the initial releases only supported passive filtering. The ability to do active filtering came along in Vista and later, and continues to see improvements in later operating systems.

The basics of building a registry filter driver are that you need your driver to call CmRegisterCallbackEx during its DriverEntry routines, and then to call CmUnRegisterCallback during your DriverUnload routines. Registry filters are considered minifilters and are loaded based on Load Order Groups and Altitude, just as file system minifilters are. We won’t really discuss this much in this series, but to create your own you need to request an altitude from Microsoft which helps them determine in which order filters should be called.

So to build a bare-bones registry filter, first we call the registration function during our driver initialization as follows. Note that we have defined a callback routine (which we will implement later), and made up an altitude just for testing. We get back from the registry a “cookie” that we must use later on when we want to unregister our filter.

NTSTATUS RfRegistryCallback( __in PVOID CallbackContext, __in PVOID Argument1, __in PVOID Argument2 );
void RfUnload( __in PDRIVER_OBJECT pDriverObject );
LARGE_INTEGER g_CmCookie = { 0 };

extern "C"
NTSTATUS DriverEntry( __in PDRIVER_OBJECT pDriverObject, __in PUNICODE_STRING pRegistryPath )
{
    UNREFERENCED_PARAMETER( pRegistryPath );

    //	Set up our unload routine
    pDriverObject->DriverUnload = RfUnload;

    //	Register our callback with the system
    UNICODE_STRING AltitudeString = RTL_CONSTANT_STRING( L"360000" );
    NTSTATUS status = CmRegisterCallbackEx( RfRegistryCallback, &AltitudeString, pDriverObject, NULL, &g_CmCookie, NULL );
    if ( !NT_SUCCESS( status ) )
    {
        //	Failed to register - probably shouldn't succeed the driver intialization since this is critical
    }

    return status;
}

Next we need to unregister our registry callback when our driver unloads:

void RfUnload( __in PDRIVER_OBJECT pDriverObject )
{
    UNREFERENCED_PARAMETER( pDriverObject );
    PAGED_CODE();

    NTSTATUS status = CmUnRegisterCallback( g_CmCookie );
    if ( !NT_SUCCESS( status ) )
    {
        //	Failed to unregister - try to handle gracefully
    }
}

Finally, we need a routine for the registry to call in our driver when something happens. For this first article, this routine isn’t really going to do anything, but it exists as a sample of what the callback function looks like, and what some of the parameters look like and how to handle them. In a future article we will look at a more graceful way to handle this callback routine.

NTSTATUS RfRegistryCallback( __in PVOID CallbackContext, __in PVOID Argument1, __in PVOID Argument2 )
{
    UNREFERENCED_PARAMETER( CallbackContext );
    UNREFERENCED_PARAMETER( Argument2 );

    REG_NOTIFY_CLASS Operation = (REG_NOTIFY_CLASS) (ULONG_PTR) Argument1;
    switch ( Operation )
    {
    case RegNtPreCreateKeyEx:
        break;
    case RegNtPreOpenKeyEx:
        break;
    case RegNtKeyHandleClose:
        break;
    }

    return STATUS_SUCCESS;
}

That’s it! At this point you have a working registry filter driver. You can install it on a machine (I just do this with the command line “sc create” and regedit for testing purposes). If you connect a kernel debugger and put a breakpoint on the RfRegistryCallback routine, you will see it’s getting called for all kinds of registry operations (there are a LOT of these going on in the system).