Initialization Order in c++ Matters

So last week I ran into a bug in my code that exposed an aspect of the c++ language that I was familiar with intellectually, but had never run into before. The initialization order of member variables in a class is in the order of their declaration in the class. Basically this means that if your class declares a, then b, then c, they will always get initialized in that order, even you have a constructor that initializes them in a different order.

In my case I was making a class that would call a function every so often, basically duplicating a Win32 Timer object functionality. I didn’t want to pull Boost libraries in, which appears to be the best possible way to accomplish this. Instead, I wrote the following class (roughly). The idea was that it would create a thread that would call my function every 1 second. When the containing object gets destructed, it would signal the thread to end and then wait for it. This is not the best async code, but it was simple enough and functional for what I needed.

class Timer
{
	std::thread timerThread_;
	bool destructing_{ false };

public:
	int ticks_{ 0 };

	Timer() : timerThread_{ [this]() { this->timerThread(); } }
	{
	}

	~Timer()
	{
		destructing_ = true;
		timerThread_.join();
	}

	void timerThread()
	{
		while (!destructing_)
		{
			std::this_thread::sleep_for(std::chrono::seconds(1));
			tick();
		}
	}

	void tick()
	{
		++ticks_;
	}
};

So after writing this code I found that my tick() function was never being called. In a debugger I quickly found that destructing_ was set to true when I was expecting it to be false. It didn’t take me long to realize what had happened. Reordering the bool and the std::thread declarations in the class fixed the problem. This was actually a rule I didn’t know about until recently, and I was glad I had listened to a CppCon talk that mentioned it, because it would have taken me a lot longer to figure out otherwise.

The main reason that I didn’t notice it initially, even being aware of the ordering rules, was that in my non-example code, the constructor code was in a .cpp file, and the member initialized bool was declared in the .h file. So the ordering is somewhat unapparent when viewing the .cpp file.

Deriving Unique Key Container Name for RSACryptoServiceProvider

I ran across a problem this week where I needed to get the filename where an RSA encryption key was stored. These files are stored (for machine-scope keys) in C:\ProgramData\Microsoft\Crypto\RSA\MachineKeys, and have a filename that looks like a hash value followed by a SID. This is easy to find if you have access to the key:

var csp = new CspParameters
{
    Flags = CspProviderFlags.NoPrompt | 
            CspProviderFlags.UseMachineKeyStore | 
            CspProviderFlags.UseExistingKey,
    KeyContainerName = "dev.dev.domo.com"
};

var crypto = new RSACryptoServiceProvider(csp);

Console.WriteLine(csp.KeyContainerName);
Console.WriteLine(crypto.CspKeyContainerInfo.UniqueKeyContainerName);

But in my case I didn’t have access to the keyfile, as it had been created by another user and ACLed. The algorithm for deriving these filenames is not too difficult… It turns out you can take the container name, convert it to lowercase, add an extra null byte, compute the MD5 hash, and then convert the MD5 hash to a string in DWORD-sized chunks. Then you append the machine guid, which can be found in the registry.

public static class RsaCryptoServiceProviderExtensions
{
    public static string GetUniqueKeyContainerName(string containerName)
    {
        using (var rk = Registry.LocalMachine.OpenSubKey(@"SOFTWARE\Microsoft\Cryptography"))
        {
            if (rk == null)
            {
                throw new Exception("Unable to open registry key");
            }

            var machineGuid = (string)rk.GetValue("MachineGuid");

            using (var md5 = MD5.Create())
            {
                var containerNameArray = Encoding.ASCII.GetBytes(containerName.ToLower());
                var originalLength = containerNameArray.Length;
                Array.Resize(ref containerNameArray, originalLength + 1);

                var hash = md5.ComputeHash(containerNameArray);
                var stringBuilder = new StringBuilder(32);
                var binaryReader = new BinaryReader(new MemoryStream(hash));
                for (var i = 1; i <= 4; i++)
                {
                    stringBuilder.Append(binaryReader.ReadInt32().ToString("x8"));
                }

                stringBuilder.Append("_" + machineGuid);

                return stringBuilder.ToString();
            }
        }
    }
}

Using an enumeration from a runtime-loaded assembly

I recently had a co-worker who needed to instantiate and use a class from an assembly loaded at runtime. The code couldn’t reference the assembly directly for various reasons. This was accomplished relatively easily until he needed to assign the value of an enumerated type. So take the following class definition.

namespace DynamicAssembly
{
    public class MyClass
    {
        public enum MyEnum { ValueA, ValueB, ValueC }

        public MyEnum TheEnumValue { get; set; }
    }
}

From a project, the goal was to load the above assembly dynamically, instantiate a MyClass variable and then set TheEnumValue = MyEnum.ValueB. Really simple in normal code… a little more convoluted in dynamic runtime code. The solution I came up with is the following:

static void Main(string[] args)
{
    var p = Path.GetFullPath(@"..\..\..\DynamicAssembly\bin\Debug\DynamicAssembly.dll");
    var a = Assembly.LoadFile(p);

    var classType = a.GetType("DynamicAssembly.MyClass");
    dynamic classInstance = Activator.CreateInstance(classType);

    var enumType = a.GetType("DynamicAssembly.MyClass+MyEnum");
    var enumValues = enumType.GetEnumNames();
    var enumIndex = Array.IndexOf(enumValues, "ValueB");
    var enumValue = enumType.GetEnumValues().GetValue(enumIndex);

    classInstance.TheEnumValue = (dynamic)enumValue;
}

I would love to hear from you if you know of a better way to accomplish this.

C# Garbage Collection, or Why Did My Object Just Disappear?

I ran across a bug in a project just the other day that I thought others could find interesting. In this project, I had a main thread that was listening for connections and then serving back some data. I also had a timer that would periodically trigger an update of the data that was being served. This was using a System.Threading.Timer and therefore was running on a secondary thread from the thread pool.

The problem was that the timer would run two or three times (in fifteen minute intervals) and then it would just magically stop running. I initially thought perhaps locking issues between the threads, so I went through and locked everything that was shared, all to no avail.

And to make the problem even more frustrating, I couldn’t reproduce it in a debugger. I initially thought that this was perhaps because I was not patient enough to wait 45 minutes for it to happen. But it turned out to be a release vs. debug kind of problem: the release build had the problem, while the debug build didn’t seem to.

For research purposes, take as an example the follow little program. This program should have a main thread that just sleeps the day away, and a timer that prints out a debug message every 5 seconds. If I run the debug build of this, it works great, but running the release build on my machine, the timer thread didn’t even run a single time! Waahh?!?!

class Program
{
  static long i = 0;

  static void TimerCallback(object state)
  {
    Debug.WriteLine("{0:D5}: TimerCallback", i);
  }

  static void Main(string[] args)
  {
    // Trigger the callback every 5 seconds
    System.Threading.Timer t = new System.Threading.Timer(TimerCallback, null, 0, 5000); 

    while (true)
    {
      Thread.Sleep(2500);
    }
  }
}

It turns out what is going on here is that the system is happily garbage collecting my Timer object. According to the system, that t variable never gets used after it is initialized, so it’s safe to just throw it away. If you look at the MSIL using the ILDASM tool, you see the following for the release build. Notice that it does a newobj to create the Timer object, and then rather than storing it in a local with something like stloc.0, it just pops it off the stack and doesn’t keep any reference on it.

IL_0013:  newobj     instance void [mscorlib]System.Threading.Timer::.ctor(class [mscorlib]System.Threading.TimerCallback,
                                                                             object,
                                                                             int32,
                                                                             int32)
IL_0018:  pop

The debug version of the same code like the following, and note that it declares a local object, and then stores the reference to the Timer object in that local object.

.locals init ([0] class [mscorlib]System.Threading.Timer t,
           [1] bool CS$4$0000)
...
IL_0014:  newobj     instance void [mscorlib]System.Threading.Timer::.ctor(class [mscorlib]System.Threading.TimerCallback,
                                                                             object,
                                                                             int32,
                                                                             int32)
IL_0019:  stloc.0

Now once I figured out what was going on, fixing it was trivial. A using statement around the disposable Timer object keeps it in scope, and deterministically cleans it up when appropriate. (Of course, this is how the code should have been written in the first place, but look at the cool problem I got to figure out as a result of my lazy coding.)

REST Apis in C++

A recent project required me to call some REST apis on a web server from my client application, which was written in C++. What I had to do was very simple, to I first started looking at the WinINet and WinHTTP API families. This quickly turned into me wanting to harm myself or someone else, so I decided to continue searching for a library to help out. I looked at cURLpp and it seemed okay, but then I stumbled across the C++ REST SDK code named Casablanca. Since I have been on a modern C++ kick lately and have been enjoying some of the new things Microsoft has added to the compilers (from the C++11/14 standards), I decided to give this Microsoft-created open-source SDK a look.

My first impressions are that the syntax is a bit complicated and the documentation is very near to non-existent. I was reduced to combing through the provided sample apps to try to figure out what I needed to do, but it wasn’t horribly difficult. A few hours had me on the way to some simple REST calls. My first task was to call one of my APIs with a PUT verb, providing a JSON document in the request body. The following example illustrates this. One interesting thing here is the .get() call that comes on a lot of the SDK objects. This function is what waits for the asynchronous web call to complete, and then returns the result from the call.

http_client cli( U("http://localhost:8080/myservlet") );

//	PUT http://localhost:8080/myservlet/api/computerdata
//
ostringstream_t uri;
uri << U("/api/computerdata");

value body = value::object();
body[U("id")] = value::string( U("COMPUTER") );
body[U("version")] = value::string( U("1.1.1.3") );

http_response response = cli.request( methods::PUT, uri.str(), body.serialize(), U("application/json") ).get();
if ( response.status_code() == status_codes::OK &&
     response.headers().content_type() == U("application/json") )
{
   value json_response = response.extract_json().get();
   ucout << json_response.serialize() << endl;
}
else
{
   ucout << response.to_string() << endl;
}
}

The get call is even more simple, since it doesn't require creating a JSON request body...

http_client cli( U("http://localhost:8080/myservlet") );

//	GET http://localhost:8080/myservlet/api/computerdata/COMPUTER
//
ostringstream_t uri;
uri << U("/api/computerdata/COMPUTER");

http_response response = cli.request( methods::GET, uri.str() ).get();
if ( response.status_code() == status_codes::OK &&
     response.headers().content_type() == U("application/json") )
{
   value json_response = response.extract_json().get();
   ucout << json_response.serialize() << endl;
}
else
{
   ucout << response.to_string() << endl;
}
}

These examples are incredibly simple, but illustrate some of the most basic uses of this SDK. The SDK includes a lot more powerful and complex operations, such as PPL tasks, which is a model for creating asynchronous operations that is based on C++11 stuff. The SDK can be easily included in your package from Visual Studio by using the NuGet package manager to include Casablanca in your project. It will set up all the include paths, etc. for you. The code, samples, and what documentation there is can be found at casablanca.codeplex.com.

C++ Trivia: Writing functions that take a function as a parameter

So I was recently writing some code to test some performance characteristics of lists and vectors. This was prompted by my watching Bjarne Stroustrup’s keynote from Going Native 2012, where he explains yet another reason why vector should be the favored data structure: it often performs better than list, even when computer science common sense tells us that it should not. (See Bjarne Stroustrup: Why you should avoid Linked Lists (Youtube) for more about that.)

So I was using the Windows performance counter APIs, QueryPerformanceFrequency and QueryPerformanceCounter, and I was using them a LOT, since I was trying to measure what kind of impact each part of the testing had on the system. (E.g., how much relative time did it take to find the point at which we wanted to insert or delete an item vs. how much relative time did the actual insertion or deletion take.)

Since I have also been boning up on new language features in C++11/14, I decided that I wanted to figure out how to write a function that would take a lambda expression to make this all easy to use. I wanted to be able to call something like the following (which is completely trivial, but shows how I might want to use this functionality):

auto time = my_timer_function([](){ Sleep(500); });

Now the way I have done something like this in the past is to declare a function prototype, and then a function for each thing I want to measure, and then pass them into a function that takes a parameter of the type of the first prototype. If that sounds like it’s a bit hard to follow, that’s just because it’s a bit hard to follow.

Well in C++ they added a few new features to make this much simpler and easy to understand: Lambda expressions and the std::function type. Instead of defining a function prototype (which is always confusing syntax and I almost never get it 100% right the first time), you can use a parameter of type std::function, which is a template that takes the function signature as a parameter. So the following block defines a my_timer_function that takes in a function that looks like void fn( void ), and measures how long that function takes to complete.

auto my_timer_function( std::function fn ) -> double
{
	LARGE_INTEGER countsPerS = { 0 };
	LARGE_INTEGER beginElapsedCounts = { 0 };
	LARGE_INTEGER endElapsedCounts = { 0 };

	VERIFY( QueryPerformanceFrequency( &countsPerS ) );
	VERIFY( QueryPerformanceCounter( &beginElapsedCounts ) );

	//	Call the fn we are supposed to measure
	fn();

	VERIFY( QueryPerformanceCounter( &endElapsedCounts ) );
	return ( double( endElapsedCounts.QuadPart - beginElapsedCounts.QuadPart ) * 1.0 * 1000 / double( countsPerS.QuadPart ) );
}

The magic of the compiler makes it so you can pass this an actual function, or a lambda expression, or even a functor (object that can look like a function). So any of the following will work just fine…

// Lambda expression
auto time = my_timer_function( [](){ Sleep( 2000 ); } );

// Function
void MyFn()
{
	Sleep( 2000 );
}
...
auto time = my_timer_function( MyFn );

// Functor
struct MyFunctor
{
	void operator()()
	{
		Sleep( 2000 );
	}
};
...
auto time = my_timer_function( MyFunctor() );

C++ Trivia: Placement of const keyword

I was looking through some code examples last week and realized there was a gap in my understanding of the const keyword in C++. I have always written definitions like “const MyClass& c”, but in the code I was reading I was seeing a lot of “MyClass const & c”. Now intuitively I was thinking that this must be basically the same thing, but since I wasn’t entirely positive I decided to do a little research. It turns out that the use of the const keyword is a bit more complicated than I originally thought.

In the two examples I mentioned above, the declarations are functionally equivalent. No difference, just a matter of preference. For myself I find that putting the const keyword first is the most readable. But I understand that some developers prefer the second. I have read that people who prefer to read their declarations right-to-left particularly prefer this, as it reads like “c is a reference to a constant MyClass”, or something like that. I’m not really able to identify one as better than the other, so I will continue to use my own preference in my code, and if I am working with someone elses code who prefers the second form, I can conform to that as well.

But as I delved a bit further into this I started looking at pointer declarations. I have seen const dropped in seemingly random places in a declaration: “<const> int <const> * <const> x”. It turns out that the placement of the const keyword does make a different declaration in some of these cases. The const keyword modifies the item just to its left, unless it is the left-most thing in the expression, when it modifies the item to its right. So “const int * x” is identical to “int const * x”, but “int * const x” is a WHOLE different beast. The first two forms make the int a constant value, while the third makes the POINTER itself a constant value.

To illustrate, here is a sample function that declares the same kind of pointer (to an integer) but with varying styles of using const. In each case, the pointer is used to try to modify the pointed-to integer, and then the pointer itself is updated to point to a different underlying integer. The compiler fails some of these operations depending on the const keywords usage, each of the failures is noted with a comment.

int x = 1;
int x2 = 2;

//	Non-const pointer to an integer
int* p1 = &x;
*p1 = 42;
p1 = &x2;

//	Non-const Pointer to a const integer (two styles)
const int* p2a = &x;
*p2a = 42; // error C3892: 'p2a' : you cannot assign to a variable that is const
p2a = &x2;

int const * p2b = &x;
*p2b = 42; // error C3892: 'p2b' : you cannot assign to a variable that is const
p2b = &x2;

//	Const pointer to a non-const integer
int * const p3 = &x;
*p3 = 42;
p3 = &x2; // error C3892: 'p3' : you cannot assign to a variable that is const

//	Const pointer to a const integer
const int * const p4 = &x;
*p4 = 42; // error C3892: 'p4' : you cannot assign to a variable that is const
p4 = &x2; // error C3892: 'p4' : you cannot assign to a variable that is const

Finding Stuff in Modern C++

I recently had a co-worker ask me what the best way to find an object matching certain criteria in an STL vector. We talked about the “old-school” ways of doing that, like a for loop from zero to the number of items, or a for loop using the being and end iterators. Each of these would do the comparison and then break out of the loop if a match was found. Now these ways work just fine, but there are some newer, more “modern” methods of accomplishing the same things that use STL algorithms. In case you haven’t seen them before I thought I would share…

The first method uses a structure/class to wrap the comparison function and is usually called a “functor”. Essentially this is just an object that implements operator(), which allows you to treat the object somewhat like a function. (And you can pass it to other functions like STL algorithms.)

The second method uses the more modern “lambda” function syntax. It allows you to just define the comparison function right inline with the code. This is the one that I prefer because it keeps the comparison logic with the calling of the find algorithm. I think one of the most important aspects of good code is that it’s easy to follow and understand: you shouldn’t have to go skipping all over the code to figure out what some piece of code is doing.

Of course at first glance, a programmer who is unfamiliar with either of these methods is going to respond “huh?” But once you get used to seeing a functor or a lambda expression, they become pretty easy to read and understand.

So without further ado, on to the code, which demonstrates a very simple example of each method:
Continue reading “Finding Stuff in Modern C++”

Modern C++ in the Windows Kernel

In the past few months I have been playing around with Microsoft Visual Studio 2013 and trying out a bunch of the new C++ language features that are supported. Many of these were already in 2012, but I have been making a focused effort to make sure I understand all the new language features. I am loving the new stuff, and the way it makes my code easier to write.

Unfortunately, when it comes to using all these cool new features in daily work, I am a kernel developer, and most of this seems to be problematic in the kernel. Specifically, I wish there were a kernel-friendly Standard Template Library that we could use in kernel-land. There have long been problems documented with using C++ in the Windows kernel. The Windows compiler team has done some work in 2012 to add a /kernel switch that is supposed to help with some of this, but really it seems to do no more than make sure you don’t use C++ exceptions.

What I really believe the Windows kernel community needs, however, is a concerted effort to make all the modern C++ language features including STL available to kernel developers. I believe that it really impacts the quality of the driver ecosystem to have every single developer writing their own lists and list processing code, and trying to create their own wrappers for things like locking, IPC, etc.

One of the purposes of the new language features is to make it easy for developers to write good code, and to enable them to do the right thing. Kernel mode programming makes it difficult to do the right thing. And the worst part about that is that when you do the wrong thing in kernel mode, you crash the machine altogether (as opposed to user mode where you simply crash your own process and the system continues merrily on its way).

I understand that there are inherent difficulties in kernel mode programming, and things that you don’t have to worry about in normal C++ code, such as controlling what memory gets paged vs. non-paged, or worrying about code that can only run at certain IRQLs. So I believe that in addition to needing modern C++ and STL, we would need some (probably Microsoft-specific) extensions to help deal with these extra little problems. For example, when you declare a template, the code gets generated when the template is actually used. If the template is used in a non-paged function, and in a paged function, then we probably need a way to deterministically say how the template code should be generated.

This is stuff that can all be done, and Microsoft compiler guys are GOOD at it. They could make our lives so much easier: not just for programmers, but for consumers who are sick of having drivers that crash their machines. Come on guys… do it for freedom. Do it for the children. Just do it. Please?