Blog

How The Hell Things Work: How Windows Debugger Finds Symbols For Your Code

Every once in a while, when I am debugging something at work, I run into the problem of missing symbols. All of my products build symbols should be copied up to a symbol server, which makes everything “just work.” When it doesn’t just work, it always takes a manually process of figuring out how the hell symbols are supposed to work… again.

The basic scenario is that a debugger needs to use information it has available (e.g., in a crash dump file) to find the pdb file that matches the code you are debugging. Often, not a lot of information is available to the debugger, but where we are going to start is the header for the image. Let us first examine the module information…

0:017> lmvm mysvc
Browse full module list
start end module name
00007ff6`1a070000 00007ff6`1a2c2000 mysvc T (no symbols)
   Loaded symbol image file: mysvc.exe
   Image path: mysvc.exe
   Image name: mysvc.exe
   Browse all global symbols functions data
   Timestamp: Sun May 15 23:47:13 2022 (6281E561)
   CheckSum: 00254124
   ImageSize: 00252000
   File version: 2.9.8171.14983
   Product version: 2.9.8171.14983
   File flags: 0 (Mask 3F)
   File OS: 40004 NT Win32
   File type: 1.0 App
   File date: 00000000.00000000
   Translations: 0000.04b0 0000.04e4 0409.04b0 0409.04e4
   Information from resource tables:

Notice the checksum and image size here. These two pieces of information will be used by the debugger to build a hash that it uses to locate the indexed image file on the symbol server. To see this in action, we can configure the debugger to point to an empty symbol server path and turn on noisy symbol output…

0:017> .sympath SRV*D:\dust
DBGHELP: Symbol Search Path: srv*d:\dust
Symbol search path is: SRV*D:\dust
Expanded Symbol search path is: srv*d:\dust
************* Path validation summary **************
Response Time (ms) Location
Deferred SRV*D:\dust
0:017> !sym noisy
0:017> .reload /f mysvc.exe
SYMSRV: BYINDEX: 0x7
   d:\dust
   mysvc.exe
   6281E561252000
SYMSRV: UNC: d:\dust\mysvc.exe\6281E561252000\mysvc.exe - path not found
SYMSRV: UNC: d:\dust\mysvc.exe\6281E561252000\mysvc.ex_ - path not found
SYMSRV: UNC: d:\dust\mysvc.exe\6281E561252000\file.ptr - path not found
SYMSRV: RESULT: 0x80070003
DBGHELP: DBGENG: mysvc.exe - Image mapping disallowed by non-local path.
Unable to load image mysvc.exe, Win32 error 0n2
DBGENG: mysvc.exe - Partial symbol image load missing image info
DBGHELP: No header for mysvc.exe. Searching for dbg file
DBGHELP: .\mysvc.dbg - file not found
DBGHELP: .\exe\mysvc.dbg - path not found
DBGHELP: .\symbols\exe\mysvc.dbg - path not found
DBGHELP: mysvc.exe missing debug info. Searching for pdb anyway
DBGHELP: Can't use symbol server for mysvc.pdb - no header information available
DBGHELP: mysvc.pdb - file not found
*** WARNING: Unable to verify timestamp for mysvc.exe
DBGHELP: mysvc- no symbols loaded

We can see that the debugger tries to find the indexed binary file based on a concatenated image timestamp and image size. If it can’t find this (i.e., if the binary itself is not indexed), then the debugger is not going to be able to figure out the hash to lookup your pdb. So next, let us manually add the binary to our symbol store and see what the next step looks like.

D:\dev> & symstore.exe add /f mysvc.exe /s D:\dust /t ProductName
Finding ID... 0000000001

SYMSTORE: Number of files stored = 1
SYMSTORE: Number of errors = 0
SYMSTORE: Number of files ignored = 0
0:017> .reload /f mysvc.exe
SYMSRV: BYINDEX: 0x8
   d:\dust
   mysvc.exe
   6281E561252000
SYMSRV: PATH: d:\dust\mysvc.exe\6281E561252000\mysvc.exe
SYMSRV: RESULT: 0x00000000
DBGHELP: d:\dust\mysvc.exe\6281E561252000\mysvc.exe - OK
DBGENG: d:\dust\mysvc.exe\6281E561252000\mysvc.exe - Mapped image memory
SYMSRV: BYINDEX: 0x9
   d:\dust
   mysvc.pdb
   B8FA9DF507C542C097244F72C28E50EE1
SYMSRV: UNC: d:\dust\mysvc.pdb\B8FA9DF507C542C097244F72C28E50EE1\mysvc.pdb - path not found
SYMSRV: UNC: d:\dust\mysvc.pdb\B8FA9DF507C542C097244F72C28E50EE1\mysvc.pd_ - path not found
SYMSRV: UNC: d:\dust\mysvc.pdb\B8FA9DF507C542C097244F72C28E50EE1\file.ptr - path not found
SYMSRV: RESULT: 0x80070003
DBGHELP: mysvc.pdb - file not found
DBGHELP: D:\a\_work\1\s\bld\x64\Release\mysvc.pdb - file not found
DBGHELP: mysvc- no symbols loaded

Ok, so the debugger was able to find the indexed binary, and somehow got this magical hash, B8FA9DF507C542C097244F72C28E50EE1, that it is trying to use to find the pdb file. To see where this comes from, we can examine the headers of the binary file using dumpbin.exe. For brevity, I will filter the output to just find the section I am interested in…

dumpbin /headers mysvc.exe | Select-String "Format: RSDS"

6281E561 cv            42 001FAA10   1F9610    Format: RSDS, {B8FA9DF5-07C5-42C0-9724-4F72C28E50EE}, 1, D:\a\_work\1\s\bld\x64\Release\mysvc.pdb

Here the exe file has embedded a link to the debug information. The guid here is taken as a pure string of hex digits (strip out the dashes and braces) and used as part of the file path to find the pdb. Again, to confirm, we will manually add the pdb file to our symbol store and retry the reload command.

D:\dev> & symstore.exe add /f mysvc.pdb /s D:\dust /t ProductName
Finding ID... 0000000002

SYMSTORE: Number of files stored = 1
SYMSTORE: Number of errors = 0
SYMSTORE: Number of files ignored = 0
0:017> .reload /f mysvc.exe
SYMSRV: BYINDEX: 0xA
   d:\dust
   mysvc.exe
   6281E561252000
SYMSRV: PATH: d:\dust\mysvc.exe\6281E561252000\mysvc.exe
SYMSRV: RESULT: 0x00000000
DBGHELP: d:\dust\mysvc.exe\6281E561252000\mysvc.exe - OK
DBGENG: d:\dust\mysvc.exe\6281E561252000\mysvc.exe - Mapped image memory
SYMSRV: BYINDEX: 0xB
   d:\dust
   mysvc.pdb
   B8FA9DF507C542C097244F72C28E50EE1
SYMSRV: PATH: d:\dust\mysvc.pdb\B8FA9DF507C542C097244F72C28E50EE1\mysvc.pdb
SYMSRV: RESULT: 0x00000000
DBGHELP: mysvc- private symbols & lines
d:\dust\mysvc.pdb\B8FA9DF507C542C097244F72C28E50EE1\mysvc.pdb

Now remember in elementary school how they used to teach you all the basic ways to do things that mostly worked, but then after all that they would just show you the “standard algorithm” that just always works? Yeah, well apparently, I do the same thing, so here is a nifty little debugger command that just kind does everything above…

0:017> !lmi mysvc
Loaded Module Info: [mysvc] 
         Module: mysvc
   Base Address: 00007ff61a070000
     Image Name: mysvc.exe
   Machine Type: 34404 (X64)
     Time Stamp: 6281e561 Sun May 15 23:47:13 2022
           Size: 252000
       CheckSum: 254124
Characteristics: 22  
Debug Data Dirs: Type  Size     VA  Pointer
             CODEVIEW    42, 1faa10,  1f9610 RSDS - GUID: {B8FA9DF5-07C5-42C0-9724-4F72C28E50EE}
               Age: 1, Pdb: D:\a\_work\1\s\bld\x64\Release\mysvc.pdb
           VC_FEATURE    14, 1faa54,  1f9654 [Data not mapped]
                 POGO   414, 1faa68,  1f9668 [Data not mapped]
    Symbol Type: DEFERRED - No error - symbol load deferred
    Load Report: no symbols loaded

Well, that about does it for the basics of how symbol resolution works. In a future article I will address some more advanced topics, such as indexing pointers to files and compressed files and such advanced topics.

Debug Break on Access

I had to diagnose a crash in my driver code recently. Every time it happened, it was because a particular pointer value inside a structure had been decremented incorrectly. Decrementing the actual pointer is something that should never happen, and I couldn’t figure out how the hell it was. I had tried breakpoints in all the code that I thought was potentially problematic, and turned up nothing. Finally I came up with the following gem…

ba w 8 poi(frxdrvvt!ActiveRedirections)+0x78 ".if (poi(poi(frxdrvvt!ActiveRedirections)+0x78) % 2 == 0) { g }"

Basically, it says break on write access to the pointer in question, and if the value is divisible by 2 (since pointers have to be aligned on 8-byte boundaries) then just continue. It broke on the decrement I suspected was there and was immediately obvious what was happening. (It was my new code, but it was code I never suspected of being problematic because it was so simple, lol.)

Initialization Order in c++ Matters

So last week I ran into a bug in my code that exposed an aspect of the c++ language that I was familiar with intellectually, but had never run into before. The initialization order of member variables in a class is in the order of their declaration in the class. Basically this means that if your class declares a, then b, then c, they will always get initialized in that order, even you have a constructor that initializes them in a different order.

In my case I was making a class that would call a function every so often, basically duplicating a Win32 Timer object functionality. I didn’t want to pull Boost libraries in, which appears to be the best possible way to accomplish this. Instead, I wrote the following class (roughly). The idea was that it would create a thread that would call my function every 1 second. When the containing object gets destructed, it would signal the thread to end and then wait for it. This is not the best async code, but it was simple enough and functional for what I needed.

class Timer
{
	std::thread timerThread_;
	bool destructing_{ false };

public:
	int ticks_{ 0 };

	Timer() : timerThread_{ [this]() { this->timerThread(); } }
	{
	}

	~Timer()
	{
		destructing_ = true;
		timerThread_.join();
	}

	void timerThread()
	{
		while (!destructing_)
		{
			std::this_thread::sleep_for(std::chrono::seconds(1));
			tick();
		}
	}

	void tick()
	{
		++ticks_;
	}
};

So after writing this code I found that my tick() function was never being called. In a debugger I quickly found that destructing_ was set to true when I was expecting it to be false. It didn’t take me long to realize what had happened. Reordering the bool and the std::thread declarations in the class fixed the problem. This was actually a rule I didn’t know about until recently, and I was glad I had listened to a CppCon talk that mentioned it, because it would have taken me a lot longer to figure out otherwise.

The main reason that I didn’t notice it initially, even being aware of the ordering rules, was that in my non-example code, the constructor code was in a .cpp file, and the member initialized bool was declared in the .h file. So the ordering is somewhat unapparent when viewing the .cpp file.

Enabling Privileges in a Thread-safe Manner

So in basically every job I have had in the last twenty years, I have had to write some code to enable Windows privileges, such as the backup or restore privileges, which are required to import or export registry keys. It’s a pretty common operation and there is a convenient example on MSDN that shows how to do it. I believe that every instance of writing this code I have seen followed the same pattern: Get a thread token, or a process token if there is no thread token, then adjust the privilege on that token. Then undo the whole thing when you no longer need the privilege.

But recently I encountered a situation where there is large-scale threading going on, and this technique really backfires. The threads quickly start to disable privileges that another thread needs, etc. I started to look into a better solution, and realized that I needed a thread token for each thread so that the privileges could be adjusted on that specific thread only. I knew that the way to get a thread token was to impersonate a user, but I didn’t need to be impersonating another user in this case.

Enter the ImpersonateSelf Windows API, which is specifically designed for this situation. The API creates a thread token for the current thread, which then means that the enable privilege code can be safely called against the thread token (NOT the process token). This is a pretty straight-forward process, but based on my experience I don’t think it is commonly done correctly. The code that I have seen everywhere has definitely not been thread-safe.

Catching user-mode exceptions in a kernel-mode debugger

I learned a new favorite kernel debugger trick tonight. I regularly have a kernel debugger attached while working on my driver, but tonight experienced a crash in my user mode service. Not wanting to set up a new debugger inside the vm, I googled around and came up with the following:

!gflag +soe

This windbg command makes all exceptions go to the kernel mode debugger first. Voila!

Deriving Unique Key Container Name for RSACryptoServiceProvider

I ran across a problem this week where I needed to get the filename where an RSA encryption key was stored. These files are stored (for machine-scope keys) in C:\ProgramData\Microsoft\Crypto\RSA\MachineKeys, and have a filename that looks like a hash value followed by a SID. This is easy to find if you have access to the key:

var csp = new CspParameters
{
    Flags = CspProviderFlags.NoPrompt | 
            CspProviderFlags.UseMachineKeyStore | 
            CspProviderFlags.UseExistingKey,
    KeyContainerName = "dev.dev.domo.com"
};

var crypto = new RSACryptoServiceProvider(csp);

Console.WriteLine(csp.KeyContainerName);
Console.WriteLine(crypto.CspKeyContainerInfo.UniqueKeyContainerName);

But in my case I didn’t have access to the keyfile, as it had been created by another user and ACLed. The algorithm for deriving these filenames is not too difficult… It turns out you can take the container name, convert it to lowercase, add an extra null byte, compute the MD5 hash, and then convert the MD5 hash to a string in DWORD-sized chunks. Then you append the machine guid, which can be found in the registry.

public static class RsaCryptoServiceProviderExtensions
{
    public static string GetUniqueKeyContainerName(string containerName)
    {
        using (var rk = Registry.LocalMachine.OpenSubKey(@"SOFTWARE\Microsoft\Cryptography"))
        {
            if (rk == null)
            {
                throw new Exception("Unable to open registry key");
            }

            var machineGuid = (string)rk.GetValue("MachineGuid");

            using (var md5 = MD5.Create())
            {
                var containerNameArray = Encoding.ASCII.GetBytes(containerName.ToLower());
                var originalLength = containerNameArray.Length;
                Array.Resize(ref containerNameArray, originalLength + 1);

                var hash = md5.ComputeHash(containerNameArray);
                var stringBuilder = new StringBuilder(32);
                var binaryReader = new BinaryReader(new MemoryStream(hash));
                for (var i = 1; i <= 4; i++)
                {
                    stringBuilder.Append(binaryReader.ReadInt32().ToString("x8"));
                }

                stringBuilder.Append("_" + machineGuid);

                return stringBuilder.ToString();
            }
        }
    }
}

.NET AddIn Framework With Backwards Compatibility

We’ve been working on implementing some plugin functionality in our product and have been using the .NET AddIn Framework. Unfortunately it’s fairly complicated and there’s not a lot of great examples out there. But my co-worker Justin just posted an interesting blog about some of what we’ve been dealing with.

.NET AddIn Framework With Backwards Compatibility

Using an enumeration from a runtime-loaded assembly

I recently had a co-worker who needed to instantiate and use a class from an assembly loaded at runtime. The code couldn’t reference the assembly directly for various reasons. This was accomplished relatively easily until he needed to assign the value of an enumerated type. So take the following class definition.

namespace DynamicAssembly
{
    public class MyClass
    {
        public enum MyEnum { ValueA, ValueB, ValueC }

        public MyEnum TheEnumValue { get; set; }
    }
}

From a project, the goal was to load the above assembly dynamically, instantiate a MyClass variable and then set TheEnumValue = MyEnum.ValueB. Really simple in normal code… a little more convoluted in dynamic runtime code. The solution I came up with is the following:

static void Main(string[] args)
{
    var p = Path.GetFullPath(@"..\..\..\DynamicAssembly\bin\Debug\DynamicAssembly.dll");
    var a = Assembly.LoadFile(p);

    var classType = a.GetType("DynamicAssembly.MyClass");
    dynamic classInstance = Activator.CreateInstance(classType);

    var enumType = a.GetType("DynamicAssembly.MyClass+MyEnum");
    var enumValues = enumType.GetEnumNames();
    var enumIndex = Array.IndexOf(enumValues, "ValueB");
    var enumValue = enumType.GetEnumValues().GetValue(enumIndex);

    classInstance.TheEnumValue = (dynamic)enumValue;
}

I would love to hear from you if you know of a better way to accomplish this.

Moneyball for Hiring Software Developers

Yesterday at Domo’s Domopalooza conference I had the opportunity to listen to Billy Beane, the GM of the Oakland Athletics MLB organization, and the subject of the 2003 book and 2001 film Moneyball. Beane spoke about how their organization used hard data to produce winning baseball teams with a drastically lower budget than some of the other winning MLB teams. I found the session fascinating and it got me thinking about the applications in hiring great software developers.

I’ve often thought that the way we hire software developers is not very effective. We try to gauge a developer’s talent by giving them little problems to solve: sometimes these are puzzles and brain-teasers, though these seem to be growing less common, and other times they are little programming problems to be solved on a whiteboard. While these can be fun, and even instructive, I don’t think they result in a good hiring decision.

I am intrigued by the idea that instead of little exercises (similar to a scout watching a baseball player in high school, or a tryout), we need to have some hard data to base our decisions on. The big difference between professional sports and software development, though, is that athletes have their performance constantly measured and recorded. MLB is full of stats: wins, losses, on base hits, home runs, stolen bases, etc. With that data you can find out interesting things about the game, such as the fact that stolen bases contribute very little statistically to whether a game is won or lost.

While software developers don’t have performance measurements that are public, there are ways of measuring that performance: bugs written, lines of code produced, bugs resolved, etc. If we had statistics like these that were public, then maybe we would have a better way of finding and selecting the right developers for our projects. But within our own organizations we could keep track of these statistics and use them to manage our employees after they were hired.

But many software developers are now starting to become involved in social programming. Many of us are participating in code retreat days, programmer meetup groups, hack nights, open source projects, etc. Quite a few of us are even putting our side projects out in the open on sites like github or codeplex. What if we could process all that data and get statistics about what good programmers look like, and find ways to measure a programmer’s talent in a real way?

Of course, there are a lot of intangibles that still probably need to be interviewed for. It’s not worth hiring somebody that no one gets along with just because they have some skills. But if you could weed out the people who don’t have what you’re looking for, wouldn’t you be miles ahead in the interview process?