Dealing with Modules, Assemblies and Types with CLR Profiling APIs

Introduction

In the first post of this series dedicated to CLR Profiling API, you have seen how to get a FunctionID each time a managed method is executed in a .NET application. As David Broman (source of most of the profiling implementation details at Microsoft) explains, a FunctionID is a pointer to an internal data structure of the CLR called a MethodDesc. For us, it is just an opaque value that is usable in different CLR APIs. So what if you would like to know the name of the method behind this FunctionID?

Unlike what you might think, this first question is not an easy one, especially if you would like to get the complete signature of the method such as what you get in Visual Studio Call Stack panel:

You will have to get the module name (i.e. the assembly where the method type is defined), the type name, the method name and the list of its parameters type and name.

This post deals with the notions of module, assembly and type in addition to introducing the .NET Metadata API.

Identifying the module and assembly

I’m sure that most of you know what an assembly is: this is what gets generated when you compile a Class Library in Visual Studio. Easy answer. However, .NET (unlike Visual Studio) supports the notion of multi-module assembly creation bound to several “modules”. Each module can contain types and resources and the assembly contains the manifest listing all the modules defining the assembly.

This is why the profiling API allows you to get both assembly and module. Let’s use ICorProfilerInfo::GetFunctionInfo to find out which module and assembly is implementing the type of a given FunctionID.

Now that you have a ModuleID, you can call ICorProfilerInfo::GetModuleInfo to get its name, load address and assembly. The usage pattern of this API is common in COM: first you call it to get the size of the buffer to copy the name and then you call it a second time with the newly allocated buffer:

Note that the module name is the full path name of the module file.

Here is the code that calls ICorProfilerInfo::GetAssemblyInfo to get the assembly name now that you have the AssemblyID:

The assembly name does not contain the file extension such as .dll or .so.

ID or Token: it depends on which profiling API to use

it is important to discuss what kind of information you get from the different profiling APIs. Like FunctionID, ClassID and ModuleID are opaque pointers to CLR internal data structures. They are used by the runtime to map into memory metadata generated by the compiler. The metadata identifiers are usually referenced as “token” and the mdToken type simply stands for “metadata token”. Unlike the different xxxID types with values different each time the code runs, the metadata tokens stay the same because they come from the compiled assembly. While debugging, it is good to be able to compare what token you get against their corresponding value in an assembly. As an example, here is what you get with ILSpy while browsing the medatata:

Each kind of metadata is encoded into the first 2 digits so it is easy to see what you are manipulating. The 06 prefix tells you that you are dealing with a method:

Instead of ICorProfilerInfo, you need to use IMetaDataImport to access information behind the metadata tokens. Since the metadata is bound to a given module, you have to call ICorProfilerInfo:: GetModuleMetaData to get the implementation corresponding to a given ModuleID.

For the rest of the series, I will do my best to present which profiling/metadata API to use for what purpose. And in some cases, you will need both.

Identifying the type

After the module details, let’s see what we can get for the type that implements a given FunctionID. For one of my test, I defined the following C# generic type:

It can be used like the following:

Why am I starting with a generic type? That way, you will better understand that this feature has been added after the initial profiling API shipped and is not that well integrated. Basically, the first iteration of ICorProfilerInfo did not deal with generics but the second one ICorProfilerInfo2 does.

But first, let’s summarize a few basics about generics. When you define a generic type and generic methods such as for my GenericPublicClass, the C# compiler generates the metadata for the generic type definition that acts as a template. The generic type parameters (K and V in my case) are placeholders that will be instanciated by generic type arguments to get a final generic constructed type.

The important part to understand for our purpose is the fact that metadata will only contain generic type definitions

The name stored in the metadata ends with the ` character followed by the number of generic type parameters. This is what you get when you call GetType().Name on a generic instance in C#.

As shown earlier, ICorProfilerInfo::GetFunctionInfo is used to get the ClassID of the type implementing the given FunctionID. Unfortunately, in case of a generic type, it returns S_OK but the ClassID you get is 0. In that case, you know you have to call ICorProfilerInfo2::GetFunctionInfo2:

You have the FunctionID but not the COR_PRF_FRAME_INFO… You need to call ICorProfilerInfo3::GetFunctionEnter3Info to get it from the COR_PRF_ELT_INFO given by the enter stub. Here is the final code to get a ClassID for a generic type:

Here is a summary of the relationships between the different IDs with the corresponding APIs to call:

From a ClassID to a class name

It is time to enter a complicated part of the story: how to get the “name” of the type that hides behind a ClassID. As you might guess, the first step is to figure out if it is a generic type and what are the corresponding type arguments. You have to call ICorProfilerInfo2::GetClassIDInfo2 with the ClassID to get the metadata token of the type, the number of type arguments and the ClassID of these types if any. As usual with this kind of API, a first call is needed to get the number of type arguments so you can allocate the right sized array of ClassID. The second call will fill up the newly allocated array:

Since you obtained a metadata token, you will need the IMetaDataImport of the module where the type is defined to get details such as… its name. The IMetaDataImport2 is required to enumerate the parameter types:

Getting the type “name” is done by a call to IMetaDataImport::GetTypeDefProps, passing the metadata token corresponding to the ClassID:

But before jumping into the name, you need to take care of the case where you are dealing with a nested type (i.e. a type defined in another type). Checking the flags parameter is exactly what you need:

A call to IMetaDataImport::GetNestedClassProps returns the metadata token of the enclosing type and you simply recursively call the GetTypeName method that we are implementing in case of multi-nested types.

If this is not a generic type, we are done. However, as already mentioned in case of a generic type, it will end with the ` character followed by the number of type parameters. The following helper function swiftly gets rid of it:

The next step is to rebuild the list of generic argument types using the array of ClassID return by ICorProfilerInfo2::GetClassIDInfo2. The most complicated part of the loop is avoid to add a “,” after the last argument type:

You call ICorProfilerInfo2::GetClassIDInfo2 on each parameter type ClassID to obtain the ModuleID where the type is defined and call our GetTypeName helper method.

The next episode will analyze methods signature.

References

Loves to understand how things work (MVP Developer Technologies)