.NET Native Deep Dive: Debugging into Interop Code

.NET Native Deep Dive: Debugging into Interop Code

Rate This
  • Comments 13

This post was authored by Yi Zhang, a Senior Software Development Engineer on the .NET Native team.

.NET Native and MCG

At this point, you’ve probably seen the .NET Native announcement and some of our other .NET Native blog posts. We’ve talked about how .NET Native brings you the productivity of C#, but also with the performance of C++. However, the innovation of .NET Native didn't stop there - we also see this as a great opportunity to reimagine our existing technologies and think about how to make them more accessible and transparent to .NET developers.

In the past, when .NET does interop calls, such as P/Invoke, COM interop, or even WinRT, all the magic is done by behind the scenes by .NET and is inaccessible to .NET developers. It is very difficult to see what’s going on. In .NET Native, we've made significant progress on .NET interop technology by making it more transparent to developers. For the first time, you can now debug all the interop magic in C# code and see how interop data marshaling is performed between the managed world and the native world.

Before .NET Native, interop support code in .NET is typically emitted as IL code and then JIT-compiled into machine code that is executed. So, if you want to debug that interop code, you are usually stuck with debugging in assembly code. The emitted IL code is thrown away, and it’s not easy to understand anyways. Well, this has changed with .NET Native.

In .NET Native, interop code is emitted as C# during build time so it is readable and debuggable. When .NET Native tool chain runs, one of the very first stages of the tool chain process is executing a tool called MCG (Marshaling Code Generator). It is responsible for generating interop code by scanning your app to discover exactly what interop types and functions you might be using. The result is saved in appname.interop.g.cs and then compiled by the CSC compiler into IL code. Later, this IL code is merged into the application and the tool chain rewrites your P/Invoke calls to call MCG's generated code instead. Then the whole application is processed through the rest of the .NET Native tool chain and eventually converted into native binary by the C++ backend. For more information, you can refer to Shawn Farkas’s talk on .NET Native on Channel 9. Let's look at an example how this is going to help .NET developers troubleshoot interop related issues.

Troubleshooting a P/Invoke with .NET Native

Let's create a simple C# Windows Store app from the default Blank App template. Suppose we would like to call a simple API called LCIDToLocaleName just for demonstration’s purpose (most likely you won’t need this in Windows Store apps):

 int LCIDToLocaleName(
  _In_ LCID Locale,
  _Out_opt_ LPWSTR lpName,
  _In_ int cchName,
  _In_ DWORD dwFlags
);

So, we write the following P/Invoke signature to call the API:

 [DllImport("api-ms-win-core-localization-l1-2-1.dll", CharSet=CharSet.Unicode)]
internal static extern int LCIDToLocaleName(uint locale, out string name, int cchName, uint flags);

Of course, we still have to call the P/Invoke. We can do it like this:

 string name;
LCIDToLocaleName(0x10407, out name, 255, 0);

Unfortunately, if you run this app without .NET Native, you'll get this upon calling the P/Invoke (make sure you turn on breaking on first chance managed exceptions in Debug/Exceptions):

AccessViolationException is the CLR’s way of telling you something went seriously wrong when reading or writing memory. This indicates that the app’s memory has become corrupted and the process must be terminated. A likely problem is that we are declaring the P/Invoke incorrectly and that eventually leads to an access violation. This is actually a very common problem when using P/Invoke. .NET has no knowledge about the API you are calling and is relying on you to specify how the function should be called from managed code. If you somehow get it wrong, then .NET would also call it using incorrect parameters. It can be tricky to get it exactly right. Before .NET Native, the best bet is probably for you to go the web and do some research, and see if anybody else has come up with the right managed function signature. Now with .NET Native, after enabling the .NET Native tool chain (refer to the announcement blog post for more details) you can actually step-into into LCIDToLocaleName and see for the first time what exactly is going on.

Most of the time interop code isn’t interesting to look at. So, you can’t step into interop code by default when Just My Code is turned on. Before we proceed to step into the P/Invoke, make sure Just My Code is turned off by opening Tools/Options dialog and disabling Just My Code (otherwise you won’t be able to step into it):

Now, it’s the time to step into the code (hit F11 in Visual Studio)!

 /// <summary>
/// P/Invoke class for module 'api-ms-win-core-localization-l1-2-1.dll'
/// </summary>
public static partial class api_ms_win_core_localization_l1_2_1_dll
{
  // Signature, LCIDToLocaleName, [fwd] [return] [Mcg.CodeGen.BlittableValueMarshaller] int, [fwd] [in] [Mcg.CodeGen.BlittableValueMarshaller] unsigned int, [fwd] [out] [managedbyref] [nativebyref] [Mcg.CodeGen.UnicodeStringMarshaller] wchar_t *, [fwd] [in] [Mcg.CodeGen.BlittableValueMarshaller] int, [fwd] [in] [Mcg.CodeGen.BlittableValueMarshaller] unsigned int,
  [McgGeneratedMarshallingCode]
  [McgPInvokeMarshalStub("App1, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null", "App1.App", "LCIDToLocaleName")]
  [MethodImpl(MethodImplOptions.NoInlining)]
  internal static int LCIDToLocaleName(
        uint locale,
        out string name,
        int cchName,
        uint flags)
  {
    // Setup
    short* unsafe_name;
    int unsafe___value;
    // Marshalling
    unsafe_name = default(short*);
    // Call to native method
    unsafe___value = McgNative.api_ms_win_core_localization_l1_2_1_dll_PInvokes.LCIDToLocaleName(
              locale,
              &(unsafe_name),
              cchName,
              flags
            );
    DebugAnnotations.PreviousCallContainsUserCode();
    name = ((unsafe_name != null) ? new string(((char*)unsafe_name)) : null);
    if (unsafe_name != null)
      ExternalInterop.CoTaskMemFree(unsafe_name);
    // Return
    return unsafe___value;
  }
}

The first thing you'll see is the function you are calling is not actually LCIDToLocaleName, but rather a method defined in a special class in appname.interop.g.cs. This is the magic of .NET Native tool chain at work: MCG sees that you are calling LCIDToLocaleName P/Invoke and emits code that calls the Windows API LCIDToLocaleName for you. When you call into the P/Invoke, you would actually land on MCG's code. If this were .NET 4.5, you would land on CLR generated IL stubs in JITted assembly code form, and it’s not readable.

Next, you’ll notice this function defines a bunch of variables that begins with safe / unsafe. This is MCG's jargon for managed code and native code. Whenever the data is coming from native code, it is inherently unsafe as there are no type information available when you are calling into the APIs at runtime (C++ gives you some static verification at compile time, though). Meanwhile, in the .NET world, everything can be accessed in a type-safe manner. For Unicode strings, the unsafe native type is wchar_t*, which maps to short* in C# code. The reason we are not using char*, is because char is an ambiguous type in terms of marshaling; it can be mapped to unicode char or ansi char in P/Invoke, depending on various factors. On the other hand, short is not ambiguous (i.e. it is always a two-byte integer). In order to avoid potential confusion, short is used instead of char.

Then, MCG proceeds to call the API directly by calling another LCIDToLocaleName function. Note that the McgNative.api_ms_win_core_localization_l1_2_1_dll_PInvokes.LCIDToLocaleName declaration is also a P/Invoke, except that it is defined using primitive types and pointers, where no data marshaling is involved. The C++ compiler backend in .NET Native tool chain knows how to make native function calls, except that it doesn't know how to do the data translation between managed strings and native strings. MCG bridges the gap by generating the marshaling code. But in the end, MCG still needs the C++ backend's help to actually call into the API:

 public static partial class api_ms_win_core_localization_l1_2_1_dll_PInvokes
{
  [McgGeneratedNativeCallCode]
  [DllImport("api-ms-win-core-localization-l1-2-1.dll")]
  internal extern static int LCIDToLocaleName(
        uint locale,
        short** name,
        int cchName,
        uint flags);
}

One thing to keep in mind, though, is that the API imported is emitted into the application native image directly as an API import in the import directory, the same as if you were calling the API through C++ (unless you are doing delay loading or LoadLibrary/GetProcAddress). This means if the API cannot be found in the DLL, the application will fail to start/initialize.

OK. Back to MCG’s code. When you look at the function call carefully, you'll notice it actually passes the address of the unsafe_name variable, which ends up being short**. Comparing with the actual API definition LPWSTR, there is one extra indirection (pointer) involved. That’s probably what's causing the AccessViolationException earlier—the API is filling into the wrong memory location, and the characters fill into the string was interpreted as a pointer address, causing the crash. The extra indirection is added because the out keyword in C# automatically adds an extra level of indirection for you. But we also cannot simply remove the 'out' keyword, because we want the P/Invoke to pass a string out, and a string is immutable. The right choice here is StringBuilder since it allows you to fill in a string buffer without the extra indirection:

 [DllImport("api-ms-win-core-localization-l1-2-1.dll", CharSet = CharSet.Unicode)]
internal static extern int LCIDToLocaleName(uint locale, StringBuilder name, int cchName, uint flags);

And let's try again, and debug into it and see what MCG generates for us:

 internal static int LCIDToLocaleName(
      uint locale,
      global::System.Text.StringBuilder name,
      int cchName,
      uint flags)
{
  // Setup
  short* unsafe_name;
  int unsafe___value;
  // Marshalling
  unsafe_name = (short*)McgHelpers.CoTaskMemAllocAndZeroMemory(new IntPtr((name.Capacity * 2
            + 2)));
  if (unsafe_name == null)
    throw new OutOfMemoryException();
  McgMarshal.CopyStringBuilderTo(
            name,
            unsafe_name
          );
  // Call to native method
  unsafe___value = McgNative.api_ms_win_core_localization_l1_2_1_dll_PInvokes.LCIDToLocaleName(
            locale,
            unsafe_name,
            cchName,
            flags
          );
  DebugAnnotations.PreviousCallContainsUserCode();
  McgMarshal.CopyToStringBuilder(
            name,
            unsafe_name
          );
  if (unsafe_name != null)
    ExternalInterop.CoTaskMemFree(unsafe_name);
  // Return
  return unsafe___value;
}

This time, the code seems to make a lot more sense: it allocates a buffer, copies the contents of the StringBuilder into the buffer, passes the address of the buffer into the API, and then copies the contents of the buffer back to the StringBuilder. You might be wondering why the memory allocation is necessary in the first place – it is needed because a StringBuilder could have list of memory buffers (for better performance in growing the size of StringBuilder) that are not contiguous. We might be able to further optimize this in the future by pinning the StringBuilder buffer when it only has one buffer (and we need to consider multi-threading as well). Anyway, you can easily verify that this method does the right thing by stepping-out of this function and looking at the returned StringBuilder instance.

However, if you look very closely, there is actually a step that is a bit unnecessary: MCG copies the contents of the StringBuilder into the buffer, while the API doesn't expect you to pass in any string value at all - it simply fill in the buffer. The copying step is actually wasted. Fortunately, .NET allows you to customize the marshaling behavior by specifying [In]/[Out] attribute on the parameter (for more information, you can refer to this MSDN article). StringBuilder by default gets [In, Out], and in this case, we only want the [Out] direction because we only need to receive data from the LCIDToLocaleName API:

 [DllImport("api-ms-win-core-localization-l1-2-1.dll", CharSet = CharSet.Unicode)]
internal static extern int LCIDToLocaleName(uint locale, [Out] StringBuilder name, int cchName, uint flags);

And let's see what we get this time:

 internal static int LCIDToLocaleName(
      uint locale,
      global::System.Text.StringBuilder name,
      int cchName,
      uint flags)
{
  // Setup
  short* unsafe_name;
  int unsafe___value;
  // Marshalling
  unsafe_name = (short*)McgHelpers.CoTaskMemAllocAndZeroMemory(new IntPtr((name.Capacity * 2
            + 2)));
  if (unsafe_name == null)
    throw new OutOfMemoryException();
  // Call to native method
  unsafe___value = McgNative.api_ms_win_core_localization_l1_2_1_dll_PInvokes.LCIDToLocaleName(
            locale,
            unsafe_name,
            cchName,
            flags
          );
  DebugAnnotations.PreviousCallContainsUserCode();
  McgMarshal.CopyToStringBuilder(
            name,
            unsafe_name
          );
  if (unsafe_name != null)
    ExternalInterop.CoTaskMemFree(unsafe_name);
  // Return
  return unsafe___value;
}

Much better!

Please note I choose P/Invoke here because it is the easiest way to demonstrate this new capability in .NET Native. This mechanism supports all kinds of interop calls - COM interop, WinRT, reverse P/Invoke, etc. Please keep in mind that MCG is still in development and we are working hard to improve various aspects of MCG. Please do let us know what you think about this new feature in .NET Native. Feel free to comment below or email us directly at dotnetnative@microsoft.com.

Leave a Comment
  • Please add 7 and 5 and type the answer here:
  • Post
  • This is excellent news for someone like me who does quite a bit of interop between C# C++/CLI and native WIN32/64 c code, to bolt new capability to proprietary systems.

  • Thanks for the look under the covers!  I've argued with P/Invoke enough times to really appreciate the ability to see the marshaling code.

  • Isn't there a race condition where memory is leaked in case an asynchronous exception is thrown at an inopportune point?

  • Excellent, the sooner .NET Native comes to C++/CLI the better. Can't wait to use this!

  • @xor88 Remember that this is not normal C# code but will be translated to native, so theoretically the translator can deal with that problem like normally the JIT does. Asynchronous exceptions like ThreadAbort cannot be just thrown anywhere, but only at safe points. But you are right, we can't tell from this post, they might have bugs there ;)

  • - List the few remaining win32 api calls that do not have an equivalent .NET alternative.

    - Exclude obsolete win32 calls

    - Rewrite in manage C# win32 calls wrapped by .net such as the Bitmap class.  Bitmap in .net should not require a dispose call since it is used extensively outside of a win32 UI bitmap on the server side.

  • @xor88 - you're correct that the current developer previews of .NET Native emit interop code which is not fully exception safe.   Getting the exception safety in place turns out to be a difficult balancing act between keeping compatibility with the cleanup operations of the existing .NET Framework and emitting code that's at least somewhat understandable by humans :-)

    Exception safety is clearly important to get correct before we finish the product, so it was one of the highest prioritized features on our backlog.   In fact, we've got the code checked in to our development branch now, so you should see the exception safe code gen show up in our next developer preview.    (And, at that point if you still see exception unsafe code, we'd like to know about it since it would represent a bug in our codegen that we'd like to fix)!

  • @Zarat - one of the main goals of .NET Native is to keep the same execution model for programs as exists for the existing CLR.   Due to that, if you read code and it appears that it would only be functional with a special .NET Native trick, then it's probably a bug on our side  :-)  

    There are actually a small handful of places where we do take advantage of knowledge about how the tool chain will work in order to emit some tricky code, but those should be fairly obvious, and not some subtle bit of code generation magic.   For instance, you might see some calls to a magic StdCall<T> function, which our tool chain will convert into an IL calli instruction later on.

  • IMO, the name.Capacity * 2 + 2 calculation should be done in a checked block to avoid integer overflow (security) bugs.

  • @Jeroen: Great suggestion, thank you! We've entered a bug for this and will consider the fix for a future CTP.

  • @Calbert Macdonald, I've deleted your comment as spam. We hold a very high bar for deleting comments but as you were advertising a programming tool that only appears to work with "ANSI C++, PHP, JavaScript or Java source code" (from your web page) it didn't seem relevant to the .NET blog.  

  • We've just released the third developer preview of .NET Native, which includes fixes for several of the issues identified in the comments here.   Notably, the generated code should now be exception safe, fixing the issue that @xor88 identified.   Also, buffer size calculations are now done in checked code blocks, as @Jeroen called out.

  • let me get straight to the point

    I want .NET Native for Windows Forms

Page 1 of 1 (13 items)