Wednesday, November 30, 2011

Calling Win32 DLLs in C# with P/Invoke

BOOL MessageBeep(
  UINT uType   // beep type
);

[DllImport ("User32.dll")]
static extern Boolean MessageBeep (UInt32 beepType);

Note:
Static
1.The MessageBeep method was declared as static. 

2.This is a requirement for P/Invoke methods because there is no consistent notion of an instance in the  Window API.

Extern
1. It is not a method call, but an extern method definition.

2. This is your hint to the compiler that you mean for the method to be implemented by a function exported from a DLL, and therefore there is no need for you to supply a method body.

3. P/Invoke methods are nothing more than metadata that the just-in-time (JIT) compiler uses to wire managed code to an unmanaged DLL function at run time. 

4. An important piece of information required to perform this wiring to the unmanaged world is the name of the DLL from which the unmanaged method is exported

6. The DllImport custom attribute that precedes the MessageBeep method declaration. In this case, you can see that the MessageBeep unmanaged API is exported by the User32.dll in Windows.

7. The call into the unmanaged MessageBeep function can be performed by any managed code that finds the extern MessageBeep declaration within scope

8. The call is made like any other call to a static method. It is this commonality with any other managed method calls that introduces the requirement of data marshaling.

9. One of the rules of C# is that its call syntax can only access CLR data types such as System.UInt32 and System.Boolean. 

10.C# is expressly unaware of C-based data types used in the Windows API such as UINT and BOOL, which are just typedefs of the C language types.

11.The extern method must be defined using CLR types, as you saw in the preceding code snippet. 

12. This requirement to use CLR types that are different from, but compatible with, the underlying API function types is one of the more difficult aspects of using P/Invoke.

Figure 1 MessageBeep, Interop Done Well
namespace Wintellect.Interop.Sound{
   using System;
   using System.Runtime.InteropServices;
   using System.ComponentModel;

   sealed class Sound{
      public static void MessageBeep(BeepTypes type){
         if(!MessageBeep((UInt32) type)){
            Int32 err = Marshal.GetLastWin32Error();
            throw new Win32Exception(err);
         }
      }

      [DllImport("User32.dll", SetLastError=true)]
      static extern Boolean MessageBeep(UInt32 beepType);

      private Sound(){}
   }

   enum BeepTypes{ 
      Simple = -1,
      Ok                = 0x00000000,
      IconHand          = 0x00000010,
      IconQuestion      = 0x00000020,
      IconExclamation   = 0x00000030,
      IconAsterisk      = 0x00000040
   }
}
Starting from the top, you will notice that an entire type named Sound is devoted to MessageBeep. If I need to add support for playing waves using the Windows API function PlaySound, I could reuse the Sound type. However, I am not offended by a type that exposes a single public static method. This is application code, after all. Notice also that Sound is sealed and defines an empty private constructor. These are just details to keep a user from mistakenly deriving from or creating an instance of Sound.

The next feature of the code in Figure 1 is that the actual extern method where P/Invoke occurs is a private method of Sound. This method is exposed only indirectly by the public MessageBeep method, which takes a parameter of type BeepTypes. This extra level of indirection is a critical detail that provides the following benefits. 

First, should a future managed method of beeping be introduced in the class library, you can re-tool your public MessageBeep method to use the managed API without having to change the rest of the code in your application.

A second benefit of the wrapper method is this: when you P/Invoke, you waive your right to the protection from access violations and other low-level catastrophes, normally provided by the CLR. A buffer method, even if it does nothing but pass parameters through, allows you to protect the rest of your application from access violations and the like. The buffer method localizes any potential bugs introduced by the P/Invoke call.

The third and final benefit of hiding your private extern methods behind a public wrapper is the opportunity to add some minimum CLR style to the method. For example, in Figure 1 I converted a Boolean failure returned by the Windows API function into a more CLR-like exception. I also defined an enumerated type named BeepTypes whose members correspond to the define values used with the Windows API. Since C# doesn't support defines, managed enumerated types are used to avoid scattering magic numbers throughout your application code.

This final benefit of a wrapper method is admittedly minor for a simple Windows API function like MessageBeep. But as you begin to call into more complex unmanaged functions, you will find a manual translation from the Windows API style to a more CLR-friendly approach increasingly beneficial. The more you plan to reuse your interop functionality throughout your applications, the more design thought you should put into the wrapper. Meanwhile I see no shame in non-object-oriented static wrapper methods with CLR-friendly parameters.

The DLL Import Attribute
The DllImportAttribute type plays an important part in the P/Invoke story for managed code. The DllImportAttribute's primary role is to indicate to the CLR which DLL exports the function that you want to call. The name of the DLL in question is passed as the single constructor parameter to the DllImportAttribute.

Optional DllImportAttribute Properties
In addition to indicating the host DLL, the DllImportAttribute also includes a handful of optional properties, four of which are particularly interesting: EntryPoint, CharSet, SetLastError, and CallingConvention.
EntryPoint You can set this property to indicate the entry point name of the exported DLL function in cases where you do not want your extern managed method to have the same name as the DLL export. This is particularly useful when you are defining two extern methods that call into the same unmanaged function. Additionally, in Windows you can bind to exported DLL functions by their ordinal values. If you need to do this, an EntryPoint value such as "#1" or "#129" indicates the ordinal value of the unmanaged function in the DLL rather than a function name.
CharSet When it comes to character sets, not all versions of Windows are created equal. The Windows 9xfamily of products lack significant Unicode support, while the Windows NT and Windows CE flavors use Unicode natively. And sitting on top of these operating systems, the CLR uses Unicode for its internal representation of String and Char data. But never fear—the CLR automatically makes the necessary translations from Unicode to ANSI when calling into Windows 9x API functions.
If your DLL function doesn't deal with text in any way, then you can ignore the CharSet property of the DllImportAttribute. However, when Char or String data is part of the equation, set the CharSet property to CharSet.Auto. This causes the CLR to use the appropriate character set based on the host OS. If you don't explicitly set the CharSet property, then its default is CharSet.Ansi. This default is unfortunate because it negatively affects the performance of text parameter marshaling for interop calls made on Windows 2000, Windows XP, and Windows NT®.
The only time you should explicitly select a CharSet value of CharSet.Ansi or CharSet.Unicode, rather than going with CharSet.Auto, is when you are explicitly naming an exported function that is specific to one or the other of the two flavors of Win32 OS. An example of this is the ReadDirectoryChangesW API function, which exists only in Windows NT-based operating systems and supports Unicode only; in this case you should use CharSet.Unicode explicitly.
Sometimes it is unclear whether a Windows API has a character set affinity. A surefire way to find out is to check the C-language header file for the function in the Platform SDK. (If you are unsure which header file to look in, the Platform SDK documentation lists the header file for each API function.) If you find that the API function is really defined as a macro that maps to a function name ending in A or W, then character set matters for the function that you are trying to call. An example of a Windows API function that you might be surprised to learn has A and W versions is the GetMessage API declared in WinUser.h.
SetLastError Error handling is an important but frequently avoided aspect of programming. And when you are P/Invoking, you are faced with the additional challenge of dealing with the differences between Windows API error handling and exceptions in managed code. I have a few suggestions for you.
If you are using P/Invoke to call into a Windows API function for which you use GetLastError to find extended error information, then you should set the SetLastError property to true in the DllImportAttribute for your extern method. This applies to the majority of extern methods.
This causes the CLR to cache the error set by the API function after each call to the extern method. Then, in your wrapper method, you can retrieve the cached error value by calling the Marshal.GetLastWin32Error method defined on the System.Runtime.InteropServices.Marshal type in the class library. My advice is to check for error values that you expect from the API function and throw a sensible exception for these values. For all other failure cases, including those where failure wasn't expected at all, throw the Win32Exception defined in the System.ComponentModel namespace and pass it the value returned by Marshal.GetLastWin32Error. If you take a look back at the code in Figure 1, you will see that I took this approach in my public wrapper around the extern MessageBeep method.
CallingConvention The last and probably least important of the DllImportAttribute properties that I will cover here is CallingConvention. This property lets you indicate to the CLR which function calling convention it should use for parameters on the stack. The default value of CallingConvention.Winapi is your best bet and will work in most cases. However, if the call isn't working, you might check the declaring header file in the Platform SDK to see if the API function you are calling is one of the odd APIs that bucks the calling convention standard.
In general, the calling convention of a native function, such as a Windows API function or a C-runtime DLL function, describes how the parameters are pushed onto and cleaned off the thread's stack. Most Windows API functions push the last parameter of a function onto the stack first, and then it is the called function's job to clean up the stack. By contrast, many of the C-runtime DLL functions are defined to push the method parameters onto the stack in the order they appear in the method signature, leaving stack cleanup to the caller.
Fortunately you only need a peripheral understanding of calling conventions to get P/Invoke calls to work. In general, starting with the default, CallingConvention.Winapi, is your best bet. And then, with C-runtime DLL functions and a very few functions you might have to change the convention to CallingConvention.Cdecl.

Data Marshaling
Data marshaling is a challenging aspect of P/Invoke. When passing data between the managed and unmanaged worlds, the CLR follows a number of rules that few developers will encounter often enough to memorize. Mastery of the details, though, is normally unnecessary unless you are a class library developer. Application developers who need to interop only occasionally should still understand some fundamentals of data marshaling to get the most out of P/Invoke on the CLR.

Marshaling Numerical and Logical Scalars
The majority of the Windows OS is written in C. As a result, the data types used with the Windows API are either C types or C types that are relabeled through a type definition or macro definition. Let's look at data marshaling without pointers. To keep things simple, I'll focus first on numbers and Boolean values.
When passing a parameter by value to a Windows API function, you need to know the answers to the following questions:
  • Is the data fundamentally integral or floating-point?
  • If the data is integral, is it signed or unsigned?
  • If the data is integral, how wide is it in bits?
  • If the data is floating-point, is it single or double precision?
Sometimes the answers are obvious, and other times they aren't. The Windows API redefines the fundamental C data types in a variety of ways. Figure 2 lists some common C and Win32 data types, along with their specifications and a common language runtime type with a matching specification.

Win32 TypesSpecificationCLR Type
char, INT8, SBYTE, CHAR†8-bit signed integerSystem.SByte
short, short int, INT16, SHORT16-bit signed integerSystem.Int16
int, long, long int, INT32, LONG32, BOOL†, INT32-bit signed integerSystem.Int32
__int64, INT64, LONGLONG64-bit signed integerSystem.Int64
unsigned char, UINT8, UCHAR†, BYTE8-bit unsigned integerSystem.Byte
unsigned short, UINT16, USHORT, WORD, ATOM, WCHAR†, __wchar_t16-bit unsigned integerSystem.UInt16
unsigned, unsigned int, UINT32, ULONG32, DWORD32, ULONG, DWORD, UINT32-bit unsigned integerSystem.UInt32
unsigned __int64, UINT64, DWORDLONG, ULONGLONG64-bit unsigned integerSystem.UInt64
float, FLOATSingle-precision floating pointSystem.Single
double, long double, DOUBLEDouble-precision floating pointSystem.Double
†In Win32 this type is an integer with a specially assigned meaning; in contrast, the CLR provides a specific type devoted to this meaning.

In general, as long as you select a CLR type whose specification matches that of the Win32 type for the parameter, your code will work. There are some special cases, however. For example, the BOOL type definition in the Windows API is a signed 32-bit integer. However, BOOL is used to indicate a Boolean value of true or false. While you could get away with marshaling a BOOL parameter as a System.Int32 value, you will get a more appropriate mapping if you use the System.Boolean type. Character type-mapping is similar to BOOL in the sense that there is a specific CLR type, System.Char, to address character meaning.

With this information, it might be helpful to step through an example. Sticking with the beep theme, let's try the Kernel32.dll low-level Beep, which makes a beep through the computer's speaker. The Platform SDK documentation for the method can be found at Beep. The native API is documented as follows:
BOOL Beep(
  DWORD dwFreq,      // Frequency
  DWORD dwDuration   // Duration in milliseconds
);
In terms of parameter marshaling, your job is to figure out what CLR data types are compatible with the DWORD and BOOL data types used with the Beep API function. Reviewing the chart in Figure 2, you'll see that DWORD is a 32-bit unsigned integer value, as is the CLR type System.UInt32. This means that you can use UInt32 values for the two parameters to Beep. The BOOL return value is an interesting case because the chart tells us that in Win32, BOOL is a 32-bit signed integer. Therefore, you could use a System.Int32 value for the return value from Beep. However, the CLR also defines the System.Boolean type for the semantic meaning of a Boolean value, so you should use that instead. The CLR will marshal the System.Boolean value as a 32-bit signed integer by default. The extern method definition shown here is the resulting P/Invoke method for Beep:
[DllImport("Kernel32.dll", SetLastError=true)]
static extern Boolean Beep(
   UInt32 frequency, UInt32 duration);
Parameters that are Pointers
Many Windows API functions take a pointer as one or more of their parameters. Pointers increase the complexity of marshaling data because they add a level of indirection. Without pointers, you pass data by value on the thread's stack. With pointers, you pass data by reference, by pushing a memory address to the data onto the thread's stack. The function then accesses the data indirectly through the memory address. There are multiple ways to express this additional level of indirection using managed code.
In C#, if you define a method parameter as ref or out, then the data is passed by reference rather than by value. This is true, even if you are not using Interop, but are just calling from one managed method to another. For example, if you pass a System.Int32 parameter by ref, then you pass the address to the data on the thread's stack rather than the integer value itself. Here is an example of a method defined to receive an integer value by reference:
void FlipInt32(ref Int32 num){
   num = -num;
}
Here, the FlipInt32 method takes the address of an Int32 value, accesses the data, negates it, and assigns the negated value to the original variable. In the following code, the caller's variable x would have its value changed from 10 to -10 by the FlipInt32 method:
Int32 x = 10;
FlipInt32(ref x);
This ability, in managed code, can be reused to pass pointers to unmanaged code. For example, the FileEncryptionStatus API function returns file encryption status as a 32-bit unsigned bitmask. The API is documented as shown here:
BOOL FileEncryptionStatus(
  LPCTSTR lpFileName,  // file name
  LPDWORD lpStatus     // encryption status
);
Notice that the function doesn't return the status using its return value, but instead returns a Boolean value indicating whether the call succeeded. In the success case, the actual status value is returned through the second parameter. The way this works is that the caller passes the function a pointer to a DWORD variable, and the API function populates the pointed-to memory with the status value. The following snippet shows a possible extern method definition to call into the unmanaged FileEncryptionStatus function:
[DllImport("Advapi32.dll", CharSet=CharSet.Auto)]
static extern Boolean FileEncryptionStatus(String filename, 
   out UInt32 status);
The definition uses the out keyword to indicate a by-ref parameter for the UInt32 status value. I could have selected the ref keyword here as well, and in fact both result in the same machine code at run time. The out keyword is simply a specialization of a by-ref parameter that indicates to the C# compiler that the data being passed is only being passed out of the called function. In contrast, with the ref keyword the compiler assumes that data may flow both in and out of the called function.
Another cool aspect of out and ref parameters in managed code is that the variable whose address you pass as the by-ref parameter may be a local variable on the thread's stack, an element of a class or structure, or it can be a reference to an element in an array of the appropriate data type. This caller flexibility makes by-ref parameters a good starting point for marshaling pointers to buffers, as well as to single data values. I would only consider marshaling a pointer as a more complex CLR type such as a class or an array object after I found that a ref or an out parameter did not meet my needs.
If you are unfamiliar with C syntax or making calls into the Windows API functions, then sometimes it can be difficult to know if a method parameter requires a pointer. A common indicator is if the parameter type starts with the letter P or the letters LP such as LPDWORD or PINT. In both of these examples the LP and P indicate that the parameter is a pointer, and the data type that they point to would be DWORD or INT, respectively. In some cases, however, the API function is defined as a pointer directly using the asterisk symbol (*) in C-language syntax. The following code snippet shows an example of this:
void TakesAPointer(DWORD* pNum);
As you can see, the preceding function's single parameter is a pointer to a DWORD variable.
When marshaling pointers through P/Invoke, ref and out are only used with value types in managed code. You can tell a parameter is a value type when its CLR type is defined using the struct keyword. Out and ref are used to marshal pointers to these data types because normally a value type variable is the object or data, and you don't have a reference to a value type in managed code. In contrast, the ref and out keywords are not necessary when marshaling reference type objects because the variable already is a reference to the object.
If you are unfamiliar with the difference between reference types and value types you can find more information on the topic in the .NET column in the December 2000 issue of MSDN® Magazine. Most CLR types are reference types; however, all of the primitive types such as System.Int32 and System.Boolean are value types with the exceptions of System.String and System.Object.

Marshaling Opaque Pointers: a Special Case
Sometimes in the Windows API a method takes or returns a pointer that is opaque, which means that the pointer value is technically a pointer but your code doesn't use it directly. Instead, your code passes the pointer back to Windows for subsequent reuse.
A very common example of this is the notion of a handle. In Windows, internal data structures ranging from files to buttons on the screen are represented to application code as handles. A handle is really an opaque pointer or pointer-width data value that your application uses to represent the internal OS construct.
Occasionally, API functions also define opaque pointers to be of the PVOID or LPVOID types. These types in the Windows API definitions mean that the pointer has no type.
When an opaque pointer is returned to (or expected from) your application, you should marshal the parameter or return value as a special type in the CLR called System.IntPtr. When you use the IntPtr type it is not common to use an out or ref parameter because an IntPtr is intended to hold a pointer directly. However, if you are marshaling a pointer to a pointer, then a by-ref parameter to an IntPtr is appropriate.
The System.IntPtr type has a special property in the CLR type system. Unlike the rest of the basic types in the system, IntPtr does not have a fixed size. Instead, its size at run time is based on the natural pointer size of the underlying operating system. This means that in 32-bit Windows IntPtr variables will be 32 bits in width, and in 64-bit Windows the just-in-time compiler emits code that treats IntPtr values as 64-bit values. This automatic resizing is very useful when marshaling opaque pointers between managed and unmanaged code.
Remember, any API function that returns or accepts a handle is really working with an opaque pointer. Your code should marshal handles in Windows as System.IntPtr values.
You can cast IntPtr values to and from 32-bit and 64-bit integer values in managed code. However, since the pointers are supposed to be opaque when used with Windows API functions, you should have no need to do anything with the values but store them and pass them to extern methods. Two exceptions to the store-and-pass-only rule are when you need to pass a null pointer value to an extern method and when you need to compare an IntPtr value with null. To do this, rather than cast zero to System.IntPtr, you should use the Int32.Zero static public field on the IntPtr type to get the null value for comparing or assigning.

Marshaling Text
Working with textual data is common in programming. Text adds some wrinkles to the interop story for two reasons. First, the underlying operating system may use Unicode to represent strings or it may use ANSI. In some rare cases, such as with the MultiByteToWideChar API function, the two parameters to the function disagree on character set.
The second reason that working with text when P/Invoking requires you to have some special understanding is that C and the CLR each deal with text differently. In C, a string is really only an array of char values, typically terminating in a null. Most of the Windows API functions deal with strings on these terms, either as an array of char values in the ANSI case or an array of wchar values in the case of Unicode.
Fortunately, the CLR is designed to be very flexible when marshaling text so that you can get the job done, regardless of what the Windows API function expects from your application. Here are the primary considerations to keep in mind:
  • Is your application passing text data to the API function or does string data pass back from the API function to your application? Or both?
  • What managed type should your extern method use?
  • What unmanaged string format does the API function expect?
Let's address this last concern first. Most of the Windows API functions take LPTSTR or LPCTSTR values. These are modifiable and non-modifiable buffers, respectively (from the function's point of view), which contain null-terminated character arrays. The "C" stands for constant and means that information will not be passing out of the function using that parameter. The "T" in LPTSTR indicates that this parameter can be either Unicode or ANSI depending on the character set you choose and depending on the character set of the underlying operating system. Since most string parameters are one of these two types in the Windows API, the CLR defaults work for you so long as you selected CharSet.Auto on your DllImportAttribute.
However, some API functions or custom DLL functions represent strings in different ways. If you run across one of these functions, you can decorate your extern method's string parameter with the MarshalAsAttribute and indicate a string format other than the default LPTSTR. For more information on the MarshalAsAttribute, see the Platform SDK documentation topic at MarshalAsAttribute Class.
Now let's look at the direction in which string information is being passed between your code and the unmanaged function. There are two ways that you can know which direction the information is being passed in when working with strings. The first and most reliable method is to understand the purpose of the parameter in the first place. For example, if you are calling a parameter with a name like CreateMutex and it takes a string, then you can imagine that the string information passes from your application to the API function. Meanwhile, if you call GetUserName, then the function name suggests that string information passes from the function to your application.
In addition to the rationalization approach, the second way to find out which direction the information flows in is to look for the letter "C" in the API parameter type. For example, the GetUserName API function's first parameter is defined as type LPTSTR, which stands for long-pointer to a Unicode or ANSI string buffer. But CreateMutex's name parameter is typed as LTCTSTR. Notice that here you have the same type definition, but with the addition of the letter "C" to indicate that the buffer is constant and will not be written to by the API function.
Once you have established whether the text parameter is input only or input/output, you can determine which CLR type to use for the parameter type. Here are the rules. If the string parameter is input only, use the System.String type. Strings are immutable in managed code, and are well suited to be used as buffers that will not be changed by the native API function.
If the string parameter can be input and/or output, then use the System.StringBuilder type. The StringBuilder type is a useful class library type that helps you build strings efficiently, and it happens to be great for passing buffers to native functions that the functions fill with string data on your behalf. Once the function call has returned, you need only call ToString on the StringBuilder object to get a String object.
The GetShortPathName API function is great for showing when to use String and when to use StringBuilder because it takes only three parameters: an input string, an output string, and a parameter that indicates the length in characters of the output buffer.
The commented documentation for the unmanaged GetShortPathName function in Figure 3 indicates both an input and output string parameter. This leads to the managed extern method definition, also shown in Figure 3. Notice that the first parameter is marshaled as a System.String because it is an input-only parameter. The second parameter represents an output buffer, and System.StringBuilder is used.
// ** Documentation for Win32 GetShortPathName() API Function
// DWORD GetShortPathName(
//   LPCTSTR lpszLongPath,      // file for which to get short path 
//   LPTSTR lpszShortPath,      // short path name (output)
//   DWORD cchBuffer            // size of output buffer
// );

[DllImport("Kernel32", CharSet = CharSet.Auto)]
static extern Int32 GetShortPathName(
   String path,                // input string
   StringBuilder shortPath,    // output string
   Int32 shortPathLength);     // StringBuilder.Capacity
Summing it Up
The P/Invoke features covered in this month's column are sufficient to call many of the API functions in Windows. However, if your interop needs are significant, you will eventually find yourself marshaling complex data structures and perhaps even needing access to memory directly through pointers in managed code. In fact, interop into native code can be a veritable Pandora's box of details and low-level bits. The CLR, C#, and managed C++ offer many features to help; perhaps in a later installment of this column I will cover advanced P/Invoke topics.
Meanwhile, whenever you find that the .NET Framework Class Library won't play a sound for you or perform some other bit of magic on your behalf, you know how to lean on the good old Windows API for some assistance.

No comments: