Holy cow, I wrote a book!
If you're looking to get into some p/invoke action, you'd be well-served to check out the pinvoke wiki to see if somebody else has done it too. If what you need isn't there, you may end up forced to write your own, and here are some gotchas I've seen people run into:
bool
BOOLEAN
System.Boolean
BOOL
VARIANT_BOOL
char
System.Char
long
System.Int64
CoTaskMemAlloc
CoTaskMemFree
That last one is particularly gnarly on 64-bit systems, where alignment requirements are less forgiving than on x86. The structure declarations on pinvoke.net tend to ignore 64-bit issues. For example, the declaration of the INPUT structure (as of this writing—it's a wiki so it's probably changed by the time you read this) reads as follows:
INPUT
[StructLayout(LayoutKind.Explicit)]struct INPUT { [FieldOffset(0)] int type; [FieldOffset(4)] MOUSEINPUT mi; [FieldOffset(4)] KEYBDINPUT ki; [FieldOffset(4)] HARDWAREINPUT hi; }
This structure layout is correct for 32-bit Windows, but it's incorrect for 64-bit Windows.
Let's take a look at that MOUSEINPUT structure, for starters.
MOUSEINPUT
typedef struct tagMOUSEINPUT { LONG dx; LONG dy; DWORD mouseData; DWORD dwFlags; DWORD time; ULONG_PTR dwExtraInfo; } MOUSEINPUT, *PMOUSEINPUT, FAR* LPMOUSEINPUT;
In 64-bit Windows, the LONG and DWORD members are four bytes, but the dwExtraInfo is a ULONG_PTR, which is eight bytes on a 64-bit machine. Since Windows assumes /Zp8 packing, the dwExtraInfo must be aligned on an 8-byte boundary, which forces four bytes of padding to be inserted after the time to get the dwExtraInfo to align properly. And in order for all this to work, the MOUSEINPUT structure itself must be 8-byte aligned.
LONG
DWORD
dwExtraInfo
ULONG_PTR
time
Now let's look at that INPUT structure again. Since the MOUSEINPUT comes after the type, there also needs to be padding between the type and the MOUSEINPUT to get the MOUSEINPUT back to an 8-byte boundary. In other words, the offset of mi in the INPUT structure is 8 on 64-bit Windows, not 4.
type
mi
Here's how I would've written it:
// This generates the anonymous union [StructLayout(LayoutKind.Explicit)] struct INPUT_UNION { [FieldOffset(0)] MOUSEINPUT mi; [FieldOffset(0)] KEYBDINPUT ki; [FieldOffset(0)] HARDWAREINPUT hi; }; [StructLayout(LayoutKind.Sequential)] struct INPUT { int type; INPUT_UNION u; }
I introduce a helper structure to represent the anonymous union that is the second half of the Win32 INPUT structure. By doing it this way, I let somebody else worry about the alignment, and it'll be correct for both 32-bit and 64-bit Windows.
static public void Main() { Console.WriteLine(Marshal.OffsetOf(typeof(INPUT), "u")); }
On a 32-bit system, this prints 4, and on a 64-bit system, it prints 8. The downside is that you have to type an extra u. when you access the mi, ki or hi members.
u.
ki
hi
input i; i.u.mi.dx = 0;
(I haven't checked what the PInvoke Interop Assistant comes up with for the INPUT structure.)
Since I'm obviously a glutton for punishment, I also helped read eighth grade essays on the same topic: Describe the qualities you consider to be those which make someone an adult. As always, remember that these are just the funny sentences/excerpts.
Let me tell you about my parents
Entering a no fun zone
It's harder than I thought
Tautology corner
Assorted commentary
Misspelling corner. I've included more context; that may make the game a bit easier.
And just so you won't think all eighth graders are terrible writers:
"Why can't I pass a reference to a derived class to a function that takes a reference to a base class by reference?" That's a confusing question, but it's phrased that way because the simpler phrasing is wrong!
Ths misleading simplified phrasing of the question is "Why can't I pass a reference to a derived class to a function that takes a base class by reference?" And in fact the answer is "You can!"
class Base { } class Derived : Base { } class Program { static void f(Base b) { } public static void Main() { Derived d = new Derived(); f(d); } }
Our call to f passes a reference to the derived class to a function that takes a reference to the base class. This is perfectly fine.
f
When people ask this question, they are typically wondering about passing a reference to the base class by reference. There is a double indirection here. You are passing a reference to a variable, and the variable is a reference to the base class. And it is this double reference that causes the problem.
class Base { } class Derived : Base { } class Program { static void f(ref Base b) { } public static void Main() { Derived d = new Derived(); f(ref d); // error } }
Adding the ref keyword to the parameter results in a compiler error:
ref
error CS1503: Argument '1': cannot convert from 'ref Derived' to 'ref Base'
The reason this is disallowed is that it would allow you to violate the type system. Consider:
static void f(ref Base b) { b = new Base(); }
Now things get interesting. Your call to f(ref d) passes a reference to a Derived by reference. When the f function modifies its formal parameter b, it's actually modifying your variable d. What's worse, it's putting a Base in it! When f returns, your variable d, which is declared as being a reference to a Derived is actually a reference to the base class Base.
f(ref d)
Derived
b
d
Base
At this point everything falls apart. Your program calls some method like d.OnlyInDerived(), and the CLR ends up executing a method on an object that doesn't even support that method.
d.OnlyInDerived()
You actually knew this; you just didn't know it. Let's start from the easier cases and work up. First, passing a reference into a function:
void f(SomeClass s); ... T t = new T(); f(t);
The function f expects to receive a reference to a SomeClass, but you're passing a reference to a T. When is this legal?
SomeClass
T
"Duh. T must be SomeClass or a class derived from SomeClass."
What's good for the goose is good for the gander. When you pass a parameter as ref, it not only goes into the method, but it also comes out. (Not strictly true but close enough.) You can think of it as a bidirectional parameter to the function call. Therefore, the rule "If a function expects a reference to a class, you must provide a reference to that class or a derived class" applies in both directions. When the parameter goes in, you must provide a reference to that class or a derived class. And when the parameter comes out, it also must be a reference to that class or a derived class (because the function is "passing the parameter" back to you, the caller).
But the only time that S can be T or a subclass, while simultaneously having T be S or a subclass is when S and T are the same thing. This is just the law of antisymmetry for partially-ordered sets: "if a ≤ b and b ≤ a, then a = b."
S
I didn't participate in the reading of the seventh grade essays, but I did get some of the more entertaining sentences from that batch. As you may recall, the topic was to describe the qualities you consider to be those which make someone an adult. Students were given 90 minutes, plus one additional hour upon request, equipped only with paper and pencil.
Remember, these are just the funny sentences/excerpts. Do not assume that all students write like this.
Better get your butt in gear
How'd they get that way?
And just to show that seventh graders are also capable of good writing:
A few years ago, Abhinaba wondered why FlagsAttribute didn't also alter the way enumeration values are auto-assigned.
Because attributes don't change the language. They are instructions to the runtime environment or (in rarer cases) to the compiler. An attribute can instruct the runtime environment to treat the function or class in a particular way. For example, you can use an attribute to tell the runtime environment that you want the program entry point to run in a single-threaded apartment, to tell the runtime environment how to look up your p/invoke function, or to tell the compiler to suppress a particular class of warnings.
But changing how values for enumerations are assigned, well that actually changes the language. An attribute can't change the operator precedence tables. An attribute can't change the way overloaded functions are resolved. An attribute can't change the statement block tokens from curly braces to square braces. An attribute can't change the IL that gets generated. The code still compiles to the same IL; the attribute just controls the execution environment, such as how the JIT compiler chooses to lay out a structure in memory.
Attribute or not, enumerations follow the same rule for automatic assignment: An enumeration symbol receives the value one greater than the previous enumeration symbol.
Welcome to CLR Week 2009. As always, we start with a warm-up.
The String.Format method doesn't throw a FormatException if you pass too many parameters, but it does if you pass too few. Why the asymmetry?
String.Format
FormatException
Well, this is the type of asymmetry you see in the world a lot. You need a ticket for each person that attends a concert. If you have too few tickets, they won't let you in. If you have too many, well, that's a bit wasteful, but you can still get in; the extras are ignored. If you create an array with 10 elements and use only the first five, nobody is going to raise an ArrayBiggerThanNecessary exception. Similarly, the String.Format message doesn't mind if you pass too many parameters; it just ignores the extras. There's nothing harmful about it, just a bit wasteful.
ArrayBiggerThanNecessary
Besides, you probably don't want this to be an error:
if (verbose) { format = "{0} is not {1} (because of {2})"; } else { format = "{0} not {1}"; } String.Format(format, "Zero", "One", "Two");
Think of the format string as a SELECT clause from the dataset provided by the remaining parameters. If your table has fields ID and NAME and you select just the ID, there's nothing wrong with that. But if you ask for DATE, then you have an error.
SELECT
ID
NAME
DATE
I was out of town for the grading of the seventh grade essays, so I pitched in with the sixth grade essays instead. The students were asked to think of an adult and describe the qualities that make that person an adult. This topic was not very well received by the students, who deemed it uncreative and boring. While I understand their lack of enthusiasm, it's also true that for most of your life, you're going to have to write on topics that are uncreative and boring (and the stakes are going to be higher), so you'd better get good at it.
The difference in writing skill between sixth and seventh graders (between eleven year olds and twelve year olds) is quite noticeable. Many of sixth graders could not get past the literal definition of the word adult, describing the qualities that make an adult purely in terms of biology: Age, height, strength, puberty, armpit hair. Many others focused on accomplishments or privileges that distinguish adults from children: Advanced education, having a job, knowing how to drive a car, and being able to stay up late without getting yelled at.
Remember, these are just the funny sentences/excerpts. Do not assume that all students write like this. The assignment is given under standardized test conditions: 90 minutes with nothing but pencil and paper, with one additional hour available upon request.
The easy life
Check your fun at the door
Responsible behavior
Concluding thoughts
Other remarks on student writing:
And that's why I read student essays.
The other night, I was playing a friendly game of Scrabble®, and I managed to play BEANIER* (meaning "with a stronger flavor of beans") onto a triple-word score, crossing the B with an open Y, scoring over 100 points in the process. This sufficiently demoralized the other players that the game turned into "play anything that vaguely resembles a word, with creative spelling encouraged."
It turns out that BEANIER* is not listed in the online versions of the SOWPODS or TWL Scrabble word lists, although I made the move in good faith. If the others had thought to challenge, they would've succeeded.
My brother and I play Scrabble with very different styles. I'm not so much concerned with scoring (although I certainly try to make high-scoring moves) as I am with having a pretty board with a lot of intersections and clever words. I treat Scrabble as a collaborative effort that happens to have a winner at the end, in the same spirit as shows like My Music or Says You. As a result, I don't pay too much attention to whether I'm opening easy access to a triple-word square, and I will forego a higher-scoring play in favor of one that uses a funny word or which connects two parts of the board. If you look at my scoresheet at the end of the game, it consists of a lot of medium-scoring moves (and a few really pathetic ones), with maybe one "super-move" per game where I play a bingo or otherwise manage to rack up a lot of points at one go.
My brother's approach is much more methodical. He doesn't play a very flashy game; he just focuses on scoring twenty or more points per move. If you look at his scoresheet, it's just a slow, steady climb to the final tally.
This means that when we play, it's a competition between the tortoise and the hare. (I'm the hare.) Will my "super-move" be enough to hold off the steady erosion of my lead from the constant barrage of strong moves? Usually, the answer is No. Slow and steady wins the race. But I like to think I have more fun.
One of the flags you can pass to the IShellFolder::CompareIDs method is SHCIDS_CANONICALONLY. This flag means that the method should determine whether the two pointers refer to the same underlying object, and if they do not, then it should determine which one should come first by whatever mechanism it wants. It doesn't matter which one is declared as coming before the other one, as long as it is consistent.
IShellFolder::CompareIDs
SHCIDS_CANONICALONLY
I like to think of this as the moral equivalent of the Unicode ordinal comparison. In both cases, you use the comparison if you have two items that you wish to keep in sorted order, but you don't care what the ordering rules are, as long as they are consistent. In fact, all you care about is consistency, and you're perfectly happy to sacrifice readability for speed. The resulting sorted list won't be displayed to the user; all you're going to use it for is locating the item later.
You can think of this as the moral equivalent of the NTFS file name sorting algorithm. In both cases, the items are sorted not so that the user can find them, but so that the program can find them.
Back in the days before perl ruled the earth, regular expressions were one of those weird niche features, one of those things that everybody reimplements when they need it. If you look at the old unix tools, you'll see that even then, there were three different regular expression engines with different syntax. You had grep, egrep, and vi. Probably more.
grep
egrep
vi
The grep regular expression language supported character classes, the dot wildcard, the asterisk operator, the start and end anchors, and grouping. No plus operator, no question mark, no alternation, no repetition counts. The egrep program added support for plus, question mark, and alternation. Meanwhile, somebody went back and added repetition counts to grep but didn't add them to vi; somebody else added the \< and \> metacharacters to vi but didn't add them to sed. POSIX added repetition counts to awk but changed the notation from \{n,m\} to {n,m}. And so on.
\<
\>
sed
awk
\{n,m\}
{n,m}
No two programs use the same regular expression language, but they overlap sufficiently that you can often get by with the common subset and not have to worry about which particular flavor you're up against.
Until you wander into the places where they differ.
From: John Jones Subject: Problem with regular expression I'm trying to write a regular expression to match blah blah blah.
From: John Jones Subject: Problem with regular expression
I'm trying to write a regular expression to match blah blah blah.
From: Jane Smith Subject: RE: Problem with regular expression I think this will match what you want: ^Z@1&*B*!34
From: Jane Smith Subject: RE: Problem with regular expression
I think this will match what you want: ^Z@1&*B*!34
I just ran my hand randomly over the keyboard to generate that fake regular expression. The scary thing is, at first glance, it is not obviously not a regular expression!
From: Chris Brown Subject: RE: Problem with regular expression Try $)(#$C)*#
From: Chris Brown Subject: RE: Problem with regular expression
Try $)(#$C)*#
From: John Smith Subject: RE: Problem with regular expression Thanks, everybody, for your suggestions, but I can't get any of them to work. For example, I can't get any of them to match against this string: blah blah blah blah.
From: John Smith Subject: RE: Problem with regular expression
Thanks, everybody, for your suggestions, but I can't get any of them to work. For example, I can't get any of them to match against this string: blah blah blah blah.
At this point, people chimed in with other suggestions, confirming that John doubled the backslashes, that sort of thing. John posted his test program, and then the reason was obvious.
From: Jane Smith Subject: RE: Problem with regular expression Oh, you're using CAtlRegExp. In that class, \w doesn't match a single character; it matches an entire word. You want to use \a instead.
Oh, you're using CAtlRegExp. In that class, \w doesn't match a single character; it matches an entire word. You want to use \a instead.
CAtlRegExp
\w
\a