Came across a post by Justin Etheredge discussion whether changing between languages is just a matter of syntax.
Or, to pick a specific example, can a Java programmer quickly and easily learn to write C# code?
The answer is obviously "yes". Development is about a way of thinking and approaching problems, and given the similarity between Java and C#, a good Java developer should take a minimal amount of time to learn how to write functional code in C#. The biggest barrier is libraries, which are more different than the languages are.
The answer is equally as obviously "no". Sure, you can write functional code, but you will not be able to write idiomatic code. Like a high school senior with 4 years of French class on a trip to Paris, you can make yourself understood, but you aren't going to be mistaken as a native. You ask a question, somebody replies, "Ce ne sont pas vos oignons", and you just end thinking of soup.
So, yeah, you can write C# code, but it's going to be Java written in C#. Given the closeness of the languages, it may be sufficient, but you're going to force some refactoring on any idiomatic C# speakers who inherit your code.
It can be worse - when I first started writing in Perl, I wrote C code in Perl, which just doesn't work very well. And over time, I became at least functional, though perhaps not idiomatic in Perl (though, because of TMTOWTDI, it's hard to judge that in Perl).
However, if you can become idiomatic in multiple languages, your toolset broadens, and you become more useful in all your langauges.
Two posts (1 2) on C# loop optimization got me thinking recently.
Thinking about what I did when I first joined Microsoft.
Way back in the spring of 1995 or so (yes, we did have computers back then, but the Internet of the time really *was* just a series of tubes), I was on the C++ compiler test team, and had just picked up the responsibility for running benchmark tests on various C++ compilers. I would run compilation speed and execution speed tests in controlled environments, so that we could always know where we were.
We used a series of “standard” benchmarks – such as Spec – and a few of our own.
Because execution speed was one of the few ways (other than boxes with lots of checkmarks) that you could differentiate your compiler from the other guy’s, all the compiler companies invested resources at being faster at the benchmarks.
The starting point was to look at the benchmark source, the resultant IL, and the final machine code, and see if you could see any opportunity for improvement. Were you missing any optimization opportunities?
Sometimes, that wasn’t enough, so some compiler writers (*not* the ones I worked with) sometimes got creative.
You could, for example, identify the presence of a specific expression tree that just “happened to show up” in the hot part of of a benchmark, and bypass your usual code generation with a bit of hand-tuned assembly that did things a lot faster.
Or, with a little more work, you could identify the entire benchmark, and substitute another bit of hand-tuned assembly.
Or, perhaps that hand-tuned assembly doesn’t really do *all* the work it needed to, but took a few shortcuts but still managed to return the correct answer.
For some interesting accounts, please text “compiler benchmark cheating” to your preferred search engine.
As part of that work, I got involved a bit in the writing and evaluation of benchmarks, and I thought I’d share a few rules around writing and interpreting micro-benchmarks. I’ll speak a bit about the two posts – which are about looping optimizations in C# – along the way. Just be sure to listen closely, as I will be speaking softly (though not in the Rooseveltian sense…)
Rule 0: Don’t
There has always been a widespread assumption that the speed of individual language constructs matter. It doesn’t.
Okay, it does, but only in limited cases, and frankly people devote more time to it than it deserves.
The more productive thing is to follow the agile guideline and write the simplest thing that works. And note that “works” is a bit of a weasely word here – if you write scientific computing software, you may have foreknowledge about what operations need to be fast and can safely choose something more complicated, but for most development that is assuredly not true.
Rule 1: Do something useful
Consider the following:
void DoLoop() { for (int x = 0; x < XMAX; x++) { for (int y = 0; y < YMAX; y++) { } }
}
void TimeLoop() { // start timer for (int count = 0; count < 1000; count++) { DoLoop(); } // stop timer } if XMAX is 1000, YMAX is 1000, and the total execution time is 0.01 seconds, what is the time spent per iteration? Answer: Unknown.
The average C++ optimizer is smarter that this. That nested loop has no effect on the result of the program, so the compiler is free to optimize it out (the .NET JIT may not have time to do this).
So, you modify the loop to be something like:
void DoLoop() { int sum;
for (int x = 0; x < XMAX; x++) { for (int y = 0; y < YMAX; y++) { sum += y; } } }
The loop now has some work done inside of it, so the loop can’t be eliminated.
Rule 2: No, really. Do something useful
However, the numbers won’t change. The call to DoLoop() has no side effects, so the entire call can be safely eliminated.
To make sure your loop is really a loop, there needs to be a side effect. The best bet is to have a value returned from the method and write it out to the console. This has the added benefit of giving you a way of checking whether things are working correct.
Rule 3: Benchmark != Real world
There are lurking effects that invalidate your results. Your benchmark is likely tiny and places very different memory demands on the system than your real program does.
Rule 4: Profile, don’t benchmark
C# loop optimization
If you are writing code that needs the utmost in speed, there is an improvement to be had using for rather than foreach. There is also improvement to be had using arrays rather than lists, and unsafe code and pointers rather than array indexing.
Whether this is worthwhile in a specific case depends exactly on what the code is doing. I don’t see a lot of point in spending time measuring loops when you could spend time measuring the actual code.
(Interestingly, I find myself writing more about agile and team stuff now that I'm not on a development team....)
This is in response to a question about how you balance individual empowerment with the collaborative approach on a agile tem...
***
Agile is all about the team, and being on an agile team requires participants to give up some autonomy towards the team. The team is empowered to do what they need to do to reach their goal. If there are issues around how things should be done or what decision is right, the team needs to come to a decision, and I would encourage management to let the team try to do it. Further, the team needs to “meta rules” around how to make decisions, and they also need to develop those.
This is very different than the “alpha geek” culture that exists in some groups, where a small number of developers are interested in wielding power. There are some individuals who just aren’t willing/able to work collaboratively – I’ve worked with a few, and if you are trying to run an agile team, they are likely better in a different position.
One of the teams I was on basically came to this agreement:
Developers are expected to use their best judgement when deciding what advice to seek when they are doing development. There are no rules around when you should seek advice, but as a rough guideline, extending functionality under existing patterns is something you can safely do on your own, and big refactorings or new components are areas when you should definitely seek advice. In between, think about the implications of any design choices you might make, and act accordingly.
The other approach is to adopt pair programming, which is a bigger cultural change, but generally if you get two people thinking about decisions they usually make the right decision about involving others.
There was a post on an internal alias about moving a team that has not been creating any developer-written tests to one that does TDD. I wrote a reply that I think may be of more general interest...
Developers are fluent in code. I think any time you are talking to developers about engineering practices, you need to be showing them code. I think showing TDD workflow is very important. I also have stopped using the term “Unit test” as it means 45 different things before 10am.
I’m a big fan of selling the problem rather than the solution. Here are some problems that I think resonate with the developers I know:
Time Savings
Nearly every group with existing code has a section of code that either is or has been a bug factory, and most developers can make a decent estimate of what parts of new code are “tricky”. If you use TDD (or just write tests for) that set of code, you can save yourself a huge amount of time debugging. You can also factor in the difficulty of debugging certain parts of code – there is a big benefit there.
Flexiblity
People also tend to own code that a) needs changes and b) is brittle and c) nobody understands any more, and everybody knows how much of a pain that is.
Freedom to focus on new stuff
All devs like to work on new stuff rather than fixing old bugs. If you have good tests along the way, you won’t have to be dragged away from the new stuff to fix the old stuff as often.
Pride
Everybody likes to write code that works well, nobody likes to own the bug factory. In fact, I think people leave groups to get away from something they know is a bug factory but nobody else does.
Permission
Group dynamics often push devs toward meeting a team’s definition of code complete rather than spending the time writing tests. Now, I happen to think that TDD often gets you to code complete sooner (in a tradition milestone approach), but if you’re just learning it, that isn’t the case. You need an explicit declaration that it’s okay to be spending the time writing the tests
Tests as examples
If you are creating a component that is consumed by somebody else, you can save a bunch of your time by having unit tests. Not only do you spend less time responding to email and doing bugfixes, the tests often provide very nice examples of how to use your component.
You may not that I don’t list “Design by example” there at all. I think that design by example is a much better way to create software, but it’s an experiential thing rather than something you can put on a powerpoint deck.
Hope that helps.
To actually get anything installed, we'll need a more reasonble WXS file.
<?xml version="1.0"?><Wix xmlns="http://schemas.microsoft.com/wix/2003/01/wi"> <Product Name="Microsoft HealthVault Shortcut - Fabrikam WidgetTracker" Id="PUT-GUID-HERE" Language="1033" Codepage="1252" Version="1.0.0.0" Manufacturer="Fabrikam" UpgradeCode="PUT-GUID-HERE">
<Package Id="PUT-GUID-HERE" Description="Microsoft HealthVault Shortcut - Fabrikam WidgetTracker" Manufacturer="Fabrikam" InstallerVersion="100" Compressed="yes"/>
<Property Id="ARPNOMODIFY" Value="1" /> <Property Id="ARPNOREPAIR" Value="1" />
<Media Id="1" Cabinet="CCextend.cab" EmbedCab="yes"/> <Directory Id="TARGETDIR" Name="SourceDir"> <!--***************************************************************************--> <!--These are files that get installed in the machines 'Documents and Settings'--> <!--subfolders. The files are places in a location that the ConnectionCenter app --> <!--knows to look in for MS and 3rd party components that should show up in the--> <!--ConnectionCenter UI. --> <!--The 1st Directory ID below is a pre-defined value that represents a --> <!--location on the user's Windows machine. --> <!--The files are copied into the 'all users' app data/settings directory --> <!--***************************************************************************--> <Directory Id="CommonAppDataFolder" Name="LAppsDir"> <Directory Id="AppDataMSDir" Name="Msft" LongName="Microsoft"> <Directory Id="AppDataHealthSolutionsDir" Name="HSDir" LongName="HealthVault"> <Directory Id="AppDataHealthCenterDir" Name="HCDir" LongName="Connection Center" > <Component Id="ConnectionCenterShortcut" Guid="PUT-GUID-HERE">
<File Id="shortcut_xml" Name="short_0.xml" LongName="WidgetTracker.xml" DiskId="1" Source="WidgetTracker.xml" Vital="yes"/>
<File Id="shortcut_icon" Name="short_0.ico" LongName="WidgetTracker.ico" DiskId="1" Source="WidgetTracker.ico" Vital="yes"/>
<File Id="shortcut_logo" Name="short_l.xml" LongName="WidgetTracker.png" DiskId="1" Source="WidgetTracker.png" Vital="yes"/>
</Component> </Directory> </Directory> </Directory> </Directory> </Directory>
<Feature Id="Complete" Level="1"> <ComponentRef Id="ConnectionCenterShortcut" /> </Feature> </Product></Wix>
The ARPNOMODIFY and ARPNOREPAIR tags mean that the shortcut doesn't show 'repair' or 'modify' UI in add/remove programs.
The directory tags define the folder hierarchy so that the files will get in the proper place.
The component tag defines the component. It needs a unique GUID, but AFAICT, you do not need to create a new one with each release.
The file tags are pretty explanatory. Source is the build location, and LongName is the name of the file in the target directory. Note that the filenames need to be unique to your application, so if you have a particular brand, it would be good to name them using that.
Finally, the feature tag defines which components are included in the product.
So, now we could run that and create a package, if only we knew what the files looked like. That's next.
I'm writing this specifically for developers who need to add links into HealthVault Connection Center, but I think the topic is of general interest to anyone who wants to create installer packages.
We're going to be using the WiX (Windows Installer XML) toolset to create msi files.
Note 1: While I find the mixed case "WiX" name quite enjoyable, I'm just going to use "wix" in the rest of the writeup
Note 2: There are some good wiX tutorials out there that you may find useful.
Step 1: Get the tools
Find the Download section on the right side of the WIX home page. I'm basing this on the "Version 2.0 (stable)" bits. If there are newer bits, they might work, and then again, they might not.
Off of that link, you'll find a link to the binaries zip file. Unzip them, and put them in a directory (I found "g:\wix" to be an aesthetically pleasing location).
Step 2: Read the docs
I know that you're going to skip this step, just like you skipped the instructions on the new circular saw you bought last weekend.
If you want to actually read the docs, there are some in the "doc" directory, and some online. But I prefer skipping to the next step.
Step 3: Understand the tools
Wix files use the .wxs extension. If you run the wix compiler over them ("candle"), you end up with a .wixobj file. Running the wix linker ("light") generates a .msi file.
Step 4: Build your first MSI
Here are the bare-bones of a .wxs file:
<?xml version="1.0"?><Wix xmlns="http://schemas.microsoft.com/wix/2003/01/wi"> <Product Name="Fabrikam WidgetTracker" Id="PUT-GUID-HERE" Language="1033" Codepage="1252" Version="1.0.0.0" Manufacturer="Fabrikam" UpgradeCode="PUT-GUID-HERE">
<Package Id="PUT-GUID-HERE" Description="Fabrikam users can use this to track their widgets" Manufacturer="Fabrikam" InstallerVersion="100" Compressed="yes"/>
</Product></Wix>
Most of that is pretty self-explanatory, though I do want to talk about the IDs.
The ids are GUIDs that are used to keep track of things. I won't go into the details because I they're sorta complicated, but basically, the "id" field unique identifies the product and the package within the product. Whenever you release an MSI with any different bits, you need to have a different product id or the installer will think you've already installed it.
Packages also need to be unique, and change when their contents changes.
UpgradeCodes are also GUIDs, but should never change. They are the mechanism that the installer uses to figure out that you're doing an upgrade. If the upgrade scenario is important, you need to set it.
So, to recap, product and package ids need to be unique and generated for each release. UpdateCode should be generated once, and you use that same updatecode for the life of the product.
You do the generation with uuidgen.exe. You can download that with the Windows SDK, or use one of the online generators.
This package does very little, but the way, but you can install and unistall it.
One of my readers asked whether there were any UI unit testing tools.
While I have seen some ASP.net tools like this, in general I'd expect that you would unit test a UI by making the UI a very thin layer (one that doesn't really need testing), and writing the unit tests to talk to the layer underneath.
Though I haven't had the opportunity to try it on a full project, I think that Presenter First has a lot going for it.
From Jim Newkirk, one of the original NUnit authors...
xunit.net
O'reilly publishes Beautiful Code
Jonathan Edwards counters with a beautiful explanation.
Now, I haven't read the new book, but I have a strong resonance with what Edwards wrote. You should definitely read the whole thing, but I few sentences jumped out at me.
A lesson I have learned the hard way is that we aren’t smart enough. Even the most brilliant programmers routinely make stupid mistakes. Not just typos, but basic design errors that back the code into a corner, and in retrospect should have been obvious.
and
It seems that infatuation with a design inevitably leads to heartbreak, as overlooked ugly realities intrude.
Precisely.
If there's anything that agile says, it says that we should build things simply and with a eye to revision because we not only are we "just not smart enough", there are too many unknowns when we start.
The problem with "beautiful code" as a concept is that it is closely related to "beautiful design", and I've mostly come to the conclusion that any design effort that takes more than, say, 30 minutes is a waste of time.
The concept also gets confused about what the goal of software is anyway. The goal is not to have beautiful, elegant, code. The goal is to have *useful* code that does what you need it to do.
Discuss and comment
A follow-on to the previous discussion about member names.
There were a variety of opinions, some of which argued for using no prefix at all.
For those of you who are in the group, I'm interested in how you manage things when you are doing UI work, and having to deal with your 3 member variables being swallowed by the 100 methods already defined in the base class.
Is this an issue for you? If so, how do you deal with it?
Thanks for your comments.
I decided to go ahead and write the unit tests for that layer, both because I knew what not writing them would be like, and I wanted to play with wrapping/mocking a system service.
I also decided - as some of you commented - to do the right thing and encapsulate it into a class. That would have happened long ago, but though I've written it several times, I don't think I've ever duplicated it within a single codebase - and the codebases where I did write it are pretty disparate. Now, I have something where I could at least move the source file around...
Writing tests for this was a bit weird, because in some sense what I needed to do was figure out what the system behavior was, break that down, write a test against my objects, and then write mocks that allowed me to simulate the underlying behavior.
So, for example, I created a test to enumerate a single file in a single directory, wrote a wrapper around DirectoryInfo, and then created a mock on that object so I could write GetFiles() to pass back what I wanted. And so on with multiple files, sub-directories, etc.
So, I did that, went to write the little bit of code that I needed in the real version (to use the real GetFiles() calls and package the data up), hooked it up to my real code, and it worked.
*But*, when I went back and looked at the code, I found that what I had really done was create two sets of code. There was the real code that called the system routines and shuffled the data into my wrapped classes. And then there was my mock code that let me control what files and directories got returned. But there wasn't any common code that was shared.
So, my conclusion is that I really didn't get anything out of the tests I wrote, because the tests only tested the mocks that I wrote rather than the real code, because the only real code was the code that called the system functions.
In this case, TDD didn't make sense, and I will probably pull those tests out of the system.TDD may make sense the next level up, where I've written a new encapsulation around directory traversal, but it seems like the only code there is hookup code.
So, the result of my experiement was that, in this case, writing the tests was the wrong thing to do.
I've been writing a small utility to help us do some configuration setup for testing. It needs to walk a directory structure, find all instances of a specific xml file, and then make some modifications to the file.
I TDD'd the class that does the XML file stuff, and I'm confident that it's working well. I'm now going to do the class to walk of the directory structure and find the files.
And there is my dilemna. I don't know if I'm going to do TDD on that.
I know exactly how to write it, I've written it before, and my experience is that that is code that never changes nor breaks. And, figuring out how to write tests for it is going to be somewhat complex because I'll have to isolate out the live file system parts.
So, I've already decided what I'm going to do, but I'm curious what you think. Does YAGNI apply to test code, or is that the first step to the dark side?
If scrum isn't to your liking, here are a few alternate methodologies that you might consider...
(circa November 15, 2001)
I think this column stands up pretty well without caveat.
I should note that I wrapped the image class to provide nicer pixel-based access in a class you can find here. I suggest basing your code on that rather than what I wrote in the column.
I should also note that the grey-scale code isn't what you want. The human eye is not equally sensitive to all colors, so you should use the following to determine the intensity:
0.299 * red + 0.587 * green + 0.114 * blue
Finally, I'll note that the extreme speedup I get here is because of how the underlying unmanaged GDI+ code is structured.
******
Unsafe Image Processing
Last month, we talked a bit about what unsafe code was good for and worked through a few examples. If you looked at the associated source code, you may have noticed that there was an image-processing example. This month, we're going to work through that example.
Grey Scale Images
Last summer, I was writing a program to process some pictures from my digital camera and I needed to do some image processing in C#. My first task was to figure out how to do that using the .NET fFrameworks. I did a bit of exploration, and found the Image and Bitmap classes in the System.Drawing namespace. These classes are thin covers over the GDI+ classes that encapsulate the image functions.One task I wanted to do was to walk through an image and convert it from color to grayscale. To do this, I needed to modify each pixel in the bitmap. I started by writing a BitmapGrey class that would encapsulates the bitmap, and then writing a function in that class to do the image conversion. I came up with the following class:
public class BitmapGrey{ Bitmap bitmap;
public BitmapGrey(Bitmap bitmap) { this.bitmap = bitmap; }
public void Dispose() { bitmap.Dispose(); }
public Bitmap Bitmap { get { return(bitmap); } }
public Point PixelSize { get { GraphicsUnit unit = GraphicsUnit.Pixel; RectangleF bounds = bitmap.GetBounds(ref unit);
return new Point((int) bounds.Width, (int) bounds.Height); } }
public void MakeGrey() { Point size = PixelSize;
for (int x = 0; x < size.X; x++) { for (int y = 0; y < size.Y; y++) { Color c = bitmap.GetPixel(x, y); int value = (c.R + c.G + c.B) / 3; bitmap.SetPixel(x, y, Color.FromArgb(value, value, value)); } } }}
The meat of the class is in the MakeGrey() method. It gets the bounds of the bitmap, and then walks through each pixel in the bitmap. For each pixel, it fetches the color and determines what the average brightness of that pixel should be. It then creates a new color value for the brightness value, and stores it back into the pixel.
This code was simple to write, easy to understand, and worked the first time I wrote it. Unfortunately, it's pretty slow. ; iIt takes about 14 seconds to process a single image. That's pretty slow if I compare it to a commercial image processing program, such as Paint Shop Pro, which can do the same operation in under a second.
Accessing the Bitmap Data Directly
The majority of the processing time is being spent in the GetPixel() and SetPixel() functions, and to speed up the program, I needed a faster way to access the pixels in the image. There's an interesting method in the Bitmap class called LockBits(), which can be used – not surprisingly – to lock a bitmap in memory so that it doesn't move around. With the location locked, it's safe to deal with the memory directly, rather than using the GetPixel() and SetPixel() functions. When LockBits() is called, it returns a BitmapData class. The scan0 field in the class is a pointer to the first line of bitmap data. We'll access this data to do our manipulation.First, however, we need to understand a bit more about the how the data in the image is arranged.Bitmap Data OrganizationThe organization of the bitmap depends upon the type of data in the bitmap. By looking at the PixelFormat property in the BitmapData class, we can determine what data format is being used. In this case, I'm working with JPEG images, which use the Format24bppRgb (24 bits per pixel, red green blue) format. Since we're going to be looking at these pixels directly, we'll need a way to decode a pixel. We can do that with the PixelData struct:public struct PixelData{ public byte blue; public byte green; public byte red;}
Now, we'll need a way to figure out what the offset is for a given pixel. Basically, we treat the bitmap data as an array of PixelData structures, and figure out what index we'd be looking for to reference a pixel at x, y. In memory, a bitmap is stored as one large chunk of memory, much like an array of bytes. We therefore need to figure out what the offset is of a given pixel from the beginning of the array. Here's what a 4x4 bitmap would look like, with the index and (y, x) location of each pixel.
For each line, the offset is simply the width of the line in pixels times the y value. This gives the index of the first element of the line, and the x value is simply added to that value to determine the actual index of an element. This is the way that multi-dimensional arrays are typically stored in memory.So, we can figure out the offset as follows:Offset = x + y * width;
The code to access a given pixel is something like:PixelData *pPixel;pPixel = pBase + x + y * width;
Where pBase is the address of the first element.If we go off and write some code, we'll find that this works great for some bitmaps, but doesn't work at all for other bitmaps. It turns out that the number of bytes in each line must be a multiple of 4 bytes, and since the pixels themselves are 3 bytes, there are some situations where there's some unused space at the end of each line. If we use the simple version of indexing above, we'll sometimes index into the unused space.Here's an example of what this looks like in a bitmap, with each box now corresponding to a byte rather than a pixel.
This bitmap occupies 24 bytes, with 12 bytes per line. However, it only has three entries, so each line must be padded with an extra 3 bytes. To make our code work everywhere, we need to switch from working in terms of pixels to working in terms of bytes, at least for dealing with the line indexing. We also need to calculate the width of a line in bytes. We can write a function that handles this all of this for us:public PixelData* PixelAt(int x, int y){ return (PixelData*) (pBase + y * width + x * sizeof(PixelData));}
Once that we've gotten that written that bit of code, we can finally write an unsafe version of our grey scale function:public void MakeGreyUnsafe(){ Point size = PixelSize; LockBitmap();
for (int x = 0; x < size.X; x++) { for (int y = 0; y < size.Y; y++) { PixelData* pPixel = PixelAt(x, y);
int value = (pPixel->red + pPixel->green + pPixel->blue) / 3; pPixel->red = (byte) value; pPixel->green = (byte) value; pPixel->blue = (byte) value; } } UnlockBitmap();}
We start by calling a function to lock the bitmap. This function also figures out the width of the bitmap and stores the base address away. We then iterate through the pixel, pretty much just as before, except that for each pixel, we get a pointer to the pixel and then manipulate it through the pointer.When this is tested, it only takes about 1.2 seconds to do this image. That's over 10 times faster. It's not clear where all the overhead is in GetPixel() / SetPixel(), but there's obviously some overhead in the transition from managed to unmanaged code, and when you have to do the transition 3.3 million times per image, that overhead will adds up. I originally stopped with this version, but a bit of reflection suggested another opportunity for improvement. We're calling the PixelAt() function for every pixel, even when we know that the pixels along a line are contiguous. We'll exploit that in the final version:public void MakeGreyUnsafeFaster(){ Point size = PixelSize; LockBitmap();
for (int y = 0; y < size.Y; y++) { PixelData* pPixel = PixelAt(0, y); for (int x = 0; x < size.X; x++) { byte value = (byte) ((pPixel->red + pPixel->green + pPixel->blue) / 3); pPixel->red = value; pPixel->green = value; pPixel->blue = value; pPixel++; } } UnlockBitmap();}
The loop in this version does the y values in the outer loop, so we can walk through the entire x line. At the beginning of each line, we use PixelAt() to find the address of the first element, and then use pointer arithmetic (the pPixel++ statement) to advance to the next element in the line.My test bitmap is 1,536 pixels wide. With this modification, I'm replacing a call to PixelAt() with an additional operation in all but one of those pixels.When this version is tested, the elapsed time is 0.52 seconds, which is over twice as fast as our previous version, and nearly 28 times faster than the original version. Sometimes unsafe code can be really extremely useful, though gains of 28x are pretty rare.
I apologize for shocking your system by posting more than once a month - there are reasons for that, but I unfortunately can't get into them right now - but Keith added an interesting comment to my last post. He said:
Side Note: a bit disturbing you're using C++ naming conventions in C# though? :) No doubting your a ninja coder and I love your stuff, but seriously, bringing the m_ prefixing into C# is a bit of a "cant teach an old dog new tricks" thing.
This is pretty close to a "religious question", but since it's my blog, I'm always right (as codified in the "decree on Gunnerson infallibility of 1997"), so I'll take it on.
When I first started writing code, a lot of our samples were written without any way of indicating whether something was a field or not, and I wrote a considerable amount of code using that idiom. What I found was that when I went back and looked at my code later, I had to scroll around to find out where each variable came from, and that made understanding the code harder.
I toyed for a while with using just and underscore "_name", but I didn't like that. A single underline is a bit hard to pick up visually, and it seemed like I was inventing a different expression just to be different. So, I switched back to "m_", and I must say that I'm happy with the choice. The only place I don't like it is with events or other public fields, which are then named differently, but I'm willing to deal with that.
The only other place I use prefixes is on pointers, where I just use "p". Unsafe code is rare enough that I want to have an indicator of what's going on.
[Update: Another reason to use m_ is to make it easier to find your variable names in intellisense when you're working with controls, since there are roughly 4000 members in the average control class. I've also been using "c_" in the names of controls for the same reason]
So, what do you think, keeping in mind that if you disagree, you're wrong...
Today I was working with some sample code, and I came across a misspelling. Not a big deal - there was a field that was named "m_postion" rather than "m_position".
But that got me thinking...
In the past, that sort of thing wouldn't have happened. You would have written:
int m_postion;
but then, when you wrote your code, you would have written:
m_position = 5;
and the compiler would have complained. But with intellisense, you now just pick whatever looks right in the popup list, and the mistakes stay around a bit longer.
So, I wrote a bit of code to help with this - it reflects over an assembly, and produces a list of words.
using System; using System.Collections.Generic; using System.Text; using System.Text.RegularExpressions; using System.Reflection; namespace IdentifierExtract { class GetIdentifiers { Dictionary m_identifiers = new Dictionary(); public GetIdentifiers() { } public void Process(string assemblyFilename) { Assembly assembly = Assembly.LoadFrom(assemblyFilename); foreach (Type type in assembly.GetTypes()) { ProcessType(type); } } void ProcessType(Type type) { BreakIntoWordsAndAdd(type.Name); foreach (MemberInfo memberInfo in type.GetMembers(BindingFlags.Public | BindingFlags.NonPublic | BindingFlags.Instance | BindingFlags.Static)) { BreakIntoWordsAndAdd(memberInfo.Name); MethodBase methodBase = memberInfo as MethodBase; if (methodBase != null) { foreach (ParameterInfo parameterInfo in methodBase.GetParameters()) { BreakIntoWordsAndAdd(parameterInfo.Name); } } } } void BreakIntoWordsAndAdd(string identifier) { List words = BreakPascalCasing(identifier); foreach (string word in words) { AddIdentifer(word); } } List BreakPascalCasing(string identifier) { Regex regex = new Regex(".[a-z]+"); MatchCollection matches = regex.Matches(identifier); List words = new List(); foreach (Match match in matches) { words.Add(match.Value); } return words; } private void AddIdentifer(string name) { name = name.ToLower(); if (!m_identifiers.ContainsKey(name)) { m_identifiers.Add(name, null); } } public void WriteToConsole() { List identifiers = new List(m_identifiers.Keys); identifiers.Sort(); foreach (string identifier in identifiers) { Console.WriteLine(identifier); } } } }
As promised, I'm going to start republishing some of my columns that were eaten by MSDN.
I spent some time reading this one and deciding whether I would re-write it so that it was, like, correct. But it became clear that I didn't have a lot of enthusiasm towards that, so I've decided to post it as is (literally, as-is, with some ugly formatting because of how I used to do them in MSDN).
I also am not posting the source, though I might be tempted to put it someplace if there is a big desire for it.
So, on to the caveats...
The big caveat is that my understanding of how the probing rules work was incorrect. To get things to work the way I have them architected, you need to put them somewhere in the directory tree underneath where the exe lives, and if they aren't in the same directory, you need to add the directory where they live to the private bin path. I may also have the shadow directory stuff messed up.
So, without further ado, here's something that I wrote five years ago and has not been supplanted by more timely and more correct docs. AFAIK. If you know better references, *please* add them to the comments, and also comment on anything else that's wrong.
App Domains and dynamic loading
Eric GunnersonMicrosoft Corporation
May 17, 2002
Download the ???.exe sample file. ???MSDNSamples\C#
Download or browse the ???.exe in the MSDN Online Code Center!href(/code/default.asp?URL=/code/sample.asp?url=/msdn-files/026/002/???/msdncompositedoc.xml).
This month, I’m sitting in a departure lounge at Palm Springs airport, waiting to fly back to Seattle after an ASP.NET conference.
My original plan for this month – to the extent that I have a plan – was to do some work on the expression parsing part of the SuperGraph application. In the past few weeks, however, I’ve received several emails asking when I was going to get the loading and unloading of assemblies in app domains part done, so I’ve decided to focus on that instead.
Before I get into code, I’d like to talk a bit about what I’m trying to do. As you probably remember, SuperGraph lets you choose from a list of functions. I’d like to be able to put “add-in” assemblies in a specific directory, have SuperGraph detect them, load them, and find any functions contained in them.
Doing that by itself doesn’t require a separate AppDomain; Assembly.Load() usually works fine. The problem is when you want to provide a way for the user to update those assemblies when the program is running, which is really desirable if you’re writing something that runs on a server, and you don’t want to stop and start the server.
To do this, we’ll load all add-in assemblies in a separate AppDomain. When a file is added or modified, we’ll unload that AppDomain, create a new one, and load the current files into it. Then things will be great.
To make this a little clearer, I’ve created a diagram of a typical scenario:
In this diagram, the Loader class creates a new AppDomain named Functions. Once the AppDomain is created, Loader creates an instance of RemoteLoader inside that new AppDomain.
To load an assembly, a load function is called on the RemoteLoader. It opens up the new assembly, finds all the functions in it, packages them up into a FunctionList object, and then returns that object to the Loader. The Function objects in this FunctionList can then be used from the Graph function.
The first task is to create an AppDomain. To create it in the proper manner, we’ll need to pass it an AppDomainSetup object. The docs on this are useful enough once you understand how everything works, but aren’t much help if you’re trying to understand how things work. When a Google search on the subject returned up last month’s column as one of the higher matches, I suspected I might be in for a bit of trouble.
The basic problem has to do with how assemblies are loaded in the runtime. By default, the runtime will look either in the global assembly cache or in the currently application directory tree. We’d like to load our add-in applications from a totally different directory.
When you look at the docs for AppDomainSetup, you’ll find that you can set the ApplicationBase property to the directory to search for assemblies. Unfortunately, we also need to reference the original program directory, because that’s where the RemoteLoader class lives.
The AppDomain writers understood this, so they’ve provided an additional location in which they’ll search for assemblies. We’ll use ApplicationBase to refer to our add-in directory, and then set PrivateBinPath to point to the main application directory.
Here’s the code from the Loader class that does this:
AppDomainSetup setup = new AppDomainSetup();
setup.ApplicationBase = functionDirectory;
setup.PrivateBinPath = AppDomain.CurrentDomain.BaseDirectory;
setup.ApplicationName = "Graph";
appDomain = AppDomain.CreateDomain("Functions", null, setup);
remoteLoader = (RemoteLoader)
appDomain.CreateInstanceFromAndUnwrap("SuperGraph.exe",
"SuperGraphInterface.RemoteLoader");
After the AppDomain is created, the CreateInstanceFromAndUnwrap() function is used to create an instance of the RemoteLoader class in the new app domain. Note that the filename of the assembly the class is in is required, and the full name of the class.
When this call is executed, we get back an instance that looks just like a RemoteLoader. In fact, it’s actually a small proxy class that will forward any calls to the RemoteLoader instance in the other AppDomain. This is the same infrastructure that .NET remoting uses.
When you write code to do this, you’re going to make mistakes. The documentation provides little advice on how to debug your app, but if you know who to ask, they’ll tell you about the Assembly Binding Log Viewer (named fuslogvw.exe, because the loading subsystem is known as “fusion”). When you run the viewer, you can tell it to log failures, and then when you run your app and it has problems loading an assembly, you can refresh the viewer and get details on what’s going on.
This is hugely useful to find out, for example, that Assembly.Load() doesn’t require “.dll” on the end of the filename. You can tell this in the log because it will tell you it tried to load “f.dll.dll”.
So, now that we’ve gotten the application domain created, it’s time to figure out how to load an assembly, and extract the functions from it. This requires code in two separate areas. The first finds the files in a directory, and loads each of them:
void LoadUserAssemblies()
{
availableFunctions = new FunctionList();
LoadBuiltInFunctions();
DirectoryInfo d = new DirectoryInfo(functionAssemblyDirectory);
foreach (FileInfo file in d.GetFiles("*.dll"))
string filename = file.Name.Replace(file.Extension, "");
FunctionList functionList = loader.LoadAssembly(filename);
availableFunctions.Merge(functionList);
This function in the Graph class finds all dll files in the add-in directory, removes the extension from them, and then tells the loader to load them. The returned list of functions is merged into the current list of functions.
The second bit of code is in the RemoteLoader class, to actually load the assembly and find the functions:
public FunctionList LoadAssembly(string filename)
FunctionList functionList = new FunctionList();
Assembly assembly = AppDomain.CurrentDomain.Load(filename);
foreach (Type t in assembly.GetTypes())
functionList.AddAllFromType(t);
return functionList;
This code simple calls Assembly.Load() on the filename (assembly name, really) passed in, and then loads all the useful functions into a FunctionList instance to return to the caller.
At this point, the application can start up, load in the add-in assemblies, and the user can refer to them.
The next task is to be able to reload these assemblies on demand. Eventually, we’ll want to be able to do this automatically, but for testing purposes, I added a Reload button to the form that will cause the assemblies to be reloaded. The handler for this button simply calls Graph.Reload(), which needs to perform the following actions:
1. Unload the app domain
2. Create a new app domain
3. Reload the assemblies in the new app domain
4. Hook up the graph lines to the newly created app domain
Step 4 is needed because the GraphLine objects contain Function objects that came from the old app domain. After that app domain is unloaded, the function objects can’t be used any longer.
To fix this, HookupFunctions() modifies the GraphLine objects so that they point to the correct functions from the current app domain.
Here’s the code:
loader.Unload();
loader = new Loader(functionAssemblyDirectory);
LoadUserAssemblies();
HookupFunctions();
reloadCount++;
if (this.ReloadCountChanged != null)
ReloadCountChanged(this, new ReloadEventArgs(reloadCount));
The last two lines fire an event whenever a reload operation is performed. This is used to update a reload counter on the form.
The next step is to be able detect new or modified assemblies that show up in the add-in directory. The frameworks provide the FileSystemWatcher class to do this. Here’s the code I added to the Graph class constructor:
watcher = new FileSystemWatcher(functionAssemblyDirectory, "*.dll");
watcher.EnableRaisingEvents = true;
watcher.Changed += new FileSystemEventHandler(FunctionFileChanged);
watcher.Created += new FileSystemEventHandler(FunctionFileChanged);
watcher.Deleted += new FileSystemEventHandler(FunctionFileChanged);
When the FileSystemWatcher class is created, we tell it what directory to look in and what files to track. The EnableRaisingEvents property says whether we want it to send events when it detects changes, and the last 3 lines hook up the events to a function in our class. The function merely calls Reload() to reload the assemblies.
There is some inefficiency in this approach. When an assembly is updated, we have to unload the assembly to be able to load a new version, but that isn’t required when a file is added or deleted. In this case, the overhead of doing this for all changes isn’t very high, and it makes the code simpler.
After this code is built, we run the application, and then try copying a new assembly to the add-in directory. Just as we had hoped, we get a file changed event, and when the reload is done, the new functions are now available.
Unfortunately, when we try to update an existing assembly, we run into a problem. The runtime has locked the file, which means we can’t copy the new assembly into the add-in directory, and we get an error.
The designers of the AppDomain class knew this was a problem, so they provided a nice way to deal with it. When the ShadowCopyFiles property is set to “true” (the string “true”, not the boolean constant true. Don’t ask me why…), the runtime will copy the assembly to a cache directory, and then open that one. That leaves the original file unlocked, and gives us the ability to update an assembly that’s in use. ASP.NET uses this facility.
To enable this feature, I added the following line to the constructor for the Loader class:
setup.ShadowCopyFiles = "true";
I then rebuilt the application, and got the same error. I looked at the docs for the ShadowCopyDirectories property, which clearly state that all directories specified by PrivateBinPath, including the directory specified by ApplicationBase, are shadow copied if this property isn’t set. Remember how I said the docs weren’t very good in this area…
The docs for this property are just plain wrong. I haven’t verified what the exact behavior is, but I can tell you that the files in the ApplicationBase directory are not shadow copied by default. Explicitly specifying the directory fixes the problem:
setup.ShadowCopyDirectories = functionDirectory;
Figuring that out took me at least half an hour.
We can now update an existing file and have it correctly loaded in. Once I got this working, I ran into one more tiny problem. When we ran the reload function from the button on the form, the reload always happened on the same thread as the drawing, which means we were never trying to draw a line during the reload process.
Now that we’ve switched to file change events, it’s now possible for the draw to happen after the app domain has been unloaded and before we’ve loaded the new one. If this happens, we’ll get an exception.
This is a traditional multi-threaded programming issue, and is easily handled using the C# lock statement. I added a in the drawing function and in the reload function, and this ensures that they can’t both happen at the same time. This fixed the problem, and adding an updated version of an assembly will cause the program to automatically switch to a new version of the function. That’s pretty cool.
There’s one other weird bit of behavior. It turns out that the Win32 functions that detect file changes are quite generous in the number of changes they send, so doing a single update of a file leads to five change events being sent, and the assemblies being reloaded five times. The fix is to make a smarter FileSystemWatcher that can group these together, but it’s not in this version.
Having to copy files to a directly wasn’t terribly convenient, so I decided to add drag and drop functionality to the app. The first step in doing this is setting the AllowDrop property of the form to true, which turns on the drag and drop support. Next, I hooked a routine to the DragEnter event. This is called when the cursor moves in an object on a drag and drop operation, and determines whether the current object is acceptable for drag and drop.
private void Form1_DragEnter(
object sender, System.Windows.Forms.DragEventArgs e)
object o = e.Data.GetData(DataFormats.FileDrop);
if (o != null)
e.Effect = DragDropEffects.Copy;
string[] formats = e.Data.GetFormats();
In this handler, I check to see if there is FileDrop data available (ie a file is being dragged into the window). If this is true, I set the effect to Copy, which sets the cursor appropriately and causes the DragDrop event to be sent if the user releases the mouse button. The last line in the function is there purely for debugging, to see what information is available in the operation.
The next task is to write the handler for the DragDrop event:
private void Form1_DragDrop(
string[] filenames = (string[]) e.Data.GetData(DataFormats.FileDrop);
graph.CopyFiles(filenames);
This routine gets the data associated with this operation – an array of filenames – and passes it off to a graph function, which copies the files to the add-in directory, which will then cause the file change events to reload them.
At this point, you can run the app, drag new assemblies onto it, and it will load them on the fly, and keep running. It’s pretty cool.
I’ve set up a Visual C# Community Newsletter, so that the C# product team has a better way to communicate with our users. I’m going to use it to announce when there’s new content on our community site at http://www.gotdotnet.com/team/csharp!href(http://www.gotdotnet.com/team/csharp), and also to let you know if we’re going to be at a conference or user group meeting.
You can sign up for it at the site listed above.
This coming August, we're teaming up with developmentor to host C# Summer Camp. This is a chance to receive excellent C# training from developmentor instructors and to spend some time with the C# Product Team. There's more information at the developmentor site!href(http://www.developmentor.com/conferences/csharpsummer/csharpsummer.aspx).
If I do more work on SuperGraph, I’ll probably work on a version of FileSystemWatcher that doesn’t send lots of extraneous events, and possibly on the expression evaluation. I also have another small sample that I may talk about instead.
<HR NOSHADE SIZE=1>
Eric Gunnerson is a Program Manager on the Visual C# team, a member of the C# design team, and the author of A Programmer's Introduction to C#!href(http://www1.fatbrain.com/asp/bookinfo/bookinfo.asp?theisbn=1893115860&vm=c). He's been programming for long enough that he knows what 8-inch diskettes are and could once mount tapes with one hand.
I'm at least a month late in linking to this, but if you've been paying very little attention it might still be new that VS Pro will support unit testing in Orcas.
Which I think is great news.
I came up with a hacky way of doing "hover" and "press" buttons in WPF recently. There are a couple of nice examples on the web, but I was looking for a way to do it purely through XAML. If there's a nicer way to do it through XAML (or if there are big drawbacks to this approach) please let me know.
So, here's what I did.
Okay, so it's not very general in that it hardcodes the behavior for the specific button, but it was really cheap to do.
XAML looks something like this:
<ControlTemplate TargetType="{x:Type Button}"> <Border x:Name="Border" Padding="{TemplateBinding Padding}" VerticalAlignment="Stretch" BorderThickness="2,2,2,2" CornerRadius="5,5,5,5"> <Image HorizontalAlignment="Right" x:Name="image" Width="24" Height="26" Source="prev_rest.png"/> </Border> <ControlTemplate.Triggers> <Trigger Property="IsMouseOver" Value="true"> <Setter Property="Source" TargetName="image" Value="prev_hover.png"/> </Trigger> <Trigger Property="IsPressed" Value="true"> <Setter Property="Source" TargetName="image" Value="prev_down.png"/> </Trigger> </ControlTemplate.Triggers> </ControlTemplate>
I've been doing a bit of prototyping work with the aforementioned products, and thought that since Blend has an RC1 release, it was a good time to share a few thoughts...
The short story is that both VS and blend are pretty good tools for building in WPF, but together, they're very impressive.
Visual studio has its usual strengths - intellisense is great, refactoring support is great, all that stuff. And, once you get over the hurdles of how it works (which I'm presuming are less for a designer than a developer), it's very simple to get the UI exactly the way that I want it.
Which is a bit of work. WPF has a layout system that does a whole lot for you, but it does up the the object count a fair bit, especially since you compose controls to get what you want (Button doesn't support images and text - instead you do layout of other objects that are hosted within a button).
In the old world, I might have a window with 3 buttons in it. In the WPF world, I have a StackPanel that has 3 buttons in it, and then each button has as its content another StackPanel that has a rectangle (for the button part) and a text label).
Ordering and grouping those is nicely done in blend.
It's also very easy to do things like control styles, so that the look of controls can be controlled through a style rather than by modifying the properties on the control.
The only place where I think Blend falls short is when dealing with animations. The way in which they're entered and represented is a big hard to grok, though I think it's mostly just that it's hard to represent something that evolves over time. So I've been doing animations in code rather than in blend.
I also don't like having the event hookup done in blend. I'm invariably going to hook up to non-graphical events, and I like to be able to know what's going on in code rather than having to look two places.
I'm happy that I don't have to mention any problems with the two playing together. I have the same project open in VS and blend - they both monitor the project and pick up changes made by the other one.
Recommended
Okay, not quite, but John W. Backus, the developer of FORTRAN has died.
He brought us do loops, assignment statements, the much-debated goto, intrinsic data types, and started the software developer's obsession with code formatting (1). FORTRAN is the root of the Computer language family tree, and he started it back in 1954.
He was also the inventor of the Backus-Naur notation.
1) For you younger pups out there, FORTRAN is a line-oriented language obsessed with column positioning. Columns 1-5 are used for numeric labels, column 6 is used to indicate continuation from the previous line, and columns 7-72 are used for statements. Columns higher than 72 are ignored.
Which led to many programmers - as they were called in those times - carrying around little rulers that you could put on top of a code printout - assuming you have a code printout and note a punched-card deck - to check that the columns were right.
Though I understand that they wimped out and relaxed these in later versions of the language.
I learned FORTRAN (my second, or perhaps third language) on a 300 baud modem connected to a DECWriter, back in the fall of 1979.
But you tell kids these days that, and they won't believe you.
If you are doing storyboard-based animation, the animation is hooked up to a specific element through a unique id. You write code something like this:
element.Name = "Fred";window.RegisterName(element.Name, element);
The second call establishes the relationship between the name and the object.
If you're writing cod where there are multiple elements, you need to create unique names. So, something like:
element.Name = "Fred_" + m_id.ToString();m_id++;
works great. However, if you use:
element.Name = "Fred:" + m_id.ToString();m_id++;
You'll get a not-very-helpful exception...
Software is always built on other software - your dependencies - and the dependencies that you choose have a considerable influence on your success. Choose the existing technology that you know, and you have good predictability, but you might not produce a great product, or it might take too long to finish. Choose a hot new technology, and it's harder to predict what will happen. Maybe the benefits will be great and you'll finish faster (ASP.NET vs old ASP...). Maybe things won't be as good as promised (insert the name of a technology that you were "disappointed with" in the past).
Or maybe it's not finished when you need it. Welcome to the wonderful world of co-development, where you are depending on features that aren't implemented yet. How do you reduce the risk of features/APIs not showing up, or being substantially different than you expected?
Well, the first (and best) way to reduce this risk is simply not to do it. If you only depend on features and APIs that are currently available, you know they are there.
If you can't wait a full release cycling, then perhaps you can take some sort of incremental approach, where you plan to use feature <X> but don't *commit* to using it until its actually there. My preference would be an agile approach (such as Scrum), so that when feature <X> shows up, it's actually finished and working.
That's really just the same thing I said first - don't take on the dependency until something is done.
But what do you do if you really need that feature - if your plans would be derailed unless the other team finishes the feature? I have four things in mind that can help:
Accept the Risk
First, you have to accept that you are taking on risk. Software scheduling beyond a period of a month or two is not only an unsolved problem, I believe it isn't a tractable problem. Decades of project slippage have demonstrated that, and we should just embrace the uncertainty involved rather than trying to "do better".
Note that while there are teams out there that can give good estimates for tasks in the next month (and perhaps up to two months), you can't assume that you are dealing with such a team. There are many teams who are essentially unpredictable even in short timeframes.
Understand the Risk
Second, you need to understand the risk. This will require you to work with the team that's building whatever you are needing. You need to understand where your feature ranks in the things that they are doing. It might be a feature that they absolutely have to have to ship, or it might be a "nice to have" feature. You need to understand this. It's closely related to how close your requested feature is to their main charter. You do not want to be the outlier group amongst all their clients, the customer they don't want to have.
You also need to understand when they're building the feature. If it's very early in the cycle, then it's likely to get done. If it's late in the cycle, it's less likely to get done.
If they don't think of features in this way and/or are working on features in parallel, it's more risky.
It would also help to understand what development methodology they use, and their history of being done when they guess they will be done.
Plan for Mitigation
What are you going to do if things don't work out, if the feature is late or is cut? Even in the best organizations, people get married, are out for months on medical leave, have accidents, or leave to form their own companies.
What is your group going to do when this happens?
Track the Risk
In an ideal world, the group you depend on would give you regular updates about the feature you're waiting for. And some groups do do this, but it's your risk, and you're going to need to stay on top of it. The details of that depend on the groups involved.
Accept the Outcome
If things work, great. But if the feature doesn't show up, remember that you were the one who accepted the risk in the first place.