Posts
  • Eric Gunnerson's Compendium

    App Domains and dynamic loading (the lost columns)

    • 3 Comments

    As promised, I'm going to start republishing some of my columns that were eaten by MSDN.

     

    I spent some time reading this one and deciding whether I would re-write it so that it was, like, correct. But it became clear that I didn't have a lot of enthusiasm towards that, so I've decided to post it as is (literally, as-is, with some ugly formatting because of how I used to do them in MSDN).

     

    I also am not posting the source, though I might be tempted to put it someplace if there is a big desire for it.

     

    So, on to the caveats...

     

    The big caveat is that my understanding of how the probing rules work was incorrect. To get things to work the way I have them architected, you need to put them somewhere in the directory tree underneath where the exe lives, and if they aren't in the same directory, you need to add the directory where they live to the private bin path. I may also have the shadow directory stuff messed up.

     

    So, without further ado, here's something that I wrote five years ago and has not been supplanted by more timely and more correct docs. AFAIK. If you know better references, *please* add them to the comments, and also comment on anything else that's wrong.

     

    App Domains and dynamic loading

     

    Eric Gunnerson
    Microsoft Corporation

    May 17, 2002

    Download the ???.exe sample file.
    ???
    MSDNSamples\C#

    Download or browse the ???.exe in the MSDN Online Code Center!href(/code/default.asp?URL=/code/sample.asp?url=/msdn-files/026/002/???/msdncompositedoc.xml).

    This month, I’m sitting in a departure lounge at Palm Springs airport, waiting to fly back to Seattle after an ASP.NET conference.

    My original plan for this month – to the extent that I have a plan – was to do some work on the expression parsing part of the SuperGraph application. In the past few weeks, however, I’ve received several emails asking when I was going to get the loading and unloading of assemblies in app domains part done, so I’ve decided to focus on that instead.

    Application Architecture

    Before I get into code, I’d like to talk a bit about what I’m trying to do. As you probably remember, SuperGraph lets you choose from a list of functions. I’d like to be able to put “add-in” assemblies in a specific directory, have SuperGraph detect them, load them, and find any functions contained in them.

    Doing that by itself doesn’t require a separate AppDomain; Assembly.Load() usually works fine. The problem is when you want to provide a way for the user to update those assemblies when the program is running, which is really desirable if you’re writing something that runs on a server, and you don’t want to stop and start the server.

    To do this, we’ll load all add-in assemblies in a separate AppDomain. When a file is added or modified, we’ll unload that AppDomain, create a new one, and load the current files into it. Then things will be great.

    To make this a little clearer, I’ve created a diagram of a typical scenario:

     

     

     

    In this diagram, the Loader class creates a new AppDomain named Functions. Once the AppDomain is created, Loader creates an instance of RemoteLoader inside that new AppDomain.

    To load an assembly, a load function is called on the RemoteLoader. It opens up the new assembly, finds all the functions in it, packages them up into a FunctionList object, and then returns that object to the Loader. The Function objects in this FunctionList can then be used from the Graph function.

    Creating an AppDomain

    The first task is to create an AppDomain. To create it in the proper manner, we’ll need to pass it an AppDomainSetup object. The docs on this are useful enough once you understand how everything works, but aren’t much help if you’re trying to understand how things work. When a Google search on the subject returned up last month’s column as one of the higher matches, I suspected I might be in for a bit of trouble.

    The basic problem has to do with how assemblies are loaded in the runtime. By default, the runtime will look either in the global assembly cache or in the currently application directory tree. We’d like to load our add-in applications from a totally different directory.

    When you look at the docs for AppDomainSetup, you’ll find that you can set the ApplicationBase property to the directory to search for assemblies. Unfortunately, we also need to reference the original program directory, because that’s where the RemoteLoader class lives.

    The AppDomain writers understood this, so they’ve provided an additional location in which they’ll search for assemblies. We’ll use ApplicationBase to refer to our add-in directory, and then set PrivateBinPath to point to the main application directory.

    Here’s the code from the Loader class that does this:

    AppDomainSetup setup = new AppDomainSetup();

    setup.ApplicationBase = functionDirectory;

    setup.PrivateBinPath = AppDomain.CurrentDomain.BaseDirectory;

    setup.ApplicationName = "Graph";

    appDomain = AppDomain.CreateDomain("Functions", null, setup);

     

    remoteLoader = (RemoteLoader) 

        appDomain.CreateInstanceFromAndUnwrap("SuperGraph.exe",

            "SuperGraphInterface.RemoteLoader");

    After the AppDomain is created, the CreateInstanceFromAndUnwrap() function is used to create an instance of the RemoteLoader class in the new app domain. Note that the filename of the assembly the class is in is required, and the full name of the class.

    When this call is executed, we get back an instance that looks just like a RemoteLoader. In fact, it’s actually a small proxy class that will forward any calls to the RemoteLoader instance in the other AppDomain. This is the same infrastructure that .NET remoting uses.

    Assembly Binding Log Viewer

    When you write code to do this, you’re going to make mistakes. The documentation provides little advice on how to debug your app, but if you know who to ask, they’ll tell you about the Assembly Binding Log Viewer (named fuslogvw.exe, because the loading subsystem is known as “fusion”). When you run the viewer, you can tell it to log failures, and then when you run your app and it has problems loading an assembly, you can refresh the viewer and get details on what’s going on.

    This is hugely useful to find out, for example, that Assembly.Load() doesn’t require “.dll” on the end of the filename. You can tell this in the log because it will tell you it tried to load “f.dll.dll”.

    Dynamically Loading Assemblies

    So, now that we’ve gotten the application domain created, it’s time to figure out how to load an assembly, and extract the functions from it. This requires code in two separate areas. The first finds the files in a directory, and loads each of them:

     

    void LoadUserAssemblies()

    {

        availableFunctions = new FunctionList();

        LoadBuiltInFunctions();

     

        DirectoryInfo d = new DirectoryInfo(functionAssemblyDirectory);

        foreach (FileInfo file in d.GetFiles("*.dll"))

        {

            string filename = file.Name.Replace(file.Extension, "");

            FunctionList functionList = loader.LoadAssembly(filename);

     

            availableFunctions.Merge(functionList);

        }

    }

    This function in the Graph class finds all dll files in the add-in directory, removes the extension from them, and then tells the loader to load them. The returned list of functions is merged into the current list of functions.

    The second bit of code is in the RemoteLoader class, to actually load the assembly and find the functions:

    public FunctionList LoadAssembly(string filename)

    {

        FunctionList functionList = new FunctionList();

        Assembly assembly = AppDomain.CurrentDomain.Load(filename);

     

        foreach (Type t in assembly.GetTypes())

        {

            functionList.AddAllFromType(t);

        }   

        return functionList;

    }

    This code simple calls Assembly.Load() on the filename (assembly name, really) passed in, and then loads all the useful functions into a FunctionList instance to return to the caller.

    At this point, the application can start up, load in the add-in assemblies, and the user can refer to them.

    Reloading Assemblies

    The next task is to be able to reload these assemblies on demand. Eventually, we’ll want to be able to do this automatically, but for testing purposes, I added a Reload button to the form that will cause the assemblies to be reloaded. The handler for this button simply calls Graph.Reload(), which needs to perform the following actions:

    1.      Unload the app domain

    2.      Create a new app domain

    3.      Reload the assemblies in the new app domain

    4.      Hook up the graph lines to the newly created app domain

    Step 4 is needed because the GraphLine objects contain Function objects that came from the old app domain. After that app domain is unloaded, the function objects can’t be used any longer.

    To fix this, HookupFunctions() modifies the GraphLine objects so that they point to the correct functions from the current app domain.

    Here’s the code:

    loader.Unload();

    loader = new Loader(functionAssemblyDirectory);

    LoadUserAssemblies();

    HookupFunctions();

    reloadCount++;

     

    if (this.ReloadCountChanged != null)

        ReloadCountChanged(this, new ReloadEventArgs(reloadCount));

    The last two lines fire an event whenever a reload operation is performed. This is used to update a reload counter on the form.

    Detecting new assemblies

    The next step is to be able detect new or modified assemblies that show up in the add-in directory. The frameworks provide the FileSystemWatcher class to do this.  Here’s the code I added to the Graph class constructor:

    watcher = new FileSystemWatcher(functionAssemblyDirectory, "*.dll");

    watcher.EnableRaisingEvents = true;

    watcher.Changed += new FileSystemEventHandler(FunctionFileChanged);

    watcher.Created += new FileSystemEventHandler(FunctionFileChanged);

    watcher.Deleted += new FileSystemEventHandler(FunctionFileChanged);

    When the FileSystemWatcher class is created, we tell it what directory to look in and what files to track. The EnableRaisingEvents property says whether we want it to send events when it detects changes, and the last 3 lines hook up the events to a function in our class. The function merely calls Reload() to reload the assemblies.

    There is some inefficiency in this approach. When an assembly is updated, we have to unload the assembly to be able to load a new version, but that isn’t required when a file is added or deleted. In this case, the overhead of doing this for all changes isn’t very high, and it makes the code simpler.

    After this code is built, we run the application, and then try copying a new assembly to the add-in directory. Just as we had hoped, we get a file changed event, and when the reload is done, the new functions are now available.

    Unfortunately, when we try to update an existing assembly, we run into a problem. The runtime has locked the file, which means we can’t copy the new assembly into the add-in directory, and we get an error.

    The designers of the AppDomain class knew this was a problem, so they provided a nice way to deal with it. When the ShadowCopyFiles property is set to “true” (the string “true”, not the boolean constant true. Don’t ask me why…), the runtime will copy the assembly to a cache directory, and then open that one. That leaves the original file unlocked, and gives us the ability to update an assembly that’s in use. ASP.NET uses this facility.

    To enable this feature, I added the following line to the constructor for the Loader class:

    setup.ShadowCopyFiles = "true";

    I then rebuilt the application, and got the same error. I looked at the docs for the ShadowCopyDirectories property, which clearly state that all directories specified by PrivateBinPath, including the directory specified by ApplicationBase, are shadow copied if this property isn’t set. Remember how I said the docs weren’t very good in this area…

    The docs for this property are just plain wrong. I haven’t verified what the exact behavior is, but I can tell you that the files in the ApplicationBase directory are not shadow copied by default. Explicitly specifying the directory fixes the problem:

    setup.ShadowCopyDirectories = functionDirectory;

    Figuring that out took me at least half an hour.

    We can now update an existing file and have it correctly loaded in. Once I got this working, I ran into one more tiny problem. When we ran the reload function from the button on the form, the reload always happened on the same thread as the drawing, which means we were never trying to draw a line during the reload process.

    Now that we’ve switched to file change events, it’s now possible for the draw to happen after the app domain has been unloaded and before we’ve loaded the new one. If this happens, we’ll get an exception.

    This is a traditional multi-threaded programming issue, and is easily handled using the C# lock statement. I added a in the drawing function and in the reload function, and this ensures that they can’t both happen at the same time. This fixed the problem, and adding an updated version of an assembly will cause the program to automatically switch to a new version of the function. That’s pretty cool.

    There’s one other weird bit of behavior. It turns out that the Win32 functions that detect file changes are quite generous in the number of changes they send, so doing a single update of a file leads to five change events being sent, and the assemblies being reloaded five times. The fix is to make a smarter FileSystemWatcher that can group these together, but it’s not in this version.

    Drag and Drop

    Having to copy files to a directly wasn’t terribly convenient, so I decided to add drag and drop functionality to the app. The first step in doing this is setting the AllowDrop property of the form to true, which turns on the drag and drop support. Next, I hooked a routine to the DragEnter event. This is called when the cursor moves in an object on a drag and drop operation, and determines whether the current object is acceptable for drag and drop.

    private void Form1_DragEnter(

        object sender, System.Windows.Forms.DragEventArgs e)

    {

        object o = e.Data.GetData(DataFormats.FileDrop);

        if (o != null)

        {

            e.Effect = DragDropEffects.Copy;

        }

        string[] formats = e.Data.GetFormats();

    }

     In this handler, I check to see if there is FileDrop data available (ie a file is being dragged into the window). If this is true, I set the effect to Copy, which sets the cursor appropriately and causes the DragDrop event to be sent if the user releases the mouse button. The last line in the function is there purely for debugging, to see what information is available in the operation.

    The next task is to write the handler for the DragDrop event:

    private void Form1_DragDrop(

        object sender, System.Windows.Forms.DragEventArgs e)

    {

        string[] filenames = (string[]) e.Data.GetData(DataFormats.FileDrop);

        graph.CopyFiles(filenames);

    }

    This routine gets the data associated with this operation – an array of filenames – and passes it off to a graph function, which copies the files to the add-in directory, which will then cause the file change events to reload them.

    Status

    At this point, you can run the app, drag new assemblies onto it, and it will load them on the fly, and keep running. It’s pretty cool.

    Other Stuff

    C# Community Site

    I’ve set up a Visual C# Community Newsletter, so that the C# product team has a better way to communicate with our users. I’m going to use it to announce when there’s new content on our community site at http://www.gotdotnet.com/team/csharp!href(http://www.gotdotnet.com/team/csharp), and also to let you know if we’re going to be at a conference or user group meeting.

    You can sign up for it at the site listed above.

    C# Summer Camp

    This coming August, we're teaming up with developmentor to host C# Summer Camp. This is a chance to receive excellent C# training from developmentor instructors and to spend some time with the C# Product Team. There's more information at the developmentor site!href(http://www.developmentor.com/conferences/csharpsummer/csharpsummer.aspx).

    Next Month

    If I do more work on SuperGraph, I’ll probably work on a version of FileSystemWatcher that doesn’t send lots of extraneous events, and possibly on the expression evaluation. I also have another small sample that I may talk about instead.

    <HR NOSHADE SIZE=1>

    Eric Gunnerson is a Program Manager on the Visual C# team, a member of the C# design team, and the author of A Programmer's Introduction to C#!href(http://www1.fatbrain.com/asp/bookinfo/bookinfo.asp?theisbn=1893115860&vm=c). He's been programming for long enough that he knows what 8-inch diskettes are and could once mount tapes with one hand.

     

  • Eric Gunnerson's Compendium

    The siren song of reuse...

    • 4 Comments

    We've been doing some planning 'round these parts - planning that I unfortunately can't talk about - but it's led to a fair amount of discussion about architecture, both inside the team and outside the team.

    Which has got me thinking about reuse.

    Reuse has been one of the Holy Grails of software development for a long time, along with... Well, work with me, I'm sure there are others. True AI!. That's another.

    Anyway, reuse has been discussed since time immemorial (October 13th, 1953), for some pretty sound reasons:

    • Software is hard and expensive to develop
    • We already break things apart into "components" to simplify development
    • We've all written subroutines that are used in multiple places.

    It seems that if we did a little more planning, paid a little more attention, were just a little smarter, we could build our components in a more general way, and others could benefit from them.

    And yet, people have been trying to do this for a long time, and have mostly failed at it. There are successes - widely-used successes - but they're fairly small in number. Surprisingly, people are still optimistic about going down the reuse path, and since they are likely to fail anyway, I therefore present some rules that can help them get there faster.

    Eric's N rules for failing at reuse

    Authoring reusable components:

    • It really helps to have delusions of grandeur and importance. You are going to be the ones who succeed at doing what others have failed at.
    • Pick a wide/diverse scope. You're not building a good UI framework for client applications, your framework is going to work for both client and web applications.
    • Plenty of technical challenge. That's what makes it fun and keeps you interested.
    • No immediate clients. You are building a *Component*, and when you are done, clients can use it.

    In my experience, that's more than enough by itself, but it helps if you can throw in some obscure algorithms and quirky coding styles. I'm already assuming that you don't have any real tests.

    Consuming other people's components:

    • You absolutely *must* sign up to work with something as it is being developed. There is simply no better way to waste vast amounts of time. Unfinished implementations block you, regressions set you back, and even if you don't get those, you at least get hours of refactoring to switch to the new version. It's best if you do this on a "milestone" basis, roughly every 6 weeks or so. That's short enough that you're randomized often, but long enough that waiting for bug fixes is really painful.
    • Commit to using the other component early and deeply. If somebody asks, "what will you do if <x> is too buggy, too late, or doesn't do what you need?", you can't have an answer.
    • Make sure that your use of the component is a variant of what the component is really being used for. In other words, the creator of the component is building the thing to do <x>, and you want to do <y>, which is kindof like <x>.
    • A quick prototype should be enough to ensure that you can do it.
    • If you can't get the authoring group to commit to producing what you want, take a snapshot of their code and work on it yourself.
    • Don't plan any schedule time or impact to deal with issues that come up.
    • If possible, buy the component you're using. Because if you paid for it, the quality will be higher, and they'll have to give you good support.

    I hope these tips help you.

    If you're a bit leery of reuse, then good for you. I have only a few thoughts to offer:

    If you're thinking about doing something, it's always a build vs buy decision. Even the best general-purpose framework out there is just that - a general-purpose framework. It's not designed to do exactly what you want to do.

    In the abstract, there are three phases of using a component in your code:

    Phase 1 is great. The component is doing what you want, and it's quick and easy to do it. Let's say for sake of argument that this gets you to the 80% point in your project, and it gets you there quick.

    Phase 2 is a harder. You're starting to reach the limits of the component, and it's tough to get it to do what you want. Tough enough that it's taking more time, and you're using up the time that you saved in phase 1. But you still feel like it was the right decision.

    Phase 3 is much harder. It's taken you as long to get here as a custom-written solution would have taken, and making further progress is considerably slower than if you had written everything. Worse, you can see the point where you'll reach a wall where you can't do anything more, and it's close.

    Different projects obviously reach different phases. Some never venture out of phase 1, and others are deep in phase 3. It's hard to tell where you'll end up, but if a given component is central to what you do, you are much more likely to end up in phase 3.  

    The obvious problem is that prototyping is always done in phase 1, and the rapid progress you make there is oh-so-tempting. The whole application UI is laid out in a week of work using Avalon. I got this demo with moving pictures done in 3 days using XNA. We all want to believe that it's really going to be that easy.

    Stay strong against the lure of the siren song.

  • Eric Gunnerson's Compendium

    Seven deadly sins of programming - Sin #1

    • 21 Comments

    So, the time has come for the worst sin.

    Just to recap - and so there is one post that lists them all - here are the ones that I've covered so far:

    Some people have remarked that all of these are judgement calls, and really more a matter of aesthetics than actual sins.

    That is true. I didn't include things like "naming your variables i, j, & k" as sins, because I don't think that's a real problem in most of the code I'm likely to have to deal with, and there really isn't much argument over whether it's a good idea or not.

    It perhaps would have been better to title this series, "Seven things Eric would really prefer that you don't do in code that he has to work with", but that is both ungainly and lacking the vitality of a post with the term "sin" in it.

    It's all marketing, you see - or you would if you were actually reading this post, but given my track record on the last six, it's probably a good idea to cut your losses now and spend your time more productively, like in switching your entire codebase from tabs to spaces (or spaces to tabs...)

    When I was a kid, I was fairly interested in WWII. I read a lot of books about it, from general histories about the war, to books on the warfare in the Pacific, to books about the ground war in Europe.

    One of the interesting features of the military during that time - one that I didn't appreciate until much later - was how they balanced the desire for advancement in their officer corps vs the need to only advance the most talented and capable. There were really two schools of thought at the time.

    The first school advocated an approach where a lower-ranked officer - say, a colonel - would be promoted to fill a vacancy directly, on the theory that it made the chain of command cleaner, and you'd quickly find out if he had "the right stuff".

    The second group advocated using "field promotions", in which a colonel would be temporarily promoted to see if he could perform in the job. The theory here was that the service would end up with only the best colonels promoted, and that it was much easier (and better for both the officer and the service) to let a field promotion expire rather than demote an officer already given a full promotion.

    Over time, the approach advocated by the second group was borne out as having far better results, and the danger of the first approach was recognized.

    Which brings us on our roundabout journey to our final sin:

    Sin #1 - Premature Generalization

    Last week I was debugging some code in a layout manager that we use. It originally came from another group, and is the kind of module that nobody wants to a) own or b) modify.

    As I was looking through it, I was musing on why that was the case. Not to minimize the difficulty in creating a good layout manager (something I did a bit of in a previous life), but what this module does really isn't that complex, and it has some behavior that we would really like to change.

    The problem is that there are at least three distinct layers in the layout manager. I write a line of code that says:

    toolbarTable.SetColMargin(0, 10);

    and when I step into it, I don't step into the appropriate TableFrame. I step into a wrapper class, which forwards the call onto another class, which forward onto another class, which finally does something.

    Unfortunately, the relation between the something that gets done and the TableFrame class isn't readily apparent, because of the multiple layers of indirection.

    Layers of indirection that, as far as I can tell (and remember that nobody wants to become the owner of this code by showing an any interest in it or, god forbid, actually making a modification to it...), aren't used by the way we use the layout manager. They're just mucking things up...

    Why is this the #1 sin?

    Well, as I've been going through the sins, I've been musing on how I ranked them. One of the primary factors that I used is the permanence of the sin.

    And this one is pretty permanent. Once something is generalized, it's pretty rare that it ever gets de-generalized, and I this case, I think it would be very difficult to do so.

    <Agile aside>

    This might be slightly different if there were full method-level tests for the component - one could consider pulling out that layer. But even with that, it would be hard to approach in a stepwise fashion - it could easily turn into one of those 3-hour refactorings that makes you grateful that your source code control system has a "revert" feature.

    </Agile aside>

    Or, to put it another, fairly obvious way:

    Abstraction isn't free

    In one sense this seems obvious - when you develop a component that is used by multiple clients, you have to spend a little extra effort on design and implementation, but then you sit back and reap the benefits.

    Or do you?

    It turns out that you only reap the benefits if your clients are okay with the generalized solution.

    And there's a real tendency to say, "well, we already have the ListManager component, we can just extend it to deal with this situation".

    I've know teams where this snowballed - they ended up with a "swiss army knife" component that was used in a lot of different scenarios. And like many components that do a lot, it was big, complex, and had a lot of hard-to-understand behavior. But developing it was an interesting technical challenge for the developers involved (read that as "fun and good for their careers"...)

    The problem came when the team found that one operation took about 4 times as long as it should. But because of the generalized nature of the component doing the operation, there was no easy way to optimize it.

    If the operation had been developed from scratch without using the "uber-component", there would have been several easy optimization approaches to take. But none of those would work on the generalized component, because you couldn't just implement an optimization in one scenario - it would have to work for all scenarios. You couldn't afford the dev cost to make it work everywhere, and in this case, even if you could, it would cause performance to regress in other scenarios.

    (At this point, I'm going to have to have anybody thinking "make it an option" escorted out of this post by one our friendly ushers. How do you think it got so complex in the first place?)

    At that point, you often have to think about abandoning the code and redeveloping in the next version. And in the next cycle, this group *did* learn from their mistakes - instead of the uber-component, they built a small and simple library that different scenarios could use effectively. And it turned out that, overall, they wrote less code than before.

    HaHA. I make joke.

    What they really did was build *another* uber-component that looked really promising early (they always do), but ultimately was more complex than the last version and traded a new set of problems for the old ones. But, building it was a technical challenge for the developers involved, and that's what's really important...

    How do you avoid this sin?

    Well, YAGNI is one obvious treatment, but I think a real treatment involves taking a larger view of the lifecycle costs of abstraction and componentization.

    It that one general component really going to be better than two custom solutions?

    (if you didn't understand the story, look here and and see what rank is above a colonel...)

  • Eric Gunnerson's Compendium

    Properties vs public fields redux...

    • 27 Comments

    My blog reader burped recently, and gave forth a post (and reply) than Rico wrote last September, but since I didn't comment on it then, I'll comment on it now. And, yes, I've written about this in the past...

    Basically, this revolves around the question of using properties, or using public fields. And I have a slightly different point to make than Rico is making, his point being that sometimes you need to break these guidelines for performance reasons (which I agree with, but which is not my point).

    But first, a bit of history...

    Back in the early days of .NET, we (and by "we", I mean "other people") were coming up with the .NET framework library coding guidelines and the CLS as a way of making sure that developers could use assemblies developed in multiple languages without severe cranial performance issues. Meetings were held, discussions ranged far and wide, and the CLS and coding guidelines came into being.

    And they were pretty good, though they only covered the public appearance of classes. You could use hungarian in the internals of a class, if you wanted to.

    We also discussed whether we would come out with a set of guidelines that talked about how things should be done inside your class. For C#, we decided not to do this, and if any of you have ever spent time talking about where braces belong or whether source code should use spaces or tabs, you understand why we decided not to do this.

    But I'm afraid that it unfortunately gave the impression that the library coding guidelines should be the drivers for all code that you write, and I think that's the wrong way to look at things. Others may differ inside MS, but they can write their own blog posts...

    On the specific subject of properties, the question can be boiled down to:

    If I have a field with no special behavior, should I write a "just in case" property (with trivial get/set), or should I expose a public field?

    The reason that the library design guidelines suggest you write a property here is that it is important that libraries be easily versioned. If you put a property in there ahead of time, you can change the property implementation without requiring users to recompile their code.

    That, in my book, is generally a "good thing".

    Or, to put it another way, I rank the ability to version easily higher than the cost of the extra clutter in the class and increased size of the assembly that comes from the property.

    But, if the clients of a class are always compiled together with a class - or at least shipped together - then there is no versioning issue to consider. In that case, I think it's a bad idea to write the "trivial property" - all you've done is complicate your code without any benefit. If the public field you write needs to be a property, then just make the change and recompile.

    I've switched over to writing this code, and I have to say that when I have to work with old code with trivial properties, it bothers me.

    (Oh, and I also made the same choice as Rico when I had to do some graphical stuff...)

    Comments?

  • Eric Gunnerson's Compendium

    More on properties vs fields

    • 6 Comments

    Some nice comments on what I wrote.

    First, a non-controversial question. Robin asked whether you would capitalize them in that case, and I think you should, as having something accessed externally that isn't properly cased will be surprising to people.

    Dani - and others - have pointed out that properties leave your options open in case the software is used in ways that make the property more useful. This is certainly true, and I think it comes down to how you value that flexibility over the tax that you're paying to have properties. I would usually rate the simplicity higher, but it does depend on what you're writing, how likely the code is to get repurposed, and how likely it is that a property would ever be needed. I find as I get older I'm valuing simplicity more.

    Finally, Kristof points out that a field can be passed as an out or ref parameter while a property cannot, which means that adding a property to code can break the client code. This is a good point. I do think, however, that I'd want my code to break in that situation - if something is a property I should do some thinking about whether it's a good idea to be passing it as an out or ref (the same reason C# doesn't write helper code to make this work by default).

    Kristof's comment reminded me one more reason to use properties - some of the BCL infrastructure works with properties but not with fields.

     

  • Eric Gunnerson's Compendium

    Other views on programming sins...

    • 7 Comments

    At the beginning of the sin-tacular, I asked for people to come up with their own lists. And here they are:

    My original plan was to comment on some of the individual sins that people listed, but they're all great - you should go and read them all.

    I was a bit intrigued, however, by Chris' comment (or should that be "Chris' Comments' comment?):

    Hey, Eric, what are the 7 Heavenly Virtues of Programmers?

    Hmm...

  • Eric Gunnerson's Compendium

    Best practices talk

    • 15 Comments

    I gave my Teched talk at 5PM today, entitled “C# Best Practices - What's wrong with this code?”. Rather than take a more lecture-based approach, I tried something more interactive. It was an interesting, if not fully successful talk. If you attended, please leave me some comments - what did you like, what didn't you like, etc. If you didn't attend, you should see slides on the dev center *eventually*.

    Good:

    • The warmup went well, with laughs where I expected to get laughs, and many of them better than I expected.
    • I got to jump off the stage and go out into the audience, something I also like to do.
    • I got lots of questions afterwards, which is always a good sign.

    Bad:

    • I didn't get as many answer from people as I expected. That might be because the material is too hard, or just no accessible in the time I spent on it.
    • I missed points on a few of my explanations
    • I had one glaring error in an explanation that was brought to my attention later, which I should have caught. The guidance is still correct, but I hate those kinds of errors.
    • The snippets part was fairly dry, and could have been more interesting.

    Overall, I think it went fairly well, but I haven't talked to anyone in my group, so I don't know what they think.

    Oh, and the worst part was that I had to share the stage with two rack mounted servers that made it really hard for me to hear anything.

    Afterwards, one of the room supervisors told me that the talk was SRO, and that it seemed to be a “younger crowd” than many of the other talks. I'm not sure what to make of that.

    Overall, I'm not sure how happy I am yet.

  • Eric Gunnerson's Compendium

    Nullable types in C#

    • 107 Comments

    Nullable Types in C#

    One of the "late breaking" features in C# 2.0 is what is known as "Nullable Types". The details can be found in the C# 2.0 language spec.

    Nullable types address the scenario where you want to be able to have a primitive type with a null (or unknown) value. This is common in database scenarios, but is also useful in other situations.

    In the past, there were several ways of doing this:

    • A boxed value type. This is not strongly-typed at compile-time, and involves doing a heap allocation for every type.
    • A class wrapper for the value type. This is strongly-typed, but still involves a heap allocation, and the you have to write the wrapper.
    • A struct wrapper that supports the concept of nullability. This is a good solution, but you have to write it yourself.

    To make this easier, in VS 2005, we're introducing a new type named "Nullable", that looks something like this (it's actually more complex than this, but I want to keep the example simple):

    struct Nullable<T>
    {
        public bool HasValue;
        public T Value;
    }

    You can use this struct directly, but we've also added some shortcut syntax to make the resulting code much cleaner. The first is the introduction of a new syntax for declaring a nullable type. Rather than typing:

    Nullable<int> x = new Nullable<int>(125);

    I can write:

    int? x = 125;

    which is much simpler. Similarly, rather than needed to write a null test as:

    if (x.HasValue) {...}

    you can use a familiar comparison to null:

    if (x != null) {...}

    Finally, we have support to make writing expressions easier. If I wanted to add two nullable ints together and preserve null values, if I didn't have language support, I would need to write:

    Nullable<int> x = new Nullable<int>(125);
    Nullable<int> y = new Nullable<int>(33);
    Nullable<int> z =  (x.HasValue && y.HasValue) ? 
    new Nullable<int>(x.Value + y.Value) : Nullable<int>.NullValue;

    At least I think that's what I'd have to write - it's complex enough that I'm not sure this code works. This is ugly enough that it makes using Nullable without compiler support a whole lot of work. With the compiler support, you write:

    int? x = 125;
    int? y = 33;
    int? z = x + y;
    
  • Eric Gunnerson's Compendium

    C# vs C++

    • 48 Comments

    [Update: Changed a misplaced "C#" to a "C++" in the tools section. Thanks, Nick]

    At the class a took a while back, the instructor asked me to talk a little bit about the benefits of C++ vs the benefits of C#, since I had worked in C++, then in C#, and now in C++ again.

    I talked for about 5 minutes on why C# was better, in the following themes. Note that I'm speaking about the whole programming environment, not just the language.

    Automatic memory management

    There are several facets of this. When I read or write C++ code,  I keep part of my attention on the algorithmic aspects of the code, and another part on the memory aspects of the code. In C#, I usually don't have to worry about the memory aspects. Further, in C++ I need to figure out who owns objects and who cleans them up, and also figure out how to clean up in the presence of error checking.

    Exceptions

    Without a lot of dedication, using return codes is a bug farm waiting to happen. In the C++ world, some functions return HRESULT, some return a bool, some return another set of status code, and some return a number and use an out-of-range value as an error indicator. Oh, and some are void. You not only have to write the correct code there, you have to successfully convert back and forth between the various kinds of error handling.

    You also lose the opportunity to write more functional code. I end up writing something like:

    CString name;
    RTN_ERROR_IF_FAILED(employee.FetchName(name));

    instead of writing:

    string name = employee.FetchName();

    And, of course, exceptions are fail-safe in that you get error reporting without doing anything, rather than having to do everything right to get error reporting.

    Coherent libraries

    The C++ code I'm writing uses at least 6 different libraries, all of which were written by different groups and have different philosophies in how you use them, how they're organized, how you handle errors, etc. Some are C libraries, some are C++ libraries that are pretty simple, some are C++ libraries that make heavy use of templates and/or other C++ features. They have various levels of docs, all the way from MSDN docs through "read the source" docs to "a single somewhat-outdated word doc".

    I've said in the past that the biggest improvement in productivity with .NET comes from the coherent library structure. Sure, it doesn't cover everything in Win32, but for the stuff it does cover, it's often much much easier to use than the C++ alternative.

    Compilation Model

    C++ inherited the C compilation model, which was designed in a day when machine constraints were a whole lot tighter, and in those days, it made sense.

    For today, however, separate compilation of files, separate header and source files, and linking slow things down quite a bit. The project that I'm working on now takes somewhere on the order of a minute to do a full build and link (that's to build both the output and my unit tests, which requires a redundant link). An analogous amount of C# code would take less than 5 seconds. Now, I can do an incremental make, but the dependency tracking on the build system I use isn't perfect, and this will sometimes get you into a bad state.

    Tools

    Reflection is a great feature, and enables doing a ton of things that are pretty hard to do in the C++ world. I've been using Source Insight recently, and while it's a pretty good IDE, it isn't able to fully parse templates, which means you don't get full autocompletion in that case.

    Code == Component

    In C#, you get a component automatically. In C++, you may have extra work to do - either with .IDL files or with setting up exports.

    Language Complexity

    C++ templates are really powerful. They can also be really, really obtuse, and I've seen a lot of those obtuse usages over the years. Additionally, things like full operator overloading are great if you want a smart pointer, but are often abused.

    And don't get me started on #define.

    The alternate view

    So, if what I say above is true, the conclusion looks pretty clear - use C#. Not really much surprise considering who came up with the list.

    But there are things that C++ has going for it, and it really depends on the kind of code you're writing which language is a better choice. In broad strokes (and in my opinion, of course), if you're doing user applications (either rich client or web), choosing C# is a no-brainer. When you start getting towards lower-level things like services or apps with lots of interop, the decision is less clear.

    So, what do I think you get in C++? A few things spring to mind.

    Absolute control over your memory

    I think you actually need this level of control much less than you may think you need it, but there are times when you do need it.

    Quick boot time

    Spinning up the CLR takes extra time, and it's likely to alway take extra time  (unless it's already spun up in some other manner). You don't pay this penalty in the C++ world.

    Smaller memory footprint

    Since you aren't paying for the CLR infrastructure, you don't pay for memory footprint.

    Fewer install dependencies

    You can just install and go, and your users don't have to install the proper version of the CLR.

     

    So, that's pretty much what I said in the class. What did I miss?

  • Eric Gunnerson's Compendium

    C# Coding Guidelines

    • 37 Comments

    I've been in a discussion today with one of our customers on good C# coding guidelines/standards, and I confess to not having any good references to give him.

    He was nice enough to send me a link to one that he thought was good, which happens to be by Juval Lowy at iDesign.

    If you have other coding guidelines that you think are good, feel free to add the to the comments, and I'll try to get them up on our C# dev center.

  • Eric Gunnerson's Compendium

    Grouping classes in an assembly

    • 23 Comments

    This useful bit of information crossed my desk today:

    When it comes to packaging in separate assemblies, remember that you pay a fairly large performance hit on an assembly load. An assembly should really be considered a unit of security control, independent versioning, or contribution from disparate sources. You might consider placing code in a separate assembly if it is used extremely rarely, but probably not.

    Here are some pointers from the "Designing .Net Class Libraries" course:

    Factor functionality into assemblies based on:

    - Performance - There is overhead in loading each assembly. All other things being equal, the fewer assemblies an application loads, the quicker the load time.

    - Versioning - All code in an assembly must version at the same rate.

    - Security - All code in an assembly has the same identity and is granted the same level of trust.

    Assemblies and Performance

    - Prefer single, large assemblies to multiple, smaller assemblies

    - Helps reduce working set of application

    - Large assemblies are easier for NGEN to optimize (better image layout, etc)

    - If you have several assemblies that are always loaded together, combine into a single assembly.

  • Eric Gunnerson's Compendium

    How do I protect my C# code against reverse engineering?

    • 27 Comments

    One of the questions that comes up often - usually after somebody comes across one of the C# decompilers, such as RemoteSoft's Salamander or Lutz Roeder's Reflector - is “how do I keep somebody from reverse-engineering my assemblies and stealing my code?”.

    While reverse engineering of code has been around for a long time, the combination of IL and rich metadata in systems such as Java and .NET make it easier. Reverse engineering optimized x86 code is harder, but it's still quite feasible, as many companies that used copy protection back in the 80s found out.

    One way to think about securing your code is to consider how you secure your house or business. There are some possessions that are not protected, some that are secured with simple locks, some are put in safes, and some warrant a vault with a 24-hour guard. I think it depends upon how valuable a specific piece of software is, and that obviously is something you'll have to decide for yourself.

    As for protection, there are a number of different schemes that customers have told me about.

    1. No protection. Some companies sell products that either ship with source code or have source code that can be bought separately, or have chosen not to protect certain parts of their application that have little IP.
    2. Some companies use the obfuscator that ships with VS 7.1 (a “Community Edition“ of PreEmptive Solution's Dotfuscator)
    3. Some companies use one of the commercial obfuscators out there. There's a list on the C# Developer Center on MSDN.
    4. Some companies write the sensitive parts of their code in C++, compile that to an unmanaged DLL, and then use interop to call into the unmanaged DLL.

    I should also point out that there are some products that claim to use encryption-based approaches to prevent reverse-engineering, but I don't have any credentials to evaluate such schemes, so I won't venture an opinion.

    If you know of any other schemes, please add them in the comments

  • Eric Gunnerson's Compendium

    DirectShow and C#

    • 19 Comments

    I'm currently in the "ramping up" phase of my job, and last week I spent a fair amount of time playing around with DirectShow from C#.

    DirectShow is an interesting technology that provides a sort of "building block" approach to multimedia. You create a DirectShow graph, populate it with different filters, hook them together, and start the whole process in motion. DirectShow handles the mechanics of getting the different parts talking to each other so data can flow through the system.

    If, for example, you wanted to capture some audio and save it to the disk in WMA format. You'd start with an audio input filter, which has an output pin. You'd wire that output pin to the input pin of a WMA encoder, and then wire the output pin of the encoder to the input pin of the FileStream filter, and that would give you the basic graph. Select which input you want, tell the graph to play, and your app is up and savin' data.

    Want to add a little reverb? Hook in the reverb filter.

    Video is similar - you hook things up together, and DShow handles passing things around. There's a cool workbench app named GraphEdit (graphedt.exe) that lets you graphically play around, and is a great way to find out what works before you write code.

    All of this is explained quite well in "Programming Microsoft DirectShow for Digital Video and Television"

    There are some managed DirectX wrappers in the DX9 SDK, but there aren't any available for DirectShow. I think I know why, but I'll save that for later...

    I set out to try to use C# from DirectShow, to emulate some of the C++ samples. The first problem is defining the COM interfaces. I've spent some time trying to define interfaces by hand in the past, but that hasn't been very successful. I searched for a typelib, but didn't find one.

    I did, however, find something that is a little better - the .IDL files. IDL files provide the definition of COM interfaces, and in this case are used to generate the C++ header files. They can also be used as input to MIDL, which produces a type library, which can be consumed by tlbimp to produce managed wrappers, which allows us all to go home early.

    They can if they've been set up to do that. But this IDL wasn't authored to produce a typelib, so it doesn't generate one by default. To do that, you need a library statement, which tells MIDL what to put in the typelib. Here's what I put in an IDL file:

    import "devenum.idl";
    import "axcore.idl";
    import "axextend.idl";

    [
    uuid(22995cc9-e37e-4b96-9326-b418935ac4be),
    helpstring("DirectShow interfaces")
    ]
    library DirectShow
    {
        interface IFilterGraph;
        interface ICreateDevEnum;
        interface IGraphBuilder;
        interface ICaptureGraphBuilder2;
        interface IFileSinkFilter;
        interface IFileSinkFilter2;
        interface IAMAudioInputMixer;
    };

    You have to have a GUID there for it to work.

    When run through MIDL, this gave me the typelib, which gave me a wrapper with some of the types I needed. You'll also need to add quartz.dll from windows\system32, which does have type information, so that you can get some of the other types.

    Once you've got that set up, you can start writing code. The C++ COM code in the samples is like most C++ COM code - ugly and hard to understand. But it's not that bad in C#.

    To create a COM object, you need to write:

    Type t = Type.GetTypeFromCLSID(guid);

    IGraphBuilder graphBuilder = (IGraphBuilder) Activator.CreateInstance(t);

    The guid is a Guid instance, and you get the Guid to use by looking in the include files or copying it from GraphEdit. That's a bit ugly, so I wrote this little helper function:

    private T CreateComObject<T>(Guid guid) where T: class|
    {
        Type comType = Type.GetTypeFromCLSID(guid);
        object o = Activator.CreateInstance(comType);
        if (o == null)
            return null;

        else
            return (T) o;
    }

    Which then allows me just to write:

    IGraphBuilder graphBuilder =
           CreateComObject<IGraphBuilder>(CLSID_FilterGraph);

    That's a nice little use of a generic method.

    Sometimes, you want an interface off an object rather than a new object. You can do that through a simple cast.

    Up to this point, things are fairly easy, and it makes you wonder why there's no managed interface to this stuff. Some of the real power comes when you want to write a filter, which could do something convert a picture to grayscale. While one could write a filter in C#, one would have to be pretty daft to try it, given that there are lots of helper classes in C++ that you'd have to rewrite in C++ (ok, perhaps MC++ could help...). My guess is that that's why there are no managed wrappers.

     

  • Eric Gunnerson's Compendium

    Why doesn't C# have an 'inline' keyword?

    • 21 Comments

    I got this question in email today, and I thought I'd share my response.

    For those of you who don't know, the 'inline' keyword in the C language was designed to control whether the compiler inlined a function into the calling function. For example, if I had something like

    int square(int x)
    {
        return x * x;
    }

    and used it here:

    int y = square(x);

    I would be making a function call for a single expression, which would be pretty inefficient. Adding “inline' to the definition would tell the compiler to inline it, effectively changing my use to:

    int y = x * x;

    and giving me the abstract of a function call without the overhead.

    Inline had some good uses in early C compilers, but it was also prone to misuse - some programmers would decorate all their functions with 'inline', which would give them no function calls, very big code, and therefore slow execution.

    That led to the next phase, where programmers would only put 'inline' on functions where they needed the performance. That kept the code smaller, but missed cases like the one I have above where the inlined code is smaller than the function call code, where you want to inline every time. These cases only got caught if the programmer noticed them.

    Finally, as systems got better, compilers got into the act, and started doing enough analysis to make good decisions on their own without 'inline' keyword. In fact, the compiler writers discovered that the heuristics that they put in the compiler made better choices than programmers did, and some compilers started ignoring 'inline' and just doing what they thought best. This was legal because inline was always defined as a hint, not a requirement.

    This worked really well for most cases, but there were some cases where the programmer really wanted something to be inline (ie override the heuristics). This was sometimes around as a request for the “inlinedammit!“ keyword, which showed up in VC++ in V6.0 (IIRC) as the nicer named “__forceinline“.

    So how does that relate to C#? Well, it doesn't, but it's an interesting story so I wanted to share it with you.

    For C#, inlining happens at the JIT level, and the JIT generally makes a decent decision. Rather than provide the ability to override that, I think the real long-term solution is to use profile-guided feedback so that the JIT can be smarter about what it inlines and what it doesn't.

     

  • Eric Gunnerson's Compendium

    Arrays inside of structures

    • 9 Comments

    Sometimes when doing interop, you want to have an array embedded inside of a struct. For example, something like:

    struct data
    {
        int header;
        int values[10];
    }

    that you either used in a call to interop, or with unsafe code to deal with an existing format - something like a network packet or a disk record.

    You can do this in current versions of C#, but it's ugly. The most obvious way is:

    struct Data
    {
        int header;
        int value0;
        int value1;
        int value2;
        int value3;
        int value4;
        int value5;
        int value6;
        int value7;
        int value8;
        int value9;
    }

    But that's a bit ungainly, though, you *can* take the address of value0 and then use pointer arithmetic to access the various "array elements". There's a somewhat shorter version:

    [StructLayout(LayoutKind.Sequential, Size=44)]
    struct data
    {
        int header;
        int values;
    }

    Setting the size to be 44. If you want to have more than one such array or have fields after the first one, you'll need to use LayoutKind.Explicit and put a FieldOffset attribute on every field.

    In either case, you access the "array" by getting a pointer to the values field, and using pointer arithmetic.

    To make this a little cleaner and less error-prone, in C# 2.0 we've added a fixed array syntax, allowing you to write:

    struct data
    {
        int header;
        fixed int values[10];
    }

    and getting the exact same behavior. You can then treat values as if it's a "c-style" array, using indexing, etc.

    Because the underlying implementation is still using pointers and there's the possibility of going outside the allocated space, code generated using this feature requires using "unsafe", and therefore requires elevated privilege to execute.

  • Eric Gunnerson's Compendium

    Minus 100 points

    • 26 Comments

    When I switched over the C# compiler team, I had hoped that I would be able to give some insight into how the design team works, what decisions we make, etc. Language design is a very esoteric field, and there's not a lot written about it (though “Design and evolution of C++“ is a pretty good read). I had hoped that I would be able to do this with concrete examples, as that makes it much easier.

    I've been watching for candidate topics to write about, but haven't yet come up with any good ones. One of the problems is that features have a tendency to morph in design (and in whether they'll make it into Whidbey) as time goes by, and it would be bad for me to say, “we're talking about doing<x>“ and then have us decide it wasn't a good idea. Or, for us to decide that <x> doesn't fit into our schedule, or it would break existing code, or any of the other reasons that might cause us to pull a feature. We're generally not comfortable really talking about those things until the feature is in somebody's hands.

    In lieu of such an example, I've decided to write something a bit more abstract about how we look at things.

    ******

    I've been spending a lot of time answering customer questions about why we do (or don't) have certain language features in C#. (as an aside, yes, we will be getting more of this online on the C# dev center).

    In some of those questions, the questions asks why we either “took out” or “left out” a specific feature. That wording implies that we started with an existing language (C++ and Java are the popular choices here), and then started removing features until we got to a point where we liked. And, though it may be hard for some to believe, that's not how the language got designed.

    One of the big reasons we didn't do this is that it's really hard to remove complexity when you take a subtractive approach, as removing a feature in one area may not allow you to revisit low-level design decisions, nor will it allow you to remove complexity elsewhere, in places where it support the now-removed feature.

    So, we decided on the additive approach instead, and worked hard to keep the complexity down. One way to do that is through the concept of “minus 100 points”. Every feature starts out in the hole by 100 points, which means that it has to have a significant net positive effect on the overall package for it to make it into the language. Some features are okay features for a language to have, they just aren't quite good enough to make it into the language.

    A feature like expression filters in VB is in this bucket. Yes, being able to put a condition on a catch is somewhat more convenient than having to write the test yourself, but it doesn't really enable you to do anything new.

    Being able to have property accessors with different accessibilities was another example of this. We had some experience with users being able to specify different levels of virtual-ness (virtuality?) on different accessors in an early version of the language, and it was a really confusing feature, so we elected to remove it. We felt similarly about differing accessibilities, and so we left it out of the 2002/2003 versions of the language, but we got *so much* feedback from customers that we re-evaluated the utility of the feature, and decided to add it to the language in the Whidbey version.

    One can argue that we should perhaps have come to that conclusion earlier, but I don't think that being cautious is a bad thing when doing language design. Once you add a feature, it's in the language forever.

    I should also probably note that this isn't the only reason features don't make it into the language. Some features provide enough utility in the abstract, but when we go to come up with a workable design for the feature, we find that we can't come up with a way to make it simple and understandable for the user. That means that in some cases we like a feature in the abstract, but it's not in the language.

    There is also another class - features that we don't want to get near the language. But that would be another post.

  • Eric Gunnerson's Compendium

    So, you want to write a computer book...

    • 12 Comments

    As some of you know, in the early years of the 21st century, I wrote a book on C#.

    Since then, I've had a number of people who are interested in writing a book themselves ask me about my experience, so I thought I'd spend some time to write something down.

    So, you want to write a computer book...

    There are many reasons that you might want to write a computer book - or some other kind of book, for that matter. Unfortunately, some of those reasons may not pan out the way you expected.

    Money

    Here's a quick lesson on the economics of book publishing, in the computer world. I believe that this also applies generally to the rest of the technical world. I know little about writing non-technical books - even less than I know about writing technical books - but I know enough to know that what I'm about to write doesn't apply there at all.

    Book royalties are calculated as a percentage of the wholesale price of a book, which is generally 50% of the cover price. In my understanding, royalties generally range between 10% and 15%, with the higher ones going to more established authors.

    Let's say that you write a $40 book with 10% royalties, and it sells 10,000 copies (a pretty decent sales figure, from what I'm told). That means that your royalties are:

    $40 / 2 * 0.1 * 10000 = $20,000

    That's pre-tax, of course - you'll need to pay income tax on your royalties, and you may also have to pay social security tax on it. I allocate about 40% to that, leaving you with $12000.

    Is that a good deal? Well, it's nothing to sneeze at, but you need to figure how much time you spent on it. On the first verson of my book, I estimated that I spent at least 400 hours. If that figure is accurate (I have some reason to suspect I underestimated...), that means that I made about $30 per hour ($50 per hour pre tax). That may or may not be a good deal for you.

    Books are a good example of non-linear return for the effort. If you sell only 3000 books, you get about $10/hour for the work. If you write a classic like Code Complete and sell a lot of copies, the return could be much better.

    Easy Money

    Even if you're a bestseller, writing a book is not easy, so you won't get easy money out of it.

    Fame/Glory

    If your book is successful, you may get some fame out of it (perhaps "notorioty" is a better term). While it's an interesting experience to be "famous" for your work, fame doesn't feed the iguana, with one specific exception (see respect).

    "Chicks dig computer book authors..."

    You may rest assured that your author status is likely to have no noticeable effect in this area.

    Respect

    If you write a good book that is well-received, you will get the thanks and respect of others in the industry. While you can't spend this directly, being a "noted author" can open up opportunities on the employment side of things. Some consultants write books primarily to drive name recognition and respect so they can be more successful consulting.

    Enjoyment

    Today at lunch I had one of my friends ask whether I had fun writing my book. It's a hard question to answer.

    I do derive considerable enjoyment from writing something that explains a particular topic well, and there were a fair number of those, but there is a considerable amount of hard work. To the extent that hard work is enjoyable (sometimes it is and sometimes it isn't), it was overall a pleasureable experience.

    Learning

    It is a reasonable expectation that you should be well-versed in your topic before you write a book, but no matter what your level, you will learn lots of new things. Not only are there 1001 details that you need to understand thoroughly before you write a book, there's nothing like having to commit your thoughts to paper to make you realize that you either don't know the details or are unsure of the details. I had to do a lot of research, and that meant I came out the other end with a lot more deep knowledge than I had before.

    Qualifications

    So, despite having read the earlier part of this article, you still think you want to write a book. The question now becomes "are you qualified". There are two aspects of this that I think are important.

    Can you write?

    Or, to put it more succinctly, can you write well in a reasonable amount of time without driving yourself and the people around you crazy.

    Before you can get a signed contract, you need to be able to demonstrate this to your publisher (unless you're a big name draw, and the publisher is willing to pay for editing and/or a ghostwriter).

    To find out whether this is feasible for you, you need to do some writing, and then you need to have an audience read the writing and give you constructive feedback. Writing is a skill, and over time you should be able to develop techniques that work will with your target audience.

    Good ways to practice:

    • Write a blog. Book writing is not like blog writing, but it's a good, cheap way to practice, and a great way to get quick and easy feedback.
    • Write articles for an online programer's site - something like CodeProject.
    • Write an article for MSDN
    • Answer questions on newsgroups or message boards

    Strangely, the most important of these - the last one - has the least to do with formal writing. But it's the most important, because to write a good book you need to have a deep understanding around what is hard (ie what is hard to understand, what is poorly documented, what is confusing, etc.) *and* you need to be able to explain things in ways that people understand.

    Both of those are cheap and easy to accomplish in a newsgroup. There's also an important side benefit to be had with community involvement, which I'll touch on later.

    Are you uniquely qualified?

    I know nothing about building and configuring Beowolf clusters. While with a huge amount of effort, I'm confident I could learn enough to write a book about them, getting to that point is far more effort than I want to, and I can't even know if there are good books already out there until I invest a fair amount of effort.

    So, you need to be uniquely qualified. That means one or more of the following:

    • You invented/popularized/standardized a technology. If you're Anders or Don Box, you have a position that others don't.
    • You know a lot about a technology and about how people use it, what problems they have with it, etc. This is a typical MVP advantage.
    • You have the first-mover advantage. You've been involved deeply early, and there are currently few people who have your level of knowledge.

    That last one is really important. When C# first was released, all the publishers wanted to have C# books, and there were a number of "me too" books that weren't distinctive and didn't sell very well. But in most cases, their authors worked just as hard as the authors of the more successful books. You don't want to find yourself in that situation.

    I should note that if your in the first group, you can safely ignore the first-mover consideration. You likely have enough unique insight - and likely, name recognition - that your book will sell even if it isn't first.

    Do you have the time and the desire?

    And more importantly, can your family/social life survive your book-writing effort?

    Deciding

    So, you've done all this, but perhaps you aren't really sure whether you want to do it or not. If you're in this situation, the best thing to do is to pick a chapter of the book, and just write it. That won't take a huge amount of time, and if it's a successful experience, you'll have something you can shop around to various publishers.

    Finding a Publisher

    Do some research of the various publishers. Who has a line where your book would fit well? Is there a book from another publisher that you could compete against? Are there authors you can talk to about their experiences?

    Narrow your choices, go to the publisher websites, and find their materials for new authors. And then contact somebody, and try to spend some time (on the phone or in person, ideally) talking about your book idea. If you go to conferences, publishers are often available to talk to there, and they may even buy you lunch.

    Helping Your Book Sell

    You should expect your publisher to do a reasonable amount of marketing around your book, but you can have a large effect on sales yourself. First, if you were already involved in the appropriate community, you may be able to get others to help you do a technical review of the book. This is great both for techical quality and to have somebody else make recommendations for you.

    If it doesn't seem like a conflict of interest, add a line that says "Author, "413 ways to write dangerous code" (Sams)" to the end of your signature. If people like your responses in the group, you're more likely to get a sale. Just make sure never to point people to your book instead of answering their question - that's a sure way not to get a sale.

    Co-Authors

    If you're going to co-author a book, you need to find somebody with a compatible writing style, compatible writing habits, and a compatible personality. You also need to decide how you will divide the royalties - is it 50/50, or is it pro-rated based on the number of pages or chapters. I haven't done this myself, but I do know of cases where there were a lot of bad feelings at the end.

    Oh, and #1, you must have a compatible vision and viewpoint on how books should be structured.

    For me, I find it really hard to write with somebody.

    Conclusion

    I hope this was useful. If there are areas you don't like, are unclear, or you still have questions, please let me know.

  • Eric Gunnerson's Compendium

    Future language features & scenarios

    • 119 Comments

    We're starting to think about the release after Whidbey in the C# compiler team.

    I know that it seems strange that we have just shipped the Whidbey beta, and we're already talking about what comes next, but we're fairly close to being “shut down” for the compiler right now. So, we're taking the “big picture“ perspective.

    Since we're in this mode, now's the time to suggest things that we should add to the language and/or scenarios where you think the language falls short.

    An ideal comment would be something like:

    I'd like to be able to support mixin-style programming in C#

    If it isn't obvious what you're suggesting, please provide a link.

    It would also be good to suggest through code - something like:

    I currently am forced to write code like

    <ugly code here>

    and it's really ugly.

    I won't be able to respond to all comments nor will we implement all of them (or perhaps any of them), but we will consider all of them.

    Thanks 

     

  • Eric Gunnerson's Compendium

    A lock statement with timeout...

    • 25 Comments

    Ian Griffiths comes up with an interesting way to use IDisposable and the “using“ statement to get a very of lock with timeout.

    I like the approach, but there are two ways to improve it:

    1) Define TimedLock as a struct instead of a class, so that there's no heap allocation involved.

    2) Implement Dispose() with a public implementation rather than a private one. If that's the case, the compiler will call Dispose() directly, otherwise it will box to the IDisposable interface before calling Dispose().

     

  • Eric Gunnerson's Compendium

    List<Employee> or EmployeeList?

    • 35 Comments

    In current versions of C#, to get a strongly-typed collection, you need to define a separate type for that collection. So if you want to expose a collection of employees, you created an EmployeeCollection class.

    With generics, it's now possible to just use List<Employee>, and it's also possible to write a pretty simple version of Employee collection as well:

    class EmployeeCollection: List<Employee> {}

    So, which one is preferred?

    My advice is to prefer List<Employee>, as anybody who looks at such a definition will already know what operations are present on it, while it's not clear what operations an EmployeeCollection might provide.

    I would switch to EmployeeCollection when I want to add behavior beyond what List<T> provides (I obviously *have to* switch if I want to add behavior).

    This also has the added benefit of not cluttering up the program with types that really don't need to be there.

  • Eric Gunnerson's Compendium

    Regular Expression Workbench V2.0

    • 11 Comments

    I've finished a new version of my Regular Expression Workdbench, and it's now available on gotdotnet. If you use regular expressions on .NET, or you've heard about them but haven't really tried them, this tool can help you a lot. If I do say so myself.

    As an old Perl guy (in both senses of the word "old"), I've spent a fair amount of time writing regular expressions. It's easy to try out a regex in Perl, but not so easy in a compiled language like C#. I wrote the first version of this utility a couple of years ago, and in the first version, all it did was let you type in a regex, a string to run it against, and execute it.

    Over time, it grew. The next version supported some fairly cool features:

    • A menu with all the regex language options, so you don't have to remember what the syntax is for zero-width positive lookaheads.
    • Automatic creation of C# or VB code based on your regex and the options you choose.
    • Interpretation of regexes. Hover over a part of the regex, and the popup will tell you what that part of the regex means. This is very useful if you're trying to learn regex, or you don't remember what "(?<=" means.
    • Support for calling Split()

    This version adds a few more features:

    • A nicer UI. Not a very high bar, given the previous design ("Who designed this UI? Vandals?") (5 points to anybody who knows who wrote that line...). A real menubar, a somewhat-pleasant grouping of controls, etc.
    • Library functionality. Give the regex you wrote a description, and save it away into a library, so you can open it up later, or show it off to your friends. Chicks dig a well-crafted regular expression.
    • Unit tests for the interpretation features. Found 3 or 4 good bugs when writing the unit tests. These tests will get better over time.
    • Support for calling Regex.Replace(). Specify the replacement string, and you'll see exactly what gets replaced.
    • Support for calling Regex.Replace() with a MatchEvaluator. For the cases where you can't do your replacement with a simple substitution string, the Regex class lets you write a delegate. The workbench now allows you to write the function, which it saves away, compiles, loads and then uses to call Replace.

    Comments & suggestions are always welcome.

     

     

     

     

     

     

  • Eric Gunnerson's Compendium

    C# and Aspect Oriented Programming

    • 23 Comments

    One common question I've heard recently is “Are there any plans to add AOP support to C#?”

    We don't currently have any plans to add AOP into the language.

    My current thinking - and I'm not speaking for the whole design team now, though I think my position is at least somewhat representative - is that the scenarios that are mentioned around AOP don't justify the inclusion of new language features. One of my concerns is that AOP adds a whole new level of complexity to reading code, in that the code that is executed in a method isn't all in the method. The scenarios I've seen listed are around logging and monitoring.

    So, I think it's fair to say that we're in a “wait and see” attitude on this one. If there are compelling scenarios in this area, I'd like to know about them.

  • Eric Gunnerson's Compendium

    Extending existing classes

    • 15 Comments

    One other point that came up in the static import thread was extending existing classes. It's not uncommon to want to add specific methods to existing classes - or at least have the appearance of doing this. For example, I might want to add a new method to the string class, so I can call it with:

    string s = ...;

    string r = s.Permute(10, 15);

    rather than

    string r = Utils.Permute(s, 10, 15);

    We've discussed this a few times in the past, and we think this is an important scenario to consider. There are a number of reasons why you wouldn't want to actually modify the class, of which security is just one consideration. But one could think of a way of writing something like (very hypothetical syntax):

    class Utils
    {
        [AddTo(typeof(string))]
        public static string Permute(string s, int a, int b) {...}
    }

    and have the compiler then allow you to use it as if it were part of the string class. This is very useful, but perhaps not terribly understandable, and would certainly be open to abuse.

    Another option would be to allow the following definition (also hypothetical)

    class MyString<T>: T where T:string
    {
        public string Permute(int a, int b) {...}
    }

    Now, if you use a MyString, you can add methods onto the existing method. This would also be useful to add a specific implementation of something onto an existing class, somewhat in the way that Mixins work.

    We have no plans in this area, but will likely discuss the scenario more in the future.

  • Eric Gunnerson's Compendium

    Eric should write a blog post on...

    • 48 Comments

    I've been tracking topics I wanted to cover in various methods (excel spreadsheets, Post-It (tm) notes, “post-it compatible” notes, writing on my hand, and graffiti in my office).

    It's been hard to keep all of those straight, so I'd like to centralize them in a blog post. If you have a question that you'd like me to answer, please add a comment to this post with your question.

    I'll put a link on the top-level web page of my blog, so that you can easily get to it. I'll try to put some of my own ideas there as well.

     

  • Eric Gunnerson's Compendium

    Adding Emptiness to the DateTime class

    • 29 Comments

    I got an interesting email from a customer today, asking for my opinion on how to deal with the concept of “Empty” in relation to DateTime values. They had decided to use the DateTime.MinValue value as an indication that the DateTime was empty.

    The two options they were consider were:

    1) Call static functions to determine whether a DateTime is empty
    2) Just compare the DateTime value to DateTime.MinValue to see if it is empty.

    After a bit of reflection, I decided that I didn't like either approach. The problem is that they're trying to add a new concept of “emptiness” (some might equate this to “null”) to a type without changing the type.

    A better approach is to define a new type that supports the concept of emptiness. In this case, we'll create a struct that encapsulates the DateTime value and lets us deal with empty in a more robust manner.

    Here's the struct I wrote for them (and, no, this is not an indication that I'm now writing classes when people ask me to). 

    public struct EmptyDateTime
    {
        DateTime dateTime;
    public EmptyDateTime(DateTime dateTime) { this.dateTime = dateTime; }
    public bool IsEmpty { get { return dateTime == DateTime.MinValue; } }
    public static explicit operator DateTime(EmptyDateTime emptyDateTime) { if (emptyDateTime.IsEmpty) throw new InvalidOperationException("DateTime is Empty"); return emptyDateTime.dateTime; }
    public static implicit operator EmptyDateTime(DateTime dateTime) { return new EmptyDateTime(dateTime); }
    public static EmptyDateTime Empty { get { return new EmptyDateTime(DateTime.MinValue); } } }
Page 1 of 46 (1,133 items) 12345»