March, 2006

Posts
  • Eric Gunnerson's Compendium

    Public readonly string vs. public readonly property?

    • 23 Comments

    I got an email asking me a question, and I thought it would make a good blog post. It has all the things that make a good Hollywood movie - Conflict, tradeoffs, and a grisly result if you choose wrong.

    So, the question is, "Given the choice between a public readonly string or a public readonly property, which one is better and why?"

    Just to make sure we're all on the same page, the public readonly string looks like this:

        class ThingOne
        {
            public readonly string Value;
            public ThingOne(string value) { this.Value = value; }
        }

    and the public readonly property looks like this:

        class ThingTwo
        {
            private string value; 
            public ThingTwo(string value) { this.value = value; }
            public string Value { get { return this.value; } }
        }

    So, what are the good points of each of these implementations?

    The readonly one is simpler to read and understand and simpler to maintain (think of the differences in code with 10 of these in each class). It also has the advantage of being more precise, and by that I mean that the intent of the developer is crystal-clear; Value is something that is set when the instance is constructed, and never modified afterwards. That's not true with the property version - it could be modified by setting value in some other method, or adding a set accessor.

    The property one provides the isolation that properties do - the implementation can be changed underneath without requiring the caller to change anything.

    Those are the tradeoffs as I see them. Which one to use? Well, I think it depends on two factors:

    1) How sure are you that this thing is truly readonly.

    2) The pain if you're wrong.

    Being less than convinced about my prescient abilities - and such abilities of developers in general - I think that #2 is the factor that you really need to consider. If the pain will be bad, you go with the properties approach. If it won't be bad, you go with the readonly approach. But, if you're really really sure that it's readonly and will always be readonly, then you can forget about the pain.

    As for the pain potential, it depends mostly on what you're building and how your project is architected.

    The first question is whether the class is visible outside of your assembly. If it isn't, then the pain of making a change is minimal, and I would prefer readonly because of the advantages I listed.

    If it's visible outside the assembly, then the next question is "would we ever need to be update to update this assembly separately from its clients?"  In other words, do you have servicing requirements?

    My original question here was whether the class was used outside your group, but it really comes down to build and deployment scenarios. If your group is the sole user of the assembly or all referenced assemblies are shipped as part of an update, then I would continue to prefer readonly.

    If, however, this class is an externally-visible class and you think that it might need to be serviced (even if you don't do that sort of thing now), then you need the isolation that properties provide.

    My own thinking in this area has certainly evolved over time. The more code that I work with, the more I think that premature generalization ranks right up there with premature optimization in the seven deadly sins of programming. Code that is more complex than it needs to be takes longer to write, and you pay the complexity tax every time somebody needs to work with the code. The difference between a class with 10 public fields and one private/property and one with 11 properties, 10 of which are trivial is substantial.

    You may have noticed that I didn't mention efficiency in this post until now. Because of the inlining of properties, there typically isn't any difference, but even if that wasn't the case, I wouldn't consider it a factor in my decision. I'm convinced that a group that writes clean and simple code, tracks perf, and optimizes as appropriate will end up with better perf than a group that tries to code performant code along the way. Simple code is quicker to write and easier to optimize later if necessary.

    So, that's what I think. Agree? Disagree? Want to know what the other 5 deadly sins are?

    Tom, thanks for the question.

  • Eric Gunnerson's Compendium

    When is a keyword not a keyword?

    • 16 Comments

    Many languages have a very strict definition of what a keyword is. The word "for" has a specific meaning, and you can't use it anywhere else in the program.

    C# does have this kind of keyword - you can find the usual list of keywords in the docs - but it also has what is known as a contextual keyword, where a word functions like a keyword, but only in specific situations.

    Contextual keywords are a way to expand the number of keywords you can use without infringing into the identifier space of the user.

    One of the best examples is the set accessor of a property, which typically looks something like this:

    string Name
    {
        set { name = value; }
    }

    In the set accessor, you need to have a way to indicate the value the user assigned to the property. This could (perhaps - I haven't thought about it deeply) have been done with some weird concatenation of symbols, such as "<|", but C# isn't Perl, so it made more sense to use a word, and "value" is easy to understand.

    But people use "value" as a variable name ALL THE TIME, and they're not going to be happy if you steal it, especially for a small feature.

    So, instead of making "value" a true keyword, you make it a contextual keyword. In this case, "value" is reserved only within a set accessor, but you can freely use it elsewhere.

  • Eric Gunnerson's Compendium

    Why "yield return" rather than "yield"?

    • 9 Comments

    This came up in another context, and I thought I'd share the story.

    In the first version of iterators, you would use "yield" when you wanted to "return" a iterator value back, and things worked great. So, that's that way it was for quite a long period.

    We knew at the start of the 2.0 effort that we were doing some major things to the language, and major things often require new keywords. It seemed likely that some of the generics work would lead to this.

    So, we planned on how to deal with it. Now, new keywords are bad. Bad, because developers have the annoying habit of using non-keywords as identifiers (someday I'll tell the story of the generated FORTRAN code I worked on when I was fresh out of college), and if their choice happens to align with a keyword, their used-to-be-working-perfectly code now breaks.

    Which is, as they say, bad.

    So, we came up with a mitigation strategy - we would provide a utility that you could run over your source code, and it would replace any identifier that had become a keyword with the escaped version. It does this by putting an "@" sign in front of it.

    I expect to see heavy usage of "@" in any obfuscated C# contests...

    So, anyway, we had this plan in place. But as we finished the work with generics, we found that we had managed to do it without any real keywords (see my next post for more information...).

    This meant that the only new keyword in 2.0 was "yield", and switching to yield return meant that C# 2.0 would compile all C# code without change.

    Which is good.

  • Eric Gunnerson's Compendium

    Regex 101 posts - continue or not?

    • 9 Comments

    I've been getting bored with the regex 101 exercises that I 've been posting, as lots of them are simply variants of what I've posted in the past, and there's not really much value to add in the discussion.

    I have 9 more of the exercises remaining.

    Things I could do:

    1) Do all of the remaining 9

    2) Pick a few (3-5) that I feel like talking about, and do those.

    3) Give up now, and spend more time looking for the ultimate cordless butter warmer.

    Do you have a preference?

  • Eric Gunnerson's Compendium

    Regex 101 Exercise I10 - Extract repeating hex blocks from a string

    • 8 Comments

    Regex 101 Exercise I10 - Extract repeating hex blocks from a string

    Given the string:

    PCORR:BLOCK=V5CCH,IA=H'22EF&H'2354&H'4BD4&H'4C4B&H'4D52&H'4DC9;

    Extract all the hex numbers in the form “H’xxxx”

     

  • Eric Gunnerson's Compendium

    Worst... Code... Ever...

    • 8 Comments

    Like many developers, we enjoy making fun of other people's code. I've had lots of those discussions over the years. And when it comes to who has had to work with the worst code, I've never lost.

    Way back when I was just out of college - when 7-bit ASCII ruled the world, and people hadn't decided whether lowercase characters were a fad or not - I worked for a Large Seattle Aerospace Company, at Large Seattle Aerospace Company Computer Services, in a group that built graphics tools for engineers.

    One of the libraries that we owned was generated code.

    Now, I don't hate generated code. The Windows Forms Designer and I have a generally felicitious relationship. I even have a patent cube (which you get when an application is accepted at Microsoft) sitting on my windowsill that is related to an architecture for doing generated code.

    This particular library was special. It goes without saying that the source to the generator was lost in the card recycling bin of time, but that was not what made it unique.

    First of all, it was in FORTRAN. And not that namby-pamby FORTRAN where you can have modern control structures, this was FORTRAN 77, and you had better know how to count spaces and what it meant to put a character in column 6. Did you think that python came up with the idea of significant whitespace? Pshaw.

    Secondly, the programmer who had written the code generator was a bit of a FORTRAN afficianado. There's a feature in FORTRAN 77 known as the "assigned goto". Here's an example:

          ASSIGN 30 TO LABEL
          NUM = 40
          GO TO LABEL
          NUM = 50                ! This statement is not executed
    30    ASSIGN 1000 TO IFMT
          PRINT IFMT, NUM         ! IFMT is the format specifier
    1000  FORMAT(1X,I4)
          END

    Now, to understand this, you have to remember that by default, any variable that starts with the letters I-N is implicitly an INTEGER variable (all others are implicitly REAL). So, you can assign 30 to LABEL (this is *not* the same thing as writing LABEL=30, which means what you expect it to mean), and then use "GO TO LABEL" to goto 30.

    Suffice it to say that the developer had never read Dijkstra.

    Now, my guess is that while the vast majority of you are thankful that you don't have to work in FORTRAN, there is a divide in opinion beyond that. Some of you are saying "Well, that's not really that bad". But the rest of you are shaking your heads, because you know what is coming.

    When you're doing code generation, you need to somehow come up with variable names. In this case, the developer took the easy way out, and all integer variables were something like "I1344".

    And line numbers also use the same range, starting at 1000 and going on up.

    So, it means that the code has lots statements like:

    GO TO 11634

    and lots statements like:

    GO TO I1655

    Did I mention that in the fonts of the day, the difference between "1" and "I" was fairly subtle? Even if you did notice the I in front, you had to cope with the fact that

    GO TO I1655

    really meant

    GO TO 3455

    At least, it meant that sometimes, but I1655 would be re-assigned when the program ran.

    IIRC, there were about 15K lines of this in library.

    So, bother me not with tales of poorly-formatted C# code or 2000 line C functions. They are mere trifles. To snatch the rock from my palm will require something stronger. Are you up to the challenge?

    (I am a bit worried that there might be some LISP or APL programmers out there...)

  • Eric Gunnerson's Compendium

    Weird bug I had yesterday...

    • 7 Comments

    I came across a weird bug recently, and I thought the explanation might be interesting.

    But first, a brief digression.

    Windows has support for Bi-Directional text (usually abbreviated as "bidi" around here). Some languages - Arabic in this particular case - are written from right to left, so string layout is right to left. (as is the Windows UI, which is pretty freaky the first time you see it).

    Which is pretty straightforward.

    But sometimes you need to render text that is in multiple languages, and if one of those is ordered RTL and the other is LTR, the engine has to be pretty smart, as it needs to handle both sections in the same string, even though they are written in two different directions (ie "bidi").

    Most of this support is hidden - if you're keeping your strings in resources and using placeholders, the right thing just happens for you. (If you venture into custom controls, you *do* need to worry about things like RTL layout).

    So, anyway, I got a bug on DVD Maker that said that the numbers in a certain string were wrong in Arabic windows. I looked at the resource string, and it was:

    %1!d! of %2!d! minutes

    what was coming out in the UI was:

    of 90 minutes (*)

    where the (*) is an Arabic character that I didn't recognize (not that I recognize any Arabic characters...)

    So, is this a bug, or not?

    *****

    The answer is that this is by design. The string hadn't yet been localized (our localization happens in stages). When FormatMessage went to process it, it saw the %1!d!, and went to format a number, and choose the default locale, which was Arabic, so it put in an Arabic number at the far right side (RTL, right?). But it then came to a chunk of english text, which meant it had to switch to LTR rendering, and it used the english locale to format the number in the english text.

    The good news is that when the string is localized, everything will work fine. The bad news? Well, it took me a few hours to figure this out.

  • Eric Gunnerson's Compendium

    TDD and design methodologies

    • 6 Comments

    (something I posted on our internal agile alias, in response to a question about how design works in TDD...)

    There's an underlying assumption in software engineering that "more design == better design", despite that fact that the vast majority of us have worked with baroque systems that answer a bunch of questions that nobody ever asked.
     
    The traditional theory is that if you don't do the up front design, your code will be poorly architected, inflexible, and you'll be in trouble when you try to maintain it. Which is true. But it's also true that up-front design - especially the "spend a milestone" type of up-front design - often leads to the same result.
     
    The ideal architecture is a minimalist one. It provides all the features that are needed and no features that aren't needed (I mean "features" in the class method sense, not the user-visible sense)
     
    The up-front approach attempts to do that without the data around which features are needed and which ones aren't needed, which always changes along the way.
     
    TDD says, "We're going to figure out what we need and how to put it together along the way. We know we're not going to get it exactly right the first time, but with our tests we can refactor as necessary"
     
    Comments?
  • Eric Gunnerson's Compendium

    Announcing the C# trivia test...

    • 5 Comments

    A longggggg time ago -  back right after we disclosed C# - I was in San Francisco for one of the early meetings of the bay area .net user group.

    My publisher (Apress) had given me some swag to give away, but I don't like the "survival of the fittest" mode, so I sat in the back during the first hour and wrote some trivia questions. The winners got their pick of the swag, and it was a great warmup for my talk.

    Anyway, I was thinking of that recently, and I thought it would be nice to do a version on my blog.

    But I need some more good questions. That is where you come in. Please send me some questions, I'll fold them in with the questions that I have, and then I'll present them in a "one per day" format.

    The questions should be something that can be determined through publicly available information, though entertainment is more important than accuracy, if you get my drift. What curious thing did you find out? What do you know that other people don't?

    I'll see if I can come up with some swag for the people who send in the best questions.

    Sound good? Okay.

    Send questions to me at EricGu@microsoft.com, and please put your full name in the email if you want credit.

     

     

  • Eric Gunnerson's Compendium

    Regex 101 Answer I10 - Extract repeating hex blocks from a string

    • 4 Comments

    Regex 101 Exercise I10 - Extract repeating hex blocks from a string

    Given the string:

    PCORR:BLOCK=V5CCH,IA=H'22EF&H'2354&H'4BD4&H'4C4B&H'4D52&H'4DC9;

    Extract all the hex numbers in the form “H’xxxx”

    *****

    You can match the hex digits with:

    H'(?<Values>[0-9a-fA-F]{4})

    Like our last example, you can call Match() multiple time, use Matches(), or do it in a single call with:

    (H'(?<Values>[0-9a-fA-F]{4})&)+

  • Eric Gunnerson's Compendium

    A few notes on the readonly field vs readonly property post...

    • 4 Comments

    First, Tom (not Ted) pointed out that I removed his use of [DebuggerStepThrough] on the get accessor. I did it to simplify things, but it is a useful thing to do.

    Maurits asked, "What would be the drawback of rearchitecturing C# to make public variables compile to public get-only properties?".

    Well, I guess I don't see a lot of advantage in doing that, so I guess the chief drawback I'd see is "insufficient utility versus the added complexity". We did play around with different syntaxes that could be used to write properties without having to write the accessors, but it turned out that they either 1) Didn't cover the important scenarios or 2) were as complex as writing the property. We elected not to do that.

    Bertrand Le Roy pointed out that there are some frameworks that work with properties but not with public fields. I find this pretty annoying, but this does mean that if you want to use data binding (for example), you'll have to define the properties.

    Bjorn Reppen wrote a whole post on his opinion, which I really won't comment on except to say that I'm a firm believer in YAGNI.

    Peter Ritchie wrote, "You appear to be contradicting the (framework desgin) Guidelines by suggesting readonly instance fields are a better choice under certain circumstances".

    When .NET first came out, there was some talk about writing some general coding guidelines for C#, but we realized that coding guidelines are very specific to teams and can lead to religious discussions, so we opted not to.

    Which meant that the only reference we had was the framework design guidelines, which was unfortunate.

    The Framework Design Guidelines are great - if you are building a framework. But most people aren't building frameworks - they're building applications, or their building libraries that their group uses directly. In those cases, I think you have to weigh the benefits and the costs.

    I think I could have made it clearer that many frameworks fit into the "update this assembly separately" clause.

    Finally, there were a couple comments properties being more flexible. My point was that paying for flexibility that you don't need is the wrong place to put your money.

  • Eric Gunnerson's Compendium

    Fact of the day...

    • 3 Comments

    ... women's standard clown shoe length is normally 13".

     

     

  • Eric Gunnerson's Compendium

    C# on the XBOX 360

    • 3 Comments

    Announced at GDC.

  • Eric Gunnerson's Compendium

    Prevent the heartbreak of torn toast.

    • 3 Comments

    Are you a butter user?

    Have you ever torn your toast trying to spread cold butter onto it, and then gotten so mad about it that you threw your $300 Neiman-Marcus toaster through the stained glass window of your kitchen?

    I thought so.

    You need the ButterWizard...

    (From Mil's mailing list...)

    (I'm not a big butter guy, but I do like the tub kind that is mixed with Canola oil so you can actually spread it).

  • Eric Gunnerson's Compendium

    C# Trivia Test Format

    • 3 Comments

    I've got some good questions for the trivia test, though I am still looking for more. Anything that you've found surprising while writing in C# would be fair game.

    Rather than doing this in series of posts, I'm thinking of doing it in one big post (or perhaps four smaller posts), so that it's more compact. What do you think? 1 post? 2 posts? red post? blue post? Do them all at once, or spread them out?

  • Eric Gunnerson's Compendium

    A new investment opportunity...

    • 3 Comments

    There's an internal discussion folder at Microsoft named "SOC Weirdness", and a few days ago I came across a mention of LifeGem.

    So, basically, their gig is that they extract the carbon from your beloved's remains (be they human, feline, canine, or, presumably, ursine), purify the carbon into graphite (they say "convert", but if I say that I *know* I'm going to hear from the spousal chemist), and then put it into a diamond press and after some time passes, you end up with a diamond between .2 and 1.25 carats.

    For a few thousand dollars, of course.

    Now, I'm not going to judge the people who choose to do this. Making a loved one into a diamond isn't any weirder than lots of other things people do. But for me, while I like the look of gems, they do lack basic utility, which is why I would like a different alternative for my remains.

    Rather than going to the expense of creating a gem from the graphite, it will be put to a more conventional use.

    As the LifePencil.

    Perhaps they could be handed out during the ceremony, with a suitable inscription...

  • Eric Gunnerson's Compendium

    Snippet enumeration utility

    • 2 Comments

    Snippet enumeration utility.

    Pretty cool.

  • Eric Gunnerson's Compendium

    Alfred made me...

    • 2 Comments

    Alfred made me post this.

  • Eric Gunnerson's Compendium

    New blog about video, MCE, gadgets, and Movie Maker...

    • 1 Comments
    Michael Patten, one of the PMs on the Movie Maker team, has started a new blog.
  • Eric Gunnerson's Compendium

    Compu-Promo - computer promotional photography of the 60s and 70s...

    • 1 Comments

    Compu-Promo

    Thankfully, I missed these machines by a few years...

    From The Institute of Official Cheer

  • Eric Gunnerson's Compendium

    Practical Tips For Boosting The Performance Of Windows Forms Apps

    • 1 Comments
    Practical Tips For Boosting The Performance Of Windows Forms Apps
  • Eric Gunnerson's Compendium

    Driving traffic to your blog...

    • 1 Comments
    From an internal discussion that we had, Jim posts a summary.
  • Eric Gunnerson's Compendium

    Mystery solved

    • 0 Comments

    I can't say that I'm surprised, though it did take me a while to put it all together.

    I think it started when I said that I was thinking about getting a coach. Emails that took a few days to come back. That sort of thing.

    And then when I signed up for a coach, it got a little more obvious. A Saturday ride got cancelled. And he got "busy".

    Yesterday, it all became so clear.

    Elden is moving back to Utah because he's afraid he won't be able to keep up with me this summer.

    Seriously, I'm going to miss you. I don't think any of my riding friends are quite as crazy.

  • Eric Gunnerson's Compendium

    Regex 101 Discussion I9 - Count the number of matches.

    • 0 Comments

    Regex 101 Exercise I9 - Count the number of matches

    Given a string like:

    # # 4 6 # # 7 # 45 # 43 # 65 56 2 # 4345 # # 23

     

    Count how many numbers there are in this string

    -----

    There are a few ways to approach this problem.

    In all of them, we need a way to capture the numeric parts of the string. We've done that before, using something like:

    \d+

    We can then apply that repeatedly, with code that looks something like this:

    Regex regex = new Regex("\d+");

    Match match = regex.Match(inputString);
    int count = 0;

    while (match.Success)
    {
       count++;
       match = match.NextMatch();

    }

    That's a bit ugly, however. There's a shortcut, however, using:

    MatchCollection matches = regex.Matches(inputString);
    int count = matches.Count;

    That gives the same result.

    There's another, more "advanced" approach we can use. The regex is more complex, and the code that you write is harder to understand, so I probably wouldn't prefer it over the last approach.

    To use it, you need to know a new piece of information about the Match class (where "new" means "something I haven't talked about").

    In earlier examples, we used the "Groups" indexer to pull out values that we had captured. So, if we wrote something like:

    (?<Digits>\d+)

    We would use:

    match.Groups["Digits"].Value

    to get the string.

    It is possible to write a regex in which a given capture is used more that one time, and therefore corresponds to multiple strings. If we write:

    (               # Start up repeating section
      (?<Digits>\d+)  # a sequence of digits

      (\D+|$)         # not digits or end

    )+              # repeat match

     

    We have a single regex that will match a repeating series of digits follow by non-digits, and each match is stored using the "Digits" capture.

    To get at these captures, we use:

    match.Captures["Digits"]

    which is a collection of captures, with each one containing the value from one of the captures. To solve our problem, we'd be interested in the length, which is:

    match.Captures["Digits"].Length

     

    If you want extra credit, you can also do this by using Regex.Split(), though I've found that Regex.Match() is easier to use for this sort of problem.

  • Eric Gunnerson's Compendium

    The capability immaturity model

    • 0 Comments
    I was reading the DailyWTF this morning - a great read, BTW - and came across a link to the Capability Im-Maturity Model, which I thought you might enjoy.
Page 1 of 2 (26 items) 12