Welcome to MSDN Blogs Sign in | Join | Help

ProjectProperties->Signing vs AssemblyInfo

Daniel Moth questioned the move from source attribute to project properties.  Here's my insight on 2 of the reasons.  First the disclaimer:

This posting is provided "AS IS" with no warranties, and confers no rights. The content of this posting contains my own personal opinions and does not represent my employer's view in anyway.

The 2 reasons I was aware of fell into 2 categories: security and usability.  First of all you have to remember that the Assembly level attributes are embedded into your assembly.  Thus the final assembly that you shipped contained all the information in those attributes.  Thus the filename of your key is shipped out to all of your customers.  The security gurus decided this was too much information; even with the mitigating factor of delay signing (meaning, hopefully, the file only had your public key and the private key was stored someplace really secret).  Especially if you put a fully qualified path to the key file that might contain other sensitive information like your username, project code name, machine name, etc.  Next comes usability.  Anybody coming from C/C++ is familiar with mucking with the include file path to get their #include directives to work, especially when it included partial paths.  Due to an implementation detail the tool that processed the AssemblyKeyFileAttribute was not the compiler itself, but another tool invoked by the compiler.  That tool was not aware of the location of the attribute in source, so it could not give as useful error messages.  For the same reason it could not do a source-relative search for the file when given a partial path or filename.  Instead it searched in 2 places: output file relative and current-directory relative.  The output file relative was confusing (do you remember when everybody put "..\\..\\mykey.snk"?) not only because it was 2 directories down, but because for C# it was really the intermediate file location (obj\Debug instead of bin\Debug as most would expect).  The current directory was problematic because it kept changing.  For a command-line compiler like csc.exe it was logical and made sense.  But for a multi-threaded GUI app like devenv.exe it was a lost concept.  People would complain that sometimes their project would compile and sometimes they would get strange errors and it all really hinged on which file was opened last (and thus set the current directory).  The project system at one point even tried changing the current directory before each build, but that didn’t quite work because the current directory is not per-thread, so other threads running during the build would sporadically ‘break’ the build by changing the current directory!  Many people often resorted to specifying a fully-qualified path name.  Too bad this totally breaks source code control systems and exacerbates the security issue.  It got even worse if you tried to build localized satellite assemblies that were also signed.  Since the project system couldn’t easily read the key information (yes this could be viewed as a product bug or implementation detail), it relied on the same tool to read out the key information and then apply it to the satellite assembly.  It's the same pathing problems all over again only this time in spades.

The 'solution' was to make the project system manage the paths (something that it’s designed to do) and then pass the key file or key name in such a way that it is no longer embedded in the final PE.  This solves the security problem and the usability problem.  I’m personally miffed by our poor excuse for a project system.  I’m hoping that it will get significantly better when it’s no longer a v1 product.  They do have some ideas that potentially could help with project sharing, we’ll just have to see how it pans out.

--Grant

Posted by grantri | 2 Comments
Filed under:

Remodel Marches On

Well, my house is half torn apart and the new parts are about half-built.  There's not much to talk about here at work, as we're busy finishing off the last few bugs and getting everything up to snuff.  The question in my mind is which will finish first, my house or Whidbey...

--Grant

Posted by grantri | (Comments Off)
Filed under: ,

Home Remodeling and Software Maintenance

I just finished signing the final paperwork to begin remodeling our home.  Along with the paperwork came a fairly large budget.  What strikes me as interesting is that it actually costs more to add on to a house than it does to build things from scratch.  After thinking about this for a while, it kind of makes sense.  Being a programmer, I related it back to writing code: it's sometimes faster (cheaper) to write new code than it is to debug somebody else's code.  The poor contractor can make rough estimates about what is actually under the foundation, or in the walls.  Until he actually opens things up, it's just a guessing game.

So back to software a little bit.  If we always wrote code with the assumption that 10 years from now some poor person is going to have to read and debug the code to fix it, what would we do differently?  I'm a big fan of readability, but too often I think I fall into the trap of writing comments to remind me about why I did what I did, rather then writing to an unfamiliar audience.

Well, enough rambling.  Hopefully I'll come up with some better stuff (and more) to talk about next time.

--Grant

 

Posted by grantri | 0 Comments
Filed under:

Not-so-new C# Compiler Features

So a while back somebody asked what new compiler features were coming out for Whidbey that weren't part of new language features.  Well you've already heard about Edit and Continue.  There's also the really cool Refactoring built into the IDE and built off of the compiler's source-code analysis.  As an attempt to improve the debugging experience for those of you with 500+ default.aspx files in your solution, they've added some hashes to the PDB files and improved file lookups to the debugger.  Partly to improve E&C but also to improve the overall debugging experience, the compiler has done some 'de-optimization' to the code (basically adding NOPs and storing more temporaries to locals).  Don't worry none of this should impact the performance of your relase code as long as you remember to use /optimize+.

This does bring me to another question that somebody asked: what optimizations does the C# compiler do?  The answer is very few.  Most of them would fall into the category of flow-graph optimizations: remove branches to next, short circuit branches to branches, remove dead code, invert conditionals to eliminate branches around unconditional branches, etc.  It also does very basic constant folding, although you can easily thwart this by reordering the math (e.g. "4 + 5 + a" folds to  "9 + a", but "4 + a + 5" stays the same).

You might ask why the compielr doesn't do more.  Well I can see 2 reasons: if the compiler optimizes too much it will make the JIT work harder to perform its optimizations and any optimization that applies ot the C# compiler should be written in a way that all compilers targeting MSIL could share.  For the second reason most people prefer to put the optimization into the JIT.  My personal optinion is that this was a bad choice.  Why reoptimize stuff every time the code is run?  NGEN helps, but wouldn't be nicer to just have a simple MSIL optimizer that performed classic dragon-book style machine independent optimizations?

Mike Montwill wrote a nice piece about the exact optimizations with real examples here.

Well enough for this post.  I'm still looking for things to write about.

--Grant

[Edit: Added link to Mike's aritcle]

Posted by grantri | 6 Comments
Filed under:

More on 64-vs-32

OK, this isn't a really meaty post, it's more of a collection of a few ideas that have been rattling around in my head for a while.  I kept hoping they'd develop into something bigger, or I'd have time to research/investigate them more, but nothings happened, so I'll just dump them as is.

So when comparing AMD64 chips to normal x86 chips, there's two broad categories of differences:

  1. Registers and Pointers are now twice as big.
  2. Killer architecture that helps even x86 code run faster.

Most gamers out there are already aware of #2, so instead I'm going to focus on #1.  Basically I think of it this way, "I've already got this screaming system and I need to decide if it's worth it to compile my code for 64-bit or leave it as 32-bit."  There is pain involved in moving to 64-bit unless you happen to be the perfect developer that carefully uses size_t and HANDLE and int religiously.

The biggest downside I've run into lately (and this really isn't new researchers ran into it almost 20 years ago on the first 64-bit RISC chips) also happens to be the biggest benefit of 64-bit: pointers are now twice as big!  If you have a classic tree structure that contains relatively trivial data, it has suddenly doubled in size.  Now if you really have a tree that holds enough data that it exceeds 2GB, then this probably is OK because you need that much address space.  Most apps don't fall into that category (and for the rest of this post I'm going to assume that the normal 32-bit 2GB of addressable memory is sufficient).  Thus I think my first criteria is simple: are your data structures pointer-laden?  If they are, you are going to take a hit in performance from increased memory usage.

The 'fix' is to change your data structures to use indexes or based pointers such that only a few real 64-bit pointers exist and most of your data structures use smaller (32 or 16 bit) offsets.  This is somewhat contrary to classic x86 style because why store a 32-bit integer array index, when you could use a direct pointer instead and they both take the same size.  Now that they aren't the same size, you need to carefully think about your pointers.  Do you really need a 64-bit pointer or can you get away with a 32-index somehow?

The other advantage is that the registers are bigger, and you get more of them.  More registers means more things can get enregistered, but only if you don't do things like take their address, pass them by reference, or other such things.  Bigger registers mean that the few places where you actually use 64-bit integers, it is now more efficient!

So what's the final answer?  Well my heuristic would be this: do you deal with BIG stuff?  If  yes, then try compiling 64-bit otherwise stick with 32-bit.  If performance really matters and you're willing to spend several months re-architecting to use fewer pointers, then try 64-bit, but make sure to measure everything.

--Grant

Posted by grantri | 5 Comments
Filed under:

Some of my opinions on Generics

Disclaimer: These are all my opinions, so don't take them to mean anything more than the futile thoughts of an insignificant bystander who happened to be fortunate enough to listen to a few of the C# language design meetings and occasionally interact with some of the designers.

John and a few others have compared the CLR's generics (hereafter referred to as simply 'Generics') to C++'s templates ('Templates'). In some ways I think this is beneficial, but in other ways I think it is like comparing apples and oranges. This is my attempt to compare and contrast them and explain why I think certain comparisons are invalid.

  • They both allow polymorphism in a second dimension. In addition to inheritance, you now also have instantiation.
  • They both facilitate code reuse by allowing the programmer to write type-agnostic algorithms, while in most cases preserving the type-safety and performance of type-specific algorithms.
  • Templates are a compile-time mechanism, while Generics are instantiated at runtime. Along this vein, I like to think of Templates as compile-time macros on steroids. All the C++ compilers I'm aware of implement Templates entirely in the front-end, very similarly to how the preprocessor implements macros. Because of this templates can be used and abused in many ways that Generics can't.

I think most everybody will agree with me at this point. So the next major question is why did the CLR people decide to go with the 'limited' functionality of Generics instead of Templates? Well for starters, the CLR as a platform does not preclude C++ or any other language from incorporating Templates since that all happens at compile time and require no help from the CLR. Now granted there is no metadata to express templates that would make them cross-language, but as far as I know, there has never been any cross-language Templates.

John stated, "Generics seem to really answer the freshman CS collections problems while missing some of the more expressive power of C++ templates." I agree mostly with his statement. I disagree with the implied statement that Generics don't solve/answer the 'bigger real world problems that real programmers face'. First off we must remember from our CS theory classes that all computer languages are functionally equivalent. That is to say if you can do it in one of them, you can do it in all the others, but it might be significantly harder or require more work on the part of the programmer, but it is not impossible. So what are the real situations that real programmers face that Templates make easier but Generics don't (or at least not as easy)? I'm sure there are a few. The biggest one that springs to my mind is constants. I've written a few templates that take as a parameter the initial size or the size to grow. With Generics you're forced to use a readonly field to store such information. Not quite as elegant, but close enough for me until a performance test proves otherwise (in which case I'll probably look at changing algorithms first).

I personally would contend that being able to write type-agnostic collections and algorithms, that are still type-specific to the consumer really is the 90% case. Hence Generics that are really just trying to solve the freshman CS problem are solving most of the real problems. At least for now, I'm sure in a few years that will all change.

BTW John, you and I took many of the same freshman classes, and I don't remember any stuff about collections, type safety, templates or that sort of thing until at least my sophomore or junior year. Are there any colleges out there that teach that sort of stuff to first year CS students?

--Grant

Posted by grantri | 5 Comments
Filed under:

More Info on Base Addresses

Another Microsoftie, Josh Williams follows my blogs and pointed out another case where a base address matters: NGEN.  When you NGEN your assemblies, the new images are loaded at the same base address as the original binaries.  As Josh pointed out to me, NGEN images do have significantly more relocations (similar to a real native image), and on 64-bit platforms may end up using an extra jump stub when they get rebased far away from the call targets (i.e. when a 32-bit offset no longer works).  The relocations will affect load time, but the jump stubs will  affect runtime performance.

--Grant

Posted by grantri | 2 Comments
Filed under:

More Q&A - Why No Generic Attributes?

Question from Wes Haggard:

A while back I posted http://weblogs.asp.net/whaggard/archive/2004/10/12/241476.aspx about having a generic type inherit from Attribute. Do you know why this is prohibited in C#?

My attempted response:

I used to think it wasa a CLI restriction, but now I've scoured the docs and can't find anything like that.  So, you'll have to settle for a big fat "I don't know".

Posted by grantri | 1 Comments
Filed under:

More Q&A - C# Project settings

Question from Eric Wilson:

Could you do a post on what the settings on the Advanced Tab in Visual Studio.Net for C# projects are for and when you should use them?

My lame response:

I've only occasionally had to deal with the project-system guys that more or less own that 'Tab", so I can't even remember what's on it.  Going off of the online MSDN: Visual C# Reference Advanced, Configuration Properties, <Projectname> Property Pages Dialog Box, I see 4 settings:

  • Incremental Build - don't use it unless you absolutely need to.  It's getting removed in future versions.  It was a buggy attempt at trying to only recompile changed code and thus speed up compilation.
  • Base Address - For Class Libraries (DLLs) you can get a small speedup in load times if the OS doesn't have to relocate your DLL from it's preferred base address.  This is a big thing for native DLLs which typically have hundreds of relocations.  For managed DLLs there's at most one: the import of mscoree.dll!_CorDllMain.  As such this isn't really a huge win, but possibly still measurable.  It will also impact working-set across multiple processes.  Of course most big software houses don't use this setting, instead they run a tool like rebase.exe as a post-build step that lays out all of the DLLs so they all get unique addresses. [More info in a new post]
  • File Alignment - For certain binaries, this enables you to save a little extra file space.  The 2 most common values are 4096, and 512.  I think the default is 4096.  By setting it to 512 you can potentially reduce the ammount of file padding, but this can hurt performance in other scenarios because now each section is not aligned the same as a memory page.  As with all performance issues, the best advice is to try it out and measure.  I expect this would have a bigger impact on the Compact Frameworks than on desktop applications.
  • Do not Use Mscorlib - The C# compiler automatically includes a reference to the mscorlib.dll that is installed with the current runtime.  If you want to use a different mscorlib.dll, you need to us this setting to disable the automatic reference, otherwise the 2 will conflict.  The most common use would be the Compact Frameworks.

--Grant

[12/29/04 - added link to new post on base addresses]

Posted by grantri | 2 Comments
Filed under:

Why can't I do XYZ in C#?

First off I'm not a language lawyer, or an expert.  I am only sharing some of the impressions I've gotten from working with the real language designers.

Eric Wilson asked why C# doesn't allow you to call static methods using instance pointers.  My answer would be two-fold:

  1. C# is very explicit.  There's generally one and only one way to do a lot of things that in C/C++ had many ways.  Likewise whenever there was a C/C++ language construct that was often mis-used, misunderstood, or just confusing, the language guys tried very hard to make C# impossible to mis-use, misunderstand, or get confused.  This is such an example, see reason #2.
  2. This is confusing to someone who reads the code.  If they don't have the definition of the method nearby, they won't know it's static and thus doesn't need a valid instance pointer (which Eric's sample code lacks).  Likewise some readers might try and assume that some sort of polymorphic virtual call might happen if the instance pointer is actually some sub class.

Orangy asked why he can't derive from Delegates.  The short answer here is that it's a runtime restriction.  The runtime deals very intimately with delegates, and as such has some heavy requirements on them.  However, you can accomplish the same semantics with a little extra coding.  Basically create an intermediate class that holds a weak reference (this is not new in v2, just new to me, thanks Dmitriy) to the real delegate, then pass the intermediate delegate to the real event:

// warning untested pseudo code
sealed class WeakDelegate {
   public EventHandler MakeWeakDelegate(EventHandler realDelegate) {
      return new EventHandler(new WeakDelegate(realDelegate).WeakInvoke);
   }
   private volatile WeakReference realDelegate; // volatile so it is thread safe
   private WeakDelegate(EventHandler realDelegate) {
      this.realDelegate = new WeakReference(realDelegate);
   }
   private void WeakInvoke(object sender, EventArgs args) {
      EventHandler eh = this.realDelegate.Target as EventHandler;
      if (eh != null) {
         eh.Target(sender, args);
      }
   }
}

 

That's all for today.

--Grant

[Corrected the code and some comments in response to Dmitriy's criticism]

Posted by grantri | 7 Comments
Filed under:

I've run out of ideas again

In case you haven't noticed, I've run out of ideas to write about.  I'm sure there's still a few things I know that I haven't explained, btu I can't remember them...

If you've been dying to know something about the C# compiler, ALink, CLR file formats, 64-bit JITs, please ask,a nd I'll see what I can do.

--Grant

Posted by grantri | 9 Comments
Filed under:

64-bit beta is official!

Finally I have something to write about!  There's a new section on MSDN that is definitely worth reading:

64-Bit .NET Framework (http://msdn.microsoft.com/netframework/programming/64bit/)

Now you only need a new 64-bit machine and OS to try it out on.

--Grant

Posted by grantri | (Comments Off)
Filed under:

Good post on running/writing/compiling managed binaries as 32-bit or 64-bit

How the OS Loader will force .Net v1.0/1.1 executables to run under WOW64 on a 64-Bit Machine (http://blogs.msdn.com/joshwil/archive/2004/10/15/243019.aspx)

In General I think all of Josh's recent posts are worth a read.

--Grant
Posted by grantri | 1 Comments
Filed under:

Source code control (RCS, VSS, etc.)

So as part of professional development, I assume everybody uses some form of source code control and revision tracking.  This allows multiple developers to work together, and also a way of tracking changes.  Sometimes they're also used as a way of 'branching' off new features and then integrating them back in once the feature is stable enough.

My question is more for the hobbyist who works alone.  Generally there's no need for many of these features.  So do most hobbyists use some sort of source code system for their own projects?  If yes, why?

For myself I do it just as a way to track changes.  I never need to branch anything.  There's nobody else that I share stuff with.  I suppose I also use it in a limited way to track bugs.

So, do you use a source code control system?  Which one? Why?

Thanks for taking the time to answer.
--Grant

Posted by grantri | 34 Comments

Some Clarifications

In my previous post The problem with being second, it seems like there was a lot of confusion.  I'm going to attempt to clarify some of that.  The heart of that post was meant to point out that people/developers don't always how close or far they are from a spec, until somebody else tries to implement the spec and they compare the two implementations.

#1 Although I was referring to the ECMA and all that it allowed as far as runtime/JIT optimizations, that's not exactly the same as what Microsoft will ship.  As many of you pointed out, it would break way to much of existing code.  I would personally argue that it was already broken and you just didn't know it yet, but that's an argument not worth having.  Microsoft really does value backwards compatibility and thus we won't ship until we've figured out a way to not break customers (think about how many windows APIs maintain backwards compatibility so that old programs that relied on bugs still work, this will be just another example).

#2 A few people my have interpreted my statements about testers as somehow negative.  It was not meant that way.  This is my public apology to all testers.  I was using testers as an example because they're our 'first' customers.  They see our code and its results before anybody else. In general I think testers on compilers, JITs, and runtimes are very smart and technically adept to be able to understand, test and break this stuff.  I can't speak for all teams, but the ones I've had experience with, the testers are a vital part of the development cycle, and I enjoy a good 2 way relationship with them, where they are my peers.

--Grant

[corrected some mis-spellings]

Posted by grantri | 4 Comments
Filed under:
More Posts Next page »
 
Page view tracker