VSTO Security Model

VSTO Security Model

Rate This
  • Comments 14

I somehow (?) came across a blog where a customer wonders how VSTO tightens up security. That's an interesting story.

As many of you will know, Office already has a security model for VBA and COM Add-Ins that is based on two types of evidence (digital signatures and "installed" code) and a fairly simple run / prompt / don't run policy system. You can read more about these security levels in one of Siew Moi's articles.

The CLR also has a very rich, extensible security model based around evidence, permissions and policy. You can read more about it on MSDN, or perhaps look at Don Box's article (which I haven't actually read... it just turned up in a search result and I trust him to write good stuff ;-) ).

Anyway, when it came to defining the model for VSTO we knew we didn't want to invent a 3rd way of doing it, so we had to choose whether to use the Office model or the CLR model. Now in hindsight it may seem obvious to some readers that the .NET model was the only way to go, but we were very concerned about confusing existing Office developers with the new model and having them just turn off security altogether to "make it work," so we looked quite hard at integrating with the Office model. Nevertheless it had to be implemented on the CLR policy system, and whilst this was perfectly doable (did I mention the system is very flexible?) it would have been abusing the system somewhat...

So anyway, what we thought about was basically the following (you could even use this for a blueprint of your own model, if you really need to proxy trust decisions from a legacy system to .NET...):

1)     Borrow from the digital signatures evidence that Office uses

2)     Honour only the "High" mode (must be signed to run; never prompt for elevated privileges)

3)     Use the Office trusted publishers list

4)     Build our own TrustedOfficeMacro membership condition and evidence objects

Basically how it would work would be that we'd get a request to load an assembly and check if it had an AuthentiCode signature that was generated by a certificate in the Office Trusted Publishers list. If so, we'd load the macro with the TrustedOfficeMacro evidence (essentially "vouching" for the goodness of it). Otherwise, we wouldn't load it. We'd also rely on you having a rule in your .NET policy that granted FullTrust to things loaded with TrustedOfficeMacro evidence. Another thing we could have done (and probably would have done eventually, although it's not in the original spec) was that if the assembly was "installed" (in the Templates directory) and the user had the "Trust installed templates and add-ins" feature checked, we'd load the assembly with TrustedOfficeMacro evidence.

[Aside: When presenting this to people, many of them initially said "hey, that's a security hole -- anyone can claim to be loading TrustedOfficeMacro-s and get arbitrary code to run!" But of course in order to present evidence to the CLR, you have to be fully trusted yourself (or so close as it makes no difference) so if you could load arbitrary code with bogus TrustedOfficeMacro evidence, not only could you also load it with bogus Microsoft Strongname evidence, you could also just go ahead and do whatever bad stuff you wanted to do by yourself. The point being that whenever you're thinking about security problems, think about the preconditions required to execute the attack and move on if it's already "game over" (defence in depth techniques not withstanding... :-) )]

This kind of seems cool because if you have a lot invested in VBA or COM AddIn solutions, and you already have a code signing certificate that's trusted on all your machines, you could just move to .NET code and still have it run on all your users' machines without any configuration changes. Unfortunately it had some limitations, not least of which was that it didn't mesh at all with all the new .NET stuff, so if you wanted to write both WinForms and Office code you'd have two security stories to play with instead of just one, even if you never wrote a line of VBA in your life. Also handing out evidence at assembly load time is something you want to avoid if possible, since in general you will not be in control of subsequent assembly loads in the same domain (eg if the assembly links to a second assembly, then that assembly will automatically be loaded by the CLR as needed, and you don't get any say as to what evidence it should have).

So we decided 100% .NET was the way to go, but as I thought I'd mentioned before (but I can't find a link), we thought that the default .NET logic of "if it's on the local machine, you obviously installed it, and if you installed it you obviously trust it, so it should get FullTrust" didn't really work for Office documents, so we had to do something about it.

[This blog is already longer than I thought it would be, and I haven't even started to answer the question yet! Such is the way of the blogger]

The problem was how to get Office documents to ignore the default Machine-level rule that granted FullTrust to MyComputerZone, but left policy as-is for all other applications. Some ideas that floated (and were immediately shot down :-) ) were to simply modify Machine policy when you installed Office to lock down your machine, or to somehow cleverly modify policy just before you launched Office and then put it back again as soon as it had been cached in the app. But of course the first one would have broken every locally-installed app from here to next Tuesday, and the latter was a complete hack that was just not going to happen. Oh and you need to be an admin to mess with non-User policy, and normal users had to be able to run (and even develop!) VSTO documents.

It was clear we couldn't mess with persistent policy, but luckily the CLR folks were very clever and had the foresight to include AppDomain-level policy that let AppDomain hosts (like Office) tinker with policy to their hearts' desire. So there were some initial (again, very short-lived) plans to simply slap a static "MyComputerZone : Nothing" rule into AppDomain policy and be done with it. Great! No code would run from MyComputer. Unfortunately, this was an absolute statement -- no code would ever run from MyComputer - EVER! Because .NET policy is intersected between levels (Enterprise, Machine, User, AppDomain), even if someone had added a rule at Machine or Enterprise level to say "Code signed with ACME certificate : FullTrust", we'd blow that away with our more restrictive "Nothing" rule at the AppDomain level.

Back to the drawing board.

We had to honour any and all changes that users might have made to their policy, at any level, with arbitrary degrees of complexity. And not only did we have to preserve the rules themselves (ACME gets FullTrust), we had to preserve the hierarchy of rules as well (trusting ACME in the Intranet Zone is very different to trusting them in all Zones!). The only things we would throw away would be "implicit" or "generic" rules based on Zones (MyComputer, Internet, etc) and AllCode (which matches everything in existence). And herein lies the beauty / simplicity / wackiness of it all.

We copy all the code groups from all the other policy levels into the AppDomain, and kill off the ones we don't like.

So let's say you have a "typical" machine at ACME corp, with the following policy (simplified to only mention the MyComputer and LocalIntranet Zones):

Enterprise
  AllCode : FullTrust

Machine
  AllCode : Nothing
    MyComputerZone : FullTrust
      ECMA Strongname : FullTrust
      Microsoft Strongname : FullTrust
      ACME Corp Publisher : FullTrust
    LocalIntranetZone : LocalIntranet
      http://coolserver/ : LocalIntranet
        ACME Corp Publisher : FullTrust

User
  AllCode : FullTrust

This is basically the out-of-the-box policy, with some additional rules to allow code signed by ACME Corp to run off the http://coolserver/ machine and on the local machine. We then create an AppDomain policy that is the concatenation of those three levels, so it looks like this:

AppDomain
  AllCode : FullTrust
  AllCode : Nothing
    MyComputerZone : FullTrust
      ECMA Strongname : FullTrust
      Microsoft Strongname : FullTrust
      ACME Corp Publisher : FullTrust
    LocalIntranetZone : LocalIntranet
      http://coolserver/ : LocalIntranet
        ACME Corp Publisher : FullTrust
  AllCode : FullTrust

It looks funny because AllCode appears three times and can't make up its mind about whether to grant FullTrust or Nothing, but we don't worry about that. All the granted permissions at this level will eventually be union-ed together, so we can duplicate as many things as we want as randomly as we want.

Finally we "clean" out the AllCode and Zone groups so it looks like this (changes in bold; we'll have colour one day, I promise!):

AppDomain
  AllCode : Nothing
  AllCode : Nothing
    MyComputerZone : Nothing
      ECMA Strongname : FullTrust
      Microsoft Strongname : FullTrust
      ACME Corp Publisher : FullTrust
    LocalIntranetZone : Nothing
      http://coolserver/ : LocalIntranet
        ACME Corp Publisher : FullTrust
  AllCode : Nothing

Now we have preserved 100% the semantics of the user's "explicit" policy changes (trusting URLs, publishers, sites, etc.) but we've thrown away the "implicit" default policy of giving FullTrust to the local machine. Because of the way policy resolution works, the AppDomain level can never grant more permissions to code than would otherwise have been granted (it can only take them away), although at first blush it sometimes seems that the AppDomain policy can "leak" permissions, especially when you have a LevelFinal codegroup. But it all comes out in the wash.

To this day people still say to me "I don't know how this system works, and it looks really crazy, but it seems to do the job." And it does (at least I hope it does!). I spent most of the last 18 months or so worrying about this (along with many other things :-) ) and it's been through a bunch of testing and modelling and "desk checks" manually walking through the algorithm. Hopefully no-one finds any holes with it!

A final note on the development experience. The strict policy we made for Office documents was meant to protect end-users against accidentally running malicious software, so it made sense to lock everything down by default. But of course the developers building those applications probably do want to be able to run them, and in general they won't have access to (or won't want to use) code signing and won't want to mess with policy too much (because it's hard to get right, and they'd likely turn off security altogether to make their stuff work, which is a bad idea). So we needed a different story for developers.

We went round and round on possible solutions for this one for most of the product cycle. There were three main possibilities:

1)     A machine configuration setting (regkey) to disable our policy

2)     Runtime checks between Office and VS to disable our policy

3)     Design time changes to policy to grant permissions to code

The first one sucked because it meant that a dev machine was wide open to attack. Just because the developer was doing VS development with Office and wanted to run their own code, that didn't mean we should leave their machine open to attack from malicious code. Devs sometimes open bad attachments too, you know ;-)

The second one was an improvement because it would only disable the policy if Office detected that a debugger was attached, and that the document being opened matched the one inside the debugger. This was cool because it meant that the dev machine was no longer wide open -- policy would be enforced for all documents except the ones the developer was actively debugging -- but it had another drawback. It worked if the developer hit F5 in VS (Run with Debugging), but would fail if the developer hit Ctrl+F5 (Run without Debugging) or if they ran their code from Explorer.

The third one (which we shipped with) ensured that developers could always run the code they built themselves, because we'd update their user-level policy to trust the output location of the VS project. This meant that you were still protected from random malicious code (unless you happened to save over the top of one of your own previously-trusted projects!) and it didn't rely on having the VS debugger attached. The main drawbacks were that it didn't work out-of-the-box if you developed on a network share (you can't trust network shares by modifying user-level policy unless an admin has already updated machine policy), it didn't work if you moved your solution after building it, and you could end up with hundreds of entries in policy over time as you built more and more solutions. <sigh> oh well, you can't have everything.

So there you have it. Many people think that the stuff that comes out of Redmond is just rubbish with no real thought behind it and with the only purpose being to make as much money as possible as quickly as possible and with no real concern for the user. Hopefully you'll see from my blog (and everyone else's) that this isn't the case. We think long and hard about all the features we ship, and we often have to trade off several different design ideas based on conflicting requirements, changing priorities, tight schedules, etc.

I'm the first person to admit that the VSTO policy is very draconian and, quite frankly, a pain in the neck for a lot of developers who just want their code to work. But I'll also be the first (and loudest) person to defend that design as being the only "right" choice for where we are today. (And if Siew Moi wasn't on vacation, I'm sure she'd be second in line :-) ).

Thoughts?

  • You bet. I might be on vacation but how much I like and believe in the vsto security model does not go on vacation ;-) The design is the right choice for where we are in Office today. In fact I think it's one of the coolest vsto "features". I could hardly contain my excitement when i first came to know about the vsto security model design. I've to use it a lot and also show people how to use it. Never once did I think it was painful or tedious. I really like it! Great design Peter!
  • Ivan talks about how to do this kind of per-AppDomain policy updating in his blog at: http://blogs.gotdotnet.com/ivanmed/commentview.aspx/4104271e-dceb-466f-836a-8c791af63ea8
  • I finally found some time to read this. It is all fantastic! Developers in this space should still be kind of educated about .NET CAS, but it is definitely the way to go. Thanks! BTW, our VSTO sample project has been released from Microsoft Japan. I don't know how many folks out there downloaded it, but I am sure it has some impact to them.
  • Congratulations on finishing the project. You should post the URL here ;-) Also don't miss the update to this blog at: http://blogs.gotdotnet.com/ptorr/PermaLink.aspx/f270eee6-7466-4f05-b7fc-b84bfd5934a0
  • Thanks to Brian Randell who pointed me to this post. Before reading this post I was actually quite puzzled about the things described in Brian MSDN March 04 article could ever work.
    The missing point was constructing the appdomain policy as the union of the enteprise/machine/user (and then remove the permissions in zone code groups.

    Before reading it I thought one way to obtain a similar effect was to remove the zone evidence before calling appdomain.createappdomain

    Why this approach shouldn't work ?
  • Heh... I forwarded the URL to Brian :-)

    It won't work because you need the evidence in order to evaluate the child cod groups correctly. I was going to blog about this at some stage... maybe I will soon.

  • Fabulous world of VSTO security still puzzles me after living in it for the past 3 years. Isn't it especially...
Page 1 of 1 (14 items)