Software Engineering, Project Management, and Effectiveness
A few folks have asked me about Axiomatic design that I mentioned in my post on Time-boxes, Rhythm, and Incremental Value. I figure an example is a good entry point.
An associate first walked me through axiomatic design like this. You're designing a faucet where you have one knob for hot and one knob for cold. Why's it a bad design? He said because each knob controls both temperature and flow. He said a better design is one knob for temperature and one knob for flow. This allows for incremental changes in design because the two requirements (temperature and flow) have their independence. He then showed me a nifty matrix on the board that mathematically *proved* good design.
At the heart of Axiomatic Design are these two Axioms (self-evident truth):
For an interesting walkthrough of Axiomatic Design, see "A Comparison of TRIZ and Axiomatic Design".
Around mid 2004, Randy Miller approached me with "I want to review MSF Agile with you with the idea of incorporating your work." I didn't know Randy or what to make of MSF Agile, but it sounded interesting.
Randy wanted a way to expose our security and performance guidance in MSF. Specifically he wanted to expose "Improving Web Application Security" and "Improving .NET Application Performance" through MSF. I was an immediate fan of the idea, because customers have always asked me to find more ways to integrate in the tools. I was also a fan because my two favorite mantras to use in the hallways are "bake security in the life cycle" and "bake performance in the life cycle". I saw this as a vehicle to bake security and performance in the life cycle and the tools.
We had several discussions over a period of time, which was a great forcing function. Ultimately, we had to figure out a pluggable channel for the guidance, the tools support and how to evolve over time. My key questions were:
These questions lead to a ton of insights around meta-models for software development life cycles, context-precision, organizing engineering practices and severl other concepts worth exploring in other posts.
My key philosophies were:
Randy agreed with the questions and the philosophies. We came up with some working models for pluggable guidance and integration. His job was to make the tools side happen. My job was to make the guidance side happen. I now had the challenge and opportunity of making guidance online and in the tools. This is how I ended up doing guidance modules for for .NET Framework 2.0. This also drove exposing p&p Security Engineering which is baked into MSF Agile by default.
Randy summarized our complimentary relationship best ...“The Patterns and Practices group produces an important, complimentary component to what we are building into MSF. In Visual Studio, our MSF processes can only go so deep on a topic. Our activities can provide the overview of the steps that a role should do but cannot provide all of the educational background necessary to accomplish the task.As many of the practices that we espouse in MSF (such as threat modeling) require this detailed understanding, we are building links into MSF to Patterns and Practice online material. Thus the activities in MSF and the Patterns and Practices enjoy a very complimentary relationship. The Patterns and Practices group continues to be very helpful and our relationship is one of very open communication.“
Mike Kropp, GM of our patterns & practices team, liked the results ...“– it was great to see the progress you’ve made over the past couple of months. here’s my takeaway on what you’ve accomplished:
I remember asking Randy at one point, why did you bet on our security and performance work? He told me it was because he knew we vetted and proved our work with customers and industry experts. He also knew we vetted internally across our field, support and the product teams. I told him if anybody wondered who we worked with, have them scroll down to the contributors and reviewers list for the security work as an example.
We have more work ahead of us, but I think we've accomplished a lot of what we set out to do and for that I'm grateful to Randy Miller, David Anderson and their respective team.
Today I had some interesting conversations with Loren Kohnfelder. Every now and then Loren and I play catch up. Loren is former Microsoft. If you don't know Loren, he designed the CLR security model and IE security zones. He created a model for more fine-grained control over security decisions and he's a constant advocate for simplifying security.
You might think two guys that do security stuff would talk about security. We didn't. We ended up talking about project management, blogging, social software, and where I think next generation guidance is headed. I'll share the project management piece.
I told Loren I changed my approach to projects. I use time boxes. Simply put, I set key dates and work backwards. I chunk up the work to deliver incremental value within the time boxes. This is a sharp contrast from the past where I'd design the end in mind and then do calculated guesstimates on when I'd be done, how much it would cost and the effort it would take.
I use rhythm for the time boxes. I use a set of questions to drive the rhythm … When do I need to see results? What would keep the team motivated with tangible results? When do customers need to see something? I realize that when some windows close, they are closed forever. The reality is, as a project stretches over time, risk goes up. People move, priorities change, you name it. When you deal with venture capitalists, a bird in hand today, gets you funding for two more in the bush.
Loren asked me how do I know the chunks add up to a bigger value. I told him I start with the end in mind and I use a combination of scenarios and axiomatic design. Simply put, I create a matrix of scenarios and features, and I check dependencies across features among the scenarios. What's my minimum set of scenarios my customers want to have something useful? Can I incrementally add a scenario? Can I take away scenarios at later points and get back time or money without breaking my design? Sounds simple, but you'd be surprised how revealing that last test is. With a coupled design, if you cut the wrong scenario you have a cascading impact on your design that costs you time and money.
We both agreed time boxed projects have a lot of benefits, where some are not obvious. Results breed motivation. By using a time box and rhythms, you change the game from estimating and promising very imprecise variables to a game of how much value can you deliver in a timebox. Unfortunately sometimes contracts or cultures work against this, but I find if I walk folks through it, and share the success stories, they buy in.
Want to see some short training videos and labs by Keith Brown on some common security issues?
I thought it would be great to do a pilot around modular, self-paced training. By modular, I mean you get a video that's 10 minutes or less, and a lab that's 20 minutes or less.
Modular, Relevant, Real-World
To make the training valuable, I wanted to improve on a few things:
ScenariosFor some simple usage scenarios, I had the following in mind:
The idea was that the community could help point each other to more fine-grained training or big picture as needed. On my end, I could point to the training to help walk customers through our patterns & practices Security Guidance.
Training LayoutThe modules are laid out as follows:
All pages are simple and bare by design (to render more as we learn more and based on feedback). The key to having a page per lab means we'll be able to provide fine-grained access and jumps from guidance.
For more information about the patterns & practices Security Training Modules, see About the patterns & practices Security Training Modules
Today, I cleaned up my Security Wiki on Channel9 at http://channel9.msdn.com/Security
The purpose of this Wiki is to let me share information that may not be completely fit and finish like on MSDN. This comes in handy for a few things:
What's in store going forward for Security Wiki? Well, potentially a lot. I'm still thinking through some of the possibilities. If you have things you'd really like to see more of, let me know, and I'll see what I can do.
The Web Application Security Frame is a set of categories you can use to scope security and improve your effectiveness. It consists of the following categories:
We created these categories during Improving Web Application Security to represent two things:1. Where are the most common mistakes made2. Where are the most actionable improvements
How do you use these to be more effective? You use these categories to focus and prioritize your security work. For example, if you know the most prevalent security issues occur in the input validation, authentication and authorization categories, you can start there.
You can immediately put the Web Application Security Frame into action. when you perform Security Design Inspections or Security Code Inspections you can use the frame to walk categories of common security issues. To do so, see the following:
For more information on the Web Application Security Frame, see Cheat Sheet: Web Application Security Frame.
As a software engineer, how do you cope with information overload? I suggest domain specific categories. If the basic idea of domain specific languages (DSL) is a software language targeted at a specific area of problems, then domain specific categories (DSC) are an idea to create categories specific to an area of problems.
Here's some practical usage for the categories:
Here's practical examples illustrating domain specific categories:
In the examples above, notice how the headings are carefully chosen categories. Each category that organizes recommendations is evaluated against both the problem space, such as security or performance, the application type, such as Web application, and then the specific technology at hand.
Also notice how the baseline categories in the Web Application Security Frame become more specific and relevant in two ways:
Are there down-sides to not using a one set of categories fits all approach? You bet ... but based on the results I've seen from practitioners, I'd bank on using more thoughtful and empirical categories that are tested against how actionable and relevant they are.
How do you know which techniques to use to shape your software throughout the life cycle? Start with the high Return On Investment (ROI) activities as a baseline set. You can always supplement or modify for your scenario.
Most development shops have some variations of the following activities:
· Design guidelines
· Architecture and design review
· Code review
· Deployment review
Identifying design guidelines involves putting together a set of recommendations for the development team. The recommendations address key engineering decisions such as exception management and include input from software vendor recommendations, corporate policies, industry practices, and patterns. This is a high ROI activity because it helps creates the scaffolding for the project.
Architecture and design review is an exercise in evaluating the design against functional requirements, non-functional requirements, technological requirements and constraints. It can be as simple as a whiteboard sketch or it can involve multiple documents, diagrams, and presentations. An architecture and design review is most valuable when it’s performed early enough to help shape your design. It’s less valuable when it’s performed late enough that it’s only function is to find “do overs.”
What better way to find problems than at the source? Code review is a practical and effective way for finding quality issues. While some issues can be found through static analysis tools, the advantage of a manual code review is contextual analysis (e.g. you know the usage scenario and likely conditions).
Testing is executable feedback for your software. It works or it doesn’t. If it works “right” is a bit trickier to establish. The ideal case is where you establish an executable contract between your code and the requirements, including functional, non-functional and constraints. It also helps to know what you’re looking for so you can test against it. While testing is usually optimized around functional behavior, you can tune it for quality attributes depending on your ability to set measurable objective and to define what good looks like.
Deployment review is an evaluation of your application deployed to your infrastructure. It’s where the rubber meets the road. This is where you evaluate the impact of configuration settings against runtime behavior. A deployment review can be your last actionable checkpoint before going into production. Preferably you have a working build of software earlier vs. later in your development cycle so that early deployment reviews can help reduce surprises going into production.
What makes these activities high ROI activities is that, if applied properly, they directly impact the shape of the software throughout the process. I know these are high ROI activities because I’ve seen them transform my customers’ results for the better. These activities provide key feedback and action at high-leverage points in the development process.
While trying to create threat model template for customers, I analyzed many threat models inside and outside Microsoft. It was insightful to see the patterns of what was useful across threat models and what was noise.
A good threat model has the following components:
A good threat model serves the following purposes:
By far, the most tangible output of the threat modeling activity is a prioritized list of vulnerabilities. These are action items for your developers and input for your testers. The developer makes a call on whether and how to fix, and the tester will test the fix.
This sample Template for a Web Applications Threat Model comes very close to showing what I've empirically seen to be useful, though there's always a gap between reality and real-time.
You can create effective security activities based on the high ROI engineering activities:
Rather than interspersing security in your existing activities, factor security into its own set of activities. Factoring security into its own workstream of quality control, keeps the activities lean and focused. Because you’re leveraging high ROI activities, you’re increasing the likelihood of influencing the shape of the software at strategic points. You create an engineering system that helps you address security throughout your software development vs. up front or after the fact. Using multiple activities vs. a single big bang effort up front or at the end creates an approach that scales up or down with project complexity and size.
The trick is to not over-invest at any one stage – stay leveraged. Rule out losing strategies early in the analysis but still cast a wide net. Progressively more costly analysis happens later and is much more likely to be on the correct path. Don’t spend a lot on costly late activities until you’ve passed muster on much less costly activities. Start with low cost, high roi activities, learn along the way, iteratively add more time and expense as you better understand what you are doing.
Simply factoring security into its own activities doesn’t produce effective security results. However, factoring security into focused activities does create a way to optimize your security efforts, as well as create a lean framework for improving your engineering as you learn and respond.
If it’s not broken, then don’t fix it ...
The problem is, you may have an approach that isn’t working, or it’s not as efficient as it could be, but you may not even know it. Let’s take a quick look at some broken approaches and get to the bottom of why they fail. If you understand why they fail, you can then take a look at your own approach and see what, if anything, you need to change. The more prevalent broken approaches include:
The Bolt on ApproachMake it work, and then make it right. This is probably the most common approach to security that I see, and it almost always results in failure or at least inefficiency. This approach results in a development process that ignores security until the end, usually the testing phase, and then tries to make up for mistakes made earlier in the development cycle. This is one way of addressing security. This is effectively the bolt on approach.
The assumption is that you can bolt on security at the end, just enough to get the job done. While the bolt on approach is a common practice, the prevalence of security issues you can find in Web applications that use this approach, is not a coincidence.
The real weakness in the bolt on approach is that some of the more important design decisions that impact the security of your application have a cascading impact on the rest of your application’s design. If you’ve made a poor decision early in design, later you will be faced with some unpleasant choices. You can either cut corners, further degrading security, or you can extend the development of your application missing project deadlines. If you make your most important security design decisions at the end, how can you be confident you’ve implemented and tested your application to meet your objectives?
The Do It All Up Front ApproachThe opposite of the bolt on approach is the do it all up front approach. In this case, you attempt to address all of your security design up front. There are two typical failure scenarios:
While considering security up front is a wise choice, you can’t expect to do it all at once. For one thing, you can’t expect to know everything up front. More importantly, this approach is not as capable of dealing with decisions throughout the application development that affect security, as an approach that integrates security consideration throughout the life cycle.
The Big Bang ApproachThis can be similar to the do it all up front approach. The big bang approach is where you depend on a single big effort, technique or activity to produce all of your security results. Depending on where you drop your hammer, you can certainly accomplish some security benefits, if some security effort is better than none. However, similar to the do it all up front approach, a small set of focused activities outshines the single big bang.
The typical scenario is a shop that ignores security (or pays it less than the optimal amount of attention) until the test phase. Then they spend a lot of time/money on a security test pass that tells them all the things that are wrong. They then make hard decisions on what to fix vs. what to leave and try to patch an architecture that wasn’t designed properly for security in the first place.
The Buckshot ApproachThe buckshot approach is where you try a bunch of security techniques on your application, hoping you somehow manage to hit the right ones. For example, it’s common to hear, “we’re secure, we have a firewall”, or “we’re secure, we’re using SSL”. The hallmark of this approach is that you don’t know what your target is and the effort expended is random and without clear benefit. Beware the security zealot who is quick to apply everything in their tool belt without knowing what the actual issue is they are defending against. More security doesn’t necessarily equate to good security. In fact, you may well be trading off usability or maintainability or performance, without improving security at all.
You can’t get the security you want without a specific target. Firing all over the place (with good weapons even) isn’t likely to get you a specific result. Sure you’ll kill stuff but who knows what.
The All or Nothing ApproachWith the all or nothing approach, you used to do nothing to address security and now you do it all. Over-reacting to a previous failure is one reason you might see a switch to “All”. Another is someone truly trying to do the best they can to solve a real problem and not realizing they are biting off more than they can chew.
While it can be a noble effort, it’s usually a path to disaster. There are multiple factors, aside from your own experience, that impact success, including maturity of your organization and buy in from team members. Injecting a dramatic change helps get initial momentum, but, if you take on too much at once, you may not create a lasting approach, and will eventually fall back to doing nothing.
A Web application is not a component is not a desktop application is not a Web service. If I gave you an approach to threat model a Web application, you can probably stretch the rubber band to fit Web services too. You could probably even bend it to work for components or mobile applications. The problem is that type and scenario really do matter and can sharply influence your technique. If you generalize the technique, you produced generalized results. If you increase the context and precision, and factor that into your technique, you can deepen your impact.
For example, if I'm threat modeling a Web application and I know the deployment scenario, I can whittle my way through there. If I'm threat modeling a reusable component that may be used in a variety of situations, I would actually start with the top 3-5 likely deployment scenarios and play those out. This sounds obvious but I've seen folks try to model all the possible variations of a component in a single messy model, or I've seen them give up right away saying there's just too many. The irony is that a quick 3-5 little models, usually tell you very quickly what the dominant issues and themese are.
Categories for Context
Context-precision is simply a term I give to the concept of evaluating a given problem's context using a few categories that impact the approach:
Extend and refine the categories as you see fit.
For application type, you could focus on CRM or some other business vertical. I dumb it down to the architecturally significant set that I've seen have immediate impact on the activity. For example, while a Web application and Web service seem like you could just use the same pattern-based frame above for Web applications, I would argue you can create a better one optimized for Web services. For example, a Web service involves a proxy. Proxy is a great category to evaluate attacks, vulnerabilities, and countermeasures. For another example, take input validation. For a Web application, you're likely talking about the HTML stream. For a Web service, we're focused on XML validation.
One one extreme, you don't want to invent a new technique for every context. Instead, you want to pay attention to the context and ask whether or not the technique was actually designed for what you're doing. Could you further refine or optimize the technique for your product line or context? Asking these questions sets you down the path of improved software engineering.
I see a lot of confusion over terms when it comes to threat modeling. The terms matter because they shape focus. For example if you confuse threats with attacks, you've limited what you're looking for.
There are the terms we used when we created our How To Threat Model Web Applications:
Rather than get caught up in the definitions, you can focus on intent:
An example putting this all together would be, my asset is my customer information. My application faces the threat of injection attacks. My application's lack of input validation is a vulnerability. SQL Injection and Cross-site scripting would be attacks. Countermeasures would be validating input and keeping user input out of the control channel.
There's a couple of interesting points here:
What's important in all this is that your security objectives are the ultimate scoping tool and that by understanding the relationships between the terms, you produce more effective results when you threat model.
Short-CutsYou can append SecurityGuidance, SecurityEngineering, or ThreatModeling to http://msdn.com or http://microsoft.com .
I'm J.D. Meier, the PM for security and performance on the patterns & practices team. My manager refers to me as the "abilities" PM. I create guidance to help customers bake security and performance into their life cycle. Why performance and security? ... Who wants an insecure app that scales ... or a "secure" app that won't?
One of the tasks on my plate is to "change the world". I'm a fan of using principles, practices and patterns to do so.