Inside Architecture

Notes on Enterprise Architecture, Business Alignment, Interesting Trends, and anything else that interests me this week...

June, 2007

  • Inside Architecture

    Getting the Enterprise Canonical Data Model right


    What is the correct level of abstraction for the Enterprise Canonical Data Model (ECDM)?

    As I blogged before, the ECDM is used to decide what data should be passed through the integration infrastructure in the notifications that occur on business events.  The canonical schema that define "things" are all subsets of the ECDM (or extensions as well will see).

    In some organizations, there are fairly few variations in basic 'things' like order, product, and agreement.  In other organizations, including Microsoft, the need for independent variation is more apparent.  As we move more toward "Software as a service," the number and types of products will only grow.  And what exactly is an order if we are using click-stream billing for a service call?  This will be fun.  So we need lots of flexibility as the business grows and changes.  An ECDM that is too prescriptive or too large can end up constraining the business' ability to grow and change.

    There are basically two types of messages that need to rely on the ECDM: event notifications and full data entities.  Both are transitory, in that they state a fact at a particular point in time, but the event notifications are more transitory because they are only sent once across the infrastructure.  We need to be able to replay them, but (with the exception of BAM), we don't often query them.

    In general, I'd say the rule for event notifications should be:

    Communicate sparingly, communicate clearly, allow for questions.

    Communicate sparingly: Define your entities to the minimum level needed to share "concepts" and "relationships" across the enterprise.  If an order happens from "company ABC" for 10,000 licenses of "product XSP" under marketing program "VLR", then the canonical schema for that order needs to be pretty short, and the event notification even shorter, so that receiving systems can decide if they even care.  Remember that your event system will send a LOT of events.  Keep them small but provide enough information for the recipient to decide if they need to know more.  So, perhaps the "order placed" notification has things like order id, customer id, partner id, reseller id, program id (sales are made under marketing programs) and a list of product categories that the items in the order represent.  That's it.  The receiving system can decide if they need to know more.

    Communicate Clearly: The id's must be generic and enterprise wide.  If a receiving system gets a notification or a canonical element (like the full order), they have to be able to interpret it consistently.  That means that the systems listening for the events have to know what the ids mean and how to get more information on an id if they don't already have it.

    Allow for Questions: the infrastructure needs to provide a generic way to ask the question: I need to know more about order 1234 to customer ABC on program VLC.

    So if the needs of the event notification are for brevity and consistency, what are the needs for full data entities?

    When a system gets an event notification, it will look at the event and decide if it cares.  Most of the time, it won't, and our use case ends.  Sometimes it will.  When it does, it needs to ask for full details of that data entity.  Perhaps it wants to store data.  Perhaps it wants to calculate something to append to the records for the customer, the partner, the reseller, the sales team that made the sale, or the product group that made the product.  Lots of reasons why the system getting the message will need more data.  We have the ability to 'ask questions' listed above, but that one comes to full data entities as well.

    I'd say the rule for full data entities is:

    Provide a complete document, at a point in time, allow for questions

    Provide a complete document - the full data entity contains all of the data that the source system can share about it, including denormalized details about related entities.  For example, if I get an order as stated above, for 10,000 licenses for product XSP, we would provide the full "legal name" for the product and some attributes for the product (like the fact that it is a license, what country it is sold in, languages, product family id, etc). On the other hand, we don't want to constrain the business, so allow for optional fields in the semantics of the canonical object.  Allow a system that doesn't have a data element (like a price or even a quantity) to send the order anyway.  Also allow the system that is sending data to append 'system specific' data elements.  That way, a team can use the canonical model to send data to another closely related system in the same business stream, where those 'system specific details' can be understood and used.

    At a point in time - Recognize that your documents are not static.  Provide dates and version numbers for each and every document and allow a document to be called back up on the basis of those dates and version numbers.  This is key to being able to recreate a data stream later in time, an operational necessity that is often overlooked.  So, yes, your order has a version number. 

    Allow for questions: as complete as your order document is, it will still need to have codes in it referring to other things.  For example, each product may have a product family.  By including the product family code, you are stating this: "At the time this order was placed, product "Sharepoint" was part of the "Office Family" of products".  For some products, this may not change much, but for others, this could.  So you include the product family, but there is no need to include attributes of the product family.  The receiving system can ask for product family details of the same infrastructure if it needs to follow up.

    Hopefully, with these simple guidelines, we can build the ECDM at the right level of abstraction.

  • Inside Architecture

    As the role changes...


    In my career, if I take any window of time that is two years long, regardless of start and end date, I cannot find a single period where I started and ended the period doing the same thing.  Not one.  Oh, I've worked at employers for longer than two years, but not doing the same job. 

    I'm about to begin my fourth job at Microsoft.  I've been here three years (this time).  It's a good job.  It's a different job.

    Started in one of the IT groups before moving to Enterprise Architecture.  Loved the people, and made the best of the job.  Then, I move to EA and became an Enterprise Application Architect embedded in the OEM division... which meant that it was my job to 'govern' the IT projects.  I'm not much for governing.  I'm a lot better at collaborating, and I really enjoyed collaborating with that team.  Some very smart people and I had a lot of fun working with them.  Since Spring, I was the Lead Systems Architect for a large distributed Enterprise-focused Service-Oriented Business Application (out of necessity, really).  I had a blast.  Just finished turning that gig over to an amazing architect who I have the utmost respect for so that I could move to Central Enterprise Architecture... this time to be 'Mr. SOA' for Microsoft IT.

    Of course, Microsoft IT has far more than one SOA architect.  My peers are probably better than I am in some pretty key ways.  We have many talented SOA architects working in different divisions.  What I'm hoping to do is take Microsoft IT to the next level of SOA maturity by driving the development of the Enterprise Canonical Data Model, Business Event Taxonomy, Enterprise Solution Domain Integration Model, and the Periodic Table of Services (a set of planned services that are needed to drive SOA forward).  This is one of the toughest jobs I've taken on in years (since co-founding a dot-com).

    I'm ready. 

    It's always a bit hard, and a bit sad, to leave the 'comfortable' and go to the 'new.'  There are a great many good people who I won't get to work with daily any more. I'll miss that daily contact.

    On the other hand, there are a great many good people who I haven't had the chance to work with, but will get that chance now.  Looking forward to that part.

    Microsoft IT is a great place.  If you are an IT professional, and you are the best darn architect or developer or tester or PM or operations specialist in your team, I encourage you to seriously consider joining this organization.  You can truly build a career here, if you are gutsy, and smart, and most importantly, passionate about being excellent at what you do.

    There is no way to go higher than when you are soaring with the eagles.

  • Inside Architecture

    What I like about Acropolis


    Just checking out the online resources on the new Orcas front-end development technology called Acropolis that builds MVC/MVP patterns into WPF software development.

    What I find promising: an Acropolis part can essentially consume a SOA service, allowing the composition of process and activity services to be as simple as snapping parts onto a surface.  This is not particularly new from a software development standpoint, but it's pretty new for the Microsoft stack.  Nice to see.

    We could theoretically get to the point where Mort himself can compose an application from services...  And there would be little or no code to maintain. (Thus solving the problem of unmaintainable code).  It also makes the creation of a Service Oriented Business App so fast as to provide real, useful, practical, business agility.

    Code is becoming free.

  • Inside Architecture

    The Unimportant SOA Catalog


    Have you ever woke up in the morning with an idea in your head that you simply have to write down?  I just did.  Here's the idea: Everyone talks about how important the catalog (or repository) is to Service Oriented Architecture.  It isn't.

    The reason everyone wants a catalog is simple: If I create a uniquely valuable service, and I want people to use it, I need a place to advertise it.  So I put it in a catalog.  The catalog contains useful information like what the service is, and what it does, and who made it, and how to call it.  Useful stuff.  Sears Roebuck, circa 1893.

    So how can that be unimportant?

    Because this is a case of 'doing a really good job of solving the wrong problem.'

    A friend of mine and fellow architect here in Microsoft IT named Mohamed El-Ghazali changed the way I think about service contracts.  And Gartner changed the way I thought about "what makes adoption work" and together, there's a powerful brew.  It took me a while, because these ideas are "just different enough" to make me pause, but between these two sources, they had the intended effect, and now I can say, without blinking, that the catalog is not the high order bit.

    Why? Because the catalog is not an IFaP.  It is a list of chaos. 

    If you have 20 services, or even 50 services, a catalog is really useful.  I'm looking at an architecture that will require something around 500 enterprise information, activity, and process services, about 200 infrastructure services, and countless 'point solution' services.  There is no way a list will do.  No human can remember it, or use it.  Duplication and overlap will prevail.  Face it, the catalog doesn't scale.

    So where does the solution lie? 

    How about looking to the past to find the future. 

    I call your attention to the history of the Periodic Table of Elements.

    If you are not familiar with the history of the creation of this simple yet extraordinarily powerful concept, you should read this page.  Two key concepts I'd like to pull out:

    First off, by creating the periodic table of elements, Mendeleev created a situation not only where elements could be classified, but where missing elements could be predicted.

    Between 1868 and 1870, in the process of writing his book, The Principles of Chemistry, Mendeleev created a table or chart that listed the known elements according to increasing order of atomic weights. When he organized the table into horizontal rows, a pattern became apparent--but only if he left blanks in the table. If he did so, elements with similar chemical properties appeared at regular intervals--periodically--in vertical columns on the table.

    Mendeleev was bold enough to suggest that new elements not yet discovered would be found to fill the blank places. He even went so far as to predict the properties of the missing elements. Although many scientists greeted Mendeleev's first table with skepticism, its predictive value soon became clear.

    This meant that not only did Mendeleev help to understand the list of 'needed domain knowledge', he actually created boundaries that empowered other people to focus their efforts and deliver incredibly quick innovation.  This innovation came from people he had never met.

    The second thing I'd like to highlight is that the original table was useful but it was changed as knowledge increased to match a more modern understanding of chemistry and modern techniques for measuring atoms that was not available when it was developed.  In other words, the concept is good, even if the implementation is iterative.  (19th century agility).  The boundaries remained, and the table stands today as a fundamental artifact in the understanding of our natural world.

    What does that have to do with SOA?

    I am creating a similar table of services based (loosely) on the layers defined by Shy Cohen, message exchange patterns defined by the W3C, the work on Solution Domains that my team in IT Enterprise Architecture has started, and the business behaviors that I see as necessary to accomplish a partitioned design.  The goal is to create an all-up IFaP of services based on multiple spanning layers. 

    Unlike the periodic table, this will not be bounded by physics.  Instead, it will be bounded by the data elements and solution elements defined by performing a Solution Domain mapping exercise against the enterprise.  Your organization will have different elements, but either way, there will be boundaries, and that will, I believe, foster organized and directed effort, creativity, and discoverability.

    I believe the value will be clear.

    1. We will know what services we need to develop to meet the needs of the enterprise.  We can even prioritize the list and create a roadmap showing the CIO when we will be "done."
    2. We will have basic patterns already established for how they will be called and what they will return.  This reduces a huge amount of churn and will give brave developers the ability to resist the "not invented here" plague.  The patterns can be designed to include all the needs of the test and support teams that are normally 'left out' of application specs but are ever more critical to the success of SOA.
    3. We will have generic test harnesses in place to test them before they are written, allowing test architects to build reusable test value, while at the same time relieving project teams from writing difficult and complex test software to support SOA.
    4. We will have sufficient information to estimate their cost by the team that must build and maintain them, providing some visibility to the cost of developing an integrated application.  This gives us the ability to seperate out the incremental cost of SOA from the cost of application development in general.

    I'm pretty excited about doing this, and I think it is a strategy that can work. 

    So what part of this kills the catalog?

    The catalog helps a programmer to find the name of a service that performs a specific purpose. 

    However, if I know the purpose, and the list of activities is a constrained list (as is the list of data subject areas), then I can create the name of the service and just hit the infrastructure up for it.  If it exists, the service will respond with details.  If not, the infrastructure can respond with information on what is needed and where it should live.

    It really is that simple. 

    We go from this:

    The catalog describes the service infrastructure (bad)

    to this

    The catalog is the service infrastructure. (good)

    And in this world, the catalog is informative, but not required.

  • Inside Architecture

    Enterprise IT Integration and Data Security in the Canonical Data Model?


    One thing that I do is spend a lot of time staring at a single problem: how to make a large number of systems "speak" to one another without creating piles of spaghetti-links and buckets of operational complexity.

     So this past week, I've been thinking about security in the integration layer.

    In Microsoft, we have a lot of competing business interests.  One company may be a Microsoft Partner in one channel, a customer in another, and a competitor in a third.  (IBM is a perfect example, as is Hewlett-Packard.  We love these guys.  Honest.  We also compete against them).  To add to the fun, in a vast majority of cases, our Partners compete with each other, and we need to absolutely, positively, with no errors, maintain the confidentiality and trust that our partners have in us.  In order to protect the 'wall' between Microsoft and our partners, and between our partners and each other, in competitive spaces, while still allowing open communication in other spaces, we have some pretty complicated access rules that apply not only to customer access, but also how the account managers in Microsoft, who work on their behalf, can access internal data.  For example, an account manager assigned to work with Dell as an OEM (a Microsoft Employee) cannot see the products that Hewlett Packard has licensed for their OEM division, because he or she may accidentally expose sensitive business information between these fierce competitors.

    In this space, we've developed a (patented) security model based on the execution of rules at the point of data access (Expression-Based Access Control, or EBAC).  This allows us to configure some fairly complicated rules to define what kind of data a customer may directly access (or an employee may access on behalf of their customers).  So I'm looking at the EBAC components as well as more traditional Role-based Access Control (RBAC) and thinking about integration.

     What right does any application have to see a particular data element?

    This gets sticky. 

    I can basically see two models. 

    Model 1: The automated components all trust one another to filter access at the service boundary, allowing them to share data amongst themselves freely. 

    Model 2: Every request through the system has to be traced to a credential and the data returned in a call depends heavily on the identify of the person instigating the request.

    Model 1 is usually considered less secure than model 2.

    I disagree. 

    I believe that we need a simple and consistent infrastructure for sharing automated data, and that we should move all "restriction" to the edge, where the users live.  This allows the internal systems to have consistently filled, and consistently correct, data elements, regardless of the person who triggered a process.

    In real life, we don't restrict data access to the person who initiated a request.  So why do it when we automate the real life processes?  For example, if I go to the bank and ask them to look into a questionable charge on my credit card, there is no doubt that the instigator of the request is me.  However, I do not have access to the financial systems.  A person, acting on my behalf, may begin an inquiry.  That person will have more access than I have.  If they run into a discrepency, they may forward the request to their manager, or an investigator, who has totally different access rights.  If they find identity theft, they may decide to investigate the similarity between this transaction and a transaction on another account, requiring another set of access rights. 

    Clearly, restricting this long-running process to the credentials of the person who initiated it would hobble the process. 

    So in a SOA infrastructure, what security level should an application have?

    Well, I'd say, it depends on how much you trust that application.  Not on how much you trust the people who use it.  Therefore, applications have to be granted a level of trust and have to earn that level somehow.  Perhaps it is through code reviews?  Perhaps through security hardnening processes or network provisioning?  Regardless, the point is that the application, itself, is an actor. It needs its own level of security and access, based on its needs, seperate from the people that it is acting on behalf of.

    And how do you manage that?  Do you assign an application access to a specific database?  Microsoft IT has thousands of databases, and thousands of applications.  The cartesian product alone is enough to make your head spin.  Who wants to maintain a list of millions of data items?  Not me.

    No, I'd say that you grant access for an application against a Data Subject Area.  A Data Subject Area is an abstraction.  It is the notion of the data as an entity that exists "anywhere" in the enterprise in a generic sense.  For example: A data subject area may be "invoice" and it covers all the systems that create or manage invoices.  This is most clear in the Canonical Data Model, where the invoice entity only appears once.

    Since applications should only integrate and share information using the entities of the canonical data model, would it not, therefore, make sense to align security access to the canonical data elements as well?

    I'll continue to think on this, but this is the direction I'm heading with respect to "data security in the cloud."

    Your feedback is welcome and encouraged.

  • Inside Architecture

    Simple Lifecycle Agility Maturity Model


    How agile are you?  Can you measure your agility?

    My discussions over the past week, about who is and who isn't agile, started me wondering: if you want to improve your agility, you need to be able to measure it.  This idea is simple and repeatable.  It is used in most "continuous improvement" processes. 

    I created a simple model for measuring the agility of a software development process.  I call it the Simple Lifecycle Agility Maturity Model (SLAMM).  It is a single excel spreadsheet (Office 97-2003 compatible, virus free), complete with instructions, measurements, and a chart you can use or share.  You can find it here.

    Using this model, the team follows a simple process:

    1. Write a simple story that describes the process you followed.  Examples are included in the spreadsheet.
    2. Rate your process on 12 criteria based on the Agile Alliance principles
    3. Enter weights and view results
    4. Create a list of steps to address deficiencies.  Follow the normal agile process to estimate these steps and add to the backlog.

    I'd like to share this model with the community.  Please take a look.  If you like it, use it.  Completely open source.

    The weights came from careful reading of the principles on the Agile Alliance site (with a dash of my own experience).  I invite the community to discuss the weights and create a consensus to change them if you'd like.  Note that the biggest benefit of models like this is the ability to compare the agility of processes in DIFFERENT COMPANIES or organizations, so we need to stick to a single set of weights in order to have a standard for comparison.

    I hope this is the positive outcome of the blog flurry of late. 

    <3-30-2009: Link to SLAMM spreadsheet updated after CodePlex dropped the SLAMM project. >

Page 1 of 3 (15 items) 123