Inside Architecture

Notes on Enterprise Architecture, Business Alignment, Interesting Trends, and anything else that interests me this week...

May, 2007

  • Inside Architecture

    System Reliability requires Message Durability (immature WCF)


    WCF is a very cool technology.  Microsoft has moved the goalposts in the messaging space with this one, and I'm a huge fan.  However, there is a limitation that is painful to live with: the lack of a routable, intermediable, declared message durability option.

    Sure.  There's MSMQ.  Great.  If you (a) control both ends, and (b) are willing to lose intermediability, (c) are happy with one-way communication and (d) have no network firewall or NAT issues.  Zowie. 

    What about the real promise of WCF: to put cross-cutting concerns into the infrastructure, to let developers focus on the message while the infrastructure focuses on the transport.

    Some folks are asking that WCF integrate with SSB.  I think that is a very short-sighted answer.  Messaging sits above storage on the stack, so the message infrastructure needs to control the storage.  SSB is written from the other angle: with storage first, and messaging below it.  It is possible to prop this up, but since SSB is doing a lot of the work that WCF is doing, the best way would be to take a different approach altogether.

    In my opinion, we should be able to configure an adapter in WCF where we declare a durable endpoint and configure SQL Server (if we choose, or Oracle or MySQL) as the storage mechanism.  We can then rely on WCF to not only send the message but to send it in a way that it won't be lost if the other side is down, or I crash before getting acknowledgement, etc.   ACID Transactions.  I know... I'm asking a lot.  Not more than others.  Consider me one more voice in the chorus. 

    BTW: WCF does have reliable messaging... in memory.  It has durable messaging, in MSMQ.  The limitations of this approach were nicely described by Matevz Gacnik in a blog from last February.  Matevz is an MVP and I like his writing style.


  • Inside Architecture

    Reliability in SOA is HUGE


    A colleague of mine, Dottie Shaw, blogged recently about why Durable Messaging matters.  I agree with everything she says.  Even more so, I'd add that of the system quality attributes, the one that is most endangered by the SOA approach, and therefore the one that we need to be the most aware of, is reliability.

    Reliability takes many forms, but the definition that I work from comes from the IEEE.  IEEE 610.12-1990 defines Reliability as "The ability of a system or component to perform its required functions under stated conditions for a specified period of time."

    The reason that this becomes a problem in SOA is because the basic strength of SOA is the message, and the weakest link is the mechanism used to move the message.  If we create a message but we cannot be certain that it gets delivered, then we have created a point of failure that is difficult to surpass.

    One friend of mine, Harry Pierson, likes to point out that the normal notion of 'Reliable Messaging' is not sufficient to provide system reliability.  You need more.  You need durable messaging.  Durable messaging is more than reliable messaging, in his lexicon, because durable messages are stored and forwarded.  Therefore, if a system goes down, you can always rely on the storage mechanism to keep it from being lost.  Reliable messages are kept in memory and simply retried until acknowledged, but lost if the sending system goes down during the process.

    Of course, Harry and Dottie are not alone in this.  In fact, when discussing reliability these days, web authors have started clubbing the terms together for clarity.  Just search on "reliable durable messages" to get a feel for how pervasive this linguistic gymnastics has become.  Clearly, messages have to be durable in order to improve system reliability.  Discussing one without the other has become passe'.

    Note that I view durability as an attributed of the message.  I view reliability as a measurable condition of a system, usually measured in Mean Time Between Failure (MTBF).  What becomes clear from this thread is this: in order to increase system reliability, especially in a system based on messages, we need to insure message delivery, and the best way to do this is through message durability.

    So, we need message durability to get system reliability.  Cool.

    Where do we get it from?

    Well, durability requires that a message be stored and that a mechanism exist to forward it.  (you heard me right... I just equated 'durability' to store-and-forward.  Prove me wrong.  Find a single durable system that doesn't, essentially, store the message and then forward it.)

    By seperating storage from forwarding, we get durability.  The message is saved, and the time and place when it is forwarded is decoupled from the system that sends it.  Of course, the most demanding folks will ask for more than simple durability.  They will ask that messages be sent once and in order.  Not always needed, but nice when you can get it.

    So, in your SOA architecture, consider this: if you are sending messages from one point to another, and you wish to increase the reliability of your system, you need to find a way to store your message first, and then forward it. 

    To build a quality system, however, you want to consider more than one System Quality Attribute.  Sure reliability is important, but if I build a system that is reliable yet brittle, I'd be a poor architect indeed.

    We need to consider reliability... and... Agility, Flexibility, Scalability, and Maintainability and all the rest.  Just as SOA reliability requires durability, SOA flexibility and SOA agility both require the use of standard transport mechanisms.  SOA scalability and maintainability both require intermediability.  So we need a solution that doesn't sacrifice one for another. 

    Unfortunately, our platform is lacking here.  To solve this problem, we need a mix of WCF, SSB, Biztalk, and good old fashioned code.  MSMQ should be able to do this, and it gets kinda close, but it sacrifices ease of operations, so no easy answer there. 

    On the project I'm on, we are using Biztalk for transactional messages, and for data syndication, we wrote our own mechanism based on SQL Agent and a durable protocol that gives us reliability without sacrificing intermediability and standard protocols.

    Now if I could only get that out of the box...

  • Inside Architecture

    Go Build an Enterprise Architecture


    The word "architecture" is an odd one. It is used in many ways, including to describe the interrelationship of components within a system. 

    But does it apply to the enterprise?  Not sure. 

    Many times, the practice of Enterprise Architecture has been compared to city planning.  We've been compared to zoning boards, and planning councils and even electric utilities. 

    None of those organizations call their work "architecture."

    This is probably because the analogies to architecture, at the city level, fall apart.  Cities change constantly.  They grow organically.  The limits on a city's growth are not normally a result of the zoning process.  Limits are much more likely to come from geography, or even acts of nature like fire, flood, and earthquake, than they are by a group of planners in a city office.

    So when we talk about Architecture, at the enterprise level, are we mixing our metaphors?  Are we making an assumption about the nature of change, and the nature of the ecosystem, that doesn't make sense?  Worse, are we misleading our customers, and ourselves, by using this word?

    To most people, architecture is 'hard edged.'  Architects design things that you can touch.  Their buildings have boundaries and walls and light fixtures and those things last for decades..  But in IT, at the enterprise level, this comparison doesn't make sense.  The boundaries of an enterprise IT infrastructure are like the boundaries of a community.  They change, sometimes very quickly, to respond to the needs of the business.

    So why do we call this "Enterprise Architecture?" 

    At the moment, I'm not sure. 

  • Inside Architecture

    Is an entity service an 'antipattern'


    I've seen many folks who have come to the conclusion that CRUD services (aka 'Entity services') are an antipattern.  Most recently, Udi Dahan asked me if I felt this way.


    I have an interesting job at the moment.  While the Enterprise Architecture team in Microsoft has been primarily focused on Governance, a project in the team that I was 'assigned to oversee' really needed an architect, so the Vice President of IT personally drafted me into that project.  (I consider that an honor.)  Unfortunately, it means I have two jobs (at least for now, but movement is coming). 

    • One job: look at the Enterprise... what is right, and what is composable and where does business agility lie. 
    • The other job: look at the problem space in an enterprise system: what is decoupled and what is interchangable and what is less costly to own.

    And that is why I am slow to dismiss Entity services. 

    Because, as long as we are not sharing an entity service outside of a fairly tightly constrained 'area', I have no problem with creating an entity service and composing two or three activity services out of it.  I wouldn't share the entity service beyond that tight space.  I may still want to intermediate, but primarily for operational things like end-to-end transaction tracing and uptime monitoring... not to add logic.

    The benefits of an entity services are all about decoupling.  I can create an entity service on a single business entity off the data in one application, and then, as the business moves to another application, I can create another entity service with the same interface.  I will STILL have to change my activity services (I'm not a fool), but hopefully I can minimize the change and therefore reduce the cost of change.

    I cannot come up with an example of using an Entity service based on 'integration' because I cringe at the notion.  If my order management system wants to create a customer in my relationship management system, I will NOT use a CRUD service.  On the other hand, if my order management system needs to present a couple of different 'activities' to the world, I might stand up an entity service 'under the covers.'  'Assert an Order' (idempotent create) and 'Cancel an Order' (idempotent delete) are both Activity services.  They may both use an 'Order Entity' service, for example, that is not visible to the enterprise but is still used to compose the activity services.

    So, is an Entity service an antipattern?  From the enterprise perspective, yes.  From the application perspective, no.  (these are not in conflict).  I guess it depends on where you stand.

  • Inside Architecture

    Layers and Layers of Services... yep... no confusion there


    Udi Dahan posted an excellent and thoughtful opinion about one of my posts on intermediation.  I really like his thinking.  One thing though, is that I don't see a lot of disagreement about concepts... it appears to be more about terminology. 

    It is so much more fun to disagree on the actual concepts.  Alas, terminology haunts us in this business.  That's why I really like the article by Shy Cohen in this month's Architectural Journal that goes into the differences between service types.

    From what I can tell, Udi was failing to find value in intermediation because his Process Services already handled the composition of other Activity and Capability services under the covers.  Therefore, intermediating between the composing application and the process service didn't make sense. 

    My statement was there is value in intermediating between the process and the capability services and/or intermediating between the activity and capability services.  There is also value in intermediating between the capability and entity services.

    I agree with Udi in this: There is less visible value in intermediating between a composed application and a process service.

    IMHO, this is because there is usually a great deal of business context implied in a process service.  Therefore, the ability to get value out of intermediation of a service is inversely proportional to the amount of deep business context that the service encapsulates.  That doesn't mean that it is impossible.  It is not.  But it is more difficult.  So if you want your apps to call the top-level services using a non-interceptable protocol, that is probably not a major obstacle to agility.  Of course, in the world of SQL Service Broker, this is nearly never the place where a service call is being made. 

    As far as differentiating composable vs. non-composable services: many entity services and a few bus services are not composable.  I don't think that all services need to be.  Some clearly do, and those need to be centrally managed.  Non-composable services have their place.

    I hope this clears up what appears to be a disagreement between Udi and I.  We are on the same page.  We are just using the same words in a different way.

  • Inside Architecture

    Why you need an Enterprise SOA Planning Governance Framework


    I'm looking at a problem that occurs in different enterprises, from Microsoft to Government, heavy manufacturing to healthcare, distribution to e-commerce,  It doesn't occur in every one, certainly not the smallest or the most distributed, but it occurs inevitably as companies grow, and it is in every major corporation.

    The Problem is too much overlapping, non integrated, code

    In many companies, the job function of IT is basically to write code.  This is a problem because sometimes, you shouldn't write code.  Sometimes, the problem is best solved by NOT writing code or replacing written code with configured code.  (Best solved = lower cost for the enterprise, better features for the customer, quicker turnaround on value).

    This is not the result of poor intent.  The problem is that we write code when we shouldn't.  We write code to get a feature when we should integrate with another app that has a similar feature.  We don't use what we have.  If it was a 'few incompetent people' causing this problem then the problem wouldn't be widespread.  The problem is widespread in every IT group I've worked with in a wide variety of industries and government, both in consulting and as an employee.  It is in Microsoft as well. 

    This problem is systemic.  We need a systemic solution.  In Microsoft we are solving the problem.  My goal is to share this solution with you (and improve it along the way).  Of course, to solve a problem, you need to look at the root causes.

    From my analysis, I believe that this problem is the result of the lack of visibility, misguided planning, and/or gaps in accountability.  I will cover each. 

    (Call for feedback: After reading the section below, if you feel that there is a "cause" for this problem that I have missed, and which MUST be covered in order to get folks past the barrier of "no one reuses my reusable service," then let me know.  I make no claim that my experience represents everyone elses.  There are great companies out there.  If yours does not have this problem, Look around... tell me if these three elements have been addressed.)

    Gaps in accountability

    A gap in accountability occurs when no one is responsible for looking for the situations when code should not be written.  Your organization can solve this by adding an oversight function SPECIFICALLY chartered to look for situations where integration and service composition was not considered, or not considered important.  

    In MSIT, this responsibility falls to Enterprise Architecture, but in your organization, it could be in any group that is not chartered to build a project (to avoid conflict of interest).  Consider a central PMO or Central Planning group.

    Misguided Planning

    Composing a SOBA is a different animal than "adding a feature to app X."  It requires forethought and readiness.  You have to have a list of appropriate services to draw upon.  Even if those services don't (yet) exist, the project that wants to compose them can pay for them to come into existence, but someone still has to create the list. 

    Without having teams looking for the services that "should" be present, or "could" be present, there is no way to have a list of potential services for a SOBA project to compose from.  This is what I call "SOA Planning Governance."

    Lack of Visibility

    There is a lot going on in a Corporation, and no one wants to spend a lot of money on SOA Planning Governance, so it helps to have the framework in place to handle things like this.  A framework means that we can give people guidance on an area that no one has planned yet, on demand. 

    So when the request comes in to "plan for business need N", we can find the areas of the corporation's IT infrastructure quickly, assign resources with a starting project plan, they follow a well understood process to gather information, produce an analysis, and deliver value.  They then go on to other work. 

    The analysis is used to guide project planning, but the people doing this work are not part of the normal project team.  They just do this planning work, over and over.  In Microsoft, I could keep four teams busy doing this for five years, easy.

    The Solution is an SOA Planning Governance Framework

    To address this, I believe that every large organization (facing this problem) needs a SOA Planning Governance Framework.  (That's a mouthful.  Say that 10 times fast ;-). 

    Composing the term SOA Planning Governance Framework 

    • A planning framework is a set of models and processes that conceptually create a "space" where planning can occur.  The effort of planning is constrained and guided, which allows it to be completed in a timely manner while producing good results.
    • Adding the term "Governance" means that the processes are designed to encourage alignment with business goals, and that misalignments will be captured, reported and visible.  The implicit effect is that consequences for alignment will be present as well.  Those consequences (good or bad) are outside the scope of the framework but essential to making the system work.  People do what you pay them to do.
    • A SOA planning governance framework is a framework that helps to plan SOA services when they are strategic and aligned to business goals.  It could, theoretically, be used to plan for a DLL or other reusable integration object as well, where it makes sense to create one.

    Many of you may have seen me blog about Solution Domain Architecture (SDA).  Solution Domain Architecture is a SOA Planning Governance Framework. 

    Benefits of Solution Domain Architecture

    With SDA, you get:

    • A well crafted and flexible mechanism for grouping the systems in your organization by features and data (commonality) while factoring out roles and business processes (variability) that should be cross cutting concerns for composable services.
    • A process and tools for creating and running a planning team that goes in, solves the problem, and comes out quickly with an analysis that is useful.
    • A list of services that may not exist, but should exist, for the enterprise to use.  When a project chooses to use a service that doesn't exist yet, then the project gets to pay for the service to come into existence (or a seperate fund can be set up to feed this funding request).  The point is that the services can be created "on-demand" and not in advance of an actual customer.

    Note that SDA is part of the overall mix, but it doesn't solve every problem.  No technique solves every problem.  The following tasks are Not solved in Solution Domain Architecture.

    • SDA doesn't "design" the services.  
    • SDA doesn't define what fields go into your canonical schema for 'order' or 'customer' 
    • SDA doesn't create spanning layers or MDM scenarios. 
    • SDA doesn't define when you would use ETL, SSB, or ESB to move data.

    Solution Domain Architecture helps to create the right list of services for the organization to build.  You still need people to design them, build them, and operate them. 


Page 1 of 3 (15 items) 123