One thing I've been thinking and talking about for the past few weeks is the relationship between four different concepts, a relationship that I didn't fully grasp at first but have become more convinced of as time wears on. Those terms are:
I understood a general relationship between them, but as time has passed and I've been placing my mind directly in the space of delivering service oriented business applications, the meanings have crystalized and their relationship has become more important. First, some definitions from my viewpoint.
I guess what escaped me, until recently, was how closely related these concepts really are.
The way I'm approaching this starts from the business goal: use data to drive decisions. Therefore, we need good data. In order to have good data, we need to either integrate our applications or bring the data together at the end. Either way, if the data is used consistently along the way, we will have a good data set to report from at the end.
To create that consistency, we need the Enterprise Canonical Data Model. Creating this bird is not easy. It requires a lot of work and executive buy-in. Note that the process of creating this model can generate a lot of heated discussions, mostly about variations in business process. Usually the only way to mitigate these discussions is to create a data model that contains either none of the variations between processes, or contains them all. Neither direction is "more correct" than the other.
However, in order to integrate the applications, either along the way or at the end of the data-generation processes, we need to use a particularly constrained definition of Canonical Schema: the Enterprise Canonical Message Schema is a subset of the Enterprise Canonical Data Model that represents the data we will pass between systems that many people feel would be useful. Note that we added a constraint over the definition above. Not only are we sharing the data, but we are sharing the data from the Enterprise CDM.
By constraining our message schema to the elements in the Enterprise Canonical Data Model, we radically reduce the cost of producing good data "at the end" because we will not generate bad data along the way. The key word is "subset." In order to create a canonical schema without a canonical data model, you are building a house on sand. The CDM provides the foundation for the schema, and creating the schema first is likely to cause problems later.
Therefore, for my friends still debating if we should do SOA as a "code first" or "schema first" approach, I will say this: if you want to actually share the service, you have no choice but to create the service "schema first" and even then, only AFTER a sufficiently well understood part of the canonical data model is described and understood.
And for my friends creating schemas that are not a subset of the overall model, time to resync with the overall model. Let's get a single model that we all agree on as a necessary foundation for data integration.
The next relationship is between the Canonical Message Schema and the Event Driven Architecture approach. If you build your application so that you are sending messages, and you want to create autonomy between the components (goodness), you need to send data that has a well understood interpretation and as little 'business rule baggage" as you can get away with. What better place than the Canonical Data Model to get that understanding? Now, this is no longer an academic exercise. Creating the enterprise level data model provides common understanding, so that these messages can have clear and consistent meaning. That is imperative to the notion of Event Driven Architecture, where you are trying to keep the logic of one component from bleeding over into another.
The business event ontology defines the list of events that will occur that require you to send data. Creating an ontology requires that you understand the process well enough to generalize the process steps into common-held sharable events. To get this, the data shared at the point of an event should be in the form of an Enterprise Canonical Message Schema.
Therefore, to summarize the relationship:
Business Events occur in a business, causing an application to send a Canonical Message to another application. The Canonical Message Schema is a subset of the Canonical Data Model. Event Driven Architecture is most efficient when you send a Canonical Message Schema message between components. This provides you with more consistent data, which is better for creating a business intelligence data warehouse at the end.
Some agility notes:
The list of business events in a prospect ontology may include things like "receive prospect base information", "receive prospect extended information", "prospect questionnaire response received", "prospect (re)assigned", "prospect archived", "prospect matched to existing customer", "prospect assigned to marketing program," etc. It is not a list of process steps. Just the events that occur as inputs or outputs.
Clearly, this list can be created in iterations, but if it is, you need to make sure that you capture all of the events that surround a particular high level process and not just focus from technology. In other words, the business processes of "qualify prospect" or "validate order" may have many business events associated with them, and those events may need to touch many applications and people. If you decide to focus on "qualify prospect" first, then understand all of the events surrounding "qualify prospect" before moving on to "validate order," but if both processes hit your Customer Relationship Management system, focus on the process, not the system.
It is an effiecent & Agile way to break the Business Model/Architecture.
I agree with you that, as our SOA implementation reaches some maturity we do tend to think in these lines. But keep these in mind before hand, will save us a lot of Time and Confusion.
Thanks for relating these concepts..
Now you definitely are my friend, Nick...
From this viewpoint EDA might be seen as of a higher architectural magnitude then SOA as the eventing pattern puts constraints on the services and not vice versa: http://soa-eda.blogspot.com/2007/06/magical-of-soa-and-eda.html
I still don't believe in "one true schema", but we're one the same page wrt EDA and semantic covenants:
I don't think this is about "one true schema", but about a mechanism to map different schema's to be able to pass semantics across different environments:
What else would you suggest to accomplish this, Kjell-Sverre?
I could not have said it better myself.
The goal is to figure out how to communicate. Think of this like the diplomatic community. In a country, a lot goes on that the diplomatic community is not really worried about. However, when we want to talk between ourselves (to create an international treaty, for example), we need to have a common language to negotiate and sign the treaty in. That common language (not the content of the treaty) is the stuff defined by the Enterprise Canonical Data Model.
+1. Like Jack, I really like how you've laid out the concepts. Very nice. App-independent messages are a key decoupling mechanism.
"Creating this bird is not easy. It requires a lot of work and executive buy-in."
Indeed. The difficulty in creating ECDM and ECMS cannot be overstated, IMO. This can be really hard--especially when the diplomats in the community aren't all that interested in participating in the exercise--"just send me the data I need". "Didn't we just do this for data warehousing?"
Lastly, you touched on this a little, but this exercise should not lose sight of the business processes. Only in the context of the processes do the events make sense. Integration, IMO, is best served by a "process first" approach. Events and services come after.
I agree with a process first approach and I think that's one reason why the business event ontology is so important.
That said, processes have a heirarchy. We speak of Level 1 processes like Marketing, Sales, Fulfillment, etc. Level 2 processes would be under one of those top level ones. For example, under Marketing would be things Create Market Strategy, Segment Market, Build Programs, Execute Programs, Capture Response.
That is the level where the business event ontology really hits home. This is because these Level 2 processes are the domains of large systems. You will tend to find a system that spans the process, largely from end to end. The business likes looking at data within these buckets, and cares a lot less about the individual data elements flowing between them.
So I agree, we start with process. On the other hand, I caution teams not to go all the way down to level 4 and level 5 before starting on Integration and Services. While there are likely to be services developed to support level 4 processes, they will be built in the context of a single system and don't need to be architected "from the center." They need to be architected in the project that builds or maintains the systems that serve those needs.
Process First, but stop before you go too deep.
It was the part about the CDM being like a shared relational database that lead me to think about "one true schema" and other common data model approaches. Jack's post makes the distinction between the common data format and the message metadata CDM, and I recommend reading his post first :)
In a SOA, to be effective, we need to share both data and events. Events, as I have discussed before
In the practical world the process needs to be put in place may be, by the SOA governance to make all this work effectively. I had been on consulting projects where the enterprise had the Cannonical Model (not the database) in place. The process of subsetting was so cumbersome that we ended up creating our own "Rider" data model and schemas. Probably there could be some kind of tool SOA platform vendors will start providing in the future to ease subsetting,transforming and versioning from the Enterprise repository?
I'd love to hear more. Can you send me an e-mail directly? I'd like to know what, in the enterprise you worked in, made it so difficult to create a subset message from the Canonical Model?