-
There are many reasons, sure, and probably there are also reasons why plain text files can be better, but I would like to remark just only one reason, just because I fighting with it right now:
Xml is human readable
Or at least, it should be.
I’m dealing with the HL7 standard for healthcare. HL7 files are text files with some strange delimiters such ^ and |. Luckily we can use the BizTalk HL7 Accelerator, that allow us to abstract from the HL7 details.
A sample of an HL7 file:
MSH|^~\&|REG|MCM|BTS||199601121005||ADT^A04|000001|P|2.2
EVN|A04|199601121005||01||199601121000
PID|||191919^^^MYHOS^MR~123-45-6789^^^USSSA^SS|253763|SMITH^JOHN^Q||19560129|M|||123MAIN^^BUFFALO^NY^98052^""||(123)555-0100||S|M|10199925^^^MYHOS^AN|123-45-6789
PD1|S|F|NormalString^A^+1^-1^ISO^simpletext&Test&HCD^GI^simpletext&NormalString&ISO^I|NormalString^Test&Test^Test^Test
^Test^Test^AE^simpletext^simpletext&Test&ISO
^P^NormalString^M10^MC^simpletext&NormalString&HCD^A|N|simpletext|I|I|N|NormalString^+1^M11^
simpletext&NormalString&L,M,N^RRI^simpletext&
NormalString&HCD|NOVALUE^NormalString^Test^Test^NormalString^Test|N
PV1|1|I|2000^2012^01^hey&test&DNS^test^test^test^test^test||||004777^MILLER^CONNIE^A.|||SUR||||2|A0
Where is the Patient Name? is “the substring between the fifth and the sixth | (pipe), in the third line (the line starting with PID). And remember, spaces are represented as ^(strange little hat)”
The HL7 Accelerator comes with Xsd schemas to map these flat files. A sample message type ADT A04 (the above) looks something like this (just a small piece):
<ns0:ADT_A04_22_GLO_DEF xmlns:ns0="http://microsoft.com/HealthCare/HL7/2X">
<EVN_EventType>
<EVN.1_EventTypeCode>A04</EVN.1_EventTypeCode>
<EVN.2_DateTimeOfEvent>199601121005</EVN.2_DateTimeOfEvent>
<EVN.3_DateTimePlannedEvent>199601121000</EVN.3_DateTimePlannedEvent>
<EVN.4_EventReasonCode>01</EVN.4_EventReasonCode>
</EVN_EventType>
<PID_PatientIdentification>
<PID.1_SetIdPatientId>191919</PID.1_SetIdPatientId>
<PID.2_PatientIdExternalId>
<PID.5_PatientName>
<PN.0_FamiliyName>Doe</PN.0_FamiliyName>
<PN.1_GivenName>John</PN.1_GivenName>
</PID.5_PatientName>
[…]
we still deal with HL7 codes and semantic structure, but it’s much easier to work the Patient Name. It's located in “the FamilyName element under PatientIdentification” :-)
-
I've been playing around with a funny error message in ConfigFramework. It says "The Operation Completed Successfully", but it's a Retry/Cancel message box, where you get looped until you press cancel... The details found in the log file talk about failiure to modify a COM+ Application. It also happened to a colleage, as he has posted more details in his eXtreme.NET blog.
I solved it changing the identity of MSDTC to the Network Service system identity and restarting MSDTC (I also relaxed the security allowing remote administration of MSDTC, but I'm not sure if it helped). Anyways, I think you can change these settings back to the old values after running ConfigFramework, since the error is about creating and configuring the COM+ App, not about running it.
update: this "The operation completed successfully" translates to "Connection with the transaction manager was lost". So it's in deed a problem of servers communicating via MSDTC. My concrete case is due to use local accounts for MSDTC identity in a workgroup environment, so the servers cannot authenticate each other. This case is documented in Bug: MSDTC Fails to Mutually Authenticate
Helpful articles: in case you want more documentation:
New DTC Functionality in Windows Server 2003 SP1
FIX: The SqlConnection.Open function generates a ComException error message if the Distributed Transaction Coordinator (DTC) service is restarted or failed over
Turning Off Remote Procedure Call Security on Windows 2003 Server
Here's the pic, just for curious ones:
-
Marty Wasznicky has posted a compiled BizTalk 2004 Help File (.chm) check it out in his blog (11 Mb)
The most interesting stuff is that it can be indexed using MSN Desktop Search (as any other .chm file).... that means querying BizTalk Documentation from the Windows Taskbar. Cool, since I've tried to index the Visual Studio Combined Help files without success :-(
-
There is a new whitepaper on MSDN about messaging. It's called BizTalk Server 2004: A Messaging Engine Overview, but it's quite a deep dive into messaging. From my point of view, the most interesting stuff is how it explains the Pub/Sub internals in terms of MessageBox flow and stored procs involved in publishing and subscribing.
-
Abstract: You can create a BizTalk message from a custom .NET class, instead of using an Xsd schema. This practice has some pros and cons. Let’s see…
Usually, the most common process is to start creating Xsd schemas, and use them in Orchestrations, to define BizTalk messages. But, why would you like to have BizTalk messages based on Xml? It seems obvious that all messages should be Xml based, but in fact, sometimes Xsd/Xml is not necessary and adds complexity.
When is Xsd/Xml needed?
Xml messages are very nice if you need Messaging. That is, interoperatibility, external communications, schema publication. But surprisingly, I find many situations where people are defining Xsd schemas for Xml messages where Xml is not needed at all. Well, Xml is still fashionable, so…
If a message lives inside an orchestration (is being passed from within orchestrations), but it’s never going to be published externally, or is never going to be sent/received externally, there’s no advantage in using Xml. In fact, there’s no reason to use Xml.
Xsd hell: how do you create a message from the scratch?
It’s a well known issue that creating an empty instance of a message inside an orchestration is not easy. Having the xml string hardcoded is not very elegant. Pointing to a file is not very elegant at all. You would need the path harcoded. Well, you can put it in a .config file, but… isn’t too complex to just create an empty message?
I’ve read some discussion about using Maps to create empty (or default valued) instances. Since I agree it’s a good idea, it’s still too complex to just create an instance of a message.
This message-from-the-scratch problem is inherent to the use of Xml, since a message is not instantiable.
Using a .NET class
You can use a .NET class, instead a Xsd schema. Define your orchestration message as a .NET type, using your class. Use references to XLANGs Base Types to promote your class properties to distinguised fields or properties.
Pros:
Instantiate it. Use constructors, destructors, static members for instance creation or whatever you want.
Use rich properties. Mark public properties as promoted or distinguished fields. Use get and set methods.
Reuse as objects in other projects.
Cons:
External publication and Interoperatibility. There is no Schema, so if you intend to publish it externally, what do you publish?
Dependency of XLANG assemblies, so limited reuse outside your BizTalk project.
Simple Sample:
Create a .NET class:
using System;
namespace STCEAI.Messages
{
[Serializable]
public class SimpleMessage
{
private string _id;
public SimpleMessage()
{
}
[Microsoft.XLANGs.BaseTypes.DistinguishedField]
public string Id
{
get{ return _id;}
set{ _id = value;}
}
public static SimpleMessage Create()
{
SimpleMessage msg = new SimpleMessage();
msg.Id = System.Guid.NewGuid().ToString();
return msg;
}
}
}
Note the property is marked as Distinguished Field in order to make it visible from within the orchestration. This attribute is in Microsoft.BizTalk.XLANGs.BaseTypes.dll. You can also mark it as a promoted property, assign a Namespace, and include all the Xml serialization attributes as needed.
Usage inside an orchestration, in a Message Assignment Shape :
SimpleMessage_msg = new STCEAI.Messages.SimpleMessage();
SimpleMessage_msg.Id = myId;
or better:
SimpleMessage_msg = new STCEAI.Messages.SimpleMessage.Create();
-
Calculated shot... well, I've played Scorched Earth a lot in my life... :-)
I am a Scorched Earth Tank.
When I have a mission, it consumes me; I will not be satisfied until the job is done. I have a strong sense of duty, and a strong sense of direction. Changes in the tide don't phase me - I always know which way the wind blows, and I know how to compensate for it. I get on poorly with people like myself. What Video Game Character Are You? |
-
Abstract: In the asynchronous world, we can talk about Real async and Simulated async. Each one has its own pros and cons. Let’s see a simplified sample of each case.
Sample scenario, let’s assume two systems, A and B:
1.- A sends a message to B.
2.- B processes the request from A.
3.- B returns a response to A.
Constraint: The processing of the request (step 2) takes some unpredictable time, so we cannot afford A to have an open connection waiting for the response from B at step 3. We need an asynchronous model, but it can be a real async or simulated async.
Real Async
In a Real async scenario all the communications are one-way, fire-and-forget. A sends the request to B and closes the connection. Once the processing is finished, B starts a new connection with A and sends a new message, the response.
The characteristics are:
- Each message goes in a true one-way communication.
- Both client and server must implement listeners --> from the communications point of view, both are client and servers.
- Both A and B must be aware of the other system's endpoint.
- Bandwidth and CPU are optimized, and there are no blocking points.
here is a sample picture. Arrows shows who start the communications:
Probably, the most important issue is the second point: both A and B are clients and servers (or consumers and providers). Both systems must implement a listener or message sink. This is not suitable in many cases when A is a client app. So we can go for a simulated async.
Simulated Async
In the simulated approach, almost all the communications are still one-way, but B never starts a new connection; instead, B just puts the response available for A. It’s A responsibility to get the response, doing some polling.
The characteristics are:
- All the communications are started by A. B does not deal with communications issues.
- B is not even aware of A. If there are many As, B does not need to know.
- Easy to implement. Or at least, easier than Real Async.
- Bandwidth overload, because of the polling, as well some CPU consumption on A.
Here is the sample picture, modified showing simulated async via polling. Arrows shows who start the communications:
In this case, the response is not sent (PUT); it’s retrieved (GET), so it’s not a one-way communication.
These two models can extend to the infinite with many variations, but I think here are the two most basic ones.
-
Abstract: this seems an easy question, but I have not found a proper answer yet… feel free to give your answer and a little justification.
The complete question is:
What should I secure? the access to the resources, or the content of the resources?
Securing the access means controlling who sees the resources, in terms of who can read files, databases, etc.
Securing the content means encrypting the file content or the data inside databases, so everybody can read but only a few can understand.
Meanwhile you think a response,
- There is an application, let’s say, and ASP.NET application.
- Then there is IIS authentication, let’s say, Windows Integrated Authentication.
- Then there is .NET Framework Code Access Security settings, let’s say, system administrators configuring execution permissions to the assembly.
- Then there is more Code Access Security (developer’s), so the .NET assembly asks declaratively for a read permission to a resource file.
- Then there is the resource file, encrypted, of course, to store a Connection String.
- Then there is a connection to a database engine, with user/password challenge for authentication.
- Then there is database authorization, give access to a concrete database object.
- Then there is a query, that returns an encrypted column (lovely Yukon).
- Then there is a database-level encryption user key, using a password or passphrase provided by the user, so the executing assembly can read the column data.
- So the column data goes in clear to the assembly, which returns the information to the IIS to be returned to the human-being at the other side of the network. Of course IIS uses a HTTPS connection, beware of hackers…
back to que question... Encrypt files or protect them from being read? or both? most important: why?
-
Abstract: An exception has been thrown, what should I do? --> At least two actions: Fix the situation and Log it
Orchestrations tend to fail. Sure. This is because the nature of integration: Orchestrations deal with external applications, and expect behaviors that are not always as expected.
Exception handlers should be everywhere where errors can happen. The rule of thumb is that “Suspended/Non Resumable state should never occur”.
Each exception handler should perform at least two actions: fix the situation and log it.
Fix the situation
Fixing means not allowing the orchestration fail into a suspended state, and notify all the caller process. This is usually done by creating an error response message, or valid response that contains no data, but some detail on the error, so the flow can continue in some way.
Note that using the same response message or a different and specialized fault message usually depends on the message schemas and other requirements.
Log
A common practice when handling with errors is to log full message or context data (or both!) using the Windows Event Log or any other log framework. Typically, developers log data about the exception, the state of the orchestration, the message that raised the error, etc.
With this practice is very easy to duplicate a lot of data, because BizTalk already stores all this data for you. This information can be found in the tracking database, using HAT. The problem with HAT is that is difficult to find useful info if you are not sure what you are looking for.
So what kind of information should you log in case of error? -> Enough info to:
1. know what kind of error happen.
2. know where it has happen
3. be able to find the details using HAT
Nothing more. This is the tip: don't log data that is logged by BizTalk. If there is an exception, BizTalk has the details. If message tracking is enabled, do not log message contents. Learn to use HAT efficiently, and you’ll save time and logging code.
Here is an easy sample of parallel Fix-and-Log:
-
...or at least that's what Clemens Vaster thinks. A little bit radical just to explain that SOA is an Orientation, not an Architecture itself... anyways, I agree with him, and I've added him to my blogroll...
It's true that there's too much hype with SO and SOA, so I'll try to explain my point of view of SO in 3 over-simplifyed steps:
I want to benefit of Service Orientation. What should I do?
- Design different elements of the architecture to be independant each other (stateless, autonomous)
- For each element, separate interface from implementation.
- For each interface, use a technology that provides platform-independence and location-independence (SOAP over HTTP is a good example, but it's not the only one)
...and that's all, you are Service Oriented!
Well, we can discuss that platform-independance may not be required, but that's another post...
-
Abstract: the use of a hierarchical naming convention allow to group messaging artifacts in a tree-like structure.
note on conventions
I’ve seen many different naming conventions for BizTalk, and I must admit that I haven’t found any suitable for me, especially on messaging ports. Some conventions rely on functional rules, some on technical rules. They add readability but, in my opinion, they do not solve the problem of having lots of different elements in the same plain list without any order or structure.
Here are some links of these conventions:
Scott Colestock
Scott Woodgate
MSDN Article
using a prefixes to keep related ports together
One rule that I’ve always followed is the <BizApp> prefix, so the ports corresponding to the same application are all together in the BizTalk Explorer port list. In the case of one-to-one binding, it’s also useful a <OrchName> prefix. While these solutions add a simple level of grouping, it’s not the "panacea".
Brand New Hierarchical naming convention
During a re-reading session of Hophe’s Enterprise Integration Patterns (some king of bible for me:-), I’ve found a nice idea, about using a path-similar naming convention for messaging channels and artifacts. The naming convention used in the book if something like MyCorp/Prod/OrderProcessing/NewOrders
I’ve been thinking in a useful pattern for naming BizTalk messaging ports using slashes, and I’ve decided to use this:
<BTSAppName_or_System>/<PortType>/<Partner_or_AppName>/<Operation>[/<(Request_or_Response) or (MessageType)>]
example:
So, in my current project (called for some reason CRS), I’m integrating two operations (Search and Booking ) with two external partners. One operation use fire-and-forget style, and the other is an async request/response.
Send ports
CRS/SendPorts/PartnerOne/Search
CRS/SendPorts/PartnerTwo/Search
CRS/SendPorts/PartnerTwo/Booking/Request
Receive ports and locations
CRS/ReceivePorts/PartnerTwo/Booking/Response
CRS/ReceiveLocations/PartnerTwo/Booking/Response
Here is the BizTalk Explorer screenshot:
-
Abstract: although the concept of document message and command message is quite simple, it’s usual to see document messages used for everything.
At the highest level, integration is about communicating different applications. This communication can be performed to exchange data or functionality.
If you are doing the integration via messaging, you’ll use messages to communicate applications.
Document messages
A document message is used if the purpose of the integration is to exchange data. The most typical document messages type is an Xml document representing a data structure.
Command messages
If the purpose is to exchange functionality (that is, calling a remote method asynchronously) a command message is used. Command messages are, for example, SOAP requests. A SOAP request contains the name of the method called and the parameters needed to call that method.
Why am I pointing to such an obvious description of integration/document messages/command messages? --> because I’ve seen in many places the misuse of this concept of document vs. command.
The concepts of integration, messaging and asynchronous are relatively new, or at least, relatively popular. Most of developers have learnt to do messaging in a data interchange fashion. As a result, document messages are being used for everything, including for functionality integration.
Let’s see an example: messages to create and update records, for example, Customers:
Common practice, not very elegant:
Use a single document schema that contains the customer data. Send messages to a different endpoint for creation or update. The receiver application knows what to do with the message because it knows where it has been received.
<Customers>
<Customer id=”1234”>
<Name>Foo</Name>
<!-- more data -->
</Customer>
<Customers>
Why is this not very elegant? --> Because the message itself does not describe the purpose. The purpose is in the context. Without the context, the message has no meaning. (Agent Smith said, “Without a purpose, we would not exist” :-)
Another common practice, even worse:
Use two different document schemas, which are very similar. They contain the same data, but they are slightly different.
Document message for creation:
<CreateCustomers>
<Customer id=”1234”>
<Name>Foo</Name>
<!-- more data -->
</Customer>
</CreateCustomers>
Document message for update:
<UpdateCustomers>
<Customer id=”1234”>
<Name>Foo</Name>
<!-- more data -->
</Customer>
</UpdateCustomers>
Why is this a bad practice? --> Ok, you have two schemas for two purposes, but they contain the same data, the same structure. You can reuse schemas via Xsd Import or Xsd Include, but it’s a workaround, not a good solution.
I've seen ever entire industry-oriented document specifications with lots of almost-equal schemas for many different purposes. So you have the same Customer Name field repeated any time a name of a customer is needed... you can imagine the Xsd Import dependency tree is 5-7 levels deep.
Good practice:
In my honest opinion, you should use document messages for data, and command messages for operations. Include parameters in your command message for the values. If your operation a bunch of data, include the document message inside the command message (just as SOAP does!). So the command message is just an envelope for the document message.
Document message for customers:
<Customer id=”1234”>
<Name>Foo</Name>
<!-- more data -->
</Customer>
Command message for operations with customers:
<CustomerActionEnvelope>
<Action>CreateCustomer</Action>
<Parameters>
<!-- insert here the full customer document message -->
<Parameters>
<CustomerActionEnvelope>
In this way, data is separated from meta-data (at schema level) and dependencies are reduced.
If you think this is interesting, go to http://www.eaipatterns.com for a deep explanation of Command Message and Document Message from the patterns point of view.
-
a must read for everyone working with BizTalk. It includes team development, best practices, approaches for versioning, etc.
read at http://msdn.microsoft.com/library/default.asp?url=/library/en-us/bts_2004wp/html/ffda72df-5aec-4a1b-b97a-ac98635e81dc.asp
-
Abstract: Schema designs usually contain some boolean Error-or-Success code to handle the result of the process. In an EAI scenario, schemas should contain at least support for a new result type: Warning.
EAI communicates different applications. In a one-to-one integration, the result can be just categorized Success or Failure. But if we have more than two applications, there can be a wider range of situations. So we need to categorize the results.
The following is a simple example the common error situation:
Let’s suppose that we are calling a web service and sending back its response to the initial caller (ala message router). What if the web service is down? --> We compose an empty response and put an error in the header. Something like this:
<Response>
<Header>
<Result type=”error”>
<Errors>
<Error code=”1” source=”WebService1”>A timeout happened!</Error>
</Errors>
</Result>
</Header>
<Body/>
</Response>
See the body is empty.
But what if we must create one aggregated response from two or more web services and only one them fails? We cannot say it’s an error (there is valid data), but we cannot say it’s a success.
So let’s use a warning:
The process that aggregates many responses into a unique big response can convert the error into a warning. Some way like this:
<Response>
<Header>
<Result type=”warning”>
<Warnings>
<Warning code=”2” source=”MessageRouter”>Some services did not responded as expected</Warning>
</Warnings>
</Result>
</Header>
<Body>
<!-- data from WebService 2 -->
<!-- data from WebService 3 -->
<!-- data from WebService N -->
</Body>
</Response>
And the initial caller is happy because it has a valid response, but it’s aware that it’s not a full success. We can also have a severity weight of the warning (low, medium, severe, catastrophic, etc).
Note that in the aggregated response I’ve removed references to which Web Service failed and its original message --> the Message Router & Aggregator hides the integration logic to the caller, of course, but this is another chapter…
-
Abstract: A common question from my customers: ‘how much code should be in the orchestration, and how much code should be leveraged into .NET components?’
In the BPEL standard, a business workflow is used only to coordinate the execution of components (web services). With XLANG, you can add .NET code, so the workflow have not only the orchestration, but also the implementation.
But the matter is that, the fact that you can code inside the orchestration does not mean that you should, of course. Why? Well, apart from all the stuff in my previous posts, from the technical point of view it’s difficult to maintain and difficult to debug. From my experience, the guideline is to put few code inside the orchestration, with the following rules:
The orchestration should contain:
- Control flow. That is, shapes and messages.
- Code needed to make the control flow run. Loop variables, initializations, helper calls, etc. --> if it’s simple.
The code leveraged to components should be:
- Code dealing with technical aspects. Resources, Xml/XPath handling, conversion algorithms, anything that requires error handling, etc,
- Control flow code, if complex, and/or requires technical stuff.
My justification is that leveraging code to components has the following advantages:
- Easy to debug.
- Easy to maintain.
- Can have error handling.
- Can use any CLR feature that you want.
- Use the .NET language of your choice :-)
- Some other more, sure…
So, go on and use helper components, and refactor everyday!
feedback?