As promised I've released some code that shows you how to create a reusable application host that will run your UCMA 2.0 Workflow Applications.
You can find the code, binaries and an initial "How To":
http://code.msdn.microsoft.com/ucmahost
The home page of the project site gives a basic walk through. One item I sort of mentioned on the site was the ability to interface with the service via MSMQ. Instead of specifying a Inbound or Outbound workflow, you can just provide the name of a private queue. This will watch that queue and send an alert via a default Outbound WF or the provided Outbound WF.
To use this simply write this string to MSMQ, sipuri={0}&message={1}&type={2}, providing the details of who you want to send the message to, the actual message and type. The type can be 1, Audio, or 2, IM.
UCMA v2.0 includes a great abstraction layer, Communication Workflows, based on the Windows Workflow Foundation (WF).
These workflows are great for those needing to great simple yet powerful UC enabled applications. The workflow activities themselves are pretty self explanatory. I'll cover the activities themselves in another post.
The "difficult" portion of creating a UCMA workflow application isn't the actual workflow, but creating the host. Yes the default template wraps your WF application is nice little console application, but this is really only for debugging purposes. Are you really going to deploy your console application in production? Those familiar with Speech Server are probably scratching their heads, while Speech Server had a nice service and administrator console that allow you to easily point it to your Speech WF application, UCMA does not. You need to build it yourself and build it for every UC application you build, and to truly do that we need to know at least a little bit about the UCMA Core API. Joe Calev has a great blog which walks you through the Signaling portion of the Core API. For the WF API you really should have some understanding of the Collaboration portion of the Core API.
First you will need to make sure that you have a certificate on for your application and you will need to provision an application on the OCS Front End. You can do this by using the Application Provisioner and following these instructions: http://msdn.microsoft.com/en-us/library/dd253360(office.13).aspx
This is the key information that our application will need:
- Application Name
- Application Port
- Application SIP URI
- Application Server FQDN
- OCS Pool FQDN
- OCS Pool Port
- GRUU
- Certificate Information
Note: You should never call the asynchronous methods, synchronously. IE object.End(object.Begin). You'll see some of this in sample code and blog posts, do not use this is a production application!
Armed with the necessary information, we can start looking at some code:
First we need to use create an instance of the CollaborationPlatform, for this post the importance of this object is that it is used to describe our application. It has two constructors, one that takes a ClientPlatformSettings parameter and the other that takes a ServerPlatformSettings.
The different here is the ClientPlatformSettings will use an AD account, while a ServerPlatformSettings will use a Contact Object. We reference the Contact Object via the GRUU provided by the Application Provisioning tool. There are some features lost if you use the ClientPlatformSettings, such as Impersonation. I typically draw the line, if the application is going to consume the Audio or Instant Message call and perform some logic based on the input I use the ServerPlatformSettings. IE: Response Bots. If my application is just providing an abstraction method for a user that already has credentials to sign in and message users, then I would use the ClientPlatformSettings. IE: Build your own authenticated Web Messaging.
For this post, we will use the ServerPlatformSettings.
ServerPlatformSettings platformSettings = new ServerPlatformSettings(appName, appServerFqdn, applicationPort, gruu, cert);
CollaborationPlatform collabPlatform = new CollaborationPlatform(platformSettings);
collabPlatform.EndStartup(collabPlatform.BeginStartup(null, null))
Once the CollaborationPlatform object has completed starting up, we need to setup our EndPoint. There are two choices, ApplicationEndPoint and UserEndPoint, if you are using the ClientPlatformSettings then you need to use the UserEndPoint, since we are using the ServerPlatformSettings we need to use the ApplicationEndPoint.
ApplicationEndpointSettings settings = new ApplicationEndpointSettings(applicationUri, ocsFqdn, ocsTlsPort);
ApplicationEndpoint endpoint = new ApplicationEndpoint(collabPlatform, settings);
If our WF application is an Inbound application, we need to register for the type of calls the WF will handle, but remember that the WF can only accept AV and IM calls and can't accept conferencing calls. That’s not to say you couldn’t register to receive a conference invite, and handle the conferencing features using the Core API itself, but that is another post.
endpoint.RegisterForIncomingCall<AudioVideoCall>(AudioVideoCallReceived);
endpoint.RegisterForIncomingCall<InstantMessagingCall>(InstantMessagingCallReceived);
Once we have done that, we can establish our endpoint.
endpoint.EndEstablish(endpoint.BeginEstablish(null, endpoint));
Next we need to start the Workflow Runtime and add the UCMA WF Services, this isn’t so much related to UCMA as it is to WF itself.
WorkflowRuntime workflowRuntime = new WorkflowRuntime();
workflowRuntime.AddService(new CommunicationsWorkflowRuntimeService());
workflowRuntime.AddService(new TrackingDataWorkflowRuntimeService());
workflowRuntime.StartRuntime();
Finally we need to route the call to our workflow, via the handler methods we registered for earlier.
AudioVideoCallReceived
InstantMessagingCallReceived
When a call comes in, we need to pass the call to our workflow application and from there the Workflow handles the rest:
WorkflowInstance workflowInstance = workflowRuntime.CreateWorkflow(WorkflowType);
CommunicationsWorkflowRuntimeService communicationsWorkflowRuntimeService = (CommunicationsWorkflowRuntimeService)workflowRuntime.GetService(typeof(CommunicationsWorkflowRuntimeService));
communicationsWorkflowRuntimeService.EnqueueCall(workflowInstance.InstanceId, call); //Call object is passed by the receiving handler
communicationsWorkflowRuntimeService.SetEndpoint(workflowInstance.InstanceId, endpoint); //Endpoint object is a local variable
communicationsWorkflowRuntimeService.SetWorkflowCulture(workflowInstance.InstanceId, new CultureInfo("en-US"));
workflowInstance.Start();
I was thinking about it and all of this is pretty repetitive code from one workflow to another workflow application, this all has to be done, everytime for every WF project. I’ve created a solution that I will release on CodePlex last this week. Basically the solution is a Windows Service that will take the parameters, GRUU, port, etc… log into OCS and route calls to the workflow you provide as a compiled assembly. Also part of this solution utilizes MSMQ, in much the same way Speech Server did. You can associate a MSMQ name to an application, if a message is in that queue if will fire off the workflow you specify or use the default built in WF application.
The Jist of it is, instead of worrying about writing all the code above, you can concentrate on just the workflow portion. Compile the WF as an Assembly instead of a Console application, add the application parameters to the App.Config of this UCMA Application Host service and you are done! A picture is worth a thousand words, so here is a high level picture of the solution.

I am also working on releasing the following solutions on CodePlex:
- Updated Web Chat sample. The current released Web Chat uses UC AJAX, this updated one will use UCMA 2.0, WCF & SilverLight.
- Client Framework. An abstraction layer to both UC Client and Communicator Automation.
I've been MIA for a couple months mainly for Office Communications Server 2007 R2 to RTM, but I've also been busy at work.
I promise I will get the the UCMA 2.0 Workflow post I promised, but those of you wanting to get a taste of UCMA now, check out Joe Calev's blog, he currently has a full seris on UCMA 2.0 Core right now. http://blogs.msdn.com/jcalev/default.aspx
Right now I am in Redmond for 3 weeks for the first rotation of Microsoft Certified Master program for Office Communications Server 2007 R2. All I can say it that it will be very very long days, including doing training on the weekend. As much as I am looking forward to it, I am also dreading it. 3 weeks away from home, actually 4 weeks as I will also be attending TechReady, our internal conference, and probably 11-12 hours in a classroom... This class probably is not for the faint of heart.
If are not sure what it is MCM is, you can read more here: http://www.microsoft.com/learning/mcp/master/OCS/default.mspx
The long time away from home will actually give me some time to finally blog.
I am planning a series of blog posts that will show you how to create UCMA WF applications that can answer the telephone with speech recognition and speech synthesis abilities. However before I do that I HAVE to explain the differences between UCMA and Speech Server so that there is no confusion.
Speech Server vs UCMA
With OCS 2007 R2 there has been an update to UCMA (Unified Communications Managed API), simply named UCMA 2.0. This new and approved API has a lot of new features, support for presence, telephony, speech recognition, speech synthesis, etc… In R2 there is NO update for Speech Server, that being said you can continue to run Speech Server as is in tandem with your R2 environment.
On top of the Core API is a new WF API, which abstracts a lot of common "activities" that a UCMA application might do. At first glance it may appear that this WF API is a replacement for Speech Server, but there are a few major differences that you need to consider when deciding if an application should be a Speech Server application or a UCMA application. Here are what I consider the 4 major deciding factors when choosing between Speech Server or UCMA.
- Platform vs. API
The main difference between these two is that Speech Server is not only an API, but is also an enterprise grade IVR to host these applications. With Speech Server you only have to worry about developing the front end, SALT, VoiceXML or Speech WF application.
UCMA 2.0 WF is simply an API, you need to build the front end of the application as well as building the host application. You can host these applications in a Windows Form, Console application, Windows Service, etc.. Obviously a Windows Service makes the most sense.
Note: UCMA 2.0 does NOT have built-in activities for VXML nor SALT.
- Infrastructure
The big difference here is Speech Server typically sits off as its own branch from your PBX or Media Gateway, while UCMA 2.0 sits behind the mediation server. At first this seems trivial and maybe even a good idea, but there is a reason a typical IVR application sits outside of the voice network that you use for day to day communications. IE: Do you want your IVR to take up internal bandwidth and/or the lines that you use to for your telephone communications?
If you are only expecting a small amount calls, then deploying a UCMA 2.0 application shouldn’t be a big issue, but if you are building a UCMA 2.0 application that is going to continuously use 50+ ports, you need to do some planning and additional infrastructure work before deploying behind your existing OCS environment.
Note: 50 simultaneous ports is a relative number. The key takeaway here is don’t expect to simply throw a UCMA application behind your existing mediation server and not expect to consider the impact. You can add additional mediation servers and gateways to solve scaling problems.
- Developer Tools
Speech Server contains developer tools that abstract things like SRGS grammars with a nice visual grammar editor. UCMA does not yet have these tools, so be prepared to write some of your own SRGS by hand. Small feature, but it can save a lot of development effort. I’ve said it before and I’m saying it again, most of you development effort with any speech enabled solution will be spent on grammars. In a UCMA application, SRGS is not only used to recognized speech, but also text from instant messaging conversations.
UCMA also does not have the SIP Debugging Phone as Speech Server does, this means that in order to debug a UCMA application you need to call the application via Communicator and/or an actual telephone. Also with the infrastructure requirements of having an actual OCS environment with mediation server, you won’t be able to debug application very easily “offline”.
Don’t expect to write and debug UCMA 2.0 applications sitting on a plane like you did with Speech Server. Yes you can create a virtualized OCS 2007 R2 environment but it is going to require HyperV and at least 8 GB of RAM. I’ve finally upgraded so I can do just that, but I realize that not everyone is going to be able to.
- Reporting & Tuning
Speech Server has the Tuning & Analysis tools, which comes in handy for simple reports like “How many calls did my application get?”, “How many times did a grammar fail?”. This reporting is absolutely necessary and for any enterprise class IVR solution. If you want some of this reporting for UCMA 2.0, you are going to have to build it yourself.
Outside of the reporting side, are the tuning tools, which allow you to test grammar changes on actual callers recorded audio, before deploying.
I hope this post as helped you understand the differences between Speech Server and UCMA. UCMA 2.0 clearly is a step on the roadmap to include speech tightly into the UC platform and OCS. Expect that things like the speech tools will be available by the next release in the not too distant future.
Next post will talk about the features of UCMA…
News Article here: http://biz.yahoo.com/prnews/081014/aqtu055.html?.v=76
This announcement allows me to start blogging about R2 development features. Expect many UCMA v2.0 blog posts soon!
Side Note: I know I haven’t blogged much recently, but I’ve changed roles within Microsoft, been on parental leave and busy working with customers. However I am planning a series of post about creating UC solutions..
I get this question a lot, "Does Office Communications Server integrate with CTI?" The Answer: It depends
Why do I say that? Well when most people ask about CTI, they are typically asking about CSTA applications. IE: ACD, Predictive Dialing, Screen Pops, etc.. And it depends on your OCS implementation.
Remote Call Control
You can run your current CSTA based CTI application in tandem with your OCS environment in a Remote Call Control scenario as your CTI server will sit as a branch off the PBX. However the Remote Call Control scenario has no real interaction between OCS and the CTI server. Therefore you can't call that real integration. Besides the Remote Call Control scenario doesn't utilize the full potential of our UC platform.
Enterprise Voice with PBX Integration
With a OCS deployment with Enterprise Voice, can it support your CSTA based CTI applications? No for the fact that your internal telephones are no longer "directly" linked to the PBX, but instead all telephone calls are routed through your OCS infrastructure.
.gif)
Does this mean OCS doesn't support CTI applications?
No, it just means that it doesn't support CSTA based CTI applications. CTI doesn't have to be based on CSTA. If you look at what CSTA is, it is essentially a schema. A schema that defines what a telephone call is and from a software developer perspective this is antiquated approach to solving a common problem. Why don't we have predefined schemas for everything, such as customer data, financial transactions etc... ? All business have customers and money changing hands, yet we don't commonly see standardize schemas. Yes having predefined schemas for everything would solve a lot of problems, but businesses are unique. Just because you run a bakery doesn't mean all bakeries care about the same data.
I know it seems like a Microsoft employee bashing "standards" but keep reading.
Let's take "Screen Pops" for example, when a call is transferred from the IVR application to an agent, the agent should get the customer's information from the data the IVR collected. In a traditional CSTA environment the call information and relevant data is written to the CTI server. When the call gets transferred to the agent, the agent has a piece of software sitting on the desktop which either queries or gets notified with the appropriate data from the CTI server. If you remove the context of the telephone, it simply is a client application that gets notification from a server application that an event has occurred and passes the data to the client.
.NET and/or Java developers create client/server applications every day! As a .NET developer I'd create a WCF service which the IVR application could call and pass in the data it collected, along with the who it is transferring the call to. On the agent's desktop I would have a small application that would register it's endpoint with the service and listen for any events that pertain to it. The client application could even integrate directly with a CRM application, provided that the CRM application has some sort of API.
I know you are probably reading this and wondering what does this have to do with OCS? Well nothing and that's just the point…
In this Screen Pop solution, the actual Screen Pops have nothing to do with OCS. The OCS APIs could be used to create the front end IVR application and provide business logic presence information for agents, so that the IVR knows who to route the call to. NOT to display Screen Pops. I know what I am saying goes against what others have suggested such as the UC Client API - Screen Pop Sample. It is however just a sample and is only one way to solve a problem. In my opinion creating a custom OCS client to display “Screen Pops” does "lock" you down to OCS, not to mention the increase investment of time to develop the application.
You shouldn't try and reinvent the wheel using the UC APIs, but use them for something that doesn't already exist. My rule of thumb is that the UC APIs should be used to serve two purposes: 1.) Provide modalities to your applications. IE: Telephony and IM 2.) Provide your application with presence data.
Conclusion
CSTA defines a "standard" so that you aren't “locked” down to use proprietary extensions. By applying modern SOA techniques to your solution you create a solution that is loosely coupled and could easily be adapted to any other OCS ish products out there and therefore not forever committing you to a single product and/or platform. Believe it or not as a Microsoft employee, I don’t want you to feel that you are “locked” into our products, I want you to buy our products because they are the best in the industry and will give you the greatest return on your investment.
A few new code samples have been released on various UC APIs. I haven't checked them all our yet, but I can speak a little to the "Integrating Web Chat Functionality" as I know the guy who wrote it...
The Web Chat Functionality sample shows you how you can have anonymous web users have an IM conversation with an OCS end user, without the web user providing any credentials.
Web Chat
Think about visiting a web site, for example an ecommerce website, and the customer has a question about a product, while they could email and/or call, the customer would have to leave the computer and make the extra effort. This Web Chat samples tries and solves that problem by allows the customers to simply click on a link and have an IM conversation via thier browser with a representative and/or a bot from the ecommerce website.
This is one example of "Click to Chat".
The technology behind the solution is farily simple, combining the UC AJAX API into a service, in this case a WCF service which logs into CWA via a single UC Enabled account. The WCF service is the only application that is comunicates with CWA. The browser application uses .NET 3.5 WebHTTPBinding and JSON to communicate directly to the WCF service.
The WCF controls which browser endpoints get which messages it has receieved, by assigning web users a GUID. The Browser application using a polling technique to get the messages from the service.
WPF Presence Controls
Another exciting sample are the WPF Presence Controls. While we had the WinForms controls we didn't have WPF controls and it was something requested by everyone who looked at the WinForm controls. Download them and check out George Durzi's blog post about these controls.
http://blogs.claritycon.com/blogs/george_durzi/archive/2008/09/08/wpf-presence-controls-for-microsoft-office-communicator.aspx
Integrating Web Chat Functionality - Microsoft Unified Communications AJAX API Sample
http://www.microsoft.com/downloads/details.aspx?FamilyId=C8C3F762-7BE4-4541-9B18-82499DB61293&displaylang=en
WPF Presence Controls for Microsoft Office Communicator 2007 - Microsoft Office Communicator 2007 SDK Sample
http://www.microsoft.com/downloads/details.aspx?FamilyId=5001D612-533A-4721-91EA-DA990D94FF0F&displaylang=en
Dynamics CRM Integration with Office Communications Server
http://www.microsoft.com/downloads/details.aspx?FamilyId=6E2EA762-A6C9-43BD-8C84-BF610073765C&displaylang=en
Customer Relationship Management (CRM) Activity - Microsoft Unified Communications Managed API 1.0 Sample
http://www.microsoft.com/downloads/details.aspx?FamilyId=16303459-DD75-451F-B7C0-FB2EB0D9A84A&displaylang=en
Communicator 2007 Custom Tabs - Microsoft Office Communicator 2007 Sample
http://www.microsoft.com/downloads/details.aspx?FamilyId=621C675C-46B7-4F68-ADDC-9F44E5594BFB&displaylang=en
If you are using PIC (Public IM Connectivity) in OCS, you can send SMS messages to any capable phone in North America, via AOL's SMS Gateway simply by sending a message to an E.164 normalized phone number. IE: +16128591899 and appending the AOL domain name, aol.com.

While this is fairly useful by itself, for it to be extremely useful you'd have to take advantage of this via one of the UC APIs. For example, you could use the UC AJAX API and Speech Server to send a SMS message to caller confirmation details of their call OR maybe you create an IM bot with UCMA v 1.0 that sends a transcript of the chat to the user’s SMS device. The possibilities are endless...
I was checking out the UC Developer Portal recently and It seems it is not up to date on the available Virtual Labs for UC Developers, as it only list 5.
For a more up to date list, you can visit this URL:http://msevents.microsoft.com/CUI/AdvancedSearch.aspx?culture=en-US#culture=en-US;advanced=true;sortKey=;sortOrder=;pageEvent=false;startDate=7/21/2008;endDate=12/31/2008;kwdAny=;countryId=US;languageCode=en;audience=2;products=170;eventType=4;searchcontrol=yes;s=1
It looks like these Vitual Labs cover the UC AJAX and Communicator Automation in depth. I think there is one lab on UCMA, but there doesn't appear to be one on the UC Client API. Overall very good materal and if you are new to using any of these APIs, take the time and go through the labs, best of all they are free to use.
Michiel van Oudheusden from e-office, has taken the challenge to actually hook up the the WPF Presence bubble screensaver to Communicator to show presence of your contacts!
http://unified-communications-development.blogspot.com/
The blog belongs to Joachim Farla and Marc Wetters of e-office. Make sure to check out some of the other post.
A very interesting article on using WPF to display presence by Erik Klimczak from Clarity Consulting. It goes more into the WPF side, than the actual presence, but you need to know how to make your presence bubbles look good too! I love the example of creating a screen saver out of it, It doesn't currently show actual users presence, but the article challenges you to add that your self. I'd also like to extend that challenge! Hint: I'd look into using the Communicator Automation API first.
Article: http://blogs.msdn.com/coding4fun/archive/2008/06/20/8626294.aspx
While Speech Server is apart of Office Communications Server, the two do not rely on each and actually do not integrate with each other out of the box. However that doesn't mean it can not be done.
There are two main scenarios which I am always asked about are:
1.) Communicator Calls to Speech Server
2.) Transferring Speech Server calls to Communicator
Calling Speech Server from Communicator
The first thing you need to do is setup a static route in OCS to Speech Server. Here you will need to assign a sub domain, something like ivr.domain.com. This tells OCS to route all calls where the domain contains ivr.domain.com to Speech Server.

Next in the Speech Server administrator console you will need to add the OCS Front End Server as a Trusted SIP Peer on non default ports, such as 5068 for TCP and 5069 for TLS. This is required as OCS doesn't handle the 302 Redirect Messages that Speech Server uses, by assigning non default ports we "turn off" these SIP messages. You will also need to enable Mutual TLS.

Note: This will be using TLS, OCS will already have a certificate installed, but Speech Server probably won't, now would be the time to install a certificate on Speech Server.
Next you will need to deploy your application and again assign non defaults ports, these ports should be the same ports as the Trusted SIP Peer. You can assign a "telephone number" to the application as well.

Now you can dial the Speech Server application from Communicator by dialing the static route, like 411@ivr.domain.com.
411 being the extension that you assigned to your application and ivr.domain.com being the sub domain you specified in the OCS routing tab.

Transferring Calls to OCS users
When trying to create a transfer type Speech Server application, you need to know one rule. You can only transfer via the SIP Peer in which the call orginated.
Let's take the static route example we setup previously, if my Speech Server application does a transfer to another OCS user, it would transfer back to the OCS Front End Server and would transfer apporiately via a specified SIP URI. However you couldn't transfer the call to say a PSTN, as the call orginated via OCS.
Note: To do this type of "internal" transfer, make sure to add Speech Server to the Host Authorization tab in the Front End Properties of OCS.
Back to the orginal scenario, a call comes in via the PSTN to probably a VoIP Gateway, meaning when we do a transfer it will be routed back to that same gateway. Depending on your VoIP Gateway, you need to have rules, one or more numbers assigned to forward to Speech Server, and the rest of the numbers should get routed to the Mediation Server.
Typically you do not want Speech Server to sit behind the Mediation Server but next it, as shown in the diagram below.

In your Speech Server application when we do a transfer, instead of transferring to a SIP URI like: sip:midunn@microsoft.com;transport=tcp, you need to transfer to a TEL URI, like tel:+16128591899@microsoft.com;transport=tcp. Using the TEL URI, the gateway will correctly route it to the mediation server and the mediation server will in turn route it to the correct user.
Tip: Depending on your VoIP Gateway, SIP Proxy, PBX, whatever, specificing the transport parameter is a good idea. I've run into issues where the some 3rd party SIP applications revert back to UDP if this isn't specified.
Bill Gates will be doing the Keynote and probably one of his last before he retires.
I'll be doing a couple of sessions at TechEd on Speech Server with Albert Kooiman. You can check out sessions, events, register, etc.. at www.msteched.com
If you are attending drop me a line and let's make some time for a drink at one of the after hours events!