Welcome to MSDN Blogs Sign in | Join | Help

Am I Done?

Unified Communications for Developers

News

  • Disclaimer These postings are provided "AS IS" with no warranties and confer no rights. They do not represent the views of Microsoft.







    Am I Done? at Blogged
Server Side Speech Recognition

Last week I learned something, well I learned many things but one that was particularly interesting. Lets say you wanted to do server side speech recognition but you don't want the transport to be SIP.  This obviously eliminates using UCMA 2.0 as by default it will rely on OCS and therefore SIP. Well I learned that this isn't entirely true, you can actually use the Microsoft.Speech assembly that comes with the UCMA 2.0 SDK outside a UCMA application.   This has been staring me in the face and it didn't ever cross my mind to do it.

What kind of scenario does this open? One example improved desktop experience. Well SAPI is the obvious first choice when creating a desktop application, but the SAPI experience is really dependant on the OS. IE: TTS voices are different from XP to Vista. The benefit to having a server side solution you can control the experience of the end user has.

While looking to build something there are some problems to overcome, for me building a simple sample solution, two main problems stick out.

1.) Streaming Audio - I don't really want to pass around and process WAV files. You could but the experience wouldn't be as nice as you couldn't do immediate speech recognition.
2.) As I won't be using System.Speech or SAPI, I am going to have to access hardware resources such as the microphone and speakers.

For streaming audio there are alot of different options, but being a big fan of WCF, this is first avenue I looked at and sure enough you can stream content to/from WCF. Is there anything WCF can't do!

 http://msdn.microsoft.com/en-us/library/ms731913.aspx

 Next access microphone and speakers via .NET, I don't want to write low level code I want something down and dirty. I found a very cool CodePlex project, NAudio. It abstracts a lot of the audio APIs into a single, easy to use API. This will work perfectly for my client side application.

http://www.codeplex.com/naudio

Here is a simple diagram of what I am building, and actually very close to having something running.

Note: If this was something which I was planning to actually put in a production environment, I'd separate the Synthesizer and Recognizer. These will be independent calls and would benefit from being on their own servers.

Code Sample will be available soon, hoping to have it up on MSDN samples by November 16th.

 

Custom Disclaimer

A lot of businesses want to add disclaimers to external communications. Using MSPL on the Edge servers we can create something to simply add this disclaimer to an IM Conversation.

Since MSPL hasn't really changed from LCS to OCS, we can use the sample provided for LCS to implement a custom disclaimer for OCS.  You can download the sample from: http://www.microsoft.com/downloads/details.aspx?familyid=ee41345b-836c-4dcf-9810-32709ae9f5c4&displaylang=en&tm

You'll have to make some project modifcations because it doesn't compile out of the box. These are easy like missing references etc..

If you want to first run the console application, for debugging and/or learning purposes, you'll notice that the application manifest, ".am",  is missing. Just copy the the DisclaimerService.am from the service proejct, open it in notepad and change the AppUri to http://www.microsoft.com/LC/Disclaimer/Console.

Finally when you are ready to install the service portion of the solution, you'll have to change the ServicesDependsOn property on serviceInstaller1 in ProjectInstaller.cs as it is currently set to "LCSProxy.exe".  Just set this to be blank.

After these minor changes, simply continue following the instructions included with the sample.

UCMA WF Application Host

As promised I've released some code that shows you how to create a reusable application host that will run your UCMA 2.0 Workflow Applications.

You can find the code, binaries and an initial "How To":
http://code.msdn.microsoft.com/ucmahost

The home page of the project site gives a basic walk through. One item I sort of mentioned on the site was the ability to interface with the service via MSMQ. Instead of specifying a Inbound or Outbound workflow, you can just provide the name of a private queue. This will watch that queue and send an alert via a default Outbound WF or the provided Outbound WF.

To use this simply write this string to MSMQ, sipuri={0}&message={1}&type={2}, providing the details of who you want to send the message to, the actual message and type. The type can be 1, Audio, or 2, IM.

Unified Communications Managed API v2.0 Workflow Applications

UCMA v2.0 includes a great abstraction layer, Communication Workflows, based on the Windows Workflow Foundation (WF).

These workflows are great for those needing to great simple yet powerful UC enabled applications. The workflow activities themselves are pretty self explanatory. I'll cover the activities themselves in another post.

 The "difficult" portion of creating a UCMA workflow application isn't the actual workflow, but creating the host. Yes the default template wraps your WF application is nice little console application, but this is really only for debugging purposes. Are you really going to deploy your console application in production? Those familiar with Speech Server are probably scratching their heads, while Speech Server had a nice service and administrator console that allow you to easily point it to your Speech WF application, UCMA does not. You need to build it yourself and build it for every UC application you build, and to truly do that we need to know at least a little bit about the UCMA Core API. Joe Calev has a great blog which walks you through the Signaling portion of the Core API. For the WF API you really should have some understanding of the Collaboration portion of the Core API.

 First you will need to make sure that you have a certificate on for your application and you will need to provision an application on the OCS Front End. You can do this by using the Application Provisioner and following these instructions: http://msdn.microsoft.com/en-us/library/dd253360(office.13).aspx

 This is the key information that our application will need:

  • Application Name
  • Application Port
  • Application SIP URI
  • Application Server FQDN
  • OCS Pool FQDN
  • OCS Pool Port
  • GRUU
  • Certificate Information

Note: You should never call the asynchronous methods, synchronously. IE object.End(object.Begin). You'll see some of this in sample code and blog posts, do not use this is a production application!

Armed with the necessary information, we can start looking at some code:

First we need to use create an instance of the CollaborationPlatform, for this post the importance of this object is that it is used to describe our application. It has two constructors, one that takes a ClientPlatformSettings parameter and the other that takes a ServerPlatformSettings.

The different here is the ClientPlatformSettings will use an AD account, while a ServerPlatformSettings will use a Contact Object. We reference the Contact Object via the GRUU provided by the Application Provisioning tool. There are some features lost if you use the ClientPlatformSettings, such as Impersonation. I typically draw the line, if the application is going to consume the Audio or Instant Message call and perform some logic based on the input I use the ServerPlatformSettings. IE: Response Bots. If my application is just providing an abstraction method for a user that already has credentials to sign in and message users, then I would use the ClientPlatformSettings. IE: Build your own authenticated Web Messaging.

For this post, we will use the ServerPlatformSettings.

ServerPlatformSettings platformSettings = new ServerPlatformSettings(appName, appServerFqdn, applicationPort, gruu, cert);
CollaborationPlatform collabPlatform = new CollaborationPlatform(platformSettings);

collabPlatform.EndStartup(collabPlatform.BeginStartup(null, null))

Once the CollaborationPlatform object has completed starting up, we need to setup our EndPoint. There are two choices, ApplicationEndPoint and UserEndPoint, if you are using the ClientPlatformSettings then you need to use the UserEndPoint, since we are using the ServerPlatformSettings we need to use the ApplicationEndPoint.

ApplicationEndpointSettings settings = new ApplicationEndpointSettings(applicationUri, ocsFqdn, ocsTlsPort);
ApplicationEndpoint endpoint = new ApplicationEndpoint(collabPlatform, settings);

If our WF application is an Inbound application, we need to register for the type of calls the WF will handle, but remember that the WF can only accept AV and IM calls and can't accept conferencing calls. That’s not to say you couldn’t register to receive a conference invite, and handle the conferencing features using the Core API itself, but that is another post.

endpoint.RegisterForIncomingCall<AudioVideoCall>(AudioVideoCallReceived);
endpoint.RegisterForIncomingCall<InstantMessagingCall>(InstantMessagingCallReceived);

Once we have done that, we can establish our endpoint.

endpoint.EndEstablish(endpoint.BeginEstablish(null, endpoint));

Next we need to start the Workflow Runtime and add the UCMA WF Services, this isn’t so much related to UCMA as it is to WF itself.

WorkflowRuntime  workflowRuntime = new WorkflowRuntime();

workflowRuntime.AddService(new CommunicationsWorkflowRuntimeService());

workflowRuntime.AddService(new TrackingDataWorkflowRuntimeService());

workflowRuntime.StartRuntime();

Finally we need to route the call to our workflow, via the handler methods we registered for earlier.

AudioVideoCallReceived
InstantMessagingCallReceived

When a call comes in, we need to pass the call to our workflow application and from there the Workflow handles the rest:

WorkflowInstance workflowInstance = workflowRuntime.CreateWorkflow(WorkflowType);

 

CommunicationsWorkflowRuntimeService communicationsWorkflowRuntimeService = (CommunicationsWorkflowRuntimeService)workflowRuntime.GetService(typeof(CommunicationsWorkflowRuntimeService));

         

communicationsWorkflowRuntimeService.EnqueueCall(workflowInstance.InstanceId, call); //Call object is passed by the receiving handler

communicationsWorkflowRuntimeService.SetEndpoint(workflowInstance.InstanceId, endpoint); //Endpoint object is a local variable

communicationsWorkflowRuntimeService.SetWorkflowCulture(workflowInstance.InstanceId, new CultureInfo("en-US"));

workflowInstance.Start();

 

I was thinking about it and all of this is pretty repetitive code from one workflow to another workflow application, this all has to be done, everytime for every WF project. I’ve created a solution that I will release on CodePlex last this week. Basically the solution is a Windows Service that will take the parameters, GRUU, port, etc… log into OCS and route calls to the workflow you provide as a compiled assembly. Also part of this solution utilizes MSMQ, in much the same way Speech Server did. You can associate a MSMQ name to an application, if a message is in that queue if will fire off the workflow you specify or use the default built in WF application.

The Jist of it is, instead of worrying about writing all the code above, you can concentrate on just the workflow portion. Compile the WF as an Assembly instead of a Console application, add the application parameters to the App.Config of this UCMA Application Host service and you are done! A picture is worth a thousand words, so here is a high level picture of the solution.

 

 

I am also working on releasing the following solutions on CodePlex:

  • Updated Web Chat sample. The current released Web Chat uses UC AJAX, this updated one will use UCMA 2.0, WCF & SilverLight.
  •  Client Framework. An abstraction layer to both UC Client and Communicator Automation.

 

 

OCS 2007 R2 Virtual Launch

Sign up for the OCS 2007 R2 Virtual Launch on Feburary 3rd.
http://www.microsoft.com/communicationsserver/virtualevent/languageselect.aspx

And just to keep entertain until Feburary 3rd, here are some R2 launch teaser videos:
http://www.youtube.com/profile?user=OCSR2Launch&view=videos

Windows Speech Recognition Macros Tools have Shipped!

I blogged about these earlier and they have finally shipped. You can download them here:

http://www.microsoft.com/downloads/details.aspx?FamilyID=fad62198-220c-4717-b044-829ae4f7c125&displaylang=en

Status Update

I've been MIA for a couple months mainly for Office Communications Server 2007 R2 to RTM, but I've also been busy at work.

I promise I will get the the UCMA 2.0 Workflow post I promised, but those of you wanting to get a taste of UCMA now, check out Joe Calev's blog, he currently has a full seris on UCMA 2.0 Core right now. http://blogs.msdn.com/jcalev/default.aspx

Right now I am in Redmond for 3 weeks for the first rotation of Microsoft Certified Master program for Office Communications Server 2007 R2. All I can say it that it will be very very long days, including doing training on the weekend. As much as I am looking forward to it, I am also dreading it. 3 weeks away from home, actually 4 weeks as I will also be attending TechReady, our internal conference, and probably 11-12 hours in a classroom...  This class probably is not for the faint of heart.

If are not sure what it is MCM is, you can read more here: http://www.microsoft.com/learning/mcp/master/OCS/default.mspx

The long time away from home will actually give me some time to finally blog.

Speech Server 2007 vs UCMA v2.0 WF activites

I am planning a series of blog posts that will show you how to create UCMA WF applications that can answer the telephone with speech recognition and speech synthesis abilities. However before I do that I HAVE to explain the differences between UCMA and Speech Server so that there is no confusion.

 

Speech Server vs UCMA

 

With OCS 2007 R2 there has been an update to UCMA (Unified Communications Managed API), simply named UCMA 2.0. This new and approved API has a lot of new features, support for presence, telephony, speech recognition, speech synthesis, etc… In R2 there is NO update for Speech Server, that being said you can continue to run Speech Server as is in tandem with your R2 environment. 

 

On top of the Core API is a new WF API, which abstracts a lot of common "activities" that a UCMA application might do. At first glance it may appear that this WF API is a replacement for Speech Server, but there are a few major differences that you need to consider when deciding if an application should be a Speech Server application or a UCMA application.  Here are what I consider the 4 major deciding factors when choosing between Speech Server or UCMA.

  1. Platform vs. API

The main difference between these two is that Speech Server is not only an API, but is also an enterprise grade IVR  to host these applications. With Speech Server you only have to worry about developing the front end, SALT, VoiceXML or Speech WF application.

UCMA 2.0 WF is simply an API, you need to build the front end of the application as well as building the host application. You can host these applications in a Windows Form, Console application, Windows Service, etc.. Obviously a Windows Service makes the most sense.

 

Note: UCMA 2.0 does NOT have built-in activities for VXML nor SALT.

  1. Infrastructure

The big difference here is Speech Server typically sits off as its own branch from your PBX or Media Gateway, while UCMA 2.0 sits behind the mediation server. At first this seems trivial and maybe even a good idea, but there is a reason a typical IVR application sits outside of the voice network that you use for day to day communications. IE: Do you want your IVR to take up internal bandwidth and/or the lines that you use to for your telephone communications? 

If you are only expecting a small amount calls, then deploying a UCMA 2.0 application shouldn’t be a big issue, but if you are building a UCMA 2.0 application that is going to continuously use 50+ ports, you need to do some planning and additional infrastructure work before deploying behind your existing OCS environment.

 

Note: 50 simultaneous ports is a relative number. The key takeaway here is don’t expect to simply throw a UCMA application behind your existing mediation server and not expect to consider the impact. You can add additional mediation servers and gateways to solve scaling problems.

  1. Developer Tools

Speech Server contains developer tools that abstract things like SRGS grammars with a nice visual grammar editor. UCMA does not yet have these tools, so be prepared to write some of your own SRGS by hand. Small feature, but it can save a lot of development effort. I’ve said it before and I’m saying it again, most of you development effort  with any speech enabled solution will be spent on grammars. In a UCMA application, SRGS is not only used to recognized speech, but also text from instant messaging conversations.

 

UCMA also does not have the SIP Debugging Phone as Speech Server does, this means that in order to debug a UCMA application you need to call the application via Communicator and/or an actual telephone. Also with the infrastructure requirements of having an actual OCS environment with mediation server, you won’t be able to debug application very easily “offline”.

Don’t expect to write and debug UCMA 2.0 applications sitting on a plane like you did with Speech Server. Yes you can create a virtualized OCS 2007 R2 environment but it is going to require HyperV and at least 8 GB of RAM. I’ve finally upgraded so I can do just that, but I realize that not everyone is going to be able to.

  1. Reporting & Tuning

Speech Server has the Tuning & Analysis tools, which comes in handy for simple reports like “How many calls did my application get?”, “How many times did a grammar fail?”. This reporting is absolutely necessary and for any enterprise class IVR solution. If you want some of this reporting for UCMA 2.0, you are going to have to build it yourself.

Outside of the reporting side, are the tuning tools, which allow you to test grammar changes on actual callers recorded audio, before deploying.

 

I hope this post as helped you understand the differences between Speech Server and UCMA. UCMA 2.0 clearly is a step on the roadmap to include speech tightly into the UC platform and OCS. Expect that things like the speech tools will be available by the next release in the not too distant future.

 

Next post will talk about the features of UCMA…

 

Office Communications Server 2007 R2 Unveiled at VoiceCon Amsterdam

News Article here: http://biz.yahoo.com/prnews/081014/aqtu055.html?.v=76

This announcement allows me to start blogging about R2 development features. Expect many UCMA v2.0 blog posts soon!

 

CTI (Computer Telephony Integration) w/ Office Communications Server

Side Note: I know I haven’t blogged much recently, but I’ve changed roles within Microsoft, been on parental leave and busy working with customers. However I am planning a series of post about creating UC solutions..

I get this question a lot, "Does Office Communications Server integrate with CTI?"  The Answer: It depends

Why do I say that? Well when most people ask about CTI, they are typically asking about CSTA applications. IE: ACD, Predictive Dialing, Screen Pops, etc..  And it depends on your OCS implementation.

Remote Call Control

You can run your current CSTA based CTI application in tandem with your OCS environment in a Remote Call Control scenario as your CTI server will sit as a branch off the PBX. However the Remote Call Control scenario has no real interaction between OCS and the CTI server. Therefore you can't call that real integration. Besides the Remote Call Control scenario doesn't utilize the full potential of our UC platform.

 

Enterprise Voice with PBX Integration

With a OCS deployment with Enterprise Voice, can it support your CSTA based CTI applications? No for the fact that your internal telephones are no longer "directly" linked to the PBX, but instead all telephone calls are routed through your OCS infrastructure.


Does this mean OCS doesn't support CTI applications?

No, it just means that it doesn't support CSTA based CTI applications. CTI doesn't have to be based on CSTA. If you look at what CSTA is, it is essentially a schema. A schema that defines what a telephone call is and from a software developer perspective this is antiquated approach to solving a common problem. Why don't we have predefined schemas for everything, such as customer data, financial transactions etc... ? All business have customers and money changing hands, yet we don't commonly see standardize schemas. Yes having predefined schemas for everything would solve a lot of problems, but businesses are unique. Just because you run a bakery doesn't mean all bakeries care about the same data.

I know it seems like a Microsoft employee bashing "standards" but keep reading.

Let's take "Screen Pops" for example, when a call is transferred from the IVR application to an agent, the agent should get the customer's information from the data the IVR collected. In a traditional CSTA environment the call information and relevant data is written to the CTI server. When the call gets transferred to the agent, the agent has a piece of software sitting on the desktop which either queries or gets notified with the appropriate data from the CTI server.  If you remove the context of the telephone, it simply is a client application that gets notification from a server application that an event has occurred and passes the data to the client.

.NET and/or Java developers create client/server applications every day! As a .NET developer I'd create a WCF service which the IVR application could call and pass in the data it collected, along with the who it is transferring the call to. On the agent's desktop I would have a small application that would register it's endpoint with the service and listen for any events that pertain to it. The client application could even integrate directly with a CRM application, provided that the CRM application has some sort of API.

I know you are probably reading this and wondering what does this have to do with OCS? Well nothing and that's just the point…

In this Screen Pop solution, the actual Screen Pops have nothing to do with OCS. The OCS APIs could be used to create the front end IVR application and provide business logic presence information for agents, so that the IVR knows who to route the call to.  NOT to display Screen Pops. I know what I am saying goes against what others have suggested such as the UC Client API - Screen Pop Sample. It is however just a sample and is only one way to solve a problem. In my opinion creating a custom OCS client to display “Screen Pops”  does "lock" you down to OCS, not to mention the increase investment of time to develop the application.

You shouldn't try and reinvent the wheel using the UC APIs, but use them for something that doesn't already exist. My rule of thumb is that the UC APIs should be used to serve two purposes: 1.) Provide modalities to your applications. IE: Telephony and IM  2.) Provide your application with presence data. 

Conclusion

CSTA defines a "standard" so that you aren't “locked” down to use proprietary extensions. By applying modern SOA techniques to your solution you create a solution that is loosely coupled and could easily be adapted to any other OCS ish products out there and therefore not forever committing you to a single product and/or platform. Believe it or not as a Microsoft employee, I don’t want you to feel that you are “locked” into our products, I want you to buy our products because they are the best in the industry and will give you the greatest return on your investment. 

 

New UC API Samples Released!

A few new code samples have been released on various UC APIs. I haven't checked them all our yet, but I can speak a little to the "Integrating Web Chat Functionality" as I know the guy who wrote it...

The Web Chat Functionality sample shows you how you can have anonymous web users have an IM conversation with an OCS end user, without the web user providing any credentials.

 Web Chat

Think about visiting a web site, for example an ecommerce website, and the customer has a question about a product, while they could email and/or call, the customer would have to leave the computer and make the extra effort. This Web Chat samples tries and solves that problem by allows the customers to simply click on a link and have an IM conversation via thier browser with a representative and/or a bot from the ecommerce website.

 This is one example of "Click to Chat".

The technology behind the solution is farily simple, combining the  UC AJAX API into a service, in this case a WCF service which logs into CWA via a single UC Enabled account. The WCF service is the only application that is comunicates with CWA. The browser application uses .NET 3.5 WebHTTPBinding and JSON to communicate directly to the WCF service.

 The WCF controls which browser endpoints get which messages it has receieved, by assigning web users a GUID. The Browser application using a polling technique to get the messages from the service.

 WPF Presence Controls

Another exciting sample are the WPF Presence Controls. While we had the WinForms controls we didn't have WPF controls and it was something requested by everyone who looked at the WinForm controls. Download them and check out George Durzi's blog post about these controls.
http://blogs.claritycon.com/blogs/george_durzi/archive/2008/09/08/wpf-presence-controls-for-microsoft-office-communicator.aspx

Integrating Web Chat Functionality - Microsoft Unified Communications AJAX API Sample
 http://www.microsoft.com/downloads/details.aspx?FamilyId=C8C3F762-7BE4-4541-9B18-82499DB61293&displaylang=en
 
WPF Presence Controls for Microsoft Office Communicator 2007 - Microsoft Office Communicator 2007 SDK Sample
 http://www.microsoft.com/downloads/details.aspx?FamilyId=5001D612-533A-4721-91EA-DA990D94FF0F&displaylang=en
 
Dynamics CRM Integration with Office Communications Server
 http://www.microsoft.com/downloads/details.aspx?FamilyId=6E2EA762-A6C9-43BD-8C84-BF610073765C&displaylang=en
 
Customer Relationship Management (CRM) Activity - Microsoft Unified Communications Managed API 1.0 Sample
 http://www.microsoft.com/downloads/details.aspx?FamilyId=16303459-DD75-451F-B7C0-FB2EB0D9A84A&displaylang=en
 
Communicator 2007 Custom Tabs - Microsoft Office Communicator 2007 Sample
 http://www.microsoft.com/downloads/details.aspx?FamilyId=621C675C-46B7-4F68-ADDC-9F44E5594BFB&displaylang=en
 

Sending SMS messages via Communicator

If you are using PIC (Public IM Connectivity) in OCS, you can send SMS messages to any capable phone in North America, via AOL's SMS Gateway simply by sending a message to an E.164 normalized phone number. IE: +16128591899 and appending the AOL domain name, aol.com.

While this is fairly useful by itself, for it to be extremely useful you'd have to take advantage of this via one of the UC APIs. For example, you could use the UC AJAX API and Speech Server to send a SMS message to caller confirmation details of their call OR maybe you create an IM bot with UCMA v 1.0 that sends a transcript of the chat to the user’s SMS device. The possibilities are endless...

Unified Communications Developer Labs

I was checking out the UC Developer Portal recently and It seems it is not up to date on the available Virtual Labs for UC Developers, as it only list 5.

For a more up to date list, you can visit this URL:http://msevents.microsoft.com/CUI/AdvancedSearch.aspx?culture=en-US#culture=en-US;advanced=true;sortKey=;sortOrder=;pageEvent=false;startDate=7/21/2008;endDate=12/31/2008;kwdAny=;countryId=US;languageCode=en;audience=2;products=170;eventType=4;searchcontrol=yes;s=1

It looks like these Vitual Labs cover the UC AJAX and Communicator Automation in depth. I think there is one lab on UCMA, but there doesn't appear to be one on the UC Client API. Overall very good materal and if you are new to using any of these APIs, take the time and go through the labs, best of all they are free to use.

Re: Presence in WPF

Michiel van Oudheusden from e-office, has taken the challenge to actually hook up the the WPF Presence bubble screensaver to Communicator to show presence of your contacts!

http://unified-communications-development.blogspot.com/

The blog belongs to Joachim Farla and Marc Wetters of e-office. Make sure to check out some of the other post.

Presence in WPF

A very interesting article on using WPF to display presence by Erik Klimczak from Clarity Consulting. It goes more into the WPF side, than the actual presence, but you need to know how to make your presence bubbles look good too! I love the example of creating a screen saver out of it, It doesn't currently show actual users presence, but the article challenges you to add that your self. I'd also like to extend that challenge!  Hint: I'd look into using the Communicator Automation API first.  

Article: http://blogs.msdn.com/coding4fun/archive/2008/06/20/8626294.aspx

More Posts Next page »
Page view tracker