One of the most serious limitations of current implementation of PollingDuplex mechanism (one that allows server to talk to Silverlight applications in the browser) in Silverlight 3 is its lack of support for multiple backends. In other words, if your environment includes multiple machines or does not provide intelligent routing of messages from one client to only one server in the backend cluster, your otherwise perfect Silverlight applications utilizing PollingDuplex binding will simply not work - clients will poll servers that have no record of them and the duplex channel will simply fail.
This limitation, combined with Azure load balancer's lack of session "stickiness", renders the only backend to frontend communication mechanism in Silverlight 3 unusable there - as soon as you start taking advantage of Azure scalability and provision two or more web roles, your Silverlight applications based on PollingDuplex binding will start to fail.
The bad news is there is no easy solution to this - PollingDuplex server-side library in SL3 keeps client sessions in memory and doesn't allow these sessions to be shared among other instances or machines.
The good news is it's really not that hard to implement distibuted flavor of Long Polling pattern, which is utilized by SL3. Here's my solution:
1. Login / Silverlight XAP download
2. Client establishes communication w/ Duplex Service by sending a “connect” request over basicHttp
3. Duplex Service registers client by creating its GUID and corresponding queue for that GUID in AQS and returns this GUID to the client in HTTP response
4. Client remembers the GUID and considers itself connected
5. Client sends a long poll request to server with GUID in it, web role waits for requested timeout period and sends empty response. Connection with web role ends.
6. Web role puts messages from client into the outbound queue (one per app) in response to custom operation call from the client
7. Worker role’s timer picks up messages one by one from outbound queue and routes to external destinations based on config/payload
8. Worker role receives a message from external source, extracts client GUID from config/payload and routes message to specific inbound queue
9. Web role’s timer picks up messages and immediately passes it to accessor on Duplex Service
10. Duplex Service sends message in response on open connection with client. Connection with web role ends.
I have modified Silverlight Chat sample to follow this behavior and you can find it here.
The only thing you need for it is an active Azure account. Insert account name/key for storage (only queues are used) to web.config (used by web roles) and ServiceConfiguration.cscfg (used by worker roles), set the number of web roles you want and launch it along with two or more browsers pointing to SilverlightClientTestPage.html. Let me know if you have any feedback on it – it might be very useful for future Silverlight communication stack functionality planning.
Seems like a great approach but there is a transaction cost of 1/10000 cent for each queue poll operation. Most duplex communications try to simulate real-time responses -- this would mean constant polling of the queue (i.e. every 25 ms). Taking this approach and scaling it out (which is what anyone using Azure intends to do, right?) would cause cost to scale unfavorably.
I don't understand why the Azure team doesn't add session stickiness to the load balancer or provide a mechanism for clients to address roles without the load balancer. Without this capability, the example you provide above is very difficult for applications such as real-time streaming stock quotes, massively multiplayer games, and chat rooms.
Your concern is valid in near-real time mission-critical applications. Session stickiness is tough to build into the cloud LB as it would put an unusually high burden on to otherwise very elastic and stateless infrastructure where every node can be brought down at any moment without sacrificing the overall availability. I doubt it will be implemented in Azure any time soon.
Something else to consider here - using Bloom filters and something like memcached and Velocity relying on inter-node communication. Let me just say it has been done in a very scalable fashion without spending too much on transaction costs.