IdeaCloud is a tool to visualize tweets and instagram by creating an animated mosaic on an HTML5 Canvas in real-time. IdeaCloud creators Michael Liu and Raymond Tsang tell us about how they distributed the workload on Azure and why.
In this post Michael and Raymond explain
As social media adoption grows at a rapid rate, businesses have flocked to social media platforms to advertise their events, track potential attendees and improve their overall reach. But, there hasn’t been a way to really capitalize on social media during events. I recently went to a really cool tech conference, and realized that I was really active before the event started – checking the event group, reading up on networking opportunities and interacting with my fellow attendees. But as soon as the event started, I lost that social media connection. The entire social media ‘pre-party’ ended abruptly and the event didn’t give me a reason to tweet, message or connect my social media network to this event.
I think its fair to say this event did not engage my social media presence. This is a recurring theme for many attendees. So far, event organizers are unable to:
One way to help solve these problems is with IdeaCloud. IdeaCloud event organizers can engage their audience by giving people a reason to tweet and send pictures. More importantly, people who normally wouldn’t use Twitter or Instagram WANT to participate and are engaged.
IdeaCloud is a tool to visualize tweets and Instagram pictures in one place in real-time. IdeaCloud pulls tweets and Instagram images based on your search criteria, and forms an animated mosaic on a HTML5 canvas.
IdeaCloud is powered by HTML5, open source technologies and Azure Technology. IdeaCloud uses the following technologies:
1. Azure VM and Worker Role
2. Azure Tables, Blob, Queues
3. HTML5 - Canvas and WebWorker Multi-threaded
4. SignalR (Websockets)
IdeaCloud’s social media processing and analysis requires intense CPU time and high-frequency I/O. To avoid straining the web servers, IdeaCloud needs to achieve high-scalability and must be able to distribute workload. With scalability in mind, IdeaCloud was developed on the Microsoft Azure platforms and fully leverages the power of Azure VM, Worker Roles, Queue, and Tables services.
Distributed Workload Architecture
Let’s look at the overall architecture of IdeaCloud on Azure. IdeaCloud’s front end website runs on ASP.net MVC 5 and is deployed on Azure VMs. While the VMs serve web content to the users, it is also responsible for preparing the overall workload requests. For example, when a user is viewing IdeaCloud, the website prepares different requests for pulling Twitter, Instagram, Facebook, Yammer feeds. Each of these requests is then saved into the Azure Queue.
The Social Media Processor (SMP) is a Worker Role that reads the requests stored in the Azure Queue. Each SMP reads one message at a time and polls the social media feed, performs analytics, and then stores all data into an Azure Table. The Azure Queue also guarantees each message can only be retrieved by one single SMP, and we won’t run into any race conditions.
At last, the Azure VM (website) consumes the data stored in Azure Tables, and creates the HTML5 animated social feed for the users.
As a result of our architecture design, both the front-end (Azure VMs), and the backend SMPs (Worker Roles) can be scaled independently. Using Azure’s auto-scaling feature, the solution will automatically turn VMs on and off in response to front end traffic, and will independently deploy more SMPs in the event there is an increase in queued messages. This is very powerful and removes the need of having an operator to control the scaling.
In the event of a very sudden spike in traffic, auto-scaling will not react quickly enough. It will not cause much problems to our system as we have de-coupled the front end and SMPs with Azure Queue. As the front end is busy adding more and more requests to the Azure Queue, the depth of the queue will grow. However SMPs will eventually be scaled-out (more instances) and read messages faster and the queue will shrink again.
Another benefit of this architecture is high availability. Since each VM and SMP works independently as they only communicate via the Azure Queue, we can easily achieve redundancy by deploying multiple VMs and SMPs.
The Microsoft Azure platform provides the right tools and services that make it easy for web application to achieve their scalability targets. As discussed in this post, IdeaCloud is a good example how to achieve that. In general, a distributed workload solution should inherit these important characteristics:
Be sure to check out our recent features on IdeaCloud, so far we’ve provided a brief how-to tutorial to set up your ideaCloud, explored both our Twitter Walls & Picture Wall and a demo about SignalR. If you want more information on IdeaCloud or have any questions, please visit our website or contact me at Raymond.firstname.lastname@example.org or http://ca.linkedin.com/in/tsang or email@example.com or ca.linkedin.com/in/siumichael/