Windows Azure being a Platform as a Service (PaaS), abstracts away the OS, storage, networking and shares these massive pools of physical resources across virtual instances of applications running on Azure infrastructure.  Windows Azure platform defines and enforces policies so that applications running on virtualized infrastructure will play nicely with each  other. Awareness of these resource policies is important for assessing the capacity for successful operations and also for predicting the operational expenses for planning purposes.


Bandwidth is one of the important resources governed tightly so that each service instance will get its fair share of network bandwidth. Each Azure role type gets its share of network bandwidth per the following table:

Azure Role


XSmall 5 Mbps
Small 100 Mbps
Medium 200 Mbps
Large 400 Mbps
XLarge 800 Mbps
Table 1

Awareness of the above numbers is important for capacity assessment if your application is bandwidth prone. For example, if you need a throughput of  10K requests/sec, with each request with an ingress of 5Kbyte and an egress of 10Kbyte, the bandwidth required is : ((10000 * (5 + 10) * 1024 * 8)/1000000) = 1228 Mbps (Mega bit per second). This can be serviced by 12 Small, 6 Medium, 3 Large or 2 XLarge instances just based on arithmetic.

If the request is network IO bound, with less emphasis on CPU cycles/request, distribution of the workload across multiple small instances will give the benefits of the isolation. If one of the roles gets recycled, it will only take down the fewer requests that are inflight with that role.


The CPU resource policies are implemented implicitly through the Azure Role types; each role comes with a specific number of CPU cores as shown in the table below:

Azure Role

Guaranteed CPU

XSmall Shared Core
Small 1 Core
Medium 2 Cores
Large 4 Cores
XLarge 8 Cores
Table 2  

Each core is equivalent to a 64bit 1.6 Ghz processor with a single core on the chip. If you have an existing application that maxes 2-proc (2 cores each) you probably need to look at 4 small instances or use other role types base on simple arithmetic.

CPU intensive workloads like fast Fourier transform (FFT), finite element analysis (FEA), and numerous other algorithms that aid simulations may get benefited my the large number of cores. For a typical data intensive application, one could start with a Small role and progressively change the role type to arrive at an optimal Azure role type through testing.


Each Azure role instance is provisioned with a pre-configured amount of memory as shown in the Table 3. Role instances get their memory allocations based on the role type, from the remaining memory on the physical server,  after the Root OS takes its share. If your application is memory bound due to the way application is architected (e.g. extensive use of in-memory cache or huge object graphs due to the nature of the application object model), either the application needs to be rearchitected to leverage Azure capabilities like AppFabric Cache or select the appropriate role type to  fit the application’s memory requirements.

Azure Role

Guaranteed Memory

XSmall 0.768 GB
Small 1.750 GB
Medium 3.50 GB
Large 7.00 GB
XLarge 14.0 GB

Table 3


Table 4 shows the volatile disk storage that will be allocated to each Azure role type. Typical stateless web application may not pay much attention to the local disk but certain stateful applications like full text search engines may store indexes on the local disk for performance reasons. In order for these indexes to survive the role restarts (VM reboot), cleanOnRoleRecycle="false" setting in the service definition will preserve the contents between reboots. If the VM running the role,  is relocated to a different physical server due to run time conditions like hardware failure, one has to plan for the reconstruction of the disk contents from a durable storage.  Based on the local disk storage needs you may select the appropriate role.

Azure Role

Disk Storage

XSmall 20 GB
Small 220 GB
Medium 490 GB
Large 1000 GB
XLarge 2040 GB


Table 4

Concurrency and Capacity Assessment

Statelessness on the compute tier and minimizing the surface area of the shared resources (e.g. Azure Storage and/or SQL Azure) between requests is the key for building applications that will have near linear scalability on the compute tier. Windows Azure Storage architecture already accommodates for such near linear scalability if the application is architected to leverage this durable storage appropriately.  See the article How to get most out of Windows Azure Tables for best practices on scalable usage of Azure Tables.

If the application leverages SQL Azure, and if it is multi-tenant and the tenants are a few enterprise customers, shared databases for reference data and a database instance per each tenant may not be a bad idea from the perspective of minimizing the surface area between tenants. This architecture will help both from the isolation perspective as well as from the scalability perspective. On the other hand if your solution addresses large number of tenants a shared database approach may be needed. This requires careful design of the database. An old article, coauthored by one of my colleagues  Gianpaolo Carraro,  Multi-Tenant Data Architecture is still valid in this context. Of course, you need to combine the guidance from this article with the Window Azure size limitations to arrive at the architecture that supports the multi-tenancy needs.

Once the shared resources usage is properly architected for high concurrency, Window Azure capability assessment becomes lot easier.

Capacity Assessment

In a traditional setting where hardware needs to be procured before deployment, one has to assess the capacity and also put together plans for acquiring the resources much early in the project lifecycle. Considering the latencies incurred by the typical enterprise procurement process, one has to be extremely diligent in assessing the capacity needs even before the application architecture is completely baked in.

This has to be complimented by the plan to acquire hardware and software which will base its decisions on less than accurate assessment as input.  Due to

this, often, the resource requirements will be overestimated  to account for the possible errors in the assessment process. Temporal unpredictability of the workloads also adds to the burden of capacity assessment process.

In case of cloud computing, and specifically for Azure, one has to have an eye on the architecture implications of the consumed Azure capacity and the ensuing cost of operations, but doesn’t have to know the accurate picture as in traditional deployments.

Once the system is architected for a scale out model, capacity assessment merely becomes an exercise of doing a baseline analysis of the throughput/per role instance and extrapolate the infrastructure for the target peak throughput. Application throughput needs expressed in terms of bandwidth, CPU, memory and to some lesser extent, local storage will play a big role in the selection of the Azure role type. Even though one could architect for near-linear scalability, the implementation will often result in less than perfect solutions. So, baseline assess the throughput (either requests/sec or concurrent users) across various role types and pick one that is optimal for the application. Also, load test more than one role instance to make sure that near linear scalability can be attained by adding more role instances.