Windows Azure SQL Database Marketplace
In Telemetry Basics and Troubleshooting we introduced the basic principles around monitoring and application health by looking at fundamental metrics, information sources, tools, and scripts that the Windows Azure platform provides. We showed how you can use these to troubleshoot a simple solution deployed on Windows Azure (few compute node instances, single Windows Azure SQL Database instance). In this post we expand on that entry and cover the application instrumentation aspects of the telemetry system that was implemented in the Cloud Service Fundamentals in Windows Azure code project. In the detailed wiki entry that accompanies this blog, we show how you can use the CSF instrumentation framework, which integrates with Windows Azure Diagnostics (WAD), to provide a consistent instrumentation experience for your application. The techniques we have implemented in the CSF application have been proven on large-scale Azure deployments.
The best source of information about your applications is the applications themselves. However, while good tools and a robust telemetry system make acquiring information easier, if you don’t instrument your application in the first place you cannot get at that information at all. In addition, if you don’t consistently instrument across all your application components, you are unlikely to achieve operational efficiency when you begin scaling in production. (Troubleshooting problems becomes far more complex than individuals – or even teams – can tackle in real-time.) Consistent, application-wide instrumentation and a telemetry which can consume it is the only way to extract the information you need to keep your application running well, at scale, with relative efficiency and ease.
CSF provides a number of components that you can use to quickly instrument your application and build an effective telemetry system:
By adopting these practices and using the components and configuration we have provided you can help your system scale as well as give you the insight to target your development effort more precisely and improve your operational efficiency -- which ultimately makes your customers happier for fewer resources. This allows you to provide a high quality user experience, and identify upcoming problems before your users do. There is a corresponding wiki article that goes deeper into “Telemetry: Application Instrumentation”
It’s very easy to read this and yet be just too busy growing your user base and deploying new code features.
Distrust this feeling. Many, many companies have had a hot product or service that at some point couldn’t scale and experienced one -- or more -- extended outages. Users often have little fidelity to any system that is unreliable, they may choose to just move elsewhere -- perhaps to the upstart that is chasing on your heels and ready to capture your market.
Of course some of you may already have built your own application instrumentation framework and implemented many of the best practices. For that reason we have provided the CSF application in whole including all the telemetry components as source code on the MSDN Code Gallery. Some of the key things to remember as you implement instrumentation in your application:
Thank you for taking the time to read this blog post. To learn more about how to implement the CSF instrumentation components in your application there is a corresponding wiki article that goes deeper into “Telemetry: Application Instrumentation”. In the next article in this series we will explore the data pipeline that we have implemented to provide a comprehensive view of the overall CSF application and its performance characteristics; including how we capture this information in a relational operational store and provide you with an overall view across the Azure platform.