By Andy Cross

Diagnostics in complex systems are the entry point for investigating the inner workings of a system; the data and metadata regarding the current state of the system. In a computer system they may be the state of nodes, the throughput of data, the capacity of limited resources or specific data regarding the actions undertaken by custom software.

For everyone who has written or architected software solutions, it is a known truth that nothing goes without fault forever. The smoothest application deployment can have its performance degraded by the build-up of data in a database, its configuration altered by a misguided engineer or its operations made obsolete by temporal considerations. The role of diagnostics is to provide the low level, fine granularity data regarding the operation of a system that enables the inspector to determine the cause of performance, functional or behavioural problems within the system. Without diagnostics, the inspector is at the mercy of trying to determine operation by its output; to diagnose a problem solely by its symptoms as exposed by a user interface.

In remote systems, such as those that run within a Windows Azure datacentre, these concerns are even more apparent given the closed nature of their firewalls and their security oriented design.

We may consider therefore that the role of diagnostics is to build an interface into the inner workings of a system,

  • for the use of a domain expert
  • for the purpose of exposing operational data
  • in a known format
  • in a timely manner

In Windows Azure the toolset for diagnostics is aptly called Windows Azure Diagnostics. It has a few moving parts including a Diagnostics Store that acts as a centralised location for all the gathered diagnostics, a Diagnostics Configuration that controls whether and which diagnostics types are recorded and a Diagnostics Agent which runs on every Windows Azure instance.

The Diagnostics Store is a Windows Azure Storage Account which will be used to store Blobs and Tables that include your collected diagnostics information and additionally configuration information on which diagnostics should be collected.

The Diagnostics Configuration is the way that the programmer can specify which type of diagnostics information they wish to capture from the running Role Instance. These instructions are typically provided during Role Instance start up through a static configuration file in a known location (such as diagnostics.wadcfg in the root of a WebRole), through a programmatic manipulation or through the use of a powershell script.

It is interesting to note that these three techniques are eventually equivalent. The configuration of the Diagnostics Agent is achieved by building the Diagnostics Configuration and storing it in the Diagnostics Store. The Diagnostics Agent is designed to load a configuration file from the Diagnostics Store and enact the behaviours configured within it. Fundamentally this configuration file is the output of each of the three above configuration techniques, which is why they are equivalent; the choice of which technique is largely down to developer preference.

Whilst the static file and powershell implementation have merit in their own use, only the programmatic manipulation provides a suitable mechanism for reactive logic, and so will be the focus for the remainder of this article.

The Diagnostics Agent is provided by Microsoft that runs on every Windows Azure instance. It is possible to have this Agent idle by not providing any instructions to it. This Agent can be seen running on any Windows Azure role by accessing the instance with remote desktop and viewing the running processes.

image

As previously noted, this Diagnostics Agent is configured by the developer to access a given Windows Azure Storage account. This Storage account contains a configuration file (diagnostics.wadcfg) which contain the instructions on how the agent should behave on the Role Instance, for instance which Performance Counters to collect and how regularly.

The type of information the can be collected by the Diagnostics Agent stored in the Diagnostics Store is listed in this table:

Diagnostics Type

Diagnostic Store location

Trace

Table (WADLogsTable)

IIS Logs

Blob

Windows Event Logs

Table (WADWindowsEventLogsTable)

Windows Performance Counters

Table (WADPerformanceCountersTable)

Custom Logs

Blob

Crash Dump

Blob

Diagnostic Infrastructure Log

Table (WADDiagnosticInfrastructureLogsTable)

In the scenario that the Diagnostics Configuration is created in the programmatic model, the vanilla Windows Azure approach involves writing a style of code that closely follows the structure of the diagnostics.wadcfg file and can be a little counterintuitive. Instead, I prefer to use a library written in the fluent programming style called Fluent Azure Diagnostics.

As with any fluent library it is written in such a way that its method signatures and constructs can be spoken in a clear and understandable way. Through this mantra, it is possible to share quickly details on how a system is configured or commanded, and so it is an excellent fit for a system that you often wish to quickly examine to see how it is configured.

As an example of the fluent library, the following command configures a system to configure persist:

  • a custom log directory
  • the % Processor Time performance counter
  • System.Diagnostics Trace logs
  • the Windows Event Log Application and System logs

 

var diagnostics =

                newDiagnosticsCapture(CloudStorageAccount.FromConfigurationSetting("DiagnosticsConnectionString"));

 

            string logPath = RoleEnvironment.IsAvailable

                                 ? RoleEnvironment.GetLocalResource("mylocallogs").RootPath

                                 : @"C:\\mylogs";

 

            diagnostics.Current()

                .WithDirectory(logPath, "mylogs", 1024)

                .TransferDirectoriesEvery(TimeSpan.FromHours(2D))

                .WithLogs()

                .WhereLogsAreAbove(LogLevel.Information)

                .WithPerformanceCounter(@"Processor(*)\\% Processor Time", TimeSpan.FromMinutes(1D))

                .TransferPerformanceCountersEvery(TimeSpan.FromHours(1D))

                .WithWindowsEventLog("Application!*")

                .WithWindowsEventLog("System!*")

                .TransferWindowsEventLogEvery(TimeSpan.FromHours(2D))

                .CheckForConfigurationChangesEvery(TimeSpan.FromDays(1D))

                .Commit();

 

Summary

The Windows Azure Diagnostics library combined with the Fluent Azure Diagnostics library provide a simple and yet powerful coupling of breadth of diagnosis with a simple coding implementation. We have covered the moving parts from a high level, discussing the Diagnostics Agent, Configuration and Store.

For further diagnostics tooling, including visualisation and powershell cmdlets for configuration, we recommend the Cerebrata toolset, available at http://cerebrata.com/

You can find the Windows Azure Diagnostics library in the Windows Azure SDK and the Fluent Diagnostics library on our blog at http://blog.elastacloud.com

 

178743d[1]Andy Cross
Elastacloud Limited

www.elastacloud.com

Andy is a Director of Elastacloud Limited, co-founder of the UK Windows Azure User Group and author of Fluent Diagnostics which is available for download via nuget. Andy has 12 years commercial experience across a variety of business areas but mainly in ecommerce fields. Andy is a published expert and advocate of the .NET Micro Framework and Fiddler. He has embraced Windows Azure since its inception in 2008 and has continued to be wholly single-minded about its introduction into all projects. Andy is available via @andybareweb or andy@elastacloud.com.