My name is Ratna Rekha Koukuntla and I am a Senior SDET with Microsoft’s Sales and Marketing IT (SMIT) organization. I am working on a Data warehousing application involving multiple ETL jobs and Packages. Based on our customer needs to have monthly releases, we moved to an Agile software development lifecycle, and since then, have been experimenting with ways for quick test turnaround, particularly in the area of performance testing.
In this post I have described the approach we have taken to auto baseline and monitor performance of our production environment.
Below are some release challenges we were facing with our project.
To solve the above mentioned challenges, we took the approach of auto base lining performance of the production environment and monitoring data over time.
Below is the outline of the approach we have taken to generate some informational charts related to performance:
To start with, we stored information related to ETL Packages and stored procedures execution time in the database. The information stored has the IDs, start/end timings, volume information and Pass/Fail details.
A Test engine is designed which will read the above information and for each object it auto baselines the previous information (last month averages in our case) and compares current information against the history for that object. As a team we have defined some threshold which is usedas an alerting mechanism. The engine will also send an email based on the information collected.
Email is scheduled to send weekly and contains below information (Note: Graphs are plotted using Excel API in email).
Some benefits we could see with this approach:
When we used to do this effort manually, it took at least 8 hours of time in each release (once monthly). And additionally, there was a chance for human error, resulting in us having to re-run the tests.
The 40 hours we invested to automate this effort are paying great dividends as we now know which stored procedure requires optimizing, and needs to be scoped into the next release. Manual efforts are now focused on analyzing the data and logging bugs accordingly.
Additionally, the tool helps to identify other issues such as ETL failures occurring in production.
Sample ETL Analysis Email
Information Related to ETL jobs Success rate:
In Current Month
Average Time In Previous Quarter ( excludes failed runs) ( Month A – Month B )
Execution Time Variance
Failed / Cancelled
ETL Process 1
100 Minutes To 125 minutes.
ETL Job Trend using execution timings (graph compares current data with the past months benchmark data):
Stored procedures with improved performance:
Stored procedures that require attention: