TFS2010: Test Attachment Cleaner and why you should be using it

Grant Holliday’s blog

Senior Service Engineer, Microsoft Visual Studio Team Foundation Service

TFS2010: Test Attachment Cleaner and why you should be using it

Rate This
  • Comments 13

[Update: 1 Nov 2011] There is an update available for TFS 2010 SP1 that prevents many of the large test attachments being stored in the database in the first place. You can download it from KB2608743 or read more about it on Annuthra's blog post: Reduce the size of test data in VS 2010 by avoiding publishing deployment binaries into TFS.

The execution of a Test Run in Team Foundation Server 2010 (whether automated or manual) generates a bunch of diagnostic data, for example, IntelliTrace logs (.iTrace), Video log files, Test Result (.trx) and Code Coverage (.cov) files. This diagnostic data is critical in eliminating the “no repro” bug scenarios between the testers and developers.

However, the downside of these rich diagnostic data captures is that the volume of the diagnostic data, over a period of time, can grow at a rapid pace. The Team Foundation Server administrator has little or no control over what data gets attached as part of Test Runs. There are no policy settings to limit the size of the data capture and there is no retention policy to determine how long to hold this data before initiating a cleanup.

A few months ago, the Test Attachment Cleaner for Visual Studio Ultimate 2010 & Test Professional 2010 power tool was released which addresses these shortcomings in a command-line tool.

Over the last 18 months, the TFS 2010 Dogfood server that we call ‘Pioneer’ has been slowly accumulating test attachments. Recently, the size of the attachments (680GB) eclipsed the size of version control content (563GB) and that Test Attachments older than 6 months represented over 400GB of storage.

After downloading and installing the tcmpt.exe tool, there are a five sample settings files in: C:\Program Files (x86)\Microsoft\Test Attachment Cleaner\SampleSettings

  1. Scenario#1: Identify list of attachments taking up database space of more than 1 GB per attachment
  2. Scenario#2: View/Delete IntelliTrace log files over a size of 500 MB
  3. Scenario#3: View/Delete all Video log files with Test run creation date older than specific date with no active bugs
  4. Scenario#4: View/Delete all TRX & COV log files for test runs that happened between 30 and 90 days in age & do not perform the Linked Bugs lookup query
  5. Scenario#5: View/Delete all custom/user attachments over size of 5 MB with no active or resolved bugs on test runs between 2 dates

After checking with the test managers and users of the system, we got approval to remove the old content and we ended up creating our own settings file:

<!-- View/Delete all attachments on test runs older than 6 months, that are not linked to active bugs -->
<DeletionCriteria>
  <TestRun>
    <AgeInDays OlderThan="180" />
  </TestRun>
  <Attachment />
  <LinkedBugs>    
     <Exclude state="Active" />
  </LinkedBugs>
</DeletionCriteria>

Then we ran the command-line tool with the following parameters:

TCMPT.exe attachmentCleanup /collection:http://localhost:8080/tfs/CollectionName /teamproject:TeamProjectName /settingsFile:OlderThan6Months.xml /mode:delete

The following charts show the result after the tool ran for over 36 hours:

image  image

Warning: Running queries on the operational stores is not recommended and you should run them on a pre-production/backup server if possible. In this case, this particular query is read-only and returns very quickly, so the impact to the overall system is low. Also, since the schema is intentionally not documented or supported, there are no promises that it won't change for a future release (in fact, this table already has changed as part of making TFS work on Azure).

To see the growth of your Test Case attachments over time, you can use the following SQL query on each of your collection databases:

SELECT 
  DATEADD(month,DATEDIFF(month,0,creationdate),0) as [Month],
  SUM(CompressedLength) / 1024 / 1024 as AttachmentSizeMB
FROM tbl_Attachment WITH (nolock)
GROUP BY DATEADD(month,DATEDIFF(month,0,creationdate),0) 
ORDER BY DATEADD(month,DATEDIFF(month,0,creationdate),0)

The result will be something like this:

Month

Size (MB)

2/1/2010

1,790

3/1/2010

3,663

4/1/2010

5,193

5/1/2010

5,503

6/1/2010

4,701

7/1/2010

8,594

8/1/2010

11,313

9/1/2010

22,333

10/1/2010

18,597

11/1/2010

16,409

12/1/2010

20,720

To see the size of all the tables in your collection databases, you can use the following SQL query:

-- Table rows and data sizes
CREATE TABLE #t (     [name] NVARCHAR(128),    [rows] CHAR(11),    reserved VARCHAR(18),     data VARCHAR(18),     index_size VARCHAR(18),    unused VARCHAR(18))
GO

INSERT #t
EXEC [sys].[sp_MSforeachtable] 'EXEC sp_spaceused ''?'''
GO

SELECT
    name as TableName,
    Rows,
    ROUND(CAST(REPLACE(reserved, ' KB', '') as float) / 1024,2) as ReservedMB,
    ROUND(CAST(REPLACE(data, ' KB', '') as float) / 1024,2) as DataMB,
    ROUND(CAST(REPLACE(index_size, ' KB', '') as float) / 1024,2) as IndexMB,
    ROUND(CAST(REPLACE(unused, ' KB', '') as float) / 1024,2) as UnusedMB
FROM #t
ORDER BY CAST(REPLACE(reserved, ' KB', '') as float) DESC
GO

DROP TABLE #t
GO

The result will be something like the following table. The names can be a little confusing (which is unlikely to change, since the schema isn’t documented/supported). I’ve added a column that gives a brief description of what the tables are used for.

TableName

Area

Rows

DataMB

IndexMB

tbl_Content

Version Control

6,954,871

573,861

12

tbl_AttachmentContent

Test Attachments

1,827,941

275,244

47

Attachments

Work Item Attachments

218,738

50,303

33

tbl_LocalVersion

Version Control

381,589,040

25,006

23,354

WorkItemLongTexts

Work Item Tracking

10,732,588

10,733

3,638

WorkItemsWere

Work Item Tracking

5,701,208

9,693

4,534

tbl_Version

Version Control

54,614,825

3,184

6,923

tbl_PropertyValue

Properties (Code Churn)

152,825,451

4,056

3,676

tbl_BuildInformationField

Team Build

41,565,170

5,930

186

  • What kind of privilege do we need to run for this tool? In our test environment, our TFS is maintained by other team, we want to know what kind of right do we need to run this tool?

  • @Vincent - to run the tool, you require normal contributor access to Test Runs. Backend DB or administrator access is not required.

  • Why is it that sometimes when I run the tool, the tool times out?  I haven't lost network connectivity, no firewall issues, or anything else.  It just hangs after running successfully for a long time.  If I try to run the same command and settings again, it times out again.  I have to break down my scans to scan different amounts of days to just get around the time out issues.  I am also only scanning 500GB of data in the attachment content table so I don't think that the size should be an issue. Thanks of any help you can provide.

  • @Gabe - can you contact me via blogs.msdn.com/.../contact.aspx and we'll collect some more information from you. We've had a couple of people hit a similar issue recently.

    Grant

  • Hi,

    We ran into this issue a few weeks ago and are in the process of using the tool to purge over 100G of old test run attachments.

    Going forward, is there a way to configure TFS to save only specific attachments? At this juncture the main ones we want to keep are the videos and have no interest in the DLL's etc.

  • @Glenn - This is one approach to keep the test result attachments small: www.tfsblog.de/.../prevent-test-results-blowing-up-your-tfs-databases

    The other possibility is also to delete the results when build is deleted by changing the Build Retention policy to include “delete of Test Results”.  The downside with this is we will delete both the test results (pass\fail of tests) and the test attachments.  We do not have a separate knob for just the test attachment.

  • I've run the Test Attachment Cleaner & the logs tell me it has deleted lots of stuff, however when I look at the database I see no change in its size. From SQL Server management studio I ran shrink jobs but it reports that only 5GB can be freed - the job deleted ~35GB.

    Am I missing something here? Does the database now need to be re-indexed or something similar to free up the space? Any tips would be much appreciated.

  • I've tried running the tool with the sample settings files but all I ever get is "Unknown parameter settings".  Any suggestions on what I might be doing wrong?

  • @Grant and @Gabe, we are also having the same problem with tcmpt.exe.  Our TFS database has grown to over 250GB.  When I run a query, the utility will hang for about 10 minutes and then throw a timeout error.

  • @Keith & Others- Make sure to use the /settingsFile: and /teamProject: command line options as the readme document is inaccurated.   The document states - /settings: and /project.

  • We're running this right now, and I bet there is a lot of data that can be removed.

    And I guess it will have a positivte impact on upgrade-time when we are going to upgrade to 2012.

    Will there be any changes to this in TFS 2012 or do we still need to run this tool?

  • @Kim - The only change so far in TFS2012, is that most of the binaries aren't attached as part of the build - therefore eliminating a major source of the bloat.

    You should still run the Test Attachment Cleanup tool that comes with the TFS2012 Power Tools.

  • This tool does not do any thing instead filled up logs.

Page 1 of 1 (13 items)