Monitoring the TFS Data Warehouse - FAQ

Grant Holliday’s blog

Senior Service Engineer, Microsoft Visual Studio Team Foundation Service

Monitoring the TFS Data Warehouse - FAQ

  • Comments 2

This blog post describes how to interpret the Data Warehouse & Cube status reports included in the Administrative Report Pack for TFS2010.

If these tips and the help topic for the error message do not answer your question, see the Microsoft Technical Forums for Visual Studio Team Foundation (http://go.microsoft.com/fwlink/?LinkId=54490). You can search these forums to find information about a variety of troubleshooting topics. In addition, the forums are monitored to provide quick responses to your questions.

Should I expect some processing jobs to fail?

Yes. Some failures are part of typical processing. Three jobs require exclusive access to the same warehouse resources:

  • Optimize Databases (runs at 1:00 AM by default)
  • Full Analysis Data Sync (runs at 2:00 AM by default)
  • Incremental Analysis Data Sync (runs every two hours by default)

None of these jobs can run in parallel with any other job on the list. If one job is already in progress when another job starts, the second job will fail quickly with error TF276000, as shown in the following illustration of the Cube Processing Details view of the Processing Times report:

clip_image001[3]

The following illustration shows a sample of typical processing failures:

clip_image002[3]

The previous illustration shows the results of the following events:

  1. An Incremental job was scheduled to run at 1:00 AM on June 25, 2010, but it failed because the Optimize Databases job had already started.
  2. An Incremental job was scheduled to run at 3:00 AM on the same morning and upgraded itself to Full Analysis Data Sync.
  3. An Incremental job was scheduled to run at 1:00 AM on the next morning, but it failed because the Optimize Databases job had already started.
  4. A Full Analysis Data Sync job started at 2:00 AM on June 26, 2010, and ran for one hour and 48 minutes. That job caused the Incremental job that was scheduled to run at 3:00 AM on the same morning to fail.

Why might most Cube processing jobs fail?

The Cube processing job requires exclusive access to some of the warehouse resources that data synchronization jobs use. The Cube processing job will wait for the release of the resources (normally for an hour) before it gives up. If a data synchronization job does not release the resource in time, the Cube processing job will fail with the following error:

ERROR: TF221033: Job failed to acquire a lock using lock mode Exclusive, resource DataSync: [ServerInstance].[TfsWarehouse] and timeout 3600.

The following illustration shows failures that occur if the Cube processing job cannot access one or more of the warehouse resources that it requires:

clip_image003[3]

To troubleshoot this issue, you must determine which Warehouse Data Sync job is preventing the Cube processing job from accessing the resource or resources that it needs. This report pack does not provide an easy way to determine which Warehouse Data Sync job is causing the problem, but you can determine that information by examining the Warehouse Job Status view. As the following illustration shows, the warehouse data for the problematic job will be much older than the warehouse data for other jobs for the same team project collection:

clip_image005[3]

To troubleshoot issues with individual Warehouse Data Sync jobs, first unblock the overall Warehouse Sync process by disabling the offending job to allow re rest of the process proceed, and attempt to solve the issue with the individual job afterwards.

Why might many Incremental jobs be upgraded to Full jobs?

According to the process for synchronizing the warehouse, the cube should be processed incrementally throughout the day, and then a full synchronization should occur every day at 2:00 AM. Full synchronization jobs usually run longer and consume more system resources than Incremental jobs. However, the system will try to correct itself if an Incremental job failed. In that situation, the next Incremental job will be upgraded to a full synchronization. If multiple Incremental jobs are upgraded to Full, as the following illustration shows, you might first determine whether your network connectivity is reliable. You should inspect the error that the failing job returned and then address the issue.

clip_image006[3]

Why might a processing job run for a long time (~24 hours) before it fails?

If your network loses connectivity, the server-side execution of the Analysis processing job might finish but fail to report the job completion to the processing component for Team Foundation Server. Because of the same network failures, the resource lock might be released, but the Job Agent might not update the job’s state. The following illustration shows that a Full processing job started on June 24, 2010, at 2:00 AM and ran for more than 24 hours. Because it released the processing lock, the Incremental job was running in parallel with it.

clip_image007[3]

The following illustration shows the worst case of the same problem. The Incremental job has run for more than nine hours, which means that no other jobs are scheduled and the cube is at least nine hours out of date. To mitigate this issue, you should use the AnalysisServicesProcessingTimeout setting for processing the cube for Team Foundation Server. This MSDN article describes how to Change a Process Control Setting for the Data Warehouse or Analysis Services Cube.

clip_image008[3]

clip_image009[3]

How might I resolve a schema-merge conflict to unblock a team project collection?

When a Team Project Collection gets blocked due to schema merge conflicts, the Warehouse Job Status table will show the conflict with a link to the sub report that displays details about the blocked fields. If you click the link that appears under the schema conflict error, a different report appears and shows the fields that are currently active and blocked for the blocked team project collection. See illustration below. For additional help on resolving the schema merge conflicts see Resolving Schema Conflicts That Are Occurring in the Data Warehouse.

clip_image010[3]

 

clip_image011[3]

  • Grant, all jobs OK except the Optimize Databases job, fails every night at 1:00 AM. I see no error in the event log or SQL Server log. Where does this "job" reside and how can I follow-up on the failure?

  • Data Warehouse Intelligence is a term to describe a system used in an organization to collect data, most of which are transactional data, such as purchase records and etc., from one or more data sources, such as the database of a transactional system, into a central data location, the Data Warehouse, and later report those data, generally in an aggregated way, to business users in the organization. This system generally consists of an ETL tool, a Database, a Reporting tool and other facilitating tools, such as a Data Modeling tool.A<a href="http://www.potentiamed.com">Data Warehouse  </a>(DW) is a database used for reporting. The data is offloaded from the operational systems for reporting. The data may pass through an operational data store for additional operations before it is used in the DW for reporting.

Page 1 of 1 (2 items)