I can't tell you how many times I've been in the situation of trying to figure out what has changed between two tables. Often this happens when you create copies of schemas. Maybe you have a dev, test, QA and production environment and they all contain a reference table (for example sales tax by state) but the table has been updated over time. How do you tell that the data between the environments is different? How do you tell which data has diverged? How do you bring it back in sync?

There is a really cool trick that can quickly determine which tables don't match. Using SQL Server 2005, all you have to do is configure transactional replication between the two databases, while being careful to specify that the subscriber already has the data. If you don't know how to setup transactional replication, you can find more details in the books online documentation, but basically you launch the new publication wizard from replicatin->publications folder in management studio and specify the tables you want to include. You then follow that up with the new subscription wizard from the publication node and specify the destination database. When creating the subscription, be sure and uncheck the option to initialize the subscriber. Warning: if you don't uncheck "initialize", replication will copy all of the data from the publication database to the subscrption database, wiping out all the data that used to be at the subscriber. Once replication is configured, right click on the subscription and select the option to validate. By default, replication will count the rows in each table while also generating a unique checksum based on the data in the table. This list of tables, rowcounts, and checksums will be sent to the destination subscription database and will be automatically calculated and compared. This is very efficient - much more efficient and simpler than trying to do a manual set of queries. To view the results, just launch the replication monitor (right click the replication node) and then select the "subscription watch list" and double click on the subscription in the right hand pane. The results will look something like this for tables that don't match:

Table 'test' might be out of synchronization. Rowcounts (actual: 0, expected 1). Checksum values (actual: 0, expected: 467540).

Pretty cool, huh? Now for the next step. For each table that you find doesn't match, you can use a little tool that I snuck into SQL Server 2005 called tablediff. Tablediff is a command line tool in the SQL Server\90\com directory that compares the rows in tables and produces a report showing what rows are different. Basically you specify the source and destination servers, instances, and tables. You can get more detail in the books online or via tablediff /? on the command line. So now you know which tables are different and which rows are different. Last but not least, you may choose to bring the two tables into sync with each other. You have two options at this point. If you have already have setup replication and just want to overwrite the subscriber with all the changes from the source publication, you can select the subscription and right click and select the re-initialize option. After the agents run, the data will be automatically loaded into the subscriber via parallel bcp. It's very fast. The second option to fix up the subscriber is to just use the tablediff tool. Tablediff has a "-F" option which is handy in this case. This option generates a SQL script that will fixup the subscriber. All you have to do is run the resulting script on the subscriber.

Hope that helps out, please feel free to comment or post questions and suggestions

-Matt