Tuesday 5 March 2013

SQL Server: compare tables

   


When we import, export or synchronise tables, it's a hard job if we have many records.
In the past, I've found some difficulties when I need to compare two tables in SQL Server.
The tables contained different records and I needed to consolidate them in the first table.
The first thing I wanted to do was to actually see which were the differences: I needed a list of records in order to understand the situation.
I've found a good solution for that, and let me say, a quite unexpected solution indeed.

The UNION operator works quite well if we need to compare two tables. It's quick and handles NULL values, while a join clause or a WHERE condition don't.
Assume we have two tables and we need to get all differences in both. The magic trick I've found is to GROUP the union query on all columns and count all columns. The count is working because for any not completely matched record in any GROUP BY clause column, the COUNT(*) will be 1. Exactly what we are looking for.

Ok, now let's see an example: two tables (tabA and tabB) containing three columns (ID, Col1 and Col2).
The UNION query will be:

SELECT MIN(TableName) as TableName, ID, Col1, Col2
FROM
(
  SELECT 'tabA' as TableName, tabA.ID, tabA.col1, tabA.col2
  FROM tabA
  UNION ALL
  SELECT 'tabB' as TableName, tabB.ID, tabB.col1, tabB.col2
  FROM tabB
) tmp
GROUP BY ID, col1, col2
HAVING COUNT(*) = 1
ORDER BY ID
The query will return all records from one table that don't completely match in the other. Moreover, it returns all records that don't exist in one of the two tables.
Just to be clear, the query returns records that
1) are not present in tabA, but are present in tabB;
2) are present in tabA, but are not present in tabB;
3) all rows that do not completely match all columns.

The above trick is quite different from a way to compare two table and insert the missing data from one table to another.
In this case we need to consider an example.
Two tables: tabA and tabB. Each table has two columns: ID and name.
We need to add to tabA, all the missing names that are in table tabB. That means, compare tabA with tabB, if the record exists in tabA, do nothing; if the record doesn't exist in tabA, add it.
The query will be:
INSERT INTO tabA (name)
SELECT name
FROM tabB
WHERE name NOT IN (SELECT name FROM tabA)
And that's it.

I hope you've found the above interesting enough. Just drop a line in the comments section below.

0 thoughts:

Post a Comment

Comments are moderated. I apologize if I don't publish comments immediately.

However, I do answer to all the comments.