Friday, October 8, 2010

How to Compare Two Spreadsheets in Excel

This time a contribution from Yoav Ezer the CEO of Cogniview.

Enjoy,
Diego

Being Microsoft Excel experts, the Cogniview team are often being emailed with tricky spreadsheet questions. A particularly interesting puzzle came through recently that we thought we would share with you here:



"Each month I have to compare two spreadsheets of products, each with the same header rows and same number of column. The number of rows are different each time.


Is it possible using Excel to compare these two lists and show only the differences? I would love it if we could create a macro to solve this??"



Seems like a cool challenge, and I would love to say I had a hand in coming up with the solution but I am afraid it was the programming boffins at Cogniview who solved it!


The Solution


Several ideas were suggested and considered, including going through each sheet line by line using VB code. The problem always came when either or both sheets were really long and contained tons of data. Even on the fastest machine it could be seen to grind to a halt.


The answer was to use Excel's built-in function RemoveDuplicates to handle this. Unfortunately that makes it a solution not available to users of older versions.


We need two source spreadsheets to compare, then a target spreadsheet for the results, as shown below.



Basically, this process requires our macro to perform two steps.


Step one, we copy the first sheet's data to a new sheet and put the second's data after it, then we use the RemoveDuplicates function which removes the items from the second sheet that appear in the first.



In the second step we do the same again in reverse order - copy the second sheet's data and then the first, and use RemoveDuplicates again. The left over data in each case is the difference we need to display.



All we need to do then is present the results.



For each row we state where the data was found.


The Macro


Most of the first routine sets everything up, copying the range of data as described above. The last two lines call our custom function LeaveOnlyDifferent.


Sub CompareSheets()
' Merge into the current sheet
Dim sheetResult As Worksheet, Sheet1 As Worksheet, Sheet2 As Worksheet
Set sheetResult = ActiveSheet

' Clean merge sheet so we can work with it
If (MsgBox("This will erase all the data on the current sheet." & vbCrLf & "Do you wish to continue?", vbYesNo Or vbQuestion) <> vbYes) Then
Exit Sub
End If
sheetResult.UsedRange.Delete

' Ask for two sheets to compare
SelectSheet.SheetNames.Clear
For i = 1 To Worksheets.Count
SelectSheet.SheetNames.AddItem Worksheets(i).Name
Next
sFirstSheet = AskSheet
If (sFirstSheet = "") Then
Exit Sub
End If
Set Sheet1 = Sheets(sFirstSheet)
SelectSheetStart:
sSecondSheet = AskSheet
If (sSecondSheet = "") Then
Exit Sub
End If
If (sSecondSheet = sFirstSheet) Then
MsgBox "Please select different first and second sheets"
GoTo SelectSheetStart
End If
Set Sheet2 = Sheets(sSecondSheet)

' Find the column to use for marking
Dim sFromColumn As String
Dim nLastColumn As Integer
sTemp = Sheet1.UsedRange.Offset(1, 1).Address(True, True, 1)
sTemp = Mid(sTemp, InStr(sTemp, ":") + 1)
sFromColumn = Mid(sTemp, 2, InStrRev(sTemp, "$") - 2)
nLastColumn = Sheet1.UsedRange.Columns.Count

' Copy header
Sheet1.Range("A1:" & sFromColumn & "1").Copy sheetResult.Range("A1")
sheetResult.Range(sFromColumn & "1").Formula = "From Sheet"

' Compare stuff
LeaveOnlyDifferent Sheet2, Sheet1, sheetResult, sFromColumn, nLastColumn, 2
LeaveOnlyDifferent Sheet1, Sheet2, sheetResult, sFromColumn, nLastColumn, sheetResult.UsedRange.Rows.Count
End Sub

LeaveOnlyDifferent Function


This function is where the real solution lies, comparing and presenting the result. It accepts the sheets to compare and the destination, the columns and the first row. After copying the cells and removing duplicates it copies the result to the top of the destination sheet.


Function LeaveOnlyDifferent(Sheet1 As Worksheet, Sheet2 As Worksheet, sheetResult As Worksheet, sFromColumn As String, nLastColumn As Integer, nFirstCompareRow As Integer)
' Copy first sheet data
nFirstDataRowCount = Sheet1.UsedRange.Rows.Count - 1
Sheet1.Range("A2:" & sFromColumn & (nFirstDataRowCount + 1)).Copy sheetResult.Range("A" & nFirstCompareRow)

' Copy second sheet data below the first
nStartOfSecondData = nFirstCompareRow + nFirstDataRowCount
Sheet2.Range("A2:" & sFromColumn & Sheet2.UsedRange.Rows.Count).Copy sheetResult.Range("A" & nStartOfSecondData)

' Remove duplicates
Dim arColumns() As Variant
ReDim arColumns(0 To nLastColumn - 1)
For i = 0 To nLastColumn - 1
arColumns(i) = CVar(i + 1)
Next
sheetResult.Range("A" & nFirstCompareRow & ":" & sFromColumn & sheetResult.UsedRange.Rows.Count).RemoveDuplicates arColumns, xlYes

' Mark the different data as coming from the proper sheet
nDiffRowCount = sheetResult.UsedRange.Rows.Count - nStartOfSecondData + 1
If (nDiffRowCount > 0) Then
sheetResult.Range(sFromColumn & nStartOfSecondData & ":" & sFromColumn & sheetResult.UsedRange.Rows.Count).Formula = "Found only on " & Sheet2.Name
' Copy it to the top
sheetResult.Range("A" & nStartOfSecondData & ":" & sFromColumn & sheetResult.UsedRange.Rows.Count).Copy sheetResult.Range("A" & nFirstCompareRow)
End If

' Delete all the rest
sheetResult.Range("A" & (nFirstCompareRow + nDiffRowCount) & ":" & sFromColumn & sheetResult.UsedRange.Rows.Count).Delete
End Function

Over to You


As mentioned earlier, there are many ways to solve this particular Excel challenge - how would you solve it? Could you see this solution as being useful? Please share your thoughts and experiences with us on Facebook or Twitter.


About the author


This article was written by Yoav Ezer, the CEO of a company that creates PDF to XLS conversion software, called Cogniview.


Prior to that, cwas the CEO of Nocturnus, a technology-centered software solution company.

7 comments:

  1. How do I create the macro & function? Googling only tells me to enable the Developer tab, and use the Macro Recorder.

    Baffled.......

    ReplyDelete
  2. I get an 'Object required' error on: SelectSheet.SheetNames.Clear
    in the first routine.

    ReplyDelete
  3. This looks very helpful and just what i need. However, I am getting the following error code on :

    Compile Error: Ambiguous name detected: LeaveOnlyDifferent

    Would appreciate your help

    ReplyDelete
  4. good idea but also getting object required error on SelectSheet.SheetNames.Clear

    ReplyDelete
  5. I copy both sheets into one new sheet & use the remove duplicates function on the combined sheet. All that remains are single entries for the non duplicated data.

    ReplyDelete
  6. The code is not complete: "AskSheet" is not defined and the whole 'Ask for two sheets to compare' is incomplete and does not work as posted. Don't waste your time with this one.

    ReplyDelete
  7. An easier way is to change the font color of one spreadsheet before copying and pasting it into spreadsheet two. Sort and remove duplicates. Not only do you get a quick comparison but you also identify the location of differences without an additional comparison.

    ReplyDelete