Table Join Operation in Excel
Tech Stuff August 16th, 2009The interactivity of Excel makes it a very handy tool for data analysis. But Excel excels only at numeric analysis. Unfortunately, tabular/set analysis is not inherently supported in Excel and you have to jump through hoops to do that. I’m posting the source code for a VBA subroutine (Excel Macro) that performs one of the most common tabular operations – a Table Join. The subroutine here takes 2 tables in the form of ranges, and the name of the join field as input and produces a new worksheet with the joined table as output.
Join Operation
To illustrate the join operation, let’s look at an example. Let’s say both I (Matt) and my friend (Heather) like movies. We have seen a handful of movies lately and want to figure out which movies we have both seen and how our preferences for compare against each other. So, here are 2 lists:

With a list of 14 movies, this problem is still tractable, but add another few movies and it becomes time consuming and undesirable. By running the code in this article, you get the 2 tables aligned and merged into this table instantly:

The rows in which the Movie column is empty under Heather’s list correspond to movies that Matt knows about but Heather doesn’t. Similarly, rows where Matt’s Movie column is empty indicate movies Matt doesn’t know about, but Heather does. The movies that both share on the list are the ones where the Movie column is filled for both. You can also easily compare the movie preferences. For example, it’s easy to see that Heather and Matt both liked Rocky but both disliked Code Name: The Cleaner. Heather liked Sound of Music, but Matt didn’t. And so on.
Running the Script
If you are copying and pasting the code into your worksheet, you have to do some (minimal) extra work to run the subroutine.
- Open the VBA Editor from Excel (Alt + F11)
- Record a new Macro (what you record doesn’t matter – you are going to replace the macro code)
- Open the Module1 module from the Project panel in VBA Editor.
- Copy the contents of the subroutine JoinTables as listed below.
- Select Tools->References in the VBA Editor window. Scroll down to Microsoft Scripting Runtime, and click the checkbox. Click OK to close the dialog box.
- Replace the contents of your newly created macro with this code:
JoinTables Range1, Range2, "MyJoinFieldName". - Replace ‘Range1′ with the range for the first table, Range 2 with range for 2nd table. e.g., in the example above, Range 1 is ‘Worksheets(”Sheet1″).Range(”A3:C17″)’, Range 2 is ‘Worksheets(”Sheet1″).Range(”E3:G17″)’
- Make sure you have a header row for the 2 tables and that the header row is part of the range.
- Replace ‘MyJoinFieldName’ with the name of the header column on which you are trying to join the table. In the example above, that column is called ‘Movie’. The 2 column names should match exactly (no extra space, no difference in capitalization).
- Run the newly created and replaced macro from Tools->Macro->Macros->(Select proper macro)->Run.
If you want to run the subroutine multiple times, make sure you rename the output sheet something other than ‘Joined Table’.
You can also transfer the subroutine and the helper macro into a module file and import the module whenever you need the functions.
Code
Sub TestRun() ' ' TestRun Macro ' Macro recorded 8/16/2009 by Rudrava Roy ' JoinTables Worksheets("Sheet1").Range("A3:C17"), _ Worksheets("Sheet1").Range("E3:G17"), "Movie" End Sub Public Sub JoinTables(rngTable1 As Range, rngTable2 As Range, _ strJoinField As String) ' Existing tables passed as range arguments ' Field names (headers) to join on passed as strings ' First row must be table headers ' Measure length of both tables choose long table (A) and short table (B) Dim rngA As Range, rngB As Range If rngTable1.Rows.Count >= rngTable2.Rows.Count Then Set rngA = rngTable1 Set rngB = rngTable2 Else Set rngA = rngTable2 Set rngB = rngTable1 End If ' Create new sheet with headers from longer table (A) ' followed by shorter table (B) Dim newSheet As Worksheet Set newSheet = Worksheets.Add newSheet.Name = "Joined Table" rngA.Rows(1).Copy (newSheet.Cells(1)) rngB.Rows(1).Copy (newSheet.Cells(1, rngA.Columns.Count + 1)) ' Locate join columns in both tables Dim cellB As Range Dim cellA As Range Dim rngJoinColB As Range Dim tableAJoinCol As Integer For Each cellA In rngA.Rows(1).Columns If cellA.Value = strJoinField Then tableAJoinCol = cellA.Column - rngA.Columns(1).Column + 1 End If Next For Each cellB In rngB.Rows(1).Columns If cellB.Value = strJoinField Then Set rngJoinColB = rngB.Columns( _ cellB.Column - rngB.Columns(1).Column + 1) End If Next ' Resize join column to exclude header row Set rngJoinColB = rngJoinColB.Offset(1, 0).Resize( _ rngJoinColB.Rows.Count - 1, _ rngJoinColB.Columns.Count) ' Create dictionary of table B on join column Dim dictTableB As Dictionary Set dictTableB = New Dictionary Dim rngTmp As Range For Each cellB In rngJoinColB.Rows Set rngTmp = rngB.Rows(cellB.Row - rngB.Rows(1).Row + 1) dictTableB.Add Trim(cellB.Value), rngTmp Next ' Resize ranges to exclude header rows Set rngA = rngA.Offset(1, 0).Resize(rngA.Rows.Count - 1, _ rngA.Columns.Count) Set rngB = rngB.Offset(1, 0).Resize(rngB.Rows.Count - 1, _ rngB.Columns.Count) ' Iterate through each row in A Dim i As Integer i = 2 ' row to start inserting at For Each cellA In rngA.Rows ' Copy row to new table (J) as new row cellA.Copy newSheet.Cells(i, 1) ' If join column of row from A has dictionary match, ' copy row from B & mark row 'done' If dictTableB.Exists(cellA.Columns(tableAJoinCol).Value) Then dictTableB.Item(cellA.Columns(tableAJoinCol).Value).Copy _ newSheet.Cells(i, rngA.Columns.Count + 1) dictTableB.Remove (cellA.Columns(tableAJoinCol).Value) End If i = i + 1 Next ' Iterate through each remaining row in B Dim tmpRow For Each tmpRow In dictTableB.Items ' Copy row to new row in J tmpRow.Copy newSheet.Cells(i, rngA.Columns.Count + 1) i = i + 1 Next ' Resize all columns in new sheet For i = 1 To (rngA.Columns.Count + rngB.Columns.Count) newSheet.Columns(i).EntireColumn.AutoFit Next End Sub
Recent Comments