The interactivity of Excel makes it a very handy tool for data analysis. But Excel excels only at numeric analysis. Unfortunately, tabular/set analysis is not inherently supported in Excel and you have to jump through hoops to do that. I’m posting the source code for a VBA subroutine (Excel Macro) that performs one of the most common tabular operations – a Table Join. The subroutine here takes 2 tables in the form of ranges, and the name of the join field as input and produces a new worksheet with the joined table as output.
To illustrate the join operation, let’s look at an example. Let’s say both I (Matt) and my friend (Heather) like movies. We have seen a handful of movies lately and want to figure out which movies we have both seen and how our preferences for compare against each other. So, here are 2 lists:
With a list of 14 movies, this problem is still tractable, but add another few movies and it becomes time consuming and undesirable. By running the code in this article, you get the 2 tables aligned and merged into this table instantly:
The rows in which the Movie column is empty under Heather’s list correspond to movies that Matt knows about but Heather doesn’t. Similarly, rows where Matt’s Movie column is empty indicate movies Matt doesn’t know about, but Heather does. The movies that both share on the list are the ones where the Movie column is filled for both. You can also easily compare the movie preferences. For example, it’s easy to see that Heather and Matt both liked Rocky but both disliked Code Name: The Cleaner. Heather liked Sound of Music, but Matt didn’t. And so on.
Running the Script
If you are copying and pasting the code into your worksheet, you have to do some (minimal) extra work to run the subroutine.
- Open the VBA Editor from Excel (Alt + F11)
- Record a new Macro (what you record doesn’t matter – you are going to replace the macro code)
- Open the Module1 module from the Project panel in VBA Editor.
- Copy the contents of the subroutine JoinTables as listed below.
- Select Tools->References in the VBA Editor window. Scroll down to Microsoft Scripting Runtime, and click the checkbox. Click OK to close the dialog box.
- Replace the contents of your newly created macro with this code:
JoinTables Range1, Range2, "MyJoinFieldName".
- Replace ‘Range1’ with the range for the first table, Range 2 with range for 2nd table. e.g., in the example above, Range 1 is ‘Worksheets(“Sheet1”).Range(“A3:C17”)’, Range 2 is ‘Worksheets(“Sheet1”).Range(“E3:G17”)’
- Make sure you have a header row for the 2 tables and that the header row is part of the range.
- Replace ‘MyJoinFieldName’ with the name of the header column on which you are trying to join the table. In the example above, that column is called ‘Movie’. The 2 column names should match exactly (no extra space, no difference in capitalization).
- Run the newly created and replaced macro from Tools->Macro->Macros->(Select proper macro)->Run.
If you want to run the subroutine multiple times, make sure you rename the output sheet something other than ‘Joined Table’.
You can also transfer the subroutine and the helper macro into a module file and import the module whenever you need the functions.
' TestRun Macro
' Macro recorded 8/16/2009 by Rudrava Roy
JoinTables Worksheets("Sheet1").Range("A3:C17"), _
Public Sub JoinTables(rngTable1 As Range, rngTable2 As Range, _
strJoinField As String)
' Existing tables passed as range arguments
' Field names (headers) to join on passed as strings
' First row must be table headers
' Measure length of both tables choose long table (A) and short table (B)
Dim rngA As Range, rngB As Range
If rngTable1.Rows.Count >= rngTable2.Rows.Count Then
Set rngA = rngTable1
Set rngB = rngTable2
Set rngA = rngTable2
Set rngB = rngTable1
' Create new sheet with headers from longer table (A)
' followed by shorter table (B)
Dim newSheet As Worksheet
Set newSheet = Worksheets.Add
newSheet.Name = "Joined Table"
rngB.Rows(1).Copy (newSheet.Cells(1, rngA.Columns.Count + 1))
' Locate join columns in both tables
Dim cellB As Range
Dim cellA As Range
Dim rngJoinColB As Range
Dim tableAJoinCol As Integer
For Each cellA In rngA.Rows(1).Columns
If cellA.Value = strJoinField Then
tableAJoinCol = cellA.Column - rngA.Columns(1).Column + 1
For Each cellB In rngB.Rows(1).Columns
If cellB.Value = strJoinField Then
Set rngJoinColB = rngB.Columns( _
cellB.Column - rngB.Columns(1).Column + 1)
' Resize join column to exclude header row
Set rngJoinColB = rngJoinColB.Offset(1, 0).Resize( _
rngJoinColB.Rows.Count - 1, _
' Create dictionary of table B on join column
Dim dictTableB As Dictionary
Set dictTableB = New Dictionary
Dim rngTmp As Range
For Each cellB In rngJoinColB.Rows
Set rngTmp = rngB.Rows(cellB.Row - rngB.Rows(1).Row + 1)
dictTableB.Add Trim(cellB.Value), rngTmp
' Resize ranges to exclude header rows
Set rngA = rngA.Offset(1, 0).Resize(rngA.Rows.Count - 1, _
Set rngB = rngB.Offset(1, 0).Resize(rngB.Rows.Count - 1, _
' Iterate through each row in A
Dim i As Integer
i = 2 ' row to start inserting at
For Each cellA In rngA.Rows
' Copy row to new table (J) as new row
cellA.Copy newSheet.Cells(i, 1)
' If join column of row from A has dictionary match,
' copy row from B & mark row 'done'
If dictTableB.Exists(cellA.Columns(tableAJoinCol).Value) Then
newSheet.Cells(i, rngA.Columns.Count + 1)
i = i + 1
' Iterate through each remaining row in B
For Each tmpRow In dictTableB.Items
' Copy row to new row in J
tmpRow.Copy newSheet.Cells(i, rngA.Columns.Count + 1)
i = i + 1
' Resize all columns in new sheet
For i = 1 To (rngA.Columns.Count + rngB.Columns.Count)