Compare Lists In Excel And Access
Imagine that you have a customer list from 2005 and a customer list from
2006. You’d like to know which customers were active in both years, which
customers you lost in 2006 and which were new in 2006. In Excel, you create
named ranges for your two customer lists and then use the COUNTIF formula to
check for the incidence of the customer ID or name in opposing lists.
In the screen shot here, you can see that a value of 0 means the customer
is not in the opposing list; a value of 1 shows that it is.
We still have to sort the data by 0 or 1 and then copy and paste the zero
values in 2005 to one list – Lost Customers – and the zero values from 2006
to another list – New Customers. This will get you the information you need,
but it’s clumsy. What’s more, automating this process would be very
difficult given the randomness of the results and the sizes of the two
lists.
Let’s look at how we’d perform this comparison in Access. In a
database, you create a query or queries to extract the data you want.
Especially for more recurrent list comparisons like orders placed vs. orders
shipped or a comparison of account codes from one month to the next, it's
best to use a database because you only have to write the queries once! A
query isn't a bunch of static formulas like a spreadsheet - it's a set of
instructions that is run fresh each time it's displayed. When the data in a
table changes, you don't have to change the query like you have to change
the range of formulas in a spreadsheet.
Tables are
compared to each other in a database based on the type of JOIN used. The
most common JOIN has a rule that says take each row of one table and link it
to the row or rows in another table where a common field matches.

A query that requests matching companies would display companies 5, 9 and 11.
|
Returning Customers |
|
2005ID |
2005Customer |
2006ID |
2006CustomerName |
|
5 |
EFG Company |
5 |
EFG Company |
|
9 |
IJK Company |
9 |
IJK Company |
|
11 |
KLM Company |
11 |
KLM Company |
What happens to all of the rest of the data? In a standard JOIN, the data
that has no match is thrown away. We are comparing list, though, and want to know what customers
did not return in 2006, so we’ll change the JOIN type.
In the query window, double-click on the line joining (JOIN’ing?) two
tables and you should see the following dialog box. Change the join type to
#2 and click OK.
Note that the line between tables becomes an arrow pointing from Cust2005
to Cust2006. That way you can tell immediately that the join type has been
changed. With this change we’ll see all of the customers from 2005.
|
Lost Customers |
|
2005ID |
2005Customer |
2006ID |
2006CustomerName |
|
1 |
ABC Company |
|
|
|
3 |
CDE Company |
|
|
|
5 |
EFG Company |
5 |
EFG Company |
|
7 |
GHI Company |
|
|
|
9 |
IJK Company |
9 |
IJK Company |
|
11 |
KLM Company |
11 |
KLM Company |
|
12 |
LMN Company |
|
|
|
14 |
NOP Company |
|
|
When we join the tables in this way, we can see “holes” in the table
where customers didn’t come back! Let’s go one step further in refining our
query. We can search for the holes in the data and get a list of only
LostCustomers. The hole in the data is referred to in a database as a NULL.
Go back into the query window and in the Criteria row under 2006 CustomerID
type “Is NULL”. To simplify the look of the resulting table, turn off all of
the CustomerID ‘Show’ checkboxes and the 2006 Customer Name field as shown
below:
|
Lost Customers |
|
Customer Name |
|
ABC Company |
|
CDE Company |
|
GHI Company |
|
LMN Company |
|
NOP Company |
We now have a clean list of only the data we were looking for. To get new
customers, just flip the join properties and search for NULL in the 2005
CustomerID as shown below.
If you save each of the three queries (with JOIN types 1, 2 and 3) you
can create one 'master' query that puts them all together for a clean report
showing Returning Customers, Lost Customers and New Customers. Each time the
data in the tables updates, you just run the master query and get your
report! You'll never need to play with the query again. See our article on
writing UNION
queries to piece the three types of data together, and our article on
other differences between
databases and
spreadsheets.