Pattern Matching In Microsoft Office
What is pattern matching? Suppose you want to find each customer whose last
name begins with C, or all products with “automatic” somewhere in the
product description.
You’re probably already familiar with the star (*) notation so you can
guess that C* would match any customer name that begins with C. You can also
guess that “*automatic*” would match any description that had automatic at
the beginning, in the middle, or at the end.
The star is one of many “wildcard” characters used in Microsoft Office
programs. Another one is the question mark (?). A star will replace zero
characters or many characters but a question mark will replace one and only one.
So b*d would match bad or band (or bad band for that matter!), but b?d would
match only bad or bid or bod. If you wanted to find all invoice numbers between
50 and 59, you could search for 5?. The results would not include number 5 or
anything in the 500’s.
For some real power, you can include an explicit list of characters to match.
For instance, consider the pattern
[a-f]* as applied to a list of company names.
Can you guess what this wildcard will match? Any company name that begins with A
through F! That’s pretty powerful. You can also specify patterns you don’t
want to match using the ! character. For instance, [!X-Z]* would find all words
that did not start with X, Y or Z.
The way you use pattern matching depends on the Microsoft Office application
you are using. Access queries are the simplest to implement pattern matching
since the criteria field in Access queries accepts the Like operator. Using our
example above, a query which had a criteria of Like [A-F]* would return a list
of companies whose names began with the letters A-F.
In Word, you would use the Find command with the “Use Wildcards” option
turned on. You can type in the pattern to match, or you can choose certain
patterns from the Find dialog box. Word has a few differences from the basic
Like operator so be careful. For instance, if you want to find words that start
with Ar, Br, or Cr you would use [a-c]r* but you need a “start of word”
character in front of the pattern. In Word, that’s <. If you don’t
specify the <, you’ll find all words with ar, br, or cr in them anywhere.
See the video for an example.
In other applications, you would use a Visual Basic program to separate out
entries based on pattern matching. The Like operator in Visual Basic follows all
of the pattern matching rules outlined above and is also capable of recognizing
digits as opposed to characters. For instance, 5## would match 500-599 but not
50K since K is not numeric. You can combine wildcards to find your items of
interest in Office documents. If you have a TL product line (TL-R-22, TL-S-33,
TL-R-35, TL-X-45) and you want to find only product numbers in the 30s, you
could search for TL*3?.
See our sample code page for an example which uses pattern matching to
extract a date from a description
field, even if the format of the date changes from transaction to
transaction (e.g. 051304 to 4/6/03).
Try it out and you’ll be an expert in no time.