The Fuzzy Match tool is a handy tool for matching of non-identical values. There are two different modes that the tool can perform in; Purge, which is used for a single file and compares all records and, Merge, which compares records of different sources.
Steps to perform before attaching the fuzzy match tool.
· Data cleanse
o Remove white space, punctuation and make all upper case.
· Give each row a unique ID field
· Build a single data set, using the union tool
· Add a column to each data set to give each source a unique identifier
· Remove any identical matches
Use the Unique tool to remove any exact matches before applying the purge fuzzy match. When using it merge mode, use a join tool to remove any exact match.
Fuzzy Match
Step 1: Configuration
· Choose preferred Mode
· Source ID field (only for merge)
· Record ID field
· Match Threshold – this considers each specification within the configuration properties. If the match score generated is less than the specified threshold then it will not qualify as a match.
· Match fields – field names and match style
o There are a couple of premade match styles, such as Address and Name. But for more control there is a custom option.
· Advanced options- output match Score
Step 2: Unique Record tool – selecting both sources RecordID, as this will eliminate any duplicate matches
Step 3: Join back the field results. In this case, joining both the names and the addresses back in from the different sources.
Step 4: Union- final step is to join back the different sources that now have all the data put back in.