Meta-consulting is a service Database Pros offers to help other developers
over programming hurdles via GoToMeeting. Email John Mark Osborne or call John Mark at (909) 393-4664 to find out more about this service.
Tool Tip If you want to draw multiple graphic objects while in layout mode, you may have some difficulty. Every time you finish drawing the graphic, the tool selection changes to the arrow, presumably to allow you to edit the graphic. If you want to draw multiple objects, double-click on the desired graphic tool. The tool will turn a darker shade and will stay selected until you choose another tool. If you always want tools to be locked with just a single click, see the preference setting in FileMaker "to always lock layout tools".
A script that identifies duplicates and deletes them isn't always the best choice. Sure, it works great when you are importing new contacts from a marketing campaign but what if the duplicate comes from an overzealous or lazy employee? Sometimes you need a system for merging a new record with an old one. While this can be completely scripted, I truly believe a human needs to oversee this process in most cases.
Example I'll be using an example of a contacts solution to demonstrate this technique. The simple reason is it's easy for everyone to grasp and start following along. Not everyone will have a contacts solution so they'll need to modify the technique slightly to meet their needs. I've also included a simple related table of hobbies to demonstrate how related records can be merged. There's no need to get more complicated to demonstrate this technique but your real-life scenario is almost always going to be more intricate. I'm just here to show you the basic concepts. Once you understand, it will be easy to modify the techniques to meet your own unique needs.
Finding Duplicates The first thing you need to do is identify the duplicates. The calculation and script for creating a found set of duplicates was covered in the first article of this series titled "Deleting Duplicates":
Here's the basic calculation and script for finding duplicates:
name_last & name_first & LeftWords(address; 1)
If you've worked with scripts, calculations and the exclamation point find mode operator then this script is pretty easy to understand. You'll need to adjust the calculation to meet the needs of your data since identifying duplicates is specific to the content in each database.
Merging Once duplicates are identified in a found set, you can easily walk through them and decide which ones you want to merge. Again, I believe this feature shouldn't be completely automated and requires a human to decide what records and what fields on those records to merge. The scripts are just there to ease the process of merging records. To allow the users to pick and choose which records to merge, you will need some sort of interface that initiates the merging feature. I have chosen a portal that displays the records the solution thinks are duplicates. The additional relationship is a self-join to CONTACTS_Duplicates and has two sets of match fields:
FYI: A self-join relationship is one that relates a table to itself.
The first set of match fields uses the same calculation field from the script above, allowing the self-join relationship to see all other records that match the current record. The second set of fields eliminate the current record from the relationship by using the primary key and the not equals operator so you just see duplicate records.
Before merging, I'd start with a simple Go to Related Record step in the portal to allow users to simply look at the duplicate record in a new window. They may want to look at it before committing to a merge:
Before creating the merge script, we'll need a layout that assists the script. Attach the CONTACTS table occurrence (TO) to the new layout. I designed my layout to be smaller than the default window size for my solution so I can display it as a Card Window. This will allow the user to work with the merge feature without leaving the current record. It's an interface concept that puts the user at ease by leaving the original data in the background.
Place two columns of fields on this layout. Start with the fields you might want to merge from the current record using the CONTACTS table occurrence. The second column of fields is identical to the first except that they are from a new self-join relationship called CONTACTS_Merge. I like to call this a temporary relationship because it's based on a global field. The idea is to place the primary key from the duplicate record in a global field and use that global field in a relationship to display the duplicate next to the original.
This merge technique is going to allow merging at the field level. Therefore, you'll need one script for every field you want to merge. Since the script is a single Set Field, you don't need to clutter the Script Workspace. Simply attach the Set field step to a button.
I'd recommend creating the first button in it's entirety so you can quickly make the rest of the buttons with few modifications. Use Smart Duplicate or the Option (Mac) or Ctrl (Windows) along with a drag of the mouse to quick duplicate the button. You can find out more about Smart Duplicate in Stuff Ya Oughta Know.
Once you are done, your layout should look something like the following:
The Done button in the upper right corner simply closes the window. I prefer this over the close window icon FileMaker provides in a Card Window so I turn the option off in the New Window script step. Which reminds me! You probably wanna see the script on the button in the portal that displays this merge layout.
That's right, the script isn't that complicated. As discussed previously, the duplicate record primary key is grabbed from the portal where the merge button is clicked and then placed in a global field to complete the relationship for displaying the duplicate record next to the original. Then, a Card Window is displayed using the new layout we just designed.
Merging Related Records Merging duplicate records is almost never as simple as a single table. If you have a flat file then you don't need to read any more. If you have even one related table, you'll need to offer additional merging features. Fortunately, the extra programming isn't very difficult at all. All you need is an additional relationship and a lengthy but straightforward script:
The idea in the script above is to grab the primary key from the current non-duplicate record and the clicked hobby from the portal showing the duplicate hobbies. These values are placed in variables so they can be transported to a new record that will be related to the current non-duplicate hobbies relationship. More variables may be needed if there are multiple related values that need to be moved. Once the new record is created, Set Field steps populate the new record and relate it to the current non-duplicate contact record. The last steps simply select the new portal row so it's clear to the user the hobby has been merged.
Once you're done, place two portals on the merge layout. The first is from the HOBBIES TO and the second is from the HOBBIES_Merge TO. Place a button in the HOBBIES_Merge portal that runs the scripts discussed above. Here's what it might look like:
What's Left? That's pretty much everything I know about duplicates in a series of three articles. You can either prevent them before they are entered, delete them after they are entered or merge them if you want to keep information from both records. If you have any other cool techniques, please place them in the comments.