Duplicate Leads, Contacts and Accounts is a major issue hampering quality and productivity of Salesforce.com data. System Administrators would love to handle it by making configuration changes such as making a field Unique and Required. But unfortunately this is not that easy. It is very rare that a record is duplicate based on a single custom field (Unique and Required properties are not available for standard fields). Many organizations choose to allow duplicates and remove later by some kind of deduping process.
The Solution I am proposing is based on a real life situation where the client company had millions of contacts and accounts. Uniqueness was based on a combination of multiple fields such as First Name, Last Name, State, Country and Zip Code for example. Their developers had written Apex code in before trigger to select all the records which match these fields during insert or update and if the match is found then prevent the record from saving by throwing an error. This worked quite well on a test sandbox. But once deployed in production the performance took a heavy toll and the code was removed from the trigger logic. I was asked to look into the issue. The solution was simple and worked quite well without any noticeable performance decadence.
A custom text field called UniqueKey long enough to hold concatenated string of First Name, Last Name, State, Country and Zip was defined. The field was made Unique and Required. Trigger logic (before insert/update) was added to normalize these fields by converting each field to lower case and removing extra spaces and padding if any. All these fields were then concatenated and assigned to UniqueKey. And that’s it. Salesforce prevented duplicates very efficiently and effectively.
Obviously a one time batch process was written to populate UniqueKey for the existing records.