The following is a guideline to follow when using the recipe "Dedupe for Salesforce". Please read these guidelines before cloning the recipe.
Overview
This recipe provides the following:
-
Jobs that dedupe records in Salesforce:
- Account-to-Account
- Contact-to-Contact
- Lead-to-Lead
- Lead-to-Contact
- Bots to run the above jobs. Note that the bot will be created, but not automated, and the assumption is that your Account, Contact and Lead objects are being imported on a regular basis by pre-existing bots.
- Manual data sources that provide metrics on what records have been deduped.
- Analytics reports
- App Factory Apps to view the analytics reports containing the results of deduping
Preparation
-
Establish data sources: A data source must be created for all three Salesforce objects: Accounts, Contacts and Leads. If you already have these data sources, you do not need to create new ones. The data sources need to contain the following attributes, and may contain more attributes if they're needed for identifying duplicates and/or used to merge data from the non-surviving records into the surviving record:
- Accounts: all standard fields, plus BillingStreet, Description
- Contacts: all standard fields, plus Description, LeadSource, MailingStreet, EmailBouncedDate, EmailBouncedReason, HasOptedOutOfEmail, MobilePhone
- Leads: all standard fields, plus Description, Street, EmailBouncedDate, EmailBouncedReason, HasOptedOutOfEmail, MobilePhone
- Create new data targets: Create a new data target for each object with any attributes you'll be updating with data merged from the non-surviving records. You will need two data targets for contacts, one for the C2C dedupe and one for L2C dedupe. Note: If you don’t know exactly what fields you want to use, just create a data target for each object containing only the object id. Then, the data targets can be updated after the project recipe has been cloned and you have planned out which fields you want to update after merging.
Clone
To clone the project recipe, navigate to Projects Recipes and select the project recipe to clone.
You will need to provide:
- Project name
- Prefix and/or suffix for all objects in the project
- Data sources to be used in the project
- Data targets to be used in the project
Examine the cloned project
We recommend you look through all the newly created project elements to familiarize yourself with what has been created.
The Lead to Contact dedupe jobs are designed with the assumption that both the lead AND contact objects have already been deduped. For this reason, we suggest you implement the Lead to Lead and Contact to Contact dedupe jobs before you begin working on the Lead to Contact dedupe jobs.
Look for:
-
Any job that is marked with errors. These errors must be manually corrected.
Note: When cloning, the dedupe task in the C2C, L2L, and L2C jobs may show an error, typically due to an attribute used in the task that is missing from the data source. To clear the error, look at the task configuration to see where there is an attribute missing, and fill it in where possible. Then click Save. This should clear the error. - Export tasks that write back to Salesforce. Make changes by selecting any values that want to write back to Salesforce. Note: Any Export task to Salesforce has been bypassed for your protection. We recommend that you run the jobs with the bypass enabled so you can review any data changes before writing data back to Salesforce.
Review the dedupe logic and make changes as needed
These recipes will need to be modified to reflect the deduplication criteria for your organization. Below is a list of questions to help you develop your dedupe strategy:
- Are there any records that you want to leave out of deduplication?
- Which fields should be used to identify duplicates? eg. Name and Country for Accounts? Email for Leads? Email and AccountId for Contacts?
- What criteria defines the surviving record in a duplicate set?
- What fields get merged automatically if there is no value on the surviving record, but a value on one of the non-surviving records in a duplicate set? These fields need to be added to the jobs.
- Are there any fields that need special handling when merging records in a duplicate set? eg. For Lead Source, you may want to keep the value from the earliest created lead record.
- Are there any fields that should not be merged? eg. deprecated fields, or fields maintained by a managed package are often omitted from the merge logic.
Run the bot
The first time you run the bots, we strongly suggest you keep the bypass option enabled on all export tasks.
Look at the output from each job, carefully examining your data to make sure it is being transformed correctly and to your standards. Review each export task to verify the mapped attributes are set up correctly.
When you are happy with the job outputs, complete a test run by:
- Modifying the merge task by adding a filter to limit records to a single Duplicate Set Ids value.
- Modify the export task to set a maximum number of records updated. This option is in the Advanced Configuration section of the export task template.
- Remove the Bypass option on the merge and export tasks.
- Run the bot. When the bot is complete, review the records updated by looking at the job outputs and reviewing the attributes: op_merge_success, op_merge_errors, op_export_success and op_export_errors. Please also review the records in Salesforce to verify the changes were made correctly.
Finalize the project
Final steps are:
- Edit the merge and export task to remove the restrictions implemented during testing.
- Edit the Dedupe MDS step to remove the filter that blocks writing records to the MDS. This will provide data for the analytics charts that are part of this recipe.
-
Edit the Bots to schedule the frequency and enable automation.
Note: the Lead-to-Contact deduplication bot should be run after the Contact-to-Contact and Lead-to-Lead deduplication bots, and after a re-import of both Leads and Contact data sources.