AI Fuzzy Matching – Help Center

Purpose

This task is used to send a prompt to an AI model to determine the best match from a group of records.

Disclaimer: AI models can be unpredictable and are subject to variation in response, consistency, and reliability. AI models are not guaranteed to return accurate data.

Category Location: All, Data Services/Enrichment

Prerequisites

AI Data Service

You will need to create a Data Service that is connected to an AI model prior to configuring this task. Data Services that are compatible with this task template include:

Infer Value Task

You must configure an Infer Value task template prior to using the AI Fuzzy Matching task template. Use Infer Value to identify matched records between the input data source and the reference data source. You can learn more about Infer Value HERE.

The output attribute in the Infer Value task template must be configured as shown below. The output attribute stores multiple groups of matched records and will be used in the AI Fuzzy Matching task template:

If more than one match is found: Select "Write all matching values (up to 300)" from the dropdown.
Number of values to write: Type the number of values that can be matched. The maximum amount is 300, however, AI Fuzzy Matching only considers up to 10 matched ID's.
Write outputs to: Create a new attribute to store multiple matched records. The output attribute must be configured as a multi-value text type.

The Infer Value task template can be configured in standard match mode or advanced match mode as long as the output attribute is a multi-value text type containing multiple matched records.

Field Description

Select Data Service: Select the AI Data Service you created in Openprise.
Input data source candidate records attribute: Select the multi-value text type attribute created in the Infer Value task, which contains the list of matched record ID's from the reference data source. AI Fuzzy Matching will only consider the first 10 ID's. If more than 10 ID's are included, only the first 10 will be used.
- NOTE: The ID's in the candidate records attribute must correspond to the ID's in the key field attribute of the reference data source.
Reference data source candidate records attribute: Select a data source containing the records you are trying to match to the input data source records.
- NOTE: This data source must have a key field defined or you will be unable to save the task configuration.
Construct prompt string: Use text to describe how the AI model should determine the best match and rank levels from the provided data. Clear and specific instructions are likely to yield the best results. Use the '@' symbol to select attributes for inclusion in the prompt.
Input data source attributes: Use the '@' symbol to select attributes from the input data source whose record values you want to compare against the reference data source. Any additional text outside of the specified attributes will not be considered. The ordering of the attributes should correspond to the order that you would like them to be in when they are compared to the reference data source record attributes.
- Minimum of 1 attribute is required, maximum of 10.
- The values will be presented to the AI data service as a single block of text (i.e. [Openprise, California, Tech], where each item corresponds to an attribute value for a given record).
Best match ID: a single-valued text type attribute that contains the reference data source candidate record ID that is ranked as number 1 for a group of records.
Complete model output (optional): a single-value text type attribute that returns a comma separated string containing the full ranking output of the candidate record ID's in the order they were returned by the AI data service. Records are ranked from best to worst matches left to right.
Unmatched candidate records (optional): a single-valued text type attribute that contains candidate record ID's that were not able to be found in the reference data source.
Token Limit: the number of tokens that can be used by the AI model to process and respond to a prompt. Minimum token value is 1, maximum is 12,000; the default is 8,000.

Word Parsing (Optional)

Configure an optional task to parse multiple matched results using the Word Parsing task template. This task should be configured directly after the AI Fuzzy Matching task template.

Purpose of configuring Word Parsing:

Parse results by a specific rank from the complete model output attribute (e.g. parse the second matched record ID)

Parse unmatched candidate results by a specific rank

Review Outputs

Create a filter to review how many unmatched results have a value. If a significant number of records are unmatched, consider modifying the prompt for better results.

Create a filter to review how many best match ID's have no value. If a significant number of records do not contain a best match, consider modifying the prompt for better results.

Use Openprise system fields to determine if the AI model produced an error, was successful at producing a result, and how many tokens were used. There are five system fields that begin with the naming convention "openai_compatible_XXXX".

Use Cases

Use AI Fuzzy Matching to rank duplicate records and select the best matched record from a duplicate set.
Use AI Fuzzy Matching to match leads to accounts in a Lead to Account Matching process.