Purpose
Replace or fill an attribute in Data Source A with a lookup value from Data Source B by matching on aliases. When an alias is found, replace it with the normalized value. This task is useful to replace one value with another value, or replace different variations with one standard value.
Category Location: All, Match and lookup
Field Description
- Attribute to normalize/clean up: Select the attribute from your input data source to normalize/clean up.
- Write normalized value to: Select the attribute that contains the normalized result.
- Normalized-alias mapping is stored in: Select the reference data source from the dropdown.
- Normalized values are stored in: Select the attribute in the reference data source that contains the normalized values.
- Alias values are stored in: Select the attribute in the reference data source that contains the alias values.
- Match method: Select the method to match data. Options are: Exact, Fuzzy, Begins with, Ends with, Contains.
- If more than one match is found: Two options: Write only a single value, Write all matching values (up to 100).
- Resolve conflicts using the priority in: If applicable, select the attribute that contains the priority.
- Copy unmatched input data to output: If checked, this option will write the original value to the output field when no match is found.
Tips
- This requires the use of a Reference Data Source that contains a list of normalized values and the mapping to their alias values. For example, in the Openprise reference Reference – States, Provinces, Regions, it contains State Name (e.g. California), Postal Abbreviation (e.g. CA), Aliases (e.g. California, CA, Calif).
- The aliases can also contain nicknames such as "The Golden State".
- You cannot normalize an empty attribute.
- Use matching method “fuzzy match” to catch misspellings and minor variations.
- Use matching method “contains” to search for alias' appearing among a sentence or a block of text.
- If you need to match more than one attribute to achieve normalization, consider using the Infer value task instead.
Examples
- Normalize state aliases such as California, Calif, Calif., and CA to "California". We recommend using the Openprise reference Openprise reference Reference – States, Provinces, Regions.
- Normalize country aliases such as United Sates, United States of America, US, U.S., USA, and U.S.A., to "USA". We recommend using the Openprise reference Reference – Countries – Multilingual.
- Normalize customer aliases such as Toyota USA, Toyota Corp., Toyoto Motor Sales, to "Toyota Motor Sales USA".
- Segment job title data into job level seniority and job function attributes. We recommend Openprise references Reference – Job Level Seniorities and Reference – Job Functions. For example:
- If Job Title = “VP, Enterprise Architecture”, then set Job Level Seniority = “Executive” and Job Function = “IT”.
- If Job Title = "Sales Engineer", then set Job Level Seniority = "Contributor" and Job Function = "Sales".