The Data Catalog is a library of curated reference data that you can use to help clean and normalize your data.
It is located under the Data Menu.
There are two sections in the Data Catalog: Open Data and Premium Data
Open data contains reference data for geographical, industry, job function, job level, free email, junk name resources and many more!
Premium Data contains Department of Labor, ITA Screening List, and NPI Registry reference files.
The Data Catalog reference sources are used with the Infer Task Template to populate or normalize values in your data source in a job.
Create your own reference data source
- Download an existing reference source by clicking on the data source and selecting Download
- The download will appear in the Download Library located under the Data Menu
- Once downloaded to your computer, you can edit the reference source to meet your requirements and then upload it as a data source.
2. Or, you can upload a reference source that you've created independently as a data source within Openprise.
Publish New Data
Select the Publish Open Data option to create a reference source in the Open Data library. You will first need to upload your reference as a data source within Openprise.
**This reference will only be available within your customer account in Openprise.
- Change Image: Select this option to change the image for your reference source
- Data set name: Name your reference source
- Data set description: Add a description for your reference source
- Data source: Select the data source you imported to Openprise to be used as your reference source
- Data set provider: Select your account in Openprise (company name)
- Data set administrators: Select the users that can edit this reference source
- Support information: Add support information related to this reference source
Examples
- If our Data Source was missing Country values, we could use the Reference-Cities open data source to infer the country by matching the city and state fields from our data source to this reference source and outputting the country name.
- If we wanted to tag non-business email addresses in the data source, we could utilize the Free & Disposable email providers open data source to identify non-business email addresses and tag them with Free, for example.