BETA - Please be aware that this connector is currently in beta mode.
This connector utilizes Openprise’s internal AI models. AI models can be unpredictable and are subject to variation in response, consistency and reliability. AI models are not guaranteed to return accurate data. Any PDF file data that is ingested and parsed via these models stays within Openprise and is not shared or used for training in any capacity.
This article covers how to connect Openprise to Google Drive - PDF when setting up a data source. You will need a Google Account and credentials for Openprise. You can learn more about Data Sources HERE.
Info Step
When creating a new data source, select Google Drive - PDF from the source technology and data format picklist.
- Select the Add Account Information button to link your Google Drive account to Openprise.
- Click on the Add Account button to sign into your Google Drive account using your credentials.
- Select from an existing account if you've already linked your Google Drive account to Openprise.
Directory or entity - Select the folder in your Google Drive account containing the PDF you want to import as a new data source.
- Import 1 file at a time: Fill this checkbox if your directory or entity contains multiple PDF files that you only want to import one at a time.
- Move processed file after importing: If this checkbox is filled, once a PDF file’s data has been imported into Openprise, the file will be moved into a subfolder named “OPProcessed”. This new folder will be automatically created by Openprise within the same Google Drive folder containing the PDF file.
- Skip file import if the number of fields is less than the configured: This option is selected by default and can not be un-selected. If a file does not contain any of the information specified for extraction in the Parse step of data source creation, all of the attribute values for that file will be populated with the value “N/A” indicating that the AI model was unable to extract any relevant data.
- Timezone: Select a timezone from the dropdown.
- Truncate text that exceeds maximum length instead of failing import: Fill this checkbox to truncate text that exceeds a maximum character specified in the next step.
- Maximum character length allowed per attribute value: Type the maximum character length per attribute value.
Select Next in the top right corner to move onto the Parse step.
Parse Step
The configuration that you define and test in the Parse step will be referenced when extracting data from ALL of the PDF files in the specified folder during import. If a PDF file in the folder does not contain the information specified in the configuration, no data will be extracted and returned for that file during import.
NOTE: Only 1 row of data is extracted and returned per PDF file. The extraction of table data may be unreliable and therefore not recommended as a suitable use case at this time.
-
File or entity: Select a PDF file from the dropdown that you would like to use when testing your attribute configuration. Only PDF-type files will be shown in the dropdown
- NOTE: Files with extensions such as .docx can not be imported at this time - all files MUST have the extension .pdf.
- Specify a prompt: Provide information about the file content and data extraction you would like to achieve. For best results, be as specific as possible in your prompt and what data you would like extracted.
- Attribute name: Type the names of the attributes that you would like the specified file data to be extracted to in Openprise.
- Attribute description: Define the expected content of each attribute by providing a brief description.
- Sample output: Returns a sample of the data extracted.
- Test attributes: Select this button to see what sample output would be extracted from the PDF file once imported.
- Clear attributes: Clears all attribute names and definitions.
Select Continue in the top right corner to move onto the Map step.
Map Step
- Attributes selected for import: This section will list all attributes created in the Parse step that will be imported into the data source.
-
Type: Manually select the data type of each attribute.
- NOTE: Due to the unpredictable nature of AI, we recommend configuring all attributes as single-value text type. You can change the attribute type in a job using THIS task template if necessary.