This section talks about how Astera North Star can be used to extract data automatically from files and also shares some best practices on how to get the best results.
The latest enhancement in Astera leverages artificial intelligence (AI) to suggest report model templates. This innovation empowers you to effortlessly create models for numerous source files simultaneously. Simply by defining the layout and document type, Astera intelligently recommends the most appropriate model templates, significantly reducing the time and effort required to construct your data extraction processes. This advanced functionality allows you to optimize your workflow and eliminate the manual extraction of data, resulting in increased efficiency.
This document provides a set of guidelines recommended to be followed to achieve the best results in data extraction using AI.
1. It is recommended to utilize this feature for documents spanning only one page. For larger documents, it would be quite time consuming and resource intensive in terms of processing.
2. Correctly select the document type, purchase order or invoice, in the Report Model Schema Wizard.
3. Provide a clear Data Layout based on the standards in the next point. The layout must contain appropriate regions and data fields named correctly. This will ensure accuracy in the generated templates.
4. Ensure that the documents that are being used have a layout where the first data region has key-value pairs followed by a table and the last region being a key-value pair region as well. For example, the purchase order shown below follows such a layout.
Invoices could also be documents that follow such a layout.
5. After template creation, thoroughly review and verify the generated templates for accuracy. Identify any missing fields or errors and adjust as needed. Once generated and in use, regularly validate and update the templates to reflect any changes in newer document layouts or data requirements.
6. Validate Extracted Data: Always validate the accuracy and completeness of the extracted data against the original documents. Perform regular quality checks to ensure the extracted information aligns with the intended data fields and meets your business requirements.
By following these best practices, you can maximize the effectiveness of AI-powered template creation in Astera and achieve accurate and efficient data extraction from diverse document layouts while leveraging the capabilities of ChatGPT for improved results.
Astera now uses AI to recommend report model templates, allowing you to automatically generate models for multiple source files at once. By specifying the layout and document type, Astera recommends the most suitable model templates, saving you valuable time and energy when building your data extraction processes. With this new feature, you can streamline your workflow and eliminate the need for manual data extraction. In this document, we will see how to use this feature to create the report models.
In this case, we have a few purchase order PDFs which contain key value pairs - such as Order No. and its value - at the top of the document, a table and then a key value pair at the end, as shown in the document below.
Before using this feature, you will have to upgrade your repository. This is to be done once for your repository and then you may use this feature whenever needed.
Go to the windows search bar and type "Repository Upgrade for Integration Server".
In order to utilize the Astera North Star feature, you will need to create a new project from Project > New > Integration project or open an existing project in Astera.
To open an existing project, go to Project > Open and then browse for your project file.
If you have multiple source files, add the source file folder to the project.
Next, right-click on the source files folder in the project explorer and select AI-Powered Data Extraction > Auto Create Report Models using Astera North Star.
This will open the Astera North Star Schema Wizard.
Depending on the type of files you have, either purchase orders or invoices, select the type from the drop-down menu. In this case, we will be selecting purchase orders.
Provide the wizard with the layout of the data that you want to extract from these source files. This can be done via either importing a layout defined object from a dataflow or importing a layout from a JSON.
To build the layout from a dataflow/subflow, select the icon and then browse for your dataflow. The dataflow could have an object with the layout defined such as the Passthru object as shown in the image below.
If you are importing a layout from a JSON, you can paste the JSON in the window and will have the option to verify, export, and other options for the JSON. Once you have copied the JSON, you can click on generate, and the layout will be imported.
Once you have built the layout, click OK in the Astera North Star Schema Wizard, and the report model generation will begin.
The automated report mining will generate report models for each file in the folder.
This window shows the time remaining in the generation of templates for all the files along with their status.
If the status of the file is shown as a "Success", the generated template is stored to the AI Generated Report Models folder that has been created in the project.
In case the file does not contain the required fields, the status is shown as "Erroneous" and generated templates will be kept in the Erroneous Report Models folder that has been created in the project.
You can also click on the text for Erroneous to see its stack trace.
You can access these report models if you wish to verify and/or edit them.
The automated report mining will generate report models for each file in the folder and save successfully generated files to the AI Generated Report Models folder that has been created in the project. In case the file does not contain the required fields, the generated templates will be kept in the Erroneous Report Models folder that has been created in the project. You can access these report models if you wish to verify and/or edit them.
Now that the report models have been created, they are ready to be used for data extraction.
For standalone report model layout creation, open a new report model with the file. Then select the Auto create Layout (AI) option from the toolbar.
The remaining process remains the same as for multiple source files. This file also needs to be part of a project for this feature to be executed.
In dataflows, the Report Source object has been given a new parameter for the report model itself, allowing users the flexibility to parameterize the report models to leverage data extraction as per their needs.
This concludes our discussion on creating report models using Astera North Star.