PDF Form Source
Last updated
Last updated
A PDF Form Source object provides users with the functionality of extracting data from a fillable PDF document. A fillable PDF document comprises certain data points or digital fields that are editable by a user using any modern PDF viewer.
They are often used instead of official documents on the web. The PDF Form Source object detects those points, extracts the written data, and creates relative fields for them.
Select the PDF Form Source object from the Toolbox and drag and drop it onto the dataflow designer.
Right-click on the PDF Form Source object’s header and select the Properties option from the context menu.
A configuration window will open, as shown below.
Provide the File Path for the fillable PDF document.
Owner Password: If the file is protected, then enter the password that is configured by the owner of the fillable PDF document. If the file is not protected, this option can be left blank.
Use UTF-8 Encoding: Check this option if the file is UTF-8 i.e., Unicode Transformation Format – 8-bit, encoded.
Click Next.
This is the Layout Builder window, where you can see the data fields extracted from the fillable PDF document. Click Next.
This is the Config Parameters window. Click Next.
This is the General Options window. Click OK.
Right-click on the PDF Form Source object’s header and select Preview Output from the context menu.
View the data through the Data Preview window.
The data is now available for mapping. For simplicity, we will delete the non-required data fields and store the output in a separate file. To store the data, we must write it to a destination file.
We are using a Delimited Destination object. Drag-and-drop the Delimited Destination object onto the dataflow designer and map the fields from the PDF Form Source object to the destination object.
Right-click on the fields that you do not want to store and select the Remove Element option.
Simply double-click or right-click on the Delimited Destination object’s header and select the Properties option from the context menu. Specify the File Path where you want to store the destination file. Click OK.
To preview the data, right-click on the destination object’s header and select Preview Output from the context menu.
Here, you can see the data of the selected fields.
This is how a PDF Form Source object is used in Astera to mine data points/digital fields from fillable PDF documents.