# PDF Form Source

A *PDF Form Source* object provides users with the functionality of extracting data from a fillable PDF document. A fillable PDF document comprises certain data points or digital fields that are editable by a user using any modern PDF viewer.

They are often used instead of official documents on the web. The *PDF Form Source* object detects those points, extracts the written data, and creates relative fields for them.

## **Sample Use-Case**

<figure><img src="https://750977703-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FqHxyVNGb7tSdIWecl6Ru%2Fuploads%2FkOJ6NaiOoXGD9PzTj4n2%2Fimage.png?alt=media&#x26;token=e7eb148c-5ec4-41f5-a2bb-4202b81023e3" alt=""><figcaption></figcaption></figure>

{% hint style="info" %}
**Note**: This is a *Scholarship Application Form* with fillable data fields for *Personal Information*, *Contact Details*, and *Education Qualifications.*
{% endhint %}

## **Utilizing the PDF Form Source Object**

1. Select the *PDF Form Source* object from the Toolbox and drag and drop it onto the dataflow designer.

<figure><img src="https://750977703-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FqHxyVNGb7tSdIWecl6Ru%2Fuploads%2Fijs6OP4FANVncWDPan5C%2Fimage.png?alt=media&#x26;token=b17b6fa2-afa6-44ac-8c04-b6e777b78dfe" alt=""><figcaption></figcaption></figure>

2. Right-click on the *PDF Form Source* object’s header and select the *Properties* option from the context menu.

<figure><img src="https://750977703-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FqHxyVNGb7tSdIWecl6Ru%2Fuploads%2FrNtnMGy8ha1urMOy1FpE%2Fimage.png?alt=media&#x26;token=2b90eb87-c165-4184-b056-15bca2d17b5c" alt=""><figcaption></figcaption></figure>

A *configuration* window will open, as shown below.

<figure><img src="https://750977703-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FqHxyVNGb7tSdIWecl6Ru%2Fuploads%2Fos3yy4aEujGLMnqZRlGT%2Fimage.png?alt=media&#x26;token=d5275f01-97bf-4379-a2a1-a58443bbece2" alt=""><figcaption></figcaption></figure>

3. Provide the *File Path* for the fillable PDF document.

<figure><img src="https://750977703-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FqHxyVNGb7tSdIWecl6Ru%2Fuploads%2FjwFekBARYVwGSGbEHQvQ%2Fimage.png?alt=media&#x26;token=fea5b276-699c-45ea-ae60-b2271c7365e8" alt=""><figcaption></figcaption></figure>

* *Owner Password*: If the file is protected, then enter the password that is configured by the owner of the fillable PDF document. If the file is not protected, this option can be left blank.
* *Use UTF-8 Encoding*: Check this option if the file is UTF-8 i.e., Unicode Transformation Format – 8-bit, encoded.

Click *Next*.

<figure><img src="https://750977703-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FqHxyVNGb7tSdIWecl6Ru%2Fuploads%2FMLowHcG36RZX8V9fro5F%2Fimage.png?alt=media&#x26;token=c5dcda8f-016f-4658-9abf-fb9215a0c0e5" alt=""><figcaption></figcaption></figure>

This is the *Layout Builder* window, where you can see the data fields extracted from the fillable PDF document. Click *Next*.

<figure><img src="https://750977703-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FqHxyVNGb7tSdIWecl6Ru%2Fuploads%2FWuaDFdvZnz7ayXKkOxPJ%2Fimage.png?alt=media&#x26;token=8431169e-39f5-4983-9389-ae39fd5c1edd" alt=""><figcaption></figcaption></figure>

This is the *Config* *Parameters* window. Click *Next*.

<figure><img src="https://750977703-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FqHxyVNGb7tSdIWecl6Ru%2Fuploads%2FlmyL4VxM6t1Bik3RnMyn%2Fimage.png?alt=media&#x26;token=074b00c9-5f93-4051-b4da-e43ff2718915" alt=""><figcaption></figcaption></figure>

This is the *General Options* window. Click *OK*.

<figure><img src="https://750977703-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FqHxyVNGb7tSdIWecl6Ru%2Fuploads%2FSpP0lCSlaWv33eatsNpp%2Fimage.png?alt=media&#x26;token=c0fe8adc-c10c-4c2c-b3d0-eea3a8f4f015" alt=""><figcaption></figcaption></figure>

4. Right-click on the *PDF Form Source* object’s header and select *Preview* *Output* from the context menu.

<figure><img src="https://750977703-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FqHxyVNGb7tSdIWecl6Ru%2Fuploads%2F3dL6EMsbomdPCuCNgJ1Y%2Fimage.png?alt=media&#x26;token=63aeb647-55fd-410c-97d1-ff918800b74c" alt=""><figcaption></figcaption></figure>

View the data through the *Data* *Preview* window.

<figure><img src="https://750977703-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FqHxyVNGb7tSdIWecl6Ru%2Fuploads%2FLZG9ICTrQe50IoBLi2iV%2Fimage.png?alt=media&#x26;token=9e2bf090-210a-4838-8eb2-caa51533ab38" alt=""><figcaption></figcaption></figure>

The data is now available for mapping. For simplicity, we will delete the non-required data fields and store the output in a separate file. To store the data, we must write it to a destination file.

5. We are using a *Delimited Destination* object. Drag-and-drop the *Delimited Destination* object onto the dataflow designer and map the fields from the *PDF Form Source* object to the destination object.

<figure><img src="https://750977703-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FqHxyVNGb7tSdIWecl6Ru%2Fuploads%2FAxokVRqBH2yGOSpX9FBt%2Fimage.png?alt=media&#x26;token=e1b343ac-9fbe-487b-8b34-3dfb7d9dd9cd" alt=""><figcaption></figcaption></figure>

Right-click on the fields that you do not want to store and select the *Remove Element* option.

{% hint style="info" %}
**Note:**

* Do not delete the data fields from the *PDF Form Source* object, as it will disturb the layout that has been generated for the detected data fields.
* You can also delete the data fields in the destination file by using the *Layout Builder*. Or map only the relevant fields onto the nodes of the destination object. You can refer to this [article](https://documentation.astera.com/v/astera-data-stack-v8/dataflows/destinations/delimited-file-destination) to learn more about the *Delimited Destination* object.
  {% endhint %}

6. Simply double-click or right-click on the *Delimited* *Destination* object’s header and select the *Properties* option from the context menu. Specify the *File Path* where you want to store the destination file. Click *OK*.

<figure><img src="https://750977703-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FqHxyVNGb7tSdIWecl6Ru%2Fuploads%2F4rTZmmOblVxAUdxhJ8pk%2Fimage.png?alt=media&#x26;token=672d3de6-8a1a-4997-bfd8-3377dfd22e30" alt=""><figcaption></figcaption></figure>

7. To preview the data, right-click on the destination object’s header and select *Preview* *Output* from the context menu.

<figure><img src="https://750977703-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FqHxyVNGb7tSdIWecl6Ru%2Fuploads%2FPASfD5UUOOR77TrqprCd%2Fimage.png?alt=media&#x26;token=520ff04c-3c8e-49f9-9d72-8c28f99439b9" alt=""><figcaption></figcaption></figure>

Here, you can see the data of the selected fields.

<figure><img src="https://750977703-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FqHxyVNGb7tSdIWecl6Ru%2Fuploads%2FXrR5bq29peOOjmBHNORa%2Fimage.png?alt=media&#x26;token=e0c65f5f-46af-4e58-9151-b5c48cf924ea" alt=""><figcaption></figcaption></figure>

This is how a *PDF Form Source* object is used in Astera to mine data points/digital fields from fillable PDF documents.
