Delimited Parser
Last updated
Last updated
The Delimited Parser in Astera reads and processes a single stream of text in the delimited format as input and returns its elements as parsed output. It enables users to transform otherwise semi-structured data into a structured format.
Use Case
In this case, we are using the Delimited File Source to extract our source data. You can download this sample data here.
The source file contains customers’ contact information including their name, address, postal code, phone number, etc.
Upon previewing the data, you can see that it is difficult to decipher fields and elements since the data is in a single text stream with fields and records separated by delimiters. To make sense of this data, each record needs to be parsed into its elements in respective fields.
To do this, we will use the Delimited Parser object.
Using the Delimited Parser object
To get the Delimited Parser object, go to Toolbox > Text Processors > Delimited Parser and drag and drop the object onto the designer.
You can see that the dragged object contains a single Text field.
Map the Customer_Info field inside the source object onto the Text field inside the DelimitedParser object.
Right-click on the object’s header and select Properties.
A configuration window will open as shown below.
Let’s look at the properties on this window.
Parse Data Pattern – Contains three patterns in which the dataset can be parsed:
Single Record – Data is parsed into a single record with multiple fields. Users need to provide a field delimiter, and a text qualifier, if necessary.
Multiple Records – Data is parsed into multiple records with a single or multiple fields. Users need to provide a field delimiter as well as a record delimiter.
Field Arrays – Data is parsed into an array of records and fields. Users need to provide a field value delimiter and an array separator.
The source data in this case contains multiple records with many different fields. Therefore, we will set the Parse Data Pattern option to Multiple Records.
Provide a Field Delimiter and a Record Delimiter. The source file also contains a Text Qualifier.
Click Next. This is the Layout Builder screen.
Here, write the names of the fields that you want to create.
Click OK. The Delimited Parser object now has new fields in the Output node.
To preview data, right-click on the object’s header and select Preview Output from the context menu.
A Data Preview window will open. Upon expanding the records, you can view the parsed output.
To store this parsed output, you can write it to a destination file or use it for some transformation further in the dataflow.