1 of 8

Text Processors

Delimited Parser

The Delimited Parser in Astera reads and processes a single stream of text in delimited format as input and returns its elements as parsed output. It enables users to transform an otherwise semi-structured data into a structured format.

In this document, we will learn to use the Delimited Parser to parse an incoming text stream in Astera.

Use Case

In this case, we are using the Delimited File Source to extract our source data. You can download this sample data from the following link:

The source file contains customers’ contact information including their name, address, postal code, phone number, etc.

Upon previewing the data, you can see that it is difficult to decipher fields and elements since the data is in a single text stream with fields and records separated by delimiters. To make sense of this data, each records needs to be parsed into its elements in respective fields.

To do this, we will use the Delimited Parser object.

Using Delimited Parser

To get the Delimited Parser object, go to Toolbox > Text Processors > Delimited Parser and drag-and-drop the object onto the designer.

You can see that the dragged object contains a single Text field.

Map the Customer_Info field inside the source object onto the Text field inside DelimitedParser object.

Right-click on the object’s header and select Properties.

A configuration window will open as shown below.

Let’s look at the properties on this window.

Parse Data Pattern – Contains three patterns in which the dataset can be parsed:
- Single Record – Data is parsed into a single record with multiple fields. Users need to provide a field delimiter, and a text qualifier, if necessary.
- Multiple Records – Data is parsed into multiple records with a single or multiple fields. Users need to provide a field delimiter as well as a record delimiter.
- Field Arrays – Data is parsed into an array of records and fields. Users need to provide a field value delimiter and an array separator.

The source data in this case contains multiple records with many different fields. Therefore, we will set the Parse Data Pattern option to Multiple Records.

Provide a Field Delimiter and a Record Delimiter. The source file also contains a Text Qualifier.

Click Next. This is the Layout Builder screen.

Here, write the names of the fields that you want to create.

Click OK. The Delimited Parser object now has new fields in the Output node.

To preview data, right-click on the object’s header and select Preview Output from the context menu.

A Data Preview window will open. Upon expanding the records, you can view the parsed output.

To store this parsed output, you can write it to a destination file or use it for some transformation further in the dataflow.

This concludes using the Delimited Parser in Astera.

Delimited Serializer

The Delimited Serializer converts a structured data set into a single text stream with fields and records separated by delimiters or identified by text qualifiers. A serialized data with delimiters allows sharing or storage of the data in a form that allows recovery of its original structure.

In this document, we will learn how to use a Delimited Serializer to serialize a structured data in Astera.

Use Case

In this case, we are using the Customers table from the Northwind database. You can download this sample data from the following link:

The source file contains customer’s contact information including their ContactName, Address, PostalCode, Phone etc., in a structured format.

We want to convert the information contained in multiple fields into a single text stream separated by a delimiter.

To perform this task, we will use the Delimited Serializer object in Astera.

Using Delimited Serializer

To get the Delimited Serializer object, go to Toolbox > Text Processors > Delimited Serializer and drag-and-drop the object onto the designer.

You can see that the dragged object contains a Text field with an output port and an Input sub-node which is currently empty.

Auto-map source fields by dragging and dropping the top node of the source object, that is Customers, onto the Input node of the transformation object – Delimited Serializer.

Right-click on the object’s header and select Properties.

A configuration window will open as shown below.

Let’s look at the properties on this window.

Field Delimiter – Allows users to specify a delimiter for the source fields from the dropdown list.
Text Qualifier – Allows users to specify qualifiers at the start and end of a text stream. In most cases, text qualifier encloses an entire record.
Build Operation Type – Contains two options in which a dataset can be serialized:
- One Record Per Input – creates a single text record separated by delimiters for the entire data set.
- One Record Per Transaction – creates as many text records as the source file separated by only field delimiter.

Let’s leave the properties as default, and click OK. The data has been serialized.

To preview the data, right-click on the Delimited Serializer object’s header and select Preview Output from the context menu.

A Data Preview window will open showing the serialized data with field delimiters

To store this serialized output, write it to a destination file or you can use this data further in the dataflow.

This concludes using the Delimited Serializer in Astera.

Language Parser

We have added Language Parser functionality to the Expression Builder to enable advanced expressions such as interpolated string and verbatim. This will enable users to compile different values from incoming datasets into an expression and present it as an interpolated string in the output.

In this article, we will look at a use case and understand how the Language Parser works in Astera.

Use Case

We have some EmployeeData from a fictitious organization stored in an Excel spreadsheet. We will work with that data to create:

An interpolated string
A verbatim string

We will be using the fields in the EmployeeData Excel sheet and interpolate the data into an EmployeeInfo string. After string interpolation, we will create a new field to apply escape characters inside an expression transformation object through a verbatim string.

Astera also provides an extensive and detailed support for error information that may occur while data parsing and string interpolation. We will use some examples to see how error information works in Astera.

Follow through the steps below to see how it works:

Retrieve the source data.

Note: In this case, the source data of Employees is stored in an Excel file. We will extract it using the Excel File Source object in a dataflow.

Next, we will drag-and-drop the Expression Transformation object to the designer, to create string expressions. Map fields from the Excel source object (EmployeeData).

Right-click on the Expression transformation object and select Properties. A Layout Builder will open, where you can use the Expression Editor to define the expressions for each field.

Interpolated String

Let’s discuss how to create an interpolated string in an Expression transformation object.

Create a new field in the Layout Builder. Since this field will only return an output, we will check the Output box and set the data type as ‘String.’ In this case, we have named the new field as EmployeeInfo.

In the Expression Editor, we will define the Interpolated String Expression for the new field. You can either write the expression in the given field or click the ellipses in the left corner of the field to go to the Expression Editor.

The interpolated string expression will be built in the Expression box.

It is important to note that an interpolated string expression starts with a dollar ($) sign. The $ notation denotes the parameters or the variables in the string which will be parsed dynamically during the preview and runtime for each record in the source file.

Here, we will create an interpolated string using the EmployeeID, LastName, FirstName, Title, TitleOfCourtesy, HireDate, and City fields from our source data.

The interpolated string expression in this case will be written as:

$"{TitleOfCourtesy} {FirstName} {LastName} from {City} was hired on {HireDate} and works as {Title}"

During both the preview and the runtime, the variables or the parameters enclosed in {} will take input from the source data and return the output for each record in new field EmployeeInfo. Here’s an example of the parameter and the source value for the first record:

Hence, the output of the interpolated string for the first record will return as: “Ms. Nancy Davolio from Seattle was hired on 01/04/06 and works as Sales Representative.”

Now, if we look at the preview, you can see that a new field ‘EmployeeInfo’ has been created and the interpolated string output for each record has been returned in that field.

Verbatim String

Let’s go over how to use a Verbatim String with a line break in an Expression transformation object.

Create a new field in the Layout Builder for the verbatim string. Since this field will only give an output, we will check the Output box and set the data type as String. In this case, we will name the new field as Office.

In the Expression field of the Layout Editor, we will define the verbatim string expression for that field. You can either write the expression in the corresponding expression field or click on the ellipses in the left corner of the field to enter the Expression Editor.

The verbatim string expression will be built in the Expression box on the Expression Builder screen.

It is important to note that verbatim string only works with an at the rate (@) sign before the expression and returns the value as it is, throughout the output field.

In this example, we want the following output to be returned in the Office field that we created:

Output:

Office

North Street, 301, San Francisco

The verbatim string expression to achieve this output will be written as:

@”Office

North Street, 301, San Francisco”

5. The output preview will show the verbatim string upon hovering over values in the verbatim field or you can also export it to any destination to view the data.

Error Info

The error information feature in Astera has now been improved to provide more comprehensive information about errors to the users in real time. The information that it now provides is more precise and specific which helps in debugging errors as you build expression logics.

Let’s work on a few examples to explore how the error information functionality has been improved and how it can help in resolving errors and successfully compiling expressions in the Expression Builder.

Note: Three common error messages are discussed below, which will output other error messages.

1. Unexpected Token:

For an incomplete expression, the user will instantly be notified that the expression status is Not Compiled and further give a message explaining that there is an unexpected token which expects a further expression for successful compilation.

2. Unterminated String Literal:

‘Unterminated String Literal’ means that a string variable is not closed properly, either because of an un-escaped character in it, or a line break. Hence, it won’t compile till the expression is closed.

Successful Compilation of the Error:

The string has now been closed, and the Compile Status has been updated to ‘Successful’.

3. Invalid Identifier

The ‘Invalid Identifier’ error message appears when a field name is not valid. If you look at the given object, there is no field by the name TitleCourtesy; hence the invalid identifier error. The available field is TitleOfCourtesy.

This concludes using the Language Parser functionality in Astera.

Fixed Length Parser

Fixed Length Parser processes a stream of fixed length data as an input and give elements as parsed output. This is useful when the source data is in text format (information is contained in a single field) and it needs to be put into multiple fields for further integration.

In this document, we will learn how you can use Fixed Length Parser in Astera.

Sample Use-Case

The sample data that we are using is a fixed length text file containing customers’ information.

Observe that all the information such as name, street number, street type, city, state, and zip code are concatenated in a single field. We want to parse the Contact_Details field into Name, Street Number, Street Type, and other respective fields.

To parse the data into separate fields, we will use the Fixed Length Parser object in Astera.

Extracting Data

To parse the data, we first have to extract it. Since the data is stored in a continuous stream, can not extract it directly using a Fixed Length File Source object. Therefore, we will use a combination of transformations to get the desired outcome. The diagram below explains the process that we’ll be following, right from extraction to to parsing the data.

Let’s begin with extracting the data.

Go to Toolbox > Transformations > Constant Value and drag-and-drop the Constant Value object onto the designer.

Right-click on the object’s header and select Properties from the context-menu.

In the Constant Value field, provide the file path of the source file (file containing customers data in a stream). Click OK.

Next, go to Toolbox > Function Transformations > Files > ReadFileText(String filePath) - String, and drag-and-drop the object onto the designer.

You an see that the dragged transformation object has two sub-nodes - Input and Output.

Expand the Input node and map the Value field from the Constant Value Transformation object to the filePath field inside the Read File Text object.

This will redirect Astera to read the data from the given file path. Now we can use the Fixed Length Parser object to parse the text data into separate fields.

Using Fixed Length Parser

To get the Fixed Length Parser object, go to Toolbox > Text Processors > Fixed Length Parser and drag-and-drop the object onto the designer. Map the Value field, under the Read File Text object, onto the Text field inside the Fixed Length Parser object.

You can see that the dragged-object also contains an Output sub-node which is currently empty.

Configure the Fixed Length Parser object by right-clicking on its header and selecting Properties.

A properties window will open. Here you will see three options, make sure the first two options are checked.

Click Next, and you will be directed to a Source Fields window.

Here, you have to provide the name of each field that you want to parse the source date into, as shown below.

Click OK. Now, expand the Output node inside the Fixed Length Parser object. You will see all the fields that you have created in the previous step.

Right-click on the object’s header and select Preview Output from the context menu.

A Data Preview window will open. Expand the nodes, and you will see a parsed output of each record.

To store this parsed output, write it to a destination file.

Writing to a Destination

Right-click on the Output sub-node inside the Fixed Length Parser object, and go to Write to > Excel Workbook Destination. An Excel Workbook Destination object will be added to the dataflow designer with auto-mapped fields.

Configure settings for the Excel Workbook Destination object.

Learn how to configure settings for Excel Workbook Destination from here.

Click on the Start Dataflow icon, located in the toolbar at the top, to create this destination file.

Fixed Length Serializer

Fixed Length Serializer is useful when source data is stored in multiple fields and is to be converted to text format (store entire data in a single field). This is helpful is scenarios where you have to:

Store large datasets, so you compress the information in a single field
Transport volumes of data over a network

In this document, we will learn how to use Fixed Length Serializer in Astera.

Sample Use-Case

In this case, we are using an .

The source file contains customers’ ContactDetails.

To preview this data, right-click on the source object’s header, and select Preview Output from the context menu.

A Data Preview window will open, displaying the source data.

Now, we want to convert the information contained in multiple fields into a single field in text format.

To perform this task, we will use the Fixed Length Serializer object in Astera.

Using Fixed Length Serializer

To get the Fixed Length Serializer object, go to Toolbox > Text Processors > Fixed Length Serializer, and drag-and-drop the object onto the designer.

You can see that the dragged-object contains a Text field and an Input sub-node which is currently empty.

Auto-map source fields by dragging-and-dropping top node of the source object, ContactDetails, onto the Input node of the transformation object – Fixed Length Serializer.

Right-click on the object’s header and select Properties.

A configuration window will open, as shown below.

Check the options according to the contents and format of your source file.

In this example, both options have been checked, as the first row in the source file contains header, and the data also contains multiple records.

Specify the Record Delimiter of the source data. In this case, it is <CR><LF>.

You can specify any delimiter based on the format of your source file.

Under the Builder Options, select the relevant type from the drop-down list. For this dataset, we will use One Record Per Input.

Click Next. This will open the Destination Fields screen, where you can see all the incoming fields from the source object.

Here you can delete fields, change their Header or Data Type, modify the string’s Start Position, define field Length, and customize the fields according to your requirements.

Click OK.

Right-click on the Fixed Length Serializer object, and select Preview Output.

A Data Preview window will open, displaying the serialized data.

To store this serialized output, write it to a destination file.

Right-click on the FixedLengthSerializer node, and go to Write to > Fixed Length File Destination. A Fixed Length File Destination object is added to the dataflow designer with a Text field auto-mapped to it.

Click on the Start Dataflow icon, located in the toolbar at the top, to create the destination file.

A Fixed Length File Destination file will successfully be created. You can find its link in the Job progress window.

XML/JSON Parser

An XML/JSON Parser reads and processes a single text stream of hierarchical data and give elements as parsed output in XML/JSON format.

Sample Use-Case

In this case, we are using an .

The source file contains details of customers’ orders.

To preview data, right-click on the source object’s header and select Preview Output from the context menu.

A Data Preview window will open, displaying the source data.

Observe that it is difficult to decipher fields and elements when the data is in a text stream. To make sense of this data, it needs to be structured in a proper format, with fields and collection regions clearly identified.

To parse this source data, we will use XML/JSON Parser in Astera.

Using XML/JSON Parser

To get the XMl/JSON Parser object, go to Toolbox > Text Processors > Parser and drag-and-drop the object onto the designer.

You can see that the dragged object contains a Text field.

Map the Text field in Customer_Orders object to the Text field in the XMLJSONParser object.

Right-click on the object’s header and select Properties.

A configuration window will open, as shown below.

Provide the XML schema path. Astera will parse the text file based on the structure defined in the schema file.

Click Next. This is the Layout Builder screen where the final layout of the incoming fields and the collection regions can be seen.

Click OK. The data stream is parsed into regions and fields which can be seen on the XMLJSON Parser object.

Right-click on the object’s header and select Preview Output from the context menu.

A Data Preview window will open. Expand the nodes, and you will see the parsed output for each record.

To store this parsed output, write it to a destination file.

Right-click on the Customer_Orders node under XMLJSON Parser object, go to Write to > Database Table Destination. A Database Table Destination object is added to the dataflow designer with all fields auto-mapped to it.

Click on the Start Dataflow button placed in the toolbar at the top to create the destination file.

A database table will be successfully created, with parsed output in XML format.

XML/JSON Serializer

XML/JSON Serializer reads hierarchical data and converts it into a text stream. It is basically the conversion of data in structured format to a single stream in a text-like format.

Sample Use-Case

In this case, we are using an .

The XML/JSON File Source contains the hierarchical data of customers, their orders, and the order details.

Right-click on the XML/JSON File Source object’s header and select the Preview Output to visualize the hierarchical data.

View the source data from the Data Preview window.

Now, to convert this hierarchical data into a single text stream, we will use the XML/JSON Serializer object.

Steps to Use the XML/JSON Serializer

Select the XML/JSON Serializer object from the Toolbox window and drag-and-drop it onto the dataflow designer.

Right-click on the XML/JSON Serializer object’s header and select Properties from the context menu.

A configuration window will open, as shown below.

Check the JSON Format option if your file is a JSON.

Click Next. This is the Config Parameters window, which allows us to configure and define certain parameter for the XML Serializer Source file.

Use of Parameters eliminates the process of inserting hardcoded values and provides simplification in the deployment of flows. If the parameters are left empty, the default values defined at the Properties window are assigned for use.

Now, click OK, to proceed further.

Map the fields of the XML/JSON File Source object onto the XML/JSON Serializer object’s fields.

Right-click on the object’s header and select Preview Output.

You can view the data through Data Preview window.

Hover over the Text field and you will see the Serialized output i.e., stream of text for each hierarchical record of the XML/JSON file.

To store the Serialized Output, we must write it to a destination file.

Right-click on the XmlJsonSerializer, select Write To, and select your desired destination object. A destination object is created on the dataflow designer with the Text field auto mapped.

Right-click on the destination object’s header and provide the File Path for the destination file. Select the relevant options and click OK.

By running this dataflow, you can now create an Excel file containing the serialized XML output.

This is how an XML Serializer Object can be used to format hierarchical data into a single stream of text in Astera.

Language Parser

In this article, we will look at a use case and understand how the Language Parser works in Astera.

Use Case

We have some EmployeeData from a fictitious organization stored in an Excel spreadsheet. We will work with that data to create:

An interpolated string
A verbatim string

Follow through the steps below to see how it works:

Retrieve the source data.

Note: In this case, the source data of Employees is stored in an Excel file. We will extract it using the Excel File Source object in a dataflow.

Next, we will drag-and-drop the Expression Transformation object to the designer, to create string expressions. Map fields from the Excel source object (EmployeeData).

Right-click on the Expression transformation object and select Properties. A Layout Builder will open, where you can use the Expression Editor to define the expressions for each field.

Interpolated String

Let’s discuss how to create an interpolated string in an Expression transformation object.

Create a new field in the Layout Builder. Since this field will only return an output, we will check the Output box and set the data type as ‘String.’ In this case, we have named the new field as EmployeeInfo.

In the Expression Editor, we will define the Interpolated String Expression for the new field. You can either write the expression in the given field or click the ellipses in the left corner of the field to go to the Expression Editor.

The interpolated string expression will be built in the Expression box.

Here, we will create an interpolated string using the EmployeeID, LastName, FirstName, Title, TitleOfCourtesy, HireDate, and City fields from our source data.

The interpolated string expression in this case will be written as:

$"{TitleOfCourtesy} {FirstName} {LastName} from {City} was hired on {HireDate} and works as {Title}"

Hence, the output of the interpolated string for the first record will return as: “Ms. Nancy Davolio from Seattle was hired on 01/04/06 and works as Sales Representative.”

Now, if we look at the preview, you can see that a new field ‘EmployeeInfo’ has been created and the interpolated string output for each record has been returned in that field.

Verbatim String

Let’s go over how to use a Verbatim String with a line break in an Expression transformation object.

Create a new field in the Layout Builder for the verbatim string. Since this field will only give an output, we will check the Output box and set the data type as String. In this case, we will name the new field as Office.

In the Expression field of the Layout Editor, we will define the verbatim string expression for that field. You can either write the expression in the corresponding expression field or click on the ellipses in the left corner of the field to enter the Expression Editor.

The verbatim string expression will be built in the Expression box on the Expression Builder screen.

It is important to note that verbatim string only works with an at the rate (@) sign before the expression and returns the value as it is, throughout the output field.

In this example, we want the following output to be returned in the Office field that we created:

Output:

Office

North Street, 301, San Francisco

The verbatim string expression to achieve this output will be written as:

@”Office

North Street, 301, San Francisco”

5. The output preview will show the verbatim string upon hovering over values in the verbatim field or you can also export it to any destination to view the data.