Pattern Count
Last updated
Last updated
© Copyright 2023, Astera Software
Pattern Count is the number of patterns that match your file to capture a data region. This is useful if more than one pattern is required to identify the beginning of your data region. You can specify up to five patterns in a report model at a time.
In this document, we will explore how the Pattern Count feature helps with the selection of a data region.
Open a Report Model in Astera by going to File > New > Report Model.
Provide the File Path for the unstructured file from your directory. Download the sample data text here.
Astera supports the extraction of unstructured data from text, EDI, Excel, PRN, and PDF files. In this case, we are extracting data from a text file.
Click OK. A text file containing information regarding customers’ dues will open on the Astera designer.
Now that the file is open in Astera, we will create an extraction template.
Right-click on the Record node in the Model layout under the Report Browser panel and select Add Data Region from the context menu.
A pattern-matching bar and Region Properties panel will appear. A subnode “Data” is added to the Record node in the Model Layout panel.
Specify the pattern that Astera can match on your file to capture data. You can use an alphabet, character, number, word, wild card, or any combination of these to define your pattern.
Astera has built-in wild cards to facilitate region selection.
Wild Cards | Description |
à | It matches any alphabet on the file. |
Ñ | It matches any digit on the file. |
Æ | It matches any alphabet or digit on the file. |
__ | It matches any non-blank character on the file. |
[ ] | It matches any blank character such as line, space, tab etc. on the file. |
In this example, we want to capture the data highlighted in yellow. Notice that the Date provided in Line 2 follows a consistent pattern as each data block below Line 2 contains a fixed date pattern.
We will specify this pattern using Ñ wildcard in the pattern-matching bar. Notice that this pattern alone will not capture the complete data region containing the date column since the dates in Line 5, Line 8, and Line 14 are not aligned with the specified pattern.
To handle this issue, select the Floating Pattern and Float Fields options in the Pattern Properties panel.
Observe that the regions containing the date on Line 8 are still not captured because the date in this line is not aligned with the pattern and contains only a single digit for the day contrary to the date format in the rest of the regions.
To capture data in this line, we need to increase Pattern Count to 2. Observe that a second pattern-matching bar has been added below the first one.
Now specify its pattern using Ñ wildcard in the second pattern box. Make sure to check the Floating Pattern and Float Fields options for this data region as well.
Increase the Line count to 3.
Repeat the same process to capture the remaining data as shown in the screenshot below.
Once our data region is defined, the next step is to create data fields. For that, highlight each field area, right-click, and select Add Data Field.
Repeat the process to create more data fields and name them as shown below.
Preview data by clicking on the Preview Data icon placed in the toolbar at the top of the designer window.
A Data Preview window will open, displaying a preview of the extracted data.
This concludes working with more than one Pattern Count in Astera.