© Copyright 2023, Astera Software
Pattern Count is the number of patterns that Astera matches on your file to capture a data region. This is useful if more than one pattern is required to identify the beginning of your data region. You can specify up to five patterns in a report model at a time.
In this document, we will explore how the Pattern Count feature helps with the selection of a data region.
Open a Report Model in Astera by going to File > New > Report Model.
Provide the File Path for the unstructured file from your directory.
Astera supports extraction of unstructured data from Excel, csv, text, PRN, PDF, word, rtf and xls files. In this case, we are extracting data from a text file.
Click Open. A text file containing information regarding orders to a fictitious furniture store will open in the report model.
Now that the file is open, we will create an extraction template.
Right-click on the Record node in Model layout under the Report Browser panel and select Add Data Region from the context menu.
A pattern-matching bar and Region Properties panel will appear. And a subnode "Data" is added to the Record node in the Model Layout tab.
2. Specify the pattern that the report model can look for and match in your file to capture data. You can use an alphabet, character, number, word, a wild card or any combination of these to define your pattern.
Astera has built-in wild cards to facilitate region selection.
Wild Cards
Description
It matches any alphabet on the file.
It matches any digit on the file.
It matches any alphabet or digit on the file.
It matches any non-blank character on the file.
It matches any blank character such as line, space, tab etc. on the file.
In this example, we want to capture the data highlighted in yellow. Notice that each item has a specific item code, which we can use as a pattern to extract all the item details.
3. The pattern is a combination of three alphabets, a hyphen, and five digits. You can use the relevant wildcards to specify the pattern. In this case, notice that some item-codes are different from this pattern. The digits in the codes appear before the alphabets. As a result, RUGS has not been captured in the data region.
4. In this scenario, to capture the region completely, we'll specify another pattern. You can specify up to five patterns in a single data region. We'll go to the Pattern Properties panel and increase the Pattern Count to 2. Another pattern bar appears.
5. On the second pattern bar, we'll specify another pattern where the 5 digits come before the 3 alphabets, separated by a hyphen. Now, all the lines with item details have been captured completely in the data region.
6. Once our data region is defined, the next step is to create data fields. To do that, you can highlight each field area, right-click and select Add Data Field.
7. Repeat the process to create more data fields and name them as shown below.
8. Preview data by clicking on the Preview Data icon placed in the toolbar at the top of the designer window.
9. A window will open, asking you to save the file before proceeding. Save the report model at your required path.
10. Once saved, a Data Preview window will open, displaying a preview of the extracted data.
This concludes our discussion on working with an increased Pattern Count in Astera.