Astera Data Stack
Version 11
Version 11
  • Welcome to Astera Data Stack Documentation
  • RELEASE NOTES
    • ReportMiner 11.1 - Release Notes
  • SETTING UP
    • System Requirements
    • Product Architecture
    • Installing Client and Server Applications
    • Install Manager
      • Installing Packages on Client Machine
      • Installing Packages on Server Machine
    • Connecting to an Astera Server using the Client
    • How to Connect to a Different Astera Server from the Client
    • How to Build a Cluster Database and Create Repository
    • Repository Upgrade Utility in Astera
    • How to Login from the Client
    • How to Verify Admin Email
    • Licensing in Astera
    • How to Supply a License Key Without Prompting the User
    • Enabling Python Server
    • User Roles and Access Control
      • Windows Authentication
      • Azure Authentication
    • Offline Activation of Astera
    • Silent Installation
  • Astera Intelligence
    • LLM Generate
  • DATAFLOWS
    • What are Dataflows?
    • Sources
      • Data Providers and File Formats Supported in Astera Data Stack
      • Setting Up Sources
      • Excel Workbook Source
      • COBOL File Source
      • Database Table Source
      • Delimited File Source
      • File System Items Source
      • Fixed Length File Source
      • Email Source
      • Report Source
      • SQL Query Source
      • Text Converter
      • XML/JSON File Source
      • PDF Form Source
      • Parquet File Source (Beta)
      • MongoDB Source (Beta)
      • Data Model Query
    • Transformations
      • Introducing Transformations
      • Aggregate Transformation
      • Constant Value Transformation
      • Denormalize Transformation
      • Distinct Transformation
      • Expression Transformation
      • Filter Transformation
      • Join Transformation
      • List Lookup Transformation
      • Merge Transformation
      • Normalize Transformation
      • Passthru Transformation
      • Reconcile Transformation
      • Route Transformation
      • Sequence Generator
      • Sort Transformation
      • Sources as Transformations
      • Subflow Transformation
      • Switch Transformation
      • Tree Join Transformation
      • Tree Transform
      • Union Transformation
      • Data Cleanse Transformation
      • File Lookup Transformation
      • SQL Statement Lookup
      • Database Lookup
      • AI Match Transformation
    • Destinations
      • Setting Up Destinations
      • Database Table Destination
      • Delimited File Destination
      • Excel Workbook Destination
      • Fixed Length File Destination
      • SQL Statement Destination
      • XML File Destination
      • Parquet File Destination (Beta)
      • Excel Workbook Report
      • MongoDB Destination
    • Data Logging and Profiling
      • Creating Data Profile
      • Creating Field Profile
      • Data Quality Mode
      • Using Data Quality Rules in Astera
      • Record Level Log
      • Quick Profile
    • Database Write Strategies
      • Data Driven
      • Source Diff Processor
      • Database Diff Processor
    • Text Processors
      • Delimited Parser
      • Delimited Serializer
      • Language Parser
      • Fixed Length Parser
      • Fixed Length Serializer
      • XML/JSON Parser
      • XML/JSON Serializer
    • Data Warehouse
      • Fact Table Loader
      • Dimension Loader
      • Data Vault Loader
    • EDI
      • EDI Source File
      • EDI Message Parser
      • EDI Message Serializer
      • EDI Destination File
  • WORKFLOWS
    • What are Workflows?
    • Creating Workflows in Astera
    • Decision
    • EDI Acknowledgment
    • File System
    • File Transfer
    • Or
    • Run Dataflow
    • Run Program
    • Run SQL File
    • Run SQL Script
    • Run Workflow
    • Send Mail
    • Workflows with a Dynamic Destination Path
    • Customizing Workflows With Parameters
    • GPG-Integrated File Decryption in Astera
    • AS2
      • Setting up an AS2 Server
      • Adding an AS2 Partner
      • AS2 Workflow Task
  • Subflows
    • Using Subflows in Astera
  • DATA MODEL
    • Creating a Data Warehousing Project
    • Data Models
      • Introducing Data Models
      • Opening a New Data Model
      • Data Modeler - UI Walkthrough
      • Reverse Engineering an Existing Database
      • Creating a Data Model from Scratch
      • General Entity Properties
      • Creating and Editing Relationships
      • Relationship Manager
      • Virtual Primary Key
      • Virtual Relationship
      • Change Field Properties
      • Forward Engineering
      • Verifying a Data Model
    • Dimensional Modelling
      • Introducing Dimensional Models
      • Converting a Data Model to a Dimensional Model
      • Build Dimensional Model
      • Fact Entities
      • Dimension Entities
      • Placeholder Dimension for Early Arriving Facts and Late Arriving Dimensions
      • Date and Time Dimension
      • Aggregates in Dimensional Modeling
      • Verifying a Dimensional Model
    • Data Vaults
      • Introducing Data Vaults
      • Data Vault Automation
      • Raw Vault Entities
      • Bridge Tables
      • Point-In-Time Tables
    • Documentation
      • Generating Technical and Business Documentation for Data Models
      • Lineage and Impact Analysis
    • Deployment and Usage
      • Deploying a Data Model
      • View Based Deployment
      • Validate Metadata and Data Integrity
      • Using Astera Data Models in ETL Pipelines
      • Connecting an Astera Data Model to a Third-Party Visualization Tool
  • REPORT MODEL
    • User Guide
      • Report Model Tutorial
    • Report Model Interface
      • Report Options
      • Report Browser
      • Data Regions in Report Models
      • Region Properties Panel
      • Pattern Properties
      • Field Properties Panel
    • Use Cases
      • Auto-Creating Data Regions and Fields
      • Line Count
      • Auto-Parsing
      • Pattern Count
      • Applying Pattern to Line
      • Regular Expression
      • Floating Patterns and Floating Fields
      • Creating Multi-Column Data Regions
      • Defining the Start Position of Data Fields
      • Data Field Verification
      • Using Comma Separated Values to Define Start Position
      • Defining Region End Type as Specific Text and Regular Expression
      • How To Work With PDF Scaling Factor in a Report Model
      • Connecting to Cloud Storage
    • Auto Generate Layout
      • Setting Up AGL in Astera
      • UI Walkthrough - Auto-Generate Layout, Auto-Create Fields and Create Table Region
      • Using Auto Generation Layout, Auto Create Fields and Auto Create Table (Preview)
    • AI Powered Data Extraction
      • AI Powered Data Extraction Using Astera North Star
      • Best Practices for AI-Powered Template Creation in Astera
    • Optical Character Recognition
      • Loading PDFs with OCR
      • Best Practices for OCR Usage
    • Exporting Options
      • Exporting a Report Model
      • Exporting Report Model to a Dataflow
    • Miscellaneous
      • Importing Monarch Models
      • Microsoft Word and Rich Text Format Support
      • Working With Problematic PDF Files
  • API Flow
    • API Publishing
      • Develop
        • Designing an API Flow
        • Request Context Parameters
        • Configuring Sorting and Filtering in API Flows
        • Enable Pagination
        • Asynchronous API Request
        • Multiple Responses using Conditional Route
        • Workflow Tasks in an API Flow
        • Enable File Download-Upload Through APIs
        • Database CRUD APIs Auto-Generation
        • Pre-deployment Testing and Verification of API flows
        • Multipart/Form-Data
        • Certificate Store
      • Publish
        • API Deployment
        • Test Flow Generation
      • Manage
        • Server Browser Functionalities for API Publishing
          • Swagger UI for API Deployments
        • API Monitoring
        • Logging and Tracing
    • API Consumption
      • Consume
        • API Connection
        • Making API Calls with the API Client
        • API Browser
          • Type 1 – JSON/XML File
          • Type 2 – JSON/XML URL
          • Type 3 – Import Postman API Collections
          • Type 4 - Create or customize API collection
          • Pre-built Custom Connectors
        • Request Service Options - eTags
        • HTTP Redirect Calls
        • Method Operations
        • Pagination
        • Raw Preview And Copy Curl Command
        • Support for text/XML and SOAP Protocol
        • API Logging
      • Authorize
        • Open APIs - Configuration Details
        • Authorizing Facebook APIs
        • Authorizing Astera’s Server APIs
        • Authorizing Avaza APIs
        • Authorizing the Square API
        • Authorizing the ActiveCampaign API
        • Authorizing the QuickBooks’ API
        • Astera’s Server API Documentation
        • NTLM Authentication
        • AWS Signature Authentication
  • SERVER APIS
    • Accessing Astera’s Server APIs Through a Third-Party Tool
      • Workflow Use Case
  • Project Management and Scheduling
    • Project Management
      • Deployment
      • Server Monitoring and Job Management
      • Cluster Monitor and Settings
      • Connecting to Source Control
      • Astera Project and Project Explorer
      • CAR Convert Utility Guide
    • Job Scheduling
      • Scheduling Jobs on the Server
      • Job Monitor
    • Configuring Multiple Servers to the Same Repository (Load Balancing)
    • Purging the Database Repository
  • Data Governance
    • Deployment of Assets in Astera Data Stack
    • Logging In
    • Tags
    • Modifying Asset Details
    • Data Discoverability
    • Data Profile
    • Data Quality
    • Scheduler
    • Access Management
  • Functions
    • Introducing Function Transformations
    • Custom Functions
    • Logical
      • Coalesce (Any value1, Any value2)
      • IsNotNull (AnyValue)
      • IsRealNumber (AnyValue)
      • IsValidSqlDate (Date)
      • IsDate (AnyValue)
      • If (Boolean)
      • If (DateTime)
      • If (Double)
      • Exists
      • If (Int64)
      • If (String)
      • IsDate (str, strformat)
      • IsInteger (AnyValue)
      • IsNullOrWhitespace (StringValue)
      • IsNullorEmpty (StringValue)
      • IsNull (AnyValue)
      • IsNumeric (AnyValue)
    • Conversion
      • GetDateComponents (DateWithOffset)
      • ParseDate (Formats, Str)
      • GetDateComponents (Date)
      • HexToInteger (Any Value)
      • ToInteger (Any value)
      • ToDecimal (Any value)
      • ToReal (Any value)
      • ToDate (String dateStr)
      • TryParseDate (String, UnknownDate)
      • ToString (Any value)
      • ToString (DateValue)
      • ToString (Any data, String format)
    • Math
      • Abs (Double)
      • Abs (Decimal)
      • Ceiling (Real)
      • Ceiling(Decimal)
      • Floor (Decimal)
      • Floor (Real)
      • Max (Decimal)
      • Max (Date)
      • Min (Decimal)
      • Min (Date)
      • Max (Real)
      • Max (Integer)
      • Min (Real)
      • Pow (BaseExponent)
      • Min (Integer)
      • RandomReal (Int)
      • Round (Real)
      • Round (Real Integer)
      • Round (Decimal Integer)
      • Round (Decimal)
    • Financial
      • DDB
      • FV
      • IPmt
      • IPmt (FV)
      • Pmt
      • Pmt (FV)
      • PPmt
      • PPmt (FV)
      • PV (FV)
      • Rate
      • Rate (FV)
      • SLN
      • SYD
    • String
      • Center (String)
      • Chr (IntAscii)
      • Asc (String)
      • AddCDATAEnvelope
      • Concatenate (String)
      • ContainsAnyChar (String)
      • Contains (String)
      • Compact (String)
      • Find (Int64)
      • EndsWith (String)
      • FindIntStart (Int32)
      • Extract (String)
      • GetFindCount (Int64)
      • FindLast (Int64)
      • GetDigits (String)
      • GetLineFeed
      • Insert (String)
      • IsAlpha
      • GetToken
      • IndexOf
      • IsBlank
      • IsLower
      • IsUpper
      • IsSubstringOf
      • Length (String)
      • LeftOf (String)
      • Left (String)
      • IsValidName
      • Mid (String)
      • PadLeft
      • Mid (String Chars)
      • LSplit (String)
      • PadRight
      • ReplaceAllSpecialCharsWithSpace
      • RemoveChars (String str, StringCharsToRemove)
      • ReplaceLast
      • RightAlign
      • Reverse
      • Right (String)
      • RSplit (String)
      • SplitStringMultipleRecords
      • SplitStringMultipleRecords (2 Separators)
      • SplitString (3 separators)
      • SplitString
      • SplitStringMultipleRecords (3 Separators)
      • Trim
      • SubString (NoOfChars)
      • StripHtml
      • Trim (Start)
      • TrimExtraMiddleSpace
      • TrimEnd
      • PascalCaseWithSpace (String str)
      • Trim (String str)
      • ToLower(String str)
      • ToProper(String str)
      • ToUpper (String str)
      • Substring (String str, Integer startAt)
      • StartsWith (String str, String value)
      • RemoveAt (String str, Integer startAt, Integer noofChars)
      • Proper (String str)
      • Repeat (String str, Integer count)
      • ReplaceAll (String str, String lookFor, String replaceWith)
      • ReplaceFirst (String str, String lookFor, String replaceWith)
      • RightOf (String str, String lookFor)
      • RemoveChars (String str, String charsToRemove)
      • SplitString (String str, String separator1, String separator2)
    • Date Time
      • AddMinutes (DateTime)
      • AddDays (DateTimeOffset)
      • AddDays (DateTime)
      • AddHours (DateTime)
      • AddSeconds (DateTime)
      • AddMonths (DateTime)
      • AddMonths (DateTimeOffset)
      • AddMinutes (DateTimeOffset)
      • AddSeconds (DateTimeOffset)
      • AddYears (DateTimeOffset)
      • AddYears (DateTime)
      • Age (DateTime)
      • Age (DateTimeOffset)
      • CharToSeconds (Str)
      • DateDifferenceDays (DateTimeOffset)
      • DateDifferenceDays (DateTime)
      • DateDifferenceHours (DateTimeOffset)
      • DateDifferenceHours (DateTime)
      • DateDifferenceMonths (DateTimeOffset)
      • DateDifferenceMonths (DateTime)
      • DatePart (DateTimeOffset)
      • DatePart (DateTime)
      • DateDifferenceYears (DateTimeOffset)
      • DateDifferenceYears (DateTime)
      • Month (DateTime)
      • Month (DateTimeOffset)
      • Now
      • Quarter (DateTime)
      • Quarter (DateTimeOffset)
      • Second (DateTime)
      • Second (DateTimeOffset)
      • SecondsToChar (String)
      • TimeToInteger (DateTime)
      • TimeToInteger (DateTimeOffset)
      • ToDate Date (DateTime)
      • ToDate DateTime (DateTime)
      • ToDateString (DateTime)
      • ToDateTimeOffset-Date (DateTimeOffset)
      • ToDate DateTime (DateTimeOffset)
      • ToDateString (DateTimeOffset)
      • Today
      • ToLocal (DateTime)
      • ToJulianDate (DateTime)
      • ToJulianDayNumber (DateTime)
      • ToTicks (Date dateTime)
      • ToTicks (DateTimeWithOffset dateTime)
      • ToUnixEpoc (Date dateTime)
      • ToUtc (Date dateTime)
      • UnixTimeStampToDateTime (Real unixTimeStamp)
      • UtcNow ()
      • Week (Date dateTime)
      • Week (DateTimeWithOffset dateTime)
      • Year (Date dateTime)
      • Year (DateTimeWithOffset dateTime)
      • DateToJulian (Date dateTime, Integer length)
      • DateTimeOffsetUtcNow ()
      • DateTimeOffsetNow ()
      • Day (DateTimeWithOffset dateTime)
      • Day (Date dateTime)
      • DayOfWeekStr (DateTimeWithOffset dateTime)
      • DayOfWeek (DateTimeWithOffset dateTime)
      • DayOfWeek (Date dateTime)
      • DateToJulian (DateTimeWithOffset dateTime, Integer length)
      • DayOfWeekStr (Date dateTime)
      • FromJulianDate (Real julianDate)
      • DayOfYear (Date dateTime)
      • DaysInMonth(Integer year, Integer month)
      • DayOfYear (DateTimeWithOffset dateTime)
      • FromUnixEpoc
      • FromJulianDayNumber (Integer julianDayNumber)
      • FromTicksUtc(Integer ticks)
      • FromTicksLocal(Integer ticks)
      • Hour (Date dateTime)
      • Hour (DateTimeWithOffset dateTime)
      • Minute (Date dateTime)
      • JulianToDate (String julianDate)
      • Minute (DateTimeWithOffset dateTime)
      • DateToIntegerYYYYMMDD (DateTimeWithOffset dateTime)
      • DateToIntegerYYYYMMDD (Date dateTime)
    • Files
      • AppendTextToFile (String filePath, String text)
      • CopyFile (String sourceFilePath, String destFilePath, Boolean overWrite)
      • CreateDateTime (String filePath)
      • DeleteFile (String filePath)
      • DirectoryExists (String filePath)
      • FileExists (String filePath)
      • FileLength (String filePath)
      • FileLineCount (String filePath)
      • GetDirectory (String filePath)
      • GetEDIFileMetaData (String filePath)
      • GetExcelWorksheets (String excelFilePath)
      • GetFileExtension (String filePath)
      • GetFileInfo (String filePath)
      • GetFileName (String filePath)
      • GetFileNameWithoutExtension (String filePath)
      • LastUpdateDateTime (String filePath)
      • MoveFile (String filePath, String newDirectory)
      • ReadFileBytes (String filePath)
      • ReadFileFirstLine (String filePath)
      • ReadFileText (String filePath)
      • ReadFileText (String filePath, String codePage)
      • WriteBytesToFile (String filePath, ByteArray bytes)
      • WriteTextToFile (String filePath, String text)
    • Date Time With Offset
      • ToDateTimeOffsetFromDateTime (dateTime String)
      • ToUtc (DateTimeWithOffset)
      • ToDateTimeOffsetFromDateTime
      • ToDateTimeOffset (String dateTimeOffsetStr)
      • ToDateTimeFromDateTimeOffset
    • GUID
      • NewGuid
    • Encoding
      • ToBytes
      • FromBytes
      • UrlEncode
      • UrlDecode
      • ComputeSHA256
      • ComputeMD5
      • ComputeHash (Str, Key)
      • ComputeHash (Str, Key, hex)
      • ConvertEncoding
    • Regular Expressions
      • ReplaceRegEx
      • ReplaceRegEx (Integer StartAt)
      • IsMatchRegEx (StartAt)
      • IsMatchRegEx
      • IsUSPhone
      • IsUSZipCode
      • GetMatchRegEx
      • GetMatchRegEx (StartAt)
    • TimeSpan
      • Minutes
      • Hours
      • Days
      • Milliseconds
      • TotalMilliseconds
      • TimeSpanFromTicks
      • Ticks
      • TotalHours
      • Seconds
      • TotalDays
      • ToTimeSpan (Hours, Min, Sec)
      • ToTimeSpan (Milli)
      • ToTimeSpan
      • TotalSeconds
      • TotalMinutes
    • Matching
      • Soundex
      • DoubleMetaphone
      • RefinedSoundex
    • Processes
      • TerminateProcess
      • IsProcessRunning
  • USE CASES
    • End-to-End Use Cases
      • Data Integration
        • Using Astera Data Stack to Create and Orchestrate an ETL Process for Partner Onboarding
        • Integrating Document Processing into Existing Systems with Astera Server APIs
      • Data Warehousing
        • Building a Data Warehouse – A Step by Step Approach
      • Data Extraction
        • Reusing The Extraction Template for Similar Layout Files
  • CONNECTORS
    • Setting Up IBM DB2/iSeries Connectivity in Astera
    • Connecting to SAP HANA Database
    • Connecting to MariaDB Database
    • Connecting to Salesforce Database
    • Connecting to Salesforce – Legacy Database
    • Connecting to Vertica Database
    • Connecting to Snowflake Database
    • Connecting to Amazon Redshift Database
    • Connecting to Amazon Aurora Database
    • Connecting to Google Cloud SQL in Astera
    • Connecting to MySQL Database
    • Connecting to PostgreSQL in Astera
    • Connecting to Netezza Database
    • Connecting to Oracle Database
    • Connecting to Microsoft Azure Databases
    • Amazon S3 Bucket Storage in Astera
    • Connecting to Amazon RDS Databases
    • Microsoft Azure Blob Storage in Astera
    • ODBC Connector
    • Microsoft Dynamics CRM
    • Connection Details for Azure Data Lake Gen 2 and Azure Blob Storage
    • Configuring Azure Data Lake Gen 2
    • Connecting to Microsoft Message Queue
    • Connecting to Google BigQuery
    • Azure SQL Server Configuration Prerequisites
    • Connecting to Microsoft Azure SQL Server
    • Connecting to Microsoft SharePoint in Astera
  • Incremental Loading
    • Trigger Based CDC
    • Incremental CDC
  • MISCELLANEOUS
    • Using Dynamic Layout & Template Mapping in Astera
    • Synonym Dictionary File
    • SmartMatch Feature
    • Role-Based Access Control in Astera
    • Updating Your License in Astera
    • Using Output Variables in Astera
    • Parameterization
    • Connection Vault
    • Safe Mode
    • Context Information
    • Using the Data Source Browser in Astera
    • Pushdown Mode
    • Optimization Scenarios
    • Using Microsoft’s Modern Authentication Method in Email Source Object
    • Shared Actions
    • Data Formats
    • AI Automapper
    • Resource Catalog
    • Cloud Deployment
      • Deploying Astera Data Stack on Microsoft Azure Cloud
      • Deploying Astera Data Stack on Oracle Cloud
      • Deploying Astera Data Stack on Amazon Web Services
      • Setting up the Astera Server on AKS
    • GIT In Astera Data Stack
      • GIT Repositories in Astera Data Stack
      • Moving a Repository to a Remote Server
      • Git Conflicts in Astera Data Stack
    • Astera Best Practices
  • FAQs
    • Installation
      • Why do we need to make two installations for Astera?
      • What’s the difference between Custom and Complete installation?
      • What’s the difference between 32-bit and 64-bit Astera?
      • Can we use a single license for multiple users?
      • Does Astera client work when it’s not connected to the server?
      • Why do we need to build a cluster database and set up a repository while working with Astera?
      • How do we set up multiple servers for load balancing?
      • How do we maintain schedules when migrating server or upgrading version?
      • Which database providers does Astera support for setting up a cluster database?
      • How many Astera clients can be connected to a single server?
      • Why is Astera not able to access my source file or create a new one?
    • Sources
      • Can I use data from unstructured documents in dataflows?
      • Can I extract data from fillable PDF forms in Astera?
      • Does Astera support extraction of data residing in online sources?
      • How do I process multiple files in a directory with a single execution of a flow?
      • Can I write information from the File System Items Source to the destination?
      • Can I split a source file into multiple files based on record count?
      • Does Astera support data extraction from unstructured docs or text files?
      • What is the difference between full and incremental loading in database sources?
      • How is the File System Items Source used in a Dataflow?
      • How does the PDF Form Source differ from the Report Source in Astera?
      • Does Astera support extraction of data from EDI files?
      • How does the Raw Text Filter option work in file sources in Astera?
    • Destinations
      • If I want to have a different field delimiter, say a pipe (“|”), is there an option to export with a
      • Tools Menu > Data Format has different date formats, but it doesn’t seem to do anything.
      • Can we export the Object Path column present in the Data Preview window?
      • I want to change the output format of a column.
      • What will be the outcome if we write files multiple times to the same Excel Destination?
    • Transformations
      • How is the Aggregate Transformation different from the Expression Transformation?
      • Can we omit duplicate records using the Aggregate Transformation in Astera?
      • How many datasets can a single Aggregate object take input from?
      • How is Expression Transformation different from the Function Transformation?
    • Workflows
      • What is a Workflow in Astera?
      • How do I trigger a task if at least one of a set of tasks fails?
      • Can I perform an action based on whether a file has data?
    • Scheduler
      • How can I schedule a job to run every x hours?
Powered by GitBook

© Copyright 2025, Astera Software

On this page
  • Anatomy of a Report Model
  • Report Model Interface
  • Report Model Designer
  • Pattern Box
  • Toolbar
  • Report Browser
  • Model Layout
  • Data Export Settings
  • Region Properties Panel
  • Region Details
  • Pattern Properties Panel
  • Field Properties Panel
  • Working With Report Models in Astera
  • Creating a Report Model
  • Using Report Models via Dataflows
  • Workflow Orchestration
  • Scheduling an Extraction Process

Was this helpful?

Export as PDF
  1. REPORT MODEL
  2. User Guide

Report Model Tutorial

PreviousUser GuideNextReport Model Interface

Was this helpful?

In this tutorial, we will explore the new and improved features of Astera's Report Model component. To extract data from a document, you need to create a report model, customize it using properties and multiple options available, and then select a destination of your choice to write the extracted data to, for instance, an Excel sheet or a database table.

Once you have designed a report model, test it by previewing the data and collecting statistical information.

The extracted data can be massaged further to conform to downstream needs, verified for quality, and sent to the destination of your choice. The created project can be deployed to automate the entire process of data extraction and data validation from documents that have a similar layout.

This tutorial will demonstrate how Astera creates and automates the data extraction process and expedites data preparation with features such as Data Exporting, Workflow Orchestration, Email/FTP/Folder Integration, Data Verification and Scheduling Extraction.

The process involves two important steps as shown in the figure below:

Astera uses a template-based extraction model to extract data from unstructured file sources. The template is to be designed by the users that directs and guides the data extraction process by Astera. We refer to this template as a Report Model or an Extraction Template. We will continue to use these terms throughout this tutorial.

In the next section, we will discuss the anatomy of a report model.

Anatomy of a Report Model

A report model comprises of seven main components that appear in the Model Layout tab as shown below:

A brief description for each of these components is given in the table below:

Feature

Description

Data Region

It is a section (in an unstructured file) that contains desired data points which are to be extracted. The data region can cover any number of lines in a source report and is specified by repeating pattern(s).

Header and Footer Regions

The data region at the top/bottom of a page is referred to as the header/footer region respectively. Add these regions if you are trying to extract information that is repeating on each page in the header/footer sections.

Single Instance Data Region

It is a sub-region that extracts a single set of data points within a data region. In the data preview, it is shown as a single record.

Collection Data Region

It is a sub-region that extracts multiple sets of data points within a data region. It is a collection of records captured in a hierarchical structure. For example, a collection of product items under a single Order ID.

Append Data Region

It is a region that you can add as part of a report model that would otherwise be left out of a data region.

Data Fields

It is the area within a data region containing the information that is to be extracted.

Report Model Interface

Astera's user-friendly interface enables business users to easily accomplish a wide range of data extraction tasks without relying on or employing expensive IT resources. With its easy-to-use, visual interface, the tool walks you through the process of identifying your desired data points, building the extraction logic, and sending it to the destination of your choice. The screenshot below displays important panels/windows in a Report Model.

The windows and panels shown in the screenshot above are discussed in the subsequent sections.

Report Model Designer

The unstructured file is loaded onto the Report Model designer where data regions are defined by identifying and specifying repeated patterns in the source report. Data fields are then captured from the defined data regions.

Pattern Box

Astera extracts data based on the repeatedly appearing patterns in the source report. You have to identify and specify those repeating patterns in a report model to create data regions.

You have to write the pattern in the Pattern Box (the orange region in the screenshot) to define a data region. A pattern can be any combination of alphabets, words, numeric, or alphanumeric characters. Report Models in Astera have built-in wild cards to define patterns.

Sometimes a single pattern cannot cover and define a data region. In such a case, you can make use of multiple patterns together to define the extraction logic.

Note: Astera supports a maximum of five different patterns for a single data/collection region in a report model.

Toolbar

The toolbar has various options that facilitate the data extraction process. The purpose and functionality of the icons present in the toolbar are discussed below:

The toolbar can be repositioned to any of the sides of the designer as preferred.

Report Browser

Report Browser contains features and layout panels for building extraction models and exporting extracted data.

There are two main tabs in a Report Browser panel:

Model Layout

The Model Layout panel displays the layout of your report model or extraction template. It contains data regions and fields built according to a custom extraction logic.

You can add and delete regions and fields, edit their properties, and export data directly to an Excel sheet, a CSV file, or a database table using the options available in this window.

Data Export Settings

Extracted data can be directly exported to an Excel sheet, a delimited file, or to a database table such as Microsoft SQL Server, Access, PostgreSQL, MySQL, or ODBC.

This exported data can then be used in various flow documents (a dataflow, a workflow, or a subflow).

Region Properties Panel

  • Region Name – Allows you to change the name of the data region.

  • Region Type – Tells the type of your region.

Region Details

  • The Region Details section lets you further customize your data region.

  • Region End Type – With the options available in the Region End Type drop-down list, you can specify where you want to end your data region. The options available are as follow:

  • Line Count – Ends your region after a specified number of lines.

  • Overlapping Container – Used when there are multiple data regions with overlapping lines.

  • Container Region – Used when a data region contains a sub-region within its boundaries.

Pattern Properties Panel

Let’s discuss the options available on this panel:

  • Case Sensitive Pattern Match – This option matches the data on a case-sensitive basis. For example, ‘Account’ and ‘account’” will be treated as two different patterns by Astera if the icon is selected for Case Sensitive Pattern Match.

  • Pattern is a Regular Expression – When this option is selected, Astera reads the specified pattern as a regular expression. A regular expression is a special text string used to describe a search pattern. You can think of regular expressions as wildcards. For example, wildcard notations such as *.txt are used to find all text files in a file manager.

  • Floating Pattern – The Floating Pattern option within the report model component allows you to capture each data field that matches the specified pattern no matter where it is located on the report model’s designer.

  • Float Fields – The Float Fields option will automatically be highlighted to the right of the Floating Pattern option when it is checked. The Float Fields option ensures that the line spacing also floats and is based on the line used to capture the first field. This option is selected by default but can also be unselected if you would like the field position to remain fixed.

  • Pattern Count - This option helps you increase the pattern count.

  • Apply Pattern to Line – This option is useful when the specified pattern does not capture the first line of the desired data region. For instance, when there is some information above the pattern keyword, then we increase Apply Pattern to Line from 0.

  • Multi-Column - This option is used when you have data residing in multiple columns.

Field Properties Panel

This panel appears when you define a data field within a data region.

The Field Properties panel allows users to customize the defined fields with the help of the following options:

General

  • Field Name – Allows users to assign a name to a data field.

  • Data Type – Provides option to set the data type to string, real, date etc.

  • Composite Type – Resolves a composite field such as full address or full name into components.

  • Format – Allows users to change the format of a date field.

  • Value If Null – Performs actions in cases where the field value is null.

    • None: This is the default setting. If ‘None’ is selected, the field will remain the same. For example, if the field in question is an empty address field, the cell will be displayed as empty in the preview.

    • Apply specified default: A string can be typed in for use here, such as ‘N/A.’ When the program finds a null value, the specified value will appear in the previewed cell instead of an empty cell.

    • Use from previous record: This returns the value of the preceding record in the same field.

Remove

Size and Position

Here, you can specify the size and position of a data field.

  • Start Position - Allows users to manually specify the start position of a data field.

    • Line/Column - Allows users to define co-ordinates to specify the starting position of the data field.

  • Length - Allows users to set the length of a data field.

  • Height - Allows users to set the height of a data field.

Working With Report Models in Astera

Astera provides a complete solution for automated extraction of data. Its user-friendly interface enables business users with little or no programming knowledge to easily accomplish a wide range of data extraction tasks without employing expensive IT resources.

In the following sections, we will learn how to built an extraction template, verify it against a sample data and export it to dedicated destination.

Creating a Report Model

Step 1: Loading Unstructured File

  • Open a Report Model in Astera by going to File > New > Report Model.

  • Provide the File Path for the unstructured file from you local or shared directory.

  • Click OK. The text file containing Orders invoice data will display on report model’s designer.

Astera will use this file to create a report model. Astera supports extraction of unstructured data from text, Excel, RTF, PRN, EDI, or PDF files.

There are many options available on the Report Options panel to configure how you want Astera to read your file. The reading options depend on the file type and content type of your data. For example, if you have a PDF file, you can specify the Scaling Factor, Font, Tab Size, Passwords and Pages to Read.

In this example, we are extracting Orders invoice data from a text file.

Step 2: Creating a Report Model

1. Header Region

Let’s take a look at the report document we have opened in the report model editor. At the top of the document is some general information, including company name and report dates. Then we have some account information, followed by order information including individual order items. Notice that this document also has a repeating header on each page. To extract the data from the header, we will need to add a Header to our report model.

  • Highlight the top region, right-click on it and select Add Page Header Region… from the context menu.

This area will now be highlighted in grey and the header region will show up in the Model Layout panel.

Note: It is important to note that the header is only required in documents that have identical text at the top of each page – For example, report title, date, etc. Hence, it is not mandatory to have a header in all report models.

Next step is to create data fields that make up the header.

  • Highlight the field area, right-click on it and select Add Data Field… from the context menu.

A data field will be added to the report model.

  • Rename the data field from the Field Properties panel.

  • Repeat the process to create more data fields and name them, as shown below. You can now see the layout of the report model in the Model Layout panel.

The header region has been created for this report, so let’s move on to creating a data region.

2. Data Region

  • Right-click on the Record node, inside the Model Layout panel, and select Add Data Region from the context menu.

A Data node will be added to the report model, appearing in the Model Layout tab.

Further notice that a pattern-matching box and Region Properties panel appear on the report model designer window.

  • Rename the data region to Account_Info from Region Properties panel.

Specify a pattern that Astera can match on your file to capture data. You can use an alphabet, a character, number, word or a wild card or a combination of these to define your pattern.

In this case, it’s easy to separate the account information from the surrounding data as they always start with ‘ACCOUNT:’ at the same character position.

On the pattern-matching box, right above account information, type ACCOUNT:

The report model editor now highlights all occurrences of the account region in the report.

  • Increase the Line Count to 3 to capture all three lines containing account information.

  • Highlight the field area, right-click and select Add Data Field from the context menu.

Repeat the process to create more data fields for the Account_Info region and name them, as shown below. You can see the layout of the report model in the Model Layout panel.

The data region has been created for this report. Let’s move on to creating a collection data region.

3. Collection Region

Our sample document has a hierarchical layout. As in, there may be multiple order records for each customer, and each order may have a number of order items in it. To represent this relationship in a report model, we can assign a region as a collection region. This section will demonstrate how to create a Collection region in a report model.

  • Right-click on the Account_Info node inside the Model Layout panel and select Add Collection Data Region from the context menu.

A Data node is added to the report model under Account_Info node, appearing in the Model Layout panel in a hierarchical structure. Rename this node to Items_Info.

Note: When a region has a collection of items in it, we need to enable its Collection Region property from the Region Properties panel. Notice that the icon for the Items_Info is different, to help identify this node as a ‘collection’. When we add a collection data region via the context menu, the Collection Region property is set automatically.

We want to capture the items information from this file. Astera provides the option to select a data region by placing markers next to the line number.

  • Move your cursor to the gray line on the left. Put up green markers by clicking on this gray line, left to sample lines 14 and 15. Observe that a pattern has automatically been identified in the pattern-matching bar. It is using a wildcard ‘Ñ’ as the matching pattern and has automatically created a data region.

  • Click on the Auto Create Fields icon placed in the Region Properties panel toolbar.

This will automatically create 6 fields in the Model Layout panel.

Now, the collection data region has been created for this report.

You can create the Single Instance Data Region, Append Data Region and Footer Region in the same way in a report model.

4. Adding a Formula Field

Astera provides the option to add a formula field in the report model.

  • Right-click on the Items_Info node inside the Model Layout panel and select Add a Formula Field from the context menu.

A Calculated Field Properties window will open. This is where you have to specify the formula.

You can see that it contains an expression box with built-in functions. Expand the Items_Info node to access the fields of the extracted data.

  • In the expression builder, type: Price - (0.2Price).

  • Rename this formula field as Discounted_Price.

  • Click on the Compile button to verify the expression. If it is successful, click OK.

You can add multiple formula fields to your report model in the same way.

Step 3: Testing Report Model With Sample Data

Data Statistics and Summary

Let’s learn in detail how to export data in Astera to an Excel destination.

Export to an Excel Destination

After selecting the record node, select the Create New Export Setting option placed on the toolbar at the top in the Model Layout panel and select Excel.

When exporting data to an Excel file, you can either choose a new destination to save your file to (a folder or directory) or append the data to an existing Excel file.

First you need to provide the destination file path, and then you can specify the other options on the export setting window.

  • First Row Contains Header – Check this option if you want to include the data field names as headers in the output file.

  • Worksheet – You can specify the title of the worksheet the data is being exported to within the excel file.

  • Append to File (If Exists) – Adds the exported data to an existing Excel file without overwriting the data it contains.

  • Write to Multiple Files – Saves the exported data to multiple Excel files instead of one single file. Specify the same file in multiple Excel destinations and Astera will create a single file with multiple worksheets and write all data to that file.

  • Rules for Filtering – Specifies a criterion to export only filtered records. Upon expanding, users can see an expression box with built-in functions to facilitate rule-based filtering. This feature is explained in detail in the next section.

After configuring the export settings, click Next. This takes us to the Layout Builder screen where you can customize data fields in the layout.

  • Source Field Path – Shows the name of the fields that are being exported.

  • Name – The name of the field within the program, whereas Header is the text that will be exported as a header in the destination.

  • Data Type – Specifies the type of data being exported e.g. integer, real, Boolean etc.

Click Next. This will take you to the General Options window where you can select different options as per your requirements.

Click OK and your file is ready to be exported.

Rule-Based Filtered Export

While exporting data from Astera, you have the option to send only filtered data to your export destination.

Select the export destination of your choice. A Configuration window will open where you need to provide the destination file path. On the same window, you will see the option to specify Rule for Filtering Data (in blue). Expand it.

An expression box with built-in functions will open. Expand the Record_Export node to access the fields of the extracted data.

In this example, we only want to export the data that contains ‘SOFA’ in the Items field.

For this, we will write Contains(”SOFA”,Item) in the Expression builder. Click the Compile button to verify your expression.

When the Compile Status is ‘Successful,’ click OK to close the export settings window. Your export will now begin. You can observe the progress through job trace in the Job Progress window.

At the end of a successful run, a clickable link for the exported data file will be generated in the job trace. Click on this link and your exported data will open in Excel file.

Using Report Models via Dataflows

Astera enables users to create and run dataflows for the extracted data. A dataflow is a graphical representation of the journey of data and includes sources, destinations, transformations, and object maps. It may also include a set of transformations mapped in a user-defined sequence. Generally, the data is retrieved from one or more data sources, is processed through a series of transformations, and then the transformed data is written to one or more destinations.

A report model can be used as a source in dataflows to leverage the advanced transformation and integration features in Astera.

There are two ways to create and use dataflows in Astera.

Creating Dataflows From Data Export Settings

Astera enables users to create a dataflow directly from the Data Export Settings panel.

Creating a New Dataflow

To open a new dataflow in Astera, go to File > New > Dataflow.

A new dataflow will open in Astera. You can see that a Toolbox and a secondary toolbar has been added to the designer.

Expand the Sources section in the Toolbox, here you will find the Report Source object. Drag-and-drop the Report Source object onto the dataflow designer.

The Report Source object is currently empty and needs to be configured.

Once the source object has been configured, the Report Source object on the dataflow designer will show the fields with layout according to the report model you have created.

Using PDF Form Source

The PDF form source in Astera enables users to extract data from a PDF file directly without creating an extraction template. This saves the need to create a report model since Astera reads the layout of the PDF form automatically including check-boxes and radio buttons.

In this example, the data that we want Astera to read is contained within a PDF form as shown below. This form has radio buttons as well as text boxes.

It is not feasible to create a report model to extract this data because:

  1. The report model designer does not pick up radio buttons and check boxes; and

  2. If you want Astera to read a bulk of similar forms, a single model cannot possibly be applied to all the forms since each form will contain different answers.

To cater to these issues, we will use a PDF Form Source object in a dataflow.

Under the Sources section in the Toolbox, you will find the PDF Form Source object.

Drag-and-drop the PDF Form Source object onto the dataflow designer.

You can see that the object is currently empty and needs to be configured.

Right-click on the header of the PDF Form Source object to configure its properties and retrieve data into the dataflow.

Once the source object has been configured, the PDF Form Source object on the dataflow designer will show the fields with layout defined in the PDF Form.

You can preview the extracted data by right-clicking on the source object and going to Preview Output.

A Data Preview window will open and show you the extracted data.

Using Email Source

Email Source in Astera enables users to retrieve data from emails and process the incoming email attachments.

It enables users to watch any email folder for incoming emails, and then process the incoming email attachments through a report model.

Under the Sources section in the Toolbox, you will find the Email Source object.

Drag-and-drop the Email Source object onto the dataflow designer.

You can see some built-in fields and an Attachments node in the object layout.

Right-click on the Email Source object to configure its properties and retrieve data into the dataflow.

On the next screen, enable the Download Attachments option, specify the email folder for Astera to watch for incoming emails. You can also apply various filters to process only specific emails in the folder.

Applying Data Quality Rules

To apply data quality rules, go to the data profiling section and drag and drop the Data Quality Rules object onto the dataflow designer.

Map the dataset using Report Source object to the Data Quality Rules object.

On the Properties window of Data Quality Rules object, define a rule against which your data will be validated. You can apply as many rules as you want within a single Data Quality Rules object.

In this example, a Data Quality Rule is applied to identify null records in the Items field as errors with an error message: “Item not found”.

Preview the data to check how the erroneous records are displayed. If you move the cursor over the red warning sign, it will show the error message in the tooltip.

Workflow Orchestration

A workflow is designed to orchestrate an automated and iterative execution of ordered tasks. These tasks are performed according to some predefined custom logic. A workflow makes it easy to visualize and implement complex tasks in a sequential manner.

Astera automates the data extraction process and expedites data preparation with features such as email/FTP/folder integration, a job scheduler, automated name & address parsing, and auto-creation of data extraction patterns.

Creating a New Workflow

To open a new workflow, go to File > New > Workflow.

A new workflow will open in Astera. You can see that there is an added Workflow Task section in the Toolbox panel. Click here to learn more about the features and functionalities available in a workflow.

Email/FTP/Folder Integration

With Astera, users can automatically pull files from an email or an FTP source by setting up custom job frequencies, such as every few minutes or on an hourly basis. This eliminates the need to manually download an incoming file from locations such as an email inbox or folder and send it to Astera for further processing.

In this example, we will illustrate how to use the folder watching feature while designing a workflow. The screenshot below shows the workflow designer. In this example, we will deploy the same dataflow on every file that arrives in a folder and orchestrate it with a workflow.

The Context Information object is placed under the Resources section of the Toolbox. Using Context Info, dynamic parameters are defined that take up values during the dataflow run time.

This object is useful in directing Astera to watch a folder for every file arrival.

This object is used to call and initiate a dataflow in a workflow and orchestrates the data extraction and data validation processes.

There are some other Run tasks including Run Workflow Task and Run Program Task.

In a single workflow, you can use a single or a combination of workflow tasks.

A Send Mail object sends email to the administrator at defined junctions in your workflow. You can place this object anywhere in the workflow as per your requirements.

Scheduling an Extraction Process

Jobs can be scheduled to run in a batch mode or real-time mode using a built-in Scheduler in Astera. In other words, the entire extraction process of data and its validation can be scheduled for a real-time run at the arrival of every new unstructured file.

The Scheduler functionality comes with various options for job frequencies, such as hourly, daily, weekly, or monthly. For real-time processing, the built-in job manager watches for file drops, single or batch, on any specified location and proceeds to executing a workflow automatically.

To open the Scheduler in Astera, go to Server > Job Schedules.

The main screen of the Scheduler provides options to customize a repetitive task, as shown in the screenshot below.

Configure the settings of the new scheduled task by going to the Deployed Job tab. Add the status, name and schedule type. Then, define a file path, server and the frequency of the scheduled task. It further provides an option to run the scheduled dataflow in Pushdown Mode.

Users can also track the workflow’s progress with Astera's notification option. Emails can be configured to send alerts about instances such as the beginning of a task, abnormal terminations, completion of workflows, or errors.

This concludes the Astera Report Model Tutorial article.

For more information on Report Browser, refer to article.

For more information on Region Properties panel, refer to this article over .

For more information on Pattern Properties panel, refer to article.

For more information on Field Properties panel, refer to this article over .

You can read more about these options .

Save this report model by clicking on this icon placed in the menu bar. We can verify the model by previewing the data. This will give an idea on how the report document is processed using the report model we created and how the data has been extracted.

To test the model and preview the extracted data, click on the Preview Data icon placed in the toolbar. This will open the Data Preview window, showing the entire report structure with the actual values for all the defined fields.

In Astera, you can capture a summary of extracted data fields including aggregated values such as sum, average, count, etc. To view detailed statistics of extracted data, click on this icon in the toolbar. The Quick Profile window will open displaying detailed statistics of the extracted data as shown in the figure below.

You can find more information on other data exporting options in Astera such as to a CSV file and a Database Table Destination in this article – .

There is also an option to preview your export setting to see what the data will look like in the output file before running the export. You can do this by going to Data Export Settings tab and clicking the Preview Selected Export icon .

You can learn more about dataflows and its tools from .

Once you have exported the extracted data to an Excel sheet, a delimited file or a database table, go to Data Export Settings panel and click on Create Dataflow and Open icon .

A new dataflow, typically containing a object, a destination object and a object will open in Astera.

In this example, the dataflow that has been created contains a Report Source object, a object (since we filtered out the exported data on the basis of Items), an object (for the Excel file we exported the data to), and a object.

To learn more about how this option works in Astera, refer to the article – .

You can add more transformations to this dataflow, apply or write the data to a new destination.

Let’s add this report model to a dataflow by using a object.

Right-click on the header of the Report Source object and configure the properties to retrieve report model data into the dataflow. Click to learn more about configuring the properties of Report Source object.

The data is now ready to be transformed. Astera offers 26 built-in transformations to facilitate any kind of data transformation – from basics such as , and to advanced transformations such as , and .

Click to learn more about the purpose and functionality of the different transformations you can use in a dataflow.

After transformation, this data can be loaded to 5 different destinations: , , , and .

Click to learn more about configuring the properties of PDF Form Source object in a dataflow.

Click to learn more about using the Email Source object in a dataflow.

To apply a report model on the attached unstructured files, use as shown below.

Astera can perform validation checks on the incoming records by applying custom business rules, anywhere in the data integration process. Records that fail to match the specified rules are returned as errors. These validation checks are carried out by object found under the Data Profiling section in the Toolbox.

Astera offers workflow orchestration functionality so that users can automate the entire process from the time data enters an organization to when it is stored: from conversion to validation, to loading the data into the preferred destination. For an in-depth understanding of workflows, refer to article.

We can also use a here as an alternative. It provides the metadata about files found in a particular folder.

The Task is placed under Workflow Tasks section of the Toolbox.

You can also find some process tasks such as such as and tasks in a workflow. Decision object invokes one of the two paths in a workflow, depending on whether the logical expression inside the Decision object returns a Yes (True) or a No (False) flag.

Other action objects such as and can be used at the end of a process chain to direct Astera as what to do with the data after a successful extraction task.

For more information about the built-in Scheduler, refer to the article on .

Add a new Scheduler task by clicking on this icon .

Click on located above the Deployed Job tab to save this scheduled task. It will now be added to the list of active scheduled tasks. Now, upon arrival of every new unstructured file in the specified folder, the process of data extraction validation and conversion will be triggered.

this
here
this
here
here
Exporting a Report Model
here
Report Source
Record Level Log
Filter Transformation
Excel Workbook Destination
Record Level Log
Exporting a Report Model to a Dataflow
Data Quality Rules
Report Source
here
Aggregate
Sort
Filter
Tree Join
Switch
Normalize
here
Database Table Destination
Excel Workbook Destination
Delimited File Destination
Fixed Length File Destination
XML/JSON File Destination
here
here
Report Source as a Transformation
Data Quality Rules
this
File System Item Source
Run Dataflow
Decision
SendMail
File System Action
File Transfer Task
Scheduling and Running Jobs on a Server
40-save
41-data-preview
43-statistics-option
54-preview-selected-export-icon
77-adding-a-schedule
79-save