Astera Data Stack
Version 9
Version 9
  • Welcome to Astera Data Stack Documentation
  • Release Notes
    • Astera 9.0 - Release Notes
  • Setting Up
    • System Requirements
    • Product Architecture
    • Installing Client and Server Applications
    • Connecting to a Astera Server using Lean Client
    • How to Connect to a Different Astera Server from the Lean Client
    • How to Set up a Server Certificate (.pfx) File in a New Environment
    • How to Build a Cluster Database and Create a Repository
    • How to Login from Lean Client
    • Licensing Model in Astera 9
    • User Roles and Access Control
    • Offline Activation of Astera Data Stack
  • Dataflows
    • Sources
      • Data Providers and File Formats Supported in Astera Data Stack
      • Setting Up Sources
      • Excel Workbook Source
      • COBOL File Source
      • Database Table Source
      • Delimited File Source
      • File System Items Source
      • Fixed Length File Source
      • Email Source
      • Report Source
      • SQL Query Source
      • XML/JSON File Source
      • PDF Form Source
    • Transformations
      • Introducing Transformations
      • Aggregate Transformation
      • Constant Value Transformation
      • Denormalize Transformation
      • Distinct Transformation
      • Expression Transformation
      • Filter Transformation
      • Join Transformation
      • List Lookup Transformation
      • Merge Transformation
      • Normalize Transformation
      • Passthru Transformation
      • Reconcile Transformation
      • Route Transformation
      • Sequence Generator
      • Sort Transformation
      • Sources as Transformations
      • Subflow Transformation
      • Switch Transformation
      • Tree Join Transformation
      • Tree Transform
      • Union Transformation
      • Data Cleanse Transformation
      • File Lookup Transformation
      • SQL Statement Lookup
      • Database Lookup
    • Destinations
      • Setting Up Destinations
      • Database Table Destination
      • Delimited File Destination
      • Excel Workbook Destination
      • Fixed Length File Destination
      • SQL Statement Destination
      • XML File Destination
    • Data Logging and Profiling
      • Creating Data Profile
      • Creating Field Profile
      • Data Quality Mode
      • Using Data Quality Rules in Astera
      • Record Level Log
    • Database Write Strategies
      • Data Driven
      • Source Diff Processor
      • Database Diff Processor
      • Dimension Loader - Database Write
    • Text Processors
      • Delimited Parser
      • Delimited Serializer
      • Language Parser
      • Fixed Length Parser
      • Fixed Length Serializer
      • XML/JSON Parser
      • XML JSON Serializer
    • Visualizations
      • Basic Plots
      • Distribution Plots
  • Workflows
    • What are Workflows?
    • Creating Workflows in Astera
    • Decision Task
    • EDI Acknowledgment Task
    • File System Task
    • File Transfer Task
    • OR Task
    • Run Dataflow Task
    • Run Program Task
    • Run SQL File Task
    • Run SQL Script Task
    • Run Workflow Task
    • Send Mail Task
    • Workflows with a Dynamic Destination Path
    • Customizing Workflows With Parameters
    • GPG-Integrated File Decryption in Astera
  • Subflows
    • Using Subflows in Astera
  • DATA MODEL
    • Creating a Data Warehousing Project
    • Data Models
      • Introducing Data Models
      • Opening a New Data Model
      • Data Modeler - UI Walkthrough
      • Reverse Engineering an Existing Database
      • Creating a Data Model from Scratch
      • General Entity Properties
      • Creating and Editing Relationships
      • Forward Engineering
      • Verifying a Data Model
    • Dimensional Modelling
      • Introducing Dimensional Models
      • Converting a Data Model to a Dimensional Model
      • Fact Entities
      • Dimension Entities
      • Date and Time Dimension
      • Verifying a Dimensional Model
    • Documentation
      • Generating Technical and Business Documentation for Data Models
      • Lineage and Impact Analysis
    • Deployment and Usage
      • Deploying a Data Model
      • Validate Metadata and Data Integrity
      • Using Astera Data Models in ETL Pipelines
      • Connecting an Astera Data Model to a Third Party Visualization Tool
  • Functions
    • Introducing Function Transformations
    • Custom Functions
    • Logical
      • Coalesce (Any value1, Any value2)
      • IsNotNull (AnyValue)
      • IsRealNumber (AnyValue)
      • IsValidSqlDate (Date)
      • IsDate (AnyValue)
      • If (Boolean)
      • If (DateTime)
      • If (Double)
      • Exists
      • If (Int64)
      • If (String)
      • IsDate (str, strformat)
      • IsInteger (AnyValue)
      • IsNullOrWhitespace (StringValue)
      • IsNullorEmpty (StringValue)
      • IsNull (AnyValue)
      • IsNumeric (AnyValue)
    • Conversion
      • GetDateComponents (DateWithOffset)
      • ParseDate (Formats, Str)
      • GetDateComponents (Date)
      • HexToInteger (Any Value)
      • ToInteger (Any value)
      • ToDecimal (Any value)
      • ToReal (Any value)
      • ToDate (String dateStr)
      • TryParseDate (String, UnknownDate)
      • ToString (Any value)
      • ToString (DateValue)
      • ToString (Any data, String format)
    • Math
      • Abs (Double)
      • Abs (Decimal)
      • Ceiling (Real)
      • Ceiling(Decimal)
      • Floor (Decimal)
      • Floor (Real)
      • Max (Decimal)
      • Max (Date)
      • Min (Decimal)
      • Min (Date)
      • Max (Real)
      • Max (Integer)
      • Min (Real)
      • Pow (BaseExponent)
      • Min (Integer)
      • RandomReal (Int)
      • Round (Real)
      • Round (Real Integer)
      • Round (Decimal Integer)
      • Round (Decimal)
    • Financial
      • DDB
      • FV
      • IPmt
      • IPmt (FV)
      • Pmt
      • Pmt (FV)
      • PPmt
      • PPmt (FV)
      • PV (FV)
      • Rate
      • Rate (FV)
      • SLN
      • SYD
    • String
      • Center (String)
      • Chr (IntAscii)
      • Asc (String)
      • AddCDATAEnvelope
      • Concatenate (String)
      • ContainsAnyChar (String)
      • Contains (String)
      • Compact (String)
      • Find (Int64)
      • EndsWith (String)
      • FindIntStart (Int32)
      • Extract (String)
      • GetFindCount (Int64)
      • FindLast (Int64)
      • GetDigits (String)
      • GetLineFeed
      • Insert (String)
      • IsAlpha
      • GetToken
      • IndexOf
      • IsBlank
      • IsLower
      • IsUpper
      • IsSubstringOf
      • Length (String)
      • LeftOf (String)
      • Left (String)
      • IsValidName
      • Mid (String)
      • PadLeft
      • Mid (String Chars)
      • LSplit (String)
      • PadRight
      • ReplaceAllSpecialCharsWithSpace
      • RemoveChars (String str, StringCharsToRemove)
      • ReplaceLast
      • RightAlign
      • Reverse
      • Right (String)
      • RSplit (String)
      • SplitStringMultipleRecords
      • SplitStringMultipleRecords (2 Separators)
      • SplitString (3 separators)
      • SplitString
      • SplitStringMultipleRecords (3 Separators)
      • Trim
      • SubString (NoOfChars)
      • StripHtml
      • Trim (Start)
      • TrimExtraMiddleSpace
      • TrimEnd
      • PascalCaseWithSpace (String str)
      • Trim (String str)
      • ToLower(String str)
      • ToProper(String str)
      • ToUpper (String str)
      • Substring (String str, Integer startAt)
      • StartsWith (String str, String value)
      • RemoveAt (String str, Integer startAt, Integer noofChars)
      • Proper (String str)
      • Repeat (String str, Integer count)
      • ReplaceAll (String str, String lookFor, String replaceWith)
      • ReplaceFirst (String str, String lookFor, String replaceWith)
      • RightOf (String str, String lookFor)
      • RemoveChars (String str, String charsToRemove)
      • SplitString (String str, String separator1, String separator2)
    • Date Time
      • AddMinutes (DateTime)
      • AddDays (DateTimeOffset)
      • AddDays (DateTime)
      • AddHours (DateTime)
      • AddSeconds (DateTime)
      • AddMonths (DateTime)
      • AddMonths (DateTimeOffset)
      • AddMinutes (DateTimeOffset)
      • AddSeconds (DateTimeOffset)
      • AddYears (DateTimeOffset)
      • AddYears (DateTime)
      • Age (DateTime)
      • Age (DateTimeOffset)
      • CharToSeconds (Str)
      • DateDifferenceDays (DateTimeOffset)
      • DateDifferenceDays (DateTime)
      • DateDifferenceHours (DateTimeOffset)
      • DateDifferenceHours (DateTime)
      • DateDifferenceMonths (DateTimeOffset)
      • DateDifferenceMonths (DateTime)
      • DatePart (DateTimeOffset)
      • DatePart (DateTime)
      • DateDifferenceYears (DateTimeOffset)
      • DateDifferenceYears (DateTime)
      • Month (DateTime)
      • Month (DateTimeOffset)
      • Now
      • Quarter (DateTime)
      • Quarter (DateTimeOffset)
      • Second (DateTime)
      • Second (DateTimeOffset)
      • SecondsToChar (String)
      • TimeToInteger (DateTime)
      • TimeToInteger (DateTimeOffset)
      • ToDate Date (DateTime)
      • ToDate DateTime (DateTime)
      • ToDateString (DateTime)
      • ToDateTimeOffset-Date (DateTimeOffset)
      • ToDate DateTime (DateTimeOffset)
      • ToDateString (DateTimeOffset)
      • Today
      • ToLocal (DateTime)
      • ToJulianDate (DateTime)
      • ToJulianDayNumber (DateTime)
      • ToTicks (Date dateTime)
      • ToTicks (DateTimeWithOffset dateTime)
      • ToUnixEpoc (Date dateTime)
      • ToUtc (Date dateTime)
      • UnixTimeStampToDateTime (Real unixTimeStamp)
      • UtcNow ()
      • Week (Date dateTime)
      • Week (DateTimeWithOffset dateTime)
      • Year (Date dateTime)
      • Year (DateTimeWithOffset dateTime)
      • DateToJulian (Date dateTime, Integer length)
      • DateTimeOffsetUtcNow ()
      • DateTimeOffsetNow ()
      • Day (DateTimeWithOffset dateTime)
      • Day (Date dateTime)
      • DayOfWeekStr (DateTimeWithOffset dateTime)
      • DayOfWeek (DateTimeWithOffset dateTime)
      • DayOfWeek (Date dateTime)
      • DateToJulian (DateTimeWithOffset dateTime, Integer length)
      • DayOfWeekStr (Date dateTime)
      • FromJulianDate (Real julianDate)
      • DayOfYear (Date dateTime)
      • DaysInMonth(Integer year, Integer month)
      • DayOfYear (DateTimeWithOffset dateTime)
      • FromUnixEpoc
      • FromJulianDayNumber (Integer julianDayNumber)
      • FromTicksUtc(Integer ticks)
      • FromTicksLocal(Integer ticks)
      • Hour (Date dateTime)
      • Hour (DateTimeWithOffset dateTime)
      • Minute (Date dateTime)
      • JulianToDate (String julianDate)
      • Minute (DateTimeWithOffset dateTime)
      • DateToIntegerYYYYMMDD (DateTimeWithOffset dateTime)
      • DateToIntegerYYYYMMDD (Date dateTime)
    • Files
      • AppendTextToFile (String filePath, String text)
      • CopyFile (String sourceFilePath, String destFilePath, Boolean overWrite)
      • CreateDateTime (String filePath)
      • DeleteFile (String filePath)
      • DirectoryExists (String filePath)
      • FileExists (String filePath)
      • FileLength (String filePath)
      • FileLineCount (String filePath)
      • GetDirectory (String filePath)
      • GetEDIFileMetaData (String filePath)
      • GetExcelWorksheets (String excelFilePath)
      • GetFileExtension (String filePath)
      • GetFileInfo (String filePath)
      • GetFileName (String filePath)
      • GetFileNameWithoutExtension (String filePath)
      • LastUpdateDateTime (String filePath)
      • MoveFile (String filePath, String newDirectory)
      • ReadFileBytes (String filePath)
      • ReadFileFirstLine (String filePath)
      • ReadFileText (String filePath)
      • ReadFileText (String filePath, String codePage)
      • WriteBytesToFile (String filePath, ByteArray bytes)
      • WriteTextToFile (String filePath, String text)
    • Date Time With Offset
      • ToDateTimeOffsetFromDateTime (dateTime String)
      • ToUtc (DateTimeWithOffset)
      • ToDateTimeOffsetFromDateTime
      • ToDateTimeOffset (String dateTimeOffsetStr)
      • ToDateTimeFromDateTimeOffset
    • GUID
      • NewGuid
    • Encoding
      • ToBytes
      • FromBytes
      • UrlEncode
      • UrlDecode
    • Regular Expressions
      • ReplaceRegEx
      • ReplaceRegEx (Integer StartAt)
    • TimeSpan
      • Minutes
      • Hours
      • Days
      • Milliseconds
    • Matching
      • Soundex
      • DoubleMetaphone
      • RefinedSoundex
  • Report Model
    • User Guide
      • Report Model Tutorial
    • Report Model Interface
      • Report Options
      • Report Browser
      • Data Regions in Report Models
      • Region Properties Panel
      • Pattern Properties
      • Field Properties Panel
    • Use Cases
      • Applying Pattern to Line
      • Auto Creating Data Regions and Fields
      • Auto Parsing
      • Connecting to Cloud Storage
      • Creating Multi Column Data Regions
      • Defining Region End Type as Specific Text and Regular Expression
      • Defining the Start Position of Data Fields
      • Floating Patterns and Floating Fields
      • How To Work With PDF Scaling Factor in a Report Model
      • Line Count
      • Pattern Count
      • Pattern is a Regular Expression
      • Using Comma Separated Values to Define Start Position
    • Auto Generate Layout (Beta)
      • Setting Up AGL in Astera
      • UI Walkthrough Auto Generation of Layout, Fields and Table
      • Using Auto Generation Layout, Auto Create Fields, and Auto Create Table (Beta)
    • Exporting Options
      • Exporting a Report Model
      • Exporting Report Model to A Dataflow
    • Miscellaneous
      • Importing Monarch Models
      • Microsoft Word and Rich Text Format Support
      • Working With Problematic PDF Files
  • API Flows
    • API Consumption
      • Consume
        • REST Connection
        • Making API Calls with the REST Client Object in Astera
        • REST API Browser
        • Method Operations
        • Pagination
      • Authorize
        • Open APIs - Configuration Details
        • Authorizing Facebook APIs in Astera
        • Authorizing Astera's Server APIs
        • Authorizing Avaza APIs in Astera
        • Authorizing Square API in Astera
        • Authorizing ActiveCampaign API in Astera
        • Authorizing QuickBooks’ API in Astera
        • Accessing Astera's Server APIs Through a Third Party Tool
        • Astera's Server API Documentation
  • Project Management
    • Project Management
      • Deployment
      • Server Monitoring and Job Management
      • Connecting to Source Control
      • Astera Project and Project Explorer
    • Job Scheduling
      • Scheduling Jobs on the Server
      • Job Monitor
  • Use Cases
    • End-to-End Use Cases
      • Data Integration
        • Using Astera Data Stack to Create and Orchestrate an ETL Process for Partner Onboarding
      • Data Warehousing
        • Building a Data Warehouse - A Step By Step Approach
      • Data Extraction
        • Reusing The Extraction Template for Similar Layout Files
  • Connectors
    • Setting Up IBM DB2/iSeries Connectivity in Astera
    • Connecting to SAP HANA Database
    • Connecting to MariaDB Database
    • Connecting to Salesforce Database
    • Connecting to Salesforce - Legacy Database
    • Connecting to Vertica Database
    • Connecting to Snowflake Database
    • Connecting to Amazon Redshift Database
    • Connecting to Amazon Aurora Database
    • Connecting to Google Cloud SQL in Astera
    • Connecting to MySQL Database
    • Connecting to PostgreSQL in Astera
    • Connecting to Netezza Database
    • Connecting to Oracle Database
    • Connecting to Microsoft Azure Databases
    • Connecting to Amazon RDS Databases
  • Miscellaneous
    • Using Dynamic Layout Template Mapping in Astera
    • Synonym Dictionary File
    • SmartMatch Feature
    • Role Based Access Control in Astera
    • Updating Your License in Astera
    • Using Output Variables in Astera
    • Connection Vault
    • Safe Mode
    • Using the Data Source Browser in Astera
    • Pushdown Mode
    • Cloud Deployment
      • Deploying Astera on Microsoft Azure Cloud
      • Deploying Astera on Oracle Cloud
      • Deploying Astera on Amazon Web Services
    • Context Information
  • Best Practices
    • Overview of Cardinality in Data Modeling
    • Cardinality Errors FAQs
    • Astera Best Practices - Dataflows
Powered by GitBook

© Copyright 2025, Astera Software

On this page
  • Loading An Unstructured File
  • Using Start Position Options to Capture Data Fields
  1. Report Model
  2. Use Cases

Defining the Start Position of Data Fields

PreviousDefining Region End Type as Specific Text and Regular ExpressionNextFloating Patterns and Floating Fields

Last updated 11 months ago

Start Position options are useful for defining the start position of a selected data field. They appear in the Size and Position group box in the Field Properties panel. There are three options available in the drop-down menu of the Start Position option:

  1. Fixed

  2. Follows String in Current Line

  3. Follows String in Previous Line

In this document, we will discuss how to work with the Follows String In Current Line and Follows String In Previous Line options to define the start position of a data field.

Loading An Unstructured File

The Report Options panel provides the configuration options for loading the unstructured file. You can change the source file by specifying its Path in the Data File Location group box.

Creating the Report Model

Here, we have defined the pattern as ‘ACCOUNT:’ and the Region End Type is set to Another Region Starts, which means that the current data region will end when another one starts.

Using Start Position Options to Capture Data Fields

We have captured the data region of our interest in the report model. Let’s extract relevant data points by Adding Data Fields.

  1. To create the data field, highlight the desired field area, right-click on it, and select the Add Data Field option from the context menu.

  1. As you can see below, the data is misaligned and therefore is not being captured correctly.

  1. To solve this problem, we can use the Follows String in Current Line option from the Start Position drop-down menu in the Field Properties panel to specify a string that defines the start position of this field.

Here, we have defined ‘contact:’ as the string in the textbox and Length Till End Of Line to define the start position. Now, all the data points in this field have been captured completely.

  1. Notice that two checkboxes, Case Sensitive and Regular Expression, have appeared in the Field Properties panel.

  • Case Sensitive: Allows users to search the specified string on a case-sensitive basis.

  • Regular Expression: Allows users to use a regular expression to search the preceding string of the data field.

  1. Let’s go ahead and select the Case Sensitive option and see what happens.

The fields are no longer being captured as they are no longer highlighted in blue. This is due to the case-sensitive comparison of the two strings ‘CONTACT:’ and ‘contact:’ (they are currently not matching due to the difference in upper and lower cases). Uncheck the Case Sensitive option.

  1. Select the Regular Expression checkbox and define a regular expression in the textbox to capture the data points.

The data is now being captured correctly.

  1. Next, capture the data points for the field, Address. This time, define the Start Position as Follows String in Previous Line as the address starts from the following line of the string ‘ADDRESS:’.

Notice that we have matched the case of the string specified in the textbox with the case of the string in the document since the Case Sensitive checkbox is selected. Therefore, the data is being captured correctly.

  1. Lastly, capture the data in the field Account, by using the Follows String in Current Line option.

  1. You can rename the fields in the General section of the Field Properties panel. This is what our Model Layout looks like:

  1. Preview the data by clicking on the Preview Data icon to check if all the fields are being extracted correctly from the unstructured document.

The Data Preview window shows the data extracted from the unstructured document.

This is how we can extract data from an unstructured file by specifying Start Position options in Astera.

Before creating an extraction template, we need to import the unstructured source file that we want to extract data from in Astera. To learn how to load an unstructured document in a report model, click .

There are also some other configuration options. To learn more about the Report Options, click .

Add a new and to capture all the lines in that region.

here
here
data region
specify an appropriate pattern