Astera Data Stack
Version 8
Version 8
  • Welcome to Astera Data Stack Documentation
  • Release Notes
    • Astera 8.0 - What's New, What's Fixed, and What's Improved
    • Astera 8.0 - Known Issues
    • Astera 8.1 - Release Notes
    • Astera 8.2 Release Notes
    • Astera 8.3 Release Notes
    • Astera 8.4 Release Notes
    • Astera 8.5 Release Notes
  • Getting Started
    • Astera 8 - Important Considerations
    • Astera 8 - System Requirements
    • Configuring the Server
    • Connecting to a Different Astera Server from the Lean Client
    • Connecting to an Astera Server using Lean Client
    • How to Build a Cluster Database and Create a Repository
    • How to Login from Lean Client
    • Setting up a Server Certificate (.pfx) File in a New Environment
    • Installing Client and Server Applications
    • Licensing Model in Astera 8
    • Migrating from Astera 7.x to Astera 8
    • UI Walkthrough - Astera 8.0
    • User Roles and Access Control
  • Dataflows
    • Sources
      • Data Providers and File Formats Supported in Astera
      • Setting Up Sources
      • COBOL File Source
      • Database Table Source
      • Data Model Query Source
      • Delimited File Source
      • Email Source
      • Excel Workbook Source
      • File Systems Item Source
      • Fixed Length File Source
      • PDF Form Source
      • Report Source
      • SQL Query Source
      • XML/JSON File Source
    • Transformations
      • Introducing Transformations
      • Aggregate Transformation
      • Constant Value Transformation
      • Data Cleanse Transformation
      • Denormalize Transformation
      • Distinct Transformation
      • Database Lookup Transformation
      • Expression Transformation
      • File Lookup Transformation
      • Filter Transformation
      • Join Transformation
      • List Lookup Transformation
      • Merge Transformation
      • Normalize Transformation
      • Passthru Transformation
      • Reconcile Transformation
      • Route Transformation
      • Sequence Generator Transformation
      • Sort Transformation
      • Sources as Transformations
      • Subflow Transformation
      • SQL Statement Lookup Transformation
      • Switch Transformation
      • Tree Join Transformation
      • Tree Transform
      • Union Transformation
    • Destinations
      • Setting Up Destinations
      • Database Table Destination
      • Delimited File Destination
      • Excel Workbook Destination
      • Fixed Length File Destination
      • SQL Statement Destination
      • XML/JSON File Destination
    • Data Logging and Profiling
      • Creating Data Profile
      • Creating Field Profile
      • Data Quality Mode
      • Record Level Log
      • Using Data Quality Rules in Astera
    • Database Write Strategies
      • Database Diff Processor
      • Data Driven Write Strategy
      • Dimension Loader - Database Write
      • Source Diff Processor
    • Text Processors
      • Delimited Parser
      • Delimited Serializer
      • Fixed Length Parser
      • Fixed Length Serializer
      • Language Parser
      • XML JSON Parser
      • XML JSON Serializer
    • Data Warehouse
      • Fact Table Loader
      • Dimension Table Loader
  • WORKFLOWS
    • What Are Workflows?
    • Using the Workflow Designer
    • Creating Workflows in Astera
    • Decision Task
    • EDI Acknowledgement Task
    • File System Task
    • File Transfer Task
    • OR Task
    • Run Dataflow Task
    • Run Program Task
    • Run SQL File Task
    • Run SQL Script Task
    • Run Workflow Task
    • Send Mail Task
    • Workflows with a Dynamic Destination Path
    • Customizing Workflows with Parameters
    • GPG-Integrated File Decryption in Astera
  • Subflows
    • Using Subflows in Astera
  • Functions
    • Introducing Function Transformations
    • Custom Functions
    • Logical
      • Coalesce (Any value1, Any value2)
      • IsNotNull (AnyValue)
      • IsRealNumber (AnyValue)
      • IsValidSqlDate (Date)
      • IsDate (AnyValue)
      • If (Boolean)
      • If (DateTime)
      • If (Double)
      • Exists
      • If (Int64)
      • If (String)
      • IsDate (str, strformat)
      • IsInteger (AnyValue)
      • IsNullOrWhitespace (StringValue)
      • IsNullorEmpty (StringValue)
      • IsNull (AnyValue)
      • IsNumeric (AnyValue)
    • Conversion
      • GetDateComponents (DateWithOffset)
      • ParseDate (Formats, Str)
      • GetDateComponents (Date)
      • HexToInteger (Any Value)
      • ToInteger (Any value)
      • ToDecimal (Any value)
      • ToReal (Any value)
      • ToDate (String dateStr)
      • TryParseDate (String, UnknownDate)
      • ToString (Any value)
      • ToString (DateValue)
      • ToString (Any data, String format)
    • Math
      • Abs (Double)
      • Abs (Decimal)
      • Ceiling (Real)
      • Ceiling(Decimal)
      • Floor (Decimal)
      • Floor (Real)
      • Max (Decimal)
      • Max (Date)
      • Min (Decimal)
      • Min (Date)
      • Max (Real)
      • Max (Integer)
      • Min (Real)
      • Pow (BaseExponent)
      • Min (Integer)
      • RandomReal (Int)
      • Round (Real)
      • Round (Real Integer)
      • Round (Decimal Integer)
      • Round (Decimal)
    • Financial
      • DDB
      • FV
      • IPmt
      • IPmt (FV)
      • Pmt
      • Pmt (FV)
      • PPmt
      • PPmt (FV)
      • PV (FV)
      • Rate
      • Rate (FV)
      • SLN
      • SYD
    • String
      • Center (String)
      • Chr (IntAscii)
      • Asc (String)
      • AddCDATAEnvelope
      • Concatenate (String)
      • ContainsAnyChar (String)
      • Contains (String)
      • Compact (String)
      • Find (Int64)
      • EndsWith (String)
      • FindIntStart (Int32)
      • Extract (String)
      • GetFindCount (Int64)
      • FindLast (Int64)
      • GetDigits (String)
      • GetLineFeed
      • Insert (String)
      • IsAlpha
      • GetToken
      • IndexOf
      • IsBlank
      • IsLower
      • IsUpper
      • IsSubstringOf
      • Length (String)
      • LeftOf (String)
      • Left (String)
      • IsValidName
      • Mid (String)
      • PadLeft
      • Mid (String Chars)
      • LSplit (String)
      • PadRight
      • ReplaceAllSpecialCharsWithSpace
      • RemoveChars (String str, StringCharsToRemove)
      • ReplaceLast
      • RightAlign
      • Reverse
      • Right (String)
      • RSplit (String)
      • SplitStringMultipleRecords
      • SplitStringMultipleRecords (2 Separators)
      • SplitString (3 separators)
      • SplitString
      • SplitStringMultipleRecords (3 Separators)
      • Trim
      • SubString (NoOfChars)
      • StripHtml
      • Trim (Start)
      • TrimExtraMiddleSpace
      • TrimEnd
      • PascalCaseWithSpace (String str)
      • Trim (String str)
      • ToLower(String str)
      • ToProper(String str)
      • ToUpper (String str)
      • Substring (String str, Integer startAt)
      • StartsWith (String str, String value)
      • RemoveAt (String str, Integer startAt, Integer noofChars)
      • Proper (String str)
      • Repeat (String str, Integer count)
      • ReplaceAll (String str, String lookFor, String replaceWith)
      • ReplaceFirst (String str, String lookFor, String replaceWith)
      • RightOf (String str, String lookFor)
      • RemoveChars (String str, String charsToRemove)
      • SplitString (String str, String separator1, String separator2)
    • Date Time
      • AddMinutes (DateTime)
      • AddDays (DateTimeOffset)
      • AddDays (DateTime)
      • AddHours (DateTime)
      • AddSeconds (DateTime)
      • AddMonths (DateTime)
      • AddMonths (DateTimeOffset)
      • AddMinutes (DateTimeOffset)
      • AddSeconds (DateTimeOffset)
      • AddYears (DateTimeOffset)
      • AddYears (DateTime)
      • Age (DateTime)
      • Age (DateTimeOffset)
      • CharToSeconds (Str)
      • DateDifferenceDays (DateTimeOffset)
      • DateDifferenceDays (DateTime)
      • DateDifferenceHours (DateTimeOffset)
      • DateDifferenceHours (DateTime)
      • DateDifferenceMonths (DateTimeOffset)
      • DateDifferenceMonths (DateTime)
      • DatePart (DateTimeOffset)
      • DatePart (DateTime)
      • DateDifferenceYears (DateTimeOffset)
      • DateDifferenceYears (DateTime)
      • Month (DateTime)
      • Month (DateTimeOffset)
      • Now
      • Quarter (DateTime)
      • Quarter (DateTimeOffset)
      • Second (DateTime)
      • Second (DateTimeOffset)
      • SecondsToChar (String)
      • TimeToInteger (DateTime)
      • TimeToInteger (DateTimeOffset)
      • ToDate Date (DateTime)
      • ToDate DateTime (DateTime)
      • ToDateString (DateTime)
      • ToDateTimeOffset-Date (DateTimeOffset)
      • ToDate DateTime (DateTimeOffset)
      • ToDateString (DateTimeOffset)
      • Today
      • ToLocal (DateTime)
      • ToJulianDate (DateTime)
      • ToJulianDayNumber (DateTime)
      • ToTicks (Date dateTime)
      • ToTicks (DateTimeWithOffset dateTime)
      • ToUnixEpoc (Date dateTime)
      • ToUtc (Date dateTime)
      • UnixTimeStampToDateTime (Real unixTimeStamp)
      • UtcNow ()
      • Week (Date dateTime)
      • Week (DateTimeWithOffset dateTime)
      • Year (Date dateTime)
      • Year (DateTimeWithOffset dateTime)
      • DateToJulian (Date dateTime, Integer length)
      • DateTimeOffsetUtcNow ()
      • DateTimeOffsetNow ()
      • Day (DateTimeWithOffset dateTime)
      • Day (Date dateTime)
      • DayOfWeekStr (DateTimeWithOffset dateTime)
      • DayOfWeek (DateTimeWithOffset dateTime)
      • DayOfWeek (Date dateTime)
      • DateToJulian (DateTimeWithOffset dateTime, Integer length)
      • DayOfWeekStr (Date dateTime)
      • FromJulianDate (Real julianDate)
      • DayOfYear (Date dateTime)
      • DaysInMonth(Integer year, Integer month)
      • DayOfYear (DateTimeWithOffset dateTime)
      • FromUnixEpoc
      • FromJulianDayNumber (Integer julianDayNumber)
      • FromTicksUtc(Integer ticks)
      • FromTicksLocal(Integer ticks)
      • Hour (Date dateTime)
      • Hour (DateTimeWithOffset dateTime)
      • Minute (Date dateTime)
      • JulianToDate (String julianDate)
      • Minute (DateTimeWithOffset dateTime)
      • DateToIntegerYYYYMMDD (DateTimeWithOffset dateTime)
      • DateToIntegerYYYYMMDD (Date dateTime)
    • Files
      • AppendTextToFile (String filePath, String text)
      • CopyFile (String sourceFilePath, String destFilePath, Boolean overWrite)
      • CreateDateTime (String filePath)
      • DeleteFile (String filePath)
      • DirectoryExists (String filePath)
      • FileExists (String filePath)
      • FileLength (String filePath)
      • FileLineCount (String filePath)
      • GetDirectory (String filePath)
      • GetEDIFileMetaData (String filePath)
      • GetExcelWorksheets (String excelFilePath)
      • GetFileExtension (String filePath)
      • GetFileInfo (String filePath)
      • GetFileName (String filePath)
      • GetFileNameWithoutExtension (String filePath)
      • LastUpdateDateTime (String filePath)
      • MoveFile (String filePath, String newDirectory)
      • ReadFileBytes (String filePath)
      • ReadFileFirstLine (String filePath)
      • ReadFileText (String filePath)
      • ReadFileText (String filePath, String codePage)
      • WriteBytesToFile (String filePath, ByteArray bytes)
      • WriteTextToFile (String filePath, String text)
    • Date Time With Offset
      • ToDateTimeOffsetFromDateTime (dateTime String)
      • ToUtc (DateTimeWithOffset)
      • ToDateTimeOffsetFromDateTime
      • ToDateTimeOffset (String dateTimeOffsetStr)
      • ToDateTimeFromDateTimeOffset
    • GUID
      • NewGuid
    • Encoding
      • ToBytes
      • FromBytes
      • UrlEncode
      • UrlDecode
    • Regular Expressions
      • ReplaceRegEx
      • ReplaceRegEx (Integer StartAt)
    • TimeSpan
      • Minutes
      • Hours
      • Days
      • Milliseconds
    • Matching
      • Soundex
      • DoubleMetaphone
      • RefinedSoundex
  • Report Model
    • User Guide
      • Report Model Tutorial
    • Report Model Interface
      • Field Properties Panel
      • Region Properties Panel
      • Report Browser
      • Report Options
    • Use Cases
      • Applying Pattern to Line
      • Auto Creating Data Regions and Fields
      • Auto-Parsing
      • Creating Multi-Column Data Regions
      • Floating Patterns and Floating Fields
      • How To Work With PDF Scaling Factor in a Report Model
      • Line Count
      • Pattern Count
      • Pattern is a Regular Expression
    • Exporting Options
      • Exporting a Report Model
      • Exporting Report Model to a Dataflow
    • Miscellaneous
      • Importing Monarch Models
      • Microsoft Word and Rich Text Format Support
      • Working With Problematic PDF Files
  • API Flows
    • API Consumption
      • Consume
        • REST API Browser
        • Making HTTP Requests Through REST API Browser
        • Using REST Client Outside of the Scope of the Project
      • Authorize
        • Authorizing ActiveCampaign API in Astera
        • Authorizing Astera Server APIs
        • Authorizing Avaza APIs in Astera
        • Authorizing Facebook APIs in Astera
        • Authorizing QuickBooks API in Astera
        • Authorizing Square API in Astera
        • Open APIs - Configuration Details
  • Project Management
    • Project Management
      • Astera's Project Explorer
      • Connecting to Source Control
      • Deployment
      • Server Monitoring and Job Management
    • Job Scheduling
      • Scheduling Jobs on the Server
      • Job Monitor
  • Use Cases
    • End-to-End Use Cases
      • Data Integration
        • Using Astera to Create and Orchestrate an ETL Process for Partner Onboarding
      • Data Warehousing
        • Building a Data Warehouse - A Step-By-Step Approach
      • Data Extraction
        • Reusing The Extraction Template for Similar Layout Files
  • Connectors
    • Connecting to Amazon Aurora Database
    • Connecting to Amazon RDS Databases
    • Connecting to Amazon Redshift Database
    • Connecting to Cloud Storage
    • Connecting to Google Cloud SQL in Astera
    • Connecting to MariaDB Database
    • Connecting to Microsoft Azure Databases
    • Connecting to MySQL Database
    • Connecting to Netezza Database
    • Connecting to Oracle Database
    • Connecting to PostgreSQL in Astera
    • Connecting to Salesforce Database
    • Connecting to Salesforce - Legacy Database
    • Connecting to SAP HANA Database
    • Connecting to Snowflake Database
    • Connecting to Vertica Database
    • Setting Up IBM DB2 iSeries Connectivity in Astera
  • Miscellaneous
    • Cloud Deployment
      • Deploying Astera on Amazon Web Services
      • Deploying Astera on Microsoft Azure Cloud
      • Deploying Astera on Oracle Cloud
    • Context Information
    • Pushdown Mode
    • Role Based Access Control in Astera
    • Safe Mode
    • Server Command Line Utility
    • SmartMatch Feature
    • Synonym Dictionary File
    • Updating Your License in Astera
    • Using Dynamic Layout/Template Mapping in Astera
    • Using Output Variables in Astera
    • Using the Data Source Browser in Astera
  • Best Practices
    • Astera Best Practices - Dataflows
    • Overview of Cardinality in Data Modeling
    • Cardinality Errors FAQs
Powered by GitBook

© Copyright 2025, Astera Software

On this page
  • Flat File Sources
  • Delimited File
  • Fixed-Length File
  • Excel File
  • Tree File Sources
  • COBOL File
  • XML/JSON File
  • Database Sources
  • Database Table
  • SQL Query
  • Source/Destination File Options
  • Advanced Flat-File Reading Options
  • Managing Differences between Source Layout and Source File
  • Creating Field Layout
  • Using Data Formats
  1. Dataflows
  2. Sources

Setting Up Sources

PreviousData Providers and File Formats Supported in AsteraNextCOBOL File Source

Last updated 10 months ago

Each source on the dataflow is represented as a source object. You can have any number of sources of the dataflow, and they can feed into zero or more destinations.

The following source types are supported by the dataflow engine:

Flat File Sources:

Tree File Sources:

Database Sources:

All sources can be added to the dataflow by picking a source type on the Flow toolbox and dropping it on the dataflow. File sources can also be added by dragging and dropping a file from an Explorer window. Database sources can be drag-and-dropped from the Data Source Browser. For more details on adding sources to the dataflow, see Introducing Dataflows.

Flat File Sources

Delimited File

Adding a Delimited File Source object allows you to transfer data from a delimited file. An example of what a delimited file source object looks like is shown below.

To configure the properties of a Delimited File Source object after it is added to the dataflow, right-click on its header and select Properties from the context menu.

Fixed-Length File

Adding a Fixed-Length File Source object allows you to transfer data from a fixed-length file. An example of what a Fixed-Length File Source object looks like is shown below.

To configure the properties of a Fixed-Length File Source object after it is added to the dataflow, right-click on its header and select Properties from the context menu.

Excel File

Adding an Excel Workbook Source object allows you to transfer data from an Excel file. An example of what an Excel Workbook Source object looks like is shown below.

To configure the properties of an Excel Workbook Source object after it is added to the dataflow, right-click on its header and select Properties from the context menu.

Tree File Sources

COBOL File

Adding a COBOL File Source object allows you to transfer data from a COBOL file. An example of what a COBOL File Source object looks like is shown below.

To configure the properties of a COBOL File Source object after it is added to the dataflow, right-click on its header and select Properties from the context menu.

XML/JSON File

Adding an XML/JSON File Source object allows you to transfer data from an XML file. An example of what an XML/JSON File Source object looks like is shown below.

To configure the properties of an XML/JSON File Source object after it is added to the dataflow, right-click on its header and select Properties from the context menu. The following properties are available:

General Properties window:

File Path – Specifies the location of the source XML file. Using UNC paths is recommended if running the dataflow on a server.

Schema File Path – Specifies the location of the XSD file controlling the layout of the XML source file.

Note: Astera can generate a schema based on the content of the source XML file. The data types will be assigned based on the source file’s content.

Optional Record Filter Expression – All This allows you to enter an expression to selectively filter incoming records according to your criteria. You can use the Expression Builder to help you create your filter expression. For more information on using Expression Builder, see Expression Builder.

Note: To ensure your dataflow is runnable on a remote server, please avoid using local paths for the source. Using UNC paths is recommended.

Database Sources

Database Table

Adding a Database Table Source object allows you to transfer data from a database table. An example of what a Database Table Source object looks like is shown below.

To configure the properties of a Database Table Source object after it is added to the dataflow, right-click on its header and select Properties from the context menu. The following properties are available:

Source Connection window – Allows you to enter the connection information for your source, such as Server Name, Database, and Schema, andto credentials for connecting to the selected source.

Pick Source Table window:

Select a source table using the Pick Table dropdown.

  • Select Full Load if you want to read the entire table.

  • Select Incremental Load Based on Audit Fields to perform an incremental read starting at a record where the previous read left off.

Incremental load based on Audit Fields is based around the concept of Change Data Capture (CDC), which is a set of reading and writing patterns designed to optimize large-scale data transfers by minimizing database writing in order to improve performance. CDC is implemented in Astera using the Audit Fields pattern. The Audit Fields pattern uses create time or last update time to determine the records that have been inserted or updated since the last transfer and transfers only those records.

Advantages

  • Most efficient of CDC patterns. Only records that were modified since the last transfer are retrieved by the query thereby putting little stress on the source database and network bandwidth

Disadvantages

  • Requires update date time and/or create date time fields to be present and correctly populated

  • Does not capture deletes

  • Requires index on the audit field(s) for efficient performance

To use the Audit Fields strategy, select the Audit Field and an optional Alternate Audit Field from the appropriate dropdown menus. Also, specify the path to the file that will store incremental transfer information.

Where Clause window:

You can enter an optional SQL expression serving as a filter for the incoming records. The expression should start with the WHERE word followed by the filter you wish to apply.

For example, WHERE CreatedDtTm >= ‘2001/01/05’

General Options window:

The Comments input allows you to enter comments associated with this object.

SQL Query

Adding a SQL Query Source object allows you to transfer data returned by the SQL query. An example of what an SQL Query Source object looks like is shown below.

To configure the properties of a SQL Query Source object after it is added to the dataflow, right-click on its header and select Properties from the context menu. The following properties are available:

Source Connection window – Allows you to enter the connection information for your SQL Query, such as Server Name, Database, and Schema, and credentials for connecting to the selected database.

SQL Query Source window:

Enter the SQL expression controlling which records should be returned by this source. The expression should follow SQL syntax conventions for the chosen database provider.

For example, select OrderId, OrderName, and CreatedDtTm from Orders.

Source/Destination File Options

If your source or destination is a Delimited File, you can set the following properties

  • First Row Contains Header - Check this option if you want the first row of your file to display the column headers. In the case of the Source file, this indicates if the source contains headers.

  • Field Delimiter - Allows you to select the delimiter for the fields. The available choices are "," and ".". You can also type the delimiter of your choice instead of choosing the available options.

  • Record Delimiter - Allows you to select the delimiter for the records in the fields. The choices available are "carriage-return line-feed combination", "carriage-return", and "line-feed". You can also type the record delimiter of your choice instead of choosing the available options. For more information on Record Delimiters, please refer to the Glossary.

  • Encoding - Allows you to choose the encoding scheme for the delimited file from a list of choices. The default value is Unicode (UTF-8)

  • Quote Char - Allows you to select the type of quote character to be used in the delimited file. This quote character tells the system to overlook any special characters inside the specified quotation marks. The options available are ” and ’.

You can also use the Build fields from an existing file feature to help you build destination fields based on an existing file instead of manually typing the layout.

If the Source and/or the Destination chosen is a Microsoft Excel Worksheet, you can set the following properties:

  • First Row Contains Header - Check this option if you want the first row of your file to display the column headers. In the case of the Source file, this indicates if the source contains headers.

  • Worksheet - Allows you to select a specific worksheet from the selected Microsoft Excel file.

You can also use the Build fields from an existing file feature to help you build destination fields based on an existing file instead of manually typing the layout.

If the Source and/or the Destination chosen is a Fixed Length File, you can set the following properties:

  • First Row Contains Header - Check this option if you want the first row of your file to display the column headers. In the case of the Source file, this indicates if the source contains headers.

  • Record Delimiter - Allows you to select the delimiter for the records in the fields. The choices available are "carriage-return line-feed combination", "carriage-return", and "line-feed". You can also type the record delimiter of your choice instead of choosing the available options. For more information on Record Delimiters, please refer to the Glossary.

  • Encoding - Allows you to choose the encoding scheme for the delimited file from a list of choices. The default value is Unicode (UTF-8)

You can also use the Build fields from an existing file feature to help you build destination fields based on an existing file instead of manually typing the layout.

Using the Length Markers window, you can create the layout of your fixed-length file, The Length Markers window has a ruled marker placed at the top of the window. To insert a field length marker, you can click on the window at a particular point. For example, if you want to set the length of a field to contain five characters and the field starts at five, then you need to click at the marker position nine.

In case the records don’t have a delimiter and you rely on knowing the size of a record, the number in the RecordLength box is used to specify the character length for a single record.

You can delete a field length marker by clicking the marker.

If the source is an XML file, you can set the following options:

  • Source File Path specifies the file path of the source XML file.

  • Schema File Path specifies the file path of the XML schema (XSD file) that applies to the selected source XML file.

  • Record Filter Expression allows you to optionally specify an expression used as a filter for incoming source records from the selected source XML file. The filter can refer to a field or fields inside any node inside the XML hierarchy.

The following options are available for destination XML files.

  • Destination File Path specifies the file path of the destination XML file.

  • Encoding - Allows you to choose the encoding scheme for the XML file from a list of choices. The default value is Unicode (UTF-8).

  • Format XML Output instructs Astera to add line breaks to the destination XML file for improved readability.

  • Read From Schema File specifies the file path of the XML schema (XSD file) that will be used to generate the destination XML file.

  • Root Element specifies the root element from the list of the available elements in the selected schema file.

  • Generate Destination XML Schema Based on Source Layout creates the destination XML layout to mirror the layout of the source.

  • Root Element specifies the name of the root element for the destination XML file.

  • Generate Fields as XML Attributes specifies that fields will be written as XML attributes (as opposed to XML elements) in the destination XML file.

  • Record Node specifies the name of the node that will contain each record transferred.

Note: To ensure your dataflow is runnable on a remote server, please avoid using local paths for the source. Using UNC paths is recommended.

Advanced Flat-File Reading Options

When importing from a fixed width, delimited, or Excel file, you can specify the following advanced reading options:

  • Header Spans x Rows - If your source file has a header that spans more than 1 row, select the number of rows for the header using this control.

  • Skip Initial Records - Sets the number of records that you want skipped at the beginning of the file. This option can be set whether or not your source file has a header. If your source file has a header, the first record after the specified number of rows to skip will be used as the header row.

  • Raw Text Filter - Only records starting with the filter string will be imported. The rest of the records will be filtered.

You can optionally use regular expressions to specify your filter. For example, the regular expression ^[12][4] will only include records starting with 1 or 2, and whose second character is 4.

Note: Astera supports Regular Expressions implemented with the Microsoft .NET Framework and uses the Microsoft version of named captures for regular expressions.

Raw Text Filter setting is not available for Excel source files.

Managing Differences between Source Layout and Source File

If your source is a fixed-length file, delimited file, or Excel spreadsheet, it may contain an optional header row. A header row is the first record in the file that specifies field names and, in the case of a fixed-length file, the positioning of fields in the record.

If your source file has a header row, you can specify how you want the system to handle the differences between your actual source file, and the source layout specified in the setting. Differences may arise due to the fact that the source file has a different field order from what is specified in the source layout, or it may have extra fields compared to the source layout. Conversely, the source file may have fewer fields than what is defined in the source layout, and the field names may also differ, or may have changed since the time the layout was created.

By selecting from the available options, you can have Astera handle those differences exactly as required by your situation. These options are described in more detail below:

  • Enforce exact header match – Let Astera proceed with the transfer only if the source file’s layout matches the source layout defined in the setting exactly. This includes checking for the same number and order of fields and field names.

  • Columns order in the file may be different from the layout – Let Astera ignore the sequence of fields in the source file and match them to the source layout using the field names.

  • Column headers in the file may be different from the layout – This mode is used by default whenever the source file does not have a header row. You can also enable it manually if you want to match the first field in the layout with the first field in the source file, the second field in the layout with the second field in the source file, and so on. This option will match the fields using their order as described above even if the field names are not matched successfully. We recommend that you use this mode only if you are sure that the source file has the same field sequence as what is defined in the source layout.

Creating Field Layout

The Field Layout window is available in the properties of most objects on the dataflow to help you specify the fields making up the object. The table below explains the attributes you can set in the Field Layout window.

Attribute

Description

Name

The system pre-fills this item for you based on the field header. Field names do not allow spaces. Field names are used to refer to the fields in the Expression Builder or tools where a field is used in a calculation formula.

Header

Represents the field name specified in the header row of the file. Field headers may contain spaces.

Data Type

Specifies the data type of a field, such as Integer, Real, String, Date, or Boolean.

Format

Specifies the format of the values stored in that field, depending on the field’s data type. For example, for dates you can choose between DD-MM-YY, YYYY-MM-DD, or other available formats.

Start Position

Specifies the position of the field’s first character relative to the beginning of the record.

Note: This option is only available for the fixed-length layout type.

Length

Specifies the maximum number of characters allotted for a value in the field. The actual value may be shorter than what is allowed by the Length attribute.

Note: This option is only available for fixed length and database layout types.

Column Name

Specifies the column name of the database table.

Note: This option is only available in database layout.

DB Type

Specifies the database-specific data type that the system assigns to the field based on the field's data. Each database (Oracle, SQL, Sybase, etc.) has its own DB types. For example, Long is only available in Oracle for data type strings.

Note: This option is only available in database layout.

Decimal Places

Specifies the number of decimal places for a data type specified as real.

Note: This option is only available in database layout.

Allows Null

Controls whether the field allows blank or NULL values in it.

Default Value

Specifies the value that is assigned to the field in any one of the following cases:- The source field does not have a value - The field is not found in the source layout- The destination field is not mapped to a source field.

Note: This option is only available in the destination layout.

Sequence

Represents the column order in the source file. You can change the column order of the data being imported by simply changing the number in the sequence field. The other fields in the layout will then be reordered accordingly.

Description

Contains information about the field to help you remember its purpose.

Alignment

Specifies the positioning of the field’s value relative to the start position of the field. Available alignment modes are LEFT, CENTER, and RIGHT.

Note: This option is only available for the fixed-length layout type.

Primary Key

Denotes the primary key field (or part of a composite primary key) for the table.

Note: This option is only available in database layout.

System Generated

Indicates that the field will be automatically assigned an increasing Integer number during the transfer.

Note: This option is only available in database layout.

The table below provides a list of all the attributes available for a particular layout type.

Layout Type

Attributes Available

Source Delimited file and Excel worksheet

Name, Header, Data type, Format

Source Fixed Length file

Name, Header, Data type, Format, Start position, Length

Source Database Table and SQL query

Column name, Name, Data type, DB type, Length, Decimal places, Allows null

Destination Delimited file and Excel worksheet

Name, Header, Data type, Format, Allows null, Default value

Destination Fixed Length file

Sequence, Name, Header, Description, Data type, Format, Start position, Length, Allows null, Default value, Alignment

Destination Database Table

Column name, Name, Data type, DB type, Length, Decimal places, Allows null, Primary key, System generated

Using Data Formats

Astera supports a variety of formats for each data type. For example, for Dates, you can specify the date as “April 12” or “12-Apr-08”. Data Formats can be configured independently for source and for destination, giving you the flexibility to correctly read source data and change its format as it is transferred to the destination.

If you are transferring from a flat file (for example, Delimited or Fixed-Width), you can specify the format of a field so that the system can correctly read the data from that field.

If you do not specify a data format, the system will try to guess the correct format for the field. For example, Astera is able to correctly interpret any of the following as a Date:

  • April 12

  • 12-Apr-08

  • 04-12-2008

  • Saturday, 12 April 2008

and so on.

Astera comes with a variety of pre-configured formats for each supported data type. These formats are listed in the Sample Formats section below. You can also create and save your own data formats.

To select a data format for a source field, go to Source Fields and expand the Format dropdown menu next to the appropriate field.

Dates:

Format

Sample Value

dd-MMM-yyyy

12-Apr-2008

yyyy-MM-dd

2008-04-12

dd-MM-yy

12-04-08

MM-dd-yyyy

04-12-2008

MM/dd/yyyy

04/12/2008

MM/dd/yy

04/12/08

dd-MMM-yy

12-Apr-08

M

April 12

D

12 April 2008

mm-dd-yyyy hh:mm:ss tt

04-12-2008 11:04:53 PM

M/d/yyyy hh:mm:ss tt

4/12/2008 11:04:53 PM

Booleans:

Format

Sample Value

Y/N

Y/N

1/0

1/0

T/F

T/F

True/False

True/False

Integers:

Format

Sample Value

######

123456

####

1234

####;0;(####)

-1234

.##%;0;(.##%)

123456789000%

.##%;(.##%)

1234567800%

$###,###,###,###

$1,234,567,890,000

$###,###,###,##0

$1,234,567,890,000

###,###

123450

#,#

1,000

##.00

35

Real Numbers:

Format

Sample Value

###,###.##

12,345.67

##.##

12.34

$###,###,###,###

$1,234,567,890,000

$###,###,###,##0

$1,234,567,890,000

.##%;(.##%);

.1234567800%

.##%;0;(.##%)

.12345678900%

Numeric Format Specifiers:

Format specifier

Name

Description

0

Zero placeholder

If the value being formatted has a digit in the position where the '0' appears in the format string, then that digit is copied to the result string; otherwise, a '0' appears in the result string. The position of the leftmost '0' before the decimal point and the rightmost '0' after the decimal point determines the range of digits that are always present in the result string. The "00" specifier causes the value to be rounded to the nearest digit preceding the decimal, where rounding away from zero is always used. For example, formatting 34.5 with "00" would result in the value 35.

#

Digit placeholder

If the value being formatted has a digit in the position where the '#' appears in the format string, then that digit is copied to the result string. Otherwise, nothing is stored in that position in the result string. Note that this specifier never displays the '0' character if it is not a significant digit, even if '0' is the only digit in the string. It will display the '0' character if it is a significant digit in the number being displayed. The "##" format string causes the value to be rounded to the nearest digit preceding the decimal, where rounding away from zero is always used. For example, formatting 34.5 with "##" would result in the value 35.

.

Decimal Point

The first '.' character in the format string determines the location of the decimal separator in the formatted value; any additional '.' characters are ignored.

,

Thousand separator and number scaling

The ',' character serves as both a thousand separator specifier and a number scaling specifier. Thousand separator specifier: If one or more ',' characters is specified between two digit placeholders (0 or #) that format the integral digits of a number, a group separator character is inserted between each number group in the integral part of the output. Number scaling specifier: If one or more ',' characters is specified immediately to the left of the explicit or implicit decimal point, the number to be formatted is divided by 1000 each time a number scaling specifier occurs. For example, if the string "0,," is used to format the number 100 million, the output is "100".

%

Percentage placeholder

The presence of a '%' character in a format string causes a number to be multiplied by 100 before it is formatted. The appropriate symbol is inserted in the number itself at the location where the '%' appears in the format string.

E0E+0E-0e0e+0e-0

Scientific notation

If any of the strings "E", "E+", "E-", "e", "e+", or "e-" are present in the format string and are followed immediately by at least one '0' character, then the number is formatted using scientific notation with an 'E' or 'e' inserted between the number and the exponent. The number of '0' characters following the scientific notation indicator determines the minimum number of digits to output for the exponent. The "E+" and "e+" formats indicate that a sign character (plus or minus) should always precede the exponent. The "E", "E-", "e", or "e-" formats indicate that a sign character should only precede negative exponents.

'ABC'"ABC"

Literal string

Characters enclosed in single or double quotes are copied to the result string, and do not affect formatting.

;

Section separator

The ';' character is used to separate sections for positive, negative, and zero numbers in the format string. If there are two sections in the custom format string, the leftmost section defines the formatting of positive and zero numbers, while the rightmost section defines the formatting of negative numbers. If there are three sections, the leftmost section defines the formatting of positive numbers, the middle section defines the formatting of zero numbers, and the rightmost section defines the formatting of negative numbers.

Other

All other characters

Any other character is copied to the result string and does not affect formatting.

Note: To open the source file for editing in a new tab, click the icon next to the File Path input, and select Edit File.

To generate the schema, click the icon next to the Schema File Path input, and select Generate.

To edit an existing schema, click the icon next to the Schema File Path input, and select Edit File. The schema will open for editing in a new tab.

Note: Astera makes it possible to generate an XSD file from the layout of the selected source XML file. This feature is useful when you don’t have the XSD file available. Note that all fields are assigned the data type of String in the generated schema. To use this feature, expand the control and select Generate.

To open the Data Formats window, click the icon located in the Toolbar at the top of the designer.

Delimited File
Excel File
Fixed Length File
COBOL
XML File
Data Model
Database Table
SQL Query