Astera Data Stack
Version 10
Version 10
  • Welcome to Astera Data Stack Documentation
  • RELEASE NOTES
    • Astera 10.5 - Release Notes
    • Astera 10.4 - Release Notes
    • Astera 10.3 - Release Notes
    • Astera 10.2 – Release Notes
    • Astera 10.1 - Additional Notes
    • Astera 10.1 - Release Notes
    • Astera 10.0 - Release Notes
  • SETTING UP
    • System Requirements
    • Product Architecture
    • Migrating from Astera 9 to Astera 10
    • Migrating from Astera 7.x to Astera 10
    • Installing Client and Server Applications
    • Connecting to an Astera Server using the Client
    • How to Connect to a Different Astera Server from the Client
    • How to Build a Cluster Database and Create Repository
    • Repository Upgrade Utility in Astera
    • How to Login from the Client
    • How to Verify Admin Email
    • Licensing in Astera
    • How to Supply a License Key Without Prompting the User
    • Install Manager
    • User Roles and Access Control
      • Windows Authentication
      • Azure Authentication
    • Offline Activation of Astera
    • Setting Up R in Astera
    • Silent Installation
  • DATAFLOWS
    • What are Dataflows?
    • Sources
      • Data Providers and File Formats Supported in Astera Data Stack
      • Setting Up Sources
      • Excel Workbook Source
      • COBOL File Source
      • Database Table Source
      • Delimited File Source
      • File System Items Source
      • Fixed Length File Source
      • Email Source
      • Report Source
      • SQL Query Source
      • XML/JSON File Source
      • PDF Form Source
      • Parquet File Source (Beta)
      • MongoDB Source (Beta)
      • Data Model Query
    • Transformations
      • Introducing Transformations
      • Aggregate Transformation
      • Constant Value Transformation
      • Denormalize Transformation
      • Distinct Transformation
      • Expression Transformation
      • Filter Transformation
      • Join Transformation
      • List Lookup Transformation
      • Merge Transformation
      • Normalize Transformation
      • Passthru Transformation
      • Reconcile Transformation
      • Route Transformation
      • Sequence Generator
      • Sort Transformation
      • Sources as Transformations
      • Subflow Transformation
      • Switch Transformation
      • Tree Join Transformation
      • Tree Transform
      • Union Transformation
      • Data Cleanse Transformation
      • File Lookup Transformation
      • SQL Statement Lookup
      • Database Lookup
      • AI Match Transformation
    • Destinations
      • Setting Up Destinations
      • Database Table Destination
      • Delimited File Destination
      • Excel Workbook Destination
      • Fixed Length File Destination
      • SQL Statement Destination
      • XML File Destination
      • Parquet File Destination (Beta)
      • Excel Workbook Report
      • MongoDB Destination
    • Data Logging and Profiling
      • Creating Data Profile
      • Creating Field Profile
      • Data Quality Mode
      • Using Data Quality Rules in Astera
      • Record Level Log
      • Quick Profile
    • Database Write Strategies
      • Data Driven
      • Source Diff Processor
      • Database Diff Processor
    • Text Processors
      • Delimited Parser
      • Delimited Serializer
      • Language Parser
      • Fixed Length Parser
      • Fixed Length Serializer
      • XML/JSON Parser
      • XML/JSON Serializer
    • Data Warehouse
      • Fact Table Loader
      • Dimension Loader
      • Data Vault Loader
    • Testing and Diagnostics
      • Correlation Analysis
    • Visualization
      • Basic Plots
      • Distribution Plots
    • EDI
      • EDI Source File
      • EDI Message Parser
      • EDI Message Serializer
      • EDI Destination File
  • WORKFLOWS
    • What are Workflows?
    • Creating Workflows in Astera
    • Decision
    • EDI Acknowledgment
    • File System
    • File Transfer
      • FTP
      • SFTP
    • Or
    • Run Dataflow
    • Run Program
    • Run SQL File
    • Run SQL Script
    • Run Workflow
    • Send Mail
    • Workflows with a Dynamic Destination Path
    • Customizing Workflows With Parameters
    • GPG-Integrated File Decryption in Astera
    • AS2
      • Setting up an AS2 Server
      • Adding an AS2 Partner
      • AS2 Workflow Task
  • Subflows
    • Using Subflows in Astera
  • DATA MODEL
    • Creating a Data Warehousing Project
    • Data Models
      • Introducing Data Models
      • Opening a New Data Model
      • Data Modeler - UI Walkthrough
      • Reverse Engineering an Existing Database
      • Creating a Data Model from Scratch
      • General Entity Properties
      • Creating and Editing Relationships
      • Relationship Manager
      • Virtual Primary Key
      • Virtual Relationship
      • Change Field Properties
      • Forward Engineering
      • Verifying a Data Model
    • Dimensional Modelling
      • Introducing Dimensional Models
      • Converting a Data Model to a Dimensional Model
      • Build Dimensional Model
      • Fact Entities
      • Dimension Entities
      • Placeholder Dimension for Early Arriving Facts and Late Arriving Dimensions
      • Date and Time Dimension
      • Aggregates in Dimensional Modeling
      • Verifying a Dimensional Model
    • Data Vaults
      • Introducing Data Vaults
      • Data Vault Automation
      • Raw Vault Entities
      • Bridge Tables
      • Point-In-Time Tables
    • Documentation
      • Generating Technical and Business Documentation for Data Models
      • Lineage and Impact Analysis
    • Deployment and Usage
      • Deploying a Data Model
      • View Based Deployment
      • Validate Metadata and Data Integrity
      • Using Astera Data Models in ETL Pipelines
      • Connecting an Astera Data Model to a Third-Party Visualization Tool
  • REPORT MODEL
    • User Guide
      • Report Model Tutorial
    • Report Model Interface
      • Report Options
      • Report Browser
      • Data Regions in Report Models
      • Region Properties Panel
      • Pattern Properties
      • Field Properties Panel
    • Use Cases
      • Auto-Creating Data Regions and Fields
      • Line Count
      • Auto-Parsing
      • Pattern Count
      • Applying Pattern to Line
      • Regular Expression
      • Floating Patterns and Floating Fields
      • Creating Multi-Column Data Regions
      • Defining the Start Position of Data Fields
      • Data Field Verification
      • Using Comma Separated Values to Define Start Position
      • Defining Region End Type as Specific Text and Regular Expression
      • How To Work With PDF Scaling Factor in a Report Model
      • Connecting to Cloud Storage
    • Auto Generate Layout
      • Setting Up AGL in Astera
      • UI Walkthrough - Auto Generation of Layout, Fields and Table
      • Using Auto Generation Layout, Auto Create Fields and Auto Create Table (Preview)
    • AI Powered Data Extraction
      • AI Powered Data Extraction Using Astera North Star
      • Best Practices for AI-Powered Template Creation in Astera
    • Optical Character Recognition
      • Loading PDFs with OCR
      • Best Practices for OCR Usage
    • Exporting Options
      • Exporting a Report Model
      • Exporting Report Model to a Dataflow
    • Miscellaneous
      • Importing Monarch Models
      • Microsoft Word and Rich Text Format Support
      • Working With Problematic PDF Files
  • API Flow
    • API Publishing
      • Develop
        • Designing an API Flow
        • Request Context Parameters
        • Configuring Sorting and Filtering in API Flows
        • Enable Pagination
        • Asynchronous API Request
        • Multiple Responses using Conditional Route
        • Workflow Tasks in an API Flow
        • Enable File Download-Upload Through APIs
        • Database CRUD APIs Auto-Generation
        • Pre-deployment Testing and Verification of API flows
        • Multipart/Form-Data
        • Certificate Store
      • Publish
        • API Deployment
        • Test Flow Generation
      • Manage
        • Server Browser Functionalities for API Publishing
          • Swagger UI for API Deployments
        • API Monitoring
        • Logging and Tracing
    • API Consumption
      • Consume
        • API Connection
        • Making API Calls with the API Client
        • API Browser
          • Type 1 – JSON/XML File
          • Type 2 – JSON/XML URL
          • Type 3 – Import Postman API Collections
          • Type 4 - Create or customize API collection
          • Pre-built Custom Connectors
        • Request Service Options - eTags
        • HTTP Redirect Calls
        • Method Operations
        • Pagination
        • Raw Preview And Copy Curl Command
        • Support for text/XML and SOAP Protocol
        • API Logging
        • Making Multipart/Form-Data API Calls
      • Authorize
        • Open APIs - Configuration Details
        • Authorizing Facebook APIs
        • Authorizing Astera’s Server APIs
        • Authorizing Avaza APIs
        • Authorizing the Square API
        • Authorizing the ActiveCampaign API
        • Authorizing the QuickBooks’ API
        • Astera’s Server API Documentation
        • NTLM Authentication
        • AWS Signature Authentication
        • Accessing Astera’s Server APIs Through a Third-Party Tool
          • Workflow Use Case
  • Project Management and Scheduling
    • Project Management
      • Deployment
      • Server Monitoring and Job Management
      • Cluster Monitor and Settings
      • Connecting to Source Control
      • Astera Project and Project Explorer
      • CAR Convert Utility Guide
    • Job Scheduling
      • Scheduling Jobs on the Server
      • Job Monitor
    • Configuring Multiple Servers to the Same Repository (Load Balancing)
    • Purging the Database Repository
  • Data Governance
    • Deployment of Assets in Astera Data Stack
    • Logging In
    • Tags
    • Modifying Asset Details
    • Data Discoverability
    • Data Profile
    • Data Quality
    • Scheduler
    • Access Management
  • Functions
    • Introducing Function Transformations
    • Custom Functions
    • Logical
      • Coalesce (Any value1, Any value2)
      • IsNotNull (AnyValue)
      • IsRealNumber (AnyValue)
      • IsValidSqlDate (Date)
      • IsDate (AnyValue)
      • If (Boolean)
      • If (DateTime)
      • If (Double)
      • Exists
      • If (Int64)
      • If (String)
      • IsDate (str, strformat)
      • IsInteger (AnyValue)
      • IsNullOrWhitespace (StringValue)
      • IsNullorEmpty (StringValue)
      • IsNull (AnyValue)
      • IsNumeric (AnyValue)
    • Conversion
      • GetDateComponents (DateWithOffset)
      • ParseDate (Formats, Str)
      • GetDateComponents (Date)
      • HexToInteger (Any Value)
      • ToInteger (Any value)
      • ToDecimal (Any value)
      • ToReal (Any value)
      • ToDate (String dateStr)
      • TryParseDate (String, UnknownDate)
      • ToString (Any value)
      • ToString (DateValue)
      • ToString (Any data, String format)
    • Math
      • Abs (Double)
      • Abs (Decimal)
      • Ceiling (Real)
      • Ceiling(Decimal)
      • Floor (Decimal)
      • Floor (Real)
      • Max (Decimal)
      • Max (Date)
      • Min (Decimal)
      • Min (Date)
      • Max (Real)
      • Max (Integer)
      • Min (Real)
      • Pow (BaseExponent)
      • Min (Integer)
      • RandomReal (Int)
      • Round (Real)
      • Round (Real Integer)
      • Round (Decimal Integer)
      • Round (Decimal)
    • Financial
      • DDB
      • FV
      • IPmt
      • IPmt (FV)
      • Pmt
      • Pmt (FV)
      • PPmt
      • PPmt (FV)
      • PV (FV)
      • Rate
      • Rate (FV)
      • SLN
      • SYD
    • String
      • Center (String)
      • Chr (IntAscii)
      • Asc (String)
      • AddCDATAEnvelope
      • Concatenate (String)
      • ContainsAnyChar (String)
      • Contains (String)
      • Compact (String)
      • Find (Int64)
      • EndsWith (String)
      • FindIntStart (Int32)
      • Extract (String)
      • GetFindCount (Int64)
      • FindLast (Int64)
      • GetDigits (String)
      • GetLineFeed
      • Insert (String)
      • IsAlpha
      • GetToken
      • IndexOf
      • IsBlank
      • IsLower
      • IsUpper
      • IsSubstringOf
      • Length (String)
      • LeftOf (String)
      • Left (String)
      • IsValidName
      • Mid (String)
      • PadLeft
      • Mid (String Chars)
      • LSplit (String)
      • PadRight
      • ReplaceAllSpecialCharsWithSpace
      • RemoveChars (String str, StringCharsToRemove)
      • ReplaceLast
      • RightAlign
      • Reverse
      • Right (String)
      • RSplit (String)
      • SplitStringMultipleRecords
      • SplitStringMultipleRecords (2 Separators)
      • SplitString (3 separators)
      • SplitString
      • SplitStringMultipleRecords (3 Separators)
      • Trim
      • SubString (NoOfChars)
      • StripHtml
      • Trim (Start)
      • TrimExtraMiddleSpace
      • TrimEnd
      • PascalCaseWithSpace (String str)
      • Trim (String str)
      • ToLower(String str)
      • ToProper(String str)
      • ToUpper (String str)
      • Substring (String str, Integer startAt)
      • StartsWith (String str, String value)
      • RemoveAt (String str, Integer startAt, Integer noofChars)
      • Proper (String str)
      • Repeat (String str, Integer count)
      • ReplaceAll (String str, String lookFor, String replaceWith)
      • ReplaceFirst (String str, String lookFor, String replaceWith)
      • RightOf (String str, String lookFor)
      • RemoveChars (String str, String charsToRemove)
      • SplitString (String str, String separator1, String separator2)
    • Date Time
      • AddMinutes (DateTime)
      • AddDays (DateTimeOffset)
      • AddDays (DateTime)
      • AddHours (DateTime)
      • AddSeconds (DateTime)
      • AddMonths (DateTime)
      • AddMonths (DateTimeOffset)
      • AddMinutes (DateTimeOffset)
      • AddSeconds (DateTimeOffset)
      • AddYears (DateTimeOffset)
      • AddYears (DateTime)
      • Age (DateTime)
      • Age (DateTimeOffset)
      • CharToSeconds (Str)
      • DateDifferenceDays (DateTimeOffset)
      • DateDifferenceDays (DateTime)
      • DateDifferenceHours (DateTimeOffset)
      • DateDifferenceHours (DateTime)
      • DateDifferenceMonths (DateTimeOffset)
      • DateDifferenceMonths (DateTime)
      • DatePart (DateTimeOffset)
      • DatePart (DateTime)
      • DateDifferenceYears (DateTimeOffset)
      • DateDifferenceYears (DateTime)
      • Month (DateTime)
      • Month (DateTimeOffset)
      • Now
      • Quarter (DateTime)
      • Quarter (DateTimeOffset)
      • Second (DateTime)
      • Second (DateTimeOffset)
      • SecondsToChar (String)
      • TimeToInteger (DateTime)
      • TimeToInteger (DateTimeOffset)
      • ToDate Date (DateTime)
      • ToDate DateTime (DateTime)
      • ToDateString (DateTime)
      • ToDateTimeOffset-Date (DateTimeOffset)
      • ToDate DateTime (DateTimeOffset)
      • ToDateString (DateTimeOffset)
      • Today
      • ToLocal (DateTime)
      • ToJulianDate (DateTime)
      • ToJulianDayNumber (DateTime)
      • ToTicks (Date dateTime)
      • ToTicks (DateTimeWithOffset dateTime)
      • ToUnixEpoc (Date dateTime)
      • ToUtc (Date dateTime)
      • UnixTimeStampToDateTime (Real unixTimeStamp)
      • UtcNow ()
      • Week (Date dateTime)
      • Week (DateTimeWithOffset dateTime)
      • Year (Date dateTime)
      • Year (DateTimeWithOffset dateTime)
      • DateToJulian (Date dateTime, Integer length)
      • DateTimeOffsetUtcNow ()
      • DateTimeOffsetNow ()
      • Day (DateTimeWithOffset dateTime)
      • Day (Date dateTime)
      • DayOfWeekStr (DateTimeWithOffset dateTime)
      • DayOfWeek (DateTimeWithOffset dateTime)
      • DayOfWeek (Date dateTime)
      • DateToJulian (DateTimeWithOffset dateTime, Integer length)
      • DayOfWeekStr (Date dateTime)
      • FromJulianDate (Real julianDate)
      • DayOfYear (Date dateTime)
      • DaysInMonth(Integer year, Integer month)
      • DayOfYear (DateTimeWithOffset dateTime)
      • FromUnixEpoc
      • FromJulianDayNumber (Integer julianDayNumber)
      • FromTicksUtc(Integer ticks)
      • FromTicksLocal(Integer ticks)
      • Hour (Date dateTime)
      • Hour (DateTimeWithOffset dateTime)
      • Minute (Date dateTime)
      • JulianToDate (String julianDate)
      • Minute (DateTimeWithOffset dateTime)
      • DateToIntegerYYYYMMDD (DateTimeWithOffset dateTime)
      • DateToIntegerYYYYMMDD (Date dateTime)
    • Files
      • AppendTextToFile (String filePath, String text)
      • CopyFile (String sourceFilePath, String destFilePath, Boolean overWrite)
      • CreateDateTime (String filePath)
      • DeleteFile (String filePath)
      • DirectoryExists (String filePath)
      • FileExists (String filePath)
      • FileLength (String filePath)
      • FileLineCount (String filePath)
      • GetDirectory (String filePath)
      • GetEDIFileMetaData (String filePath)
      • GetExcelWorksheets (String excelFilePath)
      • GetFileExtension (String filePath)
      • GetFileInfo (String filePath)
      • GetFileName (String filePath)
      • GetFileNameWithoutExtension (String filePath)
      • LastUpdateDateTime (String filePath)
      • MoveFile (String filePath, String newDirectory)
      • ReadFileBytes (String filePath)
      • ReadFileFirstLine (String filePath)
      • ReadFileText (String filePath)
      • ReadFileText (String filePath, String codePage)
      • WriteBytesToFile (String filePath, ByteArray bytes)
      • WriteTextToFile (String filePath, String text)
    • Date Time With Offset
      • ToDateTimeOffsetFromDateTime (dateTime String)
      • ToUtc (DateTimeWithOffset)
      • ToDateTimeOffsetFromDateTime
      • ToDateTimeOffset (String dateTimeOffsetStr)
      • ToDateTimeFromDateTimeOffset
    • GUID
      • NewGuid
    • Encoding
      • ToBytes
      • FromBytes
      • UrlEncode
      • UrlDecode
      • ComputeSHA256
      • ComputeMD5
      • ComputeHash (Str, Key)
      • ComputeHash (Str, Key, hex)
      • ConvertEncoding
    • Regular Expressions
      • ReplaceRegEx
      • ReplaceRegEx (Integer StartAt)
      • IsMatchRegEx (StartAt)
      • IsMatchRegEx
      • IsUSPhone
      • IsUSZipCode
      • GetMatchRegEx
      • GetMatchRegEx (StartAt)
    • TimeSpan
      • Minutes
      • Hours
      • Days
      • Milliseconds
      • TotalMilliseconds
      • TimeSpanFromTicks
      • Ticks
      • TotalHours
      • Seconds
      • TotalDays
      • ToTimeSpan (Hours, Min, Sec)
      • ToTimeSpan (Milli)
      • ToTimeSpan
      • TotalSeconds
      • TotalMinutes
    • Matching
      • Soundex
      • DoubleMetaphone
      • RefinedSoundex
    • Processes
      • TerminateProcess
      • IsProcessRunning
  • USE CASES
    • End-to-End Use Cases
      • Data Integration
        • Using Astera Data Stack to Create and Orchestrate an ETL Process for Partner Onboarding
      • Data Warehousing
        • Building a Data Warehouse – A Step by Step Approach
      • Data Extraction
        • Reusing The Extraction Template for Similar Layout Files
  • CONNECTORS
    • Setting Up IBM DB2/iSeries Connectivity in Astera
    • Connecting to SAP HANA Database
    • Connecting to MariaDB Database
    • Connecting to Salesforce Database
    • Connecting to Salesforce – Legacy Database
    • Connecting to Vertica Database
    • Connecting to Snowflake Database
    • Connecting to Amazon Redshift Database
    • Connecting to Amazon Aurora Database
    • Connecting to Google Cloud SQL in Astera
    • Connecting to MySQL Database
    • Connecting to PostgreSQL in Astera
    • Connecting to Netezza Database
    • Connecting to Oracle Database
    • Connecting to Microsoft Azure Databases
    • Amazon S3 Bucket Storage in Astera
    • Connecting to Amazon RDS Databases
    • Microsoft Azure Blob Storage in Astera
    • ODBC Connector
    • Microsoft Dynamics CRM
    • Connection Details for Azure Data Lake Gen 2 and Azure Blob Storage
    • Configuring Azure Data Lake Gen 2
    • Connecting to Microsoft Message Queue
    • Connecting to Google BigQuery
    • Azure SQL Server Configuration Prerequisites
    • Connecting to Microsoft Azure SQL Server
    • Connecting to Microsoft SharePoint in Astera
  • Incremental Loading
    • Trigger Based CDC
    • Incremental CDC
  • MISCELLANEOUS
    • Using Dynamic Layout & Template Mapping in Astera
    • Synonym Dictionary File
    • SmartMatch Feature
    • Role-Based Access Control in Astera
    • Updating Your License in Astera
    • Using Output Variables in Astera
    • Parameterization
    • Connection Vault
    • Safe Mode
    • Context Information
    • Using the Data Source Browser in Astera
    • Pushdown Mode
    • Optimization Scenarios
    • Using Microsoft’s Modern Authentication Method in Email Source Object
    • Shared Actions
    • Data Formats
    • AI Automapper
    • Resource Catalog
    • Cloud Deployment
      • Deploying Astera Data Stack on Microsoft Azure Cloud
      • Deploying Astera Data Stack on Oracle Cloud
      • Deploying Astera Data Stack on Amazon Web Services
      • Setting up the Astera Server on AKS
    • GIT In Astera Data Stack
      • GIT Repositories in Astera Data Stack
      • Moving a Repository to a Remote Server
      • Git Conflicts in Astera Data Stack
    • Astera Best Practices
  • FAQs
    • Installation
      • Why do we need to make two installations for Astera?
      • What’s the difference between Custom and Complete installation?
      • What’s the difference between 32-bit and 64-bit Astera?
      • Can we use a single license for multiple users?
      • Does Astera client work when it’s not connected to the server?
      • Why do we need to build a cluster database and set up a repository while working with Astera?
      • How do we set up multiple servers for load balancing?
      • How do we maintain schedules when migrating server or upgrading version?
      • Which database providers does Astera support for setting up a cluster database?
      • How many Astera clients can be connected to a single server?
      • Why is Astera not able to access my source file or create a new one?
    • Sources
      • Can I use data from unstructured documents in dataflows?
      • Can I extract data from fillable PDF forms in Astera?
      • Does Astera support extraction of data residing in online sources?
      • How do I process multiple files in a directory with a single execution of a flow?
      • Can I write information from the File System Items Source to the destination?
      • Can I split a source file into multiple files based on record count?
      • Does Astera support data extraction from unstructured docs or text files?
      • What is the difference between full and incremental loading in database sources?
      • How is the File System Items Source used in a Dataflow?
      • How does the PDF Form Source differ from the Report Source in Astera?
      • Does Astera support extraction of data from EDI files?
      • How does the Raw Text Filter option work in file sources in Astera?
    • Destinations
      • If I want to have a different field delimiter, say a pipe (“|”), is there an option to export with a
      • Tools Menu > Data Format has different date formats, but it doesn’t seem to do anything.
      • Can we export the Object Path column present in the Data Preview window?
      • I want to change the output format of a column.
      • What will be the outcome if we write files multiple times to the same Excel Destination?
    • Transformations
      • How is the Aggregate Transformation different from the Expression Transformation?
      • Can we omit duplicate records using the Aggregate Transformation in Astera?
      • How many datasets can a single Aggregate object take input from?
      • How is Expression Transformation different from the Function Transformation?
    • Workflows
      • What is a Workflow in Astera?
      • How do I trigger a task if at least one of a set of tasks fails?
      • Can I perform an action based on whether a file has data?
    • Scheduler
      • How can I schedule a job to run every x hours?
Powered by GitBook
On this page
  • Flat File Sources
  • Tree File Sources
  • Database Sources
  • Source/Destination File Options
  • Advanced Flat-File Reading Options
  • Managing Differences between Source Layout and Source File
  • Creating Field Layout
  • Using Data Formats

Was this helpful?

Export as PDF
  1. DATAFLOWS
  2. Sources

Setting Up Sources

PreviousData Providers and File Formats Supported in Astera Data StackNextExcel Workbook Source

Last updated 1 year ago

Was this helpful?

© Copyright 2025, Astera Software

Each source on the dataflow is represented as a source object. You can have any number of sources in the dataflow, and they can feed into zero or more destinations.

The following source types are supported by the dataflow engine:

Flat File Sources:

Tree File Sources:

Database Sources:

  • Data Model

All sources can be added to the dataflow by picking a source type on the Toolbox and dropping it on the dataflow. File sources can also be added by dragging-and-dropping a file from an Explorer window. Database sources can be drag-and-dropped from the Data Source Browser. For more details on adding sources to the dataflow, see Introducing Dataflows.

Flat File Sources

Delimited File

Adding a Delimited File Source object allows you to transfer data from a delimited file. An example of what a delimited file source object looks like is shown below.

To configure the properties of a Delimited File Source object after it is added to the dataflow, right-click on its header and select Properties from the context menu.

Fixed-Length File

Adding a Fixed-Length File Source object allows you to transfer data from a fixed-length file. An example of what a Fixed-Length File Source object looks like is shown below.

To configure the properties of a Fixed-Length File Source object after it is added to the dataflow, right-click on its header and select Properties from the context menu.

Excel File

Adding an Excel Workbook Source object allows you to transfer data from an Excel file. An example of what an Excel Workbook Source object looks like is shown below.

To configure the properties of an Excel Workbook Source object after it is added to the dataflow, right-click on its header and select Properties from the context menu.

Tree File Sources

COBOL File

Adding a COBOL File Source object allows you to transfer data from a COBOL file. An example of what a COBOL File Source object looks like is shown below.

To configure the properties of a COBOL File Source object after it is added to the dataflow, right-click on its header and select Properties from the context menu.

XML/JSON File

Adding an XML/JSON File Source object allows you to transfer data from an XML file. An example of what an XML/JSON File Source object looks like is shown below.

To configure the properties of an XML/JSON File Source object after it is added to the dataflow, right-click on its header and select Properties from the context menu. The following properties are available:

General Properties window:

File Path – Specifies the location of the source XML file. Using UNC paths is recommended if running the dataflow on a server.

Schema File Path – Specifies the location of the XSD file controlling the layout of the XML source file.

Note: Astera can generate a schema based on the content of the source XML file. The data types will be assigned based on the source file’s content.

Optional Record Filter Expression – Allows you to enter an expression to selectively filter incoming records according to your criteria. You can use the Expression Builder to help you create your filter expression. For more information on using Expression Builder, see Expression Builder.

Note: To ensure that your dataflow is runnable on a remote server, please avoid using local paths for the source. Using UNC paths is recommended.

Database Sources

Database Table

Adding a Database Table Source object allows you to transfer data from a database table. An example of what a Database Table Source object looks like is shown below.

To configure the properties of a Database Table Source object after it is added to the dataflow, right-click on its header and select Properties from the context menu. The following properties are available:

Source Connection window – Allows you to enter the connection information for your source, such as Server Name, Database, and Schema, as well as credentials for connecting to the selected source.

Pick Source Table window:

Select a source table using the Pick Table dropdown.

  • Select Full Load if you want to read the entire table.

  • Select Incremental Load Based on Audit Fields to perform an incremental read starting at a record where the previous read left off.

Incremental load based on Audit Fields is based around the concept of Change Data Capture (CDC), which is a set of reading and writing patterns designed to optimize large-scale data transfers by minimizing database writing in order to improve performance. CDC is implemented in Astera using Audit Fields pattern. The Audit Fields pattern uses create time or last update time to determine the records that have been inserted or updated since the last transfer and transfers only those records.

Advantages

  • Most efficient of CDC patterns. Only records that were modified since the last transfer are retrieved by the query thereby putting little stress on the source database and network bandwidth

Disadvantages

  • Requires update date time and/or create date time fields to be present and correctly populated

  • Does not capture deletes

  • Requires index on the audit field(s) for efficient performance

To use the Audit Fields strategy, select the Audit Field and an optional Alternate Audit Field from the appropriate dropdown menus. Also, specify the path to the file that will store incremental transfer information.

Where Clause window:

You can enter an optional SQL expression serving as a filter for the incoming records. The expression should start with the WHERE word followed by the filter you wish to apply.

For example, WHERE CreatedDtTm >= ‘2001/01/05’

General Options window:

The Comments input allows you to enter comments associated with this object.

SQL Query

Adding a SQL Query Source object allows you to transfer data returned by the SQL query. An example of what an SQL Query Source object looks like is shown below.

To configure the properties of a SQL Query Source object after it is added to the dataflow, right-click on its header and select Properties from the context menu. The following properties are available:

Source Connection window – Allows you to enter the connection information for your SQL Query, such as Server Name, Database, and Schema, as well as credentials for connecting to the selected database.

SQL Query Source window:

Enter the SQL expression controlling which records should be returned by this source. The expression should follow SQL syntax conventions for the chosen database provider.

For example, select OrderId, OrderName, CreatedDtTm from Orders.

Source/Destination File Options

Source or Destination is a Delimited File

If your source or destination is a Delimited File, you can set the following properties

  • First Row Contains Header - Check this option if you want the first row of your file to display the column headers. In the case of Source file, this indicates if the source contains headers.

  • Field Delimiter - Allows you to select the delimiter for the fields. The available choices are , and . You can also type the delimiter of your choice instead of choosing the available options.

  • Record Delimiter - Allows you to select the delimiter for the records in the fields. The choices available are carriage-return line-feed combination , carriage-return and line-feed . You can also type the record delimiter of your choice instead of choosing the available options. For more information on Record Delimiters, please refer to the Glossary.

  • Encoding - Allows you to choose the encoding scheme for the delimited file from a list of choices. The default value is Unicode (UTF-8)

  • Quote Char - Allows you to select the type of quote character to be used in the delimited file. This quote character tells the system to overlook any special characters inside the specified quotation marks. The options available are ” and ’.

You can also use the Build fields from an existing file feature to help you build a destination fields based on an existing file instead of manually typing the layout.

Source or Destination is a Microsoft Excel Worksheet

If the Source and/or the Destination chosen is a Microsoft Excel Worksheet, you can set the following properties:

  • First Row Contains Header - Check this option if you want the first row of your file to display the column headers. In the case of Source file, this indicates if the source contains headers.

  • Worksheet - Allows you to select a specific worksheet from the selected Microsoft Excel file.

You can also use the Build fields from an existing file feature to help you build a destination fields based on an existing file instead of manually typing the layout.

Source or Destination is a Fixed Length File

If the Source and/or the Destination chosen is a Fixed Length File, you can set the following properties:

  • First Row Contains Header - Check this option if you want the first row of your file to display the column headers. In the case of Source file, this indicates if the source contains headers.

  • Record Delimiter - Allows you to select the delimiter for the records in the fields. The choices available are carriage-return line-feed combination , carriage-return and line-feed . You can also type the record delimiter of your choice instead of choosing the available options. For more information on Record Delimiters, please refer to the Glossary.

  • Encoding - Allows you to choose the encoding scheme for the delimited file from a list of choices. The default value is Unicode (UTF-8)

You can also use the Build fields from an existing file feature to help you build a destination fields based on an existing file instead of manually typing the layout.

Using the Length Markers window, you can create the layout of your fixed-length file, The Length Markers window has a ruled marker placed at the top of the window. To insert a field length marker, you can click in the window at a particular point. For example, if you want to set the length of a field to contain five characters and the field starts at five, then you need to click at the marker position nine.

In case the records don’t have a delimiter and you rely on knowing the size of a record, the number in the RecordLength box is used to specify the character length for a single record.

You can delete a field length marker by clicking the marker.

Source or Destination is an XML file

If the source is an XML file, you can set the following options:

  • Source File Path specifies the file path of the source XML file.

  • Schema File Path specifies the file path of the XML schema (XSD file) that applies to the selected source XML file.

  • Record Filter Expression allows you to optionally specify an expression used as a filter for incoming source records from the selected source XML file. The filter can refer to a field or fields inside any node inside the XML hierarchy.

The following options are available for destination XML files.

  • Destination File Path specifies the file path of the destination XML file.

  • Encoding - Allows you to choose the encoding scheme for the XML file from a list of choices. The default value is Unicode (UTF-8).

  • Format XML Output instructs Astera to add line breaks to the destination XML file for improved readability.

  • Read From Schema File specifies the file path of the XML schema (XSD file) that will be used to generate the destination XML file.

  • Root Element specifies the root element from the list of the available elements in the selected schema file.

  • Generate Destination XML Schema Based on Source Layout creates the destination XML layout to mirror the layout of the source.

  • Root Element specifies the name of the root element for the destination XML file.

  • Generate Fields as XML Attributes specifies that fields will be written as XML attributes (as opposed to XML elements) in the destination XML file.

  • Record Node specifies the name of the node that will contain each record transferred.

Note: To ensure that your dataflow is runnable on a remote server, please avoid using local paths for the source. Using UNC paths is recommended.

Advanced Flat-File Reading Options

When importing from a fixed-width, delimited, or Excel file, you can specify the following advanced reading options:

Header Spans x Rows - If your source file has a header that spans more than 1 row, select the number of rows for the header using this control.

Skip Initial Records - Sets the number of records which you want skipped at the beginning of the file. This option can be set whether or not your source file has a header. If your source file has a header, the first record after the specified number of rows to skip will be used as the header row.

Raw Text Filter - Only records starting with the filter string will be imported. The rest of the records will be filtered.

You can optionally use regular expressions to specify your filter. For example, the regular expression ^[12][4] will only include records starting with 1 or 2, and whose second character is 4.

Note: Astera supports Regular Expressions implemented with the Microsoft .NET Framework and uses the Microsoft version of named captures for regular expressions.

Raw Text Filter setting is not available for Excel source files.

Managing Differences between Source Layout and Source File

If your source is a fixed-length file, delimited file, or Excel spreadsheet, it may contain an optional header row. A header row is the first record in the file that specifies field names and, in the case of a fixed-length file, the positioning of fields in the record.

If your source file has a header row, you can specify how you want the system to handle the differences between your actual source file, and the source layout specified in the setting. Differences may arise due to the fact that the source file has a different field order from what is specified in the source layout, or it may have extra fields compared to the source layout. Conversely, the source file may have fewer fields than what is defined in the source layout, and the field names may also differ, or may have changed since the time the layout was created.

By selecting from the available options, you can have Astera handle those differences exactly as required by your situation. These options are described in more detail below:

Enforce exact header match – Lets Astera Data Stack proceed with the transfer only if the source file’s layout matches the source layout defined in the setting exactly. This includes checking for the same number and order of fields and field names.

Columns order in file may be different from the layout – Lets Astera Data Stack ignore the sequence of fields in the source file, and match them to the source layout using the field names.

Column headers in file may be different from the layout – This mode is used by default whenever the source file does not have a header row. You can also enable it manually if you want to match the first field in the layout with the first field in the source file, the second field in the layout with the second field in the source file, and so on. This option will match the fields using their order as described above even if the field names are not matched successfully. We recommend that you use this mode only if you are sure that the source file has the same field sequence as what is defined in the source layout.

Creating Field Layout

The Field Layout window is available in the properties of most objects on the dataflow to help you specify the fields making up the object. The table below explains the attributes you can set in the Field Layout window.

Attribute
Description

Name

The system pre-fills this item for you based on the field header. Field names do not allow spaces. Field names are used to refer to the fields in the Expression Builder or tools where a field is used in a calculation formula.

Header

Represents the field name specified in the header row of the file. Field headers may contain spaces.

Data Type

Specifies the data type of a field, such as Integer, Real, String, Date, or Boolean.

Format

Specifies the format of the values stored in that field, depending on the field’s data type. For example, for dates you can choose between DD-MM-YY, YYYY-MM-DD, or other available formats.

Start Position

Specifies the position of the field’s first character relative to the beginning of the record. Note: This option is only available for fixed length layout type.

Length

Specifies the maximum number of characters allotted for a value in the field. The actual value may be shorter than what is allowed by the Length attribute. Note: This option is only available for fixed length and database layout types.

Column Name

Specifies the column name of the database table. Note: This option is only available in database layout.

DB Type

Specifies the database specific data type that the system assigns to the field based on the field's data. Each database (Oracle, SQL, Sybase, etc) has its own DB types. For example, Long is only available in Oracle for data type string. Note: This option is only available in database layout.

Decimal Places

Specifies the number of decimal places for a data type specified as real. Note: This option is only available in database layout.

Allows Null

Controls whether the field allows blank or NULL values in it.

Default Value

Specifies the value that is assigned to the field in any one of the following cases:- The source field does not have a value - The field is not found in the source layout- The destination field is not mapped to a source field. Note: This option is only available in destination layout.

Sequence

Represents the column order in the source file. You can change the column order of the data being imported by simply changing the number in the sequence field. The other fields in the layout will then be reordered accordingly.

Description

Contains information about the field to help you remember its purpose.

Alignment

Specifies the positioning of the field’s value relative to the start position of the field. Available alignment modes are LEFT, CENTER, and RIGHT. Note: This option is only available for fixed length layout type.

Primary Key

Denotes the primary key field (or part of a composite primary key) for the table. Note: This option is only available in database layout.

System Generated

Indicates that the field will be automatically assigned an increasing Integer number during the transfer. Note: This option is only available in database layout.

The table below provides a list of all the attributes available for a particular layout type.

Layout Type
Attributes Available

Source Delimited file and Excel worksheet

Name, Header, Data type, Format

Source Fixed Length file

Name, Header, Data type, Format, Start position, Length

Source Database Table and SQL query

Column name, Name, Data type, DB type, Length, Decimal places, Allows null

Destination Delimited file and Excel worksheet

Name, Header, Data type, Format, Allows null, Default value

Destination Fixed Length file

Sequence, Name, Header, Description, Data type, Format, Start position, Length, Allows null, Default value, Alignment

Destination Database Table

Column name, Name, Data type, DB type, Length, Decimal places, Allows null, Primary key, System generated

Using Data Formats

Astera supports a variety of formats for each data type. For example, for Dates, you can specify the date as “April 12” or “12-Apr-08”. Data Formats can be configured independently for source and for destination, giving you the flexibility to correctly read source data and change its format as it is transferred to destination.

If you are transferring from a flat file (for example, Delimited or Fixed-Width), you can specify the format of a field so that the system can correctly read the data from that field.

If you do not specify a data format, the system will try to guess the correct format for the field. For example, Astera is able to correctly interpret any of the following as a Date:

April 12

12-Apr-08

04-12-2008

Saturday, 12 April 2008

and so on

Astera comes with a variety of pre-configured formats for each supported data type. These formats are listed in the Sample Formats section below. You can also create and save your own data formats.

To select a data format for a source field, go to Source Fields and expand the Format dropdown menu next to the appropriate field.

Sample Formats

Dates:

Format
Sample Value

dd-MMM-yyyy

12-Apr-2008

yyyy-MM-dd

2008-04-12

dd-MM-yy

12-04-08

MM-dd-yyyy

04-12-2008

MM/dd/yyyy

04/12/2008

MM/dd/yy

04/12/08

dd-MMM-yy

12-Apr-08

M

April 12

D

12 April 2008

mm-dd-yyyy hh:mm:ss tt

04-12-2008 11:04:53 PM

M/d/yyyy hh:mm:ss tt

4/12/2008 11:04:53 PM

Booleans:

Format
Sample Value

Y/N

Y/N

1/0

1/0

T/F

T/F

True/False

True/False

Integers:

Format
Sample Value

######

123456

####

1234

####;0;(####)

-1234

.##%;0;(.##%)

123456789000%

.##%;(.##%)

1234567800%

$###,###,###,###

$1,234,567,890,000

$###,###,###,##0

$1,234,567,890,000

###,###

123450

#,#

1,000

##.00

35

Real Numbers:

Format
Sample Value

###,###.##

12,345.67

##.##

12.34

$###,###,###,###

$1,234,567,890,000

$###,###,###,##0

$1,234,567,890,000

.##%;(.##%);

.1234567800%

.##%;0;(.##%)

.12345678900%

Numeric Format Specifiers:

Format specifier
Name
Description

0

Zero placeholder

If the value being formatted has a digit in the position where the '0' appears in the format string, then that digit is copied to the result string; otherwise, a '0' appears in the result string. The position of the leftmost '0' before the decimal point and the rightmost '0' after the decimal point determines the range of digits that are always present in the result string. The "00" specifier causes the value to be rounded to the nearest digit preceding the decimal, where rounding away from zero is always used. For example, formatting 34.5 with "00" would result in the value 35.

#

Digit placeholder

If the value being formatted has a digit in the position where the '#' appears in the format string, then that digit is copied to the result string. Otherwise, nothing is stored in that position in the result string. Note that this specifier never displays the '0' character if it is not a significant digit, even if '0' is the only digit in the string. It will display the '0' character if it is a significant digit in the number being displayed. The "##" format string causes the value to be rounded to the nearest digit preceding the decimal, where rounding away from zero is always used. For example, formatting 34.5 with "##" would result in the value 35.

.

Decimal Point

The first '.' character in the format string determines the location of the decimal separator in the formatted value; any additional '.' characters are ignored.

,

Thousand separator and number scaling

The ',' character serves as both a thousand separator specifier and a number scaling specifier. Thousand separator specifier: If one or more ',' characters is specified between two digit placeholders (0 or #) that format the integral digits of a number, a group separator character is inserted between each number group in the integral part of the output. Number scaling specifier: If one or more ',' characters is specified immediately to the left of the explicit or implicit decimal point, the number to be formatted is divided by 1000 each time a number scaling specifier occurs. For example, if the string "0,," is used to format the number 100 million, the output is "100".

%

Percentage placeholder

The presence of a '%' character in a format string causes a number to be multiplied by 100 before it is formatted. The appropriate symbol is inserted in the number itself at the location where the '%' appears in the format string.

E0E+0E-0e0e+0e-0

Scientific notation

If any of the strings "E", "E+", "E-", "e", "e+", or "e-" are present in the format string and are followed immediately by at least one '0' character, then the number is formatted using scientific notation with an 'E' or 'e' inserted between the number and the exponent. The number of '0' characters following the scientific notation indicator determines the minimum number of digits to output for the exponent. The "E+" and "e+" formats indicate that a sign character (plus or minus) should always precede the exponent. The "E", "E-", "e", or "e-" formats indicate that a sign character should only precede negative exponents.

'ABC'"ABC"

Literal string

Characters enclosed in single or double quotes are copied to the result string, and do not affect formatting.

;

Section separator

The ';' character is used to separate sections for positive, negative, and zero numbers in the format string. If there are two sections in the custom format string, the leftmost section defines the formatting of positive and zero numbers, while the rightmost section defines the formatting of negative numbers. If there are three sections, the leftmost section defines the formatting of positive numbers, the middle section defines the formatting of zero numbers, and the rightmost section defines the formatting of negative numbers.

Other

All other characters

Any other character is copied to the result string, and does not affect formatting.

Note: To open the source file for editing in a new tab, click icon next to the File Path input, and select Edit File.

To generate the schema, click icon next to the Schema File Path input, and select Generate.

To edit an existing schema, click icon next to the Schema File Path input, and select Edit File. The schema will open for editing in a new tab.

Note: Astera makes it possible to generate an XSD file from the layout of the selected source XML file. This feature is useful when you don’t have the XSD file available. Note that all fields are assigned the data type of String in the generated schema. To use this feature, expand the control and select Generate.

To open the Data Formats window, click icon located in the Toolbar at the top of the designer.

Delimited File
Excel File
Fixed Length File
COBOL
XML File
Database Table
SQL Query
../_images/811.png