Astera Data Stack
Version 10
Version 10
  • Welcome to Astera Data Stack Documentation
  • RELEASE NOTES
    • Astera 10.5 - Release Notes
    • Astera 10.4 - Release Notes
    • Astera 10.3 - Release Notes
    • Astera 10.2 – Release Notes
    • Astera 10.1 - Additional Notes
    • Astera 10.1 - Release Notes
    • Astera 10.0 - Release Notes
  • SETTING UP
    • System Requirements
    • Product Architecture
    • Migrating from Astera 9 to Astera 10
    • Migrating from Astera 7.x to Astera 10
    • Installing Client and Server Applications
    • Connecting to an Astera Server using the Client
    • How to Connect to a Different Astera Server from the Client
    • How to Build a Cluster Database and Create Repository
    • Repository Upgrade Utility in Astera
    • How to Login from the Client
    • How to Verify Admin Email
    • Licensing in Astera
    • How to Supply a License Key Without Prompting the User
    • Install Manager
    • User Roles and Access Control
      • Windows Authentication
      • Azure Authentication
    • Offline Activation of Astera
    • Setting Up R in Astera
    • Silent Installation
  • DATAFLOWS
    • What are Dataflows?
    • Sources
      • Data Providers and File Formats Supported in Astera Data Stack
      • Setting Up Sources
      • Excel Workbook Source
      • COBOL File Source
      • Database Table Source
      • Delimited File Source
      • File System Items Source
      • Fixed Length File Source
      • Email Source
      • Report Source
      • SQL Query Source
      • XML/JSON File Source
      • PDF Form Source
      • Parquet File Source (Beta)
      • MongoDB Source (Beta)
      • Data Model Query
    • Transformations
      • Introducing Transformations
      • Aggregate Transformation
      • Constant Value Transformation
      • Denormalize Transformation
      • Distinct Transformation
      • Expression Transformation
      • Filter Transformation
      • Join Transformation
      • List Lookup Transformation
      • Merge Transformation
      • Normalize Transformation
      • Passthru Transformation
      • Reconcile Transformation
      • Route Transformation
      • Sequence Generator
      • Sort Transformation
      • Sources as Transformations
      • Subflow Transformation
      • Switch Transformation
      • Tree Join Transformation
      • Tree Transform
      • Union Transformation
      • Data Cleanse Transformation
      • File Lookup Transformation
      • SQL Statement Lookup
      • Database Lookup
      • AI Match Transformation
    • Destinations
      • Setting Up Destinations
      • Database Table Destination
      • Delimited File Destination
      • Excel Workbook Destination
      • Fixed Length File Destination
      • SQL Statement Destination
      • XML File Destination
      • Parquet File Destination (Beta)
      • Excel Workbook Report
      • MongoDB Destination
    • Data Logging and Profiling
      • Creating Data Profile
      • Creating Field Profile
      • Data Quality Mode
      • Using Data Quality Rules in Astera
      • Record Level Log
      • Quick Profile
    • Database Write Strategies
      • Data Driven
      • Source Diff Processor
      • Database Diff Processor
    • Text Processors
      • Delimited Parser
      • Delimited Serializer
      • Language Parser
      • Fixed Length Parser
      • Fixed Length Serializer
      • XML/JSON Parser
      • XML/JSON Serializer
    • Data Warehouse
      • Fact Table Loader
      • Dimension Loader
      • Data Vault Loader
    • Testing and Diagnostics
      • Correlation Analysis
    • Visualization
      • Basic Plots
      • Distribution Plots
    • EDI
      • EDI Source File
      • EDI Message Parser
      • EDI Message Serializer
      • EDI Destination File
  • WORKFLOWS
    • What are Workflows?
    • Creating Workflows in Astera
    • Decision
    • EDI Acknowledgment
    • File System
    • File Transfer
      • FTP
      • SFTP
    • Or
    • Run Dataflow
    • Run Program
    • Run SQL File
    • Run SQL Script
    • Run Workflow
    • Send Mail
    • Workflows with a Dynamic Destination Path
    • Customizing Workflows With Parameters
    • GPG-Integrated File Decryption in Astera
    • AS2
      • Setting up an AS2 Server
      • Adding an AS2 Partner
      • AS2 Workflow Task
  • Subflows
    • Using Subflows in Astera
  • DATA MODEL
    • Creating a Data Warehousing Project
    • Data Models
      • Introducing Data Models
      • Opening a New Data Model
      • Data Modeler - UI Walkthrough
      • Reverse Engineering an Existing Database
      • Creating a Data Model from Scratch
      • General Entity Properties
      • Creating and Editing Relationships
      • Relationship Manager
      • Virtual Primary Key
      • Virtual Relationship
      • Change Field Properties
      • Forward Engineering
      • Verifying a Data Model
    • Dimensional Modelling
      • Introducing Dimensional Models
      • Converting a Data Model to a Dimensional Model
      • Build Dimensional Model
      • Fact Entities
      • Dimension Entities
      • Placeholder Dimension for Early Arriving Facts and Late Arriving Dimensions
      • Date and Time Dimension
      • Aggregates in Dimensional Modeling
      • Verifying a Dimensional Model
    • Data Vaults
      • Introducing Data Vaults
      • Data Vault Automation
      • Raw Vault Entities
      • Bridge Tables
      • Point-In-Time Tables
    • Documentation
      • Generating Technical and Business Documentation for Data Models
      • Lineage and Impact Analysis
    • Deployment and Usage
      • Deploying a Data Model
      • View Based Deployment
      • Validate Metadata and Data Integrity
      • Using Astera Data Models in ETL Pipelines
      • Connecting an Astera Data Model to a Third-Party Visualization Tool
  • REPORT MODEL
    • User Guide
      • Report Model Tutorial
    • Report Model Interface
      • Report Options
      • Report Browser
      • Data Regions in Report Models
      • Region Properties Panel
      • Pattern Properties
      • Field Properties Panel
    • Use Cases
      • Auto-Creating Data Regions and Fields
      • Line Count
      • Auto-Parsing
      • Pattern Count
      • Applying Pattern to Line
      • Regular Expression
      • Floating Patterns and Floating Fields
      • Creating Multi-Column Data Regions
      • Defining the Start Position of Data Fields
      • Data Field Verification
      • Using Comma Separated Values to Define Start Position
      • Defining Region End Type as Specific Text and Regular Expression
      • How To Work With PDF Scaling Factor in a Report Model
      • Connecting to Cloud Storage
    • Auto Generate Layout
      • Setting Up AGL in Astera
      • UI Walkthrough - Auto Generation of Layout, Fields and Table
      • Using Auto Generation Layout, Auto Create Fields and Auto Create Table (Preview)
    • AI Powered Data Extraction
      • AI Powered Data Extraction Using Astera North Star
      • Best Practices for AI-Powered Template Creation in Astera
    • Optical Character Recognition
      • Loading PDFs with OCR
      • Best Practices for OCR Usage
    • Exporting Options
      • Exporting a Report Model
      • Exporting Report Model to a Dataflow
    • Miscellaneous
      • Importing Monarch Models
      • Microsoft Word and Rich Text Format Support
      • Working With Problematic PDF Files
  • API Flow
    • API Publishing
      • Develop
        • Designing an API Flow
        • Request Context Parameters
        • Configuring Sorting and Filtering in API Flows
        • Enable Pagination
        • Asynchronous API Request
        • Multiple Responses using Conditional Route
        • Workflow Tasks in an API Flow
        • Enable File Download-Upload Through APIs
        • Database CRUD APIs Auto-Generation
        • Pre-deployment Testing and Verification of API flows
        • Multipart/Form-Data
        • Certificate Store
      • Publish
        • API Deployment
        • Test Flow Generation
      • Manage
        • Server Browser Functionalities for API Publishing
          • Swagger UI for API Deployments
        • API Monitoring
        • Logging and Tracing
    • API Consumption
      • Consume
        • API Connection
        • Making API Calls with the API Client
        • API Browser
          • Type 1 – JSON/XML File
          • Type 2 – JSON/XML URL
          • Type 3 – Import Postman API Collections
          • Type 4 - Create or customize API collection
          • Pre-built Custom Connectors
        • Request Service Options - eTags
        • HTTP Redirect Calls
        • Method Operations
        • Pagination
        • Raw Preview And Copy Curl Command
        • Support for text/XML and SOAP Protocol
        • API Logging
        • Making Multipart/Form-Data API Calls
      • Authorize
        • Open APIs - Configuration Details
        • Authorizing Facebook APIs
        • Authorizing Astera’s Server APIs
        • Authorizing Avaza APIs
        • Authorizing the Square API
        • Authorizing the ActiveCampaign API
        • Authorizing the QuickBooks’ API
        • Astera’s Server API Documentation
        • NTLM Authentication
        • AWS Signature Authentication
        • Accessing Astera’s Server APIs Through a Third-Party Tool
          • Workflow Use Case
  • Project Management and Scheduling
    • Project Management
      • Deployment
      • Server Monitoring and Job Management
      • Cluster Monitor and Settings
      • Connecting to Source Control
      • Astera Project and Project Explorer
      • CAR Convert Utility Guide
    • Job Scheduling
      • Scheduling Jobs on the Server
      • Job Monitor
    • Configuring Multiple Servers to the Same Repository (Load Balancing)
    • Purging the Database Repository
  • Data Governance
    • Deployment of Assets in Astera Data Stack
    • Logging In
    • Tags
    • Modifying Asset Details
    • Data Discoverability
    • Data Profile
    • Data Quality
    • Scheduler
    • Access Management
  • Functions
    • Introducing Function Transformations
    • Custom Functions
    • Logical
      • Coalesce (Any value1, Any value2)
      • IsNotNull (AnyValue)
      • IsRealNumber (AnyValue)
      • IsValidSqlDate (Date)
      • IsDate (AnyValue)
      • If (Boolean)
      • If (DateTime)
      • If (Double)
      • Exists
      • If (Int64)
      • If (String)
      • IsDate (str, strformat)
      • IsInteger (AnyValue)
      • IsNullOrWhitespace (StringValue)
      • IsNullorEmpty (StringValue)
      • IsNull (AnyValue)
      • IsNumeric (AnyValue)
    • Conversion
      • GetDateComponents (DateWithOffset)
      • ParseDate (Formats, Str)
      • GetDateComponents (Date)
      • HexToInteger (Any Value)
      • ToInteger (Any value)
      • ToDecimal (Any value)
      • ToReal (Any value)
      • ToDate (String dateStr)
      • TryParseDate (String, UnknownDate)
      • ToString (Any value)
      • ToString (DateValue)
      • ToString (Any data, String format)
    • Math
      • Abs (Double)
      • Abs (Decimal)
      • Ceiling (Real)
      • Ceiling(Decimal)
      • Floor (Decimal)
      • Floor (Real)
      • Max (Decimal)
      • Max (Date)
      • Min (Decimal)
      • Min (Date)
      • Max (Real)
      • Max (Integer)
      • Min (Real)
      • Pow (BaseExponent)
      • Min (Integer)
      • RandomReal (Int)
      • Round (Real)
      • Round (Real Integer)
      • Round (Decimal Integer)
      • Round (Decimal)
    • Financial
      • DDB
      • FV
      • IPmt
      • IPmt (FV)
      • Pmt
      • Pmt (FV)
      • PPmt
      • PPmt (FV)
      • PV (FV)
      • Rate
      • Rate (FV)
      • SLN
      • SYD
    • String
      • Center (String)
      • Chr (IntAscii)
      • Asc (String)
      • AddCDATAEnvelope
      • Concatenate (String)
      • ContainsAnyChar (String)
      • Contains (String)
      • Compact (String)
      • Find (Int64)
      • EndsWith (String)
      • FindIntStart (Int32)
      • Extract (String)
      • GetFindCount (Int64)
      • FindLast (Int64)
      • GetDigits (String)
      • GetLineFeed
      • Insert (String)
      • IsAlpha
      • GetToken
      • IndexOf
      • IsBlank
      • IsLower
      • IsUpper
      • IsSubstringOf
      • Length (String)
      • LeftOf (String)
      • Left (String)
      • IsValidName
      • Mid (String)
      • PadLeft
      • Mid (String Chars)
      • LSplit (String)
      • PadRight
      • ReplaceAllSpecialCharsWithSpace
      • RemoveChars (String str, StringCharsToRemove)
      • ReplaceLast
      • RightAlign
      • Reverse
      • Right (String)
      • RSplit (String)
      • SplitStringMultipleRecords
      • SplitStringMultipleRecords (2 Separators)
      • SplitString (3 separators)
      • SplitString
      • SplitStringMultipleRecords (3 Separators)
      • Trim
      • SubString (NoOfChars)
      • StripHtml
      • Trim (Start)
      • TrimExtraMiddleSpace
      • TrimEnd
      • PascalCaseWithSpace (String str)
      • Trim (String str)
      • ToLower(String str)
      • ToProper(String str)
      • ToUpper (String str)
      • Substring (String str, Integer startAt)
      • StartsWith (String str, String value)
      • RemoveAt (String str, Integer startAt, Integer noofChars)
      • Proper (String str)
      • Repeat (String str, Integer count)
      • ReplaceAll (String str, String lookFor, String replaceWith)
      • ReplaceFirst (String str, String lookFor, String replaceWith)
      • RightOf (String str, String lookFor)
      • RemoveChars (String str, String charsToRemove)
      • SplitString (String str, String separator1, String separator2)
    • Date Time
      • AddMinutes (DateTime)
      • AddDays (DateTimeOffset)
      • AddDays (DateTime)
      • AddHours (DateTime)
      • AddSeconds (DateTime)
      • AddMonths (DateTime)
      • AddMonths (DateTimeOffset)
      • AddMinutes (DateTimeOffset)
      • AddSeconds (DateTimeOffset)
      • AddYears (DateTimeOffset)
      • AddYears (DateTime)
      • Age (DateTime)
      • Age (DateTimeOffset)
      • CharToSeconds (Str)
      • DateDifferenceDays (DateTimeOffset)
      • DateDifferenceDays (DateTime)
      • DateDifferenceHours (DateTimeOffset)
      • DateDifferenceHours (DateTime)
      • DateDifferenceMonths (DateTimeOffset)
      • DateDifferenceMonths (DateTime)
      • DatePart (DateTimeOffset)
      • DatePart (DateTime)
      • DateDifferenceYears (DateTimeOffset)
      • DateDifferenceYears (DateTime)
      • Month (DateTime)
      • Month (DateTimeOffset)
      • Now
      • Quarter (DateTime)
      • Quarter (DateTimeOffset)
      • Second (DateTime)
      • Second (DateTimeOffset)
      • SecondsToChar (String)
      • TimeToInteger (DateTime)
      • TimeToInteger (DateTimeOffset)
      • ToDate Date (DateTime)
      • ToDate DateTime (DateTime)
      • ToDateString (DateTime)
      • ToDateTimeOffset-Date (DateTimeOffset)
      • ToDate DateTime (DateTimeOffset)
      • ToDateString (DateTimeOffset)
      • Today
      • ToLocal (DateTime)
      • ToJulianDate (DateTime)
      • ToJulianDayNumber (DateTime)
      • ToTicks (Date dateTime)
      • ToTicks (DateTimeWithOffset dateTime)
      • ToUnixEpoc (Date dateTime)
      • ToUtc (Date dateTime)
      • UnixTimeStampToDateTime (Real unixTimeStamp)
      • UtcNow ()
      • Week (Date dateTime)
      • Week (DateTimeWithOffset dateTime)
      • Year (Date dateTime)
      • Year (DateTimeWithOffset dateTime)
      • DateToJulian (Date dateTime, Integer length)
      • DateTimeOffsetUtcNow ()
      • DateTimeOffsetNow ()
      • Day (DateTimeWithOffset dateTime)
      • Day (Date dateTime)
      • DayOfWeekStr (DateTimeWithOffset dateTime)
      • DayOfWeek (DateTimeWithOffset dateTime)
      • DayOfWeek (Date dateTime)
      • DateToJulian (DateTimeWithOffset dateTime, Integer length)
      • DayOfWeekStr (Date dateTime)
      • FromJulianDate (Real julianDate)
      • DayOfYear (Date dateTime)
      • DaysInMonth(Integer year, Integer month)
      • DayOfYear (DateTimeWithOffset dateTime)
      • FromUnixEpoc
      • FromJulianDayNumber (Integer julianDayNumber)
      • FromTicksUtc(Integer ticks)
      • FromTicksLocal(Integer ticks)
      • Hour (Date dateTime)
      • Hour (DateTimeWithOffset dateTime)
      • Minute (Date dateTime)
      • JulianToDate (String julianDate)
      • Minute (DateTimeWithOffset dateTime)
      • DateToIntegerYYYYMMDD (DateTimeWithOffset dateTime)
      • DateToIntegerYYYYMMDD (Date dateTime)
    • Files
      • AppendTextToFile (String filePath, String text)
      • CopyFile (String sourceFilePath, String destFilePath, Boolean overWrite)
      • CreateDateTime (String filePath)
      • DeleteFile (String filePath)
      • DirectoryExists (String filePath)
      • FileExists (String filePath)
      • FileLength (String filePath)
      • FileLineCount (String filePath)
      • GetDirectory (String filePath)
      • GetEDIFileMetaData (String filePath)
      • GetExcelWorksheets (String excelFilePath)
      • GetFileExtension (String filePath)
      • GetFileInfo (String filePath)
      • GetFileName (String filePath)
      • GetFileNameWithoutExtension (String filePath)
      • LastUpdateDateTime (String filePath)
      • MoveFile (String filePath, String newDirectory)
      • ReadFileBytes (String filePath)
      • ReadFileFirstLine (String filePath)
      • ReadFileText (String filePath)
      • ReadFileText (String filePath, String codePage)
      • WriteBytesToFile (String filePath, ByteArray bytes)
      • WriteTextToFile (String filePath, String text)
    • Date Time With Offset
      • ToDateTimeOffsetFromDateTime (dateTime String)
      • ToUtc (DateTimeWithOffset)
      • ToDateTimeOffsetFromDateTime
      • ToDateTimeOffset (String dateTimeOffsetStr)
      • ToDateTimeFromDateTimeOffset
    • GUID
      • NewGuid
    • Encoding
      • ToBytes
      • FromBytes
      • UrlEncode
      • UrlDecode
      • ComputeSHA256
      • ComputeMD5
      • ComputeHash (Str, Key)
      • ComputeHash (Str, Key, hex)
      • ConvertEncoding
    • Regular Expressions
      • ReplaceRegEx
      • ReplaceRegEx (Integer StartAt)
      • IsMatchRegEx (StartAt)
      • IsMatchRegEx
      • IsUSPhone
      • IsUSZipCode
      • GetMatchRegEx
      • GetMatchRegEx (StartAt)
    • TimeSpan
      • Minutes
      • Hours
      • Days
      • Milliseconds
      • TotalMilliseconds
      • TimeSpanFromTicks
      • Ticks
      • TotalHours
      • Seconds
      • TotalDays
      • ToTimeSpan (Hours, Min, Sec)
      • ToTimeSpan (Milli)
      • ToTimeSpan
      • TotalSeconds
      • TotalMinutes
    • Matching
      • Soundex
      • DoubleMetaphone
      • RefinedSoundex
    • Processes
      • TerminateProcess
      • IsProcessRunning
  • USE CASES
    • End-to-End Use Cases
      • Data Integration
        • Using Astera Data Stack to Create and Orchestrate an ETL Process for Partner Onboarding
      • Data Warehousing
        • Building a Data Warehouse – A Step by Step Approach
      • Data Extraction
        • Reusing The Extraction Template for Similar Layout Files
  • CONNECTORS
    • Setting Up IBM DB2/iSeries Connectivity in Astera
    • Connecting to SAP HANA Database
    • Connecting to MariaDB Database
    • Connecting to Salesforce Database
    • Connecting to Salesforce – Legacy Database
    • Connecting to Vertica Database
    • Connecting to Snowflake Database
    • Connecting to Amazon Redshift Database
    • Connecting to Amazon Aurora Database
    • Connecting to Google Cloud SQL in Astera
    • Connecting to MySQL Database
    • Connecting to PostgreSQL in Astera
    • Connecting to Netezza Database
    • Connecting to Oracle Database
    • Connecting to Microsoft Azure Databases
    • Amazon S3 Bucket Storage in Astera
    • Connecting to Amazon RDS Databases
    • Microsoft Azure Blob Storage in Astera
    • ODBC Connector
    • Microsoft Dynamics CRM
    • Connection Details for Azure Data Lake Gen 2 and Azure Blob Storage
    • Configuring Azure Data Lake Gen 2
    • Connecting to Microsoft Message Queue
    • Connecting to Google BigQuery
    • Azure SQL Server Configuration Prerequisites
    • Connecting to Microsoft Azure SQL Server
    • Connecting to Microsoft SharePoint in Astera
  • Incremental Loading
    • Trigger Based CDC
    • Incremental CDC
  • MISCELLANEOUS
    • Using Dynamic Layout & Template Mapping in Astera
    • Synonym Dictionary File
    • SmartMatch Feature
    • Role-Based Access Control in Astera
    • Updating Your License in Astera
    • Using Output Variables in Astera
    • Parameterization
    • Connection Vault
    • Safe Mode
    • Context Information
    • Using the Data Source Browser in Astera
    • Pushdown Mode
    • Optimization Scenarios
    • Using Microsoft’s Modern Authentication Method in Email Source Object
    • Shared Actions
    • Data Formats
    • AI Automapper
    • Resource Catalog
    • Cloud Deployment
      • Deploying Astera Data Stack on Microsoft Azure Cloud
      • Deploying Astera Data Stack on Oracle Cloud
      • Deploying Astera Data Stack on Amazon Web Services
      • Setting up the Astera Server on AKS
    • GIT In Astera Data Stack
      • GIT Repositories in Astera Data Stack
      • Moving a Repository to a Remote Server
      • Git Conflicts in Astera Data Stack
    • Astera Best Practices
  • FAQs
    • Installation
      • Why do we need to make two installations for Astera?
      • What’s the difference between Custom and Complete installation?
      • What’s the difference between 32-bit and 64-bit Astera?
      • Can we use a single license for multiple users?
      • Does Astera client work when it’s not connected to the server?
      • Why do we need to build a cluster database and set up a repository while working with Astera?
      • How do we set up multiple servers for load balancing?
      • How do we maintain schedules when migrating server or upgrading version?
      • Which database providers does Astera support for setting up a cluster database?
      • How many Astera clients can be connected to a single server?
      • Why is Astera not able to access my source file or create a new one?
    • Sources
      • Can I use data from unstructured documents in dataflows?
      • Can I extract data from fillable PDF forms in Astera?
      • Does Astera support extraction of data residing in online sources?
      • How do I process multiple files in a directory with a single execution of a flow?
      • Can I write information from the File System Items Source to the destination?
      • Can I split a source file into multiple files based on record count?
      • Does Astera support data extraction from unstructured docs or text files?
      • What is the difference between full and incremental loading in database sources?
      • How is the File System Items Source used in a Dataflow?
      • How does the PDF Form Source differ from the Report Source in Astera?
      • Does Astera support extraction of data from EDI files?
      • How does the Raw Text Filter option work in file sources in Astera?
    • Destinations
      • If I want to have a different field delimiter, say a pipe (“|”), is there an option to export with a
      • Tools Menu > Data Format has different date formats, but it doesn’t seem to do anything.
      • Can we export the Object Path column present in the Data Preview window?
      • I want to change the output format of a column.
      • What will be the outcome if we write files multiple times to the same Excel Destination?
    • Transformations
      • How is the Aggregate Transformation different from the Expression Transformation?
      • Can we omit duplicate records using the Aggregate Transformation in Astera?
      • How many datasets can a single Aggregate object take input from?
      • How is Expression Transformation different from the Function Transformation?
    • Workflows
      • What is a Workflow in Astera?
      • How do I trigger a task if at least one of a set of tasks fails?
      • Can I perform an action based on whether a file has data?
    • Scheduler
      • How can I schedule a job to run every x hours?
Powered by GitBook

© Copyright 2025, Astera Software

On this page
  • Use Case
  • Creating a Project
  • Creating the ETL Process
  • Orchestrating the Process for Automation

Was this helpful?

Export as PDF
  1. USE CASES
  2. End-to-End Use Cases
  3. Data Integration

Using Astera Data Stack to Create and Orchestrate an ETL Process for Partner Onboarding

The secret to enterprise success lies in the expansion of business networks in a way that maximizes operational efficiency. These networks include suppliers, resellers, vendors, and countless other business partners that provide an open door to vast markets that are inaccessible through in-house logistics alone. However, to secure these advantages, the external data coming in from these partners should be integrated through an efficient, reliable, and scalable process. With Astera Astera Data Stack, you can achieve exactly that.

In this article, we will be looking at a detailed use case that showcases the effectiveness of Astera Data Stack in Partner Onboarding.

Use Case

Wheel Dealers, an online automotive marketplace, will be the subject of our discussion. The WD portal provides customers with a comprehensive list of available cars from multiple different dealerships, with added options to compare features, prices, and other services. The company currently serves over one million customers and acquires data from over 200 different dealerships across New England and the Mid-Atlantic, with tentative plans to add a thousand new vendors from the Midwest and Northeast.

At the moment, Wheel Dealers runs a manual ETL process that requires ample time and resources. Moreover, with major expansion plans in the pipeline, this approach will not suffice in the near future. Using Astera Data Stack, we will be designing an automated data integration solution that extracts data from disparate sources with minimal manual intervention, thus providing Wheel Dealers with a seamless Partner Onboarding experience.

This implementation of an ETL process will employ the use of numerous items/features available in Astera Data Stack, such as the following:

  • Dataflow

    • Excel Workbook Source

    • Variables

    • Data Cleanse Transformation

    • Data Quality Rules

    • Database Table Destination

    • Record Level Log

  • Workflows

    • File Transfer Action

    • List FTP Directory Contents

    • Expression

    • Run Workflow

    • Context Information

    • Run Dataflow

  • Job Scheduler

A Dataflow will be used to create the ETL process, whereas two Workflows and the Job Scheduler will be used to orchestrate it.

Let’s take a detailed look at the methodologies being used.

Creating a Project

All of the items being used will be consolidated in one Project. To open a new Integration Project in Astera Data Stack, go to Project > New > Integration Project.

You can add new items to the Project by right-clicking the Project tab and selecting Add New Item.

Creating the ETL Process

As mentioned earlier, we will be creating this ETL Process in one Dataflow. The Astera Data Stack Toolbox is packed with various source, destination, and transformation objects that can serve numerous different purposes. For Wheel Dealers, we want to design a system that captures data, presents it in a refined manner, and uploads it to the dealer database.

Capturing Data

For this demonstration, we will be using an Excel Workbook Source object to capture the partner data. We are assuming that all of the incoming data is in an Excel format. However, any other source object could have been used depending on the nature of the incoming files.

  1. While the actual source files will be coming in from an FTP server, we will configure the source object through a template file. This will populate the object with all of the required fields.

2. To ensure that the source object picks up the files that are received from the FTP server, we will define a different file path at runtime by using the Config Parameters screen in the configuration window.

  • First, create a file path variable using the Variables object from the Toolbox.

  • Enter this variable as a parameterized value in the FilePath section of the Config Parameters screen.

For more information on how you can configure an Excel Workbook Source object, click here.

Cleansing Data

To refine and cleanse Wheel Dealers’ data, we will be using two separate transformations.

  1. Sort Transformation will be used to sort the data in ascending order.

A simple sort will make your data easier to query and analyze, which will consequently reduce burden on the integration server.

  1. Data Cleanse Transformation will be used to further refine the data by removing unwanted characters and spaces.

The data received by Wheel Dealers is usually well-formatted. Therefore, we will only be using this transformation to remove leading and trailing whitespaces.

Resolving Errors & Uploading to Database

Wheel Dealers needs a method to fix the errors present in the partner data and integration flows. For this purpose, we will be using the Data Quality Rules and Record Level Log objects. Moreover, the final data will be uploaded to the dealer database by using a Database Table Destination object.

  1. Data Quality Rules can be used to define some conditions, known as data quality rules, for incoming records. The records that do not meet these conditions are flagged as errors or warnings.

In this case, we will be applying two data quality rules.

  • Rule 1: The mileage of a used car cannot be null.

  • Rule 2: The transmission cannot be null.

Notice that some of the element names in the Data Quality Rules object are inconsistent with those in the database. With an ample amount of partner data coming in from disparate sources, discrepancies are bound to occur. Despite these discrepancies, Astera Data Stack has still managed to match some of the elements with their counterparts due to naming similarities, for example, ID and Dealer_ID. However, there are still a few elements that have not been matched with any others, such as MethodOfPayment, SP, DateOfOrder, and DateOfDelivery. For these, we will be using the Synonym Dictionary.

  • Add a Synonym Dictionary file to the Project and create alternatives for element names, as needed.

  • Press Shift and map the parent node of the Data Quality Rules object to that of the Database Destination object.

  1. Finally, a Record Level Log will be used to flag the errors that do not meet the data quality rules that were defined earlier.

Once you preview the output, you can see that the relevant erroneous records have been indicated.

Wheel Dealer’s business analysts can investigate this further and follow up with the concerned dealership.

Final Dataflow

The ETL process has now been created. This is what the final Dataflow should look like:

Orchestrating the Process for Automation

The orchestration of this ETL process requires two main Workflows, an inner Workflow, and the Job Scheduler. Wheel Dealers wants to implement self-operating and self-regulating processes wherever possible, so the purpose here is to automate the entire system. Workflows will be used to coordinate how the integration flows are executed; Workflow 1 will collect newly uploaded documents from Wheel Dealers’ FTP server and download them to their local network, whereas Workflow 2 will dynamically pick up the input files and feed them to the Dataflow we have designed. The Job Scheduler will be used to schedule Workflow runs.

Workflow 1

Workflow 1 will consist of a main Workflow and another inner Workflow. The former’s job will be to fetch new files from the FTP server, whereas the latter will download this new data to the local directory while deleting older files to ensure that only fresh partner files are downloaded during each run.

Inner Workflow

  1. First, we will create a file path variable using the Variables object from the Toolbox. This will be used to specify the path of the files coming in from the FTP server.

  1. The File Transfer object in the Workflow Tasks section of the Toolbox will be used to download and delete the files. Two objects, one for each purpose, will be added to the Workflow designer.

  • File Transfer Task 1: Downloading files.

  • File Transfer Task 2: Deleting files.

  • Connect the File Transfer objects to run sequentially, hence the older files will be deleted once the new files have been downloaded.

  • Map the file path variable onto the RemotePath input element of both File Transfer Task objects. As a result, this is what the final inner Workflow should look like:

Main Workflow

  1. First, we will use the List FTP Directory Contents object, present in the Sources section of the Toolbox, to extract data from the FTP server. The object will be used to specify the FTP Connection and the remote path where our required files are present.

Notice the S on the top-left of the object header. This denotes ‘Singleton’, which means that the object will only work in a single instance and will thus only retrieve one file from the FTP server. However, our purpose is to retrieve all of the new files present. For this to happen, right-click on the object header and select Loop from the context menu.

A Loop icon will now appear on the top-left of the object header.

2. The Expression object will be used to concatenate the file names coming in from the FTP Server with the path of the directory, thus creating a complete path for each file after every loop.

  • First, map the FileName element from the List FTP Directory Contents object to the FileName element in the Expression object.

  • A new element by the name of FilePath will be created in the Expression object. Using the Expression Builder in the Layout Builder screen, concatenate the directory path with the file name. This will dynamically update the file path as new files are fetched from the FTP server.

  1. The next step is to load the fetched files into the inner Workflow so that they may be downloaded or deleted. For this, we will be using the Run Workflow object from the Workflow Tasks section of the Toolbox.

  • Configure the Run Workflow object by providing the file Path for the inner Workflow.

  • The FP element in the Run Workflow object denotes the file path variable that we had defined in the inner Workflow. Map the FilePath element from the Expression object onto FP in the Run Workflow object.

In the inner Workflow, this variable has been mapped onto two File Transfer objects. As a result, their respective actions will take place on the files that are being fetched from the FTP server in a loop.

This is what the final main Workflow should look like:

Workflow 2

Workflow 2 will feed the downloaded partner files to the integration flow that we had designed earlier for cleansing, validating, and consolidating data.

1. The Context Information object in the Resources section of the Toolbox will be used to dynamically provide our Dataflow with the dropped file path i.e. the complete path of files that are downloaded from the FTP server and dropped into a particular folder.

This object, paired with the Job Scheduler, will automate the process of running this Workflow whenever a new partner file is downloaded.

  1. The Run Dataflow object in the Workflow Tasks section of the Toolbox will be used to run our Dataflow whenever a new partner file is downloaded.

  • Configure the Run Dataflow object by providing the file Path to our Dataflow.

  • The FilePath input element in the Run Dataflow object denotes the file path variable that we had defined in the Dataflow. Map the DroppedFilePath parameter from the Context Information object onto FilePath in the Run Dataflow object. As a result, this is what Workflow 2 should look like:

The FilePath variable has been used to dynamically provide the source object in our Dataflow with a file path. Consequently, whenever a new file is downloaded from the FTP server, a new file path will be fed to the source object through this mapping.

Job Scheduler

As mentioned earlier, Wheel Dealers wants to automate its data integration process wherever possible. The Job Scheduler in Astera Data Stack can be used to schedule any job to run at specific intervals or whenever a file is dropped into a folder, without any manual intervention. In this case, we will be making two schedules – one for each Workflow.

Scheduling Workflow 1

For Workflow 1, we will assume that partner files are coming in on a biweekly basis, specifically on Mondays and Wednesdays. Thus, Workflow 1 will be scheduled to run every week on Mondays and Wednesdays.

Scheduling Workflow 2

The files being fetched from the FTP server are added to a specific folder. Therefore, we will schedule Workflow 2 to run whenever a new file is dropped into that folder.

Everything is now in place, and Wheel Dealers can rely on this automated ETL Process to handle their Partner Onboarding needs. With Astera Data Stack, most of the tasks associated with Partner Onboarding can be automated, significantly cutting down on the time and resources required.

This concludes creating and orchestrating an ETL process for Partner Onboarding in Astera Data Stack.

PreviousData IntegrationNextData Warehousing

Last updated 1 year ago

Was this helpful?

For more information on how you can configure and use the Sort Transformation object, click .

For more information on how you can use the Data Cleanse Transformation object, click .

For more information on how to use the Data Quality Rules object, click .

Next, we will use a Database Table Destination object to access the dealer database and upload the dataset to it. For information on how to configure and use a Database Table Destination object, click .

For more information on the Synonym Dictionary feature, click .

For more information on how you can use the Record Level Log object, click .

For detailed guidelines on how to open and use the Job Scheduler, click .

here
here
here
here
here
here
here
01-partner-onboarding-project
02-partner-onboarding-project
03-partner-onboarding-excel-source
04-partner-onboarding-dataflow-variable
05-partner-onboarding-dataflow-variable
06-partner-onboarding-excel-source-config-parameters
07-partner-onboarding-sort-transformation
08-partner-onboarding-excel-sort
09-partner-onboarding-excel-sort-data-cleanse
10-partner-onboarding-data-cleanse
11-partner-onboarding-resolving-errors-data-quality-rules
12-partner-onboarding-data-quality-rules
13-partner-onboarding-data-quality-rules
14-partner-onboarding-database-destination
15-partner-onboarding-synonym-dictionary
16-partner-onboarding-synonym-dictionary-mapping
17-partner-onboarding-record-level-log
18-partner-onboarding-error-flags
19-partner-onboarding-final-dataflow
20-partner-onboarding-process-orchestration
21-partner-onboarding-workflow1-variable
22-partner-onboarding-workflow1-variable
23-partner-onboarding-file-transfer-download
24-partner-onboarding-file-transfer-delete
25-partner-onboarding-file-transfer
26-partner-onboarding-final-inner-workflow
27-partner-onboarding-ftp-list-connection-path
28-partner-onboarding-ftp-list
29-partner-onboarding-ftp-list-loop
30-partner-onboarding-ftp-list
31-partner-onboarding-ftp-list-filename-expression
32-partner-onboarding-expression-full-path
33-partner-onboarding-expression-full-path
34-partner-onboarding-run-workflow
35-partner-onboarding-run-workflow
36-partner-onboarding-run-workflow-file-path
37-partner-onboarding-final-main-workflow
38-partner-onboarding-context-info
39-partner-onboarding-run-dataflow
40-partner-onboarding-run-dataflow
41-partner-onboarding-context-info-dropped-file-run-dataflow
42-partner-onboarding-job-scheduler
43-partner-onboarding-job-scheduler