This section talks about the various database write strategies offered within Astera.
Data Driven Write Strategy is a set level functionality, which means that the entire incoming dataset must flow through it. It allows a user to stamp a database directive on the record so that when it reaches its destination, it will be loaded according to that directive to perform the specified write action. You can specify multiple rules within the properties of a Data Driven write strategy object. These rules are tried against each record from top to bottom. If a record passes the first rule, it will not be tried against the next rule(s).
Assume a scenario in which Orders data from a Database Table Source is written to a Database Table Destination. We want to DELETE those records where ShippedDate is before the year 2000 and declare those records where Freight is less than 10 as ERROR. We will use the Data Driven write strategy object to achieve this task.
Drag-and-drop the Data Driven object from Toolbox > Database Write Strategy > Data Driven onto the dataflow designer and map the source data to the Data Driven object.
Right-click on the header of the Data Driven object and select Properties.
A Layout Builder window will open where you can modify your layout. Click Next.
The next window is the Data Driven Write Strategy Conditions window, where you can specify rules to route your data. Click on the fx option to enable the Expression Builder.
Once you select this option, the Expression Builder will be enabled.
Now, specify the following rules in the Expression Builder and select Database Action Type as ERROR for the Freight rule and DELETE for the Date rule.
Year(ShippedDate) < 2000
Freight < 10
There are five Database actions: Insert, Update, Delete, Skip and Error. From these, you can select the action you want to be taken for a certain rule.
Once you are done specifying the rule(s), click OK.
You can now write your data to a Database Table Destination.
It is important to note here that while working with Database Write Strategies, Single Port is selected. Once you check the Single Port option in the Database Table Destination object, a box will appear in which you will have to specify a field for matching database record. In our case, we will select OrderID.
We have successfully configured the settings and built the layout.
Let’s preview the output.
Data Driven output:
You can see that Astera has appended an error message with the records where Freight is less than 10. You can create an error log of these records or load them into a database if you want.
Note: Since, the Freight rule is our second rule, only those records which have failed against the first rule would be declared as an error. Which is why the error message only appears with records where both, ShippedDate is greater than 2000, and Freight is less than 10.
Now, whenever you access the same database table, you will see that the records where ShippedDate is before the year 2000 would be deleted.
This concludes using the Data Driven Write Strategy in Astera.
The Source Diff Processor object is one of the Database Write Strategies offered in Astera. It works like the Database Diff Processor, however, unlike the Database Diff Processor, it is used to perform write actions (such as Insert, Update and Delete) on file destinations. It stores a snapshot of your data processed in the first run in a CDC file. So, the next time you run it, it will only import the new records.
We have a sample Employees dataset coming in from an Excel Workbook Source. Initially, we had records of 10 employees but later on, 2 more were added in the source dataset. We wish to apply a database write strategy that can read the data incrementally from file sources. To achieve this, we will use the Source Diff Processor in Astera.
Drag-and-drop the Source Diff Processor object from Toolbox > Database Write Strategy > Source Diff Processor onto the dataflow designer and map the source data to it.
Right-click on the Source Diff Processor object’s header and select Properties.
A Layout Builder window will open where you can modify your layout. Click Next.
The next window is the Incremental Write Options window.
Here, you have to specify the Record Matching field. This field is used to match and compare the incoming and existing records. We will select EmployeeID as the Record Matching field.
Case Sensitive – Check this option if you want to compare records on a case sensitive basis.
Sort Input – Check this option if you want to sort the incoming data.
Now, if the incoming dataset has a new record with a new EmployeeID i.e. the ID is not present in the existing file which is being compared against the incoming file, Astera will perform the INSERT action.
If the EmployeeID is already present in the existing file, Astera will compare the records against that ID and will perform UPDATE action in the fields where the information has updated.
If the EmployeeID is there in the existing file, but not present in the incoming file, it means that the particular record has been deleted. In this case, Astera will perform the DELETE action.
In the Output Options section, you can either select the Single Output option or One Port for Each Action.
The Single Output option is selected if you wish to load your data into the destination without modifying it further on the basis of individual write actions. If you select Single Output, the database action such as INSERT, UPDATE, SKIP or ERROR will be chosen by the database write strategy’s logic rather than being specified by the user. Using a Single Output is recommended when a database write strategy is applied.
One Port for Each Action is used when you want to further transform or log your data. If you select One Port for Each Action, you will get separate nodes for each Diff action in the Source Diff Processor’s object.
In this example, we will select Single Output.
The third section in the Incremental Write Options window is the Incremental Transfer Information File Path option. Here, you must specify the file path where you want to store information related to the last run.
Now, if you have worked with Excel Workbook and Database table Sources in Astera, you would have noticed that the Database Table Source object gives you the option to read incremental changes. However, no such option is available in Excel or other file source objects. This option in the Source Diff Processor enables you to read incrementally from different file formats such as Excel, Delimited, and Fixed Length.
Click OK.
Now, right-click on the Source Diff Processor object’s header and select Preview Output.
Output preview for Single Output:
Output preview if One Port for Each Action is selected:
You can now write your data to any destination or perform any transformation on the dataset.
This concludes using the Source Diff Processor write strategy in Astera.
Database Diff Processor is one of the four Database Write Strategies offered in Astera. Its purpose is to synchronize the data present in two separate datasets. The object compares the two datasets and performs write actions (insert and update) on the destination table so that both tables contain the same information.
In this use case, we have a sample dataset of customers that is stored in a database table. Currently, this dataset contains 10 records, but two more customer records are to be added later on. Furthermore, updated phone numbers are to be added for two other customers.
We want to write the initial dataset to another database table and ensure that whenever the aforementioned changes are made, they are applied to both tables. To achieve this, we will be using the Database Diff Processor object in Astera.
1. Drag and drop the Database Table Source object from Toolbox > Sources > Database Table Source onto the dataflow designer. Configure this object so that it reads data from the source table.
To learn how you can configure a Database Table Source object, click here.
Drag-and-drop the Database Diff Processor object from Toolbox > Database Write Strategies > Database Diff Processor onto the dataflow designer. Auto-map all of the elements from the source object to the Database Diff Processor object.
Right-click on the header of the Database Diff Processor object and select Properties.
This will open the Database Connection window. Here, you will have to enter the credentials for the database you want to connect to. Alternatively, you can connect to a recently used database by selecting it from the Recently Used dropdown list.
In this case, we will connect to a test database that contains an empty Customers table.
Once you have entered the required credentials, click Next.
On the next window, you will have to pick a destination table, where the write actions (Insert and Update) will be performed.
In this case, we will pick the empty Customers table that has already been created in this database.
There are a couple of options related to Record Matching at the bottom of the screen:
Select a field to be used for matching records between the source table and the destination table. In this case, we will select CustomerID because it cannot change for each customer.
Check the Case Sensitive option if you want the comparison to be case sensitive. In this case, we will leave this option unchecked.
Now that the required table and Record Matching options have been selected, click OK to close the configuration window.
Run the dataflow to write your source data to the destination table. To preview the source data, right-click on the Database Table Source object and select Preview Output.
This is what the source data looks like:
To check whether this data has been written to the destination table, right-click on the Database Diff Processor object and go to Database Table > View Table Data.
The destination table will open in a separate tab within Astera.
The data present in the destination table is the same as that in the source table, showing that we have successfully written the data by running the dataflow.
The source dataset has been updated to include two more customer records. Moreover, two other customers have updated their phone numbers. This is what the source data looks like after the changes have been implemented:
Run the dataflow again to apply these changes to the destination table. This is what the destination table should look like when you open it in Astera after running the dataflow again:
The changes that were made to the source table have automatically been applied to the destination table as well, showing that the Database Diff Processor object has achieved its task.
This concludes using the Database Diff Processor write strategy in Astera.