Excel Workbook Source
Overview
The Excel File Source object in Astera supports all formats of Excel. In this article, we will be discussing:
Various ways to get the Excel Workbook Source object on the dataflow designer.
Configuring the Excel Workbook Source object according to our required layout and settings.
Video
Getting Excel Workbook Source Object
In this section, we will cover the various ways to get an Excel Workbook Source object on the dataflow designer.
From the Toolbox
To get an Excel File Source object from the Toolbox, go to Toolbox > Sources > Excel Workbook Source. If you are unable to see the Toolbox, go to View > Toolbox or press Ctrl + Alt + X.
Drag-and-drop the Excel Workbook Source object onto the designer.
You can see that the dragged source object is empty right now. This is because we have not configured the object yet. We will discuss the configuration properties for the Excel Workbook Source object in the next section.
From the Project Explorer
If you already have a project defined and excel source files are a part of that project, you can directly drag-and-drop the excel file sources from the project tree onto the dataflow designer. The Excel File Source objects in this case will already be configured. Astera Data Stack detects the connectivity and layout information from the source file itself.
Note: In this case we are using an excel file with Customers data. The file is a part of an existing project folder.
To get an Excel File Source object from the Project Explorer, go to the Project Explorer window and expand the project tree.
Select the Excel file you want to bring in as the source and drag-and-drop it on the designer. In this case, we are working with Customers -Excel Source.xls file so we will drag-and-drop it onto the designer.
If you expand the dropped object, you will see that the layout for the source file is already built. You can even preview the output at this stage.
From the File Location
To get an Excel Workbook Source directly from the file location, open the folder containing the Excel file.
Drag-and-drop the Excel file from the folder onto the designer in Astera.
If you expand the dropped object, you will see that the layout for the source file is already built. You can even preview the output at this stage.
Configuring the Excel Workbook Source Object
To configure the Excel Workbook Source object, right-click on its header and select Properties from the context menu.
As soon as you have selected the Properties option from the context menu, a dialog box will open.
This is where you can configure your properties for the Excel Workbook Source object.
The first step is to provide the File Path for the excel source. By providing the file path, you are building the connectivity to the source dataset.
Note: In this case we are going to be using an excel file with sample Customers data.
The dialog box has some other configuration options:
If your source file contains headers and you want your Astera source layout to read headers from the source file, check the First Row Contains Header box.
If you have blank rows in your file, you can use the Consecutive Blank Rows to Indicate End of File option to specify the number of blank rows that will indicate the end of the file.
Use the Worksheet option to specify if you want to read data from a specific worksheet in your excel file.
In the Start Address option, you can indicate the cell value from where you want Astera to start reading the data.
Check the Make All Fields As String In Build Layout option when you want to set all the fields in the layout builder as String.
Dynamic Layout
Checking the Dynamic Layout option enables the two following options, Delete Fields in Subsequent Objects, and Add Fields in Subsequent Objects. These options can also be unchecked by users.
Add Fields in Subsequent Objects: Checking this option ensures that a field is definitely added in subsequent objects in a flow in case additional fields are manually added in the source database by the user or need to be added into the source database.
Delete Fields in Subsequent Objects: Checking this option ensures that a field is definitely deleted from subsequent objects in a flow in case of deletion from the source database.
Advanced File Options
In the Header spans over option, give the number of rows that your header takes. Refer to this option when your header spans over multiple rows.
Check the Enforce exact header match option if you want the header to be read as it is.
Check the Column order in file may be different from the layout option if the field order in your source layout is different from the field order in Astera layout.
Check on Column headers in file may be different from the layout if you want to use alternate header values for your fields. The Layout Builder lets you specify alternate header values for the fields in the layout.
Check the Use SmartMatch with Synonym Dictionary option when the header values vary in the source layout and Astera layout. You can create a Synonym Dictionary file to store the values for alternate headers. You can also use Synonym Dictionary file to facilitate automapping between objects that use alternate names in field layouts.
String Processing
String processing options come in use when you are reading data from a file system and writing it to a database destination.
Check the Treat empty string as null value option when you have empty cells in the source file and want those to be treated as null objects in the database destination that you are writing to, otherwise Astera will omit those accordingly in the output.
Check the Trim strings option when you want to omit any extra spaces in the field value.
Once you have specified the data reading options on this screen, click Next.
The next window is the Layout Builder. On this window, you can modify the layout of your Excel source file.
If you want to add a new field to your layout, go to the last row of your layout (Name column) and double-click on it. A blinking text cursor will appear. Type in the name of the field you want to add and select subsequent properties for it. A new field will be added to the source layout.
If you want to delete a field from your dataset, click on the serial column of the row that you want to delete. The selected row will be highlighted in blue.
Right-click on the highlighted line and select Delete from the context menu.
This will Delete the entire row from the layout.
Note: Modifying the layout (adding or deleting fields) in the Layout Builder window in Astera will not make any changes to the actual source file. The layout is specific to Astera only.
If you want to change the position of any field and want to move it below or above another field in the layout, you can do this by selecting the row and using Move up/move down keys.
Note: You will find the Move up/Move down icons on the top left of the Layout Builder.
For example: To move the Country field right below the Region field, we will select the row and use the Move up key to move this field from the 9th row to the 8th.
Other options that the Layout Builder provides:
Column Name
Description
Alternate Header
Assigns an alternate header value to the field.
Data Type
Specifies the data type of a field, such as Integer, Real, String, Date, or Boolean.
Allows Null
Controls whether the field allows blank or NULL values in it.
Output
The output checkbox allows you to choose whether or not you want to enable data from a particular field to flow through further in the dataflow pipeline.
Calculation
Defines functions through expressions for any field in your data.
After you are done customizing the Object Builder, click Next. You will be taken to a new window,Config Parameters. Here, you can further configure and define parameters for the Excel source.
Parameters can provide easier deployment of flows by eliminating hardcoded values and provide an easier way of changing multiple configurations with a simple value change.
Note: Parameters left blank will use their default values assigned on the properties page.
Once you have been through all the configuration options, click OK.
The ExcelSource object is now configured according to the changes made.
You have successfully configured your Excel Workbook Source object. The fields from the source object can now be mapped to other objects in the dataflow.
Last updated