Loading...
Loading...
Loading...
Loading...
Loading...
In Astera Data Stack, you can deploy or publish a data model to the server for data consumption. Once a model has been deployed, it is available to be used in the creation of ETL pipelines via dataflows or for data visualization and analysis through industry-leading visualization tools such as Power BI, and Domo using a built-in OData service.
Before you can deploy data models in Astera Data Stack, you need to specify the deployment and staging directories in the cluster settings.
To access the cluster settings, right-click on the default cluster node in the Server Explorer and select Cluster Settings from the context menu.
A new tab titled Cluster Settings will open in the client.
Here, you can provide folder paths for the staging and deployment directories.
Save your directories by pressing CTRL+S. Alternatively, you can right-click on the Cluster Settings tab and select the appropriate option.
There is a Deploy Data Model option on the data model toolbar.
Before you can deploy a model, you need to make sure that it does not contain any verification errors. To learn how you can verify a data model, click here.
Once the model has been verified successfully, click on the Deploy Data Model option. Alternatively, you can right-click on the data model from within the Project Explorer and select Quick Deploy from the context menu.
If you haven’t saved the model beforehand, a pop-up window will open and give you the option to save it.
Click Yes. The Data Model Deployment window will appear.
The following options are available on this window:
Database Name: In this text box, you can define a name for the virtual database representing the deployed data model.
Source Config Path: Here, you can provide the path to a config file that can be used to modify data model parameters.
Once you’ve filled these boxes, click OK.
The Job Progress window will show the deployment being processed at the server.
Note: The archive file for direct deployment is generated automatically.
To access the ADM Deployments tab, go to Server > ADM Deployments.
A new tab for Data Model Deployments will appear on the client.
Here, you can view all of your deployed data models and add new deployments as well.
Before adding a new deployment, you will have to generate an archive file for the data model manually. To do that, right-click on the data model in the Project Explorer and select Build Archive (*.Car) for the item.
A pop-up window will allow you to save the archive file at your specified location.
To add a new deployment in the Data Model Deployments, click on the Add Deployment icon.
Once a new deployment has been added to the list, you must provide the following details:
Database Name: A name for the virtual database representing the deployed data model.
Source Archive Path: The path to the archive file for the model.
Data Model: From this dropdown menu, you must select the data model that is being deployed.
Once you’ve filled in all of these details, click on the Save Deployment icon to save the deployment.
The deployment will be processed at the server, as indicated by the Job Progress window.
This concludes our discussion on deploying a data model.
The Validate Metadata and Data Integrity option in the Astera Data Stack allows you to check and verify the validity of the data present in dimension and fact tables. It gives users the option to check if these tables contain any invalid or ‘bad’ data. For instance, in a dataset containing employee records, the effective and expiration date ranges for two records of the same employee may be overlapping with each other. Moreover, the expiration date for a record could lie before its effective date, or two records of the same employee could be marked as ‘Active’ at the same time.
All of these examples constitute data that does not make logical sense. Hence, it is considered invalid or ‘bad’. The purpose of the Validate Metadata and Data Integrity option is to notify the user that these issues exist in the data present within the entities of a deployed dimensional model.
In this article, we’ll learn how to use this option. Additionally, we’ll take a look at some basic examples of its usage.
Before you can use the Validate Metadata and Data Integrity option on the entities of a dimensional model, you need to deploy the model. To learn about data model deployment, click here.
For this demonstration, we’ll deploy the following model:
You can view your deployments within the Data Source Bowser by establishing a connection with the Astera Data Model data provider. To learn about opening and using the Data Source Browser, click here.
Right-click on the deployment in the Data Model Browser and select the Validate Metadata and Data Integrity option from the context menu.
A pop-up window titled Model Validation Options will appear on your screen.
This window contains the following options:
Minimum Effective Date: The minimum date value that would be considered a valid entry in the Effective Date record identifier column/field.
Maximum Expiry Date: The maximum date value that would be considered a valid entry in the Expiration Date record designator column/field.
Message Level: This dropdown menu allows you choose the message level that will appear in the Job Progress window when validating metadata and data integrity. There are two options:
Failure Only: Will show only the messages that indicated failure in validating data.
All: Will show all the messages.
Check Referential Integrity: When this option is checked, it verifies whether the parent table in an entity relationship contains a primary key corresponding to the foreign key in the child table.
Once you’ve made the appropriate selections from this window, click OK.
You’ll be able to see the data validation process take place within the Job Progress window.
In this case, the data within this deployment does not contain any validation errors.
Now, let’s take a look at a few common validation errors that you may encounter.
In this section, we’ll examine the following dimensional model with different validation errors.
In this case, the Employee table contains two or more active records for one employee. More specifically, two or more records for the same employee have been marked as ‘Active’ in the Current Record Designator field.
This is the error message that appears in the Job Progress window when you verify metadata and data integrity for this deployment:
In this case, there is an overlap between the effective and expiration date ranges for two records of the same employee in the Employee table.
This is the error message that appears in the Job Progress window when you verify metadata and data integrity for this deployment:
In this case, two records of the same employee in the Employee table have an identical expiration date.
Here, we’ve selected All under the Message Level option in the Model Validation Options window.
These are the messages that appears in the Job Progress window when you verify metadata and data integrity for this deployment:
This concludes our discussion on using the Validate Metadata and Data Integrity option in Astera Data Stack.
A deployed data model is available as a virtual database that can be used in ETL pipelines. In Astera Data Stack, you can use the tables in these virtual databases as sources or destinations in dataflows, which represent the ETL or ELT functionality of the product. To learn how you can get started with dataflows, click here.
The dataflow toolbox contains an extensive variety of objects that can be used as part of an ETL pipeline via the drag and drop interface of the dataflow designer. These include sources, destinations, transformations, and numerous other object types.
You’ll notice a section titled Data Warehouse in the toolbox. This contains the Fact Loader and Dimension Loader objects, which can be used to load data into fact and dimension tables, respectively. We will discuss these objects in further detail at a later stage in the article.
Here is an example of a simple ETL pipeline designed in a dataflow:
In this dataflow, we’ve done the following:
Extracted data from a database table using the Database Table Source object.
Transformed the incoming dataset by filtering out certain records using the Filter Transformation object.
Loaded the transformed dataset into an Excel worksheet using the Excel Workbook Destination object.
Multiple objects in the dataflow toolbox can be used to access data models that have been deployed on the server. These include:
Database Table Source
Data Model Query Source
SQL Query Source
Fact Loader
Dimension Loader
Of these, the first three can be used as source objects and the other two can be used as destination objects. However, the Database Connection screen in the Properties configuration window is common in all of these objects. Let’s take a look at this Database Table Source object as an example.
To access the properties of an object, right-click on the object header and select Properties from the context menu.
A configuration window will appear on your screen.
The Database Connection screen on this window can be used to access the virtual databases representing the deployed data models.
The Data Provider dropdown menu features a comprehensive list of data providers available from within Astera DWB. Among these, you’ll notice an option by the name of Astera Data Model.
On selecting the Astera Data Model option, a list of required credentials will appear on the screen.
User: Here, you must provide your Astera DWB username. The default username is admin.
Password: Here, you must provide your Astera DWB password. The default password is Admin123.
Server: Here, you must provide the server connection in the given format. This is the server where you’ve deployed your data models.
Once you’ve filled in the required credentials, the Database dropdown menu will provide a list of deployments that are running on the server.
Choose the deployment that you wish to access. In this case, we’ll select Demo_Source. You can test the connection by clicking on the Test icon.
A pop-up window will show you the test results.
This Database Table Source object now has access to the tables present in Demo_Source. Similarly, you can connect to a deployed data model from the Database Connection screen in the Data Model Query Source, SQL Query Source, Fact Loader, and Dimension Loader objects.
Click Next to move to the Pick Source Table and Reading Option screen.
This screen is specific to the Database Table Source object and will vary according to the object you’re using.
Here, you can pick a table from the virtual database representing the data model.
For reference, here is a look at the Demo_Source data model.
In this case, we’ll select the Customers table from the dropdown menu.
Click OK to close the configuration window. You’ll notice that the layout of the Customers table is now visible in the Database Table Source object.
You’ve successfully accessed an Astera Data Model table in a dataflow and can use it as part of an ETL pipeline.
As mentioned earlier, the Fact Loader and Dimension Loader objects can be used as destination objects to load data into fact and dimension tables, respectively. These tables are represented by fact and dimension entities in a dimensional model, which denotes the schema of a data warehouse. Collectively, the Fact Loader and Dimension Loader objects are used to populate a data warehouse.
You can use the Database Connection screen in the properties of both objects to access the virtual databases representing deployed data models, in the same way as we saw earlier in the Database Table Source object. Once you’ve done so, click next to move to the Pick Table screen, where you can pick a table from the virtual database.
Let’s assume that we’ve deployed the following dimensional model:
The Sale entity represents the fact table, and the rest of the entities represent dimension tables.
On expanding the Pick Table dropdown menu, you’ll notice that it only shows tables that are marked as facts in the dimensional model. In this case, the only table which fits that criterion is the Sale table.
Click Next to move to the Layout Builder screen. Here, you can view the layout of the fact table.
Click OK to the close the configuration window. The layout will now be visible in the Fact Loader object.
You can now make the required mappings from a relevant source object and run the dataflow in pushdown mode to populate the fact table.
The Pick Table dropdown menu in the Dimension Loader object only shows tables that are marked as dimensions in the dimensional model.
In this case, we’ll select the Customer table.
Click Next to move to the Layout Builder screen.
Here, you can view the layout of the dimension table.
Click OK to close the configuration window. The layout will now be visible in the Dimension Loader object.
You can now make the required mappings from a relevant source object and run the dataflow to load data into the dimensional table.
This concludes our discussion on using Astera Data Models in ETL pipelines.
Astera Data Stack gives its users the functionality to access and use deployed data models in third-party visualization and analytics tools such as Power BI, Tableau, Domo, etc. This is enabled through a built-in OData module that offers seamless connectivity to any application that provides OData support. Once the connection is established, the data present within the data warehouse can be used to create comprehensive dashboards for visualization and analysis.
In this article, we’ll take a look at how you can connect to a deployed model in Microsoft Power BI via OData.
First, you need to obtain a security certificate (.pfx extension) that contains an encryption key and some other information needed to encrypt data. This certificate is specific to an organization’s domain and can be generated by the networking and security team.
Once you’ve obtained this certificate, paste the file to the installation directory of the Astera Integration Server.
Note: The path to the installation directory is usually C:\Program Files\Astera Software\Astera Integration Server 9.
Next, open the Astera Data Stack client application and open the Server Properties tab from here: Server Explorer > Server Connection Name > Server Properties.
In the Server Properties tab, go to the Certificate Settings section and enter the File Path (path to the .pfx file) and Password (the certificate password).
Once done, click on the Apply Settings button. Right-click on the Server Properties tab and click on Save and Close to save the changes and close the tab.
Next, you need import the security certificate to your local machine. Double-click on the .pfx file. An import wizard will appear on your screen.
On the first page, you can select the system area where the certificate is kept. Click Next.
Here, specify the .pfx file that your want to import. Click Next.
Here, enter the password for the private encryption key. You can get this password from the team that generated the certificate. Click Next.
Here, specify a location for the certificate by selecting Place all certificates in the following store and clicking on Browse. You can also opt for windows to automatically choose a location for your certificate by selecting the first option. Click Next.
Finally, click Finish to import the certificate.
The Windows hosts file is a system file that maps servers or hostnames to IP addresses. To access this file, go to the following path on your Windows machine:
C:\Windows\System32\drivers\etc
Open the hosts file and add the following line at the end:
127.0.0.1 [domain name]
Here, astcloudtest.astera.com represents the domain used to generate the security certificate. By adding this line, you have mapped the DNS resolution for your domain to your local host.
Once done, save and close the hosts file. At this point, you will have to restart the Astera Integration Server for all of these changes to take effect in Astera Data Stack. In Windows, you can access the service for the Astera Integration Server via this path:
C:\ProgramData\Microsoft\Windows\Start Menu\Programs\Administrative Tools > Services > Astera Integration Server
Note: When you’re saving the hosts file, please ensure that the file extension has not changed.
The process we have followed up to this point is valid for any visualization tool that supports OData. From here onwards, we will take a look at how you can connect Power BI to the OData service for a deployed data model. To learn about data model deployment in Astera Data Stack, click here.
In Power BI, select the Get Data dropdown in the Home ribbon at the top.
In the dropdown menu, you’ll be able to see some common data sources. From this list, click on OData Feed.
A new window titled OData Feed will give you the option to add a URL for your desired OData feed.
Here, enter the URL in the following format:
[domain name]:9261/api/odata/[deployment name]/
Click OK to connect to the deployed model.
You can now use the data within these tables to create dashboards for visualization and analysis.
This concludes our discussion on connecting an Astera Data Model to a third-party visualization tool via OData.
Data models can be deployed to the server to provide accessibility to users. Depending on business needs such as controlled access of data, data consumption in ETL pipelines, analysis, etc. users may need to deploy data models separately. Astera Data Stack lets you deploy multiple subsets of a data model without creating separate data models and performing redundant tasks for each deployment using view-based deployments.
Views in a data model help you focus on a smaller group of the entities without considering the whole data model, thus enabling users to work incrementally. Deployment of these views lets us verify, deploy, and allow access to only those entities of the data model that are present in a view.
Consider an example of a dimensional model consisting of multiple fact tables. To work with fact tables individually, we can use views in the data model.
Create a View by clicking the Add New View icon located in the bottom left of the data model designer. A new view will be created.
Rename this View as per the requirements by either double-clicking it or by right-clicking the View tab and selecting the Rename View option from the context menu.
Now, add the desired entities to the view using any of the following methods:
We can add entities to the View directly from the main model. To do this, right-click the entity that must be added to the view, use the Add to View option in the context menu of entities and select the view.
We can also add entities to the View by selecting any entity from the data model browser and dragging-and-dropping it onto the designer. Press the shift key while dragging-and-dropping to ensure that the selected entity and all the entities that are connected to it are added to the View on the designer.
Having added the desired entities onto the designer, the Views can now be used for incremental designing.
After finalizing the Views, they can be deployed to the server.
To deploy a view, Use the Deploy Data Model icon from the toolbar or the short cut key, Shift+Alt+M.
A wizard will open. Here, name the deployment and select the Views that need to be deployed. Once done, click OK. If the data model is verified successfully, the Views will be deployed and available on the server.
Note: When deploying Views, each view will be verified for read and write deployment individually without taking into account other Views. Separate deployments for each View will also be created.