Astera Data Stack allows its users to perform lineage and impact analysis on certain data items (tables, fields). The purpose of lineage is to be able to trace the origin or roots of a data item and the transformations it has undergone. On the other hand, impact analysis shows how the data present within an item is consumed, used, and modified with its respective pipeline(s). It also shows which data items or pipelines (i.e., workflows, dataflows, schedules or deployments) are impacted by a change/modification in the component, thereby helping to understand the potential risks and dependencies associated with the changes.
To be able to generate lineage and impact graphs for data items that are being used/processed within dataflows, those dataflows should be part of a project that is deployed on the Astera Integration Server. To learn how you can deploy a project, click here.
The next step is to generate lineage and impact for all the active deployments on the server.
Go to the Server Explorer. If you can’t see this panel, go to View > Server Explorer in the main menu.
On the Server Explorer panel, right-click on the server name and select Generate Lineage and Impact from the context menu.
When you select this option, the product will automatically generate lineage and impact for the items being processed/used in all the projects that have been deployed on the server.
The Data Source Browser can be used to access all the tables (and their fields) from a particular database. To learn about how you can use the Data Source Browser, click here.
To display a lineage graph for a particular table or field from within the Data Source Browser, right-click on that table/field and select Show Lineage from the context menu.
Similarly, to display an impact graph for a particular table or field from within the Data Source Browser, right-click on that table/field and select Show Impact from the context menu.
A relevant lineage/impact graph will be displayed within the client application when you select one of these options. However, for that to happen, the table/field that you’ve chosen should have been used in at least one of the deployed projects for which you’ve already generated lineage and impact analysis (refer to the previous step).
There are two types of views available for lineage and impact, the Graph View and the Grid View.
The graph view shows a graphical visual of the lineage and impact generated. Here is a sample lineage graph for a database table that is displayed in the client application. It shows a graphical visual to trace the origins of the destination database table for which the lineage has been generated.
Let’s Explore the different options provided in the ‘Graph View’.
The ‘Show Level’ option can be used to show different levels of detail of lineage and impact analysis. For instance, when the Level is changed from ‘All’ to ‘1’, a less detailed view of the lineage and impact analysis is shown in the Graph View, showing only the most immediate transformations affecting the destination.
The ‘Apply Filters’ options provide multiple filters that can be applied to customize the Lineage and Impact analysis view. The filter options provided include:
· Set Transformations: Show/Hide set level transformations in the Graph View
· All Transformations: Show/Hide all transformation items in the Graph View
· Paths: Show/Hide full file path of items
· Action Name: Show/Hide action item names
· Action Alias: Show/Hide alternate names for action items
· Server and Database Information: Show/Hide Database information
· Server Information: Show/Hide Server information
The ‘Export Graph to Image’ option allows you to save your lineage or impact analysis graph as an image file at the desired location.
Similarly, here’s a sample impact graph for the same database table and it shows a graphical visual to determine which components are impacted if the database table for which the impact is generated is changed/modified.
You can switch from the Graph View to Grid View using the tab at the top left of the graph.
The Grid View shows all the details of the impacted documents for the lineage and impact of a table or field in a tabular format.
Here’s the Grid view for the lineage of a database table. It shows a table of all the flows, deployments and schedules that affect the database table destination for which the lineage has been generated.
The ‘Dependency level’ option allows you to specify the level of dependency for flows in the lineage and impact analysis grid. For instance, here we have changed the Dependency level from ‘1’ to ‘All’ to show all the dependencies within the lineage analysis grid, and a more detailed view including the details of the parent items is included in the grid view.
Similarly, here’s a sample impact grid view for the same table. It shows a list of all the flows, deployments and schedules that will be impacted if the database table destination for which the impact has been generated is changed/modified.
Again, the Dependency level has been changed from ‘1’ to ‘All’ to view all levels of dependencies within the impact grid view.
Within the Grid View, you can see a list of all the impacted items, the parent items for each of the listed items, the path of the deployed items, and a list of all the deployments and schedules that are impacted by the data item.
You can also export the grid view table to an excel sheet using the ‘Export Data to Excel’ option provided.
This concludes our discussion on how to generate graphs for lineage and impact analysis in Astera Data Stack.
Astera Data Stack gives its users the option to generate technical and business documentation for data models (.mdl files). The resulting document (.html file) provides some details on the data model(s), including an overview of the underlying database, the tables/entities in each model, the columns in these entities, the relationships between these entities, etc.
When you generate documentation for a project, the document contains information on all of the data models present within the project.
In the top-most toolbar of the Project Explorer tab, there’s an option titled Generate Technical Documentation on the right side.
Note: The Project Explorer tab will automatically appear on the screen when you open an existing project or create a new one. If you can’t see the Project Explorer tab, click on View > Project Explorer in the main menu.
When you expand the Generate Technical Documentation option by clicking on the downward-facing arrow, you’ll see the following sub-options: Generate Technical Documentation and Generate Business Documentation.
You can choose the appropriate option here, based on the kind of documentation you want to generate. Please refer to the next section of this article to learn about the difference between technical and business documentation.
When you click on either of these two sub-options, the Browse For Folder window will allow you to choose a location for the generated document.
Once you’ve chosen a location, click OK. A new folder titled ‘BusinessDocument’ or ‘TechnicalDocument’ will be generated in the chosen directory, depending on the kind of documentation you’ve generated. From this folder, you can access the HTML document and its contents.
Here’s a look at the generated document:
On the left, you can see an expandable node representing each data model in the project. The entities in each model are arranged based on the type and schema they lie under, which forms a tree-layout under each node.
Each of the links provided in this layout, whether for the model itself or for an entity within the model, are clickable and will lead to a detailed overview of the item you’ve clicked on. For example, here’s what appears when you click on a model (in this case, DW_SaleNOrder, which is a dimensional model containing two facts and four dimensions):
Similarly, here’s what appears when you click on an entity in the list (in this case, City, which is a dimension entity):
You can also generate documentation for deployed models via the Data Mode Deployment tab. To access this tab, go to Server > ADM Deployment in the main menu.
To learn about data model deployment, click here.
In this tab, you’ll notice the same option that we saw in the Project Explorer.
From here, you can generate a technical or business document for a deployed data model.
Business documentation only provides information that is relevant to business users. For an entity, this includes basic details like database information, object overview, column and properties, etc.
Technical documentation is designed for technical users and goes a bit deeper in terms of details for each entity. These details include the child and parent relationships of an entity, and the indexes created within an entity. This extra information can be very useful when trying to understand a data model and its components from a logical and technical perspective.
Here’s what the entity information looks like in a technical document:
This concludes our discussion on generating technical and business documentation for data models in Astera Data Stack.