Join Using a Source

In this document, you’ll learn how to use the Join function in Astera Dataprep to combine a dataset from a project source with an existing dataset in your Dataprep Recipe.

Use Case

In this use case, we have a Dataprep Recipe where a company’s Customers dataset has been cleansed. Now, they want to join it with their Orders dataset, which is available in a shared project source.

  1. To begin, click on the Join option in the toolbar and select Source from the drop-down.

  1. Alternatively, you can drag and drop the project source from the Data Source Browser panel onto the Join object in the Recipe canvas.

  1. This will open the Recipe Configuration – Join panel.

  1. In this panel, you’ll configure the following options:

  • Filter Source: Choose the type of source you want to use. You can filter by type or simply select All.

  • Shared Source: From the drop-down, select the project source dataset you want to join.

  • Join Dataset: You can provide a custom name for the joined dataset or keep the default name. In this example, we’ll keep the default name.

  • Join Type: Choose the type of join you want to perform:

    • Inner: Keeps only the records that have matching values in both datasets.

    • Left Outer: Keeps all records from the current dataset and adds matching data from the project source. Unmatched records from the project source are filled with nulls.

    • Right Outer: Keeps all records from the project source and adds matching data from the current dataset. Unmatched records from the current dataset are filled with nulls.

    • Full Outer: Keeps all records from both datasets. Unmatched values are filled with nulls.

    In our example, we’ll use an Inner join to include only matching records.

  • Keys: Specify the key fields that the join will be based on. Astera will auto-detect matching fields, but you can modify them as needed.

    • Left Field: Field from the current dataset.

    • Right Field: Field from the project source dataset.

In this case, we’ll keep the default key fields selected.

  1. Once you’re done, click Apply. The project source dataset will now be joined, and the result will appear in the grid.

Last updated

Was this helpful?