Joining Data
Joins allow you to combine tables side-by-side and require a key column that is present in both tables. This column is used to match rows between tables.
In the following example, we will join two tables. Table 1 contains employee information, while Table 2 contains payroll information.
Table 1 containing employee information.
Table 2 containing payroll information.
Prerequisite
-
An Execution Profile with an active session linked to the workflow
Steps
-
Produce two Data Frame nodes by importing two data tables.
-
From the Data Preparation group of the Nodes tabbed page, drag and drop the Join node from the Palette to the Workflow Canvas. The Join node has two input sockets and one output socket. Connect the output socket of each of the Data Frame nodes produced in Step 1 to the input sockets of the Join node.
-
Configure the Join node by double-clicking on the node or using the Open option provided in the node menu.
-
Specify the following settings in the Join Node Viewer:
-
Standard Joins, including Inner Join, Left Outer Join, Right Outer Join, or Full Outer Join
-
Filtering Joins, including Left Semi Join, Left Anti Join, Right Semi Join, or Right Anti Join.
-
Check the code that will be executed for your specified Join configuration by saving your current configuration and then clicking on the Code tab of the Join Node Viewer. You can use the Code Editor to refine the code further.
-
Complete the Join Node configuration by clicking Save. To cancel your changes and return to the Workflow Canvas, press Discard or simply close the Join Node Viewer. To execute the Join, click the Run button.
Property |
Description |
Input Properties |
Table Names - The name of the tables to be joined. This field is automatically populated from the Data Frame node and cannot be edited. |
Output Properties |
Table - The table name.
|
Join Type |
You can create:
Descriptions of these join types display when you select a join type. |
Key Columns |
A key column is a column found in both tables containing values to be used to match records for the joined table. Select a common column between your data inputs
by clicking the Key
When a key column is selected from the Columns to Include section of the Configuration panel, it is added to the Key Columns section and grayed out in the Columns to Include section. |
Columns to Include |
Specify which columns from the data inputs should be included in the join output. A search option is also provided to address inputs with numerous columns.
|
Included Column Names |
Set prefixes or suffixes for columns in your data inputs if you wish. These prefixes/suffixes will display in the data output.
|
When you have finished configuring your join table, a preview of your data displays in the right-hand side of the Node Viewer.