Join Datastream

This guide explains how to use the Join Datastream enrichment to combine two datastreams.

Introduction

The Join Datastream enrichment lets you include data from one datastream in another datastream. These datastreams must have at least one column in common, as this is the column by which the data is joined. This enrichment is a left join. To join datastreams in other ways, create a custom script enrichment that uses the join instruction.

In this process, no data is deleted from the datastreams. Data is copied from a selected datastream (the "right datastream") into the datastream to which you assign the enrichment (the "left datastream"). The original datastream remains unchanged.

Limitations

The Join Datastream enrichment joins data extracts from selected datastreams by including data from "Datastream A" (the datastream you select when configuring the enrichment) in "Datastream B" (the datastream to which you assign the enrichment). The destination settings of Datastream B determine which data extract is included in Datastream A.

  • If Datastream B does not have a destination enabled, then the latest data extract with a status of either Collected or Loaded can be joined into Datastream A.

  • If Datastream B does have a destination enabled, then only the latest data extract with a status of Loaded can be joined into Datastream A.

The latest data extract is determined based on the datastream_extract_scheduled metadata value of a data extract. For manual fetches, the time in datastream_extract_scheduled is set to 00:00:00. If both scheduled and manual fetches were performed on the same day, the data extract from the scheduled fetch will be taken as the latest data extract even if there is a more recent data extract from a manual fetch.

Prerequisites

Before you complete the procedure in this guide, perform all of the following actions:

  • Ensure both datastreams share at least one column with corresponding data. For example, both datastreams contain Ad ID or Stock Keeping Units (SKU).

Configuring the Join Datastream enrichment

This guide explains how to configure the Join Datastream enrichment. Before completing this step, start creating an enrichment by following the instructions in Using standard enrichments. To configure this enrichment, follow these steps:

  1. In the Datastream to join drop-down menu, select the datastream which contains the columns to be joined to the datastream or datastreams to which you assigned the enrichment in the Assignments step.

  2. In the Select the common columns to join drop-down menus, select the columns that contain the corresponding data in both datastreams. These columns may have different names.

  3. In Select additional columns to include from {datastream-name}, choose which additional columns to join.

To preview your enrichment at this step, click the Table preview tab. Here, you can see how your current enrichment instructions will transform the data you have previously collected using this datastream.

If the preview shows that the enrichment is not working in the way you want it to, click the Instructions tab to adjust the enrichment settings.

Repeat these steps until the preview shows that the enrichment will have the effect you want.

For more information about the preview, see Previewing enrichments.

Example configuration

This example demonstrates how to join data from two specific datastreams.

This example assumes the following conditions:

  • The user has two Shopify datastreams. One datastream uses the Orders report type, and the other uses the Customers report type.

  • The user has fetched data using both datastreams and both data extracts contain an ID field.

    In the Orders datastream this is called id, and in the Customers datastream this is called last_order_id.

  • The user wants to add data from the Orders datastream to the Customers datastream. The data the user wants to include is the total cost of the customer's last order.

To join these datastreams as described above, the user must follow these steps:

  1. In Assign to, select Individual datastreams.

  2. Type in and select the Shopify datastream using the Customers report type.

  1. Click Next.

  1. In Datastream to join, type in and select the Shopify datastream using the Orders report type.

  2. Under Select the common columns to join, in the left drop-down menu, select last_order_id.

  3. In the right drop-down menu, select id.

  4. Under Select additional columns to include, in Select columns, select total_price.

As a result, the ID data in the id column in the data extract collected using the Orders report type is matched to the data in the last_order_id column collected using the Customers report type. The last_order_id and id columns were used to correctly add the column total_price to the data collected using the Customers report type.

Video guide: How to use the Join Datastream enrichment

This video guide explains how to create and configure a Join Datastream enrichment.