split
Split a data extract into multiple extracts.
This guide explains how to configure the split instruction. To learn about another instruction, go back to the Available custom script instructions overview.
Introduction
Use the split instruction to split the data extract into multiple extracts based on given criteria. You can split a data extract using a given number of rows or based on distinct values in a given column.
Creating a custom script transformation using the split instruction
To create and configure a custom script using the split instruction, follow these steps:
-
In the Instructions step, select the split instruction.
-
To configure the custom script instruction, fill in the following fields. Required fields are marked with an asterisk (*).
-
Key*
-
Enter the name of the column to search in for distinct values. The Transformation splits the data extract into a new extract for each unique value in the column.
-
Transformers
-
Enter the name of the transformation scripts to apply to the new data extracts created after the split transformation is complete. For more information, see Using the split custom script instruction with other instructions.
-
Tags
-
Enter the name of the tags to apply to the data extracts created after the split transformation is complete.
-
Update Metadata From Key
-
Select this field to add the key to the metadata of the new data extracts created.
-
Name Pattern
-
Enter a name pattern from which the names of the split data extracts are created. You can use the following placeholders in the name pattern:
-
{datastream_slug}
- This is an identifier for the datastream that combines the data source, the ID of the datastream and the name assigned to the datastream. -
{id}
- This is the ID of the datastream. -
{app_label}
- This is the data source -
{split_counter}
- This is a count of how many data extracts have been created. Counting starts at 0. -
{key}
- This is the key used to split the data extract. -
{uuid}
- This is the unique file ID. -
{meta[value]}
- Use a value from the data extract metadata
-
-
Rowcount
-
Enter the number of rows that each split data extract will contain. For example, enter
10
, to split a data extract of 120 rows into twelve separate data extracts, each containing 10 rows.
-
Subtable
-
Enter the name for a subtable that you want to use within this custom script.
A subtable is a temporary table that only exists for this custom script. You can apply additional instructions within the same custom script to the subtable. However, the subtable cannot be used in any other custom scripts.
If a subtable does not exist for the current custom script, the transformation is applied to the data extract, and the enriched data is output into the subtable. If the subtable already exists for the custom script, the subtable is used as the input for the transformation and optionally as the output.
Example
Transformation configuration
-
Key
-
Campaign
-
Name Pattern
-
{key}_{split_counter}
Data table before transformation
Campaign |
Ad Group |
Clicks |
---|---|---|
Brand |
media |
7 |
Brand |
ecommerce |
3 |
Brand |
festivals |
18 |
Dashboard |
ecommerce |
4 |
Dashboard |
media|social |
5 |
Dashboard |
media |
11 |
Data table after transformation
Table 1
This table is called Campaign_0
.
Campaign |
Ad Group |
Clicks |
---|---|---|
Brand |
media |
7 |
Brand |
ecommerce |
3 |
Brand |
festivals |
18 |
Table 2
This table is called Campaign_1
.
Campaign |
Ad Group |
Clicks |
---|---|---|
Dashboard |
ecommerce |
4 |
Dashboard |
media|social |
5 |
Dashboard |
media |
11 |
Using the split custom script instruction with other instructions
To apply multiple custom script instructions, including the split instruction, to a single datastream, you need to make sure that the transformation instructions are performed in the correct order. The following steps will make sure that your transformations are applied correctly.
If you want to transform your data extract and apply the split instruction last:
-
If you have multiple instructions in a custom script transformation, make sure that the split instruction is performed last (at the bottom of the list of instructions).
-
If you have multiple transformations assigned to your datastream, make sure that the transformation containing the split instruction is performed last (at the bottom of the list of transformations).
If you want to apply the split instruction and then apply other transformations:
-
In the Transformers field of the instruction configuration step as described above, enter the name of the transformation you want to perform after the split instruction.
The name of the transformation entered in this field must exactly match the name of an existing transformation.
Click + to add more transformations in the Transformers field. These transformations will be applied in the order they are added.
If you add a transformation in the Transformers field, you do not need to assign this transformation to the datastream.