split
Split a data extract into multiple extracts.
This guide explains how to configure the split instruction. To learn about another instruction, go back to the Available custom script instructions overview.
Introduction
Use the split instruction to split the data extract into multiple extracts based on given criteria. You can split a data extract using a given number of rows or based on distinct values in a given column.
Creating a custom script enrichment using the split instruction
To create and configure a custom script using the split instruction, follow these steps:
-
In the Instructions step, select the split instruction.
-
To configure the custom script instruction, fill in the following fields. Required fields are marked with an asterisk (*).
-
Key*
-
Enter the name of the column to search in for distinct values. The Enrichment splits the data extract into a new extract for each unique value in the column.
-
Transformers
-
Enter the name of the enrichment scripts to apply to the new data extracts created after the split enrichment is complete. For more information, see Using the split custom script instruction with other instructions.
-
Tags
-
Enter the name of the tags to apply to the data extracts created after the split enrichment is complete.
-
Update Metadata From Key
-
Select this field to add the key to the metadata of the new data extracts created.
-
Name Pattern
-
Enter a name pattern from which the names of the split data extracts are created. You can use the following placeholders in the name pattern:
-
{datastream_slug}
- This is an identifier for the datastream that combines the data source, the ID of the datastream and the name assigned to the datastream. -
{id}
- This is the ID of the datastream. -
{app_label}
- This is the data source -
{split_counter}
- This is a count of how many data extracts have been created. Counting starts at 0. -
{key}
- This is the key used to split the data extract. -
{uuid}
- This is the unique file ID. -
{meta[value]}
- Use a value from the data extract metadata
-
-
Rowcount
-
Enter the number of rows that each split data extract will contain. For example, enter
10
, to split a data extract of 120 rows into twelve separate data extracts, each containing 10 rows.
-
Subtable
-
Enter the name for a subtable that you want to use within this custom script.
A subtable is a temporary table that only exists for this custom script. You can apply additional instructions within the same custom script to the subtable. However, the subtable cannot be used in any other custom scripts.
If a subtable does not exist for the current custom script, the enrichment is applied to the data extract, and the enriched data is output into the subtable. If the subtable already exists for the custom script, the subtable is used as the input for the enrichment and optionally as the output.
Example
Enrichment configuration
-
Key
-
Campaign
-
Name Pattern
-
{key}_{split_counter}
Data table before enrichment
Campaign |
Ad Group |
Clicks |
---|---|---|
Brand |
media |
7 |
Brand |
ecommerce |
3 |
Brand |
festivals |
18 |
Dashboard |
ecommerce |
4 |
Dashboard |
media|social |
5 |
Dashboard |
media |
11 |
Data table after enrichment
Table 1
This table is called Campaign_0
.
Campaign |
Ad Group |
Clicks |
---|---|---|
Brand |
media |
7 |
Brand |
ecommerce |
3 |
Brand |
festivals |
18 |
Table 2
This table is called Campaign_1
.
Campaign |
Ad Group |
Clicks |
---|---|---|
Dashboard |
ecommerce |
4 |
Dashboard |
media|social |
5 |
Dashboard |
media |
11 |
Using the split custom script instruction with other instructions
To apply multiple custom script instructions, including the split instruction, to a single datastream, you need to make sure that the enrichment instructions are performed in the correct order. The following steps will make sure that your enrichments are applied correctly.
If you want to enrich your data extract and apply the split instruction last:
-
If you have multiple instructions in a custom script enrichment, make sure that the split instruction is performed last (at the bottom of the list of instructions).
-
If you have multiple enrichments assigned to your datastream, make sure that the enrichment containing the split instruction is performed last (at the bottom of the list of enrichments).
If you want to apply the split instruction and then apply other enrichments:
-
In the Transformers field of the instruction configuration step as described above, enter the name of the enrichment you want to perform after the split instruction.
The name of the enrichment entered in this field must exactly match the name of an existing enrichment.
Click + to add more enrichments in the Transformers field. These enrichments will be applied in the order they are added.
If you add an enrichment in the Transformers field, you do not need to assign this enrichment to the datastream.