split

Split a data extract into multiple extracts.

This guide explains how to configure the split instruction. To learn about another instruction, go back to the Available custom script instructions overview.

Introduction

Use the split instruction to split the data extract into multiple extracts based on given criteria. You can split a data extract using a given number of rows or based on distinct values in a given column.

Creating a custom script enrichment using the split instruction

To create and configure a custom script using the split instruction, follow these steps:

  1. Create a custom script enrichment.

  2. In the Instructions step, select the split instruction.

  3. To configure the custom script instruction, fill in the following fields. Required fields are marked with an asterisk (*).

Key*

Enter the name of the column to search in for distinct values. The Enrichment splits the data extract into a new extract for each unique value in the column.

Transformers

Enter the name of the enrichment scripts to apply to the new data extracts created after the split enrichment is complete. For more information, see Using the split custom script instruction with other instructions.

Tags

Enter the name of the tags to apply to the data extracts created after the split enrichment is complete.

Update Metadata From Key

Select this field to add the key to the metadata of the new data extracts created.

Name Pattern

Enter a name pattern from which the names of the split data extracts are created. You can use the following placeholders in the name pattern:

  • {datastream_slug} - This is an identifier for the datastream that combines the data source, the ID of the datastream and the name assigned to the datastream.

  • {id} - This is the ID of the datastream.

  • {app_label} - This is the data source

  • {split_counter} - This is a count of how many data extracts have been created. Counting starts at 0.

  • {key} - This is the key used to split the data extract.

  • {uuid} - This is the unique file ID.

  • {meta[value]} - Use a value from the data extract metadata

Rowcount

Enter the number of rows that each split data extract will contain. For example, enter 10, to split a data extract of 120 rows into twelve separate data extracts, each containing 10 rows.

Subtable

Enter the name for a subtable that you want to use within this custom script.

A subtable is a temporary table that only exists for this custom script. You can apply additional instructions within the same custom script to the subtable. However, the subtable cannot be used in any other custom scripts.

If a subtable does not exist for the current custom script, the enrichment is applied to the data extract, and the enriched data is output into the subtable. If the subtable already exists for the custom script, the subtable is used as the input for the enrichment and optionally as the output.

Example

Enrichment configuration

Key

Campaign

Name Pattern

{key}_{split_counter}

Data table before enrichment

Campaign

Ad Group

Clicks

Brand

media

7

Brand

ecommerce

3

Brand

festivals

18

Dashboard

ecommerce

4

Dashboard

media|social

5

Dashboard

media

11

Data table after enrichment

Table 1

This table is called Campaign_0.

Campaign

Ad Group

Clicks

Brand

media

7

Brand

ecommerce

3

Brand

festivals

18

Table 2

This table is called Campaign_1.

Campaign

Ad Group

Clicks

Dashboard

ecommerce

4

Dashboard

media|social

5

Dashboard

media

11

Using the split custom script instruction with other instructions

To apply multiple custom script instructions, including the split instruction, to a single datastream, you need to make sure that the enrichment instructions are performed in the correct order. The following steps will make sure that your enrichments are applied correctly.

If you want to enrich your data extract and apply the split instruction last:

  • If you have multiple instructions in a custom script enrichment, make sure that the split instruction is performed last (at the bottom of the list of instructions).

  • If you have multiple enrichments assigned to your datastream, make sure that the enrichment containing the split instruction is performed last (at the bottom of the list of enrichments).

If you want to apply the split instruction and then apply other enrichments:

  • In the Transformers field of the instruction configuration step as described above, enter the name of the enrichment you want to perform after the split instruction.

    The name of the enrichment entered in this field must exactly match the name of an existing enrichment.

    Click + to add more enrichments in the Transformers field. These enrichments will be applied in the order they are added.

    If you add an enrichment in the Transformers field, you do not need to assign this enrichment to the datastream.