parquet

Load data from a Parquet file.

This guide explains how to configure the parquet instruction. To learn about another instruction, go back to the Available custom script instructions overview.

Introduction

Use the parquet instruction to load data from a Parquet file. A Parquet file is a popular column storage file format used by Hadoop systems, such as Pig, Spark, and Hive.

Creating a custom script enrichment using the parquet instruction

To create and configure a custom script using the parquet instruction, follow these steps:

  1. Create a custom script enrichment.

  2. In the Instructions step, select the parquet instruction.

  3. To configure the custom script instruction, fill in the following fields. Required fields are marked with an asterisk (*).

Subtable

Enter the name for a subtable that you want to use within this custom script.

A subtable is a temporary table that only exists for this custom script. You can apply additional instructions within the same custom script to the subtable. However, the subtable cannot be used in any other custom scripts.

If a subtable does not exist for the current custom script, the enrichment is applied to the data extract, and the enriched data is output into the subtable. If the subtable already exists for the custom script, the subtable is used as the input for the enrichment and optionally as the output.