parquet
Load data from a Parquet file.
This guide explains how to configure the parquet instruction. To learn about another instruction, go back to the Available custom script instructions overview.
Introduction
Use the parquet instruction to load data from a Parquet file. A Parquet file is a popular column storage file format used by Hadoop systems, such as Pig, Spark, and Hive.
Creating a custom script transformation using the parquet instruction
To create and configure a custom script using the parquet instruction, follow these steps:
-
In the Instructions step, select the parquet instruction.
-
To configure the custom script instruction, fill in the following fields. Required fields are marked with an asterisk (*).
-
Subtable
-
Enter the name for a subtable that you want to use within this custom script.
A subtable is a temporary table that only exists for this custom script. You can apply additional instructions within the same custom script to the subtable. However, the subtable cannot be used in any other custom scripts.
If a subtable does not exist for the current custom script, the transformation is applied to the data extract, and the enriched data is output into the subtable. If the subtable already exists for the custom script, the subtable is used as the input for the transformation and optionally as the output.