Using Python expressions to process JSON columns#
This guide explains how to transform JSON columns in a data extract.
Introduction#
Use Python expressions in custom scripts to transform your data extract. For a list of all available custom script transformations, see Available custom script instructions.
When using Python expressions in custom scripts, you must follow certain rules - these rules are covered in more detail in Using Python expressions in custom scripts.
JSON elements#
It is important to understand elements of JSON structure to extract data from it. JSON organizes data in the following structures:
- Key-value pairs
Key-value pairs are a fundamental element of JSON. They define the values that are stored in the JSON. Keys and values are separated by a colon, and key-value pairs are separated by a comma.
"key": "value",
- Objects
JSON objects consist of multiple key-value pairs. The objects are surrounded by curly brackets.
"object_name": {"key1": "value1", "key2": "value2", "key3": "value3"}
- Arrays
JSON arrays can consist of simple values or have objects nested in them. In order to access parts of the array, you need to specify the position within the array you want to access, the first one being 0. JSON arrays are surrounded by square brackets.
"array_name": [ { "key1": "value1", "key2": "value2" }, "value3" ]
Extracting a value from a JSON key-value pair#
To extract data from a simple JSON key-value pair which is not nested in an object or array, enter the following Python expression into the transformation:
__import__('json').loads({json_column_name})['target_key']
To configure the Python expression, change the following parameters:
json_column_name
- This is the name of the column that contains JSON data.target_key
- This is the name of the key for which you want to extract the value.
Extracting from a nested array or object in JSON#
To extract values from a key-value pair nested in an array or an object from a JSON column, specify the whole tree pattern under which the respective value is nested. Enter the following Python expression into the transformation:
__import__('json').loads({json_column_name})['array_name'][position_in_array]['object_name']['target_key']
To configure the Python expression, change the following parameters:
json_column_name
- This is the name of the column that contains JSON data.array_name
- (Optional) This is the name of the array that contains the value you want to extract. When extracting a value from an array, specify theposition_in_array
index of the value. The count starts at 0.object_name
- (Optional) This is the name of the JSON object in that contains the value you want to extract.target_key
- This is the name of the key for which you want to extract the value.
Examples#
JSON column content:
{
"line_items":
[
{
"id": 9212925837502,
"price_set":
{
"shop_money":
{
"amount": "69.00",
"currency_code": "USD"
}
}
},
{
"presentment_money":
{
"amount": "70.00",
"currency_code": "USD"
}
}
]
}
Transformation examples:
To extract the amount
from the price_set
object, use the following Python expression:
__import__('json').loads({json_column_name})['line_items'][0]['price_set']['shop_money']['amount']
To extract the amount
from the presentment_money
object, use the following Python expression:
__import__('json').loads({json_column_name})['line_items'][1]['presentment_money']['amount']