Working with data extracts
This guide explains how to view your data extracts and perform actions on them.
A data extract is a file that contains the data collected during a fetch. A fetch can create one or more data extracts. Data extracts are saved in CSV format.
When you have collected your data, you can view the data extract in the datastream overview page. The data extract shows the data after transformations have been applied. To view the data in a raw state, disable any transformations applied to the datastream and run the fetch again.
Viewing data extracts
Viewing the data extracts collected in a specific fetch
To view the data you have collected in a specific fetch, follow these steps:
-
Go to the Activity page.
-
In the All tasks tab, in the list of tasks, find the data fetch whose data you want to see. The task shows a preview of your data extract.
-
To view the full data extract, click on the task overview.
-
You are then navigated to the task's details, where the collected data is displayed in a table.
Viewing all data extracts collected in a specific datastream
To view the data extracts collected in a specific datastream, follow these steps:
-
Go to the Datastreams page.
-
In the Data extracts.
, clickThe Data extracts tab contains a table containing all the data extracts fetched using the chosen datastream. Here you can filter the data extracts by status and date, and perform actions on the selected data extracts. For more information about the actions you can perform on the data extracts, see Performing actions on data extracts.
-
To open a data extract, click on its name in the table.
Viewing all data extracts collected in your current workspace
To view all data extracts collected in the current workspace, follow these steps:
-
Go to the Activity page.
-
Below the page header, click All data extracts.
This page contains a table with all data extracts fetched in the current workspace and its child workspaces, if allowed. Here you can filter the data extracts and perform actions on the selected data extracts. For more information about the actions you can perform on the data extracts, see Performing actions on data extracts.
-
To open a data extract, click on its name in the table.
Understanding the header of a data extract
The header row of a data extract table shows the field names contained in the data extract. These names are displayed with a gray background.
If a source field is mapped to a target field, the name of the mapped target field is also displayed in the header row. Dimensions have a blue background and metrics have a green background.
In the header row, you can map an unmapped source field to a target field, or change the target field to which a source field is mapped. To do so, see Mapping source fields to target fields in the data extract preview.
Data extract statuses
When you view a data extract, the status of the data extract is displayed next to its name.
When viewing all data extracts in your current workspace, the table containing the data extracts displays the status of each data extract in the Status column.
The table below explains the meaning of each data extract status:
Status |
Meaning |
---|---|
Collected |
The data in this data extract has been successfully fetched from your data source. |
Loaded |
The data in this data extract has been successfully loaded into the assigned external destinations and/or Adverity Data Storage. |
Raw |
The data in this data extract is in its raw state. This means that it is identical to the data in your data source. |
Overwritten* |
There is a newer version of the data in this data extract in another task. Restarting a task will overwrite the old version of the task. |
Processing |
This task is currently in progress. |
No data collected |
This task was completed successfully but there is no data in the data extract. |
Deleted |
This data extract has been deleted or scheduled for deletion. |
Processing error |
An error occurred when processing this task. |
Transformation error |
An error occurred when enriching the data in this data extract. |
Load error |
An error occurred when loading the data in this data extract into any external destinations and/or Adverity Data Storage. |
*The Overwritten status may be more frequent if you use the Unique by day setting in Local Data Retention. If Unique by day is enabled for your datastream, data from each day of the date range is entered into a separate data extract. If you refetch data for the same date, the data on the overlapping days is overwritten. To configure a Unique by day data collection, see Configuring unique by day data collection.
Downloading data extracts
To download a data extract, follow these steps:
-
Go to the Activity page.
-
In the All tasks tab, find the task whose data you want to download, and click on the task's overview.
-
In the row of options above the data extract, click Download .
-
Choose one of the following options:
-
Download Raw File - Click this option to download the file containing the data as it was collected.
-
Convert to CSV (.csv)
-
Convert to JSON (.json)
-
Convert to Excel (.xlsx)
-
As a result, you have downloaded a copy of your data extract in the selected format.
Performing actions on data extracts
To perform an action on your data extracts, follow these steps:
-
Go to the Datastreams page.
-
In the Data extracts.
, click -
In the table, select the checkboxes for the data extracts to which you want to apply certain actions.
-
In the Select an action drop-down menu above the data extracts, select one of the following options:
-
Delete extracts
-
Select this option to schedule the deletion of the selected data extracts. The selected data extracts will be deleted when there are no tasks running for this datastream.
-
Deleted files are kept for the grace period configured for the current workspace. You can undelete the deleted files within this grace period. To view or change your workspace's grace period, see Grace Period.
-
Undelete extracts
-
Select this option to schedule the restoration of the selected data extracts. The selected data extracts will be restored when there are no tasks running for this datastream.
-
You can undelete the deleted files within the grace period configured for the current workspace. To view or change your workspace's grace period, see Grace Period.
-
Re-load into Adverity Data Storage and destinations
-
Select this option to reload the selected data extracts into Adverity Data Storage and external destinations if these are enabled.
-
Re-load into Adverity Data Storage
-
Select this option to re-load the selected data extracts into Adverity Data Storage if it is enabled.
-
Re-load into [destination]
-
Select this option to re-load the selected data extracts into the named destination if it is enabled.
-
Re-load sequentially
-
Select this option to re-load the selected data extracts, starting from the newest data extract and working back, into Adverity Data Storage and external destinations if these are enabled.
By default, the data extracts are re-loaded simultaneously. Select this option to re-load data extracts sequentially, and ensure your data is not overwritten. Data extracts are re-loaded from newest to oldest.
-
Update metadata
-
Select this option to update the metadata for the selected data extracts.
-
Transform extracts and re-load into Adverity Data Storage and destinations
-
Select this option to transform and re-load the selected data extracts into Adverity Data Storage and external destinations if these are enabled. The data extracts will be enriched and re-loaded simultaneously.
-
Transform extracts sequentially and re-load into Adverity Data Storage and destinations
-
Select this option to transform and re-load the selected data extracts, starting from the newest data extract and working back, into Adverity Data Storage and external destinations if these are enabled.
By default, the data extracts are enriched and loaded simultaneously. Select this option to transform and load them sequentially to ensure your data is not overwritten. Data extracts are enriched and re-loaded from newest to oldest.
-
Select and apply transformations
-
Select this option to select which transformations to apply to the selected data extracts. You can also choose to load the enriched data extracts into Adverity Data Storage and external destinations if these are enabled.
-
As a result, the selected action will be applied to the selected data extracts.
Configuring unique by day data collection
To fetch data uniquely by day, follow these steps:
-
Go to the Datastreams page.
-
In the Local Data Retention.
, click -
In the Extract Filenames section, select Unique by day.
-
Click Save.
As a result, you can now collect your data and a data extract is created for each day of the date range for the data fetch.