Working with data extracts

This guide explains how to view your data extracts and perform actions on them.

A data extract is a file that contains the data collected during a fetch. A fetch can create one or more data extracts. Data extracts are saved in CSV format.

When you have collected your data, you can view the data extract in the datastream overview page. The data extract shows the data after enrichments have been applied. To view the data in a raw state, disable any enrichments applied to the datastream and run the fetch again.

Viewing data extracts

To view the data you have collected in a data extract, follow these steps:

  1. Select the workspace you work with in Adverity and then, in the platform navigation menu, click Activity.

  2. In the All tasks tab, in the list of tasks, find the data fetch whose data you want to preview, and click Show extracts.

  3. In the list, find the data extract you want to preview, and click the hyperlinked element.

  4. You are then navigated to the data extract, which is displayed in a table containing the data that you have fetched.

The header row of the table shows the field names contained in the data extract. These names are displayed with a gray background. If a source field is mapped to a target field, the name of the mapped target field is also displayed in the header row. Dimensions have a blue background and metrics have a green background. In the header row, you can map an unmapped source field to a target field, or change the target field to which a source field is mapped. To do so, see Mapping target fields in the Data Extract preview.

Data extract statuses

When you click Show extracts to view the data extracts created by a task, two columns are shown. The Extract column contains the hyperlinked elements that you can click to view the data in each data extract, and the Status column shows the status of each data extract.

The table below explains the meaning of each data extract status:

Status

Meaning

Collected

The data in this data extract has been successfully fetched from your data source.

Loaded

The data in this data extract has been successfully loaded into the assigned external destinations and/or Adverity Data Storage.

Raw

The data in this data extract is in its raw state. This means that it is identical to the data in your data source.

Overwritten*

There is a newer version of the data in this data extract in another task. Refetching a task will overwrite the old version of the task.

Processing

This task is currently in progress.

No data collected

This task was completed successfully but there is no data in the data extract.

Deleted

This data extract has been deleted or scheduled for deletion.

Processing error

An error occurred when processing this task.

Enrichment error

An error occurred when enriching the data in this data extract.

Load error

An error occurred when loading the data in this data extract into any external destinations and/or Adverity Data Storage.

*The Overwritten status may be more frequent if you use the Unique by day setting in Local Data Retention. If Unique by day is enabled for your datastream, data from each day of the date range is entered into a separate data extract. If you refetch data for the same date, the data on the overlapping days is overwritten. To configure a Unique by day data collection, see Configuring unique by day data collection.

Downloading data extracts

To download a data extract, follow these steps:

  1. Select the workspace you work with in Adverity and then, in the platform navigation menu, click Activity.

  2. In the All tasks tab, find the task whose data you want to preview, and click Show extracts.

    Reminder: A task is a data fetch.

  3. In the list, find the data extract you want to download, and click the hyperlinked element.

  4. In the row of options above the data extract, click Download .

  5. Choose one of the following options:

    • Download Raw File - Click this option to download the file containing the data as it was collected.

    • Convert to CSV (.csv)

    • Convert to JSON (.json)

    • Convert to Excel (.xlsx)

As a result, you have downloaded a copy of your data extract in the selected format.

Performing actions on data extracts

To perform an action on your data extracts, follow these steps:

  1. Select the workspace you work with in Adverity and then, in the platform navigation menu, click Datastreams.

  2. Open the chosen datastream by clicking on its name.

  3. In the top navigation panel, click Data extracts.

  4. In the table, select the checkboxes for the data extracts to which you want to apply certain actions.

  5. In the Select an action drop-down menu above the data extracts, select one of the following options:

    Delete extracts

    Select this option to schedule the deletion of the selected data extracts. The selected data extracts will be deleted when there are no tasks running for this datastream.

    Deleted files are kept for the grace period configured for the current workspace. You can undelete the deleted files within this grace period. To view or change your workspace's grace period, see Grace Period.

    Undelete extracts

    Select this option to schedule the restoration of the selected data extracts. The selected data extracts will be restored when there are no tasks running for this datastream.

    You can undelete the deleted files within the grace period configured for the current workspace. To view or change your workspace's grace period, see Grace Period.

    Re-load into Adverity Data Storage and destinations

    Select this option to reload the selected data extracts into Adverity Data Storage and external destinations if these are enabled.

    Re-load into Adverity Data Storage

    Select this option to re-load the selected data extracts into Adverity Data Storage if it is enabled.

    Re-load into [destination]

    Select this option to re-load the selected data extracts into the named destination if it is enabled.

    Re-load sequentially

    Select this option to re-load the selected data extracts, starting from the newest data extract and working back, into Adverity Data Storage and external destinations if these are enabled.

    By default, the data extracts are re-loaded simultaneously. Select this option to re-load data extracts sequentially, and ensure your data is not overwritten. Data extracts are re-loaded from newest to oldest.

    Update metadata

    Select this option to update the metadata for the selected data extracts.

    Enrich extracts and re-load into Adverity Data Storage and destinations

    Select this option to enrich and re-load the selected data extracts into Adverity Data Storage and external destinations if these are enabled. The data extracts will be enriched and re-loaded simultaneously.

    Enrich extracts sequentially and re-load into Adverity Data Storage and destinations

    Select this option to enrich and re-load the selected data extracts, starting from the newest data extract and working back, into Adverity Data Storage and external destinations if these are enabled.

    By default, the data extracts are enriched and loaded simultaneously. Select this option to enrich and load them sequentially to ensure your data is not overwritten. Data extracts are enriched and re-loaded from newest to oldest.

    Select and apply enrichments

    Select this option to select which enrichments to apply to the selected data extracts. You can also choose to load the enriched data extracts into Adverity Data Storage and external destinations if these are enabled.

As a result, the selected action will be applied to the selected data extracts.

Configuring unique by day data collection

To fetch data uniquely by day, follow these steps:

  1. Select the workspace you work with in Adverity and then, in the platform navigation menu, click Datastreams.

  2. Open the chosen datastream by clicking on its name.

  3. In the top navigation panel, click Local Data Retention.

  4. In the Extract Filenames section, select Unique by day.

  5. Click Save.

As a result, you can now collect your data and a data extract is created for each day of the date range for the data fetch.