Working with universal monitors#

This guide explains how to use universal monitors.

Introduction#

What are monitors and what can they do?

A data monitor is an automated data quality check that is performed each time you fetch data. With data monitors, you can find anomalies in your data more easily.

For an overview of your data monitors, see Introduction to the Data Quality page.

What types of monitors are there in Adverity?

There are two types of data monitors in Adverity:

  • Universal monitors

    Universal monitors detect data anomalies and flag potential issues across all data sources. They perform general data quality checks, such as finding duplicate rows in your data.

    By default, universal monitors raise a warning when an anomaly is detected. You can edit a universal monitor to trigger an error for specific datastreams.

  • Custom monitors

    Custom monitors allow you to define custom rules that your data must meet.

    For more information, see Working with custom monitors.

How do I use universal monitors?

Universal monitors are enabled by default. When enabled, universal monitors are applied to all datastreams on your Adverity instance. You can turn the universal monitors off for all datastreams from the Administration page of the root workspace. Additionally, you can change universal monitors configuration on a datastream level.

Universal monitor types#

Detecting duplicate data#

Duplicate data in your data extracts can cause inaccurate data analysis. The duplication monitor detects duplicate data in the monitored datastreams and alerts you so that you can remove it before any problems occur.

If Adverity detects one or more rows containing identical data in a data extract that you have collected, a Data Quality warning will appear in the Enrich & Monitor section of the task in the datastream overview.

To check the details about the detected duplicates, see Viewing details of the data quality issues in your data extract. If necessary, remove the duplicate data from your data extract before loading the data into Adverity Data Storage or external destinations.

image2

Detecting fetch volume anomalies#

The volume monitor detects unexpected changes in the data volume collected by the monitored datastreams. This monitor tracks the total row count of each fetch (both scheduled and manual) and notifies you of any statistical outliers compared to the previous fetches of this datastream. For each monitored datastream, Adverity creates a model based on the moving average of the row count of the latest fetches. If a new value is significantly different from the trend, the monitor triggers an issue. To detect the outliers, Adverity needs at least 10 fetches coming from the monitored datastream.

The volume monitor dynamically calculates bounds of acceptable row counts based on previous data. It uses rolling median and rolling standard deviation to set the bounds and the sensitivity parameter (σ=2.6) to detect the outliers. In other words, if the difference between the row count of the latest fetch and the current median is bigger than the current standard deviation times 2.6, then the monitor will trigger an issue. This approach was selected to minimize the false positives and detect only significant outliers.

The volume monitor compares only data extracts with the same date range. For example, if your datastream has two schedules: one is daily and another one is weekly, the volume monitor will group data extracts with the same date range and detect volume outliers for each date range group separately.

Sometimes data volume can change because of new campaign launches, authorization changes or updated configuration settings. However, if your data pipeline is stable and you are not expecting any changes, the volume monitor will help you detect the data volume issues as soon as they appear.

If Adverity detects a change in the row count of the latest fetch, a Data Quality warning will appear in the Enrich & Monitor section of the task in the datastream overview.

To check the details about the detected volume issues, see Viewing details of the data quality issues in your data extract.

image3

Enabling a universal monitor#

Note

This setting is available to users with Administrator permissions in the root workspace of your organization, and will be applied to all child workspaces of the root workspace.

To switch a universal monitor on or off for all datastreams, follow these steps:

  1. Select the root workspace and then, in the platform navigation menu, click Administration.

  2. In the Data Quality Monitors section of the workspace settings, select or deselect the checkbox for the monitor that you want to enable or disable:

    • Duplication Monitor

    • Volume Monitor

    You can later disable or edit a universal monitor for a specific datastream.

Managing universal monitors assigned to a datastream#

You can perform the following actions with the universal monitors on a datastream level.

Editing a universal monitor#

By default, universal monitors trigger a warning when an issue is detected. To change this setting for a specific datastream so that it triggers an error instead, follow these steps:

  1. Go to the Datastreams page.

  2. Select the datastream you want to configure.

  3. In the Monitor subsection of the datastream overview, find the universal monitors box.

  4. Click image1 Edit universal monitors.

  5. Select the issue type that should be triggered if the monitor detects an issue from the following options:

    Trigger an error

    Select this option to raise an error and stop processing the data.

    Trigger a warning

    Select this option to raise a warning and continue processing the data.

  6. Click Apply.

Deactivating a universal monitor#

To deactivate a universal monitor for a datastream, follow these steps:

  1. Go to the Datastreams page.

  2. Open the chosen datastream by clicking on its name.

  3. In the Monitor subsection of the datastream overview, find the universal monitors box.

  4. Disable the toggle next to the monitor you want to disable.

Including the monitor’s status in the data extract#

To include the status of the monitor assigned to the datastream in the data extract, update the datastream settings. Only the status of the monitors with severity set to Warning can be included. For more information, see Configuring advanced datastream settings.