Working with universal monitors
This guide explains how to use universal monitors.
Introduction
-
What are monitors and what can they do?
-
A data monitor is an automated data quality check that is performed each time you fetch data. With data monitors, you can find anomalies in your data more easily.
-
For an overview of your data monitors, see Introduction to the Data Quality page.
-
What types of monitors are there in Adverity?
-
There are two types of data monitors in Adverity:
-
-
Universal monitors
Universal monitors detect data anomalies and flag potential issues across all data sources. They perform general data quality checks, such as finding duplicate rows in your data.
By default, universal monitors raise a warning when an anomaly is detected. You can edit a universal monitor to trigger an error for specific datastreams.
-
Custom monitors
Custom monitors allow you to define custom rules that your data must meet.
For more information, see Working with custom monitors.
-
-
How do I use universal monitors?
-
Universal monitors are enabled by default. When enabled, universal monitors are applied to all datastreams on your Adverity instance. You can turn the universal monitors off for all datastreams from the Administration page of the root workspace. Additionally, you can change universal monitors configuration on a datastream level.
Using universal monitors
Currently, you can use universal data monitors to detect duplicate rows in your data.
Detecting duplicate data
Duplicate data in your data extracts can cause inaccurate data analysis. When the duplication monitor is enabled for a datastream, Adverity will automatically detect duplicate data in this datastream and alert you so that you can remove it before any problems occur.
If Adverity detects one or more rows containing identical data in a data extract that you have collected, a Data Quality warning will appear in the Enrich & Monitor section of the task in the datastream overview.
To check the details about the detected duplicates, see Viewing details of the data quality issues in your data extract. If necessary, remove the duplicate data from your data extract before loading the data into Adverity Data Storage or external destinations.
Enabling the duplication monitor
To switch the duplication monitor on or off for all datastreams, follow these steps:
-
Select the root workspace and then, in the , click Administration.
-
Select or deselect the Monitor all datastreams in the workspace for duplicate rows checkbox.
This setting is available to users with Administrator permissions in the root workspace of your organization, and will be applied to all child workspaces of the root workspace.
Managing the duplication monitor for a specific datastream
You can perform the following actions with the duplication monitor on a datastream level:
-
Edit the duplication monitor
-
By default, the duplication monitor triggers a warning when duplicates are detected. To change this setting for a specific datastream so that it triggers an error instead, follow these steps:
-
-
Go to the Datastreams page.
-
In the Monitors subsection of the datastream overview, find the duplication monitor.
-
Click Edit duplication monitor.
-
Select the issue type that should be triggered if the monitor detects duplicates from the following options:
-
Trigger an error
-
Select this option to raise an error and stop processing the data.
-
Trigger a warning
-
Select this option to raise a warning and continue processing the data.
-
-
Click Apply.
-
-
Deactivate the duplication monitor
-
To deactivate the duplication monitor for a datastream, follow these steps:
-
-
Go to the Datastreams page.
-
In the Monitors subsection of the datastream overview, find the duplication monitor.
-
Disable the toggle next to the duplication monitor.
-
-
Include the monitor's status in the data extract
-
To include the status of the monitor assigned to the datastream in the data extract, update the datastream settings. Only the status of the monitors with severity set to Warning can be included. For more information, see Configuring advanced datastream settings.