Collecting data using Web Connect

This reference documents the configuration options of Web Connect in Adverity.

Authorization methods

When you connect to an API using Web Connect you may need to create an authorization.

Depending on the API requirements, you may need to enter your authentication credentials in the request URL, header or body instead of creating an authorization.

The following authorization methods are available for Web Connect:

  • Use Web Connect for APIs with basic authorization (username and password), static bearer tokens, cookie sessions, and header authentication.

  • Use Web Connect (Bearer) for APIs that require you to obtain a Bearer Token that is retrieved after a call to an authorization endpoint.

  • Use Web Connect (OAuth) for APIs that use the secure authorization Oauth (Open Authorization).

  • Use Web Connect (OAuth2 Authorization Code) for APIs that use the secure authorization Open Authorization 2.0 (OAuth 2.0 ).

Configuring data collection with Web Connect

This section explains how to configure data collection with Web Connect.

You may not need to configure all fields for the specific API and use case.

Request

Configure the fields in this section to define the request sent to the API.

URL

Enter the URL of the API, including the required parameters and values.

You can add the following parameters to the URL:

  • Start and end date parameters

    You can set start and end date parameters to define the fetch date range in the URL or request body. You can set the date parameters in the following two ways:

    • {start} and {end} placeholders

      When using the {start} and {end} placeholders, you can change the fetch date range using the date range selector as in other connectors.

      For example, if your API uses the date_from parameter to define the start of the date range and the date_to parameter to define the end of the date range, add the following expression to the URL to configure the fetch date range:

      date_from={start}&date_to={end}

      Your API may use a different date format than Adverity. To use a custom date format supported by your API, configure the Date format field below.

    • Static values

      Alternatively, you can use static values to define the date range for which to collect data.

      For example, to collect data from October 1, 2023 to October 5, 2023 with the date_from and date_to API parameters, add the following expression to the URL:

      date_from="2023-10-01"&date_to="2023-10-05"

      Make sure to enter the dates in the date format supported by your API.

  • Authorization token

    If you have an authorization token for the API, you can add it to the request URL.

    If you are using an authorization to generate a token, you can add it to the request URL using the {YOUR_TOKEN} placeholder.

    For example, if the URL uses the token parameter, add the following expression to the URL:

    token={YOUR_TOKEN}
  • Value tables

    To send multiple requests for a list of parameter values, use a value table in the request URL. This is typically used for lists of IDs, where you would need to collect data for multiple IDs. Adverity sends a separate request for each value in the value table.

    For example, to use the value table with the name valuetablename, add the following expression to the URL:

    {vt(valuetablename)}

    In this expression, vt signifies that the placeholder in curly brackets is a value table.

    For more information on automatically populating value tables, see Automatically populate a value table.

  • Other static parameters

    You can add any other static parameters to the URL required by the API.

For example, to collect data from the following endpoint https://api.application.com including an authentication token generated by your authorization and using the date range selected through the datastream interface, configure the URL in the following way:

Request method

Select either the GET or POST HTTP request method, and configure the request.

For the GET request method, you can configure the header in JSON format.

For the POST request method, you can select the request content type, configure the header in JSON format, and enter an HTTP request body in the selected content type.

  • Request content type

    Select the content type for the POST request required by your API.

  • Request headers

    Enter the parameters required in the header in JSON format.

    For example, your request header may look like this:

    {
    	"api-key":"abcdef",
    	"Accept":"application/stream+json"
    }
  • Request body

    In the request body, configure what data to collect if it is required by the API.

    You can set start and end date parameters to define the fetch date range. The parameters can be set to static values or use the {start} and {end} placeholders. For more information, see Start and end date parameters.

    For example, your request body may look like this:

    {
    	"categories":[
    		"brandName",
    		"eventCode"
    	],
    	"metrics":[
    		"clickCount",
    		"impressionCount"
    	],
    	"currency":"USD",
    	"dates":{
    		"eventDateStart":"{start}",
    		"eventDateEnd":"{end}"
    	},
    	"sort":{
    		"eventCode":"ASC"
    	},
    	"summaryOnly":false
    }

Date format

Configure the date format for the {start} and {end} placeholders to match the API requirements.

Most APIs use the ISO8601 format (YYYY-MM-DD). If your API uses a different format, for example DD/MM/YYYY or MMDDYYYY, configure it in this field.

To define the date format, use the following placeholders:

  • %m - a month as a two-digit number, from 01 to 12. This matches the mm format.

  • %d - a day as a two-digit number, from 01 to 31. This matches the dd format.

  • %Y - a year as a four-digit number. This matches the yyyy format.

For example, to set the date format to DD/MM/YYYY, enter %d/%m/%Y into the Date format field.

For more information on available formats, see Unix reference.

Navigation URLs

Include a list of URLs that are called before fetching the data. Use this option if you need to perform a request before you fetch data.

Enter the list as "request method - URL" pairs.

Pagination Options

Configure the fields in this section to set the pagination parameters supported or required by the API. Pagination defines how the API sends the requested data.

What is API pagination?

API pagination divides the responses into several pages to send large amounts of data more efficiently. For example, instead of 10,000 lines of JSON, the API sends 10 pages with 1,000 lines of JSON on each page. To retrieve all data from an API, all pages are fetched one after another. These pages are then combined into one data extract. Adverity's Web Connect can fetch all pages for various API pagination options.

For a more general explanation of pagination options, see Everything You Need to Know About API Pagination.

Configuring pagination with Web Connect

Select a pagination option supported by your API from the following options:

Page & PageSize Pagination

With this pagination type, the API splits your data into pages and sends them one by one. You can define the page size.

To retrieve all of your data with this pagination type, follow these steps:

  1. In Paging Type, select Page & PageSize Pagination.

  2. In Offset Parameter Keyword, enter the name of the offset parameter specified by the API, for example, page, offset, skip.

    For GET requests, you may need to include this parameter in the request URL as well.

    For POST requests, you may need to include this parameter in the request URL or HTML request body.

  3. In Limit Parameter Keyword, enter the name of the parameter that limits how many items you retrieve on each page, for example, PageSize, limit, or take.

    For GET requests, you may need to include this parameter as well in the request URL.

    For POST requests, you may need to include this parameter in the request URL or HTML request body.

  4. In Increment, set how many items to retrieve on each page. This is used as the value for the Limit Parameter.

    We recommend that you set the increment to the highest value supported by the API.

Page Pagination

With this pagination type, the API splits your data into pages and sends them one by one. The page size is defined by the API automatically.

To retrieve all of your data with this pagination type, follow these steps:

  1. In Paging Type, select Page Pagination.

  2. In Offset Parameter Keyword, enter the name of the offset parameter specified by the API, for example, page, offset, skip.

    For GET requests, you may need to include this parameter in the request URL as well.

    For POST requests, you may need to include this parameter in the request URL or HTML request body.

Offset & Limit Paging

With this pagination type, the API uses offset and page size parameters to split the requested data. You can define the page size.

To retrieve all of your data with this pagination type, follow these steps:

  1. In Paging Type, select Offset & Limit Paging.

  2. In Offset Parameter Keyword, enter the name of the offset parameter specified by the API, for example, page, offset, skip.

    For GET requests, you may need to include this parameter in the request URL as well.

    For POST requests, you may need to include this parameter in the request URL or HTML request body.

  3. In Limit Parameter Keyword, enter the name of the parameter that limits how many items you retrieve on each page, for example, PageSize, limit, or take.

    For GET requests, you may need to include this parameter as well in the request URL.

    For POST requests, you may need to include this parameter in the request URL or HTML request body.

  4. In Increment, set how many items to retrieve on each page. This is used as the value for the Limit Parameter.

    We recommend that you set the increment to the highest value supported by the API.

JSON:API Cursor Paging

With this pagination type, the API returns a pointer to a specific result within the retrieved data marking how much has been sent already. The cursor is a string value.

To retrieve all of your data with this pagination type, follow these steps:

  1. In Paging Type, select JSON:API Cursor Paging.

  2. In JSON:API Cursor Parameter Keyword, enter the name of the parameter that contains the cursor pointer in the request, for example, cursor.

  3. In JSONPath, enter the JSONPath to the cursor returned in the response. For more information, see Using JSONPath

JSON:API Link Paging

With this pagination, the API returns a URL to the next page of data in the response. This URL is used by Adverity to fetch the next page.

To retrieve all of your data with this pagination type, follow these steps:

  1. In Paging Type, select JSON:API Link Paging.

  2. In JSONPath, enter the JSONPath to the URL returned in the response. For more information, see Using JSONPath

Response

Configure the fields in this section to define how Adverity processes the response from the API.

Zip match

If you are fetching data in a compressed file format that contains multiple CSV files, for example, zip, you can define a regular expression to configure which file within the compressed file is fetched. As a result, the fetch retrieves just a file you define uncompressed.

For example, your compressed file contains two folders: September and October. If you want to extract all CSV files from the October folder, enter the following expression in this field:

October/*.csv

Archive Password

If your data is in an encrypted zip file, provide the password in the Archive Password field to decrypt it automatically.

Parser

Select the format in which the API sends the response.

Adverity currently supports the following parsers:

  • CSV (with comma delimiter)

  • CSV (with semicolon delimiter)

  • TSV (tabs)

  • Excel

  • JSON (strict)

  • HTML

  • XML

  • JSON

Difference between JSON and JSON (strict): A valid response (HTML code 200) can be empty, or the response does not contain data for the specified JSONPath. In such cases, Adverity raises an error. In some cases, you may not want to see these errors. Depending on which JSON parser you select, these errors are either raised or not in the following way:

  • JSON ignores these errors.

  • JSON (strict) raises an error when a specified JSONPath does not contain data or a valid response (code 200) is empty.

Encoding

Select the encoding in which the API sends the response.

Adverity supports the following encoding types:

  • Auto-detect

  • UTF-8

  • UTF-8 (with BOM)

  • UTF-16

  • UTF-16 (Little Endian)

  • ISO-8859-1

  • ISO-8859-9

  • ISO-8859-15

  • Shift-JIS

  • Code page 437

  • Windows-1250

  • Windows-1251

  • Windows-1252

  • UTF-8 (with embedded garbage)

  • ISO-8859-1 (with embedded garbage)

Data key

Select the part of the response containing data that you want to collect.

  • For JSON responses, enter a JSONPath. For more information, see Using JSONPath.

  • For XML and HTML parsers, enter an XPath for rows. For more information, see the W3 XPath documentation.

  • For XLSX parsers, enter the name of the sheet that you are importing.

To select only specific columns from this data, use the Column mapping field described below.

Column mapping

Select only specific columns from the part of the response defined by the Data key.

To define which columns of a JSON or XML response are extracted with column mapping, use one of the following:

  • For JSON responses, enter a JSONPath. For more information, see Using JSONPath.

  • For XML responses, enter the following pairs "column-name": "XPath", where column-name will be the column name in the data extract and XPath provides the path to the data that should be fetched for this column.

    For more information, see the W3 XPath documentation.

    For example, this field may be configured like this:

    {
    	"Product":"product",
    	"CPT":"audiences/audience/cpt",
    	"Cash Donation Volume":"sales/entry[key='Cash Donation']/value"
    }

    This will create a data extract with three columns: Product, CPT, and Cash Donation Volume containing the specified data.

Other

Sleep

Enter how long you want to delay the fetch start, in seconds.

Appendix

Using JSONPath

The JSONPath enables you to specify where a particular parameter or its value is located. This means that you can filter for one specific value in a JSON response or filter for specific parameters and their values. This is a general feature available for any text formatted in JSON with correct syntax. See the example below to help you understand this concept:

  1. Visit JSONPath.com.

  2. Paste the following code snippet into the left window displayed on the JSONPath.com website:

    { "store": {
        "book": [ 
          { "category": "reference",
            "author": "Nigel Rees",
            "title": "Sayings of the Century",
            "price": 8.95
          },
          { "category": "fiction",
            "author": "Evelyn Waugh",
            "title": "Sword of Honour",
            "price": 12.99
          },
          { "category": "fiction",
            "author": "Herman Melville",
            "title": "Moby Dick",
            "isbn": "0-553-21311-3",
            "price": 8.99
          },
          { "category": "fiction",
            "author": "J. R. R. Tolkien",
            "title": "The Lord of the Rings",
            "isbn": "0-395-19395-8",
            "price": 22.99
          }
        ],
        "bicycle": {
          "color": "red",
          "price": 19.95
        }
      }
    }
  3. To only filter for all book titles in the response, enter the following code to the JSONPath Syntax field at the top of the page:

    $.store.book[*]

As a result, you used the JSONPath to filter for a specific part of the response. For more information, see JSON with JSONPath from REST API Tutorials.

Troubleshooting Web Connect

I see an error message when fetching data from Web Connect

The No rows match the given jsonpath error message appears because the syntax of a JSON path in your datastream settings is incorrect.

To resolve this issue, check the JSON paths used in your datastream settings and correct any syntax errors.

I see an error message when fetching data from Web Connect

The Adverity cannot determine the character encoding error message appears when Adverity is not sure whether the automatically detected encoding is correct.

To resolve this issue, we recommend checking the automatically detected encoding, and changing the setting if it is incorrect. If it is correct, you do not need to take any further action.

I see an error message when fetching data from Web Connect

The 'ascii'/'utf-8'/etc. codec can't decode byte 0xYY in position xxxxxx error message can appear for the following reasons:

  • The encoding for your Web Connect datastream is defined incorrectly.

  • The datastream is set to parse an incorrect file type.

  • The source file contains characters with multiple encoding types.

  • The source file contains invalid or corrupted characters.

To resolve this issue, depending on the cause, you need to correct your datastream settings, ensure that the source file only contains characters with one encoding type, and remove any invalid or corrupted characters in the source file.

I see an error message when fetching data from Web Connect

The Request method 'POST'/'GET' is not supported error message appears when you are trying to use an unsupported request method.

To resolve this issue, change the request method:

  • If you are trying to use the POST request method, change it to GET.

  • If you are trying to use the GET request method, change it to POST.