Google Analytics Data (GA4): Tips and best practices#

Data accuracy and estimates#

Google Analytics Data (GA4) uses estimates in some data calculations. When comparing data from Google Analytics Data (GA4) with data in Adverity, compare data at the row level rather than totals for greater accuracy.

Metric inflation with dimensions#

Metric values inflate when you add more dimensions to Google Analytics Data (GA4) data collection. This is common in web analytics tools because individual sessions and users can be counted multiple times.

This occurs for two reasons:

  1. Non-aggregatable metrics use estimates. The total row in a Google Analytics Data (GA4) UI report may not match the sum of all rows. For example, a report with 3 rows of 10 sessions each might show a total of 25 instead of 30.

  2. Values are counted multiple times, such as when a session spans more than one day or a user visits multiple pages in a single session.

The totals in your Google Analytics Data (GA4) UI report may not match the data in Adverity.

Field compatibility and datastream strategy#

Not all fields are compatible with each other. Use a GA4 demo account to test field compatibility before configuring production datastreams.

Separate user metrics into dedicated datastreams. Consider your end report requirements before creating datastreams. Many clients reduce user metric requirements once they understand each breakdown requires a separate datastream.

Select and recast Events to report events as metrics using custom transformation scripts.

Filter visualization widgets by datastream to avoid double counting metrics.

Use datastream filters primarily for reporting unique users for specific segments (e.g., users with engagement).

Conversions use generic source fields rather than prefix fields such as firstUserSource or sessionSource.

Historical data and user reporting#

GA4 uses an event-based model while Universal Analytics (UA) uses a session-based model. Do not directly combine historic UA data with GA4 data due to different methodologies.

The GA4 API does not report new and returning users as metrics. Instead, it reports totalUsers and newUsers. Some analysts use firstsessiondate to calculate new versus returning users.

Alternatively, use the newVsReturning dimension to classify whether users are new or returning.

Source field variations#

The GA4 API separates common dimensions (source/medium/channel grouping, campaign) into multiple dimensions (sessionSource, sessionMedium, sessionChannelGrouping).

Dimensions without session or first user prefixes (source, medium, channelGrouping, campaign) are tied to conversion events and only return data for metrics associated with conversions.

To report on sessions, bounces, and page views at a total level, use dimensions prefixed with session or first user.

Advanced filtering configuration#

Datastream filters are not commonly used. For aggregate metrics, filter data using enrichments after fetching. However, for specific segments of the totalUsers metric, filter the API call because totalUsers is non-aggregatable.

Example filter:

{
  "filter":{
    "fieldName":"eventName",
    "stringFilter":{
      "matchType":"CONTAINS",
      "value":"user_engagement",
      "caseSensitive":false
    }
  }
}

Attribution and engagement metrics#

Attribution models are not available via the API.

Force daily fetch queries each day separately, returning complete data instead of sample data. This feature is currently not available in GA4.

Skip profile if quota exceeded - when the hourly quota limit (1,250 tokens/hour) is reached, Adverity waits for the quota to reset and continues the fetch to avoid incomplete reports.

API quota limitations#

A GA4 property can make the following API requests:

  • 10 concurrent tokens (total requests at any time)

  • 1,250 tokens/hour

  • 25,000 tokens/day

GA4 has no project-level quota (previously 50,000 tokens per project per day).

Enterprise clients: Consider internal GA quotas that might influence data retrieval from the API.

Engaged sessions and bounce rate#

An engaged session is one that:

  • Lasts 10 seconds or longer

  • Has one or more conversion events

  • Has two or more page or screen views

Bounce rate: (sessions - engagedSessions) / sessions Engagement rate: engagedSessions / sessions

Sessions calculation differences#

In Universal Analytics, a session represents the time a user actively engages with your site.

In GA4, the session_start event generates a session ID that associates all subsequent events during the session. Similar to UA, sessions end after 30 minutes of inactivity, but sessions can carry over past midnight and are not affected by new campaign parameters. For sites with global audiences, this can cause discrepancies between UA and GA4 session figures.

Data reconciliation and validation#

Sessions reconcile until 5am UTC of the current day. To match the UI for yesterday, schedule GA4 datastreams after 5am UTC.

When comparing total session numbers with breakdowns (e.g., Channel Grouping), totals may differ because GA4 uses algorithms to calculate session numbers for breakdowns rather than summing them.

GA4 UI applies thresholds to certain views when Google Signals is enabled. Data thresholds prevent inferring individual user identity based on demographics, interests, or other signals. These cannot be changed in the UI.

Page and URL field differences#

pageLocation is the full URL including all parameters. pagePath includes only the portion following the hostname (excluding URL parameters). LandingPagePlusQueryString shows the portion following the hostname and includes query parameters.

For landing page reports, use LandingPagePlusQueryString and group by pagePath.

Session duration and time metrics#

The Time on Page metric is not available in GA4 and is replaced by average engagement time in the UI Pages and Screens report.

Average engagement time is improved because Time on Page did not calculate for all pages (exit pages, bounced pages), making it inaccurate.

To report average session duration, calculate total_duration as a target field using an enrichment:

total_duration = averageSessionDuration x sessions

Create a calculated KPI: averageSessionDuration = total_duration / sessions

Cross-channel attribution#

Google allows cross-channel attribution as part of data-driven models. They identify if users access your website across multiple channels (paid and organic). This feature means you will see multiple networks in conversion paths.

Setup best practices#

Visualization considerations#

Filter visualization widgets by datastream.

Schema clarity#

Use clear schema mapping (e.g., users_daily_by_page).

Account management#

Smaller clients: Fetch multiple accounts in one datastream. Larger clients: Split accounts into separate datastreams to optimize performance for large datasets.