8 minutes

Adobe Analytics Data Feeds provide flexible access to raw data, enabling complex manipulations and integration with other sources. They complement tools like Adobe Analysis Workspace by filling analytical gaps. Data Feeds store each server call as a row, delivered in batches, and include key metrics. Aggregated tables and global filters help streamline analysis and ensure consistency. This guide introduces their potential and sets the stage for further exploration.

Introduction

Data Feeds are a powerful way to tap into the raw, granular data you collect with Adobe. Ingesting this into a database creates endless interesting use cases, including:

While Data Feeds should not be the sole method to analyze Adobe Analytics data, it can be used to fill in the gaps that Adobe Analysis Workspace, Data Warehouse, or Report Builder may leave. Consider Data Feeds as one of the many tools in your Adobe Analytics toolkit!

Comparing Data Feeds vs Other Adobe Analytics tools:
Use Case
Workspace
Data Warehouse
Data Feeds
General performance reporting
Best
Possible
Possible
Join SSOT data with web analytics and bring in to BI tools (e.g. orders from your SSOT finance sources vs visits from Adobe)
Difficult
Possible
Possible
Change definitions of eVars for persistence and attribution methods
Possible
Not possible
Possible
Leverage modeling outside of Workspace
Difficult
Possible
Possible

Note: This playbook will not walk through how to ingest data feeds into a database. It will assume data feeds are already easily accessible for querying.

Understanding the Data Feed
What are Data Feeds?

Every Adobe Analytics server call is stored as a single row in a Data Feed. A separate Data Feed needs to be set up for each report suite. Data is delivered in hourly batches after each hour, or in daily batches at the conclusion of each day.

Across each implementation, the format of the raw Data Feed will stay the same. A comprehensive list of all the columns available is here: https://experienceleague.adobe.com/en/docs/analytics/export/analytics-data-feed/data-feed-contents/datafeeds-reference. This section of the playbook will highlight a few important columns.

Occurrence (Hit), Visit, and Visitor Identifiers

To replicate OOTB occurrences, visits, and unique visitors, you’ll need a combination of four columns: post_visid_high, post_visid_low, visit_num, and visit_page_num

Here’s starter SQL code to get a count of visitors, visits, and occurrences:

SELECT
COUNT(DISTINCT CONCAT(POST_VISID_HIGH, POST_VISID_LOW)) AS VISITORS,
COUNT(DISTINCT CONCAT(POST_VISID_HIGH, POST_VISID_LOW, VISIT_NUM)) AS VISITS,
COUNT(DISTINCT CONCAT(POST_VISID_HIGH, POST_VISID_LOW, VISIT_NUM, VISIT_PAGE_NUM)) AS OCCURRENCES
FROM DATA_FEEDS
WHERE HIT_SOURCE = '1' AND EXCLUDE_HIT = '0'

TIP: In Adobe Workspace, the “Visit Depth” dimension is equivalent to the visit_num Data Feed column. The “Hit Depth” dimension in Workspace is equivalent to the visit_page_num Data Feed column.

Notes on the visit_page_num column:

TIP: If you want to see the exact actions a user has made, you can select a visitor ID, order by visit_num, and then by visit_page_num to get a hit-by-hit recount of their actions. This is helpful when debugging customer journeys - test it out on your own actions as you navigate your digital property!

eVars and props

Each eVar and prop has a dedicated column in the Data Feed, no matter if it is enabled or has data populated. There are also separate columns for both the pre-processed value and the post-processed values. The post-processed column will have eVar attribution and expiration logic applied.

Refer to this flowchart to understand when processing happens: https://experienceleague.adobe.com/docs/analytics/technotes/processing-order.html?lang=en

In total there will be 500 columns of eVars (i.e. 2 sets of 250 possible eVars that can be created for an implementation) and 150 columns of props (i.e. 2 sets of 75 possible props that can be set up).

TIP: Ever wish the eVar you captured had the properties of a prop? The non-post eVar columns behave like a prop. Without eVar attribution or persistence logic applied, use the non-post columns to filter for analytics hits where that eVar must be present - exactly like a prop!

Segments and Calculated Metrics

Segments, custom calculated metrics and Adobe’s out of the box metrics need to be custom defined. For OOTB metrics and segments, refer to Adobe’s technical documentation.

The logic to replicate some common metrics is found here: https://experienceleague.adobe.com/en/docs/analytics/export/analytics-data-feed/data-feed-contents/datafeeds-calculate

In the next section, sample code is provided to replicate Adobe’s OOTB bounces metric as an example. An upcoming Experience League post will provide additional examples of how segments and metrics can be recreated with Data Feeds.

Querying Data Feeds
The Basics

Our first query will count visits for a specific day while excluding any unnecessary hits.

SELECT
COUNT(DISTINCT CONCAT (POST_VISID_HIGH, POST_VISID_LOW, VISIT_NUM)) AS VISITS
FROM DATA_FEEDS
WHERE HIT_SOURCE = '1' AND EXCLUDE_HIT = '0'

Next, we’ll count visits across their entry pages.

WITH MIN_VISIT_PAGE_NUM_TABLE AS (
SELECT 
POST_VISID_HIGH, 
POST_VISID_LOW,
VISIT_NUM,
MIN(VISIT_PAGE_NUM) AS MIN_VISIT_PAGE_NUM
FROM DATA_FEEDS
WHERE HIT_SOURCE = '1' AND EXCLUDE_HIT = '0'
GROUP BY 1,2,3)

SELECT
PAGE_NAME,
COUNT(DISTINCT A.VISIT_ID) AS ENTRIES
FROM DATA_FEEDS A
LEFT JOIN MIN_VISIT_PAGE_NUM_TABLE B
ON A.POST_VISID_HIGH = B.POST_VISID_HIGH
AND A.POST_VISID_LOW = B.POST_VISID_LOW
AND A.VISIT_NUM = B.VISIT_NUM
AND A.VISIT_PAGE_NUM = B.MIN_VISIT_PAGE_NUM
WHERE HIT_SOURCE = '1' AND EXCLUDE_HIT = '0'
GROUP BY 1

TIP: To replicate the above for exits, find the MAX visit_page_num instead of min.

Finally, we’ll replicate the OOTB bounces metric.

WITH MAX_VISIT_PAGE_NUM_TABLE AS (
SELECT 
POST_VISID_HIGH, 
POST_VISID_LOW,
VISIT_NUM,
MAX(VISIT_PAGE_NUM) AS MAX_VISIT_PAGE_NUM
FROM DATA_FEEDS
WHERE HIT_SOURCE = '1' AND EXCLUDE_HIT = '0'
GROUP BY 1,2,3),

MIN_VISIT_PAGE_NUM_TABLE AS (
SELECT 
POST_VISID_HIGH, 
POST_VISID_LOW,
VISIT_NUM,
MIN(VISIT_PAGE_NUM) AS MIN_VISIT_PAGE_NUM
FROM DATA_FEEDS
WHERE HIT_SOURCE = '1' AND EXCLUDE_HIT = '0'
GROUP BY 1,2,3)
SELECT
COUNT(DISTINCT A.VISIT_ID) AS BOUNCES
FROM MAX_VISIT_PAGE_NUM A
LEFT JOIN MIN_VISIT_PAGE_NUM_TABLE B
ON A.POST_VISID_HIGH = B.POST_VISID_HIGH
AND A.POST_VISID_LOW = B.POST_VISID_LOW
AND A.VISIT_NUM = B.VISIT_NUM
WHERE A.MAX_VISIT_PAGE_NUM = B.MIN_VISIT_PAGE_NUM
Data Feed Strategy
Build Aggregated Tables

It can be beneficial to build aggregated views of the raw Data Feed. Benefits include:

The types of aggregated tables needed will be dependent on each organization and your stakeholders, but here are some examples to get you started:

Keep in mind how the end user will need to join these tables to get their desired results. For example, it may be easiest to have each aggregated table include post_visid_high, post_visid_low, visit_num, and visit_page_num so you can join across any level of granularity.

Global Filters

Some organizations have global filters applied to all reporting (e.g. to exclude bots, take out fraudulent traffic, etc.). Consider creating an aggregated table that replicates this filter and join this to any queries against the raw Data Feed.

Having a centralized table eliminates the need to maintain this filter logic as it changes over time.

Discrepancy Monitoring with Workspace

At the start of your Data Feed journey, run a quick count of visits by day and make sure each day lines up with Workspace. Although your Data Feed is enabled, there may be gaps on your internal Data Engineering side to process the files, or Adobe may have skipped sending them.

Regardless, it’s handy to set up regular validations that Workspace continues to line up with your Data Feeds and any aggregated tables.

Conclusion

Wrangling your Adobe Data Feeds can feel like a daunting project. But once you’ve got a handle on them, there are endless possibilities to customize your data and serve your specific use cases.

This article sets the foundation for understanding these Feeds and just scratches the surface. Stay tuned for more Experience League articles to help you dive deeper into this rich data!

Additional Resources