Adobe Analytics Data Feeds provide flexible access to raw data, enabling complex manipulations and integration with other sources. They complement tools like Adobe Analysis Workspace by filling analytical gaps. Data Feeds store each server call as a row, delivered in batches, and include key metrics. Aggregated tables and global filters help streamline analysis and ensure consistency. This guide introduces their potential and sets the stage for further exploration.
Introduction
Data Feeds are a powerful way to tap into the raw, granular data you collect with Adobe. Ingesting this into a database creates endless interesting use cases, including:
-
Unparalleled flexibility to manipulate the data exactly how you need it (e.g. changing eVar persistence, complex sequential logic)
-
Joining Adobe Analytics data with other data sources to enable 360 reporting or as inputs to models and decision drivers
While Data Feeds should not be the sole method to analyze Adobe Analytics data, it can be used to fill in the gaps that Adobe Analysis Workspace, Data Warehouse, or Report Builder may leave. Consider Data Feeds as one of the many tools in your Adobe Analytics toolkit!
Comparing Data Feeds vs Other Adobe Analytics tools:
Use Case
Workspace
Data Warehouse
Data Feeds
Note: This playbook will not walk through how to ingest data feeds into a database. It will assume data feeds are already easily accessible for querying.
-
For guidance on setting up and ingesting data feeds, refer to: https://experienceleague.adobe.com/docs/analytics/export/analytics-data-feed/data-feed-overview.html?lang=en
-
For an overview of what files are included in the Data Feed: https://experienceleague.adobe.com/en/docs/analytics/export/analytics-data-feed/data-feed-contents/datafeeds-contents
Understanding the Data Feed
What are Data Feeds?
Every Adobe Analytics server call is stored as a single row in a Data Feed. A separate Data Feed needs to be set up for each report suite. Data is delivered in hourly batches after each hour, or in daily batches at the conclusion of each day.
Across each implementation, the format of the raw Data Feed will stay the same. A comprehensive list of all the columns available is here: https://experienceleague.adobe.com/en/docs/analytics/export/analytics-data-feed/data-feed-contents/datafeeds-reference. This section of the playbook will highlight a few important columns.
Occurrence (Hit), Visit, and Visitor Identifiers
To replicate OOTB occurrences, visits, and unique visitors, you’ll need a combination of four columns: post_visid_high, post_visid_low, visit_num, and visit_page_num
-
Unique Visitors: A concatenation of post_visid_high and post_visid_low is used to get a visitor ID. Distinct counts of this visitor ID will replicate unique visitors.
-
Visits: A concatenation of post_visid_high, post_visid_low, and visit_num is used to get a visit ID. Distinct counts of this visit ID will replicate visits. Due to hash collision, you may need to also concatenate with visit_start_time_gmt
-
Occurrences: A concatenation of post_visid_high, post_visid_low, visit_num, and visit_page_num is used to get a hit ID. Distinct counts of this hit ID will replicate occurrences.
Here’s starter SQL code to get a count of visitors, visits, and occurrences:
SELECT
COUNT(DISTINCT CONCAT(POST_VISID_HIGH, POST_VISID_LOW)) AS VISITORS,
COUNT(DISTINCT CONCAT(POST_VISID_HIGH, POST_VISID_LOW, VISIT_NUM)) AS VISITS,
COUNT(DISTINCT CONCAT(POST_VISID_HIGH, POST_VISID_LOW, VISIT_NUM, VISIT_PAGE_NUM)) AS OCCURRENCES
FROM DATA_FEEDS
WHERE HIT_SOURCE = '1' AND EXCLUDE_HIT = '0'
TIP: In Adobe Workspace, the “Visit Depth” dimension is equivalent to the visit_num Data Feed column. The “Hit Depth” dimension in Workspace is equivalent to the visit_page_num Data Feed column.
Notes on the visit_page_num column:
-
Don’t let the “page” in the visit_page_num column name mislead you! This does not only increment on page views, instead it increments on every analytics server call.
-
Visits do not always start on visit_page_num = 1, but should most of the time. For precise reporting, if you need to pull the first hit of a visit, calculate the minimum visit_page_num for a visit ID
TIP: If you want to see the exact actions a user has made, you can select a visitor ID, order by visit_num, and then by visit_page_num to get a hit-by-hit recount of their actions. This is helpful when debugging customer journeys - test it out on your own actions as you navigate your digital property!
eVars and props
Each eVar and prop has a dedicated column in the Data Feed, no matter if it is enabled or has data populated. There are also separate columns for both the pre-processed value and the post-processed values. The post-processed column will have eVar attribution and expiration logic applied.
Refer to this flowchart to understand when processing happens: https://experienceleague.adobe.com/docs/analytics/technotes/processing-order.html?lang=en
In total there will be 500 columns of eVars (i.e. 2 sets of 250 possible eVars that can be created for an implementation) and 150 columns of props (i.e. 2 sets of 75 possible props that can be set up).
TIP: Ever wish the eVar you captured had the properties of a prop? The non-post eVar columns behave like a prop. Without eVar attribution or persistence logic applied, use the non-post columns to filter for analytics hits where that eVar must be present - exactly like a prop!
Segments and Calculated Metrics
Segments, custom calculated metrics and Adobe’s out of the box metrics need to be custom defined. For OOTB metrics and segments, refer to Adobe’s technical documentation.
The logic to replicate some common metrics is found here: https://experienceleague.adobe.com/en/docs/analytics/export/analytics-data-feed/data-feed-contents/datafeeds-calculate
In the next section, sample code is provided to replicate Adobe’s OOTB bounces metric as an example. An upcoming Experience League post will provide additional examples of how segments and metrics can be recreated with Data Feeds.
Querying Data Feeds
The Basics
Our first query will count visits for a specific day while excluding any unnecessary hits.
SELECT
COUNT(DISTINCT CONCAT (POST_VISID_HIGH, POST_VISID_LOW, VISIT_NUM)) AS VISITS
FROM DATA_FEEDS
WHERE HIT_SOURCE = '1' AND EXCLUDE_HIT = '0'
Next, we’ll count visits across their entry pages.
WITH MIN_VISIT_PAGE_NUM_TABLE AS (
SELECT
POST_VISID_HIGH,
POST_VISID_LOW,
VISIT_NUM,
MIN(VISIT_PAGE_NUM) AS MIN_VISIT_PAGE_NUM
FROM DATA_FEEDS
WHERE HIT_SOURCE = '1' AND EXCLUDE_HIT = '0'
GROUP BY 1,2,3)
SELECT
PAGE_NAME,
COUNT(DISTINCT A.VISIT_ID) AS ENTRIES
FROM DATA_FEEDS A
LEFT JOIN MIN_VISIT_PAGE_NUM_TABLE B
ON A.POST_VISID_HIGH = B.POST_VISID_HIGH
AND A.POST_VISID_LOW = B.POST_VISID_LOW
AND A.VISIT_NUM = B.VISIT_NUM
AND A.VISIT_PAGE_NUM = B.MIN_VISIT_PAGE_NUM
WHERE HIT_SOURCE = '1' AND EXCLUDE_HIT = '0'
GROUP BY 1
TIP: To replicate the above for exits, find the MAX visit_page_num instead of min.
Finally, we’ll replicate the OOTB bounces metric.
WITH MAX_VISIT_PAGE_NUM_TABLE AS (
SELECT
POST_VISID_HIGH,
POST_VISID_LOW,
VISIT_NUM,
MAX(VISIT_PAGE_NUM) AS MAX_VISIT_PAGE_NUM
FROM DATA_FEEDS
WHERE HIT_SOURCE = '1' AND EXCLUDE_HIT = '0'
GROUP BY 1,2,3),
MIN_VISIT_PAGE_NUM_TABLE AS (
SELECT
POST_VISID_HIGH,
POST_VISID_LOW,
VISIT_NUM,
MIN(VISIT_PAGE_NUM) AS MIN_VISIT_PAGE_NUM
FROM DATA_FEEDS
WHERE HIT_SOURCE = '1' AND EXCLUDE_HIT = '0'
GROUP BY 1,2,3)
SELECT
COUNT(DISTINCT A.VISIT_ID) AS BOUNCES
FROM MAX_VISIT_PAGE_NUM A
LEFT JOIN MIN_VISIT_PAGE_NUM_TABLE B
ON A.POST_VISID_HIGH = B.POST_VISID_HIGH
AND A.POST_VISID_LOW = B.POST_VISID_LOW
AND A.VISIT_NUM = B.VISIT_NUM
WHERE A.MAX_VISIT_PAGE_NUM = B.MIN_VISIT_PAGE_NUM
Data Feed Strategy
Build Aggregated Tables
It can be beneficial to build aggregated views of the raw Data Feed. Benefits include:
-
Easier for end users: Your average Data Scientist who would like to leverage this data won’t need to worry about what merch eVar to use, or how to parse the event list.
-
Reduced querying load: The Data Feed might be a terabytes large dataset. Your Data Engineering team will thank you for reducing queries against the raw dataset and pointing end users to smaller, more manageable tables
The types of aggregated tables needed will be dependent on each organization and your stakeholders, but here are some examples to get you started:
-
User Details Table: Contains all hit and page data (e.g. pagename, last touch channel, mobile device)
-
Key Events Table: Contains a parsed product_list and all the events that occurred in that hit
-
Search Terms Table: Contains all search terms made and search-related metrics
Keep in mind how the end user will need to join these tables to get their desired results. For example, it may be easiest to have each aggregated table include post_visid_high, post_visid_low, visit_num, and visit_page_num so you can join across any level of granularity.
Global Filters
Some organizations have global filters applied to all reporting (e.g. to exclude bots, take out fraudulent traffic, etc.). Consider creating an aggregated table that replicates this filter and join this to any queries against the raw Data Feed.
Having a centralized table eliminates the need to maintain this filter logic as it changes over time.
Discrepancy Monitoring with Workspace
At the start of your Data Feed journey, run a quick count of visits by day and make sure each day lines up with Workspace. Although your Data Feed is enabled, there may be gaps on your internal Data Engineering side to process the files, or Adobe may have skipped sending them.
Regardless, it’s handy to set up regular validations that Workspace continues to line up with your Data Feeds and any aggregated tables.
Conclusion
Wrangling your Adobe Data Feeds can feel like a daunting project. But once you’ve got a handle on them, there are endless possibilities to customize your data and serve your specific use cases.
This article sets the foundation for understanding these Feeds and just scratches the surface. Stay tuned for more Experience League articles to help you dive deeper into this rich data!
Additional Resources
-
How to set up Data Feeds: https://experienceleague.adobe.com/docs/analytics/export/analytics-data-feed/data-feed-overview.html?lang=en
-
Overview of hat files are included in the Data Feed: https://experienceleague.adobe.com/en/docs/analytics/export/analytics-data-feed/data-feed-contents/datafeeds-contents
-
Adobe Data Feed Columns: https://experienceleague.adobe.com/en/docs/analytics/export/analytics-data-feed/data-feed-contents/datafeeds-reference
-
Adobe Data Processing Flow Chart: https://experienceleague.adobe.com/docs/analytics/technotes/processing-order.html?lang=en
-
How to replicate common calculated metrics with Data Feeds: https://experienceleague.adobe.com/en/docs/analytics/export/analytics-data-feed/data-feed-contents/datafeeds-calculate