10 minutes
h1

When data disappears between AEP and a CJA report, it rarely vanishes at random — it gets dropped, transformed, or misconfigured at one of five distinct pipeline stages. This guide walks practitioners through each stage, the failure modes to watch for, and the diagnostic tools to find and fix missing data.

You built the connection. You configured the Data View. You opened Analysis Workspace — and the numbers are wrong. A dimension is blank. A metric is lower than it should be. You know the data is there. So where did it go?

This is one of the most common and frustrating experiences for Adobe Experience Platform and Customer Journey Analytics practitioners. The pipeline from data source to Customer Journey Analytics report passes through multiple stages, and data can quietly disappear — or be silently transformed — at any one of them.

This article walks you through each stage of the Adobe Experience Platform → Customer Journey Analytics pipeline, explains what can go wrong at each layer, and gives you the diagnostic questions and actions to find your data and fix your implementation. Think of it as a practitioner’s field guide for data forensics.

When data goes missing in Customer Journey Analytics, it rarely vanishes at random. Each pipeline stage leaves clues. Your job is to know where to look.

The Adobe Experience Platform → Customer Journey Analytics pipeline at a glance

Before diagnosing a problem, it helps to hold the full architecture in your head. Data travels through five distinct stages before appearing in a Customer Journey Analytics Workspace report:

At each stage, data can be filtered, dropped, mismatched, or misconfigured. The good news: every stage leaves diagnostic signals. Let’s go through them one by one.

Stage 1: Source

The source is everything upstream of Adobe Experience Platform — your web or mobile SDKs, CRM, marketing automation platform (MAP), point of sale software (POS), ad platforms, call center records, etc. Adobe Experience Platform comes with data connectors to most major platforms that you can find in the Source Catalog:

Default alt

Source of truth

Data originating from the source system is considered the authoritative “source of truth” for any downstream analytics or reporting.

By referencing the raw payloads and original records, you can make sure that what appears in Customer Journey Analytics Workspace reports accurately reflects all key events and attributes as captured at the source. This practice allows you to validate report accuracy and trace discrepancies back to their origin, helping maintain data integrity throughout the architecture.

Stage 2: Ingestion

Once you know the source data you need for reporting, it’s time to setup dataflows to ingest in the data into data lake as queryable datasets.

How ingestion works

Adobe Experience Platform supports two ingestion modes:

Default alt

Ingestion errors are the most common cause of missing records — and they are frequently overlooked because they don’t surface as loud alerts.

Adobe Experience Platform validates every record against its XDM schema at ingestion time. Records that don’t conform are either rejected or have offending fields nullified and silently dropped. This means your dataset may contain far fewer records than your source system — without any obvious error message at the dataset level.

Things to look for

Ingestion errors in the dataset UI

Navigate to the Datasets section in the Adobe Experience Platform UI and click into your dataset. Scroll down to the batch list. Each batch shows a status (Success, Failed, or Partial) and a record count. If you see failures:

TIP
DCVS and MAPPER errors are deceptively quiet. Your batch may show “Success” while thousands of individual records were skipped or altered. Always review the record counts in the batch detail, not just the top-level status.

Calculated fields in Data Prep

Adobe Experience Platform Data Prep is the mapping and transformation layer applied during ingestion. When you configure a source connector, you define field mappings from your source schema to your XDM target schema. Data Prep also supports calculated fields — inline transformations applied during ingestion, such as string concatenation, date format conversion, or conditional logic.

Calculated fields are powerful but fragile. If a transformation function encounters unexpected input (a null value, a malformed string, a type mismatch), the resulting attribute is set to NULL rather than throwing an error — and the record is still ingested with a missing field.

The ANALYZE TABLE query in dataset statistics

One of the most underused diagnostic tools in Adobe Experience Platform is the Query Service “ANALYZE TABLE” command (also accessible as Dataset Statistics in the Query Service UI). Running this against a dataset gives you column-level statistics: record counts, null rates, distinct value counts, and min/max values.

Default alt

This is invaluable for post-ingestion validation:

TIP
Make ANALYZE TABLE part of your post-ingestion validation runbook. Run it after major data loads and after any Data Prep mapping changes. Catching data quality issues here is far easier than troubleshooting them in Customer Journey Analytics reports.

Stage 3: Transformation

Transformation refers to work done on your data inside Adobe Experience Platform after ingestion — reshaping, enriching, aggregating, or deriving new datasets before they flow into Customer Journey Analytics. This stage is where many B2B teams and advanced implementations do a lot of work, and it’s also where subtle data modeling decisions can create unexpected downstream effects.

Transformation use cases

Customer Journey Analytics B2B Edition

If your organization uses Customer Journey Analytics B2B Edition, you may need to transform data in Adobe Experience Platform due to its B2B model with accounts, opportunities, buying groups, and person-level events. At a minimum, you need to stitch account IDs onto all profile and event datasets you intend to use.

Manual stitching

If your team is not on Graph-Based Stitching, sometimes you will need to handle the stitching of Person ID onto all profile and event datasets using SQL queries.

Record-level filtering

In some cases, you don’t need all dataset records inside of Customer Journey Analytics. This could be due to BU or department isolation, governance requirements, brand-specific reporting needs, etc.

Calculated fields

You may need to create calculated fields using case statement logic. Marketing Channels, EmailHash, and Custom Unique Identifiers are examples of these.

Data Distiller

The majority of Adobe Experience Platform data transformation happens with Data Distiller, allowing organizations to clean, enrich, aggregate, and model data directly within the platform before it is activated or analyzed in downstream applications like Real-Time CDP and Customer Journey Analytics.

Scheduled Data Distiller templates

Adobe Data Distiller is the licensed add-on to Adobe Experience Platform Query Service that enables scheduled, persistent SQL-based transformations. Where Data Prep handles transformation at ingestion time, Data Distiller handles transformation after data is in the lake — creating derived datasets, enriching profiles, building aggregated tables, and preparing curated datasets specifically for Customer Journey Analytics.

Default alt

Scheduled Data Distiller queries are powerful but require ongoing maintenance. Common failure modes include:

If Customer Journey Analytics data looks complete for some time periods but empty for others — especially when the data comes from a derived dataset — a failed or stale Data Distiller schedule, or bad SQL logic is often the culprit.

TIP
Treat your Data Distiller scheduled queries like production pipelines. Monitor their run history, set up alerting where possible, and document the upstream dependencies of each query. A transformation that worked last month can silently break when a schema field changes.

Stage 4: Connection

The Customer Journey Analytics Connection is where Adobe Experience Platform datasets are assembled into a unified data model for analysis. The Connection defines which datasets are included, how they are typed (event, profile, or lookup), which field serves as the Person ID, and how data is stitched across datasets. This stage is where some of the most counterintuitive data loss can occur.

The “inner join” behavior

This is one of the most important concepts to understand about Customer Journey Analytics Connections, and the one that most frequently surprises practitioners.

When a profile dataset and an event dataset share the same Person ID field, Customer Journey Analytics applies what is effectively an inner join: only Person IDs that appear in both the event dataset AND the profile dataset are counted as persons in reports. If a Person ID exists in your profile dataset but has no corresponding events in your event dataset, that person will not appear in Customer Journey Analytics reports at all.

Default alt

To illustrate: if your Connection contains three Person IDs (1, 2, and 3) in your profile dataset, but Person ID 3 has no events in the event dataset, Customer Journey Analytics will only count 2 persons. Person ID 3’s profile attributes exist in Adobe Experience Platform — you can see them in dataset preview — but they will return “No value” in Analysis Workspace.

This behavior is by design, not a bug. Customer Journey Analytics is an event-driven analytics platform. Profile data enriches events — it does not surface independently without an associated event. The practical implication: if you’re expecting to report on “all customers in your CRM,” including those with zero activity, Customer Journey Analytics is not the right tool for that query without engineering a synthetic event for each person.

TIP
If a stakeholder reports that “Customer Journey Analytics is missing customers,” the first question to ask is whether those customers have any events in the event dataset for the reporting date range. They may exist in Adobe Experience Platform — they just have no events, so Customer Journey Analytics won’t surface them.

Other things to look for

Consistent Person ID across profile and event datasets

The Person ID field in your Connection must use the same identity namespace and the same value format across all datasets. This sounds obvious, but in practice it breaks frequently:

Know your data model

Customer Journey Analytics supports three dataset types in a Connection: event, profile, and lookup. Understanding the differences is critical:

Putting data in the wrong dataset type is a common mistake. CRM enrichment data belongs in a profile dataset. Campaign metadata belongs in a lookup dataset. Putting lookup data in a profile dataset (or vice versa) will produce unexpected join behavior.

Stage 5: Data View

The Data View is Customer Journey Analytics’s reporting configuration layer — the lens through which your Connection data is interpreted. It defines which fields are exposed as dimensions and metrics, how persistence is configured, what attribution models are applied, and which derived fields or calculated fields exist. If your data made it through ingestion, transformation, and the Connection correctly but still doesn’t appear in a report, the Data View is where to look next.

Things to look for

Did you include the component? The correct one?

The Data View does not automatically expose every field from your Connection. You must explicitly add dimensions and metrics as components. If a field doesn’t appear in Analysis Workspace, the most common reason is simply that it hasn’t been added to the Data View.

Check:

Component configurations

Customer Journey Analytics Data View component settings are significantly more flexible than Adobe Analytics, and that flexibility introduces new ways for data to look wrong:

Derived fields and calculated field logic

Derived Fields are one of Customer Journey Analytics’s most powerful features — and one of the most common sources of unexpected data behavior. A derived field applies a rule-based transformation to raw schema data retroactively, without modifying the underlying dataset. This is incredibly useful for standardizing values, mapping codes to friendly labels, or building complex classification logic.

However, derived fields apply their logic at query time, which means:

Default alt

When a derived field returns unexpected nulls or incorrect values, work backwards through the rule logic in the derived field builder. Test each condition individually using the preview function and compare against values you can verify in the underlying dataset via Query Service.

TIP
For complex derived field logic, validate the expected output using a Query Service query against the raw dataset first. Knowing exactly what values exist in the field before writing your derived field rules will save significant debugging time.

Putting it together: a diagnostic framework

When data is missing or wrong in a Customer Journey Analytics report, work through the pipeline in order. Start at the source and move downstream. Jumping straight to the Data View when the issue is actually in ingestion wastes time.

Default alt

Conclusion

The Adobe Experience Platform → Customer Journey Analytics pipeline is powerful precisely because it is layered. Each stage — Source, Ingestion, Transformation, Connection, Data View — adds flexibility, but also adds a place where data can silently disappear or be unexpectedly altered.

The practitioners who debug fastest are the ones who resist the urge to assume. They don’t assume the source is correct. They don’t assume ingestion succeeded. They don’t assume the Connection is joined the way they think it is. They move through the pipeline stage by stage, using the diagnostic tools at each layer, until they find the gap.

Build that discipline into your team’s standard operating procedure — post-ingestion validation checklists, scheduled query monitoring, Data View QA before publishing to stakeholders — and you’ll spend far less time answering “where did my data go?”

The pipeline is only as strong as your understanding of each layer. Know the stages, know the failure modes, and you’ll find your data every time.