Best practices for setting up connections in CJA

15 minute

style

article-header-section

Customer Journey Analytics (CJA) Connections are the key to unified, cross-channel insights in Adobe Experience Platform. This article shares best practices—from planning and identity stitching to schema alignment and maintenance—to help you build connections that grow with your organization. With a solid foundation, you’ll be ready to answer complex customer questions and unlock the full power of your data.

Customer Journey Analytics (CJA) connections are the foundation for unified cross-channel analysis in Adobe Experience Platform. Setting up a connection properly ensures that your various datasets work together seamlessly in the Analysis Workspace. In this blog, we explore best practices for planning and configuring CJA connections - from upfront planning and identity stitching to schema alignment and ongoing maintenance - all in a conversational, accessible way. Whether you’re new to CJA or a seasoned user, these tips help you get the most out of your Connections.

Planning your connection: Start with goals and data needs

Before configuring, pause to plan. What business questions are you trying to answer? Clear goals help determine which data is truly needed.

Identify required datasets: List out all the data sources you might integrate, like web analytics, mobile app data, CRM records, call center logs, etc. Then prioritize them. It’s often better to start with a few key datasets than to pull in everything at once. For example, you might begin with web and mobile app data this quarter, and plan to add CRM or support data later. This phased approach ensures that you focus on the most important data first and don’t overwhelm your team (or the system) with unnecessary complexity.

Check sandbox and access: Remember that CJA connections are sandbox-specific. Make sure the datasets that you need reside in the same Adobe Experience Platform sandbox, as a connection can only pull from one sandbox at a time. Also, ensure you have the right permissions (Product Admin in CJA and dataset access in AEP) to create connections. It’s wise to limit who can create or edit connections - treat it as an admin task for a core team.

Define the connection’s scope: Give your connection a clear name and description that reflects its purpose (for example, “Retail Web + Mobile Journey Data”). This helps everyone understand what’s included. In CJA’s Create Connection wizard, you are prompted to name and describe the connection - use that to document its goal. Only include datasets that serve that analytical goal. If a dataset doesn’t contribute to your use case, consider leaving it out to keep the connection efficient.

By planning with end goals in mind, you set a strong foundation. You know exactly which datasets and settings to configure, making the setup process in CJA much smoother.

Identity configuration: Choosing the right linking field

One of the most important steps in setting up a CJA connection is telling it how to recognize the same person across multiple datasets.

NOTE

CJA doesn’t do identity stitching—that happens in Adobe Experience Platform (AEP). Instead, it relies on stitched identities from AEP’s Identity Service. Your job is to make sure all datasets in the connection share a common ID (like a hashed email, CRM ID, or ECID) and that it’s properly declared as an identity in AEP with the correct namespace.

CJA can seamlessly link data across channels using the resolved identity map from AEP, enabling you to analyze cohesive customer journeys across web, mobile, offline, and more.

Pick a consistent ID: Choose an identifier that reliably appears in each dataset. For example, if both your website and mobile app collect a hashed email address for logged-in users, that can serve as the common ID. The key is to pick an identity that is populated, stable, and usable across all connected datasets. Without this, CJA can’t associate user behavior across sources.
Use the right namespace in AEP: In Adobe Experience Platform, make sure that your chosen identity field is properly marked as an identity field and assigned to the correct namespace (for example, Email, ECID, CRM_ID). This tells AEP how to resolve identities and allows CJA to interpret them correctly. Skipping this step is one of the most common reasons connections don’t behave as expected.
Set the primary ID in CJA connections: During connection setup in CJA, you are prompted to select a primary ID. This defines the grain at which customer journeys are stitched together-typically "Person" for B2C or "Account" for B2B. Choose the one that best fits your business model and reporting needs, and map it to the identity field you configured in AEP.
Avoid mismatched or missing IDs: If even one dataset lacks the configured primary ID, or contains it inconsistently, CJA won’t be able to link that dataset’s records with others. Those events are treated as anonymous, which limits their value in journey analysis. Always verify that your ID field is present, populated, and valid across all datasets before building the connection.

Think of your identity field like a backstage pass—everyone with the same badge gets recognized, no matter where they are. But if someone shows up without it (a dataset missing the primary ID), CJA won’t know who they are.

By aligning your primary ID across datasets, you let CJA do what it does best: analyze unified, identity-aware journeys. You don’t need to know the stitching logic—just make sure your datasets use the same ID “language” and AEP is set Upright™. CJA handles the rest.

Schema alignment: Speak the same data language

Each dataset in Adobe Experience Platform is defined by a schema (it's set of fields and definitions). When you bring multiple datasets into one CJA connection, having aligned schemas ensures that those fields integrate nicely. In other words, consistent field naming and definitions act like a common language for your data.

Use consistent field names: If two datasets contain the same info, use the same field name and, if possible, the same XDM field group or data type to ensure compatibility in Workspace. For example, if both reference an order ID, name it the same (for example, orderID). If one says order_id and the other OrderNumber, CJA treats them as separate, making cross-source analysis harder. It’s like two teams using the same language—alignment avoids confusion.
Event vs lookup datasets: Event datasets hold timestamped records like page views or purchases. Lookup datasets store static data like product catalogs to enrich events. Profile datasets contain customer attributes, and summary datasets hold aggregated data. In CJA, you can combine one event dataset with multiple lookup or profile datasets. A best practice: store static info (like product details) in a lookup dataset instead of repeating it in every event. CJA can join them on the fly using a shared key like Product ID, keeping your event data lean and efficient.
Ensure keys and types match: If you add a lookup dataset, CJA asks you to specify the key field (in the lookup) and the matching field in the event dataset that links to it. For example, if your event data has a field productID and you have a product lookup dataset with _id as the product key, you would configure _id (lookup key) = productID (event field) in the connection setup. Also, ensure the data types match (for example, both are strings or both are numbers), as mismatches can break joins or cause unexpected behavior.

Plan schemas before ingesting: It’s much easier to ensure naming consistency during schema design. If your web and mobile data are similar, model them on the same XDM schema for uniform fields. If they differ, use separate schemas — but still align key dimensions and metrics. A little upfront schema planning avoids duplicate or conflicting fields later in the Analysis Workspace.

In short, treat schema alignment as ensuring all your data sources “speak” in unison. This avoids confusion and makes your connected data analysis far more intuitive, scalable, and future-proof, especially as new datasets or teams come into the picture.

Combining datasets smartly: Unify vs. separate data sources

When connecting multiple data sources in CJA, you have to decide how to organize them: do you unify data into one dataset or keep them separate and combine at the connection stage? The answer depends on your data and use cases, and there are best practices for each approach.

Unifying data into one dataset: If your data sources share a similar structure and identity, consider merging them into a single dataset, like combining web and app events into an “All Digital Interactions” stream. It simplifies setup, with fewer moving parts and just one primary event dataset per data view (a limit in most CJA editions). Unified datasets also make cross-channel analysis easier, but you need to align schemas and possibly add a source field (for example, channel = "web" or "mobile").
Separating data into multiple datasets: Sometimes it’s better to keep datasets separate, like web and mobile, so you can manage them independently. This can help with team ownership, data latency, or retention needs. Each dataset can have its own TTL, be updated on its own, or used selectively in connections. The downside? More complexity. You need consistent schemas and IDs so CJA can merge them properly. Also, CJA Foundation only allows one event dataset per connection, so separating events may require separate connections, limiting unified analysis. Check your edition for multi-event support.
Best of both worlds - careful combination: A common best practice is to combine datasets that naturally belong together for analysis, and separate those that don’t. For example, unify digital channels (web, app) into one dataset since a user journey often flows between them. But you might keep offline sales or call center data in separate datasets if they are different in structure or update frequency. You would then bring them into the connection as additional datasets (profile or lookup data) to enrich the online events (using a common customer ID to join). This way, you respect the one-event-dataset rule while still combining multiple sources in the analysis.
Consider data volume and performance: Another reason to be thoughtful about unifying vs. separating is performance and contract limits. A single massive dataset with all data might be slower to query or push you over quota, whereas splitting into logical parts could help manage load. On the other hand, too many separate pieces can increase overhead. Strike a balance based on what you need to analyze together. When in doubt, start with a lean connection (fewer datasets) and add more later if needed, rather than pulling in everything by default.

Backfill and validation: Load history carefully and verify results

Once you’ve set up your connection with the desired datasets, CJA gives you the option to backfill historical data. Backfilling means bringing in past data (that existed in Adobe Experience Platform before the connection was created) so that your reports have historical context from day one. It’s a powerful feature, but you want to approach it carefully. Equally important is validating that your data is accurate once it’s in CJA.

Plan your initial backfill: During connection creation, you can enable backfill for each dataset. Think about how much historical data is truly needed. It might be tempting to backfill two years of data, but if you only need the last 6 months for your analyses or if older data quality is questionable, you might choose a shorter window. Keep in mind that by default, if you don’t enable a rolling window (discussed later), CJA tries to ingest all data available in the Platform dataset. So, if your AEP dataset has 25 months of data, it brings all 25 months unless you limit it. Be mindful of your organization’s data usage limits and the time that it might take to process a huge backfill.
Start small, then scale: A best practice is to backfill a small period first to test, before pulling in your entire data history. For example, you might initially request to backfill just the last 7 days or 1 month of data. Once that small backfill completes, go to Analysis Workspace and check the data. Do the counts of visits, orders, etc., match your expectations or match source systems for that period? Are the datasets merging correctly (for example, does your common ID unify users across data)? This trial run can reveal any misconfigurations early. Adobe even suggests testing your connection with a limited backfill, and if everything looks good, then “backfill all the remaining data with ease.”

Data retention strategy: Use rolling windows to manage historical data

Data can pile up quickly. While it’s great to have rich historical information, you might not need all the data forever in CJA. That’s where a data retention (rolling window) strategy comes in. Adobe CJA allows you to set a rolling data window on your connection. Meaning CJA will only retain data for a defined recent period (for example, the last 12 months), discarding anything older than that window. This is a crucial best practice to control data volume and stay within contractual limits.

Understand rolling window benefits: Enabling a rolling window means you’re telling CJA, for example, “only keep the most recent 6 months of data available for analysis, and continuously drop anything older.” The main benefit is that you store and report only on data that’s relevant and within your analysis timeframe, and automatically delete older data. This helps prevent the accumulation of huge data volumes you no longer need, which can improve performance and avoid overage costs. If your business usually focuses on the past year’s trends, there’s no need to keep five years of data in CJA—older data can be archived and brought in only when needed.
Set it up during connection creation: When creating the connection, there’s a checkbox for “Enable rolling data retention window.” If you check it, you can then specify a number of months for retention (the UI usually provides options like 1, 3, 6, 12, 24 months, etc.).

Choose a window that suits your needs; many companies opt for 12 or 24 months as a balance between historical context and data manageability. Keep in mind that this window applies to event datasets (which have timestamps). Lookup or profile datasets don’t have timestamps, so they piggyback on event data retention—if related events are removed, unreferenced lookup data may drop out of the analysis too.

Default vs. rolling: If you don’t enable a rolling window, CJA ingests all available data from AEP and keep adding new data indefinitely—unless AEP enforces its own retention limits. For example, if AEP retains 25 months of data and you haven’t set a rolling window, the initial backfill might bring in all 25 months, and as time goes on, it could grow (if AEP keeps more). In contrast, with a 13-month rolling window, CJA would only ever keep 13 months at a time -when a new month’s data comes in, the month falling off the back end is dropped. Think of it like a moving time window.
Be aware of contract limits: Adobe often licenses CJA based on a certain number of events or volume. Having a rolling window aligned with what you actually use in reporting can help you stay within those limits. For instance, if your contract allows 13 months of data, setting a 13-month rolling window ensures you won’t accidentally build up more data than allowed. It’s a safety net as well as a housekeeping tool.
Review and adjust over time: Your retention needs might change. Maybe at first you only needed 6 months, but next year you realize you want to do a year-over-year analysis, so 13 months would be better. You can edit the connection to adjust the rolling window as needed. Just remember that if you extend it, you might need to backfill the newly included period (if those older months were previously dropped). And if you shorten it, data older than the new window will get removed. Always communicate changes to your users so they understand, for example, why data prior to a certain date might no longer appear.

Having a clear data retention strategy via rolling windows keeps your CJA connections lean and focused. It’s like cleaning out the closet regularly - you make space for what’s current and avoid hoarding data “just because.” This not only helps with system performance and limits but also ensures that your analyses don’t accidentally include stale data that no one is acting on.

Monitoring and Maintenance: Keep an Eye on Your Connections

Setting up a connection is not a “set it and forget it” task. Ongoing monitoring and maintenance ensure that your connections continue to deliver reliable insights as data flows and business needs evolve. Here are some best practices for caring for your CJA connections over time:

Monitor data flows regularly: Make it a habit to check that data is updating as expected. In CJA’s Connections dashboard (the Connections manager), you can see information like the last time data was ingested for each dataset (“Last updated” timestamp). If you notice that one dataset hasn’t updated in a while (for example, no new data in 2 days when it should be daily), that’s a red flag to investigate. Perhaps an upstream data ingestion pipeline failed or a source system had an outage. Catching these issues early helps maintain data continuity.
Validate periodically: Just as you validated after initial setup, continue to spot-check the data on a schedule (monthly, quarterly, or after any major data source change). Verify key metrics and dimensions to ensure that nothing has drifted. For example, if a new marketing channel was added to your website, did those events start showing up in CJA properly? If your common ID capture logic changed (maybe your mobile app now collects email differently), are identities still stitching correctly? Regular validation might include running a known report (like total sales last week) and comparing it to a source of truth. This ongoing quality control ensures confidence in the data.
Watch for anomalies: Use CJA or AEP tools to monitor trends in the data. A sudden drop in event counts or unique IDs, or a spike in null values, could indicate an issue. Many teams set up automated alerts or reports for basic data health indicators, for instance, a Workspace freeform table or an Insight alert that checks if daily events fall to zero. If you have important fields (like the common ID), monitoring the count of records missing that field can be valuable. As mentioned earlier, “you are what you ingest,” and any data quality issues upstream will manifest in CJA. So, being vigilant will help you catch problems often originating outside CJA but affecting your analysis.
Manage changes carefully: Over time, you may need to update your connections, maybe add a new dataset (like bringing in a new data source), or remove one that’s no longer needed, or change the retention window. Be careful with such changes. Adding a dataset initiates a backfill for that dataset if you request it, which could bring in a flood of new data. Removing a dataset impacts any data views or reports that were using fields from that dataset. Always communicate with your analytics team before making changes, and ideally test significant changes in a non-production sandbox or during off-peak times. CJA allows editing connections (like requesting another backfill or toggling data import on/off) fairly easily, but with great power comes great responsibility!
Access control and audit: As mentioned, restrict who can edit connections. You don’t want just anyone adding datasets or changing settings without oversight. Use Adobe Admin Console to limit this to admins or a governance group. It’s also smart to document connection settings—in a wiki or the description field—so future admins understand the setup. Since Adobe doesn’t offer a detailed change log for connections, keep your record of changes. This helps with troubleshooting, like knowing why data spiked after a backfill.
Stay informed on updates: Adobe is always improving CJA. New features, like more dataset support or identity resolution upgrades, are announced in Experience League docs and forums. Keep an eye on release notes and the community for updates. For example, future support for multiple event datasets in one connection could change how you set things up. Staying proactive helps you get the most from CJA and follow best practices.

By actively monitoring and maintaining your connections, you can ensure that your hard work in setting them up continues to pay off. Think of it like maintaining a car: regular oil changes and check-ups prevent bigger problems down the road. Similarly, a little ongoing attention to your CJA connections keeps your customer journey data running smoothly and reliably.

Conclusion: Setting up for success in CJA

Building robust connections in Customer Journey Analytics is a journey in itself; you plan it, build it, and then nurture it. By planning thoughtfully, aligning schemas, selecting the right identity strategy, and combining datasets with intention, you lay the foundation for a unified customer view.

Then, through careful backfill, smart data retention settings, and ongoing monitoring, your connections don't just launch, they evolve and scale with your organization.

With these best practices in place, setting up connections in CJA becomes more than a task; it becomes a strategic enabler of insight. You are ready to answer powerful cross-channel questions like “How do in-store purchases correlate with mobile app usage?” or “Did that email campaign influence repeat visits?” because your data is speaking the same language.

Every organization’s data landscape is unique, so adapt these recommendations to your context. Start lean, iterate often, and build with purpose.

In no time, you have connections in CJA that deliver a truly unified customer journey view, the very outcome CJA was built to deliver.

Happy connecting!

style

article-content-section