Understanding How Customer Journey Analytics Uses Identity
This video is not a technical deep dive, but rather a practical look at how identity affects your analysis in Customer Journey Analytics, including a look at cross-channel visualizations made possible by stitching visitor IDs.
Hi, my name is Matt Thomas from the Adobe Analytics Product Management team.
Today let’s talk about identity, why it matters in CJA, and how it impacts your analysis.
Identity is also critical for stitching, which helps address identity gaps in your data.
Most organizations want person-based reporting. It provides a more accurate view of individuals across devices and channels, and improves downstream activation efforts.
So what is an identity? In this context, it’s an alphanumeric value representing a person, device, or even a transaction during interactions across various channels.
Identities fall into two categories. Non-durable, meaning they’re short-lived, like a cookie ID. Durable, meaning they’re persistent across channels, like an email address, loyalty ID, or phone number.
Connecting these identities together gives us a complete view of a person’s interactions. For example, McKay’s profile may include email, phone, loyalty ID, CRM ID, or many others. Depending on the channel in which she’s engaging, you might see one or several identities in your data.
When bringing this data into AEP, make sure identity fields are recognized so that they can be used across AEP apps, including CJA.
This typically involves two steps.
The first step is defining identity namespace in AEP for all of your identity types. There are a lot of standard ones, but you can also define custom ones. The second step is marking identity fields in your schema.
Whether these fields are in standalone fields or in the XDM identity map, they need to be marked as identity, and it’s essential for CJA connections and for populating the identity graph.
You’re probably asking yourself, so why is all of this so important? As we have discussed, most people have multiple devices and touchpoints with any given brand, CJA brought this kind of holistic, person-based reporting by allowing you to co-mingle data from various channels and sources.
One key tenet of CJA is that each dataset contains a single field that is to be used as the person identifier.
As you can see, each of these three datasets have a color-coded column signifying that it is an identity column.
Any other dataset that is to be used together in CJA needs to also have the same person identifier in common for it to be joined properly together and to show the true customer journey. As mentioned, the person ID is the glue that connects datasets in CJA. On the connection screen, you select the identity field for each dataset.
As you can see here, under the label of person ID, you will see a dropdown of all identities that are marked in the schema for that dataset.
One thing to remember here is that when the data is being ingested, if a row of data does not have a value in the selected person ID field, that row will be dropped from CJA.
Let’s see how that plays out with an example.
In this table, if you select the device ID as the person ID, this leads to device-centric analysis, leaving out the ability to connect it with other channels since this is the only dataset that has the ECID identity.
On a positive note, no rows of data are dropped since they all have values in the device ID column. If we switch, though, to using the customer ID as the person ID, this leads to person-centric analysis, allowing you to connect it with other channels, but since not all rows have the value, the unauthenticated events will be dropped, leaving a gap in your analysis.
So which one do you choose? Most likely, you will pick the identity that’s most common in your data or the one your organization prioritizes. To help in that decision-making process, it’s important to measure identity coverage.
You can do this by creating a connection to your data, and in the Data View configuration, drag all identity dimensions into the Metric section to count occurrences.
Then create a calculated metric for identification rates, such as email presence over total events, and repeat this process for any other identities you have in your data.
Identities in a web dataset like ECID should approach 100%, but others will certainly vary.
This insight can help shape your identity strategy.
If coverage is low or you need both authenticated and unauthenticated events, stitching can help.
Check out our other videos on field-based stitching and graph-based stitching to learn more. Thank you.
For more information, visit the documentation.