CJA stitching enablement and validation
This video walks you through enabling stitching on any event dataset. Additionally, once stitching has been enabled on a dataset it will help showcase different metrics and dimensions that you can use to validate that it is working and the value it is bringing.
Hi, this is Matt Thomas from the Adobe Analytics Product Management team.
Today let’s talk about how you can enable identity stitching on your datasets and how you can validate that it is working properly.
This video assumes you have a base understanding of how identities are configured to work in both AEP and CJA and have a base understanding of how CJA works.
Let’s dive into the product. After logging into CJA, assuming that you are an admin and have the ability to make changes to connection, let’s create a new connection.
This also works if you are editing an existing connection. After creating the connection, you can start to add datasets that you want as part of this connection. Let’s do so now. You will notice that underneath the person ID, you have a new box here that you can tick called Enable Identity Stitching.
Once you do that, you’ll be presented with a warning making sure that you’re applying to both local laws and regulations. Assuming that you hit Continue, the person ID field will be disabled because we are no longer able to use the field from the dataset itself. Instead, we will use one that is derived through the stitching process.
You have multiple options here for persistent ID, person ID, and look back window. Let’s start with the person persistent ID. This ID can be selected by hitting the dropdown here and you will see any fields in this list that is marked as an identity in the schema. I’m going to go ahead and select Identity Map here and I’m going to use the ECID namespace.
When I get down to person ID, I will have multiple options here. Again, the identities that are found in the schema that are marked as identity along with identity map. If you’re entitled to it, Identity Graph.
This essentially becomes graph-based stitching. So by selecting Identity Graph, you are again shown a warning ensuring that you have properly set up the graph that it can be used by stitching.
After doing that, you pick the namespace that you want to use as part of a graph-based stitching, which in this case is actually a hashed email address.
Otherwise, you can choose a variable or a field directly from the dataset itself or the identity map. In this case, I’m choosing an email from the identity map. This essentially becomes field-based stitching.
You specify your look back period. Again, this field is driven by what you’re entitled to. I’m going to select seven days and then I’m going to set this as a web dataset. One thing that I will encourage you to do is request a backfill that’s short in nature. There’s two benefits to this. One, I can quickly get data into CJA and validate that it is working correctly. And it limits in case there are any problems or any setup configurations that are wrong that you can quickly get rid of this connection or at least this dataset and start a new one without a lot of ingested rows. So I go ahead and hit add datasets to my connection. And now you will see a column here that says stitched true. If you save your connection and pick the number of daily events, save your connection, the stitching process will begin.
Let’s switch over now to data views. Let’s go ahead and create a new data view so that we can validate that the stitching is working correctly. So I’m going to go ahead and select the connection that I just created. I’m going to type this as stitching validation data view.
And I’m going to go over to the components section. From the components section, we’re going to go over to the metrics and dimensions. And we’re going to ensure that we have some of the default metrics dimensions that are available and one of which is the identity namespace. This houses what is the namespace of the identity that’s used in the CJA connection. We can bring this as a dimension. We can also bring it in as a metric if we want.
And what we’ll do here is we’ll also come over here and we’ll look up the email which is going to be in the email ID. And we’re going to bring this in as a dimension. And we’re also going to bring it in as a metric. And we’re going to make this a metric. We’re going to call it has email set.
And this we’re going to it will only count when the email when there’s a value in this dimension. We’re going to go ahead and save and continue and leave all the other settings the same. Now that we have our data view created, let’s create our first project.
It’s going to be a blank workspace project. Let’s make sure that we have the right data view selected. First, let’s look at how many events already have the email set on the event before even any stitching process was involved. You can simply drag the has email set over into the metric area of your table. And you can see that pretty easily.
Secondly, let’s create a calculated metric to see how many events now have the email on them after the stitching process. So coming over here to the plus, you can create a calculated metric. We’ll call it email stitched namespace.
And let’s add add in the identity namespace. And we’re going to set it equal to the desired namespace, which in this case is email address. And we’re going to bring in events as the metric here. So that’s basically going to count how many times it was set to email. We’ll go ahead and save that. And now we have it over here as email stitched namespace. And we can add it as a metric to our table to see how many events now contain email versus what it had before. Now let’s determine the authentication rates against all events by creating two additional calculated metrics. The first one, which we’re going to call email authentication rate, is simply going to have hashed email set divided by total events.
We’re going to go ahead and save that one. And we’re going to create a new one also called stitched authentication rate. And this one’s going to be the identity namespace equal to, again, the desired namespace with events over all of the events. We’re going to make this a percent and we’re going to go two places. Now let’s add both of these in as metrics to our table, starting with email authentication rate and stitched authentication rate. Now you can see there is a pretty dramatic jump from email authentication rate of basically the number of events that had it set versus the total events versus what we have now after the stitching process. We can also create other calculated metrics such as percent increase and also lift if we want to, to see, again, the tremendous value that stitching is bringing to this data set.
I hope this helps. Thank you so much for your time.