[Beta]{class="badge informative"}

Troubleshooting guide for identity graph linking rules

AVAILABILITY
The Identity graph linking rules feature is currently in beta. Contact your Adobe account team for information on the participation criteria. The feature and documentation are subject to change.

As you test and validate identity graph linking rules, you may run into some issues related to data ingestion and graph behavior. Read this document to learn how to troubleshoot some common issues that you might encounter when working with identity graph linking rules.

Data ingestion flow overview data-ingestion-flow-overview

The following diagram is a simplified representation of how data flows into Adobe Experience Platform and Applications. Use this diagram as reference to help you get a better understanding of the contents of this page.

A diagram of how data ingestion flows in Identity Service.

It is important to note the following factors:

  • For streaming data, Real-Time Customer Profile, Identity Service, and data lake will start processing the data when the data is sent. However, the latency to complete the processing of the data is dependent on the service. Usually, data lake will take a longer time to process, compared to Profile and Identity.
    • If the data does not appear when running a query against a dataset even after a couple of hours, then it is likely that the data did not get ingested into Experience Platform.
  • For batch data, all data will flow into data lake first, then the data will be propagated to Profile and Identity if the dataset is enabled for Profile and Identity.
  • For ingestion-related issues, it is important that the issue is isolated at a service-level for accurate debugging and troubleshooting. There are three potential issue types to consider:
Ingestion issue type
Does the data get ingested in data lake?
Does the data get ingested in Profile?
Does the data get ingested in Identity Service?
General ingestion issue
No
No
No
Graph issue
Yes
Yes
No
Profile fragment issue
Yes
No
Yes

Data ingestion issues data-ingestion-issues

NOTE
  • This section assumes that the data has been successfully ingested into data lake and that there were no syntax or other errors that would prevent the data from being ingested into Experience Platform in the first place.

  • The examples use ECID as the cookie namespace and CRMID as the person namespace.

My identities are not getting ingested into Identity Service my-identities-are-not-getting-ingested-into-identity-service

There are various reasons for why this could happen, including, but not limited to the following:

Within the context of identity graph linking rules, a record may be rejected from Identity Service because the incoming event has two or more identities with the same unique namespace but different identity value. This scenario usually happens due to implementation errors.

Consider the following event with two assumptions:

  • The field name CRMID is marked as an identity with the namespace CRMID.
  • The namespace CRMID is defined as a unique namespace.

The following event will return an error message indicating that ingestion has failed.

{
  "_id": "random_string",
  "eventType": "web browsing event",
  "identityMap": {
    "ECID": [
      {
        "id": "11111111111111111111111111111111111111",
        "primary": false
      }
    ],
    "CRMID": [
      {
        "id": "Alice",
        "primary": true
      }
    ]
  },
  "CRMID": "Bob",
  "timestamp": "2024-08-17T15:22:51+00:00",
  "web": {
    "webPageDetails": {
      "URL": "https://www.adobe.com/acrobat.html",
      "name": "Adobe Acrobat"
    }
  }
}

Troubleshooting steps

To resolve this error, you must first collect the following information:

  • The identity value (identity_value) you expected to be ingested in the identity graph.
  • The dataset (dataset_name) in which the event was sent in.

Next, use Adobe Experience Platform Query Service and run the following query:

TIP
Replace dataset_name and identity_value with the information that you collected.
  SELECT key, col.id as identityValue, timestamp, _id, identityMap, *
  FROM (SELECT key, explode(value), *
  FROM (SELECT explode(identityMap), *
  FROM dataset_name)) WHERE col.id = 'identity_value'

After running your query, find the event record that you expected to generate a graph, and then validate that the identity values are different in the same row. View the following image for an example:

An untitled query that resulted in duplicated namespaces.

NOTE
If the two identities are exactly the same, and if the event is ingested via streaming, then both Identity and Profile will deduplicate the identity.

My experience event fragments are not getting ingested into Profile my-experience-event-fragments-are-not-getting-ingested-into-profile

There are various reasons that contribute as to why your experience event fragments are not getting ingested into Profile, including but not limited to:

In the context of namespace priority, Profile will reject:

  • Any event that contains two or more identities with the highest namespace priority. For example, if GAID is not marked as a unique namespace and two identities both with a GAID namespace and different identity values came in, then Profile will not store any of the events.
  • Any event where the namespace with the highest priority is an empty string.

Troubleshooting steps

If your data is sent to data lake, but not Profile, and you believe that this is due to sending two or more identities with the highest namespace priority in a single event, then you may run the following query to validate that there are two different identity values sent against the same namespace:

TIP
In the following queries, you must:
  • Replace _testimsorg.identification.core.email with the path sending the identity.
  • Replace Email with the namespace with the highest priority. This is the same namespace that is not being ingested.
  • Replace dataset_name with the dataset that you wish to query.
  SELECT identityMap, key, col.id as identityValue, _testimsorg.identification.core.email, _id, timestamp
  FROM (SELECT key, explode(value), *
  FROM (SELECT explode(identityMap), *
  FROM dataset_name)) WHERE col.id != _testimsorg.identification.core.email and key = 'Email'

You can also run the following query to check if ingestion to Profile is not happening due to the highest namespace having an empty string:

  SELECT identityMap, key, col.id as identityValue, _testimsorg.identification.core.email, _id, timestamp
  FROM (SELECT key, explode(value), *
  FROM (SELECT explode(identityMap), *
  FROM dataset_name)) WHERE (col.id = '' or _testimsorg.identification.core.email = '') and key = 'Email'

These two queries assume that one identity is sent from the identityMap and another identity is sent from an identity descriptor. NOTE: In Experience Data Model (XDM) schemas, the identity descriptor is the field marked as an identity.

My experience event fragments are ingested, but have the “wrong” primary identity in Profile

Namespace priority plays an important role in how event fragments determine primary identity.

  • Once you have configured and saved your identity settings for a given sandbox, Profile will then use namespace priority to determine the primary identity. In the case of identityMap, Profile will then no longer use the primary=true flag.
  • While Profile will no longer refer to this flag, other services on Experience Platform may continue to use the primary=true flag.

In order for authenticated user events to be tied to the person namespace, all authenticated events must contain the person namespace (CRMID). This means that even after a user logs in, the person namespace must still be present on every authenticated event.

You may continue to see primary=true ‘events’ flag when looking up a profile in profile viewer. However, this is ignored and will not be used by Profile.

AAIDs are blocked by default. Therefore, if you are using the Adobe Analytics source connector, you must ensure that the ECID is prioritized higher than the ECID so that the unauthenticated events will have a primary identity of ECID.

Troubleshooting steps

  • To validate that authenticated events contain both the person and cookie namespace, read the steps outlined in the section on troubleshooting errors regarding data not being ingested to Identity Service.
  • To validate that authenticated events have the primary identity of the person namespace (e.g. CRMID), search the person namespace on profile viewer using no-stitch merge policy (this is the merge policy that does not use private graph). This search will only return events associated to the person namespace.

This section outlines common issues you may encounter regarding how the identity graph behaves.

The identity is getting linked to the ‘wrong’ person

The identity optimization algorithm will honor the most recently established links and remove the oldest links. Therefore, it is possible that once this feature is enabled, ECIDs could be reassigned (re-linked) from one person to another. To understand the history of how an identity gets linked over time, follow the steps below:

Troubleshooting steps

NOTE
The following steps will retrieve information under the following assumptions:

First, you must collect the following information:

  • The identity symbol (namespaceCode) of the cookie namespace (e.g. ECID) and the person namespace (e.g. CRMID) that were sent.

    • For Web SDK implementations, these are usually the namespaces included in the identityMap.
    • For Analytics source connector implementations, these are the cookie identifier included in the identityMap. The person identifier is an eVar field marked as an identity.
  • The dataset in which the event was sent in (dataset_name).

  • The identity value of the cookie namespace to look up (identity_value).

Identity symbols (namespaceCode) are case sensitive. To retrieve all identity symbols for a given dataset in the identityMap, run the following query:

SELECT distinct explode(*)FROM (SELECT map_keys(identityMap) FROM dataset_name)

If you do not know the identity value of your cookie identifier and you would like to search for a cookie ID that would have been linked to multiple person identifiers, then you must run the following query. This query assumes ECID as the cookie namespace and CRMID as the person namespace.

Web SDK implementation
code language-sql
  SELECT identityMap['ECID'][0]['id'], count(distinct identityMap['CRMID'][0]['id']) as crmidCount FROM dataset_name GROUP BY identityMap['ECID'][0]['id'] ORDER BY crmidCount desc
Analytics source connector implementation
code language-sql
  SELECT identityMap['ECID'][0]['id'], count(distinct personID) as crmidCount FROM dataset_name group by identityMap['ECID'][0]['id'] ORDER BY crmidCount desc

Note: personID refers to the path of the descriptor. You can find this information under schemas.

Next, examine the association of the cookie namespace in order of timestamp by running the following query:

Web SDK implementation
code language-sql
  SELECT identityMap['CRMID'][0]['id'] as personEntity, *
  FROM dataset_name
  WHERE identitymap['ECID'][0].id ='identity_value'
  ORDER BY timestamp desc
Analytics source connector implementation
code language-sql
SELECT _experience.analytics.customDimensions.eVars.eVar10 as personEntity, *
FROM dataset_name
WHERE identitymap['ECID'][0].id ='identity_value'
ORDER BY timestamp desc

Note: This example assumes that eVar10 is marked as an identity. For your configurations, you must change the eVar based on your own organization’s implementation.

The identity optimization algorithm is not ‘working’ as expected

Troubleshooting steps

Refer to the documentation on identity optimization algorithm, as well as the types of graph structures that are supported.

  • Read the graph configuration guide for examples of supported graph structures.

  • You can also read the implementation guide for examples of unsupported graph structures. There are two scenarios that could happen:

    • No single namespace across all your profiles.
    • A “dangling ID” scenario occurs. In this scenario, Identity Service is unable to determine if the dangling ID is associated with any of the person entities in the graphs.

You can also use the graph simulation tool in the UI to simulate events and configure your own unique namespace and namespace priority settings. Doing so can help give you a baseline understanding of how the identity optimization algorithm should behave.

If your simulation results match your graph behavior expectations, then you can check and see if your identity settings matches the settings that you have configured in your simulation.

I still see collapsed graphs in my sandbox even after configuring identity settings

Identity graphs will adhere to your configured unique namespace and namespace priority after the settings have been saved. Any “collapsed” graphs that exist before you save your new settings will not be affected, until new data is ingested such that the collapsed graph is updated. The primary identity of event fragments on Real-Time Customer Profile will not be updated even after namespace priority changes.

Troubleshooting steps

You can use the identity graph viewer to check whether your graph was ingested before or after your settings. Examine the last updated timestamp under Link properties to see when Identity Service ingested the graph. If the timestamp is before configuration, then that suggests that the “collapsed” graph was created before enabling the feature.

The identity graph viewer with an example graph.

I want to know how many “collapsed” graphs exist in my sandbox

Use the identity dashboard for insights on the state of your identity graph, such as the count of identities and graphs. Refer to the metric, “Graph count with multiple namespaces” for a count of graphs that have collapsed - these are graphs that contain two or more identities with the same namespace. Assuming that the sandbox has no data, and you have configured a namespace (e.g. CRMID) to be unique, the expectation is that there should be zero graphs that have two or more CRMIDs. In the example below, there are two graphs that contain two or more email addresses.

The identity dashboard with metrics on identity count, graph count, count by namespace, graph count by size and graph count of graphs more than two namespaces.

You can find a detailed breakdown in the profile snapshot export dataset in data lake by running the query below:

NOTE
  • Replace dataset_name with the actual name of your dataset.

  • The counts may not exactly match. The identity dashboard is based on the identity graph count and the following query is based on profile count with two or more identities. The data is independently processed and updated by the service.

  SELECT key, identityCountInGraph, count(identityCountInGraph) as graphCount
  FROM (SELECT key, cardinality(value) as identityCountInGraph
  FROM (SELECT explode(identityMap)
  FROM dataset_name
  WHERE cardinality(identityMap) > 1)) /* by definition, graphs have 2 or more identities */
  WHERE key not in ('ecid', 'aaid', 'idfa', 'gaid') /* filter out common device/cookie namespaces */
  GROUP BY 1, 2
  ORDER BY 1, 2 asc

You can use the following query in profile snapshot export dataset to obtain sample identities from “collapsed” graphs.

  SELECT identityMap
  FROM dataset_name
  WHERE cardinality(identityMap['CRMID'])>1 /* any graphs with 2+ CRMID. Change CRMID namespace if needed */
TIP
The two queries listed above will yield expected results if the sandbox is not enabled for the shared device interim approach and will behave differently from identity graph linking rules.

Frequently asked questions faq

This section outlines a list of answers to frequently asked questions about identity graph linking rules.

Identity optimization algorithm identity-optimization-algorithm

Read this section for answers to frequently asked questions about the identity optimization algorithm.

I have a CRMID for each of my business unites (B2C CRMID, B2B CRMID), but I don’t have a unique namespace across all of my profiles. What will happen if I mark B2C CRMID and B2B CRMID as unique, and enable my identity settings?

This scenario is unsupported. Therefore, you may see graphs collapse in cases where a user uses their B2C CRMID to login, and another user uses their B2B CRMID to login. For more information, read the section on single person namespace requirement in the implementation page.

Does identity optimization algorithm ‘fix’ existing collapsed graphs?

Existing collapsed graphs will be affected (‘fixed’) by the graph algorithm only if these graphs get updated after you save your new settings.

If two people log in and out using the same device, what happens to the events? Will all events transfer over to the last authenticated user?

  • Anonymous events (events with ECID as primary identity on Real-Time Customer Profile) will transfer to the last authenticated user. This is because the ECID will be linked to the CRMID of the last authenticated user (on Identity Service).
  • All authenticated events (events with CRMID defined as primary identity) will remain with the person.

For more information, read the guide on determining the primary identity for experience events.

How will journeys in Adobe Journey Optimizer be impacted when the ECID is transferring from one person to another?

The CRMID of the last authenticated user will be linked to the ECID (shared device). ECIDs can be reassigned from one person to another based on user behavior. The impact will depend on how the journey is constructed, so it is important that customers test out the journey in a development sandbox environment to validate the behavior.

The key points to highlight are as follows:

  • Once a profile enters a journey, ECID re-assignment does not result in the profile exiting in the middle of a journey.

    • Journey exits are not triggered by graph changes.
  • If a profile is no longer associated with an ECID, then this may result in changing the journey path if there is a condition that uses audience qualification.

    • ECID removal may change events associated to a profile, which could result in changes in audience qualification.
  • Re-entry of a journey is dependent on journey properties.

    • If you disable re-entry of a journey, once a profile exits from that journey, the same profile will not re-enter for 91 days (based on global journey timeout).
  • If a journey starts with an ECID namespace, the profile that enters and the profile that receives the action (ex. email, offer) may be different depending on how the journey is designed.

    • For example, if there is a wait condition between actions, and the ECID transfers during the waiting period, a different profile may be targeted.
    • With this feature, ECID are no longer always associated with one profile.
    • The recommendation is to start journeys with person namespaces (CRMID).

Namespace priority

Read this section for answers to frequently asked questions about namespace priority.

I’ve enabled my identity settings. What happens to my settings if I want to add a custom namespace after the settings has been enabled?

There are two ‘buckets’ of namespaces: person namespaces and device/cookie namespaces. The newly created custom namespace will have the lowest priority in each ‘bucket’ so that this new custom namespace does not impact existing data ingestion.

If Real-Time Customer Profile is no longer using the ‘primary’ flag on identityMap, does this value still need to be sent?

Yes, the ‘primary’ flag on identityMap is used by other services. For more information, read the guide on the implications of namespace priority on other Experience Platform services.

Will namespace priority apply to Profile record datasets in Real-Time Customer Profile?

No. Namespace priority will only apply to Experience Event datasets using the XDM ExperienceEvent Class.

How does this feature work in tandem with the identity graph guardrails of 50 identities per graph? Does namespace priority affect this system defined guardrail?

The identity optimization algorithm will be applied first to ensure person entity representation. Afterwards, if the graph tries to exceed the identity graph guardrail (50 identities per graph), then this logic will be applied. Namespace priority does not affect the deletion logic of the 50 identity/graph guardrail.

Testing

Read this section for answers to frequently asked questions about testing and debugging features in identity graph linking rules.

What are some of the scenarios I should be testing in a development sandbox environment?

Generally speaking, testing on a development sandbox should mimic the use cases you intend to execute on your production sandbox. Refer to the following table for some key areas to validate, when conducting comprehensive testing:

Test case
Test steps
Expected outcome
Accurate person entity representation
  • Mimic anonymous browsing
  • Mimic two people (John, Jane) logging in using the same device
  • Both John and Jane should be associated to their attributes and authenticated events.
  • The last authenticated user should be associated to the anonymous browsing events.
Segmentation

Create four segment definitions (NOTE: Each pair of segment definition should have one evaluated using batch and the other streaming.)

  • Segment definition A: Segment qualification based on John’s authenticated events.
  • Segment definition B: Segment qualification based on Jane’s authenticated events.
Regardless of shared device scenarios, John and Jane should always qualify for their respective segments.
Audience qualification / unitary journeys on Adobe Journey Optimizer
  • Create a journey starting with an audience qualification activity (such as the streaming segmentation created above).
  • Create a journey starting with a unitary event. This unitary event should be an authenticated event.
  • You must disable re-entry when creating these journeys.
  • Regardless of shared device scenarios, John and Jane should trigger the respective journeys that they should enter.
  • John and Jane should not re-enter the journey when the ECID is transferred back to them.

How do I validate that this feature is working as expected?

Use the graph simulation tool to validate that the feature is working at an individual graph level.

To validate the feature at a sandbox level, refer to the Graph count with multiple namespaces section in the identity dashboard.

recommendation-more-help
64963e2a-9d60-4eec-9930-af5aa025f5ea