Experience Data Model (XDM) is the core framework that standardizes customer experience data by providing common structures and definitions for use in downstream Adobe Experience Platform services. By adhering to XDM standards, all customer experience data can be incorporated into a common representation that allows you to gain valuable insights from customer actions, define customer audiences, and express customer attributes for personalization purposes.
Since XDM is extremely versatile and customizable by design, it is therefore important to follow best practices for data modeling when designing your schemas. This document covers the key decisions and considerations you must make when mapping your customer experience data to XDM.
Before reading this guide, please review the XDM System overview for a high-level introduction to XDM and its role within Experience Platform.
Additionally, this guide focuses exclusively on key considerations regarding schema design. It is therefore strongly recommended that you refer to the basics of schema composition for detailed explanations of the individual schema elements mentioned throughout this guide.
The recommended approach for designing your data model for use in Experience Platform can be summarized as follows:
The steps related to identifying the applicable data sources required to carry out your business use cases will vary from organization to organization. While the remainder of sections throughout this document focus on the latter steps of organizing and constructing an ERD after the data sources have been identified, the explanations of the diagram’s various components may inform your decisions as to which of your data sources should be migrated to Platform.
Once you have determined the data sources you wish to bring into Platform, create a high-level ERD to help guide the process of mapping your data to XDM schemas.
The example below represents a simplified ERD for a company who wants to bring data into Platform. The diagram highlights the essential entities that should be sorted into XDM classes, including customer accounts, hotels, addresses, and several common e-commerce events.
Once you have created an ERD to identify the essential entities you would like to bring into Platform, these entities must be sorted into profile, lookup, and event categories:
|Profile entities||Profile entities represent attributes relating to an individual person, typically a customer. Entities that fall under this category should be represented by schemas based on the XDM Individual Profile class.|
|Lookup entities||Lookup entities represent concepts that can relate to an individual person, but cannot be directly used to identify the individual. Entities that fall under this category should be represented by schemas based on custom classes, and are linked to profiles and events through schema relationships.|
|Event entities||Event entities represent concepts related to actions a customer can take, system events, or any other concept where you may want to track changes over time. Entities that fall under this category should be represented by schemas based on the XDM ExperienceEvent class.|
The sections below provide further guidance for how to sort your entities into the above categories.
A primary way of sorting between entity categories is whether the data being captured is mutable or not.
Attributes belonging to profiles or lookup entities are typically mutable. For example, a customer’s preferences might change over time, and the parameters of a subscription plan can be updated depending on market trends.
By contrast, event data is typically immutable. Since events are attached to a specific timestamp, the “system snapshot” that an event provides does not change. For example, an event can capture a customer’s preferences when they checkout a cart, and does not change even if the customer’s preferences end up changing later on. Event data cannot be changed after it has been recorded.
To summarize, profiles and lookup entities contain mutable attributes and represent the most current information about the subjects they capture, while events are immutable records of the system at a specific time.
If an entity contains any attributes related to an individual customer, it is most likely a profile entity. Examples of customer attributes include:
If you want to analyze how certain attributes within an entity change over time, it is most likely an event entity. For example, adding product items to a cart can be tracked as add-to-cart events in Platform:
|Customer ID||Type||Product ID||Quantity||Timestamp|
|1234567||Add||275098||2||Oct 1, 10:32 AM|
|1234567||Remove||275098||1||Oct 1, 10:33 AM|
|1234567||Add||486502||1||Oct 1, 10:41 AM|
|1234567||Add||910482||5||Oct 3, 2:15 PM|
When categorizing your entities, it is important to think about the audiences you may want to build to address your particular business use cases.
For example, a company wants to know all of the “Gold” or “Platinum” members of their loyalty program that have made more than five purchases in the last year. Based on this segmentation logic, the following conclusions can be made regarding how relevant entities should be represented:
In addition to considerations regarding segmentation use cases, you should also review the activation use cases for those audiences in order to identify additional relevant attributes.
For example, a company has built an audience based on the rule that
country = US. Then, when activating that audience to certain downstream targets, the company wants to filter all exported profiles based on home state. Therefore, a
state attribute should also be captured in the applicable profile entity.
Based on the use case and granularity of your data, you should decide whether certain values need to be pre-aggregated before being included in a profile or event entity.
For example, a company wants to build an audience based on the number of cart purchases. You can choose to incorporate this data at the lowest granularity by including each timestamped purchase event as its own entity. However, this can sometimes increase the number of recorded events exponentially. To reduce the number of ingested events, you can choose to create an aggregate value
numberOfPurchases over a weeklong or monthlong period. Other aggregate functions like MIN and MAX can also apply to these situations.
Experience Platform does not currently perform automatic value aggregation, although this is planned for future releases. If you choose to use aggregated values, you must perform the calculations externally before sending the data to Platform.
The cardinalities established in your ERD can also provide some clues as to how to categorize your entities. If there is a one-to-many relationship between two entities, the entity that represents the “many” will likely be an event entity. However, there are also cases where the “many” is a set of lookup entities that are provided as an array within a profile entity.
Since there is no universal approach to fit all use cases, it is important to consider the pros and cons of each situation when categorizing entities based on cardinality. See the next section for more information.
The following table outlines some common entity relationships and the categories that can be derived from them:
|Customers and Cart Checkouts||One to many||A single customer may have many cart checkouts, which are events that can be tracked over time. Customers would therefore be a profile entity, while Cart Checkouts would be an event entity.|
|Customers and Loyalty Accounts||One to one||A single customer can only have one loyalty account, and vice versa. Since the relationship is one-to-one, both Customers and Loyalty Accounts represent profile entities.|
|Customers and Subscriptions||One to many||A single customer may have many subscriptions. Since the company is only concerned with a customer’s current subscriptions, Customers is a profile entity, while Subscriptions is a lookup entity.|
While the previous section provided some general guidelines for deciding how to categorize your entities, it is important to understand that there can often be pros and cons for choosing one entity category over another. The following case study is intended to illustrate how you might consider your options in these situations.
A company tracks active subscriptions for their customers, where one customer can have many subscriptions. The company also wants to include subscriptions for segmentation use cases, such as finding all users with active subscriptions.
In this scenario, the company has two potential options for representing a customer’s subscriptions in their data model:
The first approach would be to include an array of subscriptions as attributes within the profile entity for Customers. Objects in this array would contain fields for
The second approach would be to use event schemas to represent subscriptions. This entails ingesting the same subscription fields as the first approach, with addition of a subscription ID, a customer ID, and a timestamp of when the subscription event occurred.
Once you have sorted your entities into profile, lookup, and event categories, you can start converting your data model into XDM schemas. For demonstration purposes, the example data model shown earlier has been sorted into appropriate categories in the following diagram:
The category that an entity has been sorted under should determine the XDM class you base its schema on. To reiterate:
While event entities will almost always be represented by separate schemas, entities in the profile or lookup categories may be combined together in a single XDM schema, depending on their cardinality.
For example, since the Customers entity has a one-to-one relationship with the LoyaltyAccounts entity, the schema for the Customers entity could also include a
LoyaltyAccount object to contain the appropriate loyalty fields for each customer. If the relationship is one to many, however, the entity that represents the “many” could be represented by a separate schema or an array of profile attributes, depending on its complexity.
The sections below provide general guidance on constructing schemas based on your ERD.
The rules of schema evolution dictate that only non-destructive changes can be made to schemas once they have been implemented. In other words, once you add a field to a schema and data has been ingested against that field, the field can no longer be removed. It is therefore essential to adopt an iterative modeling approach when you are first creating your schemas, starting with a simplified implementation which progressively gains complexity over time.
If you are not sure whether a particular field is necessary to include in a schema, the best practice is to leave it out. If it is later determined that the field is necessary, it can always be added in the next iteration of the schema.
In Experience Platform, XDM fields marked as identities are used to stitch together information about individual customers coming from multiple data sources. Although a schema can have multiple fields marked as identities, a single primary identity must be defined in order for the schema to be enabled for use in Real-Time Customer Profile. See the section on identity fields in the basics of schema composition for more detailed information on the use case of these fields.
When designing your schemas, any primary keys in your relational database tables will be likely candidates for primary identities. Other examples of applicable identity fields are customer email addresses, phone numbers, account IDs, and ECID.
Experience Platform provides several out-of-the-box XDM schema field groups for capturing data related to the following Adobe applications:
For example, the Adobe Analytics ExperienceEvent Template field group allows you to map Analytics-specific fields to your XDM schemas. Depending on the Adobe applications you are working with, you should be using these Adobe-provided field groups in your schemas.
Adobe application field groups automatically assign a default primary identity through the use of the
identityMap field, which is a system-generated, read-only object that maps standard identity values for an individual customer.
For Adobe Analytics, ECID is the default primary identity. If an ECID value is not provided by a customer, the primary identity will instead default to AAID.
When using Adobe application field groups, no other fields should be marked as the primary identity. If there are additional properties that need to be marked as identities, these fields need to be assigned as secondary identities instead.
To prevent bad data being ingested into Platform, you are recommended to define the criteria for field level validation when creating your schemas. To set constraints on a particular field, select the field from the Schema Editor to open the Field properties sidebar. See the documentation on type-specific field properties for exact descriptions of the available fields.
The following are a collection of suggestions for data modelling when creating a schema:
identityMapfield often serves as the primary identity. Avoid designating additional fields as primary identities for that schema.
_idas an identity: Avoid using the
_idfield in Experience Event schemas as an identity. It is meant for record uniqueness, not for use as an identity.
This document covered the general guidelines and best practices for designing your data model for Experience Platform. To summarize:
Once you are ready, see the tutorial on creating a schema in the UI for step-by-step instructions on how to create a schema, assign the appropriate class for the entity, and add fields to map your data to.