Plan your data model

This video reviews what to do before you start building your schemas in Adobe Experience Platform. Document your business use cases, understand your Platform license, know the product guardrails, and identify what data to ingest before finalizing your data model. For more information, please visit the schemas documentation.

In this video, I want to talk about things you should do before you start building your schemas. Schemas are a critical foundation of a platform implementation. And so, you’ll want to build them with consideration. I’ll cover documenting your business use cases. Knowing what you’ve licensed. Knowing constraints of your license and the product. Identifying what data to ingest and creating a platform-centric entity relationship diagram or ERD. The last topic is pretty large and I’m going to save that for another video. Before you start building your data models, it’s imperative that you work with your business stakeholders to understand and document their key business use cases. I can’t emphasize this enough like with all Adobe digital experience products, your business use cases should be the driver of the technical implementation. You can document these in whatever format or system works for your company, but do document them. Some of these can definitely impact the design decisions you make as the data architect. So, document them and then review them again with your stakeholders so there aren’t any surprises later on when the implementation is up and running. The good news is, these scores should have already been fleshed out during the sales process. And it informed which parts of platform were purchased including any app services and other applications. One resource we use in both the sales and delivery process are digital experience blueprints. Blueprints are repeatable implementations that can help you solve established business problems. There are available on experience league and each one explains the use cases it addresses, lists the products involve, contains architecture diagrams and links to relevant enablement content. So, find out which blueprints your stakeholders are expecting to use. Your company segmentation goals are one of the most important topics to discuss with your stakeholders before defining your data models. Let me illustrate why this is important to do upfront. As you probably know, there are two types of data and platform, record and time series. Record data is for the current attributes of a customer. Time series is for the customer’s actions. Loyalty system data is typically modeled as record data. A customer might have 30,000 loyalty points. If you built a segment for customers with 30,000 points, they would be in it. But what if your marketers wanted to be able to personalize content based on loyalty point transactions, say customers who earned 10,000 loyalty points in the last month or spend 100,000 points in the last year. You might be able to patch your record schema with some additional fields in the pinch but that’s starting to sound like that data should have been modeled as time series data. The modeling decisions you make impact every downstream user or platform. So, it’s important to know what those users expect to be able to do.
Be sure to know which platform package or packages you purchased, which application or intelligence services were purchased and which other Adobe applications or third-party applications you intend to use. For example, while many platform packages in use cases include real-time customer profile, some don’t, there are modeling considerations that must be taken into account for real-time customer profile, which may or may not apply to you. Specific field groups are required for intelligence services like customer AI and attribution AI which are also critical to understand at the outset. Adobe applications and services that send data to platform typically have their own specific schemas that are created in your account when they’re provisioned. As the data architect, you should be aware of those.
In addition to which platform features you have access to, your contract might call out some key parameters that could impact what data you ingest into platform and how you use it. For example, say your contract limits you to 10 million profiles. You wouldn’t want to ingest data from a system that would automatically create 20 million profiles. It becomes even more important to understand these limitations as you move beyond your primary use cases in data sources and consider expanding your use of experience platform.
Also, platform has some guard rails, this change. So, I don’t want to go into a lot of detail but there is some documentation of these at the moment covering things like, how many data sets you can use with real-time customer profiles. How many relationships between schemas we recommend, the number of segments in a sandbox. Some of these are hard limits. While others are recommendations to keep the system performing quickly. Review this documentation before and while you’re designing your data model. The last important thing I want to cover in this video since we’ll save that entity relationship diagram for a separate one, is to understand your data sources. What data should be brought into platform and why? Are you ingesting data into the lake to run machine learning models and use customer journey analytics? Do you need data in real-time customer profile for marketing activation? Maybe you’re doing both. You should be more selective about which data you ingest into profile because of the license and performance considerations I mentioned earlier. Focus first on your primary data sources needed to address your key use cases, but also start thinking about how you would model data from secondary sources so your model is ready to scale in the future. So those are some things to think about before you start building your schemas, good luck. -