Relational Store basics in Journey Optimizer
Learn the foundational concepts of the relational store used in Journey Optimizer’s campaign orchestration—covering schema design, data ingestion, supported sources, and key differences from the real-time profile store.
Welcome! In this video, I will talk about the relational store basics. This is a foundational pillar of the Journey Optimizer’s campaign orchestration functionality. Let’s dive right in. The relational store is a relational database. It’s based on primary and foreign key relationships. Because it’s a relational database, it can support any entity type. Basically, any table of data that you can imagine. For example, customer reservations, property, inventory, and so on. It doesn’t have to be person-based.
This database type allows you to basically describe anything. However, this also means the relational store doesn’t support complex objects like arrays or maps. Only scalar fields, like text, numbers, boolean. From a data engineering perspective, for data ingestion into the data store via API, only selected sources from the current services catalog are available. Only batch data connectors are available as no real-time data is stored in the relational data store. The data in the store is mutable, which means it can be updated and deleted within the store. For example, you want to model a reservation. You don’t need every permutation of the reservation, you just want to use the last step. However, when updating the data, you do have to do full record reinstatement. Partial record updates are not possible. Now let’s talk about modeling data. There is always a manual approach. I want to just come in and build my schema. If you are familiar with data modeling in the XDM UI, it looks the same, except that the structure is not hierarchical, it’s flattened. You don’t have access to field groups and data types. You have to design each kind of field yourself, but since it’s just a nice flattened schema, it is not complex.
The other option is to upload a DDL file. This allows you to do bulk creation. So, you can take a DDL file which looks like SQL. Using this method, you could create 15 tables at once, and then quickly just define what the relationships are between them. So, these are two options when you’re thinking about how you model your data. Bulk is a great way to jumpstart your implementation, especially if your data is already coming from a relational system. So how does data get in? What you see on the left are the sources that are supported today. They are all sourced for batch ingestion. Additional sources will be supported in future. You can ingest data from Snowflake, Amazon S3, Google BigQuery, Azure Databricks, the Data Landing Zone, SFTP, or as a file upload. Before you can ingest data for the first time you need to set up your account, then build the source data flow, which will go through data prep before it goes into the data lake. This is where all data goes first when being ingested on the experience platform. From there the data is pushed down into the relational store. This happens through a scheduled job that runs behind the scenes. This job runs every 15 minutes and processes data from the data sets. If there’s a big load of data greater than a gigabyte, it will kick the process off immediately. In that case, it doesn’t have to wait for the 15 minutes. Within the data lake, you can create your own data sets and schemas. You also have the out-of-the-box AGO system data sets. These are all the delivery and tracking logs that track delivery, bounces, clicks, opens, and so on. All this data is pushed down into the relational data store and is then available for segmentation and orchestration with orchestrated campaigns. Let’s look at the guardrails you should be aware of. First, you can’t just take data from the data lake and drop it into the relational store if it’s using standard XDM formats like experience events or individual profiles. That’s because this kind of data is organized hierarchical, not rational, like a table. So before you move it, you need to clean it up and reshape it a bit. Just remember, XDM data is made for the profile data store, not the relational store. You’ll learn more about how to handle that in the next slide. Secondly, whenever you create a schema for the relational store, you always need two things, a primary key and a version descriptor. The primary key is the unique ID for each record. It helps the system know which record is which. The version descriptor is a date and time field that tells us when the record was last updated. This is important because it helps the system figure out what’s new and what hasn’t changed when you’re updating data. Also, the relational store can support up to 200 schemas. There is one data set per relational schema, so there is always a one-to-one mapping between a data set and a schema. You can never share a schema with multiple data sets. This is specific to the relational data store opposed to when you are working with profile data. And you can only use one source connector and data flow per data set. You cannot map multiple sources to the same data set. Again, different than how it works if you work with the profile store.
Lastly, every file or batch that is ingested must have a column called underscore change, underscore type, underscore request. This information tells the system it’s supposed to treat that record as an update or delete within the store. As there are two different data stores that Journey Optimizer utilizes, let’s take a look at when you might want to put data into the relational store versus the profile. We’re going to look at it solely from a data perspective. The business case you are looking to implement might prescribe which store you need to use. Consider the data you want to ingest into Journey Optimizer. Is the source of the data relational already? It’s probably a good candidate for the relational store. Do you want the data as it is, meaning you don’t want to have to do a lot of data engineering work? Are you bringing in more than 12 months of event-based data because you were hoping at some point you might need it for flexibility? You’re probably thinking more relational store and big.
If you need the capacity for ad hoc audience creation, evaluation, and activation, you’ll probably want the data in the relational store. When would data go into the real-time customer profile? Well, do you have streaming data? Is there something you’re trying to do with the data that’s streaming to drive to in-the-moment experiences? Is data freshness a requirement? Data freshness, meaning you want the most recent and updated data. In that case, you probably want the real-time XDM profile because it’s a faster store as it allows streaming.
Consider if you need to make many in-the-moment decisions. For example, your customer is on the website and you need to figure something out and react.
Ask yourself if you are okay with your behavioral data being limited to potentially 90 days or less and are you willing to work with pre-computed aggregates, either custom-built or out-of-the-box? Is there a need for personalizing channels in real-time? For example, your customers are coming into the website from an in-app experience and you are trying to make a decision on them.
Now you should have a good understanding of the relational store, how the data is ingested, and what type of data is stored in that data store. Thank you for watching.