Data replication

Understand which databases Adobe Campaign V8 uses, why data is being replicated, which data is being replicated and how the replication process works.

In this tutorial, I will explain the concept of Data Replication in Adobe Campaign v8.
At the end of this video, you will understand which databases Campaign v8 uses, why data is replicated between these databases, and which data is replicated, and how it is replicated.
In Adobe Campaign v8, we have both a cloud database, Snowflake, as well as a local database, PostgreSQL, also referred to as Postgres. Snowflake is really fast and can process a large number of data much faster than Postgres. However, each query, no matter how big or small, won’t process under half a second. Which is okay with batch, but it is too slow for real time, unitary calls, like real-time messaging via the message center or UI calls. You don’t want to wait half a second every time you click somewhere in the application. For this, Postgres is better as it answers a milliseconds when it is asked to, for example, display a delivery or edit an object.
To strike the perfect balance, Campaign v8 customer data, batch campaign execution, workflow and batch, data ingestion, and reporting are stored on Snowflake.
And the Client Console, the user interface, real-time messaging, and the API for unitary lookups and rights must remain on Postgres.
Depending on the use case, some of the built-in schema needs to live on both sides, Postgres and Snowflake. For example, if we create a folder in the Client Console, we need the schema on Postgres as we need to be able to access it within milliseconds. But we also need it on Snowflake 'cause for one, you can apply access restrictions based on folders. And the data and the folder can be included when querying recipient data, for example.
This applies to the delivery schema as well.
The recipient schema, on the other hand, is not replicated but only lies on Snowflake. To ensure data consistency, Campaign v8 uses. an out of the box mechanism of data replication. A technical workflow called Replicate Reference tables runs every hour. Some tables are replicated incrementally if the last modified field exists. Otherwise, the whole table is replicated. It relies on a built in JavaScript library. But the replication workflow is just the tip of the iceberg. The replication mechanism is a lot more refined than just an hourly replication. There are several replications in place based on size of the tables, small tables, medium, large tables. Some tables will be replicated in real time. Others will be replicated on an hourly basis. Some tables will have incremental updates and others will be fully replaced.
What is important for you to understand is that this is really out of the box. You do not have to do anything. I still would like to show you where these workflows are situated and which data we’re actually talking about in the system. So, let’s go into Campaign v8.
Let’s take a look at two examples. The first example is the recipients. That’s on Snowflake and it’s not replicated. With recipients, we have the existing recipient schema, which you might already know from v7 as well on the local Postgres database, under the NMS namespace. And we have an out-of-the-box schema on Snowflake under the XXL namespace.
You can see here that this is a schema extension, which means it won’t create a separate table. It just modifies the existing recipient table. This modification changes the data source. So, it will pull the data from Postgres and move the table to Snowflake.
Let’s see what happens with data that is replicated. Let’s take a look at the operator, which is the list of users with their rights.
We have the operator schema on the local database.
And we also have an XXL schema on Snowflake.
You can see in this case, it is not an extension, but a brand-new schema. So, the data source is on the cloud database, but it creates a separate table on Snowflake.
This one is named XTK Operator on Snowflake. And the other one, is a table XTK Operator on the local Postgres database.
Those two tables are linked through the replication process. The replication workflow can be found under the technical workflows, full FDA replications.
Here, you can see that there are multiple workflows that are linked to the data replication, as well as the staging mechanism, which we’ll be covering in a separate video. The replicate reference table can be found here as well. Let’s take a look. The workflow runs every hour, daily.
What it does is it takes all schema that have replication enabled, for instance, the NMS delivery, and replicates them from Postgres to Snowflake. If we create a new operator, let’s call it Test User One.
I’ll just add a password, and I Will make it an Administrator.
And now you can see to create it, it takes a little bit longer. Because the system is actually creating two operators, one on Postgres and one on Snowflake.
Now, let’s take a look at the data.
Now, we have this new User Test User One on the local Postgres database.
And if we look at Snowflake, you can see that the user has been created as well.
If this ad hoc replication doesn’t work for any reason, the hourly workflow is the backup.
Now, you should understand the local and the cloud-based databases Adobe Campaign v8 uses, why the data needs to be replicated between the PostgreSQL and Snowflake database, which data is replicated, and how it is being replicated. Thank you for watching. -