Create datasets and ingest data

Learn how to create a dataset, map it to a schema, add data to it, and confirm that the data has been ingested.

In this video, we will show you how to create a data set map to a schema, ingest data into that dataset and confirm that the data has been adjusted. We’ll continue with our example from previous videos and ingest loyalty system data for our fictional brand Luma. In a previous video, we created a schema to represent our loyalty data. To create a dataset based on that schema, we’ll select datasets from the left navigation in Journey Optimizer. Here, you’ll find a list of existing datasets, and this is where you can view and manage all of the datasets for your organization. To create a new dataset, we’ll click on create dataset in the upper left. You’ll see two options here, create dataset from schema or create dataset from a CSV file. We have our Luma loyalty schema that we created in a previous video, and we’ll select the first option to create a dataset based on that schema. So next, we’ll search for the schema that we created previously, Luma loyalty, select that schema and click next. We’ll give a name to our dataset. We’ll call it Luma Loyalty Data and then click finish.
And with that, we’ve created an empty dataset to hold our loyalty system data. There are a number of different options for ingesting data into that data set, these include file upload, batch and streaming data ingestion APIs, and built in connectors for a wide range of popular data sources. You’ll ultimately want to use the source connectors or data ingestion APIs to set up a continuous flow of data from your various sources into Journey Optimizer. But when starting out with a new schema and dataset, you’ll often want to upload a sample data file to make sure sample data from your source maps properly into the schema and dataset you’ve created. So, we’ll explore some of the other data ingestion options further in other videos, but for now we’ll do a simple file upload to ingest a CSV file of sample loyalty system data into our dataset. One last thing before we ingest the data, you’ll notice on the right-hand side, there’s a toggle button to enable for profile.
That means the data in this dataset will be stitched together with any other profile enabled datasets as part of the unified profile. Journey Optimizer works off this unified profile data, so if you want the data from this dataset to be exposed in Journey Optimizer for targeting and personalization, you need to make sure that you’ve enabled this dataset and the underlying schema for profile. We cover this in more detail in a separate video on managing identities. Further down, you’ll also see a place to add data to the dataset. Now, if you have a JSN or CSV formatted file that perfectly matches the schema structure and field names, then you can drop it here and ingest the file directly into the dataset this way. But in most cases, you’ll need to do some mapping to make sure the columns in your data file go into the corresponding fields in your dataset schema. So, to do that, I’ll go to sources, and under file ingestion, I’ll click on map file to schema.
Here, I’ll select Luma Loyalty Dataset that we just created, which is where we want the data to land. Then I’ll click next and I can drag and drop our CSV file containing the sample data from my loyalty system.
Next is where we map columns from the CSV file to target fields in our Luma loyalty schema that we created previously.
On the left, you’ll see the columns from the CSV file, and on the right, our intelligent mapper service has already suggested target fields from our Luma loyalty schema based on the CSV column names. So, you’ll want to go through these and select the ones that are correct and correct any that need to be adjusted.
In our case, it looks like it mapped everything correctly, so we can go ahead and select them all. And once I verify all of the target schema fields, I’ll click finish and the data for my CSV file will be ingested into the dataset. You’ll be taken to a data set overview page that shows a new batch that’s processing.
Once the batch is complete, you can click on it to see details of the batch. If there were any failures or warnings in the ingestion process, you’ll see those here.
You can also preview the dataset to make sure that everything looks good.
You can see here all the names and the data that we expected from our dataset.
And now, you know how to create a dataset, add data into it and confirm that the data set has been adjusted. -