Documentation Experience Platform Tutorials

Data ingestion overview

Last update: February 24, 2025

Topics:
Data Ingestion

CREATED FOR:

Beginner
Developer

Adobe Experience Platform’s data ingestion capabilities let you bring your data together into one open and scalable platform for the purpose of managing real-time customer profiles. For more information, please visit the Data Ingestion documentation.

Transcript

Hi there, I’m going to give you a quick overview of how to ingest data into Adobe Experience Platform. Data ingestion is a fundamental step to getting your data in Experience Platform so you can use it to build 360 degree real-time customer profiles and use them to provide meaningful experiences. Adobe Experience Platform allows data to be ingested from various external sources while giving you the ability to structure, label and enhance incoming data using Platform services. You can ingest data from various sources such as Adobe applications, enterprise sources, databases, stream data using a web or mobile SDK and many others. Platform is API friendly and lets you ingest data using batch and streaming APIs. Experience Platform provides tools to ensure that the ingested data is XDM compliant and helps prepare that data for real-time customer profiles and other services. You can ingest data into Platform using various sources. You can configure a streaming source connector in Platform that provides an HTTP API endpoint, and then you can either do a batch ingestion or stream data into Platform using the endpoint. You can drag and drop files into the UI and ingest it with the batch mode. You can also configure it a source connector in the UI that will ingest data from the origin system using the most appropriate mode for that system. Source connectors ingest data using either batch ingestion or streaming ingestion. Platform provides you with the state of art streaming infrastructure to collect, enrich and activate data in real time. Streaming ingestion APIs makes it easy for customers to ingest data from the real-time messaging systems, other first party systems and partners. When data is streamed to Platform, data is verified to ensure that it’s coming from trusted sources and it’s in the XDM format. The is then placed on Experience Platform pipeline for consumption by other services as fast as possible. Different services within Platform then consume the data from the pipeline. In the next step that is stored in Data Lake as a dataset, comprised of batches and files that can be accessed by various Platform components. All data sets contain a reference to the XDM schema that constraints the format and structure of the data that they can store. Attempting to upload that to a dataset that does not conform to the datasets XDM schema, will cause an ingestion to fail. Any data that is configured to be processed into the profile gets flagged for immediately processing up into the identity graph and profile store. With real-time customer profile, you can see a holistic view of each individual customer by combining data from multiple channels, including online, offline, CRM and third-party. Profile allows you to consolidate your customer data into a unified view, offering an actionable, timestamped account of every customer interaction. With Adobe Experience Platform Query Service, you can prove all your stored customer datasets including behavioral, CRM, point of sales data and more into one place and run faster petabyte SQL queries to discover the story behind customer behavior and generate impactful insights using a BI tool of your choice. Although your real-time customer profile requests real time data ingestion and activation, there are still many use cases where batch ingestion is needed. Many first party and third party systems do not support streaming edition yet. Plus, you might want to completely refresh the data in Platform with an updated version from your own Data Lake such as monthly refresh of your product catalog. In addition, if you want to upload large volumes of data, batch ingestion is still the optimal method to load terabytes of data into Platform. To support these use cases, Platform provides batch data ingestion pipelines that allow you to ingest data from any system. Batch pipeline validates, transforms and partitions data before it’s stored in the Data Lake. This ensures that the data is stored in the most optimized format to support easy access at petabyte scale. Let’s take a cute look at the source connector example to get a better understanding. When you log into a Platform you will see sources in the left navigation. Clicking sources will take you to the source catalog screen where you can see all of the source connectors currently available in Platform. For our video, let’s use the Amazon S3 cloud storage to perform a batch ingestion. Click on the add data option and choose an existing Amazon account and then move to the next step. In this step, we choose the source file for data ingestion and verify the file data format. Not that the ingested file data can be formatted as XDM JSON, XDM Parquet or delimited. Currently for delimited files, you have an option to preview sample data of the source file. You can also choose a custom delimiter for your source data. For streaming and batch ingestion, Adobe Experience Platform currently supports the following file formats. For data ingestion, another requirement is to have a dataset to store the incoming data. A data set is a storage and management construct for a collection of data, typically a table, that contains columns derived from a schema and the ingested data gets stored as rows. All data sets are based on existing XDM schemas which provide constraints for what the ingested data should contain and how it should be structured. Experience Platform uses schema to describe the structure of data in a consistent and reusable way. Before ingesting data into Platform, a schema must be composed to describe the data structure and provide constraints to the type of data that can be contained within each field, so data can be validated as it moves between systems. Schema consists of a base class and zero or more mix-ins. First you assign a class that defines what a schema is. For example, an individual profile or an Experience event. Next you can add mix-ins which are reusable components defining fields like personal details, preferences or addresses. Adobe Experience Platform provides standard classes and mix-ins related to these classes. If there is a need you can also define a customer class or a custom mix-in for your use case. Data appropriation allows data engineers to map, transform and validate source data to and from Experience at a model. Data appropriation appears as a mapping step in the data ingestion process. Data engineers can use data prep to perform data manipulating during ingestion. You can define simple pass through mappings to assign source input attributes to XDM target attributes, create calculated fields to perform in row calculations that can we assign to XDM attributes. In this example, you can combine the first name and the last name source fields to populate the full name field in the target field using a concatenation operation. Similarly, you can also transform a particular field by applying string, numeric, a date manipulation functions provided by Platform. Let’s select a frequency for this batch ingestion and move to the next step. With the help of error diagnosis, Platform allows users to generate error reports for newly ingested batches. Error diagnostics for failed records can be downloaded using the API. Partial ingestion enables the ingestion of valid records of new batch data, within a specified error threshold. The error threshold enables the configuration of personally acceptable errors before the entire batch fields. Let’s review the changes and save your configuration. At this step, we are successfully configured a data ingestion flow from a source location to Platform. Adobe Experience Platform allows data to be ingested from various external sources while giving you the ability to structure, label and enhance incoming data using Platform services. -

Methods of data ingestion

Sources overview

Learn how to easily ingest data from Adobe, first-party, and third-party applications into Platform's Real-Time Customer Profile and data lake.

Learn more

Adobe Experience Platform Web SDK and Edge Network overview

Learn how Adobe Experience Platform Web SDK and Edge Network allows customers to use one JavaScript library and one beacon to send data to Adobe applications and third-party destinations.

Learn more

Speeds of data ingestion

Batch Data Ingestion Overview

This video gives an overview of batch ingestion in Adobe Experience Platform and shows how to ingest batch data using the API.

Learn more

Streaming data ingestion overview

Using Experience Platform's streaming ingestion you can be sure that any data you send will be available in the Real-Time Customer Profile. This data can be captured from CRM and ERP systems or from any other source which is able to communicate over HTTP or public cloud streaming infrastructure.

Learn more

Adobe Experience Platform Web SDK and Edge Network overview

Learn how Adobe Experience Platform Web SDK and Edge Network allows customers to use one JavaScript library and one beacon to send data to Adobe applications and third-party destinations.

Learn more

Ingesting data from common third party sources

Ingest Data using CRM Source Connectors

Learn how to easily batch ingest data from CRM sources into Adobe Experience Platform's Real-Time Customer Profile and data lake seamlessly.

Learn more

Ingest Data using Cloud Storage Source Connectors

This video shows how to easily batch ingest data from cloud storage services into Adobe Experience Platform's Real-Time Customer Profile and data lake, in a seamless and scalable manner.

Learn more

Stream data using Source Connectors

Learn how to stream data in real-time from a cloud storage source to Platform and use the data in real-time for customer engagement.

Learn more

Ingest Data using Streaming Connection HTTP API endpoint

This video shows how to stream data to Adobe Experience Platform in real-time using the HTTP API endpoint.

Learn more

Ingesting data from common Adobe sources

Ingest data using the Adobe Analytics source connector

The Adobe Analytics Source connector allows you to easily stream, map, and filter data from Adobe Analytics into Adobe Experience Platform's Real-Time Customer Profile and Experience data lake.

Learn more

Ingest data from Marketo Engage

Learn how to ingest data from Marketo Engage using the source connector using the standard and template workflows.

Learn more

Ingest data using the Adobe Audience Manager data connector

Learn how to use the Audience Manager data connector to bring traits and segments from AAM into the Platform and combine them with other rich data.

Learn more

recommendation-more-help