Ingest data ingest-data

IMPORTANT
To change the data source for a dataset, you must first delete the existing dataflow before creating a new one that references the same dataset and the new source.
Adobe Experience Platform enforces a strict one-to-one relationship between dataflows and datasets. This allows you to maintain synchronization between the source and the dataset for accurate incremental ingestion.

Adobe Experience Platform allows data to be ingested from external sources while providing you with the ability to structure, label, and enhance incoming data using Experience Platform services. You can ingest data from a variety of sources such as Adobe applications, cloud-based storages, databases, and many others.

A dataset is a storage and management construct for a collection of data, typically a table, that contains a schema (columns) and fields (rows). Data that is successfully ingested into Experience Platform is stored within the data lake as datasets.

Supported Sources for Orchestrated campaigns supported

The following Dources are supported for use with Orchestrated campaigns:

Type
Source
Cloud Storage
Amazon S3
Google Cloud Storage
SFTP
Cloud Data Warehouses
Snowflake
Google BigQuery
Data Landing Zone
Azure Databricks
File-Based Uploads
Local File Upload

Guidelines for relational Schema data hygiene cdc

For datasets enabled with Change data capture, all data changes including deletions, are automatically mirrored from the source system into Adobe Experience Platform.

Since Adobe Journey Optimizer Campaigns require all onboarded datasets to be enabled with Change data capture, it is the customer’s responsibility to manage deletions at the source. Any record deleted from the source system will automatically be removed from the corresponding dataset in Adobe Experience Platform.

To delete records via file-based ingestion, the customer’s data file should mark the record using a D value in the Change Request Type field. This indicates that the record should be deleted in Adobe Experience Platform, mirroring the source system.

If customer wants to delete records only from Adobe Experience Platform without affecting the original source data, the following options are available:

  • Proxy or Sanitized Table for Change data capture Replication

    Customer can create a proxy or sanitized source table to control which records are replicated into Adobe Experience Platform. Deletions can then be managed selectively from this intermediary table.

  • Deletion via Data Distiller

    If licensed, Data Distiller can be used to support deletion operations directly within Adobe Experience Platform, independent of the source system.

    Learn more on Data Distiller

Configure a dataflow

This example demonstrates how to configure a data flow that ingests structured data into Adobe Experience Platform. The configured data flow supports automated, scheduled ingestion and enables real-time updates.

  1. From the Connections menu, access the Sources menu.

  2. Choose your source depending on the Supported Sources for Orchestrated campaigns.

  3. Connect your Cloud Storage or Google Cloud Storage account if you chose cloud based sources.

  4. Choose the data to ingest into Adobe Experience Platform.

  5. From the Dataset details page, check Enable Change data capture to display only datasets that are mapped to relational schemas and include both a primary key and a version descriptor.

    Learn more on guidelines for relational Schemas data hygiene

    note important
    IMPORTANT
    For file-based sources only, each row in the data file must include a _change_request_type column with values U (upsert) or D (delete). Without this column, the system will not recognize the data as supporting change tracking, and the Orchestrated Campaign toggle will not appear, preventing the dataset from being selected for targeting.

  6. Select your previously created Dataset and click Next.

  7. If you are using a file-based sources only, from the Select data window, upload your local files and preview their structure and contents.

    Note that the maximum supported size is 100MB.

  8. In the Mapping window, verify that each source file attribute is correctly mapped with the corresponding fields in the target schema. Learn more about targeting dimensions.

    Click Next once done.

  9. Configure the data flow Schedule based on your desired frequency.

  10. Click Finish to create the data flow. It will execute automatically according to the defined schedule.

  11. From the Connections menu, select Sources and access the Data Flows tab to track flow execution, review ingested records, and troubleshoot any errors.

recommendation-more-help
b22c9c5d-9208-48f4-b874-1cefb8df4d76