Ingesting data into datasets
Adobe Experience Platform Data Ingestion represents the multiple methods by which Platform ingests data from various sources. Regardless of the method of ingestion, all successfully ingested data is converted to batch files. Batches are units of data that consist of one or more files to be ingested as a single unit. These batch files are then added to dedicated datasets and persisted within the Data Lake.
See the Data Ingestion overview for more information.
Labels applied to datasets from schemas
Adobe Experience Platform Data Governance allows you to manage customer data in order to ensure compliance with regulations, restrictions, and policies applicable to data use. The Data Governance framework allows you to apply usage labels to categorize data according to the usage policies that apply to that data. Labels can be applied to individual schemas, fields within those schemas, and entire individual datasets. When labels are applied directly to a schema, those labels are propagated to all existing and future datasets that are based on that schema.
See the Data Governance overview for more information on the service. For steps on how to work with usage labels in Platform, refer to the following guides:
Datasets in downstream Platform services
Once datasets have been used to store ingested data, those datasets are then used by downstream Platform services to update customer profiles, gain insights through machine learning, and more.
The following is a list of downstream services that use datasets for various operations. Please review the documentation for each service for more information.
- Data Access API: Allows you to access and download the contents of files stored within datasets.
- Adobe Experience Platform Identity Service: Bridges identities across devices and systems, linking datasets together based on the identity fields defined by the XDM schemas they conform to.
- Real-Time Customer Profile: Leverages Identity Service to create detailed customer profiles from your datasets in real time. Real-Time Customer Profile pulls data from the Data Lake and persists customer profiles in its own separate data store.
- Adobe Experience Platform Segmentation Service: Allows you to build segments and generate audiences from your Real-Time Customer Profile data. These audiences can then be exported to their own datasets within the Data Lake.
- Adobe Experience Platform Data Science Workspace: Uses machine learning and artificial intelligence to uncover insights in large datasets.
- Adobe Experience Platform Query Service: Allows you to use standard SQL to query data in Experience Platform, joining any datasets within the Data Lake and capturing query results as a new dataset for use in reporting, Data Science Workspace, or Real-Time Customer Profile.
- Adobe Experience Platform Destinations Service: Allows you to export datasets to your desired cloud storage or email marketing destinations, for reporting or data science activities.
Next steps
By reading this document, you have been introduced to the core uses of datasets in Experience Platform, as well as the various Platform services that utilize datasets. For more details on the many ways datasets are used in Platform, please review the service documentation linked throughout this overview.
For steps on how to interact with datasets within the Experience Platform UI, see the datasets user guide.