This article explains the workflow required to export datasets from Adobe Experience Platform to your preferred cloud storage location, such as Amazon S3, SFTP locations, or Google Cloud Storage by using the Experience Platform UI.
You can also use the Experience Platform APIs to export datasets. Read the export datasets API tutorial for more information.
The datasets that you can export vary based on the Experience Platform application (Real-Time CDP, Adobe Journey Optimizer), the tier (Prime or Ultimate), and any add-ons that you purchased (for example: Data Distiller).
Understand from the table below which dataset types you can export depending on your application, product tier, and any add-ons purchased:
|Datasets available for export
|Profile and Experience Event datasets created in the Experience Platform UI after ingesting or collecting data through Sources, Web SDK, Mobile SDK, Analytics Data Connector, and Audience Manager.
|Adobe Journey Optimizer
|Refer to the Adobe Journey Optimizer documentation.
|Refer to the Adobe Journey Optimizer documentation.
|Customer Journey Analytics
| Profile and Experience Event datasets created in the Experience Platform UI after ingesting or collecting data through Sources, Web SDK, Mobile SDK, Analytics Data Connector, and Audience Manager.
Note on availability: The ability to export datasets to the cloud is in the Limited Testing phase of release and might not be available yet in your environment. This note will be removed when the functionality is generally available. For information about the Customer Journey Analytics release process, see Customer Journey Analytics feature releases.
|Data Distiller (Add-on)
|Derived datasets created through Query Service.
Watch the video below for an end-to-end explanation of the workflow described on this page, benefits of using the export dataset functionality, and some suggested use cases.
Currently, you can export datasets to the cloud storage destinations highlighted in the screenshot and listed below.
Some file-based destinations in the Experience Platform catalog support both audience activation and dataset export.
This document contains all the information necessary to export datasets. If you want to activate audiences to cloud storage or email marketing destinations, read Activate audience data to batch profile export destinations.
To export datasets to cloud storage destinations, you must have successfully connected to a destination. If you haven’t done so already, go to the destinations catalog, browse the supported destinations, and configure the destination that you want to use.
To export datasets, you need the View Destinations, View Datasets, and Manage and Activate Dataset Destinations access control permissions. Read the access control overview or contact your product administrator to obtain the required permissions.
To ensure that you have the necessary permissions to export datasets and that the destination supports exporting datasets, browse the destinations catalog. If a destination has an Activate or an Export datasets control, then you have the appropriate permissions.
Follow the instructions to select a destination where you can export your datasets:
Go to Connections > Destinations, and select the Catalog tab.
Select Activate or Export datasets on the card corresponding to the destination that you want to export datasets to.
Select Data type Datasets and select the destination connection that you want to export datasets to, then select Next.
If you want to set up a new destination to export datasets, select Configure new destination to trigger the Connect to destination workflow.
Use the check boxes to the left of the dataset names to select the datasets that you want to export to the destination, then select Next.
In the Scheduling step, you can set a start date and an export cadence for your dataset exports.
The Export incremental files option is automatically selected. This triggers an export of one or multiple files representing a full snapshot of the dataset. Subsequent files are incremental additions to the dataset since the previous export.
The first incremental file export includes all existing data in the dataset, functioning as a backfill. The export can contain one or multiple files.
Use the Frequency selector to select the export frequency:
Use the Time selector to choose the time of day, in UTC format, when the export should take place.
Use the Date selector to choose the interval when the export should take place. Note that you currently cannot set an end date for the exports. For more information, view the known limitations section.
Select Next to save the schedule and proceed to the Review step.
For dataset exports, the file names have a preset, default format, which cannot be modified. See the section Verify successful dataset export for more information and examples of exported files.
On the Review page, you can see a summary of your selection. Select Cancel to break up the flow, Back to modify your settings, or Finish to confirm your selection and start exporting datasets to the destination.
When exporting datasets, Experience Platform creates one or multiple
.parquet files in the storage location that you provided. Expect new files to be deposited in your storage location according to the export schedule you provided.
Experience Platform creates a folder structure in the storage location you specified, where it deposits the exported dataset files. A new folder is created for each export time, following the pattern below:
The default file name is randomly generated and ensures that exported file names are unique.
The presence of these files in your storage location is confirmation of a successful export. To understand how the exported files are structured, you can download a sample .parquet file or .json file.
In the connect to destination workflow, you can select the exported dataset files to be compressed, as shown below:
Note the difference in file format between the two file types, when compressed:
To remove datasets from an existing dataflow, follow the steps below:
Log in to the Experience Platform UI and select Destinations from the left navigation bar. Select Browse from the top header to view your existing destination dataflows.
Select the filter icon on the top left to launch the sort panel. The sort panel provides a list of all your destinations. You can select more than one destination from the list to see a filtered selection of dataflows associated with the selected destination.
From the Activation data column, select the datasets control to view all datasets mapped to this export dataflow.
The Activation data page for the destination appears. Select the dataset which you want to remove, then select Remove dataset in the right rail to trigger the dataset removal confirmation dialog.
In the confirmation dialog, select Remove to immediately remove the dataset from exports to the destination.
Refer to the product description documents to understand how much data you are entitled to export for each Experience Platform application, per year. For example, you can view the Real-Time CDP Product Description here.
Note that the data export entitlements for different applications are not additive. For example, this means that if you purchase Real-Time CDP Ultimate and Adobe Journey Optimizer Ultimate, the profile export entitlement will be the larger of the two entitlements, as per the product descriptions. Your volume entitlements are calculated by taking your total number of licensed profiles and multiplying by 500 KB for Real-Time CDP Prime or 700 KB for Real-Time CDP Ultimate to determine how much volume of data you are entitled to.
On the other hand, if you purchased add-ons such as Data Distiller, the data export limit that you are entitled to represents the sum of the product tier and the add-on.
You can view and track your profile exports against your contractual limits in the licensing dashboard.
Keep in mind the following limitations for the general availability release of dataset exports: