(Beta) Export datasets to cloud storage destinations

IMPORTANT
  • The functionality to export datasets is currently in Beta and is not available to all users. The documentation and the functionality are subject to change.
  • This beta functionality supports the export of first generation data, as defined in the Real-Time Customer Data Platform product description.
  • This functionality is available to customers who have purchased the Real-Time CDP Prime and Ultimate package. Please contact your Adobe representative for more information.

This article explains the workflow required to export datasets from Adobe Experience Platform to your preferred cloud storage location, such as Amazon S3, SFTP locations, or Google Cloud Storage.

When to activate segments or export datasets

Some file-based destinations in the Experience Platform catalog support both segment activation and dataset export.

  • Consider activating segments when you want your data structured into profiles grouped by audience interests or qualifications.
  • Alternatively, consider dataset exports when you are looking to export raw datasets, which are not grouped or structured by audience interests or qualifications. You could use this data for reporting, data science workflows, to satisfy compliance requirements, and many other use cases.

This document contains all the information necessary to export datasets. If you want to activate segments to cloud storage or email marketing destinations, read Activate audience data to batch profile export destinations.

Prerequisites

To export datasets to cloud storage destinations, you must have successfully connected to a destination. If you haven’t done so already, go to the destinations catalog, browse the supported destinations, and configure the destination that you want to use.

Required permissions

To export datasets, you need the Manage Destinations, View Destinations, Activate Destinations, and Manage and Activate Dataset Destinations access control permissions. Read the access control overview or contact your product administrator to obtain the required permissions.

To ensure that you have the necessary permissions to export datasets and that the destination supports exporting datasets, browse the destinations catalog. If a destination has an Activate or an Export datasets control, then you have the appropriate permissions.

Select your destination

Follow the instructions to select a destination where you can export your datasets:

  1. Go to Connections > Destinations, and select the Catalog tab.

    Destination catalog tab with Catalog control highlighted.

  2. Select Activate or Export datasets on the card corresponding to the destination that you want to export datasets to.

    Destination catalog tab with Activate control highlighted.

  3. Select Data type Datasets and select the destination connection that you want to export datasets to, then select Next.

TIP

If you want to set up a new destination to export datasets, select Configure new destination to trigger the Connect to destination workflow.

Destination activation workflow with Datasets control highlighted.

  1. The Select datasets view appears. Proceed to the next section to select your datasets for export.

Select your datasets

Use the check boxes to the left of the dataset names to select the datasets that you want to export to the destination, then select Next.

Dataset export workflow showing the Select datasets step where you can select which datasets to export.

Schedule dataset export

In the Scheduling step, you can set a start date as well as an export cadence for your dataset exports.

The Export incremental files option is automatically selected. This triggers an export where the first file is a full snapshot of the dataset and subsequent files are incremental additions to the dataset since the previous export.

IMPORTANT

The first exported incremental file includes all existing data in the dataset, functioning as a backfill.

Dataset export workflow showing the scheduling step.

  1. Use the Frequency selector to select the export frequency:

    • Daily: Schedule incremental file exports once a day, every day, at the time you specify.
    • Hourly: Schedule incremental file exports every 3, 6, 8, or 12 hours.
  2. Use the Time selector to choose the time of day, in UTC format, when the export should take place.

  3. Use the Date selector to choose the interval when the export should take place. Note that in the beta version of the feature, it is not possible to set an end date for the exports. For more information, view the known limitations section.

  4. Select Next to save the schedule and proceed to the Review step.

NOTE

For dataset exports, the file names have a preset, default format, which cannot be modified. See the section Verify successful dataset export for more information and examples of exported files.

Review

On the Review page, you can see a summary of your selection. Select Cancel to break up the flow, Back to modify your settings, or Finish to confirm your selection and start exporting datasets to the destination.

Dataset export workflow showing the review step.

Verify successful dataset export

When exporting datasets, Experience Platform creates a .json or .parquet file in the storage location that you provided. Expect a new file to be deposited in your storage location according to the export schedule you provided.

Experience Platform creates a folder structure in the storage location you specified, where it deposits the exported dataset files. A new folder is created for each export time, following the pattern below:

folder-name-you-provided/datasetID/exportTime=YYYYMMDDHHMM

The default file name is randomly generated and ensures that exported file names are unique.

Sample dataset files

The presence of these files in your storage location is confirmation of a successful export. To understand how the exported files are structured, you can download a sample .parquet file or .json file.

Remove dataset from destination

To remove a dataset from an existing dataflow, follow the steps below:

  1. Log in to the Experience Platform UI and select Destinations from the left navigation bar. Select Browse from the top header to view your existing destination dataflows.

    Destination browse view with a destination connection shown and the rest blurred out.

    TIP

    Select the filter icon Filter-icon on the top left to launch the sort panel. The sort panel provides a list of all your destinations. You can select more than one destination from the list to see a filtered selection of dataflows associated with the selected destination.

  2. From the Activation data column, select the datasets control to view all datasets mapped to this export dataflow.

    The available datasets navigation option highlighted in the Activation data column.

  3. The Activation data page for the destination appears. Select Remove dataset in the right rail to trigger the remove dataset confirmation dialog.

    Remove dataset dialog showing the Remove dataset control in the right rail.

  4. In the confirmation dialog, select Remove to immediately remove the dataset from exports to the destination.

    Dialog showing the Confirm dataset removal option from the dataflow.

Known limitations

Keep in mind the following limitations for the beta release of dataset exports:

  • There is currently a single permission (Manage and Activate Dataset Destinations) that includes manage and activate permissions on dataset destinations. These controls will be split up in the future into more granular permissions. Review the required permissions section for a complete list of permissions that you need to export datasets.
  • Currently, you can only export incremental files and an end date cannot be selected for your dataset exports.
  • Exported filenames are currently not customizable.
  • The UI does not currently block you from deleting a dataset that is being exported to a destination. Do not delete any datasets that are being exported to destinations. Remove the dataset from a destination dataflow before deleting it.
  • Monitoring metrics for dataset exports are currently mixed with numbers for profile exports so they do not reflect the true export numbers.

On this page