Export datasets by using the Flow Service API
- Topics:
- Destinations
CREATED FOR:
- Admin
- User
- This functionality is available to customers who have purchased the Real-Time CDP Prime and Ultimate package, Adobe Journey Optimizer, or Customer Journey Analytics. Contact your Adobe representative for more information.
endTime
date for export dataset dataflows. Adobe has also introduced a default end date of May 1st 2025 for all dataset export dataflows created prior to the September 2024 release.endTime
date, these will default to an end time six months from the time they are created.This article explains the workflow required to use the Flow Service API to export datasets from Adobe Experience Platform to your preferred cloud storage location, such as Amazon S3, SFTP locations, or Google Cloud Storage.
Datasets available for exporting
The datasets that you can export depend on the Experience Platform application (Real-Time CDP, Adobe Journey Optimizer), the tier (Prime or Ultimate), and any add-ons that you purchased (for example: Data Distiller).
Refer to the table on the UI tutorial page to understand which datasets you can export.
Supported destinations
Currently, you can export datasets to the cloud storage destinations highlighted in the screenshot and listed below.
Getting started
This guide requires a working understanding of the following components of Adobe Experience Platform:
- Experience Platform datasets: All data that is successfully ingested into Adobe Experience Platform is persisted within the Data Lake as datasets. A dataset is a storage and management construct for a collection of data, typically a table, that contains a schema (columns) and fields (rows). Datasets also contain metadata that describes various aspects of the data they store.
- Sandboxes: Experience Platform provides virtual sandboxes which partition a single Experience Platform instance into separate virtual environments to help develop and evolve digital experience applications.
The following sections provide additional information that you must know in order to export datasets to cloud storage destinations in Experience Platform.
Required permissions
To export datasets, you need the View Destinations, View Datasets, and Manage and Activate Dataset Destinations access control permissions. Read the access control overview or contact your product administrator to obtain the required permissions.
To ensure that you have the necessary permissions to export datasets and that the destination supports exporting datasets, browse the destinations catalog. If a destination has an Activate or an Export datasets control, then you have the appropriate permissions.
Reading sample API calls
This tutorial provides example API calls to demonstrate how to format your requests. These include paths, required headers, and properly formatted request payloads. Sample JSON returned in API responses is also provided. For information on the conventions used in documentation for sample API calls, see the section on how to read example API calls in the Experience Platform troubleshooting guide.
Gather values for required and optional headers
In order to make calls to Experience Platform APIs, you must first complete the Experience Platform authentication tutorial. Completing the authentication tutorial provides the values for each of the required headers in all Experience Platform API calls, as shown below:
- Authorization: Bearer
{ACCESS_TOKEN}
- x-api-key:
{API_KEY}
- x-gw-ims-org-id:
{ORG_ID}
Resources in Experience Platform can be isolated to specific virtual sandboxes. In requests to Experience Platform APIs, you can specify the name and ID of the sandbox that the operation will take place in. These are optional parameters.
- x-sandbox-name:
{SANDBOX_NAME}
All requests that contain a payload (POST, PUT, PATCH) require an additional media type header:
- Content-Type:
application/json
API reference documentation
You can find accompanying reference documentation for all the API operations in this tutorial. Refer to the Flow Service - Destinations API documentation on the Adobe Developer website. We recommend that you use this tutorial and the API reference documentation in parallel.
Glossary
For descriptions of the terms that you will be encountering in this API tutorial, read the glossary section of the API reference documentation.
Gather connection specs and flow specs for your desired destination
Before starting the workflow to export a dataset, identify the connection spec and flow spec IDs of the destination to which you are intending to export datasets to. Use the table below for reference.
4fce964d-3f37-408f-9778-e597338a21ee
269ba276-16fc-47db-92b0-c1049a3c131f
6d6b59bf-fb58-4107-9064-4d246c0e5bb2
95bd8965-fc8a-4119-b9c3-944c2c2df6d2
be2c3209-53bc-47e7-ab25-145db8b873e1
17be2013-2549-41ce-96e7-a70363bec293
10440537-2a7b-4583-ac39-ed38d4b848e8
cd2fc47e-e838-4f38-a581-8fff2f99b63a
c5d93acb-ea8b-4b14-8f53-02138444ae99
585c15c4-6cbf-4126-8f87-e26bff78b657
36965a81-b1c6-401b-99f8-22508f1e6a26
354d6aad-4754-46e4-a576-1b384561c440
You need these IDs to construct various Flow Service entities. You also need to refer to parts of the Connection Spec itself to set up certain entities so you can retrieve the Connection Spec from Flow Service APIs. See the examples below of retrieving connection specs for all the destinations in the table:
Request
curl --location --request GET 'https://platform.adobe.io/data/foundation/flowservice/connectionSpecs/4fce964d-3f37-408f-9778-e597338a21ee' \
--header 'accept: application/json' \
--header 'x-api-key: {API_KEY}' \
--header 'x-gw-ims-org-id: {ORG_ID}' \
--header 'x-sandbox-name: {SANDBOX_NAME}' \
--header 'Authorization: Bearer {ACCESS_TOKEN}'
Response
{
"items": [
{
"id": "4fce964d-3f37-408f-9778-e597338a21ee",
"name": "Amazon S3",
"providerId": "14e34fac-d307-11e9-bb65-2a2ae2dbcce4",
"version": "1.0",
//...
Request
curl --location --request GET 'https://platform.adobe.io/data/foundation/flowservice/connectionSpecs/6d6b59bf-fb58-4107-9064-4d246c0e5bb2' \
--header 'accept: application/json' \
--header 'x-api-key: {API_KEY}' \
--header 'x-gw-ims-org-id: {ORG_ID}' \
--header 'x-sandbox-name: {SANDBOX_NAME}' \
--header 'Authorization: Bearer {ACCESS_TOKEN}'
Response
{
"items": [
{
"id": "6d6b59bf-fb58-4107-9064-4d246c0e5bb2",
"name": "Azure Blob Storage",
"providerId": "14e34fac-d307-11e9-bb65-2a2ae2dbcce4",
"version": "1.0",
//...
Request
curl --location --request GET 'https://platform.adobe.io/data/foundation/flowservice/connectionSpecs/be2c3209-53bc-47e7-ab25-145db8b873e1' \
--header 'accept: application/json' \
--header 'x-api-key: {API_KEY}' \
--header 'x-gw-ims-org-id: {ORG_ID}' \
--header 'x-sandbox-name: {SANDBOX_NAME}' \
--header 'Authorization: Bearer {ACCESS_TOKEN}'
Response
{
"items": [
{
"id": "be2c3209-53bc-47e7-ab25-145db8b873e1",
"name": "Azure Data Lake Gen2",
"providerId": "14e34fac-d307-11e9-bb65-2a2ae2dbcce4",
"version": "1.0",
//...
Request
curl --location --request GET 'https://platform.adobe.io/data/foundation/flowservice/connectionSpecs/10440537-2a7b-4583-ac39-ed38d4b848e8' \
--header 'accept: application/json' \
--header 'x-api-key: {API_KEY}' \
--header 'x-gw-ims-org-id: {ORG_ID}' \
--header 'x-sandbox-name: {SANDBOX_NAME}' \
--header 'Authorization: Bearer {ACCESS_TOKEN}'
Response
{
"items": [
{
"id": "10440537-2a7b-4583-ac39-ed38d4b848e8",
"name": "Data Landing Zone",
"providerId": "14e34fac-d307-11e9-bb65-2a2ae2dbcce4",
"version": "1.0",
//...
Request
curl --location --request GET 'https://platform.adobe.io/data/foundation/flowservice/connectionSpecs/c5d93acb-ea8b-4b14-8f53-02138444ae99' \
--header 'accept: application/json' \
--header 'x-api-key: {API_KEY}' \
--header 'x-gw-ims-org-id: {ORG_ID}' \
--header 'x-sandbox-name: {SANDBOX_NAME}' \
--header 'Authorization: Bearer {ACCESS_TOKEN}'
Response
{
"items": [
{
"id": "c5d93acb-ea8b-4b14-8f53-02138444ae99",
"name": "Google Cloud Storage",
"providerId": "14e34fac-d307-11e9-bb65-2a2ae2dbcce4",
"version": "1.0",
//...
Request
curl --location --request GET 'https://platform.adobe.io/data/foundation/flowservice/connectionSpecs/36965a81-b1c6-401b-99f8-22508f1e6a26' \
--header 'accept: application/json' \
--header 'x-api-key: {API_KEY}' \
--header 'x-gw-ims-org-id: {ORG_ID}' \
--header 'x-sandbox-name: {SANDBOX_NAME}' \
--header 'Authorization: Bearer {ACCESS_TOKEN}'
Response
{
"items": [
{
"id": "36965a81-b1c6-401b-99f8-22508f1e6a26",
"name": "SFTP",
"providerId": "14e34fac-d307-11e9-bb65-2a2ae2dbcce4",
"version": "1.0",
//...
Follow the steps below to set up a dataset dataflow to a cloud storage destination. For some steps, the requests and responses differ between the various cloud storage destinations. In those cases, use the tabs on the page to retrieve the requests and responses specific to the destination that you want to connect and export datasets to. Be sure to use the correct connection spec and flow spec for the destination you are configuring.
Retrieve a list of datasets
To retrieve a list of datasets eligible for activation, start by making an API call to the below endpoint.
Request
curl --location --request GET 'https://platform.adobe.io/data/foundation/flowservice/connectionSpecs/23598e46-f560-407b-88d5-ea6207e49db0/configs?outputType=activationDatasets&outputField=datasets&start=0&limit=20&properties=name,state' \
--header 'accept: application/json' \
--header 'x-gw-ims-org-id: {ORG_ID}' \
--header 'x-api-key: {API_KEY}' \
--header 'x-sandbox-name: {SANDBOX_NAME}' \
--header 'Authorization: Bearer {ACCESS_TOKEN}'
Note that to retrieve eligible datasets, the connection spec ID used in the request URL must be the data lake source connection spec ID, 23598e46-f560-407b-88d5-ea6207e49db0
, and the two query parameters outputField=datasets
and outputType=activationDatasets
must be specified. All other query parameters are the standard ones supported by the Catalog Service API.
Response
{
"items": [
{
"id": "5ef3e324052581191aa6a466",
"name": "AAM Authenticated Profiles Meta Data",
"description": "Activation profile export dataset",
"fileDescription": {
"persisted": true,
"containerFormat": "parquet",
"format": "parquet"
},
"aspect": "production",
"state": "DRAFT"
},
{
"id": "5ef3e3259ad2a1191ab7dd7d",
"name": "AAM Devices Data",
"description": "Activation profile export dataset",
"fileDescription": {
"persisted": true,
"containerFormat": "parquet",
"format": "parquet"
},
"aspect": "production",
"state": "DRAFT"
},
{
"id": "5ef3e325582424191b1beb42",
"name": "AAM Devices Profile Meta Data",
"description": "Activation profile export dataset",
"fileDescription": {
"persisted": true,
"containerFormat": "parquet",
"format": "parquet"
},
"aspect": "production",
"state": "DRAFT"
},
{
"id": "5ef3e328582424191b1beb44",
"name": "AAM Realtime",
"description": "Activation profile export dataset",
"fileDescription": {
"persisted": true,
"containerFormat": "parquet",
"format": "parquet"
},
"aspect": "production",
"state": "DRAFT"
},
{
"id": "5ef3e328fe742a191b2b3ea5",
"name": "AAM Realtime Profile Updates",
"description": "Activation profile export dataset",
"fileDescription": {
"persisted": true,
"containerFormat": "parquet",
"format": "parquet"
},
"aspect": "production",
"state": "DRAFT"
}
],
"pageInfo": {
"start": 0,
"end": 4,
"total": 149,
"hasNext": true
}
}
A successful response contains a list of datasets eligible for activation. These datasets can be used when constructing the source connection in the next step.
For information about the various response parameters for each returned dataset, refer to the Datasets API developer documentation.
Create a source connection
After retrieving the list of datasets that you want to export, you can create a source connection using those dataset IDs.
Request
Note the highlighted lines with inline comments in the request example, which provide additional information. Remove the inline comments in the request when copy-pasting the request into your terminal of choice.
curl --location --request POST 'https://platform.adobe.io/data/foundation/flowservice/sourceConnections' \
--header 'accept: application/json' \
--header 'x-api-key: {API_KEY}' \
--header 'x-gw-ims-org-id: {ORG_ID}' \
--header 'x-sandbox-name: {SANDBOX_NAME}' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--data-raw '{
"name": "Connecting to Data Lake",
"description": "Data Lake source connection to export datasets",
"connectionSpec": {
"id": "23598e46-f560-407b-88d5-ea6207e49db0", // this connection spec ID is always the same for Source Connections
"version": "1.0"
},
"params": {
"datasets": [ // datasets to activate
{
"dataSetId": "5ef3e3259ad2a1191ab7dd7d",
"name": "AAM Devices Data"
}
]
}
}'
Response
{
"id": "900df191-b983-45cd-90d5-4c7a0326d650",
"etag": "\"0500ebe1-0000-0200-0000-63e28d060000\""
}
A successful response returns the ID (id
) of the newly created source connection and an etag
. Note down the source connection ID as you will need it later when creating the dataflow.
Please also remember that:
- The source connection created in this step needs to be linked to a dataflow for its datasets to be activated to a destination. See the create a dataflow section for information on how to link a source connection to a dataflow.
- The dataset IDs of a source connection cannot be modified after creation. If you need to add or remove datasets from a source connection, you must create a new source connection and link the ID of the new source connection to the dataflow.
Create a (target) base connection
A base connection securely stores the credentials to your destination. Depending on the destination type, the credentials needed to authenticate against that destination can vary. To find these authentication parameters, first retrieve the connection spec for your desired destination as described in the section Gather connection specs and flow specs and then look at the authSpec
of the response. Reference the tabs below for the authSpec
properties of all supported destinations.
Note the highlighted line with inline comments in the connection spec example below, which provide additional information about where to find the authentication parameters in the connection spec.
{
"items": [
{
"id": "4fce964d-3f37-408f-9778-e597338a21ee",
"name": "Amazon S3",
"providerId": "14e34fac-d307-11e9-bb65-2a2ae2dbcce4",
"version": "1.0",
"authSpec": [ // describes the authentication parameters
{
"name": "Access Key",
"type": "KeyBased",
"spec": {
"$schema": "http://json-schema.org/draft-07/schema#",
"description": "Defines auth params required for connecting to amazon-s3",
"type": "object",
"properties": {
"s3AccessKey": {
"description": "Access key id",
"type": "string",
"pattern": "^[A-Z2-7]{20}$"
},
"s3SecretKey": {
"description": "Secret access key for the user account",
"type": "string",
"format": "password",
"pattern": "^[A-Za-z0-9\/+]{40}$"
}
},
"required": [
"s3SecretKey",
"s3AccessKey"
]
}
}
],
//...
Note the highlighted line with inline comments in the connection spec example below, which provide additional information about where to find the authentication parameters in the connection spec.
{
"items": [
{
"id": "6d6b59bf-fb58-4107-9064-4d246c0e5bb2",
"name": "Azure Blob Storage",
"providerId": "14e34fac-d307-11e9-bb65-2a2ae2dbcce4",
"version": "1.0",
"authSpec": [ // describes the authentication parameters
{
"name": "ConnectionString",
"type": "ConnectionString",
"spec": {
"$schema": "http://json-schema.org/draft-07/schema#",
"description": "Connection String for Azure Blob based destinations",
"type": "object",
"properties": {
"connectionString": {
"description": "connection string for login",
"type": "string",
"format": "password"
}
},
"required": [
"connectionString"
]
}
}
],
//...
Note the highlighted line with inline comments in the connection spec example below, which provide additional information about where to find the authentication parameters in the connection spec.
{
"items": [
{
"id": "be2c3209-53bc-47e7-ab25-145db8b873e1",
"name": "Azure Data Lake Gen2",
"providerId": "14e34fac-d307-11e9-bb65-2a2ae2dbcce4",
"version": "1.0",
"authSpec": [ // describes the authentication parameters
{
"name": "Azure Service Principal Auth",
"type": "AzureServicePrincipal",
"spec": {
"$schema": "http://json-schema.org/draft-07/schema#",
"description": "defines auth params required for connecting to adlsgen2 using service principal",
"type": "object",
"properties": {
"url": {
"description": "Endpoint for Azure Data Lake Storage Gen2.",
"type": "string"
},
"servicePrincipalId": {
"description": "Service Principal Id to connect to ADLSGen2.",
"type": "string"
},
"servicePrincipalKey": {
"description": "Service Principal Key to connect to ADLSGen2.",
"type": "string",
"format": "password"
},
"tenant": {
"description": "Tenant information(domain name or tenant ID).",
"type": "string"
}
},
"required": [
"servicePrincipalKey",
"url",
"tenant",
"servicePrincipalId"
]
}
}
],
//...
{
"items": [
{
"id": "10440537-2a7b-4583-ac39-ed38d4b848e8",
"name": "Data Landing Zone",
"providerId": "14e34fac-d307-11e9-bb65-2a2ae2dbcce4",
"version": "1.0",
"authSpec": [],
//...
Note the highlighted line with inline comments in the connection spec example below, which provide additional information about where to find the authentication parameters in the connection spec.
{
"items": [
{
"id": "c5d93acb-ea8b-4b14-8f53-02138444ae99",
"name": "Google Cloud Storage",
"providerId": "14e34fac-d307-11e9-bb65-2a2ae2dbcce4",
"version": "1.0",
"authSpec": [ // describes the authentication parameters
{
"name": "Google Cloud Storage authentication credentials",
"type": "GoogleCloudStorageAuth",
"spec": {
"$schema": "http://json-schema.org/draft-07/schema#",
"description": "defines auth params required for connecting to google cloud storage connector.",
"type": "object",
"properties": {
"accessKeyId": {
"description": "Access Key Id for the user account",
"type": "string"
},
"secretAccessKey": {
"description": "Secret Access Key for the user account",
"type": "string",
"format": "password"
}
},
"required": [
"accessKeyId",
"secretAccessKey"
]
}
}
],
//...
Note the highlighted line with inline comments in the connection spec example below, which provide additional information about where to find the authentication parameters in the connection spec.
{
"items": [
{
"id": "36965a81-b1c6-401b-99f8-22508f1e6a26",
"name": "SFTP",
"providerId": "14e34fac-d307-11e9-bb65-2a2ae2dbcce4",
"version": "1.0",
"authSpec": [ // describes the authentication parameters
{
"name": "SFTP with Password",
"type": "SFTP",
"spec": {
"$schema": "http://json-schema.org/draft-07/schema#",
"description": "defines auth params required for connecting to sftp locations with a password",
"type": "object",
"properties": {
"domain": {
"description": "Domain of server",
"type": "string"
},
"username": {
"description": "Username",
"type": "string"
},
"password": {
"description": "Password",
"type": "string",
"format": "password"
}
},
"required": [
"password",
"domain",
"username"
]
}
},
{
"name": "SFTP with SSH Key",
"type": "SFTP",
"spec": {
"$schema": "http://json-schema.org/draft-07/schema#",
"description": "defines auth params required for connecting to sftp locations using SSH Key",
"type": "object",
"properties": {
"domain": {
"description": "Domain of server",
"type": "string"
},
"username": {
"description": "Username",
"type": "string"
},
"sshKey": {
"description": "Base64 string of the private SSH key",
"type": "string",
"format": "password",
"contentEncoding": "base64",
"uiAttributes": {
"tooltip": {
"id": "platform_destinations_connect_sftp_ssh",
"fallbackUrl": "http://www.adobe.com/go/destinations-sftp-connection-parameters-en "
}
}
}
},
"required": [
"sshKey",
"domain",
"username"
]
}
}
],
//...
Using the properties specified in the authentication spec (i.e. authSpec
from the response) you can create a base connection with the required credentials, specific to each destination type, as shown in the examples below:
Request
Note the highlighted lines with inline comments in the request example, which provide additional information. Remove the inline comments in the request when copy-pasting the request into your terminal of choice.
curl --location --request POST 'https://platform.adobe.io/data/foundation/flowservice/connections' \
--header 'accept: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--header 'x-api-key: <API-KEY>' \
--header 'x-gw-ims-org-id: <IMS-ORG-ID>' \
--header 'x-sandbox-name: <SANDBOX-NAME>' \
--header 'Content-Type: application/json' \
--data-raw '{
"name": "Amazon S3 Base Connection",
"auth": {
"specName": "Access Key",
"params": {
"s3SecretKey": "<Add secret key>",
"s3AccessKey": "<Add access key>"
}
},
"connectionSpec": {
"id": "4fce964d-3f37-408f-9778-e597338a21ee", // Amazon S3 connection spec
"version": "1.0"
}
}'
Response
{
"id": "12401496-2573-4ca7-8137-fef1aeb9dd4c",
"etag": "\"0000d781-0000-0200-0000-63e29f420000\""
}
Request
Note the highlighted lines with inline comments in the request example, which provide additional information. Remove the inline comments in the request when copy-pasting the request into your terminal of choice.
curl --location --request POST 'https://platform.adobe.io/data/foundation/flowservice/connections' \
--header 'accept: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--header 'x-api-key: <API-KEY>' \
--header 'x-gw-ims-org-id: <IMS-ORG-ID>' \
--header 'x-sandbox-name: <SANDBOX-NAME>' \
--header 'Content-Type: application/json' \
--data-raw '{
"name": "Azure Blob Storage Base Connection",
"auth": {
"specName": "ConnectionString",
"params": {
"connectionString": "<Add Azure Blob connection string>"
}
},
"connectionSpec": {
"id": "6d6b59bf-fb58-4107-9064-4d246c0e5bb2", // Azure Blob Storage connection spec
"version": "1.0"
}
}'
Response
{
"id": "12401496-2573-4ca7-8137-fef1aeb9dd4c",
"etag": "\"0000d781-0000-0200-0000-63e29f420000\""
}
Request
Note the highlighted lines with inline comments in the request example, which provide additional information. Remove the inline comments in the request when copy-pasting the request into your terminal of choice.
curl --location --request POST 'https://platform.adobe.io/data/foundation/flowservice/connections' \
--header 'accept: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--header 'x-api-key: <API-KEY>' \
--header 'x-gw-ims-org-id: <IMS-ORG-ID>' \
--header 'x-sandbox-name: <SANDBOX-NAME>' \
--header 'Content-Type: application/json' \
--data-raw '{
"name": "Azure Data Lake Gen 2(ADLS Gen2) Base Connection",
"auth": {
"specName": "Azure Service Principal Auth",
"params": {
"servicePrincipalKey": "<Add servicePrincipalKey>",
"url": "<Add url>",
"tenant": "<Add tenant>",
"servicePrincipalId": "<Add servicePrincipalId>"
}
},
"connectionSpec": {
"id": "be2c3209-53bc-47e7-ab25-145db8b873e1", // Azure Data Lake Gen 2(ADLS Gen2) connection spec
"version": "1.0"
}
}'
Response
{
"id": "12401496-2573-4ca7-8137-fef1aeb9dd4c",
"etag": "\"0000d781-0000-0200-0000-63e29f420000\""
}
Request
curl --location --request POST 'https://platform.adobe.io/data/foundation/flowservice/connections' \
--header 'accept: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--header 'x-api-key: <API-KEY>' \
--header 'x-gw-ims-org-id: <IMS-ORG-ID>' \
--header 'x-sandbox-name: <SANDBOX-NAME>' \
--header 'Content-Type: application/json' \
--data-raw '{
"name": "Data Landing Zone Base Connection",
"connectionSpec": {
"id": "3567r537-2a7b-4583-ac39-ed38d4b848e8",
"version": "1.0"
}
}'
Response
{
"id": "12401496-2573-4ca7-8137-fef1aeb9dd4c",
"etag": "\"0000d781-0000-0200-0000-63e29f420000\""
}
Request
Note the highlighted lines with inline comments in the request example, which provide additional information. Remove the inline comments in the request when copy-pasting the request into your terminal of choice.
curl --location --request POST 'https://platform.adobe.io/data/foundation/flowservice/connections' \
--header 'accept: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--header 'x-api-key: <API-KEY>' \
--header 'x-gw-ims-org-id: <IMS-ORG-ID>' \
--header 'x-sandbox-name: <SANDBOX-NAME>' \
--header 'Content-Type: application/json' \
--data-raw '{
"name": "Google Cloud Storage Base Connection",
"auth": {
"specName": "Google Cloud Storage authentication credentials",
"params": {
"accessKeyId": "<Add accessKeyId>",
"secretAccessKey": "<Add secret Access Key>"
}
},
"connectionSpec": {
"id": "c5d93acb-ea8b-4b14-8f53-02138444ae99", // Google Cloud Storage connection spec
"version": "1.0"
}
}'
Response
{
"id": "12401496-2573-4ca7-8137-fef1aeb9dd4c",
"etag": "\"0000d781-0000-0200-0000-63e29f420000\""
}
Request
Note the highlighted lines with inline comments in the request example, which provide additional information. Remove the inline comments in the request when copy-pasting the request into your terminal of choice.
curl --location --request POST 'https://platform.adobe.io/data/foundation/flowservice/connections' \
--header 'accept: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--header 'x-api-key: <API-KEY>' \
--header 'x-gw-ims-org-id: <IMS-ORG-ID>' \
--header 'x-sandbox-name: <SANDBOX-NAME>' \
--header 'Content-Type: application/json' \
--data-raw '{
"name": "SFTP with password Base Connection",
"auth": {
"specName": "SFTP with Password",
"params": {
"domain": "<Add domain>",
"username": "<Add username>",
"password": "<Add password>"
}
},
"connectionSpec": {
"id": "36965a81-b1c6-401b-99f8-22508f1e6a26", // SFTP connection spec
"version": "1.0"
}
}'
Note the highlighted lines with inline comments in the request example, which provide additional information. Remove the inline comments in the request when copy-pasting the request into your terminal of choice.
curl --location --request POST 'https://platform.adobe.io/data/foundation/flowservice/connections' \
--header 'accept: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--header 'x-api-key: <API-KEY>' \
--header 'x-gw-ims-org-id: <IMS-ORG-ID>' \
--header 'x-sandbox-name: <SANDBOX-NAME>' \
--header 'Content-Type: application/json' \
--data-raw '{
"name": "SFTP with SSH key Base Connection",
"auth": {
"specName": "SFTP with SSH Key",
"params": {
"domain": "<Add domain>",
"username": "<Add username>",
"sshKey": "<Add SSH key>"
}
},
"connectionSpec": {
"id": "36965a81-b1c6-401b-99f8-22508f1e6a26", // SFTP connection spec
"version": "1.0"
}
}'
Response
{
"id": "12401496-2573-4ca7-8137-fef1aeb9dd4c",
"etag": "\"0000d781-0000-0200-0000-63e29f420000\""
}
Note the connection ID from the response. This ID will be required in the next step when creating the target connection.
Create a target connection
Next, you need to create a target connection which stores the export parameters for your datasets. Export parameters include location, file format, compression, and other details. Refer to the targetSpec
properties provided in the destination’s connection spec to understand the supported properties for each destination type. Reference the tabs below for the targetSpec
properties of all supported destinations.
Note the highlighted lines with inline comments in the connection spec example below, which provide additional information about where to find the target spec parameters in the connection spec. You can see also in the example below which target parameters are not applicable to dataset export destinations.
{
"items": [
{
"id": "4fce964d-3f37-408f-9778-e597338a21ee",
"name": "Amazon S3",
"providerId": "14e34fac-d307-11e9-bb65-2a2ae2dbcce4",
"version": "1.0",
"authSpec": [...],
"encryptionSpecs": [...],
"targetSpec": { // describes the target connection parameters
"name": "User based target",
"type": "UserNamespace",
"spec": {
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"properties": {
"bucketName": {
"title": "Bucket name",
"description": "Bucket name",
"type": "string",
"pattern": "(?=^.{3,63}$)(?!^(\\d+\\.)+\\d+$)(^(([a-z0-9]|[a-z0-9][a-z0-9\\-]*[a-z0-9])\\.)*([a-z0-9]|[a-z0-9][a-z0-9\\-]*[a-z0-9])$)",
"uiAttributes": {
"tooltip": {
"id": "platform_destinations_connect_s3_bucket",
"fallbackUrl": "http://www.adobe.com/go/destinations-amazon-s3-connection-parameters-en"
}
}
},
"path": {
"title": "Folder path",
"description": "Output path for copying files",
"type": "string",
"pattern": "^[0-9a-zA-Z\/\\!\\-_\\.\\*\\''\\(\\)]*((\\%SEGMENT_(NAME|ID)\\%)?\/?)+$",
"uiAttributes": {
"tooltip": {
"id": "platform_destinations_connect_s3_folderpath",
"fallbackUrl": "http://www.adobe.com/go/destinations-amazon-s3-connection-parameters-en"
}
}
},
"fileType": {...}, // not applicable to dataset destinations
"datasetFileType": {
"conditional": {
"field": "flowSpec.attributes._workflow",
"operator": "CONTAINS",
"value": "DATASETS"
},
"title": "File Type",
"description": "Select file format",
"type": "string",
"enum": [
"JSON",
"PARQUET"
]
},
"csvOptions": {...}, // not applicable to dataset destinations
"compression": {
"title": "Compression format",
"description": "Select the desired file compression format.",
"type": "string",
"enum": [
"NONE",
"GZIP"
]
}
},
"required": [
"bucketName",
"path",
"datasetFileType",
"compression",
"fileType"
]
}
//...
Note the highlighted lines with inline comments in the connection spec example below, which provide additional information about where to find the target spec parameters in the connection spec. You can see also in the example below which target parameters are not applicable to dataset export destinations.
{
"items": [
{
"id": "6d6b59bf-fb58-4107-9064-4d246c0e5bb2",
"name": "Azure Blob Storage",
"providerId": "14e34fac-d307-11e9-bb65-2a2ae2dbcce4",
"version": "1.0",
"authSpec": [...],
"encryptionSpecs": [...],
"targetSpec": { // describes the target connection parameters
"name": "User based target",
"type": "UserNamespace",
"spec": {
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"properties": {
"path": {
"title": "Folder path",
"description": "Output path (relative) indicating where to upload the data",
"type": "string",
"pattern": "^[0-9a-zA-Z\/\\!\\-_\\.\\*\\'\\(\\)]+$"
},
"container": {
"title": "Container",
"description": "Container within the storage where to upload the data",
"type": "string",
"pattern": "^[a-z0-9](?!.*--)[a-z0-9-]{1,61}[a-z0-9]$"
},
"fileType": {...}, // not applicable to dataset destinations
"datasetFileType": {
"conditional": {
"field": "flowSpec.attributes._workflow",
"operator": "CONTAINS",
"value": "DATASETS"
},
"title": "File Type",
"description": "Select file format",
"type": "string",
"enum": [
"JSON",
"PARQUET"
]
},
"csvOptions": {...}, // not applicable to dataset destinations
"compression": {
"title": "Compression format",
"description": "Select the desired file compression format.",
"type": "string",
"enum": [
"NONE",
"GZIP"
]
}
},
"required": [
"container",
"path",
"datasetFileType",
"compression",
"fileType"
]
}
//...
Note the highlighted lines with inline comments in the connection spec example below, which provide additional information about where to find the target spec parameters in the connection spec. You can see also in the example below which target parameters are not applicable to dataset export destinations.
{
"items": [
{
"id": "be2c3209-53bc-47e7-ab25-145db8b873e1",
"name": "Azure Data Lake Gen2",
"providerId": "14e34fac-d307-11e9-bb65-2a2ae2dbcce4",
"version": "1.0",
"authSpec": [...],
"encryptionSpecs": [...],
"targetSpec": { // describes the target connection parameters
"name": "User based target",
"type": "UserNamespace",
"spec": {
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"properties": {
"path": {
"title": "Folder path",
"description": "Enter the path to your Azure Data Lake Storage folder",
"type": "string"
},
"fileType": {...}, // not applicable to dataset destinations
"datasetFileType": {
"conditional": {
"field": "flowSpec.attributes._workflow",
"operator": "CONTAINS",
"value": "DATASETS"
},
"title": "File Type",
"description": "Select file format",
"type": "string",
"enum": [
"JSON",
"PARQUET"
]
},
"csvOptions":{...}, // not applicable to dataset destinations
"compression": {
"title": "Compression format",
"description": "Select the desired file compression format.",
"type": "string",
"enum": [
"NONE",
"GZIP"
]
}
},
"required": [
"path",
"datasetFileType",
"compression",
"fileType"
]
}
//...
Note the highlighted lines with inline comments in the connection spec example below, which provide additional information about where to find the target spec parameters in the connection spec. You can see also in the example below which target parameters are not applicable to dataset export destinations.
"items": [
{
"id": "10440537-2a7b-4583-ac39-ed38d4b848e8",
"name": "Data Landing Zone",
"providerId": "14e34fac-d307-11e9-bb65-2a2ae2dbcce4",
"version": "1.0",
"authSpec": [],
"encryptionSpecs": [],
"targetSpec": { // describes the target connection parameters
"name": "User based target",
"type": "UserNamespace",
"spec": {
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"properties": {
"path": {
"title": "Folder path",
"description": "Enter the path to your Azure Data Lake Storage folder",
"type": "string"
},
"fileType": {...}, // not applicable to dataset destinations
"datasetFileType": {
"conditional": {
"field": "flowSpec.attributes._workflow",
"operator": "CONTAINS",
"value": "DATASETS"
},
"title": "File Type",
"description": "Select file format",
"type": "string",
"enum": [
"JSON",
"PARQUET"
]
},
"csvOptions": {...}, // not applicable to dataset destinations
"compression": {
"title": "Compression format",
"description": "Select the desired file compression format.",
"type": "string",
"enum": [
"NONE",
"GZIP"
]
}
},
"required": [
"path",
"datasetFileType",
"compression",
"fileType"
]
}
//...
Note the highlighted lines with inline comments in the connection spec example below, which provide additional information about where to find the target spec parameters in the connection spec. You can see also in the example below which target parameters are not applicable to dataset export destinations.
{
"items": [
{
"id": "c5d93acb-ea8b-4b14-8f53-02138444ae99",
"name": "Google Cloud Storage",
"providerId": "14e34fac-d307-11e9-bb65-2a2ae2dbcce4",
"version": "1.0",
"authSpec": [...],
"encryptionSpecs": [...],
"targetSpec": { // describes the target connection parameters
"name": "User based target",
"type": "UserNamespace",
"spec": {
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"properties": {
"bucketName": {
"title": "Bucket name",
"description": "Bucket name",
"type": "string",
"pattern": "(?!^goog.*$)(?!^.*g(o|0)(o|0)gle.*$)(((?=^.{3,63}$)(^([a-z0-9]|[a-z0-9][a-z0-9\\-_]*)[a-z0-9]$))|((?=^.{3,222}$)(?!^(\\d+\\.)+\\d+$)(^(([a-z0-9]{1,63}|[a-z0-9][a-z0-9\\-_]{1,61}[a-z0-9])\\.)*([a-z0-9]{1,63}|[a-z0-9][a-z0-9\\-_]{1,61}[a-z0-9])$)))"
},
"path": {
"title": "Folder path",
"description": "Output path for copying files",
"type": "string",
"pattern": "^[0-9a-zA-Z\/\\!\\-_\\.\\*\\''\\(\\)]*((\\%SEGMENT_(NAME|ID)\\%)?\/?)+$"
},
"fileType": {...}, // not applicable to dataset destinations
"datasetFileType": {
"conditional": {
"field": "flowSpec.attributes._workflow",
"operator": "CONTAINS",
"value": "DATASETS"
},
"title": "File Type",
"description": "Select file format",
"type": "string",
"enum": [
"JSON",
"PARQUET"
]
},
"csvOptions": {...}, // not applicable to dataset destinations
"compression": {
"title": "Compression format",
"description": "Select the desired file compression format.",
"type": "string",
"enum": [
"NONE",
"GZIP"
]
}
},
"required": [
"bucketName",
"path",
"datasetFileType",
"compression",
"fileType"
]
}
//...
Note the highlighted lines with inline comments in the connection spec example below, which provide additional information about where to find the target spec parameters in the connection spec. You can see also in the example below which target parameters are not applicable to dataset export destinations.
{
"items": [
{
"id": "36965a81-b1c6-401b-99f8-22508f1e6a26",
"name": "SFTP",
"providerId": "14e34fac-d307-11e9-bb65-2a2ae2dbcce4",
"version": "1.0",
"authSpec": [...],
"encryptionSpecs": [...],
"targetSpec": { // describes the target connection parameters
"name": "User based target",
"type": "UserNamespace",
"spec": {
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"properties": {
"remotePath": {
"title": "Folder path",
"description": "Enter your folder path",
"type": "string"
},
"fileType": {...}, // not applicable to dataset destinations
"datasetFileType": {
"conditional": {
"field": "flowSpec.attributes._workflow",
"operator": "CONTAINS",
"value": "DATASETS"
},
"title": "File Type",
"description": "Select file format",
"type": "string",
"enum": [
"JSON",
"PARQUET"
]
},
"csvOptions": {...}, // not applicable to dataset destinations
"compression": {
"title": "Compression format",
"description": "Select the desired file compression format.",
"type": "string",
"enum": [
"GZIP",
"NONE"
]
}
},
"required": [
"remotePath",
"datasetFileType",
"compression",
"fileType"
]
},
//...
By using the above spec, you can construct a target connection request specific to your desired cloud storage destination, as shown in the tabs below.
Request
For other supported values of
datasetFileType
, see the API reference documentation.Note the highlighted lines with inline comments in the request example, which provide additional information. Remove the inline comments in the request when copy-pasting the request into your terminal of choice.
curl --location --request POST 'https://platform.adobe.io/data/foundation/flowservice/targetConnections' \
--header 'accept: application/json' \
--header 'x-api-key: {API_KEY}' \
--header 'x-gw-ims-org-id: {ORG_ID}' \
--header 'x-sandbox-name: {SANDBOX_NAME}' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--data-raw '{
"name": "Amazon S3 Target Connection",
"baseConnectionId": "<FROM_STEP_CREATE_TARGET_BASE_CONNECTION>",
"params": {
"mode": "Server-to-server",
"bucketName": "your-bucket-name",
"path": "folder/subfolder",
"compression": "NONE",
"datasetFileType": "JSON"
},
"connectionSpec": {
"id": "4fce964d-3f37-408f-9778-e597338a21ee", // Amazon S3 connection spec id
"version": "1.0"
}
}'
Response
{
"id": "12401496-2573-4ca7-8137-fef1aeb9dd4c",
"etag": "\"0000d781-0000-0200-0000-63e29f420000\""
}
Request
For other supported values of
datasetFileType
, see the API reference documentation.Note the highlighted lines with inline comments in the request example, which provide additional information. Remove the inline comments in the request when copy-pasting the request into your terminal of choice.
curl --location --request POST 'https://platform.adobe.io/data/foundation/flowservice/targetConnections' \
--header 'accept: application/json' \
--header 'x-api-key: {API_KEY}' \
--header 'x-gw-ims-org-id: {ORG_ID}' \
--header 'x-sandbox-name: {SANDBOX_NAME}' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--data-raw '{
"name": "Azure Blob Storage Target Connection",
"baseConnectionId": "<FROM_STEP_CREATE_TARGET_BASE_CONNECTION>",
"params": {
"mode": "Server-to-server",
"container": "your-container-name",
"path": "folder/subfolder",
"compression": "NONE",
"datasetFileType": "JSON"
},
"connectionSpec": {
"id": "6d6b59bf-fb58-4107-9064-4d246c0e5bb2", // Azure Blob Storage connection spec id
"version": "1.0"
}
}'
Response
{
"id": "12401496-2573-4ca7-8137-fef1aeb9dd4c",
"etag": "\"0000d781-0000-0200-0000-63e29f420000\""
}
Request
For other supported values of
datasetFileType
, see the API reference documentation.Note the highlighted lines with inline comments in the request example, which provide additional information. Remove the inline comments in the request when copy-pasting the request into your terminal of choice.
curl --location --request POST 'https://platform.adobe.io/data/foundation/flowservice/targetConnections' \
--header 'accept: application/json' \
--header 'x-api-key: {API_KEY}' \
--header 'x-gw-ims-org-id: {ORG_ID}' \
--header 'x-sandbox-name: {SANDBOX_NAME}' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--data-raw '{
"name": "Azure Data Lake Gen 2(ADLS Gen2) Target Connection",
"baseConnectionId": "<FROM_STEP_CREATE_TARGET_BASE_CONNECTION>",
"params": {
"mode": "Server-to-server",
"path": "folder/subfolder",
"compression": "NONE",
"datasetFileType": "JSON"
},
"connectionSpec": {
"id": "be2c3209-53bc-47e7-ab25-145db8b873e1", // Azure Data Lake Gen 2(ADLS Gen2) connection spec id
"version": "1.0"
}
}'
Response
{
"id": "12401496-2573-4ca7-8137-fef1aeb9dd4c",
"etag": "\"0000d781-0000-0200-0000-63e29f420000\""
}
Request
For other supported values of
datasetFileType
, see the API reference documentation.Note the highlighted lines with inline comments in the request example, which provide additional information. Remove the inline comments in the request when copy-pasting the request into your terminal of choice.
curl --location --request POST 'https://platform.adobe.io/data/foundation/flowservice/targetConnections' \
--header 'accept: application/json' \
--header 'x-api-key: {API_KEY}' \
--header 'x-gw-ims-org-id: {ORG_ID}' \
--header 'x-sandbox-name: {SANDBOX_NAME}' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--data-raw '{
"name": "Data Landing Zone Target Connection",
"baseConnectionId": "<FROM_STEP_CREATE_TARGET_BASE_CONNECTION>",
"params": {
"mode": "Server-to-server",
"path": "folder/subfolder",
"compression": "NONE",
"datasetFileType": "JSON"
},
"connectionSpec": {
"id": "10440537-2a7b-4583-ac39-ed38d4b848e8", // Data Landing Zone connection spec id
"version": "1.0"
}
}'
Response
{
"id": "12401496-2573-4ca7-8137-fef1aeb9dd4c",
"etag": "\"0000d781-0000-0200-0000-63e29f420000\""
}
Request
For other supported values of
datasetFileType
, see the API reference documentation.Note the highlighted lines with inline comments in the request example, which provide additional information. Remove the inline comments in the request when copy-pasting the request into your terminal of choice.
curl --location --request POST 'https://platform.adobe.io/data/foundation/flowservice/targetConnections' \
--header 'accept: application/json' \
--header 'x-api-key: {API_KEY}' \
--header 'x-gw-ims-org-id: {ORG_ID}' \
--header 'x-sandbox-name: {SANDBOX_NAME}' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--data-raw '{
"name": "Google Cloud Storage Target Connection",
"baseConnectionId": "<FROM_STEP_CREATE_TARGET_BASE_CONNECTION>",
"params": {
"mode": "Server-to-server",
"bucketName": "your-bucket-name",
"path": "folder/subfolder",
"compression": "NONE",
"datasetFileType": "JSON"
},
"connectionSpec": {
"id": "c5d93acb-ea8b-4b14-8f53-02138444ae99", // Google Cloud Storage connection spec id
"version": "1.0"
}
}'
Response
{
"id": "12401496-2573-4ca7-8137-fef1aeb9dd4c",
"etag": "\"0000d781-0000-0200-0000-63e29f420000\""
}
Request
For other supported values of
datasetFileType
, see the API reference documentation.Note the highlighted lines with inline comments in the request example, which provide additional information. Remove the inline comments in the request when copy-pasting the request into your terminal of choice.
curl --location --request POST 'https://platform.adobe.io/data/foundation/flowservice/targetConnections' \
--header 'accept: application/json' \
--header 'x-api-key: {API_KEY}' \
--header 'x-gw-ims-org-id: {ORG_ID}' \
--header 'x-sandbox-name: {SANDBOX_NAME}' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--data-raw '{
"name": "SFTP Target Connection",
"baseConnectionId": "<FROM_STEP_CREATE_TARGET_BASE_CONNECTION>",
"params": {
"mode": "Server-to-server",
"remotePath": "folder/subfolder",
"compression": "NONE",
"datasetFileType": "JSON"
},
"connectionSpec": {
"id": "36965a81-b1c6-401b-99f8-22508f1e6a26", // SFTP connection spec id
"version": "1.0"
}
}'
Response
{
"id": "12401496-2573-4ca7-8137-fef1aeb9dd4c",
"etag": "\"0000d781-0000-0200-0000-63e29f420000\""
}
Note the Target Connection ID from the response. This ID will be required in the next step when creating the dataflow to export datasets.
Create a dataflow
The final step in the destination configuration is to set up a dataflow. A dataflow ties together previously created entities and also provides options for configuring the dataset export schedule. To create the dataflow, use the payloads below, depending on your desired cloud storage destination, and replace the entity IDs from previous steps.
Request
Note the highlighted lines with inline comments in the request example, which provide additional information. Remove the inline comments in the request when copy-pasting the request into your terminal of choice.
curl --location --request POST 'https://platform.adobe.io/data/foundation/flowservice/flows' \
--header 'accept: application/json' \
--header 'x-api-key: {API_KEY}' \
--header 'x-gw-ims-org-id: {ORG_ID}' \
--header 'x-sandbox-name: {SANDBOX_NAME}' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--data-raw '{
"name": "Activate datasets to an Amazon S3 cloud storage destination",
"description": "This operation creates a dataflow to export datasets to an Amazon S3 cloud storage destination",
"flowSpec": {
"id": "269ba276-16fc-47db-92b0-c1049a3c131f", // Amazon S3 flow spec ID
"version": "1.0"
},
"sourceConnectionIds": [
"<FROM_STEP_CREATE_SOURCE_CONNECTION>"
],
"targetConnectionIds": [
"<FROM_STEP_CREATE_TARGET_CONNECTION>"
],
"transformations": [],
"scheduleParams": { // specify the scheduling info
"exportMode": DAILY_FULL_EXPORT or FIRST_FULL_THEN_INCREMENTAL
"interval": 3, // also supports 6, 9, 12 hour increments
"timeUnit": "hour", // also supports "day" for daily increments.
"interval": 1, // when you select "timeUnit": "day"
"startTime": 1675901210, // UNIX timestamp start time (in seconds)
"endTime": 1975901210, // UNIX timestamp end time (in seconds)
"foldernameTemplate": "%DESTINATION%_%DATASET_ID%_%DATETIME(YYYYMMdd_HHmmss)%"
}
}'
The table below provides descriptions of all parameters in the scheduleParams
section, which allows you to customize export times, frequency, location, and more for your dataset exports.
exportMode
"DAILY_FULL_EXPORT"
or "FIRST_FULL_THEN_INCREMENTAL"
. For more information about the two options, refer to export full files and export incremental files in the batch destinations activation tutorial. The three available export options are:Full file - Once:
"DAILY_FULL_EXPORT"
can only be used in combination with timeUnit
:day
and interval
:0
for a one-time full export of the dataset. Daily full exports of datasets are not supported. If you need daily exports, use the incremental export option.Incremental daily exports: Select
"FIRST_FULL_THEN_INCREMENTAL"
, timeUnit
:day
, and interval
:1
for daily incremental exports.Incremental hourly exports: Select
"FIRST_FULL_THEN_INCREMENTAL"
, timeUnit
:hour
, and interval
:3
,6
,9
, or 12
for hourly incremental exports.timeUnit
day
or hour
depending on the frequency with which you want to export dataset files.interval
1
when the timeUnit
is day and 3
,6
,9
,12
when the time unit is hour
.startTime
endTime
foldernameTemplate
Specify the expected folder name structure in your storage location where the exported files will be deposited.
DATASET_ID
= A unique identifier for the dataset.DESTINATION
= The name of the destination.DATETIME
= The date and time formatted as yyyyMMdd_HHmmss.EXPORT_TIME
= The scheduled time for data export formatted asexportTime=YYYYMMDDHHMM
.DESTINATION_INSTANCE_NAME
= The name of the specific instance of the destination.DESTINATION_INSTANCE_ID
= A unique identifier for the destination instance.SANDBOX_NAME
= The name of the sandbox environment.ORGANIZATION_NAME
= The name of the organization.
Response
{
"id": "eb54b3b3-3949-4f12-89c8-64eafaba858f",
"etag": "\"0000d781-0000-0200-0000-63e29f420000\""
}
Request
Note the highlighted lines with inline comments in the request example, which provide additional information. Remove the inline comments in the request when copy-pasting the request into your terminal of choice.
curl --location --request POST 'https://platform.adobe.io/data/foundation/flowservice/flows' \
--header 'accept: application/json' \
--header 'x-api-key: {API_KEY}' \
--header 'x-gw-ims-org-id: {ORG_ID}' \
--header 'x-sandbox-name: {SANDBOX_NAME}' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--data-raw '{
"name": "Activate datasets to an Azure Blob Storage cloud storage destination",
"description": "This operation creates a dataflow to export datasets to an Azure Blob Storage cloud storage destination",
"flowSpec": {
"id": "95bd8965-fc8a-4119-b9c3-944c2c2df6d2", // Azure Blob Storage flow spec ID
"version": "1.0"
},
"sourceConnectionIds": [
"<FROM_STEP_CREATE_SOURCE_CONNECTION>"
],
"targetConnectionIds": [
"<FROM_STEP_CREATE_TARGET_CONNECTION>"
],
"transformations": [],
"scheduleParams": { // specify the scheduling info
"exportMode": DAILY_FULL_EXPORT or FIRST_FULL_THEN_INCREMENTAL
"interval": 3, // also supports 6, 9, 12 hour increments
"timeUnit": "hour", // also supports "day" for daily increments.
"interval": 1, // when you select "timeUnit": "day"
"startTime": 1675901210, // UNIX timestamp start time (in seconds)
"endTime": 1975901210, // UNIX timestamp end time (in seconds)
"foldernameTemplate": "%DESTINATION%_%DATASET_ID%_%DATETIME(YYYYMMdd_HHmmss)%"
}
}'
The table below provides descriptions of all parameters in the scheduleParams
section, which allows you to customize export times, frequency, location, and more for your dataset exports.
exportMode
"DAILY_FULL_EXPORT"
or "FIRST_FULL_THEN_INCREMENTAL"
. For more information about the two options, refer to export full files and export incremental files in the batch destinations activation tutorial. The three available export options are:Full file - Once:
"DAILY_FULL_EXPORT"
can only be used in combination with timeUnit
:day
and interval
:0
for a one-time full export of the dataset. Daily full exports of datasets are not supported. If you need daily exports, use the incremental export option.Incremental daily exports: Select
"FIRST_FULL_THEN_INCREMENTAL"
, timeUnit
:day
, and interval
:1
for daily incremental exports.Incremental hourly exports: Select
"FIRST_FULL_THEN_INCREMENTAL"
, timeUnit
:hour
, and interval
:3
,6
,9
, or 12
for hourly incremental exports.timeUnit
day
or hour
depending on the frequency with which you want to export dataset files.interval
1
when the timeUnit
is day and 3
,6
,9
,12
when the time unit is hour
.startTime
endTime
foldernameTemplate
Specify the expected folder name structure in your storage location where the exported files will be deposited.
DATASET_ID
= A unique identifier for the dataset.DESTINATION
= The name of the destination.DATETIME
= The date and time formatted as yyyyMMdd_HHmmss.EXPORT_TIME
= The scheduled time for data export formatted asexportTime=YYYYMMDDHHMM
.DESTINATION_INSTANCE_NAME
= The name of the specific instance of the destination.DESTINATION_INSTANCE_ID
= A unique identifier for the destination instance.SANDBOX_NAME
= The name of the sandbox environment.ORGANIZATION_NAME
= The name of the organization.
Response
{
"id": "eb54b3b3-3949-4f12-89c8-64eafaba858f",
"etag": "\"0000d781-0000-0200-0000-63e29f420000\""
}
Request
Note the highlighted lines with inline comments in the request example, which provide additional information. Remove the inline comments in the request when copy-pasting the request into your terminal of choice.
curl --location --request POST 'https://platform.adobe.io/data/foundation/flowservice/flows' \
--header 'accept: application/json' \
--header 'x-api-key: {API_KEY}' \
--header 'x-gw-ims-org-id: {ORG_ID}' \
--header 'x-sandbox-name: {SANDBOX_NAME}' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--data-raw '{
"name": "Activate datasets to an Azure Data Lake Gen 2(ADLS Gen2) cloud storage destination",
"description": "This operation creates a dataflow to export datasets to an Azure Data Lake Gen 2(ADLS Gen2) cloud storage destination",
"flowSpec": {
"id": "17be2013-2549-41ce-96e7-a70363bec293", // Azure Data Lake Gen 2(ADLS Gen2) flow spec ID
"version": "1.0"
},
"sourceConnectionIds": [
"<FROM_STEP_CREATE_SOURCE_CONNECTION>"
],
"targetConnectionIds": [
"<FROM_STEP_CREATE_TARGET_CONNECTION>"
],
"transformations": [],
"scheduleParams": { // specify the scheduling info
"exportMode": DAILY_FULL_EXPORT or FIRST_FULL_THEN_INCREMENTAL
"interval": 3, // also supports 6, 9, 12 hour increments
"timeUnit": "hour", // also supports "day" for daily increments.
"interval": 1, // when you select "timeUnit": "day"
"startTime": 1675901210, // UNIX timestamp start time (in seconds)
"endTime": 1975901210, // UNIX timestamp end time (in seconds)
"foldernameTemplate": "%DESTINATION%_%DATASET_ID%_%DATETIME(YYYYMMdd_HHmmss)%"
}
}'
The table below provides descriptions of all parameters in the scheduleParams
section, which allows you to customize export times, frequency, location, and more for your dataset exports.
exportMode
"DAILY_FULL_EXPORT"
or "FIRST_FULL_THEN_INCREMENTAL"
. For more information about the two options, refer to export full files and export incremental files in the batch destinations activation tutorial. The three available export options are:Full file - Once:
"DAILY_FULL_EXPORT"
can only be used in combination with timeUnit
:day
and interval
:0
for a one-time full export of the dataset. Daily full exports of datasets are not supported. If you need daily exports, use the incremental export option.Incremental daily exports: Select
"FIRST_FULL_THEN_INCREMENTAL"
, timeUnit
:day
, and interval
:1
for daily incremental exports.Incremental hourly exports: Select
"FIRST_FULL_THEN_INCREMENTAL"
, timeUnit
:hour
, and interval
:3
,6
,9
, or 12
for hourly incremental exports.timeUnit
day
or hour
depending on the frequency with which you want to export dataset files.interval
1
when the timeUnit
is day and 3
,6
,9
,12
when the time unit is hour
.startTime
endTime
foldernameTemplate
Specify the expected folder name structure in your storage location where the exported files will be deposited.
DATASET_ID
= A unique identifier for the dataset.DESTINATION
= The name of the destination.DATETIME
= The date and time formatted as yyyyMMdd_HHmmss.EXPORT_TIME
= The scheduled time for data export formatted asexportTime=YYYYMMDDHHMM
.DESTINATION_INSTANCE_NAME
= The name of the specific instance of the destination.DESTINATION_INSTANCE_ID
= A unique identifier for the destination instance.SANDBOX_NAME
= The name of the sandbox environment.ORGANIZATION_NAME
= The name of the organization.
Response
{
"id": "eb54b3b3-3949-4f12-89c8-64eafaba858f",
"etag": "\"0000d781-0000-0200-0000-63e29f420000\""
}
Request
Note the highlighted lines with inline comments in the request example, which provide additional information. Remove the inline comments in the request when copy-pasting the request into your terminal of choice.
curl --location --request POST 'https://platform.adobe.io/data/foundation/flowservice/flows' \
--header 'accept: application/json' \
--header 'x-api-key: {API_KEY}' \
--header 'x-gw-ims-org-id: {ORG_ID}' \
--header 'x-sandbox-name: {SANDBOX_NAME}' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--data-raw '{
"name": "Activate datasets to a Data Landing Zone cloud storage destination",
"description": "This operation creates a dataflow to export datasets to a Data Landing Zone cloud storage destination",
"flowSpec": {
"id": "cd2fc47e-e838-4f38-a581-8fff2f99b63a", // Data Landing Zone flow spec ID
"version": "1.0"
},
"sourceConnectionIds": [
"<FROM_STEP_CREATE_SOURCE_CONNECTION>"
],
"targetConnectionIds": [
"<FROM_STEP_CREATE_TARGET_CONNECTION>"
],
"transformations": [],
"scheduleParams": { // specify the scheduling info
"exportMode": DAILY_FULL_EXPORT or FIRST_FULL_THEN_INCREMENTAL
"interval": 3, // also supports 6, 9, 12 hour increments
"timeUnit": "hour", // also supports "day" for daily increments.
"interval": 1, // when you select "timeUnit": "day"
"startTime": 1675901210, // UNIX timestamp start time (in seconds)
"endTime": 1975901210, // UNIX timestamp end time (in seconds)
"foldernameTemplate": "%DESTINATION%_%DATASET_ID%_%DATETIME(YYYYMMdd_HHmmss)%"
}
}'
The table below provides descriptions of all parameters in the scheduleParams
section, which allows you to customize export times, frequency, location, and more for your dataset exports.
exportMode
"DAILY_FULL_EXPORT"
or "FIRST_FULL_THEN_INCREMENTAL"
. For more information about the two options, refer to export full files and export incremental files in the batch destinations activation tutorial. The three available export options are:Full file - Once:
"DAILY_FULL_EXPORT"
can only be used in combination with timeUnit
:day
and interval
:0
for a one-time full export of the dataset. Daily full exports of datasets are not supported. If you need daily exports, use the incremental export option.Incremental daily exports: Select
"FIRST_FULL_THEN_INCREMENTAL"
, timeUnit
:day
, and interval
:1
for daily incremental exports.Incremental hourly exports: Select
"FIRST_FULL_THEN_INCREMENTAL"
, timeUnit
:hour
, and interval
:3
,6
,9
, or 12
for hourly incremental exports.timeUnit
day
or hour
depending on the frequency with which you want to export dataset files.interval
1
when the timeUnit
is day and 3
,6
,9
,12
when the time unit is hour
.startTime
endTime
foldernameTemplate
Specify the expected folder name structure in your storage location where the exported files will be deposited.
DATASET_ID
= A unique identifier for the dataset.DESTINATION
= The name of the destination.DATETIME
= The date and time formatted as yyyyMMdd_HHmmss.EXPORT_TIME
= The scheduled time for data export formatted asexportTime=YYYYMMDDHHMM
.DESTINATION_INSTANCE_NAME
= The name of the specific instance of the destination.DESTINATION_INSTANCE_ID
= A unique identifier for the destination instance.SANDBOX_NAME
= The name of the sandbox environment.ORGANIZATION_NAME
= The name of the organization.
Response
{
"id": "eb54b3b3-3949-4f12-89c8-64eafaba858f",
"etag": "\"0000d781-0000-0200-0000-63e29f420000\""
}
Request
Note the highlighted lines with inline comments in the request example, which provide additional information. Remove the inline comments in the request when copy-pasting the request into your terminal of choice.
curl --location --request POST 'https://platform.adobe.io/data/foundation/flowservice/flows' \
--header 'accept: application/json' \
--header 'x-api-key: {API_KEY}' \
--header 'x-gw-ims-org-id: {ORG_ID}' \
--header 'x-sandbox-name: {SANDBOX_NAME}' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--data-raw '{
"name": "Activate datasets to a Google Cloud Storage cloud storage destination",
"description": "This operation creates a dataflow to export datasets to a Google Cloud Storage destination",
"flowSpec": {
"id": "585c15c4-6cbf-4126-8f87-e26bff78b657", // Google Cloud Storage flow spec ID
"version": "1.0"
},
"sourceConnectionIds": [
"<FROM_STEP_CREATE_SOURCE_CONNECTION>"
],
"targetConnectionIds": [
"<FROM_STEP_CREATE_TARGET_CONNECTION>"
],
"transformations": [],
"scheduleParams": { // specify the scheduling info
"exportMode": DAILY_FULL_EXPORT or FIRST_FULL_THEN_INCREMENTAL
"interval": 3, // also supports 6, 9, 12 hour increments
"timeUnit": "hour", // also supports "day" for daily increments.
"interval": 1, // when you select "timeUnit": "day"
"startTime": 1675901210, // UNIX timestamp start time (in seconds)
"endTime": 1975901210, // UNIX timestamp end time (in seconds)
"foldernameTemplate": "%DESTINATION%_%DATASET_ID%_%DATETIME(YYYYMMdd_HHmmss)%"
}
}'
The table below provides descriptions of all parameters in the scheduleParams
section, which allows you to customize export times, frequency, location, and more for your dataset exports.
exportMode
"DAILY_FULL_EXPORT"
or "FIRST_FULL_THEN_INCREMENTAL"
. For more information about the two options, refer to export full files and export incremental files in the batch destinations activation tutorial. The three available export options are:Full file - Once:
"DAILY_FULL_EXPORT"
can only be used in combination with timeUnit
:day
and interval
:0
for a one-time full export of the dataset. Daily full exports of datasets are not supported. If you need daily exports, use the incremental export option.Incremental daily exports: Select
"FIRST_FULL_THEN_INCREMENTAL"
, timeUnit
:day
, and interval
:1
for daily incremental exports.Incremental hourly exports: Select
"FIRST_FULL_THEN_INCREMENTAL"
, timeUnit
:hour
, and interval
:3
,6
,9
, or 12
for hourly incremental exports.timeUnit
day
or hour
depending on the frequency with which you want to export dataset files.interval
1
when the timeUnit
is day and 3
,6
,9
,12
when the time unit is hour
.startTime
endTime
foldernameTemplate
Specify the expected folder name structure in your storage location where the exported files will be deposited.
DATASET_ID
= A unique identifier for the dataset.DESTINATION
= The name of the destination.DATETIME
= The date and time formatted as yyyyMMdd_HHmmss.EXPORT_TIME
= The scheduled time for data export formatted asexportTime=YYYYMMDDHHMM
.DESTINATION_INSTANCE_NAME
= The name of the specific instance of the destination.DESTINATION_INSTANCE_ID
= A unique identifier for the destination instance.SANDBOX_NAME
= The name of the sandbox environment.ORGANIZATION_NAME
= The name of the organization.
Response
{
"id": "eb54b3b3-3949-4f12-89c8-64eafaba858f",
"etag": "\"0000d781-0000-0200-0000-63e29f420000\""
}
Request
Note the highlighted lines with inline comments in the request example, which provide additional information. Remove the inline comments in the request when copy-pasting the request into your terminal of choice.
curl --location --request POST 'https://platform.adobe.io/data/foundation/flowservice/flows' \
--header 'accept: application/json' \
--header 'x-api-key: {API_KEY}' \
--header 'x-gw-ims-org-id: {ORG_ID}' \
--header 'x-sandbox-name: {SANDBOX_NAME}' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--data-raw '{
"name": "Activate datasets to an SFTP cloud storage destination",
"description": "This operation creates a dataflow to export datasets to an SFTP cloud storage destination",
"flowSpec": {
"id": "354d6aad-4754-46e4-a576-1b384561c440", // SFTP flow spec ID
"version": "1.0"
},
"sourceConnectionIds": [
"<FROM_STEP_CREATE_SOURCE_CONNECTION>"
],
"targetConnectionIds": [
"<FROM_STEP_CREATE_TARGET_CONNECTION>"
],
"transformations": [],
"scheduleParams": { // specify the scheduling info
"exportMode": DAILY_FULL_EXPORT or FIRST_FULL_THEN_INCREMENTAL
"interval": 3, // also supports 6, 9, 12 hour increments
"timeUnit": "hour", // also supports "day" for daily increments.
"interval": 1, // when you select "timeUnit": "day"
"startTime": 1675901210, // UNIX timestamp start time (in seconds)
"endTime": 1975901210, // UNIX timestamp end time (in seconds)
"foldernameTemplate": "%DESTINATION%_%DATASET_ID%_%DATETIME(YYYYMMdd_HHmmss)%"
}
}'
The table below provides descriptions of all parameters in the scheduleParams
section, which allows you to customize export times, frequency, location, and more for your dataset exports.
exportMode
"DAILY_FULL_EXPORT"
or "FIRST_FULL_THEN_INCREMENTAL"
. For more information about the two options, refer to export full files and export incremental files in the batch destinations activation tutorial. The three available export options are:Full file - Once:
"DAILY_FULL_EXPORT"
can only be used in combination with timeUnit
:day
and interval
:0
for a one-time full export of the dataset. Daily full exports of datasets are not supported. If you need daily exports, use the incremental export option.Incremental daily exports: Select
"FIRST_FULL_THEN_INCREMENTAL"
, timeUnit
:day
, and interval
:1
for daily incremental exports.Incremental hourly exports: Select
"FIRST_FULL_THEN_INCREMENTAL"
, timeUnit
:hour
, and interval
:3
,6
,9
, or 12
for hourly incremental exports.timeUnit
day
or hour
depending on the frequency with which you want to export dataset files.interval
1
when the timeUnit
is day and 3
,6
,9
,12
when the time unit is hour
.startTime
endTime
foldernameTemplate
Specify the expected folder name structure in your storage location where the exported files will be deposited.
DATASET_ID
= A unique identifier for the dataset.DESTINATION
= The name of the destination.DATETIME
= The date and time formatted as yyyyMMdd_HHmmss.EXPORT_TIME
= The scheduled time for data export formatted asexportTime=YYYYMMDDHHMM
.DESTINATION_INSTANCE_NAME
= The name of the specific instance of the destination.DESTINATION_INSTANCE_ID
= A unique identifier for the destination instance.SANDBOX_NAME
= The name of the sandbox environment.ORGANIZATION_NAME
= The name of the organization.
Response
{
"id": "eb54b3b3-3949-4f12-89c8-64eafaba858f",
"etag": "\"0000d781-0000-0200-0000-63e29f420000\""
}
Note the Dataflow ID from the response. This ID will be required in the next step when retrieving the dataflow runs to validate the successful dateset exports.
Get the dataflow runs
To check the executions of a dataflow, use the Dataflow Runs API:
Request
In the request to retrieve dataflow runs, add as query parameter the dataflow ID that you obtained in the previous step, when creating the dataflow.
curl --location --request GET 'https://platform.adobe.io/data/foundation/flowservice/runs?property=flowId==eb54b3b3-3949-4f12-89c8-64eafaba858f' \
--header 'accept: application/json' \
--header 'x-api-key: {API_KEY}' \
--header 'x-gw-ims-org-id: {ORG_ID}' \
--header 'x-sandbox-name: {SANDBOX_NAME}' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
Response
{
"items": [
{
"id": "4b7728dd-83c9-4c38-95a4-24ddab545404",
"createdAt": 1675807718296,
"updatedAt": 1675807731834,
"createdBy": "aep_activation_batch@AdobeID",
"updatedBy": "acp_foundation_connectors@AdobeID",
"createdClient": "aep_activation_batch",
"updatedClient": "acp_foundation_connectors",
"sandboxId": "7dfdcd30-0a09-11ea-8ea6-7bf93ce86c28",
"sandboxName": "sand-1",
"imsOrgId": "5555467B5D8013E50A494220@AdobeOrg",
"flowId": "aae5ec63-b0ac-4808-9a44-abf2ea67bd5a",
"flowSpec": {
"id": "615d3489-36d2-4671-9467-4ae1129facd3",
"version": "1.0"
},
"providerRefId": "ba56f98e0c49b572adb249980c39b1c7",
"etag": "\"08005e9e-0000-0200-0000-63e2cbf30000\"",
"metrics": {
"durationSummary": {
"startedAtUTC": 1675807719411,
"completedAtUTC": 1675807719416
},
"sizeSummary": {
"inputBytes": 0
},
"recordSummary": {
"inputRecordCount": 0,
"skippedRecordCount": 0,
"sourceSummaries": [
{
"id": "ea2b1205-4692-49de-b448-ebf75b1d188a",
"inputRecordCount": 0,
"skippedRecordCount": 0,
"entitySummaries": [
{
//...
You can find information about the various parameters returned by the Dataflow runs API in the API reference documentation.
Verify successful dataset export
When exporting datasets, Experience Platform creates a .json
or .parquet
file in the storage location that you provided. Expect a new file to be deposited in your storage location according to the export schedule you provided when creating a dataflow.
Experience Platform creates a folder structure in the storage location you specified, where it deposits the exported dataset files. A new folder is created for each export time, following the pattern below:
folder-name-you-provided/datasetID/exportTime=YYYYMMDDHHMM
The default file name is randomly generated and ensures that exported file names are unique.
Sample dataset files
The presence of these files in your storage location is confirmation of a successful export. To understand how the exported files are structured, you can download a sample .parquet file or .json file.
Compressed dataset files
In the step to create a target connection, you can select the exported dataset files to be compressed.
Note the difference in file format between the two file types, when compressed:
- When exporting compressed JSON files, the exported file format is
json.gz
- When exporting compressed parquet files, the exported file format is
gz.parquet
- JSON files can be exported in a compressed mode only.
API error handling
The API endpoints in this tutorial follow the general Experience Platform API error message principles. Refer to API status codes and request header errors in the Experience Platform troubleshooting guide for more information on interpreting error responses.
Known limitations
View known limitations about dataset exports.
Frequently Asked Questions
View a list of frequently asked questions about dataset exports.
Next steps
By following this tutorial, you have successfully connected Experience Platform to one of your preferred batch cloud storage destinations and set up a dataflow to the respective destination to export datasets. See the following pages for more details, such as how to edit existing dataflows using the Flow Service API: