Guardrails for Data Ingestion

Guardrails are thresholds that provide guidance for data and system usage, performance optimization, and avoidance of errors or unexpected results in Adobe Experience Platform. Guardrails can refer to your usage or consumption of data and processing in relation to your licensing entitlements.

This document provides guidance on guardrails for data ingestion in Adobe Experience Platform.

Guardrails for batch ingestion

The following table outlines guardrails to consider when using the batch ingestion API or sources:

Type of ingestion Guidelines Notes
Data lake ingestion using the batch ingestion API
  • You can ingest up to 20 GB of data per hour to data lake using the batch ingestion API.
  • The maximum number of files per batch is 1500.
  • The maximum batch size is 100 GB.
  • The maximum number of properties or fields per row is 10000.
  • The maximum number of batches per minute, per user is 138.
Data lake ingestion using batch sources
  • You can ingest up to 200 GB of data per hour to data lake using batch ingestion sources such as Azure Blob, Amazon S3, and SFTP.
  • A batch size should be between 256 MB and 100 GB.
  • The maximum number of files per batch is 1500.
See the sources overview for a catalog of sources you can use for data ingestion.
Batch ingestion to Profile
  • You can ingest up to 120 GB of data per hour.
  • The maximum size of a record class is 100 KB (soft).
  • The maximum size of an ExperienceEvent class is 10 KB (soft).
  • The maximum size of a single record is 1 MB.

Guardrails for streaming ingestion

The following table outlines guardrails to consider when using the streaming ingestion API or streaming sources:

Type of ingestion Guidelines Notes
Streaming ingestion
  • The maximum record size is 1 MB, with the recommended size being 10 KB.
  • You can process 20000 requests per second to Profile in under one minute.
  • You can process up to 20000 requests per second to data lake in under 15 minutes.
Use the batch ingestion API if you require a higher data throughput.
Streaming sources
  • The maximum record size is 1 MB, with the recommended size being 10 KB.
  • Streaming sources support between 4000 to 5000 requests per second upon the creation of a new source connection. Note: It can take up to 30 minutes for streaming data to be completely processed to data lake.
  • You can process between 4000 and 5000 requests per second to data lake. Note: It can take up to 30 minutes for streaming data to be completely processed to data lake.
Streaming sources such as Kafka, Azure Event Hubs, and Amazon Kinesis do not use the Data Collection Core Service (DCCS) route and can have different throughput limits. See the sources overview for a catalog of sources you can use for data ingestion.

Next steps

See the following documentation for more information on data and processing guardrails in Experience Platform:

On this page