Documentation Experience Platform Dataflows Guide

Monitor data lake ingestion

Last update: Tue May 20 2025 00:00:00 GMT+0000 (Coordinated Universal Time)

Topics:
Dataflows

CREATED FOR:

Developer

IMPORTANT

Streaming sources, such as the HTTP API source are not currently supported by the monitoring dashboard. At this moment, you can only use the dashboard to monitor batch sources.

You can use the monitoring dashboard in the Adobe Experience Platform user interface to retrieve metrics around your data ingestion and data retention processes in data lake. Use the graphs in the interface to monitor ingestion and retention trends over time and summarize performance across all of your sources dataflows.

Read this document to learn how you can use the monitoring dashboard to monitor all data processing in data lake, including both ingestion and retention.

Get started get-started

This tutorial requires a working understanding of the following components of Adobe Experience Platform:

Dataflows: Dataflows are a representation of data jobs that move data across Experience Platform. Dataflows are configured across different services, helping move data from source connectors to target datasets, to Identity and Profile, and to Destinations.
- Dataflow runs: Dataflow runs are the recurring scheduled jobs based on the frequency configuration of selected dataflows.
Sources: Experience Platform allows data to be ingested from various sources while providing you with the ability to structure, label, and enhance incoming data using Experience Platform services.
Identity Service: Gain a better view of individual customers and their behavior by bridging identities across devices and systems.
Real-Time Customer Profile: Provides a unified, real-time consumer profile based on aggregated data from multiple sources.
Sandboxes: Experience Platform provides virtual sandboxes which partition a single Experience Platform instance into separate virtual environments to help develop and evolve digital experience applications.

Use the monitoring dashboard for data lake ingestion

Select Data lake from the main header in the monitoring dashboard to view your data lake ingestion rate.

The monitoring dashboard with the sources card selected.

The Ingestion rate graph displays your data ingestion rate based on your configured time frame. By default, the monitoring dashboard displays ingestion rates from the last 24 hours. For steps on how to configure your time frame, read the guide on configuring monitoring time frame.

The graph is enabled to display by default. To hide the graph, select Metrics and graphs to disable the toggle and hide the graph.

The ingestion rate metrics graph.

The lower part of the dashboard displays a table that outlines the current metrics report for all existing sources dataflows.

The monitoring dashboard metrics table.

Metrics

Description

Records received

The total number of records received from a given source.

Records ingested

The total number of records ingested to data lake.

Records deleted

The total number of records deleted due to data lake retention settings or change data capture operations.

Records skipped

The total number of records skipped. A skipped record refers to fields that were skipped because they were not required for ingestion. For example, if you create a sources dataflow with partial ingestion enabled, you can configure an acceptable error rate threshold. During the ingestion process, ingestion will skip records of fields that are not required, such as identity fields, so long as they are within the error threshold.

Records failed

The total number of records that could not be ingested due to errors.

Ingested rate

The percentage of records that were ingested based on the total number of records received.

Total failed dataflows

The total number of dataflows that failed.

You can further filter your data using the options provided above the metrics table:

Filtering options

Description

Use the search bar to filter your view to a single source type.

Sources

Select Sources to filter your view and display metric data per source type. This is the default display that the monitoring dashboard uses.

Dataflows

Select Dataflows to filter your view and display metric data per dataflow.

Show failures only

Select Show failures only to filter your view and display only dataflows that reported ingestion failures.

My sources

You can further filter your view by using the My sources dropdown menu. Use the dropdown menu to filter your view by category. Alternatively, you can select All sources to display metrics on all or sources, or select My sources to display only the sources that you have a corresponding account with.

To customize your column display, select the column settings icon .

The monitoring dashboard with the column settings icon selected.

Next, use the Customize table window to select the columns that you want your dashboard to display. When finished, select Apply.

The customize column pop-up window in the monitoring dashboard.

To monitor the data that is being ingested in a specific dataflow, select the filter icon beside a source.

TIP

You can use the monitoring dashboard to monitor data deletion metrics for records deleted using data retention policies. For more information on data retention, read the guide on setting data retention policies.

Monitor a specific dataflow by selecting the filter icon beside a given source.

The metrics table updates to a table of active dataflows that correspond to the source that you selected. During this step, you can view additional information on your dataflows, including their corresponding dataset and data type, as well as a time stamp to indicate when they were last active.

To further inspect a dataflow, select the filter icon beside a dataflow.

The dataflows table in the monitoring dashboard.

Next, you are taken to an interface that lists all dataflow run iterations of the dataflow that you selected.

Dataflow runs represent an instance of dataflow execution. For example, if a dataflow is scheduled to run hourly at 9:00 AM, 10:00 AM, and 11:00 AM, then you would have three instances of a flow run. Flow runs are specific to your particular organization.

To inspect metrics of a specific dataflow run iteration, select the filter icon beside your dataflow.

The dataflow run metric page.

Use the dataflow run details page to view metrics and information of your selected run iteration.

The dataflow run details page.

Dataflow run details

Description

Records ingested

The total number of records that were ingested from the dataflow run.

Records failed

The total number of records that were not ingested due to errors in the dataflow run.

Total files

The total number of files in the dataflow run.

Size of data

The total size of data contained in the dataflow run.

Dataflow run ID

The ID of the dataflow run iteration.

Org ID

The ID of the organization in which the dataflow run was created in.

Status

The status of the dataflow run.

Dataflow run start

A timestamp that indicates when the dataflow run started.

Dataflow run end

A timestamp that indicates when the dataflow run ended.

Dataset

The dataset used to create the dataflow.

Data type

The type of the data that was in the dataflow.

Partial ingestion

Partial batch ingestion is the ability to ingest data containing errors, up to a certain configurable threshold. This feature allows you to successfully ingest all of your accurate data into Experience Platform, while all of your incorrect data is batched separately with information on why it is invalid. You can enable partial ingestion during the dataflow creation process.

Error diagnostics

Error diagnostics instructs the source to produce error diagnostics that you can later reference when monitoring your dataset activity and dataflow status. You can enable error diagnostics during the dataflow creation process.

Error summary

Given a failed dataflow run, error summary displays an error code and description to summarize why the run iteration failed.

If your dataflow run reports errors, you can scroll down to the bottom of the page use the Dataflow run errors interface.

Use the Records failed section to view metrics on records that were not ingested due to errors. To view a comprehensive error report, select Preview error diagnostics. To download a copy of your error diagnostics and file manifest, select Download and then copy the example API call to be used with the Data Access API.

NOTE

You may only use error diagnostics if the feature was enabled during the source connection creation process.

Next steps next-steps

By following this tutorial, you learned how to monitor the data lake ingestion rate using the Monitoring dashboard. You also learned to identify errors that cause dataflow failures during ingestion. See the following documents for more details:

recommendation-more-help

d4f38c9a-ed8e-4b74-98e7-57b4df1b999f