Monitor data lake ingestion
Read this document to learn how to use the monitoring dashboard to monitor data lake ingestion in the Experience Platform UI.
Get started get-started
This tutorial requires a working understanding of the following components of Adobe Experience Platform:
- Dataflows: Dataflows are a representation of data jobs that move data across Experience Platform. Dataflows are configured across different services, helping move data from source connectors to target datasets, to Identity and Profile, and to Destinations.
- Dataflow runs: Dataflow runs are the recurring scheduled jobs based on the frequency configuration of selected dataflows.
- Sources: Experience Platform allows data to be ingested from various sources while providing you with the ability to structure, label, and enhance incoming data using Experience Platform services.
- Identity Service: Gain a better view of individual customers and their behavior by bridging identities across devices and systems.
- Real-Time Customer Profile: Provides a unified, real-time consumer profile based on aggregated data from multiple sources.
- Sandboxes: Experience Platform provides virtual sandboxes which partition a single Experience Platform instance into separate virtual environments to help develop and evolve digital experience applications.
Use the monitoring dashboard for data lake ingestion
Select Data lake from the main header in the monitoring dashboard to view your data lake ingestion rate.
The Ingestion rate graph displays your data ingestion rate based on your configured time frame. By default, the monitoring dashboard displays ingestion rates from the last 24 hours. For steps on how to configure your time frame, read the guide on configuring monitoring time frame.
The graph is enabled to display by default. To hide the graph, select Metrics and graphs to disable the toggle and hide the graph.
The lower part of the dashboard displays a table that outlines the current metrics report for all existing sources dataflows.
You can further filter your data using the options provided above the metrics table:
To monitor the data that is being ingested in a specific dataflow, select the filter icon
The metrics table updates to a table of active dataflows that correspond to the source that you selected. During this step, you can view additional information on your dataflows, including their corresponding dataset and data type, as well as a time stamp to indicate when they were last active.
To further inspect a dataflow, select the filter icon
Next, you are taken to an interface that lists all dataflow run iterations of the dataflow that you selected.
Dataflow runs represent an instance of dataflow execution. For example, if a dataflow is scheduled to run hourly at 9:00 AM, 10:00 AM, and 11:00 AM, then you would have three instances of a flow run. Flow runs are specific to your particular organization.
To inspect metrics of a specific dataflow run iteration, select the filter icon
Use the dataflow run details page to view metrics and information of your selected run iteration.
If your dataflow run reports errors, you can scroll down to the bottom of the page use the Dataflow run errors interface.
Use the Records failed section to view metrics on records that were not ingested due to errors. To view a comprehensive error report, select Preview error diagnostics. To download a copy of your error diagnostics and file manifest, select Download and then copy the example API call to be used with the Data Access API.
Next steps next-steps
By following this tutorial, you learned how to monitor the data lake ingestion rate using the Monitoring dashboard. You also learned to identify errors that cause dataflow failures during ingestion. See the following documents for more details: