Monitor Dataflows for Streaming Sources in the UI

This tutorial covers the steps for monitoring dataflows for streaming sources using the Sources workspace.

Getting started

This tutorial requires a working understanding of the following components of Adobe Experience Platform:

  • Dataflows: Dataflows are a representation of data jobs that move data across Platform. Dataflows are configured across different services, helping move data from source connectors to target datasets, to Identity and Profile, and to Destinations.
    • Dataflow runs: Dataflow runs are the recurring scheduled jobs based on the frequency configuration of selected dataflows.
  • Sources: Experience Platform allows data to be ingested from various sources while providing you with the ability to structure, label, and enhance incoming data using Platform services.
  • Sandboxes: Experience Platform provides virtual sandboxes which partition a single Platform instance into separate virtual environments to help develop and evolve digital experience applications.

Monitor dataflows for streaming sources

In the Platform UI, select Sources from the left navigation bar to access the Sources workspace. The Catalog screen displays a variety of sources for which you can create an account with.

To view existing dataflows for streaming sources, select Dataflows from the top header.

catalog

The Dataflows page contains a list of all existing dataflows in your organization, including information about their source data, account name, and dataflow run status.

Select the name of the dataflow you want to view.

dataflows

The following table contains more information on dataflow run statuses:

Status Description
Completed The Completed status indicates that all records for the corresponding dataflow run were process within the one-hour period. A Completed status can still contain errors in dataflow runs.
Processing The Processing status indicates that a dataflow is not yet active. This status is often encountered immediately after a new dataflow is created.
Error The Error status indicates that the activation process of a dataflow has been disrupted.

The Dataflow Activity page displays specific information on your streaming dataflow. The top banner contains the cumulative number of records ingested and records failed for all of your streaming dataflow runs in your selected date range.

The lower half of the page displays information on the number of records received, ingested, and failed, per flow run. Each flow run is recorded within an hourly window.

dataflow-activity

Each individual dataflow run shows the following details:

  • Dataflow run start: The time that the dataflow run started at.
  • Processing time: The amount of time that it took for the dataflow to process.
  • Records Received: The total number of records received in the dataflow from a source connector.
  • Records Ingested: The total count of records ingested into Data Lake.
  • Records Failed: The number of records that were not ingested into Data Lake due to errors in the data.
  • Ingestion Rate: The success rate of records ingested into Data Lake. This metric is applicable when Partial Ingestion is enabled.
  • Status: Represents the state the dataflow is in: either Completed or Processing. Completed means that all records for the corresponding dataflow run were processed within the one-hour period. Processing means that the dataflow run has not yet finished.

By default, the data displayed contains ingestion rates from the last seven days. Select Last 7 days to adjust the time frame of records displayed.

change-time

A calendar pop-up window appears, providing you options for alternative ingestion time frames. Select Last 30 days and then select Apply.

calendar

To view the details of a specific dataflow run, including its errors, select the run’s start time from the list.

select-fail

The Dataflow run overview page contains additional information on your dataflow, such as its corresponding dataflow run ID, target dataset, and IMS organization ID.

A flow run with errors also contains the Dataflow run errors panel, which displays the particular error that led to the failure of the run, as well as the total count of records that failed.

failure

Next steps

By following this tutorial, you have successfully used the Sources workspace to monitor your streaming dataflows and identify the errors that led to any failed dataflows. See the following documents for more information:

On this page