Monitor data ingestion

Learn how to monitor and track data that gets ingested into Adobe Experience Platform using the monitoring dashboard. This monitoring dashboard provides a top-down view of source data processing through data lake to Profile and Identity services, with actionable insights. For more information, see the documentation.

Transcript
Hi, this is Daniel. In this video, I’m going to show you two different ways to monitor the success of your data ingestion into Adobe Experience Platform. As a data engineer, you want to be able to see that your data is ingested into the data lake, identity graphs, and realtime customer profiles. And if it’s not successful, you need to know why so you can respond appropriately. Here I am on the monitoring dashboard, which you can get to from monitoring in the left navigation in the data management section. You can see the three areas I just mentioned at the top of the dashboard. You can also monitor activation flows out of platform such as segment jobs running and being sent to destinations. But those are typically concerns for marketers so we’ll cover those in a separate video.
Oh, and if you don’t see this monitoring screen, you might need an administrator at your company to give you access to the data monitoring permission item. But before we dig into the monitoring dashboard, I want to show you the other way to see what’s going on with your data ingestion, which is by using the alerts feature. When you configure a data flow in a source connector, you’re given the option to subscribe to events related to a run starting, completing, failing, or not ingesting any records in a certain period of time.
For existing data flows, you can click on this three dots icon and subscribe to alerts from the modal.
You can also go to alerts and subscribe to different alert types, although this will apply to all sources in this sandbox.
Now, how will you be alerted? Well, these alerts will appear in the notifications area of experience cloud, but only if you have experience platform toggled on in your notification preferences, and also have them enabled for platform alerts. There are additional preferences here, including the ability to receive email alerts, which can be configured to send immediately or in periodic digests.
Okay, so let’s go back to the dashboard. What is this telling me? I can see that of the records that were received, most were successful but some failed. All of the records ingested to profile enabled data sets were ingested into the identity service. They also made it to the profile service, creating and then updating a number of profile fragments. In this 30 day time window, data ingestion occurred on two of the days, and I can see that all of the data ingestion occurred from this one type of source. To drill into this source, I can click on the filter icon, which will then expose all of the data flows and runs.
So on May 26th, there were three data flows that ran. One failed entirely, the next one was partially ingested, and the third was completely successful. And note that I can switch this view to see what happened in these data flow runs with identity service and profile service by selecting those from the top.
So what happened with these failed ingestions? Well, let’s look at the one that partially ingested. I can drill into that by selecting the filter icon.
Here I can see that there was just one run. I’ll click on the filter again to drill into that specific run, and now here I can see the error codes related to that batch. It looks like a required field was missing. I can preview the error diagnostics to get a sample of more details. It looks like the loyalty ID field was missing in 21% of the records. But since this data flow had a 50% error threshold, the valid records were ingested. Full diagnostics can be downloaded too. This will open a dialogue with a curl command you can run to download everything. And again, I can see what happened with identity and profile service portions of the ingestion by selecting those at the top.
Remember that to drill down to this level, I kept adding filters. So I’ll remove those filters to go back up.
Another way I can look at this is via these end to end tabs at the top. The local file upload source uses batch. So I know that the same data would be in batch end to end. The date filters default to 24 hours so I’m going to reset that to last 30 days. There’s another default filter to show me only the failed batches, which is fine. Note that this doesn’t show me the source or data flow information, but is oriented to the data set receiving the data. If I click into the batch, it will take me to the data sets area and the batch overview. I can see error messages here, although I can’t easily access the error diagnostics preview, which is what pointed me to this specific field that had the issue. So I prefer troubleshooting via the dashboard tab.
So hopefully this gives you a good overview of two different ways using the alerts and the dashboard to monitor the success of your data ingestion into platform. And if there are any issues with the ingestion, you can see what’s going on and take whatever steps you need to fix things so you can get more high quality data into platform. -
recommendation-more-help
9051d869-e959-46c8-8c52-f0759cee3763