Guidelines for stitching data

It is recommend that you stitch the events of a user across a common id when possible. For example, you may have user data with “id1” across 10 events. Later, the same user deleted the cookie id and is recorded as “id2” across next 20 events. If you know that id1 and id2 correspond to same user, the best practice is to stitch all 30 events with a common id.

If this is not possible, you should treat each set of events as a different user when creating your model input data. This ensures the best results during model training and scoring.

Workflow summary

The preparation process varies depending on whether your data is stored in Adobe Experience Platform or externally. This section summarizes the necessary steps you need to take, given either scenario.

External data preparation

If your data is stored outside of Experience Platform, you need to map your data to the required and relevant fields in a Consumer ExperienceEvent schema. This schema can be augmented with custom field groups to better capture your customer data. Once mapped, you can create a dataset using your Consumer ExperienceEvent schema and ingest your data to Platform. The CEE dataset can then be selected when configuring an Intelligent Service.

Depending on the Intelligent Service you wish to use, different fields may be required. Note that it is a best practice to add data to a field if you have the data available. To learn more about the required fields, visit the Attribution AI or Customer AI data requirements guide.

Adobe Analytics data preparation

Customer AI and Attribution AI natively support Adobe Analytics data. To use Adobe Analytics data, follow the steps outlined in the documentation to set up an Analytics source connector.

Once the source connector is streaming your data into Experience Platform, you are able to select Adobe Analytics as a data source followed by a dataset during your instance configuration. All of the required schema field groups and individual fields are automatically created during the connection set up. You do not need to ETL (Extract, Transform, Load) the datasets into the CEE format.

If you compare the data flown through the Adobe Analytics source connector onto Adobe Experience Platform with Adobe Analytics data, you may notice some discrepancies. The Analytics Source connector might drop rows during the transformation to an Experience Data Model (XDM) schema. There can be multiple reasons for the whole row to be unfit for transformation which include missing timestamps, missing personIDs, invalid or large person IDs, invalid analytic values, and more.

For more information and examples, visit the documentation for comparing Adobe Analytics and Customer Journey Analytics data. This article is designed to help you diagnose and solve for those differences so that you and your team can use Adobe Experience Platform data for Intelligent Services unimpeded by concerns about data integrity.

In Adobe Experience Platform Query Services, run the following Total Records between start and end timestamp by channel.typeAtSource query to find the count by marketing channels.

       Count(_id) AS Records
FROM  df_hotel
WHERE timestamp>=from_utc_timestamp('2021-05-15','UTC')
        AND timestamp<from_utc_timestamp('2022-01-10','UTC')
        AND timestamp IS NOT NULL
        AND enduserids._experience.aaid.id IS NOT NULL
GROUP BY channel.typeAtSource
IMPORTANT
The Adobe Analytics connector takes up to four weeks to backfill data. If you recently set up a connection you should verify that the dataset has the minimum length of data required for Customer or Attribution AI. Please review the historical data sections in Customer AI or Attribution AI, and verify you have enough data for your prediction goal.