AI/ML feature pipelines
Data Distiller enables data scientists and engineers to enrich their machine learning pipelines with high-value customer experience data that has been collected and curated in Adobe Experience Platform. From a Python notebook in any environment, you can interactively explore customer data in the Experience Platform, define and compute features from the data, and read the computed features into your machine learning environment for modeling.
- With Data Distiller’s powerful query capabilities, you can extract meaningful features from the rich behavioral data available in the Experience Platform. You can then bring the distilled feature data into your machine learning environment without the need to copy large volumes of event data outside of the Experience Platform.
- Read the prepared feature dataset into your preferred machine learning tools and combine with other features derived from enterprise data to train, experiment, tune, and deploy custom models tailored to your business.
- Generate scores, predictions, or recommendations from your models and return the output to the Experience Platform to optimize customer experiences through Real-Time Customer Data Platform and Adobe Journey Optimizer.
Prerequisites prerequisites
This workflow requires a working understanding of the various aspects of Adobe Experience Platform. Before beginning this tutorial, please review the documentation for the following concepts:
- How to authenticate and access Experience Platform APIs.
- Sandboxes: Attribute-based access control permissions and how to create and manage roles, as well as assign the desired resource permissions for these roles.
- Data Governance: How to apply data usage labels to datasets and fields, categorizing each according to related data governance policies and access control policies.
Next steps
By reading this document, you have been introduced to the important concepts behind using your preferred machine learning tools to build custom models that support your marketing use cases.
The documents included in this series of guides, describe the basic steps for creating feature pipelines from Experience Platform to feed custom models in your machine learning environment. You are now ready to establish a connection between Data Distiller and your Jupyter Notebook.
The documentation linked below corresponds with the steps indicated on the infographic above.
- Step 1: Explore and analyze datasets
- Step 2: Engineer features for machine learning
- Step 3: Export feature datasets
Additional resources
- aepp: an Adobe-managed open-source Python library for making requests to Data Distiller and other Experience Platform services from Python code.