Documentation Experience Platform Data Science Workspace Guide

Train and evaluate a model in the Data Science Workspace UI

Last update: Mon Aug 05 2024 00:00:00 GMT+0000 (Coordinated Universal Time)

Topics:
Data Science Workspace

CREATED FOR:

User
Developer

NOTE

Data Science Workspace is no longer available for purchase.

This documentation is intended for existing customers with prior entitlements to Data Science Workspace.

In Adobe Experience Platform Data Science Workspace, a machine learning Model is created by incorporating an existing Recipe that is appropriate for the Model’s intent. The Model is then trained and evaluated to optimize its operating efficiency and efficacy by fine-tuning its associated Hyperparameters. Recipes are reusable, meaning that multiple Models can be created and tailored to specific purposes with a single Recipe.

This tutorial walks through the steps to create, train, and evaluate a Model.

Getting started

In order to complete this tutorial, you must have access to Experience Platform. If you do not have access to an organization in Experience Platform, please speak to your system administrator before proceeding.

This tutorial requires an existing Recipe. If you do not have a Recipe, follow the Import a packaged Recipe in the UI tutorial before continuing.

Create a Model

In Experience Platform, select the Models tab located in the left navigation, then select the browse tab to view your existing Models. Select Create Model near the top right of the page to begin a Model creation process.

Browse through the list of existing Recipes, find and select the Recipe to be used to create the Model and select Next.

Select an appropriate input dataset and select Next. This will set the default input training dataset for the Model.

Provide a name for the Model and review the default Model configurations. Default configurations were applied during Recipe creation, review and modify the configuration values by double-clicking the values.

To provide a new set of configurations, select Upload New Config and drag a JSON file containing Model configurations into the browser window. Select Finish to create the Model.

NOTE

Configurations are unique and specific to their intended Recipe, this means that configurations for the Retail Sales Recipe will not work for the Product Recommendations Recipe. See the reference section for a list of Retail Sales Recipe configurations.

Create a training Run

In Experience Platform, select the Models tab located in the left navigation, then select the browse tab to view your existing Models. Find and select the hyperlink attached to the name of the Model you wish to train.

All existing training runs with their current training statuses are listed. For Models created using the Data Science Workspace user interface, a training run is automatically generated and executed using the default configurations and input training dataset.

Create a new training run by selecting Train near the top-right of the Model overview page.

Select the training input dataset for the training run, then select Next.

Default configurations provided during the Model’s creation are shown, change and modify these accordingly by double-clicking the values. Select Finish to create and execute the training run.

NOTE

Evaluate the Model

select model

All existing training runs with their current training statuses are listed. With multiple completed training runs, evaluation metrics can be compared across different training runs in the Model evaluation chart. Select an evaluation metric using the dropdown list above the graph.

The Mean Absolute Percent Error (MAPE) metric expresses accuracy as a percentage of the error. This is used to identify the top performing Experiment. The lower the MAPE, the better.

overview of training runs

The “Precision” metric describes the percentage of relevant Instances compared with the total retrieved Instances. Precision can be seen as the probability that a randomly selected outcome is correct.

running multiple runs

Selecting a specific training run provides the details of that run by opening the evaluation page. This can be done even before the run has been completed. On the evaluation page, you are able to see other evaluation metrics, configuration parameters, and visualizations specific to the training run.

preview logs

You can also download activity logs to see the details of the run. Logs are particularly useful for failed runs to see what went wrong.

activity logs

Hyperparameters cannot be trained and a Model must be optimized by testing different combinations of Hyperparameters. Repeat this Model training and evaluation process until you have arrived at an optimized Model.

Next steps

This tutorial walked you through creating, training, and evaluating a Model in Data Science Workspace. Once you have arrived at an optimized Model, you can use the trained Model to generate insights by following the Score a Model in the UI tutorial.

Reference reference

Retail Sales Recipe configurations

Hyperparameters determine the Model’s training behavior, modifying Hyperparameters will affect the Model’s accuracy and precision:

Hyperparameter

Description

Recommended Range

learning_rate

Learning rate shrinks the contribution of each tree by learning_rate. There is a trade-off between learning_rate and n_estimators.

0.1

n_estimators

The number of boosting stages to perform. Gradient boosting is fairly robust to over-fitting so a large number usually results in better performance.

100

max_depth

Maximum depth of the individual regression estimators. The maximum depth limits the number of nodes in the tree. Tune this parameter for best performance; the best value depends on the interaction of the input variables.

Additional parameters determine the Model’s technical properties:

Parameter key

Type

Description

ACP_DSW_INPUT_FEATURES

String

List of comma separated input schema attributes.

ACP_DSW_TARGET_FEATURES

String

List of comma separated output schema attributes.

ACP_DSW_FEATURE_UPDATE_SUPPORT

Boolean

Determines whether input and output features are modifiable

tenantId

String

This ID ensures resources you create are namespaced properly and contained within your organization. Follow the steps here to find your tenant ID.

ACP_DSW_TRAINING_XDM_SCHEMA

String

The input schema used for training a Model.

evaluation.labelColumn

String

Column label for evaluation visualizations.

evaluation.metrics

String

Comma separated list of evaluation metrics to be used for evaluating a Model.

ACP_DSW_SCORING_RESULTS_XDM_SCHEMA

String

The output schema used for scoring a Model.

recommendation-more-help

cc79fe26-64da-411e-a6b9-5b650f53e4e9