An in-depth description of the algorithms used in Adobe Target Recommendations, including the logic and mathematical details of model training and the process of model serving.
Model training is the process of how recommendations are generated by the Adobe Target learning algorithms. Model serving is how Target delivers recommendations to your site visitors (also known as content delivery).
Target includes the following broad types of algorithms in Recommendations:
Item-Based algorithms: Include algorithms that follow the logic “People who viewed/bought this item also viewed/bought these items.” These algorithms are grouped under the umbrella term item-item collaborative filtering, as well as Items with Similar Attributes algorithms.
User-Based algorithms: Include the Recently Viewed and Recommended for You algorithms.
Popularity-Based algorithms: Include algorithms that return the top-viewed or top-purchased items across the website, or top-viewed or top-purchased by category or item attribute.
Cart-Based algorithms: Include multi-item based recommendations with the logic “people who viewed/bought these items, also viewed/bought those items.”
Custom Criteria: Include recommendations based on custom files uploaded to Target.
For more general information about each algorithm type and the individual algorithms, see Base the recommendation on a recommendation key.
Many of the algorithms listed above are predicated on the presence of one or multiple keys. These keys are used to retrieve similar items at content delivery time (when recommendations are made). Customer-specified keys can include the current item someone is viewing, last item viewed or purchased, top-viewed item, current category, or favorite category for that visitor. Other algorithms, such as cart-based or user-based recommendations, use implicit keys (that cannot be configured by the customer). For more information, see Recommendation keys, in Base the recommendation on a recommendation key. Note, however, that these keys are relevant at model serving time only (content delivery). These keys do not affect the “offline” or model training time logic.
The following sections group algorithms in a slightly different manner than the algorithm types described above. The following grouping is based on the similarity of model training logic.
Item-Item collaborative filtering recommendation algorithms are based on the idea that you should use the behavioral patterns of many users (hence collaborative) to provide useful recommendations for a given item (for example, filter the catalog of possible items to recommend). Although there are many different algorithms that fall under the general umbrella of collaborative filtering, these algorithms universally use behavioral data sources as inputs. In Target Recommendations, these inputs are the unique views and purchases of items by users.
For the “people who viewed/purchased this item also viewed/purchased these items” algorithm, the goal is to calculate a similarity s(A,B) between all pairs of items. For a given item A, the top recommendations are then ordered by their similarity s(A,B).
One example of such a similarity is the co-occurrence between items: a simple count of the number of users who purchased both items. Although intuitive, such a metric is naive in that it is biased towards recommending popular items. For example, if at a grocery retailer most people purchase bread, bread will have a high co-occurrence with all items, but it is not necessarily a good recommendation. Target instead uses a more sophisticated similarity metric known as the log likelihood ratio (LLR). This quantity is large when the probability of two items, A and B, co-occurring is very different to the probability of them not co-occurring. For concreteness, consider a case of the People Who Viewed This, Bought That algorithm. The LLR similarity is large when the probability that B was purchased is not independent of whether someone viewed A.
For example, if
then item B should not be recommended with item A. Full details of this log likelihood ratio similarity calculation are provided in this PDF.
The logical flow of the actual algorithm implementation is shown in the following schematic diagram:
Details of these steps are as follows:
Model serving: Recommendations content is delivered from Target’s global “Edge” network. When mbox requests are made to Target and it is determined that recommendations content should be delivered to the page, the request for the appropriate item key for the recommendations algorithm is either parsed from the request or looked up from the user profile, and then used to retrieve the recommendations computed in the previous steps. Further dynamic filters are applied at this time, before the appropriate design is rendered.
In this type of algorithm, two items are considered to be related if their names and textual descriptions are semantically similar. Unlike most recommendations algorithms in which behavioral data sources must be used, content similarity algorithms use metadata from product catalogs to derive the similarity between items. Target is therefore able to drive recommendations in so-called “cold-start” scenarios, where no behavioral data has been collected (for example, at the beginning of a Target activity).
Although the model serving and content delivery aspects of Target’s content similarity algorithms are identical to other item-based algorithms, the model training steps are drastically different and involve a series of natural language processing and preprocessing steps as depicted in the following diagram. The core of the similarity calculation is the use of the cosine similarity of modified tf-idf vectors that represent each item in the catalog.
Details of these steps are as follows:
Input data: As described before, this algorithm is based purely on catalog data (ingested to Target via a Catalog Feed, the Entities API, or from on-page updates.
Attribute extraction: After the application of regular static filters, catalog rules and global exclusions, this algorithm extracts relevant textual fields from the entity schema. Target automatically uses the name, message, and category fields from the entity attributes and attempts to extract any string fields from custom entity attributes. This process is done by ensuring that the majority of values for that field are not parsable as a number, date, or boolean.
Stemming and stop-word removal: For more accurate text similarity matching, it is prudent to remove very common “stop” words that do not significantly alter the meaning of an item (for example, “was,” “is,” “and,” and so forth). Similarly, stemming refers to the process of reducing words with different suffixes to their root word, which has an identical meaning (for example, “connect,” “connecting,” and “connection” all have the same root word: “connect”). Target uses the Snowball stemmer. Target performs automatic language detection first, and can do stop word removal for up to 50 languages and stemming for 18 languages.
n-gram creation: After the previous steps, each word is treated as a token. The process of combining contiguous sequences of tokens into a single token is referred to as n-gram creation. Target’s algorithms consider up to 2-grams.
tf-idf computation: The next step involves the creation of tf-idf vectors to reflect the relative importance of tokens in the item description. For each token/term t in an item i, in a catalog D with |D| items, the term frequency TF(t, i) is computed first (the number of times the term appears in the item i), as well as the document frequency DF(t, D). In essence, the number of items where the token t exists. The tf-idf measure is then
Target uses Apache Spark’s tf-idf featurization implementation, which under the hood hashes each token to a space of 218 tokens. In this step, customer-specified attribute boosting and burying is also applied by adjusting the term frequencies in each vector based on settings specified in the criteria.
Item similarity computation: The final item similarity computation is done using an approximate cosine similarity. For two items, A and B, with vectors tA and tB, the cosine similarity is defined as:
To avoid significant complexity in computing similarities between all N x N items, the tf-idf vector is truncated to contain only its largest 500 entries, and then compute cosine similarities between items using this truncated vector representation. This approach proves to be more robust for sparse vector similarity computations, as compared to other approximate nearest neighbor (ANN) techniques, such as locality sensitive hashing.
Model serving: This process is identical to item-item collaborative filtering techniques described in the previous section.
The most recent additions to the Target suite of recommendations algorithms are Recommended For You and a series of Cart-Based recommendations algorithms. Both types of algorithms use collaborative filtering techniques to form individual item-based recommendations. Then, at serve-time, multiple items in the user’s browsing history (for Recommended For You), or the user’s current cart (for Cart-based recommendations) are used to retrieve these item-based recommendations, which are then merged to form the final list of recommendations. Note that many flavors of personalized recommendation algorithms exist. The choice of a multi-key algorithm means that recommendations are immediately available after a visitor has any browsing history and recommendations can update to respond to the latest visitor behavior.
These algorithms build on the foundational collaborative filtering techniques described in the item-based recommendations section, but also incorporate hyperparameter tuning to determine the optimal similarity metric between items. The algorithm performs a chronological split of behavioral data for each user, and trains recommendation models on the earlier data while attempting to predict the items that a user views or purchases later. The similarity metric that produces the optimal [Mean Average Precision](https://en.wikipedia.org/wiki/Evaluation_measures_(information_retrieval?lang=en) is then chosen.
The logic of model training and scoring steps are shown in the following diagram:
Details of these steps are as follows:
Input data: This is identical to item-item collaborative filtering (CF) methods. Both Recommended For You and Cart-Based algorithms use behavioral data, in the form of views and purchases of users collected when you implement Target or from Adobe Analytics.
The training step computes several types of vector similarities: LLR similarity (discussed here), cosine similarity (defined previously), and a normalized L2 similarity, defined as:
Model serving: Unlike previous algorithms in which serving recommendations involve specifying a single key for retrieval, followed by application of business rules, the Recommended for You and Cart-Based algorithms employ a more complex runtime process.
These processes are illustrated in the following image, where a visitor has viewed item A and purchased item B. Individual recommendations are retrieved with the offline similarity scores shown beneath each item label. After retrieval, the recommendations are merged with weighted similarity scores summed. Finally, in a scenario where the customer has specified that previously viewed and purchased items must be filtered out, the filtering step removes items A and B from the list of recommendations.
Target provides popularity-based algorithms for both the most viewed items, as well as the top selling items either across a website, or broken down by an item attribute or category. Popularity based algorithms rank items based on the number of sessions in which that item was viewed or purchased in a given time frame.
All these algorithms combine aggregated behavioral data where the total number of sessions in which items were viewed and purchased is recorded at both hourly and daily resolutions. Individual algorithms then find the most viewed or most purchased items for the customer configured lookback window.
Individual algorithm nuances are as follows:
The “recently viewed” recommendations algorithm allows for in-session personalization of recommendations. This algorithm requires no offline “model training.” Instead, Target uses the unique Visitor Profile to maintain a running list of items that have been viewed in a given session and can surface these items in recommendations activities. This allows for real-time updates to recommendations and next-page personalization.
Custom criteria allow customers to upload their own recommendations to Target, giving important flexibility and allowing “bring your own model” capabilities. Custom criteria replace the “offline training” portion of Item-Based recommendations, but behave similarly to Item-Based recommendation algorithms during the online content delivery phase, in that a single key is used for retrieval of recommendations and business rules/filters are then applied.