Configure bot detection for datastreams
Nonhuman traffic from automated programs, web scrapers, spiders, and scripted scanners can make it difficult to identify events from human visitors. This type of traffic can negatively affect important business metrics, leading to incorrect traffic reporting.
Bot detection allows you to identify events generated by the Web SDK, Mobile SDK and Server API as being generated by known spiders and bots.
By configuring bot detection for your datastreams, you can identify specific IP addresses, IP ranges, and request headers to classify as bot events. This helps provide a more accurate measurement of user activity on your site or mobile application.
When a request to the Edge Network matches any of the bot detection rules, the XDM schema is updated with a bot score (always set to 1), as shown below:
{
"botDetection": {
"score": 1
}
}
This bot scoring helps the solutions receiving the request correctly identify bot traffic.
Bot detection rules can take up to 15 minutes to propagate across the Edge Network after being created.
Prerequisites prerequisites
For bot detection to work on your datastream, you must add the Bot Detection Information field group to your schema. See the XDM schema documentation to learn how to add fields groups to a schema.
Configure bot detection for datastreams configure
You can configure bot detection after creating a datastream configuration. See the documentation on how to create and configure a datastream, then follow the instructions below to add bot detection capabilities to your datastream.
Go to the datastreams list and select the datastream to which you want to add bot detection.
In the datastream details page, select the Bot Detection option on the right rail.
The Bot Detection Rules page is shown.
From the Bot Detection Rules page, you can configure bot detection by using the following functionalities:
- Using the IAB/ABC International Spiders and Bots List.
- Creating your own bot detection rules.
Use the IAB/ABC International Spiders and Bots List iab-list
The IAB/ABC International Spiders and Bots List is a third-party, industry-standard list of internet spiders and bots. This list helps you identify automated traffic such as search engine crawlers, monitoring tools, and other nonhuman traffic that you may not want to include in your analytics counts.
To configure your datastream to use the IAB/ABC International Spiders and Bots List:
- Toggle the Use IAB/ABC International Spiders and Bots List for bot detection on this datastream option.
- Select Save to apply the bot detection settings to your datastream.
Create bot detection rules rules
In addition to using the IAB/ABC International Spiders and Bots List, you can define your own bot detection rules for each datastream.
You can create bot detection rules based on IP addresses and IP address ranges.
If you need more granular bot detection rules, you can combine the IP conditions with request header conditions. Bot detection rules can use the following headers:
user-agent
content-type
referer
sec-ch-ua
sec-ch-ua-mobile
sec-ch-ua-platform
sec-ch-ua-platform-version
sec-ch-ua-arch
sec-ch-ua-model
sec-ch-ua-bitness
sec-ch-ua-wow64
To create a bot detection rule, follow the steps below:
-
Select Add New Rule.
-
Type a name for the rule in the Rule Name field.
-
Select Add new IP condition to add a new IP-based rule. You can define the rule by IP address or by IP address range.
note tip TIP The IP conditions are based on a logical OR
operation. A request is marked as originating from a bot if matches any of the IP conditions which you defined. -
If you want to add header conditions to your rule, select Add header conditions group, and then select the headers which you want the rule to use.
Then, add the conditions to be used for the selected header.
-
After configuring the desired bot detection rules, select Save to have the rules applied to your datastream.
Bot detection rule examples examples
To help you get started with bot detection, you can use the examples detailed below to create bot detection rules.
Bot detection based on one IP address one-ip
To mark all requests originating from a specific IP address as bot traffic, create a new bot detection rule which evaluates a single IP address, as shown in the image below.
Bot detection based on two IP addresses two-ip
To mark all requests originating from either of two specific IP addresses as bot traffic, create a new bot detection rule which evaluates two IP addresses, as shown in the image below.
Bot detection based on a range of IP addresses range
To mark all requests originating from any IP address in a specific range as bot traffic, create a new bot detection rule which evaluates an entire IP address range, as shown in the image below.
Bot detection based on an IP address and a request header ip-header
To mark all requests originating from a specific IP address and containing a specific request header as bot traffic, create a new bot detection rule as shown in the image below.
This rule checks if the request originates from a specific IP address and if the referer
request header starts with www.adobe.com
.
Bot detection based on multiple conditions multiple-conditions
You can create bot detection rules based on:
- Multiple different conditions: Different conditions are evaluated as a logical
AND
operation, meaning that the conditions need to be met simultaneously in order for the request to be identified as originating from a bot. - Multiple conditions of the same type: Conditions of the same type are evaluated as a logical
OR
operation, meaning that if any of the conditions are met, the request is identified as originating from a bot.
The rule shown in the image below identifies a bot-originating request if the following conditions are met:
The request originates from either one of the two IP addresses, the referer
header starts with www.adobe.com
, and the sec-ch-ua-mobile
header identifies the request as originating from a desktop browser.