Examples of request.log analysis | AEM
The request.log of Adobe Experience Manager (AEM) is a rich source of information, including response times, which is crucial for analyzing performance issues. This article provides several examples to illustrate this point.
Description description
Environment
Experience Manager 6.5
Issue/Symptoms
The Adobe Experience Manager (AEM) request.log
contains various useful information, such as response time, for analyzing the performance issues. Here is a list of analysis examples using Linux commands (including some external commands like ruby [
1]
and datamash [
2]
).
Installation guides
[
1]
https://www.ruby-lang.org/en/documentation/installation/
[
2]
https://www.gnu.org/software/datamash/download/
Resolution resolution
Table of Contents
A. Introduction
- Format of
request.log
B. Preparation Steps
- Data cleansing
- Restarted time
- Number of accesses per hour
- Maximum concurrent processing
- Split a log file
- Merge request records and response records
C. Examples of Analysis
- The heaviest accesses
- Accesses missing response
- Slow accesses
- Time-series data of response time
- Minimum, mean (average), median, maximum response time
- Number of accesses per a period
- Number of response statuses per a period
- Most frequent URLs
access.log
records for arequest.log
record
D. Conclusion
A. Introduction
The format of request.log:
AEM 6.5 generates request.log
in the following format by default. Due to a system limitation, the command lines in this article are shown as images instead of plain text.
Example of request.log
:
In this article, a line with "->
" is referred as a “request record.” A line with “<
-” is a “response record.”
Request Record:
When a request is received by AEM, a request record is logged. It contains the date and time of receipt, the request ID, the request method, and the URL.
Response Record:
When AEM responds to a request, a response record is logged. It contains the date and time of response, the request ID, the status code, Content-Type, and the response time (in milliseconds).
Find the corresponding manual on Interpreting the request.log.
B. Preparation Steps
Step 1. Data cleansing
Before diving into the analysis of request.log
, it is important to standardize the log records.
The first sed
command removes an extra space in Content-Type of response records, to prevent wrong field separation with white space. The ruby command (see [
1]
above to install Ruby) converts the date format to ISO 8601. The ruby command also separates date and time with white space instead of a colon.
Step 2. Restarted time
Restarting AEM and a service pack installation resets the request ID of request.log
. The request records with request ID = 0
indicates there might be these kinds of operation.
In the above example, request IDs were reset to 0 at 13:08:49 and 13:26:13.
Step 3. Number of accesses per hour
Count the number of accesses per hour and the time range of the request.log
.
Step 4. Maximum concurrent processing
The number of concurrent processing helps to guess the server load of AEM.
By default, the maximum number of concurrent connections for Jetty in AEM is set to 200. There is a delay in releasing the socket after completing the response. When the number of concurrent processing exceeds approximately 170, it becomes unable to accept new requests.
Step 5. Split a log file
The request ID of request.log
is reset when AEM restarts or a Service Pack is installed. Due to this behavior, the analysis may be incorrect when a request.log
contains such operations. To perform accurate analysis and to reduce the file size being handled at once, split the request.log
using request records with the request ID = 0
.
Step 6. Merge request records and response records
Merging request and response records by request ID makes it easier to spot when performance issues started. This merged log file will be used in the later examples.
The last sed
command adds a dummy response to request records that do not have a corresponding response record. There may also be response records without request records. But they are ignorable as they are typically not an issue for investigation.
The merged log file should look like this:
C. Examples of analysis
Example 1. The heaviest accesses
Sort the merged log file by response time in descending order, including accesses without response.
Example 2. Accesses missing response
Extract accesses missing their corresponding response records using the dummy response time.
If the timing of receiving accesses without response is correlated with an increase in server load, these accesses may have triggered performance issues.
Example 3. Slow accesses
Extract accesses that took more than 10 seconds.
When the number of hits is too high, replace [ 0-9] {5}
with [ 0-9] {6}
in the grep
command to narrow down to accesses that took more than 100 seconds.
Example 4. Time-series data of response time
Extracting only the timestamp and response time from the data is useful to create graphs.
Omitting accesses that responded immediately makes the data more efficient. The following example extracts access that took more than one second.
Example 5. Minimum, mean (average), median, maximum response time
The above example uses datamash command (https://www.gnu.org/software/datamash/) for statistical processing. If the log contains accesses without response, the dummy value will influence the result.
Example 6. Number of accesses per a period
Count the number of accesses per ten minutes. The result helps to determine if large traffic caused a performance issue.
The following example narrows the data down to only POST requests. A typical use case is to determine if there is a concentration of content authoring or replication to the Publish tier.
Example 7. Number of response statuses per a period
Create a table of the number of each response status per ten minutes with datamash command.
Example 8. Most frequent URLs
Print the top three URLs which were accessed most frequently per ten minutes.
Example 9. access.log records for a request.log record
Search access.log
for records that correspond to a particular request ID.
If multiple accesses to the same URL at the same time happened, the result shows multiple access.log
records for a single request ID.
D. Conclusion
The examples in this article should help you analyze your performance issues.
The examples listed have been tested on CentOS 7.5 and Ubuntu 22.04LTS, but they may not work as expected depending on your environment, such as different versions or variations of the commands. Please adjust them accordingly to the commands installed in your environment.