Troubleshooting AEM as a Cloud Service
Learn how to troubleshoot and debug the AEM SDK, AEM as a Cloud Service and, the build and deploy process.
Transcript
Hey everyone, my name is Varsaluja and welcome to this video. Today we’ll talk about how we can troubleshoot various aspects of AMS as a Cloud Service. After completing this video, you should have a general idea on how you can debug local AM SDK, debug AM as a Cloud Service environment, and debug build and deployment failures for your Cloud Manager executions. Debugging AM SDK. As we know, AM SDK is the primary development environment used by developers. It also supports multiple ways to debug AM and the deployed applications. We’ll go through some common debugging tools and consoles that should help you in the troubleshooting. Logs generated by the AM SDK can provide key insights into debugging AM applications. They’re available under the CRS Quickstart Logs folder. Logs act as the frontline of debugging AM applications, and can help resolve complex issues with AM. But they are dependent on adequate logging levels set in the deployed AM application. Adobe recommends keeping the local development and AM as a Cloud Service development environment logging levels as similar as possible, as it normalizes the log visibility and awarding discrepancies for both environments. If default logging is insufficient for local development, custom logging can be configured by the OSGI web console. AM SDK also has an OSGI web console that provides a variety of information and introspection into the local AM runtime, that are useful to understand how your application is recognized and functions within AM. AM provides many OSGI console, each providing key insights into different aspects of AM, such as bundles information, configuration statuses, and more. OSGI console can be accessed from within AM under Tools, Operations, Subconsole, or directly to the link system console. OSGI console provides details in different debugging scenarios, such as validating if an OSGI bundle is present, validating if an OSGI bundle is active, determining if an OSGI bundle has unsatisfied imports, preventing it from starting, identifying if the OSGI property values are bound to any active OSGI configurations. Other areas of the console helps you with the functions, such as server resolution, resource resolution, package dependency resolution, and more, that can help ease the debugging effort in AM. Another common use tool is CRXD Lite. CRXD Lite is a web-based interface for interacting with the AM’s content repository. CRXD Lite provides complete visibility into content repo, including nodes, properties, property values, and permissions, which can be helpful in troubleshooting various type of issues related to code and content. For any content or permissions related issues on your local environment, these tools come very happy. The CRXD Lite can be accessed within AM under Tools, General, CRXD Lite, or can be directly accessed with the URL CRXD underscore index of GSP. The dispatcher tools are now part of the AM SDK. They provide a containerized Apache web server environment that can be used to simulate AM cloud service dispatcher set up locally. The latest AM SDK can be downloaded from the software distribution portal. The setting of dispatcher tool and doing the logs and the cached content can be vital in ensuring the end-to-end AM application functionality, and ensure if all the security configurations are correct or not. As these dispatcher tools run in a containerized environment, you can access the logs and the cache using the Docker shell. It’s pretty easy to set up, and can help ease the debugging effort with the content delivery on the published service through the dispatcher. So in summary, we installed Docker, validate and run your dispatcher config with the local dispatcher SDK, test end-to-end delivery. In case of any problems, dig deeper using the dispatcher log and the content cache generated for your config to troubleshoot the problem at hand. I’ll share the reference link at the end of the session that would help you set up dispatcher SDK. Debugging AM as a cloud service environment. As we know, AM as a cloud service is now cloud-native way to leverage AM applications. This runs on a self-service, scalable cloud infrastructure, which requires AM developers to understand and debug various facets of AM as a cloud service, from build phase to deployment phase, and getting all the details of running AM applications. Like the AM SDK, logs provide details on how your application is functioning in AM as a cloud service. Also, provides insights into issues with deployment. All the logs generated by the author of public services are exposed via cloud manager and are available for download. All log activity for a given environment’s AM service are consolidated into a single log file, even if different parts within the service generate the log statement. AM author service provides AM runtime server log that includes error access and request log. AM error log is the Java error log for AM, similar to local SDK under the QuickShot folder. AM access is the log with all the HTTP requests to the AM service. AM request is the log with all the HTTP requests made to AM service and the corresponding HTTP response. For AM public service, in addition to the error access and the request log, the Apache web server and the dispatcher logs are also available for download. These include HTTP access, HTTP error, and AM dispatcher log. HTTP access is the log with all the HTTP requests made to AM’s web server or dispatcher. The HTTP error is the log with all the messages from Apache web server, and helps with the debugging of supported Apache modules, such as mod rewrite, mod security, et cetera. AM dispatcher is the log with all the messages from the dispatcher modules, including filtering and serving from the cache messages. Adobe recommends to set the logging level for dispatcher log in the dev environment to be in debug or trace mode, such that in case an issue arises, we can do an in-depth analysis and understand the cause of the issue. AM as a cloud service supports custom logging, but currently does not support custom log files. So all custom or project specific logs can be configured and logged in the error.log, and this would be available in the cloud manager. Adobe cloud manager also supports setting AM to cloud service log while the cloud manager plug-in for Adobe IOM. With this, you can download and tail the log for all your environments using the command interface. You can also use the cloud manager environment variable to parameterize the log level. With this, you can allow log levels to change dynamically, and this can be done using the AIO CLI plug-in. Things to know about using the environment variable is that they are limited in number, and there is no UI, and these can only be set using the cloud manager APIs or the AIO CLI plug-in. Customers who have Flunk accounts have an option to request that the AIO cloud service logs are forwarded to the appropriate index. The logging data is equivalent to what is available through the cloud manager log, but customers may find it convenient to leverage the query features available in the Splunk product. Splunk forwarding for the sandbox program environment is not supported. To enable Splunk forwarding, customers should submit a support request along with the Splunk HEC endpoint, the Splunk index, the Splunk port, the Splunk HEC tokens to enable this integration with customer Splunk accounts, and even the cloud service. Each AEM as a cloud service environment has its own developer console. This exposes similar information exposed in the OSHA console in the local AEM SDK. Access to the developer console can be enabled by assigning the user with the cloud manager product developer profile and align this user with the AEM user or administrator’s product profile in the AEM console. CRXD Lite is also available in your AEM cloud service environment, which provides direct access to the JCR. This is really helpful in debugging content and access control related issues. For debugging OSJ configurations, it is recommended to use the developer console instead of CRXDE. Note that the content paths like apps, libs, or index are visible in CRXDE but are immutable, meaning they cannot be changed at runtime by any user. These locations in the JCR can only be modified via code deployment. You would also use CRX package manager available for all AEM author environments to extract the content in form of packages and analyze it offline. Let’s do a quick demo to showcase developer console. As mentioned earlier, access to the developer console is through cloud manager UI. So once we log in into the cloud manager UI, you can select any of your AEM environments and access this developer console. So you should be prompted to log in using your Adobe credentials. So once you’re logged in, you can view all your available tabs in the developer console. Using the pod dropdown, you can select any of the pods from your author service or the public service. You can capture and download status information around bundles, components, configuration, oak indexes, sling jobs, and more. You can also use the package resolution or the serverlet resolution tab to capture information on how your package or the serverlet is being resolved in AEM. The queries tab takes you back within AEM to help you with the analysis on the query performance. The integration tab provides you with local and developing token or the service credentials that you can use to authenticate AEM using the third party applications. Debugging build and deployment failures. As we know, gold build and deployment activities to AEM cloud service are done through cloud manager’s pipeline execution. Failures may occur during these steps in the build process, which might require actions to resolve them. We’ll go through some common failures in this deployment cycle and how you can best approach them. The first step in the cloud manager’s execution is the validation step, which simply ensures that the basic cloud manager configurations are valid. There could be a scenario where once you start the execution the execution fails in the validation step, suggesting that the pipeline execution can be started and the environment sits in the invalid state, which is reflected in the cloud manager UI. The reason for this is that the target environment of the pipeline is in a transition state and it cannot accept new builds. This could include the environment is getting created or deleted. One should wait for the state to resolve and then retry to start the application. As we know, deployment pipeline executions are tied to the environment and the Git branch. So there could be situations where pipeline execution can be started. As the environment or the Git branch the pipeline is configured to is already marked deleted. One needs to edit the pipeline configuration and reconfigure the target environment or select the Git branch before retrying execution. The next step in the cloud manager is build and the unit testing phase that performs a Maven build. That is the Maven clean package command for the project checked out from the pipeline’s configured Git branch. Errors identified in this phase should be reproducible by building the project locally and can be fixed on the local dev environment. And once fixed, we can redeploy this, get back to step. This step won’t be able to identify any issues arising from unavailable or unreachable Maven dependencies using the project code or using a private internal Maven repository which is not accessible. It also doesn’t identify any issues with the unsupported Maven plugins used in the project. The next one is the code scanning phase that performs a static code analysis using a mix of Java and AM specific best practices. Code scanning results in a build failure if critical security vulnerabilities exist in the code. Lesser violations can be overridden but it’s recommended that they are fixed. Note that code scanning is imperfect and can result in false positives which should be analyzed further. To resolve the code scanning issues review the summary and download the CSV report provided by the cloud manager under the download details section and we can take the action accordingly. All the activity in the code scanning phase is logged and is available for download from the cloud manager UI. Next is the build image phase which is responsible for combining the build code artifacts that is created in the build and unit testing phase along with the EM release. And this forms a single deployable artifact. While any code build and compilation issues are found during the build and unit testing phase there could be configuration or structural issues identified when attempting to combine the custom build artifacts. So there could be situations where the pipeline execution fails in the build images step which could be due to the malformed repo scripts, usage of multiple version of core components and more. These issues are logged in the build image log and is available for download from cloud manager. For malformed repo script we need to ensure that the directives in the script are defined correctly. This can be reproduced on your local EEM SDK and should be reflected in the error log. Other common case is the usage of core component version which is greater than the deployed version. EEM as a cloud service automatically includes the latest core components version in every EEM release. Meaning after an EEM as a cloud service is automatically or manually updated the latest version of core components is deployed to it. To prevent this failure whenever an update of EEM as a cloud service environment is available this should be included as a part of the next build and deploy and always ensure that the updates are included after implementing the core component versions in the application’s code base. The deploy to step is responsible for taking the core artifact generated in the build image step starts up a new EEM author in a public service and upon success deploys it in the form of a blueprint deployment. Note that the log available by the download log button in the deploy step is not the EEM error log and does not contain any detailed information pertaining to your application startup. It only contains logs for the process while it’s deploying the artifact. The EEM error log available for download in cloud manager would contain all the information around the startup and the shutdown of EEM service which may be applicable for any issues in the deploy system. Let’s discuss a few common reasons why the execution failed in the deploy system. There are situations where the cloud manager pipeline holds an older version of EEM and then what is deployed on the target environment. This may cause the execution failure due to any known product or infrastructure issues and these might already be fixed in the latest EEM release. The best course forward is that if the target environment has an update available option, select the update from the environment actions and then rerun the build. There could also be situations where the code running during the startup of the newly deployed EEM service takes so long that the cloud manager times out before the deploy can be completely successful. In these cases, the deployment may eventually succeed even though the cloud manager status reported failed. This may be caused by the query traversals or delay in startup time due to the OSGI lifecycle for the custom bundles being deployed. The best course here is to review the implementation for the code that runs early in the OSGI bundles lifecycle and review the EEM error log or EEM author and the public service around the time of the failure. And we can look for log messages indicating any custom logs running self-tune. Although most code and configuration violations are got earlier in the build cycle, it is possible for an incompatible configuration or some code to go undetected until it executes in the last deployed EEM. If this happens, we need to review the EEM error log or the EEM author and the public services around the time of the failure as shown in the cloud manager UI. Best bet is to review the log for any errors thrown by the Java classes provided by the custom application. If any issues are found, resolve, push the fixed code and rebuild the pipeline. Help and resources. If the above troubleshooting approaches do not resolve the issues at hand, customers and partners should be able to reach out to Adobe Support for guidance. This could be done by creating a support ticket via the admin console, by accessing the support tab and creating a support ticket with all the details of the problem so the respective teams can help qualify and help fit the resolution. Quick note, if you are a member of multiple Adobe orgs, ensure the correct org is selected in the list switcher prior to creating the case. Tutorials, documentation and discussion threads related to EEM in the cloud service are available on Experience League and forums with tons of information that would help partners and customers to know more and grow with EEM in the cloud service. Thank you for watching this video.
recommendation-more-help
4859a77c-7971-4ac9-8f5c-4260823c6f69