Search and indexing

Learn about AEM as a Cloud Service’s search indexes, how to convert AEM 6 index definitions to be AEM as a Cloud Service compatible, and how to deploy indexes to AEM as a Cloud Service.

Transcript
This is search and indexing. My name is Darren Kuntze. I am Senior Cloud Architect. Today’s agenda will be going over the notable changes versus legacy indexes, followed by a explanation of blue-green deployment, and then followed by transforming legacy indexes using the index converter, and deploying those indexes to your cloud service instance, and a few troubleshooting tips to wrap it up. So first let’s look at notable changes. So the notable changes here are that Lucene is the only index type that’s supported. So if you have any old property indexes or solar indexes, you need to convert them over to Lucene for it to be supported. We also use the blue-green deployment model, which allows for fast rollback and zero downtime deployments. And the index definitions themselves have a versionable naming scheme that always increments, always goes forward. There’s no index manager available like you would on the previous 6.5 versions, and adding the indexes must go through the CI CD pipeline with the path printed here on the screen. So subdirectories are not supported, so you can’t have a nested directory structure on the index definitions. So let’s look at some of the examples of an index definition and how they impact what you’re deploying. So an out of the box index like damasset-lucene, as you can see in the different columns here. So if there’s multiple versions deployed, if there’s no additional custom index deployed on top of that, then in any version that’s deployed on AEM, the damasset-lucene index will be used. But if you customize the damasset-lucene index by adding the hyphen custom hyphen one to it, then you can see it’s still an out of the box index. It says yes, customized on it. But in the case of the dash one, you can see version one would be the original version. So version two would now use your custom index. In the case of custom, just plain custom, not customize it out of the box, but like this ACME product index, it’s not an out of the box index. You can see the first next few columns are saying no, but dash one, the original version, of course, will be used in version one. So if you deploy another version of your index, version one will not be used and custom two will be used. In the case of something like an out of the box index of CQ page Lucene, similar to the top one of the damasset-lucene index will be used in all versions. So changing and customizing an index. So the best practice here, and I can’t emphasize this more importantly, but you should copy the latest version from your cloud service instance, not the local SDK. So go up to your cloud service instance, poke around and find the one that you want to customize and take that off. So go to your dev environment where you have access to CRX-DE, look at the index definition and copy that precisely. Then you can add your customizations on top of those along with any sub notes. So if there’s a TQID configuration, you need to bring that along too. So if you disable that, then various bits and pieces of the index will not work. And the example naming. So if damasset-lucene has a version six already, it’d be named damasset-lucene-6. And if you wanted to customize that, then you’d add the hyphen custom hyphen one to the end of it. And then you’re good to go. So removing and undoing an index. So if you’ve deployed something that you either don’t need or you screwed something up, then you can only remove custom indexes. So what you would do in this case is like, if we did screw up the custom one index, what we do is we would deploy a custom dash two index and basically roll it back to the previous functioning version. The one that either was out of the box or takes out the features that you added. So you cannot roll back and then emphasize here on the slide you cannot roll back directly to another version. So if there was a custom three and four and you wanted to roll back to three, you can’t just get rid of the index definition four and leave and roll back to three. You’d have to create an index definition five with the attributes and traits of index definition four. Removing an index is a matter of creating a new version that is empty and has some dummy traits on it so that it will never be used as an index. So you would just deploy another version with attributes that make it so it doesn’t get picked up. So here’s a few visual examples of what we were just talking about. So in the case of the damnassetleucine custom one out of the box index uses in version two, but not three, because in the next example, we have damnassetleucine-2, a new version of the asset leucine index. So that one obviously works in version three. We have dash custom one. So it is an out of the box, but it’s merged to function in version three. The CQ page leucine is an out of the box one. Consider that the first version of it. So it’s out of the box. It would work in version two. Now, if version three comes out, you have CQ page leucine-2, it is out of the box, but it will not be picked up in version two and will be used in version three. Should be noted here that whenever new versions of the out of the box indexes are used, you will have to pay attention and add an additional customized version. Currently, there are works in place to do this automatically for you, so when a new version comes out, that you don’t have to continuously update your custom indexes, it will be automatically merged. But as of right now, that feature has not been released. I’m assuming it would be at a future date. So let’s talk about blue-green deployment. And blue-green deployment gives you the ability to zero downtime updates and upgrades. It also gives you the capability of a fast rollback. So if you have multiple versions, so version one of your release and you wanna release version 1.1, you always have that capability to roll back if something were to go wrong. In the context of indexing with the blue-green deployment model, the content is indexed before becoming green. So there is no additional step after the fact. So when it becomes active, it will not index. So we’re doing indexing before. It does happen during the deployment step and it can time out. So if you have a lot of content or the environment is under a fairly heavy load, then there is a slight possibility that it could time out. The solution to that is just to rerun the pipeline again and it should resolve itself because multiple timeouts are fairly rare. Timeouts themselves are rare, but having it happen multiple times is extremely rare. But you can see from the diagrams here in a typical blue-green environment. So the blue environment is active on the far left. And so then traffic is going to that blue environment. So in the middle part, the DevOps pipeline is running. It deploys your code, your indexes to the green environment. You can still see that the traffic from the site is going still to the blue environment. And so then whenever the indexing and deployment and the images are all built, everything’s done. Then eventually on the right here, the traffic at the load balancer is switched over to the green environment. But again, like we said before, the fast rollback being that there is still a release 1.0 right there. So if something were to happen in the deploy step, an index didn’t fit, or let’s use the example of the timeout. So if re-indexing occurs and a timeout happens, it’s not going to release the green environment. So the load balancer is not going to switch over. So you will stick to 1.0. So we got pretty close in that instance, but not close enough. It still didn’t meet all the metrics for a release. Let’s look at transforming legacy indexes. Transforming legacy indexes is handled, you can do it manually, or you can use the index converter tool. So there’s a QR code that you can use, or if you want to pause this video and type in that long URL, you’re welcome to. We’re going to have a demo of that here in a moment, but the index converter tool only transforms the type Lucene index definitions, which are present under apps or Oak index. That’s not that it couldn’t transform those indexes, but there are other types of indexes, but Lucene index definitions are the only type supported under cloud service. The ensure Oak index definitions from the ACS comments project are not supported on cloud service, therefore it will not migrate those. So what you should do is migrate those to a regular Oak index definition first, and then run the index converter tool. So it’s just a matter of, if you have a local environment, then you could deploy the ensure Oak index definitions aren’t supported, copy those out to your project, and then run the migration tool on that. And once again, I will emphasize from a troubleshooting standpoint, always copy the out of box index definition from cloud service and not the SDK before making any customizations, because there are certain cases where the SDK might not be completely up to date with the cloud service instance. So always grab them from the latest version that you have in your development environment of cloud service. Let’s look at a demo. So here you have a project that you’re probably familiar with from some of the other examples, from some of the other modules that you’ve seen. And what I’ve done is went ahead and installed the index converter tool as part of the migration tools that you probably already used for the repository, modernizer and so on and created my configuration file up here at the base of my project. It’s on the dot AIO dash CLI directory. It’s called AEM dash migration dash config dot YAML. The content of which the file is, if you have ensure index definitions, you add the path to those. It’ll do some conversion to a traditional index and then create the custom index definitions for you. Make sure you have the correct AEM version of which you’re using and the key here, which the demo I’m going to show today is that you have the actual path to the custom Oak index directory path, as well as the filter XML path. You can see I’m using an absolute path all the way to my definitions. So if you look at the project itself, there’s an Oak indexes directory at the base of the project. It gets some module in this case, a Maven module. And if we look, drill down through the directories and arrive at our content XML, you can see inside here, there is a damn asset Lucene customization. So ideally this customization would be up to date with the latest time cloud service. So we’re converting it to an actual accurate definition. There’s also a couple other definitions here at the bottom. You can see these are property indexes and an ordered property index. So weekend ID and we can termination date. So if we run the tool on these, we can expect that the Lucene index, which the migrator will support the converter, and then these other two will suggest some alternatives for converting these two indexes. So if we jump into the command itself and just run that, it’ll read the configuration file and should output some useful information. So you can see that it’s running. It found the indexes, created a report, has a log and output a content XML with the actual converted index and updated a filter index. So it didn’t overwrite the existing one. So you have to go into the target here and expand it out. And you can see there is the content XML, which has the index definition for the index that it converted. So you can see quite a bit information in there. It also named it appropriately with the new index definition. Name me, you can see dam asset Lucene dash six dash custom one. And then if you look at the filter XML, it added that to the filter XML, comparing it to the original one, it’s pretty close to the same, but it should be noted in the filter XML, the weekend ID index definition. If you look at the report, it’ll go through the changes that it did made. So it, here it is, the name that converted the dam asset Lucene index and created a custom one for that. Some of the other indexes that it did not transform, you can see down here at the bottom that weekend ID, which are property indexes and termination date. And you can see there’s some additional help here in the link to take those indexes and convert those into a proper Lucene index. So now you can take this content XML, move it to your apps directory and then deploy it, which would be doing in the next section on deploying. Next up deploying. Some of the best practices for deploying are to deploy their definitions in their own module or as part of your UI apps module, the sample project that we were working on when we’re doing the index conversion, had it in its own module, but the best practice overall is to put it in your UI apps module. If you don’t want to trouble yourself with creating your own module for just indexes and bring that down into the JCR route and in the Oak index node itself, there’ll be a content XML. You can specifically take that definition that you did the conversion with and drop that in there. Or if you have existing ones, merge those two together, there should not be sub folders. So you can’t break them down into further directories like you can in previous versions and content packages, as we’ll show here with definitions must have the, in the properties XML with no intermediate saves set to true. This will make it so that a index definition when it’s getting installed due to the nature of the file vault mechanism will not save to intermittent saves while it’s doing install just to save time stuff. So it does a big bang right at the end and be conscious about the amount of time that it will take to index all the content that you have in your repository, large volumes of content may time out during the indexing step and in a preview of what we’re going to talk about in the troubleshooting is be conscious about it. So the indexing step might take an extra long time in the cloud manager API has a timeout. So if it’s taking too long, it may time out and cause a failure in the pipeline, which in all actuality, the, if you have a lot of content, the indexing step is still going on in the background and eventually may complete. So proper steps to mitigate this is just to basically run the pipeline again, because if in fact you have a correct definition, everything’s going good, then the pipeline will run again, see that the content did get indexed, no timeout this time because the index definition didn’t change and everything should be good. So that leads us into our final section on troubleshooting. So number one is always stay up to date with the out of the box indexes that you may have customized. So as the example of the dam asset Lucene index, if you have customized version six, then you want to make sure that when version seven comes out, if and when it does, that you are changing your custom index to match that out of the box index. And that includes the sub node. So in the case of things like dam asset Lucene index, you should have Tika nodes and some of those sub nodes as well in your customization. So those can change too. So there might be some additional configuration that needs to be done. There is an auto merge feature coming in the future, which should remedy this. So you don’t have to worry so much about that. Also auto merge your changes with any new things that come out. There’s also some validation from cloud managers. So some rule within the pipeline itself that will validate the index definition so that you can have some sort of programmatic sanity check based, you know, on some actual real world rules. So like if you were customizing some out of the box indexes and something didn’t match up, I’ve emphasized this a few times during this presentation, but always, always copy the latest index definition from cloud service before customizing, not the local SDK, but always from the development cloud service instance where you can actually access the CRXD light. So go in there, compare them, make sure everything is all synced up. That leads me to what, you know, why would you want to do that? It’s just because again, features that the out of the box functionality of a EMA as an application might be existing in there. There are some additional properties or whatnot that if you don’t copy those over, so you might be breaking some out of the box functionality by forgetting or leaving those out. One of the tips for troubleshooting is to monitor the cloud manager deployment logs. So as it’s going in real time, you can actually monitor those logs either through the UI or CLT and when it reaches that indexing step, you can get a jump on mitigating any issues that do come up. Those indexing issues are mainly down to the timeout or some of the, you know, might be indexing a, the wrong index. You may have touched something that doesn’t need to be indexed. So those are the type of things that you would see in there. That timeout that I mentioned on the previous slide would also be where you’d see that some of the performance and explain tools that you’re familiar with from previous versions of AEM, they’re available in the developer console as well. So it’d go to that developer console for the environment. And there’s a few clicks in there under query for performance and the explain query go in there and you’ll see the familiar, you know, slow queries and common query type interface. There’s also on the developer console main page, the ability to query and download the Oakes index stats. So, or status, and that’ll show you all the configured indexes and their attributes. With that, we’ll wrap it up and thank you for watching.

Index Converter Tool

Index Converter Tool

As part of refactoring your code base, use the Index converter tool to convert custom Oak index definitions to AEM as a Cloud Service compatible index definitions.

Review the index converter documentation for the complete and current set of Index Converter capabilities.

Key activities

  • Use the Adobe I/O Workflow Migrator tool to migrate asset processing workflows to use the Asset Compute microservices.
  • Set up a local development environment and deploy the customized indices. Ensure that the updated indices are up to date.
  • Deploy the updated code base to an AEM as a Cloud Service development environment and continue to validate.
  • If modifying an out of the box index ALWAYS copy the latest index definition from an AEM as a Cloud Service environment running on the latest release. Modify the copied index definition to fit your needs.

Hands-on exercise

Apply your knowledge by trying out what you learned with this hands-on exercise.

Prior to trying the hands-on exercise, make sure you’ve watched and understand the video above, and following materials:

Also, make sure you have completed the previous hands-on exercise:

Hands-on exercise GitHub repository

Hands-on with indexes

Explore defining and deploying Oak indexes to AEM as a Cloud Service.

Try out indexing

recommendation-more-help
4859a77c-7971-4ac9-8f5c-4260823c6f69