Search and indexing

Learn about AEM as a Cloud Service’s search indexes, how to convert AEM 6 index definitions to be AEM as a Cloud Service compatible, and how to deploy indexes to AEM as a Cloud Service.

This is Search & Indexing. My name’s Darin Kuntze. I am Senior Cloud Architect. Today’s agenda, we’ll be going over the notable changes versus legacy indexes, followed by an explanation of blue-green deployment and then followed by transforming legacy indexes using the index converter and deploying those indexes to your cloud service instance, and a few troubleshooting tips to wrap it up. So first let’s look at notable changes. So the notable changes here are that Lucene is the only index type that supported. So if you have any old property indexes or solar indexes, you need to convert them over to Lucene for it to be supported. We also use the blue green deployment model, which allows for fast rollback and zero downtime deployments, and the index definitions themselves have a versionable naming scheme that always increments, always goes forward. There’s no index manager available like you would on the previous 65 versions and adding the indexes must group through the CI/CD pipeline with the path printed here on the screen. So sub-directories are not supported, so you can’t have a nested directory structure on the index definitions. So let’s look at some of the examples of an index definition and how they impact what you’re deploying. So an out of the box index like damAssetLucene, as you can see in the different columns here. So if there’s multiple versions deployed, if there’s no additional custom index deployed on top of that, then in any version that’s deployed on AEM, the damAssetLucene index will be used, but if you customize the damAssetLucene index by adding the -custom-1 to it, then you can see it’s still an out of the box index. It says yes, customized on it. But in the case of the -1, you can see version one would be the original version. So version two would now use your custom index. In the case of custom, just playing custom, not customize it out of the box, but like this acne product index, it’s not an out of the box indexing. See, the first next few columns are say no, but -1, the original version of course will be used in version one. So if you deploy another version of your index version one will not be used and custom-2 will be used. In the case of something like now the box index of cqPageLucene similar to the top one of the damAssetLucene index will be used in all versions. So changing and customizing an index. So the best practice here and can emphasize this more importantly, but you should copy the latest version from your cloud service instance, not the local SDK. So go up to your cloud service instance, poke around and find the one that you want to customize and take that off. So go to your dev environment where you have access to CXTE look at the index definition and copy that precisely. Then you can add your customizations on top of those, along with any sub-note. So if there’s a TK configuration, you need to bring that along too. So if you disable that then various bits and pieces of the index will not work. And the example naming. So if damAssetLucine has a version six or eight, it’d be named damAssetLucene-6.
And if you wanted to customize that, then you’d add the -custom-1 to the end of it. And then you’re good to go. So removing an undoing an index. So if you’ve deployed something that you either don’t need, or you screwed something up, then you can only remove custom indexes. So what you would do in this case is, like, if we did screw up the custom-1 index, what we do is we would deploy a custom-2 index and basically roll it back to the previous functioning version. The one that either was out of the box or takes out the features that you added. So you cannot roll back and emphasized here on the slide. You cannot roll back directly to another version. So if there was a custom three and four, and you wanted to roll back to three, you can’t just get rid of the index definition four and leave and roll back to three. You’d have to create an index definition five with the attributes and traits of index definition four. Removing an index is a matter of creating a new version that is empty and has some dummy traits on it so that it will never be used as an index. So you would just deploy another version with attributes that make it so it doesn’t get picked up. So here’s a few visual examples of what we were just talking about. So in the case of the damAssetLucine-custom-1 out of the box index uses in version two, but not three, because in the next example, we have damAssetLucine-2 a new version of the AssetLucene index. So that one obviously works in version three, we have -custom-1. So it is an out of the box, but it’s merged to function in version three. The cqPageLucene is an out of the box one instead of that, the first version of it. So it’s out of the box, it would work in version two. Now, if version three comes out, you have cqPageLucene-2 is out of the box, but it will not be picked up in version two and will be used in version three. Should be noted here that whenever new versions of the out of the box indexes are used, you will have to pay attention and add an additional customized version. Currently, there are works in place to do this automatically for you so when a new version comes out, that you don’t have to continuously update your custom indexes, it will be automatically merged. But as of right now, that feature has not been released. I’m assuming it would be at a future date.
Let’s talk about blue-green deployment. In blue-green deployment gives you the ability to zero downtime, updates and upgrades. It also gives you the capability to have fast rollback. So if you have multiple versions, so version one of your release and your release version 1.1, you always have that capability to roll back if something were to go wrong. In the context of indexing with the blue-green deployment model, the content is indexed before becoming green. So there is no additional step after the fact. So when it becomes active, you will not annex it. So we’re doing indexing before. It does happen during the deployment step and it can time out. So if you have a lot of content or the environment is under a fairly heavy load, then there is a slight possibility that it could time out. The solution to that is just to rerun the pipeline again, and it should resolve itself cause multiple timeouts are fairly rare. Timeouts themselves are rare, but having it happen multiple times is extremely rare. But you can see from the diagrams here in a typical blue-green environment. So the blue environment is active on the far left. And so then traffic is going to that blue environment. So in the middle part, the DevOps pipeline is running. It deploys your code, your indexes to the green environment. You can still see that the traffic from the site is going still to the blue environment. And so then whenever the indexing and deployment and the images are all built, everything’s done, then eventually on the right here, that traffic at the load balancer is switched over to the green environment. But again, like we said before, the fast roll back being that there is still a release to 1.0, right there. So if something were to happen in the deploy step, an index didn’t fit or let’s use the example of the timeout. So if re-indexing occurs in a timeout happens, it’s not going to release the green environment. So the load balancer is not going to switch over. So you will stick to 1.0, so we got pretty close in that instance, but not close enough, it’s still didn’t meet all the metrics for a release. Let’s look at transforming legacy indexes. Transforming legacy indexes is handled. You can do it manually, or you can use the index converter tool. So there’s a QR code that you can use or if you want to pause this video and type in that long URL, you’re welcome to, we’re going to have a demo of that here in a moment. But the index converter tool only transforms the type Lucene index definitions, which are present under apps or Oak Index. That’s not that it couldn’t transform those indexes, but there are other types of indexes, but Lucene index definitions are the only type supported under cloud service. The Ensure Oak Index definitions from the ACS comments project are not supported on cloud service. Therefore it will not migrate those. So what you should do is migrate those to a regular Oak Index definition first and then run the index converter tool. So it’s just a matter of, if you have a local environment, then you could deploy the Ensure Oak Index definitions aren’t supported copy those out to your project and then run the migration tool on that. And once again, I will emphasize from a troubleshooting standpoint, always copy the out-of-box index definition from cloud service and not the SDK before making any customizations, because there are a certain cases where the SDK might not be completely up to date with the cloud service instance. So always grab them from the latest version that you have in your development environment of cloud service. Let’s look at a demo.
So here you have a project that you’re probably familiar with from some of the other examples, from some of the other modules that you’ve seen, and what I’ve done is went ahead and installed the index converter tool as part of the migration tools that you’ve probably already used for the repository modernizer and so on, and created my configuration file up here at the base of my project. It’s on the .aio.cli directory. It’s called aem-migration-config.yaml.
The content of which the file is, if you have Ensure Index definitions, you add the path to those, it’ll do some conversion to a traditional index and then create the custom index definitions for you. Make sure you have the correct, the AEM version of what you’re using and the key here, which the demo I’m going to show today is that you have the actual path to the custom Oak Index directory path, as well as the filter XML path. You can see I’m using an absolute path all the way to my definitions. So if you look at the project itself, there’s an Oak Index’s directory at the base of the project. I guess it’s a module in this case, a maven module. And if we drill down through the directories and arrive at our content XML, you can see inside here, there is a damAssetLucene customization. So ideally this customization would be up to date with the latest sign clown service. So we’re converting it to an actual accurate definition. There’s also a couple other definitions here at the bottom. You can see these are property indexes and an ordered property index. So WKNDId, and weTerminationDate. So if we run the tool on these, we can expect that the Lucene index, which the Migrator will support the converter, and then these other two will suggest some alternatives for converting these to indexes. So if we jump into the command itself and just run that, it’ll read the configuration file and should output some useful information. So you can see it’s running. It found the indexes, it created a report, has a log, an output, a content XML with the actual converted index and updated a filter index. So it didn’t overwrite the existing one. So you have to go into the target here and expand it out. And you can see there is the content XML, which has the index definition for the index that it converted. So you can see quite a bit information there. It also named it appropriately with the new index definition name it. You see the acid Lucene-6-custom-1. And then if you look at the filter XML, it added that to the filter XML, comparing it to the original one. It’s pretty close to the same, but it should be noted in the filter XML, the WKNDId index definition if you look at the report, it’ll go through the changes that it did make. So here is the converted the DMS at Lucene index and create a custom-1 for that. Some of the other indexes that it did not transform, you can see down here at the bottom that WKNDId, which are property indexes and TerminationDate. And you can see there’s some additional help here in the link to take those indexes and convert those into a proper Lucene index. So now you can take this content XML, and move it to your apps directory, and then deploy it, which would be doing in the next section on deploying.
Next up deploying. Some of the best practices for deploying are to deploy their definitions in their own module, or as part of your UI apps module. The sample project that we were working on when we’re doing the index conversion, had it in its own module, but the best practice overall is to put it in your UI apps module. If you don’t want to trouble yourself with creating your own module for just indexes and bring that down into the JCR root and in the Oak Index note itself, there’ll be a content XML. You can specifically take that definition that you did a conversion with and drop that in there. Or if you have existing ones merge those two together, there should not be sub folders. So you can’t break them down into further directories like you can in previous versions and a content packages as we will show here with definitions, must have in the properties XML with no intermediate saves set to true. This will make it so that an index definition when it’s getting installed due to the nature of the file vault mechanism will not save, do intermittent saves while it’s doing install just to save time and stuff. So it does a big bang right at the end. And be conscious about the amount of time that it will take to index all the content that you have in your repository. Large volumes of content may time out during the indexing step and in a preview of what we’re going to talk about in the troubleshooting is be conscious about it. So the indexing step might take an extra long time in the cloud manager API has a timeout. So if it’s taking too long, it may time out and cause a failure in the pipeline, which in all actuality, if you have a lot of content, the indexing step is still going on in the background and eventually may complete. So proper steps to mitigate this is just to basically run the pipeline again, because if in fact you have a correct definition, everything is going good. Then the pipeline will run again, see that the content did get indexed no time out this time, because the index definition didn’t change and everything should be good. So that leads us into our final section on troubleshooting. So number one is always stay up to date with the out of the box indexes that you may have customized. So as the example of the damAssetLucene index, if you have customized version six, then you want to make sure that when version seven comes out, if and when it does that you are changing your custom index to match that out of the box index. And that includes the subnets. So in the case of things like DMS and Lucene index, you should have Tika nodes and some of those sub nodes as well in your customization. So those can change too, so there might be some additional configuration that needs to be done. There is an auto merge feature coming in the future, which should remedy this. So you don’t have to worry so much about that. So it’ll auto merge your changes with any new things that come out. There’s also some validation from cloud manager. So some rule within the pipeline itself that will validate the index definition so that you can have some sort of programmatics sanity check based on some actual real-world rules. So like if you were customizing some out of the box indexes and something didn’t match up. I’ve emphasized this a few times during this presentation, but always, always copy the latest index definition from cloud service before customizing, not the local SDK, but always from the development cloud service instance where you can actually access CRXDE lite. So go in there, compare and make sure everything is all synced up. That leads me to why would you want to do that? It’s just because again, features that the out of the box functionality of AEM as an application might be existing in there that some additional properties or what not, that if you don’t copy those over, so you might be breaking some out of the box functionality by forgetting or leaving those out. One of the tips for troubleshooting is to monitor the cloud manager deployment logs. So as it’s going in real time, you can actually monitor those logs either through the UI or CLT. And when it reaches that indexing step, you can get a jump on mitigating any issues that do come up. Those indexing issues are mainly down to the timeout or some of the, it might be indexing the wrong index. You may have touched something that doesn’t need to be indexed. So those are the types of things that you would see in there that timeout that I mentioned on the previous slide would also be where you’d see that some of the performance and explain tools that you’re familiar with from previous versions of AEM, they’re available in the developer console as well. So ii we go to that developer console for the environment, and there’s a few clicks in there under query for performance, and explain query go in there and you’ll see the familiar slope queries and a common query type interface. There’s also on the developer console main page, the ability to query and download the Oaks Index stats, or status. And that’ll show you all the configured indexes and their attributes. With that, we’ll wrap it up and thank you for watching. -

Index Converter Tool

Index Converter Tool

As part of refactoring your code base, use the Index converter tool to convert custom Oak index definitions to AEM as a Cloud Service compatible index definitions.

Review the index converter documentation for the complete and current set of Index Converter capabilities.

Key activities

  • Use the Adobe I/O Workflow Migrator tool to migrate asset processing workflows to use the Asset Compute microservices.
  • Set up a local development environment and deploy the customized indices. Ensure that the updated indices are up to date.
  • Deploy the updated code base to an AEM as a Cloud Service development environment and continue to validate.
  • If modifying an out of the box index ALWAYS copy the latest index definition from an AEM as a Cloud Service environment running on the latest release. Modify the copied index definition to fit your needs.

Hands-on exercise

Apply your knowledge by trying out what you learned with this hands-on exercise.

Prior to trying the hands-on exercise, make sure you’ve watched and understand the video above, and following materials:

Also, make sure you have completed the previous hands-on exercise:

Hands-on exercise GitHub repository

Hands-on with indexes

Explore defining and deploying Oak indexes to AEM as a Cloud Service.

Try out indexing