Bringing Intelligence on Content in Adobe Experience Manager with Content AI
Adobe Experience Manager Content AI transforms digital experiences by utilizing existing content for semantic search, generative discovery, and automatic content variations. Learn how Content AI’s architecture and enrichment pipelines deliver intelligent digital experiences. Discover its capabilities to efficiently manage content and unlock AI opportunities, enhancing your business’s digital transformation.
our session today about content AI and how you can bring intelligence to content in AM. I have with me here Fabrizio Fotino and Nitin Gupta, who is also covering on the chat. We are from the search and indexing team and we want to showcase you today the value of content AI. First of all, we will cover a quick introduction about content AI, what is the reason behind content AI and we give you a quick overview. Then Fabrizio will cover the architecture and the services and components behind content AI. We will cover some use cases which are building on top of content AI and then we will tell you what is next in the pipeline. All right, before I get started with content AI, let’s cover very quickly what is the reason behind content AI. So in the last couple of years, LLMs have changed how content is being discovered and how it is being utilised. There are more and more AI tools out there but our customers have one challenge. The one challenge is that they have a lot of segregated content across various sources. Because of this segregated content, they are not able to unlock all of the AI opportunities that they have and they can’t use them to the full extent of their limits. This is also leading to an inconsistent rank experience. So having heard this problem statement before, what can we do next? At first, we have to understand content is key. But not only content is key, it’s also key how we efficiently manage a content layer. And this is exactly where content AI comes into picture.
Basically in a nutshell, content AI uses the content that you have in the AM. We are trying to use the content in an intelligent manner and then we try to unlock opportunities for content. On top of that, we don’t only have the content in the AM but we also build functionality to acquire content from external sources. And this is all ending up in this pool of content where we can run some transformation, some enrichment and we have a symmetrical and intelligent content layer. Of course, as you can imagine, this is also powering multiple AM solutions and it’s also creating the basis for new functionality. In the spirit of AM, of course, this is open for developers, for all of you as partners and you can use it for further customization and enrichment. So now we have heard what is the reason behind content AI. We have heard what is content AI in a nutshell. Now we will quickly cover what is the layered structure of content AI. As I have mentioned before, at the very center of content AI, we have this vectorized pool of content. This is both the AM content that we have as well as the externally interested content in that layer of content. On top of that, we are then able to build the first AI use case. This is what we call AI search. AI search has a couple of search functionalities, including hybrid search, full text search and other search techniques. On top of AI search, we are then building the out of the box rack-based system, what we call AI answers. It’s basically a generated response based on the content. Here we also have the opportunity to fine-tune the system. You can provide feedback and the system will learn. Last but not least, we have this admin layer. In the admin layer, you are able to basically ingest content, you can onboard websites and you can efficiently manage some configurations. We also have this ML layer where we basically have both the embedding models as well as the LLMs that we use for AI answering. With that, I will hand it over to Fabrizio to cover the next section. Thank you, Julia. Hello everyone. Let me start by mentioning a report that Gartner published in April about rethinking the role of enterprise search for AI agents and assistants. I think it’s an excellent report. I suggest you to have a look if you’re interested in the union of search and AI. I think the insights go well beyond the enterprise search space. What struck me out was also that some of the insights, the conclusions match with the pillars that we have, we use while building content AI. First of all, the central message, the thing that we believe is that if you want to be successful in building AI agents and assistants on top of your content, you need to have a robust search backbone. As Julia said, in many organizations, search is fragmented, content is fragmented with different silos having different search capabilities. Another thing is that in many organizations, search is seen as a UI on top of a theme layer that is good at keywords. We think that we need to change this. We need to treat search as infrastructure, as a platform. We need a system of control. We need a system of delivery. We need connectors. We need ways to evaluate the content quality together as the retrieval quality. We need to go beyond keywords. To make this concrete, let’s have a look at the architectural journey that brought us here. All right. We haven’t built a new platform. We decided to build it on top of the scalable search and indexing infrastructure that is already powering AMS Cloud Service at scale. So, as you know, most of the indexes in AM are based on Lucene and they are embedded in the instances. This is okay in an on-premise world, but in the cloud where elasticity is key, this can become a problem. Let’s think about the burst of traffic on your website and we need to quickly bump the published tier. We need to create more published instances. So if you have large indexes, your published tier needs to wait until the indexes are fully downloaded. So what we have done in the past years is to build what we call remote search, which is powered by Elastic. Right now we have, I would say, a hybrid solution. Some of the indexes in AM still live in Lucene because it makes sense, but other indexes, mainly content, large content indexes are in Elasticsearch. The key here is that this is completely transparent to you. You still use the same index definitions. You still use the same scalability to improve elasticity. This actually opens new scenarios. So first of all, indexes are first-class citizens and they are not black boxes anymore. We can look into the indexes. We can easily troubleshoot if there is an issue. But this also opens new scenarios, as I said, and Content.AI was one of the things that was born out of this change. So we built on top of that asynchronous enrichment pipelines. We built a layer to end semantic search. We built generative search. And on top of that, a bunch of tools that you will see in a second. So this is the platform that we are looking at. The diagram you see is not everything, nothing, but it shows the major components and how they fit together.
And now let’s zoom into some of them. So let’s talk about how we actually get the content into the system. So most of the time, the AM content is what we care about, right? But what if our AI workflows need content that lives somewhere else? It may be a partner website, maybe public data source, legacy system, and so on. And this is where the acquisition component comes in. You can configure it through the admin UI and API, and it supports scraping, crawling API feed, and more connectors are about to come. The point here is simple, is that we don’t want to limit the intelligence to the content which is in the AM repository. We think that the combination of AM content plus external sources is all lifted together in the same layer, same intelligent layer, is what unlocks the next generation of searching the AI-driven workflows. Now let’s have a look what happens when the content is in the system. So indexing and ingestion handle the basic stuff. Normalization of content, structure update of index configuration, and so on. But where things really get interesting is in the enrichment. The enrichment pipelines might look a bit like the index definition that you know from AM, but there is a key difference here. In content AI, enrichment is index-centric. That means that if you want to change something in your workflow, you want to change some configuration, you don’t have to go back to the original source, to the AM repository or the external source. We can apply transformations inside the indexing layer, and this makes the system flexible, efficient, and also easy to evolve as the need of your search and AI workflow change over time. Now let’s have a look at the natural configuration of enrichment. A pipeline basically is a set of steps that transform the data, and in here we introduce the concept of spaces. Here we have vector spaces that are basically steps with the goal of producing embeddings that will enrich your content. In this example, we are enriching the same field, which is the description, twice. In the first we read we are using the addInPlace miniLM, and in the second in Goop we are using OpenAI 002. Why do we want to do that? Well, it might be for several reasons. Perhaps we are running an A-B test, we want to run an A-B test, or because different use cases might benefit from different models. For example, one model can be good at short form queries, while the other might capture broader semantics. We can also define lexical spaces, and here is where the similarities with the traditional AM definitions are close, let’s say. We can use the JQ expression where we basically concatenate the title and the description, but we also support advanced configurations, like for example you can configure language analyzers to improve the full text tolerance. So the power here comes from keeping these spaces separate. It means that, as you will see, we will be able to run hybrid searches or model specific searches without having to reprocess the content. And this brings flexibility and makes experimentation cheap. So now let’s turn to the search service. This is the service that actually makes the intelligence usable. At its core, it supports multiple styles of search through a composable query API. You can do hybrid searches combining KNN, K-Nearest Neighbor, semantic retrieval with a good, old-fashioned full text filtering capabilities we are relying on for years. This means that you can get the relevance based on meaning, plus the precision from structured filters, together in a single query.
Another aspect which is quite important is that we support cross-index queries. That means you can bring data from different sources and you can get results back in a single result set. But where things get really powerful is in this composability that I mentioned before. You can compose your queries, as we will see, to use this model or better use this vector space. And again, this makes A-B testing trivial. You can run different configurations in production. You can monitor performance. You can gather feedback and then switch to the best setup with confidence without re-architecting the system. In other words, this is not just a retrieval engine. We can see this as a controlled experimentation platform for content intelligence. Alright, now what I want to show here is the flexibility of the search API. As I said, it’s composable. You can mix and match the query types depending on the problem you are about to solve. In this example, we have a hybrid query where in red you can see the vector part that uses the mini-LM space we configured before. And in here you have the lexical query that targets the lexical space. That’s the sort of setup you want to have in a site search scenario, where you want both semantic relevance and the precision of filters and keywords. In here, instead, we have a pure vector query that uses the Ada002 model. Here, an important thing to notice is that we are not interested in the original documents anymore. We are asking for chunks, and that’s exactly the pattern that you want to expect in a RAC workflow, which we will dig into later. The takeaway here is that the same content can be lifted into multiple representations, and those representations can serve very different use cases, all from single indexing layers. And now, let’s have a look at something concrete. Searching for assets. All right. As we all know, AM is good at finding words, but not reading. So if I go here, here is an environment with images, and I start searching for big passenger ship.
All right. The system tells me we have three images. What if I extend this query and I search for big passenger ship on water? Bummer. There are no images. It seems there are no images, right? But the problem here is that our assets, the metadata in the assets, don’t work. We can improve their elements. We can set language analyzers. We can set standards, tokenizers, synonyms, and so on. But it gets quite complicated, especially if we want to support multi-language. What if we enable semantic search now? Well, we returned exactly what we were thinking. Large vessels, ships, cruises, and so on. So we moved the search term base to a meaning. So we are looking for the meaning. And what’s very important here is that we are not replacing the existing search. All the stuff that you have here across ACLs, filtering, they still work. We are extending. We are running basically a hybrid search query with the old full text plus the vectorized part. And it gets actually better. So we are in Basel. Let’s search the German again. Standard search, no results. But if we enable semantic search, we are able to get relevant results. This is something that is built in the foundation. And that’s why we are now able to roll out to all our AMCS customers.
This is actually happening right now. And we support multiple languages. We support around 100 languages. And I would say this is an important step forward to move people close to the right content. But this is just the first step. Here we are seeing that the search is helping like humans. But we can reuse the platform we have built to also power other applications and AI workflows as Julia will show now.
Yes. So I am going to run now through AI answers. So now you just have heard what is AI search. Now let’s repeat one more time what is AI answers. So AI answers is this out of the box write-based tool to generate responses for your customer queries. There are a lot of capabilities of AI answers coming out of the box. So for instance, you can do drop tuning to assure that you are meeting the voice of the customer and the brand guidelines. You can also provide contextual data. We will show that in a demo in a second. And on top of that, you can also provide citations. You can provide follow-up questions. And you can also provide feedback to that system to further find you need. So let’s see if the technology is supporting us and we are able to bring up the demo. Yes. All right. Okay. What you can see here is inside Adobe, inside Adobe, you have internal Adobe employees for their day-to-day life. And I want to leverage the search here now. In a traditional sense, the search what you can see here is a keyword based search. A keyword based search works very well with keywords. So if you have a very simple keyword based query, this will be working very well. But if you then have a very complex query, this will not understand it and it will not understand the user intent. But for that reason, we have AI answers. So what I will do now is I will search parking spaces. Because when we visit the office here, we want to make sure we stand on a valid parking space otherwise it will get very expensive. So now you can see if I scroll down the results from the keyword based search. And as you can see, the first result is matching to a user intent but everything else is not. Let’s check out now how the AI answer looks like. Here I can see in a summarized format exactly what I was searching for. So this is the whole beauty of AI answers. On top of that, what I have mentioned before, as you can see here the sources, you can see here other follow-up questions. So now what is interesting about this is we get back parking spaces in Basel. Why is this the case? Because what we are doing here is we are providing some additional information to the context of this query. So what we are doing here is we provide the location of the current logged in user. So as I log in with Basel, it will give me back the parking spaces in Basel. So what I will do now is I will select let’s say San Francisco. And I will just make this a bit more complex where are parking spaces. And there we go. One more time we receive in a summarized format all of the parking spaces in San Francisco. So basically what you can do is you can provide additional context to a specific query and the same query will respond differently based on the context provided to that query. Last but not least, I am from Austria. We speak German even though not everyone believes that. So I want to search, who can I park? And there we go. It even understands my Austrian German. So this is amazing. I also get back the results in German which is very helpful. In total, we are able to support 100 different languages. And this has been supported without any additional developer time from the Inside Adobe site. So this is already amazing. Five minutes. Okay. So I will hurry up. I have one more question. I want to know how to access the VPN. You can also make spelling mistakes because the tool, well, no. Usually you should be able to make spelling mistakes. Let me check. There you go. It’s working now. So you can see here a very generic query. But what is interesting about this is the result, what you get back is very specific. Why is this the case? Because this result is grounded in the information of the website that you can see here. So you won’t get back any random result from any public website. You will get back exactly what you want to see. All right. So with that, let’s head back to the presentation. So of course, you are all interested to see how this looks under the hood. So feel free to scan the QR code here. Basically, you will see a list of all of the API endpoints available in the experimental mode. And you can just start testing it yourself. So what is next in the pipeline? What you have seen, what Fabrizio has shown is the semantic search for assets. So underneath the hood, we are using the assets index for that. On top of that, we want to be able to have the same semantic search capabilities for content fragments, for secure page content, for everything what we have in here. Last but not least, one more time to recap. At the bottom, you see all of the content AI foundational APIs. On top of these foundational services, you now see all of the use cases that are being powered by this amazing foundation. We have AI search, what we presented today, AI answers. On top of that, we will also support content advisor, content hub. We are supporting Elmo and many more use cases to come. All right. Thank you for your attention today.
This session — Bringing Intelligence on Content in Adobe Experience Manager with Content AI — features Fabrizio Fortino, Senior Cloud Software Engineer, and Julia Daurer, Manager of Software Development at Adobe. Recorded live from Basel, this presentation explores how Adobe Experience Manager Content AI uses existing customer content to power semantic search, generative discovery, and automatic content variations through agentic workflows. Learn about Content AI’s architecture, enrichment pipelines, and A/B testing capabilities designed to deliver intelligent digital experiences.
Special thanks to our sponsors Algolia and Ensemble for supporting Adobe Developers Live 2025.
Next Steps
- Continue the conversation on Experience League
- Discover upcoming events