Content Transfer Tool

Learn how Content Transfer Tool helps you migrate content to AEM as a Cloud Service from AEM 6.3+.

Transcript
Hi, my name is Kiran Margula, working with AEM Cloud and Customer Solutions Engineering team as Senior Cloud Architect. In this session, we are going to learn about different methods of migrating content into AEM as cloud services, either from on-premise or Adobe Mainnet services. We are also going to learn how to choose the right method and touch upon best practices and guidelines to plan and execute a successful content migration. As we see here, there are three different methods. First, Content Transfer Tool, for which I will be providing an overview of the tool, prerequisites for the usage, planning the migration using CGT, and end that section with a short demo. Second, Bulk Import Service. Bulk Import Service is an out-of-the-box AEM as cloud services feature that is available for all cloud services customers. This service will facilitate to pull the assets from external cloud storages such as AWS or Azure. Third, Package Manager. There are some specific things to keep in mind while using the package manager in AEM as cloud services, which I will touch upon in that section. Let’s dive in. Content Transfer Tool, for short, CTT. This tool is developed and managed by Adobe Engineering team. This is a primary tool for migrating AEM to AEM as cloud services, which is successfully being used on multiple migration projects. This tool is distributed as a standard AEM package through Software Distribution Portal. The package contains two components, frontend and backend. Frontend, which I will demo later, consists of touch UI-based user interfaces to create connections to target AEM as cloud service, create migration sets. Each migration set will contain one or more content cards from which the content has to be extracted, options to include or exclude versions, and also perform either initial or top-up migrations. The CTT user interface also contains actions to stop, pause, and monitor the migration process. CTT migration process itself is divided into two major steps. One is extraction and the other one is ingestion. During the extraction phase, the content and the reference blobs that are stored in the blob store are being extracted and downloaded and returned to a temporary space on the source system into the disk. From there, content and data will be uploaded into a cloud storage, which is going to become the source for the next step, which is ingestion. This middle cloud layer or storage layer is Azure Blob Store as of recording this session. During the ingestion process, the content that is available in the Azure container, which is a staging container, will be ingested into target AEM as cloud services and then indexed so that it would be available for the consumption of the content authors if the ingestion is happening into author instance and it is available for the delivery if it is ingested into AEM cloud services published there. Please note that CTT is compatible with AEM 6.3 and onwards on the source system. So if source system is anything less than AEM 6.3 such as 6.2 or 6.1 or 6.0, then the repository must be upgraded to 6.3 and onwards. Please note that all the customizations and other related custom code are not required to be upgraded as long as the CTT user interface is loading and CTT actions could be performed. Another thing to note is while the CTT package that is going to be installed on AEM source system has both frontend and backend facilitating the extraction, there are some libraries, the companion libraries on the AEM as cloud services, which are facilitating the ingestion process. One of the primary features of Content Transfer tool is its ability to perform top-up migrations. Now let’s talk about how to approach and plan and execute a successful content migration. There are four crucial steps for a successful content migration to occur. Number one is to identify whether the source repository statistics are in line with the supported limits for using Content Transfer tool or even to store in AEM as cloud services. To do that, there is a requirement to gather information from the source system to review those numbers that are outlined in the CTT prerequisites public documentation. Then if any of these limits are going out of the bounds that are published publicly, then it is recommended to create an Adobe support ticket before moving on to the migration. So there are two ways that this kind of information could be collected from source AEM system. One that recommended is collect a BPA report and then import that information into Cloud Acceleration Manager. Cloud Acceleration Manager also has a feature to estimate. It’s not like a perfect estimate, but at least it will give the indications of how long the CTT extraction and ingestion is going to take. Again, those are the indicative numbers. And then to gather segment store size, index store size, and all such information, you can use standard Linux excuses commands. Once that information is gathered, reviewed, and determined that, well, we’re good, then the next crucial step in the process is proof of migration. The intention behind proof of migration is you can think of this as a proof of concept, but during proof of migration, there are certain things that used to be considered carefully. So number one is try to migrate a production copy of the content so that we are actually dealing with the content that is there available on the production. So it is recommended to get a clone at this stage just to try out this proof of migration step, and then try to place the clone in the same network zone as in the production so that we can simulate the network connectivity or any such items, and also identify a good subset of content that could be migrated. And then try to migrate all the users and groups with user mapping as well to identify any issues. So a good rule of thumb is identify at least 25% of the content or at least one terabyte. The reason why one terabyte is mentioned is to get a estimate or how much time it will take to transfer or extract in just one terabyte, and then that number could be extrapolated at a later point to plan the initial migration, to put the estimates into the overall project plan that comes from proof of migration. So overall, the intention behind proof of migration is to identify any issues very early on, fix them, and then be prepared for initial migration. And this will give a near to realistic estimate so that though will give a clear idea how long the content migration itself is going to take in the overall project plan. So once the proof of migration is all clear, then move on to the initial migration. So during the initial migration, the best practices here are always do a migration from author to author and publish to publish. This is primarily to make sure you are replicating the state of the content as is from the source to destination system. And also make another note here that when the content is being ingested into AMS cloud services, the author instances are going to be down. They are going to be scaled down and they will be scaled up once the content ingestion completes. But that is different on AM published though. When the content ingestion is happening on the publish, the publish instances are not going to be down. So that is something to keep in mind. And once the initial migration, which is going to be the crucial step, is complete, then plan for the incremental top-ups, which is the top-up migration. So for planning the top-up migrations, one of the crucial data points is how much content effectively is being added, meaning edits are fine because edits most of the time are going to be at a property level. So it’s mostly textual data, but if a heavy number of assets are being added, then it’s a good idea to measure how much of that content is going to be added for a certain period of time, could be a week within a two weeks or a month timeframe. And then based on the amount of the volume of assets or content is being added, then schedule that number of frequent incremental top-ups. So the idea behind the frequent incremental top-ups is to make sure you’re catching up AM as cloud services target instance with the latest content that is being pumped into the live production data so that finally before going live, that length of content freeze is very low. So there are certain things to keep in mind or being aware of. Number one, from process standpoint, make sure the proof of migration is planned into very early on in the migration project timelines. So when a extraction is started, CTT is going to span up its own Java process. And this Java process is going to be owned pretty much in the next world. It is going to be owned by the same user who is going to own the AM process. And that CTT Java process will take up to four gigs of that. So if CTT is trying to execute on the live production system, this is something that has to be taken care or taken into account. So if there is a requirement to add that four gigs to upsize the servers, this is the time to do it. So before starting CTT. Apart from using the four gigs of heap, it also uses two other infra elements. One is the disk IO and disk usage and the network bandwidth. So when CTT extracting the content from blob store and segment store, then what it will do is it will temporarily put that into a temp space, which is relative to CRX Quickstart folder, which is AM installation folder. And then it uses network bandwidth to upload the extracted content into a Azure container in the cloud. And this particular Azure container is properly secure. Not any other customers can access it. Only this particular source system can access it. So no other users because by the secret keys. So because it is using the network, so there are two things. One is it is using network. Two, the network connectivity has to be established. If there are any firewall allow lists or any such connectivities, it has to be opened up. This is the time to plan for that as well. So all in all, CTT uses its own additional Java space heap of up to four GB. It uses disk IO, it uses disk space, and also it uses the network bandwidth on the source system. So those are the things that has to be considered while you are using CTT. So with that, let’s jump into the Content Transfer Tool demo so that I can show you how it looks, the user interface and other aspects of it. As a first step, let’s see how to download the Content Transfer Tool from the software distribution portal. So once you log in into the software distribution portal, then you can go navigate into AMS Support Services and then under Software Type, if you choose Tooling, that’s the easiest way to locate it, and you will find the latest version of Content Transfer Tool available here. So once you download the package and then you can go into the package manager on your source system here for demonstration purposes. I am using a 6.4 instance and I have uploaded the Content Transfer Tool here and then I went ahead and installed the package. So, the package installs, as I mentioned before, it installs the tool. You can access that by going into AM and then Operations. Under Operations, you’ll have a content migration. So under Content Migration, there is a Content Transfer section. Once you are here, let’s try to create a content migration set so that I can demonstrate what are all involved here. So give a name for your migration set name. This is the name for the entire migration set. And then you can provide your cloud services URL here and cloud services author URL primarily. So this is another note. Whenever you are doing either author to author migration or publish to publish migration, always use the same author URL and then access token. Once you click open access token, it will show the access token of the cloud service instance. And here you can toggle whether you are intended to include versions or not include versions. And then also you can enable mapping the IMS users and groups with the associated ACLs. So the way user mapping works is, let’s say for example, on your source system for a specific asset and taking a specific example, like for example, under Content DAM, you have an A.JPEG and on that A.JPEG, if the ACLs are provided to Joe Smith and then Joe Smith is part of DAM administrators group, when CTT is migrating the content, it is going to take the content and Joe Smith because he is assigned ACLs and the ACLs are inherited from the DAM administrators. It’s going to take the group and the users and put it into AAM cloud services. But due to AAMs cloud services is being connected or provisioned through IAMs, now we have to make sure that Joe Smith is available, the identity of that user is available in IAMs and mapped from IAMs to AAM. So that’s all the user mapping at high level, but more information is available in the public documentation section. Now, this is the crucial step where you have to select what you are going to kind of extract, right? So I am configuring it to extract content assets, I’m picking VGtail under English and I’m going to pick the asset activities. Now I’m going to save this. Once you save, this is what you’re going to see. So when I say there are some actions that you can perform here is like you can go and click on extract. So what extract will do is let’s see what it will do in the backend. So I’m going to click extract and then the overwriting stays in container during extraction means if the contents of this path are already available in the Azure container, they will be overwritten, which will be kind of turned off during the top of migration so that it won’t kind of overwrite the existing content in the stays in container. And then I’m going to click extract. Once the extraction is running, it will show running. Now let’s go quickly and see whether we have a new Java process running here. If you look at under cloud migration, there is a new extraction folder that has been created. So if you get into that extraction folder, this is where it will write the temporary content onto the disk. And also the output log is nothing but the extraction log. So you can watch the log contents either from here or from CTT user interface. So if you want to watch the log files, you can click on the logs and then click on the extraction log, which will show the extraction process. So when we adapt for the Java process, there was no Java process. And also we can look the ingestion log when the ingestion is happening. I’m not going to ingest right now because it’s going to take a little bit of time. But overall, that’s how you can download the CTT package, install the latest version into your source system, and then create a migration set and then initiate an extraction. And if you are an AM administrator slash system admin who has access to the source, SSH access, then you can go into the CRX Quickstart and monitor the disk exploitation that way. And also you can watch that output log. Or if you are watching the logs from here, from touch interface, you can just go and read the log. So hope that is useful. Thanks for watching.

Using the Content Transfer Tool

Content Transfer Tool lifecycle

The Content Transfer Tool is installed on AEM 6.3+ and transfers content to AEM as a Cloud Service.

Key activities

  • Download the latest Content Transfer Tool.

  • Transfer AEM Author 6.3+ final content to AEM as a Cloud Service Author service.

    • Install the Content Transfer Tool on AEM 6.3+ Author containing the final content to transfer.
    • Run the Content Transfer Tool in batches, transferring sets of content.
  • Transfer AEM Publish 6.3+ final content to AEM as a Cloud Service Publish service.

    • Install the Content Transfer Tool on AEM 6.3+ Publish containing the final content to transfer.
    • Run the Content Transfer Tool in batches, transferring sets of content.
  • Optionally, “top-up” content on AEM as a Cloud Service, by transferring new content since the last content transfer

Hands-on exercise

Apply your knowledge by trying out what you learned with this hands-on exercise.

Prior to trying the hands-on exercise, make sure you’ve watched and understand the video above, and following materials:

Also, make sure you have completed the previous hands-on exercise:

Hands-on exercise GitHub repository

Hands-on with Content Transfer Tool

Explore how the Content Transfer Tool can automatically move content from AEM 6 to AEM as a Cloud Service.

Try out Content Transfer Tool

Other resources

recommendation-more-help
4859a77c-7971-4ac9-8f5c-4260823c6f69