Bulk Import Service

Last update: 2024-01-25
  • Created for:
  • Experienced
    Developer

Learn how AEM as a Cloud Services’ Bulk Import Service can be used to import assets from non-AEM sources.

 Transcript

In this section, I’m going to talk about Bulk Import Service. As I mentioned before, Bulk Import Service is an out of the box feature that comes with AMS Cloud Services.

This method or service is useful for importing assets, especially from non-AEM sources.

The content transfer tool is useful to migrate content from AEM on premise to AMS cloud services. Whereas Bulk Import Service is very useful for assets customers, specifically when they are trying to migrate digital assets from non-AEM sources through cloud storage names, such as Azure or S3 So Bulk Import Service basically is available, and it all runs based on the configuration. All as we see here, someone has to set up a source, which is going to be a cloud storage, Could it be Azure or AWS. And then provide their credentials, and then pretty much go and pull the assets from that particular source into AEM as cloud services.

There are some very useful features, such as, whenever the assets are being basically selected for ingesting into a AMS cloud services for the cloud storage. You can configure such as like the filters such as go and get only a subset of assets filtered by either by file size or by specific MIME type. And also then the assets are being ingested into AMS cloud services. That is a configuration for dealing with existing assets. Let’s say for example, if an asset is coming in and the asset is already there, the Import mode dictates what has to be done. Should it be a skip, meaning is it not going to be imported or ingested, or it should be replaced or create a version of the previous asset and then make the current ingested asset being the latest version. And another cool part of this feature is this could be used either for migrating a huge chunk of assets in one single shot, or periodically do the migrations. So this is pretty handy for any solutions that are periodically wants to ingest into assets. All they have to do is make sure those assets are coming into the Azure or S3 account, and then configured the connections and what not, and then schedule the pull pretty much the schedule pulls will happen as jobs. So, they are going to be ingesting frequently. And there are some really useful features such as, once the connection is configured, is the connection okay? And also the Dry Run. I found it very useful, which will tell exactly how long estimated time that the assets will take to and completely ingested into AMS cloud services. So, with that, I will move on to a quick demo for the Bulk Import Services. So let’s switch to the demo. Here, I am on my AMS cloud services instance, and you can access Bulk Import Service via tools, assets, and then go into Bulk Import. I have already created a connection here. Let me quickly show you how to check connection. So it’s easily success. So, let’s jump into the connection. Initially, you have to create the connection. So all you had to do is give a Title which is required, and then the type of the Storage. Could it be Azure? Could be Amazon S3? and then the Storage Account details and the Access Key, and the Source Folder. So for example, right now I’m showing the view for my Azure Blob Storage. I have opened it in the Azure Explorer. I have created a container and within the container I have a folder and I have an asset. So what I could do is I can go either start pulling assets directly from this level or any of the levels beneath it. So the nice part is, let’s say, for example, if I set up level1 as my source folder, then anything beneath level1, which is level11 and level111, are going to be ingested.

But what I’m going to do for now is I’m going to go set up, the source folder is empty so that it will take the route as a source. And as I mentioned, I can filter what are all the assets are going to be picked up by MIME type or the sizes, and then you can configure the Import Mode. And if you want to clean up the Azure or S3 cloud storage, you can do that as well. And also we can map where these things are going to come into AEM. There’s a Metadata File support as well. So, this is basically to after ingestion Up-I the metadata using a CSV file. So for now, what I’m going to do is I like to run a lightweight demonstration of how all these things going to work up. So I’m going to go ahead and select this and let me show you the Dry Run features, and the Dry Run will tell a how many assets are coming in, and estimated time that it’s going to take. And if you have already executed this before, then there is going to be a Job History where you can see how many times I have basically executed this, twice. You can go and take a look into the details of what happened behind that particular job. We can either close or if you want, we can delete those instances. So, with that, let me go ahead and try to either Schedule or Run. Right now, I’m going to run this and click on run and then the job is going to be queued up. And once it completes, so right now we can go and see the Job History. It’s like still processing. It can open up and see the logs and what not. So, now it says succeeded. Now, let’s go and see whether the assets have been successfully boarded or not. Okay! So, Assets, Files.

Here, I have the index.jpg and then level1 and level111 and all the levels, and with all the assets that are being pulled from Azure Blob Store. As you see, they are all new, they’ve been processed. This method is super-fast as well when you are kind of trying to ingest assets. We have seen great numbers in terms of moving the assets this way. Thanks for watching! And let’s move on to the next section: Package Manager. The third option is the Package Manager. As I have mentioned before, Package Manager is still available in the cloud services, but there are specific things that we had to consider when we’re using Package manager in cloud services. So, one interesting fact that everybody has to know is even though the CRXDE system console and all the other consoles, or like a block from especially CRXDE system console are brought from cloud services. Package Manager is available across all the authoring instances, whether it could be Dev, Stage, Production. Please make a note of that. And then, this is basically ideal for moving smaller set of contents because there are perceived package restrictions because of the nature of the cloud, the way AEM is set up in cloud. So, what we have observed so far is from different experiences, packages up to 50 MB sizes are reliably being uploaded and installed. But if the package size are more, sometimes the packages either failing to upload or failing to install, that is because of the underlying infrastructure that we’re using in the cloud. This may improve in future state. But for now, these are some things that we had to keep in mind. Even though we are using packages, these packages have to have filters to install the content only in the middle parts. So, if for example, you have a package which is having filter to install something into apps, then that package cannot be installed. So, no matter how you try to do it, it won’t. It has to go only through the cloud manager. So, the package contents have to be installing into the mutable parts only. And another thing to note is if you wants to push some content into Publish tier, because Rex package Manager is not available in the Publish tier, one way that you can achieve this is upload the package into Author corresponding to the Publish tier, and then replicate or activate or distribute in the cloud services, so that content is going to be replicated to the published instances available in the Publish tier. So, the key summary from the Package Manager is it is available across all the author instances. And there is a pursued limit of the size of the packages. And if you want to get some content into Publish tier, you can upload the package into Author and then activate it. Thanks for watching! - -

Using the Bulk Import Service

Bulk Import Service lifecycle

The Bulk Import Service is used to transfer files stored in Azure Blob Storage or Amazon S3 storage into AEM as a Cloud Service as assets.

TIP

The input sources in this video only show Azure Blob Storage and Amazon S3; however the available sources continue to grow over time. For a complete list of supported input sources, please refer to the available options in product, or documentation.

Key Activities

  • Upload the files-to-import to your cloud storage provider.
  • Configure and run the Bulk Import Service from AEM as a Cloud Service Author service.
  • Run the Bulk Service Importer as a one-time import or schedule a periodic import.

Other Resources

On this page