Bulk Import Service
Last update: Sun Mar 23 2025 00:00:00 GMT+0000 (Coordinated Universal Time)
- Topics:
- Migration
CREATED FOR:
- Experienced
- Developer
Learn how AEM as a Cloud Services’ Bulk Import Service can be used to import assets from non-AEM sources.
Transcript
In this section, I’m going to talk about Bulk Import Service. As I mentioned before, Bulk Import Service is a out-of-the-box feature that comes with AMS Cloud services. This method or service is useful for importing assets, especially from non-AEM sources. The content transfer tool is useful to migrate content from AEM on-premise to AMS Cloud services, whereas Bulk Import Service is very useful for assets customers, specifically when they are trying to migrate digital assets from non-AEM sources to cloud storage layers such as Azure or S3. Bulk Import Service basically is available, and it all runs based on their configuration. As we see here, someone has to set up a source, which is going to be a Cloud Studies, could it be Azure or AWS, and then provide the credentials and then pretty much go and pull the assets from that particular source into AMS Cloud services. There are some very useful features such as whenever the assets are being basically selected for ingesting into AMS Cloud services for the Cloud storage, you can configure such as the filter, such as go and get only a subset of assets filtered either by file size or by specific MIME type. Also, when the assets are being ingested into AMS Cloud services, there is a configuration for dealing with existing assets. Let’s say for example, if an asset is coming in and the asset is already there, the import mode dictates what has to be done. Should it be skipped, meaning is it not going to be imported or ingested, or it should be replaced or create a version of the previous asset and then make the current ingested asset being the latest version. And another cool part of this feature is this could be used either for migrating a huge chunk of assets in one single shot or periodically do the migrations. So this is pretty handy for any solutions that are periodically wants to ingest into assets. All they have to do is make sure those assets are coming into the Azure or S3 account and then configure the connections and whatnot, and then schedule the pull. Pretty much the scheduled pulls will happen as jobs, so they’re going to be ingesting frequently. And there are some really useful features, such as once the connection is configured, is the connection is okay, and also the dryden, I found it’s very useful, which will tell exactly how long estimated time that the assets will take to and completely ingested into AMS Cloud Services. So with that, I will move on to a quick demo for the Bulk Import Services. So let’s switch to the demo. Here I am on my AMS Cloud Services instance, and you can access Bulk Import Service via Tools, Assets, and then go into Bulk Import. I have already created a connection here. Let me quickly show you how to check connection. So it’s a success. So let’s jump into the connection. Initially, you have to create the connection. So all you have to do is give a title, which is required, and then the type of the storage, could it be Azure, could be Amazon S3, and then the storage account details and the access key, and the source folder. So for example, right now, I’m showing the view for my Azure blob storage. I have opened it in the Azure Explorer. I have created a container, and within the container, I have a folder and I have an asset. So what I can do is I can go either start pulling assets directly from this level or any of the levels beneath it. The nice part is, let’s say for example, if I set up level one as my source folder, then anything beneath level one, which is level 11 and level 111 are going to be ingested. But what I’m going to do for now is I’m going to go set up the source folder as empty so that it will take the root as a source. And I can, as I mentioned, I can filter what all the assets are going to be picked up by MIME type or the sizes, and then you can configure the import mode. And if you want to clean up the Azure or S3 cloud storage, you can do that as well. And also we can map where these things are going to come in into AEM. There is a metadata file support as well. So this is basically to after ingestion apply the metadata using a CSV file. So for now, what I’m going to do is I like to run a lightweight demonstration of how all these things are going to work out. So I’m going to go ahead and select this, and let me show you the dry run features. And the dry run will tell how many assets are coming in and estimated time that it’s going to take. And if you have already executed this before, then there is going to be a job history where you can see how many times I have basically executed this twice. You can go and take a look into the details of what happened behind that particular job. You can either close out if you want, we can delete those steps. So with that, let me go ahead and try to either schedule or run. Right now I’m going to run this and click on run, and then the job is going to be queued up. And once it completes, so right now we can go and see the job history. It’s like still processing. It can open up and see the logs and whatnot. So now it says succeeded. Now let’s go and see whether the assets have been successfully ported or not. Okay, so assets, files. Here I have the index.jpg and then level one, and level one, one, one, and all the levels, and with all the assets that have been pulled from the Azure Graph Store. As you see, they are all new, they’ve been processed. This method is super fast as well when you are kind of trying to ingest assets. We have seen great numbers in terms of moving the assets this way. Thanks for watching, and let’s move on to the next section. Package Manager. The third option is the Package Manager. As I mentioned before, Package Manager is still available in the cloud services, but there are specific things that we have to consider when we are using Package Manager in cloud services. So one interesting fact that everybody has to know is even though the CRXDE system console and all the other consoles are blocked from, especially CRXDE system console are blocked from cloud services, Package Manager is available across all the authoring instances, whether it could be dev, stage, production. Please make a note of that. And then this is basically ideal for moving smaller set of contents because there are perceived package restrictions because of the nature of the cloud, the way AEM is set up in cloud. So what we have observed so far is from different experiences, packages up to 50 MB sizes are reliably being uploaded and installed, but if the packages are more, sometimes the packages either failing to upload or failing to install, that is because of the underlying infrastructure that we’re using in the cloud. This may improve in future state, but for now these are some things that we have to keep in mind. Even though we’re using packages, these packages has to have filters to install the content only in the mutable parts. So if, for example, you have a package which is having filter to install something into apps, then that package cannot be installed. So no matter how you try to do it, it has to go only through the cloud manager. So the package contents has to be installing into the mutable parts only. Another thing to note is if you want to push some content into published there, because CRX package manager is not available in the published there, one way that you can achieve this is upload the package into author corresponding to the published there, and then replicate or activate or distribute in the cloud services so that the content is going to be replicated to the published instances available in the published there. So the key summary from the package manager is it is available across all the author instances, and there is a perceived limit of the size of the packages. And if you want to get some content into published there, you can upload the package into author and then activate it. Thanks for watching.
Using the Bulk Import Service
The Bulk Import Service is used to transfer files stored in Azure Blob Storage or Amazon S3 storage into AEM as a Cloud Service as assets.
The input sources in this video only show Azure Blob Storage and Amazon S3; however the available sources continue to grow over time. For a complete list of supported input sources, please refer to the available options in product, or documentation.
Key Activities
- Upload the files-to-import to your cloud storage provider.
- Configure and run the Bulk Import Service from AEM as a Cloud Service Author service.
- Run the Bulk Service Importer as a one-time import or schedule a periodic import.
Other Resources
recommendation-more-help
4859a77c-7971-4ac9-8f5c-4260823c6f69