Bulk importing using the Importer

[For publishing from AEM Sites using Edge Delivery Services, click here.]{class="badge positive" title="Publish from AEM to Edge Delivery Services"}

Learn how to bulk import web pages using the AEM Importer tool during site migration.

Transcript
Once you are satisfied with the transformation and you have individually tested one or more files, you likely will need to bulk import many more. The Import Bulk tool works nearly the same as the Import Workbench tool with a few minor differences. Provide a list of URLs instead of one. Simply paste the list of URLs to import with one URL per line. The import.js file is not automatically reloaded as it is for one-off imports since if you are in the middle of importing 1000 URLs, you probably do not want the process to restart if you change the code. Otherwise, the options are the same for importing one page or performing a bulk import. The amount of URLs you can import varies mainly based on the memory each page consumes. For example, a heavy SPA page usually does not release memory and the browser tends to crash between 60 and 100 pages. In such situations, if you only need information which is in the markup, you can disable JavaScript execution in the options and you will be able to import many more pages. You can still batch the set of URLs to import if the number is still manually manageable. If you have a lot of URLs to import, 10k+, contact the AEM team. There are several ways to automate the process without using a browser which you can discuss with them. During the process, you can download an excel report with the list of pages imported and some process information, import success 404 301 etc. At the end of the process, this report file contains everything the importer has done and can be used for further analysis such as to find pages with errors. Or it can be used for page processing such as previewing and publishing. If you are importing pages from a website and you do not have the full list of URLs to import, you can use the Crawl tool to build the list based on the site map or by crawling the site. You can use the filter path name option to only crawl URLs under a certain path. The provided URLs must then match this filter. This can be really useful to only crawl a subset of a large site and save time and resources. After providing a hostname and clicking get from robots.txt or site map, the tool will first try to find site maps in the slash robots.txt file. If no robots.txt is found, it will try the slash sitemap.xml file. The default file name to search can be changed in the options. If it finds a site map, it will collect all the URLs referenced in the site amp and recursively follow the referenced other site map files. When the crawl is complete, you can use the download report button for the list of all unique URLs found. The eyedropper tool allows you to capture the logo and some of the key CSS information of a website. You just need to provide a URL and click eyedrop. Clicking the copy CSS to clipboard button copies all gathered information in a CSS format that is ready to be pasted into your AEM CSS for further testing and customization.
recommendation-more-help
bb44cebf-d964-4e3c-b64e-ce882243fe4d