Manual setup of your Sitemap configuration
If you need more customization than your generated sitemap configuration file provides, you can copy its contents and paste it into a file named helix-sitemap.yaml
in the root folder of your project.
Note: When using a manually configured index and sitemap (e.g. your code repo includes a helix-query.yaml and helix-sitemap.yaml file) your index definition must include the robots property to ensure the sitemap excludes pages with robots: noindex
metadata. When using auto-generated index definitions, simply follow the recommendations in the indexing documentation so those pages are excluded from the index.
The following sections contain the supported types of sitemaps.
Simple Sitemap
The following is a simple helix-sitemap.yaml
. It assumes a single index containing all the pages that need to appear in the sitemap.
sitemaps:
example:
source: /query-index.json
destination: /sitemap-en.xml
If you want last modification dates to be included in the URLs of your sitemap, add a lastmod
property including a format to your configuration.
sitemaps:
example:
source: /query-index.json
destination: /sitemap-en.xml
lastmod: YYYY-MM-DD
Multiple Sitemaps
It is common to have sitemaps per section of the sites and/or per country or language. AEM supports sitemaps including the corresponding hreflang
references. In the following example we assume that there is a one to one mapping between the indexes and the sitemaps XML files.
sitemaps:
example:
lastmod: YYYY-MM-DD
languages:
en:
source: /en/query-index.json
destination: /sitemap-en.xml
hreflang: en
fr:
source: /fr/query-index.json
destination: /sitemap-fr.xml
hreflang: fr
alternate: /fr/{path}
If there are two pages in the english and french section that share a common suffix, they will be related, so e.g. if you have a page /welcome
in the english section and a page /fr/welcome
in the french section, the resulting entry in the /sitemap-en.xml
will look like this:
<url>
<loc>https://wwww.mysite.com/welcome</loc>
<xhtml:link rel="alternate" hreflang="en" href="https://wwww.mysite.com/welcome"/>
<xhtml:link rel="alternate" hreflang="fr" href="https://wwww.mysite.com/fr/welcome"/>
</url>
A similar entry will be available in /sitemap-fr.xml
.
Specifying the primary language manually
There might be situations where you have alternate versions of a page, but you’re unable to use a common suffix to identify them, possibly because you’re porting a legacy website that should not have its paths changed. In that situation, you can specify a primary-language-url
for the alternate location, in the metadata of the document.
Let’s assume our primary language is english, we have a page /welcome
in the english section and /fr/bienvenu
in the french section, and the latter is an alternate version of the former.
First, we add that information to the document at /fr/bienvenu
in its metadata:
This can also be added to a global metadata
sheet, as shown in Bulk Metadata.
Then, we add an indexed property primary-language-url
to the french index:
primary-language-url:
select: head > meta[name="primary-language-url"]
value: attribute(el, "content")
Finally, we re-publish the french page, and rebuild the sitemap.
Specifying the default language
Another common requirement is to specify the default language for a sitemap with multiple languages. This can be achieved by adding a property default
in the sitemap:
sitemaps:
example:
default: en
lastmod: YYYY-MM-DD
languages:
en:
source: /en/query-index.json
destination: /sitemap-en.xml
hreflang: en
fr:
source: /fr/query-index.json
destination: /sitemap-fr.xml
hreflang: fr
alternate: /fr/{path}
In the resulting sitemap, all entries from the english subtree will have an extra alternate entry with hreflang x-default
.
Specifying multiple hreflangs for one subtree
Sometimes, it is required to map multiple hreflangs to only one language subtree, e.g. consider we want the following to appear in the resulting sitemap:
<url>
<loc>https://myhost/la/page</loc>
<xhtml:link rel="alternate" hreflang="es-VE" href="https://myhost/la/page"/>
<xhtml:link rel="alternate" hreflang="es-SV" href="https://myhost/la/page"/>
<xhtml:link rel="alternate" hreflang="es-PA" href="https://myhost/la/page"/>
</url>
Every page in our sitemap source should appear exactly once, but have multiple alternate hreflangs associated with it. In order to achieve this, you should specify an array of languages in the hreflang
property:
sitemaps:
example:
lastmod: YYYY-MM-DD
languages:
la:
source: /la/query-index.json
destination: /sitemap-la.xml
hreflang:
- es-VE
- es-SV
- es-PA
Multiple Indexes Aggregated Into One Sitemap
There are cases where it is easier to have a single larger sitemap than fragmented small sitemaps, especially as there is a limit of sitemaps that can be submitted to search engines per site.
The following example shows how to aggregate a number of separate indexes into a single sitemap.
sitemaps:
example:
lastmod: YYYY-MM-DD
languages:
dk:
source: /dk/query-index.json
destination: /sitemap.xml
hreflang: dk
alternate: /dk/{path}
no:
source: /no/query-index.json
destination: /sitemap.xml
hreflang: no
alternate: /no/{path}
Using the same destination it is possible to combine multiple small sitemaps into one larger sitemap.
Including other sitemaps as input
In a mixed scenario, where not all languages in a sitemap are managed in AEM, you can include sitemaps from other language trees by specifying an XML path as source, as in:
sitemaps:
example:
lastmod: YYYY-MM-DD
languages:
en:
source: /en/query-index.json
destination: /sitemaps/sitemap-en.xml
hreflang: en
fr:
source: https://www.mysite.com/legacy/sitemap-fr.xml
destination: /sitemaps/sitemap-fr.xml
hreflang: fr
alternate: /fr/{path}
In this example, we use an external french sitemap to calculate all sitemap locations. AEM will determine alternates for english sitemap URLs by deconstructing the french counterparts in external sitemap using the alternate
definition.
Adding an extension to all locations in the sitemap
In a scenario, where you want all your locations to have an extension, e.g. .html
, and you’re unable to generate a helix-sitemap
sheet in your query index to derive a formula, you can add an extension to all languages or an individual language using an extension
property:
sitemaps:
example:
lastmod: YYYY-MM-DD
extension: .html
languages:
en:
source: /en/query-index.json
destination: /en/sitemap.xml
hreflang: en
fr:
source: /fr/query-index.json
destination: /fr/sitemap.xml
hreflang: fr
alternate: /fr/{path}