Extracting Strings for Translating extracting-strings-for-translating

Last update: Wed May 03 2023 00:00:00 GMT+0000 (Coordinated Universal Time)

Topics:
Developing

CREATED FOR:

Developer

CAUTION

AEM 6.4 has reached the end of extended support and this documentation is no longer updated. For further details, see our technical support periods. Find the supported versions here.

Use xgettext-maven-plugin to extract strings from your source code that need translating. The Maven plugin extracts strings to an XLIFF file that you send for translating. Strings are extracted from the following locations:

Java source files
Javascript source files
XML representations of SVN resources (JCR Nodes)

Configuring String Extraction configuring-string-extraction

Configure how the xgettext-maven-plugin tool extracts strings for your project.

/filter { }
/parsers {
   /vaultxml { }
   /javascript { }
   /regexp {
      /files {
         /java { }
         /jsp { }
         /extjstemplate { }
      }
   }
}
/potentials { }

Section

Description

/filter

Identifies the files that are parsed.

/parsers/vaultxml

Configures the parsing of Vault files. Identifies the JCR nodes that contain externalized strings and localization hints. Also identifies JCR nodes to ignore.

/parsers/javascript

Identifies the Javascript functions that externalize strings. You do not need to change this section.

/parsers/regexp

Configures the parsing of Java, JSP, and ExtJS Template files. You do not need to change this section.

/potentials

The formula for detecting strings to internationalize.

Identifying the Files to Parse identifying-the-files-to-parse

The /filter section of the i18n.any file identifies the files that the xgettext-maven-plugin tool parses. Add several include and exclude rules that identify files that are parsed and ignored, respectively. You should include all files and then exclude the files that you do not want to parse. Typically, you exclude file types that do not contribute to the UI, or files that define UI but are not being translated. The include and exclude rules have the following format:

{ /include "pattern" }
{ /exclude "pattern" }

The pattern part of a rule is used to match the names of the files to include or exclude. The pattern prefix indicate whether you are matching a JCR node (its representation in Vault) or the file system.

Prefix

Effect

Indicates a JCR path. Therefore, this prefix matches files below the jcr_root directory.

Indicates a regular file on the file system.

none

No prefix, or a pattern that begins with a folder or file name, indicates a regular file on the file system.

When used within a pattern, the / character indicates a subdirectory and the * character matches all. The following table lists several example rules.

Example rule

Effect

{ /include "*" }

Include all files.

{ /exclude "*.pdf" }

Exclude all PDF files.

{ /exclude "*/pom.xml" }

Exclude POM files.

{ /exclude "/content/*" } { /include "/content/catalogs/geometrixx/templatepages" } { /include "/content/catalogs/geometrixx/templatepages/*" }

Exclude all files below the /content node.

Include the /content/catalogs/geometrixx/templatepages node.

Include all child nodes of /content/catalogs/geometrixx/templatepages.

Extracting the Strings extracting-the-strings

no POM:

mvn -N com.adobe.granite.maven:xgettext-maven-plugin:1.2.2:extract  -Dxgettext.verbose=true -Dxgettext.target=out -Dxgettext.rules=i18n.any -Dxgettext.root=.

With POM: Add this to POM:

<build>
    <plugins>
        <plugin>
            <groupId>com.adobe.granite.maven</groupId>
            <artifactId>xgettext-maven-plugin</artifactId>
            <version>1.1</version>
            <configuration>
                <rules>i18n.any</rules>
                <root>jcr_root</root>
                <xliff>cq.xliff</xliff>
                <verbose>true</verbose>
            </configuration>
        </plugin>
    </plugins>
</build>

the command:

mvn xgettext:extract

Output Files output-files

raw.xliff: extracted strings
warn.log: warnings (if any), if CQ.I18n.getMessage() API is used incorrectly. These always need a fix and then a re-run.
parserwarn.log: parser warnings (if any), e.g. js parser issues
potentials.xliff: “potential” candidates that are not extracted, but might be human readable strings that need translation (can be ignored, still produces a huge amount of false positives)
strings.xliff: flattened xliff file, to be imported into ALF
backrefs.txt: allows for quick lookup of source code locations for a given string

recommendation-more-help

2315f3f5-cb4a-4530-9999-30c8319c520e