Converting Word Documents to PDF Documents

This section describes how you can use the Generate PDF API to programmatically convert a Microsoft Word document to a PDF document.

NOTE
For more information about additional file formats, see Adding Support for Additional Native File Formats.
NOTE
For more information about the Generate PDF service, see Services Reference for AEM Forms.

Summary of steps

To convert a Microsoft Word document to a PDF document, perform the following tasks:

  1. Include project files.
  2. Create a Generate PDF client.
  3. Retrieve the file to convert to a PDF document.
  4. Convert the file to a PDF document.
  5. Retrieve the results.

Include project files

Include necessary files into your development project. If you are creating a client application using Java, include the necessary JAR files. If you are using web services, ensure that you include the proxy files.

Create a Generate PDF client

Before you can programmatically perform a Generate PDF operation, create a Generate PDF service client. If you are using the Java API, create a GeneratePdfServiceClient object. If you are using the web service API, create a GeneratePDFServiceService object.

Retrieve the file to convert to a PDF document

Retrieve the Microsoft Word document to convert to a PDF document.

Convert the file to a PDF document

After you create the Generate PDF service client, you can invoke the createPDF2 method. This method needs information about the document to convert, including the file extension.

Retrieve the results

After the file is converted to a PDF document, you can retrieve the results. For example, after you convert a Word file to a PDF document, you can retrieve and save the PDF document.

See also

Convert Word documents to PDF documents using the Java API

Convert Word documents to PDF documents using the web service API

Including AEM Forms Java library files

Setting connection properties

Generate PDF Service API Quick Starts

Convert Word documents to PDF documents using the Java API

Convert a Microsoft Word document to a PDF document by using the Generate PDF API (Java):

  1. Include project files.

    Include client JAR files, such as adobe-generatepdf-client.jar, in your Java project’s class path.

  2. Create a Generate PDF client.

    • Create a ServiceClientFactory object that contains connection properties.
    • Create a GeneratePdfServiceClient object by using its constructor and passing the ServiceClientFactory object.
  3. Retrieve the file to convert to a PDF document.

    • Create a java.io.FileInputStream object that represents the Word file to convert by using its constructor. Pass a string value that specifies the file location.
    • Create a com.adobe.idp.Document object by using its constructor and passing the java.io.FileInputStream object.
  4. Convert the file to a PDF document.

    Convert the file to a PDF document by invoking the GeneratePdfServiceClient object’s createPDF2 method and passing the following values:

    • A com.adobe.idp.Document object that represents the file to convert.
    • A java.lang.String object that contains the file extension.
    • A java.lang.String object that contains the file type settings to be used in the conversion. File type settings provide conversion settings for different file types, such as .doc or .xls.
    • A java.lang.String object that contains the name of the PDF settings to be used. For example, you can specify Standard.
    • A java.lang.String object that contains the name of the security settings to be used.
    • An optional com.adobe.idp.Document object that contains settings to be applied while generating the PDF document.
    • An optional com.adobe.idp.Document object that contains metadata information to be applied to the PDF document.

    The createPDF2 method returns a CreatePDFResult object that contains the new PDF document and a log information. The log file typically contains error or warning messages generated by the conversion request.

  5. Retrieve the results.

    To obtain the PDF document, perform the following actions:

    • Invoke the CreatePDFResult object’s getCreatedDocument method, which returns a com.adobe.idp.Document object.
    • Invoke the com.adobe.idp.Document object’s copyToFile method to extract the PDF document from the object created in the previous step.

    If you used the createPDF2 method to obtain the log document (not applicable to HTML conversions), perform the following actions:

    • Invoke the CreatePDFResult object’s getLogDocument method. This returns a com.adobe.idp.Document object.
    • Invoke the com.adobe.idp.Document object’s copyToFile method to extract the log document.

Convert Word documents to PDF documents using the web service API

Convert a Microsoft Word document to a PDF document by using the Generate PDF API (web service):

  1. Include project files.

    Create a Microsoft .NET project that uses MTOM. Ensure that you use the following WSDL definition: http://localhost:8080/soap/services/GeneratePDFService?WSDL&lc_version=9.0.1.

    NOTE
    Replace localhost with the IP address of the server hosting AEM Forms.
  2. Create a Generate PDF client.

    • Create a GeneratePDFServiceClient object by using its default constructor.

    • Create a GeneratePDFServiceClient.Endpoint.Address object by using the System.ServiceModel.EndpointAddress constructor. Pass a string value that specifies the WSDL to the AEM Forms service (for example, http://localhost:8080/soap/services/GeneratePDFService?blob=mtom.) You do not need to use the lc_version attribute. However, specify ?blob=mtom.

    • Create a System.ServiceModel.BasicHttpBinding object by getting the value of the GeneratePDFServiceClient.Endpoint.Binding field. Cast the return value to BasicHttpBinding.

    • Set the System.ServiceModel.BasicHttpBinding object’s MessageEncoding field to WSMessageEncoding.Mtom. This value ensures that MTOM is used.

    • Enable basic HTTP authentication by performing the following tasks:

      • Assign the AEM forms user name to the field GeneratePDFServiceClient.ClientCredentials.UserName.UserName.
      • Assign the corresponding password value to the field GeneratePDFServiceClient.ClientCredentials.UserName.Password.
      • Assign the constant value HttpClientCredentialType.Basic to the field BasicHttpBindingSecurity.Transport.ClientCredentialType.
      • Assign the constant value BasicHttpSecurityMode.TransportCredentialOnly to the field BasicHttpBindingSecurity.Security.Mode.
  3. Retrieve the file to convert to a PDF document.

    • Create a BLOB object by using its constructor. The BLOB object is used to store the file that you want to convert to a PDF document.
    • Create a System.IO.FileStream object by invoking its constructor. Pass a string value that represents the file location of the file to convert and the mode in which to open the file.
    • Create a byte array that stores the content of the System.IO.FileStream object. You can determine the size of the byte array by getting the System.IO.FileStream object’s Length property.
    • Populate the byte array with stream data by invoking the System.IO.FileStream object’s Read method and passing the byte array, the starting position, and the stream length to read.
    • Populate the BLOB object by assigning to its MTOM property the contents of the byte array.
  4. Convert the file to a PDF document.

    Convert the file to a PDF document by invoking the GeneratePDFServiceService object’s CreatePDF2 method and passing the following values:

    • A BLOB object that represents the file to be converted.
    • A string that contains the file extension.
    • A java.lang.String object that contains the file type settings to be used in the conversion. File type settings provide conversion settings for different file types, such as .doc or .xls.
    • A string object that contains the PDF settings to be used. You can specify Standard.
    • A string object that contains the security settings to be used. You can specify No Security.
    • An optional BLOB object that contains settings to be applied while generating the PDF document.
    • An optional BLOB object that contains metadata information to be applied to the PDF document.
    • An output parameter of type BLOB that is populated by the CreatePDF2 method. The CreatePDF2 method populates this object with the converted document. (This parameter value is required only for web service invocation).
    • An output parameter of type BLOB that is populated by the CreatePDF2 method. The CreatePDF2 method populates this object with the log document. (This parameter value is required only for web service invocation).
  5. Retrieve the results.

    • Retrieve the converted PDF document by assigning the BLOB object’s MTOM field to a byte array. The byte array represents the converted PDF document. Ensure you use the BLOB object that is used as the output parameter for the createPDF2 method.
    • Create a System.IO.FileStream object by invoking its constructor and passing a string value that represents the file location of the converted PDF document.
    • Create a System.IO.BinaryWriter object by invoking its constructor and passing the System.IO.FileStream object.
    • Write the contents of the byte array to a PDF file by invoking the System.IO.BinaryWriter object’s Write method and passing the byte array.

Converting HTML Documents to PDF Documents

This section describes how you can use the Generate PDF API to programmatically convert HTML documents to PDF documents.

NOTE
For more information about the Generate PDF service, see Services Reference for AEM Forms.

Summary of steps

To convert an HTML document to a PDF document, perform the following tasks:

  1. Include project files.
  2. Create a Generate PDF client.
  3. Retrieve the HTML content to convert to a PDF document.
  4. Convert the HTML content to a PDF document.
  5. Retrieve the results.

Include project files

Include necessary files into your development project. If you are creating a client application using Java, include the necessary JAR files. If you are using web services, ensure that you include the proxy files.

Create a Generate PDF client

Before you can programmatically perform a Generate PDF operation, you must create a Generate PDF service client. If you are using the Java API, create a GeneratePdfServiceClient object. If you are using the web service API, create a GeneratePDFServiceService.

Retrieve the HTML content to convert to a PDF document

Reference HTML content that you want to convert to a PDF document. You can reference HTML content such as an HTML file or HTML content that is accessible using a URL.

Convert the HTML content to a PDF document

After you create the service client, you can invoke the appropriate PDF creation operation. This operation needs information about the document to be converted, including the path to the target document.

Retrieve the results

After the HTML content is converted to a PDF document, you can retrieve the results and save the PDF document.

See also

Convert HTML content to a PDF document using the Java API

Convert HTML content to a PDF document using the web service API

Including AEM Forms Java library files

Setting connection properties

Generate PDF Service API Quick Starts

Convert HTML content to a PDF document using the Java API

Convert an HTML document to a PDF document using the Generate PDF API (Java):

  1. Include project files.

    Include client JAR files, such as adobe-generatepdf-client.jar, in your Java project’s class path.

  2. Create a Generate PDF client.

    Create a GeneratePdfServiceClient object by using its constructor and passing a ServiceClientFactory object that contains connection properties.

  3. Retrieve the HTML content to convert to a PDF document.

    Retrieve HTML content by creating a string variable and assigning a URL that points to HTML content.

  4. Convert the HTML content to a PDF document.

    Invoke the GeneratePdfServiceClient object’s htmlToPDF2 method and pass the following values:

    • A java.lang.String object that contains the URL of the HTML file to be converted.
    • A java.lang.String object that contains the file type settings to be used in the conversion. File type settings can include spidering levels.
    • A java.lang.String object that contains the name of the security settings to be used.
    • An optional com.adobe.idp.Document object that contains settings to be applied while generating the PDF document. If this information is not supplied, the settings are automatically chosen based on the previous three parameters.
    • An optional com.adobe.idp.Document object that contains metadata information to be applied to the PDF document.
  5. Retrieve the results.

    The htmlToPDF2 method returns an HtmlToPdfResult object that contains the new PDF document that was generated. To obtain the newly created PDF document, perform the following actions:

    • Invoke the HtmlToPdfResult object’s getCreatedDocument method. This returns a com.adobe.idp.Document object.
    • Invoke the com.adobe.idp.Document object’s copyToFile method to extract the PDF document from the object created in the previous step.

Convert HTML content to a PDF document using the web service API

Convert HTML content to a PDF document by using the Generate PDF API (web service):

  1. Include project files.

    Create a Microsoft .NET project that uses MTOM. Ensure that you use the following WSDL definition: http://localhost:8080/soap/services/GeneratePDFService?WSDL&lc_version=9.0.1.

    NOTE
    Replace localhost with the IP address of the server hosting AEM Forms.
  2. Create a Generate PDF client.

    • Create a GeneratePDFServiceClient object by using its default constructor.

    • Create a GeneratePDFServiceClient.Endpoint.Address object by using the System.ServiceModel.EndpointAddress constructor. Pass a string value that specifies the WSDL to the AEM Forms service (for example, http://localhost:8080/soap/services/GeneratePDFService?blob=mtom.) You do not need to use the lc_version attribute. However, specify ?blob=mtom.

    • Create a System.ServiceModel.BasicHttpBinding object by getting the value of the GeneratePDFServiceClient.Endpoint.Binding field. Cast the return value to BasicHttpBinding.

    • Set the System.ServiceModel.BasicHttpBinding object’s MessageEncoding field to WSMessageEncoding.Mtom. This value ensures that MTOM is used.

    • Enable basic HTTP authentication by performing the following tasks:

      • Assign the AEM forms user name to the field GeneratePDFServiceClient.ClientCredentials.UserName.UserName.
      • Assign the corresponding password value to the field GeneratePDFServiceClient.ClientCredentials.UserName.Password.
      • Assign the constant value HttpClientCredentialType.Basic to the field BasicHttpBindingSecurity.Transport.ClientCredentialType.
      • Assign the constant value BasicHttpSecurityMode.TransportCredentialOnly to the field BasicHttpBindingSecurity.Security.Mode.
  3. Retrieve the HTML content to convert to a PDF document.

    Retrieve HTML content by creating a string variable and assigning a URL that points to HTML content.

  4. Convert the HTML content to a PDF document.

    Convert the HTML content to a PDF document by invoking the GeneratePDFServiceService object’s HtmlToPDF2 method and pass the following values:

    • A string that contains the HTML content to convert.
    • A java.lang.String object that contains the file type settings to be used in the conversion.
    • A string object that contains the security settings to be used.
    • An optional BLOB object that contains settings to be applied while generating the PDF document.
    • An optional BLOB object that contains metadata information to be applied to the PDF document.
    • An output parameter of type BLOB that is populated by the CreatePDF2 method. The CreatePDF2 method populates this object with the converted document. (This parameter value is required only for web service invocation).
  5. Retrieve the results.

    • Retrieve the converted PDF document by assigning the BLOB object’s MTOM field to a byte array. The byte array represents the converted PDF document. Ensure you use the BLOB object that is used as the output parameter for the HtmlToPDF2 method.
    • Create a System.IO.FileStream object by invoking its constructor and passing a string value that represents the file location of the converted PDF document.
    • Create a System.IO.BinaryWriter object by invoking its constructor and passing the System.IO.FileStream object.
    • Write the contents of the byte array to a PDF file by invoking the System.IO.BinaryWriter object’s Write method and passing the byte array.

Converting PDF Documents to Non-image Formats

This section describes how you can use the Generate PDF Java API and web service API to programmatically convert a PDF document to an RTF file, which is an example of a non-image format. Other non-image formats include HTML, text, DOC, and EPS. When converting a PDF document to RTF, ensure that the PDF document does not contain form elements, such as a submit button. Form elements are not converted.

NOTE
For more information about the Generate PDF service, see Services Reference for AEM Forms.

Summary of steps

To convert a PDF document to any of the supported types, perform the following steps:

  1. Include project files.
  2. Create a Generate PDF client.
  3. Retrieve the PDF document to convert.
  4. Convert the PDF document.
  5. Save the converted file.

Include project files

Include necessary files into your development project. If you are creating a client application using Java, include the necessary JAR files. If you are using web services, ensure that you include the proxy files.

Create a Generate PDF client

Before you can programmatically perform a Generate PDF operation, you must create a Generate PDF service client. If you are using the Java API, create a GeneratePdfServiceClient object. If you are using the web service API, create a GeneratePDFServiceService object.

Retrieve the PDF document to convert

Retrieve the PDF document to convert to a non-image format.

Convert the PDF document

After you create the service client, you can invoke the PDF export operation. This operation needs information about the document to be converted, including the path to the target document.

Save the converted file

Save the converted file. For example, if you convert a PDF document to an RTF file, save the converted document to an RTF file.

See also

Convert a PDF document to a RTF file using the Java API

Convert a PDF document to a RTF file using the web service API

Including AEM Forms Java library files

Setting connection properties

Generate PDF Service API Quick Starts

Convert a PDF document to a RTF file using the Java API

Convert a PDF document to an RTF file by using the Generate PDF API (Java):

  1. Include project files.

    Include client JAR files, such as adobe-generatepdf-client.jar, in your Java project’s class path.

  2. Create a Generate PDF client.

    Create a GeneratePdfServiceClient object by using its constructor and passing a ServiceClientFactory object that contains connection properties.

  3. Retrieve the PDF document to convert.

    • Create a java.io.FileInputStream object that represents the PDF document to convert by using its constructor. Pass a string value that specifies the location of the PDF document.
    • Create a com.adobe.idp.Document object by using its constructor and passing the java.io.FileInputStream object.
  4. Convert the PDF document.

    Invoke the GeneratePdfServiceClient object’s exportPDF2 method and pass the following values:

    • A com.adobe.idp.Document object that represents the PDF file to convert.
    • A java.lang.String object that contains the name of the file to convert.
    • A java.lang.String object that contains the name of the Adobe PDF settings.
    • A ConvertPDFFormatType object that specifies the target file type for the conversion.
    • An optional com.adobe.idp.Document object that contains settings to be applied while generating the PDF document.

    The exportPDF2 method returns an ExportPDFResult object that contains the converted file.

  5. Convert the PDF document.

    To obtain the newly created file, perform the following actions:

    • Invoke the ExportPDFResult object’s getConvertedDocument method. This returns a com.adobe.idp.Document object.
    • Invoke the com.adobe.idp.Document object’s copyToFile method to extract the new document.

Convert a PDF document to a RTF file using the web service API

Convert a PDF document to an RTF file by using the Generate PDF API (web service):

  1. Include project files.

    Create a Microsoft .NET project that uses MTOM. Ensure that you use the following WSDL definition: http://localhost:8080/soap/services/GeneratePDFService?WSDL&lc_version=9.0.1.

    NOTE
    Replace localhost with the IP address of the server hosting AEM Forms.
  2. Create a Generate PDf client.

    • Create a GeneratePDFServiceClient object by using its default constructor.

    • Create a GeneratePDFServiceClient.Endpoint.Address object by using the System.ServiceModel.EndpointAddress constructor. Pass a string value that specifies the WSDL to the AEM Forms service (for example, http://localhost:8080/soap/services/GeneratePDFService?blob=mtom.) You do not need to use the lc_version attribute. However, specify ?blob=mtom.

    • Create a System.ServiceModel.BasicHttpBinding object by getting the value of the GeneratePDFServiceClient.Endpoint.Binding field. Cast the return value to BasicHttpBinding.

    • Set the System.ServiceModel.BasicHttpBinding object’s MessageEncoding field to WSMessageEncoding.Mtom. This value ensures that MTOM is used.

    • Enable basic HTTP authentication by performing the following tasks:

      • Assign the AEM forms user name to the field GeneratePDFServiceClient.ClientCredentials.UserName.UserName.
      • Assign the corresponding password value to the field GeneratePDFServiceClient.ClientCredentials.UserName.Password.
      • Assign the constant value HttpClientCredentialType.Basic to the field BasicHttpBindingSecurity.Transport.ClientCredentialType.
      • Assign the constant value BasicHttpSecurityMode.TransportCredentialOnly to the field BasicHttpBindingSecurity.Security.Mode.
  3. Retrieve the PDF document to convert.

    • Create a BLOB object by using its constructor. The BLOB object is used to store a PDF document that is converted.
    • Create a System.IO.FileStream object by invoking its constructor and passing a string value that represents the file location of the PDF document and the mode in which to open the file.
    • Create a byte array that stores the content of the System.IO.FileStream object. You can determine the size of the byte array by getting the System.IO.FileStream object’s Length property.
    • Populate the byte array with stream data by invoking the System.IO.FileStream object’s Read method and passing the byte array, the starting position, and the stream length to read.
    • Populate the BLOB object by assigning to its MTOM property the contents of the byte array.
  4. Convert the PDF document.

    Invoke the GeneratePDFServiceServiceWse object’s ExportPDF2 method and pass the following values:

    • A BLOB object that represents the PDF file to convert.
    • A string that contains the path name of the file to convert.
    • A java.lang.String object that specifies the file location.
    • A string object that specifies the target file type for the conversion. Specify RTF.
    • An optional BLOB object that contains settings to be applied while generating the PDF document.
    • An output parameter of type BLOB that is populated by the ExportPDF2 method. The ExportPDF2 method populates this object with the converted document. (This parameter value is required only for web service invocation).
  5. Save the converted file.

    • Retrieve the converted RTF document by assigning the BLOB object’s MTOM field to a byte array. The byte array represents the converted RTF document. Ensure you use the BLOB object that is used as the output parameter for the ExportPDF2 method.
    • Create a System.IO.FileStream object by invoking its constructor. Pass a string value that represents the location of the RTF file.
    • Create a System.IO.BinaryWriter object by invoking its constructor and passing the System.IO.FileStream object.
    • Write the contents of the byte array to a RTF file by invoking the System.IO.BinaryWriter object’s Write method and passing the byte array.