v7

Configure access to Hadoop

Last update: 2023-07-31

Use Campaign Federated Data Access (FDA) option to process information stored in an external databases. Follow the steps below to configure access to Hadoop.

  1. Configure Hadoop database
  2. Configure the Hadoop external account in Campaign

Configuring Hadoop 3.0

Connecting to a Hadoop external database in FDA requires the following configurations on the Adobe Campaign server. Note that this configuration is available for both Windows and Linux.

  1. Download the ODBC drivers for Hadoop depending on your OS version. Drivers can be found on this page.

  2. You then need to install the ODBC drivers and create a DSN for your Hive connection. Instructions can be found in this page

  3. After downloading and installing the ODBC drivers, you need to restart Campaign Classic. To do so, run the following command:

    systemctl stop nlserver.service
    systemctl start nlserver.service
    
  4. In Campaign Classic, you can then configure your Hadoop external account. For more on how to configure your external account, refer to this section.

Hadoop external account

The Hadoop external account allows you to connect your Campaign instance to your Hadoop external database.

  1. In Campaign Classic, configure your Hadoop external account. From the Explorer, click Administration / Platform / External accounts.

  2. Click New.

  3. Select External database as your external account’s Type.

  4. Configure the Hadoop external account, you must specify:

    • Type: ODBC (Sybase ASE, Sybase IQ)

    • Server: Name of the DNS

    • Account: Name of the user

    • Password: User account password

    • Database: Name of your database if not specified in DSN. It can be left empty if specified in the DSN

    • Time zone: Server time zone

The connector supports the following ODBC options:

Name Value
ODBCMgr iODBC
warehouse 1/2/4

The connector also supports the following Hive options:

Name Value Description
bulkKey Azure blob or DataLake access key For wasb:// or wasbs:// bulk loaders (i.e. if the bulk load tool starts with wasb:// or wasbs://).
It is the access key for blob or DataLake bucket for bulk load.
hdfsPort port number
set by default to 8020
For HDFS bulk load (i.e. if the bulk load tool starts with webhdfs:// or webhdfss://).
bucketsNumber 20 Number of buckets when creating a clustered table.
fileFormat PARQUET Default file format for work tables.

Configuring Hadoop 2.1

If you need to connect to Hadoop 2.1, follow the steps described below for Windows or Linux.

Hadoop 2.1 for Windows

  1. Install ODBC and Azure HD Insight drivers for Windows.

  2. Create the DSN (Data Source Name) by running the ODBC DataSource Administrator tool. A System DSN sample for Hive is provided for you to modify.

    Description: vorac (or any name you like)
    Host: vorac.azurehdinsight.net
    Port: 443
    Database: sm_tst611 (or your database name)
    Mechanism: Azure HDInsight Service
    User/Password: admin/<your password here>
    
  3. Create the Hadoop external account, as detailed in this section.

Hadoop 2.1 for Linux

  1. Install unixodbc for Linux.

    apt-get install unixodbc
    
  2. Download and install ODBC drivers for Apache Hive from HortonWorks: https://www.cloudera.com/downloads.html.

    dpkg -i hive-odbc-native_2.1.10.1014-2_amd64.deb
    
  3. Check ODBC files location.

    
    root@campadpac71:/tmp# odbcinst -j
    unixODBC 2.3.1
    DRIVERS............: /etc/odbcinst.ini
    SYSTEM DATA SOURCES: /etc/odbc.ini
    FILE DATA SOURCES..: /etc/ODBCDataSources
    USER DATA SOURCES..: /root/.odbc.ini
    SQLULEN Size.......: 8
    SQLLEN Size........: 8
    SQLSETPOSIROW Size.: 8
    
  4. Create the DSN (Data Source Name) and edit the odbc.ini file. Then, create a DSN for your Hive connection.

    Here is an example for HDInsight to setup a connection called “viral”:

    [ODBC Data Sources]
    vorac
    
    [vorac]
    Driver=/usr/lib/hive/lib/native/Linux-amd64-64/libhortonworkshiveodbc64.so
    HOST=vorac.azurehdinsight.net
    PORT=443
    Schema=sm_tst611
    HiveServerType=2
    AuthMech=6
    UID=admin
    PWD=<your password here>
    HTTPPath=
    UseNativeQuery=1
    
    NOTE

    The UseNativeQuery parameter here is very important. Campaign is Hive-aware and will not work correctly unless UseNativeQuery is set. Typically, the driver or Hive SQL Connector will rewrite queries and tamper the column ordering.

    The authentication setup depends on the Hive/Hadoop configuration. For instance, for HD Insight, use AuthMech=6 for user/password authentication, as described here.

  5. Export the variables.

    export ODBCINI=/etc/myodbc.ini
    export ODBCSYSINI=/etc/myodbcinst.ini
    
  6. Setup Hortonworks drivers via /usr/lib/hive/lib/native/Linux-amd64-64/hortonworks.hiveodbc.ini.

    You have to use UTF-16 to be able to connect with Campaign and unix-odbc (libodbcinst).

    [Driver]
    
    DriverManagerEncoding=UTF-16
    ErrorMessagesPath=/usr/lib/hive/lib/native/hiveodbc/ErrorMessages/
    LogLevel=0
    LogPath=/tmp/hive
    SwapFilePath=/tmp
    
    ODBCInstLib=libodbcinst.so
    
  7. You can now test your connection using isql.

    isql vorac
    isql vorac -v
    
  8. Create the Hadoop external account, as detailed in this section.

On this page