Read more about Data Workbench’s End-of-life announcement.
Identify minimum requirements and recommendations for Data Workbench server components before planning and implementing your system.
The server Data Processing Unit (DPU) is the main data processing component of Data Workbench. It listens for network connections from Data Workbench, reads raw source data from the File Server Unit (FSU) and uses substantial computational and storage resources.
Please refer to the Services Description in the Adobe Data Workbench (Insight) Service Agreement for license capacity information.
For MS System Center Endpoint Protection in Windows 2012 Servers, these executables need to be added to the Excluded Processes: >
Adobe provides recommendations regarding a Data Workbench design that meets your business needs. However, the following guidelines are useful when selecting the operating system (OS) and hardware, because the optimized nature of the DPU software places specific requirements on the OS/hardware platform.
If a single dataset is limited by the capacity or speed of a single DPU, you can cluster them. For example, suppose you have three licensed copies of the DPU software that are used together to more quickly run a larger dataset. As the data is divided between the machines evenly, the licensed capacity of the dataset is multiplied by three. In addition, the processing speed per row becomes three times faster than a single DPU.
To achieve the best performance from your DPU investment, Adobe recommends the following high-performance components described in the following table:
Required | Recommended | |
---|---|---|
Operating System |
Microsoft Windows Server 2008 x64 |
Microsoft Windows Server 2012 x64 Microsoft Windows Server 2016 x64 |
CPU |
See recommendations. |
Latest-generation 4-core+ processors from Intel or AMD are recommended. For optimal performance, 8-cores; for a trade-off between speed and cost, 4-cores are recommended. |
RAM |
8 GB |
12 GB |
Working Data Storage |
1TB+ of total logical temp storage. Low latency access to the disk sub-system |
For temporary storage Adobe recommends either:
These should be configured in a JBOD array. Alternatively, when gross disk capacity exceeds 2TB, an array of 2-disk RAID1 volumes can be used. For example, configure six disks as a 3*(2*750GB RAID 1 pair.) |
System Data Storage |
Additionally, Adobe requires high-availability storage of a modest size (20GB) for the OS, DPU software, and other system software. |
|
Clustering Hardware |
See recommendations. |
Use a homogenous set of servers. In a DPU cluster, the slowest server reduces the performance of the whole dataset. |
Clustering Network Performance | A switched-gigabit Ethernet connection or greater. |
When considering alternative disk subsystems for temp storage, consider the following factors and guidelines:
Adobe cannot provide a warranty or representation concerning the speed at which data is processed by a configured Data Workbench, because a variety of factors impact the data processing speed, including but not limited to the following:
The server’s File Serving Unit (FSU) is the main data storage and management component of Data Workbench. The FSU acts as a file server for raw source data to the DPU, and, when appropriate, coordinates the clustering of DPUs. Each FSU is licensed to supply source data to up to five (5) DPUs.
FSU Components | Recommendations | |
---|---|---|
Operating System, CPU, RAM |
These requirements are the same as those of the DPU. However, for the FSU, Adobe recommends using the minimum requirements rather than following the recommendations. |
|
Disk System The FSU requires highly-available, redundant storage for large volumes of data. Adobe will work with you to determine your exact requirements. |
Adobe recommends:
As the FSU holds the raw source data, any loss would be unrecoverable, and Adobe suggests backing up this data on a regular basis. |
|
Network Performance |
Adobe requires switched-gigabit Ethernet connections between FSUs and DPUs working together. |
Data Workbench Sensor collects event data from web, application, and data collection servers to be transmitted to any server. Sensor’s instrumentation ensures consistently accurate measurement of events that occur in your Internet channel. Sensor supports many combinations of Web server software and operating system.
The following table describes system recommendations for Sensor:
Features | Recommended |
---|---|
Disk Storage |
512 MB minimum. |
RAM |
32 MB of RAM must be available to Sensor on the HTTP or other server computer that is its host. |
Network Performance |
1 Mbps or greater network connection to a repeater server or data workbench server . Sensor typically consumes far less bandwidth than one (1) Mbps. Your Adobe consultants will help you estimate the actual amount of bandwidth that would be required on a routine basis. |
Network Ports and Firewalls |
Sensor connects to the data workbench server using HTTPS (typically port 443, though this is configurable) or HTTP (typically port 80, though this is configurable). The appropriate port on any firewall that resides between a Sensor and the target data workbench server or repeater server should be opened only between the respective Sensor hosting computer and the data workbench server or repeater server before beginning the Sensor installation process. Sensor makes a uni-directional HTTPS or HTTP connection to an data workbench server or repeater server. |
Network Management Systems |
Existing network management systems should monitor the health of the underlying computer hardware (for example, disk space, network service) and network connectivity as well as the Windows Event Log or UNIX syslog. |
Server Time Synchronization |
Ensure that the computer system time is continuously synchronized across every computer that hosts a Sensor . The Web server applications and computers that are monitored by Sensor must have synchronized system times for the event data collected from them to be accurate. Please refer to your operating system's documentation for steps to synchronize system times on an ongoing basis with NTP or other such time synchronization facility. |
DNS Name Usage |
Adobe recommends that Sensors use a DNS name (instead of an IP address) to resolve the network address of a data workbench server or repeater server. When a Sensor uses a DNS name, the host web server's DNS or local hosts file needs to be configured to resolve the name of the data workbench server or repeater server. |
The following table lists the most common combinations that Sensor supports:
Web Server Software | Operating System |
---|---|
Apache Server / IBM HTTP Server 2.2 |
Microsoft Windows Server 2003 or later; RedHat Enterprise Linux 6.x or later; Sun Solaris 8.x or later; IBM AIX 5.1x or later. |
Apache Server 2.4 |
RedHat Enterprise Linux 6.x or later |
Microsoft IIS |
Microsoft Windows Server 2003 or later |
Java Application Servers (Tomcat, JBoss, iPlanet, Weblogic) |
Microsoft Windows Server 2003 or later; RedHat Enterprise Linux 6.x or later; Sun Solaris 8.x or later; IBM AIX 5.1x or later. |
For other server and operating system combinations, please consult Adobe regarding availability. Not all features of Sensor are available with all combinations of web/application server and operating system. For more information about particular Sensor releases, please contact Adobe Support.
Data workbench report server is the component that allows the output of scheduled reporting. The reports that are output can either be in the form of .PNG images or .XLS spreadsheets placed in a file system, or as emails. Its hardware requirements are identical to the Data Workbench Client.
The following requirements exist for report server:
Adobe recommends that existing network management systems monitor the hardware and network that the Data Workbench platform relies on.
In addition, Adobe recommends monitoring the Windows event logs of the FSUs and DPUs, which are written to when an error occurs.
Any networked storage system hosting log files needs to provide at least 10MB per DPU of sustained bandwidth.
It is a normal and required practice for a server DPU to process and re-process data into new or refreshed dataset.
This may occur because of configuration changes, data source changes, hardware changes, inappropriate configuration, hardware failure, software failure, power failure, and so forth. When such processing or re-processing occurs, all dataset and system data is required to be immediately available to the DPU and FSU components. Failure to adhere to this requirement can lead to significant and unnecessary system down time.
Considerations to keep in mind when working with DPU and FSU networks.