In this blog post, we’ll provide an overview of various SAP tools for data integration. We’ll introduce you to a total of four tools that can be used for data integration.
We’ll start by introducing SAP Data Services, possibly the most comprehensive tool for data integration. Next, we’ll take a look at SAP Landscape Transformation Replication Server, the tool for integrating SAP systems with each other. We’ll then introduce you to the SAP HANA smart data integration solution. This tool is available by default in all SAP HANA systems. We’ll also cover SAP Cloud Integration for data services. Finally, we’ll describe SAP Replication Server, which is primarily used in system administration but still offers options for data integration.
In our view, SAP Data Services is one of the best-known SAP tools for data integration. In 2007, SAP acquired the company Business Objects and further developed its product BusinessObjects Data Integrator into the current solution, SAP Data Services. Put simply, SAP Data Services is a classic ETL product and thus also has a corresponding range of functions to offer. The figure below shows a rough functional overview of SAP Data Services.
With SAP Data Services, you can extract data from different sources (e.g., from SAP systems, databases, and cloud systems). For this purpose, different connectors are provided, similar to SAP Cloud Integration.
After the data has been extracted, it can be transformed. You can combine different datasets, perform structural transformations, or enrich the data.
One strength of SAP Data Services are its options for data profiling (i.e., analyzing data to identify data quality problems). In some cases, however, additional products (e.g., SAP Information Steward) are required to exploit your data’s full potential.
Loading is the last step in the classic ETL process. The previously extracted and transformed data is now passed on to the target. In this case, too, SAP Data Services uses different connectors that allow you to send the data in different formats to the appropriate recipients.
SAP Data Services is often used by BI or analytics teams to fill an SAP Business Warehouse (SAP BW) system. SAP Data Services is used to extract, prepare, and load data from transactional systems (often SAP systems) in SAP BW as part of batch processing.
In recent years, we’ve seen an increasing demand to use SAP Data Services as a tool for migration (e.g., to SAP S/4HANA). SAP itself recommends the use of SAP Data Services for complex migrations. With the help of SAP Information Steward, you can use the functions in SAP Data Services to analyze and improve your data quality.
The following links offer more information about SAP Data Services:
SAP Landscape Transformation Replication Server is SAP’s ETL tool for the real-time replication of data between two systems. The source is either an SAP system or another database system. The target must always be an SAP HANA database. Below shows how SAP Landscape Transformation Replication Server works.
From the outset, SAP Landscape Transformation Replication Server is designed to perform real-time replication. In contrast, SAP Data Services often implements job-driven batch processing (even though SAP Data Services now also enables real-time processing, at least to a limited extent). To enable this capability, SAP Landscape Transformation Replication Server uses what’s called database triggers on the source system. These triggers are activated with every change and write entries into log tables that are used to determine the delta data.
In principle, SAP Landscape Transformation Replication Server can be installed on-premise on any ABAP system. A distinction can be made as to whether SAP Landscape Transformation Replication Server is installed on the source system or on a separate ABAP system.
SAP Landscape Transformation Replication Server also works according to the ETL principle. The difference to SAP Data Services lies in the replication of data in real time. Furthermore, the options in the transformation step are more limited in SAP Landscape Transformation Replication Server than in SAP Data Services. For example, no additional data can be read from other sources during the transformation. In addition, SAP Landscape Transformation Replication Server lacks dedicated functions for analyzing data quality and cleansing data.
SAP Application SAP LT Replication Server SAP HANA System
For example, you would use SAP Landscape Transformation Replication Server if you want to integrate data into your SAP S/4HANA system in real time. A well-known use case for this approach is SAP S/4HANA Central Finance.
At the following pages, you’ll find more information about SAP Landscape Transformation Replication Server:
SAP HANA smart data integration is the standard ETL tool available in SAP HANA. In simple terms, SAP HANA smart data integration is a hybrid of SAP Data Services and SAP Landscape Transformation Replication Server, enabling both job-based batch processing and real-time integration of data. However, SAP HANA smart data integration is an integral part of the SAP HANA platform and therefore only available on these systems. An overview of the architecture of SAP HANA smart data integration is shown here.
SAP HANA smart data integration also works according to the ETL principle, but with the restriction that SAP HANA smart data integration can only be used unidirectionally. Data can only be integrated into an SAP HANA system, and no data can be exported.
Another special feature of SAP HANA smart data integration is the ability to integrate data into an SAP HANA system in two different ways. In data replication, the information is actually stored in the SAP HANA database. In contrast, data virtualization does not store information in the SAP HANA database. Instead, as soon as the information is needed by the application, it is read from the source in real time via SAP HANA smart data integration. SAP HANA smart data access (as a part of SAP HANA smart data integration) is used for this purpose.
The data provisioning agent is used to extract data from source systems. The data provisioning agent is a flexible framework in which you can install different connectors for diverse sources. SAP already delivers a large number of connectors; additional connectors are provided by third-party vendors (for a fee).
Functionality Depends on the Connector. Note that the possible range of functions depends on the connector in question. For example, not all connectors support data virtualization with SAP HANA smart data access.
For analyzing and improving data quality, the SAP HANA platform also includes a special component. SAP HANA smart data quality gives you the ability to explore, cleanse, and enrich your data.
On the following pages, you’ll find more information about SAP HANA smart data integration:
SAP Cloud Integration for data services is a cloud-based solution for data integration. SAP Cloud Integration for data services supports only a small number of SAP applications.
You may be familiar with SAP Cloud Integration for data services from the integration of SAP Integrated Business Planning (SAP IBP) for supply chain with other applications.
The following SAP applications are also supported by SAP Cloud Integration for data services:
Similar to SAP HANA smart data integration, SAP Cloud Integration for data services fundamentally consists of two components:
Access to these applications is defined via what are called data stores. SAP provides predefined data stores for the SAP cloud applications mentioned above. However, you can also create your own data stores. In addition, you can integrate data from applications using other communication types. You can create data stores with the following communication types:
You define the processing of data in SAP Cloud Integration for data services using so-called data flows. Within data flows, you can combine different work steps. Note that SAP Cloud Integration for data services is a lightweight system for data integration; therefore, the options for data flows are limited. Only basic functions for filtering, mapping, aggregation, and calling data stores are available.
On the following pages, you’ll find more information about SAP Cloud Integration for data services:
SAP Replication Server is another SAP tool for data integration. However, this tool is classified differently than the three tools already discussed. SAP Replication Server was added to the SAP product portfolio in 2010 through the acquisition of Sybase (formerly known as Sybase Replication Server) and is still being further developed today.
SAP Replication Server is mainly used by system administrators to implement high-availability system landscapes. The aim is to synchronize the complete data stack between two databases. This synchronization occurs in real time with SAP Replication Server. The synchronization makes it possible to switch to an inactive system if the active system fails. The permanent synchronization of the databases minimizes data loss during the changeover.
On the following pages, you can find more information about SAP Replication Server:
Editor’s note: This post has been adapted from a section of the book SAP Interface Management Guide by Adam Kiwon, Mark Lehmann, Manuel Männle, and Martin Tieves.