Learn SAP from the Experts | The SAP PRESS Blog

An Introduction to Data Tiers in SAP

Written by SAP PRESS | Apr 24, 2026 1:00:01 PM

Modern enterprises live and breathe data. Every purchase, click, sensor reading, or account update leaves a digital trace.

 

But raw transactions alone rarely provide the clarity leaders need to make decisions. The journey from a transaction in an operational system to a strategic insight in a decision support environment is both technical and conceptual. This post follows that journey step-by-step, tracing the flow of data through various data tiers such as transaction systems, operational data stores (ODWs), enterprise data warehouses (EDWs), and beyond—into the realm of analytics.

 

The figure below shows the tiers in the lifecycle from data creation to analytics to create business value from the enterprise data.

 

 

We’ll cover the data tiers in the data lifecycle from business applications to analytics. We also cover aspects of data quality, archiving, and security.

 

Transactional System

The data journey begins with transactional systems, also known as OLTP systems. These are the systems of record that capture the day-to-day operations of a business such as point-of-sale (POS) terminals, ERP modules, CRM systems, HCM, banking applications, booking applications, and so on.

 

Data in these systems is highly normalized. For example, a POS database links product IDs to product catalogs, customers to loyalty profiles, and transactions to payment records. The design ensures speed and accuracy but makes it difficult to run broad, analytical queries without slowing down the checkout line.

 

While these systems are ideal for running the business, they aren’t well-suited for running analytics. Queries that join dozens of normalized tables or aggregate millions of rows would compete with operational workloads, slowing down mission-critical transactions. Hence, data needs to move elsewhere for deeper analysis.

 

Operational Data Store

The operational data store (ODS) is a bridge between OLTP systems and online analytical processing (OLAP) systems. The ODS data tier brings system of record data from multiple OLTP systems into one place. This enables operational reporting using transactional data from multiple systems within the enterprise.

 

The ODS plays a key role in consolidating data from multiple transaction systems into a common, integrated repository. This is where data is maintained in near real time or frequent batches, providing a current but not historical view. Data structures in the ODS remain closer to normalized forms in the transactional systems to retain detail and consistency.

 

The ODS is particularly valuable for operational reporting because it allows you to build dashboards that require a unified view of the most recent activity. For example, a logistics company can track packages across multiple tracking systems in near real time without burdening the transaction databases themselves. A dashboard can show the top-selling items by region this afternoon or alert users when stock for a high-demand product is running low across multiple stores.

 

Still, the ODS isn’t designed for long-term storage, historical analysis, or complex data mining. For that, enterprises turn to the next data tier.

 

Enterprise Data Warehouse

An enterprise data warehouse (EDW) is the enterprise’s analytical backbone. It’s a subject-oriented, integrated, time-variant, and nonvolatile repository of data that is designed to answer questions rather than process transactions.

 

EDWs are subject-oriented and organized around business domains such as sales, inventory, and finance. These domains integrate with systems across the enterprise and bring data from those systems into the EDW. The data is unified, harmonized, and transformed using shared mentions and standard definitions. Data in EDW is time variant and nonvolatile. It’s stored across time in snapshots, enabling historic trend analysis, and doesn’t get overwritten to preserve history.

 

Data is denormalized within the warehouse and manipulated into star schemas. A star schema centers on a fact table (e.g., sales transactions) that tracks quantities and revenues. A fact table is linked to dimension tables (e.g., customer, product, store, geography location, time). This structure sacrifices storage efficiency for query performance and analytical clarity. However, this structure makes it easy to run queries such as the following: “What were shoe sales by region last November, broken down by customer age group?”

 

Retailers, for example, can analyze seasonal sales patterns by running queries such as these “Which products spike before holidays?” and “Which discounts actually drive margin growth?” These questions require history, integration, and denormalization, which are key characteristics of an EDW.

 

Depending on the nature of the data analytics scope many EDW designs can be put into a snowflake schema as well. A snowflake schema is a type of dimensional data modeling used in EDWs where dimensional tables are normalized into multiple related tables, forming a structure that resembles a snowflake. For example, a product dimension table is normalized into Brand and Category tables, which are joined with the Product table and are accessed only when you need to report on the product’s Brand and/or Category.

 

Enterprise Analytics

Analytics systems sit on top of the data warehouse, translating structured data into actionable insights. Performing analytics involves examining datasets to find patterns and draw conclusions about the information they contain. There are various techniques and tools to make sense of data, identify patterns, and derive actionable insights. The analytics types used to derive insights and inform decision-making are descriptive, diagnostic, predictive, and prescriptive analytics. Here’s a brief overview of each.

Descriptive Analytics

Descriptive analytics is about reporting on what happened by summarizing historical data to understand what transpired. Methods used include data aggregation, data mining, and reporting, as well as statistical methods to provide insights into past performance. Dashboards and reports are typical tools used to support descriptive analytics. For example, you can query the following: “What was the revenue by region and product in the last quarter?"

Diagnostic Analytics

Diagnostic analytics is about exploring and explaining why something happened by identifying patterns and correlations in the data. It requires deeper analysis, often using techniques such as data mining and correlation analysis to identify patterns or anomalies. Diagnostic analytics helps in understanding the causes behind trends observed in descriptive analytics. For example, you can query the following: “What caused shoe sales to fall in the Midwest?”

Predictive Analytics

Predictive analytics is about forecasting what could happen. It uses historical data and statistical algorithms to forecast future outcomes. ML techniques and algorithms identify trends to forecast future trends, outcomes, or predictions. Predictive analytics is commonly used in risk assessment, sales forecasting, and customer behavior analysis. For example, you can ask the following: “Which products will likely surge before Black Friday sales, based on trends and external data such as weather or local events?”

Prescriptive Analytics

Prescriptive analytics is about recommending what to do based on predictive analytics. It uses optimization and simulation algorithms to suggest the best course of action. Prescriptive analytics helps organizations make informed decisions by evaluating potential outcomes of different strategies. For example, you can ask the following: “Which stores should increase orders of winter coats in October to maximize sales without overstocking?”

 

Organizations need a combination of these analytics types to enhance their decision-making processes. Each type builds on the previous one, creating a comprehensive approach to data analysis. Without the layers of consolidation, integration, and denormalization, such systems would be impossible to operate efficiently at scale.

 

Data Quality

Data is something you encounter every day, even if you don’t realize it. Think of it like this: every app, website, and digital service you use is built on data. Just like a building needs a strong foundation to stand tall, these digital services need high-quality data to work correctly and provide value. If the data is bad, everything that’s built on it will be unreliable.

 

Data quality isn’t just one thing; it’s a combination of several key characteristics. You can think of these as the pillars of data quality. If any of the following pillars are weak, the whole structure of your data can be compromised.

Accuracy

Accuracy means the data is correct and reflects the real world. Imagine you’re using a GPS app to find a friend’s house. If the app shows their address as 123 Main Street when it’s actually 321 Main Street, that’s inaccurate data. It’s a simple mistake, but it leads to a big problem: you end up at the wrong place.

 

Accuracy is crucial in the business world. A bank needs accurate data on customer accounts to make sure people are charged the right amount and money goes to the right place. A hospital needs accurate patient records to ensure patients receive the correct medication and treatment.

Completeness

Completeness means the data is whole and nothing is missing. Think about filling out an online form. If you’re required to provide your name, email, and phone number, but you leave the phone number field blank, that data is incomplete.

 

Incomplete data can lead to missed opportunities and mistakes for a company. If a marketing team wants to send a promotional text message to customers but their phone number data is incomplete, they can’t reach everyone. Or, if a social media app has incomplete user profiles, it can’t suggest new friends or content as effectively.

Conformity

Conformity means the data conforms to a specific format or set of rules. For example, a valid date should have a month, day, and year. A valid email address must have an @ symbol and a domain name. If you’re asked to enter your age on a form, and you type “twenty” instead of “20”, the data is invalid because it’s not in the expected format (a number).

 

Businesses use validation rules to keep their data clean. If a form is programmed to only accept numbers for a US-based postal code, it prevents a user from accidentally entering nonnumeric characters, which would make the data invalid.

Consistency

Consistency means the data is the same everywhere it appears. Let’s say your name is Alex and you have a favorite online store. In one part of the store’s database, your name is spelled “Alex”, but in another part, it’s “Alec”. That’s a data consistency issue.

 

Inconsistency is a huge problem for a business because it can lead to confusion and a poor customer experience. For example, if your address is saved differently in an online store’s billing and shipping systems, you might get charged for a purchase, but the item never arrives because it was sent to the wrong address.

Uniqueness

Uniqueness means that each real-world entity (e.g., organization, supplier, product) is represented by exactly one record, with no duplicates based on the dataset’s identifying attributes or the record key. This helps to prevent double-counting and inconsistent reporting.

 

If the same organization exists in an SAP S/4HANA system under two business partner IDs, you may issue duplicate invoices, have an incomplete view of the organization’s history, and misstate revenue, degrading the organizational experience and the analytics accuracy.

Integrity

Integrity is the degree to which data values adhere to defined business rules, relationships, and valid value sets, ensuring keys, references, and hierarchies are correct and consistent across tables and systems. It prevents orphan records and rule violations that break transactions and distort analytics.

 

For example, let’s look at a business rule where the organization’s total credit exposure (open orders, deliveries, and receivables) must not exceed the approved credit limit. If the organization’s credit block is reset by a periodic batch job, then it’s possible for the organization’s current exposure and credit block status to be out of sync. This would be classified as a data integrity error.

Timeliness

Timeliness means the data is up-to-date and available when you need it. Imagine you’re checking the score of a live sports game online. If the score you see is from an hour ago, that data isn’t timely. You’re looking for real-time information, and outdated data is useless to you.

 

Timely data is critical for things that depend on the present moment, such as stock market trading, weather forecasts, or ride-sharing apps that show where your driver is right now. Without timely data, a company might make a decision based on old information that’s no longer relevant, leading to significant financial losses or operational issues.

 

The consequences of poor data quality can be severe. If you put bad data in, you’re going to get bad results out, which can lead to bad decisions. Companies often use data to decide what products to sell, where to open new stores, or how to market to customers. If the data they’re using is inaccurate or incomplete, they might make a costly mistake, such as building a store in the wrong location or promoting the wrong products. When data is messy and unreliable, people have to spend time and resources cleaning it up. This takes away from the time that could be spent on more productive tasks.

 

If a company has inconsistent or inaccurate customer data, it might not be able to resolve customer issues quickly. This can frustrate customers and damage the company’s reputation. In some industries, such as healthcare and finance, having high-quality data isn’t just important—it’s a legal requirement. Inaccurate patient data can lead to medical errors, and inconsistent financial data can result in legal penalties and fines.

 

Ultimately, data quality is the backbone of the digital world. The apps, games, and services you rely on are only as good as the data they’re built on. By understanding and valuing data quality, you can better appreciate the complex digital systems that power our lives and recognize the importance of keeping that data clean, consistent, and correct.

 

Data Archiving

As we’ve seen, data flows from transactional systems to analytics, and the most current and actively accessed data resides in the transactional system. The data value declines over time. The data in the EDW is accessed less frequently than the data in the ODS, for example.

 

Let’s use the temperature analogy to categorize data based on operational usefulness, performance requirements, frequency of access, and security requirements. Hot data is frequently accessed, has high business value, and requires high query performance. Warm data is less frequently accessed, has less business value, and has reasonable query performance expectations. Cold data is rarely accessed, has low value, and has low query performance expectations.

 

The data archiving process stores cold data for long-term retention. Archiving is essential for managing large volumes of data, remaining compliant, and keeping data for future use. Generally speaking, data archiving is about correctly sizing the current production data, reducing its size in the system, and still keeping the data available for audit, legal, and compliance purposes if and when needed.

 

When you extract data from the productive system, you need to consider why you’re doing so: What is the purpose, and how will the data be used in the future? You may consider extracting data from a productive system for several reasons:

  • To make the data available for enterprise analytics
  • To decrease the data volume because it’s negatively impacting the productive system’s performance
  • To free up space in the production system
  • To prepare the system for decommissioning

The intended future use of data determines where it resides and when to archive it. If the data isn’t needed for business transactions and analytics but is required for auditing and regulatory compliance purposes, then that data needs to be archived.

 

Archiving helps manage the productive system’s data volume. Data extracted from the source system frees up space needed to keep and manage the data. Archiving helps right-size the database and reduces storage and management costs associated with it. It’s also useful when a legacy system decommissioning is needed. During the merger or acquisition scenario, data from one system that can’t come over must be carved out, but you still want to keep it to look at in the future, so you take a snapshot before you shut down the old system.

 

Archiving also helps with keeping data that isn’t required for business operations but may be needed for audit, legal, and regulatory compliance purposes. Archiving data ensures that it’s accessible when needed.

 

SAP offers a few archiving, storage, automation, and lifecycle management solutions.

SAP Information Lifecycle Management

The SAP Information Lifecycle Management (SAP ILM) solution is used to manage data aging, archiving, and data retention, and it ensures efficient data volume management. SAP ILM enables you to define retention policy rules and automate the data archiving process. Administers can place holds on retained information until dispositioning (deletion according to policy). It supports retention policy–driven archiving and data dispositioning for audit, legal, and regulatory compliance readiness while adhering to country-specific legal regulations.

SAP Archiving and Document Access Core by OpenText

SAP Archiving and Document Access Core by OpenText provides a managed repository that is secure, long-term, and device-independent. It makes archived data and document attachments accessible from within SAP business applications.

Data ASSIST by Auritas

Data ASSIST by Auritas performs an accurate analysis of archive objects in the database. It supports archiving automation, including sizing, scheduling, and ongoing access. Data ASSIST is used with SAP ILM and OpenText or SAP BTP services for archiving and storage.

 

Security

Businesses run on SAP S/4HANA systems that consolidate all critical business data into a single system, including financial records, customer data, intellectual property, and proprietary business processes. This data is the most sensitive and valuable for the business. Unauthorized employee access, whether intentional or accidental, can lead to data theft, fraud, or sabotage. Therefore, the SAP S/4HANA system must be secured to protect its most valuable asset—its data. The security mechanism to ensure who has access to the system and what data they can see and what they can do with it is based on two key pillars.

Authentication

This is the process of verifying a user’s identity. Methods include a password, a biometric scan, or a token from multifactor authentication (MFA). A secure ERP system uses robust authentication to ensure that only verified individuals can gain initial entry into the system. Effective authorization tracks who has accessed or changed data, creating a clear and detailed audit trail. This simplifies regulatory reporting and helps prove compliance during audits.

Authorization

This determines what an authenticated user is permitted to do and see once they are in the system. It involves granting or denying access based on the user’s role and permissions. A strong authorization policy, such as role-based access control (RBAC), limits employee access to only the data and functions necessary for their job, following the principle of “least privilege.” For example, a finance clerk can create invoices but can’t approve them, while a manager can.

 

Security breaches can lead to fraud, data theft, and operational disruptions, which can cause significant financial losses. Strong controls reduce these risks and protect a company’s bottom line. By restricting who can modify certain data, authorization ensures the accuracy and reliability of information. For example, implementing segregation of duties (SoD) prevents a single person from initiating and approving a fraudulent transaction.

 

Conclusion

From transactional systems to enterprise analytics, data passes through multiple tiers of consolidation, transformation, and refinement before it can meaningfully inform business decisions. Each layer—the ODS, EDW, and analytics systems built on top—adds structure and context, while data quality, archiving, and security practices ensure that the insights produced are trustworthy, compliant, and protected. Organizations that understand and invest in this full data lifecycle are best positioned to turn everyday transactions into lasting strategic value.

 

Editor’s note: This post has been adapted from a section of the book Data Management and Analytics with SAP BTP by Dhirendra Gehlot, Jeff Gericke, Shibajee Dutta Gupta, Antony Isacc, Homiar Kalwachwala, Rick Markham, Asim Munshi, and Chris Sam. The authors of this book bring together decades of collective expertise across SAP data management, analytics, enterprise architecture, and information management. Their experience spans data integration, master data management, business intelligence, data migration, and AI-driven transformation, with each contributor having guided organizations through complex modernization efforts. United by a shared passion for turning data into actionable insight, they draw on more than 100 combined years of experience to offer practical, real-world perspectives on building trusted, scalable data foundations.

 

This post was originally published 4/2026.