In this article we will discuss about the architecture overview of the
In-Memory Computing Engine of SAP HANA. The SAP HANA database is
developed in C++ and runs on SUSE Linux Enterpise Server. SAP HANA
database consists of multiple servers and the most important component
is the Index Server. SAP HANA database consists of Index Server, Name Server, Statistics Server, Preprocessor Server and XS Engine.
- Index Server contains the actual data and the engines for processing the data. It also coordinates and uses all the other servers.
- Name Server holds information about the SAP HANA databse topology. This is used in a distributed system with instances of HANA database on different hosts. The name server knows where the components are running and which data is located on which server.
- Statistics Server collects information about Status, Performance and Resource Consumption from all the other server components. From the SAP HANA Studio we can access the Statistics Server to get status of various alert monitors.
- Preprocessor Server is used for Analysing Text Data and extracting the information on which the text search capabilities are based .
- XS Engine is an optional component. Using XS Engine clients can connect to SAP HANA database to fetch data via HTTP.
SAP HANA Index Server Architecture:
- Connection and Session Management component is responsible for creating and managing sessions and connections for the database clients. Once a session is established, clients can communicate with the SAP HANA database using SQL statements. For each session a set of parameters are maintained like, auto-commit, current transaction isolation level etc. Users are Authenticated either by the SAP HANA database itself (login with user and password) or authentication can be delegated to an external authentication providers such as an LDAP directory.
- The client requests are analyzed and executed by the set of components summarized as Request Processing And Execution Control.
The Request Parser analyses the client request and dispatches it to the
responsible component. The Execution Layer acts as the controller that
invokes the different engines and routes intermediate results to the
next execution step.
For example, Transaction Control statements are forwarded to the
Transaction Manager. Data Definition statements are dispatched to the
Metadata Manager and Object invocations are forwarded to Object Store.
Data Manipulation statements are forwarded to the Optimizer which
creates an Optimized Execution Plan that is subsequently forwarded to
the execution layer.
- The SQL Parser checks the syntax and semantics of the client SQL statements and generates the Logical Execution Plan. Standard SQL statements are processed directly by DB engine.
- The SAP HANA database has its own scripting language named SQLScript that is designed to enable optimizations and parallelization. SQLScript is a collection of extensions to SQL. SQLScript is based on side effect free functions that operate on tables using SQL queries for set processing. The motivation for SQLScript is to offload data-intensive application logic into the database.
- Multidimensional Expressions (MDX) is a language for querying and manipulating the multidimensional data stored in OLAP cubes.
- The SAP HANA database also contains a component called the Planning Engine that allows financial planning applications to execute basic planning operations in the database layer. One such basic operation is to create a new version of a dataset as a copy of an existing one while applying filters and transformations. For example: Planning data for a new year is created as a copy of the data from the previous year. This requires filtering by year and updating the time dimension. Another example for a planning operation is the disaggregation operation that distributes target values from higher to lower aggregation levels based on a distribution function.
- The SAP HANA database also has built-in support for domain-specific models (such as for financial planning) and it offers scripting capabilities that allow application-specific calculations to run inside the database.
- In HANA database, each SQL statement is processed in the context of a transaction. New sessions are implicitly assigned to a new transaction. The Transaction Manager coordinates database transactions, controls transactional isolation and keeps track of running and closed transactions. When a transaction is committed or rolled back, the transaction manager informs the involved engines about this event so they can execute necessary actions. The transaction manager also cooperates with the persistence layer to achieve atomic and durable transactions.
- Metadata can be accessed via the Metadata Manager. The SAP HANA database metadata comprises of a variety of objects, such as definitions of relational tables, columns, views, and indexes, definitions of SQLScript functions and object store metadata. Metadata of all these types is stored in one common catalog for all SAP HANA database stores (in-memory row store, in-memory column store, object store, disk-based). Metadata is stored in tables in row store. The SAP HANA database features such as transaction support, multi-version concurrency control, are also used for metadata management. In distributed database systems central metadata is shared across servers. How metadata is actually stored and shared is hidden from the components that use the metadata manager.
- The Authorization Manager is invoked by other SAP HANA
database components to check whether the user has the required
privileges to execute the requested operations. SAP HANA allows granting
of privileges to users or roles. A privilege grants the right to
perform a specified operation (such as create, update, select, execute,
and so on) on a specified object (for example a table, view, SQLScript
function, and so on).
The SAP HANA database supports Analytic Privileges that represent
filters or hierarchy drilldown limitations for analytic queries.
Analytic privileges grant access to values with a certain combination of
dimension attributes. This is used to restrict access to a cube with
some values of the dimensional attributes.
- Database Optimizer gets the Logical Execution Plan from the SQL Parser or the Calc Engine as input and generates the optimised Physical Execution Plan based on the database Statistics. The database optimizer which will determine the best plan for accessing row or column stores.
- Database Executor basically executes the Physical Execution Plan to access the row and column stores and also process all the intermediate results.
- The Row Store is the SAP HANA database row-based
in-memory relational data engine. Optimized for high performance of
write operation, Interfaced from calculation / execution layer.
Optimised Write and Read operation is possible due to Storage separation
i.e. Transactional Version Memory & Persisted Segment.
- Transactional Version Memory contains temporary versions i.e. Recent versions of changed records. This is required for Multi-Version Concurrency Control (MVCC). Write Operations mainly go into Transactional Version Memory. INSERT statement also writes to the Persisted Segment.
- Persisted Segment contains data that may be seen by any ongoing active transactions. Data that has been committed before any active transaction was started.
- Version Memory Consoliation moves the recent version of changed records from Transaction Version Memory to Persisted Segment based on Commit ID. It also clears outdated record versions from Transactional Version Memory. It can be considered as garbage collector for MVCC.
- Segments contain the actual data (content of row-store tables) in pages. Row store tables are linked list of memory pages. Pages are grouped in segments. Typical Page size is 16 KB.
- Page Manager is responsible for Memory allocation. It also keeps track of free/used pages.
- The Column Store is the SAP HANA database column-based in-memory relational data engine. Parts of it originate from TREX
(Text Retrieval and Extraction) i.e SAP NetWeaver Search and
Classification. For the SAP HANA database this proven technology was
further developed into a full relational column-based data store.
Efficient data compression and optimized for high performance of read
operation, Interfaced from calculation / execution layer. Optimised Read
and Write operation is possible due to Storage separation i.e. Main
& Delta.
- Main Storage contains the compressed data in memory for fast read.
- Delta Storage is meant for fast write operation. The update is performed by inserting a new entry into the delta storage.
- Delta Merge is an asynchronous process to move changes in delta storage into the compressed and read optimized main storage. Even during the merge operation the columnar table will be still available for read and write operations. To fulfil this requirement, a second delta and main storage are used internally.
- During Read Operation data is always read from both main & delta storages and result set is merged. Engine uses multi version concurrency control (MVCC) to ensure consistent read operations.
- As row tables and columnar tables can be combined in one SQL statement, the corresponding engines must be able to consume intermediate results created by each other. A main difference between the two engines is the way they process data: Row store operators process data in a row-at-a-time fashion using iterators. Column store operations require that the entire column is available in contiguous memory locations. To exchange intermediate results, row store can provide results to column store materialized as complete rows in memory while column store can expose results using the iterator interface needed by row store.
- The Persistence Layer is responsible for durability and atomicity of transactions. It ensures that the database is restored to the most recent committed state after a restart and that transactions are either completely executed or completely undone. To achieve this goal in an efficient way the per-sistence layer uses a combination of write-ahead logs, shadow paging and savepoints. The persistence layer offers interfaces for writing and reading data. It also contains SAP HANA 's logger that manages the transaction log. Log entries can be written implicitly by the persistence layer when data is written via the persistence interface or explicitly by using a log interface.
Distributed System and High Availability
The SAP HANA Appliance software supports High Availability. SAP HANA scales systems beyond one server and can remove the possibility of single point of failure. So a typical Distributed Scale out Cluster Landscape will have many server instances in a cluster. Therefore Large tables can also be distributed across multiple servers. Again Queries can also be executed across servers. SAP HANA Distributed System also ensures transaction safety. Features- N Active Servers or Worker hosts in the cluster.
- M Standby Server(s) in the cluster.
- Shared file system for all Servers. Serveral instances of SAP HANA share the same metadata.
- Each Server hosts an Index Server & Name Server.
- Only one Active Server hosts the Statistics Server.
- During startup one server gets elected as Active Master.
- The Active Master assigns a volume to each starting Index Server or no volume in case of cold Standby Servers.
- Upto 3 Master Name Servers can be defined or configured.
- Maximum of 16 nodes is supported in High Availability configurations.
Name Server Configured Role | Name Server Actual Role | Index Server Configured Role | Index Server Actual Role |
Master 1 | Master | Worker | Master |
Master 2 | Slave | Worker | Slave |
Master 3 | Slave | Worker | Slave |
Slave | Slave | Standby | Standby |
Failover
- High Availability enables the failover of a node within one distributed SAP HANA appliance. Failover uses a cold Standby node and gets triggered automatically. So when a Active Server X fails, Standby Server N+1 reads indexes from the shared storage and connects to logical connection of failed server X.
- If the SAP HANA system detects a failover situation, the work of the services on the failed server is reassigned to the services running on the standby host. The failed volume and all the included tables are reassigned and loaded into memory in accordance with the failover strategy defined for the system. This reassignment can be performed without moving any data, because all the persistency of the servers is stored on a shared disk. Data and logs are stored on shared storage, where every server has access to the same disks.
- The Master Name Server detects an Index Server failure and executes the failover. During the failover the Master Name Server assigns the volume of the failed Index Server to the cold Standby Server. In case of a Master Name Server failure, another of the remaining Name Servers will become Active Master.
- Before a failover is performed, the system waits for a few seconds to determine whether the service can be restarted. Standby node can take over the role of a failing master or failing slave node