What is the procedure of implementation of statement routing?
The following picture shows how statement routing is implemented. During preparation table information is collected and sent back to the client where it is cached. The client library then chooses an optimal host (containing the requested table T1) and executes the database request on that host:
Looking for the advantages of statement routing?
Statement routing minimizes the network traffic between nodes of a scale-out solution, because database requests are executed on nodes holding (parts of the) accessed tables.
Related: Top 15 SAP HANA Interview Questions and Answers for Freshers
How can we know that which limitations exist for statement routing?
The following limitations exist for statement routing:
Limitation | Details | |||||||||||||||
Limited to prepared statements | Table locations are determined during preparation, so statement routing only works for prepared SQL statements. If statements are not prepared, you can't take advantage of statement routing. Typical problem scenarios are:
| |||||||||||||||
Strategy when tables are located on several nodes | If accessed tables or partitions are located on several different nodes, SAP HANA traditionally (up to SPS 08) follows a round-robin strategy regardless of the actual table sizes. So it can happen that a high amount of data has to be sent between the SAP HANA nodes if only a rather small table is located on the selected node. Starting with SPS 09 the strategy is improved step by step, so that typically nodes with the largest tables are selected. | |||||||||||||||
SAP HANA bugs | The following SAP HANA bugs related to statement routing and client distribution mode exist with specific release levels:
|
Is there any other approaches to distribute load to the SAP HANA nodes?
Statement routing is the most popular implementation of the SAP HANA client distribution mode, but also other algorithms exist. The global client distribution mode is defined with the following parameter:
indexserver.ini -> [distribution] -> client_distribution_mode
Possible values are:
Client distribution mode | Details |
off | No client distribution |
connection | Round-robin execution of database requests |
statement | Statement routing based on table locations (default) |
all | Concurrent activation of all distribution features |
Can connection load balancing be configured individually on client side?
Connection load balancing can be configured based on the client connection property DISTRIBUTION (see SAP 2079372). The default values are marked bold.
DISTRIBUTION (client SAP HANA SPS <= 10) | DISTRIBUTION (client SAP HANA SPS >= 11) | Details |
OFF | OFF | No client distribution |
CONNECTION | CONNECTION | Round-robin execution of database requests |
STATEMENT_ONLY | STATEMENT | Statement routing based on table locations |
STATEMENT | ALL | Concurrent activation of all distribution features |
In SAP ABAP environments this setting can be implemented with the following SAP profile parameter:
dbs/hdb/connect_property = DISTRIBUTION=
Multiple connection properties can be specified in a comma separated list, e.g.:
dbs/hdb/connect_property = DISTRIBUTION=STATEMENT, ENCRYPT=TRUE
How can we configured statement routing on SQL statement level?
Statement routing can be configured via SQL hints (see SAP 2142945):
DISTRIBUTION | Details |
ROUTE_BY('', ..., '') | Route SQL statement to hosts related to the specified table(s) |
ROUTE_BY_CARDINALITY('', ..., '') | Route SQL statement to host related to the specified table with the highest amount of records |
ROUTE_TO(, ..., ) | Route SQL statement to hosts related to the specified volume ID(s) |
NO_ROUTE_TO(, ..., ) | Exclude routing of SQL statement to host related to the specific volume ID(s) |
Is there any way to determine the statement routing details for connections?
You can use the following approaches to check for the mapping of connections to SAP HANA nodes:
- Monitoring view M_CONNECTIONS (CONNECTION_ID, HOST)
- SQL: "HANA_Sessions" (SAP 1969700)
- SAP HANA Studio -> Administration -> Performance -> Sessions
Furthermore the following columns exist in monitoring views in this context:
Monitoring View | Column | Details |
M_CONNECTIONS | CLIENT_DISTRIBUTION_MODE | Client distribution mode (OFF, STATEMENT ROUTING, CONNECTION_DISTRIBUTION, ALL) |
LOGICAL_CONNECTION_ID | Anchor connection ID (i.e. the connection used to connect to the initial node before routing is done) | |
M_CONNECTION_STATISTICS | EXECUTION_COUNT_BY_ROUTING | Number of executions based on client routing |
M_SQL_PLAN_CACHE | EXECUTION_COUNT_BY_ROUTING | Number of executions based on client routing |
IS_DISTRIBUTED_EXECUTION | TRUE if tables are located on different nodes / services, otherwise FALSE | |
TABLE_LOCATIONS | Locations of involved tables |