This Blog provides information about running SAP Applications on the Microsoft Platform. The Blog is written by people who are working with SAP on the Microsoft Platform for decades.
In the following we explain, how database statistics should be updated for SAP BW running on Microsoft Server. In particular, we describe the new feature “Lightweight Update Statistics”, which speeds up by factors SAP BW process chains containing Update Statistics.
The implementation of Update Statistics in Microsoft SQL Server as well as the implementation of Update Statistics within SAP BW (running on SQL Server) has been improved a few times within the last 10 years. We will discuss here only SAP BW 7.00 and newer and SQL Server 2005 and newer. All older releases of SQL Server or SAP BW should not even be considered for running a productive system. Some of the features we describe below require the newest SAP Support Packages as listed at the end of this BLOG.
The BW user interface is independent from the underlying RDBMS. As a result, it looks the same for ORACLE, DB2, Sybase and SQL Server. The BW user interface does not take any specifics of the RDBMS into account. This causes some confusion for SAP BW customers.
The improvements we implemented resulted in a changed system behavior in the past. This still causes some confusion for SAP BW customers, in particular when running age-old SAP BW support packages. All improvements were driven by two goals:
The cost based query optimizer of SQL Server relies on actual index and column statistics. SAP strongly recommends turning on the database options AUTO_UPDATE_STATISTICS and AUTO_CREATE_STATISTICS. Under no circumstances you should ever turn off one of these database options. As a result, statistics are updated automatically by Microsoft SQL Server (and additional column statistics are created when needed).
Parameterized SQL statements benefit from the database option AUTO_UPDATE_STATISTICS_ASYNC. Therefore SAP strongly recommends, to turn it on for all SAP ERP systems.
SAP BW queries do not use parameterized SQL statements. They always use literals. However, besides BW queries there are also normal Open SQL queries running on an SAP BW system. These queries are typically running against administration tables. When running your system with actual SAP support packages, you will hardly run into a situation, where the setting of AUTO_UPDATE_STATISTICS_ASYNC matters.
Therefore you should turn on AUTO_UPDATE_STATISTICS_ASYNC also for SAP BW systems. However, there may be special scenarios, were SAP support recommends turning it off for a specific customer system. You can find a discussion about this in http://blogs.msdn.com/b/saponsqlserver/archive/2012/01/13/auto-update-statistics-async-in-sap-netweaver.aspx.
Automatic Update Statistics runs fine for almost all ERP tables. It is described in detail at http://msdn.microsoft.com/en-us/library/dd535534(SQL.100).aspx. In the past we fully relied on Automatic Update Statistics in SAP BW, too. However, for small tables Automatic Update Statistics is not sufficient in SAP BW:
The Automatic Update Statistics of Microsoft SQL Server is not running for tables, which contain less than 500 rows. For a typical ERP system this is not an issue, because for small tables the execution plan does not matter anyway. However, in SAP BW we typically join small tables (which potentially have less than 500 rows) with very large fact tables (having hundreds of million rows). We have seen really bad execution plans when executing these JOINs, if the small table did not have any database statistics (or a histogram containing only a handful of rows). We did not observe any difference between using the default sample rate and FULL Update Statistics (except for SQL Server 2000 – that’s the reason why we still use the WITH FULLSCAN option for small tables in SAP BW).
The first example are temporary tables (for example /BI0/06* tables) used in SAP BW queries. In newer SAP BW support packages, the statistics of these tables are automatically updated by SAP BW (means: SAP BW is explicitly executing a manual UPDATE STATISTICS command on SQL Server when running the BW query).
The second example are small dimension tables. They are filled when loading data into an InfoCube. We strongly recommend using BW process chains including the process type "Construct Database Statistics". In this case you will never see an issue with the statistics of a dimension table. When not doing so, you may run into the following issue:
When a new cube is created, the dimension tables are filled with one row having dimension ID 0 (column DIMID = 0). The first SELECT on this table causes an Automatic Update Statistics, which creates a histogram on DIMID (which is the primary key for dimension tables). Further Automatic Update Statistics are not triggered as long as the table has less than 500 rows.
When loading a few rows of master data (but less than 499 rows) into the dimension table, the histogram still wrongly contains only one value for the DIMID column (DIMID = 0). This is the worst case when joining a dimension table with a fact table, because fact tables never contain rows with dimension ID 0.
Once you run into this issue (because you did not run update statistics in a BW process chain), you can fix it by running a manual update statistics once. If you load further data into the cube, the number of rows in the dimension table can increase. However, in many cases an additional update statistics is not necessary any more, since the histogram is good enough (it already contains several valid values for DIMID besides 0). Once a dimension table reaches the 500 rows limit, the automatic update statistics runs anyway.
In the following we describe how Update Statistics is implemented in SAP BW for Microsoft SQL Server:
You can set a property for an InfoCube to automatically update the database statistics after each data load when using 3.x DataSources:
This property still causes a lot of confusion, because it is not used for Data Transfer Processes (DTP) or within Process Chains. You typically do not need to take care about this property, when considering an Update Statistics Strategy for SAP BW 7.0 or newer. That’s all, you need to know about it.
Why is it still available in SAP BW 7.30? It is still a relic from SAP BW 3.5 and older releases: For 3.x InfoSources you can create transfer and update rules, which allow direct staging from a 3.x DataSource into an InfoCube. When executing an InfoPackage (outside a Process Chain), which loads data from a 3.x DataSource into an InfoCube, the database statistics of the InfoCube can be automatically updated.
However, this is a very rare scenario. As of SAP BW 7, all data is supposed to be loaded using DTPs within Process Chains. There is explicitly a process type "Construct Database Statistics", which can be added to a Process Chain. The InfoCube properties "Refresh DB statistics after each data load" and "Also refresh statistics after delta upload" do not have any impact when using DTPs!
You can manually update the database statistics of all tables belonging to an InfoCube in transaction RSA1 => (select) InfoProvider => (right click) Manage => (choose tab) Performance.
There are two buttons: "Refresh Statistics" and "Create Statistics (Btch)". Both do exactly the same: They start an SAP batch job, which calls function module RSDU_ANALYZE_TABLE_MSS for each table belonging to the InfoCube. When pressing "Refresh Statistics", the batch job starts immediately. When pressing "Create Statistics (Btch)", you can define the name of the batch job and schedule it (periodically):
Here you can also change the InfoCube properties for 3.x DataSources. As already mentioned, they have no impact when using DTPs or Process Chains. "Create Statistics (Btch)" does not create any additional column-statistics on Microsoft SQL Server. Column-statistics are automatically created as required by Microsoft SQL Server during query execution.
The parameter "Percentage of IC Data Used to Create Statistics” is not used for Microsoft SQL Server. A configured sample rate in table DBSTATC is also not taken into account (any more). Instead the sample rate for UPDATE STATISTICS is calculated as follows:
In the newest* Support Packages, SAP BW never executes the SQL command UPDATE STATISTICS for a table, if the automatic update statistics was turned off (by using sp_autostats) for any INDEX STATISTICS or COLUMN STATISTICS of this table. We implemented this behavior, because a manual update statistics would change the sp_autostats setting implicitly. To clarify: we do not recommend changing the autostats options for SAP BW tables.
In SAP Transaction DBACOCKPIT you can manually update the database statistics for any INDEX or COLUMN STATISTICS of any table (not only for SAP BW tables). Here you can choose any sample rate you want.
However, we do not recommend using DBACOCKPIT to update the statistics for SAP BW tables. RSA1 is much more sophisticated, since you can update the statistics of all database tables belonging to an InfoCube at the same point in time. Furthermore, you can schedule update statistics as a batch job in RSA1.
There are two ways to schedule Update Statistics for all tables of an InfoCube:
However we do not recommend to schedule Update Statistics at all. You should rather update the database statistics at the end of the Process Chains, which contain the DTPs. In addition, you can leverage the Update Statistics included in the cube compression, as described below.
You can use the BW process type "Construct Database Statistics" within a BW Process Chain. The given sample rate in the process variant is not taken into account. It is calculated the same way as described above for Manual Update Statistics in RSA1.
You can drop and re-create the secondary database indexes of the f-fact table during data load by using the following BW process types within a Process Chain (“Delete Index” has no impact on the e-fact table):
The DTP is faster, if there are no secondary indexes on the f-fact table. On the other hand, you have the additional overhead of re-creating these indexes. The benefit depends on the ratio between transferred data and data already contained in the f-fact table. Therefore you may not want include the “Delete Index” process type in all Process Chains which perform DTPs.
However, including “Generate Index” in a Process Chain is always a good idea. Indeed, this process type repairs all indexes of the fact-tables. The runtime is negligible, if all indexes already exist.
We recommend to add the BW process type "Construct Database Statistics" at the end of all process chains, which use DTPs to InfoCubes. Hereby you make sure that the database statistics for small dimension tables are always up-to-date.
There is one tricky thing you should know about "Construct Database Statistics": In the variant of the processs type you can either define the InfoCubes directly or those InfoCubes, which are the target of a Data Transfer Process. We recommend using the latter:
By doing so, only the statistics of the dimension tables and the f-fact table are updated. The statistics of the e-fact table is not updated, because the e-fact table is not modified by the DTP (when specifying the InfoCube instead, an UPDATE STATISTICS is running for all tables belonging to the InfoCube. This also includes the e-fact table and the globally used master data tables of all InfoObjects contained in the InfoCube).
You can further reduce the overhead of the BW process type "Construct Database Statistics" by leveraging Lightweight Update Statistics, a new feature of SAP BW running on Microsoft SQL Server. As a result, UPDATE STATISTICS is actually only performed on two types of tables:
Lightweight Update Statistics can only be used within Process Chains and only when specifying the InfoCubes indirectly as a target of a DTP. Since there are typically no missing statistics (except for empty tables), Lightweight Update Statistics is very fast: We update only the database statistics of those tables, which may cause performance issues in BW queries: small (dimension) tables.
You can configure Lightweight Update Statistics by setting the RSADMIN parameter MSS_LEIGHTWEIGHT_UPDSTATS using SAP report SAP_RSADMIN_MAINTAIN. You have the following options:
There are a few scenarios where SAP BW automatically executes an UPDATE STATISTICS on particular tables. The most important scenario is the usage of temporary tables within SAP BW queries. These tables (for example /BI0/06* tables) are created and filled with data. Before using them for JOINs in BW queries, SAP BW executes an UPDATE STATISTICS on these temporary tables.
Another scenario is the BW cube compression (aka Condense or Collapse): In the newest* BW support packages, Update Statistics is running on all tables of the InfoCube as last step of the cube compression. You can modify this behavior by setting the RSADMIN parameter MSS_COND_UPDSTATS using SAP report SAP_RSADMIN_MAINTAIN. You have the following options:
*) When we talk about the newest SAP support package stacks in this BLOG, we mean the following SPS + applying SAP note 1755488 (Lightweight UPDATE STATISTICS in SAP BW) – or any of the subsequent support packages.
In order to use Lightweight Update Statistics or Update Statistics during BW cube compression you have to apply the newest* BW support packages. Additional benefit of these support packages is described in http://blogs.msdn.com/b/saponsqlserver/archive/2012/08/17/sap-bw-improvements-in-new-support-package.aspx.
Even when not applying the newest* support packages, we do not expect any issues with database statistics on small tables any more, if you
The following strategy keeps database statistics up-to-date, while reducing the overhead of running UPDATE STATISTICS to a minimum. It is automatically used, when applying the newest* BW support packages, without the need to set any RSADMIN parameter. You only have to include the process type "Construct Database Statistics" (using the object type “Data Transfer Process” in its variant) at the end of each DTP Process Chain. The Update Statistics is then spread over 3 places:
The runtime of Update Statistics depends on many parameters: Hardware (CPU, RAM, I/O), cube size (number of rows in fact tables), cube design (number of dimensions, number of characteristics, number of rows in dimension table). We have performed some measurements on a test system (running inside a virtual machine). The absolute runtime may not be typical, however it gives you an idea about the performance impact of Lightweight Update Statistics. The cube had 38 characteristics in 11 dimensions, 8 key figures, 1 million rows in the f-fact table and 10 million rows in the e-fact table. Since most of the dimension tables already contained more than 500 rows, the Lightweight Update Statistics was very fast (229 times faster compared with Normal Update Statistics):