The DL785 has been in constant demand since it arrived in September, 2008 - mostly doing SQL work of course. The 8-socket form factor for an x64 system is still not widely known about among the customers that come to the lab. The fact remains that most SQL Server workloads run on 4 socket or smaller platforms. With the number of cores per socket continuing to climb, you can get a lot of RDBMS work out of a 4 socket - quad or six core system. One area where the bigger 8-socket platforms are in high demand is in data warehousing.
The Fast Track SQL Server SMP data warehouse materials are out. You can check out all the details here (http://www.microsoft.com/sqlserver/2008/en/us/fasttrack.aspx). Fast Track consists of a set of guidelines for building "balanced systems". The Fast Track materials are specifically designed to describe CPU core-balanced configurations to support sequential I/O data access workloads. The materials are very specific about the optimizations for sequential I/O workloads. Of course there is no real world workload that is 100% sequential I/O all the time. You will have to determine where you have a good fit to your particular workload before adopting this approach.
In addition to describing an approach for configuring CPU core-balanced systems for sequential I/O, the Fast Track team applied the methodology to a wide array of currently shipping server and storage options at various levels of scale. The link above has all the details.
My current interest is in how to apply the techniques of the Fast Track approach to assessing hardware performance at the component level. We have been advocating the use of tools like IOMeter and SQLIOStress to validate disk subsystem performance for years. The Fast Track documents give me a standardized way to look at overall system balance in the context of workload and system setup that utilize the SQL binaries directly.
The first thing I'm going to do is run a CPU Consumption Rate analysis on the DL785. The server current has the Opteron 8354 2.4 Ghz processors (Barcelona)that came with it last fall. It has been in such high demand we haven't taken time to upgrade them. I'm getting newer ones soon so I want to try running the CPU Consumption Rate both before and after the upgrade. Nothing official, just an attempt to learn more about the Fast Track approach by getting my hands dirty. Not like dirt under the finger nails dirty though. Computer techs stay much cleaner these days. More on CPU consumption rates soon...
Thanks to the joint marketing efforts of HP and AMD we just unpacked and racked a new HP DL785 G5 in the data center. It is hard to miss at 7U’s but there is an awful lot of power packed into this machine. The standard configurations have 4 sockets populated but ours has 8 AMD Opteron 8354 quad-core processors running at 2.2 GHz. That’s right – 32 cores in 7U’s. You can read more about the Quad-Core AMD Opteron™ Processor here including a description of the new HyperTransport™ enhancements of the 8000 series that support 8 socket server designs.
Each processor module is equipped with 8 DIMM slots that slide into the front of the server for easy access and hot swapping. We will try this out with SQL Server 2008 under load. You can get a good look at the physical system design at the HP site on the DL785 Quickspecs page. Our machine has all 64 DIMM slots populated with 2GB PC2-5300 DDR SDRAM for a total of 128 GB.
If you looked at the physical diagram from the link above you’ll see two local storage media bays at the top right on the front panel view. Our system has two Smart Array P400i Controllers, one for each bay, each with the optional 512MB of battery backed write cache. Each bay can hold up to 8 SAS drives and we are fully populated with 16 - 72GB, 10K drives. That is a lot of local storage. The P400 controllers will do RAID stripping across the disks in one bay only so the largest disk group I can have is 8 spindles. I haven’t decided how take advantage of that much direct attached storage given the horsepower of the server so let me know if you have any creative ideas. One thought is to put one or more transaction logs on their own local RAID group. That would simplify SAN layout if I could get enough IOPS locally. SAN layout negotiations get complicated in a hurry if you request a separate transaction log disk group per database to preserve the sequential write efficiency of the log writes. Something else to ponder.
I am also tracking down some more information on the power supply redundancy for this server. We have 3 power supply bays populated each with a 1200W unit. There are three empty power supply bays and the online documentation says I need all 6 bays populated to achieve “full redundancy”. I need to understand what the power consumption vs redundancy trade-offs are with this machine.
The DL785 chassis includes 11 PCI slots. They were all open when we took the lid off but we now have 4 populated with Emulex LPe11000 4Gb/s HBA’s Mark, our lab manager, found a block diagram for the DL785 on the HP site and he decided what slots to use to spread out the I/O traffic to the all the processor nodes. More on this later when we start to do some bigger workload testing.
So, it is in the rack with Windows Enterprise Server 2008 installed. I started to run some SQLIOSim loads against the local disks and one SAN drive. This is about the third time I have tried to run SQLIOSim on Server 2008 without specifying “Run as administrator”. SQLIOSim does not prompt for elevation on its own and is not happy running with non-admin rights. Next time I won’t forget. There’s a lot more to do so I’m going back to the lab. We have the 785 hooked up the HP EVA 8000 with 144 spindles and I’m curious to see if we can flood the EVA with IOPS. Wish me luck.
Phil Hummel – Mountain View, CA
At the MTC’s we’ve worked with a few customers interested in discussing how they could migrate existing cubes from Essbase to SQL Server Analysis Services. In one particular instance we did a proof of concept that involved manually converting one of the customer’s more complex cubes.
It’s worth noting that there is at least one tool on the market from a Microsoft partner called Exologic that will automate the migration process and it seems to do a good job of converting objects over. However, in this particular case the customer wanted to understand the complexities involved and see how difficult the process would be to rebuild the cube from the ground up in Analysis Services.
One of the difficulties that we initially encountered was in implementing the shared members from Essbase in Analysis Services. A shared member essentially means that the dimension member needs to appear in more than one place within the hierarchy. One example of this would be an organization where employees have one manager but may have a “dotted-line” relationship to another manager as well.
The traditional way to accomplish this has been to use the Custom Rollup feature in SSAS. In this particular POC though, we discovered that due to the design of the cube, amount of data, and number of shared members the performance of the custom rollup was not acceptable. It was taking nearly a minute to run one particular query that used the custom rollup dimension.
Instead, we ended up using a many-to-many dimension to implement the shared members. That improved performance of the query in question significantly - with the new design it ran in less than one second.
The SQL Customer Advisory Team has done a detailed write-up describing the approach from a more general standpoint. It’s a good article and should be helpful if you have a situation where you need to migrate Essbase shared members to SSAS.
-Scott Hulke, Microsoft Technology Center – Dallas
Last year we worked with a customer on a proof of concept that involved populating and running some performance tests against a 20TB database. The POC was conducted using the following software and hardware:
- Windows Server 2003 Datacenter sp1
- SQL Server 2005 Enterprise sp2
- Unisys ES7000 (64 x64 processor cores, 256GB RAM)
- HP XP12000 SAN with 24TB of storage spread across 860+ spindles
As it turned out, the bulk of the data was contained within one partitioned table in particular and we ended up having to re-load that table a few times as we optimized the system and made changes to the disk layout.
During the POC, we documented a few tips and lessons along the way in a list that we've been maintaining internally. Now that we've started this blog, I thought it may be useful to publish this information for others that are building or working on very large database systems with SQL Server. Note this is not intended to be a complete list of VLDB considerations but just a list of configuration settings, commands and tips that were helpful in our POC.
Configuration
- Use the -E startup option to enable SQL Server to allocate 4 extents per file instead of the default of 1. This must be done prior to loading data into the database.
- Grant the "Lock pages in Memory" privilege to the SQL service account in order to keep the process memory in physical RAM.
- Turn on Trace Flag 834 to enable large page support for buffer pool. Note this requires the "Lock pages in Memory" privilege.
- Make sure that the SQL Server service account has "manage volume" (SE_MANAGE_VOLUME_NAME) rights in order to enable instant file initialization.
- Set CPU affinity mask (and affinity64 mask if the system has more than 32 processors) on a NUMA system to bind schedulers and threads to specific CPU's. In our case, we set both of these configuration items to 0xFFFFFFFF in order to use all 64 CPU cores.
- Do not rely on autogrow to size the data or transaction log files. Pre-allocate space so that SQL can grab contiguous space on disk.
Partitioning
- ALTER TABLE SWITCH can invalidate statistics causing update stats to run.
- Update statistics on a very large table with several indexes can take a long time to run. In addition to executing a query, it can be triggered by trying to get an estimated execution plan if there have been sufficient data changes.
- In our case, we were doing a lot of partition switching but not changing the data so we turned off auto update statistics so that the stats updates did not affect query performance. In a production environment it would most likely be more appropriate to set the AUTO_UPDATE_STATISTICS_ASYNC option.
- To split a partition and move data, it is more efficient to switch the partition out and move the data manually using an INSERT-SELECT statement rather than using ALTER PARTITION FUNCTION SPLIT RANGE. Then ALTER PARTITION SPLIT RANGE can be used on the partitioned table to split an empty partition into two. In our case that made a difference of several hours when splitting a 200GB partition.
- It is a good practice to always keep an empty partition at either end of the table so it can be split before loading new data.
Parallelism / Performance
- When doing a parallel CREATE INDEX operation, use the SORT_IN_TEMPDB option to get less fragmentation. Be sure to size tempdb adequately so the data files don't have to grow.
- If you have long-running queries or CREATE INDEX commands using large amounts of memory, you can monitor their memory usage with the DMV sys.dm_exec_query_memory_grants.
- CREATE INDEX with MAXDOP larger than the number of processor cores in a single NUMA node will result in SQL Server spreading the tasks across all NUMA nodes. This can cause excessive overhead in the form of CXPACKET wait times as the threads communicate across nodes.
- When running several long-running operations in parallel, you may need to pay attention to NUMA node affinity to ensure that tasks are being evenly distributed across nodes. We used the following query to monitor this:
select t.session_id, r.command, s.parent_node_id, count(*) as tasks, sum(pending_io_count) as [IO],
sum(case t.task_state
when 'RUNNING' then 1
else 0
end) as [RunningTasks]
from sys.dm_os_tasks t
join sys.dm_exec_requests r on t.session_id = r.session_id
join sys.dm_os_schedulers s on t.scheduler_id = s.scheduler_id
where t.session_id > 50
group by s.parent_node_id, t.session_id, r.command
order by 3
- We used SQL LiteSpeed for backups and were able to achieve backup speeds of over 700MB/sec using all 64 CPU cores. Of course, your mileage will vary based on compression settings and data. We have recently published some impressive numbers regarding the new SQL Server 2008 compressed backups.
-Scott Hulke, Microsoft Technology Center – Dallas
On March 5th, we announced the Microsoft SQL Server Data Service offering at the mix conference...here's some basic information on the topic.
Q: What is Microsoft SQL Server Data Services?
A: Microsoft® SQL Server™ Data Service (SSDS) is a highly scalable and cost-effective on-demand data storage and query processing web service. It is built on robust SQL Server technologies and helps guarantees a business-ready SLA covering high availability, performance and security features. Microsoft SSDS is accessible using standards based protocols (SOAP, REST) for quick provisioning of on-demand data-driven & mashup applications.
Q: How does Microsoft SQL Server Data Services differ from traditional on-premise relational SQL Server Database?
A: Microsoft SSDS is a storage and query processing utility providing mega scale, high availability, reliability, and geo-distributed data services in the Cloud. Customers use the service on-demand, with no up-front cost. It eliminates the initial investment in hardware and software and the on-going cost for storage administration, scale maintenance. Developers and Service providers can quickly run their on-demand applications with minimal infrastructure cost.
Q: What can I do with SQL Server Data Services?
A: Customers can use Microsoft SSDS to store virtually any amount of data in the Cloud. They can query and modify data as required by the specific business scenarios. SSDS supports standards-based REST and SOAP interfaces designed to work with any Internet-development toolkit. The primary wire format is XML. Developers and service providers can quickly run on-demand applications with ease. The data has flexible schema which can be modified dynamically by the application. The data is provided with high availability and reliability virtually anywhere, anytime.
Want to try SSDS out...please register for the free Beta program at http://www.microsoft.com/sql/dataservices/default.mspx. Once registered, you'll receive an invitation code to join the beta program where you will have access to detailed documentation and sample code examples on how to consume the services.
For Additional Information please go to:
http://www.microsoft.com/sql/dataservices/default.mspx
http://www.microsoft.com/sql/dataservices/faq.mspx
Wondering how to implement "sparkline" type reports with Reporting Services, well take a look at this post from an MTC architect in New York.
http://blogs.msdn.com/jbasilico/archive/2008/01/09/implementation-of-sparkline-type-reports-within-reporting-services.aspx
Read about the experiences of an MTC architect out in New York who's tested Teradata v12 against SQL Server 2005 Analysis Services.
http://blogs.msdn.com/jbasilico/archive/2008/02/26/functionality-testing-of-teradata-v12-with-sql-server-2005-analysis-services.aspx