SQL Server deployment and Performance on VM issues and the Solutions

The most successful deployments of virtual SQL Servers have a few things in common.  When a company is doing all of these things, odds are they’re going to be very happy with their virtual SQL Servers:

10. Use vSphere 4.1, not earlier versions. vSphere 4.1 introduced major improvements in multi-CPU scheduling that vastly improve performance.  SQL Server will break queries out across multiple virtual CPUs, and it expects all of those vCPUs to perform identically.  vSphere 4.1 helps make that happen.  

Lacie IAmAKey Drive9. When using blades, avoid overcommitting switch uplinks. Virtualization works great with blades – but only when the SAN and network teams monitor utilization rates between the blade chassis switches and the rest of the networks.  All too frequently, the SQL Server is thrown into a blade chassis with dozens of other servers all competing for the same small amount of storage and network bandwidth leaving the blade chassis.  Everyone looks at their monitoring systems and can’t understand why the backups, big queries, and maintenance jobs run so slow, and the culprit is the hidden small uplink.

8. Avoid 1Gb iSCSI for storage. While this cheap and easy-to-use storage works great for most servers, it doesn’t work as well for SQL Server.  If you’re happy with the speed of SQL Server running on local storage, you can be in for a rude surprise with 1Gb iSCSI.

7. Test storage performance before deployment. I’m partial to SQLIO, which might be the worst-named tool in history.  It has absolutely nothing to do with SQL Server – it doesn’t require SQL Server, and it doesn’t mimic SQL Server.  It just flat out hammers IO using whatever parameters you pass in.  Using a test file of the same size as the expected database, the storage should consistently exceed 100MB/sec whether it’s random or sequential, reads or writes.  Higher numbers than that may be required due to the application design or end user expectations, but that’s a good foot-in-the-door number.

6. Compensate for slow storage with memory. If you can’t get the storage performance you need, you can help by caching as much of the database as possible in memory.  If the projected database size is 50GB but the virtual server only has 16GB of memory, then it’s not going to be able to cache the entire database.  Perhaps the users won’t be querying old data, in which case you might be able to get by with less.

5. Ensure CPU power saving is turned off. While Intel’s latest Xeon processors provide impressive power savings, they won’t ramp up to full processor speed unless the CPU is under heavy load.  SQL Server will rarely push the processors that hard, which means they stay slow – sounds good in theory, but in reality, every query takes 70-100% longer than it did pre-virtualization.  If you care about query performance, turn this setting off in the hardware BIOS.  

4. Coordinate VMware reservations and shares with SQL Server settings. VMware vSphere has a great set of options to ensure that a guest OS gets the right amount of resources for its needs.  These settings need to be set in concert with SQL Server’s min and max memory settings.  There’s plenty of bad advice out on the web saying things like, “Just disable the balloon driver and give SQL Server all the memory” – that’s not right either.  There’s a happy medium in letting SQL Server and VMware cooperate to balance resources across multiple guests, but it only works when these settings are aware of each other.

3. Use Resource Pools to track SQL Server licensing. Microsoft’s virtualization licensing for SQL Server is notoriously complex.  Starting with SQL Server 2008R2, only Datacenter Edition ($60k/cpu socket) provides unlimited virtualization rights.  Enterprise Edition ($30k/cpu socket) provides just four VMs.  Tracking these closely with VMware Resource Pools can result in huge cost savings.  If SQL Servers are allowed to move to any host in the VMware cluster, then a licensing audit can produce staggering costs.

2. Use VMware HA for high availability. If your users can tolerate a SQL Server outage of 2-3 minutes, VMware HA is much easier to manage than a SQL Server cluster.  If your users require less than 30 seconds of downtime, consider implementing a physical SQL Server cluster instead.  SQL Server clusters are tricky enough to manage on their own, and doing them inside VMware adds an impractical level of management on servers that can’t be down for more than 30 seconds.

1. Virtualize small SQL Servers first. Start by gaining experience with 1-2 vCPU servers with under 16GB of memory.  As the company’s sysadmins grow accustomed to how SQL Server uses CPU and memory in a virtual environment, they’ll be more confident virtualizing larger servers.  If the DBAs and sysadmins don’t get along when trying to pin down performance problems on smaller servers, they’re going to be very adversarial when dealing with larger servers.

WOW – THAT’S A DEMANDING LIST!

I know what you’re thinking: “This outside consultant is greedy and expects everybody to dump tons of money into their SQL Servers.  He’s asking for the moon.  Nobody does this in real life.”

Let’s rewind back to the beginning of the recommendations where I said, “When a company is doing all of these things, odds are they’re going to be very happy with their virtual SQL Servers.” You can certainly skimp on some of these items and still stand a pretty good chance of being happy with your SQL Server performance.

The more items you skimp on, the worse your chances become.  If you implement a virtual SQL Server with 32GB of memory trying to cache 250GB of databases on RAID 5 storage, hooked up via 1GB iSCSI, with no prior experience virtualizing SQL Servers of that size, odds are you’re going to be miserable.  Users will scream and complain, and you’ll bring in an outsider who will track the problems back to these types of recommendations. 

__________________________________________________________________________

Tips for configuring Microsoft SQL Server in a virtual machine

Solution

RDBMS performance characteristics:
  • Traditionally CPU and Disk I/O intensive.
  • Require adequate amounts of CPU power to prevent SQL scheduler thread queuing.
  • Require fast I/O throughput to write logs without affecting query performance.
From a performance perspective, these issues arise:
  • CPU contention with other processes and virtual machines.
  • Disk I/O contention when writing logs or dealing with write-intensive database activity.
Allocate adequate CPU resources: 
  • Database performance suffers if CPU resources are constrained.
  • During levels of high database activity, run the DBCC SQLPERF(UMSSTATS) command against your database.

    This returns scheduler statistics for each CPU. If the num runnable metric, that is the Scheduler Queue Length is greater than 2, there may be CPU constraints, causing SQL Server threads to queue. 

  • Consider increasing minimum and maximum CPU resource allocations to reduce this queue length.
Use uni-processor virtual machines. Upgrade to VSMP virtual machines only if necessary.  
  • It is easy to move from a UP HAL to a SMP HAL, but not vice-versa.
  • Upgrading from UP to SMP mode requires a compulsive HAL change which can result in HLT related issues after the upgrade when using Win2k virtual machines.
  • Re-installation is required only when moving between ACPI and non-ACPI configurations. Virtual machines with two VCPUs require two PCPUs to be available to service a request. Systems with few PCPUs or many VSMP.
  • Virtual machines may see poor performance.
Separate data and logs on different physical disks:
  • A standard RDBMS best practice is to place SQL data and logs onto separate physical disks. This ensures that I/O intensive logging does not interfere with data queries.
  • In a virtual machine, create one (or more) .vmdk(s) for data and another for logs. Pre-allocate disk space when you create the virtual disks.

    Note: An auto-growing .vmdk sacrifices some performance for the added flexibility.

  • Put the .vmdk on a different LUN to spread activity across multiple spindles.

    This allows sequential writes to occur as fast as possible, helping to boost I/O performance.

Consider RAID 10 for write intensive databases:
  • RAID 10 is more efficient than RAID 5 for writes, but requires more disks.
  • Calculate your read/write ratio to determine if your database is write intensive.
  • After your database has been running for a few days, run Windows Task Manager and view the total number of I/O Read Bytes and I/O Write Bytes for the sqlservr.exe process.

    This shows you the ratio of SQL Server reads to writes since the SQL Server service was last restarted.

Disable screen savers and unused devices:
  • Screen savers and devices like floppy drives, CD-ROM drives, serial ports, etc. consume resources, even when they are not in use.
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s