Friday, 14 August 2015

DBA Training Week 8 - What Pages Are In Memory?






What Objects Are In Cache?

We like to think SQL Server is using all of our memory to cache data, but that's just part of it. SQL Server uses memory for lots of things:
  • Caching database objects
  • Sorting data for your query results
  • Caching execution plans
  • Performing system tasks
Often, we're surprised by how little data is being cached for each database.

Last week, we looked at the 8KB pages in our database. Those pages are the same whether they're on disk or in memory - they include the database ID and the object ID, so if we looked at all of the pages in memory, we could figure out which tables are being cached in memory right now. The below query gives us the magic answers, but be aware that the more memory you have, the longer this will take. It won't block other users, but it could take a minute or two if you've got >64GB memory, several minutes if you've got a terabyte or more:
 
SELECT CAST(COUNT(*) * 8 / 1024.0 AS NUMERIC(10, 2)) AS CachedDataMB , 
CASE database_id WHEN 32767 THEN 'ResourceDb' ELSE DB_NAME(database_id) END AS DatabaseName 
FROM sys.dm_os_buffer_descriptors 
GROUP BY DB_NAME(database_id) , database_id 
ORDER BY 1 DESC

Compare the size of each database versus how much is being cached. Often in the field, I'll see 100GB databases that just have 8-12GB of data cached in memory. That might be completely okay - if you only regularly query just that amount of data - but what if we constantly need all 100GB, and we're constantly pulling it from disk?

This Leads to Cool QuestionsThis DMV query leads to so many cool performance tuning questions! I get so excited by these concepts.

How fast are the cached pages changing? From the moment we read an 8KB page off disk, how long does it stay in memory before we have to flush it out of the cache to make room for something else we're reading off disk? This concept is Page Life Expectancy, a Perfmon counter that measures in seconds how long things stay in RAM. The longer, the better, as I explain in my Perfmon tutorial.

Do the results change based on time of day? This is a one-time snapshot of what's in memory at the moment, but it can change in a heartbeat. If you have automated processes that run a bunch of reports in a single database at 2AM, then the memory picture will look completely different then.

Are we caching low-value data? If you mix vendor apps and in-house-written apps on the server, you'll often find that the worst-written application will use the most memory. Thing is, that might not be the most important application. Unfortunately, we don't have a way of capping how much memory gets used by each database. This is why most shops prefer to run vendor applications on separate virtual machines or servers - this way, they don't hog all the memory on a SQL Server that needs to serve other applications.

Do we have enough memory? If you're running SQL Server 2008/R2/12 Standard Edition, you're limited to just 64GB of physical RAM. If you're running SQL Server on bare metal (not a VM) and you've got any less than 64GB, go buy enough to get to 64GB. It's the safest, easiest performance tuning change you can make. If you're in a VM or running Enterprise Edition, the memory question gets a lot tougher. To learn more, read A Sysadmin's Guide to SQL Server Memory.

Are we using memory for anything other than SQL Server? If we've got Integration Services, Analysis Services, Reporting Services, or any other applications installed on our server, these are robbing us of precious memory that we might need to cache data. Don't remote desktop into your SQL Server and run SSMS, either - it's a memory pig. Put your management tools on a virtual machine in the data center, and remote desktop into that instead.

Can we reduce memory needs with indexes? If we've got a really wide table (lots of fields) or a really wide index, and we're not actively querying most of those fields, then we're caching a whole bunch of data we don't need. Remember, SQL Server is caching at the page level, not at the field level. A nonclustered index is a narrower copy of the table with just the fields/columns we want. The less fields, the more data we can pack in per page. The more we can pack in, the more data we're caching, and the less we need to hit disk.

When I tune indexes on a server I've never seen before, sys.dm_os_buffer_descriptors is one of the first places I look. The database with the most stuff cached here is likely to be the one that needs the most index help, and we'll talk indexes next.

Brent says: when I was a junior DBA, I focused a lot on the storage. I kept complaining to my SAN administrators because my storage didn't respond fast enough - my drives were taking 50ms, 100ms, or even 200ms in order to deliver data for my queries.

The SAN admin kept saying, "It's okay. The SAN has a cache." Thing is, the size of the SAN's cache is typically 32GB-128GB - which at first sounds like a lot - but divide it between all of the servers connected to the SAN. Often, we find that an individual SQL Server might get only a couple of gigabytes of SAN cache. That's way less than what the SQL Server has in memory. What are the odds that, when we need data for a query, it's not going to be in SQL Server's 64GB of memory, but it IS going to be in the SAN's miserly 2GB of cache? Not gonna happen.

SAN caching is still great for writes, especially for the transaction log, but don't count on it helping for performance for SELECT queries.

To learn more, check out my training course on Storage, Virtualization, and Hardware for DBAs. As a thank-you for reading these and investing in your own skills, you can use coupon code HardwareTraining to knock off $100 this week.



No comments:

Post a Comment