Saturday 25 March 2017

Some good reasons to upgrade to sql 2016 - GDPR



The business might not like the cost of upgrading from an older version of sql, plus costs of possibly new operating system and maybe new hardwar.

Just explain to them that they need to consider data security, how business critical the data is and that the data is their main asset.

GDPR is on its way, thats another reason to seriously consider upgrading to sql 2016.

Some features that will sway the argument your way are:-

Note, the features below are available for all version from SQL 2019 SP1

Always Encrypted

Unlike TDE (transparent data encryption), which only encrypts data at rest, AE encrypts data at rest and in transit. The data and is only made unencrypted at the application layer. Even DBA's with sysadmin rights cannot see the data unless they have the certificate key.

Dynamic Data Masking

Basically, this allows server side masking of sensitive data. You can restrict what an end user sees based on the privileges they have. This could be very good where general users need to see part of a credit card number, but not all of it. Developers and testers using live data will only see not sensitive data.

Row Level Security

Another great feature that allows filtering out or hiding of certain rows based on either the context of the sql query that is being run or the privileges of the user. The end user will never know that certain rows are being hidden from the result set.

Conclusion

The three security features mentioned above can be used together to create a secure and flexible policy regarding sensitive data protection. Consider that GDPR is on its way in 2018 and the fines can be high, spending money now on sql 2016 could very well be worth the investment. Consider the purchase as an insurance policy against data loss and subsequent fines, loss of corporate prestige etc.

Friday 2 October 2015

SQL Server 2012 AlwaysOn Guide


SQL Server 2012 AlwaysOn

15 May 2012

SQL Server AlwaysOn provides a high-availability and Disaster-recovery solution for SQL Server 2012. It makes use of existing SQL Server features, particularly Failover Clustering, and provides new capabilities such as availability groups. Warwick Rudd explains the basics and shows you how to implement it.

https://www.simple-talk.com/sql/database-administration/sql-server-2012-alwayson/

Friday 14 August 2015

DBA Training Week 8 - What Pages Are In Memory?






What Objects Are In Cache?

We like to think SQL Server is using all of our memory to cache data, but that's just part of it. SQL Server uses memory for lots of things:
  • Caching database objects
  • Sorting data for your query results
  • Caching execution plans
  • Performing system tasks
Often, we're surprised by how little data is being cached for each database.

Last week, we looked at the 8KB pages in our database. Those pages are the same whether they're on disk or in memory - they include the database ID and the object ID, so if we looked at all of the pages in memory, we could figure out which tables are being cached in memory right now. The below query gives us the magic answers, but be aware that the more memory you have, the longer this will take. It won't block other users, but it could take a minute or two if you've got >64GB memory, several minutes if you've got a terabyte or more:
 
SELECT CAST(COUNT(*) * 8 / 1024.0 AS NUMERIC(10, 2)) AS CachedDataMB , 
CASE database_id WHEN 32767 THEN 'ResourceDb' ELSE DB_NAME(database_id) END AS DatabaseName 
FROM sys.dm_os_buffer_descriptors 
GROUP BY DB_NAME(database_id) , database_id 
ORDER BY 1 DESC

Compare the size of each database versus how much is being cached. Often in the field, I'll see 100GB databases that just have 8-12GB of data cached in memory. That might be completely okay - if you only regularly query just that amount of data - but what if we constantly need all 100GB, and we're constantly pulling it from disk?

This Leads to Cool QuestionsThis DMV query leads to so many cool performance tuning questions! I get so excited by these concepts.

How fast are the cached pages changing? From the moment we read an 8KB page off disk, how long does it stay in memory before we have to flush it out of the cache to make room for something else we're reading off disk? This concept is Page Life Expectancy, a Perfmon counter that measures in seconds how long things stay in RAM. The longer, the better, as I explain in my Perfmon tutorial.

Do the results change based on time of day? This is a one-time snapshot of what's in memory at the moment, but it can change in a heartbeat. If you have automated processes that run a bunch of reports in a single database at 2AM, then the memory picture will look completely different then.

Are we caching low-value data? If you mix vendor apps and in-house-written apps on the server, you'll often find that the worst-written application will use the most memory. Thing is, that might not be the most important application. Unfortunately, we don't have a way of capping how much memory gets used by each database. This is why most shops prefer to run vendor applications on separate virtual machines or servers - this way, they don't hog all the memory on a SQL Server that needs to serve other applications.

Do we have enough memory? If you're running SQL Server 2008/R2/12 Standard Edition, you're limited to just 64GB of physical RAM. If you're running SQL Server on bare metal (not a VM) and you've got any less than 64GB, go buy enough to get to 64GB. It's the safest, easiest performance tuning change you can make. If you're in a VM or running Enterprise Edition, the memory question gets a lot tougher. To learn more, read A Sysadmin's Guide to SQL Server Memory.

Are we using memory for anything other than SQL Server? If we've got Integration Services, Analysis Services, Reporting Services, or any other applications installed on our server, these are robbing us of precious memory that we might need to cache data. Don't remote desktop into your SQL Server and run SSMS, either - it's a memory pig. Put your management tools on a virtual machine in the data center, and remote desktop into that instead.

Can we reduce memory needs with indexes? If we've got a really wide table (lots of fields) or a really wide index, and we're not actively querying most of those fields, then we're caching a whole bunch of data we don't need. Remember, SQL Server is caching at the page level, not at the field level. A nonclustered index is a narrower copy of the table with just the fields/columns we want. The less fields, the more data we can pack in per page. The more we can pack in, the more data we're caching, and the less we need to hit disk.

When I tune indexes on a server I've never seen before, sys.dm_os_buffer_descriptors is one of the first places I look. The database with the most stuff cached here is likely to be the one that needs the most index help, and we'll talk indexes next.

Brent says: when I was a junior DBA, I focused a lot on the storage. I kept complaining to my SAN administrators because my storage didn't respond fast enough - my drives were taking 50ms, 100ms, or even 200ms in order to deliver data for my queries.

The SAN admin kept saying, "It's okay. The SAN has a cache." Thing is, the size of the SAN's cache is typically 32GB-128GB - which at first sounds like a lot - but divide it between all of the servers connected to the SAN. Often, we find that an individual SQL Server might get only a couple of gigabytes of SAN cache. That's way less than what the SQL Server has in memory. What are the odds that, when we need data for a query, it's not going to be in SQL Server's 64GB of memory, but it IS going to be in the SAN's miserly 2GB of cache? Not gonna happen.

SAN caching is still great for writes, especially for the transaction log, but don't count on it helping for performance for SELECT queries.

To learn more, check out my training course on Storage, Virtualization, and Hardware for DBAs. As a thank-you for reading these and investing in your own skills, you can use coupon code HardwareTraining to knock off $100 this week.



DBA Training Week 7 - Managing Space Inside Databases




Pages: What SQL Puts in Files

Inside each of your database data files (MDFs), SQL Server stores your data in 8KB pages. That's kilobytes - not megabytes, not gigabytes, but just mere kilobytes.

Say that we create a table:

CREATE TABLE dbo.Employees
(EmployeeID INT IDENTITY(1,1),
 EmployeeName VARCHAR(200))

First off, yes, I understand, I shouldn't do EmployeeName in one field - I should have FirstName, LastName, MiddleName, Suffix, Prefix, yadda yadda yadda, but I'm trying to keep this email short. Now see what you did? It's long again. Doggone it, it's hard teaching this stuff in an email.

Anyhoo, in this table, each record takes up just a little bit of space. EmployeeID is an INTeger, which takes 4 bytes. It's the same 4 bytes whether that number is 1 or 1,000,000. EmployeeName is a VARCHAR(200), which means we can store up to 200 characters in here, and each character takes a byte. If we insert 'BRENT OZAR', that's 10 characters (and boy, am I a character), so we need 10 bytes to store it.

If all of our employees average about 10 characters in their name, that means we could fit about 500-600 records per 8KB database page. (In reality, there's some overhead because SQL Server also needs to use some parts of the page to store metadata, and we'll talk about that later in the training.)

Brent Ozar Unlimited is a small company, so we can keep all of our employees on a single 8KB page. As we insert, update, and delete employees, SQL Server fetches that 8KB page off disk, brings it into memory, makes the necessary changes, and then writes that data page back to disk. The 8KB page itself is the smallest unit that SQL Server will cache - it doesn't cache individual rows/records - and each page belongs exclusively to just one object.

A Word About Objects

You'll notice that I avoid using the word "table". Tables are cool, but as we start to dig into what SQL Server's doing under the hood, we want to start thinking about these three object types:

Heap - a table with no clustered index. In my dbo.Employees table, I didn't specify in what order SQL Server should store my data, so it's just going to slap the data down on my 8KB page in any old order.

Clustered Index - what we normally think of as a table. If I'd have created my table like this:

CREATE TABLE dbo.Employees
(EmployeeID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
 EmployeeName VARCHAR(200))

Then SQL Server would store my data in order of EmployeeID. That way, when I search for EmployeeID #42, it can go directly to that number without scanning through all of my employees. The clustered index is sorted in the order of the EmployeeID field, but it's actually the full copy of our table, including all of our fields - in this case, just EmployeeName.

Nonclustered index - If I tell SQL Server to:

CREATE NONCLUSTERED INDEX IX_Name
ON dbo.Employees(EmployeeName)

Then SQL Server will create a second copy of my table sorted by EmployeeName. This copy of my table will only include the fields specified in my index (EmployeeName), plus whatever fields it needs to get back to the clustered index (in this case, my clustering key, EmployeeID).

All three of these objects - heaps, clustered indexes, and nonclustered indexes - will be stored on separate sets of pages. We won't have the clustered index and nonclustered index for the same table on the same page - they're split. That's why when we're doing space analysis, we have to think in terms of indexes, not tables. To learn more index basics, read Jes Borland's SQL Server Index Terms.

Pages & Rows on Disk

The sys.dm_db_index_physical stats Dynamic Management Function (DMF) returns the number of rows and pages stored in each database object. It takes parameters for database ID and object ID, or you can pass in NULLs to get information across all of your database objects. Scroll down to the examples link in that Books Online page, and you'll get queries to examine a single table - I'd strongly recommend starting with a small table, because some of the parameters for this DMF will cause SQL Server to actually look at every page in the object. That means if all of the pages for that object aren't cached in memory, SQL Server will go pull those pages off disk, and that can slow down your running SQL Server.

This DMF also includes average record size and max record size. This makes for fun spelunking: how big is each record, really? Just because we make everything a VARCHAR(8000) doesn't mean we're actually storing 8,000 characters in each field. Now, don't go changing your database structure just yet - you can easily break applications when datatypes change. Let's leave that for later.

You can get similar metadata much faster by using sp_spaceused, but it doesn't get the cool info about average record size, and I wanted to encourage you to go spelunking here.

Learning More About PagesIn my 90-minute session How to Think Like the Engine, I explain pages, indexes, joins, SARGability, and more. I use real database pages from the StackOverflow.com database for demos, and you actually get PDFs to print out and follow along as we go.

Your Homework

Let's start thinking about what objects are taking up space in our databases. Check out our free sp_BlitzIndex™ stored procedure that analyzes the indexes in your database from a psychologist's point of view. Is your database a hoarder, clutching on to a bunch of nonclustered indexes that aren't getting used, and are just taking up space? The details columns in the results will show how big each index is, and whether it's getting used or not.

This is important because the more indexes you have:
  • The longer your backups take
  • The longer index rebuilds take
  • The more memory you need to cache everything
  • And most importantly, the slower your inserts/updates/deletes go, because SQL Server has to maintain more copies of your table
As we start to move up the Hierarchy of Needs from capacity into performance, you'll start to see how these foundational items are so important. See you next week! 

Brent says: Wanna see what a page looks like? Check out the DBCC PAGE command. You pass in a database name, file number, and page number, and SQL Server will return the nearly-raw contents of that page along with diagnostic information. It's a fun way to get a peek under the hood.

This week's topic is a good example of how knowing the basics of database internals can come in handy when you step back and think about database server performance. You don't have to use DBCC PAGE as part of your job as a DBA, but just knowing what an 8KB page is helps you understand the output of various Dynamic Management Views (DMVs) when they report back units in pages.

For more, watch How to Think Like The Engine. Use coupon code Engine19 and it's just $19 this week.

Free related videos:

How Does SQL Server Store Data?
Back to Index Basics
Heaps in SQL Server

How to Tune Indexes
5 Hours, $299

Want to become an index tuning expert? Kendra Little shows you how. Use coupon code InsideIndexes and get $100 off.


DBA Training Week 6 - How to Monitor Drive Space

How to Monitor Drive Space

Don't.

It's not your job.

No, seriously, hang on a second. I'm not saying that you should ever say things like "It's not my job," but I do want you to understand when you should avoid reinventing the wheel. Your Windows administrators should be monitoring every Windows server you have, making sure it doesn't run out of drive space. This isn't some wild, unique need that only SQL Server has - it's everywhere in your data center.

(You might actually even BE a Windows admin, just here because you need to learn about working with SQL Server. It might actually BE your job to monitor this stuff. That's cool - that's how I got started too.)

In our journey from the base of my Hierarchy of Database Needs to the top, we do indeed need to talk about capacity - but I don't want to monitor drive space from inside SQL Server, and I don't want to buy SQL-specific tools in order to pull it off. Yes, you can technically use commands like xp_fixeddrives to list all of the SQL Server's drive letters and how much free space they have, but that doesn't work everywhere. For example, you might have mount points or databases on UNC paths, neither of which will show up in xp_fixeddrives. So leave the drive space monitoring to the pros.

Why You Shouldn't Build a Monitoring ToolIf you want to be a professional developer, you should build things. You should learn what exceptions could happen, how to trap them, and how to fail elegantly. It's hard work, but if you get good at it - really good - you can build amazing things and make a killer living for yourself.

But if you want to be a professional DBA, you should leave the developing to the developers.

I'm not saying you shouldn't learn the dynamic management views (DMVs), how to dig through them to find the evidence you need, and how to turn that data into actions. Hopefully, I'm going to teach you how to do a lot of that over the course of the next six months. Take another look at the Hierarchy of Database Needs again, and think for a second about all the things we're going to be learning over the next six months. Just in the last five weeks, I've had you build an inventory of your servers, automate your database restores, start locking down security, and enumerate your database risks. The next few months are chock full of things that will help you become a hero in the eyes of your users and your developers.

Building a crappy monitoring tool in your spare time will not give you that same level of respect. (And yes, if you've only got your spare time at work, your monitoring tool is going to be crappy. If you've got so much time that you can build a great monitoring tool, you need to focus on providing results to the business fast, because you're about to be laid off as pure overhead.)

How to Pick a Monitoring ToolThere's basically two kinds of monitoring tools out there for SQL Server.

Up/down Alerting - these tools make sure the SQL Server service is running, and that it has all the basic needs taken care of. If the server runs out of drive space, if the service stops, if the event log starts filling up with errors, or zombies attack, these tools will let you know. The most popular software in this category is Dell Spotlight, Idera SQL Diagnostic Manager, and Red Gate SQL Monitor.

Performance Diagnostics - these tools try to derive insight from SQL Server's dynamic management views (DMVs) to help performance tuners wring the most throughput out of their hardware. The most popular software here is SQL Sentry Performance Advisor, Dell Foglight Performance Analysis, and to some extent, the tools in the above category too.

If I was you, I'd start by asking the Windows team if they've got any software that handles the up/down alerting for services, drive capacity monitoring, etc. If so, get them to start monitoring your servers. I'm being selfish here - my real reason is that I want to save my monitoring budget for tools in the Performance Diagnostics category. Surprisingly, all of these tools are around the same price - around $1,000-$2,000 USD per monitored instance.

Then, no matter which category of tool you're buying, make a list of the last 4-5 things that have caused your phone to ring after hours. Here's some of my personal favorites:
  • Deadlocks
  • Out-of-control query running for hours
  • Long-running job, like a backup going too slow
  • Queries that desperately need an index
Figure out how to recreate those same symptoms in your development environment, and then get a free trial of a couple of the tools I mentioned. (All of them provide free 10-14 day trials.) Reproduce the problem, and watch how the monitoring software reacts. Does it lead you to the root cause quickly, or does it just have a bunch of flashing red lights on the console? The best ones will save you time by getting you right to the solution.

After you've done your evaluation and picked a favorite, get a quote from them - and get a quote from the other vendors as well. Share the quotes with the competing salespeople. You'd be surprised how willing they are to negotiate, and you might be able to use the cheapest quote to get the best tool at a price that works for you.

Your Homework This WeekI don't expect you to magically get budget approval for a tool this week, but I want you to start making a list of day to day problems as they strike. If you waste four hours troubleshooting a deadlock problem, make a note of the date, the time required, and a few words about the root cause. In a month or two, you'll be able to present this list to your boss as proof of why the right monitoring tool could save you money.

I'm not selling you vendor tools, by the way. I often link to vendor tools, but I don't receive any compensation. I'm just a huge believer in using the right tool to look like a hero fast - instead of building your own hammer every time you change jobs.

If you're still convinced you should build your own tool, go check out the open source project Opserver. It's a free performance monitoring tool from the guys at Stack Exchange, home of Stack Overflow. They use SQL Server (and a lot of other cool tools), and theirs is the best open source performance dashboard I've ever seen. (And that's not just because I helped with some of the DMV queries.)

See you next week!
Brent 

Brent Says: People often ask me what's my favorite monitoring tool. Because I jump around from one client to the next, and they all use different tools, I don't get the luxury of knowing one tool well.

When I do see folks that are happy with their monitoring tool, these are the tools they're usually using - listed in alphabetical order:

Dell Spotlight on SQL Server Enterprise

Idera SQL Diagnostic Manager

Red Gate SQL Monitor

SQL Sentry Performance Advisor



DBA Training Week 5 - Risk, Governance, and Compliance

Risk isn't just a fun board game.

Download the SQL Server Compliance Guide. It's an old 92-page whitepaper circa 2008, but it might just be the best (and most timeless) technical document that Microsoft has ever produced.

I only need you to read pages 7-13. In those six pages, you'll understand the differences between risk management, governance, and compliance. Risk management means knowing what risks the company is taking with the data, governance means the actions taken to address the risks, and compliance means someone is double-checking that we're actually doing the governance stuff we say we're doing.

Your job as a DBA involves all three, but when you're just getting started with compliance, focus on risk management. We need to get a quick idea of the different ways we could lose data, or that supposedly secure data could get into the wrong hands. (Yes, your developers are probably the wrong hands, but that's a different story.)

Homework: Look for Risky BusinessOdds are, nobody in the company has an inventory of the data we're storing in ridiculously unsafe ways. Here's a quick way to check your own database looking for dangerous fields:

SELECT * FROM INFORMATION_SCHEMA.COLUMNS
WHERE COLUMN_NAME LIKE '%password%'
OR COLUMN_NAME LIKE '%social%'
OR COLUMN_NAME LIKE '%credit%'

Feel free to change out those keywords for other terms that are relevant to your business - fields that are sensitive, and that would hurt if they got out into the wild. Then look at the contents of those tables - is the data being stored unencrypted? Who has access to it?

If we're storing unencrypted passwords in the database, for example, then every database backup we've ever done is dangerous. If anybody gets access to any backup file, like our offsite tape backups, then we could be on the front page of the news tomorrow.

When you find it, email the developers and the business contact for that database. Explain that the data is not encrypted, and use examples from the SQL Server Compliance Guide to show how this is a risk for the business.

Compliance: Knowing Who's Accessing WhatIt's unusual for me to assign homework first in the email, and then go on to talk about other things, but I want you to be aware of other things that companies usually do around their secure data.

Often management will say, "We need to audit everyone who changes or accesses secure data." Technically, SQL Server has features that can accomplish this goal - things like SQL Server Auditing. In practice, though, this is a problem for hard-core security teams because the very same DBA who manages the servers also has permissions to change the SQL Server audits. A greedy DBA could easily disable auditing, get the necessary data, and then enable it again.

With seriously ambitious work, you can lock auditing down so that can't happen, but ... it's really, really hard.

As a result, most seriously secure companies end up with a completely different auditing solution that lives outside of the DBA's realm. The security team buys third party hardware appliances like IBM Guardium or Imperva that act like a network firewall between everyone and the SQL Server. The appliance logs everything that happens in the SQL Server, and then the security team can review those logs without you (the DBA) knowing about it.

These solutions are typically six figures and up. They're expensive because they have to be absolutely bulletproof - if they fail, you can't have the access to the database suddenly stop. Plus, you'll need them in place for not just your production environment, but also your disaster recovery environment.

If you just want to check a box and make the auditors think you're secure, that's easy and cheap, but seriously good security is seriously expensive.

Let me know what you think of the SQL Server Compliance Guide. Enjoy, and see you next week! 


DBA Training Week 4 - Knowing Who Has Access

Missed last week's email? Read it here.

Who Can Get You Fired?Run our free sp_Blitz™ stored procedure on one of your production servers and pay particular attention to the Security section of the results. It lists the logins who have been granted the sysadmin or security admin roles.

Don't think of them as logins.

Think of them as people who can get you fired.

These people can drop databases, drop tables, change stored procedures, edit data, or even change SQL Server configuration settings like max server memory or maxdop. You're probably not getting alerted when any of these things change - we just can't afford to monitor every single thing in SQL Server and send alerts on it. At some point, we have to be able to trust certain logins, and that's where the sysadmin and security admin roles come in.

Except when we first get started learning database administration, it's usually because we're the only DBA in the shop, and the server is a mess. The front door is unlocked, the key is under the floor mat, and everybody knows we've got a big screen TV in the living room.

How to Get Started Locking Things DownBefore you start removing people's SA rights, be aware that there can be political backlash. In one shop, the head developer's SA rights were removed, and he stormed into the DBA's office the next morning screaming. Turns out one of his apps automatically created a processing database every night, did a bunch of work in it, and then dropped the database. Nobody knew because it was only around for 30-45 minutes. The problem could have been avoided by communicating the security changes ahead of time, and that's where we need to start.

Take the sp_Blitz™ output to your manager - just the security portions - and say something like this:

"Here's the list of people who can do anything on the production server - delete data, drop databases, or change performance settings. If they do, here's the list of applications that will be affected, including accounting and payroll. I don't want to take away all of their permissions - I just want to start by giving them full permissions over their database, but not in any other databases, and not at the server level. Can I talk to them about doing that?"

Note that we're only going to TALK to them, not actually do it, because we need to communicate with them first. Then, go to the end users or developers involved and say:

"We're starting to lock down the production server, but I want to make sure you have all the permissions you need. I'm going to make you a complete database owner inside your database so you can do anything you want in there, but I'm going to take away your rights to the other databases (like accounting and payroll), and I'm going to remove your permissions to change server-level settings like how much memory the server can use. I'm planning on doing it next weekend, and I want you to have my email and phone number so that if anything breaks on that date, you can call me and I can audit what's happening on the server to see if it's related to the permissions change."

When You Get ResistanceIf you get any pushback from developers or users, go to the project managers or business people who have a large stake in the database. For example, if the accounting database is on the server, go to the CFO and say:

"Here's the list of people who can take down the accounting system. They have the permissions to drop the database at any given time, and there's nothing I can do to stop it. I'd like to get that changed - can I schedule a short meeting with you and the development manager to get everyone on the same page?"

You want to turn it into a business problem, not a technology problem, and the CFO will very much be on your side. She can't afford to have her entire department go down just because some developer didn't put a WHERE clause on a T-SQL statement.

I Know, This Week Isn't FunDatabase administration isn't all bacon and roses. Sometimes it's boring politics and paperwork, and this is one of those weeks.

In Week 1: Where My Servers At?, we built a spreadsheet inventory of our servers, and now it's time to fill in a little more details. Since we're analyzing security, we need to know which applications live on each server, and who's in charge of each of those applications. You don't have to fill in the specifics of who has read or write permissions in each database, but we want to at least know the business purpose and the business contact.

The business contact is the one who really helps us get the database locked down because their job is on the line if this data is lost or unavailable. (Come to think of it, you're someone who can get THEM fired!) In the coming weeks, you'll be working more with them on reliability and performance, too, so now is a good time to start fleshing out that part of the spreadsheet. 

Brent says: Database administration is politics.

Sure, we like to think we're the police, here to protect and to serve, but most of what we do involves sitting in meetings, convincing people to do what we want, the way we want it.

It's made more challenging because we often don't have any real authority. Sometimes the DBAs report to completely different managers than the developers - and sometimes, it's even different companies! We might be administering a database server that houses a third-party application, and the vendor's support team demands to have SA access.

Consulting Lines: After doing consulting for a while, I wrote a series of blog posts explaining lines you can use as a DBA:
"Sounds like you've got it all under control."
"What happens if that doesn't work?"
"Would you mind driving?"
"SQL Server needs a dog."

Missed a week? Here's what we covered so far: