T-SQL Tuesday: What were you thinking?
This month's T-SQL Tuesday marks the 120th installment, hosted no other than MCM Wayne Sheffield (@DBAWayne). Here is the Invitation. It also happens the first time I’m not only a monthly reader but a contributor, hopefully I can keep it going and help the T-SQL Tuesday community with valuable content. The guidance to participate this month is to write about some situation in which you couldn’t figure out why something was done the way it was done. Or, in Wayne’s words:
I want to know about things that you’ve seen others do in SQL Server that have left you wondering “What were you thinking” (maybe even with a few #$%^& thrown in)? Tell us what you saw, why you thought it was so crazy, and what you did about it (if anything). And please… just tell us what you saw, not who you saw doing it.
Before approaching this topic, I let it simmer for a day or two, first because I couldn’t come up with a story that was good enough or relevant enough. There is no point in blogging if what you are writing about can’t translate into experiences others can relate to.
Then suddenly this idea of telling my experience of switching employers and working for a new company started to make sense, especially in the context of “what were you thinking” (or more precisely “what is EVERYBODY thinking….” I should say).
My story is not entirely SQL related (although many times it involved SQL configurations I have not seen in the past) but I think it’s an interesting experience because I’m pretty sure many of us go through these frustrations, only sometimes we don’t recognize them as such. Let me explain so it makes more sense.
Rewind back to 3 years ago (more or less). I happened to finish a long time working relationship with the now defunct Hewlett Packard Enterprise Services (after a Spin-Merge that saw it re-birth as TXC) and starting a new and exciting journey with a local mid-size company here in Winnipeg called RapidRTC (where I still work today as full time DBA).
First few days into the company I knew something was odd, I couldn’t explain exactly what or why, but something was feeling strange. It took me a few days and a few “this is how we do it here” meetings to realize that I was having a tough time adjusting to new processes and methodologies after so many years of working for the same customer where I knew how to approach everything almost without reading any procedure manual. Also, I was seeing SQL Server’s configured in ways I had never seen before and I couldn’t help to think “these guys may not know what they are doing here”, turns out I was not entirely right… let me make it more tangible with 2 examples.
Change Release Management
The old customer for which I administered databases unfortunately had no mature processes for releasing changes, but their risk tolerance was high, way higher than RapidRTC (which has some severe risk allergy) so they tipped the balance in favor of “a hiccup here and there is OK” in order to deploy without any checks and balances.
Looking that from hindsight the approach was not only risky, but also un-scalable, but hey… who said every organization needs to consider scalability? especially in the public sector, some organizations do not care whether their solution will “scale well” or not, because they are NOT going to grow, there is no such thing as more customers, more traffic, more clicks. I know that for some people (especially those who read architecture books for breakfast with 2 or 3 chapters dedicated to scalability every morning) it is hard to wrap your head around the fact that, out in the field, these organizations exist… but believe me, they do exist and once you find them, you have to look at things under a different light.
My journey was moving OUT of one of this companies and after some time I realized that I had to understand the rules of the game had changed, there is nothing like deploying some changes without the appropriate pull request, integration test, pre-deployment (to integration servers/staging servers/etc.) and finally signing up for a time slot in our deployment calendar to make it happen. Granted, automation and tooling can make all this happen more or less fast, but due diligence is not something you do because you have time to kill. It has to happen or nothing gets deployed, period.
SQL Server Configuration Settings “max server memory”
I go to one very critical SQL server in my new company and I find 128GB RAM on one box, then I look into server settings and find that Max Server Memory was something like 60GB RAM there.
No other services run there, a single SQL Server 2008R2 Enterprise instance that was leaving more than half their memory in the parking lot.
I told myself: Breath deep… count to 10… breath again… don’t start swearing right away and ask the question that can open a few doors: “Why?”
Turns out this box was one of two servers participating on a Microsoft Failover Cluster, and this clusters had two SQL Server instances, one living permanently in one server (let’s call it ServerA) and the other in the other (ServerB) server.
Their rationale was that they would rather cap the Max Server Memory at less than half the total size of physical memory just in case a failover occurs and both SQL Server instances end up running on the same node.
Again, initially it made no sense to me to me that they were doing that, so I did what most of us do when stumble upon something we just don’t understand, we turn to our fellow DBAs around cyberspace.
Remember I said this company DOES not tolerate risk? Well, to them having a script that adjust the server memory if a failover occurs is considered risky.
The script can fail
It may not trigger appropriately we it should
It needs to be tested and re-tested if something changes
It is yet another potential point of failure
And here is another thing: they know that the DB server runs FINE on 60GB of RAM, so to them capping the RAM made more sense than solving a problem that for them doesn’t exist. I’m not saying this is right or wrong, I’m just stating the way this was considered and why it was solved like this.
My DBA Stack Exchange Post helped me to realize that:
Other people were asking the same thing and I’m not alone (no wonder…)
Aaron Bertrand is a Rockstar, I knew he was a top-notch DBA SQL Server for quite some time. He is a well-known author and respected in the community, yet he still makes time to answer questions for free….
He already had faced this issue in the past and decided to blog about it. It just happened I didn’t see his post and only by asking on Stack Exchange somebody pointed it to me.
The Microsoft Community blog had yet another approach to this problem, this time using PowerShell.
So what did I learn with all of this?
If something doesn’t make sense, don’t rush into conclusions, make sure you understand the eco-system where these decisions are taken.
If you have never seen something configured in certain way, research it and give whoever did it the benefit of the doubt. It may not follow theory, but in practice it could work.
Look in the community other people who may have used the approach that you think is not the right one and try to understand why they do it, get out of your filter bubble, challenge your assumptions. You may be surprised by what you find.
If after all these, you still feel something “process/configuration” makes no sense, don’t expect to change people’s minds by just saying “it makes no sense”, use your research and find other people’s blogs that you respect and know are subject matter experts to support your evidence-based discussion of why you think something needs to change.
Last but not least, be nice, no one wins an argument (technical or not) by being an asshole. Learn to recognize there are always other ways of seeing things.
That's all for me today... I hope I can keep this SQL Tuesday thing going. It is a great source of Blogging inspiration!