TLDR; Applying DevOps to database deployments tips the scales in Sql Server's favor with simpler sql scripts and DDL rollback.
The Bias of a Microsoft Developer...
I've flipped back and forth between Sql Server and Oracle my entire career, but as a Microsoft stack developer, I've tended to favor Sql Server. Oracle, however, was the primary RDBMS at my current company for many years so this Microsoft guy was forced to soften his bias. As i got past the differences in Oracle, I eventually settled into the idea that it was nearly as good as Sql Server, just different.
.Net Likes Sql Server
The "nearly" stemmed from a few quirks that while forgivable, still made my job harder. If you're a .net shop, you'll forever be chasing some issue with ODP.net. In the the days of the unmanaged version, you'll be fighting with oracle homes and mismatched clients. You'll have to install publisher policy files so you don't have to upgrade your application everytime you make a minor client upgrade. In the days of the managed version you'll struggle with bitness of the MTS dll. You'll have collisions with your local config vs the machine config that somebody accidently applied with an oracle batch file. Once you understand it it's easy enough to work thru and maybe some of it is a decent exercise for testing the way .net works. But you're always going to have new developers where this is all foreign, yet none of these cycles will be wasted with Sql Server as the Sql Server assemblies are part of the .net framework. Call it home team advantage.
Programmability
But then there's the purely Oracle quirks like the lack of a "read uncommited" transaction isolation level, doing updates based on a join (the syntax sucks and doesn't work most of the time since Oracle decided to protect you with "key preserved" updates), or the lack of a date type bind variable. It's important that these aren't labled as a "second language" problem where the loss is simply in translation from what I know to what I don't. I don't really see these this way - this is missing functionality that makes me take a longer route. Nothing huge, but still things that slow me down.
How about two different programming contexts? Sql Server just has T-Sql where Oracle has both SQL Plus and PL-SQL. If you do a query and nothing else, it's hard to tell them apart. But lets say you need to do some preprocessing or execute a different query based on some condition. In T-Sql it was no problem - it doesn't matter where the select happens - after an "if", inside of a proc, it doesn't matter - it all works exactly the same and the .net setup doesn't change. But not in Oracle - logic is only available in PL-SQL (anonymous blocks or procs), yet you can't return a recordset from pl-sql without a refcursor. This makes debugging your sql code harder and puts you the place of making more changes that you wouldn't need to make for Sql Server.
Tooling
Once you get out of the programming space, you can make the other usual comparisons. Sql server seems to be cheaper, but I've also heard that nobody pays full price for Oracle so maybe there is some wiggle room.
Sql Server also won in the tooling section. Profiler is much simpler to TKProf and oracle traces can't see variable values. The execution plan visualizer in Sql Server Management Studio was also leagues ahead IMO of both the oracle web-based tools and those in Sql Developer.
Definitive Oracle Wins
Oracle does seem to "win" in two places. One is clustering. Oracle's RAC is true hot/hot load balancing for the same DB, which Sql Server does not have at this time. I'm on the fence with the value of this considering that with either product, you need to size resources based on failure. In other words, is there really a difference between a Sql Server running at 80% capacity and two RAC'd servers running at 40%? Does it matter when the disk subsystem is the more likely bottle neck anyway. Still, I'm sure it has it's advantages in some cases.
The other Oracle victory is a mindset victory and not really technological. Perhaps this is anecdotal, but Oracle installations always seem to be taken more seriously. Rarely will you find Oracle without a DBA, but it seems to happen all the time with Sql Server. The consequence is that Oracle is better maintained and managed than its Sql Server equivalent. I honestly think this stems from Oracle is actually harder to admin. But in the end, that widens the gap that a poor server admin must cross before absorbing another system and it ends up making him think twice for Oracle.
DevOps Redefines The Battle
So with all this, my previous stance was that while Sql Server was more friendly to development in a number of ways, it wasn't enough to overcome the current advantages of our solid Oracle installation vs the hodgepodge of Sql Servers we had.
But that changed this week.
We are on a big push to create an agile environment and with that, a DevOps mindset in our post-coding procedures. Our builds are becoming single-click PowerShell scripts and our front-end deployments are being done to a purposefully built environment set via Octopus Deploy. And this week I started tackling the database side of that.
DevOps Requirements For Databases
Cumulative Deployments
I started with some criteria for database deployments. First they should be cumulative, at least to a major revision. The development environment might get 50 deployments, QA might get 5, and production might get one. While it might seem easier to simply roll everything, what about the case when a development deployment accidently drops a column. Do you make a second deployment to try to repair that if it's even possible? Is it worth the risk to run such a thing in production? I think it's safer just to remove that from next "good" version of the deployment.
Repetitive Application
So with that, the idea of idempotence - repetitive application without change beyond the initial deployment - is also a requirement. In order to apply roughly the same patch over and over to the same database, the installation process must have some intelligence and/or be written with this in mind. Sometimes it means writing a merge statement rather than an insert statement. But other times we need to only conditionally apply the update.
Oracle Falls Behind
SQL Plus Doesn't Do Logic
In T-SQL, this is relatively simple. A simple “if column exists” is all you need before your “alter table add column” statement. But the SQLPlus/PLSQL conundrum complicates this problem. There is no “if” statement in SQLPlus – which is the context in which DDL is executed. The only way to do this is to convert the statement into a string and execute it with “execute immediate” inside of PLSQL anonymous block or procedure. This isn’t unsurmountable, but we now have to escape single quotes and we’ve lost syntax highlighting and error checking.
A Word on DB Migration Frameworks
I considered a database migration framework to help here. I gave Flyway an honest attempt but I ran into too many issues: It only ran the first statement in a whole sql script and it didn’t support Oracle specific statements (it does now, but only in the paid versions). I still plan on trying Liquibase, but in both cases the bar for ROI is high; this is not THAT big of a problem in my organization – database developers have little problem putting together a script that a DBA generally doesn’t mind executing. It’s only the occasional problem we are trying to avoid and only a few hours a week that we are trying to save. If I introduce something with a bit of a learning curve but with marginal improvement or its own set of problems, it’s not going to deliver any value to the business.
No Transactions - The Final Straw
So it is here I just created a simple checkpoint table to track executed changes and a simple PowerShell function to check and set that checkpoint. It worked well enough but testing illuminated the largest strike against Oracle.
Oracle, unlike Sql Server does not support DDL within a transaction. If an active transaction is in progress when DDL is encountered it is automatically committed. There is no way to group a bunch of DDL changes together. To get around this I was dropping and recreating the schema in order to run new versions of the script after I fixed bugs. But dropping an entire schema is not really an option in many of our legacy systems. They have too much data to restore each time. But even if that wasn’t the case in dev or QA, what about production? Isn’t it a huge advantage to wrap something in a transaction and know everything will be as it was if it fails rather than running for a restore?
Liquibase and Flyway both have options to write undo scripts. But frankly, I can’t see how this helps. Undo for individual migrations still might leave you with a half applied major revision since it’s only going to undo the last script that failed, and I fail to see how you can structure a major update into a single file (at least in Flyway’s case). Even if that’s not the case, we’re right back to the SQLPlus/PLSQL logic problem above as your undo is certainly going to need to investigate what it actually needs to undo.
Oracle has a potential fix in the form of “Flashback”. But at this time you can only do this at the database level and not at the schema level. We have hundreds of schemas in two Oracle databases – so data loss could be a consequence of this type of restore.
Closing
So while once I could have turned a blind eye in favor of slightly better upkeep, DevOps has brought me to a point where I’m reconsidering my stance in this battle. There is too much running on Oracle here to consider replacement, but I wonder if we should consider growing our Sql Server environment and begin giving it preferential treatment.
I still owe Oracle the chance for some work arounds. Maybe we can make schema-level backup and restore something that is easy to do from within the deployment, but again this isn’t a translation problem. Oracle is definitely missing the ability to run DDL in a transaction and there is no simple solution to work around the issues it causes.