If All You Have Is a Database, Everything Looks Like a Nail
Databases are fabulous resources. Of course, we’ll abuse them in ways that cause challenges.
Back in 1978 when I took my first job programming, it was to build a database system. At age 22, I didn’t know what that was but it didn’t matter. I needed the job. That startup company eventually became a shutdown company and in 1982, I went to work at Tandem Computers working on the infrastructure that a few years later became Tandem’s NonStop SQL.
At Tandem, I learned that a database supporting a company’s business was a complex and sophisticate creature. Not only did it need to provide access to the customer’s data, it needed online DDL, high availability, archive, offsite tape rotation and management, online backups, access control, monitoring and operational support, and even replication to a remote data center. I called this last feature geologic fault tolerance.
Databases were big, powerful, and expensive. You were lucky to have one and only the largest conglomerate company could afford more than one. Furthermore, the large mainframe computer system and its database were SO expensive, it was quite practical to hire dozens of programmers to build an application that did exactly what you want.
This led to a few recurring patterns at most companies owning shiny new expensive databases:
Every application wanted to share the same database, and
Lots of flotsam and jetsam would get tossed into the database.
The Tragedy of the Commons remains endemic to database deployments.
This post considers the causes and consequences of:
Sharing databases across applications that ought to be encapsulating different data, and
Stuffing non-relational immutable documents, photos, audio files, movies, and other inanimate objects into your innocent and unprepared relational database.
Neither of these mattered much unless your company was successful.
Monogamy versus the Temptation to Share
Once your company acquired a database, one or more DBAs (Data Base Administrators), and a setup in a datacenter, each new application at your company wanted to use your database, too! This was reasonable since the company really couldn’t afford ANOTHER mainframe. You really should share the one you have.
Soon, the database was surrounded by High Priests of Database who constantly cultivated and curated the schema for the tables and more as new functionality and new applications were added to the shared system. As the database became more central to the operations of the company, it was mission critical. Greater and greater care was placed on all the aspects needed to keep that database running 24 by 7 or as close as possible. When the database was down, the company was down and that was not a good thing.
All the care and feeding of corporate data was soon focused on the core enterprise database. Between the database providers and the enterprise database owners, the stuff needed to ensure the safety of the data got pretty damned good.
At first, each application used its very own tables and made sure to mind its own business. Eventually, it just seemed easier to read another application’s tables than to negotiate how to invoke the other app using messages or interfaces. What’s the harm?
Soon, there was an established trend that increased the entropy and intertwining of applications and tables. It became common to have transactional updates across tables for different apps.
“Hygiene” was interpreted as “Hi, Jean!” as you asked your database administrator for access to another application’s table.
Crap! Where Can I Keep This Other Stuff?
When other kinds of data needed a home, the owners of that data quite reasonably want it to receive the same loving care and protection that was offered by the database.
These precious artifacts were sometimes large in size like scans of documents, photographs, and occasionally even movies! Since SQL can accept blobs, they could stash lots of data in dusty corners of the database. This had big advantages for the people stashing the stuff and big disadvantages for the folks operating the database.
It’s great for the consumers of the database! Easy and clean storage of your random data with backup, high availability, and storage capacity just magically working. In addition, the data can be updated transactionally with the other stuff you have in the database. No worries and the best care you can get for your data! Life is GREAT!
It’s a nightmare for the administrators of the database! All of a sudden, your database starts getting fatter. Soon, you are purchasing more of the most expensive storage solutions allowable by law. Immutable data like photos and images of documents are coexisting on the SANs (Storage Area Networks) supporting your database. This is inefficient because SANs are extremely expensive high-end hardware designed for update in place. Using them as a repository for immutable stuff is like using a finely crafted Rolex watch as a paperweight.
Extricating these precious and bulky data items from the database is not as easy as it may seem. The immutable nature of scanned paper documents and other media is of tremendous help. You can allocate a 128-bit UUID for the document and store the document somewhere else. This identifier can then be kept in the database associated with some relational record.
Quickly, you realize that while this can offload the storage capacity to a less expensive storage medium, there are challenges:
Updates to the system are non-transactional. In general, you need to update the new unified system by:
Step-1: In transaction-T1 on the database, update the relational system saying that you intend to insert an object with UUID-X into a column in a table understood by the application. Another column manages the state of the object in store holding immutable objects.
Step-2: You copy the immutable object into your new store with UUID-X.
Step-3: Finally, in transaction-T2, update state of the object in the additional column so you know the object can be consumed by the application.
What could be wrong with that?
Oops… Later you learn that a failure in the multi-step insert may cause problems. A failure after step-1 and before step-2 causes abandoned and incomplete insertion as seen in the application when it looks at the table. A failure after step-2 and before step-3 leaves abandoned and useless objects in the immutable store.
To deal with this, you need a side table of in-progress inserts and deletes. At least, this can be transactional with the insert into the application’s table.
Next, you realize you’d better timestamp the ongoing work, too, so you can notice when something is old enough to warrant cleanup due to a failed insert. Now, your system is reasonably stable and you have a team party.
A few months or years later, the immutable store for the blobs is reaching capacity. Now, you need another crash program to update all the tables in the relational system to include a store ID. That allows you to keep blobs in many stores.
That’s when you realize you don’t know WHERE all the tables in the database are that hold immutable blobs. Applications have been adding blobs to the immutable store without coordinating that effort. Crap!
A new level of indirection must be added pushing management of the blobs into dedicated tables and encapsulating updates into specialized modules. Mostly, applications abide by this but not always. Old usages referencing the first immutable store must be grandfathered into the system since it’s nigh on impossible to track them down and kill them.
Using this new level of indirection, you can more easily manage replication across datacenters and even migration of the immutable blob store to a new datacenter.
Cleaving Systems Apart
I’m quite fond of the word “cleave” since it is its own antonym. You can’t be ALL bad if you’re self-contradicting!
One of the BIGGEST problems in software engineering is disentangling the big ball of mud. These intertwined and interdependent massive pieces of code can sometimes contain tens of millions of lines of code being curated by many thousands of software engineers.
Unfortunately, many times the social and economic pressures of sharing the precious database caused the creation of your very own big ball of mud. Once the application culture encouraged unfettered access to any table, it’s VERY hard to change. It’s hard culturally and its hard to do the software engineering to accomplish the change.
To have a prayer to change this, it must be incremental. You must drive both carrots and sticks. This can evolve into new ways to accomplish the disentanglement by:
Creating plumbing to make asynchronous work across applications easy, and
Snipping the access to the tables across applications.
Sometimes people stage read-only copies of tables. These are asynchronously updated from the authoritative owning application. Other applications then “own” the read-only copy in their application set of tables.
In this way, sometimes there’s success over many years of investment to disentangle.
Typically, these efforts are driven by the armchair enterprise architects who haven’t coded in decades, spout philosophical adages, and draw pretty pictures. Oh… wait… that’s ME! I haven’t coded since 1988, never shut up, and love to have my laptop on my lap in a big easy chair. And I DO draw pretty pictures! Sheesh!
Many times these architectural projects are hobbled by the natural cultural clash between these ivory tower thinkers (like me) and the folks that need to do the real work. Most successful efforts to disentangle huge systems are only partially successful. When they are, it’s because they are led with lots of attention to the cultural challenges, fair and thoughtful compensation for success when accomplishing goals, prioritizing disentanglement against new features, and avoiding them versus us team dynamics. That requires true leadership. I wish you the best of luck.
In some ways, it is MUCH easier to cleave apart a huge application than to cleave together two applications with disparate heritage. When you acquire another company, their application is based on a dramatically different corporate culture, different understandings of even the simplest concepts like “customer”, and a massive mess as their metadata is combined with your metadata.
Cleaving apart huge application is like straightening the streets of downtown Boston, notorious for having a city map created based on cow paths in the 1630s. Straightening the streets would have been easier without the skyscrapers in the way.
Maybe We’ll Just Make the Database Bigger!
So, heaven forbid your company is successful and your database keeps growing.
You’ve pulled out the immutable blobs and only had modest success disentangling the different applications. Hence, they still all need to be on the SAME database.
Scale up: You try very hard to scale up the database. Buying bigger and better hardware can go a long way.
Scale out: Next, you try to scale out the database hoping to use multiple computers in a cluster to get a bigger database.
Bigger comes in at least two dimensions. Is it more important to:
Hold more data? Or
Process more transactions?
Of course, your answer is Yes !
Applications are always developed hand-in-glove with their platform. In this case, a major part of that platform is the database. Moving an existing large application to a new database can be more traumatic than concurrently getting a new job, moving to a new town, divorcing your spouse, and marrying someone in Las Vegas you met the night before. It usually has about the same success rate.
There are many subtle aspects to your database that are SURE to impact many, many parts of your multi-million lines of code application. Some of these are:
Usage of non-standard database features. Every database vendor has cool and wonderful non-standard stuff. Most of the time, esoteric use of these has been deeply entrenched into the code base since they were added by a programmer now long gone.
Concurrency semantics. The subtleties of MVCC (Multi-Version Concurrency Control) versus Serializability versus Repeatable Reads versus Read Committed are not for the faint of heart. I guarantee they are very rarely good topics for discussion on a first date! I also guarantee that sliding different concurrency semantics under an existing large application is a fool’s errand.
Optimistic versus Pessimistic Concurrency Control. In optimistic concurrency control, the work of the transaction is done BEFORE ensuring you don’t bump into other transactional updates. If you do collide, the transaction is aborted. It is frequently (but not always) more efficient. In pessimistic concurrency control, resources are locked to ensure you don’t fail when completing. While typically more expensive, it can prevent floundering when optimistic concurrency repeatedly fails to commit due to conflicts. While the choice between optimistic and pessimistic does not actually affect the correctness of changes made by the application, changing the databases behavior away from the application’s expectations can EASILY cause so many performance problems the application is useless.
Taking an application from one set of expectations to another is like taking an aged grandparent in their 90s from one country to another one. It ain’t easy at best!
Furthermore, even if you take the same database from a centralized system to a distributed system, the application performance will almost certainly suffer in some ways while you may gain some scaling. Coordinating locks and concurrency will have different performance challenges and the transition will be a rough road at best. Tuning an application to a distributed database requires patience and insight.
Applications are deeply aware of the peculiarities and quirks of their database. Just getting one to run scaled out distributed database past a modest number of nodes is rarely easy and frequently impossible. Getting it to run fast with lower latency is even harder.
Databases are special and curious critters. I love them and definitely think they are a pain-in-the-back. Maybe even lower.
Having a lot of databases in your company is annoying and expensive, especially the challenges of operating them. The best advice I can give to a user of a database is to use it carefully and sparingly. If at all possible:
Manage blobs. Create a formalized mechanism to handle blobs, even if you keep them in the database in the short run and move them elsewhere later,
Isolate applications. Keep different applications away from other applications’ tables,
Use messaging across applications. Connect the applications with some form of asynchronous messaging even if it’s simply implemented as rows in a database table enqueued and dequeued in separate transactions. That, combined with isolating the applications can make the move to multiple databases practical.
Of course, that requires discipline. Discipline is something I am woefully short on so I’m not going to point fingers.
Do the virtues of a database intrinsically lead to these challenges using databases?
What aspects of a relational database make it so much more industrial strength and capable of supporting mission critical applications? Is this intrinsic to the relational model or is it simply a consequence of the large management and operational investments in database systems to date?
Are these technological challenges or social challenges? Is there a difference? How can you tell the difference?
Are the problems faced by large database installations simply a matter of cultural expectations?
Are these problems simply due to earlier (very reasonable) assumptions that have moved away from being applicable? Is this simply the way we do things around here that gets us into this pickle?