Guest Post: Your Database Deserves Better
Databases play a strange role in software development. The vast majority of complex applications use one, but many developers don’t pay a lot of attention to it in their daily work. They are treating it like the SSD in their laptop and simply assume that it will reliably save the data they are trying to persist while they are busy figuring out the right business logic, making the UI pretty, writing new automated tests, or searching for a troublesome memory leak. The database is often not seen as part of the application the team is building and is therefore only considered in passing when making design or implementation decisions. Commonly, it is also a source of frustration because it does not fit with the rest of the system (which is where things like object relational mappers come into play) or because it makes automated testing harder. Also, the query language used by the database management system (DBMS) is often very different from the programming language used in the rest of the application, which further widens the gap. All these factors combine to make developers pay minimal attention to the database once it is up and running.
This mindset is understandable as they have a lot on their plate, and since they don’t actively develop the database – at least in the vast majority of cases, if you are working on a DBMS right now, thank you for your service –, they treat it as something someone else must set up correctly. This is similar to build tools, which are also critical parts of a software project that often get ignored by most team members. Interestingly, this mindset can get you pretty far, as it is difficult to get the database setup so completely wrong that everyone immediately notices the problem.
Unfortunately, you can only ignore the database until the day comes when you can’t anymore. If your system is successful and is used by a lot of people, it is only a matter of time until some of the problems of a bad database setup start showing themselves. In most cases, performance will be the first thing to suffer, but you could also get a very unpleasant call from your controller asking you why the database suddenly became a major cost factor (this is especially likely if you are using a database as a service). And, of course, there are risks of downtime, data loss and data corruption, which can be fatal for an otherwise thriving application.
As a result, this article focuses on this often-overlooked part of your system and gives you some questions to ask yourself about your database.
Do you know the contract with your database?
DBMS developers pour a lot of thought into their code to ensure good performance and data consistency. However, all these features rely on certain assumptions and can only offer limited guarantees since it is impossible to create fool-proof, invariably fast systems. These assumptions and limitations are the contract that users of a database have to be aware of. In practice, however, my experience is that few developers know much about how their database operates and which guarantees for speed and consistency it offers under which circumstances.
As an example, consider making a simple write to the database. Virtually every DBMS offers guarantees to ensure data consistency, but they all have limits. In almost all commercial applications, the database and the application form a distributed system, so even if the write succeeds, the application might never know since the package could get dropped by the network on the way back. Or you could be unlucky, and the database goes down the very moment it processes the write, and the change is lost. Your application has to be prepared to deal with these situations, as they are unlikely to occur in a single write but unavoidable in the long run.
Unfortunately, many applications do not properly consider this. Especially developers used to relational databases might think that transactions are enough to keep their data consistent – they are not, as distributed transactions are very difficult to get right – and fail to consider the relevant edge cases. Usually, awareness is higher among users of the more modern NoSQL databases, as these databases put less emphasis on transactions and make their constraints more transparent. Still, vigilance is required.
How bad these effects are strongly depends on the kind of data you are dealing with. It might be acceptable to lose a single data point of a sensor in an internet of things application that creates new data every second anyway, but it spells disaster to corrupt user files or mess up financial transactions. The stakes for some systems are extremely high, and a strong knowledge of the guarantees offered by your database is essential.
Are you correctly optimizing your database usage?
Despite the general lack of interest outlined in the previous sections, code dealing with the database has a strange tendency to be prematurely optimized. The same developers who don’t think much about the database in their daily work, often seem to care deeply about its performance and try to get the most speed out of it in the rare cases where they are actively writing queries or other coding interacting with it. This is laudable in principle; however, it is very difficult to efficiently fine-tune database performance without a deep understanding of the DBMS and the usage patterns of the application. As both are often lacking, the rules used for these optimizations are often too simplistic, leading to less-than-ideal results.
For example, because many developers know that going through a network to the database is less efficient than just interacting with the main memory, CPU and registers of the machine the application is running on, they start to avoid making calls to the database or grouping them together to avoid the network hop. This idea is correct in principle but has to be balanced with other factors – like whether this network hop really makes a difference in total system performance –, otherwise avoiding the hop might have unintended consequences. This pattern often leads to the caching of data on the application side, but as regular readers of this blog know, caching should be avoided to prevent consistency issues.
I have also seen developers consciously read too much data from the database to avoid future calls to it. This is most likely a mistake since sending the additional data to the application probably consumes more time (and certainly more memory) than an additional call to the database would. One developer making this optimization was not fazed by this criticism, though, and argued that the additional data would also make the application more future-proof as it might be needed by future features (as far as I know, this has not happened yet).
There is also a common pattern of grouping calls together, i.e., reading the data for 100 users in one query rather than executing 100 queries for the same number of users. In general, this is a good practice as it avoids the overhead of the network hop to the database; however, it is not a panacea and should not be used blindly. If the multi-user query is much more complex than the single-user query, you should carefully consider whether the additional performance is worth the harder to maintain code. Performance is not everything, and there are often trade-offs you need to make. You should think about which aspect of performance (speed, memory usage, latency, throughput, load on the database etc.) you are trying to optimize. Not having a clear understanding of what you are trying to achieve is a sure way to get bad results.
Another example of premature optimization was when a colleague insisted that we would need a database index for every column of the table because otherwise queries would be slow and was very frustrated that our DBMS only supported fifty indexes per table (a limit that was put in place by the vendor because otherwise the performance of updates would suffer). Luckily, no user ever complained, and the read speed of the database was sufficient even on the columns that did not have an index. This is a common problem when trying to optimize without a reliable benchmark showing the bottleneck. The chance is very high that you will try to tweak the wrong part of your system.
The most stunning example of database optimization gone wrong was when one colleague configured a column as a string on the database (it was a text that could be chosen freely by the user, but the UI to enter it only supported 20 characters) instead of a fixed-length character field because he wanted to save some storage space in case the user entered less than 20 characters (please note that this was in 2010, not 1980). While a nice idea, this clashed badly with the used SQL dialect, which did not support WHERE-conditions on string fields. As a result, if you were looking for a row with a specific value in this column – which happened frequently – you either had to take an indirection through another table or load all the records of the table into your application and then loop through them to find the correct value. The latter was actually done in one of the UI backends but was so inefficient that finding the right entry took over a minute once the table exceeded 10,000 rows. Eventually, the column type was changed to a fixed length, which solved the issue, but this well-intentioned premature optimization had caused years of headaches by then.
Finally, there are applications where the database is considered a single point of failure and a scalability problem. In these scenarios, developers will try to avoid putting load on the database, using the common patterns of grouping calls, caching as much data as possible on the application side and sometimes even sorting through data on the application server, even though the database would be better suited for this task. If this happens in your team, something is seriously wrong, and you should have a long talk with your architect or manager about the state of the system.
In this situation, the first step is to get reliable data on the real load on the database. This will help you find the expensive operations to focus your optimization efforts on and help you get a realistic picture of the situation. In my experience, the fears of these teams are often overblown since databases are hardy beasts. It is true that some of them are difficult to scale (especially if they only provide vertical scaling), but these problems usually take time to manifest and cannot be solved by micro-optimization on the application side. If your database really is the problem, you most likely need to adjust your architecture or use a different DBMS.
In summary, be sure not to prematurely optimize your database usage. There are very few rules that are always true – the only one I would state with confidence is that you should always try to only read the data you need – while many depend on the concrete scenario. Check the documentation of your database for best practices, think about what you want to optimize for and benchmark before and after making relevant changes. This should help keep your efforts focused and avoid unpleasant surprises.
Are you considering all options when choosing a database?
The preceding sections dealt with correctly using the existing database, but choosing the right one for an application is also crucial. Unfortunately, these decisions are often made prematurely, with many deciders always choosing the same technology, regardless of the requirements of the application. There also still seems to be a strong bias towards traditional relational databases.
Relational databases are very capable and deal with a lot of use-cases very well, but they are not the best solutions for all scenarios. I have seen quite a few applications over the years that use patterns like optional columns or columns with varying datatypes that would have been easier to realize with a NoSQL database like MongoDB, but this option was never considered during the design process. Questions like performance, scalability and cost are even more rarely considered, even though they will almost invariably become important should the application be a success.
The most extreme example I have seen was a microservice whose only job was to record events and store them for later processing. The database chosen for this service was an in-memory database with a strong focus on analytics (the reason given was that it was supposedly the cheapest one to operate). Unsurprisingly, nobody ever did any analytics on these events, the lower latency of the in-memory dataset was not needed and the cost of operating the database quickly became too high for such a small service. As a result, the microservice was changed to use a shared database of the same type, which violated a core principle of microservice architecture but brought the costs down. It is still not using most of the database’s features to this day.
To avoid costly architecture mistakes like this, you should always consider all options when choosing a persistence layer for your application. Relational databases, NoSQL databases, even more modern variants like Surreal DB or even non-database solutions like persistent event queues or key-value stores can be great choices. Every application is unique and deserves the best solution for its data. Also, make sure to re-evaluate everything once the application has started to grow. The best database setup often only reveals itself after the initial shipment of the application.
Conclusion
Databases are often not considered enough when designing or developing a system. Focusing on them a bit more can help avoid many problems.
If you liked this blog post, please share it with somebody. You should also have a look at the many blog posts from the main author. As for me, you can find me on Twitter/X.