Cost analysis and cost effectiveness for databases are hardly ever performed, while it actually makes a lot of sense to perform such calculations. The easiest method of gaining insights into these is by performing a total cost of ownership (TCO) calculation. You might have theories on what your greatest cost factor is, but do you really know for sure?
Why would you perform such a TCO analysis? As with most research: prove your theory of the highest cost factor wrong. The TCO is a great tool to give a precise cost analysis and would give you some surprising insights!
Cost Factors for Databases
The cost factors for databases can be divided into two separate groups: capital expenses (CAPEX) and operational expenses (OPEX). Both cost factors are part of the infrastructure lifecycle.
Capital expenses are the costs you pay upfront during the acquire phase: hardware purchases, (non-recurring) licensing cost and any other one time cost factors like replacement parts. These expenses are a constant factor in the TCO and are spread out over the lifetime of your database servers. Most of these costs will happen in the acquire phase, however the replacement parts will obviously take place in the maintenance phase.
Operational expenses are the costs for running the database servers. As these costs are recurring, you pay them on a regular (e.g., yearly) interval and mostly during the maintenance phase. These cost include datacenter/rack rental, power consumption, network usage and operational costs like (remote) hands and personnel. The last one includes sysops, DBAs and all costs made to facilitate them like desks, office space and training. Since these expenses are recurring, they will continue to grow during the lifetime of your database servers. The longer you operate these servers, the higher the operational expenses (OPEX) will be.
This means that the longer you use your database servers, the share between CAPEX and OPEX will shift towards a higher share of OPEX. The one time purchase of hardware may be considered a high cost upfront, but given that you will probably use the hardware for more than three years, it justifies the upfront cost.
For cloud hosting, the calculation will be similar. However, since you don’t have hardware to purchase upfront, the CAPEX will be a lot lower. As cloud hosting has a recurring monthly cost, the OPEX will be higher. In some cases, your cloud provider may calculate some (setup) cost upfront and this should be treated as CAPEX.
Example Calculation for Hardware
In this example we will make a calculation of a small company (under 100 employees) that hosts on hardware in their own racks in a data center. This company has two dedicated sysops and one experienced DBA (1-4 years), where the DBA is managing around 20 databases and the sysops around 200 hosts. The average DBA salary for this is $65,000, so the annual cost per database would be $3,250. The sysops average around $50,000 for the same experience, and cost $250 per host per year. The sysops are also the people who manage the datacenter. We will not factor in the facilitation costs as this would get over complicated.
For our example cluster, we will make use of a three node MySQL replication setup: one master and two slave nodes. Hardware is based upon the Dell R730 with 64GB of memory and six 400GB SSDs, as this is a very popular model for this purpose. The price of a R730 with this configuration is currently $7655.
Rental cost of a full rack is nowadays around $350, so the colocation cost per U is roughly $8 per month. Since the R730 is a 2U unit, the total cost for our databases would be $48 per month.
Modern colocation costs factor out the power consumption, as the power is a variable factor. Prices for power with colocation can vary a lot, but it currently averages around $0.20 per kWh. The average database server consumes around 200 watts, which results in a 144kWh consumption per month per server. For our three database servers this would result in $86 per month.
This results in the following TCO:
|Cost item||CAPEX||OPEX (per year)||TCO (3 years)|
|Professional support (DBA / Sysop)||$10,500|
|Power cost (200W)||$1,032|
There are a couple of conclusions we can draw from this calculation. Cost for colocation, power and replacement parts are neglectable, compared to the other cost factors. Also during the lifetime of a database server, the support costs make up more than half of the total costs. And are far higher than the original purchase price of the servers.
Example Calculation Cloud Hosting
In this example we will make a calculation for a company that hosts in the cloud. To compare fairly, we will again make use of a three node MySQL replication setup on EC2. Amazon provides a nice TCO calculator for these purposes, so we made use of this as input for the calculations below.
To make the database servers comparable, we chose the i3.2xlarge, which (currently) has 8 vCPUs, 61GB of RAM and 1900GB of SSD storage. This currently costs $0.624 per hour, which is slightly below $15 per day and $5466 per year.
In the cloud the upfront investments (CAPEX) are not necessary. This is true in many cases, except if you make use of reserved instances like in AWS. With reserved instances, you make a claim on Amazon to reserve (performant) capacity for you, that you can use at will. In our calculation, we will not make use of reserved instances. Next to the lower CAPEX, our OPEX should be lower since our sysops don’t have to go to the data center or install these servers.
This results in the following TCO:
|Cost item||CAPEX||OPEX (per year)||TCO (3 years)|
|Professional support (DBA only)||$9,750|
|AWS 3x i3.2xlarge||$16,399|
Even though we have eliminated our upfront costs and capital investments (CAPEX), the OPEX is really high due to the premium we have to pay for high performance instances in AWS. Over a three year period, your TCO will be higher than having your own hardware
OPEX has a Large Influence
As you can see from these calculations, the influence of the operational costs (OPEX) during the lifetime of the servers is far greater than the initial large investment of the CAPEX. This is mostly due to running (and owning) these servers for multiple years.
In the case of owning your own hardware, we have shown that the operational costs even outweigh the initial costs for purchasing these servers. For the AWS example, the total costs of “owning” these servers is even higher than for the hardware example. This is the premium paid for flexibility, as with a cloud environment you are free to upgrade to a newer instance every year.
For both examples it is clear that the professional support for running these databases is relatively high. It looks like the sysops are clearly far more efficient when they are managing more than 200 hosts, but they don’t have to bother with the additional tasks that the DBA is supposed to do. If you could only make the DBA more efficient.
Making the DBA more Efficient
Luckily, there are a few methods to make your DBA more efficient and handle more database servers. Either have the DBA relieved from various tasks (others will perform these tasks) or have the DBA perform less tasks through automation. The low hanging fruit would be to automate the most time consuming tasks or the most error prone ones.
The most time consuming DBAs tasks are provisioning, deployments, performance tuning, troubleshooting, backups and scaling clusters. Provisioning and installation of software is a repetitive task that can easily be automated, just like copying data and setting up replication when scaling out with read slaves. Similarly backups can be automated and with the help of a backup manager the restore process as well.
For the most error prone actions we could identify setting up replication, failover and schema management. In these tasks a single typo could lead to disastrous proportions where the only way to resolve is to restore from an earlier made backup.
Automation of repetitive tasks and chores is a tedious, but useful task. It takes time to automate each and every one of these tasks, and your DBA will most certainly be more busy with automation than with the other (daily) tasks. This automation may actually interfere with the normal day to day jobs, especially if the DBA isn’t a developer type and is struggling with the automation jobs. Wouldn’t it be more productive to offer the DBA a readily available toolset to work with instead? Or perhaps provide the sysops with a way that allows them to perform the tasks instead?
Do you even Need a DBA?
Not all companies have full-time DBAs, at least not if you have a small number of databases. A DBA is a very specialized role, where a single person is dedicated to perform all database related tasks. This requires specialized knowledge of specific hardware, specific software, operating systems and in depth knowledge of SQL. Placing such a specialized person on a small number of databases, means the person will not have a lot to do and only cost money.
A sysop (or system administrator) is more of a generalist, and has to do a bit of everything. They generally manage hardware, operating systems, network, security, applications, databases and storage. They may have specific knowledge on one or more of these systems, but they can’t be specialized in all of them. The knowledge gap becomes more apparent when it comes to distributed setups (replication or clusters) and need for high availability.
As shown in the example calculations, the difference in salary expresses this picture as well. A DBA will cost more than a sysop and is more difficult to find as there’s not many of them around. That means that you probably have to do without a DBA. The challenge for the sysop is to have enough time to keep the databases perform well, troubleshoot any issues, monitor for any anomalies, maintain high availability, ensure data integrity and that data is backed up (and backups are verified to be ok).
ClusterControl Saving Costs
In ClusterControl the full database lifetime cycle has already been automated for the most popular open source databases. This means the DBA doesn’t necessarily have to automate his/her job anymore, as most of the tasks already have been automated in ClusterControl. Reliability will increase as the automation done in ClusterControl has been tested through and through, while what the DBA produced will only be tested directly in production.
Implementing a complete lifetime cycle management tool like ClusterControl means the DBA can now spend more time on useful things and manage more database servers. Or it could also mean people with less knowledge on databases can perform the same tasks.