Common Mistakes Made When Building an On-Premises Private Cloud Infrastructure

Ashraf Sharif

A private cloud is a scalable hosting solution that provides agility and redundancy of IT infrastructure for your enterprise. Private cloud solutions are suitable for enterprises that have large compute and storage requirements, need to promptly respond to innovation, or have requirements related to control, security and compliance. 

Building an on-premises infrastructure is not easy, let alone building the cloud infrastructure on top of it. Cloud infrastructure introduces a better approach for computing and storage resources with better reliability, higher availability, dynamic scalability and resource elasticity. The infrastructure is of course not easily built and comes with its own set challenges and obstacles. In this blog post, we are going to touch on mistakes that commonly happen when building up an on-premises private cloud infrastructure.

Weak Security

While moving to the cloud is an advantage, data security and integrity are concerns for enterprises. This is why enterprises are keen to implement private clouds. Apart from better control of the infrastructure, running on a private cloud will have a full control position on security, privacy and data sensitivity aspects. However, many tend to miss on-premises security comes at a bigger price. You have to deal with the security risks by yourself now.

Having a dedicated security team or system is very important to safeguard your cloud infrastructure. The security policy must be well-written and documented, covering the whole aspects from software up until the infrastructure tiers to protect against physical theft, cyber-attacks, malware, hacking, backdoors and all sorts of threats coming from people around the premises as well as the Internet. On a good side, the single-tenancy approach in the private cloud should make the privacy policy less complex if compared to the public cloud's multi-tenancy approach.

The public-facing services like HTTP, database, DNS, must at least run through the encrypted connections which commonly have a performance impact over the plain connections. Having a demilitarized zone (DMZ) is necessary to connect between the public network and the company’s private network. When involving public networks, you need to have a bunch of security devices ready like a honeypot, intrusion detection system (IDS), firewall, identity management system, key vault, anti-virus, spam filter, VPN, proxy and so on. Many early adopters of private cloud infrastructure would not focus on these aspects due to budget constraints and lack of awareness behind the idea that the private cloud is already isolated from the public network.

Indirect and Hidden Costs

Running a private cloud is costly. Despite there are many open-source Infrastructure as a Service (IaaS) cloud platform like OpenStack, OpenNebula and Apache CloudStack which are available for free, you also have to count the direct cost of hardware leasing or ownership, data center rental, bandwidth and availability zone for redundancy, just to get started. Once everything is in place, you also have to count the indirect cost from the security and operational perspective to make sure the infrastructure is secure and well-maintained.

Cost analysis is very critical and must be done in the early stage extensively, covering all the direct, indirect, hidden costs, the total cost of ownership (TCO), the total cost of acquisition (TCA) up until the return of investment (ROI). For on-premises cloud infrastructure, most of the important hardware like bare-metals servers, storage equipment, switches, cables, power management equipment must be in place first before you can convert the bare-metal infrastructure into a cloud infrastructure. It is similar to building a complete mini datacenter just for your organization, hence, the upfront cost is fairly huge even when you start small.

Another option is the hosted or managed private cloud. It can also be costly even though it is not wholly-owned by the organization. The service provider takes care of basic maintenance and configuration in a hosted deployment, which means the user needs to subscribe and pay regularly for that offered service. This can end up being more expensive than the upfront cost of complete ownership in the long run and sacrifices some of the control over maintenance that complete ownership guarantees. Despite operating in a single-tenant environment, the service providers are likely serving multiple clients and promising each of them a custom isolated environment. If an incident occurs on the service provider's end, users may find themselves facing the exact same problems as the public cloud presents which are a lack of control and reliability issues.

Lack of Automation and Orchestration

Cloud automation is a broad term that refers to the processes and tools an organization uses to reduce the manual efforts associated with provisioning and managing cloud computing workloads. Cloud automation can be done using scripts, but the industry has realized that due to the complexity of cloud environments and the need for orchestration of many day-to-day tasks, it is better to rely on a mature automation platform. 

There are open-source tools commonly used to automate and orchestrate tasks in the cloud. Puppet, Ansible, Terraform, Kubernetes are tools that have to be incorporated into the deployment and management tasks to simplify the daily maintenance and operation tasks. For OpenStack infrastructure, there is a dedicated Puppet module, Ansible role, Terraform OpenStack Providers, Chef Cookbooks to support your needs to build the foundations of a successful private cloud that can then be fully automated through code.

Running and maintaining a cloud infrastructure is not a straightforward task, thus, try to automate as much as possible on every possible tier. For example, for all of your open-source database needs, use ClusterControl to automate the full lifecycle of your database instances. ClusterControl can be configured to manage virtual machine life-cycle running on LXC platform (s9s server) and can also be integrated with configuration management tools like Puppet or Ansible via ClusterControl command-line interface called s9s.

Take advantage of cloud flexibility and programmable infrastructure to the most.

Underutilized Resources

There will be no more pay-per-use charging model in on-premises private cloud infrastructure. The organization owns the cloud infrastructure in its own premises and must be capable to fully utilize the resources wisely. The direct cost has now moved to hardware maintenance, bandwidth, electricity, and HVAC (heating, ventilation and air-conditioning) which highly depend on the usage of the cloud instances.

There is a common mistake where under-utilized or dormant instances are being left running indefinitely, without a significant value of them in the long run. Always tag instances with a meaningful description, together with the person-in-charge details, and its termination policy (delete after 1 week after created). Enforce stricter rules for underutilized instances to be shut down or terminated to save bandwidth, CPU processing and electricity. Schedule a regular audit for your instances and hardware assessment to make sure you are not over/underutilized the resources. 

Openstack has a feature called shelve and unshelve, where instances that you are not using, will retain in your list of servers. For example, you can stop an instance at the end of a workweek, and resume work again at the start of the next week. All associated data and resources are kept; however, anything still in memory is not retained. If you no longer need it, you can remove the instance from the server whereby the data and associated resources are deleted. If an instance is no longer needed, you can move the instance off the hypervisor to minimize resource usage.

In-House Expertise

Although tools like Metal-as-a-Service (MaaS), OpenStack and Kubernetes help to improve the operational workflow and efficiency, upskilling operations teams with those tools may take months. As a result, many organizations are forced to outsource the private cloud infrastructure management to a specialist company to accelerate the initial deployment and reduce ongoing operational costs. Some organizations make a mistake by fully outsourcing infrastructure management to others for such a long term, and do not take advantage to build the in-house team organically.

A cloud specialist is a mix of developers and system administrators on the job. We can easily find experts in either of the fields but not both. Cloud specialists commonly arise from a deep understanding of the system infrastructure, software development, architecture design, virtualization, which is not easy to get. It is very likely they are in higher demand with a high salary at this point time. Hiring a team of experts might be too costly, 

Taking advantage of the infrastructure which is already in place, we just need to at least hire a cloud leader and build a team of in-house experts around it. Building a team of experts takes time for training, knowledge sharing and skill development, which eventually gain experience and confidence to achieve a professional level. This is the best long-term investment for the company to have a strong team to support the easily scalable cloud infrastructure for years or even decades. Besides, this will keep the cloud infrastructure private and reduce the risks of external hands handling the on-premises sensitive data.

Conclusion

There are always challenges and obstacles when moving to new technology like cloud computing. We just need to understand the tradeoff and know how to make use of all of the resources we have so we don't make the same mistakes again.

ClusterControl
The only management system you’ll ever need to take control of your open source database infrastructure.