Databases in the Cloud
Technology leaders are being inundated with a flood of new cloud architectures, strategies and products – all guaranteed by vendors and various industry pundits to solve all of our database challenges. The seemingly endless array of public cloud based DBMS offerings can quickly become bewildering. This article is intended to peel back the veil of cloud based DBMS offerings by providing readers with our experiences with cloud based database architectures.
One of the benefits of working for a remote DBA services provider is that our knowledge is not constrained by any one organization’s technology implementation. We have customers that have technology strategies that range from “bleeding edge” to “yesterday’s technology tomorrow.” We know what products work and which ones don’t, what tech stack combinations play well together and what database features and technologies provide the most benefits for a given business or technical need. This includes supporting several of the public cloud based DBMS offerings discussed in this article.
DBaaS and DBPaaS Defined
Like any other term that is associated with the cloud, the definition of DBaaS and DBPaaS architectures often depends upon the vendor or industry expert you are talking to.
DBaaS and DBPaaS architectures are multi-tenant systems that, depending on the vendor chosen, provide customers with varying degrees of scalability, elasticity and administrative self service. Before we continue, a couple of definitions are in order:
- Elasticity – The ability of a system to dynamically increase or decrease computing resources based on current, real-time workload changes. An elastic system is able to automatically provision computing resources to meet the current workload demands placed upon it. The elastic nature of cloud systems is the foundation for the “Only Pay for What You Use” model that is an attractive benefit to many organizations considering public cloud based DBMS architectures
- Scalability – The ability of a system to be easily scaled to meet forecasted future demands – as in capacity planning
Database as a Service (DBaaS) DBAs have access to the database instance and, depending on the provider, are able to perform the majority, or a subset of, instance administrative activities. The vendor providing the service takes care of the hardware and OS layers as well as the database software. In addition, DBaaS vendors offer varying levels of features that may include monitoring, patching, event notifications, geo-replication for availability and backups.
DBPaaS combines Database as a Service (DBaaS) with Platform as a Service (PaaS) to become Database Platform as a Service or DBPaaS. DBPaaS providers raise their level of ownership to include the database instance. DBPaaS DBAs perform very limited or no database instance administration activities.
The challenge of defining a vendor offering as DBPaaS or DBaaS is the fluid feature set and ever-changing nature of their architectures. Just like their Infrastructure as a Service (IaaS) and Platform as a Service (PaaS) cloud counterparts, the lines of differentiation between the architectures are becoming increasingly blurred. Depending on the vendor chosen, the architectures provide varying degrees of features, elasticity, scalability and self-service activities.
Hybrid DBMS Clouds – Overcoming a Lack of Consistency Between Public and Private Implementations
Hybrid clouds also have many interpretations. To some, they mean one cloud architecture interacting with another. To others, it is a multi-tier application with a public cloud service interacting with private cloud systems. For this discussion, hybrid clouds are the DBMS vendor’s attempts to overcome the lack of consistency between public and private cloud DBMS systems. The majority of database vendors’ on-premises database offerings differ from their public cloud counterparts. In addition, public cloud implementations of the same database product may differ from each other. Oracle, Amazon, and Google all offer cloud versions of MySQL and, although very much alike in many areas, they also have key differences.
The environments often differ in database features and functionality, data access mechanisms, administrative processes and interfaces, maintenance utilities, monitoring, security controls, backup/recovery, disaster recovery and tuning and performance.
A utopian hybrid DBMS cloud would be an environment that has a combination of public and private cloud DBMS architectures that are totally transparent and seamless to administrators and developers. For developers, it would be an environment that allows 100% code compatibility between private and public clouds. For DBAs, it would be an environment that is monitored and administered exactly the same way, regardless of whether that system is running on a server in the shop’s data center or in the public cloud.
This utopian vision is coming closer to reality. Oracle’s Database Cloud and Microsoft’s Azure Stack are offerings that have the goal of making public and private cloud implementations as seamless as possible. There is no doubt that both vendors understand the importance of seamless public and private implementations to their customer base and are expending significant resources to achieve that goal.
Considering Cloud Based DBMS Systems? – It’s Not Just the Database Your Shop Needs to Evaluate
Remember the “good old days” when product choices were simple? The number of database offerings available were limited and the selection was easy. You couldn’t go wrong if you chose Oracle, Microsoft or IBM. If you preferred open source products, MySQL and PostgreSQL were excellent alternatives.
The database product landscape no longer consists of a handful of traditional database vendor offerings. The database market arena has exploded with dozens of new database products from organizations that range the spectrum from NoSQL vendors that specialize in niche solutions to a myriad of cloud based database architectures.
I’ve written a series of articles on NoSQL vs relational database systems, but for this discussion, we’ll focus on cloud offerings. As I stated, technology leaders now have more choices available to them than ever before. In order to thoroughly evaluate public cloud based DBMS offerings, we can no longer constrain ourselves to analyzing database products alone, we must also examine the entire database cloud “ecosystem.” For the intent of this article, I define the database cloud ecosystem as the vendor’s costing models, administrative and monitoring interfaces, security controls, provisioning mechanisms, server hardware, storage architecture, operating system, database and edge technologies and products. The cloud vendor’s entire ecosystem must be evaluated and compared to competing offerings.
Can DBMS Cloud Offerings Reduce Database Driven Application Total Cost of Ownership?
Software licensing commands a premium price in the marketplace. Hardware, operating system and database licenses often result in significant startup costs. Ongoing staffing costs must also be evaluated when comparing public and private cloud based DBMS architectures.
The rapid growth of cloud DBMS offerings are providing organizations with cost effective alternatives to on-premises systems. When calculating TCO and return on their database investment, savvy decision makers are now considering cloud systems as viable alternatives to more traditional on-premises database data stores.
Here’s a quick list of what we and our customers have experienced. Please note these are generalities, and the topics below are highly dependent upon the cloud provider chosen, the workloads placed upon the database system, and its level of interaction with other systems.
- Reduces up-front capital expenditures- Customers are not required to provide all of the hardware and software infrastructure components required to ensure high availability, reliability and redundancy. Since the architecture is provided by the cloud vendor, shops don’t have spend the monies required to build the computing systems.
- Evaluate vendor pricing models very closely- Vendor pricing will vary dramatically, and many of the subscription models and tiers will be VERY confusing. Describing them as confusing is like describing the Titanic as “springing a small leak.” You’ll need to analyze all system resources including disk storage, CPU and memory consumption costs when comparing the alternatives. In addition, subscription based pricing may not be the most cost effective alternative for all database driven applications.
- Database licensing – Just like its subscription based counterpart, you’ll need to evaluate database licensing costs. Amazon, for example, currently charges $0.04 per hour to customers that need Oracle database licenses and $0.0025 per hour for customers that provide their own. It’s not a significant cost savings, but if you have an Oracle license, you may want to consider Amazon’s Bring Your Own License (BYOL) option. Amazon also provides the BYOL option for Microsoft SQL Server.
- Reduces ongoing administration costs? Most of the time, but not always
- Consistency – Customers that have already chosen a DBMS cloud provider before coming to us are sometimes affected by the consistency problems I described previously, especially when the client uses the cloud DBMS system for test and QA and a private cloud for production (or vice versa). Alternatively, they will ask us to help migrate an existing database driven application from an on-premises architecture to the cloud. They find out that not all of the database features their applications rely upon are available in the cloud, or their data access mechanisms they use don’t work exactly the same way. The most dramatic impacts that affect them are those related to differences in database features and code transportability. Developers and administrators are then required to spend additional time reducing the negative impact of issues generated by the lack of consistency between public and private cloud DBMS architectures.
- The cloud system becomes a “black box”- The monitoring tools provided by the cloud vendors often do not match the robustness of their private cloud counterparts. Administrators don’t have access to the instrumentation or the analysis tools necessary to debug performance and availability issues. They must rely upon the cloud vendor to partner with them in problem analysis, identification and resolution, which often complicates and lengthens the problem resolution process. Remember that your cloud database is just one of thousands (and thousands) of systems the vendor supports.
- No database is an island – Most database systems are not stand alone. They take data feeds from other databases and are loaded using flat files. They may generate and refine data that is ingested by other applications. Performance becomes an issue when large files need to be transferred to the cloud database for ingestion, or the cloud database sends data to other systems for processing. The performance problem is magnified when SQL queries join data from the cloud environment and on-premises databases. When selecting databases for cloud implementation, evaluate its interaction with other systems. If you don’t, you may find that your staff will spend a significant amount of time wrestling with the tasks of getting data into and out of the cloud database.
What Else Do I Need to Consider?
- Security – Is the data you are storing in the cloud regulated by internal or external security policies or protection laws? Exactly how sensitive is the data? You also need to consider whether the cloud based database is standalone or if it interacts with other systems. Regardless of what the vendors and industry pundits tell you, you are increasing the security risk when you store data in the public cloud. More people are becoming involved in the administration of your environment. You are sharing the responsibility of securing your data with a third-party provider and are relying upon the quality of their security controls. This sharing of security does not mean that you turn the responsibility of securing your data over to the vendor. Organizations choosing to implement cloud based DBMS systems need to increase their level of scrutiny. Are you encrypting data transfers? Work files? Backups? Reports?
- Vendor lock-in – Customers are often required to tailor their database deployments to their selected vendor’s cloud based architecture. The amount of tailoring required is directly proportional to the level of complexity and effort required to switch cloud DBMS providers
- Training time and costs – It will take time for your developers and administrators to learn cloud based database management systems. Their architectures can range from simple to complex. Any new architecture, including DBMS public cloud implementations will require training time. Don’t expect your staff to quickly become experts in public cloud DBMS platforms. Most vendor offerings have interfaces that allow customers to configure their cloud environment, interfaces that can range from simple to very sophisticated. Amazon, for example, provides customers with their AWS CloudFormation utility, which is a very robust, template driven configuration service. It’s robust enough that your personnel will need to dedicate time to learn it. You don’t need to use the utility to deploy databases, but it does allow you to build AWS application architectures. Depending on the services you choose and the architectures you are creating, deploying databases on Amazon can range from fairly simple to complex. Oracle’s Database Cloud Service and Microsoft’s Azure configuration utilities are also extremely powerful services.
- You become reliant upon a third-party vendor for system functionality, availability and performance – With most on-premises systems, your organization is able to control the entire environment – hardware, software and ongoing administration and maintenance procedures. With public DBMS cloud implementations, your system’s availability, performance and security are now also dependent upon a third-party cloud services provider.
- Staffing changes – Cloud DBMS architectures may require changes to your support team’s organizational infrastructure. Database and application architects play an important role in the selection, configuration and implementation of public cloud based DBMS platforms. Multi-tier architecture design is critical. The more complex and multi-tier the database driven application is, the more important design becomes. Application, presentation, session, transport, network, data and physical layers must be investigated, analyzed, designed and implemented. Depending on the architecture chosen, you will need to dedicate personnel to manage the cloud DBMS deployments. The vendors provide user interfaces that allow customers to configure the cloud DBMS environments. There is a significant learning curve, and personnel must be dedicated to learn and fully understand how the chosen vendor’s configuration and provisioning services are utilized. If you want to fully leverage the benefits of cloud DBMS deployments, a staff member should be dedicated to understanding how to configure and administer the environments.
- Changes to existing technologies and products – Application development, monitoring, administration and security tools that are standards for you shop may or may not work with the cloud based architecture. Your shop will need to evaluate the impact that the new cloud based architectures have on your existing toolsets.
- Policy and procedure changes – Public Cloud DBMS systems are monitored and administered differently than their private cloud counterparts. New policies and procedures will need to be created and changes to existing documentation will be required. Change management, security administration, repeatable administration processes, application documentation, etc. will all need to be reviewed.
An increasingly competitive market arena will result in continued advancements in cloud based database architectures, technologies and features. Choosing the correct DBaaS or DBPaaS environment is critical to the success of any cloud based DBMS system. This decision was simple when the number of alternatives available was limited. With the seemingly endless array of public cloud based architectures available, that choice is no longer as clear cut. Organizations looking to the cloud now have more architecture choices available to them than ever before. In order to correctly select and implement the most appropriate cloud based DBMS architecture for their organizations, technology pros must create and execute a well thought out, detailed analysis of the competing offerings.
A correctly chosen architecture will reduce database driven application total cost of ownership while allowing the application to perform to expectations, have the desired functionality and be easily monitored and administered.