counter stats

QoS for Databases

 

People

Bianca Schroeder
Mor Harchol-Balter
Arun Iyengar
Erich Nahum
Adam Wierman

 

Motivation

There is a strong economic motivation for online retailers (e.g. Amazon) to provide lower response times to their more important, ``big-spending'' clients, so as not to lose these clients. It is likewise financially desirable to offer certain clients Service Level Agreements (SLAs) to guarantee them certain QoS goals, such as high throughput or low mean response time. Because the dominant time associated with serving a dynamic web request is the time spent at the backend database/storage system, it is important that the QoS be applied to the backend database/storage system to control the time spent there. Unfortunately, most commercial database management systems (DBMS), which lie at the heart of e-commerce applications, do not provide effective transaction prioritization for transactional, web-driven workloads.  In this project we investigate the possibility of achieving service differentiation for database transactions using external scheduling.


Figure 1.  The EQMS sits between the web server and the database backend.  The diamonds represent high priority (important) clients with high value purchases.  The books represent low priority clients with low value purchases.

 

Results and Impact

The goal of this project is to design a new approach for service differentiation in backend servers which is portable across different DBMS, storage devices, caches, and other backend servers. We develop an External QoS Management System (EQMS), that sits between the web server (or the application server) and the backend server. The key idea of the EQMS is to maintain an upper limit on the number of transactions executing simultaneously within the DBMS (called the Multi-Programming Limit, or ``MPL''). If a transaction arrives and finds MPL number of transactions already in the DBMS, the arriving transaction is held back in an external queue (no transactions are dropped). QoS targets are then achieved by scheduling the external queue. Response time for a transaction includes both waiting time in the external queue (queuing time) and time spent within the DBMS (execution time).

This approach of encapsulating QoS functionality in the EQMS outside the database has many advantages: it doesn't require any changes to the DBMS and hence is portable across DBMS and is marketable as an independent black-box device. The approach is also effective across different workloads and changing workloads, since the scheduling isn't tied to a particular bottleneck resource inside the DBMS. 

Building the EQMS involves two key challenges.

1) Choosing the MPL: The performance of the EQMS depends critically on the particular multiprogramming limit used (the MPL), i.e. the number of transactions allowed into the database. If the MPL is too low, throughput will suffer, since not all DBMS resources will be utilized. On the other hand, if the MPL is too high, there is insufficient control on scheduling. The question of how to adjust the MPL to achieve both goals simultaneously is an open problem, not just for databases but in system design in general.

In our work, we first investigate the question of whether there exists an MPL that is low enough to provide good service differentiation, while not harming throughput or overall mean response time. We find, using benchmark driven workloads, that such an MPL does in fact exist. Then we develop a feedback based controller, augmented by queueing theoretic models for automatically adjusting the MPL.

2) Scheduling to achieve QoS targets: A second key component of the EQMS is the scheduler which decides on the order in which transactions are dispatched to the DBMS such that the associated QoS targets are met. The scheduler allows for an arbitrary number of different QoS classes, where the classes can differ in their arrival rates, transaction types, etc. Each class is associated with one or more QoS targets for (per-class) mean response time, percentiles of response time, variability in response time, best effort, or any combination thereof.

The core idea behind the scheduler is that, by maintaining a low MPL, we obtain a better estimate of a transaction's execution time within the DBMS, and hence we are able to maintain accurate estimates of the per-class mean execution times. This in turn gives us an upper bound on the queueing time for a transaction, which can be used by the scheduler in order to ensure that QoS targets are met.

Figure 2.  The optimal MPL for external scheduling is high enough to ensure good throughput and low enough to provide scheduling control.


Publications