Research Theme
"Better design through modeling and measurement"
My research focuses on using mathematical models to provide insight into the design of computer systems. I apply analytic models and tools that are traditionally used in the operations research community in order to evaluate the impact of design decisions in systems such as web servers, routers, databases, and beyond. My work applies and often extends techniques in stochastic modeling, queueing theory, scheduling theory, and game theory.
Current Projects
The following links give a brief overview of some of the projects I work on. If you are interested in finding out more about any of these projects don't hesitate to email me.
SQUID: Scheduling and QUeueing analysis for Improving Design
Measuring fairness and predictability
Modern designs for web servers, routers, and other applications improve user response times by giving priority to small job sizes. But, this leads to worries about whether large job sizes receive fair performance. So, we need to ask: Does giving priority to requests for small files starve large files?-
Scheduling for today's systems: Beyond idealized policies
The policies used in real systems often differ significantly from the idealized policies studied analytically as a result of a number of implementation concerns. Thus, the results proven about idealized policies cannot be immediately applied to real systems. So, we need to understand: how close is the performance of the policies used in practice to the performance of the idealized policies studied analytically? -
Scheduling with inexact job size information?
Modern system designs have improved improved user response times by giving priority to "small" requests. However, real systems do not have exact job size information, they only have estimates. Thus, when designing a new system one needs to understand: How good must job size estimates be before prioritizing jobs that are (estimated to be) small will improve performance? How to schedule when you're blind
In many applications, the scheduler does not even have estimates of job sizes. For instance, in routers and OSs the scheduler is completely blind to job size information. In these settings, one cannot hope to attain the same performance as possible when using job size information to prioritize. However, how good is the performance that is possible to achieve?
BID: Behavioral Influences on Design
-
Why designers need to worry about closed vs. open system models
Workload generators are an invaluable resource for evaluating the performance of new designs, and systems researchers are well aware of the importance of setting up the workload generator so that the system being tested is "accurately represented." However, system designers rarely pay attention to whether the workload generator is open or closed. In this project, we ask: How big an impact does the choice of open versus closed have? And, how can a researcher decide whether to use an open, closed, or partly-open model? Queueing network games
Queueing theory has long served as a fundamental tool for understanding the dynamics of systems from computer networks, to production systems, and beyond. However, traditional results view customer behavior as an exogenous parameter, unaffected by the details of the model, and thus cannot capture the impact of pricing and competition within the models. In recent years, game theoretic techniques have been applied to the same range of problems in order to characterize the impact of customer behavior/reactions. However, using game theoretic techniques alone ignores the queueing dynamics inherent in these applications. So, there is a mutual need for studying the interactions of game-theoretic dynamics and queueing models.-
The impact of impatience (coming soon)
Nearly 20% of internet traffic is from aborted service requests. This high level of user abandonment (typically due to impatience) has a large, negative impact on the behavior of the time-sharing policies that are used in many computer systems because resources are wasted on users that later abandon due to impatience. However, since users with small requests tend to become impatient quickly, while users with large requests tend to be more patient, applying the heuristic of prioritizing small jobs can limit the amount of wasted service. But, the benefits of such an approach are only beginning to be understood.
MAID: Measurment and Analysis for Improving Design
-
A non-cooperative approach to distributed control
We study an alternative to traditional forms of distributed control based on game-theoretic techniques. That is, we view the agents as self-interested players in a game. This style of control offers many advantages, such as robustness to agent failures, scalability, reduced communication costs, etc. However, it also leaves a number of interesting design issues to study, such as how can utility functions be designed so that they are guaranteed to reach an efficient equilibrium? To illustrate the ideas, we focus on resource allocation problems such as sensor coverage. Power management via speed scaling (coming soon)
When designing new systems it is no longer possible to maximize performance without taking into account power usage. In fact, soon, power is likely to surpass performance as the key design criteria. In this project, we are looking at the interaction of power management (via speed scaling) with scheduling decisions.-
Capacity planning for server farms: How many servers are optimal?
In almost all areas of computer systems, multiprocessor designs are now commonplace. From multi-core chips, to multi-channel wireless, to server farms, we are now in a multiserver world. However, the analysis of multiserver systems is notoriously difficult, and many core questions are still unanswered. Here, we address one such fundamental question: Given a fixed server capacity, how many servers are optimal? -
A unified framework for modeling TCP variants
TCP has long been the dominant protocol used for internet traffic. As a result, there has been an enormous amount of research toward understanding and improving the protocol. The result is an wide variety of TCP variants and each analyzed using a different set of models. In this project, we present an framework that can be used to provide models for any existing (or new) variant of TCP in a plug-and-play manner. In addition, the model is simple and fast, and can thus be used to aid in the design phase as well as the evaluation phase of development. QoS for databases
There is a strong economic motivation for online retailers (e.g. Amazon) to provide lower response times and Service Level Agreements (SLAs) to their more important, ``big-spending'' clients, so as not to lose these clients. Because the dominant time associated with serving a dynamic web request is the time spent at the backend database/storage system, it is important that the QoS necessary to supply these SLAs be applied to the backend database/storage system. Unfortunately, most commercial database management systems (DBMS), do not provide effective prioritization for transactional, web-driven workloads. In this project we investigate the possibility of using external scheduling to achieve service differentiation for database transactions.