Rigorous and Scalable Optimized Information Gathering
My research is in optimized information gathering,
utilizing
powerful tools from
decision theory, information theory
and sensor networks to create smarter
reasoning systems and enable monitoring
of spatio-temporal
phenomena. In the SEPIA project (Sensor Placement
and Information Acquisition), together with my collaborators, I've been
developing efficient
algorithms for choosing important observations which have
strong
provable quality guarantees and scale
to large problems. We applied our
algorithms to
several
important practical problems such as predicting
traffic,
building automation, activity
recognition, environmental monitoring and deciding which
blogs to read on the Internet.
My collaborators are Carlos
Guestrin, Anupam
Gupta, Jon
Kleinberg, Jure
Leskovec, Bilge
Mutlu, Ajit
Singh, Amarjeet
Singh, Vipul
Singhvi, William
Kaiser, Jeanne
VanBriesen, Christos
Faloutsos, Brendan
McMahan, Natalie
Glance.
Near-optimal observation selection using submodularity
We
identify a key
structural property of many observation selection
problems: submodularity, an
intuitive diminishing returns property. Exploiting this property allows
to obtain efficient approximation algorithms with
provable guarantees,
as well as drastically speeding up algorithms. Here's a short
overview paper on this topic.
In [UAI'05],
we show that, for a large class of graphical models, the problem of
selecting a set of informative attributes (features) A maximizing
the information
gain I(U;A) with respect to target variables
U, can be approximated efficiently for a large
class of graphical
models. Using the concept of submodularity, we
show that the simple greedy algorithm provides a constant
factor (1-1/e) approximation.
Moreover, we show that this guarantee is essentially the best
we can
hope for. We apply our algorithms to select sensors for best
predicting
the Bay Area traffic. Here, A would be the
selected sensors, and U the minimum traffic flow in certain regions.
Spatial monitoring. For
monitoring spatial phenomena,
such as
temperature in a building, which can frequently be modeled using
Gaussian Processes, mutual information, I(A;V\A)
is an effective criterion for quantifying the informativeness
of a sensor placement, avoiding the wasted
information effect of the commonly used maximum entropy
sampling. In [JMLR'07]
we use submodularity to
show that the greedy algorithm generates near-optimal
placements under this criterion. We also show how we can exploit
structure
in the kernel to make this algorithm very efficient.
Outbreak detection. In [KDD'07],
we study two seemingly very different problems: placing sensors in a
municipal water
distribution network in order to
detect possible contaminations, and selecting informative
weblogs to read on the Internet.
We show that both problems share essential structure: Information
cascades spreading over the blogosphere can be modeled similarly as
contaminations spreading through water networks. We show that many
objectives, such as optimizing the
detection time or likelihood, satisfy submodularity,
making the
problem amenable for greedy (and hence highly scalable) optimization.
We participated in the Battle of
the Water Sensor Networks competition, which required to
optimize for sensor placements in a real metropolitan-area
network with 20,000 nodes. Our approach achieved the
highest number of non-dominated solutions. In the blog selection
problem, our approach outperforms existing ranking criteria. Here's
a link to the CASCADES project page.
Posture
recognition. In
[UIST'07],
we apply our sensor placement methods to instrument a chair
with pressure sensors, in order to enable seating posture
recognition.
Our approach reduces the sensor hardware cost by a factor of 30
compared to previous solutions, while achieving comparable
classification accuracy. We test our approach in a user study on a
real
deployment.
Nonmyopic algorithms
Communication
constraints.
When placing
wireless sensor networks, the
sensors should not only be
informative, but also communicate well. In this
more complex setting, the greedy algorithm
performs arbitrarily badly. In [IPSN'06]
we
present an efficient randomized algorithm, pSPIEL,
which near-optimally trades off information and
communication cost. We implement our approach on a real deployment of 46
TMote Sky motes, and optimize a deployment of light sensors
in the CMU Architecture department. We also demonstrated
our algorithm live at NIPS 2005 and IPSN 2006.

Path planning. In
[IJCAI'07],
we consider the problem, where we want to plan
informative paths in order to monitor the algal growth in
Lake Fulmor, CA, using robotic boats. In this case, the greedy
algorithm performs
badly as well. We extend an
approximation algorithm by Chekuri and Pal for submodular
orienteering to the setting of optimizing paths for multiple
robots; we also make the (originally super-polynomial)
algorithm practical by devising a spatial
partitioning as well as branch & bound
schemes. In [AAAI'07]
we show how this approach can be extended to the spatio-temporal
planning setting.
Robustness. Many observation selection problems require robustness, e.g., against uncertainty in the model parameters. In [NIPS'07] we show, how many such problems can be modeled as selecting a set of observations A which maximizes the minimum over a collection of submodular functions. Examples include robust (nonlinear) experimental design problems, minimizing predictive variance in Gaussian Processes, and adversarial outbreak detection. We develop Saturate, a nonmyopic algorithm which is best possible for this problem (matching lower bounds) under reasonable complexity assumptions.
Active learning
Gaussian processes.
The
above problems address a priori (open-loop)
selection problems,
where observations are chosen according to a model, before any measured
values are obtained. In some applications, it is feasible to implement
a sequential (closed-loop) strategy, which
decides on the next observation based on previously obtained values. An
important question is, how much better a
sequential algorithm can perform compared to the best a priori
selection. In [ICML'07],
we partially answer this question for the
case of sequentially optimizing mutual information in Gaussian
Processes. We develop a theoretical bound
quantifying the sequential advantage, and use it to guide an exploration-exploitation
approach. Our approach has sample complexity
guarantees, and performs well on environmental monitoring problems.
Chain models. I
n
[IJCAI'05]
we prove that for chain graphical models
(such as HMMs), one can efficiently
compute both the best subset of
observations, and even the (exponentially large) best
sequential plan,
maximizing arbitrary value of information
objectives. We also show that even for tree
graphical models (for
which efficient inference is tractable), this problem is wildly
intractable (NP^PP complete).
In [SenSys'05],
we consider a building automation
problem, where
we want to
select sensors in order to intelligently manage the light
controls in a building, satisfying user preferences
and at the same time conserve power. We
formalize and solve this problem using our approach for maximizing
value of information.