## Research

My main interests lie in the study of rigorous methods for
making good decisions based on our observations of the world. This
kind of analysis, in the computing science field, is formalized in
part by the field of *reinforcement learning*, which is
considered part of *machine learning*, which in turn is
considered part of the field of artificial intelligence. The
motivation for the study of this problem in computing came from the
desire to construct intelligent agents that observe, deliberate, and
act in the world; however, this type of analysis is applicable to a
wide variety of problems. Methods for solving reinforcement learning
draw from techniques in statistics, computer science, engineering, operations
research, and similar fields.

I am also interested more generally in developing machine learning and statistical techniques and applying them to problems in healthcare; this is often termed *health informatics*. Below are some projects I have worked on in the past along with selected publications.

### Ongoing Projects

#### Dynamic Treatment Regimes (Treatment Policies) with Competing Outcomes

##### With: Dr. Eric Laber (NCSU Statistics)

[ICML Paper|JMLR Paper|Biometrics Paper]

#### Estimating Causal Effects for Mechanical Ventilation

##### With: Chengbo Li, members of Children's Hospital Los Angeles

[MMath Thesis]

#### Predictive Models that Accommodate Patient Heterogeneity

##### With: Rhiannon Rose (UW CS MMath, UWO Biostat PhD), Drs. Richard Kim, Ute Schwarz, Rommel Tirona (UWO Clinical Pharmacology & Toxicology), Previously Drs. Feng Chang and Tejal Patel (UW Pharmacy), members of Vanderbilt U. Medical Center

[MMath Thesis]

#### Evaluating the Validity of Latent Class Models

##### With: Dr. Brian Flaherty (U. Washington Psychology)

### Previous Projects

#### Classification Methods for Quantitative Electromyography

##### With: Tameem Hesham (Systems Design Engineering), Dr. Ruth Urner
(Computer Science), Dr. Daniel Stashuk (Systems Design Engineering)

[UAI Paper]

#### Pattern Recognition and Characterization

for Surface Self-Assembly Imaging

##### With: Robert Suderman, Dr. Nasser Abukhdeir

[Paper (preprint; appearing in Physical Review E)]

#### Evaluating Time-Of-Use Electricity Pricing

##### With: Adedamola Adepetu, Elnaz Rezaei, Dr. Srinivasan
Keshav

[IEEE SmartGridComm Paper]

#### Tracking People Over Time in 19th-Century Canada

##### With: Dr. Luiza Antonie (Guelph Economics), Dr. Kris Inwood (Guelph
Econ and History), Dr. J. Andrew Ross (Guelph History), Laura
Richards (Guelph Computer Science)

[MLJ Paper]

#### Reconstructing 3D Galaxies from 2D Imaging

##### With: Michael Cormier, Dr. Richard Mann

[MMath Thesis]

#### Predicting household electricity loads.

##### With: Rayman Preet, Peter Xiang Gao, Dr. Srinivasan Keshav

[IEEE SmartGridComm Paper]

#### Bayesian Models for Contact Tracing

##### With: Ayman Shalaby Anshour

[MMath Thesis]

### General Topics

#### Adaptive Treatment Strategies

I am interested in the development of *adaptive treatment
strategies* from medical data. These are strategies for treating
patients that take into account observations about the patients,
like age or level of depression or treatment history, as well as
knowledge of past and future treatment options. Choosing a treatment
based on all of this information allows us to reason about both the
therapeutic and the diagnostic effects of any given treatment, which
in turn allows us to be more forward-thinking in our treatment
choices. For example, it may make sense to administer "treatment A"
not because it is highly effective in all cases, but because it
divides patients into groups of responders and nonresponders, each
of which can be more effectively treated using this information. In
this case, both the therapeutic effects and the diagnostic knowledge
resulting from treatment A are used to our advantage.

I am interested in constructing these strategies using patient data. This process raises many questions once we look beyond an idealized, abstracted view of the problem.

Research questions I am most interested in include:

- What do we do when there are different and conflicting measures of "effectiveness" that must be traded off?
- How confident can we be that the treatment strategy we discover is effective?
- How do we decide which patient variables are relevant?
- What do we do when we only have partial observations of patient data?
- How can we leverage knowledge about how patients are similar or different?

#### Bayesian Global Optimization

My PhD research was focused on Bayesian approaches to non-convex global optimization problems.

An application of this approach is described in the following paper:

"Global optimization" refers to the problem of finding the minimum of a real-valued function that is either non-convex itself, or constrained to a non-convex domain. One class of algorithms for solving global optimization problems is the class of "response surface" methods. These methods construct a probabilistic model of an underlying objective function based on a small number of observed data points. The model is then used to compute which point the method should acquire next in its search for the optimum of the function. My doctoral dissertation examined the state-of-the art of response surface methods for global optimization, and offered improvements in the areas of model selection, acquisition criteria, and experimental evaluation. Traditional response surface methods can be some of the most efficient approaches to optimization in terms of the number of function evaluations required, but they have significant drawbacks. They are very sensitive to the inital choice of function evaluations, they are not invariant to shifting and scaling of the objective function, they do not make use of gradient information, and their experimental evaluation to date has been limited to a small number of artificial test functions. My dissertation work addresses each of these problems with the goal of making response surface methods more widely applicable.

To achieve this goal, I introduced a dimension-independent measure of problem difficulty based on the expected Euler characteristic of excursion sets a Gaussian process, and present an efficient method for computing the measure. This allows us to express a preference for simple functions when estimating parameters, and allows us to generate test problems with different dimensionality but similar difficulty. Using the idea of a prior on problem difficulty, I showed how a maximum a-posteriori objective can be used to robustly estimate reasonable kernel parameters, eliminating the need for initial evaluation points and reducing the need for user selection of tuning parameters. I also developed new acquisition criteria (i.e.\ infill sample criteria) that are invariant to vertical shifting and scaling of the objective function. This reduces the need to tune the optimization algorithm's parameters for use with different objective functions. I illustrated the use of gradient information with response surface methods, showing that the methods we have developed work well when given this information. Empirical results showed that response surface based optimization with gradient information can perform better than a quasi-Newton method with random restarts. Finally I introduced a new methodology for evaluating global optimization techniques based on generating many (i.e.~tens of thousands) test functions and evaluating performance on each. This enabled me to explore an important but unverified tacit assumption in the field, namely that an improved response surface model results in better optimization performance.

I believe that further research in this area should be driven by concerns that arise when these methods are used in practice. To data, I have applied the response surface approach to a robotics optimization problem. I am currently investigating an application in computer vision, and in the future I intend to investigate an application in computational linguistics.

#### Budgeted Learning

My master's thesis examined classifier learning where observing the value of a feature of a training example has an associated cost, and the total cost of all feature values acquired during training must remain less than the fixed budget. It compared methods for sequentially choosing which feature value to purchase next, given the remaining budget and userâ€™s current knowledge of naive Bayes model parameters. This problem is similar to "active learning," but active learning scenarios assume the ability to purchase class labels, whereas we were interested in purchasing feature values. Optimal budgeted learning in a Bayesian framework has been proven computationally intractable. The most successful heuristics for budgeted learning rely on the ability to compute the expected change in an estimate of classifier performance upon purchasing one or more features, and then purchasing features that are expected to improve performance the most.

Recently, advances have been made in constructing non-trivial confidence intervals for classifier performance in small sample settings. These confidence intervals could be used both in developing heuristics for effective feature purchasing as well as for providing performance guarantees, since the resulting purchased dataset is frequently too small to admit the use of cross-validation or re-sampling based approaches. I am exploring this avenue of research with a Eric Laber at North Carolina State University.