We derive confidence sequences for sequentially estimating quantiles of any distribution, with coverage guarantees holding non-asymptotically and uniformly over all sample sizes. We also present non-asymptotic, time-uniform concentration bounds for the empirical distribution function, a quantile best-arm identification algorithm, sequential Kolmogorov-Smirnov tests, and procedures for sequential quantile A/B tests.
A sensitivity analysis in an observational study tests whether the qualitative conclusions of an analysis would change if we were to allow for the possibility of limited bias due to confounding. We propose a new class of non-asymptotic, distribution-free tests, the uniform general signed rank tests, for observational studies with paired data. This class includes uniform versions of the sign test and the Wilxcoxon signed-rank test. Our tests adaptively choose a subpopulation which exhibits the strongest treatment effect, controlling for this adaptivity using a uniform martingale concentration argument, and achieving superior results to existing tests in a sensitivity analysis, especially with large samples.
A confidence sequence is a sequence of confidence intervals that is uniformly valid over an unbounded time horizon; they are fundamental tools for sequential experimentation. We develop confidence sequences whose widths go to zero, with non-asymptotic coverage guarantees under nonparametric conditions. In particular, we derive an empirical-Bernstein bound growing at an iterated-logarithm rate which allows for non-asymptotic, sequential estimation of the mean of any bounded distribution, with confidence interval widths scaling appropriately with the estimated standard deviation. We apply our methods to covariance matrix estimation and to estimation of sample average treatment effect under the Neyman-Rubin potential outcomes model.
We give a powerful, general formulation of the Cramér-Chernoff method for exponential concentration inequalities which unifies and strengthens dozens of inequalities from the literature. These include classical results for concentration of scalar sums and martingales, more recent matrix concentration results, self-normalized inequalities, and results for Banach-space valued martingales and continuous-time processes. Such results are of broad theoretical interest, but are particularly useful as the foundation of a variety of methods for the analysis of sequential experiments.
A library implementing many techniques from my research to computing confidence sequences for means, quantiles, and cumulative distribution functions. Written in C++ with Python and R wrappers.
This repository contains all code to reproduce the simulations and plots from the paper "Sequential estimation of quantiles with applications to A/B-testing and best-arm identification". Of particular interest is an efficient C++ simulation harness for quantile best-arm identification.
This is where I cut my teeth on dependency injection and TDD, refactoring the download provider and preparing it for release as a public API in Gingerbread.
A confidence sequence is an infinite sequence of confidence intervals, progressively updated as new data arrive, satisfying the property that, with a desired coverage probability, all confidence intervals cover the parameter of interest uniformly over time. Confidence sequences allow an experimenter to “peek” at an updated confidence interval as often as desired and adaptively stop at any time, without breaking coverage guarantees. I will describe a framework for constructing confidence sequences with non-asymptotic guarantees in a variety of nonparametric settings, with emphasis on applications to average treatment effect estimation in a potential outcomes framework, quantile estimation based on i.i.d. samples from any distribution, and best-arm identification in a multi-armed bandit.
We develop confidence sequences for sample average treatment effect estimation in randomized experiments, including experiments with adaptive randomization mechanisms such as biased coin designs. These confidence sequences enable an investigator to continuously monitor an experiment and adaptively stop based on observed results, without invalidating inference. We build upon uniform exponential concentration results to obtain a non-asymptotic coverage guarantee justified by the randomization mechanism. The confidence interval width is based on an empirical estimate of variance and achieves arbitrary precision for sufficiently large sample sizes.
It is now commonplace for organizations with websites or mobile apps to run randomized controlled experiments, or “A/B tests” as they’re often called in industry. Such experiments provide a reliable way to determine which product changes lead to the most successful user interactions. In this lecture we discuss why randomized experiments are so important, talk about some of the key design choices that go into A/B tests, and get a brief introduction to sequential monitoring of experimental results.
The management of dependencies among classes is one of the most important (and underappreciated) aspects of object-oriented programming. In this talk I make a case for composition over inheritance and give a brief introduction to dependency injection. I then spend the rest of the talk outlining a step-by-step, Fowler-style method for migrating incrementally from a system built upon static binding of global state to one that uses dependency injection exclusively.
In this talk I share some of my experiences developing a large-scale web crawler using Python and AWS. I give an overview of the Mercator web crawler, share some tips and hard-earned wisdom on implementing Mercator with Python and AWS, and end with some real-world results from our crawls.
Recorded in late 2013/early 2014 while I was playing with a couple of awesome musicians in Little Heart. I played the lead guitar parts, wrote some of them, and coiled and uncoiled a lot of microphone cables.