I am currently a postdoc with the Geometrica group at INRIA in France. I finished my Ph.D in Computer Science at CMU, advised by Gary Miller. My research is in geometric algorithms. I am most interested in the intersection of geometric algorithms and topological data analysis.

I am co-organizing a workshop at NIPS on Algebraic Topology and Machine Learning.

Watch the video of my recent talk on Mesh Generation and Topological Data Analysis.

Recently, I developed a new undergraduate course in Computational Geometry.

# Research

A New Approach to Output-Sensitive Voronoi Diagrams
Gary L. Miller and Donald R. Sheehy
SOCG: ACM Symposium on Computational Geometry. (to appear) 2013

We describe a new algorithm for computing the Voronoi diagram of a set of $n$ points in constant-dimensional Euclidean space. The running time of our algorithm is $O(f \log n \log \Delta)$ where $f$ is the output complexity of the Voronoi diagram and $\Delta$ is the spread of the input, the ratio of largest to smallest pairwise distances. Despite the simplicity of the algorithm and its analysis, it improves on the state of the art for all inputs with polynomial spread and near-linear output size. The key idea is to first build the Voronoi diagram of a superset of the input points using ideas from Voronoi refinement mesh generation. Then, the extra points are removed in a straightforward way that allows the total work to be bounded in terms of the output complexity, yielding the output sensitive bound. The removal only involves local flips and is inspired by kinetic data structures.

@inproceedings{miller13new,
Title = {A New Approach to Output-Sensitive Voronoi Diagrams and Delaunay Triangulations},
Author = {Gary L. Miller and Donald R. Sheehy},
Booktitle = {SOCG: Proceedings of the 2th ACM Symposium on Computational Geometry},
Year = {2013}}
A Fast Algorithm for Well-Spaced Points and Approximate Delaunay Graphs
Gary L. Miller, Donald R. Sheehy and Ameya Velingker
SOCG: ACM Symposium on Computational Geometry. (to appear) 2013

We present a new algorithm that produces a well-spaced superset of points conforming to a given input set in any dimension with guaranteed optimal output size. We also provide an approximate Delaunay graph on the output points. Our algorithm runs in expected time $O(2^{O(d)}(n\log n + m))$, where $n$ is the input size, $m$ is the output point set size, and $d$ is the ambient dimension. The constants only depend on the desired element quality bounds.

To gain this new efficiency, the algorithm approximately maintains the Voronoi diagram of the current set of points by storing a superset of the Delaunay neighbors of each point. By retaining quality of the Voronoi diagram and avoiding the storage of the full Voronoi diagram, a simple exponential dependence on $d$ is obtained in the running time. Thus, if one only wants the approximate neighbors structure of a refined Delaunay mesh conforming to a set of input points, the algorithm will return a size $2^{O(d)}m$ graph in $2^{O(d)}(n\log n + m)$ expected time. If $m$ is superlinear in $n$, then we can produce a hierarchically well-spaced superset of size $2^{O(d)}n$ in $2^{O(d)}n\log n$ expected time.

@inproceedings{miller13fast,
Title = {A Fast Algorithm for Well-Spaced Points and Approximate Delaunay Graphs},
Author = {Gary L. Miller and Donald R. Sheehy and Ameya Velingker},
Booktitle = {SOCG: Proceedings of the 29th ACM Symposium on Computational Geometry},
Year = {2013}}
Zigzag Zoology: Rips Zigzags for Homology Inference
Steve Y. Oudot and Donald R. Sheehy
SOCG: ACM Symposium on Computational Geometry. (to appear) 2013

For points sampled near a compact set $X$, the persistence barcode of the Rips filtration built from the sample contains information about the homology of $X$ as long as $X$ satisfies some geometric assumptions. The Rips filtration is prohibitively large, however zigzag persistence can be used to keep the size linear. We present several species of Rips-like zigzags and compare them with respect to the signal-to-noise ratio, a measure of how well the underlying homology is represented in the persistence barcode relative to the noise in the barcode at the relevant scales. Some of these Rips-like zigzags have been available as part of the Dionysus library for several years while others are new. Interestingly, we show that some species of Rips zigzags will exhibit less noise than the (non-zigzag) Rips filtration itself. Thus, the Rips zigzag can offer improvements in both size complexity and signal-to-noise ratio.

Along the way, we develop new techniques for manipulating and comparing persistence barcodes from zigzag modules. We give methods for reversing arrows and removing spaces from a zigzag. We also discuss factoring zigzags and a kind of interleaving of two zigzags that allows their barcodes to be compared. These techniques were developed to provide our theoretical analysis of the signal-to-noise ratio of Rips-like zigzags, but they are of independent interest as they apply to zigzag modules generally.

@inproceedings{oudot13zigzag,
Title = {Zigzag Zoology: Rips Zigzags for Homology Inference},
Author = {Steve Y. Oudot and Donald R. Sheehy},
Booktitle = {SOCG: Proceedings of the 29th ACM Symposium on Computational Geometry},
Year = {2013}}
Linear-Size Approximations to the Vietoris-Rips Filtration
Donald R. Sheehy
Discrete Comput Geom (to appear) 2013
Previously appeared in SOCG 2012

The Vietoris-Rips filtration is a versatile tool in topological data analysis. Unfortunately, it is often too large to construct in full. We show how to construct an $O(n)$-size filtered simplicial complex on an $n$-point metric space such that the persistence diagram is a good approximation to that of the Vietoris-Rips filtration. The filtration can be constructed in $O(n\log n)$ time. The constants depend only on the doubling dimension of the metric space and the desired tightness of the approximation. For the first time, this makes it computationally tractable to approximate the persistence diagram of the Vietoris-Rips filtration across all scales for large data sets.

Our approach uses a hierarchical net-tree to sparsify the filtration. We can either sparsify the data by throwing out points at larger scales to give a zigzag filtration, or sparsify the underlying graph by throwing out edges at larger scales to give a standard filtration. Both methods yield the same guarantees.

@article{sheehy13linear,
Title = {Linear-Size Approximations to the {V}ietoris-{R}ips Filtration},
Author = {Donald R. Sheehy},
Journal = {Discrete \& Computational Geometry},
Note = {to appear},
Year = {2013}}
A Multicover Nerve for Geometric Inference
Donald R. Sheehy
CCCG: The Canadian Conference in Computational Geometry 2012

We show that filtering the barycentric decomposition of a \v Cech complex by the cardinality of the vertices captures precisely the topology of $k$-covered regions among a collection of balls for all values of $k$. Moreover, we relate this result to the Vietoris-Rips complex to get an approximation in terms of the persistent homology.

@inproceedings{sheehy12multicover,
Title = {A Multicover Nerve for Geometric Inference},
Author = {Donald R. Sheehy},
Booktitle = {CCCG: Canadian Conference in Computational Geometry},
Year = {2012}}
New Bounds on the Size of Optimal Meshes
Donald R. Sheehy
Computer Graphics Forum, 31:5, 1627-1635 2012

The theory of optimal size meshes gives a method for analyzing the output size (number of simplices) of a Delaunay refinement mesh in terms of the integral of a sizing function over the input domain. The input points define a maximal such sizing function called the feature size. This paper presents a way to bound the feature size integral in terms of an easy to compute property of a suitable ordering of the point set. The key idea is to consider the pacing of an ordered point set, a measure of the rate of change in the feature size as points are added one at a time. In previous work, Miller et al.\ showed that if an ordered point set has pacing $\phi$, then the number of vertices in an optimal mesh will be $O(\phi^dn)$, where $d$ is the input dimension. We give a new analysis of this integral showing that the output size is only $\Theta(n + n\log \phi)$. The new analysis tightens bounds from several previous results and provides matching lower bounds. Moreover, it precisely characterizes inputs that yield outputs of size $O(n)$.

@article{sheehy12new,
Title = {{New Bounds on the Size of Optimal Meshes}},
Author = {Donald R. Sheehy},
Journal = {Computer Graphics Forum},
Volume= {31},
Number= {5},
Pages = {1627--1635},
Year = {2012}}
Minimax Rates for Homology Inference
Sivaraman Balakrishnan, Alessandro Rinaldo, Aarti Singh, Donald R. Sheehy and Larry Wasserman
AISTATS: AI and Statistics 2012

Often, high dimensional data lie close to a low-dimensional submanifold and it is of interest to understand the geometry of these submanifolds. The homology groups of a manifold are important topological invariants that provide an algebraic summary of the manifold. These groups contain rich topological information, for instance, about the connected components, holes, tunnels and sometimes the dimension of the manifold. In this paper, we consider the statistical problem of estimating the homology of a manifold from noisy samples under several different noise models. We derive upper and lower bounds on the minimax risk for this problem. Our upper bounds are based on estimators which are constructed from a union of balls of appropriate radius around carefully selected points. In each case we establish complementary lower bounds using Le Cam's lemma.

@article{balakrishnan12minimax,
Title     = {Minimax rates for homology inference},
Author    = {Sivaraman Balakrishnan and
Alessandro Rinaldo and
Don Sheehy and
Aarti Singh and
Larry A. Wasserman},
Journal   = {Journal of Machine Learning Research - Proceedings Track},
Volume    = {22},
Pages     = {64--72},
Year      = {2012}}
Beating the Spread: Time-Optimal Point Meshing
Gary L. Miller, Todd Phillips and Donald R. Sheehy
SOCG: ACM Symposium on Computational Geometry 2011

We present NetMesh, a new algorithm that produces a conforming Delaunay mesh for point sets in any fixed dimension with guaranteed optimal mesh size and quality. Our comparison based algorithm runs in time $O(n\log n + m)$, where $n$ is the input size and $m$ is the output size, and with constants depending only on the dimension and the desired element quality bounds. It can terminate early in $O(n\log n)$ time returning a $O(n)$ size Voronoi diagram of a superset of $P$ with a relaxed quality bound, which again matches the known lower bounds.

The previous best results in the comparison model depended on the log of the spread of the input, the ratio of the largest to smallest pairwise distance among input points. We reduce this dependence to $O(\log n)$ by using a sequence of $\epsilon$-nets to determine input insertion order in an incremental Voronoi diagram. We generate a hierarchy of well-spaced meshes and use these to show that the complexity of the Voronoi diagram stays linear in the number of points throughout the construction.

@inproceedings{miller11beating,
Title = {Beating the Spread: Time-Optimal Point Meshing},
Author = {Gary L. Miller and Todd Phillips and Donald R. Sheehy},
Booktitle = {SOCG: Proceedings of the 27th ACM Symposium on Computational Geometry},
Year = {2011}}
Topological Inference via Meshing
Benoit Hudson, Gary L. Miller, Steve Y. Oudot and Donald R. Sheehy
SOCG: ACM Symposium on Computational Geometry 2010

We apply ideas from mesh generation to improve the time and space complexities of computing the full persistent homological information associated with a point cloud $P$ in Euclidean space $R^d$. Classical approaches rely on the Cech, Rips, $\alpha$-complex, or witness complex filtrations of $P$, whose complexities scale badly with $d$. For instance, the $\alpha$-complex filtration incurs the $n^{\Omega(d)}$ size of the Delaunay triangulation, where $n$ is the size of $P$. The common alternative is to truncate the filtrations when the sizes of the complexes become prohibitive, possibly before discovering the most relevant topological features. In this paper we propose a new collection of filtrations, based on the Delaunay triangulation of a carefully-chosen superset of $P$, whose sizes are reduced to $2^{O(d^2)}n$. A nice property of these filtrations is to be interleaved multiplicatively with the family of offsets of $P$, so that the persistence diagram of $P$ can be approximated in $2^{O(d^2)}n^3$ time in theory, with a near-linear observed running time in practice. Thus, our approach remains tractable in medium dimensions, say 4 to 10.

@inproceedings{hudson10topological,
Title = {Topological Inference via Meshing},
Author = {Beno\^{i}t Hudson and Steve Y. Oudot and Gary L. Miller and Donald R. Sheehy},
Booktitle = {SOCG: Proceedings of the 26th ACM Symposium on Computational Geometry},
Year = {2010}}
Approximate Centerpoints with Proofs
Gary L. Miller and Donald R. Sheehy
Computational Geometry: Theory and Applications, 43(8): 647-654 2010
Previously appeared in SOCG 2009

We present the IteratedTverberg algorithm, the first deterministic algorithm for computing an approximate centerpoint of a set $S \subset R^d$ with running time subexponential in d. The algorithm is a derandomization of the IteratedRadon algorithm of Clarkson et al. (International Journal of Computational Geometry and Applications 6 (3) (1996) 357–377) and is guaranteed to terminate with an Ω(1/d2)-center. Moreover, it returns a polynomial-time checkable proof of the approximation guarantee, despite the coNP-completeness of testing centerpoints in general. We also explore the use of higher order Tverberg partitions to improve the running time of the deterministic algorithm and improve the approximation guarantee for the randomized algorithm. In particular, we show how to improve the $\Omega(1/d^2)$-center of the IteratedRadon algorithm to $\Omega(1/d^{\frac{r}{r-1}})$ for a cost of $O((rd)^d)$ in time for any integer r > 1.

@article{miller10approximate,
Title     = {Approximate centerpoints with proofs},
author    = {Gary L. Miller and Donald Sheehy},
journal   = {Computational Geometry: Theory and Applications},
volume    = {43},
number    = {8},
year      = {2010},
pages     = {647--654},
Year = {2010}}
The Centervertex Theorem for Wedge Depth
Gary L. Miller, Todd Phillips and Donald R. Sheehy
CCCG: The Canadian Conference in Computational Geometry 2009

There are many depth measures on point sets that yield centerpoint theorems. These theorems guarantee the existence of points of a specified depth, a kind of geometric median. However, the deep point guaranteed to exist is not guaranteed to be among the input, and often, it is not. The alpha-wedge depth of a point with respect to a point set is a natural generalization of halfspace depth that replaces halfspaces with wedges (cones or cocones) of angle $\alpha$. We introduce the notion of a centervertex, a point with depth at least $n/(d+1)$ among the set $S$. We prove that for any finite subset $S$ of $R^d$, a centervertex exists. We also present a simple algorithm for computing an approximate centervertex.

@inproceedings{miller09centervertex,
Title = {The Centervertex Theorem for Wedge Depth},
Author = {Gary L. Miller and Todd Phillips and Donald R. Sheehy},
Booktitle = {CCCG: Canadian Conference in Computational Geometry},
Year = {2009}}
Size Complexity of Volume Meshes vs. Surface Meshes
Benoit Hudson, Gary L. Miller, Todd Phillips and Donald R. Sheehy
SODA: ACM-SIAM Symposium on Discrete Algorithms 2009

Typical volume meshes in three dimensions are designed to conform to an underlying two-dimensional surface mesh, with volume mesh element size growing larger away from the surface. The surface mesh may be uniformly spaced or highly graded, and may have fine resolution due to extrinsic mesh size concerns. When we desire that such a volume mesh have good aspect ratio, we require that some space-filling {\it scaffold} vertices be inserted off the surface. We analyze the number of scaffold vertices in a setting that encompasses many existing volume meshing algorithms. We show that under simple preconditions, the number of scaffold vertices will be linear in the number of surface vertices.

@inproceedings{hudson09size,
Author = {Beno\^{i}t Hudson and Gary L. Miller and Todd Phillips and Donald R. Sheehy},
Booktitle = {SODA: ACM-SIAM Symposium on Discrete Algorithms},
Title = {Size Complexity of Volume Meshes vs. Surface Meshes},
Year = {2009}}
Shape Deformation in Continuous Map Generalization
Jeff Danciger, Satyan L. Devadoss, John Mugno, Donald R. Sheehy and Rachel Ward
GeoInformatica 13: 2, 203-221 2009

Given a collection of regions on a map, we seek a method of continuously altering the regions as the scale is varied. This is formalized and brought to rigor as well-defined problems in homotopic deformation. We ask the regions to preserve topology, area-ratios, and relative position as they change over time. A solution is presented using differential methods and computational geometric techniques. Most notably, an application of this method is used to provide an algorithm to obtain cartograms.

@article{danciger09shape,
Author = {Jeff Danciger and Satyan Devadoss and John Mugno and Donald R. Sheehy and Rachel Ward},
Journal = {GeoInformatica},
Number = {2},
Pages = {203--221},
Title = {Shape Deformation in Continuous Map Generalization},
Volume = {13},
Year = {2009}}
Achieving Spatial Adaptivity while Finding Approximate Nearest Neighbors
Jonathan Derryberry, Daniel D. Sleator, Donald R. Sheehy and Maverick Woo
CCCG: The Canadian Conference in Computational Geometry 2008

We present the first spatially adaptive data structure that answers approximate nearest neighbor (ANN) queries to points that reside in a geometric space of any constant dimension $d$.

The $L_t$-norm approximation ratio is $O(d^{1 + 1/t})$, and the running time for a query $q$ is $O(d^2 \lg \delta(p, q))$, where $p$ is the result of the preceding query and $\delta(p, q)$ is the number of input points in a suitably-sized box containing $p$ and $q$.

Our data structure has $O(d n)$ size and requires $O(d^2 n \lg n)$ preprocessing time, where $n$ is the number of points in the data structure.

The size of the bounding box for $\delta$ depends on $d$, and our results rely on the Random Access Machine (RAM) model with word size $\Theta(\lg n)$.

@inproceedings{derryberry08achieving,
Author = {John Derryberry and Daniel D. Sleator and Donald Sheehy and Maverick Woo},
Booktitle = {CCCG: Canadian Conference in Computational Geometry},
Title = {Achieving Spatial Adaptivity while Finding Approximate Nearest Neighbors},
Year = {2008}}
Linear-size meshes
Gary L. Miller, Todd Phillips and Donald R. Sheehy
CCCG: The Canadian Conference in Computational Geometry 2008

Most modern meshing algorithms produce asymptotically optimal size output. However, the size of the optimal mesh may not be bounded by any function of $n$. In this paper, we introduce \emph{well-paced} point sets and prove that these will produce linear size outputs when meshed with any size-optimal'' meshing algorithm. This work generalizes all previous work on the linear cost of balancing quadtrees. We also present an algorithm that uses well-paced points to produce a linear size Delaunay mesh of a point set in $R^d$.

@inproceedings{miller08linear,
Author = {Gary L. Miller and Todd Phillips and Donald R. Sheehy},
Booktitle = {CCCG: Canadian Conference in Computational Geometry},
Title = {Linear-size meshes},
Year = {2008}}
Size Competitive Meshing without Large Angles
Gary L. Miller, Todd Phillips and Donald R. Sheehy
ICALP: 34th International Colloquium on Automata, Languages and Programming 2007

We present a new meshing algorithm for the plane, Overlay Stitch Meshing (OSM), accepting as input an arbitrary Planar Straight Line Graph and producing a triangulation with all angles smaller than $170^\circ$. The output triangulation has competitive size with any optimal size mesh having equally bounded largest angle. The competitive ratio is $O(\log(L/s))$ where $L$ and $s$ are respectively the largest and smallest features in the input. OSM runs in $O(n \log (L/s) + m)$ time/work where $n$ is the input size and $m$ is the output size. The algorithm first uses Sparse Voronoi Refinement to compute a quality overlay mesh of the input points alone. This triangulation is then combined with the input edges to give the final mesh.

@inproceedings{miller07size,
Author = {Gary L. Miller and Todd Phillips and Donald R. Sheehy},
Booktitle = {34th International Colloquium on Automata, Languages and Programming},
Title = {Size Competitive Meshing without Large Angles},
Year = {2007}}
Compatible Triangulations and Point Partitions by Series Triangular Graphs
Jeff Danciger, Satyan L. Devadoss and Donald R. Sheehy
Computational Geometry: Theory and Applications 34, 195-202 2006

We introduce series-triangular graph embeddings and show how to partition point sets with them. This result is then used to prove an upper bound on the number of Steiner points needed to obtain compatible triangulations of point sets. The problem is generalized to finding compatible triangulations for more than two point sets and we show that such triangulations can be constructed with only a linear number of Steiner points added to each point set. Moreover, the compatible triangulations constructed by these methods are regular triangulations.

@article{danciger06compatible,
Author = {Jeff Danciger and Satyan Devadoss and Donald R. Sheehy},
Journal = {Computational Geometry: Theory and Applications},
Pages = {195--202},
Title = {Compatible Triangulations and Point Partitions by Series Triangular Graphs},
Volume = {34},
Year = {2006}}