# Preprints

## P8. A Survey of Vectorization Methods in Topological Data Analysis

Attempts to incorporate topological information in supervised learning tasks have resulted in the creation of several techniques for vectorizing persistent homology barcodes. In this paper, we study thirteen such methods. Besides describing an organizational framework for these methods, we comprehensively benchmark them against three well-known classification tasks. Surprisingly, we discover that the best-performing method is a simple vectorization, which consists only of a few elementary summary statistics. Finally, we provide a convenient web application which has been designed to facilitate exploration and experimentation with various vectorization methods.

Status: Preprint (2022), the arxiv version is available here.

## P7. Harder-Narasimhan Filtrations and Zigzag Persistence

With Marc Fersztand and Ulrike Tillmann.

We introduce a sheaf-theoretic stability condition for finite acyclic quivers. Our main result establishes that for representations of affine type A quivers, there is a precise relationship between the associated Harder-Narasimhan filtration and the barcode of the periodic zigzag persistence module obtained by unwinding the underlying quiver.

Status: Preprint (2022), the arxiv version is available here.

## P6. Topological Inference of the Conley Index

With Ka Man Yim.

The Conley index of an isolated invariant set is a fundamental object in the study of dynamical systems. Here we consider smooth functions on closed submanifolds of Euclidean space and describe a framework for inferring the Conley index of any compact, connected isolated critical set of such a function with high confidence from a sufficiently large finite point sample. The main construction of this paper is a specific index pair which is local to the critical set in question. We establish that these index pairs have positive reach and hence admit a sampling theory for robust homology inference. This allows us to estimate the Conley index, and as a direct consequence, we are also able to estimate the Morse index of any critical point of a Morse function using finitely many local evaluations.

Status: Preprint (2022), the arxiv version is available here.

## P5. Morse Theory for Complexes of Groups

With Naya Yerolemou.

We construct an equivariant version of discrete Morse theory for simplicial complexes endowed with group actions. The key ingredient is a 2-categorical criterion for making acyclic partial matchings on the quotient space compatible with an overlaid complex of groups. We use the discrete flow category of any such compatible matching to build the corresponding Morse complex of groups. Our main result establishes that the development of the Morse complex of groups recovers the original simplicial complex up to equivariant homotopy equivalence.

Status: Preprint (2022), the arxiv version is available here.

## P4. A Topological Approach to Mapping Space Signatures

A common approach for describing classes of functions and probability measures on a topological space $$\mathcal{X}$$ is to construct a suitable map $$\Phi$$ from $$\mathcal{X}$$ into a vector space, where linear methods can be applied to address both problems. The case where $$\mathcal{X}$$ is a space of paths $$[0,1] \to \mathbb{R}^n$$ and $$\Phi$$ is the path signature map has received much attention in stochastic analysis and related fields. In this article we develop a generalized $$\Phi$$ for the case where $$\mathcal{X}$$ is a space of maps $$[0,1]^d \to \mathbb{R}^n$$ for any $$d \in \mathbb{N}$$, and show that the map $$\Phi$$ generalizes many of the desirable algebraic and analytic properties of the path signature to $$d \ge 2$$. The key ingredient to our approach is topological; in particular, our starting point is a generalisation of K-T Chen's path space cochain construction to the setting of cubical mapping spaces.

Status: Preprint (2022), the arxiv version is located here.

## P3. Multivariate Central Limit Theorems for Random Clique Complexes

With Tadas Temčinas and Gesine Reinert.

Motivated by open problems in applied and computational algebraic topology, we establish multivariate normal approximation theorems for three random vectors which arise organically in the study of random clique complexes. These are:

1. the vector of critical simplex counts attained by a lexicographical Morse matching,
2. the vector of simplex counts in the link of a fixed simplex, and
3. the vector of total simplex counts.

The first of these random vectors forms a cornerstone of modern homology algorithms, while the second one provides a natural generalisation for the notion of vertex degree, and the third one may be viewed from the perspective of $$U$$-statistics. To obtain distributional approximations for these random vectors, we extend the notion of dissociated sums to a multivariate setting and prove a new central limit theorem for such sums using Stein's method.

Status: Preprint (2021); here is the arxiv version.

## P2. Tangent Space and Dimension Estimation with the Wasserstein Distance

With Uzu Lim and Harald Oberhauser.

We provide explicit bounds on the number of sample points required to estimate tangent spaces and intrinsic dimensions of (smooth, compact) Euclidean submanifolds via local principal component analysis. Our approach directly estimates local covariance matrices, which in turn allows us to simultaneously estimate both the tangent spaces and the intrinsic dimension of a manifold. The key arguments involve a matrix concentration inequality, a Wasserstein bound for flattening a manifold, and a Lipschitz relation for the covariance matrix with respect to the Wasserstein distance.

Status: Preprint (2021), the arxiv version is here.

## P1. Complex Links and Hilbert-Samuel Multiplicities

With Martin Helmer.

We describe a framework for estimating Hilbert-Samuel multiplicities $$e_XY$$ for pairs of projective varieties $$X < Y$$ from finite point samples rather than defining equations. The first step involves proving that this multiplicity remains invariant under certain hyperplane sections which reduce $$X$$ to a point $$p$$ and $$Y$$ to a curve $$C$$. Next, we establish that $$e_pC$$ equals the Euler characteristic (and hence, the cardinality) of the complex link of $$p$$ in $$C$$. Finally, we provide explicit bounds on the number of uniform point samples needed (in an annular neighborhood of $$p$$ in $$C$$) to determine this Euler characteristic with high confidence.

Status: Preprint (2020), the arxiv version can be found here.

# (2022)

## 20. The Space of Barcode Bases for Persistence Modules

With Emile Jacquard and Ulrike Tillmann.

The barcode of a persistence module serves as a complete combinatorial invariant of its isomorphism class. Barcodes are typically extracted by performing changes of basis on a persistence module until the constituent matrices have a special form. Here we describe a new algorithm for computing barcodes which also keeps track of, and outputs, such a change of basis. Our main result is an explicit characterisation of the group of transformations that sends one barcode basis to another. Armed with knowledge of the entire space of barcode bases, we are able to show that any map of persistence modules can be represented via a partial matching between bars provided that neither source nor target admits nested bars in its barcode. We also generalise the algorithm and results described above to work for zizag modules.

Status: Appeared in Journal of Applied and Computational Topology, available here; the arxiv version is located here.

Emile Jacquard, Vidit Nanda and Ulrike Tillmann. The space of barcode bases for persistence modules. Journal of Applied and Computational Topology., DOI: 10.1007/s41468-022-00094-6, July 2022.

## 19. Conormal Spaces and Whitney Stratifications

With Martin Helmer.

We describe a new algorithm for computing Whitney stratifications of complex projective varieties. The main ingredients are (a) an algebraic criterion, due to Lê and Teissier, which reformulates Whitney regularity in terms of conormal spaces and maps, and (b) a new interpretation of this conormal criterion via ideal saturations, which can be practically implemented on a computer. We show that this algorithm improves upon the existing state of the art by several orders of magnitude, even for relatively small input varieties. En route, we introduce related algorithms for efficiently stratifying affine varieties, flags on a given variety, and algebraic maps.

Status: Accepted at Foundations of Computational Mathematics. The official version is here and the arxiv version is here.

Martin Helmer and Vidit Nanda. Conormal spaces and Whitney stratifications. Foundations of Computational Mathematics., DOI: 10.1007/s10208-022-09574-8, April 2022.

## 18. Principal Components along Quiver Representations

With Anna Seigal and Heather Harrington.

Quiver representations arise naturally in many areas across mathematics. Here we describe an algorithm for calculating the vector space of sections, or compatible assignments of vectors to vertices, of any finite-dimensional representation of a finite quiver. Consequently, we are able to define and compute principal components with respect to quiver representations. These principal components are solutions to constrained optimisation problems defined over the space of sections, and are eigenvectors of an associated matrix pencil.

Status: Accepted at Foundations of Computational Mathematics. The official version is here and the arxiv version is here.

Anna Seigal, Heather Harrington and Vidit Nanda. Principal components along quiver representations. Foundations of Computational Mathematics., DOI: 10.1007/s10208-022-09563-x, April 2022.

## 17. Dist2Cycle: A Simplicial Neural Network for Homology Localization

With Alex Keros and Kartic Subr.

Simplicial complexes can be viewed as high dimensional generalizations of graphs that explicitly encode multi-way ordered relations between vertices at different resolutions, all at once. This concept is central towards detection of higher dimensional topological features of data, features to which graphs, encoding only pairwise relationships, remain oblivious. While attempts have been made to extend graph neural networks to a simplicial complex setting, the methods do not inherently exploit, or reason about, the underlying topological structure of the network. We propose a graph convolutional model for learning functions parametrized by the $$k$$-homological features of simplicial complexes. By spectrally manipulating their combinatorial $$k$$-dimensional Hodge Laplacians, the proposed model enables learning topological features of the underlying simplicial complexes, specifically, the distance of each $$k$$-simplex from the nearest optimal" $$k$$-th homology generator, effectively providing an alternative to homology localization.

Status: Accepted at the AAAI 2022 conference; the official version is here, and the the arxiv version is here.

Alexandros Keros, Vidit Nanda and Kartic Subr. Dist2Cycle: a simplicial neural network for homology localization. Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence. Vancouver, Canada., February 2022.

# (2020)

## 16. Equivariant Simplicial Reconstruction

With Lisa Carbone and Yusra Naqvi.

We introduce and analyze parallelizable algorithms to compress and accurately reconstruct finite simplicial complexes that have non-trivial automorphisms. The compressed data – called a complex of groups – amounts to a functor from (the poset of simplices in) the orbit space to the 2-category of groups, whose higher structure is prescribed by isomorphisms arising from conjugation. Using this functor, we show how to algorithmically recover the original complex up to equivariant simplicial isomorphism. Our algorithms are derived from generalizations (by Bridson-Haefliger, Carbone-Rips and Corson, among others) of the classical Bass-Serre theory for reconstructing group actions on trees.

Status: Appeared in SIAGA; the published version is here and the arxiv version is here.

Lisa Carbone, Vidit Nanda and Yusra Naqvi. Equivariant simplicial reconstuction. SIAM Journal on Applied Algebra and Geometry (SIAGA), Volume 4, Issue 4, pp 532–552, December 2020.

## 15. Geometric Anomaly Detection in Data

This paper describes the systematic application of local topological methods for detecting interfaces and related anomalies in complicated high-dimensional data. By examining the topology of small regions around each point, one can optimally stratify a given dataset into clusters, each of which is in turn well-approximable by a suitable submanifold of the ambient space. Since these approximating submanifolds might have different dimensions, we are able to detect non-manifold like singular regions in data even when none of the data points have been sampled from those singularities. We showcase this method by identifying the intersection of two surfaces in the 24-dimensional space of cyclo-octane conformations, and by locating all the self-intersections of a Henneberg minimal surface immersed in 3-dimensional space. Due to the local nature of the required topological computations, the algorithmic burden of performing such data stratification is readily distributable across several processors.

Status: Appeared in the Proceedings of the National Academy of Scienes. PNAS's version is here, while the slightly older arxiv version is here.

Bernadette J Stolz, Heather A Harrington, Jared Tanner and Vidit Nanda. Geometric anomaly detection in data. Proceedings of the National Academy of Sciences, Volume 117, Issue 33, pp 19664-19669, August 2020.

## 14. Canonical Stratifications along Bisheaves

With Amit Patel.

A theory of bisheaves has been recently introduced to measure the homological stability of fibers of maps to manifolds. A bisheaf over a topological space is a triple consisting of a sheaf, a cosheaf, and compatible maps from the stalks of the sheaf to the stalks of the cosheaf. In this note we describe how, given a bisheaf constructible (i.e., locally constant) with respect to a triangulation of its underlying space, one can explicitly determine the coarsest stratification of that space for which the bisheaf remains constructible.

Status: Appeared in the Proceedings of the 15th Abel Symposium. The official version is here, and the arxiv version is here.

Vidit Nanda and Amit Patel. Canonical stratifications along bisheaves. Topological Data Analysis, Proceedings of the 18th Abel Symposium, editors NA Baas, GE Carlsson, G Quick, M Szymik, and M Thaule, pp 391 - 403, Springer June 2020.

# (2019)

## 13. Local Cohomology and Stratification

We outline an algorithm to recover the canonical (or, coarsest) stratification of a given finite-dimensional regular CW complex into cohomology manifolds, each of which is a union of cells. The construction proceeds by iteratively localizing the poset of cells about a family of subposets; these subposets are in turn determined by a collection of cosheaves which capture variations in cohomology of cellular neighborhoods across the underlying complex. The result is a nested sequence of categories, each containing all the cells as its set of objects, with the property that two cells are isomorphic in the last category if and only if they lie in the same canonical stratum. The entire process is amenable to efficient distributed computation.

Status: Appeared in Foundations of Computational Mathematics --- the online version is here and the arxiv preprint is here.

Vidit Nanda. Local cohomology and stratification. Foundations of Computational Mathematics, Volume 20, Issue 2, pp 195 - 222, March 2020.

## 12. Discrete Morse Theory and Localization

Incidence relations among the cells of a regular CW complex produce a 2-category of entrance paths whose classifying space is homotopy-equivalent to that complex. We show here that each acyclic partial matching on (the cells of) such a complex corresponds precisely to a homotopy-preserving localization of the associated entrance path category. Restricting attention further to the full localized subcategory spanned by critical cells, we obtain the discrete flow category whose classifying space is also shown to lie in the homotopy class of the original CW complex. This flow category forms a combinatorial and computable counterpart to the one described here by Cohen, Jones and Segal in the context of smooth Morse theory.

Status: Appeared in the Journal of Pure and Applied Algebra; the official manuscript is available here (and the arxiv copy is here).

Vidit Nanda. Discrete Morse theory and localization, Journal of Pure and Applied Algebra, Volume 223, Issue 2, pp 459 - 488, February 2019.

# (2018)

## 11. Persistence Paths and Signature Features in Topological Data Analysis

With Ilya Chevyrev and Harald Oberhauser.

We introduce a new feature map for barcodes that arise in persistent homology computation. The main idea is to first realize each barcode as a path in a convenient vector space, and to then compute its path signature which takes values in the tensor algebra of that vector space. The composition of these two operations — barcode to path, path to tensor series — results in a feature map that has several desirable properties for statistical learning, such as universality and characteristicness, and achieves state-of-the-art results on common classification benchmarks.

Status: Appeared in the IEEE Transactions on Pattern Analysis and Machine Intelligence, usually abbreviated TPAMI. Here is the official version, (requires an IEEE login), and here is the identical arxiv version.

Ilya Chevyrev, Vidit Nanda and Harald Oberhauser. Persistence paths and signature features in topological data analysis, IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume 42, Issue 1, pp 192-202, (Published: December 2018, Printed: January 2020).

## 10. Discrete Morse Theory and Classifying Spaces

With Dai Tamaki and Kohei Tanaka.

The aim of this paper is to develop a refinement of Forman’s discrete Morse theory. To an acyclic partial matching µ on a finite regular CW complex X, Forman introduced a discrete analogue of gradient flows. Although Forman’s gradient flow has been proved to be useful in practical computations of homology groups, it is not sufficient to recover the homotopy type of X. Forman also proved the existence of a CW complex which is homotopy equivalent to X and whose cells are in one-to-one correspondence with the critical cells of µ, but the construction is ad hoc and does not have a combinatorial description. By relaxing the definition of Forman’s gradient flows, we introduce the notion of flow paths, which contains enough information to reconstruct the homotopy type of X, while retaining a combinatorial description. The critical difference from Forman’s gradient flows is the existence of a partial order on the set of flow paths, from which a 2-category C(µ) is constructed. It is shown that the classifying space of C(µ) is homotopy equivalent to X by using homotopy theory of 2-categories. This result may be also regarded as a discrete analogue of the unpublished work of Cohen, Jones, and Segal on Morse theory from the early 90’s.

Status: Appeared in Advances in Mathematics, whose version is available here. Of course, you can always access the arxiv preprint here.

Vidit Nanda, Dai Tamaki and Kohei Tanaka. Discrete Morse theory and classifying spaces, Advances in Mathematics, Volume 340, pp 723 - 790, October 2018.

# (2017)

## 09. Topological Signals of Singularities in Ricci Flow

We implement methods from computational homology to obtain a topological signal of singularity formation in a selection of geometries evolved numerically by Ricci flow. Our approach, based on persistent homology, produces precise, quantitative measures describing the behavior of an entire collection of data across a discrete sample of times. We analyze the topological signals of geometric criticality obtained numerically from the application of persistent homology to models manifesting singularities under Ricci flow. The results we obtain for these numerical models suggest that the topological signals distinguish global singularity formation (collapse to a round point) from local singularity formation (neckpinch). Finally, we discuss the interpretation and implication of these results and future applications.

Status: Appeared in the open-source MDPI journal Axioms, official publisher version here.

Paul Alsing, Howard Blair, Matthew Corne, Gordon Jones, Konstantin Mischaikow and Vidit Nanda. Topological signals of singularities in Ricci flow, Axioms, Volume 6, Issue 3, Article 24, August 2017.

## 08. Higher Interpolation and Extension for Persistence Modules

With Peter Bubenik and Vin de Silva.

The use of topological persistence in contemporary data analysis has provided considerable impetus for investigations into the geometric and functional-analytic structure of the space of persistence modules. In this paper, we isolate a coherence criterion which guarantees the extensibility of non-expansive maps into this space across embeddings of the domain to larger ambient metric spaces. Our coherence criterion is category-theoretic, allowing Kan extensions to provide the desired extensions. As a consequence of such “higher interpolation”, it becomes possible to compare Vietoris-Rips and Cech complexes built within the space of persistence modules.

Status: Appeared in the first issue of the SIAM Journal on Applied Algebra and Geometry, affectionately (but confusingly?) called SIAGA. Here is the official version of the published article.

Peter Bubenik, Vin de Silva and Vidit Nanda. Higher interpolation and extension for persistence modules, SIAM Journal on Applied Algebra and Geometry, Volume 1, Issue 1, pp 272-284, June 2017.

# (2016)

## 07. Discrete Morse Theory for Computing Cellular Sheaf Cohomology

With Justin Curry and Robert Ghrist.

Sheaves and sheaf cohomology are powerful tools in computational topology, greatly generalizing persistent homology. We develop an algorithm for simplifying the computation of cellular sheaf cohomology via (discrete) Morse-theoretic techniques. As a consequence, we derive efficient techniques for computation of (ordinary) cohomology of a cell complex.

Status: Appeared in Foundations of Computational Mathematics --- the online version is here and the arxiv preprint is here.

Justin Curry, Robert Ghrist and Vidit Nanda. Discrete Morse Theory for Computing Cellular Sheaf Cohomology. Foundations of Computational Mathematics Volume 16, Issue 4, pp 875 - 897, August 2016.

# (2015)

## 06. A Topological Measurement of Protein Compressibility

Given X-ray crystallography data of a protein molecule from the PDB, we build a van der Waal weighted alpha shape representation of that protein molecule by constructing cells around each atom center. Thus, to each protein we assoicate a set of persistence diagrams (one for each dimension). Using elementary physical principles, we identify certain structural features of molecules that are conjectured to impact compressibility. A simple parameter search through the persistence diagrams isolates these features and provides a robust measure which exhibits remarkable linear correlation with experimentally computed protein compressibility.

Status: Appeared in the Japan Journal of Industrial and Applied Mathematics. Springer maintains an official version of the accepted article here.

Marcio Gameiro, Yasuaki Hiraoka, Shunsuke Izumi, Miroslav Kramar, Konstantin Mischaikow, Vidit Nanda, A Topological Measurement of Protein Compressibility, Japan Journal of Industrial and Applied Mathematics, Volume 32, Issue 1, pp 1-17, March 2015.

# (2014)

## 05. Reconstructing Functions from Random Samples

With Steve Ferry and Konstantin Mischaikow.

From a sufficiently large point sample lying on a compact Riemannian submanifold of Euclidean space, one can construct a simplicial complex which is homotopy-equivalent to that manifold with high confidence. We describe a corresponding result for a Lipschitz-continuous function between two such manifolds. That is, we outline the construction of a simplicial map which recovers the induced maps on homotopy and homology groups with high confidence using only finite sampled data from the domain and range, as well as knowledge of the image of every point sampled from the domain. We provide explicit bounds on the size of the point samples required for such reconstruction in terms of intrinsic properties of the domain, the co-domain and the function. This reconstruction is robust to certain types of bounded sampling and evaluation noise.

Status: Appeared in the Journal of Computational Dynamics (official version here), and also available here on the Arxiv.

Steve Ferry, Konstantin Mischaikow and Vidit Nanda. Reconstructing Functions from Random Samples. Journal of Computational Dynamics Volume 1, Issue 2, pp 233-248, December 2014.

## 04. Discrete Morse Theoretic Algorithms for Computing Homology of Complexes and Maps

We provide explicit and efficient algorithms based on discrete Morse theory to compute homology of a very general class of complexes. A set-valued map of top-dimensional cells between such complexes is a natural discrete approximation of an underlying (and possibly unknown) continuous function, especially when the evaluation of that underlying function is subject to measurement errors. We introduce a new Morse theoretic algorithm for deriving chain maps from these set-valued maps, and hence an effective scheme for computing the map induced on homology by the approximated continuous function.

Status: Appeared in Foundations of Computational Mathematics. The official online version of the published article is here.

Shaun Harker, Konstantin Mischaikow, Marian Mrozek and Vidit Nanda. Discrete Morse Theoretic Algorithms for Computing Homology of Complexes and Maps. Foundations of Computational Mathematics Volume 14, Issue 1, pp 151-184, February 2014.

## 03. Simplicial Models and Topological Inference in Biological Systems

With Radmila Sazdanovic.

This article is a user's guide to algebraic topological methods for data analysis with a particular focus on applications to datasets arising in experimental biology. We begin with the combinatorics and geometry of simplicial complexes and outline the standard techniques for imposing filtered simplicial structures on a general class of datasets. From these structures, one computes topological statistics of the original data via the algebraic theory of (persistent) homology. These statistics are shown to be computable and robust invariants of the shape underlying a dataset. Finally, we showcase some appealing instances of topology-driven inference in biological settings -- from the detection of a new type of breast cancer to the analysis of various neural structures.

Status: Apeared as Chapter 6 of the 2014 Springer book Discrete and Topological Models in Molecular Biology, editors N. Jonoska and M. Saito. Springer also offers e-prints of the article here.

Vidit Nanda and Radmila Sazdanovic, Simplicial Models and Topological Inference in Biological Systems, Discrete and Topological Models in Molecular Biology, Natural Computing Series (Springer), pp 109-141, January 2014.

# (2013)

## 02. Geometry in the Space of Persistence Modules

With Vin de Silva.

We study the geometry of the space of persistence modules and diagrams, with special attention to Cech and Rips complexes. The metric structures are determined in terms of interleaving maps (of modules) and matchings (between diagrams). We show that the relationship between the Cech and Rips complexes is governed by the relationship between the corresponding interleavings and matchings.

Status: Appeared in the Proceedings of the 29-th Symposium on Computational Geometry. Here is the official version of the published article.

Vin de Silva and Vidit Nanda. Geometry in the Space of Persistence Modules, Proceedings of the 29th Annual Symposuim on Computational Geometry, ACM, pp 397-404, June 2013.

## 01. Morse Theory for Filtrations and Efficient Computation of Persistent Homology

We introduce an efficient preprocessing algorithm to reduce the number of cells in a filtered cell complex while preserving its persistent homology groups. The technique is based on an extension of combinatorial Morse theory from complexes to filtrations of cell complexes.

This paper provides the theoretical basis for the Perseus software project designed to compute persistent homology of various types of filtrations.

Status: Appeared in Discrete and Computational Geometry. The official online version of the published article is available here.

Konstantin Mischaikow and Vidit Nanda. Morse Theory for Filtrations and Efficient Computation of Persistent Homology. Discrete & Computational Geometry, Volume 50, Issue 2, pp 330-353, September 2013.

# (2012)

## 00. Discrete Morse Theory for Filtrations

My Ph.D. dissertation from October 2012. The main results presented here are from the paper titled "Morse theory for filtrations and efficient computation of persistent homology". The final chapter outlines a bonus application: the filtered Morse theory may be used towards simplifying the construction of long exact sequences via the Zigzag lemma. Here is Rutgers' official copy of the document, which has been formatted with awful (but mandatory) double-spacing, to say nothing of the gigantic margins into which many marvelous proofs would fit.