This page contains all my publications; for more details, see my Google Scholar profile.

Published Papers

Emile Mathieu
,
Maximilian Nickel
,
Riemannian Continuous Normalizing Flows, in Advances in Neural Information Processing Systems 33, 2020.

Normalizing flows have shown great promise for modelling flexible probability distributions in a computationally tractable way. However, whilst data is often naturally described on Riemannian manifolds such as spheres, torii, and hyperbolic spaces, most normalizing flows implicitly assume a flat geometry, making them either misspecified or ill-suited in these situations. To overcome this problem, we introduce Riemannian continuous normalizing flows, a model which admits the parametrization of flexible probability measures on smooth manifolds by defining flows as the solution to ordinary differential equations. We show that this approach can lead to substantial improvements on both synthetic and real-world data when compared to standard flows or previously introduced projected flows.

@inproceedings{mathieu2019Riemannian,
title = {Riemannian Continuous Normalizing Flows},
author = {Mathieu, Emile and Nickel, Maximilian},
booktitle = {Advances in Neural Information Processing Systems 33},
year = {2020},
publisher = {Curran Associates, Inc.}
}

Emile Mathieu
,
Charline Le Lan
,
Chris J. Maddison
,
Ryota Tomioka
,
Yee Whye Teh
,
Continuous Hierarchical Representations with Poincaré Variational Auto-Encoders, in Advances in Neural Information Processing Systems 32, 2019, 12565–12576.

The Variational Auto-Encoder (VAE) is a popular method for learning a generative model and embeddings of the data. Many real datasets are hierarchically structured. However, traditional VAEs map data in a Euclidean latent space which cannot efficiently embed tree-like structures. Hyperbolic spaces with negative curvature can. We therefore endow VAEs with a Poincaré ball model of hyperbolic geometry as a latent space and rigorously derive the necessary methods to work with two main Gaussian generalisations on that space. We empirically show better generalisation to unseen data than the Euclidean counterpart, and can qualitatively and quantitatively better recover hierarchical structures.

@inproceedings{mathieu2019Continuous,
title = {Continuous Hierarchical Representations with Poincar\'{e} Variational Auto-Encoders},
author = {Mathieu, Emile and Le Lan, Charline and Maddison, Chris J. and Tomioka, Ryota and Teh, Yee Whye},
booktitle = {Advances in Neural Information Processing Systems 32},
pages = {12565--12576},
year = {2019},
publisher = {Curran Associates, Inc.}
}

Emile Mathieu
,
Tom Rainforth
,
N Siddharth
,
Yee Whye Teh
,
Disentangling Disentanglement in Variational Autoencoders, https://icml.cc/media/Slides/icml/2019/halla(12-11-00)-12-11-35-4811-disentangling_d.pdf, in Proceedings of the 36th International Conference on Machine Learning, Long Beach, California, USA, 2019, vol. 97, 4402–4412.

We develop a generalisation of disentanglement in VAEs—decomposition of the latent representation—characterising it as the fulfilment of two factors: a) the latent encodings of the data having an appropriate level of overlap, and b) the aggregate encoding of the data conforming to a desired structure, represented through the prior. Decomposition permits disentanglement, i.e. explicit independence between latents, as a special case, but also allows for a much richer class of properties to be imposed on the learnt representation, such as sparsity, clustering, independent subspaces, or even intricate hierarchical dependency relationships. We show that the β-VAE varies from the standard VAE predominantly in its control of latent overlap and that for the standard choice of an isotropic Gaussian prior, its objective is invariant to rotations of the latent representation. Viewed from the decomposition perspective, breaking this invariance with simple manipulations of the prior can yield better disentanglement with little or no detriment to reconstructions. We further demonstrate how other choices of prior can assist in producing different decompositions and introduce an alternative training objective that allows the control of both decomposition factors in a principled manner.

@inproceedings{mathieu2019Disentaling,
title = {Disentangling Disentanglement in Variational Autoencoders},
author = {Mathieu, Emile and Rainforth, Tom and Siddharth, N and Teh, Yee Whye},
booktitle = {Proceedings of the 36th International Conference on Machine Learning},
pages = {4402--4412},
year = {2019},
volume = {97},
series = {Proceedings of Machine Learning Research},
address = {Long Beach, California, USA},
month = {09--15 Jun},
publisher = {PMLR},
oral = {https://icml.cc/media/Slides/icml/2019/halla(12-11-00)-12-11-35-4811-disentangling_d.pdf}
}

Benjamin Bloem-Reddy
,
Adam Foster
,
Emile Mathieu
,
Yee Whye Teh
,
Sampling and Inference for Beta Neutral-to-the-Left Models of Sparse Networks, https://www.youtube.com/watch?v=0PlIFXBpIgU, in Conference on Uncertainty in Artificial Intelligence, 2018.

Empirical evidence suggests that heavy-tailed degree distributions occurring in many real networks are well-approximated by power laws with exponents η that may take values either less than and greater than two. Models based on various forms of exchangeability are able to capture power laws with η<2, and admit tractable inference algorithms; we draw on previous results to show that η>2 cannot be generated by the forms of exchangeability used in existing random graph models. Preferential attachment models generate power law exponents greater than two, but have been of limited use as statistical models due to the inherent difficulty of performing inference in non-exchangeable models. Motivated by this gap, we design and implement inference algorithms for a recently proposed class of models that generates ηof all possible values. We show that although they are not exchangeable, these models have probabilistic structure amenable to inference. Our methods make a large class of previously intractable models useful for statistical inference.

@inproceedings{BloemReddy2018Sampling,
author = {Bloem-Reddy, Benjamin and Foster, Adam and Mathieu, Emile and Teh, Yee Whye},
booktitle = {Conference on Uncertainty in Artificial Intelligence},
title = {Sampling and Inference for Beta Neutral-to-the-Left Models of Sparse Networks},
month = aug,
year = {2018},
oral = {https://www.youtube.com/watch?v=0PlIFXBpIgU}
}

Benjamin Bloem-Reddy
,
Emile Mathieu
,
Adam Foster
,
Tom Rainforth
,
Hong Ge
,
María Lomelí
,
Zoubin Ghahramani
,
Yee Whye Teh
,
Sampling and inference for discrete random probability measures in probabilistic programs, in NIPS Workshop on Advances in Approximate Bayesian Inference, 2017.

We consider the problem of sampling a sequence from a discrete random probability measure (RPM) with countable support, under (probabilistic) constraints of finite memory and computation. A canonical example is sampling from the Dirichlet Process, which can be accomplished using its stick-breaking representation and lazy initialization of its atoms. We show that efficiently lazy initialization is possible if and only if a size-biased representation of the discrete RPM is used. For models constructed from such discrete RPMs, we consider the implications for generic particle-based inference methods in probabilistic programming systems. To demonstrate, we implement SMC for Normalized Inverse Gaussian Process mixture models in Turing.

@inproceedings{bloemreddy2017Sampling,
title = {Sampling and inference for discrete random probability measures in probabilistic programs},
author = {Bloem-Reddy, Benjamin and Mathieu, Emile and Foster, Adam and Rainforth, Tom and Ge, Hong and Lomelí, María and Ghahramani, Zoubin and Teh, Yee Whye},
journal = {NIPS Workshop on Advances in Approximate Bayesian Inference},
year = {2017}
}