Matrix-Vector Queries Trace Estimation with Hutch++
Published in University Research Project, 2026
This project studies efficient trace estimation of large matrices using matrix–vector queries through randomized numerical linear algebra methods.
The work focuses on the problem where a matrix is too large to compute or store explicitly, but matrix–vector products can still be evaluated efficiently. In such settings, estimating quantities like the matrix trace becomes an important computational challenge.
To address this, the project implements and analyzes three variants of the Hutch++ algorithm, a modern stochastic trace estimation method that significantly improves upon the classical Hutchinson estimator.
A key application explored in this project is triangle counting in large graphs, using the Wiki-Vote social network dataset. In graph theory, the number of triangles in a graph can be computed through
\[\text{triangles} = \frac{1}{6}\operatorname{tr}(B^3)\]where $B$ is the adjacency matrix of the graph. Instead of explicitly computing $B^3$, the project uses matrix-vector oracles to estimate the trace efficiently.
Key components of the project include:
- Implementation of Hutch++ stochastic trace estimation
- Development of NA-Hutch++ (non-adaptive) and Gaussian-Hutch++ variants
- Construction of matrix–vector oracle representations for large matrices
- Application to triangle counting in large-scale graph datasets
- Performance evaluation comparing different estimators and query complexities
The project demonstrates how randomized algorithms and matrix sketching techniques enable scalable analysis of large datasets where traditional linear algebra methods become computationally infeasible.
This work highlights the power of matrix-vector query models and randomized numerical linear algebra for solving large-scale problems in graph analytics and scientific computing.
