Research

My research interests lie in machine learning, scientific computing, and randomized algorithms for large-scale data analysis. I am particularly interested in connecting mathematical structure with real-world sensing systems and computational modeling.

Before transitioning to data science and machine learning, I conducted research in mechanical engineering and materials science, focusing on optimization, energy systems, and materials performance.

Publications


Current Research


Matrix-Vector Trace Estimation with Hutch++

Research on scalable trace estimation for large matrices using randomized numerical linear algebra. The project studies the matrix–vector query model, where matrices are too large to form explicitly but matrix–vector products can be computed efficiently.

The work implements several variants of the Hutch++ stochastic trace estimator, including adaptive Hutch++, non-adaptive NA-Hutch++, and Gaussian-Hutch++. These algorithms reduce the number of required matrix–vector queries from O(1/ε²) to O(1/ε) by combining low-rank approximation with randomized probing.

The methods are applied to large graph datasets to estimate structural properties such as triangle counts using the identity triangles = tr(B³) / 6, where B is the adjacency matrix. Instead of computing explicitly, the project uses efficient matrix–vector operations and stochastic trace estimation to scale to large networks.

Weather-Aware Travel Urban Iteration Optimization

Independent research project at the University of Minnesota focused on developing a data-driven itinerary optimization framework for urban tourism planning. The project integrates machine learning and operations research to recommend optimal travel routes under real-world constraints such as travel time, weather conditions, congestion, and limited time budgets.

The system combines attraction utility modeling using Yelp data, weather-aware congestion estimation, and integer optimization for route planning. Core components include utility scoring based on ratings and review counts, predictive waiting-time models, and a profit-maximizing orienteering formulation to select and sequence attractions while maximizing visitor utility.

Randomized Sketching Algorithms for Large-Scale Linear Algebra

Independent study at the University of Minnesota exploring randomized sketching and sampling techniques for scalable machine learning and numerical linear algebra. Current work includes leverage score sampling, CountSketch, subspace embeddings, and the Hutch++ algorithm for fast trace estimation in large matrices.

Magnetic Pose Estimation Using Distributed Dipole Models

Research on estimating the pose of flexible permanent magnets using magnetic sensor arrays. The work focuses on solving nonlinear inverse problems, distributed dipole modeling, and real-time sensor data processing.

Wenzhounese Input Method and Language Technology

Development of a digital input method for the Wenzhounese (溫州話) dialect aimed at supporting computational access to underrepresented languages. The project involves designing a Rime-based input schema, phonetic representation of dialectal pronunciation, and exploration of language technologies and LLM-based tools for dialect preservation.