Machine Learning for Science

The confluence of advanced machine learning and artificial intelligence techniques with scientific data analysis has emerged as a powerful paradigm for accelerating discovery across various scientific disciplines, particularly in astronomy, astrophysics, and cosmology. This research area leverages sophisticated algorithms to extract insights from vast, complex datasets, address computational challenges, and push the boundaries of scientific understanding. Key applications include the automated classification and characterization of celestial objects, the modeling of fundamental physical processes, and the development of new tools for data acquisition and interpretation.

Specifically, machine learning is instrumental in handling the enormous data volumes generated by modern astronomical surveys, enabling tasks such as anomaly detection, object identification, and the precise measurement of astrophysical properties. Techniques span supervised learning for predictive modeling, unsupervised learning for uncovering hidden structures and patterns, and generative models for data synthesis and anomaly identification. The emphasis extends beyond mere prediction to encompass the critical aspects of model interpretability, robust uncertainty quantification, and physical benchmarking, ensuring that AI-driven discoveries are scientifically rigorous and trustworthy.

My research significantly contributes to this exciting domain by developing and applying cutting-edge machine learning methodologies to tackle pressing challenges in astronomy, cosmology, and broader engineering applications. I have pioneered the use of multi-task modeling to address sparse data problems in engineering, while also creating extensive astronomical datasets, such as a photometric sample of 2.6 million Red Clump stars, critical for Galactic structure studies. My work leverages generative adversarial networks (GANs) for anomaly detection in astronomical images, from Hyper Suprime-Cam galaxies to the cosmic web, and for generating physically consistent synthetic data that can be rigorously benchmarked against theoretical predictions. I also explore galaxy morphology with unsupervised machine learning and predict new concept-object associations by mining astronomical literature.

Furthermore, I have focused on enhancing the interpretability and reliability of AI models in science. This includes developing methods for statistically disentangled latent spaces guided by generative factors, providing interpretable uncertainty quantification in high-energy physics applications, and optimizing galaxy selection to reduce model error in weak lensing cluster mass estimation. My contributions also span probabilistic modeling for high-dimensional stress fields, neural network-based point spread function deconvolution, the creation of synthetic spectra for probabilistic redshift estimation, and a modular deep learning pipeline for strong gravitational lens detection and modeling. By combining advanced deep learning, probabilistic methods, and rigorous physical validation, my work aims to unlock new discoveries, improve data quality, and provide trustworthy AI solutions for the scientific community, particularly for future missions like Rubin LSST and SPHEREx, and for estimating peculiar velocities from the kinetic SZ effect.

Figure from From the Inner to Outer Milky Way: A Photometric Sample of 2.6 Million Red Clump Stars
From: From the Inner to Outer Milky Way: A Photometric Sample of 2.6 Million Red Clump Stars