We introduce a semiparametric approach for forecasting Value-at-Risk (VaR) and Expected Shortfall (ES) by modeling the conditional scale of financial returns, defined as the difference between two spe...
We study the problem of selecting optimal two-block partitions to accelerate the mixing of finite Markov chains under group-averaging transformations. The main objectives considered are the Kullback-L...
Simulation plays a central role in scientific discovery. In many applications, the bottleneck is no longer running a simulator; it is choosing among large families of plausible simulators, each corres...
State space models (SSMs) have recently achieved strong performance on long sequence modeling tasks while offering improved memory and computational efficiency compared to transformer based architectu...
Frogeye Leaf Spot (FLS), caused by Cercospora sojina, poses a significant threat to soybean production, with yield losses of 30-60%. Traditional mass-action models assume homogeneous mixing, which rar...
Despite the computational efficiency of MoE models, the excessive memory footprint and I/O overhead inherent in multi-expert architectures pose formidable challenges for real-time inference on resourc...
The difference-in-differences (DiD) design is a quasi-experimental method for estimating treatment effects. In staggered DiD with multiple treatment groups and periods, estimation based on the two-way...
Collecting multiple types of data on the same set of subjects is common in modern scientific applications including, genomics, metabolomics, and neuroimaging. Joint and Individual Variance Explained (...
Striking an optimal balance between predictive performance and fairness continues to be a fundamental challenge in machine learning. In this work, we propose a post-processing framework that facilitat...
Inferring the sign of social relationships from online interactions is a fundamental challenge in social network analysis. Existing approaches typically rely on sentiment analysis to label individual ...
Latent diffusion models (LDMs) enable high-fidelity synthesis by operating in learned latent spaces. However, training state-of-the-art LDMs requires complex staging: a tokenizer must be trained first...
We develop a nonparametric approach to identify and estimate consumer preferences and unobserved heterogeneity under nonlinear price schedules. Leveraging variation across multiple price schedules, we...
mmid (Multi-Modal Integration and Downstream analyses for healthcare analytics) is a Python package that offers multi-modal fusion and imputation, classification, time-to-event prediction and clusteri...
Overall survival (OS) is the gold standard for assessing patient benefit and cost-effectiveness of new cancer drugs. However, it is often difficult to use OS as the primary endpoint in randomized clin...
The advent of Transformer and Mamba-based architectures has significantly advanced 3D medical image segmentation by enabling global contextual modeling, a capability traditionally limited in Convoluti...
The problem of optimal dosage estimation arises in diverse scientific domains, from pharmacology and toxicology to aquaculture and environmental studies. Statistical modeling of nonlinear dose-respons...
In many oncology clinical trials where overall survival is a key endpoint, patients are permitted to switch from the control arm to the experimental treatment arm or other suitable therapies. Switchin...
Modern table formats such as Apache Iceberg compute and store metadata-commit timestamps, record counts, and column-level statistics such as null counts and value bounds at write time as part of file ...
Stochastic models of diffusion are routinely used to study dispersal of populations, including populations of animals, plants, seeds and cells. Advances in imaging and field measurement technologies m...
Ultrasound imaging is an essential first-line tool for assessing hepatic steatosis. While conventional B-mode ultrasound imaging has limitations in providing detailed tissue characterization, ultrasou...