Grasping the semantics of rare constructions (form-meaning pairings) has been shown to be a challenging problem that has currently only been solved by the largest LLMs. It remains an open question if ...
We propose a score test for dependence predictability in conditional copulas that is robust to temporal instabilities. Our semiparametric procedure accommodates flexible dynamics in the marginal proce...
We present a regression-adjustment framework designed for the estimation of longitudinal treatment effects in randomized experiments under static regimes. While regression-adjustment methods are usefu...
Seagrass meadows of Thalassia testudinum are key components of blue carbon ecosystems and effective bioindicators of environmental contamination due to their ability to incorporate both inorganic and ...
This study aimed to investigate the acute and chronic toxic effects of two thermodynamic inhibitors (methanol and ethylene glycol) widely used in deep-sea oil and gas operations on two typical marine ...
Cystic fibrosis (CF)-associated lung infections caused by Pseudomonas aeruginosa (P. aeruginosa) and Staphylococcus aureus (S. aureus) remain difficult to treat due to multidrug resistance and the red...
The conventional Ekman model of the tropical boundary layer neglects nonlinear momentum advection and breaks down near the equator, where Coriolis effects are weak. During South Asian monsoon onset, w...
We study global inference for regression coefficients in high-dimensional linear models under potentially heavy-tailed errors. While sum-type tests are powerful for dense alternatives and max-type tes...
The ability of spouses to commit to future behavior has important implications for the allocation of resources between them and over time. Using a lifecycle collective model for household behavior, we...
Mixed-effects logistic regression is widely used for binary outcomes in hierarchical data, yet formal goodness-of-fit tests remain limited to random-intercept models and do not address sparse cluster ...
Background: Days Alive and at Home (DAH) over a pre-defined follow-up period is a novel post-intervention composite outcome that combines data from at least three components: (i) initial length of hos...
Whether language models can systematically generalize remains actively debated. Yet empirical performance is jointly shaped by multiple factors such as training data, training paradigms, and inference...
Generalized linear mixed models (GLMMs) are widely used for analyzing correlated data, such as longitudinal and multilevel data. With over 15 $\texttt{R}$ packages available on $\texttt{CRAN}$ for fit...
Equivalence testing compares the hypothesis that an effect $μ$ is large against the alternative that it is negligible. Here, `large' is classically expressed as being larger than some `equivalence mar...
Recent work on chain-of-thought (CoT) faithfulness reports single aggregate numbers (e.g., DeepSeek-R1 acknowledges hints 39% of the time), implying that faithfulness is an objective, measurable prope...
Predicting customers' long-term revenue from sparse and irregular transaction data is central to marketing resource allocation in non-contractual settings, yet existing approaches face a trade-off. Tr...
5-Fluorouracil (5-FU) is a first-line chemotherapeutic agent for solid tumors, but its clinical application is severely limited by dose-dependent intestinal injury that impairs patient quality of life...
Large language models (LLMs) have become increasingly useful computational models of human language processing, but it remains unclear whether vision-language learning makes text representations more ...
The escalating threat of antimicrobial resistance (AMR) has created an urgent need for new antimicrobial agents. Antimicrobial peptides (AMPs) are promising alternatives to conventional antibiotics du...
Translational medicine turns underspecified development goals into evidence synthesis that must combine literature, trials, patents, and quantitative multi-omics analysis while preserving identifiers,...