Online advertising platforms host hundreds of thousands of A/B tests, but the platform's delivery algorithm routes each creative to the audience it predicts will engage. Every two-arm test therefore c...
Do LLMs talk like us? This question intrigues a multitude of scholar and it is relevant in many fields, from education to academia. This work presents an interpretable statistical feature for distingu...
Scientific experimentation is largely driven by statistical hypothesis testing to determine significant differences in interventions. Traditionally, experimenters allocate samples uniformly between ea...
How a vision-language model internally solves the task of describing an image is far from obvious. We find that the model develops a specific mechanism for this: a small set of attention heads in its ...
Protein--ligand docking is widely used in structure-based discovery, but routine studies often fail at the workflow level rather than at the scoring level. Receptor cleaning, ligand preparation, file ...
We present MobileGym, a browser-hosted, lightweight, fully controllable environment for everyday mobile use, targeting interaction fidelity without replicating proprietary backends. It enables two cap...
Skill scores, which measure the relative improvement of a forecasting method over a benchmark via consistent scoring functions and proper scoring rules, are a standard tool in forecast evaluation, yet...
Scaling test-time compute by iteratively updating a latent state has emerged as a powerful paradigm for reasoning. Yet the internal mechanisms that enable these iterative models to generalize beyond m...
This technical note revisits the relationship between RaBitQ and TurboQuant under a unified comparison framework. We compare the two methods in terms of methodology, theoretical guarantees, and empiri...
Copper-containing steel is widely used in ship plates and other marine engineering fields due to its excellent mechanical properties and good weldability. However, in hydrogen-containing media environ...
This study investigated the short‑term effects of polyethylene microplastics (PE‑MPs) on the marine mussel Mytilus edulis using a suite of cellular and subcellular biomarkers. A total of 225 mussels w...
Instrumental variable (IV) methods rely critically on the exclusion restriction, which is untestable in exactly-identified models under standard assumptions. We propose a framework combining IV analys...
As a huge reservoir of economic metallic elements, oceanic polymetallic nodules have important strategic significance and are one of the main research objects in marine geology, especially their forma...
This paper evaluates five sensor network architectures for coastal marine pollution monitoring using a fuzzy multi-criteria decision-making framework. Thirty domain experts assessed the alternatives a...
Determining the number of change-points is a first-step and fundamental task in change-point detection problems, as it lays the groundwork for subsequent change-point position estimation. While the ex...
Understanding the interplay between high-dimensional data from different views is essential in biomedical research, particularly in fields such as genomics, neuroimaging and biobank-scale studies invo...
We introduce PACE, a backpropagation-free continual test-time adaptation system that directly optimizes the affine parameters of normalization layers. Existing derivative-free approaches struggle to b...
Human genetics offers a promising route to therapeutic discovery, yet practical frameworks translating genotype-derived signal into ranked target and drug hypotheses remain limited, particularly when ...
The pursuit of reducing the memory footprint of the self-attention mechanism in multi-headed self attention (MHA) spawned a rich portfolio of methods, e.g., group-query attention (GQA) and multi-head ...
This paper introduces a new hybrid framework that combines Reinforcement Learning (RL) and Large Language Models (LLMs) to improve robotic manipulation tasks. By utilizing RL for accurate low-level co...