Incomplete reporting of diagnostic accuracy data remains a persistent problem in medical research. In many studies, only part of the 2x2 diagnostic table is reported, leaving denominators for diseased...
Gaussian process (GP) surrogates are the default tool for emulating expensive computer experiments, but cubic cost, stationarity assumptions, and Gaussian predictive distributions limit their reach. W...
Modern imaging instruments can produce terabytes to petabytes of data for a single experiment. The biggest barrier to processing big image datasets has been computational, where image analysis algorit...
In a dataset of 423 patients who had had radical prostatectomy for localised prostate cancer we estimated the apparent Shannon information (ASI) about time to biochemical recurrence in various subsets...
Building virtual cells with generative models to simulate cellular behavior in silico is emerging as a promising paradigm for accelerating drug discovery. However, prior image-based generative approac...
A model is multicalibrated on a collection of group weights $G$ if it is calibrated -- i.e. unbiased even conditional on its prediction -- not just overall, but also after reweighting contexts by each...
What proportion of treated units actually benefited from an experimental intervention? What is the median or the largest individual treatment effect? This paper develops methods for answering such que...
Recently marine fungal polysaccharides have attracted increasing attention and interest; however, most reported sources originate from shallow or coastal environments, and knowledge regarding those de...
Cryopreservation of marine species is a key tool for biodiversity conservation, population management, and the advance of aquaculture and biological research. Currently exist significant progress in t...
With the intensified exploration of marine resources, marine bioactive peptides have become one of the research focuses in biomedicine, food science, and materials science because of their structural ...
We develop horizon-aware anytime-valid tests and confidence sequences for bounded means under a strict deadline $N$. Using the betting/e-process framework, we cast horizon-aware betting as a finite-ho...
Microplastic (MP) contamination is a notable environmental challenge affecting marine ecosystems. However, its repercussions on the reproductive success of sea turtles remain inadequately elucidated. ...
Habitat condition and area shape global species distributions, with shallow-water reefs hosting a disproportional share of marine biodiversity. Although reef area is a well-established predictor of ma...
Large Reasoning Models (LRMs) achieve strong accuracy on challenging tasks by generating long Chain-of-Thought traces, but suffer from overthinking. Even after reaching the correct answer, they contin...
Accurate motion prediction of floating platforms is critical for ensuring operational safety in offshore engineering applications or marine equipment testing. However, the strong nonlinearity and non-...
Social media can reveal patient experiences with glucagon-like peptide-1 receptor agonists (GLP-1 RAs) that extend beyond clinical trial data. We analyzed 410,198 Reddit posts (May 2019-June 2025) men...
Penetration testing, the practice of simulating cyberattacks to identify vulnerabilities, is a complex sequential decision-making task that is inherently partially observable and features large action...
The key-value (KV) cache is widely treated as essential state in transformer inference, and a large body of work engineers policies to compress, evict, or approximate its entries. We prove that this s...
Large language models (LLMs) are increasingly used to answer natural-language questions over structured data. However, when a table contains familiar real-world facts, it is unclear whether the model ...
Protein sequence optimization under tight oracle budgets requires methods that explore vast combinatorial spaces while making each evaluation informative. Existing reinforcement learning and off-polic...