海洋科研 - 国际顶刊文献统计平台

Why Empirical p-Values Are Not Uniform: Reference Samples, Dependence, and PIT Backtesting

Probability integral transforms (PITs) and empirical $p$-values are widely used to assess the calibration of predictive distributions. While exact PIT values are uniformly distributed under correct mo...

👤 Jakub Lis 📰 arXiv 📅 2026 👁 95 📚 23

Unlocking the Working Memory of Large Language Models for Latent Reasoning

To improve the reasoning capabilities of large language models, test-time compute is typically scaled by generating intermediate tokens before the final answer. However, this couples reasoning to auto...

👤 Lukas Aichberger | Sepp Hochreiter 📰 arXiv 📅 2026 👁 93 📚 23

Fast Spatial Memory with Elastic Test-Time Training

Large Chunk Test-Time Training (LaCT) has shown strong performance on long-context 3D reconstruction, but its fully plastic inference-time updates remain vulnerable to catastrophic forgetting and over...

👤 Ziqiao Ma | Xueyang Yu | Haoyu Zhen | Yu... 📰 arXiv 📅 2026 👁 43 📚 23

Causally Evaluating the Learnability of Formal Language Tasks

Language models, as multi-task learners, acquire a wide range of abilities during training. A fundamental question is how much task-specific data is needed to learn a given task. Answering this for na...

👤 Vésteinn Snæbjarnarson | Anej Svete | Jo... 📰 arXiv 📅 2026 👁 19 📚 23

Extraction of tabulated statistical results with tableParser

Tabulated content is omnipresent in scientific literature. This work presents the R package *tableParser*, designed to extract and postprocess tables from NISO-JATS-encoded XML, HTML, DOCX, and, with ...

👤 Ingmar Böschen 📰 arXiv 📅 2026 👁 308 📚 22

Statistical Testing Framework for Clustering Pipelines by Selective Inference

A data analysis pipeline is a structured sequence of steps that transforms raw data into meaningful insights by integrating multiple analysis algorithms.In many practical applications, analytical find...

👤 Yugo Miyata, Tomohiro Shiraishi, Shunich... 📰 arXiv 📅 2026 👁 155 📚 22

Universal YOCO for Efficient Depth Scaling

The rise of test-time scaling has remarkably boosted the reasoning and agentic proficiency of Large Language Models (LLMs). Yet, standard Transformers struggle to scale inference-time compute efficien...

👤 Yutao Sun | Li Dong | Tianzhu Ye | Shaoh... 📰 arXiv 📅 2026 👁 155 📚 22

Pareto frontier of portfolio investment under volatility uncertainty and short-sale constraints market

In this paper, we investigate a portfolio investment problem under volatility uncertainty and short-sale constraints market via sublinear expectation which is used to model volatility uncertainty. We ...

👤 Jing He | Shuzhen Yang 📰 arXiv 📅 2026 👁 104 📚 22

Comparison of probabilistic nowcasts and forecasts of SARS-CoV-2 variant proportions made by hierarchical multinomial linear regression models

Nowcasting and forecasting of infectious diseases have become increasingly important since the SARS-CoV-2 pandemic. In particular, methods for modeling the composition of circulating variants at a giv...

👤 Isaac MacArthur | Thomas Robacker | Evan... 📰 arXiv 📅 2026 👁 76 📚 22

Statistical and Numerical Convergence in Stochastic Equilibrium

This paper sets out the most general computational and econometric implications of the rigorous stochastic equilibrium theory from SELCKE (Staines (2024a)) arXiv:2312.16214. The analytical backbone is...

👤 David Staines 📰 arXiv 📅 2026 👁 63 📚 22

Asymptotically faster algorithms for recognizing $(k,\ell)$-sparse graphs

The family of $(k,\ell)$-sparse graphs, introduced by Lorea, plays a central role in combinatorial optimization and has a wide range of applications, particularly in rigidity theory. A key algorithmic...

👤 Bence Deák | Péter Madarasi 📰 arXiv 📅 2026 👁 60 📚 22

West Nile virus outbreak in Italy modelled with the quantum Game of Life

In the last years, an anomalously high spreading of West Nile virus (WNV) has been observed in Italy, with particularly high peaks of infections in southern Lazio, Campania and Veneto regions. The mai...

👤 Andrea Fontana | Simone Tambascia | Ciro... 📰 arXiv 📅 2026 👁 27 📚 22

Regret Bounds for Competitive Resource Allocation with Endogenous Costs

We study online resource allocation among N interacting modules over T rounds. Unlike standard online optimization, costs are endogenous: they depend on the full allocation vector through an interacti...

👤 Rui Chai 📰 arXiv 📅 2026 👁 409 📚 21

Theory Discovery in Social Networks: Automating ERGM Specification with Large Language Models

Understanding how social networks form, whether through reciprocity, shared attributes, or triadic closure, is central to computational social science. Exponential Random Graph Models (ERGMs) offer a ...

👤 Yidan Sun|Mayank Kejriwal 📰 arXiv 📅 2026 👁 382 📚 21

Modeling dependency between operational risk losses and macroeconomic variables using Hidden Markov Models

Predicting future operational risk losses gives rise to a significant challenge due to the heterogeneous and time-dependent structures present in real-world data. Furthermore, stress test exercises re...

👤 Nikeethan Selvaratnam | Dorinel Bastide ... 📰 arXiv 📅 2026 👁 223 📚 21

Understanding Truncated Positional Encodings for Graph Neural Networks

Positional encodings (PEs) enhance the power of graph neural networks (GNNs), both theoretically and empirically. Two of the most popular families of PEs - spectral (e.g., Laplacian eigenspaces, effec...

👤 James Flora | Mitchell Black | Weng-Keen... 📰 arXiv 📅 2026 👁 157 📚 21

Beyond the Current Observation: Evaluating Multimodal Large Language Models in Controllable Non-Markov Games

Deploying multimodal foundation models as closed-loop policies increasingly requires conditioning actions on observations that are no longer visible. However, existing benchmarks either expose the ful...

👤 Shengyuan Ding | Xilin Wei | Xinyu Fang ... 📰 arXiv 📅 2026 👁 155 📚 21

Neural Negative Binomial Regression for Weekly Seismicity Forecasting: Per-Cell Dispersion Estimation and Tail Risk Assessment

Standard approaches to forecasting the weekly number of earthquakes on a spatial grid rely on the Poisson distribution with a single global dispersion assumption. We show that this assumption is syste...

👤 Alim Igilik 📰 arXiv 📅 2026 👁 146 📚 21

MathNet: a Global Multimodal Benchmark for Mathematical Reasoning and Retrieval

Mathematical problem solving remains a challenging test of reasoning for large language and multimodal models, yet existing benchmarks are limited in size, language coverage, and task diversity. We in...

👤 Shaden Alshammari | Kevin Wen | Abrar Za... 📰 arXiv 📅 2026 👁 140 📚 21

Spurious Predictability in Financial Machine Learning

Adaptive specification search generates statistically significant backtests even under martingale-difference nulls. We introduce a falsification audit testing complete predictive workflows against syn...

👤 Sotirios D. Nikolopoulos 📰 arXiv 📅 2026 👁 123 📚 21

🔍 搜索