The integration of artificial intelligence (AI) technologies into judicial decision-making - particularly in pretrial, sentencing, and parole contexts - has generated substantial concerns about transp...
Background: A core aspect of epidemiology is determining the impacts of potential public health interventions over time. With long follow-up periods, epidemiologists may need to consider semi-competin...
Capturing the structured mixing within a population is key to the reliable projection of infectious disease dynamics and hence informed control. Both heterogeneity in the number of contacts and age-st...
Cryo-electron microscopy (cryo-EM) has emerged as a powerful technique for determining the three-dimensional structures of biological molecules at near-atomic resolution. However, reconstructing helic...
Since the release of the ICH E9(R1) addendum on estimands, its application in non-inferiority trials has received far less attention than in superiority settings. A key conclusion from Lynggaard et al...
At the core of modern generative modeling frameworks, including diffusion models, score-based models, and flow matching, is the task of transforming a simple prior distribution into a complex target d...
Alkaptonuria (AKU) is an ultra-rare autosomal recessive metabolic disorder caused by mutations in the HGD (Homogentisate 1,2-Dioxygenase) gene, leading to a pathological accumulation of homogentisic a...
Several graph data mining, signal processing, and machine learning downstream tasks rely on information related to the eigenvectors of the associated adjacency or Laplacian matrix. Classical eigendeco...
Understanding non-genetic determinants of cell fate is critical for developing and improving cancer therapies, as genetically identical cells can exhibit divergent outcomes under the same treatment co...
Tweedie's formula is a cornerstone of measurement-error analysis and empirical Bayes. In the Gaussian location model, it recovers posterior means directly from the observed marginal density, bypassing...
Source localization in complex networks is a rapidly advancing field with numerous real-world applications, including determining the source of misinformation. In this work, we model information sprea...
Hypergraphs serve as an effective tool widely adopted to characterize higher-order interactions in complex systems. The most intuitive and commonly used mathematical instrument for representing a hype...
We document the rise of negative earnings between 1980 and 2019: a secular increase in the percent of firms reporting losses, both among public firms and in the broader universe of US corporations, an...
Language models achieve impressive performance on a variety of knowledge, language, and reasoning tasks due to the scale and diversity of pretraining data available. The standard training recipe is a ...
Predicting future operational risk losses gives rise to a significant challenge due to the heterogeneous and time-dependent structures present in real-world data. Furthermore, stress test exercises re...
Identifying critical nodes in complex networks is a fundamental task in graph mining. Yet, methods addressing an all-or-nothing coverage mechanics in a bipartite dependency network, a graph with two t...
The Gaussian mixture model is widely used in unsupervised learning, owing to its simplicity and interpretability. However, a fundamental limitation of the classical Gaussian mixture model is that it f...
Autocorrelation is a defining characteristic of time-series data, where each observation is statistically dependent on its predecessors. In the context of deep time-series forecasting, autocorrelation...
While the optimal sample complexity of binary classification in terms of the VC dimension is well-established, determining the optimal sample complexity of multiclass classification has remained open....
Generating realistic synthetic citation, patent, or component dependency networks is essential for benchmarking community detection, graph visualisation, and network data mining algorithms. We present...