登录 注册

整个非结构化临床笔记的卫生系统规模语义搜索
Health System Scale Semantic Search Across Unstructured Clinical Notes

🔗 访问原文
🔗 Access Paper

📝 摘要
Abstract

Introduction: Semantic search, which retrieves documents based on conceptual similarity rather than keyword matching, offers substantial advantages for retrieval of clinical information. However, deploying semantic search across entire health systems, comprising hundreds of millions of clinical notes, presents formidable engineering, cost, and governance challenges that have prevented adoption. Methods: We deployed a semantic search system at a large children's hospital indexing 166 million clinical notes (484 million vectors) from 1.68 million patients. The system uses instruction-tuned qwen3-embedding-0.6B embeddings, stores vectors in a managed database with storage-optimized indexing, maintains full-text metadata in a low-latency key-value store, and operates within a HIPAA-compliant governance framework. We evaluated the system through three experiments: optimization of embedding model and chunking strategy using a physician-authored benchmark dataset, characterization of full-scale performance (cost, latency, retrieval quality), and clinical utility assessment via comparison of chart abstraction efficiency across three tasks. Results: The system delivers sub-second query latency (median 237 ms single-user, 451 ms 20-user concurrency) with monthly costs of approximately USD 4,000. Qwen3 embeddings with 300-token chunk size achieved 94.6% accuracy on a clinical question-answering benchmark. In clinical utility evaluation across three abstraction tasks, semantic search reduced time-to-completion by 24 to 89% compared to clinician-performed chart review while maintaining comparable inter-rater agreement. Conclusion: Health-system-scale semantic search is both technically and operationally feasible. The system provides infrastructure supporting interactive search, cohort generation, and downstream LLM-powered clinical applications without requiring specialized informatics expertise.

📊 文章统计
Article Statistics

基础数据
Basic Stats

76 浏览
Views
0 下载
Downloads
4 引用
Citations

引用趋势
Citation Trend

阅读国家分布
Country Distribution

阅读机构分布
Institution Distribution

月度浏览趋势
Monthly Views

相关关键词
Related Keywords

影响因子分析
Impact Analysis

7.80 综合评分
Overall Score
引用影响力
Citation Impact
浏览热度
View Popularity
下载频次
Download Frequency

📄 相关文章
Related Articles

海洋智能分析Ocean AI Analysis

正在分析中,请稍候…Analyzing, please wait…
海洋智能体 🌊
海洋智能体
AI科研助手 · 2270篇文献
我看到你正在阅读一篇文献,需要我帮你解读摘要、推荐相关论文,或者分析研究方法论吗?