临床医学领域最受欢迎90+机器学习数据集

合集:行业AI数据集精选

本文精选临床医学领域最受欢迎90+机器学习数据集,这些数据集来自具有重要影响力的学会、会议、数据库、期刊、国内外AI竞赛组织方、Github和Kaggle等数据集托管方。

一、医学组织

获取医疗行业众多具有重要影响力的学会、会议、数据库和期刊信息。

二、临床医学领域机器学习数据集

1. Pima Indians Diabetes Database
2. Breast Cancer Wisconsin (Diagnostic) Data Set
3. Stroke Prediction Dataset
4. Medical Cost Personal Datasets
  • 星标数: ⭐ 3,159
  • 简介: 使用线性回归进行保险预测
  • 主题: healthcare, education, finance, health, insurance
  • 协议: Database: Open Database, Contents: Database Contents 所有者: Miri Choi
  • 链接: https://kaggle.com/datasets/mirichoi0218/insurance
5. Heart Failure Prediction Dataset
6. Heart Failure Prediction
7. Cardiovascular Disease dataset
8. Medical Appointment No Shows
9. COVID-19 Dataset
10. mimic3-benchmarks
  • 星标数: ⭐ 876
  • 简介: 用于从MIMIC-III临床数据库构建基准机器学习数据集的Python套件。💊
  • 主题: benchmark, clinical-data, deep-learning, machine-learning
  • 协议: MIT License 所有者: YerevaNN
  • 链接: https://github.com/YerevaNN/mimic3-benchmarks
11. Diabetes prediction dataset
12. Health Insurance Marketplace
  • 星标数: ⭐ 716
  • 简介: 探索美国医疗保险市场中健康与牙科计划的数据
  • 主题: healthcare, dentistry, earth and nature, business, economics
  • 协议: CC0: Public Domain 所有者: US Department of Health and Human Services
  • 链接: https://kaggle.com/datasets/hhs/health-insurance-marketplace
13. Fetal Health Classification
14. Breast Cancer Dataset
15. Heartbeat Sounds
16. Cervical Cancer Risk Classification
17. Respiratory Sound Database
18. Logistic regression To predict heart disease
19. Disease Symptom Prediction
20. Diagnosis of COVID-19 and its clinical spectrum
  • 星标数: ⭐ 495
  • 简介: 人工智能与数据科学辅助临床决策(3月28日至4月3日)
  • 主题: healthcare, public health, earth and nature, health, classification
  • 协议: Unknown 所有者: Einstein Data4u
  • 链接: https://kaggle.com/datasets/einsteindata4u/covid19
21. Student Stress Monitoring Datasets
22. Indian Liver Patient Records
23. Heart Attack Prediction
24. Polycystic ovary syndrome (PCOS)
25. Hospital Beds Management
26. Medicare Data
  • 星标数: ⭐ 399
  • 简介: 医疗保险数据(BigQuery数据集)
  • 主题: healthcare, health, bigquery, drugs and medications
  • 协议: CC0: Public Domain 所有者: Centers for Medicare & Medicaid Services
  • 链接: https://kaggle.com/datasets/cms/cms-medicare
27. UCI Heart Disease Data
28. Pfizer Vaccine Tweets
29. Breast Cancer Proteomes
30. Disease Symptoms and Patient Profile Dataset
31. awesome-cancer-variant-resources
  • 星标数: ⭐ 325
  • 简介: 一个由社区维护的癌症临床知识库和数据库集合,专注于癌症变异研究。
  • 主题: awesome-list, bioinformatics, cancer, cancer-genomics, cancer-variants
  • 协议: MIT License 所有者: seandavi
  • 链接: https://github.com/seandavi/awesome-cancer-variant-resources
32. HEALTHCARE PROVIDER FRAUD DETECTION ANALYSIS
33. awesome-healthcare-ai
  • 星标数: ⭐ 314
  • 简介: 精选的优质开源医疗工具、算法、数据集及研究论文列表。
  • 主题: awesome-list, awesome-lists, healthcare, healthcare-application, healthcare-datasets
  • 协议: Creative Commons Zero v1.0 Universal 所有者: medtorch
  • 链接: https://github.com/medtorch/awesome-healthcare-ai
34. AV : Healthcare Analytics
35. AV : Healthcare Analytics II
36. Genetic Variant Classifications
37. Heart Attack Risk Prediction Dataset
38. Lower Back Pain Symptoms Dataset
39. Diabetes 130 US hospitals for years 1999-2008
40. MIAS Mammography
41. Chronic illness: symptoms, treatments and triggers
42. U.S. Healthcare Data
43. TSDB
  • 星标数: ⭐ 233
  • 简介: 一个Python工具箱仅需一行代码即可加载172个公开时间序列数据集,适用于机器学习和深度学习。这些数据集涵盖医疗健康、金融、电力、交通、天气等多个领域。
  • 主题: classification, data-mining, database, deep-learning, forecasting
  • 协议: BSD 3-Clause “New” or “Revised” License 所有者: WenjieDu
  • 链接: https://github.com/WenjieDu/TSDB
44. Predict Diabetes
45. Hepatitis C Prediction Dataset
46. Body Fat Prediction Dataset
47. american-healthcare-conundrum
  • 星标数: ⭐ 211
  • 简介: 调查性数据新闻:逐项量化美国医疗体系中的可避免浪费。基于CMS、OECD及联邦数据集的开源分析。目前已识别出986亿美元的可节约资金。
  • 主题: cms-data, data-journalism, drug-pricing, health-policy, healthcare
  • 协议: MIT License 所有者: rexrodeo
  • 链接: https://github.com/rexrodeo/american-healthcare-conundrum
48. Covid-19 Case Surveillance Public Use Dataset
49. COVID-19 patient pre-condition dataset
50. Cirrhosis Prediction Dataset
51. U.S. Opiate Prescriptions/Overdoses
52. Heart Attack Dataset
53. COVID-19 – Clinical Data to assess diagnosis
  • 星标数: ⭐ 177
  • 简介: Data Intelligence Team提供的Sírio-Libanês人工智能与分析数据
  • 主题: business, health, social science, medicine, classification
  • 协议: Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) 所有者: Hospital Sírio-Libanês
  • 链接: https://kaggle.com/datasets/Sírio-Libanes/covid19
54. Cuff-Less Blood Pressure Estimation
55. Healthcare Insurance
  • 星标数: ⭐ 170
  • 简介: 我的数据集涉及全球医疗保健领域的不安全感问题,目前正在开发中。
  • 主题: exploratory data analysis, data visualization, neural networks, health conditions, numpy
  • 协议: CC0: Public Domain 所有者: willian oliveira
  • 链接: https://kaggle.com/datasets/willianoliveiragibin/healthcare-insurance
56. Cannabis Strains
57. Diabetes Health Indicators Dataset
58. Breast Cancer Gene Expression Profiles (METABRIC)
59. Anxiety and Depression Psychological Therapies
60. Autism Screening
61. COVID-19 Clinical Trials dataset
62. Thyroid Disease Data
  • 星标数: ⭐ 151
  • 简介: 患者人口统计学资料及血液检测结果,以及甲状腺疾病诊断。
  • 主题: health, medicine, classification, tabular, cancer
  • 协议: Attribution 4.0 International (CC BY 4.0) 所有者: jaina
  • 链接: https://kaggle.com/datasets/jainaru/thyroid-disease-data
63. clinical-trial-outcome-prediction
  • 星标数: ⭐ 149
  • 简介: 用于临床试验批准概率预测的基准数据集及深度学习方法(分层交互网络,HINT),发表于《细胞模式》2022年。
  • 主题: benchmark, benchmark-datasets, clinical-data, clinical-research, clinical-research-data-warehouse
  • 协议: 未提供 所有者: futianfan
  • 链接: https://github.com/futianfan/clinical-trial-outcome-prediction
64. Employee Attrition for Healthcare
65. Hospital ratings
66. Hospitals and beds in India (Statewise)
67. Cirrhosis Patient Survival Prediction
68. Global Hospital Beds Capacity (for covid-19)
69. heart failure clinical records
70. Thyroid Disease Data
  • 星标数: ⭐ 130
  • 简介: 患者人口统计学特征及甲状腺疾病诊断相关的血液检测结果
  • 主题: medicine, exploratory data analysis, data cleaning, data visualization, classification
  • 协议: CC0: Public Domain 所有者: Emmanuel F. Werr
  • 链接: https://kaggle.com/datasets/emmanuelfwerr/thyroid-disease-data
71. Lung Cancer Detection
72. Adverse Food Events
  • 星标数: ⭐ 128
  • 简介: 90,000起与产品相关的用户报告不良医疗事件
  • 主题: healthcare, government, medicine, software
  • 协议: CC0: Public Domain 所有者: Food and Drug Administration
  • 链接: https://kaggle.com/datasets/fda/adverse-food-events
73. cardiobot
  • 星标数: ⭐ 127
  • 简介: 心脏健康聊天机器人基于精心筛选的心血管疾病相关数据集进行训练。它能针对用户查询提供情境感知且医学相关的回答,帮助患者和医疗从业者理解症状、治疗方案及预防措施。该模型经过微调,确保其响应始终围绕心血管健康领域展开。
  • 主题: cardio, chatbot, python
  • 协议: MIT License 所有者: stellarloop
  • 链接: https://github.com/stellarloop/cardiobot
74. Breast Cancer Diagnosis Dataset – Wisconsin State
75. Obesity Classification Dataset
76. AIDS Virus Infection Prediction 💉
77. awesome-healthcare-datasets
78. Lung Cancer Dataset
79. awesome-healthcare-datasets
  • 星标数: ⭐ 116
  • 简介: 医疗保健与生物医学数据集,用于人工智能/机器学习
  • 主题: awesome-list, biomedical, clinical, datasets, healthcare
  • 协议: Creative Commons Zero v1.0 Universal 所有者: geniusrise
  • 链接: https://github.com/geniusrise/awesome-healthcare-datasets
80. COVID19 Daily Updates
81. Pathogen Detection | Salmonella Enterica
82. Real Breast Cancer Data
83. Cancer Risk Factors Data
84. Healthcare Diabetes Dataset
85. Medical Insurance Cost Prediction
86. Coronavirus Records Dataset: 2021
87. Predict survival of patients with heart failure
88. SAS-Clinical-Trials-Toolkit
89. ChEMBL EBI Small Molecules Database
  • 星标数: ⭐ 109
  • 简介: 用于药物发现的大规模生物活性数据库(BigQuery)
  • 主题: healthcare, earth and nature, biology, chemistry, business
  • 协议: CC BY-SA 4.0 所有者: Google BigQuery
  • 链接: https://kaggle.com/datasets/bigquery/ebi-chembl
90. HeartHealthPrediction
  • 星标数: ⭐ 107
  • 简介: 在全球范围内,无论是发达国家还是欠发达国家,心脏病都是导致死亡的主要原因。数据科学家利用独特的机器学习技术,通过真实数据集高效且准确地对健康疾病进行建模。医疗分析师迫切需要能够预测患者发病前疾病风险的模型或系统。高胆固醇、不健康饮食、有害饮酒、高血糖、高血压以及吸烟是心脏病发病风险的主要征兆……
  • 主题: data-science, decision-trees, healthcare, heart-health-prediction, meachinelearning
  • 协议: 未提供 所有者: ammarmahmood1999
  • 链接: https://github.com/ammarmahmood1999/HeartHealthPrediction
91. Global Health,Mortality & Disease Trend Since 2000
92. Health Care Analytics
93. Diabetes_Dataset_With_18_Features
94. Clinical Dataset
95. Laryngeal Voice Disorder Classification

发表评论