Skip to content

Commit 242dbdc

Browse files
KshitizGITrads-bagsfer
authored
Medical LLM on Prem instructions (#1761)
* added a column for LLMs and Terminology Server on the Docs landing page * Add docs for Medical LLM On Prem Deployment * Update support.md Changed content * Update support.md * Update Release_notes.md * Update Release_notes.md * terminology_server * Update nlp_products.html * restructuring and small edits * Update medical_llm.md added more content * Add files via upload * Update medical_llm.md added links to new images * Update medical_llm.md * Update medical_llm.md * Update medical_llm.md * Add files via upload * Update medical_llm.md * Update getting_started.md * Update medical_llm.md * Update getting_started.md * Add on-prem and cloud deploy steps Med LLM * Add on-prem and cloud deploy links * updated and restructured documentation --------- Co-authored-by: diatrambitas <JSL.Git2018> Co-authored-by: rads-b <52103218+rads-b@users.noreply.github.com> Co-authored-by: Lev <agsfer@gmail.com>
1 parent 891f012 commit 242dbdc

20 files changed

+503
-12
lines changed

docs/_data/navigation.yml

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -90,6 +90,28 @@ healthcare-gpt:
9090
- title: Release Notes
9191
url: /docs/en/chatbot/releases/release_notes
9292

93+
medical-llm:
94+
- title: Medical LLMs
95+
children:
96+
- title: Home
97+
url: /docs/en/LLMs/medical_llm
98+
- title: Deploy on Premise
99+
url: /docs/en/LLMs/on_prem_deploy
100+
- title: Deploy on AWS
101+
url: /docs/en/LLMs/on_aws
102+
- title: Deploy on Snowflake
103+
url: /docs/en/LLMs/on_snowflake
104+
- title: Support
105+
url: /docs/en/LLMs/support
106+
- title: Release Notes
107+
url: /docs/en/LLMs/releases/release_notes
108+
109+
term-server:
110+
- title: Terminology Server
111+
children:
112+
- title: Home
113+
url: /docs/en/terminology_server/term_server
114+
93115
annotation-lab:
94116
- title: Generative AI Lab
95117
children:

docs/_layouts/nlp_products.html

Lines changed: 17 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
<a href="/docs/en/jsl/install" class="nlp_content_item">
1616
<div class="nlp_c">
1717
<figure><img src="/assets/images/jsl_nlp.svg" alt=""></figure>
18-
<p class="mont">John Snow Labs NLP</p>
18+
<p class="mont">John Snow Labs</p>
1919
</div>
2020
</a>
2121
<a href="/docs/en/licensed_install" class="nlp_content_item">
@@ -27,19 +27,34 @@
2727
</a>
2828
</div>
2929
<div class="nlp_content_col">
30-
<div class="nlp_content_title df">No-Code Products</div>
30+
<div class="nlp_content_title df">Large Language Models</div>
31+
<a href="/docs/en/LLMs/medical_llm" class="nlp_content_item">
32+
<div class="nlp_c">
33+
<figure><img src="/assets/images/Medical_LLMs.svg" alt=""></figure>
34+
<p class="mont">Medical LLMs</p>
35+
</div>
36+
</a>
3137
<a href="/docs/en/chatbot/healthcare_gpt" class="nlp_content_item">
3238
<div class="nlp_c">
3339
<figure><img src="/assets/images/healthcare_gpt.svg" alt=""></figure>
3440
<p class="mont">Medical Chatbot</p>
3541
</div>
3642
</a>
43+
</div>
44+
<div class="nlp_content_col">
45+
<div class="nlp_content_title df">Applications</div>
3746
<a href="/docs/en/alab/quickstart" class="nlp_content_item">
3847
<div class="nlp_c">
3948
<figure><img src="/assets/images/nlp_lab.svg" alt=""></figure>
4049
<p class="mont">Generative AI Lab</p>
4150
</div>
4251
</a>
52+
<a href="/docs/en/terminology_server/term_server" class="nlp_content_item">
53+
<div class="nlp_c">
54+
<figure><img src="/assets/images/Terminology_Server.svg" alt=""></figure>
55+
<p class="mont">Terminology Server</p>
56+
</div>
57+
</a>
4358
</div>
4459
<div class="nlp_content_col">
4560
<div class="nlp_content_title df">Open-Source Libraries</div>

docs/_sass/custom.scss

Lines changed: 28 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -2622,7 +2622,7 @@ a.btn1 {
26222622
}
26232623

26242624
.nlp_content_inner {
2625-
width: 980px;
2625+
width: 1200px;
26262626
margin: 0 auto;
26272627
display: flex;
26282628
padding: 0 15px;
@@ -2631,9 +2631,9 @@ a.btn1 {
26312631
}
26322632

26332633
.nlp_content_col {
2634-
width: 304px;
2634+
width: 280px;
26352635
display: flex;
2636-
flex: 0 0 auto;
2636+
flex: 0 0 280px;
26372637
background: rgba(255, 255, 255, 0.15);
26382638
flex-wrap: wrap;
26392639
border-radius: 10px;
@@ -3837,6 +3837,13 @@ code {
38373837
}
38383838
}
38393839
}
3840+
.nlp_content_inner {
3841+
width: 1000px;
3842+
}
3843+
.nlp_content_col {
3844+
width: 234px;
3845+
flex: 0 0 234px;
3846+
}
38403847
}
38413848

38423849
@media (max-width: 1199px) {
@@ -4078,6 +4085,14 @@ code {
40784085
.input_output_modal {
40794086
width: 90%;
40804087
}
4088+
.nlp_content_inner {
4089+
width: 764px;
4090+
}
4091+
.nlp_content_col {
4092+
width: 49%;
4093+
flex: 0 0 49%;
4094+
margin-bottom: 20px;
4095+
}
40814096
.learn-hub-inner {
40824097
background: #FFFFFF;
40834098
box-shadow: 0px 5px 15px rgba(158, 222, 250, 0.25);
@@ -4174,7 +4189,8 @@ code {
41744189
}
41754190
}
41764191
.nlp_content_col {
4177-
width: 32%;
4192+
width: 44%;
4193+
flex: 0 0 44%;
41784194
}
41794195
.ecosystem-section .tab-python-scala-li {
41804196
min-width: 175px;
@@ -4280,9 +4296,6 @@ code {
42804296
.nlp_content_item {
42814297
height: 250px;
42824298
}
4283-
.nlp_content_inner {
4284-
width: 100%;
4285-
}
42864299
.docs-wrapper {
42874300
width: 100%;
42884301
}
@@ -4403,7 +4416,6 @@ code {
44034416
width: 100%;
44044417
}
44054418
.nlp_content_col {
4406-
width: 28%;
44074419
margin: 0 15px;
44084420
}
44094421
.disp_platform_wrapper {
@@ -4412,6 +4424,9 @@ code {
44124424
margin: 0 10px;
44134425
}
44144426
}
4427+
.nlp_content_inner {
4428+
width: 100%;
4429+
}
44154430
.nlp_content_item {
44164431
font-size: 18px;
44174432
line-height: 24px;
@@ -4885,6 +4900,10 @@ code {
48854900
.edit-on-github {
48864901
right: 15px;
48874902
}
4903+
.nlp_content_col {
4904+
width: 100%;
4905+
flex: 0 0 100%;
4906+
}
48884907
}
48894908

48904909
@media (max-width: 660px) {
@@ -4935,6 +4954,7 @@ code {
49354954
.model-wrap .btn-box {
49364955
padding: 40px 0 20px;
49374956
}
4957+
49384958
}
49394959

49404960
@media (max-width: 499px) {

docs/assets/images/Medical_LLMs.svg

Lines changed: 1 addition & 0 deletions
Loading
Lines changed: 1 addition & 0 deletions
Loading
62.2 KB
Loading

docs/assets/images/graph_med_llm.png

406 KB
Loading
121 KB
Loading
Loading
130 KB
Loading
85.9 KB
Loading
98.3 KB
Loading

docs/en/LLMs/medical_llm.md

Lines changed: 171 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,171 @@
1+
---
2+
layout: docs
3+
header: true
4+
seotitle: Medical LLMs| John Snow Labs
5+
title: Medical LLMs
6+
permalink: /docs/en/LLMs/medical_llm
7+
key: docs-medical-llm
8+
modify_date: "2025-03-31"
9+
show_nav: true
10+
sidebar:
11+
nav: medical-llm
12+
---
13+
14+
There is overwhelming evidence from both academic research and industry benchmarks that domain-specific, task-optimized large language models consistently outperform general-purpose LLMs in healthcare. At John Snow Labs, we’ve developed a suite of Medical LLMs purpose-built for clinical, biomedical, and life sciences applications.
15+
16+
Our models are designed to deliver best-in-class performance across a wide range of medical tasks—from clinical reasoning and diagnostics to medical research comprehension and genetic analysis.
17+
18+
## Medical LLMs Offering
19+
20+
| **Model Name** | **Parameters** | **Recommended GPU Memory** | **Max Sequence Length** | **Model Size** | **Max KV-Cache** |**Tensor Parallel Sizes**|
21+
| Medical-LLM-7B | 7B | ~25GB | 32K | 14GB |10.50 GB | 1,2,4 |
22+
| Medical-LLM-10B | 10B | ~35GB | 32K | 19GB |15.00 GB| 1,2,4 |
23+
| Medical-LLM-14B | 14B | ~40FB | 16K | 28GB | 12.50GB | 1,2 |
24+
| Medical-LLM-24B | 24B | ~70GB | 32K | 44GB | 25GB | 1,2,4,8 |
25+
| Medical-LLM-Small | 14B | ~58GB | 32K | 28GB | 30GB | 1,2,4,8 |
26+
| Medical-LLM-Medium | 70B | 452GB | 128K | 132GB | 320GB | 4, 8 |
27+
28+
29+
*Note: All memory calculations are based on half-precision (fp16/bf16) weights. Recommended GPU Memory considers the model size and the maximum key-value cache at the model's maximum sequence length. These calculations follow the guidelines from [DJL's LMI Deployment Guide.](https://docs.djl.ai/master/docs/serving/serving/docs/lmi/deployment_guide/instance-type-selection.html)*
30+
31+
## Introduction
32+
John Snow Labs’ latest 2025 release of its Medical Large Language Models advance Healthcare AI by setting new state-of-the-art accuracy on medical LLM benchmarks. This advances what’s achievable in a variety of real-world use cases including clinical assessment, medical question answering, biomedical research synthesis, and diagnostic decision support.
33+
34+
Leading the pack is their largest 70B model, which can read and understand up to 32,000 words at once – that’s roughly 64 pages of medical text. The model is specially trained to work with medical information, from patient records to research papers, making it highly accurate for healthcare tasks. What makes this release special is how well the model performs while still being practical enough for everyday use in hospitals and clinics – thanks to a suite of models in different sizes, that balance accuracy with speed, cost, and privacy.
35+
36+
## OpenMed Benchmark Performance
37+
The comprehensive evaluation of John Snow Labs’ Medical LLM suite encompasses multiple standardized benchmarks, providing a thorough assessment of their capabilities across various medical domains. These evaluations demonstrate not only the models’ proficiency in medical knowledge but also their practical applicability in real-world healthcare scenarios.
38+
39+
The OpenMed evaluation framework represents one of the most rigorous testing environments for medical AI models, covering a broad spectrum of medical knowledge and clinical reasoning capabilities. Our models have undergone extensive testing across multiple categories, achieving remarkable results that validate their exceptional performance:
40+
41+
## Model Performance Matrix
42+
**Large (70B+) Models Comparison**
43+
44+
![Medical LLM by John Snow Labs](/assets/images/large_llm_comparison.png)
45+
46+
**Smaller Models Comparison**
47+
48+
![Medical LLM by John Snow Labs](/assets/images/small_llm_comparison.png)
49+
50+
All scores are presented as percentages (%)
51+
52+
![Medical LLM by John Snow Labs](/assets/images/all_llm_model_comparison.png)
53+
54+
## Open Medical Leaderboard Performance Analysis
55+
56+
John Snow Labs’ Medical LLMs have been rigorously evaluated against leading general-purpose and medical-specific models, including GPT-4 and Med-PaLM-2. Here's a detailed breakdown of their performance across key medical domains:
57+
58+
1. **Clinical Knowledge**
59+
60+
- Outperforms GPT-4 in clinical knowledge assessment (89.43% vs 86.04%)
61+
62+
- Shows stronger diagnostic and treatment planning capabilities
63+
64+
2. **Medical Genetics**
65+
66+
- Exceeds both GPT-4 and Med-PaLM-2 in genetic analysis (95% vs 91% and 90%)
67+
68+
- Demonstrates advanced understanding of genetic disorders and inheritance patterns
69+
70+
3. **Medical Knowledge: Anatomy**
71+
72+
- Superior anatomical knowledge compared to both alternatives (85.19% vs 80% and 77.8%)
73+
74+
- Shows stronger grasp of structural and functional relationships
75+
76+
4. **Clinical Reasoning: Professional Practice**
77+
78+
- Surpasses GPT-4 in professional medical scenarios (94.85% vs 93.01%)
79+
80+
- Better understanding of medical protocols and clinical guidelines
81+
82+
5. **Cross-Domain Capability: Life Sciences**
83+
84+
- Slightly lower than GPT-4 but comparable to Med-PaLM-2 (93.75% vs 95.14% and 94.4%)
85+
86+
- Strong foundation in biological sciences and medical principles
87+
88+
6. **Medical Knowledge: Core Concepts**
89+
90+
- Significantly outperforms both models (83.24% vs 76.88% and 80.9%)
91+
92+
- Better understanding of fundamental medical concepts
93+
94+
7. **Clinical Case Analysis**
95+
96+
- Slightly better performance in clinical case scenarios (79.81% vs 78.87% and 79.7%)
97+
98+
- More accurate in diagnostic decision-making
99+
100+
8. **Medical Research Comprehension**
101+
102+
- Notable improvement over GPT-4 in research analysis (79.4% vs 75.2%)
103+
104+
- Better at interpreting medical literature and research findings
105+
106+
9. **Clinical Assessment**
107+
108+
- Substantially higher performance in clinical assessments (75.45% vs 69.52% and 71.3%)
109+
110+
- Superior ability in evaluating clinical scenarios and treatment options
111+
112+
113+
## Small Yet Powerful: Efficiency Meets Performance
114+
115+
One of the standout features of John Snow Labs' Medical LLMs is their efficiency at scale. These models deliver exceptional performance without requiring massive infrastructure:
116+
117+
- Designed to run efficiently on a range of GPU configurations
118+
119+
- Available in multiple sizes (7B, 10B, 14B, 24B, 70B) to suit different deployment needs
120+
121+
- Optimized for both on-premise and private cloud deployments
122+
123+
💡 You can achieve cutting-edge performance in clinical NLP without the costs and risks of using massive general-purpose models.
124+
125+
![Medical LLM by John Snow Labs](/assets/images/graph_med_llm.png)
126+
127+
💡The figures demonstrate the comparative performance metrics of our models across key medical benchmarks and clinical reasoning tasks.
128+
129+
![Medical LLM by John Snow Labs](/assets/images/web1_llm_model_comparison.png)
130+
131+
![Medical LLM by John Snow Labs](/assets/images/web2_llm_model_comparison.png)
132+
133+
134+
**Medical-LLM – 14B**
135+
- Achieves 81.42% average score vs GPT-4’s 82.85% and Med-PaLM-2’s 84.08%
136+
- Clinical knowledge score of 92.36% vs Med-PaLM-2’s 88.3%
137+
- Medical reasoning at 90% matches Med-PaLM-2’s performance
138+
- Higher accuracy than Meditron-70B while using 5x less parameters
139+
- Suitable for deployment scenarios with compute constraints
140+
141+
**Medical-LLM – 10B**
142+
143+
- Average score of 75.19% across medical benchmarks
144+
- Clinical analysis score of 88.19% vs Med-PaLM-1’s 83.8%
145+
- Medical Genetics score of 82% vs Med-PaLM-1’s 75%
146+
- Comparable performance to models requiring 7x more parameters
147+
- Balanced option for resource-conscious implementations
148+
149+
**Medical-LLM – 7B**
150+
151+
- Clinical reasoning score of 86.81% vs Med-PaLM-1’s 83.8%
152+
- Average score of 71.70% on OpenMed benchmark suite
153+
- PubMedQA score of 75.6%, higher than other 7B models
154+
- Matches GPT-4’s accuracy on medical QA with 100x fewer parameters
155+
- Efficient choice for high-throughput clinical applications
156+
157+
## Performance-to-Size Comparison
158+
![Medical LLM by John Snow Labs](/assets/images/perftosize_llm_model_comparison.png)
159+
160+
161+
These models are available for on-premise deployment as well as through leading cloud marketplaces, making deployment and integration straightforward for healthcare organizations. The marketplace availability ensures scalable access to these state-of-the-art medical AI capabilities, with enterprise-grade security and compliance features built-in. Organizations can leverage these models through flexible consumption-based pricing models, enabling both small-scale implementations and large enterprise deployments.
162+
163+
164+
## Partner With Us
165+
166+
We’re committed to helping you stay at the cutting edge of medical AI. Whether you’re building decision support tools, clinical chatbots, or research platforms — our team is here to help.
167+
168+
[Book a call with our experts to:](https://www.johnsnowlabs.com/schedule-a-demo/)
169+
- Discuss your specific use case
170+
- Get a live demo of the Medical LLMs
171+
- Explore tailored deployment options.

0 commit comments

Comments
 (0)