| 1 |
2019/01 |
BioBERT |
BERT |
MLM, NSP |
110M |
GitHub  |
| 2 |
2019/02 |
BERT-MIMIC |
BERT |
MLM, NSP |
110M, 340M |
N/A |
| 3 |
2019/04 |
BioELMo |
ELMo |
Bi-LM |
93.6M |
GitHub  |
| 4 |
2019/04 |
Clinical BERT (Emily) |
BERT |
MLM, NSP |
110M |
GitHub  |
| 5 |
2019/04 |
ClinicalBERT (Kexin) |
BERT |
MLM, NSP |
110M |
GitHub  |
| 6 |
2019/06 |
BlueBERT |
BERT |
MLM, NSP |
110M, 340M |
GitHub  |
| 7 |
2019/06 |
G-BERT |
GNN + BERT |
Self-Prediction, Dual-Prediction |
3M |
GitHub  |
| 8 |
2019/07 |
BEHRT |
BERT |
MLM, NSP |
N/A |
GitHub  |
| 9 |
2019/08 |
BioFLAIR |
FLAIR |
Bi-LM |
N/A |
GitHub  |
| 10 |
2019/09 |
EhrBERT |
BERT |
MLM, NSP |
110M |
GitHub  |
| 11 |
2019/12 |
Clinical XLNet |
XLNet |
Generalized Autoregressive Pretraining |
110M |
GitHub  |
| 12 |
2020/04 |
GreenBioBERT |
BERT |
CBOW Word2Vec, Word Vector Space Alignment |
110M |
GitHub  |
| 13 |
2020/05 |
BERT-XML |
BERT |
MLM, NSP |
N/A |
N/A |
| 14 |
2020/05 |
Bio-ELECTRA |
ELECTRA |
Replaced Token Prediction |
14M |
GitHub  |
| 15 |
2020/05 |
Med-BERT |
BERT |
MLM, Prolonged LOS Prediction |
110M |
GitHub  |
| 16 |
2020/05 |
ouBioBERT |
BERT |
MLM, NSP |
110M |
GitHub  |
| 17 |
2020/07 |
PubMedBERT |
BERT |
MLM, NSP, Whole-Word Masking |
110M |
HuggingFace |
| 18 |
2020/08 |
MCBERT |
BERT |
MLM, NSP |
110M, 340M |
GitHub  |
| 19 |
2020/09 |
BioALBERT |
ALBERT |
MLM, SOP |
12M, 18M |
GitHub  |
| 20 |
2020/09 |
BRLTM |
BERT |
MLM |
N/A |
GitHub  |
| 21 |
2020/10 |
BioMegatron |
Megatron |
MLM, NSP |
345M, 800M, 1.2B |
GitHub  |
| 22 |
2020/10 |
CharacterBERT |
BERT + Character-CNN |
MLM, NSP |
105M |
GitHub  |
| 23 |
2020/10 |
ClinicalTransformer |
BERT - ALBERT - RoBERTa - ELECTRA |
MLM, NSP - MLM, SOP - MLM - Replaced Token Prediction |
110M - 12M - 125M - 110M |
GitHub  |
| 24 |
2020/10 |
SapBERT |
BERT |
Multi-Similarity Loss |
110M |
GitHub  |
| 25 |
2020/10 |
UmlsBERT |
BERT |
MLM |
110M |
GitHub  |
| 26 |
2020/11 |
bert-for-radiology |
BERT |
MLM, NSP |
110M |
GitHub  |
| 27 |
2020/11 |
Bio-LM |
RoBERTa |
MLM |
125M, 355M |
GitHub  |
| 28 |
2020/11 |
CODER |
PubMedBERT - mBERT |
Contrastive Learning |
110M - 110M |
GitHub  |
| 29 |
2020/11 |
exBERT |
BERT |
MLM, NSP |
N/A |
GitHub  |
| 30 |
2020/12 |
BioMedBERT |
BERT |
MLM, NSP |
340M |
GitHub  |
| 31 |
2020/12 |
LBERT |
BERT |
MLM, NSP |
110M |
GitHub  |
| 32 |
2021/04 |
CovidBERT |
BioBERT |
MLM, NSP |
110M |
N/A |
| 33 |
2021/04 |
ELECTRAMed |
ELECTRA |
Replaced Token Prediction |
N/A |
GitHub  |
| 34 |
2021/04 |
KeBioLM |
PubMedBERT |
MLM, Entity Detection, Entity Linking |
110M |
GitHub  |
| 35 |
2021/04 |
SINA-BERT |
BERT |
MLM |
110M |
N/A |
| 36 |
2021/05 |
ProteinBERT |
BERT |
Corrupted Token, Annotation Prediction |
16M |
GitHub  |
| 37 |
2021/05 |
SciFive |
T5 |
Span Corruption Prediction |
220M, 770M |
GitHub  |
| 38 |
2021/06 |
BioELECTRA |
ELECTRA |
Replaced Token Prediction |
110M |
GitHub  |
| 39 |
2021/06 |
EntityBERT |
BERT |
Entity-centric MLM |
110M |
N/A |
| 40 |
2021/07 |
MedGPT |
GPT-2 + GLU + RotaryEmbed |
LM |
N/A |
N/A |
| 41 |
2021/08 |
SMedBERT |
SMedBERT |
Masked Neighbor Modeling, Masked Mention Modeling, SOP, MLM |
N/A |
GitHub  |
| 42 |
2021/09 |
Bio-cli |
RoBERTa |
MLM, Subword Masking or Whole Word Masking |
125M |
GitHub  |
| 43 |
2021/11 |
UTH-BERT |
BERT |
MLM, NSP |
110M |
GitHub  |
| 44 |
2021/12 |
ChestXRayBERT |
BERT |
MLM, NSP |
110M |
N/A |
| 45 |
2021/12 |
MedRoBERTa.nl |
RoBERTa |
MLM |
123M |
GitHub  |
| 46 |
2021/12 |
PubMedELECTRA |
ELECTRA |
Replaced Token Prediction |
110M, 335M |
HuggingFace |
| 47 |
2022/01 |
Clinical-BigBird |
BigBird |
MLM |
166M |
GitHub  |
| 48 |
2022/01 |
Clinical-Longformer |
Longformer |
MLM |
149M |
GitHub  |
| 49 |
2022/03 |
BioLinkBERT |
BERT |
MLM, Document Relation Prediction |
110M, 340M |
GitHub  |
| 50 |
2022/04 |
BioBART |
BART |
Text Infilling, Sentence Permutation |
140M, 400M |
GitHub  |
| 51 |
2022/05 |
bsc-bio-ehr-es |
RoBERTa |
MLM |
125M |
GitHub  |
| 52 |
2022/05 |
PathologyBERT |
BERT |
MLM, NSP |
110M |
HuggingFace |
| 53 |
2022/06 |
RadBERT |
RoBERTa |
MLM |
110M |
GitHub  |
| 54 |
2022/06 |
ViHealthBERT |
BERT |
MLM, NSP, Capitalized Prediction |
110M |
GitHub  |
| 55 |
2022/07 |
Clinical Flair |
Flair |
Character-level Bi-LM |
N/A |
GitHub  |
| 56 |
2022/08 |
KM-BERT |
BERT |
MLM, NSP |
99M |
GitHub  |
| 57 |
2022/09 |
BioGPT |
GPT |
Autoregressive Language Model |
347M, 1.5B |
GitHub  |
| 58 |
2022/10 |
Bioberturk |
BERT |
MLM, NSP |
N/A |
GitHub  |
| 59 |
2022/10 |
DRAGON |
GreaseLM |
MLM, KG Link Prediction |
360M |
GitHub  |
| 60 |
2022/10 |
UCSF-BERT |
BERT |
MLM, NSP |
135M |
N/A |
| 61 |
2022/10 |
ViPubmedT5 |
ViT5 |
Spans-masking learning |
220M |
GitHub  |
| 62 |
2022/12 |
ALIBERT |
BERT |
MLM |
110M |
N/A |
| 63 |
2022/12 |
BioMedLM |
GPT2 |
Autoregressive Language Model |
2.7B |
GitHub  |
| 64 |
2022/12 |
BioReader |
T5 & RETRO |
MLM |
229.5M |
GitHub  |
| 65 |
2022/12 |
clinicalT5 |
T5 |
Span-mask Denoising Objective |
220M, 770M |
N/A |
| 66 |
2022/12 |
Gatortron |
BERT |
MLM |
8.9B |
GitHub  |
| 67 |
2022/12 |
Med-PaLM |
Flan-PaLM |
Instruction Prompt Tuning |
540B |
Official Site |
| 68 |
2023/01 |
clinical-T5 |
T5 |
Fill-in-the-blank-style denoising objective |
220M, 770M |
PhysioNet |
| 69 |
2023/01 |
CPT-BigBird |
BigBird |
MLM |
166M |
N/A |
| 70 |
2023/01 |
CPT-Longformer |
Longformer |
MLM |
149M |
N/A |
| 71 |
2023/02 |
Bioformer |
Bioformer |
MLM, NSP |
43M |
GitHub  |
| 72 |
2023/02 |
Lightweight |
DistilBERT |
MLM, Knowledge Distillation |
65M, 25M, 18M, 15M |
GitHub  |
| 73 |
2023/03 |
RAMM |
PubmedBERT |
MLM, Contrastive Learning, Image-Text Matching |
N/A |
GitHub  |
| 74 |
2023/04 |
DrBERT |
RoBERTa |
MLM |
110M |
GitHub  |
| 75 |
2023/04 |
MOTOR |
BLIP |
MLM, Contrastive Learning, Image-Text Matching |
N/A |
GitHub  |
| 76 |
2023/05 |
BiomedGPT |
BART backbone + BERT-encoder + GPT-decoder |
MLM |
33M, 93M, 182M |
GitHub  |
| 77 |
2023/05 |
TurkRadBERT |
BERT |
MLM, NSP |
110M |
N/A |
| 78 |
2023/06 |
CamemBERT-bio |
BERT |
Whole Word MLM |
111M |
HuggingFace |
| 79 |
2023/06 |
ClinicalGPT |
T5 |
Supervised Fine Tuning, Rank-based Training |
N/A |
N/A |
| 80 |
2023/06 |
EriBERTa |
RoBERTa |
MLM |
125M |
N/A |
| 81 |
2023/06 |
PharmBERT |
BERT |
MLM |
110M |
GitHub  |
| 82 |
2023/07 |
BioNART |
BERT |
Non-AutoRegressive Model |
110M |
GitHub  |
| 83 |
2023/07 |
BIOptimus |
BERT |
MLM |
110M |
GitHub  |
| 84 |
2023/07 |
KEBLM |
BERT |
MLM, Contrastive Learning, Ranking Objective |
N/A |
N/A |
| 85 |
2023/09 |
CPLLM |
Llama2 |
Autoregressive Language Model, Supervised Fine Tuning |
13B |
GitHub  |
| 86 |
2023/11 |
MedCPT |
BERT |
Contrastive Learning, Ranking Objective |
110M |
GitHub  |