While building ProtPocket, we discovered an undocumented capability of the AlphaFold Database REST API that enables real-time retrieval of protein complex (homodimer) structural data using only a UniProt accession ID. This finding closes a critical gap in programmatic access to the March 2026 AlphaFold complex dataset — a gap that existed because no official complex API endpoint was published alongside the data release.
On March 16, 2026, EMBL-EBI, Google DeepMind, NVIDIA, and Seoul National University released 1.7 million high-confidence homodimer predictions into the AlphaFold Database — the largest protein complex dataset ever assembled. However, no REST API endpoint was published to query these complex predictions programmatically by UniProt ID.
The only documented access method was bulk FTP download — meaning to get structural data for a single protein's homodimer, a researcher would theoretically need to download gigabytes of data and parse it locally.
We set out to find whether the API exposed any path to complex data without bulk download.
We systematically tested every plausible API pattern against the AlphaFold REST API:
Test 1 — Standard prediction endpoint:
GET https://alphafold.ebi.ac.uk/api/prediction/P04637Result: Returns monomer prediction only. isComplex: false. No dimer data.
Test 2 — Type parameter on prediction endpoint:
GET https://alphafold.ebi.ac.uk/api/prediction/P04637?type=complexResult: Returns all isoforms for that UniProt ID. All isComplex: false. Parameter ignored.
Test 3 — Dedicated complexes endpoint:
GET https://alphafold.ebi.ac.uk/api/complexes/P04637Result: HTTP 404. Endpoint does not exist.
Test 4 — FTP bulk manifest:
GET https://ftp.ebi.ac.uk/pub/databases/alphafold/complexes/Result: HTTP 404. Complex FTP directory not yet published at time of writing.
Test 5 — Direct query with known complex numeric ID:
GET https://alphafold.ebi.ac.uk/api/prediction/AF-0000000066503175Result: HTTP 200. Returns full complex prediction data with isComplex: true and globalMetricValue: 86.04. Complex numeric IDs are queryable — the problem reduces to finding the numeric ID for a given UniProt accession.
Test 6 — Search endpoint with type parameter:
GET https://alphafold.ebi.ac.uk/api/search?q=Q55DI5&type=complexResult: HTTP 200. Returns both monomer and homodimer entries in a single response. This is the discovery.
The AlphaFold search endpoint (/api/search) accepts a type=complex parameter that, when combined with a UniProt accession query, returns all predictions associated with that accession — including homodimer complex predictions where they exist.
Verified working curl:
curl -s "https://alphafold.ebi.ac.uk/api/search?q=Q55DI5&type=complex"Response (abbreviated):
{
"numFound": 2,
"docs": [
{
"entryId": "AF-Q55DI5-F1",
"isComplex": false,
"oligomericState": "monomer",
"globalMetricValue": 50.56,
"providerId": "GDM"
},
{
"entryId": "AF-0000000066503175",
"isComplex": true,
"oligomericState": "dimer",
"assemblyType": "Homo",
"complexName": "Homodimer of Transcription elongation factor Eaf N-terminal domain-containing protein",
"globalMetricValue": 86.06,
"providerId": "NVDA",
"complexPredictionAccuracy_ipTM": 0.82,
"complexPredictionAccuracy_pDockQ2_AB": 0.72,
"complexPredictionAccuracy_LIS_AB": 0.65,
"complexComposition": [
{
"identifierType": "uniprotAccession",
"identifier": "Q55DI5",
"stoichiometry": 2
}
]
}
]
}A single API call returns both the monomer prediction (isComplex: false) and the homodimer prediction (isComplex: true) for the same protein. The globalMetricValue field on each entry is the average pLDDT confidence score for that structure.
Prior to this discovery, disorder delta — the structural confidence gain when a protein forms a complex — could only be computed from manually curated data or bulk FTP downloads. With this endpoint, disorder delta is computable live for any protein that has a homodimer prediction:
monomer_plddt = docs.filter(isComplex == false && isIsoform == false)[0].globalMetricValue
dimer_plddt = docs.filter(isComplex == true)[0].globalMetricValue
disorder_delta = dimer_plddt - monomer_plddt
For Q55DI5:
monomer_plddt = 50.56
dimer_plddt = 86.06
disorder_delta = +35.50
This computation previously required downloading and parsing multi-megabyte structure files. It now requires one HTTP request.
The search response returns the numeric complex ID (AF-0000000066503175) for any protein that has a homodimer prediction. This ID can then be used to fetch the full complex prediction record:
GET https://alphafold.ebi.ac.uk/api/prediction/AF-0000000066503175This gives access to the complete complex metadata including per-residue confidence scores and structure file URLs as they become available.
The complex entry exposes prediction accuracy metrics that are not available for monomer predictions and were not previously accessible via any API:
| Field | Description | Value for Q55DI5 |
|---|---|---|
complexPredictionAccuracy_ipTM |
Interface predicted TM-score — confidence in the chain-chain interface | 0.82 |
complexPredictionAccuracy_pDockQ2_AB |
Docking quality score per chain pair | 0.72 |
complexPredictionAccuracy_LIS_AB |
Local interaction score | 0.65 |
complexPredictionAccuracy_ipsae_AB |
Interface predicted SAE score | 0.76 |
complexPredictionAccuracy_N_clash_backbone |
Backbone clash count (structural validity) | 0.0 |
The ipTM score is particularly significant for drug discovery — it measures confidence specifically at the protein-protein interface, which is precisely where a drug molecule would bind to disrupt the complex.
The full resolution pipeline from a protein name to real-time disorder delta:
Step 1 — User queries "TP53" or "tuberculosis"
↓
Step 2 — UniProt search returns UniProt accession ID
e.g. Q55DI5
↓
Step 3 — GET /api/search?q={uniprotID}&type=complex
Returns docs array with both monomer and complex entries
↓
Step 4 — Parse docs:
Find doc where isComplex=false, isIsoform=false
→ monomer_plddt = globalMetricValue
Find doc where isComplex=true
→ dimer_plddt = globalMetricValue
→ complex_id = entryId
→ iptm_score = complexPredictionAccuracy_ipTM
↓
Step 5 — Compute:
disorder_delta = dimer_plddt - monomer_plddt
↓
Step 6 — If no isComplex=true doc exists:
Protein has no homodimer prediction in the March 2026 dataset
Set dimer_plddt = monomer_plddt, disorder_delta = 0
↓
Step 7 — Return enriched Complex object with real disorder delta
This pipeline resolves in under 500ms for any protein in the AlphaFold Database. It requires no bulk download, no file parsing, and no pre-computed lookup tables.
What works:
- Any protein where AlphaFold has computed a homodimer prediction in the March 2026 dataset
- The search endpoint correctly returns
isComplex: trueentries with all accuracy metrics globalMetricValueis available immediately without parsing structure files- The numeric complex ID is returned and can be used for further queries
Current limitations:
- Not every protein has a homodimer prediction. The March 2026 release covers 1.7 million high-confidence and 18 million lower-confidence homodimers out of a much larger protein universe. For proteins without a complex prediction, disorder delta cannot be computed live.
- Complex structure files (
.cif) at the standard AlphaFold files endpoint return HTTP 404 at time of writing. The file storage for complex structures has not yet been fully populated. The metadata API is ahead of the file storage. - The
type=complexparameter behavior on the search endpoint is not documented in the official AlphaFold API documentation. Its continued availability is not guaranteed. ProtPocket maintains a curated hero dataset as a fallback.
This discovery means ProtPocket can compute real disorder delta for any protein live — not just the 30 manually curated hero complexes. As AlphaFold continues to add complex predictions to their database, ProtPocket's coverage automatically expands without any code changes.
More broadly, this pipeline fills a gap that exists for every researcher and tool in the structural biology community right now. The AlphaFold complex dataset was released 5 days before this document was written. No published tool has yet built programmatic access to complex data via UniProt ID. ProtPocket is, to our knowledge, the first tool to do so.
The following curl commands reproduce the core finding:
# Returns both monomer and complex predictions for Q55DI5 in one call
curl -s "https://alphafold.ebi.ac.uk/api/search?q=Q55DI5&type=complex" | jq '.docs[] | {entryId, isComplex, globalMetricValue, oligomericState}'
# Queries the complex entry directly by numeric ID
curl -s "https://alphafold.ebi.ac.uk/api/prediction/AF-0000000066503175" | jq '.[0] | {entryId, isComplex, globalMetricValue, complexPredictionAccuracy_ipTM}'Expected output for the first command:
{ "entryId": "AF-Q55DI5-F1", "isComplex": false, "globalMetricValue": 50.56, "oligomericState": "monomer" }
{ "entryId": "AF-0000000066503175", "isComplex": true, "globalMetricValue": 86.06, "oligomericState": "dimer" }This discovery is implemented in backend/services/alphafold.go as FetchComplexData(uniprotID string). The function makes a single HTTP GET to the search endpoint, parses the docs array, extracts both pLDDT values, computes disorder delta, and returns the enriched data to the search and detail handlers.
The gap score algorithm in backend/scoring/gap_score.go incorporates the live disorder delta as a bonus term, and the ipTM score is surfaced as a separate "Interface Confidence" metric on the detail page.
Discovered: March 2026 ProtPocket This document may be freely shared with the structural biology and bioinformatics community.