Skip to content

AIRI-Institute/GENA_Web_service

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

289 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GENA_Web_service

Backend inference services for GENA-Web: a DNA language model–based platform for sequence annotation and interpretation.

This repository contains the task-specific backend services used to run GENA-Web models. In the accompanying paper, GENA-Web is presented as a web platform for promoter annotation, splice-site annotation, epigenetic/chromatin profiling, and enhancer activity scoring from raw DNA sequence. The deployment described in the paper combines a React/Redux/igv.js frontend with multiple Flask-based model containers; this repository contains the service-side model code, not the full web UI.

What is in this repository

At the top level, the repository is organized around separate service directories under src/, with both GENA-LM and DNABERT variants for several tasks:

  • gena-promoters_2000 — promoter annotation with GENA-LM
  • gena-spliceai — splice donor / acceptor annotation with GENA-LM
  • gena-deepsea — epigenetic / chromatin feature prediction with GENA-LM
  • gena-deepstarr — enhancer activity scoring with GENA-LM

And similarly for DNABERT:

  • DNABERT-Promoters_2000, DNABERT-Promoters_original
  • DNABERT-SpliceAI
  • DNABERT-DeepSea
  • DNABERT-DeepSTARR

The repository is best understood as a collection of independent inference backends, rather than a single polished Python package.

Quick start

Option 1: Run a single service with Docker

Each task directory contains its own Dockerfile. For example, to run the GENA-LM DeepSEA-like service:

docker build -t gena-deepsea ./src/gena-deepsea
docker run --rm -p 3000:3000 gena-deepsea

The included Dockerfile uses Python 3.10, installs the task-specific requirements.txt, copies the service directory into the image, and starts the Flask app with:

python server.py

You can apply the same pattern to the other service directories.

Option 2: Run locally for development

From a chosen service directory, install the local requirements and start the server:

cd src/gena-deepsea
pip install -r requirements.txt
python server.py

This assumes that the required model assets already exist under the service’s data/ directory, including:

  • data/checkpoints/
  • data/configs/
  • data/tokenizers/

Some services also vendor a local gena_lm/ package directly inside the task directory.

Example API usage

DeepSEA-like epigenetic profiling

curl -X POST \
  -F "[email protected]" \
  http://localhost:3000/api/gena-deepsea/upload

Response format

The exact outputs depend on the task, but the services generally return JSON with paths to generated files, for example:

{
  "bed": [
    "/generated/gena-deepsea/request_..._track1.bed",
    "/generated/gena-deepsea/request_..._track2.bed"
  ],
  "fasta_file": "/generated/gena-deepsea/request_... .fa",
  "fai_file": "/generated/gena-deepsea/request_... .fa.fai",
  "archive": "/generated/gena-deepsea/request_..._archive.zip"
}

For enhancer scoring, the returned files are bedGraph tracks even though the response key is still named bed in the current implementation.

Citation

If you use this code or build on the GENA-Web system, please cite the GENA-Web paper:

Shmelev A, Petrov M, Penzar D, Akhmetyanov N, Tavritskiy M, Mamontov S, Kuratov Y, Burtsev M, Kardymon O, Fishman V. GENA-Web - GENomic Annotations Web Inference using DNA language models. bioRxiv, 2024. DOI: 10.1101/2024.04.26.591391

You may also want to cite the broader GENA-LM paper for the underlying DNA language model family.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages