Skip to content

hackbio-ca/tbd-project-24

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

50 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

patho-detection-plm

Pathogenicity detection using protein language models

License

Abstract

This project presents a machine learning model designed to analyze human protein sequences and classify single amino-acid polymorphisms as either benign or pathogenic. By training on curated datasets of known benign and pathogenic variants, the model provides a tool for researchers in R&D, medical geneticists, and clinicians engaged in embryo screening or infant genetic testing. Its key functionality lies in detecting mutations within amino acid sequences and predicting their potential pathogenic impact. The model leverages the transformers library (built on PyTorch) for sequence analysis and employs Weights & Biases for performance tracking and visualization. This approach offers a scalable, data-driven method to assist in genetic screening and early diagnosis, with potential applications in precision medicine and healthcare research.

Installation

To install with conda, run:

conda create -n demo python
conda activate demo
pip install Demo/requirements.txt

Quick Start and Usage

# To see a pre-selected set of protien sequence predictions, run:
python ./Demo/Demo.py 

# If you would like to predict pathogenicity for sequences of your own, enter them into the console when prompted and separate them by comma:
python ./Demo/Demo.py --interact

# Our selection of trained models can be specified as followed:
python ./Demo/Demo.py --model_type [esm-fz, esm-ft, esm-fz+mf, or esm-ft+mf]

# Inference can be run on available gpus with:
python ./Demo/Demo.py --device gpu

Organization

This repo is organized into:

  • Data, datasets used to train the model.
  • Demo, a notebook and cli to run existing models.
  • Models, the actual models being run.
  • Training, the script defining model classes and training procedure.

Contribute

Contributions are welcome! If you'd like to contribute, please open an issue or submit a pull request. See the contribution guidelines for more information.

Support

If you have any issues or need help, please open an issue or contact the project maintainers.

License

This project is licensed under the MIT License.

About

Placeholder repository for Team 24 – update with final project title and description when available

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages