Add `CMPNN` model by devanshamin · Pull Request #9223 · pyg-team/pytorch_geometric

devanshamin · 2024-04-21T19:46:31Z

Paper Summary

Introduces a new model, CMPNN, for Communicative Representation Learning on Attributed Molecular Graphs
CMPNN improves molecular embedding by:
- Following the edge-based message passing in DMPNN
- Introducing node-edge message communication modules such as Inner Product Kernel, Gated Graph Kernel, and Multilayer Perception
- Updating both bond and atom embeddings during training
- Including a message booster to enrich the message generation process

Motivation

Adding an alternative to DMPNN (Chemprop uses this GNN architecture) that improves DMPNN by adding:
- Node-edge message communication modules
- Message booster
- Updating bond embeddings
Adding a GNN model that updates edge embeddings during training

Benchmark Results

I have benchmarked CMPNN on a subset of datasets from the TDC ADMET Benchmark Group. I have used LitGNN to perform this benchmark and results can be found in a W&B report.

Task	Classification (AUROC ↑)		Regression (MAE ↓)
Dataset	BBB_Martins	AMES	Solubility_AqSolDB	Lipophilicity_AstraZeneca	LD50_Zhu
AttentiveFP	0.855 ± 0.011	0.814 ± 0.008	0.776 ± 0.008	0.572 ± 0.007	0.678 ± 0.012
Chemprop	0.821 ± 0.112	0.842 ± 0.014	0.829 ± 0.022	0.470 ± 0.009	0.606 ± 0.024
Chemprop-RDKit*	0.869 ± 0.027	0.850 ± 0.004	0.761 ± 0.025	0.467 ± 0.006	0.625 ± 0.022
CMPNN	0.89 ± 0.016 _CMPNN-GRU	0.843 ± 0.009 _CMPNN-MLP	0.796 ± 0.038 _CMPNN-GRU	0.515 ± 0.008 _CMPNN-MLP	0.631 ± 0.021 _{CMPNN-Additive}

Table: Prediction results of CMPNN on five chemical graph datasets. The datasets were used from the TDC ADMET Benchmark group that provides train_val/test scaffold splits. The model was trained and tested for each task for five times, and reported the mean and standard deviation of AUROC or MAE values. *Chemprop-RDKIT utilizes a hybrid approach where it combines the learned molecule embeddings with 200 global molecule features (descriptors).

Implementation Details

Note

Here is a fork of the original code with some cleanups, addition of poetry for dependency management etc.

Below are the places where improvements have been made:

Node-edge message communication modules
- In the paper, the authors mention different communicators such as Inner Product Kernel, Gated Graph Kernel and Multilayer Perception. However, within their code, they don't use such different communicators. Instead, they only use an additive communicator (not mentioned in the paper; I came up with the name 'additive').
- I have implemented 4 communicators proposed in the paper and their code:
  - Additive
  - Inner product
  - GRU
  - MLP
- The user can choose different communicator according to their dataset.
- These communication modules are applied during the convolution layers.
Final communication module
- In the paper, the authors mention applying the same communication module to the message from incoming bonds [m(v)], current atom's representation [hK(v)] and atom's initial representation [x(v)].
- In the section 3.3 of the paper where they describe the different node-edge message communication modules, all the communication modules operate on m(v) and hK-1(v).
- In their code, they use MLP communication module.
- I have kept this part the same and hardcoded the MLP communication module.

Checklist

Note

For CMPNN PyG implementation, I have used AttentiveFP as a template.

Add torch_geometric/nn/models/cmpnn.py
Add CMPNN to torch_geometric/nn/models/__init__.py
Add test/nn/models/test_cmpnn.py
Add message_booster mode to the torch_geometric/nn/aggr/multi.py:MultiAggregation class
Add a test for message_booster mode in test/nn/aggr/test_multi.py
Add examples/cmpnn.py
Add CMPNN to the 'Implemented GNN Models' section of the README
Update CHANGELOG

Thank you! Please let me know if any changes are required.

for more information, see https://pre-commit.ci

…_geometric into models/cmpnn

AzureLeon1 · 2024-05-13T14:53:07Z

Thank you for your efforts in implementing CMPNN. Based on my tests, this pull request correctly implements the key modules described in the original paper. One question I have is whether it would be more appropriate to separate BatchGRU from CMPNN, considering that BatchGRU serves the role of a readout function.

devanshamin · 2024-05-14T01:36:00Z

Thank you @AzureLeon1 for taking the time to review the PR. Regarding the BatchGRU, I have few thoughts -

If you look at the forward method of the AttentiveFP, it utilizes a PyTorch GRUCell while CMPNN implements BatchGRU using the PyTorch GRU.
Another option would be to move it under nn.pool.

I feel BatchGRU is specific to CMPNN, but if you (or the maintainers of PyG) see a utility of having it in a separate file, I'm happy to make the necessary changes.

devanshamin added 5 commits April 21, 2024 15:36

Add CMPNN model

603b918

Add message_booster mode

343ae1a

Add test for message_booster mode

a513251

Add test for CMPNN model

097411b

Add example for CMPNN model

b0568a3

devanshamin requested review from EdisonLeeeee and wsad1 as code owners April 21, 2024 19:46

github-actions bot added nn example labels Apr 21, 2024

pre-commit-ci bot and others added 3 commits April 21, 2024 19:47

[pre-commit.ci] auto fixes from pre-commit.com hooks

9375271

for more information, see https://pre-commit.ci

Add CMPNN model

521e624

Merge branch 'models/cmpnn' of https://github.com/devanshamin/pytorch…

36ee8e0

…_geometric into models/cmpnn

rusty1s assigned devanshamin May 2, 2024

rusty1s added feature 1 - Priority P1 labels May 2, 2024

Merge branch 'pyg-team:master' into models/cmpnn

9e725be

rusty1s and others added 3 commits May 23, 2024 08:31

Merge branch 'master' into models/cmpnn

4ca9777

Merge branch 'master' into models/cmpnn

becb830

Merge branch 'master' into models/cmpnn

d302b6e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `CMPNN` model#9223

Add `CMPNN` model#9223
devanshamin wants to merge 12 commits intopyg-team:masterfrom
devanshamin:models/cmpnn

devanshamin commented Apr 21, 2024 •

edited

Loading

Uh oh!

AzureLeon1 commented May 13, 2024

Uh oh!

devanshamin commented May 14, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

devanshamin commented Apr 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Paper Summary

Motivation

Benchmark Results

Implementation Details

Checklist

Uh oh!

AzureLeon1 commented May 13, 2024

Uh oh!

devanshamin commented May 14, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

devanshamin commented Apr 21, 2024 •

edited

Loading