Capturing Molecular Interactions in Graph Neural Networks: A Case Study in Multi-Component Phase Equilibrium (SolvGNN)
Here, we propose a solvent GNN (SolvGNN) architecture that captures molecular interactions in an explicit manner by combining atomic-level (local) graph convolution and molecular-level (global) message passing through a molecular interaction network. SolvGNN uses a graph representation wherein each node represents a molecule and each edge represents a potential intermolecular interaction (e.g., hydrogen bonding). We tested SolvGNN via a case study on activity coefficient predictions for binary and ternary mixtures using large data sets obtained from COnductor-like Screening MOdel for Real Solvation (COSMO-RS) calculations. We show that the proposed SolvGNN can predict composition-dependent activity coefficients with high accuracy. To interpret the trained model, we performed counterfactual analysis on SolvGNN to highlight the physical implications of chemical structures and compositions on the solvation behavior. Finally, we built a SolvGNN-based framework that takes a given mixture (binary or ternary) as input and generates the corresponding phase diagrams (P-x-y) using predicted activity coefficients coupled with phase equilibrium calculations.
SolvGNN is an open-source computational framework and tool. It can be directly used as tools to predict activity coefficients and/or generate phase diagrams; it can also be used for researchers and developers to train their own data sets and initiate future research.
Features
Model training and validation frameworkfor SolvGNNActivity coefficient predictionsat 298K for binary or ternary mixtures given their SMILES stringsphase diagrams (P-x-y) visualizationalong withphase equilibrium calculationsCounterfactual analysisto obtain physical insightsLarge data sets availabilityfrom COSMO-RS, containing 200,000 data records for binary mixtures and 160,000 data recordds for ternary mixtures
Publication
P-x-y Phase Diagram (VLE) at 298K
Conterfactual
$ git clone -b SolvGNN --single-branch https://github.com/zavalab/ML.git
$ cd ML
$ conda create -f environment.ymlFor detailed usage, navigate to the notebook directory.


