penm package

Julian Echave 26/09/2022

The penm package contains functions to build Elastic Network Models (ENM) of proteins and to perturb them; penm stands for Perturbing Elastic Network Models.

Overview

In short:

bio3d::read.pdb(): Set up a pdb protein object.
set_enm(): Set up a penm prot object, containing protein and its ENM info.
amrs()and smrs(): perform single-mutation scans to calculate mutation-response matrices.
admrs()and sdmrs(): perform double-mutation scans to calculate compensation matrices.

Installation

Install packages penm (this package) and jefuns (miscelaneous functions, some of which penm uses).

# install.packages("devtools")

devtools::install_github("jechave/jefuns")
devtools::install_github("jechave/penm")

Getting started

Before using the package, it needs to be loaded. To run this file, also load the following packages: tidyverse, patchwork, and jefuns.

library(tidyverse)  
#> ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
#> ✔ ggplot2 3.3.6      ✔ purrr   0.3.4 
#> ✔ tibble  3.1.8      ✔ dplyr   1.0.10
#> ✔ tidyr   1.2.1      ✔ stringr 1.4.0 
#> ✔ readr   2.1.2      ✔ forcats 0.5.1
#> ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
#> ✖ dplyr::filter() masks stats::filter()
#> ✖ dplyr::lag()    masks stats::lag()
library(patchwork)  
library(jefuns)
library(penm)

Set up the ENM for a protein

First, read a pdb file using bio3d::read.pdb to generate a pdb object for a protein. Then, create the prot object, that contains the full ENM analysis.

pdb <-  bio3d::read.pdb("data-raw/2XWRa.pdb") # read a pdb file
#>    PDB has ALT records, taking A only, rm.alt=TRUE
wt <- set_enm(pdb, node = "calpha", model = "anm", d_max = 10.5, frustrated = FALSE)

wt created here by set_enm() is an object of class prot. In this example, network nodes are placed at $C_\alpha$ coordinates, the model used is Bahar’s Anisotropic Network Model (model = "anm") with a cut-off distance to define contacts of d_max = 10.5. frustrated indicates whether to add frustrations to the model (it defaults to FALSE, it may be ommited from the list of arguments).

The object wt created above is a list that contains several components:

str(wt) # show structure of the object created
#> List of 6
#>  $ param:List of 4
#>   ..$ node      : chr "calpha"
#>   ..$ model     : chr "anm"
#>   ..$ d_max     : num 10.5
#>   ..$ frustrated: logi FALSE
#>  $ nodes:List of 5
#>   ..$ nsites  : int 199
#>   ..$ site    : int [1:199] 1 2 3 4 5 6 7 8 9 10 ...
#>   ..$ pdb_site: int [1:199] 91 92 93 94 95 96 97 98 99 100 ...
#>   ..$ bfactor : num [1:199] 24.1 26.2 19.7 19.8 19 ...
#>   ..$ xyz     : num [1:597] -25.3 24.7 -48.8 -27 25.1 ...
#>  $ graph: tibble [1,970 × 8] (S3: tbl_df/tbl/data.frame)
#>   ..$ edge: chr [1:1970] "1-2" "1-3" "1-4" "1-81" ...
#>   ..$ i   : int [1:1970] 1 1 1 1 1 1 1 1 2 2 ...
#>   ..$ j   : int [1:1970] 2 3 4 81 82 83 84 122 3 4 ...
#>   ..$ v0ij: num [1:1970] 0 0 0 0 0 0 0 0 0 0 ...
#>   ..$ sdij: int [1:1970] 1 2 3 80 81 82 83 121 1 2 ...
#>   ..$ lij : num [1:1970] 3.82 6.48 9.49 8.31 8.11 ...
#>   ..$ kij : num [1:1970] 1 1 1 1 1 1 1 1 1 1 ...
#>   ..$ dij : num [1:1970] 3.82 6.48 9.49 8.31 8.11 ...
#>  $ eij  : num [1:1970, 1:3] -0.447 -0.383 -0.635 -0.293 -0.628 ...
#>  $ kmat : num [1:597, 1:597] 2.6679 1.942 -1.1157 -0.1998 0.0439 ...
#>  $ nma  :List of 4
#>   ..$ mode  : int [1:591] 1 2 3 4 5 6 7 8 9 10 ...
#>   ..$ evalue: num [1:591] 0.088 0.0928 0.1167 0.1268 0.1427 ...
#>   ..$ cmat  : num [1:597, 1:597] 1.513 -0.987 0.368 0.393 -0.226 ...
#>   ..$ umat  : num [1:597, 1:591] 0.02607 -0.00897 -0.02868 0.03198 -0.00398 ...
#>  - attr(*, "class")= chr [1:2] "prot" "list"

wt$param is a list of model parameters
wt$nodes has information on number of sites, site-indexes, B-factors, and cartesian coordinates of the nodes
wt$graph is a graph representation of the elastic network (used internally)
wt$eij is a matrix of unit vectors directed along contacts (used internally)
wt$kmat is the network’s matrix (also called the Hessian, Laplacian, or Kirchhoff matrix)
wt$nma has various properties obtained from a so called “normal-mode analysis”: mode index, eigenvalues (evalue), matrix of eigenvectors (umat), and the ENM’s variance covariance matrix (cmat).

Single-site mutation-response scanning

There’re two methods of scanning mutations to obtain mutation-response matrices, an analytical method, amrs(), and a simulation-based method smrs. The analytical method uses a theoretical formual to calculate the response averaged over mutations. The simulation method introduces a given number of mutations, calculates the responses, then averages.

# Calculate analytical mutation-response matrix for deformations (response = "dr2")

dr2ij_analytical <- amrs(wt, mut_dl_sigma = 0.3, mut_sd_min = 1, option = "site", response = "dr2")

# Calculate simulation mutation-response matrix for deformations (response = "dr2")

dr2ij_simulation <- smrs(wt, nmut = 10, mut_dl_sigma = 0.3, mut_sd_min = 1, option = "site", response = "dr2", seed = 1024)

# Plot both matrices side by side

p_analytical <- plot_matrix(log10(dr2ij_analytical), value_name = "log10(dr2)") + 
  ggtitle("analytical")

p_simulation <- plot_matrix(log10(dr2ij_simulation), value_name = "log10(dr2)") + 
  ggtitle("simulation")

p_analytical + p_simulation

Three responses: structure, energy, and force

There are three possible responses that can be calculated:

response = "dr2" (structural deformations): Matrix element m[i,j] represents the square distance between site i of the wild-type protein and site i of the mutant, averaged over mutations at site j.
response = "de2" (deformation energy): m[i,j] is the mechanical energy needed to movoe site i back to its unperturbed position, averaged over mutations at site j).
response = "df2" (force): m[i,j] is the square-length of the mechanical force vector acting on site i, averaged over mutations at site j).

# Calculate analytical mutation-response matrix for deformations (response = "dr2")

dr2ij <- amrs(wt, mut_dl_sigma = 0.3, mut_sd_min = 1, option = "site", response = "dr2")
de2ij <- amrs(wt, mut_dl_sigma = 0.3, mut_sd_min = 1, option = "site", response = "de2")
df2ij <- amrs(wt, mut_dl_sigma = 0.3, mut_sd_min = 1, option = "site", response = "df2")

# Plot matrices 

p_dr2 <- plot_matrix(log10(dr2ij), value_name = "log10(dr2)") +  ggtitle("structure") +
  theme(legend.position = "none")
p_de2 <- plot_matrix(log10(de2ij), value_name = "log10(de2)") +  ggtitle("energy") +
  theme(legend.position = "none")
p_df2 <- plot_matrix(log10(df2ij), value_name = "log10(df2)") +  ggtitle("force") +
  theme(legend.position = "none")

p_dr2 + p_de2 + p_df2

Response in normal-mode coordinates

Above, each row represents a site. It is also possible to calculate responses (structure, energy, or force), in the direction of the normal modes, using option = "mode":

# Calculate analytical mutation-response matrix for deformations (response = "dr2")

dr2ij <- amrs(wt, mut_dl_sigma = 0.3, mut_sd_min = 1, option = "mode", response = "dr2")
de2ij <- amrs(wt, mut_dl_sigma = 0.3, mut_sd_min = 1, option = "mode", response = "de2")
df2ij <- amrs(wt, mut_dl_sigma = 0.3, mut_sd_min = 1, option = "mode", response = "df2")

# Plot matrices 

p_dr2 <- plot_matrix(log10(dr2ij), row_name = "mode", value_name = "log10(dr2)") +  
  ggtitle("structure") + theme(legend.position = "none")

p_de2 <- plot_matrix(log10(de2ij), row_name = "mode", value_name = "log10(de2)") + 
  ggtitle("energy") + theme(legend.position = "none")
p_df2 <- plot_matrix(log10(df2ij), row_name = "mode", value_name = "log10(df2)") +  
  ggtitle("force") + theme(legend.position = "none")

p_dr2 + p_de2 + p_df2

Double-site mutation-response scanning

Double mutation-response scanning produces a so-called “compensation matrix”. This matrix can be obtained using admrs() (analytical method), or sdmrs() (simulation method). An element d[i,j] of the matrix obtained represents the maximum compensation due to mutations at j of an original mutation at i. If option = "mean_max"is used, the average over i is returned. If option =max_max, then the maximum over mutations ati` is returned.

# Calculate analytical double mutation-response matrix for deformations (response = "dr2")

D_analytical <- admrs(wt, mut_dl_sigma = 0.3, mut_sd_min = 1, option = "mean_max", response = "dr2")

# Calculate simulation mutation-response matrix for deformations (response = "dr2")

D_simulation <- sdmrs(wt, nmut = 5, mut_dl_sigma = 0.3, mut_sd_min = 1, option = "mean_max", response = "dr2", seed = 1024)

# Plot both matrices side by side

p_analytical <- plot_matrix(log10(D_analytical), value_name = "log10(Dij)") + 
  ggtitle("analytical")

p_simulation <- plot_matrix(log10(D_simulation), value_name = "log10(Dij)") + 
  ggtitle("simulation")

p_analytical + p_simulation

### Compensation matrices averaged or maximized over initial mutation

Site j (columns of the compensaton matrix) are candidate “rescue” sites, therefore the compensation matrix looks for the maximum over possible mutations. Site i is the originally mutated site. To eliminate the specific mutation at i, there are two ways: maximize (i.e. pick the mutation at i that is most easily compensated by a mutation at j) or average. These two options are implemented:

# Calculate analytical mutation-response matrix for deformations (response = "dr2")

Dij_mean_max <- admrs(wt, mut_dl_sigma = 0.3, mut_sd_min = 1, option = "mean_max", response = "dr2")
Dij_max_max <- admrs(wt, mut_dl_sigma = 0.3, mut_sd_min = 1, option = "max_max", response = "dr2")

# Plot matrices 

p_mean_max <- plot_matrix(log10(Dij_mean_max), value_name = "log10(Dij)") +  ggtitle("mean_max") 
p_max_max <- plot_matrix(log10(Dij_max_max), value_name = "log10(Dij)") +  ggtitle("max_max") 

p_mean_max + p_max_max

Compensation matrices for structure, energy, and force

As above, three compensation matrices can be calculated.

# Calculate analytical mutation-response matrix for deformations (response = "dr2")

dr2ij <- admrs(wt, mut_dl_sigma = 0.3, mut_sd_min = 1, option = "mean_max", response = "dr2")
de2ij <- admrs(wt, mut_dl_sigma = 0.3, mut_sd_min = 1, option = "mean_max", response = "de2")
df2ij <- admrs(wt, mut_dl_sigma = 0.3, mut_sd_min = 1, option = "mean_max", response = "df2")

# Plot matrices 

p_dr2 <- plot_matrix(log10(dr2ij), value_name = "log10(dr2)") +  ggtitle("structure") +
  theme(legend.position = "none")
p_de2 <- plot_matrix(log10(de2ij), value_name = "log10(de2)") +  ggtitle("energy") +
  theme(legend.position = "none")
p_df2 <- plot_matrix(log10(df2ij), value_name = "log10(df2)") +  ggtitle("force") +
  theme(legend.position = "none")

p_dr2 + p_de2 + p_df2

Name		Name	Last commit message	Last commit date
Latest commit History 143 Commits
R		R
Rmd		Rmd
data-raw		data-raw
docs		docs
man		man
tests		tests
to_do		to_do
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
LICENSE.md		LICENSE.md
NAMESPACE		NAMESPACE
NEWS.md		NEWS.md
README.Rmd		README.Rmd
README.md		README.md
penm.Rproj		penm.Rproj
penm_0.2.0.9000.pdf		penm_0.2.0.9000.pdf
penm_0.2.0.pdf		penm_0.2.0.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

penm package

Overview

Installation

Getting started

Set up the ENM for a protein

Single-site mutation-response scanning

Three responses: structure, energy, and force

Response in normal-mode coordinates

Double-site mutation-response scanning

Compensation matrices for structure, energy, and force

About

Licenses found

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

penm package

Overview

Installation

Getting started

Set up the ENM for a protein

Single-site mutation-response scanning

Three responses: structure, energy, and force

Response in normal-mode coordinates

Double-site mutation-response scanning

Compensation matrices for structure, energy, and force

About

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages