Skip to content

tbata/PLS_DFE

Repository files navigation

Set up for Regression model

Goals & Design

We want to check the sensitivity of our conclusions to both the type distribution of fitness effects (DFE) estimation as input for the response variables (proportion of strongly versus midly deleterious mutations), possibly the type of assumptions underlying the regression. In particular we want to rerun the same analysis with DFE estimated based on different parametrizations to check the robustness of our conclusion on covariation of DFE summaries to Ne and generation time:

  • So called $\Gamma$ or $\Gamma$ + exponential parametrizations.

  • Parametrization that just place different probabilities of a mutation in being in a certain range (bins) of $N_e s$ values ( eg $[ -\infty , -10]$, $[-1,0]$ etc).

We use one template analysis Rmarkdown file Report.Rmd that generates reports and a rendering R script that controls the parameters to make as little hard coded decisions as possible while not overly complicating the code:

  • Report.Rmd file contains the template for the stats and visuals and expects a tree file for the underlying phylogeny and a so-calledregression (csv) file that stores the DFE parameters along with Ne and generation time estimates for each species.

Note that the list of species name in the tree file and the DFE files need not match perfectly ( these list are intersected). The same goes for generation time estimates ( these might no be always available and accordingly species are dropped from the analysis)

These are stored in the yaml header as

--
params:
  csv_file: "scratch/reg_vars.csv"  # default can be changed with render; 
  mytree_file: "scratch/science.abn7829_data_s4.nex.tree" # default;
  • render_reports_regression_models.R contains the list of csv files (one file per fastDFE inference) and will generate standardized reports, one for each DFE summary input.

How to run the analysis.

We Assume:

  • Input files (csv_file and mytree_file) are in /scratch.

  • All reports will be placed in the /reports_renderized subfolder.

In the R terminal type:
source("render_reports_regression_models.R")

History

Started on Feb 14, 2026 by TB

About

Phylogenetic least square models exploring the effects of Effective size (Ne) and generation time on the variation in DFE in primates species

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages