-
Notifications
You must be signed in to change notification settings - Fork 0
Phenotype Simulator
Our software package also includes a command-line simulator that allows to generate phenotypes with a wide range of different genetic architectures. In brief, the simulator assumes a linear-additive model, considering the contribution of a randomly selected (causal) genetic region for the set component, polygenic background effects from all remaining genome-wide variants, a contribution from unmeasured factors and iid observation noise. For a detailed description of the simulation procedure, we refer to the Supplementary Methods.
The simulator requires as input the genotypes and the relatedness component:
./mtSet_simPheno --bfile bfile --cfile cfile --pfile pfile
where
- bfile is the name of of the binary bed file (bfile.bed,bfile.bim,bfile.fam are required).
- cfile is the name of the covariance matrix file (cfile.cov,cfile.cov.id are required). If none is specified, the covariance matrix is expected to be in the current folder, having the same filename as the bed file.
- pfile is the name of the output file (pfile.phe,pfile.region). The file pfile.phe contains the phenotypic values (each sample is saved in one row, each trait in one column). The file pfile.region contains the randomly selected causal region (chromsom, start position, end position). If pfile is not specified, the files are saved in the current folder having an automatic generated filename containing the bed filename and the values of all simulation parameters.
By changing the following parameters different genetic architectures can be simulated and, in particular, the simulation experiments of our paper can be reproduced.
| Option | Default | Datatype | Explanation |
|---|---|---|---|
| --seed | 0 | int | seed for random number generator |
| --nTraits | 4 | int | number of simulated phenotypes |
| --windowSize | 1.5e4 | int | size of causal region |
| --vTotR | 0.05 | float | variance explained by the causal region |
| --nCausalR | 10 | int | number of causal variants in the region |
| --pCommonR | 0.8 | float | percentage of shared causal variants |
| --vTotBg | 0.4 | float | variance explained by the polygenic background effects |
| --pHidden | 0.6 | float | residual variance explained by hidden confounders (in %) |
| --pCommon | 0.8 | float | background and residual signal that is shared across traits (in %) |
| --chrom | None | int | specifies the chromosome of the causal region |
| --minPos | None | int | specifies the min. chromosomal position of the causal region (in basepairs) |
| --maxPos | None | int | specifies the max. chromosomal position of the causal region (in basepairs) |
- [Installation Instructions] (https://github.com/PMBio/mtSet/wiki/Installation-Instructions)
- Example Usage
- Tutorial
- Contact