clone the repository
git clone https://github.com/PsychoinformaticsLab/neurostuff.git
and then cd into the created folder.
cd neurostuff
open vscode
code .
Click on the lower lefthand green section of the VSCode bottom banner.

That will open a menu where you will select
Remote-Containers: Add Development Container Configuration Files...

which progresses the menu to the next choice of which file
to use to build the container.
You will select From 'docker-compose.yml'.

The next menu will ask which service you wish to create,
you will select neurostuff.

Following those choices should result in a folder named
.devcontainer with two files:
devcontainer.jsondocker-compose.yml
There are several edits needed to setup devcontainer.json
for the neurostuff repository.
The first field to edit in devcontainer.json is dockerComposeFile:

within neurostuff, there is an additional docker-compose.dev.yml
file to add.

The second field to edit in devcontainer.json is workspaceFolder,
which by default is /workspace.

Change /workspace to /neurostuff

Finally, the third field we want to edit is extensions.
neurostuff is written in python, so we want the python extension to
be installed with vscode on this remote container.
Add ms-python.python to extensions`.

Within docker-compose.yml we will change the
volumes mount from /workspace to /neurostuff

Click on the left corner in the green section of the bottom banner again.

Select Remote-Containers: Reopen in Container

Once the images are built and the containers
are created, you should be working from within
the neurostuff container, yay!

Press Ctrl+Shift+P and type into the
menu bar: Python: Configure Tests

neurostuff uses pytest so we will select pytest as our test framework.

When the menu progresses, select neurostuff
as the base directory.

The menu should close and VSCode will find
all tests written for neurostuff and you
will see a new icon that looks like a beaker.

To try debugging a test, create a breakpoint in one of the test files.

then click on Debug Test above the test function.

Clicking Debug Test should run the test until it reaches the breakpoint at which point
you have total control to inspect variables and test your understanding.

To test fmriprep, we use test data modified from openneuro datasets. We
will download the data in a downloads folder at the project root:
$ mkdir -p ./downloads && cd ./downloads
Then using a script similar to the one below, we download all necessary data into the downloads folder:
mkdir -p data/reports
# regression data for pytest
if [[ ! -d data/fmriprep_bold_truncated ]]; then
wget --retry-connrefused --waitretry=5 --read-timeout=20 --timeout=15 -t 0 -q \
-O fmriprep_bold_truncated.tar.gz "https://osf.io/286yr/download"
tar xvzf fmriprep_bold_truncated.tar.gz -C data
else
echo "Truncated BOLD series were already downloaded"
fi
if [[ ! -d data/fmriprep_bold_mask ]]; then
wget --retry-connrefused --waitretry=5 --read-timeout=20 --timeout=15 -t 0 -q \
-O fmriprep_bold_mask.tar.gz "https://osf.io/s4f7b/download"
tar xvzf fmriprep_bold_mask.tar.gz -C data
else
echo "Pre-computed masks were already downloaded"
fi
# data for test fmriprep runs
if [[ ! -d data/ds005 ]]; then
wget --retry-connrefused --waitretry=5 --read-timeout=20 --timeout=15 -t 0 -q \
-O ds005_downsampled.tar.gz "https://files.osf.io/v1/resources/fvuh8/providers/osfstorage/57f32a429ad5a101f977eb75"
tar xvzf ds005_downsampled.tar.gz -C data
else
echo "Dataset ds000005 was already downloaded"
fi
if [[ ! -d data/ds054 ]]; then
wget --retry-connrefused --waitretry=5 --read-timeout=20 --timeout=15 -t 0 -q \
-O ds054_downsampled.tar.gz "https://files.osf.io/v1/resources/fvuh8/providers/osfstorage/57f32c22594d9001ef91bf9e"
tar xvzf ds054_downsampled.tar.gz -C data
else
echo "Dataset ds000054 was already downloaded"
fi
if [[ ! -d data/ds210 ]]; then
wget --retry-connrefused --waitretry=5 --read-timeout=20 --timeout=15 -t 0 -q \
-O ds210_downsampled.tar.gz "https://files.osf.io/v1/resources/fvuh8/providers/osfstorage/5ae9e37b9a64d7000ce66c21"
tar xvzf ds210_downsampled.tar.gz -C data
else
echo "Dataset ds000210 was already downloaded"
fi
if [[ ! -d ds005/derivatives/freesurfer ]]; then
mkdir -p ds005/derivatives
wget --retry-connrefused --waitretry=5 --read-timeout=20 --timeout=15 -t 0 -q \
-O ds005_derivatives_freesurfer.tar.gz "https://files.osf.io/v1/resources/fvuh8/providers/osfstorage/58fe59eb594d900250960180"
tar xvzf ds005_derivatives_freesurfer.tar.gz -C ds005/derivatives
else
echo "FreeSurfer derivatives of ds000005 were already downloaded"
fi
code in the terminal to open VS Code.You can see the documentation for installing VS Code extensions
.devcontainer.json specifies how we build our development environment.
the file lives in the root (i.e. top) directory of the fmriprep
repository at the same level as the Dockerfile.
We are following the directions given by VS Code to create a
development
environment
with a Dockerfile.
To create the .devcontainer.json we will open VS Code in the root of
the fmriprep project:
$ cd $HOME/projects/fmriprep
$ code .
Once VS Code is open, we will press Ctrl+Shift+P on the keyboard and
type Remote-Containers: Create Container Configuration File. Selecting
that command will create a .devcontainer.json for us, but we will
change the json file to meet our needs.
The contents of .devcontainer.json will look like the following:
// See https://aka.ms/vscode-remote/devcontainer.json for format details.
{
"name": "fmriprep_dev",
"image": "fmriprep:dev",
"dockerFile": "Dockerfile",
"workspaceMount": "src=${env:PWD},dst=/src/fmriprep,type=bind",
"workspaceFolder": "/src/fmriprep",
"extensions": [
"ms-python.python",
"visualstudioexptteam.vscodeintellicode"
],
"runArgs": ["--entrypoint", "",
"-v", "${env:PWD}/downloads:/tmp",
"-e", "FMRIPREP_REGRESSION_SOURCE=/tmp/data/fmriprep_bold_truncated",
"-e", "FMRIPREP_REGRESSION_TARGETS=/tmp/data/fmriprep_bold_mask",
"-e", "FMRIPREP_REGRESSION_REPORTS=/tmp/data/reports",
"-e", "FS_LICENSE=/tmp/license.txt",
"-e", "FMRIPREP_DEV=1"],
"postCreateCommand": "pip uninstall -y fmriprep && python setup.py develop && conda install -y flake8 && cd /tmp && echo 'cHJpbnRmICJrcnp5c3p0b2YuZ29yZ29sZXdza2lAZ21haWwuY29tXG41MTcyXG4gKkN2dW12RVYzelRmZ1xuRlM1Si8yYzFhZ2c0RVxuIiA+IGxpY2Vuc2UudHh0Cg==' | base64 -d | sh"
}
The keys are documented so we will not re-hash them here, but we will explain the motivation for the values referenced by some of the keys.
workspaceMount to bind our local fmriprep
repository to the container.workspaceFolder to open VS Code in the
correct directory where our local fmriprep repository is now bound.ms-python.python and visualstudioexptteam.vscodeintellicode.
ms-python.python helps with debugging, linting, intellisense,
etc. for python.
visualstudioexptteam.vscodeintellicode helps with code
completion based on common code patterns.
runArgs removes the entrypoint for the container (it was set
to only run fmriprep), mounts data downloaded from step 1, and sets
environment variables for the container to know where the test data
should be and that we are in a testing environment.The postCreateCommand performs three miscellaneous tasks:
/src/fmriprep make a difference in the call to fmriprep
(as opposed to patching as we've seen in the above
sections)./tmp, which is
necessary to run fmriprep with freesurfer.press Ctrl+Shift+P on the keyboard and type/select
Remote-Containers: Open Folder in Container This should open a folder
browser. Navigate to your fmriprep project folder and select open. The
build process should begin.
.vscode/launch.json is a file that helps VS Code run debugging
sessions. Go to the debug view in the activity
bar on
the side of VS Code. From the debug view, select the configure gear
icon
on the Debug view top bar.
.vscode/launch.json should now exist and have a couple default
entries. We will remove those entries and replace them with the
following:
{
// Use IntelliSense to learn about possible attributes.
// Hover to view descriptions of existing attributes.
// For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
"version": "0.2.0",
"configurations": [
{
"name": "python: ds005-anat",
"type": "python",
"request": "launch",
"program": "/usr/local/miniconda/bin/fmriprep",
"args": [
"-w", "/tmp/ds005/work",
"/tmp/data/ds005",
"/tmp/ds005/derivatives",
"participant",
"--skull-strip-template", "OASIS30ANTs:res-1",
"--output-spaces", "MNI152NLin2009cAsym", "MNI152NLin6Asym",
"--sloppy", "--write-graph",
"--anat-only", "-vv", "--notrack"
],
"console": "integratedTerminal",
"justMyCode": false
},
{
"name": "python: ds005-full",
"type": "python",
"request": "launch",
"program": "/usr/local/miniconda/bin/fmriprep",
"args": [
"-w", "/tmp/ds005/work",
"/tmp/data/ds005",
"/tmp/ds005/derivatives",
"participant",
"--sloppy", "--write-graph",
"--use-aroma",
"--skull-strip-template", "OASIS30ANTs:res-1",
"--output-space", "T1w", "template", "fsaverage5", "fsnative",
"--template-resampling-grid", "native",
"--use-plugin", "/src/fmriprep/.circleci/legacy.yml",
"--cifti-output", "-vv", "--notrack"
],
"console": "integratedTerminal",
"justMyCode": false
},
{
"name": "python: ds054",
"type": "python",
"request": "launch",
"program": "/usr/local/miniconda/bin/fmriprep",
"args": [
"-w", "/tmp/ds054/work",
"/tmp/data/ds054",
"/tmp/ds054/derivatives",
"participant",
"--fs-no-reconall", "--sloppy",
"--output-spaces", "MNI152NLin2009cAsym:res-2", "anat", "func",
"-vv",
"--notrack"
],
"console": "integratedTerminal",
"justMyCode": false
},
{
"name": "python: ds210-anat",
"type": "python",
"request": "launch",
"program": "/usr/local/miniconda/bin/fmriprep",
"args": [
"-w", "/tmp/ds210/work",
"/tmp/data/ds210",
"/tmp/ds210/derivatives",
"participant",
"--fs-no-reconall", "--sloppy", "--write-graph",
"--anat-only", "-vv", "--notrack"
],
"console": "integratedTerminal",
"justMyCode": false
},
{
"name": "python: ds210-full",
"type": "python",
"request": "launch",
"program": "/usr/local/miniconda/bin/fmriprep",
"args": [
"-w", "/tmp/ds210/work",
"/tmp/data/ds210",
"/tmp/ds210/derivatives",
"participant",
"--t2s-coreg", "--use-syn-sdc",
"--template-resampling-grid", "native",
"--dummy-scans", "1",
"--fs-no-reconall", "--sloppy", "--write-graph",
"--anat-only", "-vv", "--notrack"
],
"console": "integratedTerminal",
"justMyCode": false
}
]
}
After adding those entries, you should be able to hit the green arrow and debug any changes you made to fmriprep.
You can edit this file to test on your own data or some other configuration. Please see python debugging in VS Code to learn more about the configurations.
In addition to debugging, you can also interactively run pytest. Please see the VS Code directions to get the testing framework setup.
]]>Create a minimal dataset to test the code
Learn to use a debugger
Step through the code line by line
So in order to be maximally effective in my graduate work (and use the right model for the job), I should at least have a passing knowledge of R in addition to python. I am using this blog format to chronicle my adventures and missteps while learning R.
This is the assumption I’ve operated under with python. When I make a list and print out the results, I get something like this:
list(1, 2, 3, 4, 5)
[1, 2, 3, 4, 5]
I explicitly see the structure of the list and the straight brackets tell me this is a list as opposed to parens which would indicate a tuple.
However in R, I may get something like:
c(1, 2, 3, 4, 5)
[1] 1 2 3 4 5
It took me a little while to get comfortable that the [1] prepended is a convenience to show which line the results are being printed out to.
R gets more confusing when trying to show a more complex object.
However, I’m beginning to appreciate R as an interactive programming language, and the seemingly strange way to print data structures is great for the interactive user who does not need to concern themselves with the underlying data structures R is using.
But if you are interested in the structure, then you should use the struct() function, which will bring it closer to what I’m used to seeing in python and help me understand the data types I use in R.
The next post will probably be about non-standard evaluation (this still blows my mind)
]]>I still think the best/simplest scenerio is to use psychopy for stimulus presentation and use other python utilities to generate a figure from the data.
Alas, we are stuck with eprime and I am struck with inspiration to make things much more complicated to practice using utilities that are not directly designed for this use-case.
Before I dive in, here are a list of tools/services/utilities I will be using to setup the quality assurance service. They each link to a tutorial/explanation.
we’ve already written/borrowed code to generate an svg file from the output of an
eprime task, so I will not be covering that.
We will assume there is a script that generates some form of figure
output (in a BIDS
organized fashion).
You can look at the end of the guide for what the code looks like for an example script.
sign up for circleci and connect your github account
Our QA code for this example is written in python, and currently a good way to share the environment necessary to run/reproduce the code is anaconda.
If you developed the qa code while working in a conda environment, great! Otherwise you will create a conda environment with:
conda create -n eprime_convert python=3.6
where eprime_convert is the name of the environment (you can make this be
anything you want) and python=3.6 is selecting the specific version
of python (we currently use 3.6).
To activate the newly created environment:
source activate eprime_convert
Now you will look at the import statements at the top of your script
and conda install the necessary packages.
from convert_eprime import convert
import pandas as pd
import seaborn as sns
from argparse import ArgumentParser
import os
from matplotlib import pyplot as plt
from glob import glob
import shutil
import re
From this, it appears I need to install: convert_eprime, pandas, seaborn, and matplotlib. All the other imports are from builtin packages in python so they are available by default. (you will notice which packages are default with practice) My first pass to install everything would be:
conda install convert_eprime pandas seaborn matplotlib
This would install everything if I didn’t include convert_eprime.
convert_eprime is not tracked by anaconda, and isn’t even tracked by pypi.
It’s a pet project from another
graduate student that was fed up with e-merge.
To install convert_eprime I need to know how to install a github repo.
Luckily, stackoverflow has an answer for everything.
So the real commands to install everything are:
conda install pandas seaborn matplotlib
pip install git+https://github.com/tsalo/convert-eprime.git
Test your script to make sure it works with these installs. If it complains that you are missing something, install it. Now you can export your environment to a file so it can be reproduced.
conda env export > environment.yml
Open up that environment.yml because we need to edit it. It may look something like this:
name: eprime_convert
channels:
- defaults
dependencies:
- blas=1.0=mkl
- ca-certificates=2018.03.07=0
- certifi=2018.10.15=py36_0
- cycler=0.10.0=py36_0
- dbus=1.13.2=h714fa37_1
- expat=2.2.6=he6710b0_0
- fontconfig=2.13.0=h9420a91_0
- freetype=2.9.1=h8a8886c_1
- glib=2.56.2=hd408876_0
- gst-plugins-base=1.14.0=hbbd80ab_1
- gstreamer=1.14.0=hb453b48_1
- icu=58.2=h9c2bf20_1
- intel-openmp=2019.0=118
- jpeg=9b=h024ee3a_2
- kiwisolver=1.0.1=py36hf484d3e_0
- libedit=3.1.20170329=h6b74fdf_2
- libffi=3.2.1=hd88cf55_4
- libgcc-ng=8.2.0=hdf63c60_1
- libgfortran-ng=7.3.0=hdf63c60_0
- libpng=1.6.35=hbc83047_0
- libstdcxx-ng=8.2.0=hdf63c60_1
- libuuid=1.0.3=h1bed415_2
- libxcb=1.13=h1bed415_1
- libxml2=2.9.8=h26e45fe_1
- matplotlib=3.0.1=py36h5429711_0
- mkl=2019.0=118
- mkl_fft=1.0.6=py36h7dd41cf_0
- mkl_random=1.0.1=py36h4414c95_1
- ncurses=6.1=hf484d3e_0
- numpy=1.15.4=py36h1d66e8a_0
- numpy-base=1.15.4=py36h81de0dd_0
- openssl=1.0.2p=h14c3975_0
- pandas=0.23.4=py36h04863e7_0
- patsy=0.5.1=py36_0
- pcre=8.42=h439df22_0
- pip=18.1=py36_0
- pyparsing=2.3.0=py36_0
- pyqt=5.9.2=py36h05f1152_2
- python=3.6.6=h6e4f718_2
- python-dateutil=2.7.5=py36_0
- pytz=2018.7=py36_0
- qt=5.9.6=h8703b6f_2
- readline=7.0=h7b6447c_5
- scipy=1.1.0=py36hfa4b5c9_1
- seaborn=0.9.0=py36_0
- setuptools=40.5.0=py36_0
- sip=4.19.8=py36hf484d3e_0
- six=1.11.0=py36_1
- sqlite=3.25.2=h7b6447c_0
- statsmodels=0.9.0=py36h035aef0_0
- tk=8.6.8=hbc83047_0
- tornado=5.1.1=py36h7b6447c_0
- wheel=0.32.2=py36_0
- xz=5.2.4=h14c3975_4
- zlib=1.2.11=ha838bed_2
- pip:
- convert-eprime==0.0.1a0
- future==0.17.1
prefix: /home/james/.conda/envs/eprime_convert
If we were only going to run this environment on identical (or near identical)
hardware, then this is fine, but if we want a more flexible yml, then we
need to start editing.
A few things to do:
After editing, the file should look something like this:
name: convert_eprime
channels:
- defaults
dependencies:
- matplotlib=3.0.1
- numpy=1.15.4
- pandas=0.23.4
- seaborn=0.9.0
- python=3.6
- pip:
- git+https://github.com/tsalo/convert-eprime.git
Much cleaner (I kept numpy as its own install just to be explicit, I don’t believe it’s actually necessary to include).
We have created the yml to basically build the same environment that we want to
use/build our code with.
This will be good for deploying/sharing the code in multiple contexts.
However, we are going to lock down the environment in which the code runs even further using docker.
Basically, we are going to build a docker container that has our conda environment installed on it.
We can do that by making a Dockerfile that could look like this:
# https://medium.com/@chadlagore/conda-environments-with-docker-82cdc9d25754
FROM continuumio/miniconda3:4.5.11
COPY eprime_convert.yml /env/
RUN conda env create -f /env/eprime_convert.yml &&\
conda clean --all
# Pull the environment name out of the environment.yml
RUN echo "source activate $(head -1 /env/eprime_convert.yml | cut -d' ' -f2)" > ~/.bashrc
ENV PATH /opt/conda/envs/$(head -1 /env/eprime_convert.yml | cut -d' ' -f2)/bin:$PATH
ENTRYPOINT [ "/bin/bash", "-c" ]
and we can build the Dockerfile with this command:
docker build -t jdkent/eprime_convert .
The tag is linked to my dockerhub account so when I push the container to dockerhub it will go the correct location. I will push the container to dockerhub with the following command:
docker push jdkent/eprime_convert
The container can be seen on dockerhub.
Excellent!
With this in place we can move on to setting up circleci
circleci is an online service that can run arbitrary code whenever something
happens in a github repository.
The vagueness of the description hides the power behind this service.
Essentially, your imagination is the limit for what you can do.
Follow the official circleci docs to add the repository to circleci so that circleci will begin triggering builds when commits appear in that repository.
Inside your git repository add a .circleci folder and make a config.yml
inside that folder, that is what circleci will read.
Here is a full example config.yml for circleci, I will break it down after.
# Python CircleCI 2.0 configuration file
#
# Check https://circleci.com/docs/2.0/language-python/ for more details
#
version: 2
jobs:
build:
docker:
# specify the version you desire here
- image: jdkent/eprime_convert:latest
working_directory: ~/repo
steps:
- run:
name: clone github repo
command: |
git clone https:///${GITHUB_TOKEN}@github.com/HBClab/BetterTaskSwitch.git
- run:
name: check if data QA should be skipped
command: |
cd ~/repo/BetterTaskSwitch
if [[ "$( git log --format=oneline -n 1 $CIRCLE_SHA1 | grep -i -E '\[skip[ _]?ci\]' )" != "" ]]; then
echo "Skipping Data QA"
circleci step halt
fi
- run:
name: run eprime convert
command: |
source activate eprime_convert
~/repo/BetterTaskSwitch/code/eprime_convert.py \
-b ~/repo/BetterTaskSwitch/bids \
-r ~/repo/BetterTaskSwitch/task-full_resp-srbox \
-c ~/repo/BetterTaskSwitch/code/config_file/task_switch.json \
-a mri \
--sub-prefix GE120
- run:
name: add and commit files
command: |
cd ~/repo/BetterTaskSwitch
git config credential.helper 'cache --timeout=120'
git config user.email "[email protected]"
git config user.name "QA Bot"
# Push quietly to prevent showing the token in log
git add .
git commit -m "[skip ci] $(date)"
git push -q https://${GITHUB_TOKEN}@github.com/HBClab/BetterTaskSwitch.git master
version: 2: the overall version of circleci to use, they are depreciating version one so all of them should be version 2jobs: the list of things I want circleci to run.
build:: this provides the option to choose what machinary I want circleci to run on
docker:: I want to use docker to select the environment my jobs are run using.
- image:jdkent/eprime_convert:latest: this selects the docker image stored on dockerhub that we just made in the last step.working_directory: ~/repo: where the commandline interface will drop me when I’m running commands in the docker container we selected (I don’t really take advantage of this option).steps:: the steps we will take to run the job.
- run:: instantiation of a step to take in the job
name: clone github repo: the name of the step we are taking.command: |: the actual command we will be running in the docker container (the | (pipe) allows us to type the command on a separate line so the line of code does not look crowded).- run
name: check if data QA should be skippedcommand: |: this command checks if [skip ci] or [skip_ci] is in the most recent commit message and will stop the circleci build if this is selected.- run:
name: run eprime convertcommand: |: this command activates the conda environment and runs our data qa script with the appropriate inputs generating the figure output.- run:
name: add and commit filescommand: |: this command creates a github identity so the bot can push the new data to the github repository (importantly the github message contains [skip ci], what would happen if that wasn’t there?)One important detail I’ve left out is what’s up with ${GITHUB_TOKEN}.
That is a special variable I’ve defined using circleci’s environment variable settings.
This is great for storing variables that represent some type of authentication (e.g. passwords), but you don’t want everyone to be able to see the password.
In this instance I’m using a github token.
You can make your own github token going to your github profile, clicking on settings, clicking on developer settings, and then creating a new token.
see the github announcement about tokens
Warning: you will only have explicit access to your token when you create it, so make sure you copy the token somewhere safe on your computer.
Once you have circleci setup and the config file inside your repository, you are ready to add the files and push the changes back up to github, and observe your first circleci build. The steps would look something like this:
git add .circleci/config.yml
git commit -m 'add circleci build configuration'
git push origin master
Note: the error I ran into when doing this was incorrect permissions of eprime_convert.py in my repository.
I gave the file executable permissions with the following command:
git update-index --chmod=+x eprime_convert.py
We have created a reproducible environment and setup circleci to run everytime we push a new commit to the repository. The next step is to easily visualize all the figures we have created. We will do this using github-pages.
Follow the github instructions to have github start hosting your repository as a static webpage (using github-pages).
I’m using the minimal theme and I suggest that you use that theme too.
Pull the changes to your repository.
You will have an _config.yml file in your base directory.
Change the file to look something like this:
theme: jekyll-theme-minimal
plugins:
- jekyll-relative-links
title: [BetterTaskSwitch]
description: [Monitoring BetterTaskSwitch Data]
logo: https://avatars0.githubusercontent.com/u/24659915?s=400&u=12a4f626488fe0f692d77f355d9dd9f3e4e63f7a&v=4
baseurl: /BetterTaskSwitch
You will change the title, description, and baseurl to what’s specific in the repository you are working on. The logo is pointing towards our (HBClab) github logo.
Next we will add liquid syntax to display all the swarmplots that are in our
repository.
You will place this code in your README.md file located at the
base of your repository.
{% assign my_files = site.static_files | where:"extname",".svg" | sort:"modified_time" | reverse %}
{% capture sevendays %}{{'now' | date: "%s" | minus : 604800 }}{% endcapture %}
{% for taskswitch in my_files %}
{% if taskswitch.name contains "swarmplot" %}
{% capture file_mod %}{{taskswitch.modified_time | date: "%s"}}{% endcapture %}
{% if file_mod > sevendays %}
### Recent
{% else %}
### Older
{% endif %}
**{{taskswitch.name}}**

{% endif %}
{% endfor %}
Note: This stackoverflow helped me with how to parse and compare dates
I will explain important bits of this code:
{% assign my_files = site.static_files | where:"extname",".svg" | sort:"modified_time" | reverse %}
This line creates a variable
called my_files that searches through all static files
where the extension of the file is .svg.
Next, the resulting array is then piped to sort the array
by the date the file was last modified (from oldest -> newest).
Finally, the result is reversed so that the array is sorted from
newest -> oldest.
{% capture sevendays %}{{'now' | date: "%s" | minus : 604800 }}{% endcapture %}
This line creates a variable called sevendays which measures the current
time using seconds %s and then subtracts seven days worth of
seconds (7 * 24 * 60 * 60 = 604800).
This will be used to tell whether an image is seven days old or not.
{% capture file_mod %}{{taskswitch.modified_time | date: "%s"}}{% endcapture %}
This line creates the variable file_mod. file_mod is the date (in seconds) when the file was last modified. This means we can directly compare file_mod and sevendays to test whether the file is older or newer than seven days.

This is the last line I will explain since it may look confusing.
It combines both markdown syntax and liquid syntax.
Here is the markdown portion: .
That markdown syntax displays an inline image.
The double curly brackets are liquid syntax.
These return strings that can be interpreted by markdown.
taskswitch.path is the path to the file relative to the top
directory of the repository (e.g. /some/dir/file.svg).
However, with how github parses the url, we also need to
include the website basename as well, so we prepend the site’s
baseurl.
If you look back, you can see we defined the baseurl variable in
_config.yml.
This is the difference between searching for a file using this
https://hbclab.github.io as our baseurl and this
https://hbclab.github.io/BetterTaskSwitch (we want this one)
Next we want to check to make sure we did everything correctly. We can do this by serving the jekyll website we made locally. Please follow the github instructions to do this.
Once we are satisfied with how the website looks, we can add/commit/push the changes to github.
git add _config.yml Gemfile README.md
git commit -m 'add website functionality'
git push origin master
That’s it! Once you’ve done all that, you can reap the benefits of having an automated system that generates figures and makes them visible via a website.
This code was written to work, not be beautiful, acknowledge that this code may not represent best (or even) recommended practices.
#!/usr/bin/env python
# generate pipelines that read in the eprime txt files and output a
# machine readable summary and a useful figure for quality assurance.
from convert_eprime import convert
import pandas as pd
import numpy as np
from argparse import ArgumentParser
import os
from glob import glob
import shutil
import re
from matplotlib import pyplot as plt
plt.style.use('ggplot')
import seaborn as sns
sns.set_palette("bright")
# expressions
session_dict = {1: 'pre', 2: 'post'}
def get_parser():
"""Build parser object for cmdline processing"""
parser = ArgumentParser(description='betterVTSM.py: converts '
'eprime output to tsv in BIDS format')
parser.add_argument('-b', '--bids', action='store',
help='root folder of a BIDS valid dataset')
parser.add_argument('-r', '--raw-dir', action='store',
help='directory where edat and txt files live')
parser.add_argument('-p', '--participant-label', action='store', nargs='+',
help='participant label(s) to process')
parser.add_argument('-s', '--session-label', action='store', nargs='+',
help='session label(s) to process (either 1 or 2)')
parser.add_argument('-c', '--config', action='store', required=True,
help='config file to process the eprime txt. '
'see convert_eprime for details')
parser.add_argument('--sub-prefix', action='store',
help='add additional characters to the prefix of the participant label')
return parser
def copy_eprime_files(src, dest):
# collect edat2 and txt files
types = ('*.edat2', '*.txt')
raw_files = []
for type in types:
raw_files.extend(glob(os.path.join(src, type)))
# copy all files into sourcedata (if not already there)
copied_files = 0
for file in raw_files:
out_file = os.path.join(dest, os.path.basename(file))
if not os.path.isfile(out_file):
shutil.copy(file, dest)
copied_files += 1
return copied_files
def main():
"""Entry point"""
opts = get_parser().parse_args()
# set input/output directories
bids_dir = os.path.abspath(opts.bids)
# ensure bids directory exists
os.makedirs(bids_dir, exist_ok=True)
sourcedata = os.path.join(bids_dir, 'sourcedata', 'VSTM')
derivatives = os.path.join(bids_dir, 'derivatives')
# ensure sourcedata and derivatives exist
os.makedirs(sourcedata, exist_ok=True)
os.makedirs(derivatives, exist_ok=True)
# assume data is already copied over if raw_dir isn't specified
if opts.raw_dir:
raw_dir = os.path.abspath(opts.raw_dir)
# output is only the number of copied files, throwing away
files_copied = copy_eprime_files(raw_dir, sourcedata)
print('{num} file(s) copied'.format(num=files_copied))
else:
print('-r not specified, assuming data are in the correct location: '
'{dir}'.format(dir=sourcedata))
# collect participant labels
if opts.participant_label:
participants = opts.participant_label
else:
participant_files = glob(os.path.join(sourcedata, 'VSTM_*.txt'))
sub_expr = re.compile(r'^.*VSTM_PACR-(?P<sub_id>[0-9]{3})-(?P<ses_id>[1-2]).txt')
participants = []
for participant_file in participant_files:
print(participant_file)
sub_dict = sub_expr.search(participant_file).groupdict()
participants.append(sub_dict['sub_id'])
# collect sessions
if opts.session_label:
sessions = opts.session_label
else:
sessions = [1, 2]
filename_template = 'VSTM_PACR-{sub}-{ses}.{ext}'
participant_dict = {}
for participant in participants:
participant_dict[participant] = {}
for session in sessions:
# initialize sub/ses dictionary
participant_dict[participant][session] = {'edat': None, 'txt': None}
# get the edat file (if it exists)
edat_file = filename_template.format(sub=participant,
ses=session,
ext='edat2')
if os.path.isfile(os.path.join(sourcedata, edat_file)):
participant_dict[participant][session]['edat'] = os.path.join(
sourcedata, edat_file
)
else:
print('{edat} missing!'.format(edat=edat_file))
participant_dict[participant].pop(session)
continue
# get the txt file (if it exists)
txt_file = filename_template.format(sub=participant,
ses=session,
ext='txt')
if os.path.isfile(os.path.join(sourcedata, txt_file)):
participant_dict[participant][session]['txt'] = os.path.join(
sourcedata, txt_file
)
else:
print('{txt} missing!'.format(txt=txt_file))
participant_dict[participant].pop(session)
continue
# process the data per session
for participant in participant_dict.keys():
if opts.sub_prefix:
participant_label = opts.sub_prefix + participant
else:
participant_label = participant
for session in participant_dict[participant].keys():
# type coersion to integer
session = int(session)
session_label = session_dict[session]
edat_file = participant_dict[participant][session]['edat']
txt_file = participant_dict[participant][session]['txt']
config = os.path.abspath(opts.config)
folder = 'beh'
work_file = os.path.join(sourcedata, 'work', 'sub-' + participant_label,
'ses-' + session_label, 'beh',
'sub-{sub}_ses-{ses}_task-VSTM_raw.csv'.format(sub=participant_label, ses=session_label))
# ensure directory exists
os.makedirs(os.path.dirname(work_file), exist_ok=True)
# conversion to csv
convert.text_to_rcsv(txt_file, edat_file, config, work_file)
# create dataframe
df = pd.read_csv(work_file)
#drops practice trials
df.drop(df[(df.Running == 'ColorPractice') | (df.Running == 'ShapePractice') | (df.Running == 'PracticeBoth')].index, inplace=True)
#drop all NaN entries, re: trials where no response was desired (at begining of all VSTM blocks)
df.dropna(how='all', inplace=True)
# rename column headers
df.rename(index=str, columns={"Running": "trial_type",
"Probe.ACC": "correct",
"Probe.RT": "response_time",
"Probe.CRESP": "probe_novelty"}, inplace=True)
# convert response_time into seconds
df['response_time'] = df['response_time'] / 1000
# change 'correct' column from float to int
df.correct = df.correct.astype(int)
# create new column for block number
df['block'] = df['trial_type']
# replace trial_type elements with simpler description
df['trial_type'].replace({'SimColour':'color', 'SimShape':'shape',
'SimBoth':'color_and_shape'}, inplace=True)
# replace probe_novelty elements with a more sensible set
# {/} -> novel -> 1
# z -> repeat -> 0
df['probe_novelty'].replace({'{/}': 1, 'z': 0}, inplace=True)
# write processed data to file
base_file = 'sub-{sub}_ses-{ses}_task-VSTM_events.tsv'
bids_file = os.path.join(bids_dir,
'sub-' + participant_label,
'ses-' + session_label,
folder,
base_file.format(
sub=participant_label,
ses=session_label)
)
# make sure the directory exists
os.makedirs(os.path.dirname(bids_file), exist_ok=True)
df.to_csv(bids_file, sep='\t', index=False)
# Do some quality assurance
derivatives_dir = os.path.join(derivatives, 'VSTMQA')
os.makedirs(derivatives_dir, exist_ok=True)
base_json = 'sub-{sub}_ses-{ses}_task-VSTM_averages.json'
out_json = os.path.join(derivatives_dir,
'sub-' + participant_label,
'ses-' + session_label,
folder,
base_json.format(
sub=participant_label,
ses=session_label)
)
base_fig = 'sub-{sub}_ses-{ses}_task-VSTM_swarmplot.svg'
out_fig = os.path.join(derivatives_dir,
'sub-' + participant_label,
'ses-' + session_label,
folder,
base_fig.format(
sub=participant_label,
ses=session_label)
)
# make the derivatives directory for the participant/session in taskSwitchQA
os.makedirs(os.path.dirname(out_json), exist_ok=True)
# get average response time and average correct
json_dict = {'response_time': None, 'correct': None}
json_dict['response_time'] = df['response_time'].where(df['correct'] == 1).mean()
json_dict['correct'] = df['correct'].mean()
ave_res = pd.Series(json_dict)
ave_res.to_json(out_json)
if not os.path.isfile(out_fig):
# make a swarmplot
myplot = sns.swarmplot(x="trial_type", y="response_time",
hue="correct", data=df, size=6)
# set the y range larger to fit the legend
myplot.set_ylim(0, 10.0)
# remove the title of the legend
myplot.legend(title=None)
# rename the xticks
myplot.set_xticklabels(['Color', 'Shape', 'Shape and Color'])
# rename xlabel
myplot.set_xlabel('trial type')
myplot.set_ylabel('response time (seconds)')
# rename the legend labels
new_labels = ['incorrect', 'correct']
for t, l in zip(myplot.legend_.texts, new_labels):
t.set_text(l)
# save the figure
myplot.figure.savefig(out_fig, dpi=72)
# remove all plot features from memory
plt.clf()
if __name__ == '__main__':
main()
bare, I can’t.
To get around that I had to use this command (that I stole from stack overflow):
git config --local receive.denyCurrentBranch updateInstead
And now I can push to the server git repo. We can also benefit from branches if multiple people are working on the data at the same time.
In addition, I can make a python environment with conda, and wrap up any python notebooks I make with a yml file that means anyone else (obstensibly) can replicate my environment and run the code (reproducible!)
]]>