- Create
.envfile according to.env.example. - Run
docker-compose -f docker-compose.dev.yml up --build -dto build and start development containers. - (optional) [Pycharm] Set up remote interpreter (not recommended, use local interpreter instead):
- Go to
File -> Settings -> Project -> Python Interpreter. - Click
Add interpreter -> On SSH.... - Fill in the fields (double-check the values in
docker-compose.dev.yml):- Host:
localhost. - Port:
2222. - Username:
root. - Password: no password.
- Use System installed interpreter:
selected. - Automatically sync project files:
checked. - Path mappings:
<Project root> → /app
- Host:
- Click
OK. - Local project files might get wiped out, just don't commit them and unstage changes.
- To connect to SSH via terminal run
ssh root@localhost -p 2222.
- Go to
- Create volume for language models:
docker volume create barometrs-language_models_volume - Run
download_models.pyto download language models from HuggingFace insidewebcontainer:docker exec -it web bash -c "python3 /app/download_models.py" - Credentials to connect to PostgreSQL are in
.envfile. - Run
init_db.pyto create tables:docker exec -it web bash -c "python3 /app/init_db.py" - Extract the raw data (comments) inside
datafolder. See the expected paths per news outlet insidecore/data_import.py
As of now, the following folders are expected:- Delfi -
data/delfi/ - Delfi-new -
data/delfi-new/ - Apollo -
data/apollo/ - TVNET -
data/tvnet/
- Delfi -
- Run
data_import.pyto import articles and comments into the database:docker exec -it -w /app web python3 -m core.data_import - Run
predict_comments.pyto predict emotions for the imported comments:
docker exec -it -w /app web python3 -m core.predict_comments
- Run
extract_keywords_by_day.pyto extract keywords:
docker exec -it -w /app web python3 -m core.extract_keywords_by_day
Create database dump in plain-text format (preferred):
docker exec db pg_dump -U barometrs -d barometrs -f /tmp/barometrs.sql
docker cp db:/tmp/barometrs.sql ./barometrs.sql
Create database dump in directory-format:
docker exec barometrs-db pg_dump -U emotion_classification -d emotion_classification -Fc -Z 9 -f /tmp/emotion_classification.dump
docker cp barometrs-db:/tmp/emotion_classification.dump ./emotion_classification.dump
- Wipe out database:
docker exec -u root barometrs-db bash -c "rm -rf /var/lib/postgresql/data/*"
- Copy data:
docker cp path_to/var/lib/postgresql/data/. barometrs-db:/var/lib/postgresql/data
- Build containers in development mode:
docker-compose -f docker-compose.dev.yml up --build -d
- Download language models:
docker exec -it web bash -c "python3 /app/download_models.py"
- Clear existing keywords from database. In DB console, run:
truncate table emotion_keywords_by_day;
- Run keyword extraction script:
docker exec -it -w /app web python3 -m core.extract_keywords_by_day
- Switch back to production containers:
docker-compose -f docker-compose.prod.yml up --build -d