Skip to content

Commit 7241d08

Browse files
committed
variables to change in README
1 parent 44fb392 commit 7241d08

2 files changed

Lines changed: 25 additions & 16 deletions

File tree

docker/data-review-tool/README.md

Lines changed: 12 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
This docker image contains `Finding Fossils`, a data review tool built using Dash, Python. It is used to visualize the outputs of the models and verify the extracted entities for inclusion in the Neotoma Database.
44

5-
The expected inputs are mounted onto the newly created container as volumes and can be dumped in the `data/data-review-tool` folder. It assumes the following:
5+
The expected inputs are mounted onto the newly created container as volumes and can be dumped in any folder. An environment variable is setup to provide the path to this folder. It assumes the following:
66
1. A parquet file containing the outputs from the article relevance prediction component.
77
2. A zipped file containing the outputs from the named entity extraction component.
88
3. Once the articles have been verified we update the same parquet file referenced using the environment variable `ARTICLE_RELEVANCE_BATCH` with the entities verified by the steward and the status of review for the article.
@@ -15,13 +15,9 @@ The following environment variables can be set to change the behavior of the pip
1515

1616
## Sample Docker Compose Setup
1717

18-
Update the environment variables defined under the `data-review-tool` service in the `docker-compose.yml` file under the root directory. Then build and run the docker image to install the required dependencies using `docker-compose` as follows:
19-
```bash
20-
docker-compose build
21-
docker-compose up data-review-tool
22-
```
18+
Update the environment variables and the volume paths defined under the `data-review-tool` service in the `docker-compose.yml` file under the root directory. The volume paths are:
2319

24-
This is the basic docker compose configuration for running the image.
20+
`INPUT_PATH`: The path to the directory where the data is dumped. eg. `./data/data-review-tool` (recommended)
2521

2622
```yaml
2723
version: "3.9"
@@ -32,5 +28,13 @@ services:
3228
ports:
3329
- "8050:8050"
3430
volumes:
35-
- ./data/data-review-tool:/MetaExtractor/inputs
31+
- {INPUT_PATH}:/MetaExtractor/inputs
32+
environment:
33+
- ARTICLE_RELEVANCE_BATCH=sample_parquet_output.parquet
34+
- ENTITY_EXTRACTION_BATCH=sample_ner_output.zip
35+
```
36+
Then build and run the docker image to install the required dependencies using `docker-compose` as follows:
37+
```bash
38+
docker-compose build
39+
docker-compose up data-review-tool
3640
```

docker/entity-extraction-pipeline/README.md

Lines changed: 13 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -19,11 +19,10 @@ The following environment variables can be set to change the behavior of the pip
1919

2020
## Sample Docker Compose Setup
2121

22-
Update the environment variables defined under the `entity-extraction-pipeline` service in the `docker-compose.yml` file under the root directory. Then build and run the docker image to install the required dependencies using `docker-compose` as follows:
23-
```bash
24-
docker-compose build
25-
docker-compose up entity-extraction-pipeline
26-
```
22+
Update the environment variables defined under the `entity-extraction-pipeline` service in the `docker-compose.yml` file under the root directory. The volume paths are:
23+
24+
- `INPUT_PATH`: The folder containing the raw text `nlp352` TSV file, eg. `./data/entity-extraction/raw/original_files/` (recommended)
25+
- `OUTPUT_PATH`: The folder to dump the final JSON files, eg. `./data/entity-extraction/processed/processed_articles/` (recommended)
2726

2827
Below is a sample docker compose configuration for running the image:
2928
```yaml
@@ -36,13 +35,19 @@ services:
3635
ports:
3736
- "5000:5000"
3837
volumes:
39-
- ./data/raw/:/app/inputs/
40-
- ./data/processed/:/app/outputs/
38+
- {INPUT_PATH}:/app/inputs/
39+
- {OUTPUT_PATH}:/app/outputs/
4140
environment:
4241
- HF_NER_MODEL_NAME=finding-fossils/metaextractor
4342
- SPACY_NER_MODEL_NAME=en_metaextractor_spacy
4443
- USE_NER_MODEL_TYPE=huggingface
4544
- LOG_OUTPUT_DIR=/app/outputs/
4645
- MAX_SENTENCES=20
4746
- MAX_ARTICLES=1
48-
```
47+
```
48+
Then build and run the docker image to install the required dependencies using `docker-compose` as follows:
49+
```bash
50+
docker-compose build
51+
docker-compose up entity-extraction-pipeline
52+
```
53+

0 commit comments

Comments
 (0)