Skip to content

Commit 2ffc78f

Browse files
committed
docs: updated calls and examples
1 parent 07f7caf commit 2ffc78f

File tree

1 file changed

+41
-6
lines changed

1 file changed

+41
-6
lines changed

docker/entity-extraction-pipeline/README.md

Lines changed: 41 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -17,9 +17,23 @@ The following environment variables can be set to change the behavior of the pip
1717
- `MAX_ARTICLES`: This variable can be set to a number to limit the number of articles processed. This is useful for testing and debugging. The default is `-1` which means no limit.
1818
- `LOG_OUTPUT_DIR`: This variable is set to the path of the output folder to write the log file. Default is the directory from which the docker container is run.
1919

20+
## Testing the Docker Image to Run on xDD
21+
22+
The docker image must be able to be run without root permissions. To test that this is correctly setup, run the following command and ensure it completes without error.
23+
24+
```bash
25+
docker run -u $(id -u) -p 5000:5000 -v /${PWD}/data/entity-extraction/raw/original_files/:/inputs/ -v /${PWD}/data/entity-extraction/processed/processed_articles/:/outputs/ --env LOG_OUTPUT_DIR="../outputs/" metaextractor-entity-extraction-pipeline:v0.0.3
26+
```
27+
28+
**Details**:
29+
- the $(id -u) is used to run the docker container as the current user so that the output files are not owned by root
30+
- the LOG_OUTPUT_DIR="../outputs/" is different from the docker compose as it is relative to the current directory which from Docker run starts in app folder
31+
- for git bash on windows the /${PWD} is used to get the current directory and the forward slash is important to get the correct path
32+
2033
## Sample Docker Compose Setup
2134

2235
Update the environment variables defined under the `entity-extraction-pipeline` service in the `docker-compose.yml` file under the root directory. Then build and run the docker image to install the required dependencies using `docker-compose` as follows:
36+
2337
```bash
2438
docker-compose build
2539
docker-compose up entity-extraction-pipeline
@@ -30,19 +44,40 @@ Below is a sample docker compose configuration for running the image:
3044
version: "0.0.1"
3145
services:
3246
entity-extraction-pipeline:
33-
image: metaextractor-entity-extraction-pipeline:v0.0.1
47+
image: metaextractor-entity-extraction-pipeline:v0.0.3
3448
build:
3549
...
3650
ports:
3751
- "5000:5000"
3852
volumes:
39-
- ./data/raw/:/app/inputs/
40-
- ./data/processed/:/app/outputs/
53+
- ./data/raw/:/inputs/
54+
- ./data/processed/:/outputs/
4155
environment:
42-
- HF_NER_MODEL_NAME=finding-fossils/metaextractor
43-
- SPACY_NER_MODEL_NAME=en_metaextractor_spacy
4456
- USE_NER_MODEL_TYPE=huggingface
45-
- LOG_OUTPUT_DIR=/app/outputs/
57+
- LOG_OUTPUT_DIR=/outputs/
4658
- MAX_SENTENCES=20
4759
- MAX_ARTICLES=1
60+
```
61+
62+
## Pushing the Docker Image to Docker Hub
63+
64+
To push the docker image to docker hub, first login to docker hub using the following command:
65+
66+
```bash
67+
docker login
68+
```
69+
70+
Then tag the docker image with the following two commands:
71+
72+
```bash
73+
# to update the "latest" tag image
74+
docker tag metaextractor-entity-extraction-pipeline:v<VERSION NUMBER> <DOCKER HUB USER ID>/metaextractor-entity-extraction-pipeline
75+
# to upload a specific version tagged image
76+
docker tag metaextractor-entity-extraction-pipeline:v<VERSION NUMBER> <DOCKER HUB USER ID>/metaextractor-entity-extraction-pipeline:v<VERSION NUMBER>
77+
```
78+
79+
Finally, push the docker image to docker hub using the following command:
80+
81+
```bash
82+
docker push <DOCKER HUB USER ID>/metaextractor-entity-extraction-pipeline
4883
```

0 commit comments

Comments
 (0)