Error in running pretrain because of torch.distributed

Hi,
I install environment with below information
python=3.8
pytorch,cuda with command=` conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia`
GPU= 1 geforce RTX 3090 (24 GPU-RAM)


I'm trying to run pretrain with below command 
`python -m torch.distributed.launch --nproc_per_node 1 tools/run.py --pretrain --tasks vqa --datasets m4c_textvqa --model m4c_split --seed 13 --config configs/vqa/m4c_textvqa/tap_base_pretrain.yml --save_dir save/m4c_split_pretrain_test training_parameters.distributed True`

 but I encounter below code
![3](https://user-images.githubusercontent.com/126941725/232677711-b1ea069f-22f7-43c0-a997-4c254894c34b.jpg)

Could you help me to resolve this problem?
Is this error because of using 1 GPU?
Do I need to change the initial value of a some parameter(like local_rank)?
Could the reason for this error be due to lack of GPU-memory?
It is very important to me to solve this problem. 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error in running pretrain because of torch.distributed #26

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Error in running pretrain because of torch.distributed #26

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions