Skip to content

Not a gzipped file #84

@lordkev

Description

@lordkev

Hi,

I received the following error when trying to read a standard bgzipped VCF file in kage:

File "/home/kevin/miniforge3/envs/kage/lib/python3.12/site-packages/kage/command_line_interface.py", line 52, in main
    run_argument_parser(sys.argv[1:])
  File "/home/kevin/miniforge3/envs/kage/lib/python3.12/site-packages/kage/command_line_interface.py", line 559, in run_argument_parser
    args.func(args)
  File "/home/kevin/miniforge3/envs/kage/lib/python3.12/site-packages/kage/indexing/main.py", line 305, in make_index_cli
    r = make_index(args.reference, args.vcf, args.out_base_name,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/kevin/miniforge3/envs/kage/lib/python3.12/site-packages/kage/indexing/main.py", line 74, in make_index
    validate_input_vcf(vcf_file_name)
  File "/home/kevin/miniforge3/envs/kage/lib/python3.12/site-packages/kage/indexing/main.py", line 40, in validate_input_vcf
    with bnp.open(vcf_file_name) as f:
         ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/kevin/miniforge3/envs/kage/lib/python3.12/site-packages/bionumpy/io/files.py", line 183, in bnp_open
    return _get_buffered_file(filename, suffix, mode, is_gzip=is_gzip, buffer_type=buffer_type, lazy=lazy)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/kevin/miniforge3/envs/kage/lib/python3.12/site-packages/bionumpy/io/files.py", line 68, in _get_buffered_file
    file_reader = NumpyFileReader(open_func(filename, "rb"), buffer_type) # , **kwargs2)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/kevin/miniforge3/envs/kage/lib/python3.12/site-packages/bionumpy/io/parser.py", line 68, in __init__
    self._header_data = self._buffer_type.read_header(self._file_obj)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/kevin/miniforge3/envs/kage/lib/python3.12/site-packages/bionumpy/io/file_buffers.py", line 163, in read_header
    file_object.seek(-len(line), 1)
  File "/home/kevin/miniforge3/envs/kage/lib/python3.12/gzip.py", line 421, in seek
    return self._buffer.seek(offset, whence)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
gzip.BadGzipFile: Not a gzipped file (b'\x00\x00')

It took me a while to track the issue down, but it seems this was due to the use of the isal igzip module. As soon as I replaced the import with the standard gzip module everything worked fine. Let me know if there are any other details that might be helpful.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions