Question about size of the resulting fasta file of mitochondrial genome sequencing 

Dear Changwei,

I was using autoMito to generate the 2 gfa files (raw and master) of the assembled Malus Domestica genome (as in Demo2) with the code as follows:

~/PMAT-1.5.3/bin/PMAT autoMito -i Malus_domestica.540Mb.fa -o ./out.all -st hifi -g 703m -mm -tp all -cpu 20

and after using ll command I found the size of the 2 files are all around 500000b:

-rw-rw-r-- 1 526388 2024-07-02 21:06:48 PMAT_mt_master.gfa
-rw-rw-r-- 1 557590 2024-07-02 21:06:48 PMAT_mt_raw.gfa

The contigs included in raw.gfa are:
>1
>2
>3
>2159
>4834
>15388
>1233

However, the reference mitochondrial sequence data for Apple I downloaded from NCBI (https://www.ncbi.nlm.nih.gov/nuccore/NC_018554.1/) is only 403000b in size, and the obtained raw fasta file contains many contigs that are not included in the reference sequence, e.g. contig 4834. So I copied and pasted the contig into NCBI's search engine, and found that this contig actually belongs to apple's chloroplast genome. In other words, the autoMito command I used earlier caused chloroplast sequences to get included in the mt gfa file, which is supposed to contain only mitochondrial genome.

Do you have any clue on this?

Thank you very much!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about size of the resulting fasta file of mitochondrial genome sequencing #22

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Question about size of the resulting fasta file of mitochondrial genome sequencing #22

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions