Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

empty output of chroder #182

Open
kendaida opened this issue Mar 1, 2023 · 6 comments
Open

empty output of chroder #182

kendaida opened this issue Mar 1, 2023 · 6 comments

Comments

@kendaida
Copy link

kendaida commented Mar 1, 2023

Hi, thank you for developing cool tool.
I know that there are several posts about this but I cannot solve my problem.

I am trying to use denovo assembly with scaffold level and reference genome, using pseudo genome assembly,
It runs but the output of the chroder "out_ref.fasta" was an empty file.
"output.qry.fasta" seems to be good including the same number of Chromosome as the reference genome.

Tried syri with "out_ref.fasta" or the reference fasta, it got the error "Chromosomes IDs do not match."

Thanks for your help.

@mnshgl0110
Copy link
Member

Hi. Could you please share the chromosome IDs in the reference and the chroder_qry genome? If the number of chromosomes are same then this error should not happen. Also, please run syri with --log DEBUG and share the log file.

Additionally, if there is a chromosome level reference genome available, then you can also try using ragtag as that generates better pseudo-chromosomes then chroder.

@kendaida
Copy link
Author

kendaida commented Mar 2, 2023

Thanks for your help.

I made a denovo assembly for human genome chr6.
And using hg38 as a reference, extracting the chr6 by samtools.

The output from chroder output.qry.fasta includes

chr6

And the reference includes

chr6

the log file for syri
2023-03-01 23:27:54,823 - Reading Coords - DEBUG - syri:135 - T
2023-03-01 23:27:54,824 - Reading Coords - INFO - syri:135 - Reading input from .tsv file
2023-03-01 23:27:54,841 - Reading Coords - WARNING - syri:135 - Chromosomes IDs do not match.
2023-03-01 23:27:54,841 - Reading Coords - ERROR - syri:135 - Unequal number of chromosomes in the genomes. Exiting

I'll try ragtag!

@mnshgl0110
Copy link
Member

Does your reference has only chr6 or other chromosomes as well? Make sure that the reference and query genomes have same number of chromosomes.

@kendaida
Copy link
Author

kendaida commented Mar 2, 2023

Thanks for your help.

The reference only includes chr6.

My code is like below.

GRCh38_chr6.fa includes only "chr6".
$A_fa is the assembled fasta for chrmosome 6 by Shasta

nucmer --maxmatch -c 500 -b 500 -l 100  GRCh38_chr6.fa $A_fa
delta-filter -m -i 90 -l 100 out.delta > out_filtered.delta
show-coords -THrd out_filtered.delta > out_filtered.coords
chroder -o output out_filtered.coords $BASE_DIR/GRCh38_chr6.fa $A_fa

the output of chroder output.ref.fasta is empty.
So used the reference for syri.

Confirmed input of syri has same chromosome.

grep ">" GRCh38_chr6.fa
>chr6

grep ">" output.qry.fasta 
>chr6

Then run syri.
However, the error message below are printed.

syri -c out_filtered.coords -d out_filtered.delta -r GRCh38_chr6.fa -q output.qry.fasta

2023-03-02 09:58:17,701 - Reading Coords - INFO - syri:135 - Reading input from .tsv file
2023-03-02 09:58:17,721 - Reading Coords - WARNING - syri:135 - Chromosomes IDs do not match.
2023-03-02 09:58:17,722 - Reading Coords - ERROR - syri:135 - Unequal number of chromosomes in the genomes. Exiting

I also tried ragtag but it does not solve the situation.

@mnshgl0110
Copy link
Member

If I understand correctly, in the following command:
syri -c out_filtered.coords -d out_filtered.delta -r GRCh38_chr6.fa -q output.qry.fasta
you are using the alignments generated between GRCh38_chr6.fa and A_fa.
For comparing GRCh38_chr6.fa and output.qry.fasta, you need to get genome alignment between them and use those with syri.

@kendaida
Copy link
Author

kendaida commented Mar 2, 2023

Oh thank you!
I re-run mummer with the two fasta file and finally got the output of plotsr!

Is it usual to get empty output of reference fasta ("output.ref.fasta") from chroder if I use the chromosome level assembly as an reference?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants