Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when running simulate.py #40

Open
asmmahmoud opened this issue May 8, 2022 · 4 comments
Open

Error when running simulate.py #40

asmmahmoud opened this issue May 8, 2022 · 4 comments

Comments

@asmmahmoud
Copy link

Hi,
I am trying to simulate long reads using simulate.py script. This is the command line I used
./simulate.py --fasta /home/asamy/scratch/ensemble_ref_hg38/Homo_sapiens.GRCh38.dna.chromosome.1.fa --movie_id ONT --read_type fastq --coverage 15 --min_frag 600 --max_frag 140000
AssertionError: /project/6032807/asamy/longislnd-0.9.5/run is not a directory
So, I created run directory and added an error profile in it then run again but got this error AssertionError: failed to find models in directory /project/6032807/asamy/longislnd-0.9.5/run
Could you help me solve the issue ?

@yunfeiguo
Copy link

Hi @AsmaaSamyMohamedMahmoud
/project/6032807/asamy/longislnd-0.9.5/run is the default path where longislnd looks for a model. Please create a model using sample.py first and then specify model path using --model_dir option in simulate.py.

@asmmahmoud
Copy link
Author

Hi @yunfeiguo,
Thank you for your reply. I still have a problem because I don't have alignment file as an input for sample.py. I only have a reference genome which I want to simulate LRs from it.

@yunfeiguo
Copy link

LongISLND's simulation relies on an error model which is built from an alignment file. If you don't have any real data to generate the alignment file, one solution is to use public data, e.g. pacbio or oxford nanopore data on E.coli. Note, the alignment file used for building error model can be based on any reference genome (does not have to be same genome used for simulation) as long as it contains enough k-mers. E.coli genome contains all possible 7-mers so its 7-mer error model can be used to simulate any other genome.

@asmmahmoud
Copy link
Author

Thank you for your clarification.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants