Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lastz error: terminated with an error exit status (1) #70

Open
Hannah1746 opened this issue Nov 8, 2024 · 39 comments
Open

Lastz error: terminated with an error exit status (1) #70

Hannah1746 opened this issue Nov 8, 2024 · 39 comments

Comments

@Hannah1746
Copy link

Hey I have beed tryin to run make_chains.py locally and keep getting shutdown by terminated with an error exit status (1).

(TOGA) hwaterma@cast-bio540ws02:~/Documents/MX_TOGA$ ../make_lastz_chains/make_chains.py MX DR MX_HiC_50chr_debris.fasta Danio.fa --pd test_out --chaining_memory 20 --executor local -f

Make Lastz Chains

Version 2.0.8
Commit: 88f4f39
Branch: main

  • found run_lastz.py at /home/hwaterma/Documents/make_lastz_chains/standalone_scripts/run_lastz.py
  • found run_lastz_intermediate_layer.py at /home/hwaterma/Documents/make_lastz_chains/standalone_scripts/run_lastz_intermediate_layer.py
  • found chain_gap_filler.py at /home/hwaterma/Documents/make_lastz_chains/standalone_scripts/chain_gap_filler.py
  • found faToTwoBit at /home/hwaterma/Documents/make_lastz_chains/HL_kent_binaries/faToTwoBit
  • found twoBitToFa at /home/hwaterma/Documents/make_lastz_chains/HL_kent_binaries/twoBitToFa
  • found pslSortAcc at /home/hwaterma/Documents/make_lastz_chains/HL_kent_binaries/pslSortAcc
  • found axtChain at /home/hwaterma/Documents/make_lastz_chains/HL_kent_binaries/axtChain
  • found axtToPsl at /home/hwaterma/Documents/make_lastz_chains/HL_kent_binaries/axtToPsl
  • found chainAntiRepeat at /home/hwaterma/Documents/make_lastz_chains/HL_kent_binaries/chainAntiRepeat
  • found chainMergeSort at /home/hwaterma/Documents/make_lastz_chains/HL_kent_binaries/chainMergeSort
  • found chainCleaner at /home/hwaterma/Documents/make_lastz_chains/HL_kent_binaries/chainCleaner
  • found chainSort at /home/hwaterma/Documents/make_lastz_chains/HL_kent_binaries/chainSort
  • found chainScore at /home/hwaterma/Documents/make_lastz_chains/HL_kent_binaries/chainScore
  • found chainNet at /home/hwaterma/Documents/make_lastz_chains/HL_kent_binaries/chainNet
  • found chainFilter at /home/hwaterma/Documents/make_lastz_chains/HL_kent_binaries/chainFilter
  • found lastz at /data/krablab/miniconda2/envs/TOGA/bin/lastz
  • found nextflow at /data/krablab/miniconda2/envs/TOGA/bin/nextflow
    All necessary executables found.
    Making chains for MX_HiC_50chr_debris.fasta and Danio.fa files, saving results to /home/hwaterma/Documents/MX_TOGA/test_out
    Pipeline started at 2024-11-08 11:24:48.221388
  • Setting up genome sequences for target
    genomeID: MX
    input sequence file: MX_HiC_50chr_debris.fasta
    is 2bit: False
    planned genome dir location: /home/hwaterma/Documents/MX_TOGA/test_out/target.2bit
    Initial fasta file MX_HiC_50chr_debris.fasta saved to /home/hwaterma/Documents/MX_TOGA/test_out/target.2bit
    For MX (target) sequence file: /home/hwaterma/Documents/MX_TOGA/test_out/target.2bit; chrom sizes saved to: /home/hwaterma/Documents/MX_TOGA/test_out/target.chrom.sizes
  • Setting up genome sequences for query
    genomeID: DR
    input sequence file: Danio.fa
    is 2bit: False
    planned genome dir location: /home/hwaterma/Documents/MX_TOGA/test_out/query.2bit
    Initial fasta file Danio.fa saved to /home/hwaterma/Documents/MX_TOGA/test_out/query.2bit
    For DR (query) sequence file: /home/hwaterma/Documents/MX_TOGA/test_out/query.2bit; chrom sizes saved to: /home/hwaterma/Documents/MX_TOGA/test_out/query.chrom.sizes

Partition Step

Partitioning for target

Saving partitions and creating 26 buckets for lastz output
In particular, 0 partitions for bigger chromosomes
And 26 buckets for smaller scaffolds
Saving target partitions to: /home/hwaterma/Documents/MX_TOGA/test_out/target_partitions.txt

Partitioning for query

Saving partitions and creating 59 buckets for lastz output
In particular, 40 partitions for bigger chromosomes
And 19 buckets for smaller scaffolds
Saving query partitions to: /home/hwaterma/Documents/MX_TOGA/test_out/query_partitions.txt
Num. target partitions: 0
Num. query partitions: 40
Num. lastz jobs: 0

Lastz Alignment Step

LASTZ: making jobs
LASTZ: saved 1534 jobs to /home/hwaterma/Documents/MX_TOGA/test_out/temp_lastz_run/lastz_joblist.txt
Parallel manager: pushing job /data/krablab/miniconda2/envs/TOGA/bin/nextflow /home/hwaterma/Documents/make_lastz_chains/parallelization/execute_joblist.nf --joblist
/home/hwaterma/Documents/MX_TOGA/test_out/temp_lastz_run/lastz_joblist.txt -c /home/hwaterma/Documents/MX_TOGA/test_out/temp_lastz_run/lastz_config.nf

N E X T F L O W ~ version 24.10.0
Launching /home/hwaterma/Documents/make_lastz_chains/parallelization/execute_joblist.nf [deadly_ekeblad] DSL2 - revision: 0432e25129

executor > local (27)
[e2/21114c] execute_jobs (34) [ 2%] 17 of 769, failed: 17, retries: 17
[80/3a5e0e] NOTE: Process execute_jobs (7) terminated with an error exit status (1) -- Execution is retried (1)
executor > local (27)
[70/2f69ec] execute_jobs (12) [ 2%] 18 of 824, failed: 18, retries: 18
[80/3a5e0e] NOTE: Process execute_jobs (7) terminated with an error exit status (1) -- Execution is retried (1)
[cb/61e398] NOTE: Process execute_jobs (26) terminated with an error exit status (1) -- Execution is retried (1)
[0f/1cc334] NOTE: Process execute_jobs (39) terminated with an error exit status (1) -- Execution is retried (1)
[7c/1aa0fe] NOTE: Process execute_jobs (16) terminated with an error exit status (1) -- Execution is retried (1)
[aa/6ec182] NOTE: Process execute_jobs (20) terminated with an error exit status (1) -- Execution is retried (1)
[b5/7cb776] NOTE: Process execute_jobs (21) terminated with an error exit status (1) -- Execution is retried (1)
[71/1845b5] NOTE: Process execute_jobs (27) terminated with an error exit status (1) -- Execution is retried (1)
[34/8005d1] NOTE: Process execute_jobs (25) terminated with an error exit status (1) -- Execution is retried (1)
[df/64091e] NOTE: Process execute_jobs (22) terminated with an error exit status (1) -- Execution is retried (1)
[f6/8f96ed] NOTE: Process execute_jobs (33) terminated with an error exit status (1) -- Execution is retried (1)
[04/06b5b4] NOTE: Process execute_jobs (10) terminated with an error exit status (1) -- Execution is retried (1)
[98/051c7e] NOTE: Process execute_jobs (5) terminated with an error exit status (1) -- Execution is retried (1)
[09/e4de98] NOTE: Process execute_jobs (3) terminated with an error exit status (1) -- Execution is retried (1)
[c2/153283] NOTE: Process execute_jobs (37) terminated with an error exit status (1) -- Execution is retried (1)
[36/f2dec6] NOTE: Process execute_jobs (29) terminated with an error exit status (1) -- Execution is retried (1)
[33/2420be] NOTE: Process execute_jobs (40) terminated with an error exit status (1) -- Execution is retried (1)
[e2/21114c] NOTE: Process execute_jobs (34) terminated with an error exit status (1) -- Execution is retried (1)

When I try to understand the error the run.log:

Lastz Alignment Step

LASTZ: making jobs
LASTZ: saved 1534 jobs to /home/hwaterma/Documents/MX_TOGA/test_out/temp_lastz_run/lastz_joblist.txt
Parallel manager: pushing job /data/krablab/miniconda2/envs/TOGA/bin/nextflow /home/hwaterma/Documents/make_lastz_chains/parallelization/execute_joblist.nf --joblist /home/hwaterma/Documents/MX_TOGA/test_out/temp_lastz_run/lastz_joblist.txt -c /home/hwaterma/Documents/MX_TOGA/test_out/temp_lastz_run/lastz_config.nf

and there is no .nextflow or .nextflow.log get crated.

I beed trying to trouble shoot on my own but just can;'t seem to understand what the code it getting stuck on.

@MichaelHiller
Copy link
Collaborator

Could you pls try to run a single lastz jobs on the command line? There is something systematically wrong. Likely some input files are not found or so.

@kirilenkobm Could you pls have a look? There must be log files that indicate why the lastz jobs die.

@Hannah1746
Copy link
Author

I ran one of the jobs from lastz_joblist.txt:

/home/hwaterma/Documents/make_lastz_chains/standalone_scripts/run_lastz_intermediate_layer.py BULK_1:/home/hwaterma/Documents/MX_TOGA/test_out/target.2bit:MYX_Chr1 /home/hwaterma/Documents/MX_TOGA/test_out/query.2bit:NC_007112_7:0-50000000 /home/hwaterma/Documents/MX_TOGA/test_out/pipeline_parameters.json /home/hwaterma/Documents/MX_TOGA/test_out/temp_lastz_psl_output/bucket_ref_bulk_1/BULK_1_NC_007112_7__1.psl /home/hwaterma/Documents/make_lastz_chains/standalone_scripts/run_lastz.py --output_format psl --axt_to_psl /home/hwaterma/Documents/make_lastz_chains/HL_kent_binaries/axtToPsl

That has been running for a few hours now with no errors.

@MichaelHiller
Copy link
Collaborator

Well that is promising.
You align a whole chr1 (how large is that??) against a 50Mb chunk.
If you want to run the test faster, maybe align it to a 5 or 10 Mb chunk

@Hannah1746
Copy link
Author

Our first chromosome is 69,199,620 bp.

The command did run to completion too.

I just do not understand why the larger script is not working.

@MichaelHiller
Copy link
Collaborator

I have no idea. Could you pls check how much memory the job required and how long it ran?
If you run
/usr/bin/time -v lastz ....
this gives you run time and max memory peak consumption.

Maybe your jobs get killed by Slurm on the cluster because they don't get enough memory or runtime?

Did you test running lastz on the actual compute nodes that run the Slurm jobs? Maybe something is not correctly configured there.

Do the jobs die immediately or only after running for a few hours?

Also, maybe test splitting the genomes into much smaller chunks to see if these jobs succeed.

@Hannah1746
Copy link
Author

hwaterma@cast-bio540ws02:~/Documents/MX_TOGA$ /usr/bin/time -v lastz

You must specify a target file
lastz-- Local Alignment Search Tool, blastZ-like
(version 1.04.15 released 20210827)
usage: lastz target [query] [options]
(common options; use --help for a more extensive list)
target, query specifiers or files, containing sequences to align
(use --help=files for more details)
--seed= set seed pattern (12of19, 14of22, or general pattern)
(default is 1110100110010101111)
--[no]transition allow (or don't) one transition in a seed hit
(by default a transition is allowed)
--[no]chain perform chaining
(by default no chaining is performed)
--[no]gapped perform gapped alignment (instead of gap-free)
(by default gapped alignment is performed)
--step= set step length (default is 1)
--strand=both search both strands
--strand=plus search + strand only (matching strand of query spec)
(by default both strands are searched)
--scores= read substitution and gap scores from a file
--xdrop= set x-drop threshold (default is 10sub[A][A])
--ydrop= set y-drop threshold (default is open+300extend)
--infer[=] infer scores from the sequences, then use them
all inference options are read from the control file
--hspthresh= set threshold for high scoring pairs (default is 3000)
ungapped extensions scoring lower are discarded
can also be a percentage or base count
--gappedthresh= set threshold for gapped alignments
gapped extensions scoring lower are discarded
can also be a percentage or base count
(default is to use same value as --hspthresh)
--include= read command line arguments from a text file
--help list "all" options (but the online documentation is
more complete)
--help=files list information about file specifiers
--help=shortcuts list blastz-compatible shortcuts
--help=defaults list scoring defaults for your current settings
--help=yasra list yasra-specific shortcuts

See the online documentation at http://www.bx.psu.edu/~rsharris/lastz for
the most up-to-date information.
Command exited with non-zero status 1
Command being timed: "lastz"
User time (seconds): 0.00
System time (seconds): 0.00
Percent of CPU this job got: 33%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.00
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 2240
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 97
Voluntary context switches: 1
Involuntary context switches: 0
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 1

The job dies immediately on our local computer. This computer does not have a time window.

I have got it to run locally on our computer cluster but it does not run in the 3 day window we are allowed. It does not run on Slurm but, I am working with the computer cluster IT for that issue.

@MichaelHiller
Copy link
Collaborator

After /usr/bin/time -v
which is just the command to track time and mem consumption,
pls provide the full lastz command with all parameters

@Hannah1746
Copy link
Author

I don't think I am doing this right. I'm sorry for being a little slow to this computer coding but this is what I ran and got:
(TOGA) hwaterma@cast-bio540ws02:~/Documents/MX_TOGA$ /usr/bin/time -v lastz /home/hwaterma/Documents/MX_TOGA/test_out/target.2bit:MYX_Chr1 /home/hwaterma/Documents/MX_TOGA/test_out/query.2bit:NC_007112_7:0-50000000
FAILURE: fopen_or_die failed to open "/home/hwaterma/Documents/MX_TOGA/test_out/target.2bit:MYX_Chr1" for "rb"
Command exited with non-zero status 1
Command being timed: "lastz /home/hwaterma/Documents/MX_TOGA/test_out/target.2bit:MYX_Chr1 /home/hwaterma/Documents/MX_TOGA/test_out/query.2bit:NC_007112_7:0-50000000"
User time (seconds): 0.00
System time (seconds): 0.00
Percent of CPU this job got: 100%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.00
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 2880
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 227
Voluntary context switches: 1
Involuntary context switches: 0
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 1

It seems that it can not read the target file.

@MichaelHiller
Copy link
Collaborator

your command /usr/bin/time -v lastz etc is all correct, but it outputs an error
FAILURE: fopen_or_die failed to open "/home/hwaterma/Documents/MX_TOGA/test_out/target.2bit:MYX_Chr1" for "rb"

which means the lastz command without measuring time and mem consumption (via /usr/bin/time -v ) should also crash immediately.

Is the lastz command exactly the command that was running for a long time?

Maybe try this to see if your /usr/bin/time -v works.

/usr/bin/time -v find . | wc Command being timed: "find ." User time (seconds): 0.00 System time (seconds): 0.00 Percent of CPU this job got: 1% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.53 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 3032 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 217 Voluntary context switches: 66 Involuntary context switches: 0 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0

@Hannah1746
Copy link
Author

when I run that I get:
/usr/bin/time -v find . | wc Command being timed: "find ." User time (seconds): 0.00 System time (seconds): 0.00 Percent of CPU this job got: 1% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.53 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 3032 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 217 Voluntary context switches: 66 Involuntary context switches: 0 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0
-bash: syntax error near unexpected token `('

@MichaelHiller
Copy link
Collaborator

Exit status: 0
means that this test works.

Where is this coming from? -bash: syntax error near unexpected token `('

@Hannah1746
Copy link
Author

when I run /usr/bin/time -v find . | wc Command being timed: "find ." User time (seconds): 0.00 System time (seconds): 0.00 Percent of CPU this job got: 1% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.53 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 3032 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 217 Voluntary context switches: 66 Involuntary context switches: 0 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0

the output is:
-bash: syntax error near unexpected token `('

@MichaelHiller
Copy link
Collaborator

Really weird. Here is what I get.
image

@MichaelHiller
Copy link
Collaborator

You have a proper linux system?

@Hannah1746
Copy link
Author

Oh I am sorry I was running the full command

here is just /usr/bin/time -v find . | wc
(TOGA) hwaterma@cast-bio540ws02:~/Documents/MX_TOGA$ /usr/bin/time -v find . | wc
Command being timed: "find ."
User time (seconds): 0.04
System time (seconds): 0.11
Percent of CPU this job got: 36%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.43
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 2560
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 136
Voluntary context switches: 1299
Involuntary context switches: 1
Swaps: 0
File system inputs: 10816
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
4824 4824 316858

@MichaelHiller
Copy link
Collaborator

Well then everything works.
Pls fix the FAILURE: fopen_or_die failed to open "/home/hwaterma/Documents/MX_TOGA/test_out/target.2bit:MYX_Chr1" for "rb"
This indicates the file is not readable or accessible.
Maybe check the permissions.

Afterwards, the lastz run should also run.

@Hannah1746
Copy link
Author

(TOGA) hwaterma@cast-bio540ws02:~/Documents/MX_TOGA/test_out$ ls -lh
total 1.1G
-rw-rw-r-- 1 hwaterma biograd 1.5K Nov 8 11:28 pipeline_parameters.json
-rw-rw-r-- 1 hwaterma biograd 483M Nov 8 11:28 query.2bit
-rw-rw-r-- 1 hwaterma biograd 41K Nov 8 11:28 query.chrom.sizes
-rw-rw-r-- 1 hwaterma biograd 32K Nov 8 11:28 query_partitions.txt
-rw-rw-r-- 1 hwaterma biograd 3.9K Nov 8 11:28 run.log
-rw-rw-r-- 1 hwaterma biograd 217 Nov 8 11:28 steps.json
-rw-rw-r-- 1 hwaterma biograd 559M Nov 8 11:28 target.2bit
-rw-rw-r-- 1 hwaterma biograd 6.4K Nov 8 11:28 target.chrom.sizes
-rw-rw-r-- 1 hwaterma biograd 5.6K Nov 8 11:28 target_partitions.txt
drwxrwxr-x 5 hwaterma biograd 4.0K Nov 8 11:28 temp_chain_run
drwxrwxr-x 2 hwaterma biograd 4.0K Nov 8 11:28 temp_concat_lastz_output
drwxrwxr-x 4 hwaterma biograd 4.0K Nov 8 11:28 temp_fill_chain
drwxrwxr-x 2 hwaterma biograd 4.0K Nov 8 11:28 temp_kent
drwxrwxr-x 28 hwaterma biograd 4.0K Nov 8 11:28 temp_lastz_psl_output
drwxrwxr-x 4 hwaterma biograd 4.0K Nov 8 11:29 temp_lastz_run

The file is made from the pipeline and is read and write able. Not too sure how to fix that.

@MichaelHiller
Copy link
Collaborator

no idea. Pls send me this folder gzipped for download somewhere and I'll run this lastz locally on my system.
Lets see if that works

@Hannah1746
Copy link
Author

I tar the files I am using. Could I get an email to send it to?

Thank you so much again for your time on this!

@MichaelHiller
Copy link
Collaborator

The 2bit files will be too large for email.
Pls also gzip it.
Then maybe put it on google drive for me to download.

@Hannah1746
Copy link
Author

Hannah1746 commented Nov 19, 2024

@MichaelHiller
Copy link
Collaborator

pls use my senckenberg email https://tbg.senckenberg.de/hillerlab/contact-2/

@MichaelHiller
Copy link
Collaborator

I downloaded the files, but I would need the exact command you are running.
The files are Danio and not target.2bit.

Also lastz takes parameters.

@MichaelHiller
Copy link
Collaborator

I think I found the problem. While 48.4% of the danio assembly is lower case = softmasked, you don't have any masking for the query.
twoBitToFa MX.2bit stdout | faSize stdin 2340801162 bases (65000 N's 2340736162 real 2340736162 upper 0 lower) in 384 sequences in 1 files

Pls run RepeatModeler2 on it, and use the resulting lib for repeatMasking.
Then the lastz pipe will likely work.

Not (properly) masking is the #1 issue that people have when the pipe is not running smoothly :-)

@Hannah1746
Copy link
Author

I masked my genome and it still not running. I have reached out to our school tech support. Hoping to get this figured out and will let you know if we ever get to the bottom of this.

@MauriAndresMU1313
Copy link

Hi!
Were you able to run the command? I checked some posts related to the same issue: NOTE: Process execute_jobs (34) terminated with an error exit status (1) -- Execution is retried (1).
Even though I was using a masked genome downloaded from ensembl, I still have the same issue. In one post, one of the authors mentioned that it is a Nextflow issue and that the solution is to change the version. However, even when I use another version, the issue persists.
So, I would like to understand if this issue is related to memory or to Nextflow. This issue looks like a communication problem between Nextflow and Slurm, but it might be more complicated than that.
Is there another alternative to get the chain files?

@MichaelHiller
Copy link
Collaborator

Pls send me the masked genome files such that we test the same thing.
Pls let me know if any of your lastz jobs finish or run at all (if not, that is most likely a technical problem).
And pls send me the exact lastz command that you are running

@MauriAndresMU1313
Copy link

Great!, thank you, Michael.
I just sent you the Google Drive link. So far, none of my commands work and I do not have any .nextflow.log file to check.
This is the current command that I'm running:

./make_chains.py dog ferret run_01/c_familiaris.2bit run_01/m_putorius_furo.2bit --pd run-01_out -f --chaining_memory 16 --seq1_limit 175 --seq2_limit 50 --cluster_executor slurm --cluster_queue togaslurm*

When you mention the lastz command, I checked the first the run.log:

LASTZ: making jobs
LASTZ: saved 3996 jobs to /localData/workspace_mm/temporal_tasks/lastz-chains/make_lastz_chains/run-01_out/temp_lastz_run/lastz_joblist.txt
Parallel manager: pushing job /home/mmora30/.local/bin/nextflow /localData/workspace_mm/temporal_tasks/lastz-chains/make_lastz_chains/parallelization/execute_joblist.nf --joblist /localData/workspace_mm/temporal_tasks/lastz-chains/make_lastz_chains/run-01_out/temp_lastz_run/lastz_joblist.txt -c /localData/workspace_mm/temporal_tasks/lastz-chains/make_lastz_chains/run-01_out/temp_lastz_run/lastz_config.nf

then the individual command:

/localData/workspace_mm/temporal_tasks/lastz-chains/make_lastz_chains/standalone_scripts/run_lastz_intermediate_layer.py BULK_20:/localData/workspace_mm/temporal_tasks/lastz-chains/make_lastz_chains/run-01_out/target.2bit:JAAUVH010000216:JAAUVH010000171:JAAUVH010000025:JAAUVH010000271:JAAUVH010000217:JAAUVH010000165:JAAUVH010000019:JAAUVH010000275:JAAUVH010000230:JAAUVH010000016:JAAUVH010000078:JAAUVH010000221:JAAUVH010000282:JAAUVH010000014:JAAUVH010000013:JAAUVH010000326:JAAUVH010000117:JAAUVH010000101:JAAUVH010000128:JAAUVH010000325:JAAUVH010000018:JAAUVH010000072:JAAUVH010000021:JAAUVH010000181:JAAUVH010000094:JAAUVH010000135:JAAUVH010000126:JAAUVH010000169:JAAUVH010000273:JAAUVH010000077:JAAUVH010000350:JAAUVH010000254:JAAUVH010000152:JAAUVH010000227:JAAUVH010000263:JAAUVH010000341:JAAUVH010000145:JAAUVH010000112:JAAUVH010000001:JAAUVH010000156 BULK_134:/localData/workspace_mm/temporal_tasks/lastz-chains/make_lastz_chains/run-01_out/query.2bit:AEYP01117455:AEYP01117479:AEYP01117478:AEYP01117477:AEYP01117476:AEYP01117475:AEYP01117474:AEYP01117473:AEYP01117472:AEYP01117471:AEYP01117470:AEYP01117469 /localData/workspace_mm/temporal_tasks/lastz-chains/make_lastz_chains/run-01_out/pipeline_parameters.json /localData/workspace_mm/temporal_tasks/lastz-chains/make_lastz_chains/run-01_out/temp_lastz_psl_output/bucket_ref_bulk_20/BULK_20_BULK_134__3996.psl /localData/workspace_mm/temporal_tasks/lastz-chains/make_lastz_chains/standalone_scripts/run_lastz.py --output_format psl --axt_to_psl /localData/workspace_mm/temporal_tasks/lastz-chains/make_lastz_chains/HL_kent_binaries/axtToPsl

Any comments?

@MichaelHiller
Copy link
Collaborator

Actually, it could be easier if you tell me the NCBI accession of the dog and the ferret genome.
I'll try to run the alignments on our system. and if it works, I can send you the chains

@MichaelHiller
Copy link
Collaborator

Seems like your ferret is a very old assembly. I would recommend GCF_011764305.1.

For dog, I would recommend GCA_011100685.1 = canFam4, which is better than the more recent assemblies.
I have both assemblies ready and repeat-masked.
Let me know if that would help you

@MauriAndresMU1313
Copy link

Sounds great!
Actually, I'm using Ensembl genomes, and I tried to use the latest version available for three options:
Human-ferret, Mouse-ferret and Dog-ferret. I know that there are results available for the first two, however, those are from 2020-21. So I pretend to run it again with the release 113 from Ensembl.
Here is the ID accession:

ferret
https://ftp.ensembl.org/pub/release-113/fasta/mustela_putorius_furo/dnaMustela_putorius_furo.MusPutFur1.0.dna_rm.toplevel.fa.gz
dog
https://ftp.ensembl.org/pub/release-113/fasta/canis_lupus_familiaris/dna/Canis_lupus_familiaris.ROS_Cfam_1.0.dna_rm.toplevel.fa.gz
mice
https://ftp.ensembl.org/pub/release-113/fasta/mus_musculus/dna/Mus_musculus.GRCm39.dna_rm.toplevel.fa.gz
human
https://ftp.ensembl.org/pub/release-113/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna_rm.toplevel.fa.gz

Please, let me know if I can help in something with the pairwise alignments to obtain the chains!
Thank you for your help!

@MauriAndresMU1313
Copy link

@MichaelHiller Please let me know if I can help with something, I will be attentive to any comment! Thank you again

@MichaelHiller
Copy link
Collaborator

Dog to ferret chains are done. I'll send you everything via a download link

@MauriAndresMU1313
Copy link

Sounds great, I answered you by mail!
Thank you again for the help!

@Hannah1746
Copy link
Author

I still have not got this to run on any of the computers I tried. My gnomes are both on NCBI:

https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_019703515.2/
https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_000002035.6/

Do you also have time to run and get the chain file?

@MichaelHiller
Copy link
Collaborator

I can try. Which is the reference? danRer11?

@Hannah1746
Copy link
Author

Yes please :)

@MichaelHiller
Copy link
Collaborator

Is running now. Pls send me a quick email and I'll share the chains once they are done.

@MichaelHiller
Copy link
Collaborator

Pls find the chains and nets on https://genome.senckenberg.de/download/forHannahFish/
and send me a ping when you downloaded it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants