Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PBSpro support #60

Open
mictadlo opened this issue May 9, 2024 · 7 comments
Open

PBSpro support #60

mictadlo opened this issue May 9, 2024 · 7 comments

Comments

@mictadlo
Copy link

mictadlo commented May 9, 2024

Hi,
Our HPC uses PBSpro. Does make_lastz_chains support PBSpro?

Best wishes,

MIchal

@MichaelHiller
Copy link
Collaborator

Likely not, but my understanding is that we use NextFlow to schedule the jobs. So if NextFlow can communicate with PBSpro, it may work.

@mictadlo
Copy link
Author

mictadlo commented May 9, 2024

According to the Nextflow documentation PBSpro is supported. However, I failed to get it running in the following way:

> ./make_chains.py target query test_data/test_reference.fa test_data/test_query.fa --pd test_out -f --chaining_memory 16 --cluster_executor pbspro --cluster_queue test
# Make Lastz Chains #
Version 2.0.8
Commit: 187e313afc10382fe44c96e47f27c4466d63e114
Branch: main

* found run_lastz.py at /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/standalone_scripts/run_lastz.py
* found run_lastz_intermediate_layer.py at /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/standalone_scripts/run_lastz_intermediate_layer.py
* found chain_gap_filler.py at /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/standalone_scripts/chain_gap_filler.py
* found faToTwoBit at /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/HL_kent_binaries/faToTwoBit
* found twoBitToFa at /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/HL_kent_binaries/twoBitToFa
* found pslSortAcc at /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/HL_kent_binaries/pslSortAcc
* found axtChain at /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/HL_kent_binaries/axtChain
* found axtToPsl at /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/HL_kent_binaries/axtToPsl
* found chainAntiRepeat at /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/HL_kent_binaries/chainAntiRepeat
* found chainMergeSort at /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/HL_kent_binaries/chainMergeSort
* found chainCleaner at /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/HL_kent_binaries/chainCleaner
* found chainSort at /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/HL_kent_binaries/chainSort
* found chainScore at /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/HL_kent_binaries/chainScore
* found chainNet at /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/HL_kent_binaries/chainNet
* found chainFilter at /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/HL_kent_binaries/chainFilter
* found lastz at /work/waterhouse_team/miniconda2/envs/makeLastzChains/bin/lastz
* found nextflow at /home/lorencm/bin/nextflow
All necessary executables found.
Making chains for test_data/test_reference.fa and test_data/test_query.fa files, saving results to /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/test_out
Pipeline started at 2024-05-10 08:46:21.499906
* Setting up genome sequences for target
genomeID: target
input sequence file: test_data/test_reference.fa
is 2bit: False
planned genome dir location: /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/test_out/target.2bit
Initial fasta file test_data/test_reference.fa saved to /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/test_out/target.2bit
For target (target) sequence file: /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/test_out/target.2bit; chrom sizes saved to: /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/test_out/target.chrom.sizes
* Setting up genome sequences for query
genomeID: query
input sequence file: test_data/test_query.fa
is 2bit: False
planned genome dir location: /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/test_out/query.2bit
Initial fasta file test_data/test_query.fa saved to /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/test_out/query.2bit
For query (query) sequence file: /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/test_out/query.2bit; chrom sizes saved to: /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/test_out/query.chrom.sizes

### Partition Step ###

# Partitioning for target
Saving partitions and creating 1 buckets for lastz output
In particular, 0 partitions for bigger chromosomes
And 1 buckets for smaller scaffolds
Saving target partitions to: /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/test_out/target_partitions.txt
# Partitioning for query
Saving partitions and creating 1 buckets for lastz output
In particular, 0 partitions for bigger chromosomes
And 1 buckets for smaller scaffolds
Saving query partitions to: /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/test_out/query_partitions.txt
Num. target partitions: 0
Num. query partitions: 0
Num. lastz jobs: 0

### Lastz Alignment Step ###

LASTZ: making jobs
LASTZ: saved 1 jobs to /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/test_out/temp_lastz_run/lastz_joblist.txt
Parallel manager: pushing job /home/lorencm/bin/nextflow /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/parallelization/execute_joblist.nf --joblist /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/test_out/temp_lastz_run/lastz_joblist.txt -c /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/test_out/temp_lastz_run/lastz_config.nf
N E X T F L O W  ~  version 23.10.1
Launching `/mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/parallelization/execute_joblist.nf` [gigantic_lorenz] DSL2 - revision: 0483b29723
[12/2b01f3] process > execute_jobs (1) [100%] 4 of 4, failed: 4, retries: 3
[1c/6ff42d] NOTE: Error submitting process 'execute_jobs (1)' for execution -- Execution is retried (1)
[46/bab438] NOTE: Error submitting process 'execute_jobs (1)' for execution -- Execution is retried (2)
[4b/caee10] NOTE: Error submitting process 'execute_jobs (1)' for execution -- Execution is retried (3)
ERROR ~ Error executing process > 'execute_jobs (1)'

Caused by:
  Failed to submit process to grid scheduler for execution

Command executed:

  qsub -N nf-execute_jobs .command.run

Command exit status:
  159

Command output:
  qsub: Unauthorized Request 

Work dir:
  /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/test_out/temp_lastz_run/work/12/2b01f39c7ef951786a32513d22ccc9

Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line

 -- Check '.nextflow.log' file for details


### Error! The nextflow process lastz crashed!
Please look at the logs in the /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/test_out/temp_lastz_run
An error occurred while executing lastz: Jobs for lastz at /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/test_out/temp_lastz_run/lastz_joblist.txt died
Traceback (most recent call last):
  File "/mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/modules/step_manager.py", line 70, in execute_steps
    step_result = step_to_function[step](params, project_paths, step_executables)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/modules/pipeline_steps.py", line 52, in lastz_step
    do_lastz(params, project_paths,  executables)
  File "/mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/steps_implementations/lastz_step.py", line 99, in do_lastz
    execute_nextflow_step(
  File "/mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/parallelization/nextflow_wrapper.py", line 157, in execute_nextflow_step
    nextflow_manager.check_failed()
  File "/mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/parallelization/nextflow_wrapper.py", line 109, in check_failed
    raise NextflowProcessError(f"Jobs for {self.label} at {self.joblist_path} died")
modules.error_classes.NextflowProcessError: Jobs for lastz at /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/test_out/temp_lastz_run/lastz_joblist.txt died
> less test_out/.nextflow.log
test_out/.nextflow.log: No such file or directory

What did I do wrong?

Best wishes,

Michal

@MichaelHiller
Copy link
Collaborator

Sorry, I don't know.
I have 0 experience with PBSpro.

@ohdongha
Copy link

ohdongha commented May 10, 2024

Hi! Sorry for the hitchhiking.

I also had trouble running make_lastz_chains on an HPC that runs PBS, likely due to some internal configuration of the HPC. After trial and error, I ended up running make_lastz_chains (the original v.1.0.0) by submitting the entire job to a single computing node with multiple (N) cores in the HPC, with --executor local --executor_queuesize $N (--executor local can be omitted since that's the default).

In my case, a node with N=32 was good enough for the alignment of mammalian-size genomes (or any genomes <16Gb), and there are steps where RAM appears to matter more than the number of threads.

If you have some computing nodes with a reasonable number of cores, perhaps this approach would work?

Cheers,
Dong-Ha

@MichaelHiller
Copy link
Collaborator

Thanks for the feedback. Of course running it on a single node may work. These days CPUs have 128 or 192 cores. It will take a few days to finish though.

Maybe @kirilenkobm has insights in PBSpro or how to fix the problem?

@mictadlo
Copy link
Author

Hi @ohdongha, How much memory did you need for your mammalian-size genomes? I want to run it on a 3GB allotetraploid plant.

Best wishes,

Michal

@ohdongha
Copy link

ohdongha commented May 11, 2024

Hi @ohdongha, How much memory did you need for your mammalian-size genomes? I want to run it on a 3GB allotetraploid plant.

I typically ask for 360 GB and 32-core, to be on the safe side. In most cases, max_vmem does not exceed 200GB. I think the key, which @MichaelHiller also always emphasizes, is to soft-mask the repeats as much as possible.

Cheers,
Dong-Ha

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants