Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with obsconf #1546

Open
Koketso01 opened this issue Oct 23, 2023 · 6 comments
Open

Issues with obsconf #1546

Koketso01 opened this issue Oct 23, 2023 · 6 comments
Assignees

Comments

@Koketso01
Copy link

I keep stumbling into the following error with the latest version of Caracal, however, when I use the earlier version of Caracal, I don't get the same error (I think it is the obconf worker):

Successful readonly open of default-locked table /stimela_mount/msdir/j025740-220946_1_h.ms::POLARIZATION: 4 columns, 1 rows

2023-10-23 13:56:28 CARACal.Stimela.summary_json-ms0 ERROR: /usr/local/bin/singularity run --workdir /home/mophahlane/caracal_data/MACj0247/.stimela_workdir-16980621375234609 --containall --userns returns error code 1
2023-10-23 13:56:28 CARACal.Stimela.summary_json-ms0 ERROR: job failed at 2023-10-23 13:56:28.823588 after 0:00:28.855668
2023-10-23 13:56:28 CARACal ERROR: Job 'summary_json-ms0:: Get observation information as a json file ms=j025740-220946_1_h.ms' failed: /usr/local/bin/singularity run --workdir /home/mophahlane/caracal_data/MACj0247/.stimela_workdir-16980621375234609 --containall --userns returns error code 1 [PipelineException]
2023-10-23 13:56:28 CARACal INFO: More information can be found in the logfile at /home/mophahlane/caracal_data/MACj0247/output/logs-20231023-135536/log-caracal.txt
2023-10-23 13:56:28 CARACal INFO: exiting with error code 1
log-caracal.txt

@SpaceMeerkat
Copy link

SpaceMeerkat commented Oct 24, 2023

I am also seeing this issue for CARACal runs. I have a rolled back version (version 1.0.6) as I was seeing similar issues for the latest version a while back. For some reason they're now occurring with the older version too.

@o-smirnov
Copy link
Member

@Athanaseus could you take a look please? @Koketso01, @SpaceMeerkat, which machines are you running on, which working directories, which virtual environments?

@SpaceMeerkat
Copy link

Machine: Janis
Working directories (venv, snigularity images, output directory):

  • /net/sinatra/vault-janis/dawsonj5/caracal-new
  • /net/sinatra/vault-janis/dawsonj5/sin-images
  • /net/sinatra/vault-janis/dawsonj5/MGCLS/

@Koketso01
Copy link
Author

Koketso01 commented Oct 24, 2023 via email

@SpaceMeerkat
Copy link

Just adding to this... using the same rolled back version 1.0.6 also works on Ike. So it must be something to do with janis rather than the caracal distro

@o-smirnov
Copy link
Member

I am stumped. Looking at @SpaceMeerkat's case in particular -- the transform worker runs, splits out the MS, then there's the last stage of the recipe which generates a summary.json file for the new MS:

2023-10-26 17:52:44 CARACal.Stimela INFO: Parameters validated and saved to /net/sinatra/vault-janis/dawsonj5/MGCLS/new/J0600.8-5835/.stimela_workdir-16983324577896862/stimela_parameter_files/summary_json_ms0_
0-14062544364164816983324599434216.json                                                                                                                                                                          
2023-10-26 17:52:44 CARACal.Stimela.summary_json-ms0-0 INFO: Starting container [summary_json_ms0_0-14062544364164816983324599434216]. Timeout set to -1. The container ID is printed below.                     
# running cd /net/sinatra/vault-janis/dawsonj5/MGCLS/new/J0600.8-5835/.stimela_workdir-16983324577896862 && singularity run --workdir /net/sinatra/vault-janis/dawsonj5/MGCLS/new/J0600.8-5835/.stimela_workdir-1
6983324577896862 --containall  --bind /net/sinatra/vault-janis/dawsonj5/MGCLS/new/J0600.8-5835/.stimela_workdir-16983324577896862/stimela_parameter_files/summary_json_ms0_0-14062544364164816983324599434216.jso
n:/stimela_mount/configfile:ro --bind /net/sinatra/vault-janis/dawsonj5/caracal-new/lib/python3.8/site-packages/stimela/cargo/cab/msutils/src:/stimela_mount/code:ro --bind /net/sinatra/vault-janis/dawsonj5/MGC
LS/new/J0600.8-5835/.stimela_workdir-16983324577896862/passwd:/etc/passwd:rw --bind /net/sinatra/vault-janis/dawsonj5/MGCLS/new/J0600.8-5835/.stimela_workdir-16983324577896862/group:/etc/group:rw --bind /net/s
inatra/vault-janis/dawsonj5/caracal-new/bin/stimela_runscript:/singularity:ro --bind /net/sinatra/vault-janis/dawsonj5/MGCLS/new/J0600.8-5835/msdir:/stimela_mount/msdir:rw --bind /net/sinatra/vault-janis/dawso
nj5/MGCLS/new/J0600.8-5835/input:/stimela_mount/input:ro --bind /net/sinatra/vault-janis/dawsonj5/MGCLS/new/J0600.8-5835/output/obsinfo:/stimela_mount/output:rw --bind /net/sinatra/vault-janis/dawsonj5/MGCLS/n
ew/J0600.8-5835/output/obsinfo/tmp:/stimela_mount/output/tmp:rw /net/sinatra/vault-janis/dawsonj5/sin-images/stimela_msutils_1.4.6.sif /singularity                                                              
# WARNING: Overriding HOME environment variable with SINGULARITYENV_HOME is not permitted                                                                                                                        
# Successful readonly open of default-locked table /stimela_mount/msdir/j060048-583514_0_h-J0600_8_5835-corrfreqavg.ms: 26 columns, 6177897 rows                                                                 
# Successful readonly open of default-locked table /stimela_mount/msdir/j060048-583514_0_h-J0600_8_5835-corrfreqavg.ms::FIELD: 9 columns, 1 rows                                                                 
# Successful readonly open of default-locked table /stimela_mount/msdir/j060048-583514_0_h-J0600_8_5835-corrfreqavg.ms::SPECTRAL_WINDOW: 14 columns, 1 rows                                                      
# Successful readonly open of default-locked table /stimela_mount/msdir/j060048-583514_0_h-J0600_8_5835-corrfreqavg.ms::ANTENNA: 8 columns, 61 rows                                                              
# Successful readonly open of default-locked table /stimela_mount/msdir/j060048-583514_0_h-J0600_8_5835-corrfreqavg.ms::STATE: 7 columns, 4 rows                                                                 
# Successful readonly open of default-locked table /stimela_mount/msdir/j060048-583514_0_h-J0600_8_5835-corrfreqavg.ms::POLARIZATION: 4 columns, 1 rows                                                          
2023-10-26 17:52:57 CARACal.Stimela.summary_json-ms0-0 ERROR: cd /net/sinatra/vault-janis/dawsonj5/MGCLS/new/J0600.8-5835/.stimela_workdir-16983324577896862 && singularity run --workdir /net/sinatra/vault-jani
s/dawsonj5/MGCLS/new/J0600.8-5835/.stimela_workdir-16983324577896862 --containall returns error code 1                                                                                                           
2023-10-26 17:52:57 CARACal.Stimela.summary_json-ms0-0 ERROR: job failed at 2023-10-26 17:52:57.040826 after 0:00:12.051965                                                                                      
2023-10-26 17:52:57 CARACal.Stimela.summary_json-ms0-0 ERROR: Traceback (most recent call last):                                                                                                                 
2023-10-26 17:52:57 CARACal.Stimela.summary_json-ms0-0 ERROR:   File "/net/sinatra/vault-janis/dawsonj5/caracal-new/lib/python3.8/site-packages/stimela/recipe.py", line 713, in run                             
2023-10-26 17:52:57 CARACal.Stimela.summary_json-ms0-0 ERROR:     job.run_job()                                                                                                                                  
2023-10-26 17:52:57 CARACal.Stimela.summary_json-ms0-0 ERROR:   File "/net/sinatra/vault-janis/dawsonj5/caracal-new/lib/python3.8/site-packages/stimela/recipe.py", line 425, in run_job                         
2023-10-26 17:52:57 CARACal.Stimela.summary_json-ms0-0 ERROR:     self.job.run(output_wrangler=self.apply_output_wranglers)                                                                                      
2023-10-26 17:52:57 CARACal.Stimela.summary_json-ms0-0 ERROR:   File "/net/sinatra/vault-janis/dawsonj5/caracal-new/lib/python3.8/site-packages/stimela/singularity.py", line 123, in run                        
2023-10-26 17:52:57 CARACal.Stimela.summary_json-ms0-0 ERROR:     utils.xrun(f"cd {self.execdir} && singularity run --workdir {self.execdir} --containall",                                                      
2023-10-26 17:52:57 CARACal.Stimela.summary_json-ms0-0 ERROR:   File "/net/sinatra/vault-janis/dawsonj5/caracal-new/lib/python3.8/site-packages/stimela/utils/xrun_poll.py", line 227, in xrun                   
2023-10-26 17:52:57 CARACal.Stimela.summary_json-ms0-0 ERROR:     raise StimelaCabRuntimeError("{} returns error code {}".format(command_name, status))                                                          
2023-10-26 17:52:57 CARACal.Stimela.summary_json-ms0-0 ERROR: stimela.utils.StimelaCabRuntimeError: cd /net/sinatra/vault-janis/dawsonj5/MGCLS/new/J0600.8-5835/.stimela_workdir-16983324577896862 && singularity
 run --workdir /net/sinatra/vault-janis/dawsonj5/MGCLS/new/J0600.8-5835/.stimela_workdir-16983324577896862 --containall returns error code 1                                                                     
2023-10-26 17:52:57 CARACal.Stimela.transform__avg INFO: Completed jobs : ['split_field-ms0-0', 'save-caracal_legacy-ms0', 'listobs-ms0-0']                                                                      
2023-10-26 17:52:57 CARACal.Stimela.transform__avg INFO: Remaining jobs : []                                                                                                                                     
2023-10-26 17:52:57 CARACal.Stimela.transform__avg INFO: Saving pipeline information in .last_transform__avg.json                                                                                                

The JSON file is generated and appears to be fine, there are no additional error messages -- just that exit code of 1, seemingly out of nowhere apropos of nothing.

I have repeated the appropriate msutils.summary() call by hand (outside of the container), and that works fine as well.

The junk field of the parameters is empty, so the cleanup here should be a no-op.

Here's the complete stimela parameter file for reference:

{
    'task': 'msutils',
    'base': 'stimela/msutils',
    'binary': 'msutils',
    'msdir': '/net/sinatra/vault-janis/dawsonj5/MGCLS/new/J0600.8-5835/msdir',
    'description': 'Tools for manipulating measurement sets (MSs)',
    'prefix': ' ',
    'tag': ['1.4.6'],
    'version': ['1.0.1'],
    'junk': [],
    'wranglers': [],
    'parameters': [
        {'name': 'command', 'dtype': 'str', 'info': 'MSUtils command to execute', 'required': True, 'positional': False, 'check_io': True, 'value': 'summary'},
        {'name': 'msname', 'dtype': 'file', 'info': 'MS name', 'required': False, 'positional': False, 'check_io': True, 'value': '/stimela_mount/msdir/j060048-583514_0_h-J0600_8_5835-corrfreqavg.ms'},
        {'name': 'colname', 'dtype': 'str', 'info': 'Column name', 'required': False, 'positional': False, 'check_io': True, 'value': None},
        {
            'name': 'outfile',
            'dtype': 'file',
            'info': 'Output file for MS summary (json format)',
            'required': False,
            'positional': False,
            'check_io': False,
            'value': '/stimela_mount/msdir/j060048-583514_0_h-J0600_8_5835-corrfreqavg-summary.json'
        },
        {'name': 'display', 'dtype': 'bool', 'info': 'Display MS summary to stdout', 'required': False, 'positional': False, 'check_io': True, 'value': False},
        {'name': 'shape', 'dtype': 'str', 'info': 'Shape of column to add to MS', 'required': False, 'positional': False, 'check_io': True, 'value': None},
        {'name': 'valuetype', 'dtype': 'str', 'info': 'Column data type', 'required': False, 'positional': False, 'check_io': True, 'value': None},
        {'name': 'data_desc_type', 'dtype': 'str', 'info': 'Data description type for data in column to be added', 'required': False, 'positional': False, 'check_io': True, 'value': 'array'},
        {'name': 'init_with', 'dtype': 'float', 'info': 'Value to initialize new data column with', 'required': False, 'positional': False, 'check_io': True, 'value': True},
        {'name': 'col1', 'dtype': 'str', 'info': 'First column to add/subtract', 'required': False, 'positional': False, 'check_io': True, 'value': None},
        {'name': 'col2', 'dtype': 'str', 'info': 'Second column to add/subtract', 'required': False, 'positional': False, 'check_io': True, 'value': None},
        {'name': 'cols', 'dtype': 'list:str', 'info': 'Columns to sum', 'required': False, 'positional': False, 'check_io': True, 'value': None},
        {'name': 'subtract', 'dtype': 'bool', 'info': "Subtract 'col2' from 'col1' ", 'required': False, 'positional': False, 'check_io': True, 'value': False},
        {'name': 'fromcol', 'dtype': 'str', 'info': 'Column to copy data from', 'required': False, 'positional': False, 'check_io': True, 'value': None},
        {'name': 'tocol', 'dtype': 'str', 'info': 'Column to copy data to', 'required': False, 'positional': False, 'check_io': True, 'value': None},
        {'name': 'addnoise', 'dtype': 'bool', 'info': "Add noise to MS. Will add to 'column/colname'", 'required': False, 'positional': False, 'check_io': True, 'value': False},
        {
            'name': 'sefd',
            'dtype': 'float',
            'info': 'System Equivalent Flux Density, in Jy. The noise will be calculated using this value',
            'required': False,
            'positional': False,
            'check_io': True,
            'value': 0
        },
        {'name': 'addToCol', 'dtype': 'str', 'info': 'Add noise to data in this column', 'required': False, 'positional': False, 'check_io': True, 'value': None},
        {'name': 'noise', 'dtype': 'float', 'info': "Noise in Jy to 'column/colname' data in Jy", 'required': False, 'positional': False, 'check_io': True, 'value': 0},
        {'name': 'spw_id', 'dtype': 'int', 'info': 'SPW ID', 'required': False, 'positional': False, 'check_io': True, 'value': 0},
        {
            'name': 'verify',
            'dtype': 'bool',
            'info': 'Verifies antenna Y positions in MS. If Y coordinate convention is wrong, either fixes the positions (fix=True) or raises an error. hemisphere=-1 makes it assume that the observatory is in 
the Western hemisphere, hemisphere=1 in the Eastern, or else tries to find observatory name using MS and pyrap.measure',
            'required': False,
            'positional': False,
            'check_io': True,
            'value': True
        },
        {
            'name': 'mode',
            'dtype': 'str',
            'info': 'Mode when estimating spectral weights. If mode=specs, then the weights will be based on the instrument spec sensitivity that is provided via the stats_data option',
            'required': False,
            'positional': False,
            'check_io': True,
            'value': 'specs'
        },
        {'name': 'fit_order', 'dtype': 'int', 'info': 'Fit order for function used to smooth noise/weights', 'required': False, 'positional': False, 'check_io': True, 'value': 9},
        {'name': 'smooth', 'dtype': 'str', 'info': 'Function to use for smoothing the noise/weights', 'required': False, 'positional': False, 'check_io': True, 'value': 'polyn'},
        {
            'name': 'stats_data',
            'dtype': 'list/file/str',
            'info': "File or array containing information about sensitivity as a function of frequency (in Hz). For MeerKAT use the string 'use_package_meerkat_spec' unless you have your own (updated) specs",
            'required': False,
            'positional': False,
            'check_io': False,
            'value': 'use_package_meekat_spec'
        },
        {'name': 'plot_stats', 'dtype': 'file', 'info': 'Plot of estimated spectral noise/weights', 'required': False, 'positional': False, 'check_io': True, 'value': None},
        {'name': 'write_to_ms', 'dtype': 'bool', 'info': 'Save estimated noise/weights in MS', 'required': False, 'positional': False, 'check_io': True, 'value': True},
        {
            'name': 'noise_columns',
            'dtype': 'list:str',
            'info': 'columns to save noise and corresponding noise spectrum',
            'required': False,
            'positional': False,
            'check_io': True,
            'value': ['SIGMA', 'SIGMA_SPECTRUM']
        },
        {
            'name': 'weight_columns',
            'dtype': 'list:str',
            'info': 'columns to save noise and corresponding noise spectrum',
            'required': False,
            'positional': False,
            'check_io': True,
            'value': ['WEIGHT', 'WEIGHT_SPECTRUM']
        },
        {'name': 'ctable', 'dtype': 'file', 'info': 'Calibration table to plot', 'required': False, 'positional': False, 'check_io': True, 'value': None},
        {'name': 'tabtype', 'dtype': 'str', 'info': 'Type of the calibration table', 'required': False, 'positional': False, 'check_io': True, 'value': None},
        {'name': 'plot_dpi', 'dtype': 'int', 'info': 'DPI for the gain plot', 'required': False, 'positional': False, 'check_io': True, 'value': 600},
        {'name': 'subplot_scale', 'dtype': 'int', 'info': 'Scale for the subplots in the gain plot', 'required': False, 'positional': False, 'check_io': True, 'value': 6},
        {'name': 'plot_file', 'dtype': 'str', 'info': 'Filename for gain plot', 'required': False, 'positional': False, 'check_io': True, 'value': 'meerkathi-gai-plot'}
    ]
}

@SpheMakh appealing to you now, since both msutils and Stimela classic are all your babies -- can you spot something I can't?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants