Calling a pipeline without specifying a method should use call() #8130

stscijgbot-jp · 2023-12-14T10:33:05Z

Issue JP-3490 was created on JIRA by Bryan Hilbert:

Currently, when calling a pipeline (or step?) without specifying a method e.g.:

from jwst.pipeline.calwebb_image2 import Image2Pipeline
p = Image2Pipeline('my_file_asn.json')

the pipeline will run using the run() method. Given that the run() method is not the preferred way to run the pipeline, it might be better/safer to have the pipeline in this situation default to using the call() method instead.

I'm not sure if the situation is the same for individual steps, but the same logic applies there I think.

stscijgbot-jp · 2024-01-11T15:58:05Z

Comment by Melanie Clarke on JIRA:

We recently discussed this on the NIRSpec team as well. Most of our internal users were surprised to hear that the .run method is not advised, since many demonstration notebooks use that syntax. It is also the more natural syntax – most users preferred setting parameters in a Step or Pipeline object via attributes to setting them in a dictionary to provide to the .call method. It was very surprising to users both that the run method works without warning, and that the call method quietly ignores any parameters previously set in Step or Pipeline attributes.

I think this is a design flaw in the pipeline infrastructure that can't be addressed with documentation. It is already documented in several places that .call should be used in place of .run, but having both available continues to confuse users and cause subtle problems with data reductions.

I can think of several possible approaches to fix this:

add a user warning to the .run method when it is called directly
remove the .run method from the public API entirely
find a way for the .run method to check CRDS for parameter defaults, perhaps on object instantiation

I think the last approach is preferable, since it would be least disruptive and least surprising to users.

bhilbert4 · 2024-01-11T16:11:16Z

There is a way to do the last approach. David Law's MIRI MRS notebook gets the CRDS information, and then instantiates the pipeline, sets the parameters as attributes, and then calls the run method. He has this wrapped up in a function called rundet1().

    crds_config = Detector1Pipeline.get_config_from_reference(filename)
    det1 = Detector1Pipeline.from_config_section(crds_config)
    
    det1.output_dir = outdir # Specify where the output should go
    
    det1.jump.maximum_cores='half'
    det1.ramp_fit.maximum_cores='half'
    # This next parameter helps with very bright objects and/or very short ramps
    det1.jump.three_group_rejection_threshold=100
    
    # Enable detection of large cosmic ray showers (currently only works for FASTR1 data)
    det1.jump.find_showers=True
        
    det1.save_results = True # Save the final resulting _rate.fits files
    det1(filename) # Run the pipeline on an input list of files

stscijgbot-jp · 2024-01-11T16:33:07Z

Comment by David Law on JIRA:

While the get_config_from_reference function works for our own notebooks, I'd be interested to see if there's a way to make Melanie Clarke 's third approach work more generally. Is it possible and/or desirable for both .call() and .run() to check for parameter defaults automatically?

stscijgbot-jp · 2024-01-12T14:33:13Z

Comment by Howard Bushouse on JIRA:

Theoretically pretty much anything is possible, e.g. updating the .run() method to look for pars ref files, but I think that would then be defeating the original purpose of having 2 different methods in the first place and would essentially make them redundant with one another. The .run method is preferable when you want to execute something multiple times using the same param settings, while .call starts over from scratch each time it's called. IMHO, it would be nice to simply update .call so that you can give it param/args as easily as you can with .run (i.e. not having to create the cumbersome dictionary structure).

Sounds like a good topic for discussion in a DMS WG meeting.

stscijgbot-jp · 2024-07-15T17:48:07Z

Comment by David Law on JIRA:

Adding a note from the July 10 JP meeting that the cleanest solution may be to deprecate usage that doesn't specify either .run() or .call(), adding a deprecation note now and actually removing that functionality some time next year. Potentially add a warning note for .run() as well.

stscijgbot-jp · 2024-09-04T16:18:10Z

Comment by David Law on JIRA:

Followup discussion at the Sep 3 JP meeting; we should print a deprecation warning now (https://jira.stsci.edu/browse/JP-3728) that in a future TBD build usage without specifying either .call() or .run() will be deprecated.

stscijgbot-jp · 2024-12-05T19:33:06Z

Comment by Melanie Clarke on JIRA:

Just to close the loop here... I know that the current request is just to deprecate the call function, so that you have to choose either call or run, but in case this comes back around again, we did explore a couple other alternative solutions to make the run method easier or clearer to use.

This PR would warn the user if run was called without retrieving CRDS defaults:

spacetelescope/stpipe#198

This PR would allow the run method to check CRDS for default values for any parameters not directly set by the user, to make it behave more call:

spacetelescope/stpipe#192

stscijgbot-jp added the team-coffee label Dec 14, 2023

stscijgbot-jp added team-chartreuse and removed team-coffee labels Mar 25, 2024

braingram mentioned this issue Jul 11, 2024

call and run changes spacetelescope/stpipe#165

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Calling a pipeline without specifying a method should use call() #8130

Calling a pipeline without specifying a method should use call() #8130

stscijgbot-jp commented Dec 14, 2023

stscijgbot-jp commented Jan 11, 2024

bhilbert4 commented Jan 11, 2024

stscijgbot-jp commented Jan 11, 2024

stscijgbot-jp commented Jan 12, 2024

stscijgbot-jp commented Jul 15, 2024

stscijgbot-jp commented Sep 4, 2024

stscijgbot-jp commented Dec 5, 2024 •

edited

Loading

Calling a pipeline without specifying a method should use call() #8130

Calling a pipeline without specifying a method should use call() #8130

Comments

stscijgbot-jp commented Dec 14, 2023

stscijgbot-jp commented Jan 11, 2024

bhilbert4 commented Jan 11, 2024

stscijgbot-jp commented Jan 11, 2024

stscijgbot-jp commented Jan 12, 2024

stscijgbot-jp commented Jul 15, 2024

stscijgbot-jp commented Sep 4, 2024

stscijgbot-jp commented Dec 5, 2024 • edited Loading

stscijgbot-jp commented Dec 5, 2024 •

edited

Loading