Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for operation hooks #28

Closed
csadorf opened this issue Jun 12, 2018 · 10 comments
Closed

Proposal for operation hooks #28

csadorf opened this issue Jun 12, 2018 · 10 comments
Labels
enhancement New feature or request expertise needed Extra attention is needed hooks
Milestone

Comments

@csadorf
Copy link
Contributor

csadorf commented Jun 12, 2018

Original report by Carl Simon Adorf (Bitbucket: csadorf, GitHub: csadorf).


About

This is a proposal for a feature that would allow users to specify hook functions that are executed either before or after the execution of an operation.
The purpose is to simplify the execution of specific functions before and after the execution of a job-operation, such as logging or data upload functions.

A hook function takes an instance of JobOperation as first argument which allows to access the name of the operation, the job it operates on, the command, and optional execution directives.
For example:

#!python
def log_operation(op):
    log("Operation {} started.".format(op))

This log function could then be specified to be executed on operation start:

#!python

@Project.operation
@Project.hook.on_start(log_operation)
def hello(job):
    print('hello', job)

API

Hooks are in general defined as lists of functions, as part of the flow.hooks.Hooks class.

  • on_start: Executed before a operation is executed.
  • on_finish: Executed after a operation has completed (with or without error).
  • on_success: Executed after a operation has completed without error.
  • on_fail: Executed after a operation has completed with error.

Low-level API

The low-level API allows the installation of hooks on the project instance level via the manipulation of the hooks instance variable:

#!python
project = Project()
project.hooks.on_start.append(my_log_function)

And during the adding of operations, again, via a dictionary:

#!python
project.add_operation(
    name='hello',
    cmd='echo hello {job._id}',
    hooks={'on_start': [my_log_function]})

The hooks are executed in order of definition, where project-wide hooks are executed prior to operation-wide hooks.

Decorator API

Alternatively, hooks can be installed via the decorator API:

#!python
from flow import hook

@Project.hook.on_start(log_operation)
def my_op(job):
    pass

Note, that the hooks are only executed if the operation is executed through the FlowProject interface.

Configurational API

Finally, hooks can be installed directly in the signac configuration by specifying a function that installs the hooks during project initialization.

#!bash
# signac.rc

[flow]
hooks = my_hooks.install_hooks

The hooks config key is interpreted as list, multiple modules would be provided as a comma-separated list.
The my_hooks module must be directly importable and should have a install_hooks() function in the global namespace similar to this:

#!python
# my_hooks.py

def log_operation(op):
    log("Operation {} started.".format(op))

def install_hooks(project):
    project.hooks.on_start.append(log_operation)
    return project

The install_hooks() function should return the project argument so that we can use the following idiom: install_hooks(Project()).main().

The default hook installation function is called install_hooks so in this case the following configuration would be equivalent:

#!bash
[flow]
hooks = my_hooks

As a final example, assuming that the flow package had a module that would install hooks for the automatic generation of snapshots after execution, it would likely be installed like this:

#!bash
[flow]
hooks = flow.snapshots
@csadorf
Copy link
Contributor Author

csadorf commented Jun 12, 2018

Original comment by Bradley Dice (Bitbucket: bdice, GitHub: bdice).


Good idea, and the API looks reasonable. It seems like we can usually do most customizations in flow by altering the FlowProject class, but in this case we have to work on an instance of the class. I'd have to think more about it to determine if this is a problem, and if so, whether it can be solved.

@csadorf
Copy link
Contributor Author

csadorf commented Jun 12, 2018

Original comment by Carl Simon Adorf (Bitbucket: csadorf, GitHub: csadorf).


Yes, that is true, but that is actually intentional. My reasoning is that the class defines a specific workflow, while the instance is more strongly related to the actual execution. I see the hooks execution-related, less workflow related.

One edge case is the decorator API which behaves more like a class definition...

@csadorf
Copy link
Contributor Author

csadorf commented Jun 13, 2018

Original comment by Carl Simon Adorf (Bitbucket: csadorf, GitHub: csadorf).


I think one thing we need to decide is whether those hooks are expected to always be executed, i.e., also when we just call the function, (that would be problematic for @cmd operations) or just when using FlowProject.run*().

@csadorf
Copy link
Contributor Author

csadorf commented Jun 13, 2018

Original comment by Carl Simon Adorf (Bitbucket: csadorf, GitHub: csadorf).


Slightly updated the description and the API definition.

@csadorf
Copy link
Contributor Author

csadorf commented Jun 13, 2018

Original comment by Carl Simon Adorf (Bitbucket: csadorf, GitHub: csadorf).


A prototype for the proposed API is implemented in the execution-hooks branch.

@csadorf
Copy link
Contributor Author

csadorf commented Jun 13, 2018

Original comment by Bradley Dice (Bitbucket: bdice, GitHub: bdice).


If I define a hook, I want it to run any time I call the function, no matter if that's in a Python script or the shell interface for flow. This is why I'm confused by its separation from the class workflow into a specific instance. That sounds kind of annoying to me, I'd rather it stays as part of the class. Can you explain why that would be problematic for @cmd operations? Do you mean if the user called the operation outside of signac, just on the command line?

@csadorf
Copy link
Contributor Author

csadorf commented Jun 13, 2018

Original comment by Carl Simon Adorf (Bitbucket: csadorf, GitHub: csadorf).


Well, right now, if you call a function foo() that happens to be a FlowProject operation it will just call it as a "normal function". So if you had a @cmd function like this:

@FlowProject.operaton
@flow.cmd
def foo(job):
    return 'echo foo {job._id}'

it will just print echo foo abc....

@csadorf
Copy link
Contributor Author

csadorf commented Jul 27, 2018

Original comment by Vyas Ramasubramani (Bitbucket: vramasub, GitHub: vyasr).


I had the same reaction as Bradley; everything looks good to me aside from the class/instance question. I'll think on it a bit more.

@csadorf
Copy link
Contributor Author

csadorf commented Jul 30, 2018

Original comment by Carl Simon Adorf (Bitbucket: csadorf, GitHub: csadorf).


Project-wide hooks

As per offline discussion with @vyasr and @bdice, I'm going to update this proposal to promote a decorator-based API for project-wide hooks, similar to:

class Project(FlowProject):
    pass


@Project.hooks.on_start
def log_operation(op):
    print(op)

Hooks and cmd-operations

Furthermore, we will need to account for @cmd-decorated operations, like:

@Project.operation
@flow.cmd
def foo(job):
    return "echo hello {}".format(job)

by padding the execution in (submission-)scripts, with something like:

$ flow _exec_pre_hooks foo abc123
$ echo hello abc123
$ flow _exec_post_hooks foo abc123

Alternatively, we just disable hooks for cmd-operations or require them to be executed through the flow exec command.

@mikemhenry mikemhenry added this to the v0.7 milestone Feb 16, 2019
@csadorf csadorf modified the milestones: v0.7, v0.8 Feb 26, 2019
@csadorf csadorf modified the milestones: v0.8, v0.9 May 24, 2019
@csadorf csadorf removed the proposal label Jul 5, 2019
@vyasr vyasr added the enhancement New feature or request label Jul 5, 2019
@vyasr vyasr modified the milestones: v1.0, v2.0 Jul 5, 2019
@vyasr vyasr added the expertise needed Extra attention is needed label Jul 5, 2019
@csadorf csadorf modified the milestones: v2.0, v1.0 Sep 6, 2019
csadorf added a commit that referenced this issue Dec 5, 2019
This is a prototype implementation for the API proposed in issue #28.
csadorf added a commit that referenced this issue Dec 6, 2019
This is a prototype implementation for the API proposed in issue #28.
bdice added a commit that referenced this issue Dec 29, 2021
* Prototype implementation for execution hooks.

This is a prototype implementation for the API proposed in issue #28.

* Fix issue that occurs when the add_operation 'hooks' argument is None.

* Properly handle exceptions for on_start hooks.

* Reduce verbosity of hook module on errors.

* Implement unit tests for run execution hooks.

* Revised hooks implementation and provide basic hook modules.

* Fix docstrings in snapshots module (typo/style).

* Add option to exclude files/directories based on regular expression.

In the snapshot hook module.

* Rename the 'track-operations' executable to 'flow-track-operations'.

* Add fork execution directive.

* Remove fork execution directive.

* Fix rebase error.

* Install hooks through config as callable.

To avoid the need to implement a class and also avoid forcing of a
specific function name.

* Use hidden files for the operations log and tracking log.

This makes it easier to integrate those hook systems with git
tracking.

* Rename '_metadata_schema_version' to '_schema_version'.

* Remove left-over print statement.

* Use git-notes for automated git-tracking.

Instead of storing metadata in the commit message.

* Fix style error.

* Revert changes to base submission script.

* Update changelog.

* Remove obselete internal function.

* Update metadata schema.

* Remove print_function from __future__ import.

No longer needed since the drop of support for Python 2.7.

* fixup! Update metadata schema.

* Rename '.snapshots' directory to '.signac_snapshots'.

* Refactor git repot initialization and git-ignore handling.

Also just ignore the hidden signac snapshots directory instead of
all hidden files and directories.

* Use improved storage layout for snapshot system.

* changes I failed to commit at the time

* Fix issue with test_context unit test.

* Fix error message.

* Fixed some errors made during merge conflict resolution.

* Style changes from pre-commit

* remove a temp todo comment

* Use more descriptive name _hook_triggers

* pre commit formatting

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix typo in docstring link

* Remove flow-track-operations.

* Remove non-core hook features.

* Update changelog.

* Use explicit keyword names.

* Remove from_dict constructor, fix user API to use (name, *jobs) instead of_JobOperation, fix test class name.

* Rename class to _InstallHook.

* Fix comment.

* Remove _install_config_hooks.

* Revert changes to logging configuration. Not needed until the LogOperations hook is introduced.

* Added base hooks test, no error.

* Used a function get get values from dictionary instead of typing the whole thing out. Need to figure out how to run with error and still trigger hook.

* Added test for when hook.on_fail is triggered plus changes to test to make it compatible with this test.

* Added base tests for a cmd function.

* Less clutter and neater in define_hooks_test_project.

* Added hooks install.

* Removed test length of hooks.

* Added cmd tests.

* Added test install test for cmd functions.

* Ignore fail test when exception not raised.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add docstrings to Hooks class.

* flake8 errors.

* Flake8 formatting.

* Update flow/hooks/__init__.py

Co-authored-by: Bradley Dice <[email protected]>

* Moved to changelog for 0.17

* Reference class in docs

Co-authored-by: Bradley Dice <[email protected]>

* Consistent language in the docstring

Co-authored-by: Bradley Dice <[email protected]>

* Input is operation, not operation_name

Co-authored-by: Bradley Dice <[email protected]>

* Language/wording

Co-authored-by: Bradley Dice <[email protected]>

* Missing parenthesis

Co-authored-by: Bradley Dice <[email protected]>

* More accurate description of parameters

Co-authored-by: Bradley Dice <[email protected]>

* Grammar

Co-authored-by: Bradley Dice <[email protected]>

* Directly return the function

Co-authored-by: Bradley Dice <[email protected]>

* Remove unneeded list

Co-authored-by: Bradley Dice <[email protected]>

* Return function directly instead of lambda function

Co-authored-by: Bradley Dice <[email protected]>

* Fix typo (on_failure to on_fail)

Co-authored-by: Bradley Dice <[email protected]>

* Change documentation wording to be more user friendly

Co-authored-by: Bradley Dice <[email protected]>

* Swap order so variable can be used

Co-authored-by: Bradley Dice <[email protected]>

* Use raise instead of raise error for more readable outputs

Co-authored-by: Bradley Dice <[email protected]>

* Various edits to code.

* Added tests for invalid hooks.

* set_job_doc_with_error needed an argument.

* Rename for clairty.

* Renamed for clarity and accuracy.

* Important distinction in docs

Co-authored-by: Carl Simon Adorf <[email protected]>

* More important distinctions in docs.

Co-authored-by: Carl Simon Adorf <[email protected]>

* Correct version number in changelog.

* Renamed _HooksDecorator and added more descriptive docstring.

* Remove unnecessary __call__.

* More changes for clarity on tests.

* Corrected docstring for accuracy.

* Edited string for clarity.

* Fixed typo in test.

* Check for invalid hook attributes before instantiating the project.

* Added test for exceptions in hook.

* Make tests more readable.

* Fixed typo in registry.

* Only test one thing at a time.

* Only test one thing at a time.

* Moved _HookRegistry to class method.

* Moved Hooks and _HooksList class out of __init__ file.

* Added message to describe why.

* Changed from hook to add_hook for clarity.

* Removed unneeded changelong subheadings.

* Raise error with message.

* Renamed from HooksRegistry to HooksRegister because naming consistency with OperationRegister.

* Import hooks without flake8 error

Co-authored-by: Carl Simon Adorf <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Removed unneccessary assert lines.

* Better docs

Co-authored-by: Brandon Butler <[email protected]>

* Used more explicit naming for project vs operation hooks.

* Apply suggestions from code review

Co-authored-by: Bradley Dice <[email protected]>

* Apply suggestions to docstrings.

* Revise docstring.

* Remove metaclass initialization of _operation_hooks.

* Fix code examples for operation hook.

* Rename _hooks to _project_hooks for clarity and symmetry with _operation_hooks.

* Update docs.

* Document operation_hook in API docs.

* Improvements to FlowProject.operation_hook documentation

* Changed to plural project hooks.

* Swapped hook and operation decorators in docstring and tests.

* Forgot to commit change in docstring for operation and hook decorator order.

* Change Hooks to _Hooks

* Update docs with operation_hooks name change

* Flip order of test directives to make operation first

Co-authored-by: Carl Simon Adorf <[email protected]>
Co-authored-by: Kelly Wang <[email protected]>
Co-authored-by: Vyas Ramasubramani <[email protected]>
Co-authored-by: Corwin Kerr <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Kelly Wang <[email protected]>
Co-authored-by: Carl Simon Adorf <[email protected]>
Co-authored-by: Brandon Butler <[email protected]>
@bdice bdice added the hooks label Dec 29, 2021
@bdice
Copy link
Member

bdice commented Dec 29, 2021

This is resolved by #508.

@bdice bdice closed this as completed Dec 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request expertise needed Extra attention is needed hooks
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants