Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #1508 - the needs() function should propagate calling task's params/args down to needed tasks #1509

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions ChangeLog
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ Revision history for Rex

[BUG FIXES]
- Detect invalid hostgroup expressions
- Propagate args/params down to "needed" tasks launched by needs()

[DOCUMENTATION]

Expand Down
8 changes: 2 additions & 6 deletions lib/Rex/Commands.pm
Original file line number Diff line number Diff line change
Expand Up @@ -1064,13 +1064,9 @@ sub needs {
my $task_o = $tl->get_task($task);
my $task_name = $task_o->name;
my $suffix = $self . ":";
if ( @args && grep ( /^\Q$task_name\E$/, @args ) ) {
if ( !@args || grep ( /^\Q$task_name\E$/, @args ) ) {
Rex::Logger::debug( "Calling " . $task_o->name );
$task_o->run( "<func>", params => \@task_args, args => \%task_opts );
}
elsif ( !@args ) {
Rex::Logger::debug( "Calling " . $task_o->name );
$task_o->run( "<func>", params => \@task_args, args => \%task_opts );
$task_o->run( "<func>", params => \%task_opts, args => \@task_args );
}
}

Expand Down
165 changes: 165 additions & 0 deletions t/issue/1508.t
Original file line number Diff line number Diff line change
@@ -0,0 +1,165 @@

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to admit I have a hard time to follow what exactly is happening in this test file. Could you walk me through it, please?

I see it was inspired by t/needs.t. Is there a simpler way to test the behavior perhaps? Or to fold the new checks into t/needs.t directly?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned, and despite how it may look at first, this test script was heavily inspired from t/needs.t.

It's basically a refactored version of it, with one notable functional difference: unlike t/needs.t, this one also checks the run-time task options and params.

So, my initial -joking- response would be :

sure, please walk us through t/needs.t and I will do the same for this one :-)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Joking aside, here's a brief run-through :

Just like in t/needs.t, the main package does the following :

  • first, define a bunch of tasks :
task test => sub {...}
task test2 => sub {...}
...
task test6 => sub {...}
  • Then, run each task and check that it has returned successfully (which constitutes the actual "test")

Just like in t/needs.t, some of those tasks invoke the needs() function, either with other tasks in the main package, or tasks defined in other packages (Rex::Module or Nested::Module) which also reside in the same test script.

1) Why do we have the two other packages (Rex::Module and Nested::Module) ?

Same reason as we have them in t/needs.t, i.e. we want to check that the needs() function works across package namespaces.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2) Why do we have an extra helper package (T), unlike in t/needs.t ?

Well, that's where the DRY refactoring happens...

Notice how almost the same code is repeated again and again in the task subs in t/needs.t.

Here, that boilerplate is replaced with calls to track_taskrun and check_needed which sensibly do the same, i.e. :

  • track_taskrun : When a task gets run, mark it as such by writing to a file with that name (and also save the its arguments there).

  • check_needed :

    • Make sure a "needed" task has already been run, i.e. :
      • die unless a file with that name exists
      • die unless the arguments that were saved in that file correspond to the expected arguments
    • Also do some clean up, by deleting the said file.
      Doing this here might not be considered very orthodox, but that's also what's happening in t/needs.t)

Notice how failure is signaled with die, just like in t/needs.t.

This also partly explains why we employ the private _deep_check() routine from Test::More:: here.

Just like you, my first reflex was employing is_deeply, but that doesn't work because "Test::More" chokes as it is then unable to keep track of the actual number of tests.

The cleaner alternative would have been to rely on is_deeply in the main test loop for my $task ( $run_list->tasks ) in the test script's package main, but that won't work either, because the summary info (from $task_list->get_summary) does not contain task params/arguments... Well it probably should... :-)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3) Why do we have to do file IO in the first place ?

I was also suprised by this at first. I guess the author of t/needs.t could have the best answer here.

The explanation I came up with has to do with the task execution model in Rex (each task possibly having separate SSH connection, possibly running on a separate thread/process, ...).

A no-brainer method for eliminating the hairiness that would get involved is to go through file IO.

t/needs.t does this quite simply by writing/truncating a file that has the same name of the needed task when its being run, and then merely checking the existence of the file from the calling task.

Here, we also keep track of the params/arguments, so we use Storable to do that, which also eliviates the need for manually opening and reading/writing to files.

I chose Storable because of its simplicity and popularity. Besides, Rex already has a runtime dependency on Storable, so using it in a test did not entail adding a dependency.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Why the unpacking logic ?

At one point you ask if we could get rid of the unpacking logic in helper routines, which was actually neatly placed on a single line (before perl tidy messed it up :-) :

    my %opts = ref $_[-1] eq 'HASH' ? %{; pop } : ();

Yes, we could; by directly assuming a HASH reference as the last argument, like such :

    my %opts = %{; pop };

But, if you know of a way of preventing perl-tidy doing its thing, I think it would be preferable to keep the unpacking logic in there.

It gives the caller more freedom. That way, if desired, the current tests in t/needs.t can easily be implemented with those two refactored routines (track_taskrun and check_needed).

The above also gives away my thinking about you suggestion of merging these tests into t/needs.t.

Imho, the easiest and most maintainable way of doing that would be retrofit t/needs.t tests into this -refactored- model.

Otherwise, if we try to it the non-DRY way t/needs.t does this (by copying and pasting the file IO code in all tasks), we could easily end up with a mess of several hundred lines (if not more) of hard-to-maintain code...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, please walk us through t/needs.t and I will do the same for this one :-)

Yeah, later I realized it is based on t/needs.t. My take is that t/needs.t is bad too, but that's part of the historical ballast from 10 years ago. That's a considerable amount of legacy to clean up, so that's why we use Test::Perl::Critic::Progressive to make sure we at least don't add more violations on top of the current pile :) These tests were unknowingly broken for PRs coming from forks, which was my mistake and I fixed that now (so it should properly complain about t/issue/1508.t after a rebase).

In other words, the policy is "we accept our legacy, but we don't want to add more ballast; therefore new code should be clean, or clean up old mess".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed explanation!


  1. Why do we have the two other packages (Rex::Module and Nested::Module) ?

Rex chops off the leading Rex:: part for task names for brevity. We have these two modules defined in t/needs.t to make sure needs work with namespaces both with and without the Rex:: prefix.


  1. Why do we have an extra helper package (T), unlike in t/needs.t ?

This also partly explains why we employ the private _deep_check() routine from Test::More:: here.

Just like you, my first reflex was employing is_deeply, but that doesn't work because "Test::More" chokes as it is then unable to keep track of the actual number of tests.

That sounds like a good indicator to invent a cleaner approach than the 10-year-old legacy t/needs.t - or even better, fix t/needs.t first :)

I wonder if we could split t/needs.t up into real modules under e.g t/lib first as a pure refactor, then modify it to support running the same test tasks both with and without arguments 🤔


  1. Why do we have to do file IO in the first place ?

Yup, I see we use files to keep track of some execution data, and I don't have a much better way for that currently. Storable also sounds good for dumping/restoring the extra data needed by the argument tests. It is also a Perl core module, so it wouldn't be a new dependency anyway 👍


  1. Why the unpacking logic ?

was actually neatly placed on a single line (before perl tidy messed it up :-)

That might also be an indicator of weirdness :) I'm happy to tune perltidy rules and reformat the codebase with that.

In this case though perlcritic complains about the null statement (and in the exploded formatting, about the missing final semicolon in the block). Perhaps it would be more readable as fully unpacking @_ first, and then do this transformation logic on the last element of the passed array?

=head1 NAME

issue/1508.t - Check that the `needs()` function correctly propogates run-time task arguments

=head1 DESCRIPTION

Check that the `needs()` function does indeed correctly propogate
run-time task arguments (%params and @args) from the calling task down to the "needed" tasks.

=head1 DETAILS

* AUTHOR / DATE : [tabulon]@[2021-09-26]
* RELATES-TO : [github issue #1508](https://github.com/RexOps/Rex/issues/1508#issue-1007457392)
* INSPIRED from : t/needs.t

=cut

Comment on lines +1 to +18
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rex tests currently do not include POD sections (maybe they should, though?! 🤔), but they aimed to be treated as standalone perl scripts with shebang, $VERSION, etc. (see e.g. t/cmdb_path.t).

You can squash this change away into the original commit if you want, or I can change it upon merge as a follow-up commit.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ferki,

Thank you for your through review -- of this PR as well as the related issue.

I will try to briefly address your remarks in several chunks.

POD in test files

Since the .t files are valid perl scripts, putting POD is technically allowed.

For some kinds of content, I find it nicer than block/line comments. It renders better and folds neatly in the text editor. Whether or not it is socially allowed is up to the maintainer (you), so go ahead and yank it if you wish.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why were the tests marked as TODO ?

The cherry-picking thing might have played a role in there, yes.
But that's also what I usually try to do, because I find it way cleaner way doing TDD.

If Rex policy is as you describe, I will try to remember next time. It's true that this has become quite popular among many languages lately.

Meanwhile, here are my reasons, in a nutshell, in an attempt to convince you of the added value:

  • Marking tests that are expected to fail (as such) is an extra piece of valuable information.

  • Perl (with its excellent Test::* modules) had the ability to do TDD way before other languages. TODO tests were invented as part of this scheme, and they have quite interesting characteristics.

  • When you do a normal prove, TODO tests still get executed (and their results are briefly reported), but their failure won't cause the whole test suite to fail!

  • When you do a prove -v (verbose mode), TODO tests behave as as normal tests : they will get detailed reporting

  • If TODO tests actually pass (instead of failing), that's also reported as a surprise (so that you can suspect something fishy)

  • Also, the cherry-picking of solutions can then happen across PRs.

Basically, it tends to be a superior form of TDD, in my humble opinion:

  • You can start with writing TODO tests for a given issue (without necessarily committing any code that satisfies them) ; keep them around, even ship them as such.
  • Months later, you (or someone else) can come up with one or more solutions which can easily be checked with prove -v.
  • Once satisfied with a solution, you unmark the TODO tests within the PR that commits the solution.

And, of course, all of that can happen within the same PR (as was the case here).

But you probably already know all this... And you probably have good reasons to prefer otherwise.

In the end, whatever the maintainer of the project decides, of course.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

POD in tests

In general I'm not opposed to add POD to test files, provided we can realistically commit to establish it as the new norm, and also to start adding them retroactively for consistency. I feel that would involve a considerable overhead, so I'd wish to discuss that topic outside of the context of this PR (and focus on passing arguments to needs here).


TODO tests

Having new tests that first pass, which then start to fail when I successfully implement their logic feels backward for my brain 🙃 I can accept it works better for you, and I appreciate your detailed reasoning ❤️

So far I used TODO for tests that are expected to fail, and not intend to fix in the same PR that introduces them. So I'm fine shipping TODO tests (e.g. there's one in t/file.t), and I'd even prefer it over things like "skip this on Windows" whenever possible (the current test suite mostly SKIPs though :/ ).

In the usual PR context, I like to have it demonstrated by the CI run that the new tests are actually failing before the related change gets implemented. So I push the "Add tests for X" commits early, and watch them fail ("red"). Then I also like to demonstrate that the new commit (or sometimes series of commits) I push to implement/fix the failing behavior actually makes the test suite to pass ("green"). Then if I need to simplify the code, I can make that at the end, and demonstrate the tests still pass ("refactor")

If my PR would be about adding new tests that I don't intend to fix in the same PR, I think I'd still push a commit with the failing tests first, then make the CI pass with a second commit that marks them as TODO.

I'd be happy to further discuss these methodology details via GitHub discussions or chat.

use Test::More;
use Rex::Commands;

{

package T; # Helper package (for cutting down boilerplate in tests)
use Storable;

sub track_taskrun {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly this subroutine saves the arguments passed to it in a file via Storable. If that's true, would it be better to call it something like save_arguments?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you wish so...

It also saved the arguments, yes.
But its principal function is to register the fact that a given task has just been executed (also saving the run-time arguments, in our case).

Copy link
Member

@ferki ferki Nov 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think by now I better understand the original name, thanks to your explanations. So this is a subroutine that tracks task execution (~run).

my %opts = ref $_[-1] eq 'HASH'
? %{
;
pop
}
: ();
Comment on lines +28 to +33
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the calls in this file, it seems that the first argument is always a scalar value, and the last argument is always a hash reference. Does it make sense to still keep this unpack logic around?

Maybe this works too:

my ($called_task, %opts) = @_;

That way we don't need the for loop either. What do you think?

(Same goes for check_needed below)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps...

The intention was the ability to do the following, when you have more than one needed task :

needs ("task1", "task2", "task3")
check_needed("task1", "task2", "task3", \%opts) 

Currently, it looks like we don't have that case (with several needed tasks) in the tests. Maybe we should ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For track_taskrun, it's an easier call; yes we could get rid of some of the logic, as you suggest.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, it looks like we don't have that case (with several needed tasks) in the tests. Maybe we should ?

I think that's what confused me initially. We don't have the multiple-tasks case tested now, so it might be premature optimizing.

But your're right, in the long run we probably should cover that use case too 👍

my $argv = $opts{argv};
my @prereqs = (@_);
for (@prereqs) {
my $file = "${_}.txt";
Storable::store( $argv, $file );
}
}

sub check_needed {
my %opts = ref $_[-1] eq 'HASH'
? %{
;
pop
}
: ();
my $argv = $opts{argv};
my $do_unlink = delete $opts{unlink} // 1;
my @prereqs = (@_);

for (@prereqs) {
my $file = "${_}.txt";

-f "$file" or die;
my $propagated_argv = Storable::retrieve("$file");
Test::More::_deep_check( $propagated_argv, $argv ) or die;
ferki marked this conversation as resolved.
Show resolved Hide resolved

unlink("$file") if ($do_unlink);
}
}
}

{

package MyTest;
use strict;
use warnings;
use Rex::Commands;

$::QUIET = 1;

task test1 => sub {
T::track_taskrun( test1 => { argv => \@_ } );
};

task test2 => sub {
T::track_taskrun( test2 => { argv => \@_ } );
};

1;
}

{

package Nested::Module;

use strict;
use warnings;

use Rex::Commands;

task test => sub {
T::track_taskrun( test => { argv => \@_ } );
};
}

{

package Rex::Module;

use strict;
use warnings;

use Rex::Commands;

task test => sub {
T::track_taskrun( test => { argv => \@_ } );
};
}

task test => sub {
needs MyTest;

T::check_needed( $_, { argv => \@_ } ) for (qw/test1 test2/);
};

task test2 => sub {
needs MyTest "test2";

T::check_needed( $_, { argv => \@_ } ) for (qw/test2/);
};

task test3 => sub {
needs "test4";

T::check_needed( $_, { argv => \@_ } ) for (qw/test4/);
};

task test4 => sub {
T::track_taskrun( test4 => { argv => \@_ } );
};

task test5 => sub {
needs Nested::Module test;

T::check_needed( $_, { argv => \@_ } ) for (qw/test/);
};

task test6 => sub {
needs Rex::Module "test";

T::check_needed( $_, { argv => \@_ } ) for (qw/test /);
};

{
my $task_list = Rex::TaskList->create;
my $run_list = Rex::RunList->instance;
$run_list->parse_opts(qw/test test2 test3 test5 test6/);

for my $task ( $run_list->tasks ) {
my $name = $task->name;
my %prms = ( "${name}_greet" => "Hello ${name}" );
my @args = ( "${name}.arg.0", "${name}.arg.1", "${name}.arg.2" );

$task_list->run($task);

my @summary = $task_list->get_summary;
is_deeply $summary[-1]->{exit_code}, 0, $task->name;
$run_list->increment_current_index;
}
}

done_testing;