fix(api): Error instead of infinite-looping when disposal_volume is too high to make progress #6754

SyntaxColoring · 2020-10-13T15:42:52Z

Overview

This PR turns code like this into an error:

# disposal_volume too high.
# There would be no room left in the tip to move liquid to dests.
p300.distribute(123, source, dests, disposal_volume=300)

The protocol API's current behavior for code like this depends on the exact circumstances. For example, sometimes, it hangs in an infinite loop (as reported in #6170). Other times, it seems to treat the distribute() step as a no-op.

This PR turns one specific variant into an error, and moves towards characterizing other variants.

Changelog

Fix bug: Inappropriate disposal or air_gap volume causes hang in transfers.py #6170. Before, the test protocol in that ticket would infinite-loop. Now, it raises a ValueError internally.
Leave these related bugs unfixed for now:
- When the pipette tip has a lower maximum volume than the pipette itself, it seems like an invalid disposal volume can make the distribute() effectively no-op. It picks up a tip and then drops it, without any sub-steps.
  
  This PR's unit tests caught this by chance. They're xfail'd.
  
  If we merge this PR without a fix for this bug, I'll ticket this bug separately.
- Even with this PR, another variant of bug: Inappropriate disposal or air_gap volume causes hang in transfers.py #6170, calling transfer() instead of distribute(), still hangs.
```
metadata = {"apiLevel": "2.7"}

def run(protocol):
    labware = protocol.load_labware('nest_12_reservoir_15ml', 1)
    tip_rack = protocol.load_labware('opentrons_96_filtertiprack_200ul', 2)
    pipette = protocol.load_instrument('p300_single', mount='left', tip_racks=[tip_rack])
    pipette.transfer(10, labware.wells()[0], labware.wells()[1], disposal_volume=200)
```
  I haven't looked into why.
  
  I found this by manual fiddling, so it's not currently represented in the unit tests.
  
  If we merge this PR without a fix for this bug, I'll ticket this bug separately.

Risk assessment

Medium.

There's a risk that this PR accidentally changes edge-case behavior in runnable protocols without an apiLevel bump.

This PR is intended to maintain the exact current behavior of all runnable protocols. It should merely improve how the error is presented for certain non-runnable protocols. (Instead of infinite-looping, there should be an error message.)

But suppose someone's written a "wrong," but runnable, protocol. The protocol provides an invalid value to one of our API functions, but the function currently happens not to raise any error. If this PR accidentally does make it raise an error, that would break our API versioning policy.

It's possible that this is the case, because:

I'm not confident I fully understand how _expand_for_volume_constraints() is called.
Although the test that this PR introduces is a good start, the current edge-case behaviors of transfer(), distribute(), and consolidate() don't seem well-characterized.
- Evidently, there are variants of this bug other than exactly what's reported in bug: Inappropriate disposal or air_gap volume causes hang in transfers.py #6170. This test catches some variants, but not all.
- I didn't have a great way of telling Pytest to expect a test to time out. I'm assuming that, without the fix, all the non-xfail'd test parametrizations would time out if they had a chance to run. If this assumption is wrong, this PR could be changing the behavior of runnable protocols.

Review requests

Given the deferred work that I mentioned in "Changelog," and the risks I mentioned in "Risk assessment," are we comfortable merging this as-is? Or should I try to understand this code more before I poke it with a stick?

codecov · 2020-10-13T15:42:59Z

Codecov Report

❗ No coverage uploaded for pull request base (edge@f4b6efe). Click here to learn what that means.
The diff coverage is n/a.

@@           Coverage Diff           @@
##             edge    #6754   +/-   ##
=======================================
  Coverage        ?   93.55%           
=======================================
  Files           ?      110           
  Lines           ?     4812           
  Branches        ?        0           
=======================================
  Hits            ?     4502           
  Misses          ?      310           
  Partials        ?        0

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f4b6efe...364b00f. Read the comment docs.

api/tests/opentrons/protocols/advanced_control/test_transfers.py

This is covered by another todo comment in the test itself, but should also probably get its own ticket.

api/tests/opentrons/protocols/advanced_control/test_transfers.py

amitlissack · 2021-03-10T21:33:30Z

api/tests/opentrons/protocols/advanced_control/test_transfers.py

+        ('p300_single', 'opentrons_96_tiprack_300ul', 300, 300),
+
+        # pipette max != tip max, disposal_volume == tip max
+        # todo(mm, 2021-03-10): These fail unexpectedly, apparently a bug.


Where do these skipped tests fail?

These tests fail because the function under test returns without raising the expected ValueError.

Here's what it returns, as printed by this test's print() statement:

--- Captured stdout call --- {'method': 'pick_up_tip', 'args': [], 'kwargs': {}} {'method': 'drop_tip', 'args': [], 'kwargs': {}}

I think it would be nice to explain this behavior now rather than leave the tests marked as skip.

While it's good that you added these tests, the general testing strategy for this module is tricky. I would have preferred to see individual tests for all the steps. I'm not suggesting you do that now for all the functions. But perhaps just for _expand_for_volume_constraints?

I would have preferred to see individual tests for all the steps.

Meaning:

One test function to make sure distribute() errors when asked to use ap20_single_gen2 with 20 µL tips to distribute with a disposal volume of 20 µL.

One test function to make sure distribute() errors when asked to use ap20_single_gen2 with 10 µL tips to distribute with a disposal volume of 20 µL.

etc.

Or do I misunderstand?

Sorry for not responding sooner.

I meant that i'd want to see tests for the individual functions (_create_volume_gradient , _check_valid_well_list, _create_volume_list, _expand_for_volume_constraints etc. ) rather than just full transfer plan.

amitlissack · 2021-03-10T21:37:53Z

api/tests/opentrons/protocols/advanced_control/test_transfers.py

+    # or air_gap + disposal_volume, is too high.
+
+    # Boilerplate: TransferPlan wants an InstrumentContext.
+    context = papi.ProtocolContext(


You can skip this boilerplate by using the ctx fixture defined in conftest.py.

mcous · 2021-03-11T14:35:20Z

api/src/opentrons/protocols/advanced_control/transfers.py

@@ -600,6 +605,11 @@ def _expand_for_volume_constraints(
        """ Split a sequence of proposed transfers if necessary to keep each
        transfer under the given max volume.
        """
+
+        if max_volume <= 0:
+            raise ValueError(


Should we take this opportunity to define a more specific error type for transfer build errors?

Yeah, I'm thinking about this too.

If you think just swapping the type of ValueError to something more specific would help, I can certainly do that.

I'm also trying to figure out how to get more user-friendly errors for people using the outer protocol API, like:

ValueError: Can't distribute with a disposal_volume of 400 µL when the tip can only hold 300 µL.

As opposed to either of these:

ValueError: max_volume must be greater than 0. (Got 0.) UnplannableTransferError: max_volume must be greater than 0. (Got 0.)

The tension is:

_expand_for_volume_constraints() is central, so if we validate its arguments here, that validation can cover a lot of ground.

But on the other hand, it's too low-level for that validation to raise user-friendly diagnostics. It can't know why max_volume is 0, so error messages will necessarily be opaque and jargony.

mcous · 2021-03-11T14:36:09Z

api/src/opentrons/protocols/advanced_control/transfers.py

@@ -600,6 +605,11 @@ def _expand_for_volume_constraints(
        """ Split a sequence of proposed transfers if necessary to keep each
        transfer under the given max volume.
        """
+
+        if max_volume <= 0:


I think this might make our future rework a little bit harder, but it's probably worth guarding this in an apiVersion check, right?

The original hope was that this PR would be narrowly scoped to just the variants of this bug whose fixes wouldn't need an apiVersion check. See "Risk assessment" in my OP.

Maybe that's an exercise in futility. Maybe we should expand the scope of this PR to fix more variants of the bug, and wrap them all in an apiVersion check?

SyntaxColoring · 2021-03-16T15:55:11Z

Based on review comments and out-of-band conversations, this bug doesn't seem like a good thing to attempt to solve incrementally.

The backwards compatibility concerns that make this such a slog to work on are ticketed for research in #7477.

Meanwhile, I'll re-draft this PR, try to understand the variants of this bug that I had deferred investigating, and see if we can fix them all in one apiLevel-bumping chunk.

amitlissack · 2021-03-16T17:59:15Z

Based on review comments and out-of-band conversations, this bug doesn't seem like a good thing to attempt to solve incrementally.

The backwards compatibility concerns that make this such a slog to work on are ticketed for research in #7477.

Meanwhile, I'll re-draft this PR, try to understand the variants of this bug that I had deferred investigating, and see if we can fix them all in one apiLevel-bumping chunk.

I think that it would be good enough to dive potential other bug(s) and create tickets. The most important thing is to fix the infinite loop; either by raising an exception or doing nothing.

pipenv install --dev --keep-outdated pytest-timeout

If these unexpectedly pass, it means we changed protocol API behavior without an apiLevel bump.

SyntaxColoring · 2021-03-23T00:42:45Z

I'm dropping this, for now. A combination of things makes this bug deceptively difficult to solve ~~correctly~~ safely.

This particular area of code:

Has an enormous parameter surface area. (Think of all the possible ways of combining disposal_volume, air_gap, distribute vs. consolidate vs. transfer, Single-Channel vs. 8-Channel pipettes, transfer volume, pipette volume, tip volume, and apiLevel backwards compatibility.)
Has never had parameter validation.
Generally only has test coverage of its happy paths.

So, there's currently a big space of uncharacterized, undefined behavior that you can accidentally fall into. #6170, the bug that kicked this off, originally reported an infinite loop—but depending on the exact test case, instead of that, you can get other confusing results, like a big transfer breaking down into a pick_up_tip with no liquid handling steps.

Fully characterizing this space of undefined behavior turns out to be a lot of work. The work so far in this PR only partially accomplishes it.

But, by my understanding of our Python Protocol API's versioning policy, we must characterize and preserve that behavior in fixing these bugs. Even though it was undocumented, and didn't make sense, and only happened in incorrectly written protocols.

In short, I can't be confident that, if I fix these bugs, I won't unintentionally change behavior in some place where I'm not specifically looking.

The more general versioning problem is ticketed for research as #7477. Depending on the outcome of that research, this could get a lot easier, and we could revisit it.

If/when we come back to this, I left #todo and #fixme comments in this branch to record the next steps.

And, anyway, we think we'll rewrite this code in Protocol Engine (the ongoing overhaul of our protocol-running back-end).

SyntaxColoring added api Affects the `api` project WIP fix PR fixes a bug labels Oct 13, 2020

SyntaxColoring self-assigned this Oct 13, 2020

SyntaxColoring commented Oct 13, 2020

View reviewed changes

api/tests/opentrons/protocols/advanced_control/test_transfers.py Outdated Show resolved Hide resolved

SyntaxColoring linked an issue Oct 13, 2020 that may be closed by this pull request

bug: Inappropriate disposal or air_gap volume causes hang in transfers.py #6170

Closed

SyntaxColoring added 8 commits March 9, 2021 17:43

Add failing (hanging, really) test case.

a4d8384

Refactor and expand unit test.

91f300a

Improve test readability.

b83572f

Add todo comment for TransferPlan refactor.

ff19cd3

First pass at a solution, with some todo comments.

09a427a

xfail out-of-scope bugs.

79d3603

Format.

4900539

Remove todo comment for out-of-scope bug.

3cc63eb

This is covered by another todo comment in the test itself, but should also probably get its own ticket.

SyntaxColoring force-pushed the fix_api_disposal_volume_infinite_loop branch from 15037c3 to 3cc63eb Compare March 10, 2021 17:55

SyntaxColoring changed the title ~~fix(api): Don't infinite-loop when reserved volume is too high to make progress~~ fix(api): Error early when disposal_volume is too high to make progress Mar 10, 2021

SyntaxColoring changed the title ~~fix(api): Error early when disposal_volume is too high to make progress~~ fix(api): Error instead of infinite-looping when disposal_volume is too high to make progress Mar 10, 2021

Single-space error string.

fde3cd4

SyntaxColoring marked this pull request as ready for review March 10, 2021 20:53

SyntaxColoring requested a review from a team as a code owner March 10, 2021 20:53

amitlissack reviewed Mar 10, 2021

View reviewed changes

SyntaxColoring added 2 commits March 10, 2021 17:52

Exhaust iterator with list() instead of a loop.

f00b9d6

Clarify xfail comment.

44ed13e

mcous reviewed Mar 11, 2021

View reviewed changes

SyntaxColoring mentioned this pull request Mar 12, 2021

Spike: Bug compatibility in Python Protocol API versioning #7477

Closed

SyntaxColoring marked this pull request as draft March 16, 2021 15:55

Add dev dependency on pytest-timeout.

fd7787c

pipenv install --dev --keep-outdated pytest-timeout

SyntaxColoring added 4 commits March 22, 2021 14:05

Add even more hanging and failing tests.

3e9c3ca

Strictly expect failing tests to fail.

35568d5

If these unexpectedly pass, it means we changed protocol API behavior without an apiLevel bump.

Parametrize which mode (transfer/distribute/consolidate).

f9e6831

Leave todo notes.

04259e1

Elaborate on todo notes.

364b00f

SyntaxColoring closed this Mar 23, 2021

SyntaxColoring mentioned this pull request Mar 25, 2021

refactor(api): Add todo comments calling out some TransferPlan questions #7540

Merged

SyntaxColoring deleted the fix_api_disposal_volume_infinite_loop branch April 9, 2021 14:05

SyntaxColoring mentioned this pull request Sep 27, 2023

fix(api): Addition of distribution error to prevent invalid disposal values #13659

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(api): Error instead of infinite-looping when disposal_volume is too high to make progress #6754

fix(api): Error instead of infinite-looping when disposal_volume is too high to make progress #6754

SyntaxColoring commented Oct 13, 2020 •

edited

Loading

codecov bot commented Oct 13, 2020 •

edited

Loading

amitlissack Mar 10, 2021

SyntaxColoring Mar 10, 2021 •

edited

Loading

amitlissack Mar 11, 2021

amitlissack Mar 11, 2021

SyntaxColoring Mar 11, 2021

amitlissack Mar 16, 2021

amitlissack Mar 10, 2021

mcous Mar 11, 2021

SyntaxColoring Mar 11, 2021 •

edited

Loading

mcous Mar 11, 2021

SyntaxColoring Mar 11, 2021

SyntaxColoring commented Mar 16, 2021

amitlissack commented Mar 16, 2021

SyntaxColoring commented Mar 23, 2021 •

edited

Loading

fix(api): Error instead of infinite-looping when disposal_volume is too high to make progress #6754

fix(api): Error instead of infinite-looping when disposal_volume is too high to make progress #6754

Conversation

SyntaxColoring commented Oct 13, 2020 • edited Loading

Overview

Changelog

Risk assessment

Review requests

codecov bot commented Oct 13, 2020 • edited Loading

Codecov Report

Choose a reason for hiding this comment

SyntaxColoring Mar 10, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SyntaxColoring Mar 11, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SyntaxColoring commented Mar 16, 2021

amitlissack commented Mar 16, 2021

SyntaxColoring commented Mar 23, 2021 • edited Loading

SyntaxColoring commented Oct 13, 2020 •

edited

Loading

codecov bot commented Oct 13, 2020 •

edited

Loading

SyntaxColoring Mar 10, 2021 •

edited

Loading

SyntaxColoring Mar 11, 2021 •

edited

Loading

SyntaxColoring commented Mar 23, 2021 •

edited

Loading