Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Session.get handler only calls done callback in v0.11.0-rc.2 #204

Closed
fsteff opened this issue May 6, 2024 · 10 comments
Closed

[Bug] Session.get handler only calls done callback in v0.11.0-rc.2 #204

fsteff opened this issue May 6, 2024 · 10 comments
Labels
bug Something isn't working

Comments

@fsteff
Copy link

fsteff commented May 6, 2024

Describe the bug

Session.get behaves differently in 0.11.0-rc.2 compared to 0.10.1rc0.
When using the Closure reply handler (reply and done callbacks), the reply callback is not called.

To reproduce

Considering the following example:

import time
import zenoh

session1 = zenoh.open()
session2 = zenoh.open()

qable = session1.declare_queryable('test', lambda req: req.reply(zenoh.Sample('test', 'ok')))


session2.get('test', 
             (lambda reply: print(f'reply "{reply.ok.payload.decode()}"'),
                      lambda: print('done')))

time.sleep(1)
qable.undeclare()
session1.close()
session2.close()

With version 0.10.1rc0 the console output is as expected:

reply "ok"
done

With version 0.11.0-rc.2 only the done callback is called:

done

System info

  • Platform Win11
  • Python version 3.10.9
  • eclipse-zenoh version 0.11.0-rc.2
@fsteff fsteff added the bug Something isn't working label May 6, 2024
@fsteff
Copy link
Author

fsteff commented May 6, 2024

Could this be related to eclipse-zenoh/zenoh#816?

@Mallets
Copy link
Member

Mallets commented May 6, 2024

@fsteff the PR you linked is not merge into main yet os it can't be related.
@wyfo can you look into it?

@wyfo
Copy link
Contributor

wyfo commented May 6, 2024

I can reproduce the issue, I'm on it.

@wyfo
Copy link
Contributor

wyfo commented May 6, 2024

It seems that Session.declare_queryable changed in 0.11; the operation is no more taken in account instantly. In fact, adding time.sleep(1) after session1.declare_queryable(...) fix the issue.
@Mallets Do you remind about declare_queryable behavior modification?

@fsteff
Copy link
Author

fsteff commented May 7, 2024

I can confirm that a short sleep resolves the issue, also in more complex test cases. Actually even time.sleep(.000000001) is enough.
Sounds like some sort of concurrency problem (OS context switch is enough?).

@fsteff
Copy link
Author

fsteff commented May 7, 2024

Looks like this won't be a problem in most cases. But it breaks most of our unit tests...

@OlivierHecart
Copy link
Contributor

OlivierHecart commented May 7, 2024

Hello @fsteff!
The example you provided has never been guaranteed to work. It always has been a race condition between the propagation of the queryable declaration to session2 through the net stack on one side and the execution of the main thread on the second side. In 0.10.0-rc the scheduling was so that it almost always worked in python. In 0.11.0-rcX we ported zenoh from async-std to tokio which greatly impacted the scheduling.

@fsteff
Copy link
Author

fsteff commented May 7, 2024

@OlivierHecart I suspected as much. Already fixed most unit tests and upgraded our stack :)
It was convenient that this "just worked" - is there a good way to verify that queryables have been propagated? Calling sleep(...) works most of the time, but that's not really a clean solution.
The only method I know of utilizes a router's admin space, but out tests work purely p2p.

@Mallets
Copy link
Member

Mallets commented May 7, 2024

Arguably a new entity, let's say it a querier, should be introduce to serve a similar to a publisher. In that way, it may become easier to know when there is at least one matching queryable. But my comment applies more generic to Zenoh as whole and not only to the Python API.

@fsteff
Copy link
Author

fsteff commented Oct 18, 2024

i guess this can be closed?

@fsteff fsteff closed this as completed Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants