Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add UUID to queue store #264

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

shaileshahuja
Copy link

@shaileshahuja shaileshahuja commented Aug 22, 2024

Changelog

Added

  • Introduced UUID support with github.com/google/uuid dependency
  • Implemented UUID generation for queue entries in file and memory-based queues

Changed

  • Modified Enqueue method to return (uuid.UUID, error) instead of just error
  • Updated Reader method to return (uuid.UUID, io.Reader, error)
  • Enhanced file naming in file-based queue to use UUID-based names
  • Refactored memory-based queue to store both message and UUID
  • Modified all queue-related tests to accommodate UUID changes
  • Adjusted autopaho/auto.go to handle new queue interface

Dependencies

  • Added github.com/google/uuid v1.6.0

Performance Considerations

  • Increased memory usage in memory-based queue implementation to store UUIDs alongside messages
  • Added UUID parsing operations in file-based

@MattBrittan
Copy link
Contributor

Sorry - I have been busy on other things. Will try to review this over the next week or so.

@MattBrittan
Copy link
Contributor

Really sorry that its taken me so long to look at this. The changes look good - my main concern comes back to the use of uuid.UUID in the interfaces:

type Entry interface {
	Reader() (uuid.UUID, io.Reader, error) // Provides access to the id, file contents, subsequent calls may return the same reader
	Leave() error                          // Leave the entry in the queue (same entry will be returned on subsequent calls to Peek).
	Remove() error                         // Remove this entry from the queue. Returns queue.ErrEmpty if queue is empty after operation
	Quarantine() error                     // Flag that this entry has an error (remove from queue, potentially retaining data with error flagged)
}

and

type Queue interface {
	// Wait returns a channel that is closed when there is something in the queue (will return a closed channel if the
	// queue is empty at the time of the call)
	Wait() chan struct{}

	// Enqueue add item to the queue, returns the id of the entry
	Enqueue(p io.Reader) (uuid.UUID, error)

	// Peek retrieves the oldest item from the queue without removing it
	// Users must call one of Close, Remove, or Quarantine when done with the entry, and before calling Peek again.
	// Warning: Peek is not safe for concurrent use (it may return the same Entry leading to unpredictable results)
	Peek() (Entry, error)
}

Doing this seems a bit prescriptive. If we use a database as a queue then an automatically generated field (i.e. int) would be a better IT than a UUID. The same probably applies to memory where an int would probably also be more appropropriate (could just add 1 each time an item is added to the queue). Using github.com/google/uuid here also ties us to that specific implementation of uuid (as these interfaces are public others may implement queues so changes to the interface would be breaking).

I'm not really sure what is best to return; my first thought was []byte but you can't use a slice as a map key, so I wonder if string is appropriate (the Go spec does not require that a string is valid UTF-8 (more here). However, I'm not set on this.

The benefits would be:

  • Not tied to github.com/google/uuid
  • Key type can depend upon the implementation (so an int could be used in a database/memory store, avoiding the need to generate/store a UUID where there is a better alternative)
  • UUID could still be used interanally (so your existing implementations would only need a small change).

Thoughts?

@shaileshahuja
Copy link
Author

shaileshahuja commented Oct 13, 2024 via email

@MattBrittan
Copy link
Contributor

Do you expect many users to require a custom queue implementation as a DB?

I really don't know. One thing I've learnt working on the Paho libraries is that users do a lot of unexpected things, I could see a situation where a user has a server, running under Kubernetes, that is publishing a lot of messages and they want the queue to be stored in a persisted location (i.e. database) - the queue might outlive a particular instance of the server. As a result I'd rather keep things as flexible as possible (technically a UUID would work in this situation, it's just not optimal).

Personally I just use the file store...

Using a string seems a bit hacky, but if it's worth the benefit to the users, I am open to it

I agree, however I also feel that using a UUID everywhere is a bit hacky too (particulary for the memory store where a counter would be more performant and have zero risk of collisions).

My initial thought was to use a []byte, but thought that would be an issue when used in a map. However queue does not actually use a map so I may have been overthinking that, and we could just use []byte. I think my thought process had moved on to the store and whether the ID used in the queue would be available there too (but I guess we can replace the queue ID with a MID when the packet is sent).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants