Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make generated WidebandModulationsDataset samples independent of call order #250

Open
dustinlagoy opened this issue Oct 8, 2024 · 2 comments
Labels
enhancement New feature or request
Milestone

Comments

@dustinlagoy
Copy link

First, thanks for all the work on this project. It is a great resource!!!

Is your feature request related to a problem? Please describe.
If I use the WidebandModulationsDataset to generate samples without using a DataLoader (or DatasetLoader) the samples depend on the order in which they are generated. For example:

data = WidebandModulationsDataset(...)
assert data[0] == data[0]

will fail because the generated sample at index 0 (or any index) changes each time you call data[0]. Both the number and characteristics of the generated signals and the added noise change on each subsequent call. This makes using on-the-fly generation of samples difficult unless one can ensure they are always generated in the same order.

Describe the solution you'd like
I think the dataset should generate the exact same sample for a given index regardless of any previous sample generation.

Describe alternatives you've considered
When generating samples to be written to disk, or training with on-the-fly generation of samples the data loader may ensure (it does at least for writing samples to disk) that the order of calls to WidebandModulationsDataset.__getitem__ are consistent and work around this issue. This may be sufficient for all practical use cases.

Additional context
I opened a pull request (#249) with sufficient changes to fix this issue. I understand if this feature is not desirable. In that case it may be nice to make this behavior clear in the documentation somewhere.

@MattCarrickPL
Copy link
Collaborator

Thanks for submitting this. We are discussing internally.

@ereoh ereoh added this to the v0.6.1 milestone Nov 6, 2024
@TorchDSP TorchDSP added this to TorchSig Nov 6, 2024
@ereoh ereoh added the enhancement New feature or request label Nov 6, 2024
@ereoh
Copy link
Collaborator

ereoh commented Nov 6, 2024

Hello! Just providing some updates.

We are currently in the process of doing a major rehaul and rewrite of our code for v1.0.0, which will hopefully be released by early next year. Our main goal of this rewrite is to allow infinite datasets or on-the-fly generation as you've described (with determinism).

Until then, your PR only seems to work for clean versions of wideband. If you are fine with that, I can merge it. Otherwise, you can wait for the rewrite.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
No open projects
Status: No status
Development

No branches or pull requests

3 participants