Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix copy-view issue in epochs #12121

Merged
merged 16 commits into from
Nov 9, 2023
Merged

Fix copy-view issue in epochs #12121

merged 16 commits into from
Nov 9, 2023

Conversation

pablomainar
Copy link
Contributor

@pablomainar pablomainar commented Oct 21, 2023

Closes #12114

  • Add deprecation
  • Get pytest runs green
  • Get [circle full] green

@agramfort
Copy link
Member

@pablomainar thx for the PR. you'll need to add tests. thx

Copy link
Member

@drammock drammock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this supposed to be different from #12117 in some way? Trying to figure out why that one was closed and this one opened in its place... the diff looks the same though.

mne/epochs.py Outdated
Comment on lines 1676 to 1677
if copy:
data = data.copy()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest moving this up about 25 lines, inside the if branch of the if self.preload conditional. The else branch defines data = np.empty(...) so that will never be a view of the object's data and thus the copy will never be necessary.

mne/epochs.py Outdated
@@ -1588,6 +1588,7 @@ def _get_data(
units=None,
tmin=None,
tmax=None,
copy=True,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in this private method we should probably default to copy=False so that we're not making unnecessary copies in internal calls that use the private API.

mne/epochs.py Outdated Show resolved Hide resolved
@pablomainar
Copy link
Contributor Author

@drammock thanks for the suggestions, I updated the PR accordingly

@drammock
Copy link
Member

drammock commented Oct 30, 2023

@pablomainar there are several failing CIs. Can you look into them? LMK if you need help or guidance. Here's a link to two failures that look related: https://github.com/mne-tools/mne-python/actions/runs/6639926213/job/18137375032?pr=12121#step:17:4209

I recommend running those two tests locally and seeing if you can figure out what is causing the failure. pytest mne/tests/test_epochs.py -k baseline_basic and pytest mne/viz/tests/test_epochs.py -k plot_psd_epochs

@pablomainar
Copy link
Contributor Author

Done. I changed the test slightly because the two failing ones were assuming that a view was returned from get_data(), which is no longer the case by default. Should I also write other tests to check that a copy is returned instead of a view when the copy argument is set to True?

@drammock
Copy link
Member

drammock commented Nov 2, 2023

Should I also write other tests to check that a copy is returned instead of a view when the copy argument is set to True?

yes, a small test to verify that a copy is returned would be good. Using np.array.base would be a good way to test it: https://numpy.org/doc/stable/reference/generated/numpy.ndarray.base.html

@drammock
Copy link
Member

drammock commented Nov 2, 2023

also heads-up that I merged in main here to get the tests to run (and pass, there were some CI failures here that should be fixed on main). So be sure to git pull before working on the new tests.

@pablomainar
Copy link
Contributor Author

Should I also write other tests to check that a copy is returned instead of a view when the copy argument is set to True?

yes, a small test to verify that a copy is returned would be good. Using np.array.base would be a good way to test it: https://numpy.org/doc/stable/reference/generated/numpy.ndarray.base.html

Well, I'm not sure if using base would work because get_data(copy=True) returns a view of a copy of the underlying _data. Therefore calling base on it, will return an array, not a None value.
Instead, I have modified the copy and then called again get_data() to make sure that result has not been modified. I have also added a second test for get_data(copy=False) with the same structure. I am not sure if this second test is needed, specially because it is changing the underlying _data and therefore corrupting the epochs. Might be confusing for future tests. Let me know what you think :)

@drammock
Copy link
Member

drammock commented Nov 2, 2023

I'm not sure if using base would work because get_data(copy=True) returns a view of a copy of the underlying _data.

Ah OK, that's because we always do data[select] at some point right? I wonder how important it is for data.base and data.flags.owndata to look "correct" here. @larsoner do you think it's a problem that epochs.get_data(copy=True).flags.owndata will always be (unexpectedly) False if epochs is preloaded? Right now on this PR .flags.owndata is 100% determined by whether the data are preloaded (assuming no picks passed):

import mne

sample_data_folder = mne.datasets.sample.data_path()
sample_data_raw_file = sample_data_folder / "MEG" / "sample" / "sample_audvis_raw.fif"
raw = mne.io.read_raw_fif(sample_data_raw_file, verbose=False, preload=False)
events = mne.find_events(raw, stim_channel="STI 014")
event_dict = {"auditory/left": 1}
kwargs = dict(event_id=event_dict, proj=False)
ondisk_epochs = mne.Epochs(raw, events, preload=False, **kwargs)
preloaded_epochs = mne.Epochs(raw, events, preload=True, **kwargs)

print(
    preloaded_epochs.get_data(copy=False).flags.owndata,
    preloaded_epochs.get_data(copy=True).flags.owndata,
)
# False False

print(
    ondisk_epochs.get_data(copy=False).flags.owndata,
    ondisk_epochs.get_data(copy=True).flags.owndata,
)
# True True

Ideally for preloaded epochs we should get owndata == False when copy=False and owndata == True when copy=True (again, as long as no picks are passed)

@larsoner
Copy link
Member

larsoner commented Nov 3, 2023

Instead of checking owndata directly in the preloaded case we can use np.shares_memory(out_data, epochs._data). It should be True for copy=False and False for copy=True. np.may_share_memory probably would also work and be faster

https://numpy.org/doc/stable/reference/generated/numpy.may_share_memory.html#numpy.may_share_memory

@pablomainar
Copy link
Contributor Author

Good idea @larsoner, I changed the test to use np.shares_memory. Because it is just a test with a small dataset, it won't be much slower than np.may_share_memory.
However the issue raised by @drammock is still valid: a user might be confused if they want to check if data is copy and data.flags.owndata equals False. Is this a problem?

@larsoner
Copy link
Member

larsoner commented Nov 3, 2023

I think that's a pretty advanced use case / user who would do this so it's okay

@larsoner
Copy link
Member

larsoner commented Nov 3, 2023

FYI I'm going to merge in main to get all tests to run. For the future if you click the CircleCI stotus / Details and log into GitHub there it will let those build on your PRs

Copy link
Member

@larsoner larsoner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last we need a changelog entry. We have two options:

  1. Make the default copy=None which means "use copy=False and if it matters (epochs preloaded and no fancy indexing so it's a view), emit a warning that the default will change to copy=True in 1.6". This is the safest but most annoying option to code. Then you would add an entry in the API section because it's a deprecation. I think I'm +0.5 on this one. (I can help with this step with more explanation or a quick commit if it's not clear.)
  2. Make the default copy=True now, which you've done. This is the simplest, and would then require an entry in the Bug section saying that this kwarg was added with default copy=True to avoid silent, possibly errant / dangerous behavior from 1.5.

@drammock any opinion on which we should do?

mne/epochs.py Outdated Show resolved Hide resolved
mne/epochs.py Outdated Show resolved Hide resolved
@drammock
Copy link
Member

drammock commented Nov 3, 2023

@drammock any opinion on which we should do?

I vote for option 2. I still think people should not have been relying on the fact that we returned a view, so if those (presumably rare) cases break I'm OK with it. It's outweighed by the potential good of eliminating silent bugs. see below

@cbrnr
Copy link
Contributor

cbrnr commented Nov 3, 2023

This sounds a bit bold in the light of many previous discussions on deprecations and not breaking people's code, given that this behavior is (was) clearly documented:

Returns
data : numpy.ndarray of shape (n_epochs, n_channels, n_times)
A view on epochs data.

Why do you now say "people should not have been relying on the fact that we returned a view"?

@drammock
Copy link
Member

drammock commented Nov 3, 2023

This sounds a bit bold in the light of many previous discussions on deprecations and not breaking people's code, given that this behavior is (was) clearly documented:

argh, you're right, I forgot (again) that this was documented behavior. Sorry. I switch my vote to doing a proper deprecation.

@larsoner larsoner added this to the 1.6 milestone Nov 7, 2023
@larsoner
Copy link
Member

larsoner commented Nov 7, 2023

@pablomainar do you need help with the deprecation? If so I can push some quick commits. We'll probably release 1.6 in the next week or so and it would be great to get this in!

@pablomainar
Copy link
Contributor Author

Sorry @larsoner, I've been quite busy this week. I'm not sure I'll find time to finish this in the next week, it would be very helpful if you could help me out. Thanks a lot!

* upstream/main:
  BUG: Fix bug with spectrum warning (mne-tools#12186)
  Add argument splash to disable splash-screen from Qt-browser (mne-tools#12185)
  BUG: Fix bug with logging and n_jobs>1 (mne-tools#12154)
  Use gray logo (works in light and dark modes) (mne-tools#12184)
  Tweak logo for dark mode (mne-tools#12176)
  ENH: Improve Covariance.__repr__ (mne-tools#12181)
  ENH: Enable sensor-specific OPM coregistration in mne coreg (mne-tools#11405)
  Tweak README.rst (mne-tools#12166)
  [pre-commit.ci] pre-commit autoupdate (mne-tools#12177)
  MAINT: Add branch coverage (mne-tools#12174)
  OpenSSF (mne-tools#12175)
  fix docstring in 60_sleep.py (mne-tools#12171)
  FIX: skip empty lines in read_raw_eyelink (mne-tools#12172)
  FIX: Fix bug with coreg scalars (mne-tools#12164)
  Changed casting rule in np.clip to allow reading of raw GDF files (mne-tools#12168)
  [DOC] Add documentation for setting montage order (mne-tools#12160)
  Fix inferring fiducials from EEGLAB (mne-tools#12165)
@larsoner
Copy link
Member

larsoner commented Nov 9, 2023

Deprecation complete, ready for review/merge @drammock

I ran [circle full] earlier and modified all the failing examples in this PR so we should be good once this lands in main

@drammock drammock merged commit 7b3e3c9 into mne-tools:main Nov 9, 2023
28 checks passed
Copy link

welcome bot commented Nov 9, 2023

🎉 Congrats on merging your first pull request! 🥳 Looking forward to seeing more from you in the future! 💪

larsoner added a commit to hoechenberger/mne-python that referenced this pull request Nov 13, 2023
…o-pyproject.toml

* upstream/main:
  Fix copy-view issue in epochs (mne-tools#12121)
@sappelhoff
Copy link
Member

image

@drammock, tagging you as you seem to have stumbled over my issue in mne-bids (and autoreject) --> shouldn't the new "copy" parameter for the Epochs' get_data method have been marked with "New in v1.6"?

@larsoner
Copy link
Member

Yes it should have that, too, I'll add it in the soon-to-be-merged #12178

scott-huberty added a commit to lina-usc/pylossless that referenced this pull request Jan 18, 2024
* MAINT: Update documentation and logging

- Nice new landing page
- Added covenience makefile commands
- improved logging of flagged epochs

* FIX, MAINT: Bump iclabel version and ignore mne warning

We already throw an error if the user has mne iclabel version less than .5, so added a pin for minimum version of .5 in the requirements file. MNE is also throwing a future warnign for an API kwarg change in epochs.get_data, where the current default copy=False is changing to copy=True. See mne-tools/mne-python#12121 . I'm a little worried about the additional copies we will incur every single time that we call that method, but I dont have a strong reason to stray from what MNE believes is a bug fix so lets go with it and see how it plays.

* FIX, MAINT: Bumpg GH actions checkout versions

- This should resolve the core issue of the runner not being able to find the most up to date
mne_iclabel wheel.

* FIX, TST: Install torch in doc building CI

Since MNE iclabel 0.5, torch is NOT installed by default.
snwnde pushed a commit to snwnde/mne-python that referenced this pull request Mar 20, 2024
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Daniel McCloy <[email protected]>
Co-authored-by: Eric Larson <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Calling get_data() on preloaded Epochs returns wrong units
6 participants