-
Notifications
You must be signed in to change notification settings - Fork 249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix intan one-file-per-channel
#1601
Fix intan one-file-per-channel
#1601
Conversation
Love the use of the dicts instead of the lists. I'm busy for the next couple days, but since this ~200 lines of change I'll try to carefully read this at some point this week so we can add this in. Unless Sam has the bandwidth before me :) |
Thanks, @zm711 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My only question is how do I know which stream index to use (ie if I as a previous user < 14.0 if this PR makes it....)? Like if I were stream_index 15 in my code previously would this be a breaking change? I can test if needed.
Otherwise I think this looks really great. A lot better--definitely more robust.
|
||
# we need time to be the last value | ||
raw_file_paths_dict[15] = [Path(dirname / "time.dat")] | ||
raw_file_paths_dict["timestamp"] = [Path(dirname / "time.dat")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment can be deleted above here since last value means nothing now :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch.
Thanks @zm711 . I implemented your feedback.
Good point. My impression is that there is no contract in neo that stream indices have a fixed meaning. Ids and Names should be fixed but the stream indices depend on the specific instance of the format. Some context:
All that said, it is unfortunate that the raw API for neo exposes the stream_index as the primary access method. My impression is that the normal io of neo and spikeinterface were the means to access the raw API and that the raw.io API was not designed to be relied on directly. My opinion is that either exposing stream_index directly or coupling it to the signal_streams structure was a mistake that was done because segment index and block index have natural order semantics. If this had been done with def _get_analogsignal_chunk(
self,
block_index: int,
seg_index: int,
i_start: int | None,
i_stop: int | None,
stream_index: int,
channel_indexes: list[int] | None,
):
stream_id = self.header["signal_streams"][stream_index]["id"]
buffer_id = self.header["signal_streams"][stream_index]["buffer_id"] def channel_id_to_index(self, stream_index: int, channel_ids: list[str]):
"""
Inside a stream, transform channel_ids to channel_indexes.
Based on self.header['signal_channels']
channel_indexes are zero-based offsets within the stream
Parameters
----------
stream_index: int
the stream index in which to convert the channel_ids to channel_indexes
channel_ids: list[str]
the list of channel_ids to convert to channel_indexes
Returns
-------
channel_indexes: np.array[int]
the channel_indexes associated with the given channel_ids
"""
# unique ids is already checked in _check_stream_signal_channel_characteristics
stream_id = self.header["signal_streams"][stream_index]["id"]
mask = self.header["signal_channels"]["stream_id"] == stream_id Yes, maybe something to discuss with the team. The thorny issue is that code like this is commonly use to get the stream_index: def test_sync(self):
rawio_with_sync = SpikeGLXRawIO(self.get_local_path("spikeglx/NP2_with_sync"), load_sync_channel=True)
rawio_with_sync.parse_header()
stream_index = list(rawio_with_sync.header["signal_streams"]["name"]).index("imec0.ap") It is hard to square that use with a variable number of streams (as in Plexon). |
Thanks yeah. I agree the design is a little tricky since the I guess to summarize my problem is that we don't actually have enough constraints on how to build a rawio and so different ios were build differently. I think this is a great improvement overall and will reduce mistakes. Could you run the intan test suite from spikeinterface on this branch and double check that it converts everything appropriately to make sure at the SI level we didn't make any assumptions :) |
Sure, but we can also just run the index on master right now and re-order the signal stream header so it matches what we had before. Could you check that? I can check the Spikeinterface but I think that over there we don't rely on stream index. |
In fact, let's do just the re-ordering and then we don't need to check anything else as that would provide backward compatibility with what you had before. I will take care of it. |
Oh, in fact we are already sorting so this does should keep that behavior: python-neo/neo/rawio/intanrawio.py Lines 246 to 265 in 2576037
I also double-checked the order of the streams on the signals header (which is what determines stream_index) and found no change between master and this PR. Here is the code and output: from pathlib import Path
import subprocess
from neo.rawio import IntanRawIO
from pprint import pprint
import time
neo_test_dir = Path("/home/heberto/ephy_testing_data/intan")
intan_test_cases_folder_paths = [p for p in neo_test_dir.iterdir() if p.is_dir()]
for test_case_folder_path in intan_test_cases_folder_paths:
info_rhd_path = test_case_folder_path / "info.rhd"
info_rhs_path = test_case_folder_path / "info.rhs"
if info_rhd_path.exists() or info_rhs_path.exists():
filename = info_rhd_path if info_rhd_path.exists() else info_rhs_path
print("--------------------")
print(test_case_folder_path.name)
print(filename.suffix)
# Switch to the master branch
subprocess.run(["git", "switch", "master"], check=True)
time.sleep(1) # Just so the print works
reader = IntanRawIO(filename=filename)
reader.parse_header()
pprint(reader.header["signal_streams"])
subprocess.run(["git", "switch", "fix_intan_one_file_per_channel"], check=True)
time.sleep(1) # Just so the print works
reader = IntanRawIO(filename=filename)
reader.parse_header()
pprint(reader.header["signal_streams"])
--------------------
rhd_fpc_multistim_240514_082044
.rhd
Your branch is up to date with 'origin/master'.
Already on 'master'
array([('RHD2000 amplifier channel', '0', ''),
('RHD2000 auxiliary input channel', '1', ''),
('USB board ADC input channel', '3', ''),
('USB board digital input channel', '4', ''),
('USB board digital output channel', '5', '')],
dtype=[('name', '<U64'), ('id', '<U64'), ('buffer_id', '<U64')])
Your branch is up to date with 'origin/fix_intan_one_file_per_channel'.
Switched to branch 'fix_intan_one_file_per_channel'
array([('RHD2000 amplifier channel', '0', ''),
('RHD2000 auxiliary input channel', '1', ''),
('USB board ADC input channel', '3', ''),
('USB board digital input channel', '4', ''),
('USB board digital output channel', '5', '')],
dtype=[('name', '<U64'), ('id', '<U64'), ('buffer_id', '<U64')])
--------------------
intan_fps_test_231117_052500
.rhd
Your branch is up to date with 'origin/master'.
Switched to branch 'master'
array([('RHD2000 amplifier channel', '0', '0'),
('RHD2000 auxiliary input channel', '1', '1'),
('USB board ADC input channel', '3', '3'),
('USB board digital input channel', '4', '4')],
dtype=[('name', '<U64'), ('id', '<U64'), ('buffer_id', '<U64')])
Your branch is up to date with 'origin/fix_intan_one_file_per_channel'.
Switched to branch 'fix_intan_one_file_per_channel'
array([('RHD2000 amplifier channel', '0', '0'),
('RHD2000 auxiliary input channel', '1', '1'),
('USB board ADC input channel', '3', '3'),
('USB board digital input channel', '4', '4')],
dtype=[('name', '<U64'), ('id', '<U64'), ('buffer_id', '<U64')])
--------------------
intan_fpc_test_231117_052630
.rhd
Your branch is up to date with 'origin/master'.
Switched to branch 'master'
array([('RHD2000 amplifier channel', '0', ''),
('RHD2000 auxiliary input channel', '1', ''),
('USB board ADC input channel', '3', ''),
('USB board digital input channel', '4', '')],
dtype=[('name', '<U64'), ('id', '<U64'), ('buffer_id', '<U64')])
Your branch is up to date with 'origin/fix_intan_one_file_per_channel'.
Switched to branch 'fix_intan_one_file_per_channel'
array([('RHD2000 amplifier channel', '0', ''),
('RHD2000 auxiliary input channel', '1', ''),
('USB board ADC input channel', '3', ''),
('USB board digital input channel', '4', '')],
dtype=[('name', '<U64'), ('id', '<U64'), ('buffer_id', '<U64')])
--------------------
intan_fps_rhs_test_240329_091536
.rhs
Your branch is up to date with 'origin/master'.
Switched to branch 'master'
array([('RHS2000 amplifier channel', '0', '0'),
('USB board ADC input channel', '3', '3'),
('USB board ADC output channel', '4', '4'),
('USB board digital input channel', '5', '5'),
('USB board digital output channel', '6', '6'),
('DC Amplifier channel', '10', '10'), ('Stim channel', '11', '11')],
dtype=[('name', '<U64'), ('id', '<U64'), ('buffer_id', '<U64')])
Your branch is up to date with 'origin/fix_intan_one_file_per_channel'.
Switched to branch 'fix_intan_one_file_per_channel'
array([('RHS2000 amplifier channel', '0', '0'),
('USB board ADC input channel', '3', '3'),
('USB board ADC output channel', '4', '4'),
('USB board digital input channel', '5', '5'),
('USB board digital output channel', '6', '6'),
('DC Amplifier channel', '10', '10'), ('Stim channel', '11', '11')],
dtype=[('name', '<U64'), ('id', '<U64'), ('buffer_id', '<U64')])
--------------------
intan_fpc_rhs_test_240329_091637
.rhs
Your branch is up to date with 'origin/master'.
Switched to branch 'master'
/home/heberto/development/python-neo/neo/rawio/intanrawio.py:826: UserWarning: Stim not implemented for `one-file-per-channel` due to lack of test files
warnings.warn("Stim not implemented for `one-file-per-channel` due to lack of test files")
Switched to branch 'fix_intan_one_file_per_channel'
array([('RHS2000 amplifier channel', '0', ''),
('USB board ADC input channel', '3', ''),
('USB board ADC output channel', '4', ''),
('USB board digital input channel', '5', ''),
('USB board digital output channel', '6', ''),
('DC Amplifier channel', '10', '')],
dtype=[('name', '<U64'), ('id', '<U64'), ('buffer_id', '<U64')])
Your branch is up to date with 'origin/fix_intan_one_file_per_channel'.
array([('RHS2000 amplifier channel', '0', ''),
('USB board ADC input channel', '3', ''),
('USB board ADC output channel', '4', ''),
('USB board digital input channel', '5', ''),
('USB board digital output channel', '6', ''),
('DC Amplifier channel', '10', '')],
dtype=[('name', '<U64'), ('id', '<U64'), ('buffer_id', '<U64')])
/home/heberto/development/python-neo/neo/rawio/intanrawio.py:826: UserWarning: Stim not implemented for `one-file-per-channel` due to lack of test files
warnings.warn("Stim not implemented for `one-file-per-channel` due to lack of test files") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm good with this then. I'll ping Sam to look at our discussion, but as long as this maintains behavior then the discussion is more for future development than this PR. Thanks Heberto for putting up with me :)
I branched the index discussion as a separate issue because it seems like an important thing: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One last comment then ready to merge @h-mayorquin .
channel_names = [p.name[4:-4] for p in amplifier_file_paths] | ||
|
||
for channel_name, amplifier_file_path in zip(channel_names, amplifier_file_paths): | ||
data_raw = np.fromfile(amplifier_file_path, dtype=np.int16).squeeze() | ||
data_from_neo = intan_reader.get_analogsignal_chunk(channel_ids=[channel_name], stream_index=0).squeeze() | ||
np.testing.assert_allclose(data_raw, data_from_neo) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we get a note in comment about why this works...
so p.name[4:-4] means nothing to someone who doesn't know the file format. So I would explain why 4:-4. And if you're feeling ambitious maybe a better name than p (since I know a certain developer who hates one letter variable names....).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. I was lazy here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Sorry I merged main into the branch, but this bug that is fixed on main is now popping up on RTD here. It appeared on the PR we just merged. Sorry let me try to track it down. |
git log revert? |
It broke your other PR too. So I would love to forget it out. You want to look at the log and see if you see anything ? To me it looks like the code should be fine, so I don't know why it is using an old version of the plexon code.... I've been googling to see if there is a cache I can empty but seems like RTD doesn't make it easy. |
Okay I found another Plexon overflow that looked the same as the last one so I didn't even realize it was a different one which had messed with me. Once we test and merge #1612 we can merge main into your PRs and that should fix everything :) |
I guessed that Pleon will bite you, one way or the other. |
Hi Ramon |
Fix #1599
The fix is composed of two steps:
stream_index
one-file-per-channel
case based on their channel names.The first point also has the advantage that it removes the need to keep yet another numeric reference (that is not stream_id or stream_index) for the different data structures.