Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix trialization #130

Merged
merged 35 commits into from
Mar 3, 2022
Merged

fix trialization #130

merged 35 commits into from
Mar 3, 2022

Conversation

jihyunbak
Copy link
Member

@jihyunbak jihyunbak commented Feb 22, 2022

Description and related issues

This grew into a large PR, but major changes are the following:

  • Stimulus metadata are now integrated into the metadata/resources/list_of_stimuli.yaml file in this package. The <stim_name>.yaml files in the metadata repo are no longer used.
  • Trialization-related values in the stimulus metadata are fixed.
  • All tokenizers fixed.
  • StimValueExtractor is deleted; instead stimulus parameters are loaded directly in ToneTokenizer and TIMITTokenizer.
  • Introduced RecManager, which keeps the TDT reader. Previously we were constructing TDTReader twice, once for NeuralDataOriginator and once for StimulusOriginator/MarkManager ; this was time consuming. Now RecManager is initialized only once and is passed to the neural/stimulus originators.

This PR closes these issues:

Checklist:

Did not run the tests in the test folder, but this branch successfully generates nwb for wn2 (RVG06_B03), tone (RVG06_B05), timit (RVG06_B08) and dmr (RVG06_B09) blocks.

  • All tests pass on catscan: run pytest --basetemp=tmp -sv -n 8 tests on catscan from the root directory
  • If needed, docs have been update: docs/source has been updated for any added, moved, or removed files
  • Docs build with no errors: run make clean & make html from the docs folder
  • No python formatting errors: run flake8 nsds_lab_to_nwb tests from the root directory

@jthermiz
Copy link
Member

jthermiz commented Feb 25, 2022

@jihyunbak

  1. I tried to run this PR on RVG16_B01 (wn2) and I'm getting 182 trials = 60 stimuli + 120 baseline + 2 extra baseline in the beginning

It looks like there is a baseline period that starts right after the stimulus. I think we want to space out the baseline farther away from the stimulus offset to ensure it's a true baseline.

I have not yet inspect whether the marks are at the correct times.

  1. I also tried running this PR on a RVG16_B06 (tone) and got a similar issue -- ~50% more trials than expected due to extra baseline trials.

  2. Re: "Stimulus metadata are now integrated into the metadata/resources/list_of_stimuli.yaml file in this package. The <stim_name>.yaml files in the metadata repo are no longer used"

Is this just a one off change or are we moving away from the metadata repo completely and integrating everthing to the nwb repo?

@jihyunbak
Copy link
Member Author

jihyunbak commented Feb 25, 2022

@jthermiz Thanks for reviewing!

Re 1, 2) splitted baseline trials, that's just because I worked in terms of between-marks intervals. So when the stimulus is located in the middle of the between-marks interval, the current code creates two baseline trials before/after the stimulus, such that the post-prev-stim baseline and the pre-next-stim baseline are actually connected. But if this is inconvenient, it is simple to fix the code to create only (1 stim trial + 1 baseline trial) per mark, plus 2 optional extra baseline trials in the beginning/end of recording. I will implement this.

Re 2) the baseline trial being too close to the stimulus:

It looks like there is a baseline period that starts right after the stimulus. I think we want to space out the baseline farther away from the stimulus offset to ensure it's a true baseline.

This is a really good point, I just don't have a good sense about this issue to make decisions on how far away from the stimulus the baseline should start. Let's discuss more in the Slack channel!

Re 3) Location of stimulus metadata: For now, I think I will keep the stimulus metadata file list_of_stimuli.yaml in this code repository and deprecate the old stimulus metadata in the metadata repo, for two reasons:

  • It makes sense to merge the information in the previous list_of_stimuli.yaml (with the audio/mark/parameter paths) and the information in the previous <stim_name>.yaml in the metadata repo.
  • We are using specialized tokenizers for each stimulus type anyways, so the stimulus metadata (unlike the probe metadata) are actually more tightly coupled with the codebase. Also, in the developing phase like now, it helps to see all changes together in the code repo. Perhaps we could decide to move list_of_stimuli.yaml back to the metadata repo later.

But the metadata repo still has the probe metadata (electrode coordinates etc.), which can stay away from the codebase.

@jthermiz
Copy link
Member

jthermiz commented Mar 1, 2022

Sounds good @jihyunbak.

Closing the loop on 2). For the discrete stimuli like white noise and tone, there appears to be consensus for this:

The baseline trial centered about this point:

baseline center = stim offset + (next stim onset - stim offset)/2
And that baseline duration = stimulus duration

@JesseLivezey also said:

  • timit would probably need different logic
  • For DMR we would probably need to have baseline times at the beginning and end

@jihyunbak
Copy link
Member Author

Thanks @jthermiz! Follow-up:

Baseline fixes:

  • For tone and wn, now there are exactly (N+2) baseline trials with N stimulus trials. The two extra baselines are at the beginning/end of the recording. Also implemented @jthermiz 's suggested baseline structure (centered between two stimulus periods) for tone and wn stimuli. See list_of_stimuli.yaml for baseline_start and baseline_end values.
  • For timit and dmr, there should be N stimulus trials (N=998 for timit, N=1 for dmr) plus 2 baseline trials at the beginning/end of the recording. The last baseline period starts baseline_start = 3.0 s after the end of the last stimulus trial. This value of 3.0s is an arbitrary choice and is set by list_of_stimuli.yaml.

Additional changes:

  • Introduced RecManager, which keeps the TDT reader. Previously we were constructing TDTReader twice, once for NeuralDataOriginator and once for StimulusOriginator/MarkManager ; this was time consuming. Now RecManager is initialized only once and is passed to the neural/stimulus originators.

@jihyunbak jihyunbak merged commit 3e1a0f8 into main Mar 3, 2022
@jihyunbak jihyunbak deleted the fix-trialization branch March 5, 2022 18:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

incorrect metadata values for stim trial structures Incorrect number of trials for RVG16_B01
2 participants