Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Treat directory formats like other data files #2064

Merged
merged 2 commits into from
Aug 7, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 0 additions & 2 deletions bids-validator/src/files/ignore.ts
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,6 @@ const defaultIgnores = [
'code/',
'stimuli/',
'log/',
'**/meg/*.ds/**',
'**/micr/*.zarr/**',
]

/**
Expand Down
19 changes: 18 additions & 1 deletion bids-validator/src/schema/context.ts
Original file line number Diff line number Diff line change
Expand Up @@ -31,9 +31,10 @@ export class BIDSContextDataset implements ContextDataset {
sidecarKeyValidated: Set<string>
options?: ValidatorOptions
schema: Schema
pseudofileExtensions: Set<string>

constructor(
args: Partial<BIDSContextDataset>
args: Partial<BIDSContextDataset>,
) {
this.schema = args.schema || {} as unknown as Schema
this.dataset_description = args.dataset_description || {}
Expand All @@ -46,6 +47,13 @@ export class BIDSContextDataset implements ContextDataset {
this.options = args.options
}
this.issues = args.issues || new DatasetIssues()
this.pseudofileExtensions = new Set<string>(
args.schema
? Object.values(this.schema.objects.extensions)
?.map((ext) => ext.value)
?.filter((ext) => ext.endsWith('/'))
: [],
)
}

get dataset_description(): Record<string, unknown> {
Expand All @@ -59,6 +67,15 @@ export class BIDSContextDataset implements ContextDataset {
: 'raw'
}
}

isPseudoFile(file: FileTree): boolean {
const { suffix, extension, entities } = readEntities(file.name)
return (
suffix !== '' &&
Object.keys(entities).length > 0 &&
this.pseudofileExtensions.has(`${extension}/`)
)
}
}

export class BIDSContextDatasetSubjects implements ContextDatasetSubjects {
Expand Down
34 changes: 32 additions & 2 deletions bids-validator/src/schema/walk.ts
Original file line number Diff line number Diff line change
@@ -1,8 +1,34 @@
import { BIDSContext, BIDSContextDataset } from './context.ts'
import { FileTree } from '../types/filetree.ts'
import { BIDSFile, FileTree } from '../types/filetree.ts'
import { DatasetIssues } from '../issues/datasetIssues.ts'
import { loadTSV } from '../files/tsv.ts'

function* quickWalk(dir: FileTree): Generator<BIDSFile> {
for (const file of dir.files) {
yield file
}
for (const subdir of dir.directories) {
yield* quickWalk(subdir)
}
}

const nullFile = {
stream: new ReadableStream(),
text: async () => '',
readBytes: async (size: number, offset?: number) => new Uint8Array(),

Check warning on line 18 in bids-validator/src/schema/walk.ts

View check run for this annotation

Codecov / codecov/patch

bids-validator/src/schema/walk.ts#L17-L18

Added lines #L17 - L18 were not covered by tests
}

function pseudoFile(dir: FileTree): BIDSFile {
return {
name: `${dir.name}/`,
path: `${dir.path}/`,
size: [...quickWalk(dir)].reduce((acc, file) => acc + file.size, 0),
ignored: dir.ignored,
parent: dir.parent as FileTree,
...nullFile,
}
}

/** Recursive algorithm for visiting each file in the dataset, creating a context */
export async function* _walkFileTree(
fileTree: FileTree,
Expand All @@ -12,7 +38,11 @@
yield new BIDSContext(file, dsContext)
}
for (const dir of fileTree.directories) {
yield* _walkFileTree(dir, dsContext)
if (dsContext.isPseudoFile(dir)) {
yield new BIDSContext(pseudoFile(dir), dsContext)
} else {
yield* _walkFileTree(dir, dsContext)
}
}
loadTSV.cache.delete(fileTree.path)
}
Expand Down
18 changes: 10 additions & 8 deletions bids-validator/src/tests/local/empty_files.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -31,22 +31,24 @@ Deno.test('empty_files dataset', async (t) => {
'EMPTY_FILES error is thrown for only sub-0001_task-AEF_run-01_meg.meg4',
() => {
const issue = result.issues.get('EMPTY_FILE')
assertEquals(issue, undefined, 'EMPTY_FILES was not thrown as expected')
/*
assert(
issue.files.get(
const file = assert(
issue?.files.get('/sub-0001/meg/sub-0001_task-AEF_run-01_meg.ds/'),
'sub-0001_task-AEF_run-01_meg.ds/ is empty but not present in EMPTY_FILE issue',
)
assertEquals(
issue?.files.get(
'/sub-0001/meg/sub-0001_task-AEF_run-01_meg.ds/sub-0001_task-AEF_run-01_meg.meg4',
),
'sub-0001_task-AEF_run-01_meg.meg4 is empty but not present in EMPTY_FILE issue',
undefined,
'Contents of pseudo files should not be included in EMPTY_FILES error',
)
assertEquals(
issue.files.get(
issue?.files.get(
'tests/data/empty_files/sub-0001/meg/sub-0001_task-AEF_run-01_meg.ds/BadChannels',
),
undefined,
'BadChannels should not be included in EMPTY_FILES error',
'Contents of pseudo files should not be included in EMPTY_FILES error',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it true that directories and files that are children of psuedofile candidates should never show up in any error?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the BadChannels and meg4 were only under discussion because the validator was given access to the directory contents, not the directory-as-file. Thus it makes sense to me to just back off from this micromanagement.

That said, I think it would be reasonable to add a reference to the original tree to the pseudofile, so that if we wanted to inspect contents to provide access to some metadata (e.g., OME-Zarr), it would be easy to do.

)
*/
},
)
})
4 changes: 3 additions & 1 deletion bids-validator/src/types/filetree.ts
Original file line number Diff line number Diff line change
Expand Up @@ -26,14 +26,16 @@ export class FileTree {
name: string
files: BIDSFile[]
directories: FileTree[]
ignored: boolean
parent?: FileTree

constructor(path: string, name: string, parent?: FileTree) {
constructor(path: string, name: string, parent?: FileTree, ignored?: boolean) {
this.path = path
this.files = []
this.directories = []
this.name = name
this.parent = parent
this.ignored = ignored || false
}

contains(parts: string[]): boolean {
Expand Down
10 changes: 9 additions & 1 deletion bids-validator/src/types/schema.ts
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,18 @@ export interface Entity {
format: string
}

export interface Value {
value: string
}

export interface SchemaObjects {
datatypes: Record<string, Value>
enums: Record<string, Value>
entities: Record<string, Entity>
extensions: Record<string, Value>
files: Record<string, unknown>
formats: Record<string, Format>
entities: Record<string, Entity>
suffixes: Record<string, Value>
}

export interface SchemaRules {
Expand Down