Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recursive get does not put the files in the right subdirs #1741

Open
MitchellAcoustics opened this issue Oct 28, 2024 · 2 comments
Open

Recursive get does not put the files in the right subdirs #1741

MitchellAcoustics opened this issue Oct 28, 2024 · 2 comments

Comments

@MitchellAcoustics
Copy link

Thanks for the package, it's very helpful!

I'm having an issue using the get command to recursively copy a directory, its subdirs, and included files from a Github repo to local. Although fsspec finds all of the files, with their correct path and lists them with find, when running get, it creates the relevant subdirs, but then copies the files to the parent directory.

I'm trying to copy the contents and structure of this folder from Github to a local folder: https://github.com/MitchellAcoustics/JASAEL-HowToAnalyseQuantiativeSoundscapeData/tree/main/_freeze/paper

Running:

fs = fsspec.filesystem("github", org="MitchellAcoustics", repo="JASAEL-HowToAnalyseQuantiativeSoundscapeData", ref="main")
fs.find("_freeze/paper", withdirs=True)

finds everything just fine

['_freeze/paper',
 '_freeze/paper/execute-results',
 '_freeze/paper/execute-results/html.json',
 '_freeze/paper/execute-results/tex.json',
 '_freeze/paper/figure-html',
 '_freeze/paper/figure-html/fig-circ-output-1.png',
 '_freeze/paper/figure-html/fig-circ-output-2.png',
 '_freeze/paper/figure-html/fig-circ-output-3.png',
 '_freeze/paper/figure-html/fig-circ-output-4.png',
 '_freeze/paper/figure-pdf',
 '_freeze/paper/figure-pdf/fig-circ-output-1.pdf',
 '_freeze/paper/figure-pdf/fig-circ-output-2.pdf',
 '_freeze/paper/figure-pdf/fig-circ-output-3.pdf',
 '_freeze/paper/figure-pdf/fig-circ-output-4.pdf']

But running get:

fs.get(fs.ls("_freeze/paper"), "~/Documents/Trials/embedded_paper/", recursive=True) # or with fs.get(fs.find(..., withdirs=True), ...)

results in this:

embedded_paper
├── execute-results
├── fig-circ-output-1.pdf
├── fig-circ-output-1.png
├── fig-circ-output-2.pdf
├── fig-circ-output-2.png
├── fig-circ-output-3.pdf
├── fig-circ-output-3.png
├── fig-circ-output-4.pdf
├── fig-circ-output-4.png
├── figure-html
├── figure-pdf
├── html.json
└── tex.json

Where the subdirs (execute-results, figure-html, figure-pdf) are created, but left empty, and what should have been put in them are just copied into the main directory.

Is this a bug, or should I be doing something else to make sure the files are placed in the correct subdirectories?

@martindurant
Copy link
Member

This is functioning correctly, with behaviour copied from command-line cp. If you supply a list of concrete paths (files), then they will all appear in the target directory at the root level.
To copy the directory tree, supply the root path name with recursive, and fsspec will find all the paths for you:

fs.get("_freeze/paper", "~/Documents/Trials/embedded_paper/", recursive=True)

what should have been put in them are just copied into the main directory

This does sound odd.

(it's worth noting that git clone or the ZIP download may be faster for this particular operation).

@MitchellAcoustics
Copy link
Author

Ah, thank you! Both of your suggestions were very helpful. I think I was including the fs.ls(...) because that was included in the original suggestion I saw to use fsspec. The documentation for .get confused me a bit for how to include the path, but of course it's very simple!

But yes, as you said, directly using git was much faster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants