Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

app-editors/vscode-9999: fails to build; yarn not supported #377

Open
PF4Public opened this issue Sep 27, 2024 · 35 comments · May be fixed by #384
Open

app-editors/vscode-9999: fails to build; yarn not supported #377

PF4Public opened this issue Sep 27, 2024 · 35 comments · May be fixed by #384
Labels
bug Something isn't working

Comments

@PF4Public
Copy link
Owner

PF4Public commented Sep 27, 2024

They've returned to maintaining their dependencies via npm instead of yarn, which is a bad thing since IIRC npm does not provide [useful] offline functionality.

@PF4Public PF4Public added the bug Something isn't working label Sep 27, 2024
@JohnFlowerful
Copy link

You could pack and dist the npm-cache directory like the www-apps/vaultwarden-web GURU package does.

There's also NixOS's method to fetch the individual nodejs dep tarballs and generate an npm-cache directory.
I half heartedly wrote a replacement script and eclass to do just this for the npm based packages I use. YMMV of course :)

@PF4Public
Copy link
Owner Author

You could pack and dist the npm-cache directory like the www-apps/vaultwarden-web GURU package does.

Which would imply I have to store this tarball somewhere. This is also non-reusable between versions :(

There's also NixOS's method to fetch the individual nodejs dep tarballs and generate an npm-cache directory.

This one looks promising, thanks for the hint. I'll try investigating their approach at some point.

@Kawanaao
Copy link
Contributor

They've returned to maintaining their dependencies via npm instead of yarn, which is a bad thing since IIRC npm does not provide [useful] offline functionality.

There is another option that offers full compatibility with the Yarn cache:
https://www.npmjs.com/package/offline-mirror-registry

@PF4Public
Copy link
Owner Author

https://www.npmjs.com/package/offline-mirror-registry

Does portage allow starting network services during the build?

@Kawanaao
Copy link
Contributor

Does portage allow starting network services during the build?

Yes, as far as I know, Portage does this through LD_PRELOAD, so it should not block the daemons created in the script. I tested it through build-online and everything works. Now, I am updating the code for this library, modified it, and now all that remains is to add all the necessary libraries to SRC_URI

@PF4Public
Copy link
Owner Author

With build-online you could also run normal npm without doing anything extra.

@Kawanaao
Copy link
Contributor

With build-online you could also run normal npm without doing anything extra.

Oops, got confused, with -build-online

@Kawanaao
Copy link
Contributor

Great! VSCode is now compiling, and all the necessary packages have been added. How should distribute it now? Should create a repository and publish it on npm, or is there a more convenient option? I was considering using patches, but need to update the libraries there too

@PF4Public
Copy link
Owner Author

How should distribute it now?

Distribute what exactly?

@Kawanaao
Copy link
Contributor

Distribute what exactly?

Modification of offline-mirror-registry, maybe you can create a repository for it, for npm cases?

@Kawanaao
Copy link
Contributor

Online build is functional, but in offline mode, nvmrc tries to load https://electronjs.org/headers/v30.5.1/node-v30.5.1-headers.tar.gz

image

@PF4Public
Copy link
Owner Author

Modification of offline-mirror-registry, maybe you can create a repository for it

Are your modifications too Gentoo-specific and make no sense upstream? One of the options could be to create an ebuild where you apply your changes and install it system-wide … maybe … don't know if it is reasonable and/or doable at all.

nvmrc tries to load

But why?

@JohnFlowerful
Copy link

Couldn't you download and npm install all the nodejs dep tarballs without offline-mirror-registry? It does mean manually installing the dependencies (and maintaining the required deps somehow) but it should work.

@Kawanaao
Copy link
Contributor

Kawanaao commented Oct 18, 2024

Are your modifications too Gentoo-specific and make no sense upstream? One of the options could be to create an ebuild where you apply your changes and install it system-wide … maybe … don't know if it is reasonable and/or doable at all.

No, these modifications are not specific to Gentoo. The upstream implementation of semver is quite poor, and the glob pattern does not function as expected. Additionally, there are issues with nested paths, such as @types/node. Overall, the upstream project seems to be abandoned..

Couldn't you download and npm install all the nodejs dep tarballs without offline-mirror-registry? It does mean manually installing the dependencies (and maintaining the required deps somehow) but it should work.

It’s quite possible, but it would be easier to create a fork in PF4Public that compiles it into a single js file with the embedded modules, eliminating the need to load node modules at all. For now, as a quick solution, I simply placed it in the files and compiled it into a single JS file, works great. Maybe it would be better to move it to the repository? Maybe do it with patches with npm i and manual unzipping of tar? Maybe move it to ebuild altogether? Or maybe leave it as a binary JS file? Don’t rly know

@Kawanaao Kawanaao linked a pull request Oct 18, 2024 that will close this issue
4 tasks
@PF4Public
Copy link
Owner Author

Overall, the upstream project seems to be abandoned..

Why haven't you considered other options mentioned here?

Another option might be to ship modified version of offline-mirror-registry with electron similar to dev-dependencies, I wonder if npm would use it from there though.

@JohnFlowerful
Copy link

Couldn't you download and npm install all the nodejs dep tarballs without offline-mirror-registry? It does mean manually installing the dependencies (and maintaining the required deps somehow) but it should work.

It’s quite possible, but it would be easier to create a fork in PF4Public that compiles it into a single js file with the embedded modules, eliminating the need to load node modules at all. For now, as a quick solution, I simply placed it in the files and compiled it into a single JS file, works great. Maybe it would be better to move it to the repository? Maybe do it with patches with npm i and manual unzipping of tar? Maybe move it to ebuild altogether? Or maybe leave it as a binary JS file? Don’t rly know

I meant to npm install the modules required for vscode, not offline-mirror-registry. This is essentially doing what npm clean-install does but 'manually' (read: programmatically) and without package-lock.json. CPU and OS specific packages will be a hassle to maintain I'd guess? I'm not sure how many packages vscode has with these variants, but simple maintenance utility should solve this.

I've seen your pr and actually kinda prefer the npm clean-install route myself. Having offline-mirror-registry dangling around in files isn't really ideal though ☹️. I'd prefer making it a system package.

I should note that I haven't looked over the vscode ebuild entirely, only enough to see a lot of registry uris... So again: YMMV :)

@PF4Public
Copy link
Owner Author

Having offline-mirror-registry dangling around in files isn't really ideal though ☹️. I'd prefer making it a system package.

Agreed, to me it also looks a bit complicated and fragile.

only enough to see a lot of registry uris

Code collapse to the rescue :D

@Kawanaao
Copy link
Contributor

Kawanaao commented Oct 21, 2024

Agreed, to me it also looks a bit complicated and fragile.

The mirror looks like a more promising option. I've already started rewriting it and integrating functionality that will show the missing packages in package.json. I think this will significantly simplify the process, as I spent about 8 hours searching for missing dependencies in the NPM logs.

As for installing from tar files, it's quite complicated. Sometimes, installing from tar requires certain dependencies, and among 2000 dependencies, there may be post-installation scripts that won't execute without the necessary libraries. This can lead to errors or incorrect execution due to conditional statements in the scripts.

Additionally, many libraries require multiple versions of the same dependency, which should be isolated in folders for sub-dependencies. In the end, how do you keep track of all this? In VSCode, there's support for subprojects, and it might be necessary to write a parser for package-lock.json. Overall, installing from tar files can become a real nightmare. "Why do the work of a package manager?" (c)

@Kawanaao
Copy link
Contributor

I've seen your pr and actually kinda prefer the npm clean-install route myself. Having offline-mirror-registry dangling around in files isn't really ideal though ☹️. I'd prefer making it a system package.

I would like to create something like a system package, but we are using esbuild as well, loading it as a tar. So right now, I'm just transferring fixes to the NPM registry. The presence of the module in files doesn't make me happy either :0

@PF4Public
Copy link
Owner Author

In the end, how do you keep track of all this?

A Perl script ;)

but we are using esbuild as well

IIRC element-desktop was rewritten to pack everything manually :) i.e. without esbuild :)

@Kawanaao
Copy link
Contributor

A Perl script ;)

So I spent several hours picking dependencies for nothing 🔢 Then do you think it's worth try to install them locally using a Perl script and npm install tarball?

IIRC element-desktop was rewritten to pack everything manually :) i.e. without esbuild :)

As far as I can see in their repository, they have completely removed esbuild. However, in other overlays, I’ve encountered esbuild as a Gentoo package ¯_(ツ)_/¯

@JohnFlowerful
Copy link

The mirror looks like a more promising option. I've already started rewriting it and integrating functionality that will show the missing packages in package.json. I think this will significantly simplify the process, as I spent about 8 hours searching for missing dependencies in the NPM logs.

As for installing from tar files, it's quite complicated. Sometimes, installing from tar requires certain dependencies, and among 2000 dependencies, there may be post-installation scripts that won't execute without the necessary libraries. This can lead to errors or incorrect execution due to conditional statements in the scripts.

Additionally, many libraries require multiple versions of the same dependency, which should be isolated in folders for sub-dependencies. In the end, how do you keep track of all this? In VSCode, there's support for subprojects, and it might be necessary to write a parser for package-lock.json. Overall, installing from tar files can become a real nightmare. "Why do the work of a package manager?" (c)

I had a quick fiddle with my theory and most of what you said is right :)

In the end I was trying

	export npm_config_nodedir="/usr/include/electron-${ELECTRON_SLOT}/node"

	if ! use build-online; then
		npm config set cache npm-cache
		npm config set offline true

		_IFS=${IFS}
		IFS=$'\n'
		for uri in $VS_NPM_URIS; do
			tarball=$(basename "$uri")
			# handle cases where the scope has meant renaming the file
			tarball=${tarball#* -\> }
			echo "caching ${tarball}" # REMOVE ME
			npm cache add "${DISTDIR}/${tarball}"
		done
		IFS=${_IFS}
	fi

	npm clean-install --ignore-scripts \
		--arch=${VSCODE_ARCH} --no-progress || die
	# --ignore-optional
	# --ignore-engines
	# --production=true
	# --no-progress
	# --skip-integrity-check
	# --verbose

	npm rebuild

But I grew tired of waiting... npm cache add is too slow.

@PF4Public
Copy link
Owner Author

A Perl script ;)

So I spent several hours picking dependencies for nothing 🔢 Then do you think it's worth try to install them locally using a Perl script and npm install tarball?

Not sure I understand you correctly. I meant that the list of dependencies you see in the ebuild is essentially a concatenation of all yarn.lock's via a Perl script (which is not committed into overlay obviously).

As far as I can see in their repository, they have completely removed esbuild.

Oh that's new, I may be missing something there. I'll have to update the ebuild.

However, in other overlays, I’ve encountered esbuild as a Gentoo package ¯_(ツ)_/¯

Right, but it is a Go package. I doubt, I want to pull in Go while building electron-apps :(

@Kawanaao
Copy link
Contributor

Not sure I understand you correctly. I meant that the list of dependencies you see in the ebuild is essentially a concatenation of all yarn.lock's via a Perl script (which is not committed into overlay obviously).

Ah, I thought you were talking about full control via perl script huh

Right, but it is a Go package. I doubt, I want to pull in Go while building electron-apps :(

It would be fun to use both Rust and Golang and Zig and Java as mandatory for building electron apps :)

@Kawanaao
Copy link
Contributor

Kawanaao commented Oct 23, 2024

But I grew tired of waiting... npm cache add is too slow.

I was just experimenting with this today. The final version is now in the ebuild. Although the npm cache add command works with multiple arguments, I had to split the packages into several chunks. NPM, by its nature, works with tar in a semi-multithreaded manner, using 2-3 threads per process.

if ! use build-online; then
ebegin "Hydrating npm cache"
local JOBS=$(( $(nproc) / 2 ))
local TAR_FILES=$(ls "${DISTDIR}"/*.tgz 2>/dev/null)
local TAR_COUNT=$(echo "$TAR_FILES" | wc -l)
local CHUNK_SIZE=$(( (TAR_COUNT + JOBS - 1) / JOBS ))
(echo "$TAR_FILES" | xargs -n "$CHUNK_SIZE" -P "$JOBS" npm cache add --no-progress) || die
eend $? || die
fi

@PF4Public
Copy link
Owner Author

the npm cache add command

is it slow as @JohnFlowerful mentioned?

It would be fun to use both Rust and Golang and Zig and Java as mandatory for building electron apps :)

As if needing the whole browser with js engine with additional js engine wasn't funny enough.

@Kawanaao
Copy link
Contributor

is it slow as @JohnFlowerful mentioned?

In @JohnFlowerful case, a new npm command is invoked for each .tgz file, which involves both the JIT runtime and the bash interpreter, along with loading various resources, fortunately, npm cache add can handle multiple .tgz files simultaneously, and it appears that parallel execution is unnecessary since npm is limited by filesystem performance. Currently, two threads processing this operation complete it in about 23 seconds, utilizing 4.5 GB of memory at an I/O speed of 300 MB/s (using portage zram with zstd)

Also would be great to regenerate the dependencies with your script, as there are many outdated and unnecessary ones

As if needing the whole browser with js engine with additional js engine wasn't funny enough.

Fair enough 🥲
Btw, recently looked at a project similar to Electron from Firefox. It's a pity that the project failed, it made me think that gecko could have been better as an embedded engine, F, positron

@JohnFlowerful
Copy link

In @JohnFlowerful case, a new npm command is invoked for each .tgz file, which involves both the JIT runtime and the bash interpreter, along with loading various resources, fortunately, npm cache add can handle multiple .tgz files simultaneously, and it appears that parallel execution is unnecessary since npm is limited by filesystem performance. Currently, two threads processing this operation complete it in about 23 seconds, utilizing 4.5 GB of memory at an I/O speed of 300 MB/s (using portage zram with zstd)

This is right. I would have gotten there if time was on my side, I swear :(

Also would be great to regenerate the dependencies with your script, as there are many outdated and unnecessary ones

Which might also prove that relying on the files having a .tgz extension to be problematic.
I have no idea about how vscode releases are finalised. Maybe they must use packages from the registry? No idea. I guess @PF4Public would have more knowledge :)

@Kawanaao
Copy link
Contributor

This is right. I would have gotten there if time was on my side, I swear :(

Well, 2000 packages to install is really a lot, especially when you look at other editors that don't even take up 1/40 of VSCode :(

Which might also prove that relying on the files having a .tgz extension to be problematic. I have no idea about how vscode releases are finalised. Maybe they must use packages from the registry? No idea. I guess @PF4Public would have more knowledge :)

In the case of Gentoo, it is common practice to align package installation processes with the ebuild pipeline. For dependencies, there is an offline mode because the same package may be used at least once by other programs that utilize npm. Moreover, emerge is very similar to CI, having to download dependencies in online mode with every update might not be the best idea, instead, you essentially only need to update the list of dependencies. Difficult, but we know what we're fighting for!! ;)

@JohnFlowerful
Copy link

JohnFlowerful commented Oct 23, 2024

I meant the npm registry. Packages can be git based as well. As I said however: I don't know if vscode has had or will ever have git based dependencies. I do know that it doesn't have any in 1.94.2 since I ran my own script to get the uris for testing.
It's more of a problem for @PF4Public to (potentially) address with the url grabbing script.

@PF4Public
Copy link
Owner Author

if vscode has had

Hold my beer! microsoft/vscode#149291 :D

@Kawanaao
Copy link
Contributor

I meant the npm registry. Packages can be git based as well. As I said however: I don't know if vscode has had or will ever have git based dependencies. I do know that it doesn't have any in 1.94.2 since I ran my own script to get the uris for testing. It's more of a problem for @PF4Public to (potentially) address with the url grabbing script.

Ah, I see, VSCode has Git dependencies, and they are described just like regular packages and can be loaded in the same way (although I’m not sure if "1ca1b5cc18" is the correct full package name, but in any case, it seems there is only one such Git package). This works with Yarn, but npm cannot fetch them from the cache cause Git packages are dynamic, package-lock.json file for such packages does not contain checksums, licenses, or anything - not even versions

@Kawanaao
Copy link
Contributor

I mean that yes, you can use npm cache add https://codeload...../branch-or-commit/, but downloading a tar.gz file won’t allow you to use it without modifying package.json, npm has a slightly different mechanism for handling Git links, it simply intercepts them

@JohnFlowerful
Copy link

Ah, I see, VSCode has Git dependencies, and they are described just like regular packages and can be loaded in the same way (although I’m not sure if "1ca1b5cc18" is the correct full package name, but in any case, it seems there is only one such Git package). This works with Yarn, but npm cannot fetch them from the cache cause Git packages are dynamic, package-lock.json file for such packages does not contain checksums, licenses, or anything - not even versions

Git based packages are denoted with (git|git+ssh|git+https|ssh) at the beginning of the resolved url. They shouldn't have an integrity checksum. If they do (it is known to happen in other projects) you'd have to patch the integrity checksum out of package-lock.json to be able to use the resulting tarball offline.

Again, the only reason I'm even mentioning this is because of

if ! use build-online; then
ebegin "Hydrating npm cache"
local TAR_FILES=$(ls "${DISTDIR}"/*.tgz 2>/dev/null)
npm cache add "${NPM_DEFAULT_FLAGS}" $TAR_FILES || die
eend $? || die
fi
where you're expecting all npm packages to be *.tgz, which may not be true if a package came from, say, github's codeload. In this case it will be .tar.gz unless the url grabbing script is adjusted accordingly.
Also again: I don't think this is something for you to worry about, and more an issue for @PF4Public. But it seems like this kind of fuckery might already be known and possibly already accounted for.

@Kawanaao
Copy link
Contributor

Git based packages are denoted with (git|git+ssh|git+https|ssh) at the beginning of the resolved url. They shouldn't have an integrity checksum. If they do (it is known to happen in other projects) you'd have to patch the integrity checksum out of package-lock.json to be able to use the resulting tarball offline.

The issue is that Git commits do not have checksums. npm uses the commit ID to fill in the cache data. However, when we add a .tar.gz to the cache, we use its checksum instead. This means there are fundamentally two different ways of using the cache: one for Git and another for .tar.gz. You cannot replace Git with a .tar.gz because npm treats them as separate entities.

./npm-cache/_cacache/index-v5/71/b7/a839243f980559669cc39952b9703527a14d437bd0a71279915ca880ae5d  ./npm-cache/_cacache/index-v5/ec/b3/383111a66cd624c7cdff893e75cb632c086e7f8963ec3a3367b9d2145523

./npm-cache/_cacache/content-v2/sha512/a9/ce:
bd592e5a0858e4c26581ad34c0500b9f7184410885471c6b527988d36f427b49ddb63e208c07f6c58ac6b37edb467d21a1d25b13774485b501e7a88609cb

Both indices refer to the same context: 71/b7 was created using npm cache add git, while ec/b3 was created using npm cache add tar. However, in the index, these are treated as completely separate entities. We cannot use npm cache add git in offline mode, as npm will attempt to access 71/b7 but will not interact with the .tar.gz in ec/b3. Therefore, it is much simpler to use npm install tar.gz, which modifies both package.json and package-lock.json, or to ADD the SHA-512 checksum that directly references content-v2/sha512/a9/ce/....

where you're expecting all npm packages to be *.tgz, which may not be true if a package came from, say, github's codeload. In this case it will be .tar.gz unless the url grabbing script is adjusted accordingly.

There is no point in NOT adjusting them, SRC_URI will only save the commit name without even adding .tar.gz, thanks to the URI structure tar.gz/commit, so what's the difference? still need to use the -> notation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants