Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3.1.9 seems to break the Shell app due to CSRF token issue #3894

Open
jjackzhn opened this issue Oct 22, 2024 · 35 comments
Open

3.1.9 seems to break the Shell app due to CSRF token issue #3894

jjackzhn opened this issue Oct 22, 2024 · 35 comments
Labels
bug Existing functionality not working as expected
Milestone

Comments

@jjackzhn
Copy link

jjackzhn commented Oct 22, 2024

After updating from 3.1.7 to 3.1.9, the shell app immediately crashes with the message:

Your connection to the remote server has been terminated.

Looking at the logs I found

App 4042244 output: Listening on 3000
App 4042244 output: Connection established
App 4042244 output: /var/www/ood/apps/sys/shell/app.js:161
App 4042244 output:   token = req.url.match(/csrf=([^&]*)/)[1];
App 4042244 output:                                        ^
App 4042244 output:
App 4042244 output: TypeError: Cannot read properties of null (reading '1')
App 4042244 output:     at WebSocketServer.connection (/var/www/ood/apps/sys/shell/app.js:161:40)
App 4042244 output:     at WebSocketServer.emit (node:events:517:28)
App 4042244 output:     at done (/var/www/ood/apps/sys/shell/app.js:233:9)
App 4042244 output:     at WebSocketServer.completeUpgrade (/var/www/ood/apps/sys/shell/node_modules/ws/lib/websocket-server.js:435:5)
App 4042244 output:     at WebSocketServer.handleUpgrade (/var/www/ood/apps/sys/shell/node_modules/ws/lib/websocket-server.js:343:10)
App 4042244 output:     at Server.upgrade (/var/www/ood/apps/sys/shell/app.js:232:7)
App 4042244 output:     at Server.emit (node:events:517:28)
App 4042244 output:     at onParserExecuteCommon (node:_http_server:939:14)
App 4042244 output:     at onParserExecute (node:_http_server:825:3) 

I made it print out req.url and it's the bare URL to the shell app without a CSRF token.

3.1.7 does not have this issue since downgrading fixes it.

@osc-bot osc-bot added this to the Backlog milestone Oct 22, 2024
@johrstrom
Copy link
Contributor

I can't replicate. Note that it's the javascript making this network call with a CSRF token, not the original URL you put in your browser.

Here's an image of my console tab for the same. You'll note that the first request is what you enter in the URL (and the response is an html page), then later the browser makes another request to basically the same URL only with wss protocol instead of http.

image

@johrstrom johrstrom reopened this Oct 22, 2024
@johrstrom
Copy link
Contributor

Didn't mean to close the ticket, only comment on it. Can you show a similar network tab?

I double checked the diff between the two versions this morning and can't find/see how the ping pong change changed/broke functionality to that point.

I'd also ask if you can try 3.1.9 with a private window or after clearing your cache to see if that's any different.

v3.1.7...v3.1.9

@johrstrom
Copy link
Contributor

Given the changelog, I'd pin this on updates we had to make to our lua code to support the latest version of httpd (or apache2 as the case may be). What apache2/httpd version are you using?

@jjackzhn
Copy link
Author

jjackzhn commented Oct 22, 2024

Here's a screenshot - I do see both the original request and the follow-up wss request.
ood-csrf

I'm seeing the same behavior with incognito after clearing cache and doing a hard reload.

We are using httpd 2.4.37 from RHEL 8. I'm viewing it with Chrome on Windows 10 22H2.

@jjackzhn
Copy link
Author

jjackzhn commented Oct 22, 2024

Quite a few reds and oranges in the console...
ood-console

Edit: The websocket error is the only one that doesn't show up in 3.1.7.
Detail:
ood-error

@johrstrom
Copy link
Contributor

Unfortunately I can't replicate on Rocky 8.8. Even 8.8 has httpd-2.4.37-65. I'll try to track down a system that has httpd-2.4.37-56, but I'm guessing the fixes we put in place for 2.4.62 somehow broke httpd-2.4.37-56 (assuming that's the version you're on).

I'll also have to look into the difference between those 56 and 65 releases.

@jjackzhn
Copy link
Author

Yes, I have 2.4.37-56. I'll also try to find something different to test on.

@johrstrom
Copy link
Contributor

Is it possible to update httpd? I'm unable to get an OS with that version. Everything on appstream has been updated.

I assume you're on some EL/8.8 variant?

@treydock
Copy link
Contributor

@johrstrom You can enable the Vault for Rocky to get older packages, ie: https://dl.rockylinux.org/vault/rocky/8.8/AppStream/x86_64/os/Packages/h/. Has https://dl.rockylinux.org/vault/rocky/8.8/AppStream/x86_64/os/Packages/h/httpd-2.4.37-56.module%2Bel8.8.0%2B1456%2Bd0a01c5e.7.x86_64.rpm.

You can inject this URL into something like /etc/yum.repos.d/Rocky-AppStream.repo and comment out mirrorlist and set the Vault URL at https://dl.rockylinux.org/vault/rocky/8.8/AppStream/x86_64/os/ to the value for baseurl.

@johrstrom
Copy link
Contributor

@johrstrom You can enable the Vault for Rocky to get older packages

Thanks, I'll take a look at it.

@johrstrom
Copy link
Contributor

OK I can replicate if I downgrade back to 2.4.37-56. It is infact coming from the lua updates we made to support 2.4.62...

@johrstrom
Copy link
Contributor

But that brings another question - If I have to install the version from the vault should we even support it? Is there a way to safely require httpd >= 2.4.37-56? The patch numbers could vary from distro to distro in theory, but do they in practice?

@treydock
Copy link
Contributor

Latest versions for EL8:

Rocky 8 - 2.4.37-65
Alma 8 - 2.4.37-65
RHEL 8 - 2.4.37-65

The checksum and other parts of the appstream release do differ but I think for RHEL8 (and 9 most likely) we could require a minimum release based on what works with OnDemand.

Where this could run into issues is folks on RHEL 8 can pin to like RHEL 8.8 and if we were to require something from RHEL 8.10, that would break for them but also if that requirement is based on what works for OnDemand, they have no choice but to upgrade.

If 2.4.37-56 is the issue I think we'd have to say:

Requires: > 2.4.37-56 < 2.5

https://github.com/OSC/ondemand-packaging/blob/f777543e6818c47063ab1c1f4287493b32b634cb/packages/ondemand-runtime/rpm/ondemand-runtime.spec#L108

@johrstrom
Copy link
Contributor

Where this could run into issues is folks on RHEL 8 can pin to like RHEL 8.8 and if we were to require something from RHEL 8.10

Surprisingly enough, dnf update updates from 8.8 to 8.10. But getting a 8.8 container this morning without running dnf update installed httpd-2.4.37-65 from appstream even though it was an 8.10 module.

How can I test pinning the OS to 8.8?

image

@johrstrom
Copy link
Contributor

I found these 2 advisories, so I'm a bit convinced we should just require that -65 patch version. I can't imagine someone saying yes I need pin httpd to this version with known security advisories against it - and if so, then they can just stay on 3.1.7?

https://access.redhat.com/errata/RHSA-2024:5193
https://access.redhat.com/errata/RHSA-2024:4197

@jjackzhn
Copy link
Author

Sounds like a plan. I'll update to RHEL 8.10.

@johrstrom
Copy link
Contributor

Sounds like a plan. I'll update to RHEL 8.10.

You don't need to fully upgrade to 8.10. Just an update to httpd should suffice.

@jjackzhn
Copy link
Author

We have a mixture of RHEL 8.8 EUS, RHEL 8.10 and RHEL/Rocky 9.4; this would be a good reason to move things off 8.8 EUS anyway :)

@treydock
Copy link
Contributor

How can I test pinning the OS to 8.8?

This is done with subscription manager on RHEL systems as RHEL provides minor release repos. The only way to stay on a minor release on Rocky that is not the latest minor release would be to start with 8.8 container then replace all repos with 8.8 Vault URLs.

On RHEL you'd do

sudo subscription-manager release --set=8.8

Which would update the repo URLs in /etc/yum.repos.d/redhat.repo. I don't think this really works with Rocky/Alma since subscription-manager is designed to talk to RedHat systems about the system's valid subscription with RedHat.

@johrstrom
Copy link
Contributor

I see. Yea then a locked RHEL 8.8 system won't support 3.1.9. Hrrrrm. OK that may just be the best we can do. I can't figure out how to get the apache version in lua without shelling out a rpm -q httpd query and parsing it or similar which seems very complicated and moreover would slow down basically every request. So much so that it seems untenable.

@treydock
Copy link
Contributor

get the apache version in lua without shelling out a rpm -q httpd query and parsing it

Ya that sounds not great. The release version only exists inside the RPM metadata, I don't think Apache itself knows its 2.4.x-y it only knows it's 2.4.x.

@mrobbert
Copy link

I'm not sure if it matters, but I wanted to note that I hit this bug on a Rocky 9.4 host with httpd-2.4.57-8.el9.x86_64. We have a development host also running Rocky 9.4 that updated OnDemand at the same time, but it has httpd-2.4.57-11.el9_4.1.x86_64 and did not exhibit the problem.
We fixed it by downgrading the OnDemand package since that upgrade was inadvertent, but just want to make sure you knew that this problem existed on Rocky/RHEL 9 as well as 8.

@johrstrom
Copy link
Contributor

Rocky 9.4 host with httpd-2.4.57-8.el9.x86_64

It certainly does matter because that means specifying Requires: > 2.4.37-56 < 2.5 isn't a silver bullet I thought it may be.

@treydock
Copy link
Contributor

It certainly does matter because that means specifying Requires: > 2.4.37-56 < 2.5 isn't a silver bullet I thought it may be.

We can have RHEL8 and RHEL9 specific requirements. We already do that for things like NodeJS because of different versions available:

https://github.com/OSC/ondemand-packaging/blob/8abf784f09b65b1392651fd0694155d62b8d0e29/packages/ondemand-runtime/rpm/ondemand-runtime.spec#L88-L96

My guess is whatever patch RedHat backported, it was applied to 2.4.37 for RHEL8 and 2.4.57 for RHEL9 as RHEL tends to pin their software versions though less so with AppStream

@treydock
Copy link
Contributor

So we could do

%if 0%{?rhel} == 8
Requires: %{apache} > 2.4.37-56, %{apache} < 2.5
...
%endif
%if 0%{?rhel} == 9
Requires: %{apache} > 2.4.57-8, %{apache} < 2.5
...
%endif

@johrstrom johrstrom added the bug Existing functionality not working as expected label Oct 28, 2024
@Xaraxia
Copy link

Xaraxia commented Nov 7, 2024

I am having this issue, with the following installation:

httpd-2.4.37-65.module+el8.10.0+1830+22f0c9e0.x86_64

$ cat /etc/redhat-release
Rocky Linux release 8.10 (Green Obsidian)

App 2203442 output: /var/www/ood/apps/sys/shell/app.js:161
App 2203442 output: token = req.url.match(/csrf=([^&]*)/)[1];
App 2203442 output: ^
App 2203442 output:
App 2203442 output: TypeError: Cannot read properties of null (reading '1')
App 2203442 output: at WebSocketServer.connection (/var/www/ood/apps/sys/shell/app.js:161:40)
App 2203442 output: at WebSocketServer.emit (node:events:517:28)
App 2203442 output: at done (/var/www/ood/apps/sys/shell/app.js:233:9)
App 2203442 output: at WebSocketServer.completeUpgrade (/var/www/ood/apps/sys/shell/node_modules/ws/lib/websocket-server.js:435:5)
App 2203442 output: at WebSocketServer.handleUpgrade (/var/www/ood/apps/sys/shell/node_modules/ws/lib/websocket-server.js:343:10)
App 2203442 output: at Server.upgrade (/var/www/ood/apps/sys/shell/app.js:232:7)
App 2203442 output: at Server.emit (node:events:517:28)
App 2203442 output: at onParserExecuteCommon (node:_http_server:939:14)
App 2203442 output: at onParserExecute (node:_http_server:825:3)
App 2203442 output:
App 2203442 output: Node.js v18.20.2

@Xaraxia
Copy link

Xaraxia commented Nov 7, 2024

From the browser it looks fine:

token = window.urlWithToken().match(/csrf=([^&]*)/)[1];
'AVPiWh3i-OEzDDpVlDgafakedatafakedata'
window.urlWithToken()
'wss://bunya-ondemand.rcc.uq.edu.au/pun/sys/shell/ssh/bunya.rcc.uq.edu.au?csrf=AVPiWh3i-OEzDDpVlDgafakedatafakedata'

So, potentially, something is preventing the token from being passed back to nodejs or otherwise mangling that URL in such a way that it is not being detected correctly.

@johrstrom
Copy link
Contributor

So, potentially, something is preventing the token from being passed back to nodejs or otherwise mangling that URL in such a way that it is not being detected correctly.

The update to apache should have resolved this... Did you bounce apache after updating to ensure that the new version is running? That's the only thing I can think of. I know for sure this issue is caused by the 3.1.9 update using a version of apache that's lower than httpd-2.4.37-65 (what you've reported to have).

@Xaraxia
Copy link

Xaraxia commented Nov 12, 2024

Yes, and I just did a systemctl restart again to be on the safe side.

11:40 sbw@bunya-ondemand:~$ rpm -qa | grep httpd
httpd-filesystem-2.4.37-65.module+el8.10.0+1830+22f0c9e0.noarch
httpd-devel-2.4.37-65.module+el8.10.0+1830+22f0c9e0.x86_64
httpd-2.4.37-65.module+el8.10.0+1830+22f0c9e0.x86_64
httpd-tools-2.4.37-65.module+el8.10.0+1830+22f0c9e0.x86_64
rocky-logos-httpd-86.3-1.el8.noarch

11:41 sbw@bunya-ondemand:~$ sudo /usr/sbin/httpd -v
Server version: Apache/2.4.37 (Rocky Linux)
Server built: Jul 1 2024 14:01:13

@johrstrom
Copy link
Contributor

I'm so sorry, but I can't replicate our systems or in a container with Rocky 8. httpd-2.4.37-65 (at least in a container when I test) should work well for you.

The updates we made (that apparently broke older httpds) were to support higher versions of httpd. So we can't really go back on them/revert them. I'm not really sure what to do because it really should be working for you, I'm just not sure why. You haven't edited those lua files have you? Just trying to cover what else could be the issue.

@Xaraxia
Copy link

Xaraxia commented Nov 19, 2024 via email

@johrstrom
Copy link
Contributor

No issues in delay. I'm not at SC personally, but many people from OSC are there at the OSC booth.

@Xaraxia
Copy link

Xaraxia commented Dec 1, 2024

Is it possible that there is an issue with the gems? We've got multiple packages installed for some reason, instead of just the latest. Should I remove them?

09:18 sbw@bunya-ondemand:/etc/yum.repos.d$ rpm -qa | grep ondemand
ondemand-3.1.9-1.el8.x86_64
ondemand-passenger-6.0.20-1.ood3.1.5.el8.x86_64
ondemand-release-web-3.0-1.noarch
ondemand-runtime-3.1.5-1.el8.x86_64
ondemand-ruby-3.1.5-1.el8.x86_64
ondemand-nginx-1.24.0-1.p6.0.20.ood3.1.5.el8.x86_64
ondemand-nodejs-3.1.5-1.el8.x86_64
ondemand-apache-3.1.5-1.el8.x86_64
ondemand-gems-3.1.9-1-3.1.9-1.el8.x86_64
ondemand-gems-3.0.3-1-3.0.3-1.el8.x86_64
ondemand-selinux-3.1.9-1.el8.x86_64
ondemand-gems-3.1.1-1-3.1.1-1.el8.x86_64
ondemand-gems-3.0.1-1-3.0.1-1.el8.x86_64

@treydock
Copy link
Contributor

treydock commented Dec 2, 2024

Those duplicate ondemand-gems RPMs are expected. It's something we do with packaging to ensure that upgrade from say 3.1.0 to 3.1.9 won't remove gems you may depend on for local apps. You're welcome to remove the RPMs that don't match the version of OnDemand you have installed so long as you don't have apps referencing any of those gems that may not be present in the newer versions.

@johrstrom
Copy link
Contributor

Yea I don't think the gems are the issue here. What's the checksum of the httpd binary? Something like md5sum $(which httpd). There has to be something obvious here we're missing because obviously the httpd I can get in a Rocky 8.4 container is somehow different than your Rocky 8.4 machine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Existing functionality not working as expected
Projects
None yet
Development

No branches or pull requests

6 participants