Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example file in GUI format doesn't work #246

Open
maxkfranz opened this issue Feb 22, 2024 · 3 comments
Open

Example file in GUI format doesn't work #246

maxkfranz opened this issue Feb 22, 2024 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@maxkfranz
Copy link
Member

Describe the bug
A clear and concise description of what the bug is.

Using the GSE129943_rsem_counts_2016 example file, but formatted like the UI says doesn't work.

To Reproduce
Steps to reproduce the behavior:

  1. Download the file I've attached. It just has a Gene column with the names, like the UI says. No Ensembl.
  2. Try it.
  3. The UI errors out.

GSE129943_rsem_counts_2016.csv

It doesn't work as tab-delimited either:
GSE129943_rsem_counts_2016.txt

Expected behavior
A clear and concise description of what you expected to happen.

The file should go through. Not everyone will have Ensembl.

Screenshots
If applicable, add screenshots to help explain your problem.

Screenshot 2024-02-22 at 08 26 57

Desktop (please complete the following information):

  • OS: Mac 14.3.1
  • Browser Safari
  • Version 17.3.1

It doesn't work in Chrome either.

@maxkfranz maxkfranz added the bug Something isn't working label Feb 22, 2024
@mikekucera
Copy link
Contributor

mikekucera commented Feb 23, 2024

There are several rows where the gene name is missing. There are also several rows where the gene name is "NA".

This causes an error from FGSEA.

<simpleError in data.frame(logFC = logFC, logCPM = glmfit$AveLogCPM, LR = LR, PValue = LRT.pvalue, row.names = rn): row names contain missing values>

Screenshot 2024-02-23 at 9 24 10 AM

@mikekucera
Copy link
Contributor

We can add validation logic on the client to check for cases like this, and provide better error messages. But in this case if the user had opened their file in any editor they should be able to see the obvious formatting errors. I don't think automatically filtering out the bad rows is a good idea, the user should know that there's problems with their data.

What do you guys think?

@maxkfranz
Copy link
Member Author

There are way too many rows in real data to just scroll through it in Excel. People give up and "the site is broken".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants