Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow commas in state of history download #20088

Merged
merged 1 commit into from
Mar 25, 2024

Conversation

potelux
Copy link
Contributor

@potelux potelux commented Mar 15, 2024

Proposed change

When exporting history data to a CSV, surround the state value in quotes so that any internal commas do not create new columns in the output file. Additionally, escape any quotes in the state value.

Type of change

  • Dependency upgrade
  • Bugfix (non-breaking change which fixes an issue)
  • New feature (thank you!)
  • Breaking change (fix/feature causing existing functionality to break)
  • Code quality improvements to existing code or addition of tests

Additional information

Checklist

  • The code change is tested and works locally.
  • There is no commented out code in this PR.
  • Tests have been added to verify that the new code works.

If user exposed functionality or configuration variables are added/changed:

Copy link

@home-assistant home-assistant bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @potelux

It seems you haven't yet signed a CLA. Please do so here.

Once you do that we will be able to review and accept this pull request.

Thanks!

@home-assistant
Copy link

Please take a look at the requested changes, and use the Ready for review button when you are done, thanks 👍

Learn more about our pull request process.

@home-assistant home-assistant bot marked this pull request as draft March 15, 2024 14:54
@potelux potelux force-pushed the commas-in-history-download branch 2 times, most recently from 86185ac to 1ac5710 Compare March 15, 2024 15:04
@potelux potelux marked this pull request as ready for review March 15, 2024 15:05
@piitaya
Copy link
Member

piitaya commented Mar 18, 2024

Should we only escape when it's needed? In most case, it will be string without comma or number and we will have quotes for each state even it's not needed.

@potelux
Copy link
Contributor Author

potelux commented Mar 18, 2024

Should we only escape when it's needed? In most case, it will be string without comma or number and we will have quotes for each state even it's not needed.

Checking to see if the escape is needed will require more overhead than just replacing. You would have to iterate over every string to search for commas and/or quotes, then iterate over the string that contain commas and/or quotes to replace any internal quotes.

Also, the state (TimelineState.state) will always be a string, so we don't need to check if it is of a different type. The surrounding quotes won't show up in any system meant to handle CSV as it is part of the CSV spec. It simply means that everything between quotes is a single cell. This may even catch edge-cases we have not considered.

@potelux potelux force-pushed the commas-in-history-download branch from 1ac5710 to 52414e0 Compare March 18, 2024 14:34
@potelux
Copy link
Contributor Author

potelux commented Mar 18, 2024

This change would bring us into compliance with points 7 and 8 of https://csv-spec.org/

@potelux
Copy link
Contributor Author

potelux commented Mar 18, 2024

This change would bring us into compliance with points 7 and 8 of https://csv-spec.org/

Point 10 does suggest only using quotes when required, but permits the usage even if it is not required.

@piitaya
Copy link
Member

piitaya commented Mar 18, 2024

We can do something like this. WDYT?

const safeState = s.state.includes(",") ? `"${s.state.replaceAll('"', '""')}"` : s.state;
csv.push(`${entityId},${safeState},${formatDate(s.last_changed)}\n`);

@potelux
Copy link
Contributor Author

potelux commented Mar 18, 2024

We can do something like this. WDYT?

const safeState = s.state.includes(",") ? `"${s.state.replaceAll('"', '""')}"` : s.state;
csv.push(`${entityId},${safeState},${formatDate(s.last_changed)}\n`);

That would be better since it is the recommendation of Point 10 in the CSV standard. We would need to check for a " as well, though.

We could extract the "," as a const. Maybe, CSV_DELIMETER = ",". This delimeter is used each time data is added to the CSV, but technically it could be ",", ";", or "\t". We only need to check if the string contains a comma if the comma is the current delimeter. If a semicolon is the delimeter, then that is what we need to check the string for.

EDIT:
So, something like:

const CSV_DELIMETER = ",";
const safeState = /`${CSV_DELIMETER}|"`/.test(s.state)  ?  `"${s.state.replaceAll('"', '""')}"` : s.state;
csv.push(`${entityId}${CSV_DELIMETER}${safeState}${CSV_DELIMETER}${formatDate(s.last_changed)}\n`);

If REGEX is frowned upon, we can do:

const CSV_DELIMETER = ",";
const safeState = (s.state.includes(CSV_DELIMETER) || s.state.includes('"'))?  `"${s.state.replaceAll('"', '""')}"` : s.state;
csv.push(`${entityId}${CSV_DELIMETER}${safeState}${CSV_DELIMETER}${formatDate(s.last_changed)}\n`);

Where would be the best place for that const? Would it make sense in src/data/history.ts?

@potelux potelux force-pushed the commas-in-history-download branch from 52414e0 to 3ba80cd Compare March 20, 2024 00:44
@piitaya
Copy link
Member

piitaya commented Mar 20, 2024

I don't think we need a delimiter const for now. If we want to extend the csv export in the future, I think we should rely on library that do it properly to avoid recoding the wheel.
For now, I think we should keep it simple 🙂

@potelux potelux force-pushed the commas-in-history-download branch 2 times, most recently from ad0c718 to 40720a6 Compare March 20, 2024 16:29
@potelux potelux force-pushed the commas-in-history-download branch from 40720a6 to 60b65a1 Compare March 21, 2024 15:13
Copy link
Member

@balloob balloob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you ❤️

@balloob balloob merged commit 869ace7 into home-assistant:dev Mar 25, 2024
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

History CSV Export does not support Text Sensors containing commas
3 participants