Skip to content
This repository has been archived by the owner on Dec 10, 2021. It is now read-only.

fix(plugin-chart-word-cloud): ensure top results are always displayed #841

Merged
merged 2 commits into from
Jan 12, 2021

Conversation

agatapst
Copy link
Contributor

@agatapst agatapst commented Nov 26, 2020

While resizing chart sometimes top results were filtered out because their sizes were too big. This
solution makes sure that top 10% of results will always be displayed by gradually scaling down the
chart if needed.

fix apache/superset#11784

🐛 Bug Fix
Before:
before_resizing

After:
resizing_after

Issues associated with this bug were reported in D3 library and not completely solved yet.
- most important words (highest counts) missing if they don't fit
- missing words in final layout
The cost of my workaround solution may be slower pace of displaying the chart while resizing.

cc @villebro

@agatapst agatapst requested a review from a team as a code owner November 26, 2020 12:10
@vercel
Copy link

vercel bot commented Nov 26, 2020

This pull request is being automatically deployed with Vercel (learn more).
To see the status of your deployment, click below or on the icon next to each commit.

🔍 Inspect: https://vercel.com/superset/superset-ui/j6yybjmdb
✅ Preview: https://superset-ui-git-fork-agatapst-fix-word-cloud-resizing.superset.vercel.app

@codecov
Copy link

codecov bot commented Nov 26, 2020

Codecov Report

Merging #841 (7ce156a) into master (cc11fdb) will decrease coverage by 0.05%.
The diff coverage is 0.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #841      +/-   ##
==========================================
- Coverage   26.73%   26.67%   -0.06%     
==========================================
  Files         405      405              
  Lines        8248     8266      +18     
  Branches     1126     1128       +2     
==========================================
  Hits         2205     2205              
- Misses       5914     5932      +18     
  Partials      129      129              
Impacted Files Coverage Δ
...ns/plugin-chart-word-cloud/src/chart/WordCloud.tsx 0.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cc11fdb...7ce156a. Read the comment docs.

@junlincc
Copy link
Contributor

junlincc commented Nov 26, 2020

Thanks for the quick fix @agatapst! this is awesome!
a bit context for you, we are in the process of migrating from D3 charts to Echarts, and have Echarts be Superset's main charting library. luckily, Echarts does offer wordcloud as well, see link https://github.com/ecomfe/echarts-wordcloud.
Due to D3's limitations in wordcloud, instead of further investing in improving it, the next step will be testing Echarts wordcloud to see if it reaches full feature parity and then migrate. this is not a priority yet, but I would like to get very familiar with Data Viz in Superset if you are interested. meet @villebro here, he will be your go to person for any questions. :)

Copy link
Contributor

@villebro villebro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @agatapst , looks really good! While this will have a performance impact, I think it's worth it given how misleading the results are without this fix.

A few suggestions:

  • I've noticed that Word Cloud is especially poor at rendering long "words". For instance, The value "Latin America and Caribbean Islands" will tend to not render at all if it's the only value unless the chart is really big. So having a story in the story book with a few really long words would be great.
  • The logic for ensuring that at minimum the first 10 % values are present is slightly difficult to follow. Do you think we could break out the relevant utility function(s) and add a few unit tests to make sure it works with e.g. 1, 5, 20 and 100 values, just to make sure the rounding works correctly?

@agatapst
Copy link
Contributor Author

agatapst commented Dec 1, 2020

@villebro Thanks for your suggestions, I will think that through and come back with some ideas soon. 🙂

@villebro
Copy link
Contributor

@agatapst this would be great to get merged, but needs a rebase.

While resizing chart sometimes top results were filtered out because their sizes were too big. This
solution makes sure that top 10% of results will always be displayed by gradually scaling down the
chart if needed.

fix apache/superset#11784
@agatapst
Copy link
Contributor Author

agatapst commented Jan 11, 2021

@villebro sorry for the delay! it's rebased now
update: I can see some checks error, working on them now

@villebro
Copy link
Contributor

Thanks @agatapst!

@agatapst
Copy link
Contributor Author

agatapst commented Jan 12, 2021

@villebro all checks have passed, if PR is ok without any changes, it's ready to merge
I can come back to your suggestions later, because tbh there have always been tasks with higher priority to work on - but if you think this PR should be improved asap, I can work on it today/tomorrow

Copy link
Contributor

@ktmud ktmud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a reasonable solution.

@ktmud ktmud merged commit 286bee7 into apache-superset:master Jan 12, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[wordcloud]mis-interpretation of data while resizing dashboard window
4 participants