Created by:
inventandchill#7140
sidcode#1729
Source code: https://nbviewer.org/github/inventandchill/TE_research/blob/main/Praise_charts.ipynb
Thanks to Teodoro teodoro.criscione#0572 for sharing the python parser for TE praise data
Thanks to Shawn ygg_anderson#4998 for idea to exlude iviangita from dataset
Thanks to 🐙 octopus#5508 for idea to use log of data
Thanks to Angela akrtws (TE Academy)#4246 for idea to check relationship between amount Praised and Received
Thanks to Livia liviade#1387 for positive and useful feedback
Batch 1: 2020-Sep-29 to 2021-May-07
Batch 3: Most part from 2021-May-08 to 2021-Jul-11
Batch 2: Few data 2020-Sep-23 to 2020-Nov-20
For deeply measuring system healthiness, we need to create visual representation of data using different methods.
- Is there any imbalance and whale-behaviour in the Praise and IH flow distribution?
- How does IH get distributed in general and for largest Praise sender? Is it like normal distribution or skewed
- Is there relationship between amount of praised and amount of received for IH and count of events
We use python to create Flow diagram (Sankey) for 2 types of data:
- Amount of Praises
- Total sum of IH
"Rest From" represents combined Senders who are not in Top 10 by total IH contribution
"Rest To" represents combined Receivers who are not in Top 15 by total IH receiving
Example for flow chart. A_sender praise C_reciever with IH=1 B_sender praise C_reciever with IH=3 A_sender praise D_reciever with IH=5
Count of praise (events) for A_sender 1+1=2 Sum of praise for A_sender 1+5=6
Count of praise (events) for C_receiver 1+1=2 Sum of praise for C_receiver 1+3=4
For histogram X axis - size of praise Y axis - amount of contributions with size on x axis Example We have 5 praises with IH = 7 So for x = 7 it woud be y=5
Praise flow for Batch 1 - Count of Praise
Praise flow for Batch 2 - Count of Praise
Praise flow for Batch 3 - Count of Praise
Praise flow for Batch 1 - Sum of Praise
Praise flow for Batch 2 - Sum of Praise
Praise flow for Batch 3 - Sum of Praise
Praise flow for Batch 1 - Sum of Praise, exclude iviangita
Praise flow for Batch 3 - Sum of Praise, exclude iviangita
Praise amount histogram for all 3 batches. Simple and zoomed
Histogram for iviangita
Relationship between IH Praised and IH received. Relationship between amount of events Praised and amount of events received.
- Flow charts created for praise count (events) and praise IH sum are pretty similar. This result corresponds with praise IH have near-to-skewed distribution
- On average, Top 10 praise senders are responsible for more than 75% of IH
- On average, Top 15 praise receiver are responsible for around 70% of IH
- Except for
iviangita
other praise senders without big outliers for both metrics (praise count (events) and praise IH sum) - Distribution of IH of largest praise sender (iviangita) is correpond with distribution of IH of all praise senders, except Batch 2
- IH Praised and IH received by participant have strong relationship
- Amount of event Praised and amount of event received by participant have strong relationship
Analysis suggest that system is relatively healthy and without many whales who control all praise distribution, but it is always many ways for improvements.
As an indicator of system diversification, it is possible to track the Gini index based on two key metrics:
- Praise count (events)
- Praise IH sum
Both metrics could be used for praise senders+receivers
We can calculate this indicator for any configuration- all times, or for batches of specific time periods like 1 week, etc.