Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storage of non-needed social network data #1061

Closed
Changaco opened this issue Apr 10, 2018 · 2 comments
Closed

Storage of non-needed social network data #1061

Changaco opened this issue Apr 10, 2018 · 2 comments
Assignees
Labels
linked accounts integration with other platforms
Milestone

Comments

@Changaco
Copy link
Member

The elsewhere table has two columns for user data that we don't actually use: email and extra_info. We should probably drop them, especially since the GDPR is looming (liberapay/liberapay.org#35).

Any thoughts?

@Changaco Changaco added discussion issues that are undecided or for which feedback is invited linked accounts integration with other platforms labels Apr 10, 2018
@Changaco
Copy link
Member Author

Changaco commented Apr 10, 2018

I just remembered that we get account descriptions from extra_info (source code). We need to extract that into a new column.

Edit: Instead of dropping extra_info we could change it from a column for "everything else" to a column for "other public information" (white listed keys only).

@Changaco Changaco added this to the GDPR milestone Apr 25, 2018
@Changaco Changaco removed the discussion issues that are undecided or for which feedback is invited label May 7, 2018
@Changaco Changaco self-assigned this May 7, 2018
@Changaco
Copy link
Member Author

Changaco commented May 17, 2018

Here's an overview of what we have in the extra_info column:

select platform, key, count(*) from (select platform, json_object_keys(extra_info) as key from elsewhere where json_typeof(extra_info) = 'object') x group by platform, key order by platform, key;
   platform    |                key                 | count 
---------------+------------------------------------+-------
 bitbucket     | account_id                         |   331
 bitbucket     | account_status                     |   252
 bitbucket     | created_on                         |   443
 bitbucket     | has_2fa_enabled                    |   116
 bitbucket     | is_staff                           |   440
 bitbucket     | links                              |   452
 bitbucket     | location                           |   342
 bitbucket     | nickname                           |   254
 bitbucket     | properties                         |    30
 bitbucket     | website                            |   342
 facebook      | bio                                |     1
 facebook      | education                          |     1
 facebook      | first_name                         |     7
 facebook      | gender                             |     7
 facebook      | hometown                           |     2
 facebook      | languages                          |     2
 facebook      | last_name                          |     7
 facebook      | link                               |     7
 facebook      | locale                             |     7
 facebook      | location                           |     1
 facebook      | middle_name                        |     1
 facebook      | quotes                             |     2
 facebook      | timezone                           |     7
 facebook      | updated_time                       |     7
 facebook      | verified                           |     7
 facebook      | work                               |     4
 github        | bio                                |   272
 github        | blog                               |  5320
 github        | company                            |  5320
 github        | created_at                         |  5320
 github        | followers                          |  5320
 github        | following                          |  5320
 github        | gravatar_id                        | 13873
 github        | hireable                           |  5320
 github        | location                           |  5320
 github        | node_id                            |  9528
 github        | public_gists                       |  5320
 github        | public_repos                       |  5320
 github        | site_admin                         | 13873
 github        | twitter_username                   |  1422
 github        | updated_at                         |  5320
 github        | url                                | 13873
 gitlab        | bio                                |    15
 gitlab        | can_create_group                   |   958
 gitlab        | can_create_project                 |   958
 gitlab        | color_scheme_id                    |   958
 gitlab        | confirmed_at                       |   958
 gitlab        | created_at                         |   971
 gitlab        | current_sign_in_at                 |   958
 gitlab        | external                           |   958
 gitlab        | extra_shared_runners_minutes_limit |   436
 gitlab        | identities                         |   958
 gitlab        | job_title                          |   106
 gitlab        | last_activity_on                   |   958
 gitlab        | last_sign_in_at                    |   958
 gitlab        | linkedin                           |   971
 gitlab        | location                           |   971
 gitlab        | organization                       |   971
 gitlab        | private_profile                    |   941
 gitlab        | projects_limit                     |   958
 gitlab        | public_email                       |   787
 gitlab        | shared_runners_minutes_limit       |   958
 gitlab        | skype                              |   971
 gitlab        | state                              |   985
 gitlab        | theme_id                           |   957
 gitlab        | twitter                            |   971
 gitlab        | two_factor_enabled                 |   958
 gitlab        | website_url                        |   971
 gitlab        | web_url                            |   985
 gitlab        | work_information                   |    61
 google        | aboutMe                            |    42
 google        | ageRange                           |   240
 google        | birthday                           |    75
 google        | braggingRights                     |    70
 google        | circledByCount                     |   363
 google        | cover                              |   254
 google        | domain                             |    30
 google        | etag                               |  1603
 google        | gender                             |   315
 google        | isPlusUser                         |   428
 google        | kind                               |   625
 google        | language                           |   428
 google        | name                               |   427
 google        | names                              |   939
 google        | nickname                           |    95
 google        | nicknames                          |   168
 google        | objectType                         |   625
 google        | occupation                         |   165
 google        | organizations                      |   187
 google        | photos                             |   437
 google        | placesLived                        |   192
 google        | relationshipStatus                 |    58
 google        | skills                             |    81
 google        | tagline                            |   193
 google        | taglines                           |    28
 google        | url                                |   591
 google        | urls                               |   232
 google        | verified                           |   428
 linuxfr       | created_at                         |   105
 mastodon      | acct                               |  6433
 mastodon      | all_emojis                         |     2
 mastodon      | avatar                             |  6433
 mastodon      | bot                                |  4237
 mastodon      | bot_identified                     |     1
 mastodon      | created_at                         |  6433
 mastodon      | discoverable                       |   330
 mastodon      | emojis                             |  4290
 mastodon      | fields                             |  4289
 mastodon      | followers_count                    |  6433
 mastodon      | following_count                    |  6433
 mastodon      | group                              |   331
 mastodon      | header                             |  6433
 mastodon      | header_static                      |  6433
 mastodon      | id                                 |  3411
 mastodon      | last_status_at                     |   663
 mastodon      | locked                             |  6433
 mastodon      | moved                              |   143
 mastodon      | mute_expires_at                    |     1
 mastodon      | note                               |  2114
 mastodon      | oauth_authentications              |     1
 mastodon      | pleroma                            |    20
 mastodon      | profile_emojis                     |     2
 mastodon      | source                             |  1863
 mastodon      | statuses_count                     |  6433
 mastodon      | subscribing_count                  |     3
 mastodon      | supporter                          |     1
 openstreetmap | osm                                |   363
 pleroma       | acct                               |    65
 pleroma       | avatar                             |    65
 pleroma       | bot                                |    65
 pleroma       | created_at                         |    65
 pleroma       | emojis                             |    65
 pleroma       | fields                             |    65
 pleroma       | followers_count                    |    65
 pleroma       | following_count                    |    65
 pleroma       | follow_requests_count              |    24
 pleroma       | header                             |    65
 pleroma       | header_static                      |    65
 pleroma       | id                                 |    20
 pleroma       | locked                             |    65
 pleroma       | pleroma                            |    65
 pleroma       | source                             |    65
 pleroma       | statuses_count                     |    65
 twitch        | broadcaster_language               |   192
 twitch        | broadcaster_type                   |   684
 twitch        | created_at                         |   192
 twitch        | description                        |   193
 twitch        | followers                          |   192
 twitch        | game                               |   192
 twitch        | language                           |   192
 twitch        | mature                             |   192
 twitch        | offline_image_url                  |   492
 twitch        | partner                            |   192
 twitch        | profile_banner                     |   192
 twitch        | profile_banner_background_color    |   192
 twitch        | status                             |   192
 twitch        | type                               |   492
 twitch        | updated_at                         |   192
 twitch        | url                                |   192
 twitch        | video_banner                       |   192
 twitch        | view_count                         |   492
 twitch        | views                              |   192
 twitter       | blocked_by                         |  9229
 twitter       | blocking                           |  9229
 twitter       | can_media_tag                      |   468
 twitter       | contributors_enabled               | 14582
 twitter       | created_at                         | 14582
 twitter       | default_profile                    | 14582
 twitter       | default_profile_image              | 14582
 twitter       | description                        |  7836
 twitter       | entities                           |  2553
 twitter       | favourites_count                   | 14582
 twitter       | followed_by                        |   468
 twitter       | followers_count                    | 14582
 twitter       | following                          | 14582
 twitter       | follow_request_sent                | 14582
 twitter       | friends_count                      | 14582
 twitter       | geo_enabled                        | 14582
 twitter       | has_extended_profile               | 14582
 twitter       | is_translation_enabled             | 14582
 twitter       | is_translator                      | 14582
 twitter       | lang                               | 14582
 twitter       | listed_count                       | 14582
 twitter       | live_following                     |  9046
 twitter       | location                           | 14582
 twitter       | muting                             |  9229
 twitter       | needs_phone_verification           |  2594
 twitter       | notifications                      | 14582
 twitter       | profile_background_color           | 14582
 twitter       | profile_background_image_url       | 14582
 twitter       | profile_background_image_url_https | 14582
 twitter       | profile_background_tile            | 14582
 twitter       | profile_banner_url                 | 10660
 twitter       | profile_image_url                  | 14582
 twitter       | profile_link_color                 | 14582
 twitter       | profile_location                   |  2723
 twitter       | profile_sidebar_border_color       | 14582
 twitter       | profile_sidebar_fill_color         | 14582
 twitter       | profile_text_color                 | 14582
 twitter       | profile_use_background_image       | 14582
 twitter       | protected                          | 14582
 twitter       | status                             |   712
 twitter       | statuses_count                     | 14582
 twitter       | suspended                          |  2594
 twitter       | time_zone                          | 14582
 twitter       | translator_type                    | 14246
 twitter       | url                                | 14582
 twitter       | utc_offset                         | 14582
 twitter       | verified                           | 14582
 youtube       | etag                               |  2214
 youtube       | id                                 |  1162
 youtube       | kind                               |  2214
 youtube       | snippet                            |  2214
(213 rows)

select key, count(*) from (select json_object_keys(extra_info->'osm') as key from elsewhere where platform = 'openstreetmap' and json_typeof(extra_info) = 'object') x group by key order by key;
     key      | count 
--------------+-------
 @attribution |   164
 @copyright   |   164
 @generator   |   363
 @license     |   164
 user         |   363
 @version     |   363
(6 rows)

select key, count(*) from (select json_object_keys(extra_info->'osm'->'user') as key from elsewhere where platform = 'openstreetmap' and json_typeof(extra_info) = 'object') x group by key order by key;
        key        | count 
-------------------+-------
 @account_created  |   363
 blocks            |   363
 changesets        |   363
 contributor-terms |   363
 description       |   363
 @display_name     |   363
 home              |   208
 @id               |   363
 img               |   261
 languages         |   358
 messages          |   358
 roles             |   363
 traces            |   363
(13 rows)

select key, count(*) from (select json_object_keys(extra_info->'snippet') as key from elsewhere where platform = 'youtube' and json_typeof(extra_info) = 'object') x group by key order by key;
       key       | count 
-----------------+-------
 channelId       |  1162
 country         |   501
 customUrl       |   581
 defaultLanguage |    36
 description     |  1181
 localized       |  1052
 publishedAt     |  2214
 resourceId      |  1162
 thumbnails      |  2214
(9 rows)

Changaco added a commit that referenced this issue Sep 12, 2021
In addition to modifying the database queries used by the exploration pages, this commit also closes #1061 by dropping the `extra_info` columns, because fetching all this extraneous data from the DB significantly degrades the performance of two of the exploration pages.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
linked accounts integration with other platforms
Development

No branches or pull requests

1 participant