forked from eusporg/alphaicon
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit 3b419ba
Showing
68 changed files
with
10,809 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
Copyright © 2021 European University at St. Petersburg and Skolkovo Institute of Science and Technology | ||
|
||
Moral rights: | ||
Kirill Polovnikov | ||
Nikita Pospelov | ||
Dmitriy Skougarevskiy | ||
|
||
The version control system provides attribution for specific lines of code. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
cff-version: 1.2.0 | ||
message: "If you use this algorithm, please cite it as below." | ||
authors: | ||
- family-names: "Polovnikov" | ||
given-names: "Kirill" | ||
orcid: "https://orcid.org/0000-0001-9903-9623" | ||
- family-names: "Pospelov" | ||
given-names: "Nikita" | ||
- family-names: "Skougarevskiy" | ||
given-names: "Dmitriy" | ||
orcid: "0000-0002-4022-6210" | ||
title: "α-Indirect Control in Onion-like Networks" | ||
date-released: 2021-09-16 | ||
year: 2021 | ||
url: "https://github.com/eusporg/alphaicon" | ||
preferred-citation: | ||
type: unpublished | ||
authors: | ||
- family-names: "Polovnikov" | ||
given-names: "Kirill" | ||
orcid: "https://orcid.org/0000-0001-9903-9623" | ||
- family-names: "Pospelov" | ||
given-names: "Nikita" | ||
- family-names: "Skougarevskiy" | ||
given-names: "Dmitriy" | ||
orcid: "0000-0002-4022-6210" | ||
url: "https://arxiv.org/abs/2109.07181" | ||
title: "α-Indirect Control in Onion-like Networks" | ||
year: 2021 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
Imports: | ||
data.table (>= 1.13.2), | ||
stringi (>= 1.4.4), | ||
stringr (>= 1.3.1), | ||
lubridate (>= 1.7.10), | ||
remotes (>= 2.3.0), | ||
usethis (>= 2.0.1), | ||
ndjson (>= 0.8.0), | ||
igraph (>= 1.2.6), | ||
Matrix (>= 1.3-3), | ||
matrixStats (>= 0.59.0), | ||
stargazer (>= 5.2.1), | ||
fastDummies (>= 1.6.3), | ||
ggplot2 (>= 3.3.3), | ||
ggthemes (>= 4.2.4), | ||
ggrepel (>= 0.9.1), | ||
ggnetwork (>= 0.5.9), | ||
showtext (>= 0.9-2) |
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,102 @@ | ||
# List of R dependencies for the project | ||
DEPENDENCIES: | ||
# Do nothing, the file is created outside the repo | ||
noop | ||
|
||
# Helper function to install the dependencies | ||
code/helper_functions/install_dependencies.r: | ||
# Do nothing, the file is created outside the repo | ||
noop | ||
|
||
# People with Significant Control snapshot from Companies House | ||
data/uk/persons-with-significant-control-snapshot-2021-08-02.txt: | ||
# Do nothing, the file is created outside the repo | ||
noop | ||
|
||
# Company Data Product snapshot from Companies House | ||
data/uk/BasicCompanyDataAsOneFile-2021-08-01.csv: | ||
# Do nothing, the file is created outside the repo | ||
noop | ||
|
||
# Industry sector names | ||
data/uk/sic_2007_code_list.csv: | ||
# Do nothing, the file is created outside the repo | ||
noop | ||
|
||
# CorpWatch SEC 10-K filings data: company name-id mapping | ||
data/corpwatch_api_tables_csv_14aug21/cik_name_lookup.csv: | ||
# Do nothing, the file is created outside the repo | ||
noop | ||
|
||
# CorpWatch SEC 10-K filings data: basic company information | ||
data/corpwatch_api_tables_csv_14aug21/company_info.csv: | ||
# Do nothing, the file is created outside the repo | ||
noop | ||
|
||
# CorpWatch SEC 10-K filings data: company locations | ||
data/corpwatch_api_tables_csv_14aug21/company_locations.csv: | ||
# Do nothing, the file is created outside the repo | ||
noop | ||
|
||
# Process the PSC snapshot | ||
data/uk/psc_snapshot_2021-08-02.rdata: data/uk/persons-with-significant-control-snapshot-2021-08-02.txt | ||
Rscript code/data_preparation/uk/1a_process_psc_snapshot.r | ||
|
||
# Process the live snapshot of companies data | ||
data/uk/uk_basic_companies_data_2021-08-01.rdata: data/uk/BasicCompanyDataAsOneFile-2021-08-01.csv data/uk/sic_2007_code_list.csv | ||
Rscript code/data_preparation/uk/1b_process_companies_data.r | ||
|
||
# Convert the PSC snapshot to a company-participant clean data | ||
output/uk/uk_organisations_participants_2021_long_2aug21.csv: data/uk/psc_snapshot_2021-08-02.rdata data/uk/uk_basic_companies_data_2021-08-01.rdata | ||
Rscript code/data_preparation/uk/2_psc_snapshot_to_participants_panel | ||
|
||
# Create SEC 10-K Exhibit 21 company-participant evaluation set matched to PSC and live companies | ||
data/uk/uk_parent_subsidiary_mapping_2020_2021_sec_filers_exhibit21.csv: data/corpwatch_api_tables_csv_14aug21/company_info.csv data/corpwatch_api_tables_csv_14aug21/cik_name_lookup.csv data/corpwatch_api_tables_csv_14aug21/company_locations.csv data/uk/psc_snapshot_2021-08-02.rdata data/uk/uk_basic_companies_data_2021-08-01.rdata | ||
Rscript code/data_preparation/uk/3_prepare_affiliated_entities_evaluation_data.r | ||
|
||
# Classify the network into SH, ST, C, and I entities | ||
output/uk/uk_organisations_participation_graph_core_periphery_membership_6aug21.csv: output/uk/uk_organisations_participants_2021_long_2aug21.csv | ||
jupyter nbconvert --ExecutePreprocessor.timeout=-1 --execute code/alphaicon_paper/1_compute_alphaicon.ipynb | ||
|
||
# Compute the shares by transitivity | ||
transitiveshares := $(wildcard output/uk/transitive/uk_organisations_transitive_ownership_alpha*.csv) | ||
$(transitiveshares): output/uk/uk_organisations_participants_2021_long_2aug21.csv output/uk/uk_organisations_participation_graph_core_periphery_membership_6aug21.csv | ||
jupyter nbconvert --ExecutePreprocessor.timeout=-1 --execute code/alphaicon_paper/1_compute_alphaicon.ipynb | ||
|
||
# Helper function implementing NPI/DPI computation | ||
code/helper_functions/compute_power_index.r: | ||
# Do nothing, the file is created outside the repo | ||
noop | ||
|
||
# Compute the DPI shares | ||
output/uk/npi_dpi/10000iter/uk_organisations_participants_2021_long_7sep21_dpi_10000iter.csv: code/helper_functions/compute_power_index.r | ||
Rscript code/alphaicon_paper/2_compute_npi_dpi.r | ||
|
||
# Compute the NPI shares | ||
output/uk/npi_dpi/10000iter/uk_organisations_participants_2021_long_7sep21_npi_10000iter.csv: code/helper_functions/compute_power_index.r | ||
Rscript code/alphaicon_paper/2_compute_npi_dpi.r | ||
|
||
# Perform the evaluation of algorithms at different k | ||
output/alphaicon_paper/uk_orgs_algorithm_evaluation_recall.csv: output/uk/uk_organisations_participants_2021_long_2aug21.csv output/uk/npi_dpi/10000iter/uk_organisations_participants_2021_long_7sep21_dpi_10000iter.csv output/uk/npi_dpi/10000iter/uk_organisations_participants_2021_long_7sep21_npi_10000iter.csv $(transitiveshares) output/uk/uk_organisations_participation_graph_core_periphery_membership_6aug21.csv data/uk/uk_parent_subsidiary_mapping_2020_2021_sec_filers_exhibit21.csv | ||
Rscript code/alphaicon_paper/5_algorithm_evaluation.r | ||
|
||
# Perform the evaluation of algorithms at different path length | ||
output/alphaicon_paper/uk_orgs_algorithm_evaluation_recall_by_pathlength.csv: output/uk/uk_organisations_participants_2021_long_2aug21.csv output/uk/npi_dpi/10000iter/uk_organisations_participants_2021_long_7sep21_dpi_10000iter.csv output/uk/npi_dpi/10000iter/uk_organisations_participants_2021_long_7sep21_npi_10000iter.csv $(transitiveshares) output/uk/uk_organisations_participation_graph_core_periphery_membership_6aug21.csv data/uk/uk_parent_subsidiary_mapping_2020_2021_sec_filers_exhibit21.csv | ||
Rscript code/alphaicon_paper/5_algorithm_evaluation.r | ||
|
||
# Create the ranking of top-100 holders by each method | ||
output/alphaicon_paper/uk_organisations_top100_holders_2021_long_2aug21.csv: output/uk/uk_organisations_participants_2021_long_2aug21.csv output/uk/npi_dpi/10000iter/uk_organisations_participants_2021_long_7sep21_dpi_10000iter.csv output/uk/npi_dpi/10000iter/uk_organisations_participants_2021_long_7sep21_npi_10000iter.csv $(transitiveshares) output/uk/uk_organisations_participation_graph_core_periphery_membership_6aug21.csv data/uk/uk_parent_subsidiary_mapping_2020_2021_sec_filers_exhibit21.csv | ||
Rscript code/alphaicon_paper/6_rank_top_holders.r | ||
|
||
# Compute Kendall's tau-b rank correlation of per-company participants for different methods | ||
output/alphaicon_paper/kendall_taus_participant_ranks_dpi_npi_transitive_uk_organisations_participants_2021_7sep21.csv: output/uk/uk_organisations_participants_2021_long_2aug21.csv output/uk/npi_dpi/10000iter/uk_organisations_participants_2021_long_7sep21_dpi_10000iter.csv output/uk/npi_dpi/10000iter/uk_organisations_participants_2021_long_7sep21_npi_10000iter.csv $(transitiveshares) output/uk/uk_organisations_participation_graph_core_periphery_membership_6aug21.csv data/uk/uk_parent_subsidiary_mapping_2020_2021_sec_filers_exhibit21.csv | ||
Rscript code/alphaicon_paper/6_rank_top_holders.r | ||
|
||
# α-ICON paper | ||
alphaicon_paper: output/alphaicon_paper/uk_organisations_top100_holders_2021_long_2aug21.csv output/alphaicon_paper/uk_orgs_algorithm_evaluation_recall.csv output/alphaicon_paper/uk_orgs_algorithm_evaluation_recall_by_pathlength.csv DEPENDENCIES | ||
Rscript code/helper_functions/install_dependencies.r | ||
Rscript code/helper_functions/compute_power_index.r | ||
Rscript code/alphaicon_paper/3_summary_stat_by_node_type.r | ||
Rscript code/alphaicon_paper/4_illustrate_algorithm.r | ||
Rscript code/alphaicon_paper/5_algorithm_evaluation.r | ||
Rscript code/alphaicon_paper/6_rank_top_holders.r |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,113 @@ | ||
|
||
<table> | ||
<tbody> | ||
<tr> | ||
<td valign="top" width=200><img src="https://user-images.githubusercontent.com/3776887/133301237-145e43f0-d4b3-4ae5-bf15-113efc2ad189.png"></td> | ||
<td valign="top"><h1>α-Indirect Control in Onion-like Networks</h1> | ||
We propose a fast, accurate, and scalable algorithm to detect ultimate controlling entities in global corporate networks. α-ICON uses company-participant links to identify super-holders who exert control in networks with millions of nodes.<br><br> | ||
By exploiting onion-like properties of such networks we iteratively peel off the hanging vertices until a dense core remains. This procedure allows for a dramatic speed-up, uncovers meaningful structures, and handles circular ownership by design.<br><br> | ||
Read our <a href="https://arxiv.org/abs/2109.07181" target="_blank">paper</a> with the applications. As a toy example, consider the below corporate network where α-ICON designates Mr Philip Mactaggart (in green) as the super-holder exerting control over all other entities, directly or indirectly held: | ||
</td> | ||
</tr> | ||
</tbody> | ||
</table> | ||
|
||
<img src="https://user-images.githubusercontent.com/3776887/133299028-152f030a-e1c7-428b-83ef-e5f4e92414bc.png"> | ||
|
||
## Installation | ||
|
||
To replicate the analysis you need to clone this repository to your local machine. Then you need to install the required versions of R dependencies listed in `DEPENDENCIES`. `code/helper_functions/install_dependencies.r` automates this step, but you may still need to install the underlying libraries manually with [Homebrew](https://brew.sh) or `apt-get`, depending on your platform. Finally, you need to declare the environment variable `ALPHAICON_PATH` in bash pointing to the repository. Or, better yet, you can add it in your `.Renviron` with | ||
```console | ||
user:~$ echo 'ALPHAICON_PATH="path_to_cloned_repository"' >> ~/.Renviron | ||
``` | ||
|
||
The repository does not contain any data due to its size (10+ GB unpacked); most files in `data/` and `output/` folders are zero-byte placeholders. We provide a <a href="https://drive.google.com/drive/folders/10Tq-b4BVsG3gmq2JVa026Nilzj8eojNB" target="_blank">public Google Drive folder</a> with the populated `data/` and `output/` directories. You may still need to unzip them manually. | ||
|
||
A self-contained example of α-ICON is also available in <a href="https://colab.research.google.com/drive/1AvO8hJzwj2LoKsyxk5LfSWK7LW1U02Mc" target="_blank">Google Colaboratory</a>. | ||
|
||
## Repository structure | ||
|
||
``` | ||
data/ | ||
├─uk/ # Data on UK companies and participants | ||
| ├ persons-with-significant-control-snapshot-2021-08-02.txt # Source PSC data | ||
| ├ BasicCompanyDataAsOneFile-2021-08-01.csv # Source data on live companies in UK | ||
| ├ sic_2007_code_list.csv # Standard Industrial Classification codes | ||
| ├ psc_snapshot_2021-08-02.rdata # Processed People with Significant Control data | ||
| └ uk_basic_companies_data_2021-08-01.rdata # Processed Basic Company data | ||
| | ||
├─corpwatch_api_tables_csv_14aug21/ # Data from CorpWatch Dump | ||
| ├ company_info.csv # Source companies data from SEC filings | ||
| ├ cik_name_lookup.csv # Company name variants in SEC filings | ||
| └ company_locations.csv # Company locations in SEC filings | ||
| | ||
code/ | ||
├─helper_functions/ | ||
| ├ install_dependencies.r # Installs R dependencies used in the project | ||
| └ compute_power_index.r # Computes Mizuno et al. (2020) DPI and NPI | ||
| | ||
├─data_preparation/ | ||
| └─uk/ | ||
| ├ 1a_process_psc_snapshot.r # Prepare source PSC data | ||
| ├ 1b_process_companies_data.r # Prepare source data on live companies | ||
| ├ 2_psc_snapshot_to_participants_panel.r # PSC data to entity-participant info | ||
| └ 3_prepare_affiliated_entities_evaluation_data.r # Process CorpWatch data | ||
| | ||
├─alphaicon_paper/ | ||
| ├ 1_compute_alphaicon.ipynb # Jupyter Notebook w. α-ICON (also on Google Colab) | ||
| ├ 2_compute_npi_dpi.r # Computation of Direct and Network Power Indices | ||
| ├ 3_summary_stat_by_node_type.r # UK PSC network statistics by core/SH/ST/I | ||
| ├ 4_illustrate_algorithm.r # Visualise selected networks | ||
| ├ 5_algorithm_evaluation.r # Compute recall @ k and l for various algorithms | ||
| └ 6_rank_top_holders.r # Examine the rankings of super-holders & Kendall's tau | ||
| | ||
output/ | ||
├─uk/ | ||
| ├ uk_organisations_participants_2021_long_2aug21.csv # Primary ownership data | ||
| ├ uk_organisations_participation_graph_core_periphery_membership_6aug21.csv | ||
| ├─npi_dpi/ # Mizuno et al. (2020) computation results on UK PSC data | ||
| | └─10000iter/ | ||
| | ├ uk_organisations_participants_2021_long_7sep21_dpi_10000iter.csv # DPI | ||
| | └ uk_organisations_participants_2021_long_7sep21_npi_10000iter.csv # NPI | ||
| | | ||
| ├─transitive/ # Computed α-ICON shares on equity shares or DPI weights | ||
| | ├ uk_organisations_transitive_ownership_alpha*_2021_long_2aug21.csv # α = * | ||
| | └ uk_organisations_transitive_ownership_alpha*_2021_long_7sep21_dpi_....csv | ||
| | | ||
└─alphaicon_paper/ | ||
├ uk_orgs_algorithm_evaluation_recall.csv # Algorithm recall by k | ||
├ uk_orgs_algorithm_evaluation_recall_by_pathlength.csv # Algorithm recall by l | ||
├ uk_organisations_top100_holders_2021_long_2aug21.csv # Top SH in PSC network | ||
├ uk_organisations_top100_holders_diff_npi_dpi_2021_long_2aug21.csv # Top-100 SH | ||
| # with the largest difference betw. total DPI and NPI | ||
├ uk_organisations_top100_holders_diff_transitive_dpi_2021_long_2aug21.csv | ||
| # Top-100 SH with the largest difference betw. total DPI and α-ICON (α=0.999) | ||
├ uk_organisations_top100_holders_diff_transitive_npi_2021_long_2aug21.csv | ||
| # Top-100 SH with the largest difference betw. total NPI and α-ICON (α=0.999) | ||
└ network_examples/ # Visualisations of selected networks | ||
``` | ||
|
||
We provide an annotated `Makefile` that documents the data analysis in our papers. | ||
|
||
To build the ‘<a href="https://arxiv.org/abs/2109.07181" target="_blank">α-Indirect Control in Onion-like Networks</a>’ paper run `make alphaicon_paper` when in the repository folder. | ||
|
||
Please note that those commands will not produce any publication-ready output files (e.g. tables or figures): the export statements are commented out in the code. Our intention is to make the analysis pipeline transparent to the readers with the aid of `make`: | ||
|
||
![alphaicon_dependencies](https://user-images.githubusercontent.com/3776887/133301812-87f25078-de5a-4bea-b9b0-0e6addb51b2b.png) | ||
|
||
|
||
## Licence | ||
<a rel="license" href="http://creativecommons.org/licenses/by/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by/4.0/88x31.png" /></a><br /> | ||
Creative Commons License Attribution 4.0 International (CC BY 4.0). | ||
|
||
Copyright © the respective contributors, as shown by the `AUTHORS` file. | ||
|
||
People with Significant Control data is <a href="http://download.companieshouse.gov.uk/en_pscdata.html">distributed</a> by Companies House under <a href="https://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/">Open Government Licence v3.0</a>. | ||
|
||
Free Company Data Product is <a href="http://download.companieshouse.gov.uk/en_output.html">distributed</a> by Companies House under <a href="https://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/">Open Government Licence v3.0</a>. | ||
|
||
|
||
## Contacts | ||
Dmitriy Skougarevskiy, Ph.D. | ||
|
||
[email protected] |
Oops, something went wrong.