日本語 | English
Contains the source code to perform a parameter search of the urn-model using the Genetic Algorithm (GA).
Parameter search by genetic algorithm is carried out in the following steps:
- randomly generate a population with a vector representing the parameters:
(rho, nu, recentness, frequency)
for each individual. - run the urn-model based on the parameters, calculate the goodness of fit from the results and record this
- perform an evolutionary calculation of the population based on the goodness of fit a. Select individuals with high goodness of fit and perform crossover b. Perform mutations with a certain probability c. Generate the next generation of the population.
- return to 2. However, the process ends after a certain number of generations.
In particular, the goodness of fit required in 2 (for each individual) is specifically calculated as follows:
- run the urn-model based on the parameters (
rho
,nu
,recentness
,frequency
) - record the "history of interactions" generated by the execution of the urn-model
- compute 10 metrics that characterise the network using the history recorded in 2.
- calculate the difference between the metrics of the target network and those calculated in 3, multiplied by -1, as the goodness of fit
a. Note that the difference between the two metrics can be calculated as
$d$ if$m_i$ is the metric of the target network and$m_i'$ is the metric calculated in 3:$d = \sum_{i=1}^{10} |m_i - m_i'|$ . b. The goodness of fit$f$ can be calculated as$f = -d$ .
Move this directory (/ga
) and run main.py
.
$ cd ga
$ pwd # => /path/to/ga
$ python main.py <population_size> <mutation_rate> <cross_rate> <target_dataset>
You can check the details of each argument with python main.py -h
or python main.py --help
.
When you run it, the results will be saved in ./results/<target_data>
. More precisely, a directory is created under ./results/<target_data>
, and the archive data for each generation and the final results (the data of the individual with the highest goodness of fit and the metrics of the network generated by that individual) are saved in it.
To perform a grid search to find suitable population size, mutation rate, and crossover rate for each target data, run grid_search.py
.
$ pwd # => /path/to/ga
$ python grid_search.py <target_data> [rho] [nu] [s]
The results will be saved in json format under results/grid_search/
. Note that it takes about one full day to run grid_search.py
on a Mac Book Pro (M1 Pro, 32GB) for one target data.
To find the best population size, mutation rate, and crossover rate from the grid search results, run search_best.py
.
$ pwd # => /path/to/ga
$ python search_best.py <target_data> [rho] [nu] [s]