Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

something with alexa.. #3

Open
ibarkay opened this issue Jul 20, 2019 · 1 comment
Open

something with alexa.. #3

ibarkay opened this issue Jul 20, 2019 · 1 comment

Comments

@ibarkay
Copy link

ibarkay commented Jul 20, 2019

raceback (most recent call last):
File "dga_detection.py", line 314, in
load_data()
File "dga_detection.py", line 93, in load_data
training_data = alexa.top_list(1000000)
File "/Users/_______/Documents/PycharmProjects//venv/src/alexa-top-sites/alexa/init.py", line 32, in top_list
return [a.next() for x in xrange(num)]
StopIteration

thanx :)

@helb
Copy link

helb commented Feb 24, 2020

This is because the code tries to get a million domains from alexa:

dga_detection.py:93:

training_data = alexa.top_list(1000000)

alexa/__init__.py:32:

return [a.next() for x in xrange(num)]

…but the top-1m.csv.zip downloaded by alexa/__init__.py only has ~576k domains now for some reason:

$ wc -l top-1m.csv
576602 top-1m.csv

A proper fix would be to change alexa/__init__.py to use actual line count from the file, but if you just want a quick one, change the number in dga_detection.py:93:

training_data = alexa.top_list(576602)

(count the lines yourself, it probably changes often)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants