-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix tests - steamline testing + multiprocessing backup #179
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is a bug in the multiprocessing code (and I don't have time to debug it), please just wrap the original code in the try/except rather than the new one with the func_settings
and func
, especially as this will eventually go away with #182 anyway.
Otherwise we're back at 3-hour tests.
pynsee/geodata/_get_geodata.py
Outdated
length = len(list_bbox) | ||
irange = range(length) | ||
|
||
data_all = pd.concat(list_data).reset_index(drop=True) | ||
func_settings = _set_global_var | ||
func = _get_data_with_bbox2 | ||
|
||
try: | ||
with multiprocessing.Pool( | ||
initializer=func_settings, initargs=(args,), processes=Nprocesses | ||
) as pool: | ||
list_output = list( | ||
tqdm.tqdm( | ||
pool.imap(func, irange), | ||
total=length | ||
) | ||
) | ||
except Exception: | ||
func_settings(args) | ||
list_output = [] | ||
|
||
for p in tqdm.trange(length): | ||
list_output.append(func(p)) | ||
|
||
msg = """ | ||
Multiprocessing failed in the geodata collection, | ||
a traditional loop was used instead | ||
""" | ||
logger.warning(msg) | ||
|
||
data_all = pd.concat(list_output).reset_index(drop=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do not include these changes, there is a bug somewhere which leads to the 4h-long tests
# df = _build_series_list() | ||
# test = isinstance(df, pd.DataFrame) | ||
# os.environ['pynsee_use_sdmx'] = "False" | ||
# self.assertTrue(test) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uncomment or remove
#python test_pynsee_macrodata.py TestFunction.test_get_column_title_1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove
@tfardet, shall I merge? |
@hadrilec yes. The tests still took 3h, for reasons that I don't understand... but it seems to be already the case for the last 3 commits on master and checking the coverage showed that the multiprocessing code was used, not the Could you first merge master into it and fix the conflicts? (since you are using a branch on the pynsee account, I can't do it myself) |
No description provided.