You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It is not possible to use the multiprocessing module from within a data function.
fromspotfireimportdata_functionasdfouts= [df.AnalyticOutput("out", "/tmp/out.sbdf")]
spec=df.AnalyticSpec("script", [], outs, """import pandas as pdimport multiprocessing as mpimport timedef _task(n): time.sleep(0.5) return f'task {n}'if __name__ == '__main__': with mp.Pool(2) as pool: args = [(i,) for i in range(20)] results = [pool.apply_async(_task, arg) for arg in args] out = pd.DataFrame({'x': [x.get() for x in results]})""")
result=spec.evaluate()
print(result.summary)
The above code is expected to result in a data frame with one column (x) with twenty rows (from "task 0" through to "task 19"), but is instead failing on both Windows and Linux hosts with an error from pickle:
Error executing Python script:
_pickle.PicklingError: Can't pickle <function _task at 0x0000024C4DAD7380>: attribute lookup _task on __main__ failed
Traceback (most recent call last):
File "data_function.py", line 417, in _execute_script
exec(compiled_script, self.globals)
File "<data_function>", line 14, in <module>
File "pool.py", line 774, in get
raise self._value
File "pool.py", line 540, in _handle_tasks
put(task)
File "connection.py", line 206, in send
self._send_bytes(_ForkingPickler.dumps(obj))
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
Pickle is failing since the data function is executing under the auspices of compile and exec (in the AnalyticSpec class), such that is not able to find the definition of the function to marshall to the multiprocessing pool interpreters. (An interesting discussion of the underlying issue can be found at https://stackoverflow.com/questions/31191947/pickle-and-exec-in-python, but the accepted solution appears to be based on the imp module, which has been deprecated since 3.4.)
The text was updated successfully, but these errors were encountered:
It is not possible to use the
multiprocessing
module from within a data function.The above code is expected to result in a data frame with one column (
x
) with twenty rows (from "task 0
" through to "task 19
"), but is instead failing on both Windows and Linux hosts with an error frompickle
:Pickle is failing since the data function is executing under the auspices of
compile
andexec
(in theAnalyticSpec
class), such that is not able to find the definition of the function to marshall to the multiprocessing pool interpreters. (An interesting discussion of the underlying issue can be found at https://stackoverflow.com/questions/31191947/pickle-and-exec-in-python, but the accepted solution appears to be based on theimp
module, which has been deprecated since 3.4.)The text was updated successfully, but these errors were encountered: