运行dask程序报错:Task exception was never retrieved
运行本地dask集群的时候出错,报错信息如下:
Task exception was never retrieved
future: <Task finished coro=<_wrap_awaitable() done, defined at D:\Program Files\Python3.7\lib\asyncio\tasks.py:592> exception=RuntimeError('\n An attempt has been made to start a new process before the\n current process has finished its bootstrapping phase.\n\n This probably means that you are not using fork to start your\n child processes and you have forgotten to use the proper idiom\n in the main module:\n\n if __name__ == \'__main__\':\n freeze_support()\n ...\n\n The "freeze_support()" line can be omitted if the program\n is not going to be frozen to produce an executable.')>
Traceback (most recent call last):
File "D:\Program Files\Python3.7\lib\asyncio\tasks.py", line 599, in _wrap_awaitable
return (yield from awaitable.__await__())
File "D:\Program Files\Python3.7\lib\site-packages\distributed\nanny.py", line 291, in start
response = await self.instantiate()
File "D:\Program Files\Python3.7\lib\site-packages\distributed\nanny.py", line 374, in instantiate
result = await self.process.start()
File "D:\Program Files\Python3.7\lib\site-packages\distributed\nanny.py", line 567, in start
await self.process.start()
File "D:\Program Files\Python3.7\lib\site-packages\distributed\process.py", line 34, in _call_and_set_future
res = func(*args, **kwargs)
File "D:\Program Files\Python3.7\lib\site-packages\distributed\process.py", line 202, in _start
process.start()
File "D:\Program Files\Python3.7\lib\multiprocessing\process.py", line 112, in start
self._popen = self._Popen(self)
File "D:\Program Files\Python3.7\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "D:\Program Files\Python3.7\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "D:\Program Files\Python3.7\lib\multiprocessing\popen_spawn_win32.py", line 33, in __init__
prep_data = spawn.get_preparation_data(process_obj._name)
File "D:\Program Files\Python3.7\lib\multiprocessing\spawn.py", line 143, in get_preparation_data
_check_not_importing_main()
File "D:\Program Files\Python3.7\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main
is not going to be frozen to produce an executable.''')
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
提示Task exception was never retrieved。
代码如下:
import dask.dataframe as dd
from dask.distributed import Client, LocalCluster
import pandas as pd
cluster = LocalCluster(dashboard_address=None)
client = Client(cluster)
df = pd.DataFrame([[1, 1, 1], [2, 2, 2]])
df = dd.from_pandas(df, npartitions=1)
print(df.head())
这个问题在dask社区中有讨论( https://github.com/modin-project/modin/issues/843 ),具体原因还不是特别清楚,讨论过程看可能与modin或者ray有关系,解决方法很简单,加入main方法即可:
import dask.dataframe as dd
from dask.distributed import Client, LocalCluster
import pandas as pd
if __name__ == '__main__':
cluster = LocalCluster(dashboard_address=None)
client = Client(cluster)
df = pd.DataFrame([[1, 1, 1], [2, 2, 2]])
df = dd.from_pandas(df, npartitions=1)
print(df.head())
欢迎大家关注DataLearner官方微信,接受最新的AI技术推送
