通过命令行的方式建立Dask集群

标签:#dask##python##分布式编程# 时间:2020/05/06 11:41:09 作者:小木

Dask的集群启动创建也很简单,有好几种方式,最简单的是采用官方提供dask-scheduler和dask-worker命令行方式。本文描述如何使用命令行方法建立Dask集群。

[TOC]

一、概述

官方介绍如下:

这是在多台计算机上部署Dask的最基本方法。在生产环境中,此过程通常由其他资源管理器自动执行。因此,很少有人需要明确遵循这些说明。 相反,这些说明对可能想要设置自动化服务以在其机构内部署Dask的IT专业人员很有用。

主要步骤包括:
1、启动dask-scheduler
2、注册worker

假设我们有两台主机,一台主机作为scheduler,同时也作为worker,另一台主机只作为worker。使用方法如下:

一、启动scheduler

直接使用dask-scheduler即可。安装完dask之后,系统就会有dask-scheduler命令,在作为schedule的服务器上直接运行如下命令即可:

dask-scheduler

一般有如下结果表明创建scheduler成功:

$ dask-scheduler
Start scheduler at 127.0.0.1:8786

二、注册worker

与前面类似,直接使用dask-worker即可,同时加上scheduler的地址,然后使用默认配置启动了。

成功后有如下信息:

$ dask-worker 127.0.0.1:8786
Start worker at:               127.0.0.1:1234
Registered with scheduler at:  127.0.0.1:8786

当然worker中有很多配置,最常用的是线程数(nthreads)和进程数(nprocs),如下所示

$ dask-worker 127.0.0.1:8786 --nthreads 1 --nprocs 2
distributed.nanny - INFO -         Start Nanny at: 'tcp://127.0.0.1:37543'
distributed.nanny - INFO -         Start Nanny at: 'tcp://127.0.0.1:35610'
distributed.dashboard.proxy - INFO - To route to workers diagnostics web server please install jupyter-server-proxy: python -m pip install jupyter-server-proxy
distributed.dashboard.proxy - INFO - To route to workers diagnostics web server please install jupyter-server-proxy: python -m pip install jupyter-server-proxy
distributed.worker - INFO -       Start worker at: tcp://127.0.0.1:36678
distributed.worker - INFO -          Listening to: tcp://127.0.0.1:36678
distributed.worker - INFO -          dashboard at:       127.0.0.1:42342
distributed.worker - INFO -       Start worker at: tcp://127.0.0.1:46164
distributed.worker - INFO - Waiting to connect to:    tcp://127.0.0.1:8786
distributed.worker - INFO -          Listening to: tcp://127.0.0.1:46164
distributed.worker - INFO - -------------------------------------------------
distributed.worker - INFO -          dashboard at:       127.0.0.1:42046
distributed.worker - INFO -               Threads:                          1
distributed.worker - INFO - Waiting to connect to:    tcp://127.0.0.1:8786
distributed.worker - INFO -                Memory:                    4.19 GB
distributed.worker - INFO - -------------------------------------------------
distributed.worker - INFO -       Local Directory: /home/lodap/data/dask-worker-space/worker-a01ixxg6
distributed.worker - INFO - -------------------------------------------------
distributed.worker - INFO -               Threads:                          1
distributed.worker - INFO -                Memory:                    4.19 GB
distributed.worker - INFO -       Local Directory: /home/lodap/data/dask-worker-space/worker-ly7gp0_h
distributed.worker - INFO - -------------------------------------------------
distributed.worker - INFO -         Registered to:    tcp://127.0.0.1:8786
distributed.worker - INFO - -------------------------------------------------
distributed.core - INFO - Starting established connection
欢迎大家关注DataLearner官方微信,接受最新的AI技术推送