基于docker搭建hadoop集群

1. 创建自定义docker network

docker network create --subnet=172.20.0.0/16 hadoop-cluster

2. 创建自定义镜像(基于centos)

        拉取centos镜像:

 docker pull centos:latest

        运行容器:

docker run -d --privileged  --name cluster-master -h cluster-master --net hadoop-network --ip 172.18.0.2 centos:latest /usr/sbin/init

避免centos容器启动后无法运行sshd服务,需要附件启动参数:/usr/sbin/init

安装openssh服务ssh自动接受新的公钥

首先进入容器:

docker exec -it 37fbcc24b0e3 /bin/bash
[root@cluster-master /]# yum -y install openssh openssh-server openssh-clients
[root@cluster-master /]# systemctl start sshd

 ssh自动接受新的公钥

master设置ssh登录自动添加kown_hosts
vi编辑/etc/ssh/ssh_config配置文件
设置StrictHostKeyChecking为no

安装clear命令

yum install ncurses

设置root密码(集群密码要一致)

yum install -y passwd
passwd root

将linux版的jdk复制至docker容器内(非docker容器内)

docker cp jdk-8u231-linux-x64.tar.gz cluster-master:/soft

安装jdk(容器内)

tar -zxvf jdk-8u231-linux-x64.tar.gz
mv jdk1.8.0_231 /opt

设置jdk环境变量(容器内)

vim /etc/profile
export JAVA_HOME=/opt/jdk1.8.0_231
export PATH=$PATH:$JAVA_HOME/bin

容器自动刷新环境变量:进入/root目录

在.bash_profile文件中添加 source /etc/profile

将配置好jdk的centos容器保存为一个镜像

docker export -o centos-jdk.tar cluster-master

3.导入刚才保存的镜像

docker import centos-jdk.tar centos-jdk:v3

使用这个镜像启动三个容器,分别叫cluster-master,cluster-slave01,cluster-slave02

docker run -d --privileged  --name cluster-master -h cluster-master --net hadoop-cluster --ip 172.20.0.2 centos-jdk:v3 /usr/sbin/init
docker run -d --privileged  --name cluster-slave01 -h cluster-slave01 --net hadoop-cluster --ip 172.20.0.3 centos-jdk:v3 /usr/sbin/init
docker run -d --privileged  --name cluster-slave02 -h cluster-slave02 --net hadoop-cluster --ip 172.20.0.4 centos-jdk:v3 /usr/sbin/init

分别进入这三个容器,处理ssh的互联信任问题,最后再将各自容器中的.ssh目录拷贝出来,用于挂载

hosts文件处理:在/etc/hosts文件中配置集群节点

vim /etc/hosts

cluster-master 172.20.0.2
cluster-slave01 172.20.0.3
cluster-slave02 172.20.0.4

后续三个容器都挂载这一个hosts配置文件

hostname文件处理:在每个容器中的/etc/hostname文件中修改各自的hostname值

cluster-master 容器为cluster-master
cluster-slave01容器为cluster-slave01
cluster-slave02容器为cluster-slave02

每个容器各自挂载各自的hostname配置文件

sshd_config配置文件修改,这个文件是一样的,可以将这个文件对应到三个容器中,就是说,三个容器都挂载这一个配置文件

开启一下两项:

PubkeyAuthentication yes
RSAAuthentication yes

4.hadoop安装

将需要安装的hadoop包,进行挂载,每个容器要挂载一个hadoop,不能三个容器挂载同一个目录下的hadoop

再修改/etc/profile文件,添加hadoop环境变量

export HADOOP_HOME=/opt/hadoop-2.9.2
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin

使用脚本启动集群

#!/bin/bash

echo "创建master节点"
docker run -d --privileged  --name cluster-master -h cluster-master \
        -p 22:22 -p 50070:50070 -p 50075:50075\
        -v /Users/caochuanhong/develop/docker/hadoop/volume/soft:/soft \
        -v /Users/caochuanhong/develop/docker/hadoop/volume/ssh/cluster-master:/root/.ssh \
        -v /Users/caochuanhong/develop/docker/hadoop/volume/etc/cluster-master/hosts:/etc/hosts \
        -v /Users/caochuanhong/develop/docker/hadoop/volume/etc/cluster-master/profile:/etc/profile \
        -v /Users/caochuanhong/develop/docker/hadoop/volume/etc/ssh/sshd_config:/etc/ssh/sshd_config \
        -v /Users/caochuanhong/develop/docker/hadoop/volume/opt/cluster-master/hadoop-2.9.2:/opt/hadoop-2.9.2 \
        --net hadoop-cluster --ip 172.20.0.2 centos-jdk:v3 /usr/sbin/init&& source /etc/profile

#解决/var/lib/hadoop-hdfs目录权限问题    
docker exec cluster-master bash -c "mkdir -p /var/lib/hadoop-hdfs&&chown root /var/lib/hadoop-hdfs"

echo "创建slave01节点"
docker run -d --privileged  --name cluster-slave01 -h cluster-slave01 \
        -v /Users/caochuanhong/develop/docker/hadoop/volume/ssh/cluster-slave01:/root/.ssh \
        -v /Users/caochuanhong/develop/docker/hadoop/volume/soft:/soft \
        -v /Users/caochuanhong/develop/docker/hadoop/volume/etc/cluster-slave01/hosts:/etc/hosts \
        -v /Users/caochuanhong/develop/docker/hadoop/volume/etc/cluster-slave01/profile:/etc/profile \
        -v /Users/caochuanhong/develop/docker/hadoop/volume/etc/ssh/sshd_config/etc/ssh/sshd_config \
        -v /Users/caochuanhong/develop/docker/hadoop/volume/opt/cluster-slave01/hadoop-2.9.2:/opt/hadoop-2.9.2 \
        --net hadoop-cluster --ip 172.20.0.3 centos-jdk:v3 /usr/sbin/init&& source /etc/profile
        
#解决/var/lib/hadoop-hdfs目录权限问题            
docker exec cluster-slave01 bash -c "mkdir -p /var/lib/hadoop-hdfs&&chown root /var/lib/hadoop-hdfs"


echo "创建slave02节点"
docker run -d --privileged  --name cluster-slave02 -h cluster-slave02 \
        -v /Users/caochuanhong/develop/docker/hadoop/volume/ssh/cluster-slave02:/root/.ssh \
        -v /Users/caochuanhong/develop/docker/hadoop/volume/soft:/soft \
        -v /Users/caochuanhong/develop/docker/hadoop/volume/etc/cluster-slave02/hosts:/etc/hosts \
        -v /Users/caochuanhong/develop/docker/hadoop/volume/etc/cluster-slave02/profile:/etc/profile \
        -v /Users/caochuanhong/develop/docker/hadoop/volume/etc/ssh/sshd_config:/etc/ssh/sshd_config \
        -v /Users/caochuanhong/develop/docker/hadoop/volume/opt/cluster-slave02/hadoop-2.9.2:/opt/hadoop-2.9.2 \
        --net hadoop-cluster --ip 172.20.0.4 centos-jdk:v3 /usr/sbin/init&& source /etc/profile

#解决/var/lib/hadoop-hdfs目录权限问题        
docker exec cluster-slave02 bash -c "mkdir -p /var/lib/hadoop-hdfs&&chown root /var/lib/hadoop-hdfs"

启动后进入容器,master节点运行start-dfs.sh

5. docker-conpose方式启动(依赖镜像已修改)

#使用的compose版本
version: '3'

#compose关键字,定义services
services:

  cch_hadoop_master: 
    container_name: cch_hadoop_master
    privileged: true
    networks: 
      default: 
        ipv4_address: 172.20.0.2
    image: cch/centos:v3
    ports: 
    - "22:22"
    - "50070:50070"
    - "50075:50075"
    - "19888:19888"
    volumes:
      - /Users/caochuanhong/develop/docker/hadoop/volume/soft:/soft
      - /Users/caochuanhong/develop/docker/hadoop/volume/ssh/cluster-master:/root/.ssh
      - /Users/caochuanhong/develop/docker/hadoop/volume/etc/cluster-master/hosts:/etc/hosts
      - /Users/caochuanhong/develop/docker/hadoop/volume/etc/cluster-master/profile:/etc/profile
      - /Users/caochuanhong/develop/docker/hadoop/volume/etc/cluster-master/hostname:/etc/hostname
      - /Users/caochuanhong/develop/docker/hadoop/volume/etc/ssh/sshd_config:/etc/ssh/sshd_config
      - /Users/caochuanhong/develop/docker/hadoop/volume/opt/cluster-master/hadoop-2.9.2:/opt/hadoop-2.9.2
    command: 
      - /bin/sh
      - -c
      - |

        mkdir -p /var/lib/hadoop-hdfs&&chown root /var/lib/hadoop-hdfs
        source /etc/profile
        /usr/sbin/init
    stdin_open: true
    tty: true


  cch_hadoop_slave01: 
    container_name: cch_hadoop_slave01
    privileged: true
    networks: 
      default: 
        ipv4_address: 172.20.0.3
    image: cch/centos:v3
    ports: 
      - "122:22"
    volumes:
      - /Users/caochuanhong/develop/docker/hadoop/volume/soft:/soft
      - /Users/caochuanhong/develop/docker/hadoop/volume/ssh/cluster-slave01:/root/.ssh
      - /Users/caochuanhong/develop/docker/hadoop/volume/etc/cluster-slave01/hosts:/etc/hosts
      - /Users/caochuanhong/develop/docker/hadoop/volume/etc/cluster-slave01/profile:/etc/profile
      - /Users/caochuanhong/develop/docker/hadoop/volume/etc/cluster-slave01/hostname:/etc/hostname
      - /Users/caochuanhong/develop/docker/hadoop/volume/etc/ssh/sshd_config:/etc/ssh/sshd_config
      - /Users/caochuanhong/develop/docker/hadoop/volume/opt/cluster-slave01/hadoop-2.9.2:/opt/hadoop-2.9.2
    command: 
      - /bin/sh
      - -c
      - |

        mkdir -p /var/lib/hadoop-hdfs&&chown root /var/lib/hadoop-hdfs
        source /etc/profile
        /usr/sbin/init
    stdin_open: true
    tty: true


  cch_hadoop_slave02: 
    container_name: cch_hadoop_slave02
    privileged: true
    networks: 
      default: 
        ipv4_address: 172.20.0.4
    image: cch/centos:v3
    ports: 
      - "222:22"
    volumes:
      - /Users/caochuanhong/develop/docker/hadoop/volume/soft:/soft
      - /Users/caochuanhong/develop/docker/hadoop/volume/ssh/cluster-slave02:/root/.ssh
      - /Users/caochuanhong/develop/docker/hadoop/volume/etc/cluster-slave02/hosts:/etc/hosts
      - /Users/caochuanhong/develop/docker/hadoop/volume/etc/cluster-slave02/profile:/etc/profile
      - /Users/caochuanhong/develop/docker/hadoop/volume/etc/cluster-slave02/hostname:/etc/hostname
      - /Users/caochuanhong/develop/docker/hadoop/volume/etc/ssh/sshd_config:/etc/ssh/sshd_config
      - /Users/caochuanhong/develop/docker/hadoop/volume/opt/cluster-slave02/hadoop-2.9.2:/opt/hadoop-2.9.2
    command: 
      - /bin/sh
      - -c
      - |

        mkdir -p /var/lib/hadoop-hdfs&&chown root /var/lib/hadoop-hdfs
        source /etc/profile
        /usr/sbin/init
    stdin_open: true
    tty: true


networks:
  default:
    name: hadoop-cluster
      

参考:https://www.jianshu.com/p/0c7b6de487ce

上一篇:[Docker] 假如宿主机 Nginx 代理到 Docker 的 PHP


下一篇:浅析Docker数据管理-数据库容器化并持久化:数据卷概念、创建数据卷的2种方式、docker volume用法