How to install specific hotfix on Windows Server

Windows容器环境有个特点, Host与Container的OS Builder Number必须匹配, 有点场景甚至要求Revision Number匹配, 所以经常要为K8s Node安装指定Revision 的hotfix,  用powershell在线安装时下载过程缓慢而不可控, 体验最好的路径还是直接找到相应Revision Number的msu安装包,直接安装:

1. 从Windows Update History网站找到版本对应的KB. 如: Windows Server 1809 OS Build 10.0.17763.1158
https://support.microsoft.com/en-us/help/4549949

2. 在Windows Update Catelog按KB搜索: https://www.catalog.update.microsoft.com/
找到相应的下载包. 如17763.1158对应的KB4549949: https://www.catalog.update.microsoft.com/Search.aspx?q=KB4549949

3. 下载msu安装包后使用wusa指令安装即可:

wusa windows10.0-kb4549949-x64_90e8805e69944530b8d4d4877c7609b9a9e68d81.msu

附:

为了防止Windows Node版本变更, 还要关闭Windows Auto Update, 防止Node OS自己变更版本:

a). 查看Auto Update 状态:

%systemroot%\system32\Cscript %systemroot%\system32\scregedit.wsf /AU /v

b). 禁用 Windows Auto Update:

Net stop wuauserv 
%systemroot%\system32\Cscript %systemroot%\system32\scregedit.wsf /AU 1 
Net start wuauserv

PS: 可使用wmic qfe list查看已安装的hostfix

Reference:
https://docs.microsoft.com/en-us/windows-server/administration/server-core/server-core-servicing

Customize hosts record on docker and kubernetes

Docker:

docker run -it --rm --add-host=host1:172.17.0.2 --add-host=host2:192.168.1.3 busybox

use “–add-host” to add entries to /etc/hosts

 

Kubernetes:

apiVersion: v1
kind: Pod
metadata:
  name: hostaliases-pod
spec:
  hostAliases:
  - ip: "127.0.0.1"
    hostnames:
    - "foo.local"
    - "bar.local"
  - ip: "10.1.2.3"
    hostnames:
    - "foo.remote"
    - "bar.remote"
  containers:
  - name: cat-hosts
    image: busybox
    command:
    - cat
    args:
    - "/etc/hosts"

use “spec.hostAliases” to configure hosts entry for pod/deployment

 

https://kubernetes.io/docs/concepts/services-networking/add-entries-to-pod-etc-hosts-with-host-aliases/

遇到了传说中的container runtime is down PLEG is not healthy

在一次异常断电后, 开发环境的一个小kubernetes cluster中不幸遭遇了PLEG is not healthy问题, 表现是k8s中的pod状态变成Unknown或ContainerCreating, k8s节点状态变成NotReady:

# kubectl get nodes
NAME             STATUS     ROLES     AGE	VERSION   EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION               CONTAINER-RUNTIME
k8s-dev-master   Ready      master    1y        v1.10.0   <none>        CentOS Linux 7 (Core)   3.10.0-957.21.3.el7.x86_64   docker://17.3.0
k8s-dev-node1    NotReady   node      1y        v1.10.0   <none>        CentOS Linux 7 (Core)   3.10.0-957.21.3.el7.x86_64   docker://Unknown
k8s-dev-node2    NotReady   node      1y        v1.10.0   <none>        CentOS Linux 7 (Core)   3.10.0-957.21.3.el7.x86_64   docker://Unknown
k8s-dev-node3    NotReady   node      289d	v1.10.0   <none>        CentOS Linux 7 (Core)   3.10.0-957.21.3.el7.x86_64   docker://Unknown
k8s-dev-node4    Ready      node      289d	v1.10.0   <none>        CentOS Linux 7 (Core)   3.10.0-957.21.3.el7.x86_64   docker://17.3.0

Kubelet日志中提示: skipping pod synchronization, container runtime is down PLEG is not healthy:

9月 25 11:05:06 k8s-dev-node1 kubelet[546]: I0925 11:05:06.003645     546 kubelet.go:1794] skipping pod synchronization - [container runtime is down PLEG is not healthy: pleg was last seen active 21m18.877402888s ago; threshold is 3m0s]
9月 25 11:05:11 k8s-dev-node1 kubelet[546]: I0925 11:05:11.004116     546 kubelet.go:1794] skipping pod synchronization - [container runtime is down PLEG is not healthy: pleg was last seen active 21m23.877803484s ago; threshold is 3m0s]
9月 25 11:05:16 k8s-dev-node1 kubelet[546]: I0925 11:05:16.004382     546 kubelet.go:1794] skipping pod synchronization - [container runtime is down PLEG is not healthy: pleg was last seen active 21m28.878169681s ago; threshold is 3m0s]

重启节点docker和kubelet后恢复,过不了多久又出错变成NotReady, google了一把,在stackoverflow和github/kubernetes上有相关的issue:

#45419在v1.16中才被fix, 从1.10升级到1.16太繁琐, 看到 #61117中的一个评论说通过请求节点上的/var/lib/kubelet/pods目录可以解决, 第一次试了下由于mount卷的占用问题没有删除掉该目录, 问题没有解决, 后面索性级升级了docker, 从17.3.0升级到了19.3.2, 并请除了每个节点中/var/lib/kubelet/pods/, /var/lib/docker两个目录下的所有数据后,问题解决了。

大致过程:

# 先禁用docker和kubelet自动启动, 重启后清除文件:
systemctl disable docker && systemctl disable kubelet
reboot
rm -rf /var/lib/kubelet/pods/
rm -rf /var/lib/docker

# 中间顺便把docker-ce从17.3.0升级到了19.3.2

# 升级完docker后修改docker.service还指定17.3.0中默认的storage-driver为overlay, 中间试过overlay2, devicemapper, vfs, kubelet中都有报错, 不知是kubernetes v1.10的支持问题还是数据没有清除干净
vi /etc/systemd/system/docker.service

ExecStart=/usr/bin/dockerd ... --storage-driver=overlay

# 重新加载配置后启动docker
systemctl daemon-reload
systemctl start docker && systemctl enable docker
systemctl status docker

# 由于/var/lib/docker目录被整体删除, 如果节点不能直接访问k8s镜像库,需要手动导入节点需要的基础镜像:
docker load -i kubernetes-v10.0-node.tar

# 启动Kubelet
systemctl start kubelet && systemctl enable kubelet
systemctl status kubelet

问题解决:

# kubectl get nodes -o wide
NAME             STATUS    ROLES     AGE       VERSION   EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION               CONTAINER-RUNTIME
k8s-dev-master   Ready     master    1y        v1.10.0   <none>        CentOS Linux 7 (Core)   3.10.0-957.21.3.el7.x86_64   docker://17.3.0
k8s-dev-node1    Ready     node      1y        v1.10.0   <none>        CentOS Linux 7 (Core)   3.10.0-957.21.3.el7.x86_64   docker://19.3.2
k8s-dev-node2    Ready     node      1y        v1.10.0   <none>        CentOS Linux 7 (Core)   3.10.0-957.21.3.el7.x86_64   docker://19.3.2
k8s-dev-node3    Ready     node      289d      v1.10.0   <none>        CentOS Linux 7 (Core)   3.10.0-957.21.3.el7.x86_64   docker://19.3.2
k8s-dev-node4    Ready     node      289d      v1.10.0   <none>        CentOS Linux 7 (Core)   3.10.0-957.21.3.el7.x86_64   docker://19.3.2

本次断电不幸造成了kong网关上3个月的配置数据丢失:(, 备份! 备份! 备份!

 

 

 

 

 

Getting real client IP in Docker Swarm

在Docker Swarm中通过Stack Deploy部署Service的时候,在Service中默认无法获取到客户端的IP地址, Github中有一个issue在track这个问题:Unable to retrieve user’s IP address in docker swarm mode

目前的解决方法或Workaround是把port改成host模式, 以kong为例.

默认的port发布模式:

version: "3.7"
services:
  kong-proxy:
    image: kong:1.0.3-alpine
    deploy:
      mode: global
      labels:
        - "tier=frontend"
      restart_policy:
        condition: any
    ports:
      - "80:8000"
      - "443:8443"
    depends_on:
      - database-postgresql
    environment:
      KONG_ADMIN_LISTEN: 0.0.0.0:8001, 0.0.0.0:8444 ssl
      KONG_DATABASE: postgres
      KONG_PG_DATABASE: kong
      KONG_PG_USER: kong
      KONG_PG_PASSWORD: PaSsW0rd
      KONG_PG_HOST: database-postgresql
      KONG_PG_PORT: "5432"

    volumes:
      - type: "bind"
        source: "/var/log/kong/"
        target: "/usr/local/kong/logs/"
#        read_only: true
    networks:
      - backend
      - frontend
networks:
  frontend:
  backend:

 

修改port为host模式:

version: "3.7"
services:
  kong-proxy:
    image: kong:1.0.3-alpine
    deploy:
      mode: global
      labels:
        - "tier=frontend"
      restart_policy:
        condition: any
    ports:
      - target: 8000
        published: 80
        mode: host
      - target: 8443
        published: 43
        mode: host
    depends_on:
      - database-postgresql
    environment:
      KONG_ADMIN_LISTEN: 0.0.0.0:8001, 0.0.0.0:8444 ssl
      KONG_DATABASE: postgres
      KONG_PG_DATABASE: kong
      KONG_PG_USER: kong
      KONG_PG_PASSWORD: PaSsW0rd
      KONG_PG_HOST: database-postgresql
      KONG_PG_PORT: "5432"

    volumes:
      - type: "bind"
        source: "/var/log/kong/"
        target: "/usr/local/kong/logs/"
#        read_only: true
    networks:
      - backend
      - frontend
networks:
  frontend:
  backend:

 

Docker for Windows 18.06.0-ce released

18.06.0-ce-win70 (19075)

  • Upgrades
  • New
  • Bug fixes and minor changes
    • AUFS storage driver is deprecated in Docker Desktop and AUFS support will be removed in the next major release. You can continue with AUFS in Docker Desktop 18.06.x, but you will need to reset disk image (in Settings > Reset menu) before updating to the next major update. You can check documentation to save images and backup volumes
    • Fix bug which would cause VM logs to be written to RAM rather than disk in some cases, and the VM to hang.
    • Fix security issue with named pipe connection to docker service.
    • Fix VPNKit memory leak. Fixes docker/for-win#2087, moby/vpnkit#371
    • Fix restart issue when using Windows fast startup on latest 1709 Windows updates. Fixes docker/for-win#1741, docker/for-win#1741
    • DNS name host.docker.internal can be used for host resolution from Windows Containers. Fixes docker/for-win#1976
    • Fix broken link in diagnostics window.
    • Added log rotation for docker-ce logs inside the virtual machine.
    • Changed smb permission to avoid issue when trying to manipulate files with different users in containers. Fixes docker/for-win#2170

License for OS (Windows) inside Docker [reshipment]

How does licensing work?

For production, licensing is at the host level, i.e. each machine or VM which is running Docker. Your Windows licence on the host allows you to run any number of Windows Docker containers on that host. With Windows Server 2016 you get the commercially supported version of Docker included in the licence costs, with support from Microsoft and Docker, Inc.

For development, Docker for Windows runs on Windows 10 and is free, open-source software. Docker for Windows can also run a Linux VM on your machine, so you can use both Linux and Windows containers in development. Like the server version, your Windows 10 licence allows you to run any number of Windows Docker containers.

Windows admins will want a unified platform for managing images and containers. That’s Docker Datacenter which is separately licensed, and will be available for Windows soon.

 

https://blog.docker.com/2017/01/docker-windows-server-image2docker/#h.x2hzndd3qwow

CentOS 7 中安装配置Docker

1. 通过下载Binary包安装docker

在CentOS中,由于相关组件比较齐全,可直接下载docker的发布包直接启动,可以从下面的网页中找到下载链接:

https://docs.docker.com/install/linux/docker-ce/binaries/
https://download.docker.com/linux/static/stable/x86_64/

下载:

# curl -#O https://download.docker.com/linux/static/stable/`uname -m`/docker-17.12.1-ce.tgz

解压并Copy到/usr/bin/:

# tar xzvf docker-17.12.1-ce.tgz
# cp docker/* /usr/bin/

其他机器不用重复下载,sftp到第一台机器直接copy过来:

#sftp root@192.168.1.11:/root/download/
sftp> get docker/*
sftp> exit

 

直接运行dockerd

测试一下看能否成功启动docker daemon:

接下来需要把dockerd配置成系统服务自动启动。

参照官方文档:https://docs.docker.com/config/daemon/systemd/#manually-create-the-systemd-unit-files
https://github.com/moby/moby/tree/master/contrib/init/systemd把docker.service和docker.socket下载到/etc/systemd/system/目录

# curl -o /etc/systemd/system/docker.service https://raw.githubusercontent.com/moby/moby/master/contrib/init/systemd/docker.service
# curl -o /etc/systemd/system/docker.socket https://raw.githubusercontent.com/moby/moby/master/contrib/init/systemd/docker.socket

# systemctl daemon-reload
# systemctl enable docker


然后通过# systemctl start docker 启动docker服务,如果在启动过程中遇到如下错误:

- Unit docker.socket has begun starting up.
3月 22 00:47:07 centos02 systemd[1148]: Failed to chown socket at step GROUP: No such process
3月 22 00:47:07 centos02 systemd[1]: docker.socket control process exited, code=exited status=216
3月 22 00:47:07 centos02 systemd[1]: Failed to listen on Docker Socket for the API.
-- Subject: Unit docker.socket has failed

请检查/etc/systemd/system/docker.socket文件中配置的SockerGroup对应的组是否存在,如果不存在则通过# groupadd添加后再启动docker服务,从github上下载的docker.socket中配置的SockerGroup是docker,需要先添加该group:

# groupadd docker

然后再启动docker服务,启动成功:

docker服务启动后,通过#docker version查询client与server端版本信息:

其它自定义的docker daemon启动参数及环境变量可参考官方文档:https://docs.docker.com/config/daemon/systemd/, 通过systemd drop-in和 /etc/docker/daemon.json配置。

2. 通过yum repo安装docker

手动下载binary包的安装方式略显繁琐,通过yum安装的方式就会自动化和简单很多:

a) 添加yum repo

# tee /etc/yum.repos.d/docker.repo <<-'EOF'
[dockerrepo]
name=Docker Repository
baseurl=https://yum.dockerproject.org/repo/main/centos/$releasever/
enabled=1
gpgcheck=1
gpgkey=https://yum.dockerproject.org/gpg
EOF

b) 安装docker

# yum install docker-engine

c) 启动docker服务并开机自动启动

# systemctl start docker
# systemctl enable docker

3. bridge-nf-call-iptables问题

运行docker info, 查看是否有提示bridge-nf-call-iptables is disabled和bridge-nf-call-ip6tables is disabled 的 WARNNING:

# docker info
Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 1
Server Version: 17.12.1-ce
Storage Driver: overlay2
 Backing Filesystem: xfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 9b55aab90508bd389d7654c4baf173a981477d55
runc version: 9f9c96235cc97674e935002fc3d78361b696a69e
init version: 949e6fa
Security Options:
 seccomp
  Profile: default
Kernel Version: 3.10.0-862.2.3.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 991.7MiB
Name: centos01
ID: KL2R:7F52:M5SV:T3U7:GL3Y:UU6F:KGE2:DM3Y:STSY:MLEZ:XXEL:EWG3
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled

通过添加以下配置解决:

# tee -a /etc/sysctl.conf <<-'EOF'
net.bridge.bridge-nf-call-iptables=1
net.bridge.bridge-nf-call-ip6tables=1
EOF
# sysctl -p

详细参见:关于bridge-nf-call-iptables的设计问题

3. 为docker daemon配置代理

有时候docker环境会运行在一个代理或防火墙内部,为了让docker daemon从外网pull镜像,就需要给docker daemon配置代理。有两种配置方式:

a) 通过Service Drop-In文件

例如我的代理地址为http://192.168.1.3:1080/:

# mkdir -p /etc/systemd/system/docker.service.d/
# tee /etc/systemd/system/docker.service.d/http-proxy.conf <<-'EOF'
[Service]
Environment="HTTP_PROXY=http://192.168.1.3:1080/" "HTTPS_PROXY=http://192.168.1.3:1080/" "NO_PROXY=192.168.1.1,192.168.1.3,192.168.1.11,192.168.1.12,192.168.1.13,192.168.1.14,192.168.1.99,127.0.0.1,localhost"
EOF
# systemctl daemon-reload
# systemctl restart docker

b) 修改/etc/systemd/system/docker.service文件,在[Service]配置节添加Environment:

[Service]
Environment="HTTP_PROXY=http://192.168.1.3:1080/" "HTTPS_PROXY=http://192.168.1.3:1080/" "NO_PROXY=192.168.1.1,192.168.1.3,192.168.1.11,192.168.1.12,192.168.1.13,192.168.1.14,192.168.1.99,127.0.0.1,localhost"

如果代理服务器需要认证,则配置格式为:http://username:password@192.168.1.3:1080/, 如果username或password中有特殊字符,则必须进行encode。 如#要改成%23

c) 验证

# systemctl show --property Environment docker
Environment=HTTP_PROXY=http://192.168.1.3:1080/ HTTPS_PROXY=http://192.168.1.3:1080/ NO_PROXY=192.168.1.1,192.168.1.3,192.168.1.11,192.168.1.12,192.168.1.13,192.168.1.14,192.168.1.99,127.0.0.1,localhost

如果你的代理服务器是HTTPS的,有自己的HTTPS证书,那就更麻烦一些,你需要:

  1. 安装ca-certificates包
  2. 下载该HTTPS证书的PEM格式,保存到指定目录(CentOS是放在/etc/pki/ca-trust/source/anchors/, Ubuntu是放在/usr/local/share/ca-certificates/)
  3. 执行命令刷新信任证书(CentOS中执行update-ca-trust, Ubuntu中执行update-ca-certificates)

详见:

https://docs.docker.com/engine/reference/commandline/dockerd/#running-a-docker-daemon-behind-an-https_proxy

https://manuals.gfi.com/en/kerio/connect/content/server-configuration/ssl-certificates/adding-trusted-root-certificates-to-the-server-1605.html

 

4.其它配置参数

docker服务还有很多其它参数可以通过Drop-In, docker.service或/etc/docker/daemon.json进行配置,如添加一个本地镜像库,可以通过几种方式进行配置 :

a) 修改docker.service文件,在dockerd后面添加一个或多个–insecure-registry 192.168.1.3:10000

b) 修改/etc/docker/daemon.json,添加insecure-registries配置

{
    "insecure-registries": ["192.168.1.3:10000"]
}

更新配置参数请参见:

https://docs.docker.com/engine/reference/commandline/dockerd/#daemon

https://docs.docker.com/engine/reference/commandline/dockerd/#daemon-configuration-file

附:安装docker-compose

# curl -L https://github.com/docker/compose/releases/download/1.21.0/docker-compose-$(uname -s)-$(uname -m) -o /usr/bin/docker-compose
# chmod +x /usr/bin/docker-compose
# docker-compose --version
docker-compose version 1.21.0, build 5920eb0

最新Community 19.3.2的安装方法

# yum remove docker \
                  docker-client \
                  docker-client-latest \
                  docker-common \
                  docker-latest \
                  docker-latest-logrotate \
                  docker-logrotate \
                  docker-engine

# yum install -y yum-utils \
  device-mapper-persistent-data \
  lvm2

# yum-config-manager \
    --add-repo \
    https://download.docker.com/linux/centos/docker-ce.repo

# yum list docker-ce --showduplicates | sort -r

# yum install docker-ce docker-ce-cli containerd.io