For Windows Container, you need to set –image-pull-progress-deadline for kubelet

Windows镜像动则几个G, 基于Windows Server Core的镜像5~10G, Windows节点上的kubelet在下载镜像的时候经常会cancel掉:

Failed to pull image "XXX": rpc error: code = Unknown desc = context canceled

 

造成这个问题的原因是因为默认的image pulling progress deadline是1分钟, 如果1分钟内镜像下载没有任何进度更新, 下载动作就会取消, 比较大的镜像就无法成功下载. 见官方文档:

If no pulling progress is made before this deadline, the image pulling will be cancelled. This docker-specific flag only works when container-runtime is set to docker. (default 1m0s)

 

解决方法是为kubelet配置–image-pull-progress-deadline参数, 比如指定为30分钟:

"c:/k/kubelet.exe ... --image-pull-progress-deadline=30m"

 

对于Windows服务, 使用sc指令修改kubelet的binPath:

sc config kubelet binPath= " --image-pull-progress-deadline=30m

然后重启kubelet及依赖服务:

sc stop kubeproxy && sc stop kubelet && sc start kubelet && sc start kubeproxy && sc query kubelet && sc query kubeproxy

Refer to: https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/

 

Implementing Graceful Shutdown in Windows Container

Kubernetes Linux Pod中,当通过kubectl删除一个Pod或rolling update一个Pod时, 每Terminating的Pod中的每个Container中PID为1的进程会收到SIGTERM信号, 通知进程进行资源回收并准备退出. 如果在Pod spec.terminationGracePeriodSeconds指定的时间周期内进程没有退出, 则Kubernetes接着会发出SIGKILL信号KILL这个进程。

通过 kubectl delete –force –grace-period=0 … 的效果等同于直接发SIGKILL信号.

但SIGTERM和SIGKILL方式在Windows Container中并不工作, 目前Windows Container的表现是接收到Terminating指令5秒后直接终止。。。

参见:https://v1-18.docs.kubernetes.io/docs/setup/production-environment/windows/intro-windows-in-kubernetes/#v1-pod

  • V1.Pod.terminationGracePeriodSeconds – this is not fully implemented in Docker on Windows, see: reference. The behavior today is that the ENTRYPOINT process is sent CTRL_SHUTDOWN_EVENT, then Windows waits 5 seconds by default, and finally shuts down all processes using the normal Windows shutdown behavior. The 5 second default is actually in the Windows registry inside the container, so it can be overridden when the container is built.

基于社区的讨论结果及多次尝试, 目前Windows容器中行之有效的Graceful Shutdown方法是:

1. Build docker image时通过修改注册表延长等待时间

...
RUN reg add hklm\system\currentcontrolset\services\cexecsvc /v ProcessShutdownTimeoutSeconds /t REG_DWORD /d 300 && \
    reg add hklm\system\currentcontrolset\control /v WaitToKillServiceTimeout /t REG_SZ /d 300000 /f
...

上面两个注册表位置, 第1个单位为秒, 第2个为毫秒

2. 在应用程序中注册kernel32.dll中的SetConsoleCtrlHandler函数捕获CTRL_SHUTDOWN_EVENT事件, 进行资源回收

以一个.net framework 的Console App为例说明用法:

using System;
using System.Runtime.InteropServices;
using System.Threading;

namespace Q1.Foundation.SocketServer
{
    class Program
    {
        internal delegate bool HandlerRoutine(CtrlType CtrlType);
        private static HandlerRoutine ctrlTypeHandlerRoutine = new HandlerRoutine(ConsoleCtrlHandler);

        private static bool cancelled = false;
        private static bool cleanupCompleted = false;

        internal enum CtrlType
        {
            CTRL_C_EVENT = 0,
            CTRL_BREAK_EVENT = 1,
            CTRL_CLOSE_EVENT = 2,
            CTRL_LOGOFF_EVENT = 5,
            CTRL_SHUTDOWN_EVENT = 6
        }

        [DllImport("Kernel32")]
        internal static extern bool SetConsoleCtrlHandler(HandlerRoutine handler, bool add);

        static void Main()
        {
            var result = SetConsoleCtrlHandler(handlerRoutine, true);

            // INITIAL AND START APP HERE

            while (true)
            {
                if (cancelled) break;
            }

            // DO CLEANUP HERE
            ...
            cleanupCompleted = true;
        }

        private static bool ConsoleCtrlHandler(CtrlType type)
        {
            cancelled = true;

            while (!cleanupCompleted)
            {
                // Watting for clean-up to be completed...
            }

            return true;
        }
    }
}

代码解释:

  • 引入Kernel32并声明extern函数SetConsoleCtrlHandler
  • 创建static的HandlerRoutine.
  • 调用SetConsoleCtrlHandler注册处理函数进行事件捕获
  • 捕获后在HandlerRoutine应用程序中进行资源清理
  • 清理完成后在HandlerRoutine中返回true允许应用程序退出

上述两个步骤即完成了Graceful Shutdown.

 

需要注意的点是:

1. 传统.net Console App中的事件捕获( 比如: Console.CancelKeyPressSystemEvents.SessionEnding )在容器中都不会生效,AppDomain.CurrentDomain.ProcessExit的触发时间又太晚, 只有SetConsoleCtrlHandler可行. 更多的尝试代码请参见: https://github.com/moby/moby/issues/25982#issuecomment-250490552

2. 要防止程序退出前HandlerRoutine实例被回收, 所以上面示例中使用了static的HandlerRoutine. 这点很重要, 如果HandlerRoutine在应用程序未结束的时候被回收掉, 就会引发错误, 看如下代码:

static void Main()
{
    // Initialize here

    ...
    using
    {
        var sysEventHandler = new HandlerRoutine(type =>
        {
            cancelled = true;

            while (!cleanCompleted)
            {
                // Watting for clean-up to be completed...
            }

            return true;
        });
		
        var sysEventSetResult = SetConsoleCtrlHandler(sysEventHandler, true);
        ...
    }
    ...

    // Cleanup here
}

在应用程序退出前, HandlerRoutine实例已经被回收掉了,在CTRL_SHUTDOWN_EVENT 被触发时就会引发NullReferenceException, 具体错误信息如下:

Managed Debugging Assistant 'CallbackOnCollectedDelegate':
A callback was made on a garbage collected delegate of type 'Program+HandlerRoutine::Invoke'. This may cause application crashes, corruption and data loss. When passing delegates to unmanaged code, they must be kept alive by the managed application until it is guaranteed that they will never be called.

类似场景: CallbackOnCollectedDelegate was detected

 

关于SetConsoleCtrlHandler的使用参考:

SetConsoleCtrlHandler function

HandlerRoutine callback function

 

最后, 如果要处理的应用程序类型不是Console App, 而是图形化的界面应用,则要处理的消息应该是WM_QUERYENDSESSION, 参见文档:

https://docs.microsoft.com/en-us/windows/console/setconsolectrlhandler#remarks

WM_QUERYENDSESSION message

Add File Extension to Windows IIS Container during image build

Let’s say: we need to add json file extension to the containerized IIS.

Dockerfile:

FROM {imageRegistry}/mcr.microsoft.com/dotnet/framework/aspnet:4.8-20200114-windowsservercore-ltsc2019
COPY . /inetpub/wwwroot
WORKDIR /inetpub/wwwroot

RUN C:\windows\system32\inetsrv\appcmd.exe set config "Default Web Site" -section:system.webServer/security/requestFiltering /+"fileExtensions.[fileExtension='json',allowed='True']"

ENV ASPNETCORE_URLS http://+:80
EXPOSE 80/tcp

An error occurs during build docker image:

Step 1/6 : FROM repo.q1lan.k8s:9999/mcr.microsoft.com/dotnet/framework/aspnet:4.8-20200114-windowsservercore-ltsc2019
 ---> a5bc996f06b3
Step 2/6 : COPY . /inetpub/wwwroot
 ---> bdb9536e506a
Step 3/6 : WORKDIR /inetpub/wwwroot
 ---> Running in f7666a9ffd0b
Removing intermediate container f7666a9ffd0b
 ---> c9fe76854f6c
Step 4/6 : RUN C:\windows\system32\inetsrv\appcmd.exe set config "Default Web Site" -section:system.webServer/security/requestFiltering /+"fileExtensions.[fileExtension='json',allowed='True']"
 ---> Running in 1c74d16420c2
Failed to process input: The parameter 'Web' must begin with a / or - (HRESULT=80070057).

Try to escape all double-quotes in Dockerfile:

RUN C:\windows\system32\inetsrv\appcmd.exe set config \"Default Web Site\" -section:system.webServer/security/requestFiltering /+\"fileExtensions.[fileExtension='json',allowed='True']\"

It works like a charm:

Step 1/6 : FROM repo.q1lan.k8s:9999/mcr.microsoft.com/dotnet/framework/aspnet:4.8-20200114-windowsservercore-ltsc2019
 ---> a5bc996f06b3
Step 2/6 : COPY . /inetpub/wwwroot
 ---> 646bbf3d5def
Step 3/6 : WORKDIR /inetpub/wwwroot
 ---> Running in 584471c0524a
Removing intermediate container 584471c0524a
 ---> 54f6a3ade821
Step 4/6 : RUN C:\windows\system32\inetsrv\appcmd.exe set config \"Default Web Site\" -section:system.webServer/security/requestFiltering /+\"fileExtensions.[fileExtension='json',allowed='True']\"
 ---> Running in f84c38da656a
Applied configuration changes to section "system.webServer/security/requestFiltering" for "MACHINE/WEBROOT/APPHOST/Default Web Site" at configuration commit path "MACHINE/WEBROOT/APPHOST/Default Web Site"
Removing intermediate container f84c38da656a
 ---> 7dfffe2d9813
Step 5/6 : ENV ASPNETCORE_URLS http://+:80
 ---> Running in dff81c8282f1
Removing intermediate container dff81c8282f1
 ---> cbd697556dd7
Step 6/6 : EXPOSE 80/tcp
 ---> Running in d10903bec188
Removing intermediate container d10903bec188
...

Describe Kubelet Service Parameters on Azure Windows node

Query Kubelet service

Managed by nssm

C:\k>sc qc kubelet
[SC] QueryServiceConfig SUCCESS

SERVICE_NAME: kubelet
        TYPE               : 10  WIN32_OWN_PROCESS
        START_TYPE         : 2   AUTO_START
        ERROR_CONTROL      : 1   NORMAL
        BINARY_PATH_NAME   : C:\k\nssm.exe
        LOAD_ORDER_GROUP   :
        TAG                : 0
        DISPLAY_NAME       : Kubelet
        DEPENDENCIES       : docker
        SERVICE_START_NAME : LocalSystem

Query kubelet AppParameters by nssm

C:\k>nssm get kubelet Application
C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe

C:\k>nssm get kubelet AppParameters
c:\k\kubeletstart.ps1

Powershell scripts to start kubelet

$global:MasterIP = "q1game-q1game-6adca6-e3314a8c.hcp.westus2.azmk8s.io"
$global:KubeDnsSearchPath = "svc.cluster.local"
$global:KubeDnsServiceIp = "10.0.0.10"
$global:MasterSubnet = "10.240.0.0/16"
$global:KubeClusterCIDR = "10.240.0.0/16"
$global:KubeServiceCIDR = "10.0.0.0/16"
$global:KubeBinariesVersion = "1.17.3"
$global:CNIPath = "c:\k\cni"
$global:NetworkMode = "L2Bridge"
$global:ExternalNetwork = "ext"
$global:CNIConfig = "c:\k\cni\config\$global:NetworkMode.conf"
$global:HNSModule = "c:\k\hns.psm1"
$global:VolumePluginDir = "c:\k\volumeplugins"
$global:NetworkPlugin="azure"
$global:KubeletNodeLabels="kubernetes.azure.com/role=agent,agentpool=q1win,storageprofile=managed,storagetier=Premium_LRS,kubernetes.azure.com/cluster=MC_q1game_q1game_westus2"
Write-Host "NetworkPlugin azure, starting kubelet."

# Turn off Firewall to enable pods to talk to service endpoints. (Kubelet should eventually do this)
netsh advfirewall set allprofiles state off
# startup the service

# Find if network created by CNI exists, if yes, remove it
# This is required to keep the network non-persistent behavior
# Going forward, this would be done by HNS automatically during restart of the node

$hnsNetwork = Get-HnsNetwork | ? Name -EQ azure
if ($hnsNetwork)
{
    # Cleanup all containers
    docker ps -q | foreach {docker rm $_ -f}

    Write-Host "Cleaning up old HNS network found"
    Remove-HnsNetwork $hnsNetwork
    # Kill all cni instances & stale data left by cni
    # Cleanup all files related to cni
    taskkill /IM azure-vnet.exe /f
    taskkill /IM azure-vnet-ipam.exe /f
    $cnijson = [io.path]::Combine("c:\k", "azure-vnet-ipam.json")
    if ((Test-Path $cnijson))
    {
        Remove-Item $cnijson
    }
    $cnilock = [io.path]::Combine("c:\k", "azure-vnet-ipam.json.lock")
    if ((Test-Path $cnilock))
    {
        Remove-Item $cnilock
    }

    $cnijson = [io.path]::Combine("c:\k", "azure-vnet.json")
    if ((Test-Path $cnijson))
    {
        Remove-Item $cnijson
    }
    $cnilock = [io.path]::Combine("c:\k", "azure-vnet.json.lock")
    if ((Test-Path $cnilock))
    {
        Remove-Item $cnilock
    }
}

# Restart Kubeproxy, which would wait, until the network is created
# This was fixed in 1.15, workaround still needed for 1.14 https://github.com/kubernetes/kubernetes/pull/78612
Restart-Service Kubeproxy

$env:AZURE_ENVIRONMENT_FILEPATH="c:\k\azurestackcloud.json"

c:\k\kubelet.exe --address=0.0.0.0 --anonymous-auth=false --authentication-token-webhook=true --authorization-mode=Webhook --azure-container-registry-config=c:\k\azure.json --cgroups-per-qos=false --client-ca-file=c:\k\ca.crt --cloud-config=c:\k\azure.json --cloud-provider=azure --cluster-dns=10.0.0.10 --cluster-domain=cluster.local --dynamic-config-dir=/var/lib/kubelet --enforce-node-allocatable="" --event-qps=0 --eviction-hard="" --feature-gates=RotateKubeletServerCertificate=true --hairpin-mode=promiscuous-bridge --image-gc-high-threshold=85 --image-gc-low-threshold=80 --image-pull-progress-deadline=20m --keep-terminated-pod-volumes=false --kube-reserved=cpu=100m,memory=1843Mi --kubeconfig=c:\k\config --max-pods=30 --network-plugin=cni --node-status-update-frequency=10s --non-masquerade-cidr=0.0.0.0/0 --pod-infra-container-image=kubletwin/pause --pod-max-pids=-1 --protect-kernel-defaults=true --read-only-port=0 --resolv-conf="" --rotate-certificates=false --streaming-connection-idle-timeout=4h --system-reserved=memory=2Gi --tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_128_GCM_SHA256 --node-labels=$global:KubeletNodeLabels --volume-plugin-dir=$global:VolumePluginDir --cni-bin-dir=c:\k\azurecni\bin --cni-conf-dir=c:\k\azurecni\netconf

Enable Hyper-V Isolation by modify kubelet parameters

1. Modify c:\k\kubeletstart.ps1 to add parameter to kubelet

--feature-gates="XXX=true,HyperVContainer=true"

2. Restart kubelet
C:\k>nssm restart kubelet
Kubelet: STOP: A stop control has been sent to a service that other running services are dependent on.

C:\k>sc queryex kubelet

SERVICE_NAME: kubelet
        TYPE               : 10  WIN32_OWN_PROCESS
        STATE              : 4  RUNNING
                                (STOPPABLE, PAUSABLE, ACCEPTS_SHUTDOWN)
        WIN32_EXIT_CODE    : 0  (0x0)
        SERVICE_EXIT_CODE  : 0  (0x0)
        CHECKPOINT         : 0x0
        WAIT_HINT          : 0x0
        PID                : 4044
        FLAGS              :

C:\k>taskkill /PID 4044 /F

C:\k>sc start kubelet

Restart the Windows node if necessary

Getting real client IP in Docker Swarm

在Docker Swarm中通过Stack Deploy部署Service的时候,在Service中默认无法获取到客户端的IP地址, Github中有一个issue在track这个问题:Unable to retrieve user’s IP address in docker swarm mode

目前的解决方法或Workaround是把port改成host模式, 以kong为例.

默认的port发布模式:

version: "3.7"
services:
  kong-proxy:
    image: kong:1.0.3-alpine
    deploy:
      mode: global
      labels:
        - "tier=frontend"
      restart_policy:
        condition: any
    ports:
      - "80:8000"
      - "443:8443"
    depends_on:
      - database-postgresql
    environment:
      KONG_ADMIN_LISTEN: 0.0.0.0:8001, 0.0.0.0:8444 ssl
      KONG_DATABASE: postgres
      KONG_PG_DATABASE: kong
      KONG_PG_USER: kong
      KONG_PG_PASSWORD: PaSsW0rd
      KONG_PG_HOST: database-postgresql
      KONG_PG_PORT: "5432"

    volumes:
      - type: "bind"
        source: "/var/log/kong/"
        target: "/usr/local/kong/logs/"
#        read_only: true
    networks:
      - backend
      - frontend
networks:
  frontend:
  backend:

 

修改port为host模式:

version: "3.7"
services:
  kong-proxy:
    image: kong:1.0.3-alpine
    deploy:
      mode: global
      labels:
        - "tier=frontend"
      restart_policy:
        condition: any
    ports:
      - target: 8000
        published: 80
        mode: host
      - target: 8443
        published: 43
        mode: host
    depends_on:
      - database-postgresql
    environment:
      KONG_ADMIN_LISTEN: 0.0.0.0:8001, 0.0.0.0:8444 ssl
      KONG_DATABASE: postgres
      KONG_PG_DATABASE: kong
      KONG_PG_USER: kong
      KONG_PG_PASSWORD: PaSsW0rd
      KONG_PG_HOST: database-postgresql
      KONG_PG_PORT: "5432"

    volumes:
      - type: "bind"
        source: "/var/log/kong/"
        target: "/usr/local/kong/logs/"
#        read_only: true
    networks:
      - backend
      - frontend
networks:
  frontend:
  backend:

 

Configuring /etc/hosts/in Kubernetes Depolyment/Pod

Example of Pod:

apiVersion: v1
kind: Pod
metadata:
  name: hostaliases-pod
spec:
  restartPolicy: Never
  hostAliases:
  - ip: "127.0.0.1"
    hostnames:
    - "foo.local"
    - "bar.local"
  - ip: "10.1.2.3"
    hostnames:
    - "foo.remote"
    - "bar.remote"
  containers:
  - name: cat-hosts
    image: busybox
    command:
    - cat
    args:
    - "/etc/hosts"

Example of Deployment

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: hostaliases-deployment
spec:
  template:
    spec:
      hostAliases:
      - ip: "127.0.0.1"
        hostnames:
        - "foo.local"
        - "bar.local"
      - ip: "10.1.2.3"
        hostnames:
        - "foo.remote"
        - "bar.remote"
      containers:
      - name: a-aspnetcore-app
        image: aspnetcore-app:v1.0.0
        env:
        - name: ASPNETCORE_ENVIRONMENT
          value: Development
        ports:
        - containerPort: 80
      imagePullSecrets:
      - name: docker-secret

See the result in Kubernetes container

# cat /etc/hosts
# Kubernetes-managed hosts file.
127.0.0.1       localhost
::1     localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
fe00::0 ip6-mcastprefix
fe00::1 ip6-allnodes
fe00::2 ip6-allrouters
192.168.240.234 hostaliases-deployment-65d5c48f7c-pqqvn

# Entries added by HostAliases.
127.0.0.1       foo.local bar.local
10.1.2.3        foo.remote bar.remote

reference: add-entries-to-pod-etc-hosts-with-host-aliases

Docker Windows容 器中的时间问题

场景 #1:

主机OS版本: Windows 10 1803
容器OS版本: Windows Server Core 1803

容器以默认的 hyperv 模式启动, 空器中的时间是一个莫名其妙的未来时间,比主机的时间提前10多个小时:
主机的时间是2018-8-15 17:XX:XX, 容器中的时间是2018-8-16 07:XX:XX
又一次代码修改重新构建了容器的镜像,重启了容器,容器的时间与主机的时间同步了

测试:
1. 当前实际时间为2018-8-16 16:XX:XX, 关掉主机中的自动设置时间, 修改主机的时间为2018-5-16 16:XX:XX,容器中的时间不变,重启容器后容器中的时间变成2018-8-16 09:XX:XX
2. 打开主机中自动设置时间,主机时间变回,2018-8-16 16:07:XX, 容器的时间也跟着同步成了2018-8-16 16:07:XX
3. 再次关掉主机中的自动设置时间,把主机时间改为2018-8-19 16:07:XX, 容器的时间马上跟着变成了2018-8-19 16:07:XX
4. 再次打开主机中的自动设置时间,主机时间变回2018-8-16 16:09:XX, 容器时间还维持在2018-8-19 16:XX:XX
5. 再次重启容器,容器的时间又与主机同步了

结论: 当容器中的时间比主机的时间晚时,与立即与主机时间同步,反之则不会同步。莫名其妙, 参见bug: https://github.com/moby/moby/issues/37283

场景 #2:

主机OS版本: Windows Server 2016 14393.1358
容器OS版本: Windows Server Core 10.0.14393.2363

容器以 process 模式启动, docker run … –isolation process…
不管主机时间如果变化,容器中的时间都与主机时间同步

hyperv或process兼容列表见:
https://docs.microsoft.com/en-us/virtualization/windowscontainers/deploy-containers/version-compatibility

Docker for Windows 18.06.0-ce released

18.06.0-ce-win70 (19075)

  • Upgrades
  • New
  • Bug fixes and minor changes
    • AUFS storage driver is deprecated in Docker Desktop and AUFS support will be removed in the next major release. You can continue with AUFS in Docker Desktop 18.06.x, but you will need to reset disk image (in Settings > Reset menu) before updating to the next major update. You can check documentation to save images and backup volumes
    • Fix bug which would cause VM logs to be written to RAM rather than disk in some cases, and the VM to hang.
    • Fix security issue with named pipe connection to docker service.
    • Fix VPNKit memory leak. Fixes docker/for-win#2087, moby/vpnkit#371
    • Fix restart issue when using Windows fast startup on latest 1709 Windows updates. Fixes docker/for-win#1741, docker/for-win#1741
    • DNS name host.docker.internal can be used for host resolution from Windows Containers. Fixes docker/for-win#1976
    • Fix broken link in diagnostics window.
    • Added log rotation for docker-ce logs inside the virtual machine.
    • Changed smb permission to avoid issue when trying to manipulate files with different users in containers. Fixes docker/for-win#2170

License for OS (Windows) inside Docker [reshipment]

How does licensing work?

For production, licensing is at the host level, i.e. each machine or VM which is running Docker. Your Windows licence on the host allows you to run any number of Windows Docker containers on that host. With Windows Server 2016 you get the commercially supported version of Docker included in the licence costs, with support from Microsoft and Docker, Inc.

For development, Docker for Windows runs on Windows 10 and is free, open-source software. Docker for Windows can also run a Linux VM on your machine, so you can use both Linux and Windows containers in development. Like the server version, your Windows 10 licence allows you to run any number of Windows Docker containers.

Windows admins will want a unified platform for managing images and containers. That’s Docker Datacenter which is separately licensed, and will be available for Windows soon.

 

https://blog.docker.com/2017/01/docker-windows-server-image2docker/#h.x2hzndd3qwow