Category: Kubernetes
An exception about FlexVolume SMB storage plugin
使用了Microsoft的一个FlexVolume SMB插件在Kubernetes Windows容器中挂载SMB存储 microsoft.com~smb.cmd 其中一个节点上的容器偶发性的挂载不上: E1114 09:48:35.398102 4496 driver-call.go:267] Failed to unmarshal output for command: init, output: "RunFlexVolume : \xce\u07b7\xa8\xbd\xab\xa1\xb0RunFlexVolume\xa1\xb1\xcf\xeeʶ\xb1\xf0Ϊ cmdlet\xa1\xa2\xba\xaf\xca\xfd\xa1\xa2\xbdű\xbe\xceļ\xfe\xbb\xf2\xbf\xc9\xd4\xcb\xd0г\xcc\xd0\xf2\xb5\xc4\xc3\xfb\xb3ơ\xa3\xc7\xeb\xbc\xec\xb2\xe9\xc3\xfb\xb3Ƶ\xc4ƴд\xa3\xac\xc8\xe7\xb9\xfb\xb0\xfc\xc0\xa8·\r\n\xbe\xb6\xa3\xac\xc7\xebȷ\xb1\xa3·\xbe\xb6\xd5\xfdȷ\xa3\xacȻ\xba\xf3\xd4\xd9\xca\xd4һ\xb4Ρ\xa3\r\n\xcb\xf9\xd4\xdaλ\xd6\xc3 C:\\usr\\libexec\\kubernetes\\kubelet-plugins\\volume\\exec\\microsoft.com~smb.cmd\\smb.ps1:89 \xd7ַ\xfb: 1\r\n+ RunFlexVolume\r\n+ ~~~~~~~~~~~~~\r\n + CategoryInfo : ObjectNotFound: (RunFlexVolume:String) [], ParentContainsErrorRecordException\r\n + FullyQualifiedErrorId : CommandNotFoundException\r\n \r\n", error: invalid character 'R' looking for beginning of value E1114 09:48:35.398102 4496 plugins.go:766] Error dynamically probing plugins: …
基于Upstream-ToR网络的Kubernetes Windows Node配置
前言 Kubernetes集群中Windows节点引入的成功与否主要决定于Kubernetes网络组件的成熟程度及Windows Server的SDN能力. Kubernetes从1.5版本基于Windows Server 2016(1607)开始引入Windows容器, 发展到目前的最新版本1.18; Windows Server也从1607经历了1709/1803/1809/1903/1909几个版本的发展, 在容器化支持及SDN方面也做了很多的功能改进 常用的Kubernetes集群网络组件有很多, 但能同时支持Linux/Windows混合集群的网络组件比较少(最起码之前是的), 之前搭建过基于Windows Server 1709和OVS/OVN网络的混合集群, 但由于OVS网络的基础组件太多(Open vSwitch, Central Database, Northbound Database, Southbound Database…), 安装配置过程较复杂并且难维护,加上ovn-kubernetes在当时还不太成熟, 试用了一段时间后放弃. 经过近两年的发展, Windows容器环境已经能达到准生产环境的标准, 所以本文基于社区大神们提供的开源组件及相关文档,阐述基于L3 Upstream ToR网络模型的Linux/Windows混合集群的网络配置过程 本文目的 在不修改原有的Kubernetes Linux集群网络的基础上加入Windows节点, 实现Windows/Linux节点间的网络互通: Container to Container Pod to Pod Container to Service Pod to Service 关于L3 Upstream ToR 首先引用官方一张图说明一下upstream ToR网络模型: 这张图来源于Kubernetes官网文档 – 在 Kubernetes …
How to install specific hotfix on Windows Server
Windows容器环境有个特点, Host与Container的OS Builder Number必须匹配, 有点场景甚至要求Revision Number匹配, 所以经常要为K8s Node安装指定Revision 的hotfix, 用powershell在线安装时下载过程缓慢而不可控, 体验最好的路径还是直接找到相应Revision Number的msu安装包,直接安装: 1. 从Windows Update History网站找到版本对应的KB. 如: Windows Server 1809 OS Build 10.0.17763.1158 https://support.microsoft.com/en-us/help/4549949 2. 在Windows Update Catelog按KB搜索: https://www.catalog.update.microsoft.com/ 找到相应的下载包. 如17763.1158对应的KB4549949: https://www.catalog.update.microsoft.com/Search.aspx?q=KB4549949 3. 下载msu安装包后使用wusa指令安装即可: wusa windows10.0-kb4549949-x64_90e8805e69944530b8d4d4877c7609b9a9e68d81.msu 附: 为了防止Windows Node版本变更, 还要关闭Windows Auto Update, 防止Node OS自己变更版本: a). 查看Auto Update 状态: %systemroot%\system32\Cscript %systemroot%\system32\scregedit.wsf /AU /v b). 禁用 Windows …
For Windows Container, you need to set –image-pull-progress-deadline for kubelet
Windows镜像动则几个G, 基于Windows Server Core的镜像5~10G, Windows节点上的kubelet在下载镜像的时候经常会cancel掉: Failed to pull image "XXX": rpc error: code = Unknown desc = context canceled 造成这个问题的原因是因为默认的image pulling progress deadline是1分钟, 如果1分钟内镜像下载没有任何进度更新, 下载动作就会取消, 比较大的镜像就无法成功下载. 见官方文档: If no pulling progress is made before this deadline, the image pulling will be cancelled. This docker-specific flag only works when container-runtime is set to docker. (default …
Implementing Graceful Shutdown in Windows Container
Kubernetes Linux Pod中,当通过kubectl删除一个Pod或rolling update一个Pod时, 每Terminating的Pod中的每个Container中PID为1的进程会收到SIGTERM信号, 通知进程进行资源回收并准备退出. 如果在Pod spec.terminationGracePeriodSeconds指定的时间周期内进程没有退出, 则Kubernetes接着会发出SIGKILL信号KILL这个进程。 通过 kubectl delete –force –grace-period=0 … 的效果等同于直接发SIGKILL信号. 但SIGTERM和SIGKILL方式在Windows Container中并不工作, 目前Windows Container的表现是接收到Terminating指令5秒后直接终止。。。 参见:https://v1-18.docs.kubernetes.io/docs/setup/production-environment/windows/intro-windows-in-kubernetes/#v1-pod V1.Pod.terminationGracePeriodSeconds – this is not fully implemented in Docker on Windows, see: reference. The behavior today is that the ENTRYPOINT process is sent CTRL_SHUTDOWN_EVENT, then Windows waits 5 seconds by default, and finally shuts down …
Describe Kubelet Service Parameters on Azure Windows node
Query Kubelet service Managed by nssm C:\k>sc qc kubelet [SC] QueryServiceConfig SUCCESS SERVICE_NAME: kubelet TYPE : 10 WIN32_OWN_PROCESS START_TYPE : 2 AUTO_START ERROR_CONTROL : 1 NORMAL BINARY_PATH_NAME : C:\k\nssm.exe LOAD_ORDER_GROUP : TAG : 0 DISPLAY_NAME : Kubelet DEPENDENCIES : docker SERVICE_START_NAME : LocalSystem Query kubelet AppParameters by nssm C:\k>nssm get kubelet Application C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe C:\k>nssm get …
Run Windows container with Hyper-V isolation mode in Kubernetes
Windows Container有两种隔离运行模式Hyper-V和Process, 参见:Isolation Modes 两种模式下的host的OS版本与containter的OS版本存在兼容性又不相同,参见:Windows container version compatibility 很明显Hyper-V模式的兼容性要比Process模式要好,向下兼容,也就是高版本的host OS可以运行低版本的container OS, 反之不行; 而Process模式下Windows Server中则要求host OS与container OS的版本完全相同, Windows 10中则不支持Process模式. 某一天,我想在Kubernetes Windows 节点中以Hyper-V模式运行Container, 于是乎发现1.17的文档中写道: Note: In this document, when we talk about Windows containers we mean Windows containers with process isolation. Windows containers with Hyper-V isolation is planned for a future release. 不甘心又google了一下,发现: 1. 有人提了bug, 已经被修复了: https://github.com/kubernetes/kubernetes/issues/58750 2. 代码也merge了: …
ernel:unregister_netdevice: waiting for eth0 to become free. Usage count = 1
[Copied from]: https://access.redhat.com/solutions/3659011 RHEL7 and kubernetes: kernel:unregister_netdevice: waiting for eth0 to become free. Usage count = 1 SOLUTION VERIFIED – Updated October 13 2019 at 9:17 PM – English Issue We are trying to prototype kubernetes on top of RHEL and encounter the situation that the device seems to be frozen. There are repeated messages similar to: Raw …
[Kubernetes] Create deployment, service by Python client
Install Kubernetes Python Client and PyYaml: # pip install kubernetes pyyaml 1. Get Namespaces or Pods by CoreV1Api: # -*- coding: utf-8 -*- from kubernetes import client, config, utils config.kube_config.load_kube_config(config_file="../kubecfg.yaml") coreV1Api = client.CoreV1Api() print("\nListing all namespaces") for ns in coreV1Api.list_namespace().items: print(ns.metadata.name) print("\nListing pods with their IP, namespace, names:") for pod in coreV1Api.list_pod_for_all_namespaces(watch=False).items: print("%s\t\t%s\t%s" % (pod.status.pod_ip, …