Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] svclb-traefik* won't start after host crash and restart. #34

Open
bayeslearnerold opened this issue Apr 4, 2022 · 4 comments
Open

Comments

@bayeslearnerold
Copy link

What did you do

  • How was the cluster created?

    • only 1 node, with a volume mapping for /var/rancher.../storage.
  • What did you do afterwards?
    My host crashed and after restarting it and restarting k3d, I am no longer able to connect to any app service through ingress.

What did you expect to happen

Ingress should work

Screenshots or terminal output

[rockylinux@rockylinux8 infra_k3d]$ kubectl -n kube-system logs svclb-traefik-dkgkq lb-port-80
+ trap exit TERM INT
+ echo 10.43.70.41
+ grep -Eq :
+ cat /proc/sys/net/ipv4/ip_forward
+ '[' 1 '!=' 1 ]
+ iptables -t nat -I PREROUTING '!' -s 10.43.70.41/32 -p TCP --dport 80 -j DNAT --to 10.43.70.41:80
modprobe: can't change directory to '/lib/modules': No such file or directory
modprobe: can't change directory to '/lib/modules': No such file or directory
modprobe: can't change directory to '/lib/modules': No such file or directory
modprobe: can't change directory to '/lib/modules': No such file or directory
modprobe: can't change directory to '/lib/modules': No such file or directory
iptables v1.8.4 (legacy): can't initialize iptables table `nat': Table does not exist (do you need to insmod?)
Perhaps iptables or your kernel needs to be upgraded. 

Which OS & Architecture

  • Linux, Windows, MacOS / amd64, x86, ...?
    Linux rockylinux8.linuxvmimages.local 4.18.0-348.20.1.el8_5.x86_64 #1 SMP Thu Mar 10 20:59:28 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Which version of k3d

  • output of k3d version
k3d version v5.3.0
k3s version v1.22.6-k3s1 (default)

Which version of docker

  • output of docker version and docker info
    [rockylinux@rockylinux8 infra_k3d]$ docker info
Client:
 Context:    default
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Docker Buildx (Docker Inc., v0.8.0-docker)
  scan: Docker Scan (Docker Inc., v0.17.0)

Server:
 Containers: 3
  Running: 2
  Paused: 0
  Stopped: 1
 Images: 5
 Server Version: 20.10.13
 Storage Driver: overlay2
  Backing Filesystem: xfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 2a1d4dbdb2a1030dc5b01e96fb110a9d9f150ecc
 runc version: v1.0.3-0-gf46b6ba
 init version: de40ad0
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 4.18.0-348.20.1.el8_5.x86_64
 Operating System: Rocky Linux 8.5 (Green Obsidian)
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 31.19GiB
 Name: rockylinux8.linuxvmimages.local
 ID: RI32:V7KA:PDQG:Q2Z2:DNET:CMMP:3MMG:23OF:RMTN:W6J2:WOQO:N4YA
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
@bayeslearnerold
Copy link
Author

Name:           svclb-traefik-wqjjt
Namespace:      kube-system
Priority:       0
Node:           <none>
Labels:         app=svclb-traefik
                controller-revision-hash=f4f897b4f
                pod-template-generation=1
                svccontroller.k3s.cattle.io/svcname=traefik
Annotations:    <none>
Status:         Pending
IP:             
IPs:            <none>
Controlled By:  DaemonSet/svclb-traefik
Containers:
  lb-port-80:
    Image:      rancher/klipper-lb:v0.3.4
    Port:       80/TCP
    Host Port:  80/TCP
    Environment:
      SRC_PORT:    80
      DEST_PROTO:  TCP
      DEST_PORT:   80
      DEST_IPS:    10.43.184.59
    Mounts:        <none>
  lb-port-443:
    Image:      rancher/klipper-lb:v0.3.4
    Port:       443/TCP
    Host Port:  443/TCP
    Environment:
      SRC_PORT:    443
      DEST_PROTO:  TCP
      DEST_PORT:   443
      DEST_IPS:    10.43.184.59
    Mounts:        <none>
Conditions:
  Type           Status
  PodScheduled   False 
Volumes:         <none>
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     CriticalAddonsOnly op=Exists
                 node-role.kubernetes.io/control-plane:NoSchedule op=Exists
                 node-role.kubernetes.io/master:NoSchedule op=Exists
                 node.kubernetes.io/disk-pressure:NoSchedule op=Exists
                 node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                 node.kubernetes.io/not-ready:NoExecute op=Exists
                 node.kubernetes.io/pid-pressure:NoSchedule op=Exists
                 node.kubernetes.io/unreachable:NoExecute op=Exists
                 node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
  Type     Reason            Age    From               Message
  ----     ------            ----   ----               -------
  Warning  FailedScheduling  5m44s  default-scheduler  0/1 nodes are available: 1 node(s) didn't have free ports for the requested pod ports.
  Warning  FailedScheduling  4m32s  default-scheduler  0/1 nodes are available: 1 node(s) didn't have free ports for the requested pod ports.

@westrickc
Copy link

I get the same error after creating a new cluster with k3d. My host OS is RHEL 8.5. I think it is related to the fact that RHEL 8.5 only supports nftables, but the kilpper-lb docker image has iptables symlinked to the legacy version.

Relevant versions of things:

  • Red Hat Enterprise Linux release 8.5 (Ootpa)
  • Docker version 20.10.14, build a224086
  • k3d version v5.4.1
  • k3s version v1.22.7-k3s1 (default)

My workaround was to recreate the rancher/klipper-lb:vb0.3.4 image with this Dockerfile:

FROM rancher/klipper-lb:v0.3.4
# Use nftables iptables not legacy
RUN \
  ln -sf /sbin/xtables-nft-multi /sbin/iptables && \
  ln -sf /sbin/xtables-nft-multi /sbin/iptables-save && \
  ln -sf /sbin/xtables-nft-multi /sbin/iptables-restore
CMD ["entry"]

Then I used k3d image import to inject this new image into the cluster. Eventually kubernetes will use the new image to restart the failed svclb-traefik-xxxxx pod.

It's a hack, but it gets ingress working on my system.

@r-ushil
Copy link

r-ushil commented May 23, 2023

Check this out for a quick fix:

k3d-io/k3d#1021 (comment) #

To solve the problem properly (rather than use this ad-hoc fix), I would suggest rewriting check_iptables_mode() to use grep inside of the /sbin directory, rather than trying to use lsmod / modprobe

@bartowl
Copy link

bartowl commented Jul 3, 2023

It has been now over a year and this issue has still not been fixed? There is more and more nft-based systems and this is really annoying... In particular, with 0.4.3:

+ info 'legacy mode detected'
+ echo '[INFO] ' 'legacy mode detected'
+ set_legacy
+ ln -sf /sbin/xtables-legacy-multi /sbin/iptables
[INFO]  legacy mode detected
+ ln -sf /sbin/xtables-legacy-multi /sbin/iptables-save
+ ln -sf /sbin/xtables-legacy-multi /sbin/iptables-restore
+ ln -sf /sbin/xtables-legacy-multi /sbin/ip6tables
+ start_proxy
+ echo 0.0.0.0/0
+ grep -Eq :
+ iptables -t filter -I FORWARD -s 0.0.0.0/0 -p TCP --dport 80 -j ACCEPT
modprobe: can't change directory to '/lib/modules': No such file or directory
modprobe: can't change directory to '/lib/modules': No such file or directory
iptables v1.8.8 (legacy): can't initialize iptables table `filter': Table does not exist (do you need to insmod?)
Perhaps iptables or your kernel needs to be upgraded.

This is current (5.5.1) k3d using klipper-lb:v0.4.3 on Oracle Linux Server 8.7 (RHEL 8.7 binary compatible).
Host is running iptables v1.8.4 (nf_tables) with following packages installed:
iptables-1.8.4-23.0.1.el8.x86_64
nftables-0.9.3-26.el8.x86_64
iptables-ebtables-1.8.4-23.0.1.el8.x86_64
python3-nftables-0.9.3-26.el8.x86_64
iptables-libs-1.8.4-23.0.1.el8.x86_64

proposed change do the detection would be to replace
lsmod | grep "nf_tables"
with lsmod | grep "nf_conntrack" as this is how lsmod output looks like on this system after grepping for "nf_":

#5 0.220 nf_conntrack_netlink    45056  0
#5 0.220 nf_reject_ipv4         16384  1 ipt_REJECT
#5 0.220 nf_nat                 45056  3 xt_nat,xt_MASQUERADE,nft_chain_nat
#5 0.220 nf_conntrack          147456  5 nf_conntrack_netlink,xt_nat,xt_conntrack,xt_MASQUERADE,nf_nat
#5 0.220 nf_defrag_ipv6         24576  1 nf_conntrack
#5 0.220 nf_defrag_ipv4         16384  1 nf_conntrack
#5 0.220 libcrc32c              16384  3 nf_nat,nf_conntrack,xfs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants