Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

panic: runtime error: invalid memory address or nil pointer dereference #5767

Open
4 tasks done
entense opened this issue Feb 21, 2025 · 23 comments · May be fixed by #5769
Open
4 tasks done

panic: runtime error: invalid memory address or nil pointer dereference #5767

entense opened this issue Feb 21, 2025 · 23 comments · May be fixed by #5769
Milestone

Comments

@entense
Copy link

entense commented Feb 21, 2025

Contributing guidelines and issue reporting guide

Well-formed report checklist

  • I have found a bug that the documentation does not mention anything about my problem
  • I have found a bug that there are no open or closed issues that are related to my problem
  • I have provided version/information about my environment and done my best to provide a reproducer

Description of bug

Bug description

today a very strange error appeared, buildx runs in gitlab runner and creates a build, but today suddenly a problem appeared

I tried different versions, all without success

docker info
Client:
 Version:    25.0.5
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.13.1
    Path:     /usr/local/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.25.0
    Path:     /usr/local/libexec/docker/cli-plugins/docker-compose
Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 0
 Server Version: 28.0.0
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: bcc810d6b9066471b0b6fa75f557a15a1cbf31bb
 runc version: v1.2.5-0-g599[23](****/39292#L23)ef
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 6.1.0-31-amd64
 Operating System: Alpine Linux v3.21 (containerized)
 OSType: linux
 Architecture: x86_64
 CPUs: 48
 Total Memory: 1[25](****/-/jobs/39292#L25).5GiB
 Name: f5dc9ade0ae0
 ID: d1f2f41e-e30b-434b-a789-b3ddefada255
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  ::1/128
  1[27](****39292#L27).0.0.0/8
 Live Restore Enabled: false
 Product License: Community Engine

Reproduction

image:
    name: docker:git

services:
    - name: docker:dind
      alias: thedockerhost
      command: ['--tls=false']

variables:
    DOCKER_HOST: tcp://thedockerhost:2375/
    DOCKER_DRIVER: overlay2
    DOCKER_TLS_CERTDIR: ''
    CONTAINER_RELEASE_IMAGE: $CI_REGISTRY_IMAGE:latest
    DOCKER_BUILDKIT: 1

build-image:
    stage: build
    tags:
        - docker-207
    before_script:
        - docker login -u gitlab-ci-token -p $CI_JOB_TOKEN $CI_REGISTRY
    script:
        - docker buildx version
        - docker buildx create --driver=docker-container --name=buildkit-builder --use
        - docker buildx inspect buildkit-builder --bootstrap

Version information

v0.13.1

$ docker buildx version
github.com/docker/buildx v0.13.1 788433953af10f2a698f5c07611dddce2e08c7a0
$ docker buildx create --driver=docker-container --name=buildkit-builder --use
buildkit-builder
$ docker buildx inspect buildkit-builder --bootstrap
#1 [internal] booting buildkit
#1 pulling image moby/buildkit:buildx-stable-1
#1 pulling image moby/buildkit:buildx-stable-1 15.8s done
#1 creating container buildx_buildkit_buildkit-builder0
#1 31.31 panic: runtime error: invalid memory address or nil pointer dereference#1 creating container buildx_buildkit_buildkit-builder0 15.5s done
panic: runtime error: invalid memory address or nil pointer dereference
#1 31.31 
#1 31.31 
#1 31.31 github.com/fsnotify/fsnotify.(*Watcher).igithub.com/fsnotify/fsnotify.(*Watcher).isClosed(...)
#1 31.31 github.com/fsnotify/fsnotify.(*Watcher).AddWith(0x0, {0x1ce301f, github.com/fsnotify/fsnotify.(*Watcher).AddWith(0x0, {0x1ce301f, 0xc}, {0x0, 0x0, 0xc0005129e8?})
#1 31.31 
@crazy-max
Copy link
Member

crazy-max commented Feb 21, 2025

Version:    25.0.5
...
Server Version: 28.0.0

Seems you're using an old client with latest Docker 28 but don't think that's the issue.

panic: runtime error: invalid memory address or nil pointer dereference

This error seems returned by the Docker engine and not BuildKit. Can you run this command with debug enabled?

docker --debug buildx create --driver=docker-container --name=buildkit-builder --use --bootstrap

And this one with docker run for creating the container builder without Buildx?:

docker --debug run -d --privileged --name testbk moby/buildkit:latest --debug

Also cc @thaJeztah @vvoland if this is smth you might be aware of.

@entense
Copy link
Author

entense commented Feb 21, 2025

@crazy-max

  1. failed
    docker --debug buildx create --driver=docker-container --name=buildkit-builder --use --bootstrap
$ docker --debug buildx create --driver=docker-container --name=buildkit-builder --use --bootstrap
#1 [internal] booting buildkit
#1 pulling image moby/buildkit:buildx-stable-1
#1 pulling image moby/buildkit:buildx-stable-1 14.7s done
#1 creating container buildx_buildkit_buildkit-builder0
#1 31.68 panic: runtime error: invalid memory address or nil pointer dereference
#1 31.68 #1 creating container buildx_buildkit_buildkit-builder0 17.0s done
panic: runtime error: invalid memory address or nil pointer dereference
full output
$ docker --debug buildx create --driver=docker-container --name=buildkit-builder --use --bootstrap
#1 [internal] booting buildkit
#1 pulling image moby/buildkit:buildx-stable-1
#1 pulling image moby/buildkit:buildx-stable-1 15.3s done
#1 creating container buildx_buildkit_buildkit-builder0
#1 32.42 panic: runtime error: invalid memory address or nil pointer dereference
#1 32.42 #1 creating container buildx_buildkit_buildkit-builder0 17.1s done
panic: runtime error: invalid memory address or nil pointer dereference
#1 32.42 
#1 32.42 �)goroutine 1 [running, locked to thread]:
#1 32.42 �6github.com/fsn
#1 32.42 github.com/fsnotify/fsnotify.(*Watcher).igithub.com/fsnotify/fsnotify.(*Watcher).isClosed(...)
#1 32.42 github.com/fsnotify/fsnotify.(*Watcher).AddWith(0x0, {0x1cdb3cf, github.com/fsnotify/fsnotify.(*Watcher).AddWith(0x0, {0x1cdb3cf, 0x8}, {0x0, 0x0, 0xc0005[94](https://***********/-/jobs/39296#L94)9e8?})
#1 32.42 github.com/fsnotify/fsnotify.(*Watcher).Add(...)
#1 32.42 �A	/src/vendor/ggithub.com/fsnotify/fsnotify.(*Watcher).Add(...)
#1 32.42 tags.cncf.io/container-device-interface/pkg/cdi.(*watch).update(0tags.cncf.io/container-device-interface/pkg/cdi.(*watch).update(0xc00019f610, 0xc0004f99b0, {0x0, 0x0, 0xc000594a18?})
#1 32.42 tags.cncf.io/container-device-interface/pkg/cdi.(*Cache).refreshIfRequired(0xc00tags.cncf.io/container-device-interface/pkg/cdi.(*Cache).refreshIfRequired(0xc00028c8c0, 0x0?)
#1 32.42 	/src/vendor/tags.cncf.io/container-device-interface/pkg/cdi/cache.go:130 +0x9d	/src/vendor/tags.cncf.io/container-device-interface/pkg/cdi/cache.go:130 +0x9d
#1 32.42 	/src/cmd/buildkitd/main.go:1067 +0x8f
#1 32.42 �Bmain.getCDIManager({0x0, 	/src/cmd/buildkitd/main.go:1067 +0x8f
#1 32.42 	/src/cmd/buildkitd/main.go:1071 +0xf1
#1 32.42 �lmain.ociWorkerIniti	/src/cmd/buildkitd/main.go:1071 +0xf1
#1 32.42 	/src/cmd/buildkitd/main_oci_worker.go:301 +0x406
#1 32.42 �[main.newWorkerController(0xc000476000, {0xc00016c9	/src/cmd/buildkitd/main_oci_worker.go:301 +0x406
#1 32.42 	/src/cmd/buildkitd/main.go:882 +0xfa
#1 32.42 �Jmain.newController({0x1fa5538, 0xc00028c280},	/src/cmd/buildkitd/main.go:882 +0xfa
#1 32.42 	/src/cmd/buildkitd/main.go:797 +0x2c5
#1 32.42 ��main.main.func3(0xc00047600	/src/cmd/buildkitd/main.go:797 +0x2c5
#1 32.42 	/src/cmd/buildkitd/main.go:34	/src/cmd/buildkitd/main.go:349 +0xc78
#1 32.42 	/src/vendor/github.com/urfave/cli/app.go:524 +0x50
#1 32.42 �Igithub.com/urfave/c	/src/vendor/github.com/urfave/cli/app.go:524 +0x50
#1 32.42 	/src/vendor/github.com/urfave/cli/app.go:286 +0x79b
#1 32.42 �main.main()
#1 32.42 	/src/vendor/github.com/urfave/cli/app.go:286 +0x79b
#1 32.42 	/src/cmd/bu	/src/cmd/buildkitd/main.go:413 +0x12a7
#1 32.42 [signal SIGSEGV: segmentation violation code=0x1 addr=0x28 pc=0xe78fba]
#1 32.42 [signal SIGSEGV: segmentation violation code=0x1 addr=0x28 pc=0xe78fba]
#1 32.42 ggoroutine 1 [running, locked to thread]:
#1 32.42 	/src/vendor/github.com/fsnotify/fsnotify/backend_inot	/src/vendor/github.com/fsnotify/fsnotify/backend_inotify.go:296
#1 32.42 	/src/vendor/github.com/fsnotify/fsnotify/backend_inotify.go:372 +0x3a
#1 32.42 	/src/vendor/github.com/fsnotify/fsnotify/backend_inotify.go:372 +0x3a
#1 32.42 	/src/vendor/github.com/fsnotify/fsnotify/backend	/src/vendor/github.com/fsnotify/fsnotify/backend_inotify.go:362
#1 32.42 tags.cncf.io/container-device-interface/pkg/cdi.(*Cache).refreshIfRequired(0xc00tags.cncf.io/container-device-interface/pkg/cdi.(*Cache).refreshIfRequired(0xc000178a50, 0x0?)
#1 32.42 tags.cncf.io/container-device-interface/pkg/cdi.(*Cache).Refresh(0xc000178a50)
tags.cncf.io/container-device-interface/pkg/cdi.(*Cache).Refresh(0xc000178a50)
#1 32.42 	/src/cmd/buildkitd/main.go:1067 +0x8f
#1 32.42 	/src/cmd/buildkitd/main.go:1067 +0x8f
#1 32.42 	/src/cmd/buildkitd/main_oci_worker.go:301 +0x406
#1 32.42 	/src/cmd/buildkitd/main_oci_worker.go:301 +0x406
#1 32.42 	/src/cmd/buildkitd/main.go:882 +0xfa
#1 32.42 	/src/cmd/buildkitd/main.go:882 +0xfa
#1 32.42 	/src/cmd/buildkitd/main.go:797 +0x2c5
#1 32.42 	/src/cmd/buildkitd/main.go:797 +0x2c5
#1 32.42 	/src/cmd/buildkitd/main.go:34	/src/cmd/buildkitd/main.go:349 +0xc78
#1 32.42 github.com/urfave/cli.(*App).Run(0xc0004bd880, {0xc0github.com/urfave/cli.(*App).Run(0xc0004bd880, {0xc000050040, 0x2, 0x2})
#1 32.42 main.main()
#1 32.42 main.main()
#1 32.42 time="2025-02-21T17:22:22Z" level=info mtime="2025-02-21T17:22:22Z" level=info msg="auto snapshotter: using overlayfs"
#1 32.42 [signal SIGSEGV: segmentation violation code=0x1 addr=0x28 pc=0xe78fba]
#1 32.42 [signal SIGSEGV: segmentation violation code=0x1 addr=0x28 pc=0xe78fba]
#1 32.42 ggoroutine 1 [running, locked to thread]:
#1 32.42 github.com/fsnotify/fsnotify.(*Watcher).AddWith(0x0, {0x1cebf26, github.com/fsnotify/fsnotify.(*Watcher).AddWith(0x0, {0x1cebf26, 0x11}, {0x0, 0x0, 0xc0006049e8?})
#1 32.42 	/src/vendor/github.com/fsnotify/fsnotify/backend	/src/vendor/github.com/fsnotify/fsnotify/backend_inotify.go:362
#1 32.42 	/src/vendor/tags.cncf.io/container-device-interface/pkg/cdi/cache.go:572 +0xd9
#1 32.42 	/src/vendor/tags.cncf.io/container-device-interface/pkg/cdi/cache.go:572 +0xd9
#1 32.42 	/src/vendor/tags.cncf.io/container-device-interface/pkg/cdi/cache.go:217 +0x38
#1 32.42 	/src/vendor/tags.cncf.io/container-device-interface/pkg/cdi/cache.go:217 +0x38
#1 32.42 	/src/vendor/tags.cncf.io/container-device-interface/pkg/cdi/cache.go:130 +0x9d	/src/vendor/tags.cncf.io/container-device-interface/pkg/cdi/cache.go:130 +0x9d
#1 32.42 	/src/cmd/buildkitd/main.go:1071 +0xf1
#1 32.42 	/src/cmd/buildkitd/main.go:1071 +0xf1
#1 32.42 	/src/cmd/buildkitd/main_oci_worker.go:301 +0x406
#1 32.42 	/src/cmd/buildkitd/main_oci_worker.go:301 +0x406
#1 32.42 	/src/cmd/buildkitd/main.go:882 +0xfa
#1 32.42 	/src/cmd/buildkitd/main.go:882 +0xfa
#1 32.42 	/src/cmd/buildkitd/main.go:797 +0x2c5
#1 32.42 	/src/cmd/buildkitd/main.go:797 +0x2c5
#1 32.42 	/src/cmd/buildkitd/main.go:34	/src/cmd/buildkitd/main.go:349 +0xc78
#1 32.42 	/src/vendor/github.com/urfave/cli/app.go:524 +0x50
#1 32.42 	/src/vendor/github.com/urfave/cli/app.go:524 +0x50
#1 32.42 	/src/cmd/bu	/src/cmd/buildkitd/main.go:413 +0x12a7
#1 32.42 panic: runtime error: invalid memory address or nil pointer dereference
#1 32.42 panic: runtime error: invalid memory address or nil pointer dereference
#1 32.42 
#1 32.42 
#1 32.42 github.com/fsnotify/fsnotify.(*Watcher).igithub.com/fsnotify/fsnotify.(*Watcher).isClosed(...)
#1 32.42 github.com/fsnotify/fsnotify.(*Watcher).AddWith(0x0, {0x1cdb3cf, github.com/fsnotify/fsnotify.(*Watcher).AddWith(0x0, {0x1cdb3cf, 0x8}, {0x0, 0x0, 0xc00058a9e8?})
#1 32.42 github.com/fsnotify/fsnotify.(*Watcher).Add(...)
#1 32.42 github.com/fsnotify/fsnotify.(*Watcher).Add(...)
#1 32.42 tags.cncf.io/container-device-interface/pkg/cdi.(*watch).update(0tags.cncf.io/container-device-interface/pkg/cdi.(*watch).update(0xc0008288c0, 0xc000530c30, {0x0, 0x0, 0xc00058aa18?})
#1 32.42 tags.cncf.io/container-device-interface/pkg/cdi.(*Cache).refreshIfRequired(0xc00tags.cncf.io/container-device-interface/pkg/cdi.(*Cache).refreshIfRequired(0xc0005468c0, 0x0?)
#1 32.42 tags.cncf.io/container-device-interface/pkg/cdi.(*Cache).Refresh(0xc0005468c0)
#1 32.42 tags.cncf.io/container-device-interface/pkg/cdi.(*Cache).Refresh(0xc0005468c0)
#1 32.42 main.getCDIManager.func1({0x0, {0x2d29920, 0x3, 0x3}, {0x0, 0x0, 0x0}})
main.getCDIManager.func1({0x0, {0x2d29920, 0x3, 0x3}, {0x0, 0x0, 0x0}})
#1 32.42 main.getCDIManager({0x0, {0x2d29920, 0xmain.getCDIManager({0x0, {0x2d29920, 0x3, 0x3}, {0x0, 0x0, 0x0}})
#1 32.42 main.newWorkerController(0xc0003e6000, {0xc0004bc9main.newWorkerController(0xc0003e6000, {0xc0004bc908?, 0xc00019c3a8?, {0x1d041a3?, 0x0?}})
#1 32.42 main.newController({0x1fa5538, 0xc0004main.newController({0x1fa5538, 0xc00043e050}, 0xc0003e6000, 0xc0004bc908)
#1 32.43 main.main.func3(0xc0003e6000)
#1 32.43 main.main.func3(0xc0003e6000)
#1 32.43 github.com/urfave/cli.HandleAction({0x1github.com/urfave/cli.HandleAction({0x1929120?, 0xc0003d00a0?}, 0xc000584c40?)
#1 32.43 github.com/urfave/cli.(*App).Run(0xc000584c40, {0xc0github.com/urfave/cli.(*App).Run(0xc000584c40, {0xc000050040, 0x2, 0x2})
#1 32.43 main.main()
#1 32.43 main.main()
#1 32.43 time="2025-02-21T17:22:24Z" level=info mtime="2025-02-21T17:22:24Z" level=info msg="auto snapshotter: using overlayfs"
#1 32.43 [signal SIGSEGV: segmentation violation code=0x1 addr=0x28 pc=0xe78fba]
#1 32.43 [signal SIGSEGV: segmentation violation code=0x1 addr=0x28 pc=0xe78fba]
#1 32.43 	/src/vendor/github.com/fsnotify/fsnotify/backend_inot	/src/vendor/github.com/fsnotify/fsnotify/backend_inotify.go:296
#1 32.43 	/src/vendor/github.com/fsnotify/fsnotify/backend_inotify.go:372 +0x3a
#1 32.43 	/src/vendor/github.com/fsnotify/fsnotify/backend_inotify.go:372 +0x3a
#1 32.43 	/src/vendor/github.com/fsnotify/fsnotify/backend	/src/vendor/github.com/fsnotify/fsnotify/backend_inotify.go:362
#1 32.43 	/src/vendor/tags.cncf.io/container-device-interface/pkg/cdi/cache.go:572 +0xd9
#1 32.43 	/src/vendor/tags.cncf.io/container-device-interface/pkg/cdi/cache.go:572 +0xd9
#1 32.43 	/src/vendor/tags.cncf.io/container-device-interface/pkg/cdi/cache.go:217 +0x38
#1 32.43 	/src/vendor/tags.cncf.io/container-device-interface/pkg/cdi/cache.go:217 +0x38
#1 32.43 	/src/vendor/tags.cncf.io/container-device-interface/pkg/cdi/cache.go:130 +0x9d	/src/vendor/tags.cncf.io/container-device-interface/pkg/cdi/cache.go:130 +0x9d
#1 32.43 	/src/cmd/buildkitd/main.go:1067 +0x8f
#1 32.43 	/src/cmd/buildkitd/main.go:1067 +0x8f
#1 32.43 	/src/cmd/buildkitd/main.go:1071 +0xf1
#1 32.43 	/src/cmd/buildkitd/main.go:1071 +0xf1
#1 32.43 	/src/cmd/buildkitd/main_oci_worker.go:301 +0x406
#1 32.43 	/src/cmd/buildkitd/main_oci_worker.go:301 +0x406
#1 32.43 	/src/cmd/buildkitd/main.go:797 +0x2c5
#1 32.43 	/src/cmd/buildkitd/main.go:797 +0x2c5
#1 32.43 	/src/vendor/github.com/urfave/cli/app.go:524 +0x50
#1 32.43 	/src/vendor/github.com/urfave/cli/app.go:524 +0x50
#1 32.43 	/src/vendor/github.com/urfave/cli/app.go:286 +0x79b
	/src/vendor/github.com/urfave/cli/app.go:286 +0x79b
#1 32.43 	/src/cmd/bu	/src/cmd/buildkitd/main.go:413 +0x12a7
#1 32.43 panic: runtime error: invalid memory address or nil pointer dereference
#1 32.43 panic: runtime error: invalid memory address or nil pointer dereference
#1 32.43 
#1 32.43 
#1 32.43 github.com/fsnotify/fsnotify.(*Watcher).igithub.com/fsnotify/fsnotify.(*Watcher).isClosed(...)
#1 32.43 github.com/fsnotify/fsnotify.(*Watcher).AddWith(0x0, {0x1cdb3cf, github.com/fsnotify/fsnotify.(*Watcher).AddWith(0x0, {0x1cdb3cf, 0x8}, {0x0, 0x0, 0xc0005349e8?})
#1 32.43 github.com/fsnotify/fsnotify.(*Watcher).Add(...)
#1 32.43 github.com/fsnotify/fsnotify.(*Watcher).Add(...)
#1 32.43 	/src/vendor/tags.cncf.io/container-device-interface/pkg/cdi/cache.go:572 +0xd9
#1 32.43 	/src/vendor/tags.cncf.io/container-device-interface/pkg/cdi/cache.go:572 +0xd9
#1 32.43 	/src/vendor/tags.cncf.io/container-device-interface/pkg/cdi/cache.go:217 +0x38
#1 32.43 	/src/vendor/tags.cncf.io/container-device-interface/pkg/cdi/cache.go:217 +0x38
#1 32.43 	/src/vendor/tags.cncf.io/container-device-interface/pkg/cdi/cache.go:130 +0x9d	/src/vendor/tags.cncf.io/container-device-interface/pkg/cdi/cache.go:130 +0x9d
#1 32.43 	/src/cmd/buildkitd/main.go:1067 +0x8f
#1 32.43 	/src/cmd/buildkitd/main.go:1067 +0x8f
#1 32.43 	/src/cmd/buildkitd/main.go:1071 +0xf1
#1 32.43 �lmain.ociWorkerIniti	/src/cmd/buildkitd/main.go:1071 +0xf1
#1 32.43 	/src/cmd/buildkitd/main_oci_worker.go:301 +0x406
#1 32.43 	/src/cmd/buildkitd/main_oci_worker.go:301 +0x406
#1 32.43 	/src/cmd/buildkitd/main.go:882 +0xfa
#1 32.43 �Jmain.newController({0x1fa5538, 0xc0005ba0a0},	/src/cmd/buildkitd/main.go:882 +0xfa
#1 32.43 main.main.func3(0xc0005ac160)
#1 32.43 main.main.func3(0xc0005ac160)
#1 32.43 github.com/urfave/cli.HandleAction({0x1github.com/urfave/cli.HandleAction({0x1929120?, 0xc000582380?}, 0xc000604a80?)
#1 32.43 github.com/urfave/cli.(*App).Run(0xc000604a80, {0xc0github.com/urfave/cli.(*App).Run(0xc000604a80, {0xc000050040, 0x2, 0x2})
#1 32.43 	/src/cmd/bu	/src/cmd/buildkitd/main.go:413 +0x12a7
#1 32.43 panic: runtime error: invalid memory address or nil pointer dereference
#1 32.43 panic: runtime error: invalid memory address or nil pointer dereference
#1 32.43 
#1 32.43 
#1 32.43 github.com/fsnotify/fsnotify.(*Watcher).igithub.com/fsnotify/fsnotify.(*Watcher).isClosed(...)
#1 32.43 	/src/vendor/github.com/fsnotify/fsnotify/backend_inotify.go:372 +0x3a
#1 32.43 �1github.com/fsnotify	/src/vendor/github.com/fsnotify/fsnotify/backend_inotify.go:372 +0x3a
#1 32.43 tags.cncf.io/container-device-interface/pkg/cdi.(*watch).update(0tags.cncf.io/container-device-interface/pkg/cdi.(*watch).update(0xc00005a140, 0xc0003f00c0, {0x0, 0x0, 0xc00058aa18?})
#1 32.43 tags.cncf.io/container-device-interface/pkg/cdi.(*Cache).refreshIfRequired(0xc00tags.cncf.io/container-device-interface/pkg/cdi.(*Cache).refreshIfRequired(0xc00020c230, 0x0?)
#1 32.43 tags.cncf.io/container-device-interface/pkg/cdi.(*Cache).Refresh(0xc00020c230)
#1 32.43 tags.cncf.io/container-device-interface/pkg/cdi.(*Cache).Refresh(0xc00020c230)
#1 32.43 main.getCDIManager.func1({0x0, {0x2d29920, 0x3, 0x3}, {0x0, 0x0, 0x0}})
#1 32.43 �'main.getCDIManager.func1({0x0, {0x2d29920, 0x3, 0x3}, {0x0, 0x0, 0x0}})
#1 32.43 main.getCDIManager({0x0, {0x2d29920, 0xmain.getCDIManager({0x0, {0x2d29920, 0x3, 0x3}, {0x0, 0x0, 0x0}})
#1 32.43 	/src/cmd/buildkitd/main_oci_worker.go:301 +0x406
#1 32.43 �[main.newWorkerController(0xc000162160, {0xc00049a9	/src/cmd/buildkitd/main_oci_worker.go:301 +0x406
#1 32.43 main.newController({0x1fa5538, 0xc0004main.newController({0x1fa5538, 0xc00045e050}, 0xc000162160, 0xc00049a908)
#1 32.43 	/src/cmd/buildkitd/main.go:34	/src/cmd/buildkitd/main.go:349 +0xc78
#1 32.43 	/src/vendor/github.com/urfave/cli/app.go:524 +0x50
#1 32.43 �Igithub.com/urfave/c	/src/vendor/github.com/urfave/cli/app.go:524 +0x50
#1 32.43 	/src/vendor/github.com/urfave/cli/app.go:286 +0x79b
#1 32.43 �main.main()
#1 32.43 	/src/vendor/github.com/urfave/cli/app.go:286 +0x79b
#1 32.43 time="2025-02-21T17:22:37Z" level=info mtime="2025-02-21T17:22:37Z" level=info msg="auto snapshotter: using overlayfs"
#1 32.43 
#1 32.43 
#1 32.43 github.com/fsnotify/fsnotify.(*Watcher).igithub.com/fsnotify/fsnotify.(*Watcher).isClosed(...)
#1 32.43 github.com/fsnotify/fsnotify.(*Watcher).AddWith(0x0, {0x1cdb3cf, github.com/fsnotify/fsnotify.(*Watcher).AddWith(0x0, {0x1cdb3cf, 0x8}, {0x0, 0x0, 0xc0005129e8?})
#1 32.43 	/src/vendor/github.com/fsnotify/fsnotify/backend	/src/vendor/github.com/fsnotify/fsnotify/backend_inotify.go:362
#1 32.43 	/src/vendor/tags.cncf.io/container-device-interface/pkg/cdi/cache.go:572 +0xd9
#1 32.43 	/src/vendor/tags.cncf.io/container-device-interface/pkg/cdi/cache.go:572 +0xd9
#1 32.43 	/src/vendor/tags.cncf.io/container-device-interface/pkg/cdi/cache.go:217 +0x38
#1 32.43 �Otags.cn	/src/vendor/tags.cncf.io/container-device-interface/pkg/cdi/cache.go:217 +0x38
#1 32.43 main.getCDIManager.func1({0x0, {0x2d29920, 0x3, 0x3}, {0x0, 0x0, 0x0}})
#1 32.43 �'main.getCDIManager.func1({0x0, {0x2d29920, 0x3, 0x3}, {0x0, 0x0, 0x0}})
#1 32.43 main.getCDIManager({0x0, {0x2d29920, 0xmain.getCDIManager({0x0, {0x2d29920, 0x3, 0x3}, {0x0, 0x0, 0x0}})
#1 32.43 main.ociWorkerInitializer(0xc000513438?main.ociWorkerInitializer(0xc000513438?, {0xc000636908?, 0xc000690540?, {0x1d041a3?, 0xffffffffffffff9c?}})
#1 32.43 	/src/cmd/buildkitd/main.go:882 +0xfa
#1 32.43 	/src/cmd/buildkitd/main.go:882 +0xfa
#1 32.43 	/src/cmd/buildkitd/main.go:797 +0x2c5
#1 32.43 ��main.main.func3(0xc0003e842	/src/cmd/buildkitd/main.go:797 +0x2c5
#1 32.43 github.com/urfave/cli.HandleAction({0x1github.com/urfave/cli.HandleAction({0x1929120?, 0xc00019c210?}, 0xc000584700?)
#1 32.43 github.com/urfave/cli.(*App).Run(0xc000584700, {0xc0github.com/urfave/cli.(*App).Run(0xc000584700, {0xc000050040, 0x2, 0x2})
#1 32.43 	/src/cmd/bu	/src/cmd/buildkitd/main.go:413 +0x12a7
#1 ERROR: exit code 137
------
 > [internal] booting buildkit:
32.43 �'main.getCDIManager.func1({0x0, {0x2d29920, 0x3, 0x3}, {0x0, 0x0, 0x0}})
32.43 main.getCDIManager({0x0, {0x2d29920, 0xmain.getCDIManager({0x0, {0x2d29920, 0x3, 0x3}, {0x0, 0x0, 0x0}})
32.43 main.ociWorkerInitializer(0xc000513438?main.ociWorkerInitializer(0xc000513438?, {0xc000636908?, 0xc000690540?, {0x1d041a3?, 0xffffffffffffff9c?}})
32.43 	/src/cmd/buildkitd/main.go:882 +0xfa
32.43 	/src/cmd/buildkitd/main.go:882 +0xfa
32.43 	/src/cmd/buildkitd/main.go:797 +0x2c5
32.43 ��main.main.func3(0xc0003e842	/src/cmd/buildkitd/main.go:797 +0x2c5
32.43 github.com/urfave/cli.HandleAction({0x1github.com/urfave/cli.HandleAction({0x1929120?, 0xc00019c210?}, 0xc000584700?)
32.43 github.com/urfave/cli.(*App).Run(0xc000584700, {0xc0github.com/urfave/cli.(*App).Run(0xc000584700, {0xc000050040, 0x2, 0x2})
32.43 	/src/cmd/bu	/src/cmd/buildkitd/main.go:413 +0x12a7
------
ERROR: exit code 137
147 v0.13.1 /usr/local/libexec/docker/cli-plugins/docker-buildx --debug buildx create --driver=docker-container --name=buildkit-builder --use --bootstrap
github.com/docker/buildx/driver/docker-container.(*Driver).run
	github.com/docker/buildx/driver/docker-container/driver.go:302
github.com/docker/buildx/driver/docker-container.(*Driver).wait
	github.com/docker/buildx/driver/docker-container/driver.go:204
github.com/docker/buildx/driver/docker-container.(*Driver).create.func3
	github.com/docker/buildx/driver/docker-container/driver.go:1[95](https://***********/***********/-/jobs/39296#L95)
github.com/docker/buildx/util/progress.(*subLogger).Wrap
	github.com/docker/buildx/util/progress/progress.go:79
github.com/docker/buildx/driver/docker-container.(*Driver).create
	github.com/docker/buildx/driver/docker-container/driver.go:124
github.com/docker/buildx/driver/docker-container.(*Driver).Bootstrap.func1
	github.com/docker/buildx/driver/docker-container/driver.go:75
github.com/docker/buildx/util/progress.Wrap
	github.com/docker/buildx/util/progress/progress.go:47
github.com/docker/buildx/driver/docker-container.(*Driver).Bootstrap
	github.com/docker/buildx/driver/docker-container/driver.go:71
github.com/docker/buildx/driver.Boot
	github.com/docker/buildx/driver/driver.go:[96](https://***********/***********/-/jobs/39296#L96)
github.com/docker/buildx/builder.(*Builder).Boot.(*Builder).Boot.func1.func2
	github.com/docker/buildx/builder/builder.go:185
golang.org/x/sync/errgroup.(*Group).Go.func1
	golang.org/x/sync@v0.6.0/errgroup/errgroup.go:78
runtime.goexit
	runtime/asm_amd64.s:1650
  1. success
    docker --debug run -d --privileged --name testbk moby/buildkit:latest --debug
$ docker --debug run -d --privileged --name testbk moby/buildkit:latest --debug
Unable to find image 'moby/buildkit:latest' locally
latest: Pulling from moby/buildkit
f18232174bc9: Pulling fs layer
5131be6560d8: Pulling fs layer
681f4dd882f5: Pulling fs layer
7c7f793318a7: Pulling fs layer
7c7f793318a7: Waiting
681f4dd882f5: Verifying Checksum
681f4dd882f5: Download complete
f18232174bc9: Verifying Checksum
f18232174bc9: Download complete
f18232174bc9: Pull complete
5131be6560d8: Verifying Checksum
5131be6560d8: Download complete
5131be6560d8: Pull complete
681f4dd882f5: Pull complete
7c7f793318a7: Verifying Checksum
7c7f793318a7: Download complete
7c7f793318a7: Pull complete
Digest: sha256:2c59b0a[95](https://*********/-/jobs/39298#L95)f5b2dc103814d69f695a61f131e75f3150ab58ea8afecd75e3d1f9a
Status: Downloaded newer image for moby/buildkit:latest
730c299b540b3272cef604a2435002214542b605f04fa3aecc495dda74bfa9d9

@thaJeztah
Copy link
Member

thaJeztah commented Feb 21, 2025

@crazy-max panic seems to be in buildkitd (CDI?)

#1 [internal] booting buildkit
#1 pulling image moby/buildkit:buildx-stable-1
#1 pulling image moby/buildkit:buildx-stable-1 15.3s done
#1 creating container buildx_buildkit_buildkit-builder0
#1 32.42 panic: runtime error: invalid memory address or nil pointer dereference
#1 32.42 #1 creating container buildx_buildkit_buildkit-builder0 17.1s done
panic: runtime error: invalid memory address or nil pointer dereference
#1 32.42 
#1 32.42 goroutine 1 [running, locked to thread]:
#1 32.42 github.com/fsn
#1 32.42 github.com/fsnotify/fsnotify.(*Watcher).igithub.com/fsnotify/fsnotify.(*Watcher).isClosed(...)
#1 32.42 github.com/fsnotify/fsnotify.(*Watcher).AddWith(0x0, {0x1cdb3cf, github.com/fsnotify/fsnotify.(*Watcher).AddWith(0x0, {0x1cdb3cf, 0x8}, {0x0, 0x0, 0xc0005[94](https://***********/-/jobs/39296#L94)9e8?})
#1 32.42 github.com/fsnotify/fsnotify.(*Watcher).Add(...)
#1 32.42 	/src/vendor/ggithub.com/fsnotify/fsnotify.(*Watcher).Add(...)
#1 32.42 tags.cncf.io/container-device-interface/pkg/cdi.(*watch).update(0tags.cncf.io/container-device-interface/pkg/cdi.(*watch).update(0xc00019f610, 0xc0004f99b0, {0x0, 0x0, 0xc000594a18?})
#1 32.42 tags.cncf.io/container-device-interface/pkg/cdi.(*Cache).refreshIfRequired(0xc00tags.cncf.io/container-device-interface/pkg/cdi.(*Cache).refreshIfRequired(0xc00028c8c0, 0x0?)
#1 32.42 	/src/vendor/tags.cncf.io/container-device-interface/pkg/cdi/cache.go:130 +0x9d	/src/vendor/tags.cncf.io/container-device-interface/pkg/cdi/cache.go:130 +0x9d
#1 32.42 	/src/cmd/buildkitd/main.go:1067 +0x8f
#1 32.42 main.getCDIManager({0x0, 	/src/cmd/buildkitd/main.go:1067 +0x8f
#1 32.42 	/src/cmd/buildkitd/main.go:1071 +0xf1
#1 32.42 main.ociWorkerIniti	/src/cmd/buildkitd/main.go:1071 +0xf1
#1 32.42 	/src/cmd/buildkitd/main_oci_worker.go:301 +0x406
#1 32.42 main.newWorkerController(0xc000476000, {0xc00016c9	/src/cmd/buildkitd/main_oci_worker.go:301 +0x406
#1 32.42 	/src/cmd/buildkitd/main.go:882 +0xfa
#1 32.42 main.newController({0x1fa5538, 0xc00028c280},	/src/cmd/buildkitd/main.go:882 +0xfa
#1 32.42 	/src/cmd/buildkitd/main.go:797 +0x2c5
#1 32.42 main.main.func3(0xc00047600	/src/cmd/buildkitd/main.go:797 +0x2c5

(edit: removed control chars for readability)

@elezar
Copy link

elezar commented Feb 21, 2025

@klihub do you know of any reason that fsnotify would panic like this?

@klihub
Copy link

klihub commented Feb 21, 2025

@klihub do you know of any reason that fsnotify would panic like this?

No, I have never seen it doing that. Did you check if there are any related issues open ?

@crazy-max
Copy link
Member

@entense Thanks for the logs

In any case do you have a default buildkitd config file in your home?:

  • $BUILDX_CONFIG/buildkitd.default.toml
  • $DOCKER_CONFIG/buildx/buildkitd.default.toml
  • ~/.docker/buildx/buildkitd.default.toml

I would say no as it seems to run the build in a fresh docker:dind container.

@crazy-max
Copy link
Member

@entense Can you try with this command so it shows the outputs?

docker --debug run --rm -t --privileged moby/buildkit:latest --debug

@entense
Copy link
Author

entense commented Feb 21, 2025

@crazy-max

$ docker --debug run --rm -t --privileged moby/buildkit:latest --debug
Unable to find image 'moby/buildkit:latest' locally
latest: Pulling from moby/buildkit
f18232174bc9: Pulling fs layer
5131be6560d8: Pulling fs layer
681f4dd882f5: Pulling fs layer
7c7f793318a7: Pulling fs layer
7c7f793318a7: Waiting
681f4dd882f5: Verifying Checksum
681f4dd882f5: Download complete
f18232174bc9: Verifying Checksum
f18232174bc9: Download complete
f18232174bc9: Pull complete
5131be6560d8: Verifying Checksum
5131be6560d8: Download complete
5131be6560d8: Pull complete
681f4dd882f5: Pull complete
7c7f793318a7: Verifying Checksum
7c7f793318a7: Download complete
7c7f793318a7: Pull complete
Digest: sha256:2c59b0a[95](https://*******/-/jobs/39304#L95)f5b2dc103814d69f695a61f131e75f3150ab58ea8afecd75e3d1f9a
Status: Downloaded newer image for moby/buildkit:latest
INFO[2025-02-21T18:17:33Z] auto snapshotter: using overlayfs            
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x28 pc=0xe78fba]
goroutine 1 [running, locked to thread]:
github.com/fsnotify/fsnotify.(*Watcher).isClosed(...)
	/src/vendor/github.com/fsnotify/fsnotify/backend_inotify.go:2[96](https://*******/-/jobs/39304#L96)
github.com/fsnotify/fsnotify.(*Watcher).AddWith(0x0, {0x1ce301f, 0xc}, {0x0, 0x0, 0xc0005149e8?})
	/src/vendor/github.com/fsnotify/fsnotify/backend_inotify.go:372 +0x3a
github.com/fsnotify/fsnotify.(*Watcher).Add(...)
	/src/vendor/github.com/fsnotify/fsnotify/backend_inotify.go:362
tags.cncf.io/container-device-interface/pkg/cdi.(*watch).update(0xc0005031a0, 0xc000827a40, {0x0, 0x0, 0xc000514a18?})
	/src/vendor/tags.cncf.io/container-device-interface/pkg/cdi/cache.go:572 +0xd9
tags.cncf.io/container-device-interface/pkg/cdi.(*Cache).refreshIfRequired(0xc0003f65a0, 0x0?)
	/src/vendor/tags.cncf.io/container-device-interface/pkg/cdi/cache.go:217 +0x38
tags.cncf.io/container-device-interface/pkg/cdi.(*Cache).Refresh(0xc0003f65a0)
	/src/vendor/tags.cncf.io/container-device-interface/pkg/cdi/cache.go:130 +0x9d
main.getCDIManager.func1({0x0, {0x2d29920, 0x3, 0x3}, {0x0, 0x0, 0x0}})
	/src/cmd/buildkitd/main.go:1067 +0x8f
main.getCDIManager({0x0, {0x2d29920, 0x3, 0x3}, {0x0, 0x0, 0x0}})
	/src/cmd/buildkitd/main.go:1071 +0xf1
main.ociWorkerInitializer(0xc000515438?, {0xc0004a4908?, 0xc000010888?, {0x1d041a3?, 0xffffffffffffff9c?}})
	/src/cmd/buildkitd/main_oci_worker.go:301 +0x406
main.newWorkerController(0xc0001e22c0, {0xc0004a4908?, 0xc000010888?, {0x1d041a3?, 0x0?}})
	/src/cmd/buildkitd/main.go:882 +0xfa
main.newController({0x1fa5538, 0xc0003f6190}, 0xc0001e22c0, 0xc0004a4908)
	/src/cmd/buildkitd/main.go:7[97](https://*******/-/jobs/39304#L97) +0x2c5
main.main.func3(0xc0001e22c0)
	/src/cmd/buildkitd/main.go:349 +0xc78
github.com/urfave/cli.HandleAction({0x1929120?, 0xc0003d2170?}, 0xc000504a80?)
	/src/vendor/github.com/urfave/cli/app.go:524 +0x50
github.com/urfave/cli.(*App).Run(0xc000504a80, {0xc000050040, 0x2, 0x2})
	/src/vendor/github.com/urfave/cli/app.go:286 +0x79b
main.main()
	/src/cmd/buildkitd/main.go:413 +0x12a7
time="2025-02-21T18:17:33Z" level=debug msg="[hijack] End of stdout"

@crazy-max
Copy link
Member

@klihub do you know of any reason that fsnotify would panic like this?

@elezar We are vendoring fsnotify 1.7.0 https://github.com/moby/buildkit/blob/v0.20.0/go.mod#L150 while in your case it's 1.5.1: https://github.com/cncf-tags/container-device-interface/blob/v0.8.0/go.mod#L6. Wonder if this is related.

@crazy-max
Copy link
Member

crazy-max commented Feb 21, 2025

$ docker --debug run --rm -t --privileged moby/buildkit:latest --debug
...
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x28 pc=0xe78fba]
goroutine 1 [running, locked to thread]:
github.com/fsnotify/fsnotify.(*Watcher).isClosed(...)

@entense Thanks! So not something wrong with buildx.

@crazy-max
Copy link
Member

crazy-max commented Feb 21, 2025

@entense Can you try with this command and post the output please?:

$ docker --debug run --rm -t -e FSNOTIFY_DEBUG=1 --privileged moby/buildkit:latest --debug

and this one:

$ docker --debug run --rm -t -e FSNOTIFY_DEBUG=1 --privileged crazymax/buildkit:5767 --debug

@entense
Copy link
Author

entense commented Feb 21, 2025

@crazy-max

docker --debug run --rm -t -e FSNOTIFY_DEBUG=1 --privileged moby/buildkit:latest --debug
$ docker --debug run --rm -t -e FSNOTIFY_DEBUG=1 --privileged moby/buildkit:latest --debug
Unable to find image 'moby/buildkit:latest' locally
latest: Pulling from moby/buildkit
f18232174bc9: Pulling fs layer
5131be6560d8: Pulling fs layer
681f4dd882f5: Pulling fs layer
7c7f793318a7: Pulling fs layer
7c7f793318a7: Waiting
681f4dd882f5: Verifying Checksum
681f4dd882f5: Download complete
f18232174bc9: Verifying Checksum
f18232174bc9: Download complete
f18232174bc9: Pull complete
5131be6560d8: Verifying Checksum
5131be6560d8: Download complete
5131be6560d8: Pull complete
681f4dd882f5: Pull complete
7c7f793318a7: Verifying Checksum
7c7f793318a7: Download complete
7c7f793318a7: Pull complete
Digest: sha256:2c59b0a95f5b2dc103814d69f695a61f131e75f3150ab58ea8afecd75e3d1f9a
Status: Downloaded newer image for moby/buildkit:latest
INFO[2025-02-21T18:40:15Z] auto snapshotter: using overlayfs            
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x28 pc=0xe78fba]
goroutine 1 [running, locked to thread]:
github.com/fsnotify/fsnotify.(*Watcher).isClosed(...)
	/src/vendor/github.com/fsnotify/fsnotify/backend_inotify.go:296
github.com/fsnotify/fsnotify.(*Watcher).AddWith(0x0, {0x1cdb3cf, 0x8}, {0x0, 0x0, 0xc0005[94](https://*********/-/jobs/39306#L94)9e8?})
	/src/vendor/github.com/fsnotify/fsnotify/backend_inotify.go:372 +0x3a
github.com/fsnotify/fsnotify.(*Watcher).Add(...)
	/src/vendor/github.com/fsnotify/fsnotify/backend_inotify.go:362
tags.cncf.io/container-device-interface/pkg/cdi.(*watch).update(0xc00005b160, 0xc000132ed0, {0x0, 0x0, 0xc000594a18?})
	/src/vendor/tags.cncf.io/container-device-interface/pkg/cdi/cache.go:572 +0xd9
tags.cncf.io/container-device-interface/pkg/cdi.(*Cache).refreshIfRequired(0xc00045e4b0, 0x0?)
	/src/vendor/tags.cncf.io/container-device-interface/pkg/cdi/cache.go:217 +0x38
tags.cncf.io/container-device-interface/pkg/cdi.(*Cache).Refresh(0xc00045e4b0)
	/src/vendor/tags.cncf.io/container-device-interface/pkg/cdi/cache.go:130 +0x9d
main.getCDIManager.func1({0x0, {0x2d29920, 0x3, 0x3}, {0x0, 0x0, 0x0}})
	/src/cmd/buildkitd/main.go:1067 +0x8f
main.getCDIManager({0x0, {0x2d29920, 0x3, 0x3}, {0x0, 0x0, 0x0}})
	/src/cmd/buildkitd/main.go:1071 +0xf1
main.ociWorkerInitializer(0xc0005[95](https://*********/-/jobs/39306#L95)438?, {0xc00056c908?, 0xc000010450?, {0x1d041a3?, 0xffffffffffffff9c?}})
	/src/cmd/buildkitd/main_oci_worker.go:301 +0x406
main.newWorkerController(0xc0005302c0, {0xc00056c908?, 0xc000010450?, {0x1d041a3?, 0x0?}})
	/src/cmd/buildkitd/main.go:882 +0xfa
main.newController({0x1fa5538, 0xc00045e050}, 0xc0005302c0, 0xc00056c908)
	/src/cmd/buildkitd/main.go:7[97](https://*********/-/jobs/39306#L97) +0x2c5
main.main.func3(0xc0005302c0)
	/src/cmd/buildkitd/main.go:349 +0xc78
github.com/urfave/cli.HandleAction({0x1929120?, 0xc0003e6080?}, 0xc00010cfc0?)
	/src/vendor/github.com/urfave/cli/app.go:524 +0x50
github.com/urfave/cli.(*App).Run(0xc00010cfc0, {0xc0001b6000, 0x2, 0x2})
	/src/vendor/github.com/urfave/cli/app.go:286 +0x79b
main.main()
	/src/cmd/buildkitd/main.go:413 +0x12a7
time="2025-02-21T18:40:16Z" level=debug msg="[hijack] End of stdout"
docker --debug run --rm -t -e FSNOTIFY_DEBUG=1 --privileged crazymax/buildkit:5767 --debug
$ docker --debug run --rm -t -e FSNOTIFY_DEBUG=1 --privileged crazymax/buildkit:5767 --debug
Unable to find image 'crazymax/buildkit:5767' locally
5767: Pulling from crazymax/buildkit
f18232174bc9: Pulling fs layer
ee672d332da4: Pulling fs layer
7da2d16a16d6: Pulling fs layer
cce32a221996: Pulling fs layer
cce32a221996: Waiting
7da2d16a16d6: Verifying Checksum
7da2d16a16d6: Download complete
f18232174bc9: Verifying Checksum
f18232174bc9: Download complete
f18232174bc9: Pull complete
ee672d332da4: Verifying Checksum
ee672d332da4: Download complete
ee672d332da4: Pull complete
7da2d16a16d6: Pull complete
cce32a221996: Verifying Checksum
cce32a221996: Download complete
cce32a221996: Pull complete
Digest: sha256:fec85bfe291c5d5a7b7f54de596411ea71e33b4faa637bdf4a018bb164c79d19
Status: Downloaded newer image for crazymax/buildkit:5767
INFO[2025-02-21T18:41:56Z] auto snapshotter: using overlayfs            
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0xeb406c]
goroutine 1 [running, locked to thread]:
github.com/fsnotify/fsnotify.(*Watcher).Add(...)
	/src/vendor/github.com/fsnotify/fsnotify/fsnotify.go:313
tags.cncf.io/container-device-interface/pkg/cdi.(*watch).update(0xc0006a62b0, 0xc0004a4630, {0x0, 0x0, 0xc0005[94](https://*********/-/jobs/39308#L94)a18?})
	/src/vendor/tags.cncf.io/container-device-interface/pkg/cdi/cache.go:572 +0xcc
tags.cncf.io/container-device-interface/pkg/cdi.(*Cache).refreshIfRequired(0xc0001c0000, 0x0?)
	/src/vendor/tags.cncf.io/container-device-interface/pkg/cdi/cache.go:217 +0x38
tags.cncf.io/container-device-interface/pkg/cdi.(*Cache).Refresh(0xc0001c0000)
	/src/vendor/tags.cncf.io/container-device-interface/pkg/cdi/cache.go:130 +0x9d
main.getCDIManager.func1({0x0, {0x2d2d940, 0x3, 0x3}, {0x0, 0x0, 0x0}})
	/src/cmd/buildkitd/main.go:1067 +0x8f
main.getCDIManager({0x0, {0x2d2d940, 0x3, 0x3}, {0x0, 0x0, 0x0}})
	/src/cmd/buildkitd/main.go:1071 +0xf1
main.ociWorkerInitializer(0xc0005[95](https://*********/-/jobs/39308#L95)438?, {0xc0004c0908?, 0xc000011db8?, {0x1d0689b?, 0xffffffffffffff9c?}})
	/src/cmd/buildkitd/main_oci_worker.go:301 +0x406
main.newWorkerController(0xc000202000, {0xc0004c0908?, 0xc000011db8?, {0x1d0689b?, 0x1c?}})
	/src/cmd/buildkitd/main.go:882 +0xfa
main.newController({0x1fa8298, 0xc0001c0dc0}, 0xc000202000, 0xc0004c0908)
	/src/cmd/buildkitd/main.go:7[97](https://*********/-/jobs/39308#L97) +0x2c5
main.main.func3(0xc000202000)
	/src/cmd/buildkitd/main.go:349 +0xc78
github.com/urfave/cli.HandleAction({0x192b2e0?, 0xc0003d20c0?}, 0xc000504700?)
	/src/vendor/github.com/urfave/cli/app.go:524 +0x50
github.com/urfave/cli.(*App).Run(0xc000504700, {0xc000050040, 0x2, 0x2})
	/src/vendor/github.com/urfave/cli/app.go:286 +0x79b
main.main()
	/src/cmd/buildkitd/main.go:413 +0x12a7
time="2025-02-21T18:41:57Z" level=debug msg="[hijack] End of stdout"

@tonistiigi
Copy link
Member

@entense Could you run again with tonistiigi/buildkit:v0.20.0-fsnotify-error-debug image and paste the output source

Afaics the likely issue is in https://github.com/cncf-tags/container-device-interface/blob/main/pkg/cdi/cache.go#L506 where it ignores the error return from NewWatcher. SIGSEGV happens if watch.watcher is nil in isClosed().

@entense
Copy link
Author

entense commented Feb 21, 2025

@tonistiigi

$ docker --debug run --rm -t -e FSNOTIFY_DEBUG=1 --privileged tonistiigi/buildkit:v0.20.0-fsnotify-error-debug --debug
Unable to find image 'tonistiigi/buildkit:v0.20.0-fsnotify-error-debug' locally
v0.20.0-fsnotify-error-debug: Pulling from tonistiigi/buildkit
f18232174bc9: Pulling fs layer
893fbe5bd45d: Pulling fs layer
1dc2dbb20c11: Pulling fs layer
4dcb72f687b1: Pulling fs layer
4dcb72f687b1: Waiting
1dc2dbb20c11: Verifying Checksum
1dc2dbb20c11: Download complete
f18232174bc9: Download complete
f18232174bc9: Pull complete
893fbe5bd45d: Verifying Checksum
893fbe5bd45d: Download complete
893fbe5bd45d: Pull complete
1dc2dbb20c11: Pull complete
4dcb72f687b1: Download complete
4dcb72f687b1: Pull complete
Digest: sha256:52a68b176c30f33afb9d2d5d0a4a7[94](https://********/-/jobs/39314#L94)1b4ee700c31153dfafce43644fd309085
Status: Downloaded newer image for tonistiigi/buildkit:v0.20.0-fsnotify-error-debug
INFO[2025-02-21T19:20:37Z] auto snapshotter: using overlayfs            
2025/02/21 19:20:37 NewWatcher error: too many open files
2025/02/21 19:20:37 CDI FSNotify error: /etc/cdi: failed to create watcher: too many open files
2025/02/21 19:20:37 CDI FSNotify error: /var/run/cdi: failed to create watcher: too many open files
2025/02/21 19:20:37 CDI FSNotify error: /etc/buildkit/cdi: failed to create watcher: too many open files
2025/02/21 19:20:37 skip /var/run/cdi, no watcher
2025/02/21 19:20:37 skip /etc/buildkit/cdi, no watcher
2025/02/21 19:20:37 skip /etc/cdi, no watcher
WARN[2025-02-21T19:20:37Z] using host network as the default            
INFO[2025-02-21T19:20:37Z] found worker "0644p9p0pmzcfiz5sjuzprxz6", labels=map[org.mobyproject.buildkit.worker.executor:oci org.mobyproject.buildkit.worker.hostname:0[104](https://********/-/jobs/39314#L104)d02f8590 org.mobyproject.buildkit.worker.network:host org.mobyproject.buildkit.worker.oci.process-mode:sandbox org.mobyproject.buildkit.worker.selinux.enabled:false org.mobyproject.buildkit.worker.snapshotter:overlayfs], platforms=[linux/amd64 linux/amd64/v2 linux/amd64/v3 linux/amd64/v4 linux/386] 
WARN[2025-02-21T19:20:37Z] skipping containerd worker, as "/run/containerd/containerd.sock" does not exist 
INFO[2025-02-21T19:20:37Z] found 1 workers, default="0644p9p0pmzcfiz5sjuzprxz6" 
WARN[2025-02-21T19:20:37Z] currently, only the default worker can be used. 
INFO[2025-02-21T19:20:37Z] running server on /run/buildkit/buildkitd.sock

@tonistiigi
Copy link
Member

@klihub That confirms the above theory. NewWatcher error leaves watcher as nil, and without the early return in "skip /etc/cdi, no watcher" that would lead directly to panic.

I do wonder what is the root cause of the "too many open files". Is it some system configuration error, or is the fsnotify itself (or something in dind/buildkit) causing it.

@khiemcare
Copy link

khiemcare commented Feb 22, 2025

I came across this thread when we ran into this issue and discovered that our github runner had kept the buildx containers running for each build and eventually consumed all the resources. The short term fix was to stop and removed all the containers and that resolved it. I figured it is worth sharing in case you are running into the same issue.

@klihub
Copy link

klihub commented Feb 22, 2025

@klihub That confirms the above theory. NewWatcher error leaves watcher as nil, and without the early return in "skip /etc/cdi, no watcher" that would lead directly to panic.

I do wonder what is the root cause of the "too many open files". Is it some system configuration error, or is the fsnotify itself (or something in dind/buildkit) causing it.

@tonistiigi @elezar Thank you for tracking this down ! I try to roll a fix for this.

@entense
Copy link
Author

entense commented Feb 22, 2025

@klihub That confirms the above theory. NewWatcher error leaves watcher as nil, and without the early return in "skip /etc/cdi, no watcher" that would lead directly to panic.

I do wonder what is the root cause of the "too many open files". Is it some system configuration error, or is the fsnotify itself (or something in dind/buildkit) causing it.

I found out something, We have about 120 docker containers running on our server, if I stop any of them, the error disappears, until I start another container.

It seems that all of them together exceed the limit on the number of open files

@jrscholey
Copy link

jrscholey commented Feb 22, 2025

As of Friday 21st Feb, we are also experiencing the same issue in our gitlab build pipelines using docker-dind:20.10 with our k8s gitlab runners, with no configuration change from our side.

On Friday afternoon the error appeared to be intermittent, but as of the evening (CET) until now it continues to return the error 100%. Clearing runner caches / artifacts and restarting our runner deployments has no effect.

I wll try to provide more logs when I can. But initially we are seeing the same output as OP in the build logs of gitlab.

@entense
Copy link
Author

entense commented Feb 22, 2025

I managed to solve this temporarily by increasing the limits on the server.

increase hard ulimit from 1048576 to 4096000

sudo nano /etc/security/limits.conf
* soft nofile 4096
* hard nofile 4096000
root soft nofile 4096
root hard nofile 4096000

paste session required pam_limits.so

sudo nano /etc/pam.d/common-session

put DefaultLimitNOFILE=4096000

sudo nano /etc/systemd/system.conf
sudo nano /etc/systemd/user.conf
sudo nano /etc/docker/daemon.json
{
  "default-ulimits": {
    "nofile": {
      "Name": "nofile",
      "Soft": 4096000,
      "Hard": 4096000
    }
  }
}
# cat /proc/sys/fs/nr_open
echo 4096000 | sudo tee /proc/sys/fs/nr_open
sysctl fs.inotify.max_user_watches
sysctl fs.inotify.max_user_instance
echo 4096000 | sudo tee /proc/sys/fs/inotify/max_user_watches
echo 512 | sudo tee /proc/sys/fs/inotify/max_user_instances

in /etc/sysctl.conf

fs.nr_open=4096000
fs.inotify.max_user_watches=4096000
sudo sysctl -p
  1. gitlab runner config config.toml
...
  [runners.docker]
    tls_verify = false
    image = "docker:dind"
    privileged = true
    ulimits = { nofile = 4096000 }
...

and reboot system

@klihub
Copy link

klihub commented Feb 22, 2025

@klihub That confirms the above theory. NewWatcher error leaves watcher as nil, and without the early return in "skip /etc/cdi, no watcher" that would lead directly to panic.
I do wonder what is the root cause of the "too many open files". Is it some system configuration error, or is the fsnotify itself (or something in dind/buildkit) causing it.

@tonistiigi @elezar Thank you for tracking this down ! I try to roll a fix for this.

@tonistiigi @elezar Here is a minimal footprint fix cncf-tags/container-device-interface#254. Provided that's deemed good enough and gets merged, would you like us to roll a v0.8.1 patch release with only the fix cherry-picked ?

@tonistiigi tonistiigi linked a pull request Feb 22, 2025 that will close this issue
@tonistiigi
Copy link
Member

Pushed test PR #5769 with image tonistiigi/buildkit:cdi-fsnotify-fix with that patch.

would you like us to roll a v0.8.1 patch release with only the fix cherry-picked ?

Yes please.

Anyone hitting this(in gitlab), you can try increasing the limits as suggested above or use moby/buildkit:v0.19.0 or tonistiigi/buildkit:cdi-fsnotify-fix as a workaround. If you have more details of what may cause reaching the open file limit in that Gitlab configuration you can also post it. Still not clear to me why this specifically would reach this limit.

@klihub By default the CDI directories we listen on, do not even exist in the container image configuration. Would that cause file watcher on the parent? (and result in many open files). If true, why would that be as even if fsnotify needs to listen on parent it only cares about the root and not about the events from subfiles of that parent dir.

@klihub
Copy link

klihub commented Feb 22, 2025

@klihub By default the CDI directories we listen on, do not even exist in the container image configuration. Would that cause file watcher on the parent? (and result in many open files). If true, why would that be as even if fsnotify needs to listen on parent it only cares about the root and not about the events from subfiles of that parent dir.

@tonistiigi No, we don't do that. We don't walk up the directory tree and install a set of watches to detect when all missing directories are created. Instead we simply administer if we haven't managed to create a watch for a (missing) directory we need to track, and then re-try creating the watch when we do a refresh.

@entense mentioned in this comment earlier that it looks like the problem arises, whenever you have 120 docker containers running on the server. Stopping one makes the problem go away and then creating one brings it back.

Could we try to figure out if you really hit a limit which is set too low, or if some component (maybe us in CDI) leaks file descriptors somewhere, by driving the system to the 120 container state and then checking the /proc/$pid/fd of the failing process to see what kind of descriptors we see there ? Could help a bit narrowing down where we are looking for the culprit.

@crazy-max crazy-max added this to the v0.20.1 milestone Feb 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants