Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jobs are not deregistered when bringing down a docker container #76

Open
codersaur opened this issue Dec 11, 2024 · 6 comments
Open

Jobs are not deregistered when bringing down a docker container #76

codersaur opened this issue Dec 11, 2024 · 6 comments
Labels
bug Something isn't working

Comments

@codersaur
Copy link

codersaur commented Dec 11, 2024

I have a strange issue that I can't work out.

Docker discovery is working and jobs are registered when I bring up containers with ofelia labels. However, now that I have multiple jobs for multiple containers across several docker compose projects, I am finding that when I bring down those containers, the associated jobs do not get deregistered as expected. This leads to "No such container" errors next time a job is due...

ERROR  [Job "test" (9714b2942917)] Finished in "1.296536ms", failed: true, skipped: false, error: error creating exec: No such container: testcontainer

The weird thing is that I tested the registration/deregistration of jobs not long ago and it all seemed to be working (in fact I chose the netresearch fork precisely because it doesn't work with the mcuadros image).

What could be causing ofelia to miss docker events or otherwise fail to deregister jobs?

Config:

ofelia:
  container_name: ofelia
  image: ghcr.io/netresearch/ofelia:latest
  command: daemon --config=/etc/ofelia/config.ini
  volumes:
    # Config for global settings (doesn't work as labels):
    - ./volumes/ofelia/etc:/etc/ofelia/:rw
    # Docker socket so jobs defined as labels on other containers can be dynamically registered:
    - /var/run/docker.sock:/var/run/docker.sock:ro
    # Get trusted certs from host:
    - /etc/ssl/certs:/etc/ssl/certs:ro
@mickaelperrin
Copy link

I can confirm the issue, and in the same manner, jobs are not updated also, if you modify a label and restart the compose project / container.

Currently tested with a docker-compose project not in a docker swarm environment.

@codersaur
Copy link
Author

I can confirm the issue, and in the same manner, jobs are not updated also, if you modify a label and restart the compose project / container.

Currently tested with a docker-compose project not in a docker swarm environment.

FYI, I have since switched to https://github.com/funkyfuture/deck-chores which doesn't appear to suffer from such fundamental issues.

@mickaelperrin
Copy link

mickaelperrin commented Feb 3, 2025

Thanks, but deck-shores is only an "ofelia.job-exec", there is no way to trigger new containers like "ofelia.job-run";, and it looks like ofelia has a "better" docker swarm support in theory. However, it looks like that issue ruined all benefits, because basically it needs a restart each time a new container with labels is turned on / off / modified, and it's currently not supported.

Maybe, I can hack something using docker-gen.

@CybotTM CybotTM closed this as completed in 9cfba8a Feb 3, 2025
@CybotTM CybotTM reopened this Feb 4, 2025
@CybotTM
Copy link
Member

CybotTM commented Feb 4, 2025

Reported issue should be fixed, but i assume that now configured jobs are also removed.

Need to have a look

@CybotTM CybotTM added the bug Something isn't working label Feb 4, 2025
@mickaelperrin
Copy link

I compiled the Docker image locally, and will use it on the next few days to see if I encounter some weird behaviour. I did some basic tests and I confirm that:

  • updating the label value of an existing container works (e.g: changing the frequency of the task)
  • stopping a docker-compose project remove all associated tasks

Thx

@ogmueller
Copy link

It seems that the associated tasks are removed if a docker compose project has been stopped, but I do get at least 3 email warning before ofelia noticed the change. I used the edge image from yesterday.

Labels:

      - "ofelia.enabled=true"
      - "ofelia.job-exec.datecron.schedule=@every 5s"
      - "ofelia.job-exec.datecron.command=nonexistingcommand"
ofelia                | 2025-02-24T15:17:19.433Z  scheduler.go:53 ▶ NOTICE  New job registered "datecron" - "nonexistingcommand" - "@every 5s" - ID: 2
ofelia                | 2025-02-24T15:17:19.433Z  config.go:192 ▶ DEBUG  checking exec job datecron if new
ofelia                | 2025-02-24T15:17:19.433Z  config.go:249 ▶ DEBUG  no run jobs to update
ofelia                | 2025-02-24T15:17:19.433Z  config.go:274 ▶ DEBUG  no new run jobs
ofelia                | 2025-02-24T15:17:24Z  cron_utils.go:13 ▶ DEBUG  wake
ofelia                | 2025-02-24T15:17:24Z  cron_utils.go:13 ▶ DEBUG  run
ofelia                | 2025-02-24T15:17:24.023Z  common.go:128 ▶ NOTICE  [Job "datecron" (fec729574704)] Started - nonexistingcommand
ofelia                | 2025-02-24T15:17:24.035Z  common.go:128 ▶ NOTICE  [Job "datecron" (fec729574704)] Finished in "11.992565ms", failed: false, skipped: false, error: none
ofelia                | 2025-02-24T15:17:29.002Z  cron_utils.go:13 ▶ DEBUG  wake
ofelia                | 2025-02-24T15:17:29.002Z  cron_utils.go:13 ▶ DEBUG  run
ofelia                | 2025-02-24T15:17:29.006Z  common.go:128 ▶ NOTICE  [Job "datecron" (3cfefbd7b4d2)] Started - nonexistingcommand
ofelia                | 2025-02-24T15:17:29.431Z  config.go:138 ▶ DEBUG  dockerLabelsUpdate started
ofelia                | 2025-02-24T15:17:29.431Z  config.go:143 ▶ DEBUG  dockerLabelsUpdate labels: map[ofelia:map[ofelia.enabled:true ofelia.service:true] traefik:map[ofelia.enabled:true ofelia.job-exec.datecron.command:nonexistingcommand ofelia.job-exec.datecron.schedule:@every 5s]]
ofelia                | 2025-02-24T15:17:29.431Z  config.go:151 ▶ DEBUG  checking exec job datecron for changes
ofelia                | 2025-02-24T15:17:29.431Z  config.go:154 ▶ DEBUG  checking exec job datecron vs datecron
ofelia                | 2025-02-24T15:17:29.432Z  config.go:192 ▶ DEBUG  checking exec job datecron if new
ofelia                | 2025-02-24T15:17:29.432Z  config.go:249 ▶ DEBUG  no run jobs to update
ofelia                | 2025-02-24T15:17:29.432Z  config.go:274 ▶ DEBUG  no new run jobs
ofelia-postfix-relay  | 2025-02-24T16:17:29.639968+01:00 213e79315295 postfix/smtpd[128]: connect from ofelia.ofelia_default[172.18.6.3]
ofelia-postfix-relay  | 2025-02-24T16:17:29.650707+01:00 213e79315295 postfix/smtpd[128]: 9ED6F3DAAD59: client=ofelia.ofelia_default[172.18.6.3]
ofelia-postfix-relay  | 2025-02-24T16:17:29.651354+01:00 213e79315295 postfix/cleanup[131]: 9ED6F3DAAD59: message-id=<>
ofelia-postfix-relay  | 2025-02-24T16:17:29.653577+01:00 213e79315295 postfix/smtpd[128]: disconnect from ofelia.ofelia_default[172.18.6.3] ehlo=1 mail=1 rcpt=1 data=1 quit=1 commands=5
ofelia                | 2025-02-24T15:17:29.653Z  common.go:124 ▶ ERROR  [Job "datecron" (3cfefbd7b4d2)] Finished in "615.089337ms", failed: true, skipped: false, error: error creating exec: API error (409): container f99d24f22abcb680acc85fbb716a6aa8ef12b001c270881edd74b88a43c0ddab is not running
ofelia-postfix-relay  | 2025-02-24T16:17:29.653719+01:00 213e79315295 postfix/qmgr[125]: 9ED6F3DAAD59: from=<srv-docker@teqneers.de>, size=2562, nrcpt=1 (queue active)
ofelia-postfix-relay  | 2025-02-24T16:17:29.819970+01:00 213e79315295 postfix/smtp[132]: 9ED6F3DAAD59: to=<srv-docker@teqneers.de>, relay=abbe.teqneers.de[5.9.198.18]:25, delay=0.17, delays=0.01/0.02/0.11/0.03, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as C22E6C48EFC)
ofelia-postfix-relay  | 2025-02-24T16:17:29.820994+01:00 213e79315295 postfix/qmgr[125]: 9ED6F3DAAD59: removed
ofelia                | 2025-02-24T15:17:34.001Z  cron_utils.go:13 ▶ DEBUG  wake
ofelia                | 2025-02-24T15:17:34.001Z  cron_utils.go:13 ▶ DEBUG  run
ofelia                | 2025-02-24T15:17:34.005Z  common.go:128 ▶ NOTICE  [Job "datecron" (1734601a0b16)] Started - nonexistingcommand
ofelia-postfix-relay  | 2025-02-24T16:17:34.008675+01:00 213e79315295 postfix/smtpd[128]: connect from ofelia.ofelia_default[172.18.6.3]
ofelia-postfix-relay  | 2025-02-24T16:17:34.009092+01:00 213e79315295 postfix/smtpd[128]: 0232F3DAAD68: client=ofelia.ofelia_default[172.18.6.3]
ofelia-postfix-relay  | 2025-02-24T16:17:34.009358+01:00 213e79315295 postfix/cleanup[131]: 0232F3DAAD68: message-id=<>
ofelia-postfix-relay  | 2025-02-24T16:17:34.011457+01:00 213e79315295 postfix/smtpd[128]: disconnect from ofelia.ofelia_default[172.18.6.3] ehlo=1 mail=1 rcpt=1 data=1 quit=1 commands=5
ofelia                | 2025-02-24T15:17:34.011Z  common.go:124 ▶ ERROR  [Job "datecron" (1734601a0b16)] Finished in "2.770582ms", failed: true, skipped: false, error: error creating exec: No such container: traefik
ofelia-postfix-relay  | 2025-02-24T16:17:34.011539+01:00 213e79315295 postfix/qmgr[125]: 0232F3DAAD68: from=<srv-docker@teqneers.de>, size=2558, nrcpt=1 (queue active)
ofelia-postfix-relay  | 2025-02-24T16:17:34.187950+01:00 213e79315295 postfix/smtp[132]: 0232F3DAAD68: to=<srv-docker@teqneers.de>, relay=abbe.teqneers.de[5.9.198.18]:25, delay=0.18, delays=0/0/0.14/0.03, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 27B79C48EFD)
ofelia-postfix-relay  | 2025-02-24T16:17:34.188709+01:00 213e79315295 postfix/qmgr[125]: 0232F3DAAD68: removed
ofelia                | 2025-02-24T15:17:39.001Z  cron_utils.go:13 ▶ DEBUG  wake
ofelia                | 2025-02-24T15:17:39.001Z  cron_utils.go:13 ▶ DEBUG  run
ofelia                | 2025-02-24T15:17:39.005Z  common.go:128 ▶ NOTICE  [Job "datecron" (2f981f43fb13)] Started - nonexistingcommand
ofelia-postfix-relay  | 2025-02-24T16:17:39.013051+01:00 213e79315295 postfix/smtpd[128]: connect from ofelia.ofelia_default[172.18.6.3]
ofelia-postfix-relay  | 2025-02-24T16:17:39.014692+01:00 213e79315295 postfix/smtpd[128]: 038363DAAD6F: client=ofelia.ofelia_default[172.18.6.3]
ofelia-postfix-relay  | 2025-02-24T16:17:39.015872+01:00 213e79315295 postfix/cleanup[131]: 038363DAAD6F: message-id=<>
ofelia-postfix-relay  | 2025-02-24T16:17:39.025146+01:00 213e79315295 postfix/smtpd[128]: disconnect from ofelia.ofelia_default[172.18.6.3] ehlo=1 mail=1 rcpt=1 data=1 quit=1 commands=5
ofelia-postfix-relay  | 2025-02-24T16:17:39.025422+01:00 213e79315295 postfix/qmgr[125]: 038363DAAD6F: from=<srv-docker@teqneers.de>, size=2558, nrcpt=1 (queue active)
ofelia                | 2025-02-24T15:17:39.025Z  common.go:124 ▶ ERROR  [Job "datecron" (2f981f43fb13)] Finished in "5.861274ms", failed: true, skipped: false, error: error creating exec: No such container: traefik
ofelia-postfix-relay  | 2025-02-24T16:17:39.201573+01:00 213e79315295 postfix/smtp[132]: 038363DAAD6F: to=<srv-docker@teqneers.de>, relay=abbe.teqneers.de[5.9.198.18]:25, delay=0.19, delays=0.01/0/0.14/0.03, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 2B5A4C48EFE)
ofelia-postfix-relay  | 2025-02-24T16:17:39.201857+01:00 213e79315295 postfix/qmgr[125]: 038363DAAD6F: removed
ofelia                | 2025-02-24T15:17:39.43Z  config.go:138 ▶ DEBUG  dockerLabelsUpdate started
ofelia                | 2025-02-24T15:17:39.43Z  config.go:143 ▶ DEBUG  dockerLabelsUpdate labels: map[ofelia:map[ofelia.enabled:true ofelia.service:true]]
ofelia                | 2025-02-24T15:17:39.43Z  config.go:151 ▶ DEBUG  checking exec job datecron for changes
ofelia                | 2025-02-24T15:17:39.43Z  scheduler.go:58 ▶ NOTICE  Job deregistered (will not fire again) "datecron" - "nonexistingcommand" - "@every 5s" - ID: 2
ofelia                | 2025-02-24T15:17:39.43Z  config.go:181 ▶ DEBUG  removing exec job datecron
ofelia                | 2025-02-24T15:17:39.43Z  config.go:210 ▶ DEBUG  no new exec jobs
ofelia                | 2025-02-24T15:17:39.43Z  config.go:249 ▶ DEBUG  no run jobs to update
ofelia                | 2025-02-24T15:17:39.43Z  config.go:274 ▶ DEBUG  no new run jobs
ofelia                | 2025-02-24T15:17:39.43Z  cron_utils.go:13 ▶ DEBUG  removed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants