Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add unhandled events #127

Open
10 of 36 tasks
mathisloevenich opened this issue Oct 13, 2021 · 1 comment
Open
10 of 36 tasks

add unhandled events #127

mathisloevenich opened this issue Oct 13, 2021 · 1 comment

Comments

@mathisloevenich
Copy link
Collaborator

mathisloevenich commented Oct 13, 2021

Jobs that are handled at the current state:

  • 000 (Job submitted)
  • 001 (Job executing)
  • 002 (Error in executable)
  • 003 (Job was checkpointed)
  • 004 (Job evicted from machine)
  • 005 (Job terminated)
  • 006 (Image size of job updated)
  • 007 (Shadow exception)
  • 008 (Generic log event) -> does not require handling
  • 009 (Job aborted)
  • 010 (Job was suspended)
  • 011 (Job was unsuspended)
  • 012 (Job was held)
  • 013 (Job was released)
  • 014 (Parallel node exectued)
  • 015 (Parallel node terminated)
  • 016 (POST script terminated)
  • Job events between 016 an 022 are not documented and probably only for internal use case
  • 022 (Remote system call socket lost)
  • 023 (Remote system call socket reestablished)
  • 024 (Remote system call reconnect failure)
  • 025 (Grid Resource Back Up)
  • 026 (Detected Down Grid Resource)
  • 027 (Job submitted to grid resource)
  • 028 (Job ad information event triggered.)
  • 029 ()
  • 030 ()
  • 031 ()
  • 032 ()
  • 033 ()
  • 034 ()
  • 035 ()
  • 036 ()
  • 037 ()
  • 038 ()
  • 039 ()
  • 040 ()
@mathisloevenich
Copy link
Collaborator Author

mathisloevenich commented Nov 17, 2021

Have a look here: https://htcondor.readthedocs.io/en/latest/codes-other-values/job-event-log-codes.html

Not all jobs are handled yet and some of them might never be worth handeling/mentioning.

There is no api to know the data load of all the different events.

Therefore it would make sense to contact the team of htcondor for this:
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

I've been trying to find information about the JobEvent types in their database, but I could't find anything.

@mathisloevenich mathisloevenich changed the title add unhandled events (JOB_DISCONNECTED, JOB_RECONNECT_FAILED, JOB_EVICTED) add unhandled events Nov 23, 2021
@mathisloevenich mathisloevenich pinned this issue Nov 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant