Skip to content

Pilots 2.0: generic, configurable pilots

fstagni edited this page May 2, 2014 · 7 revisions

This is a proposal for a refactoring of the "pilots" code. The goal of this proposal is, first of all, to make today's pilots easy to configure, and easy to extend.

We start with a definition of pilot. A pilot is what creates the possibility to run jobs on a worker node. A pilot can be sent, as a script to be run. Or, it can be fetched. A pilot can run on every computing resource, e.g.: CREAM Computing element, DIRAC Computing element, Virtual Machine in the form of contextualization script.

A pilot has, at a minimum, to:

  • install DIRAC
  • configure DIRAC
  • run the JobAgent

A pilot has to run on each and every computing resource type.

Limitations of current solution

The current solution lacks extensibility, and maintainability. Its building block is the script dirac-pilot.py, which uses the script dirac-install.py for the installation. Communities can define their own pilot scripts, but it is currently impossible to:

  • add capabilities to the current pilot while inheriting from the base one
  • define different pilots based on the type of the computing resource.

Proposal

We propose a solution where:

  • the pilot script is generated, at runtime, server-side
  • a toolbox of pilots capabilities (that we will call "commands") is available for generating the pilot script
  • each command implement a single, atomic, functions, e.g.:
    • run tests
    • install DIRAC
    • configure DIRAC
    • run JobAgent
    • run monitoring agent
    • report usage
    • ... and whatever it is needed
  • VOs can easily extend the content of the toolbox, adding more commands
  • different computing resource types can run different pilots

The proposed solution requires that each command follows a coding convention, and can be found in a specific part of the code, e.g. "DIRAC.WorkloadManagementSystem.Command.MyCommand":

class MyCommand(object): def do(): """ Here, we specify what the command do """

Clone this wiki locally