Skip to content

Schedule jobs to SE (RFC)

acasajus edited this page Mar 13, 2013 · 6 revisions

RFC #8

Authors: R.Graciani, A.Casajús

Last Modified: 2013-03-13

Motivation

DIRAC job scheduling has mainly been driven by directing jobs to sites. Although this mechanism suites most use cases, there are several where there that is not enough. Currently DIRAC lacks Most of these scenarios where jobs need to be directed to Storage Elements (SEs). Here we present a proposal:

Goal of the proposal

Users can define a list of SEs from which the Input Data (ID) has to be read or retrieved from. An example use case would be a VO that would like to divide the data processing based on the SE capacity and not the processing power provided by Sites.

The proposal also takes into account that data can be either read directly from the SE or downloaded into the Worker Node for local processing. Users should be able to select how they want to access the data.

Proposed implementation

Users can define in the job manifest two options:

  • AllowDataDownload: Defines how data can be retrieved. Either direct access to the SE or allow downloading data to the WN for local processing. By default is false.
  • InputSE: Defines a list of SEs where the data has to be accessed from. All the InputData have to have a replica in one of the requested SEs. By default this value is undefined. This is equivalent to a list with all SEs.

Todo finish when days have more than 24 hours

Clone this wiki locally