-
Notifications
You must be signed in to change notification settings - Fork 176
Schedule jobs to SE (RFC)
RFC #8
Authors: R.Graciani, A.Casajús
Last Modified: 2013-03-13
DIRAC job scheduling has mainly been driven by directing jobs to sites. Although this mechanism suites most use cases, there are several where there that is not enough. Currently DIRAC lacks Most of these scenarios where jobs need to be directed to Storage Elements (SEs). Here we present a proposal:
Users can define a list of SEs from which the Input Data (ID) has to be read or retrieved from. An example use case would be a VO that would like to divide the data processing based on the SE capacity and not the processing power provided by Sites.
The proposal also takes into account that data can be either read directly from the SE or downloaded into the Worker Node for local processing. Users should be able to select how they want to access the data.
Users can define in the job manifest two options:
- AllowDataDownload: Defines how data can be retrieved. Either direct access to the SE or allow downloading data to the WN for local processing. By default is false.
- InputSE: Defines a list of SEs where the data has to be accessed from. All the InputData have to have a replica in one of the requested SEs. By default this value is undefined. This is equivalent to a list with all SEs.
When the job goes through the optimization chain, if InputSE is defined, all the usable replicas are reduced to the ones in the requested SEs.
At matching time, the JobAgent has to present a list of all Local and Close SEs (Close SEs being SEs from which the job can download data). If the manifest required InputSE these will be matched against the ones presented by the JobAgent. If AllowDataDownload is set to True all the SEs presented by the JobAgent will be considered, otherwise only he Local ones will be used for the Matching.