Skip to content
fstagni edited this page Aug 21, 2014 · 20 revisions

The main point of this version is the introduction of a new type of pilot, that is, for most parts, an implementation of the points discussed within https://github.com/DIRACGrid/DIRAC/wiki/Pilots-2.0:-generic,-configurable-pilots. These changes will be transparent to VOs. Also, several changes of the Data Management system are done.

Changes for the pilot

In case your VO only uses Grid resources, and the pilots are only sent by SiteDirector and TaksQueueDirector agents, and you don't plan to have any specific pilot behaviour, you can stop reading here: the new pilot won't have anything different from the old pilot that you will notice.

Instead, in case you want, for example, to install DIRAC in a different way, or you want your pilot to have some VO specific action, you should carefully read the RFC 18, and what follows. You should also keep reading if your resources include IAAS and IAAC type of resources, like Virtual Machines.

The files to consider are in https://github.com/DIRACGrid/DIRAC/tree/rel-v6r12/WorkloadManagementSystem/PilotAgent The main file in which you should look is https://github.com/DIRACGrid/DIRAC/blob/rel-v6r12/WorkloadManagementSystem/PilotAgent/dirac-pilot.py that also contains a good explanation on how the system works.

The system works with "commands", as explained in the RFC. Any command can be added. If your command is executed before the "InstallDIRAC" command, pay attention that DIRAC functionalities won't be available.

We have introduced a special command named "GetPilotVersion" in https://github.com/DIRACGrid/DIRAC/blob/rel-v6r12/WorkloadManagementSystem/PilotAgent/pilotCommands.py that you should use, and possibly extend, in case you want to send/start pilots that don't know beforehand the (VO)DIRAC version they are going to install. In this case, you have to provide a json file freely accessible that contains the pilot version. This is tipically the case for VMs in IAAS and IAAC.

Beware that, to send pilots containing a specific list of commands via SiteDirector agents need a SiteDirector extension.

Changes in the WMS

It is now possible to set the delay at which jobs in final states are removed from the WMS, via the JobCleaningAgent CS parameters RemoveStatusDelay/Done, RemoveStatusDelay/Killd, RemoveStatusDelay/Failed (default is 7 days).

Changes for the DFC (DIRAC File Catalog)

As visible in https://github.com/DIRACGrid/DIRAC/pull/1983, some fixes and improvements of the DFC requires the tables to be INNODB. It is thus necessary to update your DB so that all the tables use that engine (ALTER TABLE myTable ENGINE = INNODB;)

Changes for FTS (MANDATORY)

Work still ongoing.

There are some mandatory changes in the CS structure, even if you choose to keep using FTS2.

  • In the DataManagement section of Operations, a new flag is needed : 'FTSVersion', whose value can be 'FTS2' (default) or 'FTS3'
  • Still in Operations/[default or setup]/DataManagement/ a new nested section needs to be created:
  FTSPlacement
  {
    FTS2
    {
      ...
    }
    FTS3
    {
      # How to choose the FTS server. It can be:
      # Random : choose random from the list
      # Sequence : one after the other
      # Failover : always use the first one, goes to the next if problem
      ServerPolicy = Random 
    }
  }
  • The section Systems/DataManagement/Services/FTSManager/FTSStrategy can be removed, and its content moved to the previously created section Operations/[default or setup]/DataManagement/FTSPlacement/FTS2
  • The section /Resources/FTSEndpoints also needs to be divided in FTS2 and FTS3. The previous list of servers can go in FTS2. BEWARE: the FTS3 servers need to point on the REST API port (default 8446 )
  • In Systems/DataManagement/Agents/FTSAgents, the attribute FTSGraphValidityPeriod is removed, and the attribute RWAccessValidityPeriod is replaced with FTSPlacementValidityPeriod

The FTS3 rest API release is needed in the externals, but is not yet deployed. For testing, you can get it here https://github.com/cern-it-sdc-id/fts3-rest/tree/master

Changes for RequestManagementDB (Resource Status System)

As committed within https://github.com/DIRACGrid/DIRAC/pull/1950 there is a new field in the DowntimeCache table: 'GOCDBServiceType' : 'VARCHAR(32) NOT NULL'

Clone this wiki locally