@@ -341,16 +341,13 @@ Working with the scheduler is difficult. Challenges include:
341
341
later. For example, data unpacked from the CIB can safely be used anytime
342
342
after ``unpack_cib(), `` but actions may become optional or required anytime
343
343
before ``pcmk__create_graph() ``. There's no easy way to deal with this.
344
- * Many names of struct members, functions, etc., are suboptimal, but are part
345
- of the public API and cannot be changed until an API backward compatibility
346
- break.
347
344
348
345
349
346
.. index ::
350
347
single: pcmk_scheduler_t
351
348
352
- Cluster Working Set
353
- ___________________
349
+ The Scheduler Object
350
+ ____________________
354
351
355
352
The main data object for the scheduler is ``pcmk_scheduler_t ``, which contains
356
353
all information needed about nodes, resources, constraints, etc., both as the
@@ -363,18 +360,21 @@ transition graph XML. The variable name is usually ``scheduler``.
363
360
Resources
364
361
_________
365
362
366
- ``pcmk_resource_t `` is the data object representing cluster resources. A
367
- resource has a variant: :term: `primitive `, group, clone, or :term: `bundle `.
363
+ ``pcmk_resource_t `` is the data object representing cluster resources. It has a
364
+ couple of public members for backward compatibility reasons, but most of the
365
+ implementation is in the internal ``pcmk__resource_private_t `` type.
368
366
369
- The resource object has members for two sets of methods,
370
- ``resource_object_functions_t `` from the ``libpe_status `` public API, and
371
- ``resource_alloc_functions_t `` whose implementation is internal to
367
+ A resource has a variant: :term: `primitive `, group, clone, or :term: `bundle `.
368
+
369
+ The private resource object has members for two sets of methods,
370
+ ``pcmk__rsc_methods_t `` from ``libcrmcommon ``, and
371
+ ``pcmk__assignment_methods_t `` whose implementation is internal to
372
372
``libpacemaker ``. The actual functions vary by variant.
373
373
374
- The object functions have basic capabilities such as unpacking the resource
374
+ The resource methods have basic capabilities such as unpacking the resource
375
375
XML, and determining the current or planned location of the resource.
376
376
377
- The :term: `assignment <assign> ` functions have more obscure capabilities needed
377
+ The :term: `assignment <assign> ` methods have more obscure capabilities needed
378
378
for scheduling, such as processing location and ordering constraints. For
379
379
example, ``pcmk__create_internal_constraints() `` simply calls the
380
380
``internal_constraints() `` method for each top-level resource in the cluster.
@@ -390,25 +390,33 @@ with the highest :term:`score` for a given resource. The scheduler does a bunch
390
390
of processing to generate the scores, then the actual assignment is
391
391
straightforward.
392
392
393
+ The scheduler node implementation is a little confusing.
394
+
395
+ ``pcmk_node_t `` (``struct pcmk__scored_node ``) is the primary object used.
396
+
397
+ It contains two sub-structs, ``pcmk__node_private_t *priv `` (which is internal)
398
+ and ``struct pcmk__node_details *details `` (which is public for backward
399
+ compatibility reasons), that contain all node information that is independent
400
+ of resource assignment (the node name, etc.).
401
+
402
+ It contains one other (internal) sub-struct, ``struct pcmk__node_assignment
403
+ *assign ``, which contains information particular to a specific resource being
404
+ assigned.
405
+
393
406
Node lists are frequently used. For example, ``pcmk_scheduler_t `` has a
394
- ``nodes `` member which is a list of all nodes in the cluster, and
395
- ``pcmk_resource_t `` has a ``running_on `` member which is a list of all nodes on
396
- which the resource is (or might be) active. These are lists of ``pcmk_node_t ``
397
- objects.
407
+ ``nodes `` member which is a list of all nodes in the cluster, and the internal
408
+ resource object has an ``active_nodes `` member which is a list of all nodes on
409
+ which the resource is (or might be) active.
398
410
399
- The ``pcmk_node_t `` object contains a ``struct pe_node_shared_s *details ``
400
- member with all node information that is independent of resource assignment
401
- (the node name, etc.).
411
+ Only the scheduler's ``nodes `` list has the full, original node instances. All
412
+ other node lists have shallow copies created by ``pe__copy_node() ``, which
413
+ share ``details `` and ``priv `` from the main list (but can differ in their
414
+ ``assign `` member).
402
415
403
- The working set's ``nodes `` member contains the original of this information.
404
- All other node lists contain copies of ``pcmk_node_t `` where only the
405
- ``details `` member points to the originals in the working set's ``nodes `` list.
406
- In this way, the other members of ``pcmk_node_t `` (such as ``weight ``, which is
407
- the node score) may vary by node list, while the common details are shared.
408
416
409
417
.. index ::
410
418
single: pcmk_action_t
411
- single: pe_action_flags
419
+ single: pcmk__action_flags
412
420
413
421
Actions
414
422
_______
@@ -418,16 +426,16 @@ taken. These could be resource actions, cluster-wide actions such as fencing a
418
426
node, or "pseudo-actions" which are abstractions used as convenient points for
419
427
ordering other actions against.
420
428
421
- It has a ``flags `` member which is a bitmask of `` enum pe_action_flags ``. The
422
- most important of these are `` pe_action_runnable `` (if not set, the action is
423
- "blocked" and cannot be added to the transition graph) and
424
- `` pe_action_optional `` (actions with this set will not be added to the
425
- transition graph; actions often start out as optional, and may become required
426
- later).
429
+ Its (internal) implementation has a ``flags `` member which is a bitmask of
430
+ `` enum pcmk__action_flags ``. The most important of these are
431
+ `` pcmk__action_runnable `` (if not set, the action is "blocked" and cannot be
432
+ added to the transition graph) and `` pcmk__action_optional `` (actions with this
433
+ set will not be added to the transition graph; actions often start out as
434
+ optional, and may become required later).
427
435
428
436
429
437
.. index ::
430
- single: pe__colocation_t
438
+ single: pcmk__colocation_t
431
439
432
440
Colocations
433
441
___________
@@ -462,30 +470,45 @@ The resource assignment functions have several methods related to colocations:
462
470
463
471
464
472
.. index ::
465
- single: pe__ordering_t
466
- single: pe_ordering
473
+ single: pcmk__action_relation_t
474
+ single: action; relation
467
475
468
- Orderings
469
- _________
476
+ Action Relations
477
+ ________________
470
478
471
479
Ordering constraints are simple in concept, but they are one of the most
472
480
important, powerful, and difficult to follow aspects of the scheduler code.
473
481
474
- ``pe__ordering_t `` is the data object representing an ordering, better thought
475
- of as a relationship between two actions, since the relation can be more
476
- complex than just "this one runs after that one".
482
+ ``pcmk__action_relation_t `` is the data object representing an ordering, better
483
+ thought of as a relationship between two actions, since the relation can be
484
+ more complex than just "this one runs after that one".
477
485
478
- For an ordering "A then B", the code generally refers to A as "first" or
486
+ For a relation "A then B", the code generally refers to A as "first" or
479
487
"before", and B as "then" or "after".
480
488
481
- Much of the power comes from ``enum pe_ordering ``, which are flags that
482
- determine how an ordering behaves. There are many obscure flags with big
483
- effects. A few examples:
484
-
485
- * ``pe_order_none `` means the ordering is disabled and will be ignored. It's 0,
486
- meaning no flags set, so it must be compared with equality rather than
487
- ``pcmk_is_set() ``.
488
- * ``pe_order_optional `` means the ordering does not make either action
489
- required, so it only applies if they both become required for other reasons.
490
- * ``pe_order_implies_first `` means that if action B becomes required for any
491
- reason, then action A will become required as well.
489
+ Much of the power comes from ``enum pcmk__action_relation_flags ``, which are
490
+ flags that determine how a relation behaves. There are many obscure flags with
491
+ big effects. A few examples:
492
+
493
+ * ``pcmk__ar_none `` means the relation is disabled and will be ignored. The
494
+ value is 0, meaning no flags set, so it must be compared with equality rather
495
+ than ``pcmk_is_set() ``.
496
+ * ``pcmk__ar_ordered `` without any other flags set means the relation does not
497
+ make either action required, so it applies only if they both become required
498
+ for other reasons.
499
+ * ``pcmk__ar_then_implies_first `` means that if action B becomes required for
500
+ any reason, then action A will become required as well.
501
+
502
+ Adding a New Scheduler Regression Test
503
+ ______________________________________
504
+
505
+ #. Choose a test name.
506
+ #. Copy the uncompressed input CIB to cts/scheduler/xml/TESTNAME.xml. It's
507
+ helpful to add an XML comment at the top describing the essential features of
508
+ the test (which configuration and status scenarios are being tested).
509
+ #. Edit ``cts/cts-scheduler.in `` and add the test name and description to the
510
+ ``TESTS `` array.
511
+ #. Run ``cts/cts-scheduler --update --run TESTNAME `` to generate the expected
512
+ transition graph, scores, etc. Look over the generated files to make sure
513
+ they are as expected.
514
+ #. Commit your changes.
0 commit comments