Skip to content

Commit 96c989b

Browse files
authored
Merge pull request #4178 from STEllAR-GROUP/module_checkpoint
Move checkpointing support to its own module
2 parents 3b0408f + 90be121 commit 96c989b

File tree

26 files changed

+437
-155
lines changed

26 files changed

+437
-155
lines changed

docs/CMakeLists.txt

+2-1
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,6 @@ set(doxygen_dependencies
5858
"${PROJECT_SOURCE_DIR}/hpx/lcos/when_some.hpp"
5959
"${PROJECT_SOURCE_DIR}/hpx/lcos/wait_each.hpp"
6060
"${PROJECT_SOURCE_DIR}/hpx/lcos/when_each.hpp"
61-
"${PROJECT_SOURCE_DIR}/hpx/util/checkpoint.hpp"
6261
"${PROJECT_SOURCE_DIR}/hpx/util/debugging.hpp"
6362
"${PROJECT_SOURCE_DIR}/hpx/util/pack_traversal.hpp"
6463
"${PROJECT_SOURCE_DIR}/hpx/util/pack_traversal_async.hpp"
@@ -260,6 +259,8 @@ create_symbolic_link("${PROJECT_SOURCE_DIR}/examples"
260259
"${CMAKE_CURRENT_BINARY_DIR}/examples")
261260
create_symbolic_link("${PROJECT_SOURCE_DIR}/tests"
262261
"${CMAKE_CURRENT_BINARY_DIR}/tests")
262+
create_symbolic_link("${PROJECT_SOURCE_DIR}/libs"
263+
"${CMAKE_CURRENT_BINARY_DIR}/libs")
263264

264265
hpx_source_to_doxygen(hpx_autodoc
265266
DEPENDENCIES ${doxygen_dependencies})

docs/sphinx/manual/miscellaneous.rst

-127
Original file line numberDiff line numberDiff line change
@@ -150,133 +150,6 @@ Utilities in |hpx|
150150
In order to ease the burden of programming in |hpx| we have provided several
151151
utilities to users. The following section documents those facilies.
152152

153-
.. _checkpoint:
154-
155-
Checkpoint
156-
----------
157-
158-
A common need of users is to periodically backup an application. This practice
159-
provides resiliency and potential restart points in code. We have developed the
160-
concept of a ``checkpoint`` to support this use case.
161-
162-
Found in ``hpx/util/checkpoint.hpp``, ``checkpoint``\ s are defined as objects
163-
which hold a serialized version of an object or set of objects at a particular
164-
moment in time. This representation can be stored in memory for later use or it
165-
can be written to disk for storage and/or recovery at a later point. In order to
166-
create and fill this object with data we use a function called
167-
``save_checkpoint``. In code the function looks like this::
168-
169-
hpx::future<hpx::util::checkpoint> hpx::util::save_checkpoint(a, b, c, ...);
170-
171-
``save_checkpoint`` takes arbitrary data containers such as int, double, float,
172-
vector, and future and serializes them into a newly created ``checkpoint``
173-
object. This function returns a ``future`` to a ``checkpoint`` containing the
174-
data. Let us look a simple use case below::
175-
176-
using hpx::util::checkpoint;
177-
using hpx::util::save_checkpoint;
178-
179-
std::vector<int> vec{1,2,3,4,5};
180-
hpx::future<checkpoint> save_checkpoint(vec);
181-
182-
Once the future is ready the checkpoint object will contain the ``vector``
183-
``vec`` and its five elements.
184-
185-
It is also possible to modify the launch policy used by ``save_checkpoint``.
186-
This is accomplished by passing a launch policy as the first argument. It is
187-
important to note that passing ``hpx::launch::sync`` will cause
188-
``save_checkpoint`` to return a ``checkpoint`` instead of a ``future`` to a
189-
``checkpoint``. All other policies passed to ``save_checkpoint`` will return a
190-
``future`` to a ``checkpoint``.
191-
192-
Sometimes ``checkpoint`` s must be declared before they are used.
193-
``save_checkpoint`` allows users to move pre-created ``checkpoint`` s into the
194-
function as long as they are the first container passing into the function (In
195-
the case where a launch policy is used, the ``checkpoint`` will immediately
196-
follow the launch policy). An example of these features can be found below:
197-
198-
.. literalinclude:: ../../tests/unit/util/checkpoint.cpp
199-
:language: c++
200-
:lines: 27-38
201-
202-
Now that we can create ``checkpoint`` s we now must be able to restore the
203-
objects they contain into memory. This is accomplished by the function
204-
``restore_checkpoint``. This function takes a ``checkpoint`` and fills its data
205-
into the containers it is provided. It is important to remember that the
206-
containers must be ordered in the same way they were placed into the
207-
``checkpoint``. For clarity see the example below:
208-
209-
.. literalinclude:: ../../tests/unit/util/checkpoint.cpp
210-
:language: c++
211-
:lines: 41-49
212-
213-
The core utility of ``checkpoint`` is in its ability to make certain data
214-
persistent. Often this means that the data is needed to be stored in an object,
215-
such as a file, for later use. For these cases we have provided two solutions:
216-
stream operator overloads and access iterators.
217-
218-
We have created the two stream overloads
219-
``operator<<`` and ``operator>>`` to stream data
220-
out of and into ``checkpoint``. You can see an
221-
example of the overloads in use below:
222-
223-
.. literalinclude:: ../../tests/unit/util/checkpoint.cpp
224-
:language: c++
225-
:lines: 176-186
226-
227-
This is the primary way to move data into and out of a ``checkpoint``. It is
228-
important to note, however, that users should be cautious when using a stream
229-
operator to load data an another function to remove it (or vice versa). Both
230-
``operator<<`` and ``operator>>`` rely on a ``.write()`` and a ``.read()``
231-
function respectively. In order to know how much data to read from the
232-
``std::istream``, the ``operator<<`` will write the size of the ``checkpoint``
233-
before writing the ``checkpoint`` data. Correspondingly, the ``operator>>`` will
234-
read the size of the stored data before reading the data into new instance of
235-
``checkpoint``. As long as the user employs the ``operator<<`` and
236-
``operator>>`` to stream the data this detail can be ignored.
237-
238-
.. important::
239-
240-
Be careful when mixing ``operator<<`` and ``operator>>`` with other
241-
facilities to read and write to a ``checkpoint``. ``operator<<`` writes and
242-
extra variable and ``operator>>`` reads this variable back separately. Used
243-
together the user will not encounter any issues and can safely ignore this
244-
detail.
245-
246-
Users may also move the data into and out of a ``checkpoint`` using the exposed
247-
``.begin()`` and ``.end()`` iterators. An example of this use case is
248-
illustrated below.
249-
250-
.. literalinclude:: ../../tests/unit/util/checkpoint.cpp
251-
:language: c++
252-
:lines: 129-150
253-
254-
Checkpointing Components
255-
------------------------
256-
257-
``save_checkpoint`` and ``restore_checkpoint`` are also able to store components
258-
inside ``checkpoint``s. This can be done in one of two ways. First a client of
259-
the component can be passed to ``save_checkpoint``. When the user wishes to
260-
resurrect the component she can pass a client instance to ``restore_checkpoint``.
261-
262-
This technique is demonstrated below:
263-
264-
.. literalinclude:: ../../tests/unit/util/checkpoint.cpp
265-
:language: c++
266-
:lines: 143-144
267-
268-
The second way a user can save a component is by passing a ``shared_ptr`` to the
269-
component to ``save_checkpoint``. This component can be resurrected by creating
270-
a new instance of the component type and passing a ``shared_ptr`` to the new
271-
instance to ``restore_checkpoint``. An example can be found below:
272-
273-
This technique is demonstrated below:
274-
275-
.. literalinclude:: ../../tests/unit/util/checkpoint.cpp
276-
:language: c++
277-
:lines: 113-126
278-
279-
280153
.. _iostreams:
281154

282155
The |hpx| I/O-streams component

examples/1d_stencil/CMakeLists.txt

-2
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,6 @@ set(example_programs
1010
1d_stencil_2
1111
1d_stencil_3
1212
1d_stencil_4
13-
1d_stencil_4_checkpoint
1413
1d_stencil_4_parallel
1514
1d_stencil_5
1615
1d_stencil_6
@@ -42,7 +41,6 @@ set(1d_stencil_1_PARAMETERS THREADS_PER_LOCALITY 4)
4241
set(1d_stencil_2_PARAMETERS THREADS_PER_LOCALITY 4)
4342
set(1d_stencil_3_PARAMETERS THREADS_PER_LOCALITY 4)
4443
set(1d_stencil_4_PARAMETERS THREADS_PER_LOCALITY 4)
45-
set(1d_stencil_4_checkpoint_PARAMETERS THREADS_PER_LOCALITY 4)
4644
set(1d_stencil_4_parallel_PARAMETERS THREADS_PER_LOCALITY 4)
4745
set(1d_stencil_5_PARAMETERS THREADS_PER_LOCALITY 4)
4846
set(1d_stencil_6_PARAMETERS THREADS_PER_LOCALITY 4)

libs/CMakeLists.txt

+5-3
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ set(HPX_LIBS
1616
assertion
1717
basic_execution
1818
cache
19+
checkpoint
1920
collectives
2021
compute
2122
compute_cuda
@@ -104,6 +105,7 @@ foreach(lib ${HPX_LIBS})
104105

105106
set(MODULE_FORCE_LINKING_INCLUDES
106107
"${MODULE_FORCE_LINKING_INCLUDES}#include <hpx/${lib}/force_linking.hpp>\n")
108+
107109
set(MODULE_FORCE_LINKING_CALLS
108110
"${MODULE_FORCE_LINKING_CALLS}\n ${lib}::force_linking();")
109111

@@ -114,9 +116,9 @@ foreach(lib ${HPX_LIBS})
114116
endforeach()
115117

116118
configure_file(
117-
"${PROJECT_SOURCE_DIR}/cmake/templates/modules.cpp.in"
118-
"${CMAKE_BINARY_DIR}/libs/modules.cpp"
119-
@ONLY)
119+
"${PROJECT_SOURCE_DIR}/cmake/templates/modules.cpp.in"
120+
"${CMAKE_BINARY_DIR}/libs/modules.cpp"
121+
@ONLY)
120122

121123
configure_file(
122124
"${PROJECT_SOURCE_DIR}/cmake/templates/config_defines_strings_modules.hpp.in"

libs/all_modules.rst

+1
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ All modules
1919
/libs/assertion/docs/index.rst
2020
/libs/basic_execution/docs/index.rst
2121
/libs/cache/docs/index.rst
22+
/libs/checkpoint/docs/index.rst
2223
/libs/collectives/docs/index.rst
2324
/libs/compute/docs/index.rst
2425
/libs/compute_cuda/docs/index.rst

libs/checkpoint/CMakeLists.txt

+34
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
# Copyright (c) 2019 The STE||AR-Group
2+
#
3+
# SPDX-License-Identifier: BSL-1.0
4+
# Distributed under the Boost Software License, Version 1.0. (See accompanying
5+
# file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
6+
7+
cmake_minimum_required(VERSION 3.3.2 FATAL_ERROR)
8+
9+
list(APPEND CMAKE_MODULE_PATH "${CMAKE_CURRENT_SOURCE_DIR}/cmake")
10+
11+
# Default location is $HPX_ROOT/libs/checkpoint/include
12+
set(checkpoint_headers
13+
hpx/checkpoint/checkpoint.hpp
14+
)
15+
16+
# Default location is $HPX_ROOT/libs/checkpoint/include_compatibility
17+
set(checkpoint_compat_headers
18+
hpx/util/checkpoint.hpp
19+
)
20+
21+
set(checkpoint_sources)
22+
23+
include(HPX_AddModule)
24+
add_hpx_module(checkpoint
25+
COMPATIBILITY_HEADERS ON
26+
DEPRECATION_WARNINGS
27+
FORCE_LINKING_GEN
28+
GLOBAL_HEADER_GEN ON
29+
SOURCES ${checkpoint_sources}
30+
HEADERS ${checkpoint_headers}
31+
COMPAT_HEADERS ${checkpoint_compat_headers}
32+
DEPENDENCIES hpx_serialization
33+
CMAKE_SUBDIRS examples tests
34+
)

libs/checkpoint/README.rst

+16
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
2+
..
3+
Copyright (c) 2019 The STE||AR-Group
4+
5+
SPDX-License-Identifier: BSL-1.0
6+
Distributed under the Boost Software License, Version 1.0. (See accompanying
7+
file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
8+
9+
==========
10+
checkpoint
11+
==========
12+
13+
This library is part of HPX.
14+
15+
Documentation can be found `here
16+
<https://stellar-group.github.io/hpx/docs/sphinx/latest/html/libs/checkpoint/docs/index.html>`__.

libs/checkpoint/docs/index.rst

+136
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,136 @@
1+
..
2+
Copyright (c) 2019 The STE||AR-Group
3+
4+
SPDX-License-Identifier: BSL-1.0
5+
Distributed under the Boost Software License, Version 1.0. (See accompanying
6+
file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
7+
8+
.. _libs_checkpoint:
9+
10+
==========
11+
checkpoint
12+
==========
13+
14+
A common need of users is to periodically backup an application. This practice
15+
provides resiliency and potential restart points in code. We have developed the
16+
concept of a ``checkpoint`` to support this use case.
17+
18+
Found in ``hpx/util/checkpoint.hpp``, ``checkpoint``\ s are defined as objects
19+
which hold a serialized version of an object or set of objects at a particular
20+
moment in time. This representation can be stored in memory for later use or it
21+
can be written to disk for storage and/or recovery at a later point. In order to
22+
create and fill this object with data we use a function called
23+
``save_checkpoint``. In code the function looks like this::
24+
25+
hpx::future<hpx::util::checkpoint> hpx::util::save_checkpoint(a, b, c, ...);
26+
27+
``save_checkpoint`` takes arbitrary data containers such as int, double, float,
28+
vector, and future and serializes them into a newly created ``checkpoint``
29+
object. This function returns a ``future`` to a ``checkpoint`` containing the
30+
data. Let us look a simple use case below::
31+
32+
using hpx::util::checkpoint;
33+
using hpx::util::save_checkpoint;
34+
35+
std::vector<int> vec{1,2,3,4,5};
36+
hpx::future<checkpoint> save_checkpoint(vec);
37+
38+
Once the future is ready the checkpoint object will contain the ``vector``
39+
``vec`` and its five elements.
40+
41+
It is also possible to modify the launch policy used by ``save_checkpoint``.
42+
This is accomplished by passing a launch policy as the first argument. It is
43+
important to note that passing ``hpx::launch::sync`` will cause
44+
``save_checkpoint`` to return a ``checkpoint`` instead of a ``future`` to a
45+
``checkpoint``. All other policies passed to ``save_checkpoint`` will return a
46+
``future`` to a ``checkpoint``.
47+
48+
Sometimes ``checkpoint`` s must be declared before they are used.
49+
``save_checkpoint`` allows users to move pre-created ``checkpoint`` s into the
50+
function as long as they are the first container passing into the function (In
51+
the case where a launch policy is used, the ``checkpoint`` will immediately
52+
follow the launch policy). An example of these features can be found below:
53+
54+
.. literalinclude:: ../../../../libs/tests/unit/checkpoint.cpp
55+
:language: c++
56+
:lines: 27-38
57+
58+
Now that we can create ``checkpoint`` s we now must be able to restore the
59+
objects they contain into memory. This is accomplished by the function
60+
``restore_checkpoint``. This function takes a ``checkpoint`` and fills its data
61+
into the containers it is provided. It is important to remember that the
62+
containers must be ordered in the same way they were placed into the
63+
``checkpoint``. For clarity see the example below:
64+
65+
.. literalinclude:: ../../../../libs/tests/unit/checkpoint.cpp
66+
:language: c++
67+
:lines: 41-49
68+
69+
The core utility of ``checkpoint`` is in its ability to make certain data
70+
persistent. Often this means that the data is needed to be stored in an object,
71+
such as a file, for later use. For these cases we have provided two solutions:
72+
stream operator overloads and access iterators.
73+
74+
We have created the two stream overloads
75+
``operator<<`` and ``operator>>`` to stream data
76+
out of and into ``checkpoint``. You can see an
77+
example of the overloads in use below:
78+
79+
.. literalinclude:: ../../../../libs/tests/unit/checkpoint.cpp
80+
:language: c++
81+
:lines: 176-186
82+
83+
This is the primary way to move data into and out of a ``checkpoint``. It is
84+
important to note, however, that users should be cautious when using a stream
85+
operator to load data an another function to remove it (or vice versa). Both
86+
``operator<<`` and ``operator>>`` rely on a ``.write()`` and a ``.read()``
87+
function respectively. In order to know how much data to read from the
88+
``std::istream``, the ``operator<<`` will write the size of the ``checkpoint``
89+
before writing the ``checkpoint`` data. Correspondingly, the ``operator>>`` will
90+
read the size of the stored data before reading the data into new instance of
91+
``checkpoint``. As long as the user employs the ``operator<<`` and
92+
``operator>>`` to stream the data this detail can be ignored.
93+
94+
.. important::
95+
96+
Be careful when mixing ``operator<<`` and ``operator>>`` with other
97+
facilities to read and write to a ``checkpoint``. ``operator<<`` writes and
98+
extra variable and ``operator>>`` reads this variable back separately. Used
99+
together the user will not encounter any issues and can safely ignore this
100+
detail.
101+
102+
Users may also move the data into and out of a ``checkpoint`` using the exposed
103+
``.begin()`` and ``.end()`` iterators. An example of this use case is
104+
illustrated below.
105+
106+
.. literalinclude:: ../../../../libs/tests/unit/checkpoint.cpp
107+
:language: c++
108+
:lines: 129-150
109+
110+
Checkpointing Components
111+
------------------------
112+
113+
``save_checkpoint`` and ``restore_checkpoint`` are also able to store components
114+
inside ``checkpoint``s. This can be done in one of two ways. First a client of
115+
the component can be passed to ``save_checkpoint``. When the user wishes to
116+
resurrect the component she can pass a client instance to ``restore_checkpoint``.
117+
118+
This technique is demonstrated below:
119+
120+
.. literalinclude:: ../../../../libs/tests/unit/checkpoint.cpp
121+
:language: c++
122+
:lines: 143-144
123+
124+
The second way a user can save a component is by passing a ``shared_ptr`` to the
125+
component to ``save_checkpoint``. This component can be resurrected by creating
126+
a new instance of the component type and passing a ``shared_ptr`` to the new
127+
instance to ``restore_checkpoint``. An example can be found below:
128+
129+
This technique is demonstrated below:
130+
131+
.. literalinclude:: ../../../../libs/tests/unit/checkpoint.cpp
132+
:language: c++
133+
:lines: 113-126
134+
135+
136+

0 commit comments

Comments
 (0)