@@ -1454,121 +1454,13 @@ and at any point in the past or future.
1454
1454
variety of environments or machines.
1455
1455
]
1456
1456
1457
- === Environments <ch2-environments>
1458
-
1459
- Environments where a build or computational process occurs can be broadly
1460
- categorised into two types: hardware and software environments. While software
1461
- environments can be managed to a high degree of consistency, achieving
1462
- reproducibility across different hardware, particularly different # gls (" CPU" )
1463
- architectures # eg [`x86` , `ARM` ], is essentially impossible. Tasks like
1464
- instruction execution, memory management, and floating-point calculations are
1465
- handled in distinct ways. Even small variations in these processes can lead to
1466
- differences in output. Consequently, even with identical software, builds on
1467
- different types of # gls (" CPU" ) architectures will produce different results.
1468
- When something is said to be reproducible, it typically means reproducible
1469
- within the same # gls (" CPU" ) architecture. Therefore, this section will focus
1470
- exclusively on the reproducibility challenges within software environments.
1471
-
1472
- A software environment is composed of the # gls (" OS" ), along with the set of
1473
- tools, libraries, and dependencies required to build or run a specific
1474
- application. Any change in these components can influence the outcome of a
1475
- software build or execution. For example, a minor update to a library could
1476
- potentially alter the behaviour of the software, producing different outcomes
1477
- across different executions or more importantly, have an impact on the security
1478
- level.
1479
-
1480
- To enhance reproducibility, it is critical to ensure that the software
1481
- environment remains stable and unaltered during both the build and execution
1482
- phases. Unfortunately, conventional # glspl (" OS" ) such as Linux distributions,
1483
- Microsoft Windows, and macOS, are # emph [mutable] by default. This mutability is
1484
- primarily facilitated through package managers, which enable users to easily
1485
- modify their environments by installing or upgrading software packages. As a
1486
- result, uncontrolled changes to dependencies may also lead to inconsistencies in
1487
- software behaviour, or have a impact on the security level, undermining
1488
- reproducibility.
1489
-
1490
- To mitigate these issues, # emph [immutable] environments have gained popularity.
1491
- Tools such as Docker # cite (<docker> ,form :" normal" ) provide mechanisms to
1492
- encapsulate software and its dependencies in containers, thus creating
1493
- environments that remain unchanged after creation. Once a container is built, it
1494
- can be shared and executed across different systems with the guarantee that it
1495
- will function identically, given the same environment. This characteristic makes
1496
- containers highly suitable for distributing software.
1497
-
1498
- Despite the advantages of immutability, it does not guarantee reproducibility by
1499
- default. For instance, container images hosted on platforms like Docker Hub
1500
- # cite (<dockerhub> ,form :" normal" ), including popular language interpreters
1501
- # eg [Python, Node, PHP], may not be reproducible due to non-deterministic
1502
- steps during the image creation. A specific example can be found in
1503
- # ref (<python-dockerfile> ), which runs `apt-get update` at line 4 as part of the
1504
- image build process. Since `apt-get` pulls the latest version of package lists
1505
- at build-time, it is impossible to reproduce the same image later, compromising
1506
- Docker's build-time reproducibility.
1507
-
1508
- # figure (
1509
- sourcefile (
1510
- lang : " dockerfile" ,
1511
- read (" ../../resources/sourcecode/python.dockerfile" ),
1512
- ),
1513
- caption : [
1514
- An excerpt of the Python's Dockerfile
1515
- # cite (<python-dockerfile-repository> ,form :" normal" ) used to build the
1516
- # emph [official] Python images.
1517
- ],
1518
- ) <python-dockerfile>
1519
-
1520
- Docker images, once built, are immutable. While Docker does not guarantee
1521
- build-time reproducibility, it has the potential to ensure run-time
1522
- reproducibility, reflecting Docker's philosophy of
1523
- # emph ["build once, use everywhere"]. This distinction between build-time
1524
- reproducibility (@def-reproducibility-build-time ) and run-time reproducibility
1525
- (@def-reproducibility-run-time ) is key. Docker does not ensure that an image
1526
- will always be built consistently, often due to the base image used (as
1527
- specified in the `FROM` directive of a `Dockerfile` ), as seen in
1528
- @python-dockerfile . Although building a reproducible image with Docker is
1529
- technically possible, it would require additional effort, external tools, and a
1530
- more complex setup. Therefore, we assume that build-time reproducibility is not
1531
- guaranteed, but the immutability of the environment significantly enhances the
1532
- potential for reproducibility at run-time.
1533
-
1534
- # info-box [
1535
- Docker # cite (<docker> ,form :" normal" ) is a platform for building, shipping, and
1536
- running applications in containers, with Docker Hub
1537
- # cite (<dockerhub> ,form :" normal" ) providing a large repository of container
1538
- images, which has significantly contributed to Docker's popularity. Among
1539
- these are "official" Docker images
1540
- # cite (<dockerofficialimages> ,form :" normal" ), which are curated and reviewed by
1541
- Docker Inc. These images offer standard environments for popular software and
1542
- adhere to some quality standards.
1543
-
1544
- However, the term "official" can be misleading. One might suggest that these
1545
- images are maintained by the original software's developers, but it's not
1546
- always the case. For example, the PHP Docker image is not maintained by the
1547
- core PHP development team. This means updates or fixes may not be as prompt or
1548
- specific as if the software’s developers maintained the image.
1549
-
1550
- While Docker vets these images for quality, responsibility for the contents
1551
- rests with the maintainers. Users should be aware that official images are not
1552
- immune to security risks or outdated software, and reviewing the documentation
1553
- for issues is advisable.
1554
-
1555
- In summary, "official" Docker images are trusted but may not be maintained by
1556
- the software’s creators. Developers should use them with care, especially in
1557
- production environments, and verify that the images meet their security and
1558
- functionality needs.
1559
- ]
1560
-
1561
- Package managers are a critical aspect of the reproducibility puzzle. Without
1562
- proper control over how dependencies are resolved and installed, achieving
1563
- consistent and reproducible builds becomes difficult.
1564
-
1565
1457
==== Configuration Management
1566
1458
1567
1459
Reproducibility relies on stable, consistent and well-maintained codebases but
1568
1460
also heavily depends on stable, consistent and well-maintained environments as
1569
- seen in @ ch2-environments . In addition, a critical component is environment
1570
- configuration management. Configuration management plays a critical role in
1571
- ensuring reproducibility by mitigating the non-deterministic behaviours
1461
+ seen in (add ref to ch2-environments) . In addition, a critical component is
1462
+ environment configuration management. Configuration management plays a critical
1463
+ role inensuring reproducibility by mitigating the non-deterministic behaviours
1572
1464
introduced by configuration drifts.
1573
1465
1574
1466
# info-box [
@@ -1656,15 +1548,15 @@ goal of this model, providing the highest level of determinism and reliability
1656
1548
in system behaviours.
1657
1549
1658
1550
Congruent management, particularly through the adoption of immutable
1659
- environment (@ ch2-environments ), ensures that environment remain in a
1551
+ environment ((add ref to ch2-environments) ), ensures that environment remain in a
1660
1552
well-defined state, thus maximising reproducibility. However, this approach can
1661
1553
lack the flexibility required for dynamic environments, where each minor
1662
1554
adjustments may necessitate rebuilding the entire system. This limitation
1663
1555
highlights the importance of carefully choosing between convergent and congruent
1664
1556
approaches based on the environment's needs.
1665
1557
1666
1558
# info-box [
1667
- Immutable environments (@ ch2-environments ) are environments that are designed
1559
+ Immutable environments ((add ref to ch2-environments) ) are environments that are designed
1668
1560
to be unchangeable once they are created. They are often used in containers
1669
1561
# eg [Docker # cite (<docker> ,form :" normal" )], where the ability to quickly create
1670
1562
and destroy environments is essential. Immutable environments enhance
0 commit comments