67
67
68
68
\newcommand {\codesize }{\fontsize {\bodyfontsize }{\bodybaselineskip }}
69
69
70
- % Syntax highlighting for ARM asm (minted doesn't do this well)
70
+ % Syntax highlighting for Arm asm (minted doesn't do this well)
71
71
\usepackage {listings }
72
72
\lstset {
73
73
basicstyle=\ttfamily\codesize\selectfont ,
74
74
keywordstyle=\color {darkGreen}\bfseries ,
75
75
commentstyle=\textcolor [rgb]{0.25,0.50,0.50}
76
76
}
77
- % listings definitions for ARM assembly.
78
- % Get them from https://github.com/frosc/arm-assembler-latex-listings,
79
- % install as shown at http://tex.stackexchange.com/a/1138/92465
77
+ % listings definitions for Arm assembly.
78
+ % Get them from https://github.com/sysprog21/arm-assembler-latex-listings .
80
79
\usepackage {lstlangarm } % See above
81
80
82
81
\usepackage {changepage } % For adjustwidth
@@ -588,7 +587,7 @@ \section{Sequential consistency on weakly-ordered hardware}
588
587
or \introduce {memory models}.
589
588
For example, x64 is relatively \introduce {strongly-ordered},
590
589
and can be trusted to preserve some system-wide order of loads and stores in most cases.
591
- Other architectures like \textsc {arm } are \introduce {weakly-ordered},
590
+ Other architectures like \textsc {Arm } are \introduce {weakly-ordered},
592
591
so you can not assume that loads and stores are executed in program order unless the \textsc {cpu} is given special instructions---
593
592
called \introduce {memory barriers}---to not shuffle them around.
594
593
@@ -597,7 +596,7 @@ \section{Sequential consistency on weakly-ordered hardware}
597
596
and to see why the \clang {} and \cplusplus {} concurrency models were designed as they were.\punckern \footnote {%
598
597
It is worth noting that the concepts we discuss here are not specific to \clang {} and \cplusplus {}.
599
598
Other systems programming languages like D and Rust have converged on similar models.}
600
- Let's examine \textsc {arm }, since it is both popular and straightforward.
599
+ Let's examine \textsc {Arm }, since it is both popular and straightforward.
601
600
Consider the simplest atomic operations: loads and stores.
602
601
Given some \mintinline {cpp}{atomic_int foo},
603
602
% Shield your eyes.
@@ -667,8 +666,8 @@ \section{Implementing atomic read-modify-write operations with LL/SC instruction
667
666
668
667
Like many other \textsc {risc}\footnote {%
669
668
\introduce {Reduced instruction set computer},
670
- in contrast to a \introduce {complex instruction set computer} \textsc {(cisc)} architecture like x64.}
671
- architectures, \textsc {arm } lacks dedicated \textsc {rmw} instructions.
669
+ in contrast to a \introduce {complex instruction set computer} \textsc {(cisc)} architecture like x64.} architectures,
670
+ \textsc {Arm } lacks dedicated \textsc {rmw} instructions.
672
671
And since the processor can context switch to another thread at any time,
673
672
we can not build \textsc {rmw} ops from normal loads and stores.
674
673
Instead, we need special instructions:
@@ -677,7 +676,7 @@ \section{Implementing atomic read-modify-write operations with LL/SC instruction
677
676
A load-link reads a value from an address---like any other load---but also instructs the processor to monitor that address.
678
677
Store-conditional writes the given value \emph {only if } no other stores were made to that address since the corresponding load-link.
679
678
Let's see them in action with an atomic fetch and add.
680
- On \textsc {arm },
679
+ On \textsc {Arm },
681
680
\begin {colfigure }
682
681
\begin {minted }[fontsize=\codesize ]{cpp}
683
682
void incFoo() { ++foo; }
@@ -752,7 +751,7 @@ \section{Do we always need sequentially consistent operations?}
752
751
\label {lock-example }
753
752
754
753
All of our examples so far have been sequentially consistent to prevent reorderings that break our code.
755
- We've also seen how weakly-ordered architectures like \textsc {arm } use memory barriers to create sequential consistency.
754
+ We have also seen how weakly-ordered architectures like \textsc {Arm } use memory barriers to create sequential consistency.
756
755
But as you might expect,
757
756
these barriers can have a noticeable impact on performance.
758
757
After all,
@@ -1083,7 +1082,7 @@ \subsection{Consume}
1083
1082
}
1084
1083
\end {minted }
1085
1084
\end {colfigure }
1086
- and an \textsc {arm } compiler could emit:
1085
+ and an \textsc {Arm } compiler could emit:
1087
1086
\begin {colfigure }
1088
1087
\ begin{lstlisting} [language={[ARM]Assembler}]
1089
1088
ldr r3, &peripherals
@@ -1130,10 +1129,10 @@ \subsection{\textsc{Hc Svnt Dracones}}
1130
1129
1131
1130
\section {Hardware convergence }
1132
1131
1133
- Those familiar with \textsc {arm } may have noticed that all assembly shown here is for the seventh version of the architecture.
1132
+ Those familiar with \textsc {Arm } may have noticed that all assembly shown here is for the seventh version of the architecture.
1134
1133
Excitingly, the eighth generation offers massive improvements for lockless code.
1135
1134
Since most programming languages have converged on the memory model we have been exploring,
1136
- \textsc {arm }v8 processors offer dedicated load-acquire and store-release instructions: \keyword {lda} and \keyword {stl}.
1135
+ \textsc {Arm }v8 processors offer dedicated load-acquire and store-release instructions: \keyword {lda} and \keyword {stl}.
1137
1136
Hopefully, future \textsc {cpu} architectures will follow suit.
1138
1137
1139
1138
\section {Cache effects and false sharing }
0 commit comments