Skip to content

Commit 96f8ea3

Browse files
committedNov 13, 2024
Lay the foundation for low-concurrency performance improvements
Additional chapters on optimizing read and write performance for low concurrency will be added later.
1 parent 591ad41 commit 96f8ea3

File tree

1 file changed

+15
-14
lines changed

1 file changed

+15
-14
lines changed
 

‎Chapter4_8.md

Lines changed: 15 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -26,8 +26,8 @@ Throughput and response time have a generally reciprocal but subtly complex rela
2626

2727
In performance optimization, two main goals are:
2828

29-
1. **Optimal response time:** Minimize waiting for task completion.
30-
2. **Maximal throughput:** Handle as many simultaneous tasks as possible.
29+
1. **Optimal response time:** Minimize waiting for task completion.
30+
2. **Maximal throughput:** Handle as many simultaneous tasks as possible.
3131

3232
These goals are contradictory: optimizing for response time requires minimizing system load, while optimizing for throughput requires maximizing it. Balancing these conflicting objectives is key to effective performance optimization.
3333

@@ -87,9 +87,9 @@ Based on extensive testing, it is concluded that the root cause of discrepancies
8787

8888
MySQL's complexity makes performance modeling challenging, but focusing on specific subsystems can offer valuable insights into performance problems. For instance, when modeling the performance of major latches in MySQL 5.7, it's found that executing a transaction (with a transaction isolation level of Read Committed) involves certain operations:
8989

90-
- **Read Operations:** Pass through the trx-sys subsystem, potentially involving global latch queueing.
91-
- **Write Operations:** Go through the lock-sys subsystem, which involves global latch queueing for lock scheduling.
92-
- **Redo Log Operations:** Write operations require updates to the redo log subsystem, which also involves global latch queueing.
90+
- **Read Operations:** Pass through the trx-sys subsystem, potentially involving global latch queueing.
91+
- **Write Operations:** Go through the lock-sys subsystem, which involves global latch queueing for lock scheduling.
92+
- **Redo Log Operations:** Write operations require updates to the redo log subsystem, which also involves global latch queueing.
9393

9494
![](media/197c7662d3b25ebbc2870a1cee917e3f.png)
9595

@@ -117,12 +117,12 @@ Saturated latches degrade multithreaded application performance, causing scalabi
117117

118118
To address these scalability problems, consider the following measures:
119119

120-
- Improve critical resource access speed.
121-
- Use latch sharding to reduce conflicts.
122-
- Minimize unnecessary wake-up processes.
123-
- Implement latch-free mechanisms.
124-
- Design the architecture thoughtfully.
125-
- Implement transaction throttling Mechanism.
120+
- Improve critical resource access speed.
121+
- Use latch sharding to reduce conflicts.
122+
- Minimize unnecessary wake-up processes.
123+
- Implement latch-free mechanisms.
124+
- Design the architecture thoughtfully.
125+
- Implement transaction throttling Mechanism.
126126

127127
#### 4.8.6.1 Improve Critical Resource Access Speed
128128

@@ -180,8 +180,9 @@ Regarding algorithms, optimization opportunities are generally hard to find in m
180180

181181
Cache has a significant impact on performance, and maintaining cache-friendliness primarily involves the following principles:
182182

183-
1. **Sequential Memory Access:** Access memory data sequentially whenever possible. Sequential access benefits cache efficiency. For example, algorithms like direct insertion sort, which operate on small data sets, are highly cache-friendly.
184-
2. **Avoid False Sharing:** False sharing occurs when different threads modify parts of the same cache line simultaneously, leading to frequent cache invalidations and performance degradation. This often happens when different members of the same struct are modified by different threads concurrently.
183+
1. **Sequential Memory Access:** Access memory data sequentially whenever possible. Sequential access benefits cache efficiency. For example, algorithms like direct insertion sort, which operate on small data sets, are highly cache-friendly.
184+
2. **Ensuring Cache-Friendly Code**: Whether frequently accessed functions are inlined, if there is code that hinders inlining, and if switch statements are used appropriately—all these factors can affect the cache friendliness of the code.
185+
3. **Avoid False Sharing:** False sharing occurs when different threads modify parts of the same cache line simultaneously, leading to frequent cache invalidations and performance degradation. This often happens when different members of the same struct are modified by different threads concurrently.
185186

186187
False sharing is a well-known problem in multiprocessor systems, causing performance degradation in multi-threaded programs running in such environments. The figure below shows an example of false sharing.
187188

@@ -206,7 +207,7 @@ Date: Fri Nov 8 20:58:48 2013 +0100
206207
...
207208
Added missing PFS_cacheline_uint32 to atomic counters,
208209
to enforce no false sharing happens.
209-
210+
210211
This is a performance improvement.
211212
```
212213

0 commit comments

Comments
 (0)