Skip to content

Commit

Permalink
Add more specific details for scalling of snapshot
Browse files Browse the repository at this point in the history
No description

---

Pull Request resolved: #227
commit_hash:5fa8496deac7e649e2ae7f7cd52230cb7636327a
  • Loading branch information
laskoviymishka authored and robot-piglet committed Feb 21, 2025
1 parent 7b304a4 commit 4147d30
Showing 1 changed file with 27 additions and 0 deletions.
27 changes: 27 additions & 0 deletions docs/concepts/scaling.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,33 @@
- Add new storage pairs seamlessly.
- Scales Data Plane independently.

### Snapshot Sharding


1. **Worker with index 0** (initializer) analyzes the database schema and determines partitioning strategies:
- Uses table statistics.
- Splits data by PK ranges if possible.
- Otherwise, applies heuristics (e.g., partitioning by `id % N`).
2. Worker 0 stores the list of segments (value ranges) in the **coordinator** (e.g., S3).
3. **Workers 1..N** request available segments from the coordinator, process them, and report completion.

```mermaid
sequenceDiagram
participant Worker0 as Worker 0 (Main)
participant Coordinator as Coordinator (S3)
participant Workers as Workers (Secondary)
participant DB as MySQL/PostgreSQL
Worker0 ->> DB: Analyze schema
Worker0 ->> Coordinator: Store data split information
loop For each Worker
Workers ->> Coordinator: Request segment
Workers ->> DB: Process segment
Workers ->> Coordinator: Report completion
end
Coordinator ->> Worker0: Wait All Completed
```

## Vertical Scaling
- Handles high-load databases by splitting jobs into read and write tasks.
- Supports persistent queues for decoupled processing.
Expand Down

0 comments on commit 4147d30

Please sign in to comment.