From 3919aa38812b4c0170319850ea34e84c4e3039b9 Mon Sep 17 00:00:00 2001 From: Vladislav Kokosh Date: Tue, 11 Jun 2024 17:29:26 +0300 Subject: [PATCH 1/3] Added split and merge shard algorithm --- .../develop/blockchain/sharding-lifecycle.mdx | 75 +++++++++++++++++++ 1 file changed, 75 insertions(+) diff --git a/docs/develop/blockchain/sharding-lifecycle.mdx b/docs/develop/blockchain/sharding-lifecycle.mdx index edde6b18a7..d33951a190 100644 --- a/docs/develop/blockchain/sharding-lifecycle.mdx +++ b/docs/develop/blockchain/sharding-lifecycle.mdx @@ -14,6 +14,81 @@ ISP underpins the TON Blockchain's design, treating each account as part of its Each shardchain, or more precisely, each shardchain block, is identified by a combination of `workchain_id` and a binary prefix `s` of the account_id. +## Algorithm for deciding whether to split or merge + +### 1. Assessment of the current state of the block + +Before deciding whether to split or merge, it is necessary to assess the current state of the block. This includes an assessment of the block size, gas consumption and logical block time difference (lt_delta). + +The `estimate_block_size` formula calculates the total block size based on various state parameters and statistics. Let's understand this formula in detail and translate it into an understandable language. + +```cpp +td::uint64 BlockLimitStatus::estimate_block_size(const vm::NewCellStorageStat::Stat* extra) const { + auto sum = st_stat.get_total_stat(); + if (extra) { + sum += *extra; + } + return 2000 + (sum.bits >> 3) + sum.cells * 12 + sum.internal_refs * 3 + sum.external_refs * 40 + accounts * 200 + + transactions * 200 + (extra ? 200 : 0) + extra_out_msgs * 300 + public_library_diff * 700; +} +``` +### 2. Block classification + +Using the estimation results, we categorise the block into three parameters and their limits: size, gas, and lt_delta. + +There are three main classification classes for different block parameters and their limits: +- underload (0) - is a state where a shard understands that there is no load, and is inclined to merge if the neighbouring shard wishes to do so. +- soft limit (2) - when this limit is reached, internal messages are no longer processed. +- hard limit (4) - that's the absolute maximum size. + +Block limits are loaded from the contract config - [23](/develop/howto/blockchain-configs#param-22-and-23). + +At the moment, the classification of blocks is as follows: + +#### Classification of Block size +- 0 - if block size < 131,072 bytes +- 2 - if 131,072 ≤ block size ≤ 524,288 bytes +- 4 - if block size = 524,288 bytes + +#### Classification of Block gas +- 0 - if block gas < 2000000 gas +- 2 - if 2000000 ≤ block gas ≤ 10000000 gas +- 4 - if block gas = 20000000 gas + +#### Classification of Block lt_delta +- 0 - if block lt_delta < 1000 lt_delta +- 2 - if 5000 ≤ block lt_delta ≤ 10000 lt_delta +- 4 - if block lt_delta = 10000 lt_delta + +#### Final block classification: + +Classification of block limit = max(`Classification of Block size`, `Classification of Block gas`, `Classification of Block lt_delta`) + +For example: if classification of Block size - 2, classification of Block gas - 2, classification of Block lt_delta - 4, then the final block classification is `hard(4)`. + +### 3. Determination of overload or underload condition + +After classifying the block, we check its overload or underload condition. + +- If the block class by limits ≤ `underload(0)` and message queue ≤ `MERGE_MAX_QUEUE_SIZE = 2047`, set the status to `underloaded`. +- If the block class by limits ≥ `soft(2)` и message queue ≤ `SPLIT_MAX_QUEUE_SIZE = 100000`, set the status to `overloaded`. + +### 4. Deciding whether to split or merge + +Using the history of overloading and underloading, decide whether a split or merge is needed. + +- If the overload condition is `overload` and the message queue size is between `FORCE_SPLIT_QUEUE_SIZE = 4096` and `SPLIT_MAX_QUEUE_SIZE = 100000`, the `want_split` status is set. +- If the overload condition is `underloaded`и the message queue size ≤ `MERGE_MAX_QUEUE_SIZE = 2047`, the `want_merge` status is set. + +### 5. Checking the manual settings of the collator + +- If `want_split` is set in the collator settings, a split will be made. +- If `want_merge` is set in the collator settings, a merge will be made. + +### 6. Final decision + +The final decision to split or merge is based on `want_split` and `want_merge` values. + ## Messages and Instant Hypercube Routing (Instant Hypercube Routing) In the infinite sharding paradigm, each account (or smart-contract) is treated as if it were itself in a separate shardchain. From f9804da5ffc4ef902a576d6b5b6f320fd3460414 Mon Sep 17 00:00:00 2001 From: Vladislav Kokosh Date: Thu, 13 Jun 2024 11:48:01 +0300 Subject: [PATCH 2/3] Removed block size formula --- .../develop/blockchain/sharding-lifecycle.mdx | 20 ++++--------------- 1 file changed, 4 insertions(+), 16 deletions(-) diff --git a/docs/develop/blockchain/sharding-lifecycle.mdx b/docs/develop/blockchain/sharding-lifecycle.mdx index d33951a190..c04a32f532 100644 --- a/docs/develop/blockchain/sharding-lifecycle.mdx +++ b/docs/develop/blockchain/sharding-lifecycle.mdx @@ -20,23 +20,11 @@ Each shardchain, or more precisely, each shardchain block, is identified by a co Before deciding whether to split or merge, it is necessary to assess the current state of the block. This includes an assessment of the block size, gas consumption and logical block time difference (lt_delta). -The `estimate_block_size` formula calculates the total block size based on various state parameters and statistics. Let's understand this formula in detail and translate it into an understandable language. - -```cpp -td::uint64 BlockLimitStatus::estimate_block_size(const vm::NewCellStorageStat::Stat* extra) const { - auto sum = st_stat.get_total_stat(); - if (extra) { - sum += *extra; - } - return 2000 + (sum.bits >> 3) + sum.cells * 12 + sum.internal_refs * 3 + sum.external_refs * 40 + accounts * 200 + - transactions * 200 + (extra ? 200 : 0) + extra_out_msgs * 300 + public_library_diff * 700; -} -``` ### 2. Block classification -Using the estimation results, we categorise the block into three parameters and their limits: size, gas, and lt_delta. +Using the estimation results, we classify the block into three parameters and their limits: size, gas, and lt_delta. -There are three main classification classes for different block parameters and their limits: +There are three main classification classes for different block parameters and their limits: - underload (0) - is a state where a shard understands that there is no load, and is inclined to merge if the neighbouring shard wishes to do so. - soft limit (2) - when this limit is reached, internal messages are no longer processed. - hard limit (4) - that's the absolute maximum size. @@ -77,8 +65,8 @@ After classifying the block, we check its overload or underload condition. Using the history of overloading and underloading, decide whether a split or merge is needed. -- If the overload condition is `overload` and the message queue size is between `FORCE_SPLIT_QUEUE_SIZE = 4096` and `SPLIT_MAX_QUEUE_SIZE = 100000`, the `want_split` status is set. -- If the overload condition is `underloaded`и the message queue size ≤ `MERGE_MAX_QUEUE_SIZE = 2047`, the `want_merge` status is set. +- If the overload history is `overload` and the message queue size is between `FORCE_SPLIT_QUEUE_SIZE = 4096` and `SPLIT_MAX_QUEUE_SIZE = 100000`, the `want_split` status is set. +- If the overload condition is `underloaded` and the message queue size ≤ `MERGE_MAX_QUEUE_SIZE = 2047`, the `want_merge` status is set. ### 5. Checking the manual settings of the collator From 876000308604081433a752edf856901159072210 Mon Sep 17 00:00:00 2001 From: Vladislav Kokosh Date: Tue, 30 Jul 2024 11:27:52 +0300 Subject: [PATCH 3/3] Updated after review --- .../develop/blockchain/sharding-lifecycle.mdx | 92 ++++++++++--------- 1 file changed, 49 insertions(+), 43 deletions(-) diff --git a/docs/develop/blockchain/sharding-lifecycle.mdx b/docs/develop/blockchain/sharding-lifecycle.mdx index c04a32f532..b91afc5e1f 100644 --- a/docs/develop/blockchain/sharding-lifecycle.mdx +++ b/docs/develop/blockchain/sharding-lifecycle.mdx @@ -16,66 +16,72 @@ Each shardchain, or more precisely, each shardchain block, is identified by a co ## Algorithm for deciding whether to split or merge -### 1. Assessment of the current state of the block - -Before deciding whether to split or merge, it is necessary to assess the current state of the block. This includes an assessment of the block size, gas consumption and logical block time difference (lt_delta). - -### 2. Block classification - -Using the estimation results, we classify the block into three parameters and their limits: size, gas, and lt_delta. - -There are three main classification classes for different block parameters and their limits: -- underload (0) - is a state where a shard understands that there is no load, and is inclined to merge if the neighbouring shard wishes to do so. -- soft limit (2) - when this limit is reached, internal messages are no longer processed. -- hard limit (4) - that's the absolute maximum size. +Validators decide whether to split or merge shards in the following way: +1. For each block, block size, gas consumption and lt delta are calculated. +2. Using these values, blocks can be considered overloaded or underloaded. +3. Each shard keeps underload and overload history. If enough recent blocks were underloaded or overloaded, `want_merge` or `want_split` flag is set. +4. Validators merge or split shards using these flags. -Block limits are loaded from the contract config - [23](/develop/howto/blockchain-configs#param-22-and-23). - -At the moment, the classification of blocks is as follows: - -#### Classification of Block size -- 0 - if block size < 131,072 bytes -- 2 - if 131,072 ≤ block size ≤ 524,288 bytes -- 4 - if block size = 524,288 bytes +### 1. Assessment of the current state of the block -#### Classification of Block gas -- 0 - if block gas < 2000000 gas -- 2 - if 2000000 ≤ block gas ≤ 10000000 gas -- 4 - if block gas = 20000000 gas +Each block has the following parameters. They are used to determine overload and underload. +1. *Block size estimation* - not an actual block size, but an estimation calculated during collation. +2. *Gas consumption* - total gas consumed in all transactions (excluding ticktock and mint/recover special transactions). +3. *Lt delta* - difference between start and end lt of the block. -#### Classification of Block lt_delta -- 0 - if block lt_delta < 1000 lt_delta -- 2 - if 5000 ≤ block lt_delta ≤ 10000 lt_delta -- 4 - if block lt_delta = 10000 lt_delta +### 2. Block limits and classification -#### Final block classification: +Block limits are loaded from the [configuration parameters 22 and 23](/develop/howto/blockchain-configs#param-22-and-23). +Each of the three parameters has three limits: underload, soft, hard: +1. *Block size*: `128/256/512 KiB`. +2. *Gas consumption*: `2M/10M/20M` in basechain, `200K/1M/2.5M` in masterchain. +3. *Lt delta*: `1000/5000/10000`. +Also, there is a medium limit, which is equal to `(soft + hard) / 2`. -Classification of block limit = max(`Classification of Block size`, `Classification of Block gas`, `Classification of Block lt_delta`) +We classify the three parameters (size, gas, and lt delta) into categories: +- `0` - underload limit is not reached. +- `1` - underload limit is exceeded. +- `2` - soft limit is exceeded. +- `3` - medium limit is exceeded. +- `4` - hard limit is exceeded. -For example: if classification of Block size - 2, classification of Block gas - 2, classification of Block lt_delta - 4, then the final block classification is `hard(4)`. +Block classification is max(`Classification of size`, `Classification of gas`, `Classification of lt delta`). For example: if classification of size is 2, classification of gas is 3, classification of lt delta is 1, then the final block classification is 3. -### 3. Determination of overload or underload condition +- When classification of the block is 0 (underload), the block is inclined to merge with its sibling. +- When classification of the block is 2 (soft limit reached), collator stops processing internal messages. The block is inclined to split. +- When classification of the block is 3 (medium limit reached), collator stops processing external messages. -After classifying the block, we check its overload or underload condition. +### 3. Determination of overload or underload -- If the block class by limits ≤ `underload(0)` and message queue ≤ `MERGE_MAX_QUEUE_SIZE = 2047`, set the status to `underloaded`. -- If the block class by limits ≥ `soft(2)` и message queue ≤ `SPLIT_MAX_QUEUE_SIZE = 100000`, set the status to `overloaded`. +After classifying the block, collator checks overload and underload conditions. +Size of the outbound message queue and status of dispatch queue processing is also taken into consideration. +- If the block class is ≥ `2` (soft) and message queue size ≤ `SPLIT_MAX_QUEUE_SIZE = 100000` then the block is overloaded. +- If limit for total processed messages from dispatch queue was reached and message queue size ≤ `SPLIT_MAX_QUEUE_SIZE = 100000` then the block is overloaded. +- If the block class is `0` (underload) and message queue size ≤ `MERGE_MAX_QUEUE_SIZE = 2047` then the block is underloaded. +- If message queue size is ≥ `FORCE_SPLIT_QUEUE_SIZE = 4096` and ≤ `SPLIT_MAX_QUEUE_SIZE = 100000` then the block is overloaded. ### 4. Deciding whether to split or merge -Using the history of overloading and underloading, decide whether a split or merge is needed. +Each block keeps underload and overload history - it is a 64-bit mask of the underload/overload status of the last 64 blocks. +It is used to decide whether to split or merge. + +Underload and overload history has a weight, which is calculated as follows: +`one_bits(mask & 0xffff) * 3 + one_bits(mask & 0xffff0000) * 2 + one_bits(mask & 0xffff00000000) - (3 + 2 + 1) * 16 * 2 / 3` +(here `one_bits` is the number of `1`-bits in a mask, and the lower bits correspond to the most recent blocks). -- If the overload history is `overload` and the message queue size is between `FORCE_SPLIT_QUEUE_SIZE = 4096` and `SPLIT_MAX_QUEUE_SIZE = 100000`, the `want_split` status is set. -- If the overload condition is `underloaded` and the message queue size ≤ `MERGE_MAX_QUEUE_SIZE = 2047`, the `want_merge` status is set. +When underload or overload history has a non-negative weight, the flag `want_merge` or `want_split` is set. -### 5. Checking the manual settings of the collator +### 5. Final decision -- If `want_split` is set in the collator settings, a split will be made. -- If `want_merge` is set in the collator settings, a merge will be made. +Validators decide to split or merge shards using `want_split` and `want_merge` flags and [workchain configuration parameters](/develop/howto/blockchain-configs#param-12). -### 6. Final decision +- If the shard has depth < `min_split` then it will split. +- If the shard has depth > `max_split` then it will merge. +- Shards with depth `min_split` cannot merge, shards with depth `max_split` cannot split. +- If the block has `want_split` flag, the shard will split. +- If the block and its sibling have `want_merge` flag, the shards will merge. -The final decision to split or merge is based on `want_split` and `want_merge` values. +Shards split and merge in `split_merge_delay = 100` seconds after the decision is made. ## Messages and Instant Hypercube Routing (Instant Hypercube Routing)