Skip to content

DOC-12485 prevent bucket from running out of space #3811

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: release/8.0
Choose a base branch
from

Conversation

ggray-cb
Copy link
Contributor

@ggray-cb ggray-cb commented May 30, 2025

This doc PR covers the Morpheus feature that lets users set threshold to prevent the data storage path from becoming full (MB-59113).

It also addresses several other issues in the areas of the documentation that were being updated anyhow:

  • DOC-12778 Data Settings guidance for reader/writer threads change to 'disk i/o optimized' needs to be revised
  • New alert added by MB-58882 and MB-57062 which weren't labeled with needs-doc so they were not called to our attention.
  • New alert added by MB-65138 Alert when there is an items count mismatch in an index, and its replica which someone tried to alert docs about, but sadly they typoed the label as "need-doc" so we didn't see it on our dashboard.

Main changes in this PR, with links to the preview site (see here for username/password for the site):

  • Added a What's New entry.
  • Storage Properties lots of editing to bring up to doc standards. Added new section (Filesystem Free Space and Usage Limits) to cover new default alert and the ability to limit disk use.
  • Updated the Available Alerts section of the Alerts page to add the new default disk use percent alert. Also added the alerts for stuck rebalance and index replica divergence.
  • In the Data Settings section of the General page, revised to meet doc standards. Added documentation on the checkbox to enable the data limit. Also revised the guidance on when to use the Disk i/o optimized setting, as requested by DOC-12778.
  • Set Data Disk Use Limits new page for the new REST API endpoint to change disk usage limit settings.
  • Setting Alerts added the limit entry for maxDataDiskUsedPerc ti set the default warning disk useage threshold. Also added entries for the stuck rebalance thresholds for the alert added by MB-58882 and MB-57062.

ggray-cb added 5 commits May 28, 2025 11:46
* Initial pass on Storage Properties to bring up to doc standards.
* Added some coverage for alerts that were adding without alerting the doc team: rebalance timeouts and an index issue that I haven't dug into.
@anuthan
Copy link

anuthan commented Jun 2, 2025

Thanks @ggray-cb, glanced over it have one minor comment.
@Peter-Searby could you do the review, I just glanced over it, thanks.

Items written to disk are always written in compressed form.
Based on bucket configuration, items may be maintained in compressed form in memory also.
See xref:buckets-memory-and-storage/compression.adoc[Compression] for information.
Disk access does not interrupt most client interactions.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Disk access does not interrupt most client interactions." We should probably get rid of this line. Durable writes which are client operation require flush to disk.

Copy link
Contributor

@Peter-Searby Peter-Searby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the amount of changes to kv/storage sections, perhaps it would be worth having someone from one or both of those teams to review this?

You can configure Couchbase Server to prevent writes to buckets from consuming all of the disk space in a node.
You set a minimum amount of space every node must have free in the filesystem used by the data service.
If the node's has less free space than this limit, Couchbase Server prevents writes to buckets.
Even if you do not set this limit, Couchbase Server now alerts you when a node starts to run out of disk space.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See xref:buckets-memory-and-storage/buckets.adoc[Buckets] for information.
Couchbase Server compresses the data it writes to disk.
Compression reduces the amount of disk space used which can help reduce costs.
It also makes the backup and restore procedures easier.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what compression has to do with the ease of backup and restore. Does this just mean speed/performance?


For illustrations of how Couchbase Server saves new and updates existing Couchbase-bucket items, thereby employing both memory and storage resources, see xref:buckets-memory-and-storage/memory-and-storage.adoc[Memory and Storage].
To see how Couchbase Server saves new items and updates existing items in Couchbase buckets, using both memory and storage, seexref:buckets-memory-and-storage/memory-and-storage.adoc[Memory and Storage].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
To see how Couchbase Server saves new items and updates existing items in Couchbase buckets, using both memory and storage, seexref:buckets-memory-and-storage/memory-and-storage.adoc[Memory and Storage].
To see how Couchbase Server saves new items and updates existing items in Couchbase buckets, using both memory and storage, see xref:buckets-memory-and-storage/memory-and-storage.adoc[Memory and Storage].

You can control the number of reader and writer threads.
In the Couchbase Server Web Console, you can have Couchbase Server automatically choose a default value or a value that optimizes disk I/O.
You can also manually set the number of threads per node to a value between 1 and 64.
Using a higher number of threads may improve performance if your hardware supports it, such as when your CPU has a high large of cores.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Using a higher number of threads may improve performance if your hardware supports it, such as when your CPU has a high large of cores.
Using a higher number of threads may improve performance if your hardware supports it, such as when your CPU has a larger number of cores.

You can also manually set the number of threads per node to a value between 1 and 64.
Using a higher number of threads may improve performance if your hardware supports it, such as when your CPU has a high large of cores.
Increasing the number of writer threads helps optimize durable writes.
For more information, see xref:learn:data/durability.adoc[Durability].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like it applies to the whole paragraph, when it really is just relevant to the prior sentence, perhaps it could be phrased better?

Left-clicking on the *Advanced Data Settings* tab displays radio buttons for *Reader Thread Settings* and *Writer Thread Settings*:
The *Reader Thread Settings* and *Writer Thread Settings* options let you control the number of threads the Data Service uses on each node to read and write data.
Allocating more threads can improve performance.
In particular, adding more writer threads can improve durable write performance,.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In particular, adding more writer threads can improve durable write performance,.
In particular, adding more writer threads can improve durable write performance.

[[get-privs]]
=== Required Privileges

You must have at least on one of the following roles:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
You must have at least on one of the following roles:
You must have at least one of the following roles:

[source,bash]
----
curl -u Administrator:password \
-X GET 'http://127.0.0.1:8091//settings/resourceManagement' | jq
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
-X GET 'http://127.0.0.1:8091//settings/resourceManagement' | jq
-X GET 'http://127.0.0.1:8091/settings/resourceManagement' | jq

@@ -173,6 +184,12 @@ NOTE: If the node exceeds 90% of the available system connections, then please c

* `memcachedUserConnectionWarningThreshold`. Trigger the `xref:manage:manage-settings/configure-alerts.adoc#memcached-alert[memcached_connections]` alert if the number of `user` connections in use exceeds the given percentage of connections available. (E.g., if this value is set to `90`, the system will trigger an alert if the number of user connections used by the data service exceeds 90% of the available connections.)

* `stuckRebalanceThresholdIndex` and `stuckRebalanceThresholdKV`.
Sets the timeout threshold for an index rebalance and a data operation to be considered stuck.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Sets the timeout threshold for an index rebalance and a data operation to be considered stuck.
Sets the timeout threshold for a data or index service rebalance to make no identified progress to be considered stuck.


For all information on using the REST API for compaction, see the xref:rest-api:compaction-rest-api.adoc[Compaction API].
You can enable a feature to have Couchbase Server stop writing to the Data Service storage path when it reaches a certain percentage of disk usage.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
You can enable a feature to have Couchbase Server stop writing to the Data Service storage path when it reaches a certain percentage of disk usage.
You can enable a feature to have Couchbase Server Data Service stop writing to the Data Service storage path when it reaches a certain percentage of disk usage.

Also worth noting that this storage path may be on the same disk as other data, which may still be written to

You can also perform compaction manually on a specific bucket.
For information about performing manual compaction with the command line, see xref:cli:cbcli/couchbase-cli-bucket-compact.adoc[bucket-compact].

For all information about using the REST API for compaction, see the xref:rest-api:compaction-rest-api.adoc[Compaction API].

== Disk I/O Priority
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming this is the Bucket Priority (in the UI), my understanding is that this doesn't actually do anything. I'm not too sure why we've kept the config around, but it would be worth getting confirmation from KV how this should be documented (they look to be planning on cleaning this up in Ponyo: https://jira.issues.couchbase.com/browse/MB-66579)

@@ -202,17 +202,30 @@ The size of the change history may need to be increased.
For information, on establishing change-history size, see xref:rest-api:rest-bucket-create.adoc[Creating and Editing Buckets].
| `history_size_warning`

| Low Indexer Residence Percentage
| Approaching Indexer low resident percentage
| Warns that the Index Service is, on a given node, occupying a percentage of available memory that is below an established threshold, the default for which is `10`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| Warns that the Index Service is, on a given node, occupying a percentage of available memory that is below an established threshold, the default for which is `10`.
| Warns that the Index Service is, on a given node, occupying a percentage of available memory that is below an established threshold, the default for which is `10`%.


A high thread-allocation may improve performance on systems whose hardware-resources are commensurately supportive (for example, where the number of CPU cores is high).
In particular, a high number of _writer_ threads on such systems may significantly optimize the performance of _durable writes_: see xref:learn:data/durability.adoc[Durability], for information.
*Prevent writes to buckets when storage becomes <number>% full* controls whether Couchbase Server prevents the filesystem containing the data path from becoming full.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"whether Couchbase Server prevents the filesystem containing the data path from becoming full."
This is too strongly worded. We can't prevent the filesystem becoming full, so lets be careful not to imply that we can

This alert warns you that the disk is becoming full.
It occurs even if data disk usage limits are not enabled.
The value must be an integer between `1` and `100`, which is the percentage of disk space used.
It defaults to `90`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It actually defaults to 75%. Also, if the data disk limit is enabled, then it will ignore the configured threshold and use 10% less than the enforcement threshold.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants