Closed
Description
Describe the bug
A clear and concise description of what the bug is.
version: V0.8.1
As the doc from AWS said: your application can achieve at least 3,500 PUT/COPY/POST/DELETE or 5,500 GET/HEAD requests per second per partitioned Amazon S3 prefix.
https://docs.aws.amazon.com/AmazonS3/latest/userguide/optimizing-performance.html
error log from qw indexer
storage error(kind=Service, source=service error: unhandled error: unhandled error: Error { code: "SlowDown", message: "Please reduce your request rate.", s3_extended_request_id: "*******+QcTMf+==", aws_request_id: "***********" }
(ServiceError(ServiceError { source: Unhandled(Unhandled { source: ErrorMetadata { code: Some("SlowDown"), message: Some("Please reduce your request rate."), extras: Some({"s3_extended_request_id": "*********************+QcTMf+==", "aws_request_id": "**************"}) }, meta: ErrorMetadata { code: Some("SlowDown"),
message: Some("Please reduce your request rate."), extras: Some({"s3_extended_request_id": "**************+QcTMf+**************==", "aws_request_id": "**************"}) } }),
raw: Response { inner: Response { status: 503, version: HTTP/1.1, headers: {"x-amz-request-id": "**************", "x-amz-id-2": "**************+QcTMf+**************==", "content-type": "application/xml", "transfer-encoding": "chunked", "date": "Tue, 24 Sep 2024 15:12:39 GMT", "server": "AmazonS3", "connection": "close"}, body: SdkBody { inner: Once(Some(b"<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<Error><Code>SlowDown</Code><Message>Please reduce your request rate.</Message><RequestId>**************</RequestId><HostId>**************+QcTMf+**************==</HostId></Error>")), retryable: true } },
properties: SharedPropertyBag(Mutex { data: PropertyBag { contents: ["aws_types::SigningService", "alloc::vec::Vec<http::version::Version>", "aws_smithy_http::operation::Metadata", "aws_smithy_http::connection::CaptureSmithyConnection", "aws_credential_types::credentials_impl::Credentials", "aws_http::user_agent::AwsUserAgent", "aws_sig_auth::signer::OperationSigningConfig", "aws_types::region::Region", "aws_smithy_types::endpoint::Endpoint", "aws_sig_auth::middleware::Signature", "aws_credential_types::cache::SharedCredentialsCache", "aws_sdk_s3::endpoint::Params", "aws_types::region::SigningRegion"] }, poisoned: false, .. }) } })))
After this error occurred, the Quickwit cluster became very unstable. Kafka consumption kept rebalancing continuously and impossible to perform the merge operation.
How to fix? According to AWS's recommendation, S3 prefixes need to be subdivided to improve performance.