diff --git a/deploy-manage/cloud-organization/tools-and-apis.md b/deploy-manage/cloud-organization/tools-and-apis.md index e1a502ebf..df57726d2 100644 --- a/deploy-manage/cloud-organization/tools-and-apis.md +++ b/deploy-manage/cloud-organization/tools-and-apis.md @@ -18,7 +18,7 @@ Most Elastic resources can be accessed and managed through RESTful APIs. While t Elasticsearch APIs : This set of APIs allows you to interact directly with the Elasticsearch nodes in your deployment. You can ingest data, run search queries, check the health of your clusters, manage snapshots, and more. - To use these APIs on {{ecloud}} read our topic [Access the API console](asciidocalypse://docs/cloud/docs/reference/cloud-hosted/ec-api-console.md), and to learn about all of the available endpoints check the [Elasticsearch API reference documentation](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/index.md). + To use these APIs on {{ecloud}} read our topic [Access the API console](asciidocalypse://docs/cloud/docs/reference/cloud-hosted/ec-api-console.md), and to learn about all of the available endpoints check the [Elasticsearch API reference documentation](elasticsearch://reference/elasticsearch/rest-apis/index.md). Some [restrictions](../deploy/elastic-cloud/restrictions-known-problems.md#ec-restrictions-apis-elasticsearch) apply when using the Elasticsearch APIs on {{ecloud}}. diff --git a/deploy-manage/deploy/cloud-enterprise/add-custom-bundles-plugins.md b/deploy-manage/deploy/cloud-enterprise/add-custom-bundles-plugins.md index 74e917ddd..d97006ba1 100644 --- a/deploy-manage/deploy/cloud-enterprise/add-custom-bundles-plugins.md +++ b/deploy-manage/deploy/cloud-enterprise/add-custom-bundles-plugins.md @@ -1,8 +1,8 @@ --- -mapped_pages: +mapped_pages: - https://www.elastic.co/guide/en/cloud-enterprise/current/ece-add-custom-bundle-plugin.html navigation_title: "Custom bundles and plugins" -applies_to: +applies_to: deployment: ece: --- @@ -360,7 +360,7 @@ You do not need to do this step if you are using default filename and password ( } ``` -4. To use this bundle, you can refer it in the [GeoIP processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/geoip-processor.md) of an ingest pipeline as `MyGeoLite2-City.mmdb` under `database_file` such as: +4. To use this bundle, you can refer it in the [GeoIP processor](elasticsearch://reference/ingestion-tools/enrich-processor/geoip-processor.md) of an ingest pipeline as `MyGeoLite2-City.mmdb` under `database_file` such as: ```sh ... diff --git a/deploy-manage/deploy/cloud-enterprise/ce-add-support-for-node-roles-autoscaling.md b/deploy-manage/deploy/cloud-enterprise/ce-add-support-for-node-roles-autoscaling.md index d3c4db11e..88ed3c041 100644 --- a/deploy-manage/deploy/cloud-enterprise/ce-add-support-for-node-roles-autoscaling.md +++ b/deploy-manage/deploy/cloud-enterprise/ce-add-support-for-node-roles-autoscaling.md @@ -17,7 +17,7 @@ The `node_roles` field defines the roles that an Elasticsearch topology element There are a number of fields that need to be added to each Elasticsearch node in order to support `node_roles`: * **id**: Unique identifier of the topology element. This field, along with the `node_roles`, identifies an Elasticsearch topology element. -* **node_roles**: The list of node roles. Allowable roles are: `master`, `ingest`, `ml`, `data_hot`, `data_content`, `data_warm`, `data_cold`, `data_frozen`, `remote_cluster_client`, and `transform`. For details, check [Node roles](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/node-settings.md#node-roles). +* **node_roles**: The list of node roles. Allowable roles are: `master`, `ingest`, `ml`, `data_hot`, `data_content`, `data_warm`, `data_cold`, `data_frozen`, `remote_cluster_client`, and `transform`. For details, check [Node roles](elasticsearch://reference/elasticsearch/configuration-reference/node-settings.md#node-roles). * **topology_element_control**: Controls for the topology element. * **min**: The absolute minimum size limit for a topology element. If the value is `0`, that means the topology element can be disabled. diff --git a/deploy-manage/deploy/cloud-enterprise/ece-ha.md b/deploy-manage/deploy/cloud-enterprise/ece-ha.md index 2d57c6f76..105d67da0 100644 --- a/deploy-manage/deploy/cloud-enterprise/ece-ha.md +++ b/deploy-manage/deploy/cloud-enterprise/ece-ha.md @@ -31,11 +31,11 @@ Increasing the number of zones should not be used to add more resources. The con ## Master nodes [ece-ece-ha-2-master-nodes] -$$$ece-ha-tiebreaker$$$Tiebreakers are used in distributed clusters to avoid cases of [split brain](https://en.wikipedia.org/wiki/Split-brain_(computing)), where an {{es}} cluster splits into multiple, autonomous parts that continue to handle requests independently of each other, at the risk of affecting cluster consistency and data loss. A split-brain scenario is avoided by making sure that a minimum number of [master-eligible nodes](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/node-settings.md#master-node) must be present in order for any part of the cluster to elect a master node and accept user requests. To prevent multiple parts of a cluster from being eligible, there must be a [quorum-based majority](/deploy-manage/distributed-architecture/discovery-cluster-formation/modules-discovery-quorums.md) of `(n/2)+1` nodes, where `n` is the number of master-eligible nodes in the cluster. The minimum number of master nodes to reach quorum in a two-node cluster is the same as for a three-node cluster: two nodes must be available. +$$$ece-ha-tiebreaker$$$Tiebreakers are used in distributed clusters to avoid cases of [split brain](https://en.wikipedia.org/wiki/Split-brain_(computing)), where an {{es}} cluster splits into multiple, autonomous parts that continue to handle requests independently of each other, at the risk of affecting cluster consistency and data loss. A split-brain scenario is avoided by making sure that a minimum number of [master-eligible nodes](elasticsearch://reference/elasticsearch/configuration-reference/node-settings.md#master-node) must be present in order for any part of the cluster to elect a master node and accept user requests. To prevent multiple parts of a cluster from being eligible, there must be a [quorum-based majority](/deploy-manage/distributed-architecture/discovery-cluster-formation/modules-discovery-quorums.md) of `(n/2)+1` nodes, where `n` is the number of master-eligible nodes in the cluster. The minimum number of master nodes to reach quorum in a two-node cluster is the same as for a three-node cluster: two nodes must be available. When you create a cluster with nodes in two availability zones when a third zone is available, Elastic Cloud Enterprise can create a tiebreaker in the third availability zone to help establish quorum in case of loss of an availability zone. The extra tiebreaker node that helps to provide quorum does not have to be a full-fledged and expensive node, as it does not hold data. For example: By tagging allocators hosts in Elastic Cloud Enterprise, can you create a cluster with eight nodes each in zones `ece-1a` and `ece-1b`, for a total of 16 nodes, and one tiebreaker node in zone `ece-1c`. This cluster can lose any of the three availability zones whilst maintaining quorum, which means that the cluster can continue to process user requests, provided that there is sufficient capacity available when an availability zone goes down. -By default, each node in an {{es}} cluster is a master-eligible node and a data node. In larger clusters, such as production clusters, it’s a good practice to split the roles, so that master nodes are not handling search or indexing work. When you create a cluster, you can specify to use dedicated [master-eligible nodes](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/node-settings.md#master-node), one per availability zone. +By default, each node in an {{es}} cluster is a master-eligible node and a data node. In larger clusters, such as production clusters, it’s a good practice to split the roles, so that master nodes are not handling search or indexing work. When you create a cluster, you can specify to use dedicated [master-eligible nodes](elasticsearch://reference/elasticsearch/configuration-reference/node-settings.md#master-node), one per availability zone. ::::{warning} Clusters that only have two or fewer master-eligible node are not [highly available](/deploy-manage/production-guidance/availability-and-resilience.md) and are at risk of data loss. You must have [at least three master-eligible nodes](/deploy-manage/distributed-architecture/discovery-cluster-formation/modules-discovery-quorums.md). diff --git a/deploy-manage/deploy/cloud-enterprise/ece-manage-capacity.md b/deploy-manage/deploy/cloud-enterprise/ece-manage-capacity.md index ab8440506..d55c417c4 100644 --- a/deploy-manage/deploy/cloud-enterprise/ece-manage-capacity.md +++ b/deploy-manage/deploy/cloud-enterprise/ece-manage-capacity.md @@ -15,13 +15,13 @@ This section focuses on the allocator role, and explains how to plan its capacit * [Storage](#ece-alloc-storage) -## Memory [ece-alloc-memory] +## Memory [ece-alloc-memory] You should plan your deployment size based on the amount of data you ingest. Memory is the main scaling unit for a deployment. Other units, like CPU and disks, are proportional to the memory size. The memory available for an allocator is called *capacity*. During installation, the allocator capacity defaults to 85% of the host physical memory, as the rest is reserved for ECE system services. -::::{note} +::::{note} ECE does not support hot-adding of resources to a running node. When increasing CPU/memory allocated to a ECE node, a restart is needed to utilize the additional resources. :::: @@ -38,13 +38,13 @@ curl -X PUT \ For more information on how to use API keys for authentication, check the section [Access the API from the Command Line](asciidocalypse://docs/cloud/docs/reference/cloud-enterprise/ece-api-command-line.md). -::::{important} +::::{important} Prior to ECE 3.5.0, regardless of the use of this API, the [CPU quota](#ece-alloc-cpu) used the memory specified at installation time. :::: -### Examples [ece_examples] +### Examples [ece_examples] Here are some examples to make Elastic deployments and ECE system services run smoothly on your host: @@ -56,14 +56,14 @@ Note that the recommended reservations above are not guaranteed upper limits, if These fluctuations should not be a concern in practice. To get actual limits that could be used in alerts, you could add 4GB to the recommended values above. -## CPU quotas [ece-alloc-cpu] +## CPU quotas [ece-alloc-cpu] ECE uses CPU quotas to assign shares of the allocator host to the instances that are running on it. To calculate the CPU quota, use the following formula: `CPU quota = DeploymentRAM / HostCapacity` -### Examples [ece_examples_2] +### Examples [ece_examples_2] Consider a 32GB deployment hosted on a 128GB allocator. @@ -84,19 +84,19 @@ If you use 12GB Allocator system service reservation, the CPU quota is 28%: Those percentages represent the upper limit of the % of the total CPU resources available in a given 100ms period. -## Processors setting [ece-alloc-processors-setting] +## Processors setting [ece-alloc-processors-setting] In addition to the [CPU quotas](#ece-alloc-cpu), the `processors` setting also plays a relevant role. -The allocated `processors` setting originates from Elasticsearch and is responsible for calculating your [thread pools](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/thread-pool-settings.md#node.processors). While the CPU quota defines the percentage of the total CPU resources of an allocator that are assigned to an instance, the allocated `processors` define how the thread pools are calculated in Elasticsearch, and therefore how many concurrent search and indexing requests an instance can process. In other words, the CPU ratio defines how fast a single task can be completed, while the `processors` setting defines how many different tasks can be completed at the same time. +The allocated `processors` setting originates from Elasticsearch and is responsible for calculating your [thread pools](elasticsearch://reference/elasticsearch/configuration-reference/thread-pool-settings.md#node.processors). While the CPU quota defines the percentage of the total CPU resources of an allocator that are assigned to an instance, the allocated `processors` define how the thread pools are calculated in Elasticsearch, and therefore how many concurrent search and indexing requests an instance can process. In other words, the CPU ratio defines how fast a single task can be completed, while the `processors` setting defines how many different tasks can be completed at the same time. We rely on Elasticsearch and the `-XX:ActiveProcessorCount` JVM setting to automatically detect the allocated `processors`. -In earlier versions of ECE and Elasticsearch, the [Elasticsearch processors](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/thread-pool-settings.md#node.processors) setting was used to configure the allocated `processors` according to the following formula: +In earlier versions of ECE and Elasticsearch, the [Elasticsearch processors](elasticsearch://reference/elasticsearch/configuration-reference/thread-pool-settings.md#node.processors) setting was used to configure the allocated `processors` according to the following formula: `Math.min(16,Math.max(2,(16*instanceCapacity*1.0/1024/64).toInt))` -The following table gives an overview of the allocated `processors` that are used to calculate the Elasticsearch [thread pools](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/thread-pool-settings.md) based on the preceding formula: +The following table gives an overview of the allocated `processors` that are used to calculate the Elasticsearch [thread pools](elasticsearch://reference/elasticsearch/configuration-reference/thread-pool-settings.md) based on the preceding formula: | instance size | vCPU | | --- | --- | @@ -110,18 +110,18 @@ The following table gives an overview of the allocated `processors` that are use This table also provides a rough indication of what the auto-detected value could be on newer versions of ECE and Elasticsearch. -## Storage [ece-alloc-storage] +## Storage [ece-alloc-storage] ECE has specific [hardware prerequisites](ece-hardware-prereq.md) for storage. Disk space is consumed by system logs, container overhead, and deployment data. The main factor for selecting a disk quota is the deployment data, that is, data from your Elasticsearch, Kibana, and APM nodes. The biggest portion of data is consumed by the Elasticsearch nodes. -::::{note} +::::{note} ECE uses [XFS](ece-software-prereq.md#ece-xfs) to enforce specific disk space quotas to control the disk consumption for the deployment nodes running on your allocator. :::: -::::{important} +::::{important} You must use XFS and have quotas enabled on all allocators, otherwise disk usage won’t display correctly. :::: @@ -144,7 +144,7 @@ You can change the value of the disk multiplier at different levels: 3. Adjust the disk quota to your needs. -::::{important} +::::{important} The override only persists during the lifecycle of the instance container. If a new container is created, for example during a `grow_and_shrink` plan or a vacate operation, the quota is reset to its default. To increase the storage ratio in a persistent way, [edit the instance configurations](ece-configuring-ece-instance-configurations-edit.md). :::: diff --git a/deploy-manage/deploy/cloud-enterprise/post-installation-steps.md b/deploy-manage/deploy/cloud-enterprise/post-installation-steps.md index ee7bafd32..7ea271136 100644 --- a/deploy-manage/deploy/cloud-enterprise/post-installation-steps.md +++ b/deploy-manage/deploy/cloud-enterprise/post-installation-steps.md @@ -8,7 +8,7 @@ mapped_pages: After your Elastic Cloud Enterprise installation is up, some additional steps might be required: * Add your own load balancer. Load balancers are user supplied and we do not currently provide configuration steps for you. -* [Add more capacity](../../maintenance/ece/scale-out-installation.md) to your Elastic Cloud Enterprise installation, [resize your deployment](resize-deployment.md), [upgrade to a newer Elasticsearch version](../../upgrade/deployment-or-cluster.md), and [add some plugins](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch-plugins/cloud-enterprise/ece-add-plugins.md). +* [Add more capacity](../../maintenance/ece/scale-out-installation.md) to your Elastic Cloud Enterprise installation, [resize your deployment](resize-deployment.md), [upgrade to a newer Elasticsearch version](../../upgrade/deployment-or-cluster.md), and [add some plugins](elasticsearch://reference/elasticsearch-plugins/cloud-enterprise/ece-add-plugins.md). * [Configure ECE system deployments](system-deployments-configuration.md) to ensure a highly available and resilient setup. * [Configure ECE for deployment templates](configure-deployment-templates.md) to indicate what kind of hardware you have available for Elastic Stack deployments. * [Install your security certificates](../../security/secure-your-elastic-cloud-enterprise-installation/manage-security-certificates.md) to enable TLS/SSL authentication for secure connections over HTTPS. @@ -19,7 +19,7 @@ After your Elastic Cloud Enterprise installation is up, some additional steps mi * Learn how to work around host maintenance or a host failure by [moving nodes off of an allocator](../../maintenance/ece/move-nodes-instances-from-allocators.md). * If you received a license from Elastic, [manage the licenses](../../license/manage-your-license-in-ece.md) for your Elastic Cloud Enterprise installation. -::::{warning} +::::{warning} During installation, the system generates secrets that are placed into the `/mnt/data/elastic/bootstrap-state/bootstrap-secrets.json` secrets file, unless you passed in a different path with the --host-storage-path parameter. Keep the information in the `bootstrap-secrets.json` file secure by removing it from its default location and placing it into a secure storage location. :::: diff --git a/deploy-manage/deploy/cloud-on-k8s/advanced-elasticsearch-node-scheduling.md b/deploy-manage/deploy/cloud-on-k8s/advanced-elasticsearch-node-scheduling.md index 8e0e8d2c8..a26e20417 100644 --- a/deploy-manage/deploy/cloud-on-k8s/advanced-elasticsearch-node-scheduling.md +++ b/deploy-manage/deploy/cloud-on-k8s/advanced-elasticsearch-node-scheduling.md @@ -19,7 +19,7 @@ You can combine these features to deploy a production-grade Elasticsearch cluste ## Define Elasticsearch nodes roles [k8s-define-elasticsearch-nodes-roles] -You can configure Elasticsearch nodes with [one or multiple roles](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/node-settings.md). +You can configure Elasticsearch nodes with [one or multiple roles](elasticsearch://reference/elasticsearch/configuration-reference/node-settings.md). ::::{tip} You can use [YAML anchors](https://yaml.org/spec/1.2/spec.md#id2765878) to declare the configuration change once and reuse it across all the node sets. @@ -198,7 +198,7 @@ This example restricts Elasticsearch nodes so they are only scheduled on Kuberne ## Topology spread constraints and availability zone awareness [k8s-availability-zone-awareness] -Starting with ECK 2.0 the operator can make Kubernetes Node labels available as Pod annotations. It can be used to make information, such as logical failure domains, available in a running Pod. Combined with [Elasticsearch shard allocation awareness](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#allocation-awareness) and [Kubernetes topology spread constraints](https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/), you can create an availability zone-aware Elasticsearch cluster. +Starting with ECK 2.0 the operator can make Kubernetes Node labels available as Pod annotations. It can be used to make information, such as logical failure domains, available in a running Pod. Combined with [Elasticsearch shard allocation awareness](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md) and [Kubernetes topology spread constraints](https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/), you can create an availability zone-aware Elasticsearch cluster. ### Exposing Kubernetes node topology labels in Pods [k8s-availability-zone-awareness-downward-api] @@ -253,13 +253,13 @@ This example relies on: * Kubernetes nodes in each zone being labeled accordingly. `topology.kubernetes.io/zone` [is standard](https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#interlude-built-in-node-labels), but any label can be used. * [Pod topology spread constraints](https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/) to spread the Pods across availability zones in the Kubernetes cluster. -* Elasticsearch configured to [allocate shards based on node attributes](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#allocation-awareness). Here we specified `node.attr.zone`, but any attribute name can be used. `node.attr.rack_id` is another common example. +* Elasticsearch configured to [allocate shards based on node attributes](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md). Here we specified `node.attr.zone`, but any attribute name can be used. `node.attr.rack_id` is another common example. ## Hot-warm topologies [k8s-hot-warm-topologies] -By combining [Elasticsearch shard allocation awareness](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#allocation-awareness) with [Kubernetes node affinity](https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#node-affinity-beta-feature), you can set up an Elasticsearch cluster with hot-warm topology: +By combining [Elasticsearch shard allocation awareness](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md) with [Kubernetes node affinity](https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#node-affinity-beta-feature), you can set up an Elasticsearch cluster with hot-warm topology: ```yaml apiVersion: elasticsearch.k8s.elastic.co/v1 diff --git a/deploy-manage/deploy/cloud-on-k8s/create-custom-images.md b/deploy-manage/deploy/cloud-on-k8s/create-custom-images.md index 806783180..bd44eea45 100644 --- a/deploy-manage/deploy/cloud-on-k8s/create-custom-images.md +++ b/deploy-manage/deploy/cloud-on-k8s/create-custom-images.md @@ -8,7 +8,7 @@ mapped_pages: # Create custom images [k8s-custom-images] -You can create your own custom application images (Elasticsearch, Kibana, APM Server, Beats, Elastic Agent, Elastic Maps Server, and Logstash) instead of using the base images provided by Elastic. You might want to do this to have a canonical image with all the necessary plugins pre-loaded rather than [installing them through an init container](init-containers-for-plugin-downloads.md) each time a Pod starts. You must use the official image as the base for custom images. For example, if you want to create an Elasticsearch 8.16.1 image with the [ICU Analysis Plugin](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch-plugins/analysis-icu.md), you can do the following: +You can create your own custom application images (Elasticsearch, Kibana, APM Server, Beats, Elastic Agent, Elastic Maps Server, and Logstash) instead of using the base images provided by Elastic. You might want to do this to have a canonical image with all the necessary plugins pre-loaded rather than [installing them through an init container](init-containers-for-plugin-downloads.md) each time a Pod starts. You must use the official image as the base for custom images. For example, if you want to create an Elasticsearch 8.16.1 image with the [ICU Analysis Plugin](elasticsearch://reference/elasticsearch-plugins/analysis-icu.md), you can do the following: 1. Create a `Dockerfile` containing: diff --git a/deploy-manage/deploy/cloud-on-k8s/custom-configuration-files-plugins.md b/deploy-manage/deploy/cloud-on-k8s/custom-configuration-files-plugins.md index 4e68b62a3..d0480c52d 100644 --- a/deploy-manage/deploy/cloud-on-k8s/custom-configuration-files-plugins.md +++ b/deploy-manage/deploy/cloud-on-k8s/custom-configuration-files-plugins.md @@ -56,7 +56,7 @@ Refer to [Creating custom images](create-custom-images.md) for instructions on h ## Use init containers for plugins installation -The following example describes option 2, using a repository plugin. To install the plugin before the Elasticsearch nodes start, use an init container to run the [plugin installation tool](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch-plugins/installation.md). +The following example describes option 2, using a repository plugin. To install the plugin before the Elasticsearch nodes start, use an init container to run the [plugin installation tool](elasticsearch://reference/elasticsearch-plugins/installation.md). ```yaml spec: @@ -107,7 +107,7 @@ To install custom configuration files you can: 1. Add the configuration data into a ConfigMap or Secret. 2. Use volumes and volume mounts in your manifest to mount the contents of the ConfigMap or Secret as files in your {{es}} nodes. -The next example shows how to add a synonyms file for the [synonym token filter](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analysis-synonym-tokenfilter.md) in Elasticsearch. But you can **use the same approach for any kind of file you want to mount into the configuration directory of Elasticsearch**, like adding CA certificates of external systems. +The next example shows how to add a synonyms file for the [synonym token filter](elasticsearch://reference/data-analysis/text-analysis/analysis-synonym-tokenfilter.md) in Elasticsearch. But you can **use the same approach for any kind of file you want to mount into the configuration directory of Elasticsearch**, like adding CA certificates of external systems. 1. Create the ConfigMap or Secret with the data: diff --git a/deploy-manage/deploy/cloud-on-k8s/elasticsearch-deployment-quickstart.md b/deploy-manage/deploy/cloud-on-k8s/elasticsearch-deployment-quickstart.md index 4a7ac2b47..22a88db27 100644 --- a/deploy-manage/deploy/cloud-on-k8s/elasticsearch-deployment-quickstart.md +++ b/deploy-manage/deploy/cloud-on-k8s/elasticsearch-deployment-quickstart.md @@ -108,7 +108,7 @@ NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE quickstart-es-http ClusterIP 10.15.251.145 9200/TCP 34m ``` -In order to make requests to the [{{es}} API](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/index.md): +In order to make requests to the [{{es}} API](elasticsearch://reference/elasticsearch/rest-apis/index.md): 1. Get the credentials. diff --git a/deploy-manage/deploy/cloud-on-k8s/nodes-orchestration.md b/deploy-manage/deploy/cloud-on-k8s/nodes-orchestration.md index 11384843e..5cd97fc02 100644 --- a/deploy-manage/deploy/cloud-on-k8s/nodes-orchestration.md +++ b/deploy-manage/deploy/cloud-on-k8s/nodes-orchestration.md @@ -177,7 +177,7 @@ Advanced users may force an upgrade by manually deleting Pods themselves. The de Operations that reduce the number of nodes in the cluster cannot make progress without user intervention, if the Elasticsearch index replica settings are incompatible with the intended downscale. Specifically, if the Elasticsearch index settings demand a higher number of shard copies than data nodes in the cluster after the downscale operation, ECK cannot migrate the data away from the node about to be removed. You can address this in the following ways: * Adjust the Elasticsearch [index settings](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-settings) to a number of replicas that allow the desired node removal. -* Use [`auto_expand_replicas`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-modules.md) to automatically adjust the replicas to the number of data nodes in the cluster. +* Use [`auto_expand_replicas`](elasticsearch://reference/elasticsearch/index-settings/index-modules.md) to automatically adjust the replicas to the number of data nodes in the cluster. ## Advanced control during rolling upgrades [k8s-advanced-upgrade-control] diff --git a/deploy-manage/deploy/cloud-on-k8s/readiness-probe.md b/deploy-manage/deploy/cloud-on-k8s/readiness-probe.md index 8b39c91c2..58a25f00c 100644 --- a/deploy-manage/deploy/cloud-on-k8s/readiness-probe.md +++ b/deploy-manage/deploy/cloud-on-k8s/readiness-probe.md @@ -46,6 +46,6 @@ Note that this requires restarting the Pods. % this feature might have disappeared, we will need to investigate this a bit more, as the link below doesn't work anymore but it does for 8.15 for example. -We do not recommend overriding the default readiness probe on Elasticsearch 8.2.0 and later. ECK configures a socket based readiness probe using the Elasticsearch [readiness port feature](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/jvm-settings.md#readiness-tcp-port) which is not influenced by the load on the Elasticsearch cluster. +We do not recommend overriding the default readiness probe on Elasticsearch 8.2.0 and later. ECK configures a socket based readiness probe using the Elasticsearch [readiness port feature](elasticsearch://reference/elasticsearch/jvm-settings.md#readiness-tcp-port) which is not influenced by the load on the Elasticsearch cluster. diff --git a/deploy-manage/deploy/cloud-on-k8s/requests-routing-to-elasticsearch-nodes.md b/deploy-manage/deploy/cloud-on-k8s/requests-routing-to-elasticsearch-nodes.md index 7626121c8..903b665d2 100644 --- a/deploy-manage/deploy/cloud-on-k8s/requests-routing-to-elasticsearch-nodes.md +++ b/deploy-manage/deploy/cloud-on-k8s/requests-routing-to-elasticsearch-nodes.md @@ -8,7 +8,7 @@ mapped_pages: # Requests routing to Elasticsearch nodes [k8s-traffic-splitting] -The default Kubernetes service created by ECK, named `-es-http`, is configured to include all the Elasticsearch nodes in that cluster. This configuration is good to get started and is adequate for most use cases. However, if you are operating an Elasticsearch cluster with [different node types](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/node-settings.md) and want control over which nodes handle which types of traffic, you should create additional Kubernetes services yourself. +The default Kubernetes service created by ECK, named `-es-http`, is configured to include all the Elasticsearch nodes in that cluster. This configuration is good to get started and is adequate for most use cases. However, if you are operating an Elasticsearch cluster with [different node types](elasticsearch://reference/elasticsearch/configuration-reference/node-settings.md) and want control over which nodes handle which types of traffic, you should create additional Kubernetes services yourself. As an alternative, you can use features provided by third-party software such as service meshes and ingress controllers to achieve more advanced traffic management configurations. Check the [recipes directory](https://github.com/elastic/cloud-on-k8s/tree/2.16/config/recipes) in the ECK source repository for a few examples. diff --git a/deploy-manage/deploy/cloud-on-k8s/transport-settings.md b/deploy-manage/deploy/cloud-on-k8s/transport-settings.md index e0887773e..8197b55c7 100644 --- a/deploy-manage/deploy/cloud-on-k8s/transport-settings.md +++ b/deploy-manage/deploy/cloud-on-k8s/transport-settings.md @@ -8,7 +8,7 @@ mapped_pages: # Transport settings [k8s-transport-settings] -The transport module in Elasticsearch is used for internal communication between nodes within the cluster as well as communication between remote clusters. Check the [Elasticsearch documentation](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md) for details. For customization options of the HTTP layer, check [Services](accessing-services.md) and [TLS certificates](/deploy-manage/security/secure-http-communications.md). +The transport module in Elasticsearch is used for internal communication between nodes within the cluster as well as communication between remote clusters. Check the [Elasticsearch documentation](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md) for details. For customization options of the HTTP layer, check [Services](accessing-services.md) and [TLS certificates](/deploy-manage/security/secure-http-communications.md). ## Customize the Transport Service [k8s_customize_the_transport_service] diff --git a/deploy-manage/deploy/cloud-on-k8s/virtual-memory.md b/deploy-manage/deploy/cloud-on-k8s/virtual-memory.md index fad49f599..003d163d9 100644 --- a/deploy-manage/deploy/cloud-on-k8s/virtual-memory.md +++ b/deploy-manage/deploy/cloud-on-k8s/virtual-memory.md @@ -14,7 +14,7 @@ The kernel setting `vm.max_map_count=262144` can be set on the host directly, by For more information, check the Elasticsearch documentation on [Virtual memory](/deploy-manage/deploy/self-managed/vm-max-map-count.md). -Optionally, you can select a different type of file system implementation for the storage. For possible options, check the [store module documentation](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/store.md). +Optionally, you can select a different type of file system implementation for the storage. For possible options, check the [store module documentation](elasticsearch://reference/elasticsearch/index-settings/store.md). ```yaml spec: diff --git a/deploy-manage/deploy/elastic-cloud/add-plugins-extensions.md b/deploy-manage/deploy/elastic-cloud/add-plugins-extensions.md index 65598caf1..d2ee9afed 100644 --- a/deploy-manage/deploy/elastic-cloud/add-plugins-extensions.md +++ b/deploy-manage/deploy/elastic-cloud/add-plugins-extensions.md @@ -15,16 +15,16 @@ Plugins extend the core functionality of {{es}}. There are many suitable plugins * Analysis plugins, to provide analyzers targeted at languages other than English. * Scripting plugins, to provide additional scripting languages. -Plugins can come from different sources: the official ones created or at least maintained by Elastic, community-sourced plugins from other users, and plugins that you provide. Some of the official plugins are always provided with our service, and can be [enabled per deployment](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch-plugins/cloud/ec-adding-elastic-plugins.md). +Plugins can come from different sources: the official ones created or at least maintained by Elastic, community-sourced plugins from other users, and plugins that you provide. Some of the official plugins are always provided with our service, and can be [enabled per deployment](elasticsearch://reference/elasticsearch-plugins/cloud/ec-adding-elastic-plugins.md). There are two ways to add plugins to a hosted deployment in {{ecloud}}: -* [Enable one of the official plugins already available in {{ecloud}}](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch-plugins/cloud/ec-adding-elastic-plugins.md). +* [Enable one of the official plugins already available in {{ecloud}}](elasticsearch://reference/elasticsearch-plugins/cloud/ec-adding-elastic-plugins.md). * [Upload a custom plugin and then enable it per deployment](upload-custom-plugins-bundles.md). -Custom plugins can include the official {{es}} plugins not provided with {{ecloud}}, any of the community-sourced plugins, or [plugins that you write yourself](asciidocalypse://docs/elasticsearch/docs/extend/index.md). Uploading custom plugins is available only to Gold, Platinum, and Enterprise subscriptions. For more information, check [Upload custom plugins and bundles](upload-custom-plugins-bundles.md). +Custom plugins can include the official {{es}} plugins not provided with {{ecloud}}, any of the community-sourced plugins, or [plugins that you write yourself](elasticsearch://extend/index.md). Uploading custom plugins is available only to Gold, Platinum, and Enterprise subscriptions. For more information, check [Upload custom plugins and bundles](upload-custom-plugins-bundles.md). -To learn more about the official and community-sourced plugins, refer to [{{es}} Plugins and Integrations](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch-plugins/index.md). +To learn more about the official and community-sourced plugins, refer to [{{es}} Plugins and Integrations](elasticsearch://reference/elasticsearch-plugins/index.md). For a detailed guide with examples of using the {{ecloud}} API to create, get information about, update, and delete extensions and plugins, check [Managing plugins and extensions through the API](manage-plugins-extensions-through-api.md). diff --git a/deploy-manage/deploy/elastic-cloud/available-stack-versions.md b/deploy-manage/deploy/elastic-cloud/available-stack-versions.md index fa8628954..b1ff431bd 100644 --- a/deploy-manage/deploy/elastic-cloud/available-stack-versions.md +++ b/deploy-manage/deploy/elastic-cloud/available-stack-versions.md @@ -28,7 +28,7 @@ You might sometimes notice additional versions listed in the user interface beyo Whenever a new Elastic Stack version is released, we do our best to provide the new version on our hosted service at the same time. We send you an email and add a notice to the console, recommending an upgrade. You’ll need to decide whether to upgrade to the new version with new features and bug fixes or to stay with a version you know works for you a while longer. -There can be [breaking changes](asciidocalypse://docs/elasticsearch/docs/release-notes/breaking-changes.md) in some new versions of Elasticsearch that break what used to work in older versions. Before upgrading, you’ll want to check if the new version introduces any changes that might affect your applications. A breaking change might be a function that was previously deprecated and that has been removed in the latest version, for example. If you have an application that depends on the removed function, the application will need to be updated to continue working with the new version of Elasticsearch. +There can be [breaking changes](elasticsearch://release-notes/breaking-changes.md) in some new versions of Elasticsearch that break what used to work in older versions. Before upgrading, you’ll want to check if the new version introduces any changes that might affect your applications. A breaking change might be a function that was previously deprecated and that has been removed in the latest version, for example. If you have an application that depends on the removed function, the application will need to be updated to continue working with the new version of Elasticsearch. To learn more about upgrading to newer versions of the Elastic Stack on our hosted service, check [Upgrade Versions](../../upgrade/deployment-or-cluster.md). diff --git a/deploy-manage/deploy/elastic-cloud/differences-from-other-elasticsearch-offerings.md b/deploy-manage/deploy/elastic-cloud/differences-from-other-elasticsearch-offerings.md index 368990388..9ec7081d2 100644 --- a/deploy-manage/deploy/elastic-cloud/differences-from-other-elasticsearch-offerings.md +++ b/deploy-manage/deploy/elastic-cloud/differences-from-other-elasticsearch-offerings.md @@ -42,7 +42,7 @@ To ensure optimal performance, follow these recommendations for sizing individua For large datasets that exceed the recommended maximum size for a single index, consider splitting your data across smaller indices and using an alias to search them collectively. -These recommendations do not apply to indices using better binary quantization (BBQ). Refer to [vector quantization](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-quantization) in the core {{es}} docs for more information. +These recommendations do not apply to indices using better binary quantization (BBQ). Refer to [vector quantization](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-quantization) in the core {{es}} docs for more information. ## API availability [elasticsearch-differences-serverless-apis-availability] @@ -94,7 +94,7 @@ When attempting to use an unavailable API, you’ll receive a clear error messag ## Settings availability [elasticsearch-differences-serverless-settings-availability] -In {{es-serverless}}, you can only configure [index-level settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index.md). Cluster-level settings and node-level settings are not required by end users and the `elasticsearch.yml` file is fully managed by Elastic. +In {{es-serverless}}, you can only configure [index-level settings](elasticsearch://reference/elasticsearch/index-settings/index.md). Cluster-level settings and node-level settings are not required by end users and the `elasticsearch.yml` file is fully managed by Elastic. Available settings : **Index-level settings**: Settings that control how {{es}} documents are processed, stored, and searched are available to end users. These include: @@ -154,6 +154,6 @@ The following features are not available in {{es-serverless}} and are not planne * [Custom plugins and bundles](/deploy-manage/deploy/elastic-cloud/upload-custom-plugins-bundles.md) * [{{es}} for Apache Hadoop](elasticsearch-hadoop://reference/index.md) -* [Scripted metric aggregations](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-metrics-scripted-metric-aggregation.md) +* [Scripted metric aggregations](elasticsearch://reference/data-analysis/aggregations/search-aggregations-metrics-scripted-metric-aggregation.md) * Managed web crawler: You can use the [self-managed web crawler](https://github.com/elastic/crawler) instead. -* Managed Search connectors: You can use [self-managed Search connectors](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/search-connectors/self-managed-connectors.md) instead. \ No newline at end of file +* Managed Search connectors: You can use [self-managed Search connectors](elasticsearch://reference/ingestion-tools/search-connectors/self-managed-connectors.md) instead. \ No newline at end of file diff --git a/deploy-manage/deploy/elastic-cloud/ech-version-policy.md b/deploy-manage/deploy/elastic-cloud/ech-version-policy.md index ff660ca6d..79b75d280 100644 --- a/deploy-manage/deploy/elastic-cloud/ech-version-policy.md +++ b/deploy-manage/deploy/elastic-cloud/ech-version-policy.md @@ -25,7 +25,7 @@ You might sometimes notice additional versions listed in the user interface beyo Whenever a new Elastic Stack version is released, we do our best to provide the new version on our hosted service at the same time. We send you an email and add a notice to the console, recommending an upgrade. You’ll need to decide whether to upgrade to the new version with new features and bug fixes or to stay with a version you know works for you a while longer. -There can be [breaking changes](asciidocalypse://docs/elasticsearch/docs/release-notes/breaking-changes.md) in some new versions of Elasticsearch that break what used to work in older versions. Before upgrading, you’ll want to check if the new version introduces any changes that might affect your applications. A breaking change might be a function that was previously deprecated and that has been removed in the latest version, for example. If you have an application that depends on the removed function, the application will need to be updated to continue working with the new version of Elasticsearch. +There can be [breaking changes](elasticsearch://release-notes/breaking-changes.md) in some new versions of Elasticsearch that break what used to work in older versions. Before upgrading, you’ll want to check if the new version introduces any changes that might affect your applications. A breaking change might be a function that was previously deprecated and that has been removed in the latest version, for example. If you have an application that depends on the removed function, the application will need to be updated to continue working with the new version of Elasticsearch. To learn more about upgrading to newer versions of the Elastic Stack on our hosted service, check [Upgrade Versions](../../upgrade/deployment-or-cluster.md). diff --git a/deploy-manage/deploy/elastic-cloud/ech-whats-new.md b/deploy-manage/deploy/elastic-cloud/ech-whats-new.md index eaa7c4049..7b458ad83 100644 --- a/deploy-manage/deploy/elastic-cloud/ech-whats-new.md +++ b/deploy-manage/deploy/elastic-cloud/ech-whats-new.md @@ -15,7 +15,7 @@ Check the Release Notes to get the recent updates for each product. Elasticsearch -* [Elasticsearch 8.x Release Notes](asciidocalypse://docs/elasticsearch/docs/release-notes/index.md) +* [Elasticsearch 8.x Release Notes](elasticsearch://release-notes/index.md) * [Elasticsearch 7.x Release Notes](https://www.elastic.co/guide/en/elasticsearch/reference/7.17/es-release-notes.html) * [Elasticsearch 6.x Release Notes](https://www.elastic.co/guide/en/elasticsearch/reference/6.8/es-release-notes.html) * [Elasticsearch 5.x Release Notes](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/es-release-notes.html) diff --git a/deploy-manage/deploy/elastic-cloud/edit-stack-settings.md b/deploy-manage/deploy/elastic-cloud/edit-stack-settings.md index 408788433..65c8be45a 100644 --- a/deploy-manage/deploy/elastic-cloud/edit-stack-settings.md +++ b/deploy-manage/deploy/elastic-cloud/edit-stack-settings.md @@ -57,9 +57,9 @@ From the {{ecloud}} Console you can customize {{es}}, {{kib}}, and related produ Change how {{es}} runs by providing your own user settings. {{ech}} appends these settings to each node’s `elasticsearch.yml` configuration file. -{{ech}} automatically rejects `elasticsearch.yml` settings that could break your cluster. +{{ech}} automatically rejects `elasticsearch.yml` settings that could break your cluster. -For a list of supported settings, check [Supported {{es}} settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/elastic-cloud-hosted-elasticsearch-settings.md). +For a list of supported settings, check [Supported {{es}} settings](elasticsearch://reference/elasticsearch/configuration-reference/elastic-cloud-hosted-elasticsearch-settings.md). ::::{warning} You can also update [dynamic cluster settings](../../../deploy-manage/deploy/self-managed/configure-elasticsearch.md#dynamic-cluster-setting) using {{es}}'s [update cluster settings API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings). However, {{ech}} doesn’t reject unsafe setting changes made using this API. Use it with caution. diff --git a/deploy-manage/deploy/elastic-cloud/manage-plugins-extensions-through-api.md b/deploy-manage/deploy/elastic-cloud/manage-plugins-extensions-through-api.md index 00165ca3c..7d7539ac4 100644 --- a/deploy-manage/deploy/elastic-cloud/manage-plugins-extensions-through-api.md +++ b/deploy-manage/deploy/elastic-cloud/manage-plugins-extensions-through-api.md @@ -22,21 +22,21 @@ This guide provides a full list of tasks for managing [plugins and extensions](a * [Delete an extension](#ec-extension-guide-delete) -## Create an extension [ec-extension-guide-create] +## Create an extension [ec-extension-guide-create] There are two methods to create an extension. You can: 1. Stream the file from a publicly-accessible download URL. 2. Upload the file from a local file path. -::::{note} +::::{note} For plugins larger than 200MB the download URL option **must** be used. Plugins larger than 8GB cannot be uploaded with either method. :::: These two examples are for the `plugin` extension type. For bundles, change `extension_type` to `bundle`. -For plugins, `version` must match (exactly) the `elasticsearch.version` field defined in the plugin’s `plugin-descriptor.properties` file. Check [Help for plugin authors](asciidocalypse://docs/elasticsearch/docs/extend/index.md#plugin-authors) for details. For plugins larger than 5GB, the `plugin-descriptor.properties` file needs to be at the top of the archive. This ensures that the our verification process is able to detect that it is an Elasticsearch plugin; otherwise the plugin will be rejected by the API. This order can be achieved by specifying at time of creating the ZIP file: `zip -r name-of-plugin.zip plugin-descriptor.properties *`. +For plugins, `version` must match (exactly) the `elasticsearch.version` field defined in the plugin’s `plugin-descriptor.properties` file. Check [Help for plugin authors](elasticsearch://extend/index.md) for details. For plugins larger than 5GB, the `plugin-descriptor.properties` file needs to be at the top of the archive. This ensures that the our verification process is able to detect that it is an Elasticsearch plugin; otherwise the plugin will be rejected by the API. This order can be achieved by specifying at time of creating the ZIP file: `zip -r name-of-plugin.zip plugin-descriptor.properties *`. For bundles, we recommend setting `version` using wildcard notation that matches the major version of the Elasticsearch deployment. For example, if Elasticsearch is on version 8.4.3, simply set `8.*` as the version. The value `8.*` means that the bundle is compatible with all 8.x versions of Elasticsearch. @@ -58,12 +58,12 @@ curl -X POST \ The single POST request creates an extension with the metadata, validates, and streams the file from the `download_url` specified. The accepted protocols for `download_url` are `http` and `https`. -::::{note} +::::{note} The `download_url` must be directly and publicly accessible. There is currently no support for redirection or authentication unless it contains security credentials/tokens expected by your HTTP service as part of the URL. Otherwise, use the following Option 2 to upload the file from a local path. :::: -::::{note} +::::{note} When the file is larger than 5GB, the request may timeout after 2-5 minutes, but streaming will continue on the server. Check the Extensions page in the Cloud UI after 5-10 minutes to make sure that the plugin has been created. A successfully created plugin will contain correct name, type, version, size, and last modified information. :::: @@ -105,7 +105,7 @@ curl -v -X PUT "https://api.elastic-cloud.com/api/v1/deployments/extensions/EXTE -T "/path_to/custom-plugin-8.4.3.zip" ``` -::::{note} +::::{note} When using curl, always use the `-T` option. DO NOT use `-F` (we have seen inconsistency in curl behavior across systems; using `-F` can result in partially uploaded or truncated files). :::: @@ -123,7 +123,7 @@ The above PUT request uploads the file from the local path specified. This reque ``` -## Add an extension to a deployment plan [ec-extension-guide-add-plan] +## Add an extension to a deployment plan [ec-extension-guide-add-plan] Once the extension is created and uploaded, you can add the extension using its `EXTENSION_ID` in an [update deployment API call](https://www.elastic.co/docs/api/doc/cloud/operation/operation-update-deployment). @@ -180,7 +180,7 @@ The previous examples are for plugins. For bundles, use the `user_bundles` const ``` -## Get an extension [ec-extension-guide-get-extension] +## Get an extension [ec-extension-guide-get-extension] You can use the GET call to retrieve information about an extension. @@ -227,7 +227,7 @@ For example, the previous call returns: ``` -## Update the name of an existing extension [ec-extension-guide-update-name] +## Update the name of an existing extension [ec-extension-guide-update-name] To update the name of an existing extension, simply update the name field without uploading a new file. You do not have to specify the `download_url` when only making metadata changes to an extension. @@ -262,12 +262,12 @@ curl -X POST \ Updating the name of an existing extension does not change its `EXTENSION_ID`. -## Update the type of an existing extension [ec-extension-guide-update-type] +## Update the type of an existing extension [ec-extension-guide-update-type] Updating `extension_type` has no effect. You cannot change the extension’s type (`plugin` versus `bundle`) after the initial creation of a plugin. -## Update the version of an existing bundle [ec-extension-guide-update-version-bundle] +## Update the version of an existing bundle [ec-extension-guide-update-version-bundle] For bundles, we recommend setting `version` using wildcard notation that matches the major version of the Elasticsearch deployment. For example, if Elasticsearch is on version 8.4.3, simply set `8.*` as the version. The value `8.*` means that the bundle is compatible with all 7.x versions of Elasticsearch. @@ -304,12 +304,12 @@ curl -X POST \ Updating the name of an existing extension does not change its `EXTENSION_ID`. -## Update the version of an existing plugin [ec-extension-guide-update-version-plugin] +## Update the version of an existing plugin [ec-extension-guide-update-version-plugin] -For plugins, `version` must match (exactly) the `elasticsearch.version` field defined in the plugin’s `plugin-descriptor.properties` file. Check [Help for plugin authors](asciidocalypse://docs/elasticsearch/docs/extend/index.md#plugin-authors) for details. If you change the version, the associated plugin file *must* also be updated accordingly. +For plugins, `version` must match (exactly) the `elasticsearch.version` field defined in the plugin’s `plugin-descriptor.properties` file. Check [Help for plugin authors](elasticsearch://extend/index.md) for details. If you change the version, the associated plugin file *must* also be updated accordingly. -## Update the file associated to an existing extension [ec-extension-guide-update-file] +## Update the file associated to an existing extension [ec-extension-guide-update-file] You may want to update an uploaded file for an existing extension without performing an Elasticsearch upgrade. If you are updating the extension to prepare for an Elasticsearch upgrade, check the [Upgrade Elasticsearch](#ec-extension-guide-upgrade-elasticsearch) scenario later on this page. @@ -340,7 +340,7 @@ curl -v -X PUT "https://api.elastic-cloud.com/api/v1/deployments/extensions/EXTE -T "/path_to/custom-plugin-8.4.3-10212022.zip" ``` -::::{important} +::::{important} If you are not making any other plan changes and simply updating an extension file, you need to issue a no-op plan so that Elasticsearch will make use of this new file. A *no-op* (no operation) plan triggers a rolling restart on the deployment, applying the same (unchanged) plan as the current plan. :::: @@ -348,7 +348,7 @@ If you are not making any other plan changes and simply updating an extension fi Updating the file of an existing extension or bundle does not change its `EXTENSION_ID`. -## Upgrade Elasticsearch [ec-extension-guide-upgrade-elasticsearch] +## Upgrade Elasticsearch [ec-extension-guide-upgrade-elasticsearch] When you upgrade Elasticsearch in a deployment, you must ensure that: @@ -473,7 +473,7 @@ Unlike bundles, plugins *must* match the Elasticsearch version down to the patch -## Delete an extension [ec-extension-guide-delete] +## Delete an extension [ec-extension-guide-delete] You can delete an extension simply by calling a DELETE against the EXTENSION_ID of interest: diff --git a/deploy-manage/deploy/elastic-cloud/restrictions-known-problems.md b/deploy-manage/deploy/elastic-cloud/restrictions-known-problems.md index c1c472980..41bc48ca5 100644 --- a/deploy-manage/deploy/elastic-cloud/restrictions-known-problems.md +++ b/deploy-manage/deploy/elastic-cloud/restrictions-known-problems.md @@ -50,7 +50,7 @@ The following restrictions apply when using APIs in {{ecloud}}: $$$ec-restrictions-apis-elasticsearch$$$ Elasticsearch APIs -: The Elasticsearch APIs do not natively enforce rate limiting. However, all requests to the Elasticsearch cluster are subject to Elasticsearch configuration settings, such as the [network HTTP setting](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#http-settings) `http:max_content_length` which restricts the maximum size of an HTTP request body. This setting has a default value of 100MB, hence restricting API request payloads to that size. This setting is not currently configurable in {{ecloud}}. For a list of which Elasticsearch settings are supported on Cloud, check [Add Elasticsearch user settings](edit-stack-settings.md). To learn about using the Elasticsearch APIs in {{ecloud}}, check [Access the Elasticsearch API console](asciidocalypse://docs/cloud/docs/reference/cloud-hosted/ec-api-console.md). And, for full details about the Elasticsearch APIs and their endpoints, check the [Elasticsearch API reference documentation](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/index.md). +: The Elasticsearch APIs do not natively enforce rate limiting. However, all requests to the Elasticsearch cluster are subject to Elasticsearch configuration settings, such as the [network HTTP setting](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#http-settings) `http:max_content_length` which restricts the maximum size of an HTTP request body. This setting has a default value of 100MB, hence restricting API request payloads to that size. This setting is not currently configurable in {{ecloud}}. For a list of which Elasticsearch settings are supported on Cloud, check [Add Elasticsearch user settings](edit-stack-settings.md). To learn about using the Elasticsearch APIs in {{ecloud}}, check [Access the Elasticsearch API console](asciidocalypse://docs/cloud/docs/reference/cloud-hosted/ec-api-console.md). And, for full details about the Elasticsearch APIs and their endpoints, check the [Elasticsearch API reference documentation](elasticsearch://reference/elasticsearch/rest-apis/index.md). $$$ec-restrictions-apis-kibana$$$ diff --git a/deploy-manage/deploy/elastic-cloud/tools-apis.md b/deploy-manage/deploy/elastic-cloud/tools-apis.md index 705c19c46..cc78d2786 100644 --- a/deploy-manage/deploy/elastic-cloud/tools-apis.md +++ b/deploy-manage/deploy/elastic-cloud/tools-apis.md @@ -60,17 +60,17 @@ Note that some [restrictions](/deploy-manage/deploy/elastic-cloud/restrictions-k The following APIs are available for {{es-serverless}} users. These links will take you to the autogenerated API reference documentation. - [Elasticsearch Serverless APIs](https://www.elastic.co/docs/api/doc/elasticsearch-serverless): Use these APIs to index, manage, search, and analyze your data in {{es-serverless}}. Learn how to [connect to your {{es-serverless}} endpoint](/solutions/search/get-started.md). - - ::::{tip} + + ::::{tip} Learn how to [connect to your {{es-serverless}} endpoint](/solutions/search/get-started.md). :::: - + - [Kibana Serverless APIs](https://www.elastic.co/docs/api/doc/serverless): Use these APIs to manage resources such as connectors, data views, and saved objects for your {{serverless-full}} project. **Additional API information** -- [{{es}} API conventions](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md): Reference information about headers and request body conventions for {{es-serverless}} REST APIs. +- [{{es}} API conventions](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md): Reference information about headers and request body conventions for {{es-serverless}} REST APIs. :::: ::::{tab-item} {{ech}} @@ -92,7 +92,7 @@ serverless: unavailable For each deployment, an **API Console** page is available from the {{ecloud}} Console for you to execute queries using the available APIs. You can find this console when selecting a specific deployment to manage. From there, the API Console is available under the **{{es}}** page. :::{note} -This API Console is different from the [Dev Tools Console](/explore-analyze/query-filter/tools/console.md) available in each deployment, from which you can call {{es}} and {{kib}} APIs. On the {{ecloud}} API Console, you cannot run Kibana APIs. +This API Console is different from the [Dev Tools Console](/explore-analyze/query-filter/tools/console.md) available in each deployment, from which you can call {{es}} and {{kib}} APIs. On the {{ecloud}} API Console, you cannot run Kibana APIs. ::: ## ECCTL - Command-line interface for {{ecloud}} diff --git a/deploy-manage/deploy/self-managed/bootstrap-checks-heap-size.md b/deploy-manage/deploy/self-managed/bootstrap-checks-heap-size.md index 36abba2ff..fc6e239fb 100644 --- a/deploy-manage/deploy/self-managed/bootstrap-checks-heap-size.md +++ b/deploy-manage/deploy/self-managed/bootstrap-checks-heap-size.md @@ -5,5 +5,5 @@ mapped_pages: # Heap size check [bootstrap-checks-heap-size] -By default, {{es}} automatically sizes JVM heap based on a node’s [roles](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/node-settings.md#node-roles) and total memory. If you manually override the default sizing and start the JVM with different initial and max heap sizes, the JVM may pause as it resizes the heap during system usage. If you enable [`bootstrap.memory_lock`](setup-configuration-memory.md#bootstrap-memory_lock), the JVM locks the initial heap size on startup. If the initial heap size is not equal to the maximum heap size, some JVM heap may not be locked after a resize. To avoid these issues, start the JVM with an initial heap size equal to the maximum heap size. +By default, {{es}} automatically sizes JVM heap based on a node’s [roles](elasticsearch://reference/elasticsearch/configuration-reference/node-settings.md#node-roles) and total memory. If you manually override the default sizing and start the JVM with different initial and max heap sizes, the JVM may pause as it resizes the heap during system usage. If you enable [`bootstrap.memory_lock`](setup-configuration-memory.md#bootstrap-memory_lock), the JVM locks the initial heap size on startup. If the initial heap size is not equal to the maximum heap size, some JVM heap may not be locked after a resize. To avoid these issues, start the JVM with an initial heap size equal to the maximum heap size. diff --git a/deploy-manage/deploy/self-managed/bootstrap-checks-max-map-count.md b/deploy-manage/deploy/self-managed/bootstrap-checks-max-map-count.md index 46c4cac7b..0c94b5d8a 100644 --- a/deploy-manage/deploy/self-managed/bootstrap-checks-max-map-count.md +++ b/deploy-manage/deploy/self-managed/bootstrap-checks-max-map-count.md @@ -7,5 +7,5 @@ mapped_pages: Continuing from the previous [point](max-size-virtual-memory-check.md), to use `mmap` effectively, Elasticsearch also requires the ability to create many memory-mapped areas. The maximum map count check checks that the kernel allows a process to have at least 262,144 memory-mapped areas and is enforced on Linux only. To pass the maximum map count check, you must configure `vm.max_map_count` via `sysctl` to be at least `262144`. -Alternatively, the maximum map count check is only needed if you are using `mmapfs` or `hybridfs` as the [store type](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/store.md) for your indices. If you [do not allow](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/store.md#allow-mmap) the use of `mmap` then this bootstrap check will not be enforced. +Alternatively, the maximum map count check is only needed if you are using `mmapfs` or `hybridfs` as the [store type](elasticsearch://reference/elasticsearch/index-settings/store.md) for your indices. If you [do not allow](elasticsearch://reference/elasticsearch/index-settings/store.md#allow-mmap) the use of `mmap` then this bootstrap check will not be enforced. diff --git a/deploy-manage/deploy/self-managed/executable-jna-tmpdir.md b/deploy-manage/deploy/self-managed/executable-jna-tmpdir.md index 40847b987..3e17f591a 100644 --- a/deploy-manage/deploy/self-managed/executable-jna-tmpdir.md +++ b/deploy-manage/deploy/self-managed/executable-jna-tmpdir.md @@ -5,7 +5,7 @@ mapped_pages: # Ensure JNA temporary directory permits executables [executable-jna-tmpdir] -::::{note} +::::{note} This is only relevant for Linux. :::: @@ -30,9 +30,9 @@ To resolve these problems, either remove the `noexec` option from your `/tmp` fi ``` -If you need finer control over the location of these temporary files, you can also configure the path that JNA uses with the [JVM flag](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/jvm-settings.md#set-jvm-options) `-Djna.tmpdir=` and you can configure the path that `libffi` uses for its temporary files by setting the `LIBFFI_TMPDIR` environment variable. Future versions of {{es}} may need additional configuration, so you should prefer to set `ES_TMPDIR` wherever possible. +If you need finer control over the location of these temporary files, you can also configure the path that JNA uses with the [JVM flag](elasticsearch://reference/elasticsearch/jvm-settings.md#set-jvm-options) `-Djna.tmpdir=` and you can configure the path that `libffi` uses for its temporary files by setting the `LIBFFI_TMPDIR` environment variable. Future versions of {{es}} may need additional configuration, so you should prefer to set `ES_TMPDIR` wherever possible. -::::{note} +::::{note} {{es}} does not remove its temporary directory. You should remove leftover temporary directories while {{es}} is not running. It is best to do this automatically, for instance on each reboot. If you are running on Linux, you can achieve this by using the [tmpfs](https://www.kernel.org/doc/html/latest/filesystems/tmpfs.md) file system. :::: diff --git a/deploy-manage/deploy/self-managed/important-settings-configuration.md b/deploy-manage/deploy/self-managed/important-settings-configuration.md index 398a0166a..d5fe27d2a 100644 --- a/deploy-manage/deploy/self-managed/important-settings-configuration.md +++ b/deploy-manage/deploy/self-managed/important-settings-configuration.md @@ -8,15 +8,15 @@ mapped_pages: {{es}} requires very little configuration to get started, but there are a number of items which **must** be considered before using your cluster in production: * [Path settings](#path-settings) -* [Cluster name setting](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/miscellaneous-cluster-settings.md#cluster-name) +* [Cluster name setting](elasticsearch://reference/elasticsearch/configuration-reference/miscellaneous-cluster-settings.md#cluster-name) * [Node name setting](#node-name) * [Network host settings](#network.host) * [Discovery settings](#discovery-settings) * [Heap size settings](#heap-size-settings) * [JVM heap dump path setting](#heap-dump-path) -* [GC logging settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/jvm-settings.md#gc-logging) +* [GC logging settings](elasticsearch://reference/elasticsearch/jvm-settings.md#gc-logging) * [Temporary directory settings](#es-tmpdir) -* [JVM fatal error log setting](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/jvm-settings.md#error-file-path) +* [JVM fatal error log setting](elasticsearch://reference/elasticsearch/jvm-settings.md#error-file-path) * [Cluster backups](#important-settings-backups) Our [{{ecloud}}](https://cloud.elastic.co/registration?page=docs&placement=docs-body) service configures these items automatically, making your cluster production-ready by default. @@ -60,7 +60,7 @@ Don’t modify anything within the data directory or run processes that might in :::: -Elasticsearch offers a deprecated setting that allows you to specify multiple paths in `path.data`. To learn about this setting, and how to migrate away from it, refer to [Multiple data paths](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/path.md#multiple-data-paths). +Elasticsearch offers a deprecated setting that allows you to specify multiple paths in `path.data`. To learn about this setting, and how to migrate away from it, refer to [Multiple data paths](elasticsearch://reference/elasticsearch/index-settings/path.md#multiple-data-paths). ## Cluster name setting [_cluster_name_setting] @@ -93,7 +93,7 @@ node.name: prod-data-2 ## Network host setting [network.host] -By default, {{es}} only binds to loopback addresses such as `127.0.0.1` and `[::1]`. This is sufficient to run a cluster of one or more nodes on a single server for development and testing, but a [resilient production cluster](../../production-guidance/availability-and-resilience.md) must involve nodes on other servers. There are many [network settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md) but usually all you need to configure is `network.host`: +By default, {{es}} only binds to loopback addresses such as `127.0.0.1` and `[::1]`. This is sufficient to run a cluster of one or more nodes on a single server for development and testing, but a [resilient production cluster](../../production-guidance/availability-and-resilience.md) must involve nodes on other servers. There are many [network settings](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md) but usually all you need to configure is `network.host`: ```yaml network.host: 192.168.1.10 @@ -158,21 +158,21 @@ cluster.initial_master_nodes: <1> 1. Identify the initial master nodes by their [`node.name`](#node-name), which defaults to their hostname. Ensure that the value in `cluster.initial_master_nodes` matches the `node.name` exactly. If you use a fully-qualified domain name (FQDN) such as `master-node-a.example.com` for your node names, then you must use the FQDN in this list. Conversely, if `node.name` is a bare hostname without any trailing qualifiers, you must also omit the trailing qualifiers in `cluster.initial_master_nodes`. -See [bootstrapping a cluster](../../distributed-architecture/discovery-cluster-formation/modules-discovery-bootstrap-cluster.md) and [discovery and cluster formation settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/discovery-cluster-formation-settings.md). +See [bootstrapping a cluster](../../distributed-architecture/discovery-cluster-formation/modules-discovery-bootstrap-cluster.md) and [discovery and cluster formation settings](elasticsearch://reference/elasticsearch/configuration-reference/discovery-cluster-formation-settings.md). ## Heap size settings [heap-size-settings] -By default, {{es}} automatically sets the JVM heap size based on a node’s [roles](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/node-settings.md#node-roles) and total memory. We recommend the default sizing for most production environments. +By default, {{es}} automatically sets the JVM heap size based on a node’s [roles](elasticsearch://reference/elasticsearch/configuration-reference/node-settings.md#node-roles) and total memory. We recommend the default sizing for most production environments. -If needed, you can override the default sizing by manually [setting the JVM heap size](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/jvm-settings.md#set-jvm-heap-size). +If needed, you can override the default sizing by manually [setting the JVM heap size](elasticsearch://reference/elasticsearch/jvm-settings.md#set-jvm-heap-size). ## JVM heap dump path setting [heap-dump-path] By default, {{es}} configures the JVM to dump the heap on out of memory exceptions to the default data directory. On [RPM](install-elasticsearch-with-rpm.md) and [Debian](install-elasticsearch-with-debian-package.md) packages, the data directory is `/var/lib/elasticsearch`. On [Linux and MacOS](install-elasticsearch-from-archive-on-linux-macos.md) and [Windows](install-elasticsearch-with-zip-on-windows.md) distributions, the `data` directory is located under the root of the {{es}} installation. -If this path is not suitable for receiving heap dumps, modify the `-XX:HeapDumpPath=...` entry in [`jvm.options`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/jvm-settings.md#set-jvm-options): +If this path is not suitable for receiving heap dumps, modify the `-XX:HeapDumpPath=...` entry in [`jvm.options`](elasticsearch://reference/elasticsearch/jvm-settings.md#set-jvm-options): * If you specify a directory, the JVM will generate a filename for the heap dump based on the PID of the running instance. * If you specify a fixed filename instead of a directory, the file must not exist when the JVM needs to perform a heap dump on an out of memory exception. Otherwise, the heap dump will fail. @@ -180,7 +180,7 @@ If this path is not suitable for receiving heap dumps, modify the `-XX:HeapDumpP ## GC logging settings [_gc_logging_settings] -By default, {{es}} enables garbage collection (GC) logs. These are configured in [`jvm.options`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/jvm-settings.md#set-jvm-options) and output to the same default location as the {{es}} logs. The default configuration rotates the logs every 64 MB and can consume up to 2 GB of disk space. +By default, {{es}} enables garbage collection (GC) logs. These are configured in [`jvm.options`](elasticsearch://reference/elasticsearch/jvm-settings.md#set-jvm-options) and output to the same default location as the {{es}} logs. The default configuration rotates the logs every 64 MB and can consume up to 2 GB of disk space. You can reconfigure JVM logging using the command line options described in [JEP 158: Unified JVM Logging](https://openjdk.java.net/jeps/158). Unless you change the default `jvm.options` file directly, the {{es}} default configuration is applied in addition to your own settings. To disable the default configuration, first disable logging by supplying the `-Xlog:disable` option, then supply your own command line options. This disables *all* JVM logging, so be sure to review the available options and enable everything that you require. @@ -225,7 +225,7 @@ If you intend to run the `.tar.gz` distribution on Linux or MacOS for an extende By default, {{es}} configures the JVM to write fatal error logs to the default logging directory. On [RPM](install-elasticsearch-with-rpm.md) and [Debian](install-elasticsearch-with-debian-package.md) packages, this directory is `/var/log/elasticsearch`. On [Linux and MacOS](install-elasticsearch-from-archive-on-linux-macos.md) and [Windows](install-elasticsearch-with-zip-on-windows.md) distributions, the `logs` directory is located under the root of the {{es}} installation. -These are logs produced by the JVM when it encounters a fatal error, such as a segmentation fault. If this path is not suitable for receiving logs, modify the `-XX:ErrorFile=...` entry in [`jvm.options`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/jvm-settings.md#set-jvm-options). +These are logs produced by the JVM when it encounters a fatal error, such as a segmentation fault. If this path is not suitable for receiving logs, modify the `-XX:ErrorFile=...` entry in [`jvm.options`](elasticsearch://reference/elasticsearch/jvm-settings.md#set-jvm-options). ## Cluster backups [important-settings-backups] diff --git a/deploy-manage/deploy/self-managed/install-elasticsearch-from-archive-on-linux-macos.md b/deploy-manage/deploy/self-managed/install-elasticsearch-from-archive-on-linux-macos.md index 55c1c4f0f..bd23611b1 100644 --- a/deploy-manage/deploy/self-managed/install-elasticsearch-from-archive-on-linux-macos.md +++ b/deploy-manage/deploy/self-managed/install-elasticsearch-from-archive-on-linux-macos.md @@ -7,7 +7,7 @@ mapped_pages: {{es}} is available as a `.tar.gz` archive for Linux and MacOS. -This package contains both free and subscription features. [Start a 30-day trial](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/license-settings.md) to try out all of the features. +This package contains both free and subscription features. [Start a 30-day trial](elasticsearch://reference/elasticsearch/configuration-reference/license-settings.md) to try out all of the features. The latest stable version of {{es}} can be found on the [Download {{es}}](https://elastic.co/downloads/elasticsearch) page. Other versions can be found on the [Past Releases page](https://elastic.co/downloads/past-releases). @@ -130,11 +130,11 @@ When {{es}} starts for the first time, the security auto-configuration process b Before enrolling a new node, additional actions such as binding to an address other than `localhost` or satisfying bootstrap checks are typically necessary in production clusters. During that time, an auto-generated enrollment token could expire, which is why enrollment tokens aren’t generated automatically. -Additionally, only nodes on the same host can join the cluster without additional configuration. If you want nodes from another host to join your cluster, you need to set `transport.host` to a [supported value](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#network-interface-values) (such as uncommenting the suggested value of `0.0.0.0`), or an IP address that’s bound to an interface where other hosts can reach it. Refer to [transport settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#transport-settings) for more information. +Additionally, only nodes on the same host can join the cluster without additional configuration. If you want nodes from another host to join your cluster, you need to set `transport.host` to a [supported value](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#network-interface-values) (such as uncommenting the suggested value of `0.0.0.0`), or an IP address that’s bound to an interface where other hosts can reach it. Refer to [transport settings](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#transport-settings) for more information. To enroll new nodes in your cluster, create an enrollment token with the `elasticsearch-create-enrollment-token` tool on any existing node in your cluster. You can then start a new node with the `--enrollment-token` parameter so that it joins an existing cluster. -1. In a separate terminal from where {{es}} is running, navigate to the directory where you installed {{es}} and run the [`elasticsearch-create-enrollment-token`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/create-enrollment-token.md) tool to generate an enrollment token for your new nodes. +1. In a separate terminal from where {{es}} is running, navigate to the directory where you installed {{es}} and run the [`elasticsearch-create-enrollment-token`](elasticsearch://reference/elasticsearch/command-line-tools/create-enrollment-token.md) tool to generate an enrollment token for your new nodes. ```sh bin/elasticsearch-create-enrollment-token -s node @@ -307,7 +307,7 @@ When you install {{es}}, the following certificates and keys are generated in th `transport.p12` : Keystore that contains the key and certificate for the transport layer for all the nodes in your cluster. -`http.p12` and `transport.p12` are password-protected PKCS#12 keystores. {{es}} stores the passwords for these keystores as [secure settings](../../security/secure-settings.md). To retrieve the passwords so that you can inspect or change the keystore contents, use the [`bin/elasticsearch-keystore`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/elasticsearch-keystore.md) tool. +`http.p12` and `transport.p12` are password-protected PKCS#12 keystores. {{es}} stores the passwords for these keystores as [secure settings](../../security/secure-settings.md). To retrieve the passwords so that you can inspect or change the keystore contents, use the [`bin/elasticsearch-keystore`](elasticsearch://reference/elasticsearch/command-line-tools/elasticsearch-keystore.md) tool. Use the following command to retrieve the password for `http.p12`: diff --git a/deploy-manage/deploy/self-managed/install-elasticsearch-with-debian-package.md b/deploy-manage/deploy/self-managed/install-elasticsearch-with-debian-package.md index dab8ea2e5..d2ae50cf9 100644 --- a/deploy-manage/deploy/self-managed/install-elasticsearch-with-debian-package.md +++ b/deploy-manage/deploy/self-managed/install-elasticsearch-with-debian-package.md @@ -7,7 +7,7 @@ mapped_pages: The Debian package for Elasticsearch can be [downloaded from our website](#install-deb) or from our [APT repository](#deb-repo). It can be used to install Elasticsearch on any Debian-based system such as Debian and Ubuntu. -This package contains both free and subscription features. [Start a 30-day trial](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/license-settings.md) to try out all of the features. +This package contains both free and subscription features. [Start a 30-day trial](elasticsearch://reference/elasticsearch/configuration-reference/license-settings.md) to try out all of the features. The latest stable version of Elasticsearch can be found on the [Download Elasticsearch](https://elastic.co/downloads/elasticsearch) page. Other versions can be found on the [Past Releases page](https://elastic.co/downloads/past-releases). @@ -78,7 +78,7 @@ When installing {{es}}, security features are enabled and configured by default. * Authentication and authorization are enabled, and a password is generated for the `elastic` built-in superuser. * Certificates and keys for TLS are generated for the transport and HTTP layer, and TLS is enabled and configured with these keys and certificates. -The password and certificate and keys are output to your terminal. You can reset the password for the `elastic` user with the [`elasticsearch-reset-password`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/reset-password.md) command. +The password and certificate and keys are output to your terminal. You can reset the password for the `elastic` user with the [`elasticsearch-reset-password`](elasticsearch://reference/elasticsearch/command-line-tools/reset-password.md) command. We recommend storing the `elastic` password as an environment variable in your shell. For example: @@ -340,7 +340,7 @@ When you install {{es}}, the following certificates and keys are generated in th `transport.p12` : Keystore that contains the key and certificate for the transport layer for all the nodes in your cluster. -`http.p12` and `transport.p12` are password-protected PKCS#12 keystores. {{es}} stores the passwords for these keystores as [secure settings](../../security/secure-settings.md). To retrieve the passwords so that you can inspect or change the keystore contents, use the [`bin/elasticsearch-keystore`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/elasticsearch-keystore.md) tool. +`http.p12` and `transport.p12` are password-protected PKCS#12 keystores. {{es}} stores the passwords for these keystores as [secure settings](../../security/secure-settings.md). To retrieve the passwords so that you can inspect or change the keystore contents, use the [`bin/elasticsearch-keystore`](elasticsearch://reference/elasticsearch/command-line-tools/elasticsearch-keystore.md) tool. Use the following command to retrieve the password for `http.p12`: diff --git a/deploy-manage/deploy/self-managed/install-elasticsearch-with-docker.md b/deploy-manage/deploy/self-managed/install-elasticsearch-with-docker.md index 4d53003d0..6c173e9c4 100644 --- a/deploy-manage/deploy/self-managed/install-elasticsearch-with-docker.md +++ b/deploy-manage/deploy/self-managed/install-elasticsearch-with-docker.md @@ -7,7 +7,7 @@ mapped_pages: Docker images for {{es}} are available from the Elastic Docker registry. A list of all published Docker images and tags is available at [www.docker.elastic.co](https://www.docker.elastic.co). The source code is in [GitHub](https://github.com/elastic/elasticsearch/blob/master/distribution/docker). -This package contains both free and subscription features. [Start a 30-day trial](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/license-settings.md) to try out all of the features. +This package contains both free and subscription features. [Start a 30-day trial](elasticsearch://reference/elasticsearch/configuration-reference/license-settings.md) to try out all of the features. ::::{tip} If you just want to test {{es}} in local development, refer to [Run {{es}} locally](../../../solutions/search/get-started.md). Please note that this setup is not suitable for production environments. @@ -452,9 +452,9 @@ The image [exposes](https://docs.docker.com/engine/reference/builder/#/expose) T ### Manually set the heap size [docker-set-heap-size] -By default, {{es}} automatically sizes JVM heap based on a nodes’s [roles](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/node-settings.md#node-roles) and the total memory available to the node’s container. We recommend this default sizing for most production environments. If needed, you can override default sizing by manually setting JVM heap size. +By default, {{es}} automatically sizes JVM heap based on a nodes’s [roles](elasticsearch://reference/elasticsearch/configuration-reference/node-settings.md#node-roles) and the total memory available to the node’s container. We recommend this default sizing for most production environments. If needed, you can override default sizing by manually setting JVM heap size. -To manually set the heap size in production, bind mount a [JVM options](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/jvm-settings.md#set-jvm-options) file under `/usr/share/elasticsearch/config/jvm.options.d` that includes your desired [heap size](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/jvm-settings.md#set-jvm-heap-size) settings. +To manually set the heap size in production, bind mount a [JVM options](elasticsearch://reference/elasticsearch/jvm-settings.md#set-jvm-options) file under `/usr/share/elasticsearch/config/jvm.options.d` that includes your desired [heap size](elasticsearch://reference/elasticsearch/jvm-settings.md#set-jvm-heap-size) settings. For testing, you can also manually set the heap size using the `ES_JAVA_OPTS` environment variable. For example, to use 1GB, use the following command. @@ -595,7 +595,7 @@ Some plugins require additional security permissions. You must explicitly accept * Attaching a `tty` when you run the Docker image and allowing the permissions when prompted. * Inspecting the security permissions and accepting them (if appropriate) by adding the `--batch` flag to the plugin install command. -See [Plugin management](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch-plugins/_other_command_line_parameters.md) for more information. +See [Plugin management](elasticsearch://reference/elasticsearch-plugins/_other_command_line_parameters.md) for more information. ### Troubleshoot Docker errors for {{es}} [troubleshoot-docker-errors] diff --git a/deploy-manage/deploy/self-managed/install-elasticsearch-with-rpm.md b/deploy-manage/deploy/self-managed/install-elasticsearch-with-rpm.md index 47010a425..0dc8fcbfc 100644 --- a/deploy-manage/deploy/self-managed/install-elasticsearch-with-rpm.md +++ b/deploy-manage/deploy/self-managed/install-elasticsearch-with-rpm.md @@ -12,7 +12,7 @@ RPM install is not supported on distributions with old versions of RPM, such as :::: -This package contains both free and subscription features. [Start a 30-day trial](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/license-settings.md) to try out all of the features. +This package contains both free and subscription features. [Start a 30-day trial](elasticsearch://reference/elasticsearch/configuration-reference/license-settings.md) to try out all of the features. The latest stable version of Elasticsearch can be found on the [Download Elasticsearch](https://elastic.co/downloads/elasticsearch) page. Other versions can be found on the [Past Releases page](https://elastic.co/downloads/past-releases). @@ -82,7 +82,7 @@ When installing {{es}}, security features are enabled and configured by default. * Authentication and authorization are enabled, and a password is generated for the `elastic` built-in superuser. * Certificates and keys for TLS are generated for the transport and HTTP layer, and TLS is enabled and configured with these keys and certificates. -The password and certificate and keys are output to your terminal. You can reset the password for the `elastic` user with the [`elasticsearch-reset-password`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/reset-password.md) command. +The password and certificate and keys are output to your terminal. You can reset the password for the `elastic` user with the [`elasticsearch-reset-password`](elasticsearch://reference/elasticsearch/command-line-tools/reset-password.md) command. We recommend storing the `elastic` password as an environment variable in your shell. For example: @@ -344,7 +344,7 @@ When you install {{es}}, the following certificates and keys are generated in th `transport.p12` : Keystore that contains the key and certificate for the transport layer for all the nodes in your cluster. -`http.p12` and `transport.p12` are password-protected PKCS#12 keystores. {{es}} stores the passwords for these keystores as [secure settings](../../security/secure-settings.md). To retrieve the passwords so that you can inspect or change the keystore contents, use the [`bin/elasticsearch-keystore`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/elasticsearch-keystore.md) tool. +`http.p12` and `transport.p12` are password-protected PKCS#12 keystores. {{es}} stores the passwords for these keystores as [secure settings](../../security/secure-settings.md). To retrieve the passwords so that you can inspect or change the keystore contents, use the [`bin/elasticsearch-keystore`](elasticsearch://reference/elasticsearch/command-line-tools/elasticsearch-keystore.md) tool. Use the following command to retrieve the password for `http.p12`: diff --git a/deploy-manage/deploy/self-managed/install-elasticsearch-with-zip-on-windows.md b/deploy-manage/deploy/self-managed/install-elasticsearch-with-zip-on-windows.md index 4cdbe53c5..ca5b5a363 100644 --- a/deploy-manage/deploy/self-managed/install-elasticsearch-with-zip-on-windows.md +++ b/deploy-manage/deploy/self-managed/install-elasticsearch-with-zip-on-windows.md @@ -7,7 +7,7 @@ mapped_pages: {{es}} can be installed on Windows using the Windows `.zip` archive. This comes with a `elasticsearch-service.bat` command which will setup {{es}} to run as a service. -This package contains both free and subscription features. [Start a 30-day trial](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/license-settings.md) to try out all of the features. +This package contains both free and subscription features. [Start a 30-day trial](elasticsearch://reference/elasticsearch/configuration-reference/license-settings.md) to try out all of the features. ::::{note} On Windows the {{es}} {{ml}} feature requires the Microsoft Universal C Runtime library. This is built into Windows 10, Windows Server 2016 and more recent versions of Windows. For older versions of Windows it can be installed via Windows Update, or from a [separate download](https://support.microsoft.com/en-us/help/2999226/update-for-universal-c-runtime-in-windows). If you cannot install the Microsoft Universal C Runtime library you can still use the rest of {{es}} if you disable the {{ml}} feature. @@ -87,11 +87,11 @@ When {{es}} starts for the first time, the security auto-configuration process b Before enrolling a new node, additional actions such as binding to an address other than `localhost` or satisfying bootstrap checks are typically necessary in production clusters. During that time, an auto-generated enrollment token could expire, which is why enrollment tokens aren’t generated automatically. -Additionally, only nodes on the same host can join the cluster without additional configuration. If you want nodes from another host to join your cluster, you need to set `transport.host` to a [supported value](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#network-interface-values) (such as uncommenting the suggested value of `0.0.0.0`), or an IP address that’s bound to an interface where other hosts can reach it. Refer to [transport settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#transport-settings) for more information. +Additionally, only nodes on the same host can join the cluster without additional configuration. If you want nodes from another host to join your cluster, you need to set `transport.host` to a [supported value](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#network-interface-values) (such as uncommenting the suggested value of `0.0.0.0`), or an IP address that’s bound to an interface where other hosts can reach it. Refer to [transport settings](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#transport-settings) for more information. To enroll new nodes in your cluster, create an enrollment token with the `elasticsearch-create-enrollment-token` tool on any existing node in your cluster. You can then start a new node with the `--enrollment-token` parameter so that it joins an existing cluster. -1. In a separate terminal from where {{es}} is running, navigate to the directory where you installed {{es}} and run the [`elasticsearch-create-enrollment-token`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/create-enrollment-token.md) tool to generate an enrollment token for your new nodes. +1. In a separate terminal from where {{es}} is running, navigate to the directory where you installed {{es}} and run the [`elasticsearch-create-enrollment-token`](elasticsearch://reference/elasticsearch/command-line-tools/create-enrollment-token.md) tool to generate an enrollment token for your new nodes. ```sh bin\elasticsearch-create-enrollment-token -s node @@ -194,7 +194,7 @@ You can install {{es}} as a service that runs in the background or starts automa TLS is not enabled or configured when you start {{es}} as a service. :::: -3. Generate a password for the `elastic` user with the [`elasticsearch-reset-password`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/reset-password.md) tool. The password is output to the command line. +3. Generate a password for the `elastic` user with the [`elasticsearch-reset-password`](elasticsearch://reference/elasticsearch/command-line-tools/reset-password.md) tool. The password is output to the command line. ```sh C:\Program Files\elasticsearch-9.0.0-beta1\bin>\bin\elasticsearch-reset-password -u elastic @@ -284,9 +284,9 @@ At its core, `elasticsearch-service.bat` relies on [Apache Commons Daemon](https ::::{note} -By default, {{es}} automatically sizes JVM heap based on a node’s [roles](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/node-settings.md#node-roles) and total memory. We recommend this default sizing for most production environments. If needed, you can override default sizing by manually setting the heap size. +By default, {{es}} automatically sizes JVM heap based on a node’s [roles](elasticsearch://reference/elasticsearch/configuration-reference/node-settings.md#node-roles) and total memory. We recommend this default sizing for most production environments. If needed, you can override default sizing by manually setting the heap size. -When installing {{es}} on Windows as a service for the first time or running {{es}} from the command line, you can manually [Set the JVM heap size](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/jvm-settings.md#set-jvm-heap-size). To resize the heap for an already installed service, use the service manager: `bin\elasticsearch-service.bat manager`. +When installing {{es}} on Windows as a service for the first time or running {{es}} from the command line, you can manually [Set the JVM heap size](elasticsearch://reference/elasticsearch/jvm-settings.md#set-jvm-heap-size). To resize the heap for an already installed service, use the service manager: `bin\elasticsearch-service.bat manager`. :::: diff --git a/deploy-manage/deploy/self-managed/install-from-archive-on-linux-macos.md b/deploy-manage/deploy/self-managed/install-from-archive-on-linux-macos.md index dc8991e99..748435391 100644 --- a/deploy-manage/deploy/self-managed/install-from-archive-on-linux-macos.md +++ b/deploy-manage/deploy/self-managed/install-from-archive-on-linux-macos.md @@ -77,7 +77,7 @@ If this is the first time you’re starting {{kib}}, this command generates a un 3. Log in to {{kib}} as the `elastic` user with the password that was generated when you started {{es}}. ::::{note} -If you need to reset the password for the `elastic` user or other built-in users, run the [`elasticsearch-reset-password`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/reset-password.md) tool. To generate new enrollment tokens for {{kib}} or {{es}} nodes, run the [`elasticsearch-create-enrollment-token`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/create-enrollment-token.md) tool. These tools are available in the {{es}} `bin` directory. +If you need to reset the password for the `elastic` user or other built-in users, run the [`elasticsearch-reset-password`](elasticsearch://reference/elasticsearch/command-line-tools/reset-password.md) tool. To generate new enrollment tokens for {{kib}} or {{es}} nodes, run the [`elasticsearch-create-enrollment-token`](elasticsearch://reference/elasticsearch/command-line-tools/create-enrollment-token.md) tool. These tools are available in the {{es}} `bin` directory. :::: diff --git a/deploy-manage/deploy/self-managed/install-on-windows.md b/deploy-manage/deploy/self-managed/install-on-windows.md index cd2e52e71..294f3045c 100644 --- a/deploy-manage/deploy/self-managed/install-on-windows.md +++ b/deploy-manage/deploy/self-managed/install-on-windows.md @@ -50,7 +50,7 @@ If this is the first time you’re starting {{kib}}, this command generates a un 3. Log in to {{kib}} as the `elastic` user with the password that was generated when you started {{es}}. ::::{note} -If you need to reset the password for the `elastic` user or other built-in users, run the [`elasticsearch-reset-password`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/reset-password.md) tool. To generate new enrollment tokens for {{kib}} or {{es}} nodes, run the [`elasticsearch-create-enrollment-token`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/create-enrollment-token.md) tool. These tools are available in the {{es}} `bin` directory. +If you need to reset the password for the `elastic` user or other built-in users, run the [`elasticsearch-reset-password`](elasticsearch://reference/elasticsearch/command-line-tools/reset-password.md) tool. To generate new enrollment tokens for {{kib}} or {{es}} nodes, run the [`elasticsearch-create-enrollment-token`](elasticsearch://reference/elasticsearch/command-line-tools/create-enrollment-token.md) tool. These tools are available in the {{es}} `bin` directory. :::: diff --git a/deploy-manage/deploy/self-managed/install-with-debian-package.md b/deploy-manage/deploy/self-managed/install-with-debian-package.md index 0316337fa..388be1cb5 100644 --- a/deploy-manage/deploy/self-managed/install-with-debian-package.md +++ b/deploy-manage/deploy/self-managed/install-with-debian-package.md @@ -49,7 +49,7 @@ When you start {{es}} for the first time, the following security configuration o The password and certificate and keys are output to your terminal. -You can then generate an enrollment token for {{kib}} with the [`elasticsearch-create-enrollment-token`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/create-enrollment-token.md) tool: +You can then generate an enrollment token for {{kib}} with the [`elasticsearch-create-enrollment-token`](elasticsearch://reference/elasticsearch/command-line-tools/create-enrollment-token.md) tool: ```sh bin/elasticsearch-create-enrollment-token -s kibana diff --git a/deploy-manage/deploy/self-managed/install-with-rpm.md b/deploy-manage/deploy/self-managed/install-with-rpm.md index ce62ded7e..8750dc530 100644 --- a/deploy-manage/deploy/self-managed/install-with-rpm.md +++ b/deploy-manage/deploy/self-managed/install-with-rpm.md @@ -59,7 +59,7 @@ When you start {{es}} for the first time, the following security configuration o The password and certificate and keys are output to your terminal. -You can then generate an enrollment token for {{kib}} with the [`elasticsearch-create-enrollment-token`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/create-enrollment-token.md) tool: +You can then generate an enrollment token for {{kib}} with the [`elasticsearch-create-enrollment-token`](elasticsearch://reference/elasticsearch/command-line-tools/create-enrollment-token.md) tool: ```sh bin/elasticsearch-create-enrollment-token -s kibana diff --git a/deploy-manage/deploy/self-managed/max-number-threads-check.md b/deploy-manage/deploy/self-managed/max-number-threads-check.md index 121251822..43e75f063 100644 --- a/deploy-manage/deploy/self-managed/max-number-threads-check.md +++ b/deploy-manage/deploy/self-managed/max-number-threads-check.md @@ -5,5 +5,5 @@ mapped_pages: # Maximum number of threads check [max-number-threads-check] -Elasticsearch executes requests by breaking the request down into stages and handing those stages off to different thread pool executors. There are different [thread pool executors](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/thread-pool-settings.md) for a variety of tasks within Elasticsearch. Thus, Elasticsearch needs the ability to create a lot of threads. The maximum number of threads check ensures that the Elasticsearch process has the rights to create enough threads under normal use. This check is enforced only on Linux. If you are on Linux, to pass the maximum number of threads check, you must configure your system to allow the Elasticsearch process the ability to create at least 4096 threads. This can be done via `/etc/security/limits.conf` using the `nproc` setting (note that you might have to increase the limits for the `root` user too). +Elasticsearch executes requests by breaking the request down into stages and handing those stages off to different thread pool executors. There are different [thread pool executors](elasticsearch://reference/elasticsearch/configuration-reference/thread-pool-settings.md) for a variety of tasks within Elasticsearch. Thus, Elasticsearch needs the ability to create a lot of threads. The maximum number of threads check ensures that the Elasticsearch process has the rights to create enough threads under normal use. This check is enforced only on Linux. If you are on Linux, to pass the maximum number of threads check, you must configure your system to allow the Elasticsearch process the ability to create at least 4096 threads. This can be done via `/etc/security/limits.conf` using the `nproc` setting (note that you might have to increase the limits for the `root` user too). diff --git a/deploy-manage/deploy/self-managed/networkaddress-cache-ttl.md b/deploy-manage/deploy/self-managed/networkaddress-cache-ttl.md index c554d7a34..294304f53 100644 --- a/deploy-manage/deploy/self-managed/networkaddress-cache-ttl.md +++ b/deploy-manage/deploy/self-managed/networkaddress-cache-ttl.md @@ -5,5 +5,5 @@ mapped_pages: # DNS cache settings [networkaddress-cache-ttl] -Elasticsearch runs with a security manager in place. With a security manager in place, the JVM defaults to caching positive hostname resolutions indefinitely and defaults to caching negative hostname resolutions for ten seconds. Elasticsearch overrides this behavior with default values to cache positive lookups for sixty seconds, and to cache negative lookups for ten seconds. These values should be suitable for most environments, including environments where DNS resolutions vary with time. If not, you can edit the values `es.networkaddress.cache.ttl` and `es.networkaddress.cache.negative.ttl` in the [JVM options](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/jvm-settings.md#set-jvm-options). Note that the values [`networkaddress.cache.ttl=`](https://docs.oracle.com/javase/8/docs/technotes/guides/net/properties.md) and [`networkaddress.cache.negative.ttl=`](https://docs.oracle.com/javase/8/docs/technotes/guides/net/properties.md) in the [Java security policy](https://docs.oracle.com/javase/8/docs/technotes/guides/security/PolicyFiles.md) are ignored by Elasticsearch unless you remove the settings for `es.networkaddress.cache.ttl` and `es.networkaddress.cache.negative.ttl`. +Elasticsearch runs with a security manager in place. With a security manager in place, the JVM defaults to caching positive hostname resolutions indefinitely and defaults to caching negative hostname resolutions for ten seconds. Elasticsearch overrides this behavior with default values to cache positive lookups for sixty seconds, and to cache negative lookups for ten seconds. These values should be suitable for most environments, including environments where DNS resolutions vary with time. If not, you can edit the values `es.networkaddress.cache.ttl` and `es.networkaddress.cache.negative.ttl` in the [JVM options](elasticsearch://reference/elasticsearch/jvm-settings.md#set-jvm-options). Note that the values [`networkaddress.cache.ttl=`](https://docs.oracle.com/javase/8/docs/technotes/guides/net/properties.md) and [`networkaddress.cache.negative.ttl=`](https://docs.oracle.com/javase/8/docs/technotes/guides/net/properties.md) in the [Java security policy](https://docs.oracle.com/javase/8/docs/technotes/guides/security/PolicyFiles.md) are ignored by Elasticsearch unless you remove the settings for `es.networkaddress.cache.ttl` and `es.networkaddress.cache.negative.ttl`. diff --git a/deploy-manage/deploy/self-managed/plugins.md b/deploy-manage/deploy/self-managed/plugins.md index 1bb9edc9b..7361a65f9 100644 --- a/deploy-manage/deploy/self-managed/plugins.md +++ b/deploy-manage/deploy/self-managed/plugins.md @@ -7,7 +7,7 @@ mapped_pages: Plugins are a way to enhance the basic Elasticsearch functionality in a custom manner. They range from adding custom mapping types, custom analyzers (in a more built in fashion), custom script engines, custom discovery and more. -For information about selecting and installing plugins, see [{{es}} Plugins and Integrations](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch-plugins/index.md). +For information about selecting and installing plugins, see [{{es}} Plugins and Integrations](elasticsearch://reference/elasticsearch-plugins/index.md). -For information about developing your own plugin, see [Help for plugin authors](asciidocalypse://docs/elasticsearch/docs/extend/index.md). +For information about developing your own plugin, see [Help for plugin authors](elasticsearch://extend/index.md). diff --git a/deploy-manage/deploy/self-managed/system-config-tcpretries.md b/deploy-manage/deploy/self-managed/system-config-tcpretries.md index 91a3e83e4..8204b24b1 100644 --- a/deploy-manage/deploy/self-managed/system-config-tcpretries.md +++ b/deploy-manage/deploy/self-managed/system-config-tcpretries.md @@ -5,7 +5,7 @@ mapped_pages: # TCP retransmission timeout [system-config-tcpretries] -Each pair of {{es}} nodes communicates via a number of TCP connections which [remain open](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#long-lived-connections) until one of the nodes shuts down or communication between the nodes is disrupted by a failure in the underlying infrastructure. +Each pair of {{es}} nodes communicates via a number of TCP connections which [remain open](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#long-lived-connections) until one of the nodes shuts down or communication between the nodes is disrupted by a failure in the underlying infrastructure. TCP provides reliable communication over occasionally unreliable networks by hiding temporary network disruptions from the communicating applications. Your operating system will retransmit any lost messages a number of times before informing the sender of any problem. {{es}} must wait while the retransmissions are happening and can only react once the operating system decides to give up. Users must therefore also wait for a sequence of retransmissions to complete. @@ -30,6 +30,6 @@ This setting applies to all TCP connections and will affect the reliability of c {{es}} also implements its own internal health checks with timeouts that are much shorter than the default retransmission timeout on Linux. Since these are application-level health checks their timeouts must allow for application-level effects such as garbage collection pauses. You should not reduce any timeouts related to these application-level health checks. -You must also ensure your network infrastructure does not interfere with the long-lived connections between nodes, [even if those connections appear to be idle](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#long-lived-connections). Devices which drop connections when they reach a certain age are a common source of problems to {{es}} clusters, and must not be used. +You must also ensure your network infrastructure does not interfere with the long-lived connections between nodes, [even if those connections appear to be idle](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#long-lived-connections). Devices which drop connections when they reach a certain age are a common source of problems to {{es}} clusters, and must not be used. diff --git a/deploy-manage/deploy/self-managed/vm-max-map-count.md b/deploy-manage/deploy/self-managed/vm-max-map-count.md index 808376a7d..2e3f48241 100644 --- a/deploy-manage/deploy/self-managed/vm-max-map-count.md +++ b/deploy-manage/deploy/self-managed/vm-max-map-count.md @@ -5,7 +5,7 @@ mapped_pages: # Virtual memory [vm-max-map-count] -Elasticsearch uses a [`mmapfs`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/store.md#mmapfs) directory by default to store its indices. The default operating system limits on mmap counts is likely to be too low, which may result in out of memory exceptions. +Elasticsearch uses a [`mmapfs`](elasticsearch://reference/elasticsearch/index-settings/store.md#mmapfs) directory by default to store its indices. The default operating system limits on mmap counts is likely to be too low, which may result in out of memory exceptions. On Linux, you can increase the limits by running the following command as `root`: diff --git a/deploy-manage/distributed-architecture.md b/deploy-manage/distributed-architecture.md index a26c9a6de..a1a3cf9ed 100644 --- a/deploy-manage/distributed-architecture.md +++ b/deploy-manage/distributed-architecture.md @@ -16,5 +16,5 @@ The topics in this section provides information about the architecture of {{es}} * [Shard allocation awareness](distributed-architecture/shard-allocation-relocation-recovery/shard-allocation-awareness.md): Learn how to use custom node attributes to distribute shards across different racks or availability zones. -* [Shard request cache](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/shard-request-cache-settings.md): Learn how {{es}} caches search requests to improve performance. +* [Shard request cache](elasticsearch://reference/elasticsearch/configuration-reference/shard-request-cache-settings.md): Learn how {{es}} caches search requests to improve performance. diff --git a/deploy-manage/distributed-architecture/clusters-nodes-shards/node-roles.md b/deploy-manage/distributed-architecture/clusters-nodes-shards/node-roles.md index 3b601a7d7..c3a2178a9 100644 --- a/deploy-manage/distributed-architecture/clusters-nodes-shards/node-roles.md +++ b/deploy-manage/distributed-architecture/clusters-nodes-shards/node-roles.md @@ -5,7 +5,7 @@ mapped_pages: # Node roles [node-roles-overview] -Any time that you start an instance of {{es}}, you are starting a *node*. A collection of connected nodes is called a [cluster](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md). If you are running a single node of {{es}}, then you have a cluster of one node. All nodes know about all the other nodes in the cluster and can forward client requests to the appropriate node. +Any time that you start an instance of {{es}}, you are starting a *node*. A collection of connected nodes is called a [cluster](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md). If you are running a single node of {{es}}, then you have a cluster of one node. All nodes know about all the other nodes in the cluster and can forward client requests to the appropriate node. Each node performs one or more roles. Roles control the behavior of the node in the cluster. @@ -61,14 +61,14 @@ Similarly, each master-eligible node maintains the following data on disk: * the index metadata for every index in the cluster, and * the cluster-wide metadata, such as settings and index templates. -Each node checks the contents of its data path at startup. If it discovers unexpected data then it will refuse to start. This is to avoid importing unwanted [dangling indices](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/local-gateway.md#dangling-indices) which can lead to a red cluster health. To be more precise, nodes without the `data` role will refuse to start if they find any shard data on disk at startup, and nodes without both the `master` and `data` roles will refuse to start if they have any index metadata on disk at startup. +Each node checks the contents of its data path at startup. If it discovers unexpected data then it will refuse to start. This is to avoid importing unwanted [dangling indices](elasticsearch://reference/elasticsearch/configuration-reference/local-gateway.md#dangling-indices) which can lead to a red cluster health. To be more precise, nodes without the `data` role will refuse to start if they find any shard data on disk at startup, and nodes without both the `master` and `data` roles will refuse to start if they have any index metadata on disk at startup. It is possible to change the roles of a node by adjusting its `elasticsearch.yml` file and restarting it. This is known as *repurposing* a node. In order to satisfy the checks for unexpected data described above, you must perform some extra steps to prepare a node for repurposing when starting the node without the `data` or `master` roles. -* If you want to repurpose a data node by removing the `data` role then you should first use an [allocation filter](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-shard-allocation-filtering) to safely migrate all the shard data onto other nodes in the cluster. -* If you want to repurpose a node to have neither the `data` nor `master` roles then it is simplest to start a brand-new node with an empty data path and the desired roles. You may find it safest to use an [allocation filter](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-shard-allocation-filtering) to migrate the shard data elsewhere in the cluster first. +* If you want to repurpose a data node by removing the `data` role then you should first use an [allocation filter](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-shard-allocation-filtering) to safely migrate all the shard data onto other nodes in the cluster. +* If you want to repurpose a node to have neither the `data` nor `master` roles then it is simplest to start a brand-new node with an empty data path and the desired roles. You may find it safest to use an [allocation filter](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-shard-allocation-filtering) to migrate the shard data elsewhere in the cluster first. -If it is not possible to follow these extra steps then you may be able to use the [`elasticsearch-node repurpose`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/node-tool.md#node-tool-repurpose) tool to delete any excess data that prevents a node from starting. +If it is not possible to follow these extra steps then you may be able to use the [`elasticsearch-node repurpose`](elasticsearch://reference/elasticsearch/command-line-tools/node-tool.md#node-tool-repurpose) tool to delete any excess data that prevents a node from starting. ## Available node roles [node-roles-list] @@ -222,7 +222,7 @@ node.roles: [ data_warm ] Cold data nodes are part of the cold tier. When you no longer need to search time series data regularly, it can move from the warm tier to the cold tier. While still searchable, this tier is typically optimized for lower storage costs rather than search speed. -For better storage savings, you can keep [fully mounted indices](../../tools/snapshot-and-restore/searchable-snapshots.md#fully-mounted) of [{{search-snaps}}](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-searchable-snapshot.md) on the cold tier. Unlike regular indices, these fully mounted indices don’t require replicas for reliability. In the event of a failure, they can recover data from the underlying snapshot instead. This potentially halves the local storage needed for the data. A snapshot repository is required to use fully mounted indices in the cold tier. Fully mounted indices are read-only. +For better storage savings, you can keep [fully mounted indices](../../tools/snapshot-and-restore/searchable-snapshots.md#fully-mounted) of [{{search-snaps}}](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-searchable-snapshot.md) on the cold tier. Unlike regular indices, these fully mounted indices don’t require replicas for reliability. In the event of a failure, they can recover data from the underlying snapshot instead. This potentially halves the local storage needed for the data. A snapshot repository is required to use fully mounted indices in the cold tier. Fully mounted indices are read-only. Alternatively, you can use the cold tier to store regular indices with replicas instead of using {{search-snaps}}. This lets you store older data on less expensive hardware but doesn’t reduce required disk space compared to the warm tier. diff --git a/deploy-manage/distributed-architecture/discovery-cluster-formation.md b/deploy-manage/distributed-architecture/discovery-cluster-formation.md index ef0a74999..0a1f9bcc2 100644 --- a/deploy-manage/distributed-architecture/discovery-cluster-formation.md +++ b/deploy-manage/distributed-architecture/discovery-cluster-formation.md @@ -30,7 +30,7 @@ The following processes and settings are part of discovery and cluster formation [Cluster fault detection](discovery-cluster-formation/cluster-fault-detection.md) : {{es}} performs health checks to detect and remove faulty nodes. -[Settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/discovery-cluster-formation-settings.md) +[Settings](elasticsearch://reference/elasticsearch/configuration-reference/discovery-cluster-formation-settings.md) : There are settings that enable users to influence the discovery, cluster formation, master election and fault detection processes. diff --git a/deploy-manage/distributed-architecture/discovery-cluster-formation/cluster-fault-detection.md b/deploy-manage/distributed-architecture/discovery-cluster-formation/cluster-fault-detection.md index 4428525e6..6fb315168 100644 --- a/deploy-manage/distributed-architecture/discovery-cluster-formation/cluster-fault-detection.md +++ b/deploy-manage/distributed-architecture/discovery-cluster-formation/cluster-fault-detection.md @@ -7,12 +7,12 @@ mapped_pages: The elected master periodically checks each of the nodes in the cluster to ensure that they are still connected and healthy. Each node in the cluster also periodically checks the health of the elected master. These checks are known respectively as *follower checks* and *leader checks*. -Elasticsearch allows these checks to occasionally fail or timeout without taking any action. It considers a node to be faulty only after a number of consecutive checks have failed. You can control fault detection behavior with [`cluster.fault_detection.*` settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/discovery-cluster-formation-settings.md). +Elasticsearch allows these checks to occasionally fail or timeout without taking any action. It considers a node to be faulty only after a number of consecutive checks have failed. You can control fault detection behavior with [`cluster.fault_detection.*` settings](elasticsearch://reference/elasticsearch/configuration-reference/discovery-cluster-formation-settings.md). If the elected master detects that a node has disconnected, however, this situation is treated as an immediate failure. The master bypasses the timeout and retry setting values and attempts to remove the node from the cluster. Similarly, if a node detects that the elected master has disconnected, this situation is treated as an immediate failure. The node bypasses the timeout and retry settings and restarts its discovery phase to try and find or elect a new master. $$$cluster-fault-detection-filesystem-health$$$ -Additionally, each node periodically verifies that its data path is healthy by writing a small file to disk and then deleting it again. If a node discovers its data path is unhealthy then it is removed from the cluster until the data path recovers. You can control this behavior with the [`monitor.fs.health` settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/discovery-cluster-formation-settings.md). +Additionally, each node periodically verifies that its data path is healthy by writing a small file to disk and then deleting it again. If a node discovers its data path is unhealthy then it is removed from the cluster until the data path recovers. You can control this behavior with the [`monitor.fs.health` settings](elasticsearch://reference/elasticsearch/configuration-reference/discovery-cluster-formation-settings.md). $$$cluster-fault-detection-cluster-state-publishing$$$ The elected master node will also remove nodes from the cluster if nodes are unable to apply an updated cluster state within a reasonable time. The timeout defaults to 2 minutes starting from the beginning of the cluster state update. Refer to [Publishing the cluster state](cluster-state-overview.md#cluster-state-publishing) for a more detailed description. @@ -22,27 +22,27 @@ The elected master node will also remove nodes from the cluster if nodes are una See [*Troubleshooting an unstable cluster*](../../../troubleshoot/elasticsearch/troubleshooting-unstable-cluster.md). -#### Diagnosing `disconnected` nodes [_diagnosing_disconnected_nodes] +#### Diagnosing `disconnected` nodes [_diagnosing_disconnected_nodes] See [Diagnosing `disconnected` nodes](../../../troubleshoot/elasticsearch/troubleshooting-unstable-cluster.md#troubleshooting-unstable-cluster-disconnected). -#### Diagnosing `lagging` nodes [_diagnosing_lagging_nodes] +#### Diagnosing `lagging` nodes [_diagnosing_lagging_nodes] See [Diagnosing `lagging` nodes](../../../troubleshoot/elasticsearch/troubleshooting-unstable-cluster.md#troubleshooting-unstable-cluster-lagging). -#### Diagnosing `follower check retry count exceeded` nodes [_diagnosing_follower_check_retry_count_exceeded_nodes] +#### Diagnosing `follower check retry count exceeded` nodes [_diagnosing_follower_check_retry_count_exceeded_nodes] See [Diagnosing `follower check retry count exceeded` nodes](../../../troubleshoot/elasticsearch/troubleshooting-unstable-cluster.md#troubleshooting-unstable-cluster-follower-check). -#### Diagnosing `ShardLockObtainFailedException` failures [_diagnosing_shardlockobtainfailedexception_failures] +#### Diagnosing `ShardLockObtainFailedException` failures [_diagnosing_shardlockobtainfailedexception_failures] See [Diagnosing `ShardLockObtainFailedException` failures](../../../troubleshoot/elasticsearch/troubleshooting-unstable-cluster.md#troubleshooting-unstable-cluster-shardlockobtainfailedexception). -#### Diagnosing other network disconnections [_diagnosing_other_network_disconnections] +#### Diagnosing other network disconnections [_diagnosing_other_network_disconnections] See [Diagnosing other network disconnections](../../../troubleshoot/elasticsearch/troubleshooting-unstable-cluster.md#troubleshooting-unstable-cluster-network). diff --git a/deploy-manage/distributed-architecture/discovery-cluster-formation/discovery-hosts-providers.md b/deploy-manage/distributed-architecture/discovery-cluster-formation/discovery-hosts-providers.md index 2316afa77..c032a857a 100644 --- a/deploy-manage/distributed-architecture/discovery-cluster-formation/discovery-hosts-providers.md +++ b/deploy-manage/distributed-architecture/discovery-cluster-formation/discovery-hosts-providers.md @@ -19,12 +19,12 @@ Refer to [Troubleshooting discovery](../../../troubleshoot/elasticsearch/discove ## Seed hosts providers [built-in-hosts-providers] -By default the cluster formation module offers two seed hosts providers to configure the list of seed nodes: a *settings*-based and a *file*-based seed hosts provider. It can be extended to support cloud environments and other forms of seed hosts providers via [discovery plugins](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch-plugins/discovery-plugins.md). Seed hosts providers are configured using the `discovery.seed_providers` setting, which defaults to the *settings*-based hosts provider. This setting accepts a list of different providers, allowing you to make use of multiple ways to find the seed hosts for your cluster. +By default the cluster formation module offers two seed hosts providers to configure the list of seed nodes: a *settings*-based and a *file*-based seed hosts provider. It can be extended to support cloud environments and other forms of seed hosts providers via [discovery plugins](elasticsearch://reference/elasticsearch-plugins/discovery-plugins.md). Seed hosts providers are configured using the `discovery.seed_providers` setting, which defaults to the *settings*-based hosts provider. This setting accepts a list of different providers, allowing you to make use of multiple ways to find the seed hosts for your cluster. Each seed hosts provider yields the IP addresses or hostnames of the seed nodes. If it returns any hostnames then these are resolved to IP addresses using a DNS lookup. If a hostname resolves to multiple IP addresses then {{es}} tries to find a seed node at all of these addresses. If the hosts provider does not explicitly give the TCP port of the node by then, it will implicitly use the first port in the port range given by `transport.profiles.default.port`, or by `transport.port` if `transport.profiles.default.port` is not set. The number of concurrent lookups is controlled by `discovery.seed_resolver.max_concurrent_resolvers` which defaults to `10`, and the timeout for each lookup is controlled by `discovery.seed_resolver.timeout` which defaults to `5s`. Note that DNS lookups are subject to [JVM DNS caching](../../deploy/self-managed/networkaddress-cache-ttl.md). -#### Settings-based seed hosts provider [settings-based-hosts-provider] +#### Settings-based seed hosts provider [settings-based-hosts-provider] The settings-based seed hosts provider uses a node setting to configure a static list of the addresses of the seed nodes. These addresses can be given as hostnames or IP addresses; hosts specified as hostnames are resolved to IP addresses during each round of discovery. @@ -42,7 +42,7 @@ discovery.seed_hosts: -#### File-based seed hosts provider [file-based-hosts-provider] +#### File-based seed hosts provider [file-based-hosts-provider] The file-based seed hosts provider configures a list of hosts via an external file. {{es}} reloads this file when it changes, so that the list of seed nodes can change dynamically without needing to restart each node. For example, this gives a convenient mechanism for an {{es}} instance that is run in a Docker container to be dynamically supplied with a list of IP addresses to connect to when those IP addresses may not be known at node startup. @@ -73,18 +73,18 @@ Host names are allowed instead of IP addresses and are resolved by DNS as descri You can also add comments to this file. All comments must appear on their lines starting with `#` (i.e. comments cannot start in the middle of a line). -#### EC2 hosts provider [ec2-hosts-provider] +#### EC2 hosts provider [ec2-hosts-provider] -The [EC2 discovery plugin](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch-plugins/discovery-ec2.md) adds a hosts provider that uses the [AWS API](https://github.com/aws/aws-sdk-java) to find a list of seed nodes. +The [EC2 discovery plugin](elasticsearch://reference/elasticsearch-plugins/discovery-ec2.md) adds a hosts provider that uses the [AWS API](https://github.com/aws/aws-sdk-java) to find a list of seed nodes. -#### Azure Classic hosts provider [azure-classic-hosts-provider] +#### Azure Classic hosts provider [azure-classic-hosts-provider] -The [Azure Classic discovery plugin](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch-plugins/discovery-azure-classic.md) adds a hosts provider that uses the Azure Classic API find a list of seed nodes. +The [Azure Classic discovery plugin](elasticsearch://reference/elasticsearch-plugins/discovery-azure-classic.md) adds a hosts provider that uses the Azure Classic API find a list of seed nodes. -#### Google Compute Engine hosts provider [gce-hosts-provider] +#### Google Compute Engine hosts provider [gce-hosts-provider] -The [GCE discovery plugin](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch-plugins/discovery-gce.md) adds a hosts provider that uses the GCE API find a list of seed nodes. +The [GCE discovery plugin](elasticsearch://reference/elasticsearch-plugins/discovery-gce.md) adds a hosts provider that uses the GCE API find a list of seed nodes. diff --git a/deploy-manage/distributed-architecture/discovery-cluster-formation/modules-discovery-bootstrap-cluster.md b/deploy-manage/distributed-architecture/discovery-cluster-formation/modules-discovery-bootstrap-cluster.md index dd579557b..7e4e85093 100644 --- a/deploy-manage/distributed-architecture/discovery-cluster-formation/modules-discovery-bootstrap-cluster.md +++ b/deploy-manage/distributed-architecture/discovery-cluster-formation/modules-discovery-bootstrap-cluster.md @@ -11,12 +11,12 @@ The initial set of master-eligible nodes is defined in the [`cluster.initial_mas * The [node name](../../deploy/self-managed/important-settings-configuration.md#node-name) of the node. * The node’s hostname if `node.name` is not set, because `node.name` defaults to the node’s hostname. You must use either the fully-qualified hostname or the bare hostname [depending on your system configuration](#modules-discovery-bootstrap-cluster-fqdns). -* The IP address of the node’s [transport publish address](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#modules-network-binding-publishing), if it is not possible to use the `node.name` of the node. This is normally the IP address to which [`network.host`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#common-network-settings) resolves but [this can be overridden](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#advanced-network-settings). +* The IP address of the node’s [transport publish address](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#modules-network-binding-publishing), if it is not possible to use the `node.name` of the node. This is normally the IP address to which [`network.host`](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#common-network-settings) resolves but [this can be overridden](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#advanced-network-settings). * The IP address and port of the node’s publish address, in the form `IP:PORT`, if it is not possible to use the `node.name` of the node and there are multiple nodes sharing a single IP address. Do not set `cluster.initial_master_nodes` on master-ineligible nodes. -::::{important} +::::{important} After the cluster has formed, remove the `cluster.initial_master_nodes` setting from each node’s configuration and never set it again for this cluster. Do not configure this setting on nodes joining an existing cluster. Do not configure this setting on nodes which are restarting. Do not configure this setting when performing a full-cluster restart. If you leave `cluster.initial_master_nodes` in place once the cluster has formed then there is a risk that a future misconfiguration may result in bootstrapping a new cluster alongside your existing cluster. It may not be possible to recover from this situation without losing data. @@ -39,7 +39,7 @@ cluster.initial_master_nodes: - master-c ``` -::::{important} +::::{important} You must set `cluster.initial_master_nodes` to the same list of nodes on each node on which it is set in order to be sure that only a single cluster forms during bootstrapping. If `cluster.initial_master_nodes` varies across the nodes on which it is set then you may bootstrap multiple clusters. It is usually not possible to recover from this situation without losing data. :::: @@ -63,7 +63,7 @@ This message shows the node names `master-a.example.com` and `master-b.example.c ## Choosing a cluster name [bootstrap-cluster-name] -The [`cluster.name`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/miscellaneous-cluster-settings.md#cluster-name) setting enables you to create multiple clusters which are separated from each other. Nodes verify that they agree on their cluster name when they first connect to each other, and Elasticsearch will only form a cluster from nodes that all have the same cluster name. The default value for the cluster name is `elasticsearch`, but it is recommended to change this to reflect the logical name of the cluster. +The [`cluster.name`](elasticsearch://reference/elasticsearch/configuration-reference/miscellaneous-cluster-settings.md#cluster-name) setting enables you to create multiple clusters which are separated from each other. Nodes verify that they agree on their cluster name when they first connect to each other, and Elasticsearch will only form a cluster from nodes that all have the same cluster name. The default value for the cluster name is `elasticsearch`, but it is recommended to change this to reflect the logical name of the cluster. ## Auto-bootstrapping in development mode [bootstrap-auto-bootstrap] @@ -84,14 +84,14 @@ Once an {{es}} node has joined an existing cluster, or bootstrapped a new cluste If you intended to add a node into an existing cluster but instead bootstrapped a separate single-node cluster then you must start again: 1. Shut down the node. -2. Completely wipe the node by deleting the contents of its [data folder](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/node-settings.md#data-path). +2. Completely wipe the node by deleting the contents of its [data folder](elasticsearch://reference/elasticsearch/configuration-reference/node-settings.md#data-path). 3. Configure `discovery.seed_hosts` or `discovery.seed_providers` and other relevant discovery settings. Ensure `cluster.initial_master_nodes` is not set on any node. 4. Restart the node and verify that it joins the existing cluster rather than forming its own one-node cluster. If you intended to form a new multi-node cluster but instead bootstrapped a collection of single-node clusters then you must start again: 1. Shut down all the nodes. -2. Completely wipe each node by deleting the contents of their [data folders](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/node-settings.md#data-path). +2. Completely wipe each node by deleting the contents of their [data folders](elasticsearch://reference/elasticsearch/configuration-reference/node-settings.md#data-path). 3. Configure `cluster.initial_master_nodes` as described above. 4. Configure `discovery.seed_hosts` or `discovery.seed_providers` and other relevant discovery settings. 5. Restart all the nodes and verify that they have formed a single cluster. diff --git a/deploy-manage/distributed-architecture/discovery-cluster-formation/modules-discovery-quorums.md b/deploy-manage/distributed-architecture/discovery-cluster-formation/modules-discovery-quorums.md index 4481774ef..eba853009 100644 --- a/deploy-manage/distributed-architecture/discovery-cluster-formation/modules-discovery-quorums.md +++ b/deploy-manage/distributed-architecture/discovery-cluster-formation/modules-discovery-quorums.md @@ -24,7 +24,7 @@ After a master-eligible node has joined or left the cluster the elected master m ## Master elections [_master_elections] -Elasticsearch uses an election process to agree on an elected master node, both at startup and if the existing elected master fails. Any master-eligible node can start an election, and normally the first election that takes place will succeed. Elections only usually fail when two nodes both happen to start their elections at about the same time, so elections are scheduled randomly on each node to reduce the probability of this happening. Nodes will retry elections until a master is elected, backing off on failure, so that eventually an election will succeed (with arbitrarily high probability). The scheduling of master elections are controlled by the [master election settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/discovery-cluster-formation-settings.md#master-election-settings). +Elasticsearch uses an election process to agree on an elected master node, both at startup and if the existing elected master fails. Any master-eligible node can start an election, and normally the first election that takes place will succeed. Elections only usually fail when two nodes both happen to start their elections at about the same time, so elections are scheduled randomly on each node to reduce the probability of this happening. Nodes will retry elections until a master is elected, backing off on failure, so that eventually an election will succeed (with arbitrarily high probability). The scheduling of master elections are controlled by the [master election settings](elasticsearch://reference/elasticsearch/configuration-reference/discovery-cluster-formation-settings.md#master-election-settings). ## Cluster maintenance, rolling restarts and migrations [_cluster_maintenance_rolling_restarts_and_migrations] diff --git a/deploy-manage/distributed-architecture/discovery-cluster-formation/modules-discovery-voting.md b/deploy-manage/distributed-architecture/discovery-cluster-formation/modules-discovery-voting.md index 9c0c9229e..4de16d881 100644 --- a/deploy-manage/distributed-architecture/discovery-cluster-formation/modules-discovery-voting.md +++ b/deploy-manage/distributed-architecture/discovery-cluster-formation/modules-discovery-voting.md @@ -32,7 +32,7 @@ The current voting configuration is not necessarily the same as the set of all a Larger voting configurations are usually more resilient, so Elasticsearch normally prefers to add master-eligible nodes to the voting configuration after they join the cluster. Similarly, if a node in the voting configuration leaves the cluster and there is another master-eligible node in the cluster that is not in the voting configuration then it is preferable to swap these two nodes over. The size of the voting configuration is thus unchanged but its resilience increases. -It is not so straightforward to automatically remove nodes from the voting configuration after they have left the cluster. Different strategies have different benefits and drawbacks, so the right choice depends on how the cluster will be used. You can control whether the voting configuration automatically shrinks by using the [`cluster.auto_shrink_voting_configuration` setting](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/discovery-cluster-formation-settings.md). +It is not so straightforward to automatically remove nodes from the voting configuration after they have left the cluster. Different strategies have different benefits and drawbacks, so the right choice depends on how the cluster will be used. You can control whether the voting configuration automatically shrinks by using the [`cluster.auto_shrink_voting_configuration` setting](elasticsearch://reference/elasticsearch/configuration-reference/discovery-cluster-formation-settings.md). ::::{note} If `cluster.auto_shrink_voting_configuration` is set to `true` (which is the default and recommended value) and there are at least three master-eligible nodes in the cluster, Elasticsearch remains capable of processing cluster state updates as long as all but one of its master-eligible nodes are healthy. diff --git a/deploy-manage/distributed-architecture/reading-and-writing-documents.md b/deploy-manage/distributed-architecture/reading-and-writing-documents.md index 559d0290e..ad038228e 100644 --- a/deploy-manage/distributed-architecture/reading-and-writing-documents.md +++ b/deploy-manage/distributed-architecture/reading-and-writing-documents.md @@ -41,7 +41,7 @@ These indexing stages (coordinating, primary, and replica) are sequential. To en Many things can go wrong during indexing — disks can get corrupted, nodes can be disconnected from each other, or some configuration mistake could cause an operation to fail on a replica despite it being successful on the primary. These are infrequent but the primary has to respond to them. -In the case that the primary itself fails, the node hosting the primary will send a message to the master about it. The indexing operation will wait (up to 1 minute, by [default](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-modules.md)) for the master to promote one of the replicas to be a new primary. The operation will then be forwarded to the new primary for processing. Note that the master also monitors the health of the nodes and may decide to proactively demote a primary. This typically happens when the node holding the primary is isolated from the cluster by a networking issue. See [here](#demoted-primary) for more details. +In the case that the primary itself fails, the node hosting the primary will send a message to the master about it. The indexing operation will wait (up to 1 minute, by [default](elasticsearch://reference/elasticsearch/index-settings/index-modules.md)) for the master to promote one of the replicas to be a new primary. The operation will then be forwarded to the new primary for processing. Note that the master also monitors the health of the nodes and may decide to proactively demote a primary. This typically happens when the node holding the primary is isolated from the cluster by a networking issue. See [here](#demoted-primary) for more details. Once the operation has been successfully performed on the primary, the primary has to deal with potential failures when executing it on the replica shards. This may be caused by an actual failure on the replica or due to a network issue preventing the operation from reaching the replica (or preventing the replica from responding). All of these share the same end result: a replica which is part of the in-sync replica set misses an operation that is about to be acknowledged. In order to avoid violating the invariant, the primary sends a message to the master requesting that the problematic shard be removed from the in-sync replica set. Only once removal of the shard has been acknowledged by the master does the primary acknowledge the operation. Note that the master will also instruct another node to start building a new shard copy in order to restore the system to a healthy state. @@ -62,7 +62,7 @@ Reads in Elasticsearch can be very lightweight lookups by ID or a heavy search r When a read request is received by a node, that node is responsible for forwarding it to the nodes that hold the relevant shards, collating the responses, and responding to the client. We call that node the *coordinating node* for that request. The basic flow is as follows: 1. Resolve the read requests to the relevant shards. Note that since most searches will be sent to one or more indices, they typically need to read from multiple shards, each representing a different subset of the data. -2. Select an active copy of each relevant shard, from the shard replication group. This can be either the primary or a replica. By default, {{es}} uses [adaptive replica selection](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/search-shard-routing.md#search-adaptive-replica) to select the shard copies. +2. Select an active copy of each relevant shard, from the shard replication group. This can be either the primary or a replica. By default, {{es}} uses [adaptive replica selection](elasticsearch://reference/elasticsearch/rest-apis/search-shard-routing.md#search-adaptive-replica) to select the shard copies. 3. Send shard level read requests to the selected copies. 4. Combine the results and respond. Note that in the case of get by ID look up, only one shard is relevant and this step can be skipped. diff --git a/deploy-manage/distributed-architecture/shard-allocation-relocation-recovery.md b/deploy-manage/distributed-architecture/shard-allocation-relocation-recovery.md index 6fe8da97b..653dec505 100644 --- a/deploy-manage/distributed-architecture/shard-allocation-relocation-recovery.md +++ b/deploy-manage/distributed-architecture/shard-allocation-relocation-recovery.md @@ -9,7 +9,7 @@ Each [index](../../manage-data/data-store/index-basics.md) in Elasticsearch is d A cluster can contain multiple copies of a shard. Each shard has one distinguished shard copy called the *primary*, and zero or more non-primary copies called *replicas*. The primary shard copy serves as the main entry point for all indexing operations. The operations on the primary shard copy are then forwarded to its replicas. -Replicas maintain redundant copies of your data across the [nodes](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/node-settings.md) in your cluster, protecting against hardware failure and increasing capacity to serve read requests like searching or retrieving a document. If the primary shard copy fails, then a replica is promoted to primary and takes over the primary’s responsibilities. +Replicas maintain redundant copies of your data across the [nodes](elasticsearch://reference/elasticsearch/configuration-reference/node-settings.md) in your cluster, protecting against hardware failure and increasing capacity to serve read requests like searching or retrieving a document. If the primary shard copy fails, then a replica is promoted to primary and takes over the primary’s responsibilities. Over the course of normal operation, Elasticsearch allocates shard copies to nodes, relocates shard copies across nodes to balance the cluster or satisfy new allocation constraints, and recovers shards to initialize new copies. In this topic, you’ll learn how these operations work and how you can control them. @@ -30,7 +30,7 @@ By default, the primary and replica shard copies for an index can be allocated t You can control how shard copies are allocated using the following settings: -* [Cluster-level shard allocation settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md): Use these settings to control how shard copies are allocated and balanced across the entire cluster. For example, you might want to [allocate nodes availability zones](shard-allocation-relocation-recovery/shard-allocation-awareness.md), or prevent certain nodes from being used so you can perform maintenance. +* [Cluster-level shard allocation settings](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md): Use these settings to control how shard copies are allocated and balanced across the entire cluster. For example, you might want to [allocate nodes availability zones](shard-allocation-relocation-recovery/shard-allocation-awareness.md), or prevent certain nodes from being used so you can perform maintenance. * [Index-level shard allocation settings](shard-allocation-relocation-recovery/index-level-shard-allocation.md): Use these settings to control how the shard copies for a specific index are allocated. For example, you might want to allocate an index to a node in a specific data tier, or to an node with specific attributes. @@ -67,8 +67,8 @@ You can determine the cause of a shard recovery using the [recovery](https://www To control how shards are recovered, for example the resources that can be used by recovery operations, and which indices should be prioritized for recovery, you can adjust the following settings: -* [Index recovery settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/index-recovery-settings.md) -* [Cluster-level shard allocation settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md) +* [Index recovery settings](elasticsearch://reference/elasticsearch/configuration-reference/index-recovery-settings.md) +* [Cluster-level shard allocation settings](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md) * [Index-level shard allocation settings](shard-allocation-relocation-recovery/index-level-shard-allocation.md), including [delayed allocation](shard-allocation-relocation-recovery/delaying-allocation-when-node-leaves.md) and [index recovery prioritization](shard-allocation-relocation-recovery/index-level-shard-allocation.md) Shard recovery operations also respect general shard allocation settings. @@ -91,7 +91,7 @@ When a shard copy is relocated, it is created as a new shard copy on the target ### Adjust shard relocation settings [_adjust_shard_relocation_settings] -You can control how and when shard copies are relocated. For example, you can adjust the rebalancing settings that control when shard copies are relocated to balance the cluster, or the high watermark for disk-based shard allocation that can trigger relocation. These settings are part of the [cluster-level shard allocation settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md). +You can control how and when shard copies are relocated. For example, you can adjust the rebalancing settings that control when shard copies are relocated to balance the cluster, or the high watermark for disk-based shard allocation that can trigger relocation. These settings are part of the [cluster-level shard allocation settings](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md). Shard relocation operations also respect shard allocation and recovery settings. diff --git a/deploy-manage/distributed-architecture/shard-allocation-relocation-recovery/delaying-allocation-when-node-leaves.md b/deploy-manage/distributed-architecture/shard-allocation-relocation-recovery/delaying-allocation-when-node-leaves.md index 29e21ced5..3bffc7e50 100644 --- a/deploy-manage/distributed-architecture/shard-allocation-relocation-recovery/delaying-allocation-when-node-leaves.md +++ b/deploy-manage/distributed-architecture/shard-allocation-relocation-recovery/delaying-allocation-when-node-leaves.md @@ -13,7 +13,7 @@ When a node leaves the cluster for whatever reason, intentional or otherwise, th These actions are intended to protect the cluster against data loss by ensuring that every shard is fully replicated as soon as possible. -Even though we throttle concurrent recoveries both at the [node level](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/index-recovery-settings.md) and at the [cluster level](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-shard-allocation-settings), this shard-shuffle can still put a lot of extra load on the cluster which may not be necessary if the missing node is likely to return soon. Imagine this scenario: +Even though we throttle concurrent recoveries both at the [node level](elasticsearch://reference/elasticsearch/configuration-reference/index-recovery-settings.md) and at the [cluster level](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-shard-allocation-settings), this shard-shuffle can still put a lot of extra load on the cluster which may not be necessary if the missing node is likely to return soon. Imagine this scenario: * Node 5 loses network connectivity. * The master promotes a replica shard to primary for each primary that was on Node 5. @@ -47,7 +47,7 @@ With delayed allocation enabled, the above scenario changes to look like this: * Node 5 returns after a few minutes, before the `timeout` expires. * The missing replicas are re-allocated to Node 5 (and sync-flushed shards recover almost immediately). -::::{note} +::::{note} This setting will not affect the promotion of replicas to primaries, nor will it affect the assignment of replicas that have not been assigned previously. In particular, delayed allocation does not come into effect after a full cluster restart. Also, in case of a master failover situation, elapsed delay time is forgotten (i.e. reset to the full initial delay). :::: diff --git a/deploy-manage/distributed-architecture/shard-allocation-relocation-recovery/shard-allocation-awareness.md b/deploy-manage/distributed-architecture/shard-allocation-relocation-recovery/shard-allocation-awareness.md index 495199f72..d91216a9b 100644 --- a/deploy-manage/distributed-architecture/shard-allocation-relocation-recovery/shard-allocation-awareness.md +++ b/deploy-manage/distributed-architecture/shard-allocation-relocation-recovery/shard-allocation-awareness.md @@ -23,7 +23,7 @@ Learn more about [designing resilient clusters](../../production-guidance/availa To enable shard allocation awareness: -1. Specify the location of each node with a [custom node attribute](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/node-settings.md#custom-node-attributes). For example, if you want Elasticsearch to distribute shards across different racks, you might use an awareness attribute called `rack_id`. +1. Specify the location of each node with a [custom node attribute](elasticsearch://reference/elasticsearch/configuration-reference/node-settings.md#custom-node-attributes). For example, if you want Elasticsearch to distribute shards across different racks, you might use an awareness attribute called `rack_id`. You can set custom attributes in two ways: diff --git a/deploy-manage/maintenance/add-and-remove-elasticsearch-nodes.md b/deploy-manage/maintenance/add-and-remove-elasticsearch-nodes.md index e71a68179..175a087ea 100644 --- a/deploy-manage/maintenance/add-and-remove-elasticsearch-nodes.md +++ b/deploy-manage/maintenance/add-and-remove-elasticsearch-nodes.md @@ -16,7 +16,7 @@ If you are running a single instance of {{es}}, you have a cluster of one node. :alt: A cluster with one node and three primary shards ::: -You add nodes to a cluster to increase its capacity and reliability. By default, a node is both a data node and eligible to be elected as the master node that controls the cluster. You can also configure a new node for a specific purpose, such as handling ingest requests. For more information, see [Nodes](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/node-settings.md). +You add nodes to a cluster to increase its capacity and reliability. By default, a node is both a data node and eligible to be elected as the master node that controls the cluster. You can also configure a new node for a specific purpose, such as handling ingest requests. For more information, see [Nodes](elasticsearch://reference/elasticsearch/configuration-reference/node-settings.md). When you add more nodes to a cluster, it automatically allocates replica shards. When all primary and replica shards are active, the cluster state changes to green. @@ -47,11 +47,11 @@ When {{es}} starts for the first time, the security auto-configuration process b Before enrolling a new node, additional actions such as binding to an address other than `localhost` or satisfying bootstrap checks are typically necessary in production clusters. During that time, an auto-generated enrollment token could expire, which is why enrollment tokens aren’t generated automatically. -Additionally, only nodes on the same host can join the cluster without additional configuration. If you want nodes from another host to join your cluster, you need to set `transport.host` to a [supported value](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#network-interface-values) (such as uncommenting the suggested value of `0.0.0.0`), or an IP address that’s bound to an interface where other hosts can reach it. Refer to [transport settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#transport-settings) for more information. +Additionally, only nodes on the same host can join the cluster without additional configuration. If you want nodes from another host to join your cluster, you need to set `transport.host` to a [supported value](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#network-interface-values) (such as uncommenting the suggested value of `0.0.0.0`), or an IP address that’s bound to an interface where other hosts can reach it. Refer to [transport settings](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#transport-settings) for more information. To enroll new nodes in your cluster, create an enrollment token with the `elasticsearch-create-enrollment-token` tool on any existing node in your cluster. You can then start a new node with the `--enrollment-token` parameter so that it joins an existing cluster. -1. In a separate terminal from where {{es}} is running, navigate to the directory where you installed {{es}} and run the [`elasticsearch-create-enrollment-token`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/create-enrollment-token.md) tool to generate an enrollment token for your new nodes. +1. In a separate terminal from where {{es}} is running, navigate to the directory where you installed {{es}} and run the [`elasticsearch-create-enrollment-token`](elasticsearch://reference/elasticsearch/command-line-tools/create-enrollment-token.md) tool to generate an enrollment token for your new nodes. ```sh bin\elasticsearch-create-enrollment-token -s node @@ -73,7 +73,7 @@ To enroll new nodes in your cluster, create an enrollment token with the `elasti 3. Repeat the previous step for any new nodes that you want to enroll. -For more information about discovery and shard allocation, refer to [*Discovery and cluster formation*](../distributed-architecture/discovery-cluster-formation.md) and [Cluster-level shard allocation and routing settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md). +For more information about discovery and shard allocation, refer to [*Discovery and cluster formation*](../distributed-architecture/discovery-cluster-formation.md) and [Cluster-level shard allocation and routing settings](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md). ## Master-eligible nodes [add-elasticsearch-nodes-master-eligible] @@ -122,7 +122,7 @@ Adding an exclusion for a node creates an entry for that node in the voting conf GET /_cluster/state?filter_path=metadata.cluster_coordination.voting_config_exclusions ``` -This list is limited in size by the `cluster.max_voting_config_exclusions` setting, which defaults to `10`. See [Discovery and cluster formation settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/discovery-cluster-formation-settings.md). Since voting configuration exclusions are persistent and limited in number, they must be cleaned up. Normally an exclusion is added when performing some maintenance on the cluster, and the exclusions should be cleaned up when the maintenance is complete. Clusters should have no voting configuration exclusions in normal operation. +This list is limited in size by the `cluster.max_voting_config_exclusions` setting, which defaults to `10`. See [Discovery and cluster formation settings](elasticsearch://reference/elasticsearch/configuration-reference/discovery-cluster-formation-settings.md). Since voting configuration exclusions are persistent and limited in number, they must be cleaned up. Normally an exclusion is added when performing some maintenance on the cluster, and the exclusions should be cleaned up when the maintenance is complete. Clusters should have no voting configuration exclusions in normal operation. If a node is excluded from the voting configuration because it is to be shut down permanently, its exclusion can be removed after it is shut down and removed from the cluster. Exclusions can also be cleared if they were created in error or were only required temporarily by specifying `?wait_for_removal=false`. diff --git a/deploy-manage/maintenance/ece/start-stop-routing-requests.md b/deploy-manage/maintenance/ece/start-stop-routing-requests.md index c4c0367d4..9130380f3 100644 --- a/deploy-manage/maintenance/ece/start-stop-routing-requests.md +++ b/deploy-manage/maintenance/ece/start-stop-routing-requests.md @@ -16,7 +16,7 @@ The {{ecloud}} proxy routes HTTP requests to its deployment’s individual produ It might be helpful to temporarily block upstream requests in order to protect some or all instances or products within your deployment. For example, you might stop request routing in the following cases: * If another team within your company starts streaming new data into your production {{integrations-server}} without previous load testing, both it and {{es}} might experience performance issues. You might consider stopping routing requests on all {{integrations-server}} instances in order to protect your downstream {{es}} instance. -* If {{es}} is being overwhelmed by upstream requests, it might experience increased response times or even become unresponsive. This might impact your ability to resize components in your deployment and increase the duration of pending plans or increase the chance of plan changes failing. Because every {{es}} node is an [implicit coordinating node](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/node-settings.md), you should stop routing requests across all {{es}} nodes to completely block upstream traffic. +* If {{es}} is being overwhelmed by upstream requests, it might experience increased response times or even become unresponsive. This might impact your ability to resize components in your deployment and increase the duration of pending plans or increase the chance of plan changes failing. Because every {{es}} node is an [implicit coordinating node](elasticsearch://reference/elasticsearch/configuration-reference/node-settings.md), you should stop routing requests across all {{es}} nodes to completely block upstream traffic. ## Considerations [request-routing-considerations] diff --git a/deploy-manage/maintenance/start-stop-services/full-cluster-restart-rolling-restart-procedures.md b/deploy-manage/maintenance/start-stop-services/full-cluster-restart-rolling-restart-procedures.md index a4674e858..31bd0123c 100644 --- a/deploy-manage/maintenance/start-stop-services/full-cluster-restart-rolling-restart-procedures.md +++ b/deploy-manage/maintenance/start-stop-services/full-cluster-restart-rolling-restart-procedures.md @@ -11,14 +11,14 @@ applies_to: There may be [situations where you want to perform a full-cluster restart](../../security/secure-cluster-communications.md) or a rolling restart. In the case of [full-cluster restart](#restart-cluster-full), you shut down and restart all the nodes in the cluster while in the case of [rolling restart](#restart-cluster-rolling), you shut down only one node at a time, so the service remains uninterrupted. ::::{warning} -Nodes exceeding the low watermark threshold will be slow to restart. Reduce the disk usage below the [low watermark](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-watermark-low) before restarting nodes. +Nodes exceeding the low watermark threshold will be slow to restart. Reduce the disk usage below the [low watermark](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-watermark-low) before restarting nodes. :::: ## Full-cluster restart [restart-cluster-full] 1. **Disable shard allocation.** - When you shut down a data node, the allocation process waits for `index.unassigned.node_left.delayed_timeout` (by default, one minute) before starting to replicate the shards on that node to other nodes in the cluster, which can involve a lot of I/O. Since the node is shortly going to be restarted, this I/O is unnecessary. You can avoid racing the clock by [disabling allocation](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-allocation-enable) of replicas before shutting down [data nodes](../../distributed-architecture/clusters-nodes-shards/node-roles.md#data-node-role): + When you shut down a data node, the allocation process waits for `index.unassigned.node_left.delayed_timeout` (by default, one minute) before starting to replicate the shards on that node to other nodes in the cluster, which can involve a lot of I/O. Since the node is shortly going to be restarted, this I/O is unnecessary. You can avoid racing the clock by [disabling allocation](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-allocation-enable) of replicas before shutting down [data nodes](../../distributed-architecture/clusters-nodes-shards/node-roles.md#data-node-role): ```console PUT _cluster/settings @@ -29,7 +29,7 @@ Nodes exceeding the low watermark threshold will be slow to restart. Reduce the } ``` - You can also consider [gateway settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/local-gateway.md) when restarting large clusters to reduce initial strain while nodes are processing [through discovery](../../distributed-architecture/discovery-cluster-formation.md). + You can also consider [gateway settings](elasticsearch://reference/elasticsearch/configuration-reference/local-gateway.md) when restarting large clusters to reduce initial strain while nodes are processing [through discovery](../../distributed-architecture/discovery-cluster-formation.md). 2. **Stop indexing and perform a flush.** Performing a [flush](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-flush) speeds up shard recovery. @@ -124,7 +124,7 @@ Nodes exceeding the low watermark threshold will be slow to restart. Reduce the ## Rolling restart [restart-cluster-rolling] 1. **Disable shard allocation.** - When you shut down a data node, the allocation process waits for `index.unassigned.node_left.delayed_timeout` (by default, one minute) before starting to replicate the shards on that node to other nodes in the cluster, which can involve a lot of I/O. Since the node is shortly going to be restarted, this I/O is unnecessary. You can avoid racing the clock by [disabling allocation](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-allocation-enable) of replicas before shutting down [data nodes](../../distributed-architecture/clusters-nodes-shards/node-roles.md#data-node-role): + When you shut down a data node, the allocation process waits for `index.unassigned.node_left.delayed_timeout` (by default, one minute) before starting to replicate the shards on that node to other nodes in the cluster, which can involve a lot of I/O. Since the node is shortly going to be restarted, this I/O is unnecessary. You can avoid racing the clock by [disabling allocation](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-allocation-enable) of replicas before shutting down [data nodes](../../distributed-architecture/clusters-nodes-shards/node-roles.md#data-node-role): ```console PUT _cluster/settings @@ -135,7 +135,7 @@ Nodes exceeding the low watermark threshold will be slow to restart. Reduce the } ``` - You can also consider [gateway settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/local-gateway.md) when restarting large clusters to reduce initial strain while nodes are processing [through discovery](../../distributed-architecture/discovery-cluster-formation.md). + You can also consider [gateway settings](elasticsearch://reference/elasticsearch/configuration-reference/local-gateway.md) when restarting large clusters to reduce initial strain while nodes are processing [through discovery](../../distributed-architecture/discovery-cluster-formation.md). 2. **Stop non-essential indexing and perform a flush.** (Optional) While you can continue indexing during the rolling restart, shard recovery can be faster if you temporarily stop non-essential indexing and perform a [flush](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-flush). diff --git a/deploy-manage/maintenance/start-stop-services/start-stop-kibana.md b/deploy-manage/maintenance/start-stop-services/start-stop-kibana.md index b92b57f5f..9390bc9c0 100644 --- a/deploy-manage/maintenance/start-stop-services/start-stop-kibana.md +++ b/deploy-manage/maintenance/start-stop-services/start-stop-kibana.md @@ -30,8 +30,8 @@ If this is the first time you’re starting {{kib}}, this command generates a un 2. In your browser, paste the enrollment token that was generated in the terminal when you started {{es}}, and then click the button to connect your {{kib}} instance with {{es}}. 3. Log in to {{kib}} as the `elastic` user with the password that was generated when you started {{es}}. -::::{note} -If you need to reset the password for the `elastic` user or other built-in users, run the [`elasticsearch-reset-password`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/reset-password.md) tool. To generate new enrollment tokens for {{kib}} or {{es}} nodes, run the [`elasticsearch-create-enrollment-token`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/create-enrollment-token.md) tool. These tools are available in the {{es}} `bin` directory. +::::{note} +If you need to reset the password for the `elastic` user or other built-in users, run the [`elasticsearch-reset-password`](elasticsearch://reference/elasticsearch/command-line-tools/reset-password.md) tool. To generate new enrollment tokens for {{kib}} or {{es}} nodes, run the [`elasticsearch-create-enrollment-token`](elasticsearch://reference/elasticsearch/command-line-tools/create-enrollment-token.md) tool. These tools are available in the {{es}} `bin` directory. :::: @@ -55,8 +55,8 @@ If this is the first time you’re starting {{kib}}, this command generates a un 2. In your browser, paste the enrollment token that was generated in the terminal when you started {{es}}, and then click the button to connect your {{kib}} instance with {{es}}. 3. Log in to {{kib}} as the `elastic` user with the password that was generated when you started {{es}}. -::::{note} -If you need to reset the password for the `elastic` user or other built-in users, run the [`elasticsearch-reset-password`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/reset-password.md) tool. To generate new enrollment tokens for {{kib}} or {{es}} nodes, run the [`elasticsearch-create-enrollment-token`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/create-enrollment-token.md) tool. These tools are available in the {{es}} `bin` directory. +::::{note} +If you need to reset the password for the `elastic` user or other built-in users, run the [`elasticsearch-reset-password`](elasticsearch://reference/elasticsearch/command-line-tools/reset-password.md) tool. To generate new enrollment tokens for {{kib}} or {{es}} nodes, run the [`elasticsearch-create-enrollment-token`](elasticsearch://reference/elasticsearch/command-line-tools/create-enrollment-token.md) tool. These tools are available in the {{es}} `bin` directory. :::: diff --git a/deploy-manage/manage-connectors.md b/deploy-manage/manage-connectors.md index 085c1f30a..1c6d76880 100644 --- a/deploy-manage/manage-connectors.md +++ b/deploy-manage/manage-connectors.md @@ -12,7 +12,7 @@ applies_to: Connectors serve as a central place to store connection information for both Elastic and third-party systems. They enable the linking of actions to rules, which execute as background tasks on the {{kib}} server when rule conditions are met. This allows rules to route actions to various destinations such as log files, ticketing systems, and messaging tools. Different {{kib}} apps may have their own rule types, but they typically share connectors. The **{{stack-manage-app}} > {{connectors-ui}}** provides a central location to view and manage all connectors in the current space. ::::{note} -This page is about {{kib}} connectors that integrate with services like generative AI model providers. If you’re looking for Search connectors that synchronize third-party data into {{es}}, refer to [Connector clients](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/search-connectors/index.md). +This page is about {{kib}} connectors that integrate with services like generative AI model providers. If you’re looking for Search connectors that synchronize third-party data into {{es}}, refer to [Connector clients](elasticsearch://reference/ingestion-tools/search-connectors/index.md). :::: @@ -66,7 +66,7 @@ Some connector types are paid commercial features, while others are free. For a After you create a connector, it is available for use any time you set up an action in the current space. ::::{tip} -For out-of-the-box and standardized connectors, refer to [preconfigured connectors](asciidocalypse://docs/kibana/docs/reference/connectors-kibana/pre-configured-connectors.md). +For out-of-the-box and standardized connectors, refer to [preconfigured connectors](asciidocalypse://docs/kibana/docs/reference/connectors-kibana/pre-configured-connectors.md). You can also manage connectors as resources with the [Elasticstack provider](https://registry.terraform.io/providers/elastic/elasticstack/latest) for Terraform. For more details, refer to the [elasticstack_kibana_action_connector](https://registry.terraform.io/providers/elastic/elasticstack/latest/docs/resources/kibana_action_connector) resource. diff --git a/deploy-manage/monitor/logging-configuration/auditing-search-queries.md b/deploy-manage/monitor/logging-configuration/auditing-search-queries.md index 6afdf75fc..6df7c1471 100644 --- a/deploy-manage/monitor/logging-configuration/auditing-search-queries.md +++ b/deploy-manage/monitor/logging-configuration/auditing-search-queries.md @@ -12,7 +12,7 @@ applies_to: # Audit Elasticsearch search queries [auditing-search-queries] -There is no [audit event type](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/elasticsearch-audit-events.md) specifically dedicated to search queries. Search queries are analyzed and then processed; the processing triggers authorization actions that are audited. However, the original raw query, as submitted by the client, is not accessible downstream when authorization auditing occurs. +There is no [audit event type](elasticsearch://reference/elasticsearch/elasticsearch-audit-events.md) specifically dedicated to search queries. Search queries are analyzed and then processed; the processing triggers authorization actions that are audited. However, the original raw query, as submitted by the client, is not accessible downstream when authorization auditing occurs. Search queries are contained inside HTTP request bodies, however, and some audit events that are generated by the REST layer, on the coordinating node, can be toggled to output the request body to the audit log. Therefore, one must audit request bodies in order to audit search queries. @@ -24,26 +24,26 @@ xpack.security.audit.logfile.events.emit_request_body: true You can apply this setting through [cluster update settings API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings), as described in [](./configuring-audit-logs.md). Alternatively, you can modify `elasticsearch.yml` in all nodes and restart for the changes to take effect. -::::{important} +::::{important} No filtering is performed when auditing, so sensitive data might be audited in plain text when audit events include the request body. Also, the request body can contain malicious content that can break a parser consuming the audit logs. :::: The request body is printed as an escaped JSON string value (RFC 4627) to the `request.body` event attribute. Not all events contain the `request.body` attribute, even when the above setting is toggled. The ones that do are: - + * `authentication_success` * `authentication_failed` * `realm_authentication_failed` * `tampered_request` -* `run_as_denied` +* `run_as_denied` * `anonymous_access_denied` -The `request.body` attribute is printed on the coordinating node only (the node that handles the REST request). Most of these event types are [not included by default](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/auding-settings.md#xpack-sa-lf-events-include). +The `request.body` attribute is printed on the coordinating node only (the node that handles the REST request). Most of these event types are [not included by default](elasticsearch://reference/elasticsearch/configuration-reference/auding-settings.md#xpack-sa-lf-events-include). A good practical piece of advice is to add `authentication_success` to the event types that are audited (add it to the list in the `xpack.security.audit.logfile.events.include`), as this event type is not audited by default. -::::{note} +::::{note} Typically, the include list contains other event types as well, such as `access_granted` or `access_denied`. :::: diff --git a/deploy-manage/monitor/logging-configuration/configuring-audit-logs.md b/deploy-manage/monitor/logging-configuration/configuring-audit-logs.md index 4d6a21874..f4ae3fb93 100644 --- a/deploy-manage/monitor/logging-configuration/configuring-audit-logs.md +++ b/deploy-manage/monitor/logging-configuration/configuring-audit-logs.md @@ -16,16 +16,16 @@ When auditing security events, a single client request might generate multiple a {{es}} configuration options include: - * [{{es}} audited events settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/auding-settings.md#event-audit-settings): Use include and exclude filters to control the types of events that get logged. - * [{{es}} node information settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/auding-settings.md#node-audit-settings): Control whether to add or hide node information such as hostname or IP address in the audited events. - * [{{es}} ignore policies settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/auding-settings.md#audit-event-ignore-policies): Use ignore policies for fine-grained control over which audit events are printed to the log file. + * [{{es}} audited events settings](elasticsearch://reference/elasticsearch/configuration-reference/auding-settings.md#event-audit-settings): Use include and exclude filters to control the types of events that get logged. + * [{{es}} node information settings](elasticsearch://reference/elasticsearch/configuration-reference/auding-settings.md#node-audit-settings): Control whether to add or hide node information such as hostname or IP address in the audited events. + * [{{es}} ignore policies settings](elasticsearch://reference/elasticsearch/configuration-reference/auding-settings.md#audit-event-ignore-policies): Use ignore policies for fine-grained control over which audit events are printed to the log file. ::::{tip} - In {{es}}, all auditing settings except `xpack.security.audit.enabled` are dynamic. This means you can configure them using the [cluster update settings API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings), allowing changes to take effect immediately without requiring a restart. This approach is faster and more convenient than modifying `elasticsearch.yml`. + In {{es}}, all auditing settings except `xpack.security.audit.enabled` are dynamic. This means you can configure them using the [cluster update settings API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings), allowing changes to take effect immediately without requiring a restart. This approach is faster and more convenient than modifying `elasticsearch.yml`. :::: For a complete description of event details and format, refer to the following resources: - * [{{es}} audit events details and schema](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/elasticsearch-audit-events.md) + * [{{es}} audit events details and schema](elasticsearch://reference/elasticsearch/elasticsearch-audit-events.md) * [{{es}} log entry output format](/deploy-manage/monitor/logging-configuration/logfile-audit-output.md#audit-log-entry-format) ### Kibana auditing configuration @@ -43,7 +43,7 @@ For a complete description of auditing event details, such as `category`, `type` ### General recommendations -* Consider starting with {{es}} [`xpack.security.audit.logfile.events.include`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/auding-settings.md#xpack-sa-lf-events-include) and [{{kib}} ignore filters](asciidocalypse://docs/kibana/docs/reference/configuration-reference/security-settings.md#audit-logging-ignore-filters) settings to specify the type of events you want to include or exclude in the auditing output. +* Consider starting with {{es}} [`xpack.security.audit.logfile.events.include`](elasticsearch://reference/elasticsearch/configuration-reference/auding-settings.md#xpack-sa-lf-events-include) and [{{kib}} ignore filters](asciidocalypse://docs/kibana/docs/reference/configuration-reference/security-settings.md#audit-logging-ignore-filters) settings to specify the type of events you want to include or exclude in the auditing output. * If you need a more granular control, refer to [{{es}} audit events ignore policies](./logfile-audit-events-ignore-policies.md) for a better understanding how ignore policies work and when they are beneficial. diff --git a/deploy-manage/monitor/logging-configuration/logfile-audit-events-ignore-policies.md b/deploy-manage/monitor/logging-configuration/logfile-audit-events-ignore-policies.md index 50777cef3..c140495cb 100644 --- a/deploy-manage/monitor/logging-configuration/logfile-audit-events-ignore-policies.md +++ b/deploy-manage/monitor/logging-configuration/logfile-audit-events-ignore-policies.md @@ -15,15 +15,15 @@ applies_to: The comprehensive audit trail is necessary to ensure accountability. It offers tremendous value during incident response and can even be required for demonstrating compliance. -The drawback of an audited system is represented by the inevitable performance penalty incurred. In all truth, the audit trail spends *I/O ops* that are not available anymore for the user’s queries. Sometimes the verbosity of the audit trail may become a problem that the event type restrictions, [defined by `include` and `exclude`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/auding-settings.md#event-audit-settings), will not alleviate. +The drawback of an audited system is represented by the inevitable performance penalty incurred. In all truth, the audit trail spends *I/O ops* that are not available anymore for the user’s queries. Sometimes the verbosity of the audit trail may become a problem that the event type restrictions, [defined by `include` and `exclude`](elasticsearch://reference/elasticsearch/configuration-reference/auding-settings.md#event-audit-settings), will not alleviate. **Audit events ignore policies** are a finer way to tune the verbosity of the audit trail. These policies define rules that match audit events which will be *ignored* (read as: not printed). Rules match on the values of attributes of audit events and complement the `include` or `exclude` method. Imagine the corpus of audit events and the policies chopping off unwanted events. With a sole exception, all audit events are subject to the ignore policies. The exception are events of type `security_config_change`, which cannot be filtered out, unless excluded altogether. -::::{important} +::::{important} When utilizing audit events ignore policies you are acknowledging potential accountability gaps that could render illegitimate actions undetectable. Please take time to review these policies whenever your system architecture changes. :::: -A policy is a named set of filter rules. Each filter rule applies to a single event attribute, one of the `users`, `realms`, `actions`, `roles` or `indices` attributes. The filter rule defines a list of [Lucene regexp](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/regexp-syntax.md), **any** of which has to match the value of the audit event attribute for the rule to match. A policy matches an event if **all** the rules comprising it match the event. An audit event is ignored, therefore not printed, if it matches **any** policy. All other non-matching events are printed as usual. +A policy is a named set of filter rules. Each filter rule applies to a single event attribute, one of the `users`, `realms`, `actions`, `roles` or `indices` attributes. The filter rule defines a list of [Lucene regexp](elasticsearch://reference/query-languages/regexp-syntax.md), **any** of which has to match the value of the audit event attribute for the rule to match. A policy matches an event if **all** the rules comprising it match the event. An audit event is ignored, therefore not printed, if it matches **any** policy. All other non-matching events are printed as usual. All policies are defined under the `xpack.security.audit.logfile.events.ignore_filters` settings namespace. For example, the following policy named *example1* matches events from the *kibana_system* or *admin_user* principals that operate over indices of the wildcard form *app-logs**: diff --git a/deploy-manage/monitor/logging-configuration/logfile-audit-output.md b/deploy-manage/monitor/logging-configuration/logfile-audit-output.md index 718242d0a..87e56c79b 100644 --- a/deploy-manage/monitor/logging-configuration/logfile-audit-output.md +++ b/deploy-manage/monitor/logging-configuration/logfile-audit-output.md @@ -19,7 +19,7 @@ In self-managed clusters, you can configure how the `logfile` is written in the Orchestrated deployments (ECH, ECE, and ECK) do not support changes in `log4j2.properties` files of the {{es}} instances. -::::{note} +::::{note} If you overwrite the `log4j2.properties` and do not specify appenders for any of the audit trails, audit events are forwarded to the root appender, which by default points to the `elasticsearch.log` file. :::: @@ -33,4 +33,4 @@ There are however a few attributes that are exceptions to the above format. The When the `request.body` attribute is present (see [Auditing search queries](auditing-search-queries.md)), it contains a string value containing the full HTTP request body, escaped as per the JSON RFC 4677. -Refer to [audit event types](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/elasticsearch-audit-events.md) for a complete list of fields, as well as examples, for each entry type. +Refer to [audit event types](elasticsearch://reference/elasticsearch/elasticsearch-audit-events.md) for a complete list of fields, as well as examples, for each entry type. diff --git a/deploy-manage/monitor/logging-configuration/security-event-audit-logging.md b/deploy-manage/monitor/logging-configuration/security-event-audit-logging.md index 4c5e7c60a..7c190a582 100644 --- a/deploy-manage/monitor/logging-configuration/security-event-audit-logging.md +++ b/deploy-manage/monitor/logging-configuration/security-event-audit-logging.md @@ -25,7 +25,7 @@ In this section, you'll learn how to: * [](./configuring-audit-logs.md): Filter and control what security events get logged in the audit log output. -* [Audit {{es}} search queries](./auditing-search-queries.md): Audit and log search request bodies. +* [Audit {{es}} search queries](./auditing-search-queries.md): Audit and log search request bodies. * [Correlate audit events](./correlating-kibana-elasticsearch-audit-logs.md): Explore audit logs and understand how events from the same request are correlated. @@ -33,5 +33,5 @@ By following these guidelines, you can effectively audit system activity, enhanc For a complete description of audit event details and format, refer to: -* [Elasticsearch audit events](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/elasticsearch-audit-events.md) +* [Elasticsearch audit events](elasticsearch://reference/elasticsearch/elasticsearch-audit-events.md) * [Kibana audit events](asciidocalypse://docs/kibana/docs/reference/kibana-audit-events.md) diff --git a/deploy-manage/monitor/monitoring-data/ec-memory-pressure.md b/deploy-manage/monitor/monitoring-data/ec-memory-pressure.md index 41fd8ae84..5cd900b4f 100644 --- a/deploy-manage/monitor/monitoring-data/ec-memory-pressure.md +++ b/deploy-manage/monitor/monitoring-data/ec-memory-pressure.md @@ -23,7 +23,7 @@ The percentage number used in the JVM memory pressure indicator is actually the When the JVM memory pressure reaches 75%, the indicator turns red. At this level, garbage collection becomes more frequent as the memory usage increases, potentially impacting the performance of your cluster. As long as the cluster performance suits your needs, JVM memory pressure above 75% is not a problem in itself, but there is not much spare memory capacity. Review the [common causes of high JVM memory usage](#ec-memory-pressure-causes) to determine your best course of action. -When the JVM memory pressure indicator rises above 95%, {{es}}'s [real memory circuit breaker](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/circuit-breaker-settings.md#parent-circuit-breaker) triggers to prevent your instance from running out of memory. This situation can reduce the stability of your cluster and the integrity of your data. Unless you expect the load to drop soon, we recommend that you resize to a larger cluster before you reach this level of memory pressure. Even if you’re planning to optimize your memory usage, it is best to resize the cluster first. Resizing the cluster to increase capacity can give you more time to apply other changes, and also provides the cluster with more resource for when those changes are applied. +When the JVM memory pressure indicator rises above 95%, {{es}}'s [real memory circuit breaker](elasticsearch://reference/elasticsearch/configuration-reference/circuit-breaker-settings.md#parent-circuit-breaker) triggers to prevent your instance from running out of memory. This situation can reduce the stability of your cluster and the integrity of your data. Unless you expect the load to drop soon, we recommend that you resize to a larger cluster before you reach this level of memory pressure. Even if you’re planning to optimize your memory usage, it is best to resize the cluster first. Resizing the cluster to increase capacity can give you more time to apply other changes, and also provides the cluster with more resource for when those changes are applied. ## Common causes of high JVM memory usage [ec-memory-pressure-causes] diff --git a/deploy-manage/monitor/stack-monitoring/collecting-log-data-with-filebeat.md b/deploy-manage/monitor/stack-monitoring/collecting-log-data-with-filebeat.md index b51415693..7b1e19549 100644 --- a/deploy-manage/monitor/stack-monitoring/collecting-log-data-with-filebeat.md +++ b/deploy-manage/monitor/stack-monitoring/collecting-log-data-with-filebeat.md @@ -14,22 +14,22 @@ applies_to: You can use {{filebeat}} to monitor the {{es}} log files, collect log events, and ship them to the monitoring cluster. Your recent logs are visible on the **Monitoring** page in {{kib}}. -::::{important} +::::{important} If you’re using {{agent}}, do not deploy {{filebeat}} for log collection. Instead, configure the {{es}} integration to collect logs. :::: 1. Verify that {{es}} is running and that the monitoring cluster is ready to receive data from {{filebeat}}. - ::::{tip} + ::::{tip} In production environments, we strongly recommend using a separate cluster (referred to as the *monitoring cluster*) to store the data. Using a separate monitoring cluster prevents production cluster outages from impacting your ability to access your monitoring data. It also prevents monitoring activities from impacting the performance of your production cluster. See [*Monitoring in a production environment*](elasticsearch-monitoring-self-managed.md). :::: 2. Identify which logs you want to monitor. - The {{filebeat}} {{es}} module can handle [audit logs](../logging-configuration/logfile-audit-output.md), [deprecation logs](../logging-configuration/elasticsearch-log4j-configuration-self-managed.md#deprecation-logging), [gc logs](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/jvm-settings.md#gc-logging), [server logs](../logging-configuration/elasticsearch-log4j-configuration-self-managed.md), and [slow logs](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/slow-log.md). For more information about the location of your {{es}} logs, see the [path.logs](../../deploy/self-managed/important-settings-configuration.md#path-settings) setting. + The {{filebeat}} {{es}} module can handle [audit logs](../logging-configuration/logfile-audit-output.md), [deprecation logs](../logging-configuration/elasticsearch-log4j-configuration-self-managed.md#deprecation-logging), [gc logs](elasticsearch://reference/elasticsearch/jvm-settings.md#gc-logging), [server logs](../logging-configuration/elasticsearch-log4j-configuration-self-managed.md), and [slow logs](elasticsearch://reference/elasticsearch/index-settings/slow-log.md). For more information about the location of your {{es}} logs, see the [path.logs](../../deploy/self-managed/important-settings-configuration.md#path-settings) setting. - ::::{important} + ::::{important} If there are both structured (`*.json`) and unstructured (plain text) versions of the logs, you must use the structured logs. Otherwise, they might not appear in the appropriate context in {{kib}}. :::: @@ -54,7 +54,7 @@ If you’re using {{agent}}, do not deploy {{filebeat}} for log collection. Inst If you configured the monitoring cluster to use encrypted communications, you must access it via HTTPS. For example, use a `hosts` setting like `https://es-mon-1:9200`. - ::::{important} + ::::{important} The {{es}} {{monitor-features}} use ingest pipelines, therefore the cluster that stores the monitoring data must have at least one [ingest node](../../../manage-data/ingest/transform-enrich/ingest-pipelines.md). :::: @@ -74,7 +74,7 @@ If you’re using {{agent}}, do not deploy {{filebeat}} for log collection. Inst #password: "YOUR_PASSWORD" ``` - ::::{tip} + ::::{tip} In production environments, we strongly recommend using a dedicated {{kib}} instance for your monitoring cluster. :::: @@ -101,13 +101,13 @@ If you’re using {{agent}}, do not deploy {{filebeat}} for log collection. Inst If the logs that you want to monitor aren’t in the default location, set the appropriate path variables in the `modules.d/elasticsearch.yml` file. See [Configure the {{es}} module](asciidocalypse://docs/beats/docs/reference/filebeat/filebeat-module-elasticsearch.md#configuring-elasticsearch-module). - ::::{important} + ::::{important} If there are JSON logs, configure the `var.paths` settings to point to them instead of the plain text logs. :::: 8. [Start {{filebeat}}](asciidocalypse://docs/beats/docs/reference/filebeat/filebeat-starting.md) on each node. - ::::{note} + ::::{note} Depending on how you’ve installed {{filebeat}}, you might see errors related to file ownership or permissions when you try to run {{filebeat}} modules. See [Config file ownership and permissions](asciidocalypse://docs/beats/docs/reference/libbeat/config-file-permissions.md). :::: @@ -115,7 +115,7 @@ If you’re using {{agent}}, do not deploy {{filebeat}} for log collection. Inst For example, use the [cat indices](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cat-indices) command to verify that there are new `filebeat-*` indices. - ::::{tip} + ::::{tip} If you want to use the **Monitoring** UI in {{kib}}, there must also be `.monitoring-*` indices. Those indices are generated when you collect metrics about {{stack}} products. For example, see [Collecting monitoring data with {{metricbeat}}](collecting-monitoring-data-with-metricbeat.md). :::: diff --git a/deploy-manage/monitor/stack-monitoring/ece-stack-monitoring.md b/deploy-manage/monitor/stack-monitoring/ece-stack-monitoring.md index 964a0bf76..247ce65af 100644 --- a/deploy-manage/monitor/stack-monitoring/ece-stack-monitoring.md +++ b/deploy-manage/monitor/stack-monitoring/ece-stack-monitoring.md @@ -22,12 +22,12 @@ Monitoring consists of two components: The steps in this section cover only the enablement of the monitoring and logging features in Elastic Cloud Enterprise. For more information on how to use the monitoring features, refer to [Monitor a cluster](../../monitor.md). -### Before you begin [ece-logging-and-monitoring-limitations] +### Before you begin [ece-logging-and-monitoring-limitations] Some limitations apply when you use monitoring on Elastic Cloud Enterprise. To learn more, check the monitoring [restrictions and limitations](ece-restrictions-monitoring.md). -### Monitoring for production use [ece-logging-and-monitoring-production] +### Monitoring for production use [ece-logging-and-monitoring-production] For production use, you should send your deployment logs and metrics to a dedicated monitoring deployment. Monitoring indexes logs and metrics into {{es}} and these indexes consume storage, memory, and CPU cycles like any other index. By using a separate monitoring deployment, you avoid affecting your other production deployments and can view the logs and metrics even when a production deployment is unavailable. @@ -44,15 +44,15 @@ How many monitoring deployments you use depends on your requirements: Logs and metrics that get sent to a dedicated monitoring {{es}} deployment [may not be cleaned up automatically](#ece-logging-and-monitoring-retention) and might require some additional steps to remove excess data periodically. -### Retention of monitoring daily indices [ece-logging-and-monitoring-retention] +### Retention of monitoring daily indices [ece-logging-and-monitoring-retention] -#### Stack versions 8.0 and above [ece-logging-and-monitoring-retention-8] +#### Stack versions 8.0 and above [ece-logging-and-monitoring-retention-8] When you enable monitoring in Elastic Cloud Enterprise, your monitoring indices are retained for a certain period by default. After the retention period has passed, the monitoring indices are deleted automatically. The retention period is configured in the `.monitoring-8-ilm-policy` index lifecycle policy. To view or edit the policy open {{kib}} **Stack management > Data > Index Lifecycle Policies**. -### Sending monitoring data to itself (self monitoring) [ece-logging-and-monitoring-retention-self-monitoring] +### Sending monitoring data to itself (self monitoring) [ece-logging-and-monitoring-retention-self-monitoring] $$$ece-logging-and-monitoring-retention-7$$$ When you enable self-monitoring in Elastic Cloud Enterprise, your monitoring indices are retained for a certain period by default. After the retention period has passed, the monitoring indices are deleted automatically. Monitoring data is retained for three days by default or as specified by the [`xpack.monitoring.history.duration` user setting](https://www.elastic.co/guide/en/cloud-enterprise/current/ece-change-user-settings-examples.html#xpack-monitoring-history-duration). @@ -74,7 +74,7 @@ PUT /_cluster/settings ``` -### Sending monitoring data to a dedicated monitoring deployment [ece-logging-and-monitoring-retention-dedicated-monitoring] +### Sending monitoring data to a dedicated monitoring deployment [ece-logging-and-monitoring-retention-dedicated-monitoring] When [monitoring for production use](#ece-logging-and-monitoring-production), where you configure your deployments **to send monitoring data to a dedicated monitoring deployment** for indexing, this retention period does not apply. Monitoring indices on a dedicated monitoring deployment are retained until you remove them. There are three options open to you: @@ -107,17 +107,17 @@ When [monitoring for production use](#ece-logging-and-monitoring-production), wh * To retain monitoring indices on a dedicated monitoring deployment as is without deleting them automatically, no additional steps are required other than making sure that you do not enable the monitoring deployment to send monitoring data to itself. You should also monitor the deployment for disk space usage and upgrade your deployment periodically, if necessary. -### Retention of logging indices [ece-logging-and-monitoring-log-retention] +### Retention of logging indices [ece-logging-and-monitoring-log-retention] An ILM policy is pre-configured to manage log retention. The policy can be adjusted according to your requirements. -### Index management [ece-logging-and-monitoring-index-management-ilm] +### Index management [ece-logging-and-monitoring-index-management-ilm] When sending monitoring data to a deployment, you can configure [Index Lifecycle Management (ILM)](/manage-data/lifecycle/index-lifecycle-management.md) to manage retention of your monitoring and logging indices. When sending logs to a deployment, an ILM policy is pre-configured to manage log retention and the policy can be customized to your needs. -### Enable logging and monitoring [ece-enable-logging-and-monitoring-steps] +### Enable logging and monitoring [ece-enable-logging-and-monitoring-steps] Elastic Cloud Enterprise manages the installation and configuration of the monitoring agent for you. When you enable monitoring on a deployment, you are configuring where the monitoring agent for your current deployment should send its logs and metrics. @@ -134,23 +134,23 @@ To enable monitoring on your deployment: If a deployment is not listed, make sure that it is running a compatible version. The monitoring deployment and production deployment must be on the same major version, cloud provider, and region. - ::::{tip} + ::::{tip} Remember to send logs and metrics for production deployments to a dedicated monitoring deployment, so that your production deployments are not impacted by the overhead of indexing and storing monitoring data. A dedicated monitoring deployment also gives you more control over the retention period for monitoring data. :::: -::::{note} +::::{note} Enabling logs and monitoring may trigger a plan change on your deployment. You can monitor the plan change progress from the deployment’s **Activity** page. :::: -::::{note} +::::{note} Enabling logs and monitoring requires some extra resource on a deployment. For production systems, we recommend sizing deployments with logs and monitoring enabled to at least 4 GB of RAM. :::: -### Access the monitoring application in Kibana [ece-access-kibana-monitoring] +### Access the monitoring application in Kibana [ece-access-kibana-monitoring] With monitoring enabled for your deployment, you can access the [logs](https://www.elastic.co/guide/en/kibana/current/observability.html) and [stack monitoring](../monitoring-data/visualizing-monitoring-data.md) through Kibana. @@ -174,28 +174,28 @@ Alternatively, you can access logs and metrics directly on the Kibana **Logs** a | `service.version` | The version of the stack resource that generated the log | `8.13.1` | -### Logging features [ece-extra-logging-features] +### Logging features [ece-extra-logging-features] When shipping logs to a monitoring deployment there are more logging features available to you. These features include: -#### For {{es}}: [ece-extra-logging-features-elasticsearch] +#### For {{es}}: [ece-extra-logging-features-elasticsearch] * [Audit logging](../logging-configuration/enabling-audit-logs.md) - logs security-related events on your deployment -* [Slow query and index logging](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/slow-log.md) - helps find and debug slow queries and indexing +* [Slow query and index logging](elasticsearch://reference/elasticsearch/index-settings/slow-log.md) - helps find and debug slow queries and indexing * Verbose logging - helps debug stack issues by increasing component logs After you’ve enabled log delivery on your deployment, you can [add the Elasticsearch user settings](../../deploy/cloud-enterprise/edit-stack-settings.md) to enable these features. -#### For Kibana: [ece-extra-logging-features-kibana] +#### For Kibana: [ece-extra-logging-features-kibana] * [Audit logging](../logging-configuration/enabling-audit-logs.md) - logs security-related events on your deployment After you’ve enabled log delivery on your deployment, you can [add the Kibana user settings](../../deploy/cloud-enterprise/edit-stack-settings.md) to enable this feature. -### Other components [ece-extra-logging-features-enterprise-search] +### Other components [ece-extra-logging-features-enterprise-search] Enabling log collection also supports collecting and indexing the following types of logs from other components in your deployments: @@ -213,12 +213,12 @@ The ˆ*ˆ indicates that we also index the archived files of each type of log. Check the respective product documentation for more information about the logging capabilities of each product. -## Metrics features [ece-extra-metrics-features] +## Metrics features [ece-extra-metrics-features] With logging and monitoring enabled for a deployment, metrics are collected for Elasticsearch, Kibana, and APM with Fleet Server. -#### Enabling Elasticsearch/Kibana audit logs on your deployment [ece-enable-audit-logs] +#### Enabling Elasticsearch/Kibana audit logs on your deployment [ece-enable-audit-logs] Audit logs are useful for tracking security events on your {{es}} and/or {{kib}} clusters. To enable {{es}} audit logs on your deployment: diff --git a/deploy-manage/monitor/stack-monitoring/es-http-exporter.md b/deploy-manage/monitor/stack-monitoring/es-http-exporter.md index 4d3b617e8..69a036a9e 100644 --- a/deploy-manage/monitor/stack-monitoring/es-http-exporter.md +++ b/deploy-manage/monitor/stack-monitoring/es-http-exporter.md @@ -9,7 +9,7 @@ applies_to: # HTTP exporters [http-exporter] -::::{important} +::::{important} {{agent}} and {{metricbeat}} are the recommended methods for collecting and shipping monitoring data to a monitoring cluster. If you have previously configured legacy collection methods, you should migrate to using [{{agent}}](collecting-monitoring-data-with-elastic-agent.md) or [{{metricbeat}}](collecting-monitoring-data-with-metricbeat.md) collection. Do not use legacy collection alongside other collection methods. @@ -19,9 +19,9 @@ If you have previously configured legacy collection methods, you should migrate The `http` exporter is the preferred exporter in the {{es}} {{monitor-features}} because it enables the use of a separate monitoring cluster. As a secondary benefit, it avoids using a production cluster node as a coordinating node for indexing monitoring data because all requests are HTTP requests to the monitoring cluster. -The `http` exporter uses the low-level {{es}} REST Client, which enables it to send its data to any {{es}} cluster it can access through the network. Its requests make use of the [`filter_path`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/common-options.md#common-options-response-filtering) parameter to reduce bandwidth whenever possible, which helps to ensure that communications between the production and monitoring clusters are as lightweight as possible. +The `http` exporter uses the low-level {{es}} REST Client, which enables it to send its data to any {{es}} cluster it can access through the network. Its requests make use of the [`filter_path`](elasticsearch://reference/elasticsearch/rest-apis/common-options.md#common-options-response-filtering) parameter to reduce bandwidth whenever possible, which helps to ensure that communications between the production and monitoring clusters are as lightweight as possible. -The `http` exporter supports a number of settings that control how it communicates over HTTP to remote clusters. In most cases, it is not necessary to explicitly configure these settings. For detailed descriptions, see [Monitoring settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/monitoring-settings.md). +The `http` exporter supports a number of settings that control how it communicates over HTTP to remote clusters. In most cases, it is not necessary to explicitly configure these settings. For detailed descriptions, see [Monitoring settings](elasticsearch://reference/elasticsearch/configuration-reference/monitoring-settings.md). ```yaml xpack.monitoring.exporters: @@ -49,13 +49,13 @@ xpack.monitoring.exporters: 2. An `http` exporter defined whose arbitrary name is `my_remote`. This name uniquely defines the exporter but is otherwise unused. 3. `host` is a required setting for `http` exporters. It must specify the HTTP port rather than the transport port. The default port value is `9200`. 4. User authentication for those using {{stack}} {{security-features}} or some other form of user authentication protecting the cluster. -5. See [HTTP exporter settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/monitoring-settings.md#http-exporter-settings) for all TLS/SSL settings. If not supplied, the default node-level TLS/SSL settings are used. +5. See [HTTP exporter settings](elasticsearch://reference/elasticsearch/configuration-reference/monitoring-settings.md#http-exporter-settings) for all TLS/SSL settings. If not supplied, the default node-level TLS/SSL settings are used. 6. Optional base path to prefix any outgoing request with in order to work with proxies. 7. Arbitrary key/value pairs to define as headers to send with every request. The array-based key/value format sends one header per value. 8. A mechanism for changing the date suffix used by default. -::::{note} +::::{note} The `http` exporter accepts an array of `hosts` and it will round robin through the list. It is a good idea to take advantage of that feature when the monitoring cluster contains more than one node. :::: @@ -69,17 +69,17 @@ Unlike the `local` exporter, *every* node that uses the `http` exporter attempts The easiest way to trigger a check is to disable, then re-enable the exporter. -::::{warning} +::::{warning} This resource management behavior can create a hole for users that delete monitoring resources. Since the `http` exporter does not re-check its resources unless one of the triggers occurs, this can result in malformed index mappings. :::: Unlike the `local` exporter, the `http` exporter is inherently routing requests outside of the cluster. This situation means that the exporter must provide a username and password when the monitoring cluster requires one (or other appropriate security configurations, such as TLS/SSL settings). -::::{important} +::::{important} When discussing security relative to the `http` exporter, it is critical to remember that all users are managed on the monitoring cluster. This is particularly important to remember when you move from development environments to production environments, where you often have dedicated monitoring clusters. :::: -For more information about the configuration options for the `http` exporter, see [HTTP exporter settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/monitoring-settings.md#http-exporter-settings). +For more information about the configuration options for the `http` exporter, see [HTTP exporter settings](elasticsearch://reference/elasticsearch/configuration-reference/monitoring-settings.md#http-exporter-settings). diff --git a/deploy-manage/monitor/stack-monitoring/es-legacy-collection-methods.md b/deploy-manage/monitor/stack-monitoring/es-legacy-collection-methods.md index 1d8afb703..546caa3e6 100644 --- a/deploy-manage/monitor/stack-monitoring/es-legacy-collection-methods.md +++ b/deploy-manage/monitor/stack-monitoring/es-legacy-collection-methods.md @@ -30,16 +30,16 @@ To learn about monitoring in general, see [Monitor a cluster](../../monitor.md). 1. Verify that the `xpack.monitoring.elasticsearch.collection.enabled` setting is `true`, which is its default value, on each node in the cluster. - ::::{note} + ::::{note} You can specify this setting in either the `elasticsearch.yml` on each node or across the cluster as a dynamic cluster setting. If {{es}} {{security-features}} are enabled, you must have `monitor` cluster privileges to view the cluster settings and `manage` cluster privileges to change them. :::: - For more information, see [Monitoring settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/monitoring-settings.md) and [Cluster update settings](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings). + For more information, see [Monitoring settings](elasticsearch://reference/elasticsearch/configuration-reference/monitoring-settings.md) and [Cluster update settings](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings). 2. Set the `xpack.monitoring.collection.enabled` setting to `true` on each node in the cluster. By default, it is disabled (`false`). - ::::{note} + ::::{note} You can specify this setting in either the `elasticsearch.yml` on each node or across the cluster as a dynamic cluster setting. If {{es}} {{security-features}} are enabled, you must have `monitor` cluster privileges to view the cluster settings and `manage` cluster privileges to change them. :::: @@ -61,7 +61,7 @@ To learn about monitoring in general, see [Monitor a cluster](../../monitor.md). Alternatively, you can enable this setting in {{kib}}. In the side navigation, click **Monitoring**. If data collection is disabled, you are prompted to turn it on. - For more information, see [Monitoring settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/monitoring-settings.md) and [Cluster update settings](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings). + For more information, see [Monitoring settings](elasticsearch://reference/elasticsearch/configuration-reference/monitoring-settings.md) and [Cluster update settings](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings). 3. Optional: Specify which indices you want to monitor. @@ -73,13 +73,13 @@ To learn about monitoring in general, see [Monitor a cluster](../../monitor.md). You can prepend `-` to explicitly exclude index names or patterns. For example, to include all indices that start with `test` except `test3`, you could specify `test*,-test3`. To include system indices such as .security and .kibana, add `.*` to the list of included names. For example `.*,test*,-test3` - 4. Optional: Specify how often to collect monitoring data. The default value for the `xpack.monitoring.collection.interval` setting 10 seconds. See [Monitoring settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/monitoring-settings.md). + 4. Optional: Specify how often to collect monitoring data. The default value for the `xpack.monitoring.collection.interval` setting 10 seconds. See [Monitoring settings](elasticsearch://reference/elasticsearch/configuration-reference/monitoring-settings.md). 2. Identify where to store monitoring data. By default, the data is stored on the same cluster by using a [`local` exporter](es-local-exporter.md). Alternatively, you can use an [`http` exporter](es-http-exporter.md) to send data to a separate *monitoring cluster*. - ::::{important} + ::::{important} The {{es}} {{monitor-features}} use ingest pipelines, therefore the cluster that stores the monitoring data must have at least one [ingest node](../../../manage-data/ingest/transform-enrich/ingest-pipelines.md). :::: @@ -147,8 +147,8 @@ To learn about monitoring in general, see [Monitor a cluster](../../monitor.md). 4. Configure your cluster to route monitoring data from sources such as {{kib}}, Beats, and {{ls}} to the monitoring cluster. For information about configuring each product to collect and send monitoring data, see [Monitor a cluster](../../monitor.md). 5. If you updated settings in the `elasticsearch.yml` files on your production cluster, restart {{es}}. See [*Stopping Elasticsearch*](../../maintenance/start-stop-services/start-stop-elasticsearch.md) and [*Starting Elasticsearch*](../../maintenance/start-stop-services/start-stop-elasticsearch.md). - ::::{tip} - You may want to temporarily [disable shard allocation](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md) before you restart your nodes to avoid unnecessary shard reallocation during the install process. + ::::{tip} + You may want to temporarily [disable shard allocation](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md) before you restart your nodes to avoid unnecessary shard reallocation during the install process. :::: 6. Optional: [Configure the indices that store the monitoring data](../monitoring-data/configuring-data-streamsindices-for-monitoring.md). diff --git a/deploy-manage/monitor/stack-monitoring/es-local-exporter.md b/deploy-manage/monitor/stack-monitoring/es-local-exporter.md index a4e40e0cb..a392e3fe3 100644 --- a/deploy-manage/monitor/stack-monitoring/es-local-exporter.md +++ b/deploy-manage/monitor/stack-monitoring/es-local-exporter.md @@ -8,7 +8,7 @@ applies_to: # Local exporters [local-exporter] -::::{important} +::::{important} {{agent}} and {{metricbeat}} are the recommended methods for collecting and shipping monitoring data to a monitoring cluster. If you have previously configured legacy collection methods, you should migrate to using [{{agent}}](collecting-monitoring-data-with-elastic-agent.md) or [{{metricbeat}}](collecting-monitoring-data-with-metricbeat.md) collection. Do not use legacy collection alongside other collection methods. @@ -39,7 +39,7 @@ The elected master node is the only node to set up resources for the `local` exp One benefit of the `local` exporter is that it lives within the cluster and therefore no extra configuration is required when the cluster is secured with {{stack}} {{security-features}}. All operations, including indexing operations, that occur from a `local` exporter make use of the internal transport mechanisms within {{es}}. This behavior enables the exporter to be used without providing any user credentials when {{security-features}} are enabled. -For more information about the configuration options for the `local` exporter, see [Local exporter settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/monitoring-settings.md#local-exporter-settings). +For more information about the configuration options for the `local` exporter, see [Local exporter settings](elasticsearch://reference/elasticsearch/configuration-reference/monitoring-settings.md#local-exporter-settings). ## Cleaner service [local-exporter-cleaner] diff --git a/deploy-manage/monitor/stack-monitoring/es-monitoring-collectors.md b/deploy-manage/monitor/stack-monitoring/es-monitoring-collectors.md index fbc39ac14..9bfe49790 100644 --- a/deploy-manage/monitor/stack-monitoring/es-monitoring-collectors.md +++ b/deploy-manage/monitor/stack-monitoring/es-monitoring-collectors.md @@ -9,7 +9,7 @@ applies_to: # Collectors [es-monitoring-collectors] -::::{important} +::::{important} {{agent}} and {{metricbeat}} are the recommended methods for collecting and shipping monitoring data to a monitoring cluster. If you have previously configured legacy collection methods, you should migrate to using [{{agent}}](collecting-monitoring-data-with-elastic-agent.md) or [{{metricbeat}}](collecting-monitoring-data-with-metricbeat.md) collection. Do not use legacy collection alongside other collection methods. @@ -40,23 +40,23 @@ Once collection has completed, all of the monitoring data is passed to the expor If gaps exist in the monitoring charts in {{kib}}, it is typically because either a collector failed or the monitoring cluster did not receive the data (for example, it was being restarted). In the event that a collector fails, a logged error should exist on the node that attempted to perform the collection. -::::{note} +::::{note} Collection is currently done serially, rather than in parallel, to avoid extra overhead on the elected master node. The downside to this approach is that collectors might observe a different version of the cluster state within the same collection period. In practice, this does not make a significant difference and running the collectors in parallel would not prevent such a possibility. :::: -For more information about the configuration options for the collectors, see [Monitoring collection settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/monitoring-settings.md#monitoring-collection-settings). +For more information about the configuration options for the collectors, see [Monitoring collection settings](elasticsearch://reference/elasticsearch/configuration-reference/monitoring-settings.md#monitoring-collection-settings). -## Collecting data from across the Elastic Stack [es-monitoring-stack] +## Collecting data from across the Elastic Stack [es-monitoring-stack] {{es}} {{monitor-features}} also receive monitoring data from other parts of the Elastic Stack. In this way, it serves as an unscheduled monitoring data collector for the stack. -By default, data collection is disabled. {{es}} monitoring data is not collected and all monitoring data from other sources such as {{kib}}, Beats, and Logstash is ignored. You must set `xpack.monitoring.collection.enabled` to `true` to enable the collection of monitoring data. See [Monitoring settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/monitoring-settings.md). +By default, data collection is disabled. {{es}} monitoring data is not collected and all monitoring data from other sources such as {{kib}}, Beats, and Logstash is ignored. You must set `xpack.monitoring.collection.enabled` to `true` to enable the collection of monitoring data. See [Monitoring settings](elasticsearch://reference/elasticsearch/configuration-reference/monitoring-settings.md). Once data is received, it is forwarded to the exporters to be routed to the monitoring cluster like all monitoring data. -::::{warning} +::::{warning} Because this stack-level "collector" lives outside of the collection interval of {{es}} {{monitor-features}}, it is not impacted by the `xpack.monitoring.collection.interval` setting. Therefore, data is passed to the exporters whenever it is received. This behavior can result in indices for {{kib}}, Logstash, or Beats being created somewhat unexpectedly. :::: diff --git a/deploy-manage/monitor/stack-monitoring/es-monitoring-exporters.md b/deploy-manage/monitor/stack-monitoring/es-monitoring-exporters.md index b2d667edc..d15f010a9 100644 --- a/deploy-manage/monitor/stack-monitoring/es-monitoring-exporters.md +++ b/deploy-manage/monitor/stack-monitoring/es-monitoring-exporters.md @@ -45,7 +45,7 @@ You are strongly recommended to manage the curation of indices and particularly :::: -There is also a disk watermark (known as the flood stage watermark), which protects clusters from running out of disk space. When this feature is triggered, it makes all indices (including monitoring indices) read-only until the issue is fixed and a user manually makes the index writeable again. While an active monitoring index is read-only, it will naturally fail to write (index) new data and will continuously log errors that indicate the write failure. For more information, see [Disk-based shard allocation settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#disk-based-shard-allocation). +There is also a disk watermark (known as the flood stage watermark), which protects clusters from running out of disk space. When this feature is triggered, it makes all indices (including monitoring indices) read-only until the issue is fixed and a user manually makes the index writeable again. While an active monitoring index is read-only, it will naturally fail to write (index) new data and will continuously log errors that indicate the write failure. For more information, see [Disk-based shard allocation settings](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#disk-based-shard-allocation). ## Default exporters [es-monitoring-default-exporter] @@ -77,7 +77,7 @@ Before exporters can route monitoring data, they must set up certain {{es}} reso The templates are ordinary {{es}} templates that control the default settings and mappings for the monitoring indices. -By default, monitoring indices are created daily (for example, `.monitoring-es-6-2017.08.26`). You can change the default date suffix for monitoring indices with the `index.name.time_format` setting. You can use this setting to control how frequently monitoring indices are created by a specific `http` exporter. You cannot use this setting with `local` exporters. For more information, see [HTTP exporter settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/monitoring-settings.md#http-exporter-settings). +By default, monitoring indices are created daily (for example, `.monitoring-es-6-2017.08.26`). You can change the default date suffix for monitoring indices with the `index.name.time_format` setting. You can use this setting to control how frequently monitoring indices are created by a specific `http` exporter. You cannot use this setting with `local` exporters. For more information, see [HTTP exporter settings](elasticsearch://reference/elasticsearch/configuration-reference/monitoring-settings.md#http-exporter-settings). ::::{warning} Some users create their own templates that match *all* index patterns, which therefore impact the monitoring indices that get created. It is critical that you do not disable `_source` storage for the monitoring indices. If you do, {{kib}} {{monitor-features}} do not work and you cannot visualize monitoring data for your cluster. diff --git a/deploy-manage/monitor/stack-monitoring/kibana-monitoring-legacy.md b/deploy-manage/monitor/stack-monitoring/kibana-monitoring-legacy.md index d54ffdfb2..47a04e47e 100644 --- a/deploy-manage/monitor/stack-monitoring/kibana-monitoring-legacy.md +++ b/deploy-manage/monitor/stack-monitoring/kibana-monitoring-legacy.md @@ -61,7 +61,7 @@ To learn about monitoring in general, see [Monitor a cluster](../../monitor.md). } ``` - For more information, see [Monitoring settings in {{es}}](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/monitoring-settings.md) and [Cluster update settings](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings). + For more information, see [Monitoring settings in {{es}}](elasticsearch://reference/elasticsearch/configuration-reference/monitoring-settings.md) and [Cluster update settings](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings). 2. Verify that `monitoring.enabled` and `monitoring.kibana.collection.enabled` are set to `true` in the `kibana.yml` file. These are the default values. For more information, see [Monitoring settings in {{kib}}](asciidocalypse://docs/kibana/docs/reference/configuration-reference/monitoring-settings.md). 3. Identify where to send monitoring data. {{kib}} automatically sends metrics to the {{es}} cluster specified in the `elasticsearch.hosts` setting in the `kibana.yml` file. This property has a default value of `http://localhost:9200`.
diff --git a/deploy-manage/monitor/stack-monitoring/kibana-monitoring-metricbeat.md b/deploy-manage/monitor/stack-monitoring/kibana-monitoring-metricbeat.md index 0cc7fef06..c83639823 100644 --- a/deploy-manage/monitor/stack-monitoring/kibana-monitoring-metricbeat.md +++ b/deploy-manage/monitor/stack-monitoring/kibana-monitoring-metricbeat.md @@ -63,7 +63,7 @@ To learn about monitoring in general, see [Monitor a cluster](../../monitor.md). } ``` - For more information, see [Monitoring settings in {{es}}](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/monitoring-settings.md) and [Cluster update settings](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings). + For more information, see [Monitoring settings in {{es}}](elasticsearch://reference/elasticsearch/configuration-reference/monitoring-settings.md) and [Cluster update settings](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings). 4. [Install {{metricbeat}}](asciidocalypse://docs/beats/docs/reference/metricbeat/metricbeat-installation-configuration.md) on the same server as {{kib}}. 5. Enable the {{kib}} {{xpack}} module in {{metricbeat}}.
diff --git a/deploy-manage/production-guidance/availability-and-resilience.md b/deploy-manage/production-guidance/availability-and-resilience.md index 278b17e1d..af4cdbd9d 100644 --- a/deploy-manage/production-guidance/availability-and-resilience.md +++ b/deploy-manage/production-guidance/availability-and-resilience.md @@ -10,7 +10,7 @@ Distributed systems like {{es}} are designed to keep working even if some of the There is a limit to how small a resilient cluster can be. All {{es}} clusters require the following components to function: * One [elected master node](../distributed-architecture/discovery-cluster-formation/modules-discovery-quorums.md) -* At least one node for each [role](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/node-settings.md) +* At least one node for each [role](elasticsearch://reference/elasticsearch/configuration-reference/node-settings.md) * At least one copy of every [shard](../../deploy-manage/index.md) A resilient cluster requires redundancy for every required cluster component. This means a resilient cluster must have the following components: diff --git a/deploy-manage/production-guidance/availability-and-resilience/resilience-in-larger-clusters.md b/deploy-manage/production-guidance/availability-and-resilience/resilience-in-larger-clusters.md index 529d3f5dd..91ce725c6 100644 --- a/deploy-manage/production-guidance/availability-and-resilience/resilience-in-larger-clusters.md +++ b/deploy-manage/production-guidance/availability-and-resilience/resilience-in-larger-clusters.md @@ -9,7 +9,7 @@ It’s not unusual for nodes to share common infrastructure, such as network int {{es}} expects node-to-node connections to be reliable, have low latency, and have adequate bandwidth. Many {{es}} tasks require multiple round-trips between nodes. A slow or unreliable interconnect may have a significant effect on the performance and stability of your cluster. -For example, a few milliseconds of latency added to each round-trip can quickly accumulate into a noticeable performance penalty. An unreliable network may have frequent network partitions. {{es}} will automatically recover from a network partition as quickly as it can but your cluster may be partly unavailable during a partition and will need to spend time and resources to [resynchronize any missing data](../../distributed-architecture/shard-allocation-relocation-recovery.md#shard-recovery) and [rebalance](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#shards-rebalancing-settings) itself once the partition heals. Recovering from a failure may involve copying a large amount of data between nodes so the recovery time is often determined by the available bandwidth. +For example, a few milliseconds of latency added to each round-trip can quickly accumulate into a noticeable performance penalty. An unreliable network may have frequent network partitions. {{es}} will automatically recover from a network partition as quickly as it can but your cluster may be partly unavailable during a partition and will need to spend time and resources to [resynchronize any missing data](../../distributed-architecture/shard-allocation-relocation-recovery.md#shard-recovery) and [rebalance](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#shards-rebalancing-settings) itself once the partition heals. Recovering from a failure may involve copying a large amount of data between nodes so the recovery time is often determined by the available bandwidth. If you’ve divided your cluster into zones, the network connections within each zone are typically of higher quality than the connections between the zones. Ensure the network connections between zones are of sufficiently high quality. You will see the best results by locating all your zones within a single data center with each zone having its own independent power supply and other supporting infrastructure. You can also *stretch* your cluster across nearby data centers as long as the network interconnection between each pair of data centers is good enough. diff --git a/deploy-manage/production-guidance/availability-and-resilience/resilience-in-small-clusters.md b/deploy-manage/production-guidance/availability-and-resilience/resilience-in-small-clusters.md index 8cbb4ca47..d5ffea269 100644 --- a/deploy-manage/production-guidance/availability-and-resilience/resilience-in-small-clusters.md +++ b/deploy-manage/production-guidance/availability-and-resilience/resilience-in-small-clusters.md @@ -11,7 +11,7 @@ In smaller clusters, it is most important to be resilient to single-node failure If your cluster consists of one node, that single node must do everything. To accommodate this, {{es}} assigns nodes every role by default. -A single node cluster is not resilient. If the node fails, the cluster will stop working. Because there are no replicas in a one-node cluster, you cannot store your data redundantly. However, by default at least one replica is required for a [`green` cluster health status](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-health). To ensure your cluster can report a `green` status, override the default by setting [`index.number_of_replicas`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-modules.md) to `0` on every index. +A single node cluster is not resilient. If the node fails, the cluster will stop working. Because there are no replicas in a one-node cluster, you cannot store your data redundantly. However, by default at least one replica is required for a [`green` cluster health status](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-health). To ensure your cluster can report a `green` status, override the default by setting [`index.number_of_replicas`](elasticsearch://reference/elasticsearch/index-settings/index-modules.md) to `0` on every index. If the node fails, you may need to restore an older copy of any lost indices from a [snapshot](../../tools/snapshot-and-restore.md). @@ -20,7 +20,7 @@ Because they are not resilient to any failures, we do not recommend using one-no ## Two-node clusters [high-availability-cluster-design-two-nodes] -If you have two nodes, we recommend they both be data nodes. You should also ensure every shard is stored redundantly on both nodes by setting [`index.number_of_replicas`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-modules.md) to `1` on every index that is not a [searchable snapshot index](../../tools/snapshot-and-restore/searchable-snapshots.md). This is the default behaviour but may be overridden by an [index template](../../../manage-data/data-store/templates.md). [Auto-expand replicas](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-modules.md) can also achieve the same thing, but it’s not necessary to use this feature in such a small cluster. +If you have two nodes, we recommend they both be data nodes. You should also ensure every shard is stored redundantly on both nodes by setting [`index.number_of_replicas`](elasticsearch://reference/elasticsearch/index-settings/index-modules.md) to `1` on every index that is not a [searchable snapshot index](../../tools/snapshot-and-restore/searchable-snapshots.md). This is the default behaviour but may be overridden by an [index template](../../../manage-data/data-store/templates.md). [Auto-expand replicas](elasticsearch://reference/elasticsearch/index-settings/index-modules.md) can also achieve the same thing, but it’s not necessary to use this feature in such a small cluster. We recommend you set only one of your two nodes to be [master-eligible](../../distributed-architecture/clusters-nodes-shards/node-roles.md#master-node-role). This means you can be certain which of your nodes is the elected master of the cluster. The cluster can tolerate the loss of the other master-ineligible node. If you set both nodes to master-eligible, two nodes are required for a master election. Since the election will fail if either node is unavailable, your cluster cannot reliably tolerate the loss of either node. diff --git a/deploy-manage/production-guidance/general-recommendations.md b/deploy-manage/production-guidance/general-recommendations.md index d5f148aa5..57537a71b 100644 --- a/deploy-manage/production-guidance/general-recommendations.md +++ b/deploy-manage/production-guidance/general-recommendations.md @@ -6,16 +6,16 @@ mapped_pages: # General recommendations [general-recommendations] -## Don’t return large result sets [large-size] +## Don’t return large result sets [large-size] -Elasticsearch is designed as a search engine, which makes it very good at getting back the top documents that match a query. However, it is not as good for workloads that fall into the database domain, such as retrieving all documents that match a particular query. If you need to do this, make sure to use the [Scroll](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/paginate-search-results.md#scroll-search-results) API. +Elasticsearch is designed as a search engine, which makes it very good at getting back the top documents that match a query. However, it is not as good for workloads that fall into the database domain, such as retrieving all documents that match a particular query. If you need to do this, make sure to use the [Scroll](elasticsearch://reference/elasticsearch/rest-apis/paginate-search-results.md#scroll-search-results) API. -## Avoid large documents [maximum-document-size] +## Avoid large documents [maximum-document-size] -Given that the default [`http.max_content_length`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#http-settings) is set to 100MB, Elasticsearch will refuse to index any document that is larger than that. You might decide to increase that particular setting, but Lucene still has a limit of about 2GB. +Given that the default [`http.max_content_length`](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#http-settings) is set to 100MB, Elasticsearch will refuse to index any document that is larger than that. You might decide to increase that particular setting, but Lucene still has a limit of about 2GB. -Even without considering hard limits, large documents are usually not practical. Large documents put more stress on network, memory usage and disk, even for search requests that do not request the `_source` since Elasticsearch needs to fetch the `_id` of the document in all cases, and the cost of getting this field is bigger for large documents due to how the filesystem cache works. Indexing this document can use an amount of memory that is a multiplier of the original size of the document. Proximity search (phrase queries for instance) and [highlighting](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/highlighting.md) also become more expensive since their cost directly depends on the size of the original document. +Even without considering hard limits, large documents are usually not practical. Large documents put more stress on network, memory usage and disk, even for search requests that do not request the `_source` since Elasticsearch needs to fetch the `_id` of the document in all cases, and the cost of getting this field is bigger for large documents due to how the filesystem cache works. Indexing this document can use an amount of memory that is a multiplier of the original size of the document. Proximity search (phrase queries for instance) and [highlighting](elasticsearch://reference/elasticsearch/rest-apis/highlighting.md) also become more expensive since their cost directly depends on the size of the original document. It is sometimes useful to reconsider what the unit of information should be. For instance, the fact you want to make books searchable doesn’t necessarily mean that a document should consist of a whole book. It might be a better idea to use chapters or even paragraphs as documents, and then have a property in these documents that identifies which book they belong to. This does not only avoid the issues with large documents, it also makes the search experience better. For instance if a user searches for two words `foo` and `bar`, a match across different chapters is probably very poor, while a match within the same paragraph is likely good. diff --git a/deploy-manage/production-guidance/optimize-performance/approximate-knn-search.md b/deploy-manage/production-guidance/optimize-performance/approximate-knn-search.md index 3c493cb70..63e9fc15e 100644 --- a/deploy-manage/production-guidance/optimize-performance/approximate-knn-search.md +++ b/deploy-manage/production-guidance/optimize-performance/approximate-knn-search.md @@ -10,33 +10,33 @@ mapped_pages: Many of these recommendations help improve search speed. With approximate kNN, the indexing algorithm runs searches under the hood to create the vector index structures. So these same recommendations also help with indexing speed. -## Reduce vector memory foot-print [_reduce_vector_memory_foot_print] +## Reduce vector memory foot-print [_reduce_vector_memory_foot_print] -The default [`element_type`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-element-type) is `float`. But this can be automatically quantized during index time through [`quantization`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-quantization). Quantization will reduce the required memory by 4x, 8x, or as much as 32x, but it will also reduce the precision of the vectors and increase disk usage for the field (by up to 25%, 12.5%, or 3.125%, respectively). Increased disk usage is a result of {{es}} storing both the quantized and the unquantized vectors. For example, when int8 quantizing 40GB of floating point vectors an extra 10GB of data will be stored for the quantized vectors. The total disk usage amounts to 50GB, but the memory usage for fast search will be reduced to 10GB. +The default [`element_type`](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-element-type) is `float`. But this can be automatically quantized during index time through [`quantization`](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-quantization). Quantization will reduce the required memory by 4x, 8x, or as much as 32x, but it will also reduce the precision of the vectors and increase disk usage for the field (by up to 25%, 12.5%, or 3.125%, respectively). Increased disk usage is a result of {{es}} storing both the quantized and the unquantized vectors. For example, when int8 quantizing 40GB of floating point vectors an extra 10GB of data will be stored for the quantized vectors. The total disk usage amounts to 50GB, but the memory usage for fast search will be reduced to 10GB. -For `float` vectors with `dim` greater than or equal to `384`, using a [`quantized`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-quantization) index is highly recommended. +For `float` vectors with `dim` greater than or equal to `384`, using a [`quantized`](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-quantization) index is highly recommended. -## Reduce vector dimensionality [_reduce_vector_dimensionality] +## Reduce vector dimensionality [_reduce_vector_dimensionality] The speed of kNN search scales linearly with the number of vector dimensions, because each similarity computation considers each element in the two vectors. Whenever possible, it’s better to use vectors with a lower dimension. Some embedding models come in different "sizes", with both lower and higher dimensional options available. You could also experiment with dimensionality reduction techniques like PCA. When experimenting with different approaches, it’s important to measure the impact on relevance to ensure the search quality is still acceptable. -## Exclude vector fields from `_source` [_exclude_vector_fields_from_source] +## Exclude vector fields from `_source` [_exclude_vector_fields_from_source] -{{es}} stores the original JSON document that was passed at index time in the [`_source` field](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-source-field.md). By default, each hit in the search results contains the full document `_source`. When the documents contain high-dimensional `dense_vector` fields, the `_source` can be quite large and expensive to load. This could significantly slow down the speed of kNN search. +{{es}} stores the original JSON document that was passed at index time in the [`_source` field](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md). By default, each hit in the search results contains the full document `_source`. When the documents contain high-dimensional `dense_vector` fields, the `_source` can be quite large and expensive to load. This could significantly slow down the speed of kNN search. -::::{note} +::::{note} [reindex](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-reindex), [update](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-update), and [update by query](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-update-by-query) operations generally require the `_source` field. Disabling `_source` for a field might result in unexpected behavior for these operations. For example, reindex might not actually contain the `dense_vector` field in the new index. :::: -You can disable storing `dense_vector` fields in the `_source` through the [`excludes`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-source-field.md#include-exclude) mapping parameter. This prevents loading and returning large vectors during search, and also cuts down on the index size. Vectors that have been omitted from `_source` can still be used in kNN search, since it relies on separate data structures to perform the search. Before using the [`excludes`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-source-field.md#include-exclude) parameter, make sure to review the downsides of omitting fields from `_source`. +You can disable storing `dense_vector` fields in the `_source` through the [`excludes`](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md#include-exclude) mapping parameter. This prevents loading and returning large vectors during search, and also cuts down on the index size. Vectors that have been omitted from `_source` can still be used in kNN search, since it relies on separate data structures to perform the search. Before using the [`excludes`](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md#include-exclude) parameter, make sure to review the downsides of omitting fields from `_source`. -Another option is to use [synthetic `_source`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source). +Another option is to use [synthetic `_source`](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source). -## Ensure data nodes have enough memory [_ensure_data_nodes_have_enough_memory] +## Ensure data nodes have enough memory [_ensure_data_nodes_have_enough_memory] {{es}} uses the [HNSW](https://arxiv.org/abs/1603.09320) algorithm for approximate kNN search. HNSW is a graph-based algorithm which only works efficiently when most vector data is held in memory. You should ensure that data nodes have at least enough RAM to hold the vector data and index structures. To check the size of the vector data, you can use the [Analyze index disk usage](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-disk-usage) API. @@ -56,11 +56,11 @@ Note that the required RAM is for the filesystem cache, which is separate from t The data nodes should also leave a buffer for other ways that RAM is needed. For example your index might also include text fields and numerics, which also benefit from using filesystem cache. It’s recommended to run benchmarks with your specific dataset to ensure there’s a sufficient amount of memory to give good search performance. You can find [here](https://elasticsearch-benchmarks.elastic.co/#tracks/so_vector) and [here](https://elasticsearch-benchmarks.elastic.co/#tracks/dense_vector) some examples of datasets and configurations that we use for our nightly benchmarks. -## Warm up the filesystem cache [dense-vector-preloading] +## Warm up the filesystem cache [dense-vector-preloading] -If the machine running Elasticsearch is restarted, the filesystem cache will be empty, so it will take some time before the operating system loads hot regions of the index into memory so that search operations are fast. You can explicitly tell the operating system which files should be loaded into memory eagerly depending on the file extension using the [`index.store.preload`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/preloading-data-into-file-system-cache.md) setting. +If the machine running Elasticsearch is restarted, the filesystem cache will be empty, so it will take some time before the operating system loads hot regions of the index into memory so that search operations are fast. You can explicitly tell the operating system which files should be loaded into memory eagerly depending on the file extension using the [`index.store.preload`](elasticsearch://reference/elasticsearch/index-settings/preloading-data-into-file-system-cache.md) setting. -::::{warning} +::::{warning} Loading data into the filesystem cache eagerly on too many indices or too many files will make search *slower* if the filesystem cache is not large enough to hold all the data. Use with caution. :::: @@ -69,47 +69,47 @@ The following file extensions are used for the approximate kNN search: Each exte * `vex` for the HNSW graph * `vec` for all non-quantized vector values. This includes all element types: `float`, `byte`, and `bit`. -* `veq` for quantized vectors indexed with [`quantization`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-quantization): `int4` or `int8` -* `veb` for binary vectors indexed with [`quantization`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-quantization): `bbq` +* `veq` for quantized vectors indexed with [`quantization`](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-quantization): `int4` or `int8` +* `veb` for binary vectors indexed with [`quantization`](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-quantization): `bbq` * `vem`, `vemf`, `vemq`, and `vemb` for metadata, usually small and not a concern for preloading Generally, if you are using a quantized index, you should only preload the relevant quantized values and the HNSW graph. Preloading the raw vectors is not necessary and might be counterproductive. -## Reduce the number of index segments [_reduce_the_number_of_index_segments] +## Reduce the number of index segments [_reduce_the_number_of_index_segments] -{{es}} shards are composed of segments, which are internal storage elements in the index. For approximate kNN search, {{es}} stores the vector values of each segment as a separate HNSW graph, so kNN search must check each segment. The recent parallelization of kNN search made it much faster to search across multiple segments, but still kNN search can be up to several times faster if there are fewer segments. By default, {{es}} periodically merges smaller segments into larger ones through a background [merge process](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/merge.md). If this isn’t sufficient, you can take explicit steps to reduce the number of index segments. +{{es}} shards are composed of segments, which are internal storage elements in the index. For approximate kNN search, {{es}} stores the vector values of each segment as a separate HNSW graph, so kNN search must check each segment. The recent parallelization of kNN search made it much faster to search across multiple segments, but still kNN search can be up to several times faster if there are fewer segments. By default, {{es}} periodically merges smaller segments into larger ones through a background [merge process](elasticsearch://reference/elasticsearch/index-settings/merge.md). If this isn’t sufficient, you can take explicit steps to reduce the number of index segments. -### Increase maximum segment size [_increase_maximum_segment_size] +### Increase maximum segment size [_increase_maximum_segment_size] {{es}} provides many tunable settings for controlling the merge process. One important setting is `index.merge.policy.max_merged_segment`. This controls the maximum size of the segments that are created during the merge process. By increasing the value, you can reduce the number of segments in the index. The default value is `5GB`, but that might be too small for larger dimensional vectors. Consider increasing this value to `10GB` or `20GB` can help reduce the number of segments. -### Create large segments during bulk indexing [_create_large_segments_during_bulk_indexing] +### Create large segments during bulk indexing [_create_large_segments_during_bulk_indexing] A common pattern is to first perform an initial bulk upload, then make an index available for searches. Instead of force merging, you can adjust the index settings to encourage {{es}} to create larger initial segments: -* Ensure there are no searches during the bulk upload and disable [`index.refresh_interval`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-modules.md#index-refresh-interval-setting) by setting it to `-1`. This prevents refresh operations and avoids creating extra segments. -* Give {{es}} a large indexing buffer so it can accept more documents before flushing. By default, the [`indices.memory.index_buffer_size`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/indexing-buffer-settings.md) is set to 10% of the heap size. With a substantial heap size like 32GB, this is often enough. To allow the full indexing buffer to be used, you should also increase the limit [`index.translog.flush_threshold_size`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/translog.md). +* Ensure there are no searches during the bulk upload and disable [`index.refresh_interval`](elasticsearch://reference/elasticsearch/index-settings/index-modules.md#index-refresh-interval-setting) by setting it to `-1`. This prevents refresh operations and avoids creating extra segments. +* Give {{es}} a large indexing buffer so it can accept more documents before flushing. By default, the [`indices.memory.index_buffer_size`](elasticsearch://reference/elasticsearch/configuration-reference/indexing-buffer-settings.md) is set to 10% of the heap size. With a substantial heap size like 32GB, this is often enough. To allow the full indexing buffer to be used, you should also increase the limit [`index.translog.flush_threshold_size`](elasticsearch://reference/elasticsearch/index-settings/translog.md). -## Avoid heavy indexing during searches [_avoid_heavy_indexing_during_searches] +## Avoid heavy indexing during searches [_avoid_heavy_indexing_during_searches] Actively indexing documents can have a negative impact on approximate kNN search performance, since indexing threads steal compute resources from search. When indexing and searching at the same time, {{es}} also refreshes frequently, which creates several small segments. This also hurts search performance, since approximate kNN search is slower when there are more segments. When possible, it’s best to avoid heavy indexing during approximate kNN search. If you need to reindex all the data, perhaps because the vector embedding model changed, then it’s better to reindex the new documents into a separate index rather than update them in-place. This helps avoid the slowdown mentioned above, and prevents expensive merge operations due to frequent document updates. -## Avoid page cache thrashing by using modest readahead values on Linux [_avoid_page_cache_thrashing_by_using_modest_readahead_values_on_linux_2] +## Avoid page cache thrashing by using modest readahead values on Linux [_avoid_page_cache_thrashing_by_using_modest_readahead_values_on_linux_2] -Search can cause a lot of randomized read I/O. When the underlying block device has a high readahead value, there may be a lot of unnecessary read I/O done, especially when files are accessed using memory mapping (see [storage types](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/store.md#file-system)). +Search can cause a lot of randomized read I/O. When the underlying block device has a high readahead value, there may be a lot of unnecessary read I/O done, especially when files are accessed using memory mapping (see [storage types](elasticsearch://reference/elasticsearch/index-settings/store.md#file-system)). Most Linux distributions use a sensible readahead value of `128KiB` for a single plain device, however, when using software raid, LVM or dm-crypt the resulting block device (backing Elasticsearch [path.data](../../deploy/self-managed/important-settings-configuration.md#path-settings)) may end up having a very large readahead value (in the range of several MiB). This usually results in severe page (filesystem) cache thrashing adversely affecting search (or [update](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-document)) performance. You can check the current value in `KiB` using `lsblk -o NAME,RA,MOUNTPOINT,TYPE,SIZE`. Consult the documentation of your distribution on how to alter this value (for example with a `udev` rule to persist across reboots, or via [blockdev --setra](https://man7.org/linux/man-pages/man8/blockdev.8.md) as a transient setting). We recommend a value of `128KiB` for readahead. -::::{warning} +::::{warning} `blockdev` expects values in 512 byte sectors whereas `lsblk` reports values in `KiB`. As an example, to temporarily set readahead to `128KiB` for `/dev/nvme0n1`, specify `blockdev --setra 256 /dev/nvme0n1`. :::: diff --git a/deploy-manage/production-guidance/optimize-performance/disk-usage.md b/deploy-manage/production-guidance/optimize-performance/disk-usage.md index 7d29c1cfd..1db3ae1b3 100644 --- a/deploy-manage/production-guidance/optimize-performance/disk-usage.md +++ b/deploy-manage/production-guidance/optimize-performance/disk-usage.md @@ -24,12 +24,12 @@ PUT index } ``` -[`text`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/text.md) fields store normalization factors in the index to facilitate document scoring. If you only need matching capabilities on a `text` field but do not care about the produced scores, you can use the [`match_only_text`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/text.md#match-only-text-field-type) type instead. This field type saves significant space by dropping scoring and positional information. +[`text`](elasticsearch://reference/elasticsearch/mapping-reference/text.md) fields store normalization factors in the index to facilitate document scoring. If you only need matching capabilities on a `text` field but do not care about the produced scores, you can use the [`match_only_text`](elasticsearch://reference/elasticsearch/mapping-reference/text.md#match-only-text-field-type) type instead. This field type saves significant space by dropping scoring and positional information. ## Don’t use default dynamic string mappings [default-dynamic-string-mapping] -The default [dynamic string mappings](../../../manage-data/data-store/mapping/dynamic-mapping.md) will index string fields both as [`text`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/text.md) and [`keyword`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/keyword.md). This is wasteful if you only need one of them. Typically an `id` field will only need to be indexed as a `keyword` while a `body` field will only need to be indexed as a `text` field. +The default [dynamic string mappings](../../../manage-data/data-store/mapping/dynamic-mapping.md) will index string fields both as [`text`](elasticsearch://reference/elasticsearch/mapping-reference/text.md) and [`keyword`](elasticsearch://reference/elasticsearch/mapping-reference/keyword.md). This is wasteful if you only need one of them. Typically an `id` field will only need to be indexed as a `keyword` while a `body` field will only need to be indexed as a `text` field. This can be disabled by either configuring explicit mappings on string fields or setting up dynamic templates that will map string fields as either `text` or `keyword`. @@ -63,12 +63,12 @@ Keep in mind that large shard sizes come with drawbacks, such as long full recov ## Disable `_source` [disable-source] -The [`_source`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-source-field.md) field stores the original JSON body of the document. If you don’t need access to it you can disable it. However, APIs that needs access to `_source` such as update, highlight and reindex won’t work. +The [`_source`](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md) field stores the original JSON body of the document. If you don’t need access to it you can disable it. However, APIs that needs access to `_source` such as update, highlight and reindex won’t work. ## Use `best_compression` [best-compression] -The `_source` and stored fields can easily take a non negligible amount of disk space. They can be compressed more aggressively by using the `best_compression` [codec](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-modules.md#index-codec). +The `_source` and stored fields can easily take a non negligible amount of disk space. They can be compressed more aggressively by using the `best_compression` [codec](elasticsearch://reference/elasticsearch/index-settings/index-modules.md#index-codec). ## Force merge [_force_merge] @@ -90,14 +90,14 @@ The [shrink API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/ope ## Use the smallest numeric type that is sufficient [_use_the_smallest_numeric_type_that_is_sufficient] -The type that you pick for [numeric data](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/number.md) can have a significant impact on disk usage. In particular, integers should be stored using an integer type (`byte`, `short`, `integer` or `long`) and floating points should either be stored in a `scaled_float` if appropriate or in the smallest type that fits the use-case: using `float` over `double`, or `half_float` over `float` will help save storage. +The type that you pick for [numeric data](elasticsearch://reference/elasticsearch/mapping-reference/number.md) can have a significant impact on disk usage. In particular, integers should be stored using an integer type (`byte`, `short`, `integer` or `long`) and floating points should either be stored in a `scaled_float` if appropriate or in the smallest type that fits the use-case: using `float` over `double`, or `half_float` over `float` will help save storage. ## Use index sorting to colocate similar documents [_use_index_sorting_to_colocate_similar_documents] When Elasticsearch stores `_source`, it compresses multiple documents at once in order to improve the overall compression ratio. For instance it is very common that documents share the same field names, and quite common that they share some field values, especially on fields that have a low cardinality or a [zipfian](https://en.wikipedia.org/wiki/Zipf%27s_law) distribution. -By default documents are compressed together in the order that they are added to the index. If you enabled [index sorting](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/sorting.md) then instead they are compressed in sorted order. Sorting documents with similar structure, fields, and values together should improve the compression ratio. +By default documents are compressed together in the order that they are added to the index. If you enabled [index sorting](elasticsearch://reference/elasticsearch/index-settings/sorting.md) then instead they are compressed in sorted order. Sorting documents with similar structure, fields, and values together should improve the compression ratio. ## Put fields in the same order in documents [_put_fields_in_the_same_order_in_documents] diff --git a/deploy-manage/production-guidance/optimize-performance/indexing-speed.md b/deploy-manage/production-guidance/optimize-performance/indexing-speed.md index 36757e4f4..667b1db2a 100644 --- a/deploy-manage/production-guidance/optimize-performance/indexing-speed.md +++ b/deploy-manage/production-guidance/optimize-performance/indexing-speed.md @@ -6,12 +6,12 @@ mapped_pages: # Indexing speed [tune-for-indexing-speed] -## Use bulk requests [_use_bulk_requests] +## Use bulk requests [_use_bulk_requests] Bulk requests will yield much better performance than single-document index requests. In order to know the optimal size of a bulk request, you should run a benchmark on a single node with a single shard. First try to index 100 documents at once, then 200, then 400, etc. doubling the number of documents in a bulk request in every benchmark run. When the indexing speed starts to plateau then you know you reached the optimal size of a bulk request for your data. In case of tie, it is better to err in the direction of too few rather than too many documents. Beware that too large bulk requests might put the cluster under memory pressure when many of them are sent concurrently, so it is advisable to avoid going beyond a couple tens of megabytes per request even if larger requests seem to perform better. -## Use multiple workers/threads to send data to Elasticsearch [multiple-workers-threads] +## Use multiple workers/threads to send data to Elasticsearch [multiple-workers-threads] A single thread sending bulk requests is unlikely to be able to max out the indexing capacity of an Elasticsearch cluster. In order to use all resources of the cluster, you should send data from multiple threads or processes. In addition to making better use of the resources of the cluster, this should help reduce the cost of each fsync. @@ -20,7 +20,7 @@ Make sure to watch for `TOO_MANY_REQUESTS (429)` response codes (`EsRejectedExec Similarly to sizing bulk requests, only testing can tell what the optimal number of workers is. This can be tested by progressively increasing the number of workers until either I/O or CPU is saturated on the cluster. -## Unset or increase the refresh interval [_unset_or_increase_the_refresh_interval] +## Unset or increase the refresh interval [_unset_or_increase_the_refresh_interval] The operation that consists of making changes visible to search - called a [refresh](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-refresh) - is costly, and calling it often while there is ongoing indexing activity can hurt indexing speed. @@ -28,63 +28,63 @@ By default, Elasticsearch periodically refreshes indices every second, but only This is the optimal configuration if you have no or very little search traffic (e.g. less than one search request every 5 minutes) and want to optimize for indexing speed. This behavior aims to automatically optimize bulk indexing in the default case when no searches are performed. In order to opt out of this behavior set the refresh interval explicitly. -On the other hand, if your index experiences regular search requests, this default behavior means that Elasticsearch will refresh your index every 1 second. If you can afford to increase the amount of time between when a document gets indexed and when it becomes visible, increasing the [`index.refresh_interval`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-modules.md#index-refresh-interval-setting) to a larger value, e.g. `30s`, might help improve indexing speed. +On the other hand, if your index experiences regular search requests, this default behavior means that Elasticsearch will refresh your index every 1 second. If you can afford to increase the amount of time between when a document gets indexed and when it becomes visible, increasing the [`index.refresh_interval`](elasticsearch://reference/elasticsearch/index-settings/index-modules.md#index-refresh-interval-setting) to a larger value, e.g. `30s`, might help improve indexing speed. -## Disable replicas for initial loads [_disable_replicas_for_initial_loads] +## Disable replicas for initial loads [_disable_replicas_for_initial_loads] If you have a large amount of data that you want to load all at once into Elasticsearch, it may be beneficial to set `index.number_of_replicas` to `0` in order to speed up indexing. Having no replicas means that losing a single node may incur data loss, so it is important that the data lives elsewhere so that this initial load can be retried in case of an issue. Once the initial load is finished, you can set `index.number_of_replicas` back to its original value. If `index.refresh_interval` is configured in the index settings, it may further help to unset it during this initial load and setting it back to its original value once the initial load is finished. -## Disable swapping [_disable_swapping_2] +## Disable swapping [_disable_swapping_2] You should make sure that the operating system is not swapping out the java process by [disabling swapping](../../deploy/self-managed/setup-configuration-memory.md). -## Give memory to the filesystem cache [_give_memory_to_the_filesystem_cache] +## Give memory to the filesystem cache [_give_memory_to_the_filesystem_cache] The filesystem cache will be used in order to buffer I/O operations. You should make sure to give at least half the memory of the machine running Elasticsearch to the filesystem cache. -## Use auto-generated ids [_use_auto_generated_ids] +## Use auto-generated ids [_use_auto_generated_ids] When indexing a document that has an explicit id, Elasticsearch needs to check whether a document with the same id already exists within the same shard, which is a costly operation and gets even more costly as the index grows. By using auto-generated ids, Elasticsearch can skip this check, which makes indexing faster. -## Use faster hardware [indexing-use-faster-hardware] +## Use faster hardware [indexing-use-faster-hardware] If indexing is I/O-bound, consider increasing the size of the filesystem cache (see above) or using faster storage. Elasticsearch generally creates individual files with sequential writes. However, indexing involves writing multiple files concurrently, and a mix of random and sequential reads too, so SSD drives tend to perform better than spinning disks. Stripe your index across multiple SSDs by configuring a RAID 0 array. Remember that it will increase the risk of failure since the failure of any one SSD destroys the index. However this is typically the right tradeoff to make: optimize single shards for maximum performance, and then add replicas across different nodes so there’s redundancy for any node failures. You can also use [snapshot and restore](../../tools/snapshot-and-restore.md) to backup the index for further insurance. -### Local vs. remote storage [_local_vs_remote_storage] +### Local vs. remote storage [_local_vs_remote_storage] Directly-attached (local) storage generally performs better than remote storage because it is simpler to configure well and avoids communications overheads. Some remote storage performs very poorly, especially under the kind of load that {{es}} imposes. However, with careful tuning, it is sometimes possible to achieve acceptable performance using remote storage too. Before committing to a particular storage architecture, benchmark your system with a realistic workload to determine the effects of any tuning parameters. If you cannot achieve the performance you expect, work with the vendor of your storage system to identify the problem. -## Indexing buffer size [_indexing_buffer_size] +## Indexing buffer size [_indexing_buffer_size] -If your node is doing only heavy indexing, be sure [`indices.memory.index_buffer_size`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/indexing-buffer-settings.md) is large enough to give at most 512 MB indexing buffer per shard doing heavy indexing (beyond that indexing performance does not typically improve). Elasticsearch takes that setting (a percentage of the java heap or an absolute byte-size), and uses it as a shared buffer across all active shards. Very active shards will naturally use this buffer more than shards that are performing lightweight indexing. +If your node is doing only heavy indexing, be sure [`indices.memory.index_buffer_size`](elasticsearch://reference/elasticsearch/configuration-reference/indexing-buffer-settings.md) is large enough to give at most 512 MB indexing buffer per shard doing heavy indexing (beyond that indexing performance does not typically improve). Elasticsearch takes that setting (a percentage of the java heap or an absolute byte-size), and uses it as a shared buffer across all active shards. Very active shards will naturally use this buffer more than shards that are performing lightweight indexing. The default is `10%` which is often plenty: for example, if you give the JVM 10GB of memory, it will give 1GB to the index buffer, which is enough to host two shards that are heavily indexing. -## Use {{ccr}} to prevent searching from stealing resources from indexing [_use_ccr_to_prevent_searching_from_stealing_resources_from_indexing] +## Use {{ccr}} to prevent searching from stealing resources from indexing [_use_ccr_to_prevent_searching_from_stealing_resources_from_indexing] Within a single cluster, indexing and searching can compete for resources. By setting up two clusters, configuring [{{ccr}}](../../tools/cross-cluster-replication.md) to replicate data from one cluster to the other one, and routing all searches to the cluster that has the follower indices, search activity will no longer steal resources from indexing on the cluster that hosts the leader indices. -## Avoid hot spotting [_avoid_hot_spotting] +## Avoid hot spotting [_avoid_hot_spotting] [Hot Spotting](../../../troubleshoot/elasticsearch/hotspotting.md) can occur when node resources, shards, or requests are not evenly distributed. {{es}} maintains cluster state by syncing it across nodes, so continually hot spotted nodes can cause overall cluster performance degredation. -## Additional optimizations [_additional_optimizations] +## Additional optimizations [_additional_optimizations] Many of the strategies outlined in [*Tune for disk usage*](disk-usage.md) also provide an improvement in the speed of indexing. diff --git a/deploy-manage/production-guidance/optimize-performance/search-speed.md b/deploy-manage/production-guidance/optimize-performance/search-speed.md index a2eb96171..58c9c927d 100644 --- a/deploy-manage/production-guidance/optimize-performance/search-speed.md +++ b/deploy-manage/production-guidance/optimize-performance/search-speed.md @@ -6,49 +6,49 @@ mapped_pages: # Search speed [tune-for-search-speed] -## Give memory to the filesystem cache [_give_memory_to_the_filesystem_cache_2] +## Give memory to the filesystem cache [_give_memory_to_the_filesystem_cache_2] Elasticsearch heavily relies on the filesystem cache in order to make search fast. In general, you should make sure that at least half the available memory goes to the filesystem cache so that Elasticsearch can keep hot regions of the index in physical memory. -## Avoid page cache thrashing by using modest readahead values on Linux [_avoid_page_cache_thrashing_by_using_modest_readahead_values_on_linux] +## Avoid page cache thrashing by using modest readahead values on Linux [_avoid_page_cache_thrashing_by_using_modest_readahead_values_on_linux] -Search can cause a lot of randomized read I/O. When the underlying block device has a high readahead value, there may be a lot of unnecessary read I/O done, especially when files are accessed using memory mapping (see [storage types](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/store.md#file-system)). +Search can cause a lot of randomized read I/O. When the underlying block device has a high readahead value, there may be a lot of unnecessary read I/O done, especially when files are accessed using memory mapping (see [storage types](elasticsearch://reference/elasticsearch/index-settings/store.md#file-system)). Most Linux distributions use a sensible readahead value of `128KiB` for a single plain device, however, when using software raid, LVM or dm-crypt the resulting block device (backing Elasticsearch [path.data](../../deploy/self-managed/important-settings-configuration.md#path-settings)) may end up having a very large readahead value (in the range of several MiB). This usually results in severe page (filesystem) cache thrashing adversely affecting search (or [update](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-document)) performance. You can check the current value in `KiB` using `lsblk -o NAME,RA,MOUNTPOINT,TYPE,SIZE`. Consult the documentation of your distribution on how to alter this value (for example with a `udev` rule to persist across reboots, or via [blockdev --setra](https://man7.org/linux/man-pages/man8/blockdev.8.md) as a transient setting). We recommend a value of `128KiB` for readahead. -::::{warning} +::::{warning} `blockdev` expects values in 512 byte sectors whereas `lsblk` reports values in `KiB`. As an example, to temporarily set readahead to `128KiB` for `/dev/nvme0n1`, specify `blockdev --setra 256 /dev/nvme0n1`. :::: -## Use faster hardware [search-use-faster-hardware] +## Use faster hardware [search-use-faster-hardware] If your searches are I/O-bound, consider increasing the size of the filesystem cache (see above) or using faster storage. Each search involves a mix of sequential and random reads across multiple files, and there may be many searches running concurrently on each shard, so SSD drives tend to perform better than spinning disks. If your searches are CPU-bound, consider using a larger number of faster CPUs. -### Local vs. remote storage [_local_vs_remote_storage_2] +### Local vs. remote storage [_local_vs_remote_storage_2] Directly-attached (local) storage generally performs better than remote storage because it is simpler to configure well and avoids communications overheads. Some remote storage performs very poorly, especially under the kind of load that {{es}} imposes. However, with careful tuning, it is sometimes possible to achieve acceptable performance using remote storage too. Before committing to a particular storage architecture, benchmark your system with a realistic workload to determine the effects of any tuning parameters. If you cannot achieve the performance you expect, work with the vendor of your storage system to identify the problem. -## Document modeling [_document_modeling] +## Document modeling [_document_modeling] Documents should be modeled so that search-time operations are as cheap as possible. -In particular, joins should be avoided. [`nested`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/nested.md) can make queries several times slower and [parent-child](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/parent-join.md) relations can make queries hundreds of times slower. So if the same questions can be answered without joins by denormalizing documents, significant speedups can be expected. +In particular, joins should be avoided. [`nested`](elasticsearch://reference/elasticsearch/mapping-reference/nested.md) can make queries several times slower and [parent-child](elasticsearch://reference/elasticsearch/mapping-reference/parent-join.md) relations can make queries hundreds of times slower. So if the same questions can be answered without joins by denormalizing documents, significant speedups can be expected. -## Search as few fields as possible [search-as-few-fields-as-possible] +## Search as few fields as possible [search-as-few-fields-as-possible] -The more fields a [`query_string`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-query-string-query.md) or [`multi_match`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-multi-match-query.md) query targets, the slower it is. A common technique to improve search speed over multiple fields is to copy their values into a single field at index time, and then use this field at search time. This can be automated with the [`copy-to`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/copy-to.md) directive of mappings without having to change the source of documents. Here is an example of an index containing movies that optimizes queries that search over both the name and the plot of the movie by indexing both values into the `name_and_plot` field. +The more fields a [`query_string`](elasticsearch://reference/query-languages/query-dsl-query-string-query.md) or [`multi_match`](elasticsearch://reference/query-languages/query-dsl-multi-match-query.md) query targets, the slower it is. A common technique to improve search speed over multiple fields is to copy their values into a single field at index time, and then use this field at search time. This can be automated with the [`copy-to`](elasticsearch://reference/elasticsearch/mapping-reference/copy-to.md) directive of mappings without having to change the source of documents. Here is an example of an index containing movies that optimizes queries that search over both the name and the plot of the movie by indexing both values into the `name_and_plot` field. ```console PUT movies @@ -72,9 +72,9 @@ PUT movies ``` -## Pre-index data [_pre_index_data] +## Pre-index data [_pre_index_data] -You should leverage patterns in your queries to optimize the way data is indexed. For instance, if all your documents have a `price` field and most queries run [`range`](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-range-aggregation.md) aggregations on a fixed list of ranges, you could make this aggregation faster by pre-indexing the ranges into the index and using a [`terms`](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-terms-aggregation.md) aggregations. +You should leverage patterns in your queries to optimize the way data is indexed. For instance, if all your documents have a `price` field and most queries run [`range`](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-range-aggregation.md) aggregations on a fixed list of ranges, you could make this aggregation faster by pre-indexing the ranges into the index and using a [`terms`](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-terms-aggregation.md) aggregations. For instance, if documents look like: @@ -106,7 +106,7 @@ GET index/_search } ``` -Then documents could be enriched by a `price_range` field at index time, which should be mapped as a [`keyword`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/keyword.md): +Then documents could be enriched by a `price_range` field at index time, which should be mapped as a [`keyword`](elasticsearch://reference/elasticsearch/mapping-reference/keyword.md): ```console PUT index @@ -144,26 +144,26 @@ GET index/_search ``` -## Consider mapping identifiers as `keyword` [map-ids-as-keyword] +## Consider mapping identifiers as `keyword` [map-ids-as-keyword] -Not all numeric data should be mapped as a [numeric](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/number.md) field data type. {{es}} optimizes numeric fields, such as `integer` or `long`, for [`range`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-range-query.md) queries. However, [`keyword`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/keyword.md) fields are better for [`term`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-term-query.md) and other [term-level](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/term-level-queries.md) queries. +Not all numeric data should be mapped as a [numeric](elasticsearch://reference/elasticsearch/mapping-reference/number.md) field data type. {{es}} optimizes numeric fields, such as `integer` or `long`, for [`range`](elasticsearch://reference/query-languages/query-dsl-range-query.md) queries. However, [`keyword`](elasticsearch://reference/elasticsearch/mapping-reference/keyword.md) fields are better for [`term`](elasticsearch://reference/query-languages/query-dsl-term-query.md) and other [term-level](elasticsearch://reference/query-languages/term-level-queries.md) queries. Identifiers, such as an ISBN or a product ID, are rarely used in `range` queries. However, they are often retrieved using term-level queries. Consider mapping a numeric identifier as a `keyword` if: -* You don’t plan to search for the identifier data using [`range`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-range-query.md) queries. +* You don’t plan to search for the identifier data using [`range`](elasticsearch://reference/query-languages/query-dsl-range-query.md) queries. * Fast retrieval is important. `term` query searches on `keyword` fields are often faster than `term` searches on numeric fields. -If you’re unsure which to use, you can use a [multi-field](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/multi-fields.md) to map the data as both a `keyword` *and* a numeric data type. +If you’re unsure which to use, you can use a [multi-field](elasticsearch://reference/elasticsearch/mapping-reference/multi-fields.md) to map the data as both a `keyword` *and* a numeric data type. -## Avoid scripts [_avoid_scripts] +## Avoid scripts [_avoid_scripts] -If possible, avoid using [script](../../../explore-analyze/scripting.md)-based sorting, scripts in aggregations, and the [`script_score`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-script-score-query.md) query. See [Scripts, caching, and search speed](../../../explore-analyze/scripting/scripts-search-speed.md). +If possible, avoid using [script](../../../explore-analyze/scripting.md)-based sorting, scripts in aggregations, and the [`script_score`](elasticsearch://reference/query-languages/query-dsl-script-score-query.md) query. See [Scripts, caching, and search speed](../../../explore-analyze/scripting/scripts-search-speed.md). -## Search rounded dates [_search_rounded_dates] +## Search rounded dates [_search_rounded_dates] Queries on date fields that use `now` are typically not cacheable since the range that is being matched changes all the time. However switching to a rounded date is often acceptable in terms of user experience, and has the benefit of making better use of the query cache. @@ -214,7 +214,7 @@ GET index/_search In that case we rounded to the minute, so if the current time is `16:31:29`, the range query will match everything whose value of the `my_date` field is between `15:31:00` and `16:31:59`. And if several users run a query that contains this range in the same minute, the query cache could help speed things up a bit. The longer the interval that is used for rounding, the more the query cache can help, but beware that too aggressive rounding might also hurt user experience. -::::{note} +::::{note} It might be tempting to split ranges into a large cacheable part and smaller not cacheable parts in order to be able to leverage the query cache, as shown below: :::: @@ -262,19 +262,19 @@ GET index/_search However such practice might make the query run slower in some cases since the overhead introduced by the `bool` query may defeat the savings from better leveraging the query cache. -## Force-merge read-only indices [_force_merge_read_only_indices] +## Force-merge read-only indices [_force_merge_read_only_indices] Indices that are read-only may benefit from being [merged down to a single segment](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-forcemerge). This is typically the case with time-based indices: only the index for the current time frame is getting new documents while older indices are read-only. Shards that have been force-merged into a single segment can use simpler and more efficient data structures to perform searches. -::::{important} +::::{important} Do not force-merge indices to which you are still writing, or to which you will write again in the future. Instead, rely on the automatic background merge process to perform merges as needed to keep the index running smoothly. If you continue to write to a force-merged index then its performance may become much worse. :::: -## Warm up global ordinals [_warm_up_global_ordinals] +## Warm up global ordinals [_warm_up_global_ordinals] -[Global ordinals](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/eager-global-ordinals.md) are a data structure that is used to optimize the performance of aggregations. They are calculated lazily and stored in the JVM heap as part of the [field data cache](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/field-data-cache-settings.md). For fields that are heavily used for bucketing aggregations, you can tell {{es}} to construct and cache the global ordinals before requests are received. This should be done carefully because it will increase heap usage and can make [refreshes](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-refresh) take longer. The option can be updated dynamically on an existing mapping by setting the [eager global ordinals](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/eager-global-ordinals.md) mapping parameter: +[Global ordinals](elasticsearch://reference/elasticsearch/mapping-reference/eager-global-ordinals.md) are a data structure that is used to optimize the performance of aggregations. They are calculated lazily and stored in the JVM heap as part of the [field data cache](elasticsearch://reference/elasticsearch/configuration-reference/field-data-cache-settings.md). For fields that are heavily used for bucketing aggregations, you can tell {{es}} to construct and cache the global ordinals before requests are received. This should be done carefully because it will increase heap usage and can make [refreshes](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-refresh) take longer. The option can be updated dynamically on an existing mapping by setting the [eager global ordinals](elasticsearch://reference/elasticsearch/mapping-reference/eager-global-ordinals.md) mapping parameter: ```console PUT index @@ -291,29 +291,29 @@ PUT index ``` -## Warm up the filesystem cache [_warm_up_the_filesystem_cache] +## Warm up the filesystem cache [_warm_up_the_filesystem_cache] -If the machine running Elasticsearch is restarted, the filesystem cache will be empty, so it will take some time before the operating system loads hot regions of the index into memory so that search operations are fast. You can explicitly tell the operating system which files should be loaded into memory eagerly depending on the file extension using the [`index.store.preload`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/preloading-data-into-file-system-cache.md) setting. +If the machine running Elasticsearch is restarted, the filesystem cache will be empty, so it will take some time before the operating system loads hot regions of the index into memory so that search operations are fast. You can explicitly tell the operating system which files should be loaded into memory eagerly depending on the file extension using the [`index.store.preload`](elasticsearch://reference/elasticsearch/index-settings/preloading-data-into-file-system-cache.md) setting. -::::{warning} +::::{warning} Loading data into the filesystem cache eagerly on too many indices or too many files will make search *slower* if the filesystem cache is not large enough to hold all the data. Use with caution. :::: -## Use index sorting to speed up conjunctions [_use_index_sorting_to_speed_up_conjunctions] +## Use index sorting to speed up conjunctions [_use_index_sorting_to_speed_up_conjunctions] -[Index sorting](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/sorting.md) can be useful in order to make conjunctions faster at the cost of slightly slower indexing. Read more about it in the [index sorting documentation](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/sorting-conjunctions.md). +[Index sorting](elasticsearch://reference/elasticsearch/index-settings/sorting.md) can be useful in order to make conjunctions faster at the cost of slightly slower indexing. Read more about it in the [index sorting documentation](elasticsearch://reference/elasticsearch/index-settings/sorting-conjunctions.md). -## Use `preference` to optimize cache utilization [preference-cache-optimization] +## Use `preference` to optimize cache utilization [preference-cache-optimization] -There are multiple caches that can help with search performance, such as the [filesystem cache](https://en.wikipedia.org/wiki/Page_cache), the [request cache](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/shard-request-cache-settings.md) or the [query cache](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/node-query-cache-settings.md). Yet all these caches are maintained at the node level, meaning that if you run the same request twice in a row, have 1 replica or more and use [round-robin](https://en.wikipedia.org/wiki/Round-robin_DNS), the default routing algorithm, then those two requests will go to different shard copies, preventing node-level caches from helping. +There are multiple caches that can help with search performance, such as the [filesystem cache](https://en.wikipedia.org/wiki/Page_cache), the [request cache](elasticsearch://reference/elasticsearch/configuration-reference/shard-request-cache-settings.md) or the [query cache](elasticsearch://reference/elasticsearch/configuration-reference/node-query-cache-settings.md). Yet all these caches are maintained at the node level, meaning that if you run the same request twice in a row, have 1 replica or more and use [round-robin](https://en.wikipedia.org/wiki/Round-robin_DNS), the default routing algorithm, then those two requests will go to different shard copies, preventing node-level caches from helping. Since it is common for users of a search application to run similar requests one after another, for instance in order to analyze a narrower subset of the index, using a preference value that identifies the current user or session could help optimize usage of the caches. -## Replicas might help with throughput, but not always [_replicas_might_help_with_throughput_but_not_always] +## Replicas might help with throughput, but not always [_replicas_might_help_with_throughput_but_not_always] In addition to improving resiliency, replicas can help improve throughput. For instance if you have a single-shard index and three nodes, you will need to set the number of replicas to 2 in order to have 3 copies of your shard in total so that all nodes are utilized. @@ -322,7 +322,7 @@ Now imagine that you have a 2-shards index and two nodes. In one case, the numbe So what is the right number of replicas? If you have a cluster that has `num_nodes` nodes, `num_primaries` primary shards *in total* and if you want to be able to cope with `max_failures` node failures at once at most, then the right number of replicas for you is `max(max_failures, ceil(num_nodes / num_primaries) - 1)`. -## Tune your queries with the Search Profiler [_tune_your_queries_with_the_search_profiler] +## Tune your queries with the Search Profiler [_tune_your_queries_with_the_search_profiler] The [Profile API](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-profile.html) provides detailed information about how each component of your queries and aggregations impacts the time it takes to process the request. @@ -331,17 +331,17 @@ The [Search Profiler](../../../explore-analyze/query-filter/tools/search-profile Because the Profile API itself adds significant overhead to the query, this information is best used to understand the relative cost of the various query components. It does not provide a reliable measure of actual processing time. -## Faster phrase queries with `index_phrases` [faster-phrase-queries] +## Faster phrase queries with `index_phrases` [faster-phrase-queries] -The [`text`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/text.md) field has an [`index_phrases`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/index-phrases.md) option that indexes 2-shingles and is automatically leveraged by query parsers to run phrase queries that don’t have a slop. If your use-case involves running lots of phrase queries, this can speed up queries significantly. +The [`text`](elasticsearch://reference/elasticsearch/mapping-reference/text.md) field has an [`index_phrases`](elasticsearch://reference/elasticsearch/mapping-reference/index-phrases.md) option that indexes 2-shingles and is automatically leveraged by query parsers to run phrase queries that don’t have a slop. If your use-case involves running lots of phrase queries, this can speed up queries significantly. -## Faster prefix queries with `index_prefixes` [faster-prefix-queries] +## Faster prefix queries with `index_prefixes` [faster-prefix-queries] -The [`text`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/text.md) field has an [`index_prefixes`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/index-prefixes.md) option that indexes prefixes of all terms and is automatically leveraged by query parsers to run prefix queries. If your use-case involves running lots of prefix queries, this can speed up queries significantly. +The [`text`](elasticsearch://reference/elasticsearch/mapping-reference/text.md) field has an [`index_prefixes`](elasticsearch://reference/elasticsearch/mapping-reference/index-prefixes.md) option that indexes prefixes of all terms and is automatically leveraged by query parsers to run prefix queries. If your use-case involves running lots of prefix queries, this can speed up queries significantly. -## Use `constant_keyword` to speed up filtering [faster-filtering-with-constant-keyword] +## Use `constant_keyword` to speed up filtering [faster-filtering-with-constant-keyword] There is a general rule that the cost of a filter is mostly a function of the number of matched documents. Imagine that you have an index containing cycles. There are a large number of bicycles and many searches perform a filter on `cycle_type: bicycle`. This very common filter is unfortunately also very costly since it matches most documents. There is a simple way to avoid running this filter: move bicycles to their own index and filter bicycles by searching this index instead of adding a filter to the query. @@ -422,7 +422,7 @@ This is a powerful way of making queries cheaper by putting common values in a d The `constant_keyword` is not strictly required for this optimization: it is also possible to update the client-side logic in order to route queries to the relevant indices based on filters. However `constant_keyword` makes it transparently and allows to decouple search requests from the index topology in exchange of very little overhead. -## Default search timeout [_default_search_timeout] +## Default search timeout [_default_search_timeout] By default, search requests don’t time out. You can set a timeout using the [`search.default_search_timeout`](../../../solutions/search/the-search-api.md#search-timeout) setting. diff --git a/deploy-manage/production-guidance/optimize-performance/size-shards.md b/deploy-manage/production-guidance/optimize-performance/size-shards.md index 5d559037b..aa7b654ec 100644 --- a/deploy-manage/production-guidance/optimize-performance/size-shards.md +++ b/deploy-manage/production-guidance/optimize-performance/size-shards.md @@ -28,14 +28,14 @@ Keep the following things in mind when building your sharding strategy. ### Searches run on a single thread per shard [single-thread-per-shard] -Most searches hit multiple shards. Each shard runs the search on a single CPU thread. While a shard can run multiple concurrent searches, searches across a large number of shards can deplete a node’s [search thread pool](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/thread-pool-settings.md). This can result in low throughput and slow search speeds. +Most searches hit multiple shards. Each shard runs the search on a single CPU thread. While a shard can run multiple concurrent searches, searches across a large number of shards can deplete a node’s [search thread pool](elasticsearch://reference/elasticsearch/configuration-reference/thread-pool-settings.md). This can result in low throughput and slow search speeds. ### Each index, shard, segment and field has overhead [each-shard-has-overhead] Every index and every shard requires some memory and CPU resources. In most cases, a small set of large shards uses fewer resources than many small shards. -Segments play a big role in a shard’s resource usage. Most shards contain several segments, which store its index data. {{es}} keeps some segment metadata in heap memory so it can be quickly retrieved for searches. As a shard grows, its segments are [merged](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/merge.md) into fewer, larger segments. This decreases the number of segments, which means less metadata is kept in heap memory. +Segments play a big role in a shard’s resource usage. Most shards contain several segments, which store its index data. {{es}} keeps some segment metadata in heap memory so it can be quickly retrieved for searches. As a shard grows, its segments are [merged](elasticsearch://reference/elasticsearch/index-settings/merge.md) into fewer, larger segments. This decreases the number of segments, which means less metadata is kept in heap memory. Every mapped field also carries some overhead in terms of memory usage and disk space. By default {{es}} will automatically create a mapping for every field in every document it indexes, but you can switch off this behaviour to [take control of your mappings](../../../manage-data/data-store/mapping/explicit-mapping.md). @@ -54,7 +54,7 @@ Where applicable, use the following best practices as starting points for your s ### Delete indices, not documents [delete-indices-not-documents] -Deleted documents aren’t immediately removed from {{es}}'s file system. Instead, {{es}} marks the document as deleted on each related shard. The marked document will continue to use resources until it’s removed during a periodic [segment merge](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/merge.md). +Deleted documents aren’t immediately removed from {{es}}'s file system. Instead, {{es}} marks the document as deleted on each related shard. The marked document will continue to use resources until it’s removed during a periodic [segment merge](elasticsearch://reference/elasticsearch/index-settings/merge.md). When possible, delete entire indices instead. {{es}} can immediately remove deleted indices directly from the file system and free up resources. @@ -67,8 +67,8 @@ One advantage of this setup is [automatic rollover](../../../manage-data/lifecyc {{ilm-init}} also makes it easy to change your sharding strategy over time: -* **Want to decrease the shard count for new indices?**
Change the [`index.number_of_shards`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-modules.md#index-number-of-shards) setting in the data stream’s [matching index template](../../../manage-data/data-store/data-streams/modify-data-stream.md#data-streams-change-mappings-and-settings). -* **Want larger shards or fewer backing indices?**
Increase your {{ilm-init}} policy’s [rollover threshold](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-rollover.md). +* **Want to decrease the shard count for new indices?**
Change the [`index.number_of_shards`](elasticsearch://reference/elasticsearch/index-settings/index-modules.md#index-number-of-shards) setting in the data stream’s [matching index template](../../../manage-data/data-store/data-streams/modify-data-stream.md#data-streams-change-mappings-and-settings). +* **Want larger shards or fewer backing indices?**
Increase your {{ilm-init}} policy’s [rollover threshold](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-rollover.md). * **Need indices that span shorter intervals?**
Offset the increased shard count by deleting older indices sooner. You can do this by lowering the `min_age` threshold for your policy’s [delete phase](../../../manage-data/lifecycle/index-lifecycle-management/index-lifecycle.md). Every new backing index is an opportunity to further tune your strategy. @@ -82,7 +82,7 @@ There is no hard limit on the physical size of a shard, and each shard can in th You may be able to use larger shards depending on your network and use case, and smaller shards may be appropriate for certain use cases. -If you use {{ilm-init}}, set the [rollover action](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-rollover.md)'s `max_primary_shard_size` threshold to `50gb` to avoid shards larger than 50GB and `min_primary_shard_size` threshold to `10gb` to avoid shards smaller than 10GB. +If you use {{ilm-init}}, set the [rollover action](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-rollover.md)'s `max_primary_shard_size` threshold to `50gb` to avoid shards larger than 50GB and `min_primary_shard_size` threshold to `10gb` to avoid shards smaller than 10GB. To see the current size of your shards, use the [cat shards API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cat-shards). @@ -133,7 +133,7 @@ GET _cat/shards?v=true ### Add enough nodes to stay within the cluster shard limits [shard-count-per-node-recommendation] -[Cluster shard limits](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/miscellaneous-cluster-settings.md#cluster-shard-limit) prevent creation of more than 1000 non-frozen shards per node, and 3000 frozen shards per dedicated frozen node. Make sure you have enough nodes of each type in your cluster to handle the number of shards you need. +[Cluster shard limits](elasticsearch://reference/elasticsearch/configuration-reference/miscellaneous-cluster-settings.md#cluster-shard-limit) prevent creation of more than 1000 non-frozen shards per node, and 3000 frozen shards per dedicated frozen node. Make sure you have enough nodes of each type in your cluster to handle the number of shards you need. ### Allow enough heap for field mappers and overheads [field-count-recommendation] @@ -223,7 +223,7 @@ Note that the above rules do not necessarily guarantee the performance of search If too many shards are allocated to a specific node, the node can become a hotspot. For example, if a single node contains too many shards for an index with a high indexing volume, the node is likely to have issues. -To prevent hotspots, use the [`index.routing.allocation.total_shards_per_node`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/total-shards-per-node.md#total-shards-per-node) index setting to explicitly limit the number of shards on a single node. You can configure `index.routing.allocation.total_shards_per_node` using the [update index settings API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-settings). +To prevent hotspots, use the [`index.routing.allocation.total_shards_per_node`](elasticsearch://reference/elasticsearch/index-settings/total-shards-per-node.md#total-shards-per-node) index setting to explicitly limit the number of shards on a single node. You can configure `index.routing.allocation.total_shards_per_node` using the [update index settings API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-settings). ```console PUT my-index-000001/_settings @@ -237,7 +237,7 @@ PUT my-index-000001/_settings ### Avoid unnecessary mapped fields [avoid-unnecessary-fields] -By default {{es}} [automatically creates a mapping](../../../manage-data/data-store/mapping/dynamic-mapping.md) for every field in every document it indexes. Every mapped field corresponds to some data structures on disk which are needed for efficient search, retrieval, and aggregations on this field. Details about each mapped field are also held in memory. In many cases this overhead is unnecessary because a field is not used in any searches or aggregations. Use [*Explicit mapping*](../../../manage-data/data-store/mapping/explicit-mapping.md) instead of dynamic mapping to avoid creating fields that are never used. If a collection of fields are typically used together, consider using [`copy_to`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/copy-to.md) to consolidate them at index time. If a field is only rarely used, it may be better to make it a [Runtime field](../../../manage-data/data-store/mapping/runtime-fields.md) instead. +By default {{es}} [automatically creates a mapping](../../../manage-data/data-store/mapping/dynamic-mapping.md) for every field in every document it indexes. Every mapped field corresponds to some data structures on disk which are needed for efficient search, retrieval, and aggregations on this field. Details about each mapped field are also held in memory. In many cases this overhead is unnecessary because a field is not used in any searches or aggregations. Use [*Explicit mapping*](../../../manage-data/data-store/mapping/explicit-mapping.md) instead of dynamic mapping to avoid creating fields that are never used. If a collection of fields are typically used together, consider using [`copy_to`](elasticsearch://reference/elasticsearch/mapping-reference/copy-to.md) to consolidate them at index time. If a field is only rarely used, it may be better to make it a [Runtime field](../../../manage-data/data-store/mapping/runtime-fields.md) instead. You can get information about which fields are being used with the [Field usage stats](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-field-usage-stats) API, and you can analyze the disk usage of mapped fields using the [Analyze index disk usage](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-disk-usage) API. Note however that unnecessary mapped fields also carry some memory overhead as well as their disk usage. @@ -273,7 +273,7 @@ DELETE my-index-000001 ### Force merge during off-peak hours [force-merge-during-off-peak-hours] -If you no longer write to an index, you can use the [force merge API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-forcemerge) to [merge](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/merge.md) smaller segments into larger ones. This can reduce shard overhead and improve search speeds. However, force merges are resource-intensive. If possible, run the force merge during off-peak hours. +If you no longer write to an index, you can use the [force merge API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-forcemerge) to [merge](elasticsearch://reference/elasticsearch/index-settings/merge.md) smaller segments into larger ones. This can reduce shard overhead and improve search speeds. However, force merges are resource-intensive. If possible, run the force merge during off-peak hours. ```console POST my-index-000001/_forcemerge @@ -284,7 +284,7 @@ POST my-index-000001/_forcemerge If you no longer write to an index, you can use the [shrink index API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-shrink) to reduce its shard count. -{{ilm-init}} also has a [shrink action](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-shrink.md) for indices in the warm phase. +{{ilm-init}} also has a [shrink action](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-shrink.md) for indices in the warm phase. ### Combine smaller indices [combine-smaller-indices] @@ -311,7 +311,7 @@ Here’s how to resolve common shard-related errors. ### this action would add [x] total shards, but this cluster currently has [y]/[z] maximum shards open; [troubleshooting-max-shards-open] -The [`cluster.max_shards_per_node`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/miscellaneous-cluster-settings.md#cluster-max-shards-per-node) cluster setting limits the maximum number of open shards for a cluster. This error indicates an action would exceed this limit. +The [`cluster.max_shards_per_node`](elasticsearch://reference/elasticsearch/configuration-reference/miscellaneous-cluster-settings.md#cluster-max-shards-per-node) cluster setting limits the maximum number of open shards for a cluster. This error indicates an action would exceed this limit. If you’re confident your changes won’t destabilize the cluster, you can temporarily increase the limit using the [cluster update settings API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings) and retry the action. diff --git a/deploy-manage/remote-clusters/ec-remote-cluster-self-managed.md b/deploy-manage/remote-clusters/ec-remote-cluster-self-managed.md index 0e1f93873..ed72c4b42 100644 --- a/deploy-manage/remote-clusters/ec-remote-cluster-self-managed.md +++ b/deploy-manage/remote-clusters/ec-remote-cluster-self-managed.md @@ -137,7 +137,7 @@ A deployment can be configured to trust all or specific deployments in any envir 7. Configure the self-managed cluster to trust this deployment, so that both deployments are configured to trust each other: * Download the Certificate Authority used to sign the certificates of your deployment nodes (it can be found in the Security page of your deployment) - * Trust this CA either using the [setting](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md) `xpack.security.transport.ssl.certificate_authorities` in `elasticsearch.yml` or by [adding it to the trust store](../security/different-ca.md). + * Trust this CA either using the [setting](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md) `xpack.security.transport.ssl.certificate_authorities` in `elasticsearch.yml` or by [adding it to the trust store](../security/different-ca.md). 8. Generate certificates with an `otherName` attribute using the {{es}} certutil. Create a file called `instances.yaml` with all the details of the nodes in your on-premise cluster like below. The `dns` and `ip` settings are optional, but `cn` is mandatory for use with the `trust_restrictions` path setting in the next step. Next, run `./bin/elasticsearch-certutil cert --ca elastic-stack-ca.p12 -in instances.yaml` to create new certificates for all the nodes at once. You can then copy the resulting files into each node. diff --git a/deploy-manage/remote-clusters/ece-remote-cluster-self-managed.md b/deploy-manage/remote-clusters/ece-remote-cluster-self-managed.md index 88b107c9e..346bd8de7 100644 --- a/deploy-manage/remote-clusters/ece-remote-cluster-self-managed.md +++ b/deploy-manage/remote-clusters/ece-remote-cluster-self-managed.md @@ -136,7 +136,7 @@ A deployment can be configured to trust all or specific deployments in any envir 7. Configure the self-managed cluster to trust this deployment, so that both deployments are configured to trust each other: * Download the Certificate Authority used to sign the certificates of your deployment nodes (it can be found in the Security page of your deployment) - * Trust this CA either using the [setting](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md) `xpack.security.transport.ssl.certificate_authorities` in `elasticsearch.yml` or by [adding it to the trust store](../security/different-ca.md). + * Trust this CA either using the [setting](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md) `xpack.security.transport.ssl.certificate_authorities` in `elasticsearch.yml` or by [adding it to the trust store](../security/different-ca.md). 8. Generate certificates with an `otherName` attribute using the {{es}} certutil. Create a file called `instances.yaml` with all the details of the nodes in your on-premise cluster like below. The `dns` and `ip` settings are optional, but `cn` is mandatory for use with the `trust_restrictions` path setting in the next step. Next, run `./bin/elasticsearch-certutil cert --ca elastic-stack-ca.p12 -in instances.yaml` to create new certificates for all the nodes at once. You can then copy the resulting files into each node. diff --git a/deploy-manage/remote-clusters/eck-remote-clusters.md b/deploy-manage/remote-clusters/eck-remote-clusters.md index 10c4e0f98..8a002da7c 100644 --- a/deploy-manage/remote-clusters/eck-remote-clusters.md +++ b/deploy-manage/remote-clusters/eck-remote-clusters.md @@ -148,7 +148,7 @@ kubectl get secret cluster-one-es-transport-certs-public \ -o go-template='{{index .data "ca.crt" | base64decode}}' > remote.ca.crt ``` -You then need to configure the CA as one of the trusted CAs in `cluster-two`. If that cluster is hosted outside of Kubernetes, take the CA certificate that you have just extracted and add it to the list of CAs in [`xpack.security.transport.ssl.certificate_authorities`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#_pem_encoded_files_3). +You then need to configure the CA as one of the trusted CAs in `cluster-two`. If that cluster is hosted outside of Kubernetes, take the CA certificate that you have just extracted and add it to the list of CAs in [`xpack.security.transport.ssl.certificate_authorities`](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#_pem_encoded_files_3). ::::{note} Beware of copying the source Secret as-is into a different namespace. Check [Common Problems: Owner References](../../troubleshoot/deployments/cloud-on-k8s/common-problems.md#k8s-common-problems-owner-refs) for more information. diff --git a/deploy-manage/remote-clusters/remote-clusters-api-key.md b/deploy-manage/remote-clusters/remote-clusters-api-key.md index 63e243673..9e9208114 100644 --- a/deploy-manage/remote-clusters/remote-clusters-api-key.md +++ b/deploy-manage/remote-clusters/remote-clusters-api-key.md @@ -14,7 +14,7 @@ All cross-cluster requests from the local cluster are bound by the API key’s p On the local cluster side, not every local user needs to access every piece of data allowed by the API key. An administrator of the local cluster can further configure additional permission constraints on local users so each user only gets access to the necessary remote data. Note it is only possible to further reduce the permissions allowed by the API key for individual local users. It is impossible to increase the permissions to go beyond what is allowed by the API key. -In this model, cross-cluster operations use [a dedicated server port](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#remote_cluster.port) (remote cluster interface) for communication between clusters. A remote cluster must enable this port for local clusters to connect. Configure Transport Layer Security (TLS) for this port to maximize security (as explained in [Establish trust with a remote cluster](#remote-clusters-security-api-key)). +In this model, cross-cluster operations use [a dedicated server port](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#remote_cluster.port) (remote cluster interface) for communication between clusters. A remote cluster must enable this port for local clusters to connect. Configure Transport Layer Security (TLS) for this port to maximize security (as explained in [Establish trust with a remote cluster](#remote-clusters-security-api-key)). The local cluster must trust the remote cluster on the remote cluster interface. This means that the local cluster trusts the remote cluster’s certificate authority (CA) that signs the server certificate used by the remote cluster interface. When establishing a connection, all nodes from the local cluster that participate in cross-cluster communication verify certificates from nodes on the other side, based on the TLS trust configuration. @@ -29,7 +29,7 @@ If you run into any issues, refer to [Troubleshooting](/troubleshoot/elasticsear ## Prerequisites [remote-clusters-prerequisites-api-key] -* The {{es}} security features need to be enabled on both clusters, on every node. Security is enabled by default. If it’s disabled, set `xpack.security.enabled` to `true` in `elasticsearch.yml`. Refer to [General security settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#general-security-settings). +* The {{es}} security features need to be enabled on both clusters, on every node. Security is enabled by default. If it’s disabled, set `xpack.security.enabled` to `true` in `elasticsearch.yml`. Refer to [General security settings](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#general-security-settings). * The nodes of the local and remote clusters must be on {{stack}} 8.14 or later. * The local and remote clusters must have an appropriate license. For more information, refer to [https://www.elastic.co/subscriptions](https://www.elastic.co/subscriptions). @@ -45,9 +45,9 @@ If a remote cluster is part of an {{ech}} deployment, it has a valid certificate 1. Enable the remote cluster server on every node of the remote cluster. In `elasticsearch.yml`: - 1. Set [`remote_cluster_server.enabled`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#remote-cluster-network-settings) to `true`. - 2. Configure the bind and publish address for remote cluster server traffic, for example using [`remote_cluster.host`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#remote-cluster-network-settings). Without configuring the address, remote cluster traffic may be bound to the local interface, and remote clusters running on other machines can’t connect. - 3. Optionally, configure the remote server port using [`remote_cluster.port`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#remote_cluster.port) (defaults to `9443`). + 1. Set [`remote_cluster_server.enabled`](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#remote-cluster-network-settings) to `true`. + 2. Configure the bind and publish address for remote cluster server traffic, for example using [`remote_cluster.host`](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#remote-cluster-network-settings). Without configuring the address, remote cluster traffic may be bound to the local interface, and remote clusters running on other machines can’t connect. + 3. Optionally, configure the remote server port using [`remote_cluster.port`](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#remote_cluster.port) (defaults to `9443`). 2. Next, generate a certificate authority (CA) and a server certificate/key pair. On one of the nodes of the remote cluster, from the directory where {{es}} has been installed: @@ -81,7 +81,7 @@ If a remote cluster is part of an {{ech}} deployment, it has a valid certificate 4. If the remote cluster has multiple nodes, you can either: * create a single wildcard certificate for all nodes; - * or, create separate certificates for each node either manually or in batch with the [silent mode](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/certutil.md#certutil-silent). + * or, create separate certificates for each node either manually or in batch with the [silent mode](elasticsearch://reference/elasticsearch/command-line-tools/certutil.md#certutil-silent). 3. On every node of the remote cluster: @@ -139,7 +139,7 @@ You must have the `manage` cluster privilege to connect remote clusters. :::: -The local cluster uses the [remote cluster interface](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md) to establish communication with remote clusters. The coordinating nodes in the local cluster establish [long-lived](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#long-lived-connections) TCP connections with specific nodes in the remote cluster. {{es}} requires these connections to remain open, even if the connections are idle for an extended period. +The local cluster uses the [remote cluster interface](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md) to establish communication with remote clusters. The coordinating nodes in the local cluster establish [long-lived](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#long-lived-connections) TCP connections with specific nodes in the remote cluster. {{es}} requires these connections to remain open, even if the connections are idle for an extended period. To add a remote cluster from Stack Management in {{kib}}: @@ -206,7 +206,7 @@ The API response indicates that the local cluster is connected to the remote clu Use the [cluster update settings API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings) to dynamically configure remote settings on every node in the cluster. The following request adds three remote clusters: `cluster_one`, `cluster_two`, and `cluster_three`. -The `seeds` parameter specifies the hostname and [remote cluster port](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md) (default `9443`) of a seed node in the remote cluster. +The `seeds` parameter specifies the hostname and [remote cluster port](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md) (default `9443`) of a seed node in the remote cluster. The `mode` parameter determines the configured connection mode, which defaults to [`sniff`](/deploy-manage/remote-clusters/remote-clusters-self-managed.md#sniff-mode). Because `cluster_one` doesn’t specify a `mode`, it uses the default. Both `cluster_two` and `cluster_three` explicitly use different modes. diff --git a/deploy-manage/remote-clusters/remote-clusters-cert.md b/deploy-manage/remote-clusters/remote-clusters-cert.md index 54514c3cf..ba751510a 100644 --- a/deploy-manage/remote-clusters/remote-clusters-cert.md +++ b/deploy-manage/remote-clusters/remote-clusters-cert.md @@ -26,7 +26,7 @@ If you run into any issues, refer to [Troubleshooting](/troubleshoot/elasticsear ## Prerequisites [remote-clusters-prerequisites-cert] -1. The {{es}} security features need to be enabled on both clusters, on every node. Security is enabled by default. If it’s disabled, set `xpack.security.enabled` to `true` in `elasticsearch.yml`. Refer to [General security settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#general-security-settings). +1. The {{es}} security features need to be enabled on both clusters, on every node. Security is enabled by default. If it’s disabled, set `xpack.security.enabled` to `true` in `elasticsearch.yml`. Refer to [General security settings](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#general-security-settings). 2. The local and remote clusters versions must be compatible. :::{include} _snippets/remote-cluster-certificate-compatibility.md @@ -57,7 +57,7 @@ You must have the `manage` cluster privilege to connect remote clusters. :::: -The local cluster uses the [transport interface](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md) to establish communication with remote clusters. The coordinating nodes in the local cluster establish [long-lived](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#long-lived-connections) TCP connections with specific nodes in the remote cluster. {{es}} requires these connections to remain open, even if the connections are idle for an extended period. +The local cluster uses the [transport interface](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md) to establish communication with remote clusters. The coordinating nodes in the local cluster establish [long-lived](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#long-lived-connections) TCP connections with specific nodes in the remote cluster. {{es}} requires these connections to remain open, even if the connections are idle for an extended period. To add a remote cluster from Stack Management in {{kib}}: @@ -122,7 +122,7 @@ The API response indicates that the local cluster is connected to the remote clu Use the [cluster update settings API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings) to dynamically configure remote settings on every node in the cluster. The following request adds three remote clusters: `cluster_one`, `cluster_two`, and `cluster_three`. -The `seeds` parameter specifies the hostname and [transport port](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md) (default `9300`) of a seed node in the remote cluster. +The `seeds` parameter specifies the hostname and [transport port](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md) (default `9300`) of a seed node in the remote cluster. The `mode` parameter determines the configured connection mode, which defaults to [`sniff`](/deploy-manage/remote-clusters/remote-clusters-self-managed.md#sniff-mode). Because `cluster_one` doesn’t specify a `mode`, it uses the default. Both `cluster_two` and `cluster_three` explicitly use different modes. diff --git a/deploy-manage/remote-clusters/remote-clusters-migrate.md b/deploy-manage/remote-clusters/remote-clusters-migrate.md index fc36820df..7470c316f 100644 --- a/deploy-manage/remote-clusters/remote-clusters-migrate.md +++ b/deploy-manage/remote-clusters/remote-clusters-migrate.md @@ -42,9 +42,9 @@ On the remote cluster: 1. Enable the remote cluster server on every node of the remote cluster. In `elasticsearch.yml`: - 1. Set [`remote_cluster_server.enabled`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#remote-cluster-network-settings) to `true`. - 2. Configure the bind and publish address for remote cluster server traffic, for example using [`remote_cluster.host`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#remote-cluster-network-settings). Without configuring the address, remote cluster traffic may be bound to the local interface, and remote clusters running on other machines can’t connect. - 3. Optionally, configure the remote server port using [`remote_cluster.port`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#remote_cluster.port) (defaults to `9443`). + 1. Set [`remote_cluster_server.enabled`](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#remote-cluster-network-settings) to `true`. + 2. Configure the bind and publish address for remote cluster server traffic, for example using [`remote_cluster.host`](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#remote-cluster-network-settings). Without configuring the address, remote cluster traffic may be bound to the local interface, and remote clusters running on other machines can’t connect. + 3. Optionally, configure the remote server port using [`remote_cluster.port`](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#remote_cluster.port) (defaults to `9443`). 2. Next, generate a certificate authority (CA) and a server certificate/key pair. On one of the nodes of the remote cluster, from the directory where {{es}} has been installed: @@ -78,7 +78,7 @@ On the remote cluster: 4. If the remote cluster has multiple nodes, you can either: * create a single wildcard certificate for all nodes; - * or, create separate certificates for each node either manually or in batch with the [silent mode](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/certutil.md#certutil-silent). + * or, create separate certificates for each node either manually or in batch with the [silent mode](elasticsearch://reference/elasticsearch/command-line-tools/certutil.md#certutil-silent). 3. On every node of the remote cluster: @@ -224,7 +224,7 @@ Resume any persistent tasks that you stopped earlier. Tasks should be restarted ## Disable certificate based authentication and authorization [remote-clusters-migration-disable-cert] -::::{note} +::::{note} Only proceed with this step if the migration has been proved successful on the local cluster. If the migration is unsuccessful, either [find out what the problem is and attempt to fix it](/troubleshoot/elasticsearch/remote-clusters.md) or [roll back](#remote-clusters-migration-rollback). :::: diff --git a/deploy-manage/remote-clusters/remote-clusters-settings.md b/deploy-manage/remote-clusters/remote-clusters-settings.md index acf9c3610..20bdabbf2 100644 --- a/deploy-manage/remote-clusters/remote-clusters-settings.md +++ b/deploy-manage/remote-clusters/remote-clusters-settings.md @@ -16,8 +16,8 @@ The following settings apply to both [sniff mode](/deploy-manage/remote-clusters `cluster.remote.initial_connect_timeout` : The time to wait for remote connections to be established when the node starts. The default is `30s`. -`remote_cluster_client` [role](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/node-settings.md#node-roles) -: By default, any node in the cluster can act as a cross-cluster client and connect to remote clusters. To prevent a node from connecting to remote clusters, specify the [node.roles](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/node-settings.md#node-roles) setting in `elasticsearch.yml` and exclude `remote_cluster_client` from the listed roles. Search requests targeting remote clusters must be sent to a node that is allowed to act as a cross-cluster client. Other features such as {{ml}} [data feeds](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/machine-learning-settings.md#general-ml-settings), [transforms](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/transforms-settings.md#general-transform-settings), and [{{ccr}}](../tools/cross-cluster-replication/set-up-cross-cluster-replication.md) require the `remote_cluster_client` role. +`remote_cluster_client` [role](elasticsearch://reference/elasticsearch/configuration-reference/node-settings.md#node-roles) +: By default, any node in the cluster can act as a cross-cluster client and connect to remote clusters. To prevent a node from connecting to remote clusters, specify the [node.roles](elasticsearch://reference/elasticsearch/configuration-reference/node-settings.md#node-roles) setting in `elasticsearch.yml` and exclude `remote_cluster_client` from the listed roles. Search requests targeting remote clusters must be sent to a node that is allowed to act as a cross-cluster client. Other features such as {{ml}} [data feeds](elasticsearch://reference/elasticsearch/configuration-reference/machine-learning-settings.md#general-ml-settings), [transforms](elasticsearch://reference/elasticsearch/configuration-reference/transforms-settings.md#general-transform-settings), and [{{ccr}}](../tools/cross-cluster-replication/set-up-cross-cluster-replication.md) require the `remote_cluster_client` role. `cluster.remote..skip_unavailable` : Per cluster boolean setting that allows to skip specific clusters when no nodes belonging to them are available and they are the target of a remote cluster request. @@ -28,13 +28,13 @@ In {{es}} 8.15, the default value for `skip_unavailable` was changed from `false `cluster.remote..transport.ping_schedule` -: Sets the time interval between regular application-level ping messages that are sent to try and keep remote cluster connections alive. If set to `-1`, application-level ping messages to this remote cluster are not sent. If unset, application-level ping messages are sent according to the global `transport.ping_schedule` setting, which defaults to `-1` meaning that pings are not sent. It is preferable to correctly configure TCP keep-alives instead of configuring a `ping_schedule`, because TCP keep-alives are handled by the operating system and not by {{es}}. By default {{es}} enables TCP keep-alives on remote cluster connections. Remote cluster connections are transport connections so the `transport.tcp.*` [advanced settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#transport-settings) regarding TCP keep-alives apply to them. +: Sets the time interval between regular application-level ping messages that are sent to try and keep remote cluster connections alive. If set to `-1`, application-level ping messages to this remote cluster are not sent. If unset, application-level ping messages are sent according to the global `transport.ping_schedule` setting, which defaults to `-1` meaning that pings are not sent. It is preferable to correctly configure TCP keep-alives instead of configuring a `ping_schedule`, because TCP keep-alives are handled by the operating system and not by {{es}}. By default {{es}} enables TCP keep-alives on remote cluster connections. Remote cluster connections are transport connections so the `transport.tcp.*` [advanced settings](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#transport-settings) regarding TCP keep-alives apply to them. `cluster.remote..transport.compress` -: Per-cluster setting that enables you to configure compression for requests to a specific remote cluster. The handling cluster will automatically compress responses to compressed requests. The setting options are `true`, `indexing_data`, and `false`. If unset, defaults to the behaviour specified by the node-wide `transport.compress` setting. See the [documentation for the `transport.compress` setting](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#transport-settings-compress) for further information. +: Per-cluster setting that enables you to configure compression for requests to a specific remote cluster. The handling cluster will automatically compress responses to compressed requests. The setting options are `true`, `indexing_data`, and `false`. If unset, defaults to the behaviour specified by the node-wide `transport.compress` setting. See the [documentation for the `transport.compress` setting](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#transport-settings-compress) for further information. `cluster.remote..transport.compression_scheme` -: Per-cluster setting that enables you to configure the compression scheme for requests to a specific cluster if those requests are selected to be compressed by to the `cluster.remote..transport.compress` setting. The handling cluster will automatically use the same compression scheme for responses as for the corresponding requests. The setting options are `deflate` and `lz4`. If unset, defaults to the behaviour specified by the node-wide `transport.compression_scheme` setting. See the [documentation for the `transport.compression_scheme` setting](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#transport-settings-compression-scheme) for further information. +: Per-cluster setting that enables you to configure the compression scheme for requests to a specific cluster if those requests are selected to be compressed by to the `cluster.remote..transport.compress` setting. The handling cluster will automatically use the same compression scheme for responses as for the corresponding requests. The setting options are `deflate` and `lz4`. If unset, defaults to the behaviour specified by the node-wide `transport.compression_scheme` setting. See the [documentation for the `transport.compression_scheme` setting](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#transport-settings-compression-scheme) for further information. $$$remote-cluster-credentials-setting$$$ diff --git a/deploy-manage/security/different-ca.md b/deploy-manage/security/different-ca.md index 5e50eeaac..1b935d4a2 100644 --- a/deploy-manage/security/different-ca.md +++ b/deploy-manage/security/different-ca.md @@ -15,7 +15,7 @@ If you have to trust a new CA from your organization, or you need to generate a Create a new CA certificate, or get the CA certificate of your organization, and add it to your existing CA truststore. After you finish updating your certificates for all nodes, you can remove the old CA certificate from your truststore (but not before!). -::::{note} +::::{note} The following examples use PKCS#12 files, but the same steps apply to JKS keystores. :::: @@ -24,7 +24,7 @@ The following examples use PKCS#12 files, but the same steps apply to JKS keysto In this example, the keystore and truststore are using different files. Your configuration might use the same file for both the keystore and the truststore. - ::::{note} + ::::{note} These instructions assume that the provided certificate is signed by a trusted CA and the verification mode is set to `certificate`. This setting ensures that nodes to not attempt to perform hostname verification. :::: @@ -54,7 +54,7 @@ The following examples use PKCS#12 files, but the same steps apply to JKS keysto 1. Enter a name for the compressed output file that will contain your certificate and key, or accept the default name of `elastic-stack-ca.zip`. 2. Unzip the output file. The resulting directory contains a CA certificate (`ca.crt`) and a private key (`ca.key`). - ::::{important} + ::::{important} Keep these file in a secure location as they contain the private key for your CA. :::: @@ -92,12 +92,12 @@ The following examples use PKCS#12 files, but the same steps apply to JKS keysto -### Generate a new certificate for each node in your cluster [node-certs-different-nodes] +### Generate a new certificate for each node in your cluster [node-certs-different-nodes] Now that your CA truststore is updated, use your new CA certificate to sign a certificate for your nodes. -::::{note} -If your organization has its own CA, you’ll need to [generate Certificate Signing Requests (CSRs)](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/certutil.md#certutil-csr). CSRs contain information that your CA uses to generate and sign a security certificate. +::::{note} +If your organization has its own CA, you’ll need to [generate Certificate Signing Requests (CSRs)](elasticsearch://reference/elasticsearch/command-line-tools/certutil.md#certutil-csr). CSRs contain information that your CA uses to generate and sign a security certificate. :::: @@ -126,7 +126,7 @@ If your organization has its own CA, you’ll need to [generate Certificate Sign 3. Replace your existing keystore with the new keystore, ensuring that the file names match. For example, `elastic-certificates.p12`. - ::::{important} + ::::{important} If your [keystore password is changing](same-ca.md#cert-password-updates), then save the keystore with a new filename so that {{es}} doesn’t attempt to reload the file before you update the password. :::: @@ -167,7 +167,7 @@ If your organization has its own CA, you’ll need to [generate Certificate Sign -### What’s next? [transport-layer-newca-whatsnext] +### What’s next? [transport-layer-newca-whatsnext] Well done! You’ve updated the keystore for the transport layer. You can also [update the keystore for the HTTP layer](#node-certs-different-http) if necessary. If you’re not updating the keystore for the HTTP layer, then you’re all set. @@ -176,8 +176,8 @@ Well done! You’ve updated the keystore for the transport layer. You can also [ You can generate certificates for the HTTP layer using your new CA certificate and private key. Other components such as {{kib}} or any of the Elastic language clients verify this certificate when they connect to {{es}}. -::::{note} -If your organization has its own CA, you’ll need to [generate Certificate Signing Requests (CSRs)](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/certutil.md#certutil-csr). CSRs contain information that your CA uses to generate and sign a security certificate instead of using self-signed certificates that the `elasticsearch-certutil` tool generates. +::::{note} +If your organization has its own CA, you’ll need to [generate Certificate Signing Requests (CSRs)](elasticsearch://reference/elasticsearch/command-line-tools/certutil.md#certutil-csr). CSRs contain information that your CA uses to generate and sign a security certificate instead of using self-signed certificates that the `elasticsearch-certutil` tool generates. :::: @@ -245,7 +245,7 @@ This process is different for each client, so refer to your client’s documenta 6. Replace your existing keystore with the new keystore, ensuring that the file names match. For example, `node1-http.p12`. - ::::{important} + ::::{important} If your [keystore password is changing](same-ca.md#cert-password-updates), then save the keystore with a new filename so that {{es}} doesn’t attempt to reload the file before you update the password. :::: @@ -280,7 +280,7 @@ This process is different for each client, so refer to your client’s documenta 12. Complete the remaining steps for a [rolling restart](../maintenance/start-stop-services/full-cluster-restart-rolling-restart-procedures.md#restart-cluster-rolling), beginning with the step to **Reenable shard allocation**. -### What’s next? [http-kibana-newca-whatsnext] +### What’s next? [http-kibana-newca-whatsnext] Well done! You’ve updated the keystore for the HTTP layer. You can now [update encryption between {{kib}} and {{es}}](#node-certs-different-kibana). @@ -291,7 +291,7 @@ When you ran the `elasticsearch-certutil` tool with the `http` option, it create 1. Copy the `elasticsearch-ca.pem` file to the {{kib}} configuration directory, as defined by the `KBN_PATH_CONF` path. - ::::{note} + ::::{note} `KBN_PATH_CONF` contains the path for the {{kib}} configuration files. If you installed {{kib}} using archive distributions (`zip` or `tar.gz`), the path defaults to `KBN_HOME/config`. If you used package distributions (Debian or RPM), the path defaults to `/etc/kibana`. :::: diff --git a/deploy-manage/security/enabling-cipher-suites-for-stronger-encryption.md b/deploy-manage/security/enabling-cipher-suites-for-stronger-encryption.md index 67232ef96..f59767be1 100644 --- a/deploy-manage/security/enabling-cipher-suites-for-stronger-encryption.md +++ b/deploy-manage/security/enabling-cipher-suites-for-stronger-encryption.md @@ -9,9 +9,9 @@ The TLS and SSL protocols use a cipher suite that determines the strength of enc The *Java Cryptography Extension (JCE) Unlimited Strength Jurisdiction Policy Files* enable the use of additional cipher suites for Java in a separate JAR file that you need to add to your Java installation. You can download this JAR file from Oracle’s [download page](http://www.oracle.com/technetwork/java/javase/downloads/index.md). The *JCE Unlimited Strength Jurisdiction Policy Files`* are required for encryption with key lengths greater than 128 bits, such as 256-bit AES encryption. -After installation, all cipher suites in the JCE are available for use but requires configuration in order to use them. To enable the use of stronger cipher suites with {{es}} {{security-features}}, configure the [`cipher_suites` parameter](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#ssl-tls-settings). +After installation, all cipher suites in the JCE are available for use but requires configuration in order to use them. To enable the use of stronger cipher suites with {{es}} {{security-features}}, configure the [`cipher_suites` parameter](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#ssl-tls-settings). -::::{note} +::::{note} The *JCE Unlimited Strength Jurisdiction Policy Files* must be installed on all nodes in the cluster to establish an improved level of encryption strength. :::: diff --git a/deploy-manage/security/same-ca.md b/deploy-manage/security/same-ca.md index e8ae052da..dc1379320 100644 --- a/deploy-manage/security/same-ca.md +++ b/deploy-manage/security/same-ca.md @@ -17,14 +17,14 @@ You don’t have to restart each node, but doing so forces new TLS connections a The following steps provide instructions for generating new node certificates and keys for both the transport layer and the HTTP layer. You might only need to replace one of these layer’s certificates depending on which of your certificates are expiring. -::::{important} +::::{important} :name: cert-password-updates If your keystore is password protected, the password is stored in the {{es}} secure settings, *and* the password needs to change, then you must perform a [rolling restart](../maintenance/start-stop-services/full-cluster-restart-rolling-restart-procedures.md#restart-cluster-rolling) on your cluster. You must also use a different file name for the keystore so that {{es}} doesn’t reload the file before the node is restarted. :::: -::::{tip} +::::{tip} If your CA has changed, complete the steps in [update security certificates with a different CA](different-ca.md). :::: @@ -37,7 +37,7 @@ The following examples use PKCS#12 files, but the same steps apply to JKS keysto In this example, the keystore and truststore are pointing to different files. Your configuration might use the same file that contains the certificate and CA. In this case, include the path to that file for both the keystore and truststore. - ::::{note} + ::::{note} These instructions assume that the provided certificate is signed by a trusted CA and the verification mode is set to `certificate`. This setting ensures that nodes to not attempt to perform hostname verification. :::: @@ -79,7 +79,7 @@ The following examples use PKCS#12 files, but the same steps apply to JKS keysto 5. $$$replace-keystores$$$Replace your existing keystore with the new keystore, ensuring that the file names match. For example, `elastic-certificates.p12`. - ::::{important} + ::::{important} If your [keystore password is changing](#cert-password-updates), then save the keystore with a new filename so that {{es}} doesn’t attempt to reload the file before you update the password. :::: @@ -105,7 +105,7 @@ The following examples use PKCS#12 files, but the same steps apply to JKS keysto -### What’s next? [transport-layer-sameca-whatsnext] +### What’s next? [transport-layer-sameca-whatsnext] Well done! You’ve updated the keystore for the transport layer. You can also [update the keystore for the HTTP layer](#node-certs-same-http) if necessary. If you’re not updating the keystore for the HTTP layer, then you’re all set. @@ -114,8 +114,8 @@ Well done! You’ve updated the keystore for the transport layer. You can also [ Other components such as {{kib}} or any of the Elastic language clients verify this certificate when they connect to {{es}}. -::::{note} -If your organization has its own CA, you’ll need to [generate Certificate Signing Requests (CSRs)](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/certutil.md#certutil-csr). CSRs contain information that your CA uses to generate and sign a certificate. +::::{note} +If your organization has its own CA, you’ll need to [generate Certificate Signing Requests (CSRs)](elasticsearch://reference/elasticsearch/command-line-tools/certutil.md#certutil-csr). CSRs contain information that your CA uses to generate and sign a certificate. :::: @@ -175,7 +175,7 @@ If your organization has its own CA, you’ll need to [generate Certificate Sign 6. Replace your existing keystore with the new keystore, ensuring that the file names match. For example, `node1-http.p12`. - ::::{important} + ::::{important} If your [keystore password is changing](#cert-password-updates), then save the keystore with a new filename so that {{es}} doesn’t attempt to reload the file before you update the password. :::: diff --git a/deploy-manage/security/secure-endpoints.md b/deploy-manage/security/secure-endpoints.md index 9db6054c0..7757019e3 100644 --- a/deploy-manage/security/secure-endpoints.md +++ b/deploy-manage/security/secure-endpoints.md @@ -8,24 +8,24 @@ mapped_pages: Protecting your {{es}} cluster and the data it contains is of utmost importance. Implementing a defense in depth strategy provides multiple layers of security to help safeguard your system. The following principles provide a foundation for running {{es}} in a secure manner that helps to mitigate attacks on your system at multiple levels. -## Run {{es}} with security enabled [security-run-with-security] +## Run {{es}} with security enabled [security-run-with-security] Never run an {{es}} cluster without security enabled. This principle cannot be overstated. Running {{es}} without security leaves your cluster exposed to anyone who can send network traffic to {{es}}, permitting these individuals to download, modify, or delete any data in your cluster. [Start the {{stack}} with security enabled](../deploy/self-managed/installing-elasticsearch.md) or [manually configure security](manually-configure-security-in-self-managed-cluster.md) to prevent unauthorized access to your clusters and ensure that internode communication is secure. -## Run {{es}} with a dedicated non-root user [security-not-root-user] +## Run {{es}} with a dedicated non-root user [security-not-root-user] Never try to run {{es}} as the `root` user, which would invalidate any defense strategy and permit a malicious user to do **anything** on your server. You must create a dedicated, unprivileged user to run {{es}}. By default, the `rpm`, `deb`, `docker`, and Windows packages of {{es}} contain an `elasticsearch` user with this scope. -## Protect {{es}} from public internet traffic [security-protect-cluster-traffic] +## Protect {{es}} from public internet traffic [security-protect-cluster-traffic] Even with security enabled, never expose {{es}} to public internet traffic. Using an application to sanitize requests to {{es}} still poses risks, such as a malicious user writing [`_search`](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-search) requests that could overwhelm an {{es}} cluster and bring it down. Keep {{es}} as isolated as possible, preferably behind a firewall and a VPN. Any internet-facing applications should run pre-canned aggregations, or not run aggregations at all. -While you absolutely shouldn’t expose {{es}} directly to the internet, you also shouldn’t expose {{es}} directly to users. Instead, use an intermediary application to make requests on behalf of users. This implementation allows you to track user behaviors, such as can submit requests, and to which specific nodes in the cluster. For example, you can implement an application that accepts a search term from a user and funnels it through a [`simple_query_string`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-simple-query-string-query.md) query. +While you absolutely shouldn’t expose {{es}} directly to the internet, you also shouldn’t expose {{es}} directly to users. Instead, use an intermediary application to make requests on behalf of users. This implementation allows you to track user behaviors, such as can submit requests, and to which specific nodes in the cluster. For example, you can implement an application that accepts a search term from a user and funnels it through a [`simple_query_string`](elasticsearch://reference/query-languages/query-dsl-simple-query-string-query.md) query. -## Implement role based access control [security-create-appropriate-users] +## Implement role based access control [security-create-appropriate-users] [Define roles](../users-roles/cluster-or-deployment-auth/defining-roles.md) for your users and [assign appropriate privileges](../users-roles/cluster-or-deployment-auth/elasticsearch-privileges.md) to ensure that users have access only to the resources that they need. This process determines whether the user behind an incoming request is allowed to run that request. diff --git a/deploy-manage/security/security-certificates-keys.md b/deploy-manage/security/security-certificates-keys.md index 26a6512e7..6d41517ff 100644 --- a/deploy-manage/security/security-certificates-keys.md +++ b/deploy-manage/security/security-certificates-keys.md @@ -39,7 +39,7 @@ There are [some cases](../deploy/self-managed/installing-elasticsearch.md#stack- 2. Copy the generated `elastic` password and enrollment token. These credentials are only shown when you start {{es}} for the first time. ::::{note} - If you need to reset the password for the `elastic` user or other built-in users, run the [`elasticsearch-reset-password`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/reset-password.md) tool. To generate new enrollment tokens for {{kib}} or {{es}} nodes, run the [`elasticsearch-create-enrollment-token`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/create-enrollment-token.md) tool. These tools are available in the {{es}} `bin` directory. + If you need to reset the password for the `elastic` user or other built-in users, run the [`elasticsearch-reset-password`](elasticsearch://reference/elasticsearch/command-line-tools/reset-password.md) tool. To generate new enrollment tokens for {{kib}} or {{es}} nodes, run the [`elasticsearch-create-enrollment-token`](elasticsearch://reference/elasticsearch/command-line-tools/create-enrollment-token.md) tool. These tools are available in the {{es}} `bin` directory. :::: @@ -90,11 +90,11 @@ When {{es}} starts for the first time, the security auto-configuration process b Before enrolling a new node, additional actions such as binding to an address other than `localhost` or satisfying bootstrap checks are typically necessary in production clusters. During that time, an auto-generated enrollment token could expire, which is why enrollment tokens aren’t generated automatically. -Additionally, only nodes on the same host can join the cluster without additional configuration. If you want nodes from another host to join your cluster, you need to set `transport.host` to a [supported value](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#network-interface-values) (such as uncommenting the suggested value of `0.0.0.0`), or an IP address that’s bound to an interface where other hosts can reach it. Refer to [transport settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#transport-settings) for more information. +Additionally, only nodes on the same host can join the cluster without additional configuration. If you want nodes from another host to join your cluster, you need to set `transport.host` to a [supported value](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#network-interface-values) (such as uncommenting the suggested value of `0.0.0.0`), or an IP address that’s bound to an interface where other hosts can reach it. Refer to [transport settings](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#transport-settings) for more information. To enroll new nodes in your cluster, create an enrollment token with the `elasticsearch-create-enrollment-token` tool on any existing node in your cluster. You can then start a new node with the `--enrollment-token` parameter so that it joins an existing cluster. -1. In a separate terminal from where {{es}} is running, navigate to the directory where you installed {{es}} and run the [`elasticsearch-create-enrollment-token`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/create-enrollment-token.md) tool to generate an enrollment token for your new nodes. +1. In a separate terminal from where {{es}} is running, navigate to the directory where you installed {{es}} and run the [`elasticsearch-create-enrollment-token`](elasticsearch://reference/elasticsearch/command-line-tools/create-enrollment-token.md) tool to generate an enrollment token for your new nodes. ```sh bin/elasticsearch-create-enrollment-token -s node @@ -177,7 +177,7 @@ When you install {{es}}, the following certificates and keys are generated in th `transport.p12` : Keystore that contains the key and certificate for the transport layer for all the nodes in your cluster. -`http.p12` and `transport.p12` are password-protected PKCS#12 keystores. {{es}} stores the passwords for these keystores as [secure settings](secure-settings.md). To retrieve the passwords so that you can inspect or change the keystore contents, use the [`bin/elasticsearch-keystore`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/elasticsearch-keystore.md) tool. +`http.p12` and `transport.p12` are password-protected PKCS#12 keystores. {{es}} stores the passwords for these keystores as [secure settings](secure-settings.md). To retrieve the passwords so that you can inspect or change the keystore contents, use the [`bin/elasticsearch-keystore`](elasticsearch://reference/elasticsearch/command-line-tools/elasticsearch-keystore.md) tool. Use the following command to retrieve the password for `http.p12`: @@ -228,11 +228,11 @@ The {{es}} configuration directory isn’t writable The following settings are incompatible with security auto configuration. If any of these settings exist, the node startup process skips configuring security automatically and the node starts normally. -* [`node.roles`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/node-settings.md#node-roles) is set to a value where the node can’t be elected as `master`, or if the node can’t hold data -* [`xpack.security.autoconfiguration.enabled`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#general-security-settings) is set to `false` -* [`xpack.security.enabled`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#general-security-settings) has a value set -* Any of the [`xpack.security.transport.ssl.*`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#transport-tls-ssl-settings) or [`xpack.security.http.ssl.*`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#http-tls-ssl-settings) settings have a value set in the `elasticsearch.yml` configuration file or in the `elasticsearch.keystore` -* Any of the `discovery.type`, `discovery.seed_hosts`, or `cluster.initial_master_nodes` [discovery and cluster formation settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/discovery-cluster-formation-settings.md) have a value set +* [`node.roles`](elasticsearch://reference/elasticsearch/configuration-reference/node-settings.md#node-roles) is set to a value where the node can’t be elected as `master`, or if the node can’t hold data +* [`xpack.security.autoconfiguration.enabled`](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#general-security-settings) is set to `false` +* [`xpack.security.enabled`](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#general-security-settings) has a value set +* Any of the [`xpack.security.transport.ssl.*`](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#transport-tls-ssl-settings) or [`xpack.security.http.ssl.*`](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#http-tls-ssl-settings) settings have a value set in the `elasticsearch.yml` configuration file or in the `elasticsearch.keystore` +* Any of the `discovery.type`, `discovery.seed_hosts`, or `cluster.initial_master_nodes` [discovery and cluster formation settings](elasticsearch://reference/elasticsearch/configuration-reference/discovery-cluster-formation-settings.md) have a value set ::::{note} Exceptions are when `discovery.type` is set to `single-node`, or when `cluster.initial_master_nodes` exists but contains only the name of the current node. diff --git a/deploy-manage/security/set-up-basic-security-plus-https.md b/deploy-manage/security/set-up-basic-security-plus-https.md index 3a11acfd3..b77f32be2 100644 --- a/deploy-manage/security/set-up-basic-security-plus-https.md +++ b/deploy-manage/security/set-up-basic-security-plus-https.md @@ -269,7 +269,7 @@ To send monitoring data securely, create a monitoring user and grant it the nece You can use the built-in `beats_system` user, if it’s available in your environment. Because the built-in users are not available in {{ecloud}}, these instructions create a user that is explicitly used for monitoring {{metricbeat}}. -1. If you’re using the built-in `beats_system` user, on any node in your cluster, run the [`elasticsearch-reset-password`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/reset-password.md) utility to set the password for that user: +1. If you’re using the built-in `beats_system` user, on any node in your cluster, run the [`elasticsearch-reset-password`](elasticsearch://reference/elasticsearch/command-line-tools/reset-password.md) utility to set the password for that user: This command resets the password for the `beats_system` user to an auto-generated value. diff --git a/deploy-manage/security/set-up-basic-security.md b/deploy-manage/security/set-up-basic-security.md index e60f3aa30..173d367d9 100644 --- a/deploy-manage/security/set-up-basic-security.md +++ b/deploy-manage/security/set-up-basic-security.md @@ -11,7 +11,7 @@ mapped_pages: When you start {{es}} for the first time, passwords are generated for the `elastic` user and TLS is automatically configured for you. If you configure security manually *before* starting your {{es}} nodes, the auto-configuration process will respect your security configuration. You can adjust your TLS configuration at any time, such as [updating node certificates](updating-certificates.md). -::::{important} +::::{important} If your cluster has multiple nodes, then you must configure TLS between nodes. [Production mode](../deploy/self-managed/bootstrap-checks.md#dev-vs-prod-mode) clusters will not start if you do not enable TLS. :::: @@ -72,7 +72,7 @@ The transport networking layer is used for internal communication between nodes Now that you’ve generated a certificate authority and certificates, you’ll update your cluster to use these files. -::::{note} +::::{note} {{es}} monitors all files such as certificates, keys, keystores, or truststores that are configured as values of TLS-related node settings. If you update any of these files, such as when your hostnames change or your certificates are due to expire, {{es}} reloads them. The files are polled for changes at a frequency determined by the global {{es}} `resource.reload.interval.high` setting, which defaults to 5 seconds. :::: @@ -81,7 +81,7 @@ Complete the following steps **for each node in your cluster**. To join the same 1. Open the `$ES_PATH_CONF/elasticsearch.yml` file and make the following changes: - 1. Add the [`cluster-name`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/miscellaneous-cluster-settings.md#cluster-name) setting and enter a name for your cluster: + 1. Add the [`cluster-name`](elasticsearch://reference/elasticsearch/configuration-reference/miscellaneous-cluster-settings.md#cluster-name) setting and enter a name for your cluster: ```yaml cluster.name: my-cluster @@ -105,7 +105,7 @@ Complete the following steps **for each node in your cluster**. To join the same xpack.security.transport.ssl.truststore.path: elastic-certificates.p12 ``` - 1. If you want to use hostname verification, set the verification mode to `full`. You should generate a different certificate for each host that matches the DNS or IP address. See the `xpack.security.transport.ssl.verification_mode` parameter in [TLS settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#transport-tls-ssl-settings). + 1. If you want to use hostname verification, set the verification mode to `full`. You should generate a different certificate for each host that matches the DNS or IP address. See the `xpack.security.transport.ssl.verification_mode` parameter in [TLS settings](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#transport-tls-ssl-settings). 2. If you entered a password when creating the node certificate, run the following commands to store the password in the {{es}} keystore: @@ -122,7 +122,7 @@ Complete the following steps **for each node in your cluster**. To join the same For example, if you installed {{es}} with an archive distribution (`tar.gz` or `.zip`), you can enter `Ctrl+C` on the command line to stop {{es}}. - ::::{warning} + ::::{warning} You must perform a full cluster restart. Nodes that are configured to use TLS for transport cannot communicate with nodes that use unencrypted transport connection (and vice-versa). :::: diff --git a/deploy-manage/security/set-up-minimal-security.md b/deploy-manage/security/set-up-minimal-security.md index 60c8c9461..a3dd85c8a 100644 --- a/deploy-manage/security/set-up-minimal-security.md +++ b/deploy-manage/security/set-up-minimal-security.md @@ -9,7 +9,7 @@ mapped_pages: # Set up minimal security [security-minimal-setup] -::::{important} +::::{important} You only need to complete the following steps if you’re running an existing, unsecured cluster and want to enable the {{es}} {{security-features}}. :::: @@ -18,7 +18,7 @@ In {{es}} 8.0 and later, security is [enabled automatically](../deploy/self-mana If you’re running an existing {{es}} cluster where security is disabled, you can manually enable the {{es}} {{security-features}} and then create passwords for built-in users. You can add more users later, but using the built-in users simplifies the process of enabling security for your cluster. -::::{important} +::::{important} The minimal security scenario is not sufficient for [production mode](../deploy/self-managed/bootstrap-checks.md#dev-vs-prod-mode) clusters. If your cluster has multiple nodes, you must enable minimal security and then [configure Transport Layer Security (TLS)](secure-cluster-communications.md) between nodes. :::: @@ -34,7 +34,7 @@ Enabling the {{es}} security features provides basic authentication so that you xpack.security.enabled: true ``` - ::::{note} + ::::{note} The `$ES_PATH_CONF` variable is the path for the {{es}} configuration files. If you installed {{es}} using archive distributions (`zip` or `tar.gz`), the variable defaults to `$ES_HOME/config`. If you used package distributions (Debian or RPM), the variable defaults to `/etc/elasticsearch`. :::: @@ -50,7 +50,7 @@ Enabling the {{es}} security features provides basic authentication so that you To communicate with your cluster, you must configure a password for the `elastic` and `kibana_system` built-in users. Unless you enable anonymous access (not recommended), all requests that don’t include credentials are rejected. -::::{note} +::::{note} You only need to set passwords for the `elastic` and `kibana_system` users when enabling minimal or basic security. :::: @@ -61,7 +61,7 @@ You only need to set passwords for the `elastic` and `kibana_system` users when ./bin/elasticsearch ``` -2. On any node in your cluster, open another terminal window and set the password for the `elastic` built-in user by running the [`elasticsearch-reset-password`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/reset-password.md) utility. This command resets the password to an auto-generated value. +2. On any node in your cluster, open another terminal window and set the password for the `elastic` built-in user by running the [`elasticsearch-reset-password`](elasticsearch://reference/elasticsearch/command-line-tools/reset-password.md) utility. This command resets the password to an auto-generated value. ```shell ./bin/elasticsearch-reset-password -u elastic @@ -98,7 +98,7 @@ This account is not meant for individual users and does not have permission to l elasticsearch.username: "kibana_system" ``` - ::::{note} + ::::{note} The `KBN_PATH_CONF` variable is the path for the {{kib}} configuration files. If you installed {{kib}} using archive distributions (`zip` or `tar.gz`), the variable defaults to `KIB_HOME/config`. If you used package distributions (Debian or RPM), the variable defaults to `/etc/kibana`. :::: diff --git a/deploy-manage/security/supported-ssltls-versions-by-jdk-version.md b/deploy-manage/security/supported-ssltls-versions-by-jdk-version.md index e96d8f6ca..4f024d3d3 100644 --- a/deploy-manage/security/supported-ssltls-versions-by-jdk-version.md +++ b/deploy-manage/security/supported-ssltls-versions-by-jdk-version.md @@ -93,9 +93,9 @@ jdk.tls.disabledAlgorithms=SSLv3, TLSv1, RC4, DES, MD5withRSA, \ ### Enable your custom security configuration [_enable_your_custom_security_configuration] -To enable your custom security policy, add a file in the [`jvm.options.d`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/jvm-settings.md#set-jvm-options) directory within your {{es}} configuration directory. +To enable your custom security policy, add a file in the [`jvm.options.d`](elasticsearch://reference/elasticsearch/jvm-settings.md#set-jvm-options) directory within your {{es}} configuration directory. -To enable your custom security policy, create a file named `java.security.options` within the [jvm.options.d](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/jvm-settings.md#set-jvm-options) directory of your {{es}} configuration directory, with this content: +To enable your custom security policy, create a file named `java.security.options` within the [jvm.options.d](elasticsearch://reference/elasticsearch/jvm-settings.md#set-jvm-options) directory of your {{es}} configuration directory, with this content: ```text -Djava.security.properties=/path/to/your/es.java.security @@ -105,7 +105,7 @@ To enable your custom security policy, create a file named `java.security.option ## Enabling TLS versions in {{es}} [_enabling_tls_versions_in_es] -SSL/TLS versions can be enabled and disabled within {{es}} via the [`ssl.supported_protocols` settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#ssl-tls-settings). +SSL/TLS versions can be enabled and disabled within {{es}} via the [`ssl.supported_protocols` settings](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#ssl-tls-settings). {{es}} will only support the TLS versions that are enabled by the underlying JDK. If you configure `ssl.supported_procotols` to include a TLS version that is not enabled in your JDK, then it will be silently ignored. diff --git a/deploy-manage/tools/cross-cluster-replication.md b/deploy-manage/tools/cross-cluster-replication.md index 694a65755..37ae19739 100644 --- a/deploy-manage/tools/cross-cluster-replication.md +++ b/deploy-manage/tools/cross-cluster-replication.md @@ -1,10 +1,10 @@ --- applies_to: deployment: - eck: - ess: - ece: - self: + eck: + ess: + ece: + self: --- # Cross-cluster replication [xpack-ccr] @@ -184,7 +184,7 @@ When you create a follower index, you cannot use it until it is fully initialize Remote recovery is a network intensive process that transfers all of the Lucene segment files from the leader cluster to the follower cluster. The follower requests that a recovery session be initiated on the primary shard in the leader cluster. The follower then requests file chunks concurrently from the leader. By default, the process concurrently requests five 1MB file chunks. This default behavior is designed to support leader and follower clusters with high network latency between them. ::::{tip} -You can modify dynamic [remote recovery settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cross-cluster-replication-settings.md#ccr-recovery-settings) to rate-limit the transmitted data and manage the resources consumed by remote recoveries. +You can modify dynamic [remote recovery settings](elasticsearch://reference/elasticsearch/configuration-reference/cross-cluster-replication-settings.md#ccr-recovery-settings) to rate-limit the transmitted data and manage the resources consumed by remote recoveries. :::: @@ -193,11 +193,11 @@ Use the [recovery API](https://www.elastic.co/docs/api/doc/elasticsearch/operati ## Replicating a leader requires soft deletes [ccr-leader-requirements] -{{ccr-cap}} works by replaying the history of individual write operations that were performed on the shards of the leader index. {{es}} needs to retain the [history of these operations](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/history-retention.md) on the leader shards so that they can be pulled by the follower shard tasks. The underlying mechanism used to retain these operations is *soft deletes*. +{{ccr-cap}} works by replaying the history of individual write operations that were performed on the shards of the leader index. {{es}} needs to retain the [history of these operations](elasticsearch://reference/elasticsearch/index-settings/history-retention.md) on the leader shards so that they can be pulled by the follower shard tasks. The underlying mechanism used to retain these operations is *soft deletes*. A soft delete occurs whenever an existing document is deleted or updated. By retaining these soft deletes up to configurable limits, the history of operations can be retained on the leader shards and made available to the follower shard tasks as it replays the history of operations. -The [`index.soft_deletes.retention_lease.period`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-modules.md#ccr-index-soft-deletes-retention-period) setting defines the maximum time to retain a shard history retention lease before it is considered expired. This setting determines how long the cluster containing your follower index can be offline, which is 12 hours by default. If a shard copy recovers after its retention lease expires, but the missing operations are still available on the leader index, then {{es}} will establish a new lease and copy the missing operations. However {{es}} does not guarantee to retain unleased operations, so it is also possible that some of the missing operations have been discarded by the leader and are now completely unavailable. If this happens then the follower cannot recover automatically so you must [recreate it](cross-cluster-replication/ccr-recreate-follower-index.md). +The [`index.soft_deletes.retention_lease.period`](elasticsearch://reference/elasticsearch/index-settings/index-modules.md#ccr-index-soft-deletes-retention-period) setting defines the maximum time to retain a shard history retention lease before it is considered expired. This setting determines how long the cluster containing your follower index can be offline, which is 12 hours by default. If a shard copy recovers after its retention lease expires, but the missing operations are still available on the leader index, then {{es}} will establish a new lease and copy the missing operations. However {{es}} does not guarantee to retain unleased operations, so it is also possible that some of the missing operations have been discarded by the leader and are now completely unavailable. If this happens then the follower cannot recover automatically so you must [recreate it](cross-cluster-replication/ccr-recreate-follower-index.md). Soft deletes must be enabled for indices that you want to use as leader indices. Soft deletes are enabled by default on new indices created on or after {{es}} 7.0.0. @@ -221,13 +221,13 @@ This following sections provide more information about how to configure and use {{ccr-cap}} is designed to replicate user-generated indices only, and doesn’t currently replicate any of the following: -* [System indices](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#system-indices) +* [System indices](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#system-indices) * [Machine learning jobs](../../explore-analyze/machine-learning.md) * [index templates](../../manage-data/data-store/templates.md) * [{{ilm-cap}}](../../manage-data/lifecycle/index-lifecycle-management.md) and [{{slm}}](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-slm) polices * [User permissions and role mappings](../users-roles/cluster-or-deployment-auth/mapping-users-groups-to-roles.md) * [Snapshot repository settings](snapshot-and-restore/self-managed.md) -* [Cluster settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md) +* [Cluster settings](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md) * [Searchable snapshot](snapshot-and-restore/searchable-snapshots.md) If you want to replicate any of this data, you must replicate it to a remote cluster manually. diff --git a/deploy-manage/tools/cross-cluster-replication/_failback_when_clustera_comes_back.md b/deploy-manage/tools/cross-cluster-replication/_failback_when_clustera_comes_back.md index 13e71bb2a..fe8022fd0 100644 --- a/deploy-manage/tools/cross-cluster-replication/_failback_when_clustera_comes_back.md +++ b/deploy-manage/tools/cross-cluster-replication/_failback_when_clustera_comes_back.md @@ -4,10 +4,10 @@ mapped_pages: applies_to: deployment: - eck: - ess: - ece: - self: + eck: + ess: + ece: + self: --- # Failback when clusterA comes back [_failback_when_clustera_comes_back] @@ -61,8 +61,8 @@ When `clusterA` comes back, `clusterB` becomes the new leader and `clusterA` bec GET kibana_sample_data_ecommerce/_search?q=kimchy ``` - ::::{tip} - If a soft delete is merged away before it can be replicated to a follower the following process will fail due to incomplete history on the leader, see [index.soft_deletes.retention_lease.period](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-modules.md#ccr-index-soft-deletes-retention-period) for more details. + ::::{tip} + If a soft delete is merged away before it can be replicated to a follower the following process will fail due to incomplete history on the leader, see [index.soft_deletes.retention_lease.period](elasticsearch://reference/elasticsearch/index-settings/index-modules.md#ccr-index-soft-deletes-retention-period) for more details. :::: diff --git a/deploy-manage/tools/cross-cluster-replication/_perform_update_or_delete_by_query.md b/deploy-manage/tools/cross-cluster-replication/_perform_update_or_delete_by_query.md index f9fe7389a..4b7defb7c 100644 --- a/deploy-manage/tools/cross-cluster-replication/_perform_update_or_delete_by_query.md +++ b/deploy-manage/tools/cross-cluster-replication/_perform_update_or_delete_by_query.md @@ -4,10 +4,10 @@ mapped_pages: applies_to: deployment: - eck: - ess: - ece: - self: + eck: + ess: + ece: + self: --- # Perform update or delete by query [_perform_update_or_delete_by_query] @@ -53,8 +53,8 @@ It is possible to update or delete the documents but you can only perform these } ``` - ::::{tip} - If a soft delete is merged away before it can be replicated to a follower the following process will fail due to incomplete history on the leader, see [index.soft_deletes.retention_lease.period](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-modules.md#ccr-index-soft-deletes-retention-period) for more details. + ::::{tip} + If a soft delete is merged away before it can be replicated to a follower the following process will fail due to incomplete history on the leader, see [index.soft_deletes.retention_lease.period](elasticsearch://reference/elasticsearch/index-settings/index-modules.md#ccr-index-soft-deletes-retention-period) for more details. :::: diff --git a/deploy-manage/tools/cross-cluster-replication/ccr-getting-started-prerequisites.md b/deploy-manage/tools/cross-cluster-replication/ccr-getting-started-prerequisites.md index 27a0a5d49..31606ae5a 100644 --- a/deploy-manage/tools/cross-cluster-replication/ccr-getting-started-prerequisites.md +++ b/deploy-manage/tools/cross-cluster-replication/ccr-getting-started-prerequisites.md @@ -1,13 +1,13 @@ --- mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/ccr-getting-started-prerequisites.html - + applies_to: deployment: - eck: - ess: - ece: - self: + eck: + ess: + ece: + self: --- # Prerequisites [ccr-getting-started-prerequisites] @@ -17,5 +17,5 @@ To complete this tutorial, you need: * The `manage` cluster privilege on the local cluster. * A license on both clusters that includes {{ccr}}. [Activate a free 30-day trial](../../license/manage-your-license-in-self-managed-cluster.md). * An index on the remote cluster that contains the data you want to replicate. This tutorial uses the sample eCommerce orders data set. [Load sample data](../../../explore-analyze/index.md#gs-get-data-into-kibana). -* In the local cluster, all nodes with the `master` [node role](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/node-settings.md#node-roles) must also have the [`remote_cluster_client`](../../distributed-architecture/clusters-nodes-shards/node-roles.md#remote-node) role. The local cluster must also have at least one node with both a data role and the [`remote_cluster_client`](../../distributed-architecture/clusters-nodes-shards/node-roles.md#remote-node) role. Individual tasks for coordinating replication scale based on the number of data nodes with the `remote_cluster_client` role in the local cluster. +* In the local cluster, all nodes with the `master` [node role](elasticsearch://reference/elasticsearch/configuration-reference/node-settings.md#node-roles) must also have the [`remote_cluster_client`](../../distributed-architecture/clusters-nodes-shards/node-roles.md#remote-node) role. The local cluster must also have at least one node with both a data role and the [`remote_cluster_client`](../../distributed-architecture/clusters-nodes-shards/node-roles.md#remote-node) role. Individual tasks for coordinating replication scale based on the number of data nodes with the `remote_cluster_client` role in the local cluster. diff --git a/deploy-manage/tools/cross-cluster-replication/ccr-recreate-follower-index.md b/deploy-manage/tools/cross-cluster-replication/ccr-recreate-follower-index.md index 77cdf96fb..92f9ae5e0 100644 --- a/deploy-manage/tools/cross-cluster-replication/ccr-recreate-follower-index.md +++ b/deploy-manage/tools/cross-cluster-replication/ccr-recreate-follower-index.md @@ -4,21 +4,21 @@ mapped_pages: applies_to: deployment: - eck: - ess: - ece: - self: + eck: + ess: + ece: + self: --- # Recreate a follower index [ccr-recreate-follower-index] -When a document is updated or deleted, the underlying operation is retained in the Lucene index for a period of time defined by the [`index.soft_deletes.retention_lease.period`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-modules.md#ccr-index-soft-deletes-retention-period) parameter. You configure this setting on the [leader index](../cross-cluster-replication.md#ccr-leader-requirements). +When a document is updated or deleted, the underlying operation is retained in the Lucene index for a period of time defined by the [`index.soft_deletes.retention_lease.period`](elasticsearch://reference/elasticsearch/index-settings/index-modules.md#ccr-index-soft-deletes-retention-period) parameter. You configure this setting on the [leader index](../cross-cluster-replication.md#ccr-leader-requirements). When a follower index starts, it acquires a retention lease from the leader index. This lease informs the leader that it should not allow a soft delete to be pruned until either the follower indicates that it has received the operation, or until the lease expires. If a follower index falls sufficiently behind a leader and cannot replicate operations, {{es}} reports an `indices[].fatal_exception` error. To resolve the issue, recreate the follower index. When the new follow index starts, the [remote recovery](../cross-cluster-replication.md#ccr-remote-recovery) process recopies the Lucene segment files from the leader. -::::{important} +::::{important} Recreating the follower index is a destructive action. All existing Lucene segment files are deleted on the cluster containing the follower index. :::: diff --git a/deploy-manage/tools/cross-cluster-replication/manage-auto-follow-patterns.md b/deploy-manage/tools/cross-cluster-replication/manage-auto-follow-patterns.md index 1c8816cb1..71c49f8ea 100644 --- a/deploy-manage/tools/cross-cluster-replication/manage-auto-follow-patterns.md +++ b/deploy-manage/tools/cross-cluster-replication/manage-auto-follow-patterns.md @@ -4,18 +4,18 @@ mapped_pages: applies_to: deployment: - eck: - ess: - ece: - self: + eck: + ess: + ece: + self: --- # Manage auto-follow patterns [ccr-auto-follow] To replicate time series indices, you configure an auto-follow pattern so that each new index in the series is replicated automatically. Whenever the name of a new index on the remote cluster matches the auto-follow pattern, a corresponding follower index is added to the local cluster. -::::{note} -Auto-follow patterns only match open indices on the remote cluster that have all primary shards started. Auto-follow patterns do not match indices that can’t be used for {{ccr-init}} such as [closed indices](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-open) or [{{search-snaps}}](../snapshot-and-restore/searchable-snapshots.md). Avoid using an auto-follow pattern that matches indices with a [read or write block](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-block.md). These blocks prevent follower indices from replicating such indices. +::::{note} +Auto-follow patterns only match open indices on the remote cluster that have all primary shards started. Auto-follow patterns do not match indices that can’t be used for {{ccr-init}} such as [closed indices](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-open) or [{{search-snaps}}](../snapshot-and-restore/searchable-snapshots.md). Avoid using an auto-follow pattern that matches indices with a [read or write block](elasticsearch://reference/elasticsearch/index-settings/index-block.md). These blocks prevent follower indices from replicating such indices. :::: diff --git a/deploy-manage/tools/snapshot-and-restore.md b/deploy-manage/tools/snapshot-and-restore.md index 962c75515..8fc3c3797 100644 --- a/deploy-manage/tools/snapshot-and-restore.md +++ b/deploy-manage/tools/snapshot-and-restore.md @@ -1,10 +1,10 @@ --- applies_to: deployment: - eck: - ess: - ece: - self: + eck: + ess: + ece: + self: --- # Snapshot and restore @@ -60,7 +60,7 @@ Use **Kibana** to manage your snapshots. In Kibana, you can: In **Elastic Cloud Enterprise**, you can also [restore snapshots](snapshot-and-restore/restore-snapshot.md) across clusters. :::: - + ::::{dropdown} Elastic Cloud on Kubernetes (ECK) On Elastic Cloud on Kubernetes, you must manually configure snapshot repositories. The system does not create **Snapshot Lifecycle Management (SLM) policies** or **automatic snapshots** by default. @@ -75,7 +75,7 @@ Snapshots back up only open indices. If you close an index, it is not included i By default, a snapshot of a cluster contains the cluster state, all regular data streams, and all regular indices. The cluster state includes: -- [Persistent cluster settings](/deploy-manage/deploy/self-managed/configure-elasticsearch.md#cluster-setting-types) +- [Persistent cluster settings](/deploy-manage/deploy/self-managed/configure-elasticsearch.md#cluster-setting-types) - [Index templates](/manage-data/data-store/templates.md) - [Legacy index templates](https://www.elastic.co/guide/en/elasticsearch/reference/8.17/indices-templates-v1.html) - [Ingest pipelines](/manage-data/ingest/transform-enrich/ingest-pipelines.md) @@ -100,7 +100,7 @@ A **feature state** contains the indices and data streams used to store configur To retrieve a list of feature states, use the [Features API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-features-get-features). :::: -A feature state typically includes one or more [system indices or system data streams](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#system-indices). It may also include regular indices and data streams used by the feature. For example, a feature state may include a regular index that contains the feature’s execution history. Storing this history in a regular index lets you more easily search it. +A feature state typically includes one or more [system indices or system data streams](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#system-indices). It may also include regular indices and data streams used by the feature. For example, a feature state may include a regular index that contains the feature’s execution history. Storing this history in a regular index lets you more easily search it. In Elasticsearch 8.0 and later versions, feature states are the only way to back up and restore system indices and system data streams. @@ -147,7 +147,7 @@ You can’t restore an index to an earlier version of Elasticsearch. For example A compatible snapshot can contain indices created in an older incompatible version. For example, a snapshot of a 7.17 cluster can contain an index created in 6.8. Restoring the 6.8 index to an 8.17 cluster fails unless you can use the [archive functionality](/deploy-manage/upgrade/deployment-or-cluster/reading-indices-from-older-elasticsearch-versions.md). Keep this in mind if you take a snapshot before upgrading a cluster. -As a workaround, you can first restore the index to another cluster running the latest version of Elasticsearch that’s compatible with both the index and your current cluster. You can then use [reindex-from-remote](https://www.elastic.co/guide/en/elasticsearch/reference/8.17/docs-reindex.html#reindex-from-remote) to rebuild the index on your current cluster. Reindex from remote is only possible if the index’s [`_source`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-source-field.md) is enabled. +As a workaround, you can first restore the index to another cluster running the latest version of Elasticsearch that’s compatible with both the index and your current cluster. You can then use [reindex-from-remote](https://www.elastic.co/guide/en/elasticsearch/reference/8.17/docs-reindex.html#reindex-from-remote) to rebuild the index on your current cluster. Reindex from remote is only possible if the index’s [`_source`](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md) is enabled. Reindexing from remote can take significantly longer than restoring a snapshot. Before you start, test the reindex from remote process with a subset of the data to estimate your time requirements. diff --git a/deploy-manage/tools/snapshot-and-restore/azure-repository.md b/deploy-manage/tools/snapshot-and-restore/azure-repository.md index 4379a2b55..a747330d0 100644 --- a/deploy-manage/tools/snapshot-and-restore/azure-repository.md +++ b/deploy-manage/tools/snapshot-and-restore/azure-repository.md @@ -1,7 +1,7 @@ --- applies_to: deployment: - self: + self: --- # Azure repository [repository-azure] @@ -101,7 +101,7 @@ The following list describes the available client settings. Those that must be s : A shared access signatures (SAS) token, which the repository’s internal Azure client uses for authentication. The SAS token must have read (r), write (w), list (l), and delete (d) permissions for the repository base path and all its contents. These permissions must be granted for the blob service (b) and apply to resource types service (s), container (c), and object (o). Alternatively, use `key`. `azure.client.CLIENT_NAME.timeout` -: The client side timeout for any single request to Azure, as a [time unit](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#time-units). For example, a value of `5s` specifies a 5 second timeout. There is no default value, which means that {{es}} uses the [default value](https://azure.github.io/azure-storage-java/com/microsoft/azure/storage/RequestOptions.md#setTimeoutIntervalInMs(java.lang.Integer)) set by the Azure client. +: The client side timeout for any single request to Azure, as a [time unit](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#time-units). For example, a value of `5s` specifies a 5 second timeout. There is no default value, which means that {{es}} uses the [default value](https://azure.github.io/azure-storage-java/com/microsoft/azure/storage/RequestOptions.md#setTimeoutIntervalInMs(java.lang.Integer)) set by the Azure client. `azure.client.CLIENT_NAME.endpoint` : The Azure endpoint to connect to. It must include the protocol used to connect to Azure. @@ -157,16 +157,16 @@ PUT _snapshot/my_backup `chunk_size` -: Big files can be broken down into multiple smaller blobs in the blob store during snapshotting. It is not recommended to change this value from its default unless there is an explicit reason for limiting the size of blobs in the repository. Setting a value lower than the default can result in an increased number of API calls to the Azure blob store during snapshot create as well as restore operations compared to using the default value and thus make both operations slower as well as more costly. Specify the chunk size as a [byte unit](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#byte-units), for example: `10MB`, `5KB`, `500B`. Defaults to the maximum size of a blob in the Azure blob store which is `5TB`. +: Big files can be broken down into multiple smaller blobs in the blob store during snapshotting. It is not recommended to change this value from its default unless there is an explicit reason for limiting the size of blobs in the repository. Setting a value lower than the default can result in an increased number of API calls to the Azure blob store during snapshot create as well as restore operations compared to using the default value and thus make both operations slower as well as more costly. Specify the chunk size as a [byte unit](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#byte-units), for example: `10MB`, `5KB`, `500B`. Defaults to the maximum size of a blob in the Azure blob store which is `5TB`. `compress` : When set to `true` metadata files are stored in compressed format. This setting doesn’t affect index files that are already compressed by default. Defaults to `true`. `max_restore_bytes_per_sec` -: (Optional, [byte value](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#byte-units)) Maximum snapshot restore rate per node. Defaults to unlimited. Note that restores are also throttled through [recovery settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/index-recovery-settings.md). +: (Optional, [byte value](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#byte-units)) Maximum snapshot restore rate per node. Defaults to unlimited. Note that restores are also throttled through [recovery settings](elasticsearch://reference/elasticsearch/configuration-reference/index-recovery-settings.md). `max_snapshot_bytes_per_sec` -: (Optional, [byte value](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#byte-units)) Maximum snapshot creation rate per node. Defaults to `40mb` per second. Note that if the [recovery settings for managed services](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/index-recovery-settings.md#recovery-settings-for-managed-services) are set, then it defaults to unlimited, and the rate is additionally throttled through [recovery settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/index-recovery-settings.md). +: (Optional, [byte value](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#byte-units)) Maximum snapshot creation rate per node. Defaults to `40mb` per second. Note that if the [recovery settings for managed services](elasticsearch://reference/elasticsearch/configuration-reference/index-recovery-settings.md#recovery-settings-for-managed-services) are set, then it defaults to unlimited, and the rate is additionally throttled through [recovery settings](elasticsearch://reference/elasticsearch/configuration-reference/index-recovery-settings.md). `readonly` : (Optional, Boolean) If `true`, the repository is read-only. The cluster can retrieve and restore snapshots from the repository but not write to the repository or create snapshots in it. diff --git a/deploy-manage/tools/snapshot-and-restore/create-snapshots.md b/deploy-manage/tools/snapshot-and-restore/create-snapshots.md index 9696b86e6..a22737df6 100644 --- a/deploy-manage/tools/snapshot-and-restore/create-snapshots.md +++ b/deploy-manage/tools/snapshot-and-restore/create-snapshots.md @@ -1,10 +1,10 @@ --- applies_to: deployment: - eck: - ess: - ece: - self: + eck: + ess: + ece: + self: --- # Create snapshots [snapshots-take-snapshot] @@ -31,7 +31,7 @@ The guide also provides tips for creating dedicated cluster state snapshots and * You can only take a snapshot from a running cluster with an elected [master node](../../distributed-architecture/clusters-nodes-shards/node-roles.md#master-node-role). * A snapshot repository must be [registered](self-managed.md) and available to the cluster. -* The cluster’s global metadata must be readable. To include an index in a snapshot, the index and its metadata must also be readable. Ensure there aren’t any [cluster blocks](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/miscellaneous-cluster-settings.md#cluster-read-only) or [index blocks](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-block.md) that prevent read access. +* The cluster’s global metadata must be readable. To include an index in a snapshot, the index and its metadata must also be readable. Ensure there aren’t any [cluster blocks](elasticsearch://reference/elasticsearch/configuration-reference/miscellaneous-cluster-settings.md#cluster-read-only) or [index blocks](elasticsearch://reference/elasticsearch/index-settings/index-block.md) that prevent read access. ## Considerations [create-snapshot-considerations] @@ -41,7 +41,7 @@ The guide also provides tips for creating dedicated cluster state snapshots and * Each snapshot is logically independent. You can delete a snapshot without affecting other snapshots. * Taking a snapshot can temporarily pause shard allocations. See [Snapshots and shard allocation](../snapshot-and-restore.md#snapshots-shard-allocation). * Taking a snapshot doesn’t block indexing or other requests. However, the snapshot won’t include changes made after the snapshot process starts. -* You can take multiple snapshots at the same time. The [`snapshot.max_concurrent_operations`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/snapshot-restore-settings.md#snapshot-max-concurrent-ops) cluster setting limits the maximum number of concurrent snapshot operations. +* You can take multiple snapshots at the same time. The [`snapshot.max_concurrent_operations`](elasticsearch://reference/elasticsearch/configuration-reference/snapshot-restore-settings.md#snapshot-max-concurrent-ops) cluster setting limits the maximum number of concurrent snapshot operations. * If you include a data stream in a snapshot, the snapshot also includes the stream’s backing indices and metadata. You can also include only specific backing indices in a snapshot. However, the snapshot won’t include the data stream’s metadata or its other backing indices. @@ -134,7 +134,7 @@ PUT _slm/policy/nightly-snapshots ``` 1. When to take snapshots, written in [Cron syntax](/explore-analyze/alerts-cases/watcher/schedule-types.md#schedule-cron). -2. Snapshot name. Supports [date math](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#api-date-math-index-names). To prevent naming conflicts, the policy also appends a UUID to each snapshot name. +2. Snapshot name. Supports [date math](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#api-date-math-index-names). To prevent naming conflicts, the policy also appends a UUID to each snapshot name. 3. [Registered snapshot repository](self-managed.md) used to store the policy’s snapshots. 4. Data streams and indices to include in the policy’s snapshots. 5. If `true`, the policy’s snapshots include the cluster state. This also includes all feature states by default. To only include specific feature states, see [Back up a specific feature state](#back-up-specific-feature-state). @@ -157,7 +157,7 @@ The snapshot process runs in the background. To monitor its progress, see [Monit ### {{slm-init}} retention [slm-retention-task] -{{slm-init}} snapshot retention is a cluster-level task that runs separately from a policy’s snapshot schedule. To control when the {{slm-init}} retention task runs, configure the [`slm.retention_schedule`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/snapshot-restore-settings.md#slm-retention-schedule) cluster setting. +{{slm-init}} snapshot retention is a cluster-level task that runs separately from a policy’s snapshot schedule. To control when the {{slm-init}} retention task runs, configure the [`slm.retention_schedule`](elasticsearch://reference/elasticsearch/configuration-reference/snapshot-restore-settings.md#slm-retention-schedule) cluster setting. ```console PUT _cluster/settings @@ -186,7 +186,7 @@ A snapshot repository can safely scale to thousands of snapshots. However, to ma ## Manually create a snapshot [manually-create-snapshot] -To take a snapshot without an {{slm-init}} policy, use the [create snapshot API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-snapshot-create). The snapshot name supports [date math](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#api-date-math-index-names). +To take a snapshot without an {{slm-init}} policy, use the [create snapshot API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-snapshot-create). The snapshot name supports [date math](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#api-date-math-index-names). ```console # PUT _snapshot/my_repository/ diff --git a/deploy-manage/tools/snapshot-and-restore/google-cloud-storage-repository.md b/deploy-manage/tools/snapshot-and-restore/google-cloud-storage-repository.md index 98e3d384c..56acaf0ec 100644 --- a/deploy-manage/tools/snapshot-and-restore/google-cloud-storage-repository.md +++ b/deploy-manage/tools/snapshot-and-restore/google-cloud-storage-repository.md @@ -1,7 +1,7 @@ --- applies_to: deployment: - self: + self: --- # Google Cloud Storage repository [repository-gcs] @@ -192,10 +192,10 @@ The following settings are supported: : When set to `true` metadata files are stored in compressed format. This setting doesn’t affect index files that are already compressed by default. Defaults to `true`. `max_restore_bytes_per_sec` -: (Optional, [byte value](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#byte-units)) Maximum snapshot restore rate per node. Defaults to unlimited. Note that restores are also throttled through [recovery settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/index-recovery-settings.md). +: (Optional, [byte value](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#byte-units)) Maximum snapshot restore rate per node. Defaults to unlimited. Note that restores are also throttled through [recovery settings](elasticsearch://reference/elasticsearch/configuration-reference/index-recovery-settings.md). `max_snapshot_bytes_per_sec` -: (Optional, [byte value](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#byte-units)) Maximum snapshot creation rate per node. Defaults to `40mb` per second. Note that if the [recovery settings for managed services](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/index-recovery-settings.md#recovery-settings-for-managed-services) are set, then it defaults to unlimited, and the rate is additionally throttled through [recovery settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/index-recovery-settings.md). +: (Optional, [byte value](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#byte-units)) Maximum snapshot creation rate per node. Defaults to `40mb` per second. Note that if the [recovery settings for managed services](elasticsearch://reference/elasticsearch/configuration-reference/index-recovery-settings.md#recovery-settings-for-managed-services) are set, then it defaults to unlimited, and the rate is additionally throttled through [recovery settings](elasticsearch://reference/elasticsearch/configuration-reference/index-recovery-settings.md). `readonly` : (Optional, Boolean) If `true`, the repository is read-only. The cluster can retrieve and restore snapshots from the repository but not write to the repository or create snapshots in it. diff --git a/deploy-manage/tools/snapshot-and-restore/manage-snapshot-repositories.md b/deploy-manage/tools/snapshot-and-restore/manage-snapshot-repositories.md index 1bd5d6887..e62b8f6de 100644 --- a/deploy-manage/tools/snapshot-and-restore/manage-snapshot-repositories.md +++ b/deploy-manage/tools/snapshot-and-restore/manage-snapshot-repositories.md @@ -1,10 +1,10 @@ --- applies_to: deployment: - eck: - ess: - ece: - self: + eck: + ess: + ece: + self: --- # Manage snapshot repositories @@ -26,13 +26,13 @@ If you manage your own Elasticsearch cluster, you can use the following built-in Other repository types are available through official plugins: -* [Hadoop Distributed File System (HDFS)](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch-plugins/repository-hdfs.md) +* [Hadoop Distributed File System (HDFS)](elasticsearch://reference/elasticsearch-plugins/repository-hdfs.md) ### Elastic Cloud Hosted {{ech}} deployments automatically register a repository named `found-snapshots` in {{es}} clusters. These repositories are used together with the `cloud-snapshot-policy` SLM policy to take periodic snapshots of your {{es}} clusters. You can also use the `found-snapshots` repository for your own [SLM policies](/deploy-manage/tools/snapshot-and-restore/create-snapshots.md#automate-snapshots-slm) or to store searchable snapshots. -The `found-snapshots` repository is specific to each deployment. However, you can restore snapshots from another deployment’s found-snapshots repository if the deployments are under the same account and in the same region. +The `found-snapshots` repository is specific to each deployment. However, you can restore snapshots from another deployment’s found-snapshots repository if the deployments are under the same account and in the same region. Elastic Cloud Hosted deployments also support the following repository types: @@ -57,7 +57,7 @@ Elastic Cloud Enterprise installations support the following Elasticsearch snaps * [Minio](/deploy-manage/tools/snapshot-and-restore/minio-on-premise-repository.md) :::{note} -No repository types other than those listed are supported in the Elastic Cloud Enterprise platform, even if they are supported by Elasticsearch. +No repository types other than those listed are supported in the Elastic Cloud Enterprise platform, even if they are supported by Elasticsearch. ::: For more details, refer to [Managing snapshot repositories in Elastic Cloud Enterprise](/deploy-manage/tools/snapshot-and-restore/cloud-enterprise.md). diff --git a/deploy-manage/tools/snapshot-and-restore/minio-on-premise-repository.md b/deploy-manage/tools/snapshot-and-restore/minio-on-premise-repository.md index 07d523bc2..1d38dcaa1 100644 --- a/deploy-manage/tools/snapshot-and-restore/minio-on-premise-repository.md +++ b/deploy-manage/tools/snapshot-and-restore/minio-on-premise-repository.md @@ -1,7 +1,7 @@ --- applies_to: deployment: - ece: + ece: --- # Minio on-premise repository [ece-configuring-minio] @@ -63,7 +63,7 @@ How you create the AWS S3 bucket depends on what version of Elasticsearch you ar * For version 7.x: 1. Using the Minio browser or an S3 client application, create an S3 bucket to store your snapshots. - 2. [Log into the Cloud UI](../../deploy/cloud-enterprise/log-into-cloud-ui.md) and [add the S3 repository plugin](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch-plugins/cloud-enterprise/ece-add-plugins.md) to your cluster. + 2. [Log into the Cloud UI](../../deploy/cloud-enterprise/log-into-cloud-ui.md) and [add the S3 repository plugin](elasticsearch://reference/elasticsearch-plugins/cloud-enterprise/ece-add-plugins.md) to your cluster. * For versions 8.0 and later, {{es}} has built-in support for AWS S3 repositories; no repository plugin is needed. Use the Minio browser or an S3 client application to create an S3 bucket to store your snapshots. diff --git a/deploy-manage/tools/snapshot-and-restore/read-only-url-repository.md b/deploy-manage/tools/snapshot-and-restore/read-only-url-repository.md index 61f822de5..71e2ca91e 100644 --- a/deploy-manage/tools/snapshot-and-restore/read-only-url-repository.md +++ b/deploy-manage/tools/snapshot-and-restore/read-only-url-repository.md @@ -1,12 +1,12 @@ --- applies_to: deployment: - self: + self: --- # Read-only URL repository [snapshots-read-only-repository] -::::{note} +::::{note} This repository type is only available if you run {{es}} on your own hardware. If you use {{ech}}, see [{{ech}} repository types](self-managed.md). :::: @@ -28,13 +28,13 @@ PUT _snapshot/my_read_only_url_repository ## Repository settings [read-only-url-repository-settings] `chunk_size` -: (Optional, [byte value](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#byte-units)) Maximum size of files in snapshots. In snapshots, files larger than this are broken down into chunks of this size or smaller. Defaults to `null` (unlimited file size). +: (Optional, [byte value](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#byte-units)) Maximum size of files in snapshots. In snapshots, files larger than this are broken down into chunks of this size or smaller. Defaults to `null` (unlimited file size). `http_max_retries` : (Optional, integer) Maximum number of retries for `http` and `https` URLs. Defaults to `5`. `http_socket_timeout` -: (Optional, [time value](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#time-units)) Maximum wait time for data transfers over a connection. Defaults to `50s`. +: (Optional, [time value](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#time-units)) Maximum wait time for data transfers over a connection. Defaults to `50s`. `compress` : (Optional, Boolean) If `true`, metadata files, such as index mappings and settings, are compressed in snapshots. Data files are not compressed. Defaults to `true`. @@ -43,10 +43,10 @@ PUT _snapshot/my_read_only_url_repository : (Optional, integer) Maximum number of snapshots the repository can contain. Defaults to `Integer.MAX_VALUE`, which is `2^31-1` or `2147483647`. `max_restore_bytes_per_sec` -: (Optional, [byte value](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#byte-units)) Maximum snapshot restore rate per node. Defaults to unlimited. Note that restores are also throttled through [recovery settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/index-recovery-settings.md). +: (Optional, [byte value](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#byte-units)) Maximum snapshot restore rate per node. Defaults to unlimited. Note that restores are also throttled through [recovery settings](elasticsearch://reference/elasticsearch/configuration-reference/index-recovery-settings.md). `max_snapshot_bytes_per_sec` -: (Optional, [byte value](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#byte-units)) Maximum snapshot creation rate per node. Defaults to `40mb` per second. Note that if the [recovery settings for managed services](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/index-recovery-settings.md#recovery-settings-for-managed-services) are set, then it defaults to unlimited, and the rate is additionally throttled through [recovery settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/index-recovery-settings.md). +: (Optional, [byte value](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#byte-units)) Maximum snapshot creation rate per node. Defaults to `40mb` per second. Note that if the [recovery settings for managed services](elasticsearch://reference/elasticsearch/configuration-reference/index-recovery-settings.md#recovery-settings-for-managed-services) are set, then it defaults to unlimited, and the rate is additionally throttled through [recovery settings](elasticsearch://reference/elasticsearch/configuration-reference/index-recovery-settings.md). `url` : (Required, string) URL location of the root of the shared filesystem repository. The following protocols are supported: @@ -57,7 +57,7 @@ PUT _snapshot/my_read_only_url_repository * `https` * `jar` -URLs using the `http`, `https`, or `ftp` protocols must be explicitly allowed with the [`repositories.url.allowed_urls`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/snapshot-restore-settings.md#repositories-url-allowed) cluster setting. This setting supports wildcards in the place of a host, path, query, or fragment in the URL. +URLs using the `http`, `https`, or `ftp` protocols must be explicitly allowed with the [`repositories.url.allowed_urls`](elasticsearch://reference/elasticsearch/configuration-reference/snapshot-restore-settings.md#repositories-url-allowed) cluster setting. This setting supports wildcards in the place of a host, path, query, or fragment in the URL. URLs using the `file` protocol must point to the location of a shared filesystem accessible to all master and data nodes in the cluster. This location must be registered in the `path.repo` setting. You don’t need to register URLs using the `ftp`, `http`, `https`, or `jar` protocols in the `path.repo` setting. diff --git a/deploy-manage/tools/snapshot-and-restore/s3-repository.md b/deploy-manage/tools/snapshot-and-restore/s3-repository.md index d598fd2b6..318528dea 100644 --- a/deploy-manage/tools/snapshot-and-restore/s3-repository.md +++ b/deploy-manage/tools/snapshot-and-restore/s3-repository.md @@ -1,14 +1,14 @@ ---- +--- applies_to: deployment: - self: + self: --- # S3 repository [repository-s3] You can use AWS S3 as a repository for [Snapshot/Restore](../snapshot-and-restore.md). -::::{note} +::::{note} If you are looking for a hosted solution of Elasticsearch on AWS, please visit [https://www.elastic.co/cloud/](https://www.elastic.co/cloud/). :::: @@ -106,7 +106,7 @@ The following list contains the available client settings. Those that must be st : The password to connect to the `proxy.host` with. `read_timeout` -: ([time value](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#time-units)) The maximum time {{es}} will wait to receive the next byte of data over an established, open connection to the repository before it closes the connection. The default value is 50 seconds. +: ([time value](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#time-units)) The maximum time {{es}} will wait to receive the next byte of data over an established, open connection to the repository before it closes the connection. The default value is 50 seconds. `max_connections` : The maximum number of concurrent connections to S3. The default value is `50`. @@ -120,7 +120,7 @@ The following list contains the available client settings. Those that must be st `path_style_access` : Whether to force the use of the path style access pattern. If `true`, the path style access pattern will be used. If `false`, the access pattern will be automatically determined by the AWS Java SDK (See [AWS documentation](https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/AmazonS3Builder.md#setPathStyleAccessEnabled-java.lang.Boolean-) for details). Defaults to `false`. -::::{note} +::::{note} :name: repository-s3-path-style-deprecation In versions `7.0`, `7.1`, `7.2` and `7.3` all bucket operations used the [now-deprecated](https://aws.amazon.com/blogs/aws/amazon-s3-path-deprecation-plan-the-rest-of-the-story/) path style access pattern. If your deployment requires the path style access pattern then you should set this setting to `true` when upgrading. @@ -166,22 +166,22 @@ The following settings are supported: `base_path` : Specifies the path to the repository data within its bucket. Defaults to an empty string, meaning that the repository is at the root of the bucket. The value of this setting should not start or end with a `/`. - ::::{note} + ::::{note} Don’t set `base_path` when configuring a snapshot repository for {{ECE}}. {{ECE}} automatically generates the `base_path` for each deployment so that multiple deployments may share the same bucket. :::: `chunk_size` -: ([byte value](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#byte-units)) The maximum size of object that {{es}} will write to the repository when creating a snapshot. Files which are larger than `chunk_size` will be chunked into several smaller objects. {{es}} may also split a file across multiple objects to satisfy other constraints such as the `max_multipart_parts` limit. Defaults to `5TB` which is the [maximum size of an object in AWS S3](https://docs.aws.amazon.com/AmazonS3/latest/userguide/qfacts.md). +: ([byte value](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#byte-units)) The maximum size of object that {{es}} will write to the repository when creating a snapshot. Files which are larger than `chunk_size` will be chunked into several smaller objects. {{es}} may also split a file across multiple objects to satisfy other constraints such as the `max_multipart_parts` limit. Defaults to `5TB` which is the [maximum size of an object in AWS S3](https://docs.aws.amazon.com/AmazonS3/latest/userguide/qfacts.md). `compress` : When set to `true` metadata files are stored in compressed format. This setting doesn’t affect index files that are already compressed by default. Defaults to `true`. `max_restore_bytes_per_sec` -: (Optional, [byte value](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#byte-units)) Maximum snapshot restore rate per node. Defaults to unlimited. Note that restores are also throttled through [recovery settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/index-recovery-settings.md). +: (Optional, [byte value](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#byte-units)) Maximum snapshot restore rate per node. Defaults to unlimited. Note that restores are also throttled through [recovery settings](elasticsearch://reference/elasticsearch/configuration-reference/index-recovery-settings.md). `max_snapshot_bytes_per_sec` -: (Optional, [byte value](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#byte-units)) Maximum snapshot creation rate per node. Defaults to `40mb` per second. Note that if the [recovery settings for managed services](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/index-recovery-settings.md#recovery-settings-for-managed-services) are set, then it defaults to unlimited, and the rate is additionally throttled through [recovery settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/index-recovery-settings.md). +: (Optional, [byte value](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#byte-units)) Maximum snapshot creation rate per node. Defaults to `40mb` per second. Note that if the [recovery settings for managed services](elasticsearch://reference/elasticsearch/configuration-reference/index-recovery-settings.md#recovery-settings-for-managed-services) are set, then it defaults to unlimited, and the rate is additionally throttled through [recovery settings](elasticsearch://reference/elasticsearch/configuration-reference/index-recovery-settings.md). `readonly` : (Optional, Boolean) If `true`, the repository is read-only. The cluster can retrieve and restore snapshots from the repository but not write to the repository or create snapshots in it. @@ -190,7 +190,7 @@ The following settings are supported: If `false`, the cluster can write to the repository and create snapshots in it. Defaults to `false`. - ::::{important} + ::::{important} If you register the same snapshot repository with multiple clusters, only one cluster should have write access to the repository. Having multiple clusters write to the repository at the same time risks corrupting the contents of the repository. :::: @@ -200,7 +200,7 @@ The following settings are supported: : When set to `true` files are encrypted on server side using AES256 algorithm. Defaults to `false`. `buffer_size` -: ([byte value](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#byte-units)) Minimum threshold below which the chunk is uploaded using a single request. Beyond this threshold, the S3 repository will use the [AWS Multipart Upload API](https://docs.aws.amazon.com/AmazonS3/latest/dev/uploadobjusingmpu.md) to split the chunk into several parts, each of `buffer_size` length, and to upload each part in its own request. Note that setting a buffer size lower than `5mb` is not allowed since it will prevent the use of the Multipart API and may result in upload errors. It is also not possible to set a buffer size greater than `5gb` as it is the maximum upload size allowed by S3. Defaults to `100mb` or `5%` of JVM heap, whichever is smaller. +: ([byte value](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#byte-units)) Minimum threshold below which the chunk is uploaded using a single request. Beyond this threshold, the S3 repository will use the [AWS Multipart Upload API](https://docs.aws.amazon.com/AmazonS3/latest/dev/uploadobjusingmpu.md) to split the chunk into several parts, each of `buffer_size` length, and to upload each part in its own request. Note that setting a buffer size lower than `5mb` is not allowed since it will prevent the use of the Multipart API and may result in upload errors. It is also not possible to set a buffer size greater than `5gb` as it is the maximum upload size allowed by S3. Defaults to `100mb` or `5%` of JVM heap, whichever is smaller. `max_multipart_parts` : (integer) The maximum number of parts that {{es}} will write during a multipart upload of a single object. Files which are larger than `buffer_size × max_multipart_parts` will be chunked into several smaller objects. {{es}} may also split a file across multiple objects to satisfy other constraints such as the `chunk_size` limit. Defaults to `10000` which is the [maximum number of parts in a multipart upload in AWS S3](https://docs.aws.amazon.com/AmazonS3/latest/userguide/qfacts.md). @@ -218,18 +218,18 @@ The following settings are supported: : (integer) Sets the maximum number of possibly-dangling multipart uploads to clean up in each batch of snapshot deletions. Defaults to `1000` which is the maximum number supported by the [AWS ListMultipartUploads API](https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListMultipartUploads.md). If set to `0`, {{es}} will not attempt to clean up dangling multipart uploads. `throttled_delete_retry.delay_increment` -: ([time value](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#time-units)) This value is used as the delay before the first retry and the amount the delay is incremented by on each subsequent retry. Default is 50ms, minimum is 0ms. +: ([time value](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#time-units)) This value is used as the delay before the first retry and the amount the delay is incremented by on each subsequent retry. Default is 50ms, minimum is 0ms. `throttled_delete_retry.maximum_delay` -: ([time value](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#time-units)) This is the upper bound on how long the delays between retries will grow to. Default is 5s, minimum is 0ms. +: ([time value](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#time-units)) This is the upper bound on how long the delays between retries will grow to. Default is 5s, minimum is 0ms. `throttled_delete_retry.maximum_number_of_retries` : (integer) Sets the number times to retry a throttled snapshot deletion. Defaults to `10`, minimum value is `0` which will disable retries altogether. Note that if retries are enabled in the Azure client, each of these retries comprises that many client-level retries. `get_register_retry_delay` -: ([time value](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#time-units)) Sets the time to wait before trying again if an attempt to read a [linearizable register](#repository-s3-linearizable-registers) fails. Defaults to `5s`. +: ([time value](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#time-units)) Sets the time to wait before trying again if an attempt to read a [linearizable register](#repository-s3-linearizable-registers) fails. Defaults to `5s`. -::::{note} +::::{note} The option of defining client settings in the repository settings as documented below is considered deprecated, and will be removed in a future version. :::: @@ -350,7 +350,7 @@ You may further restrict the permissions by specifying a prefix within the bucke The bucket needs to exist to register a repository for snapshots. If you did not create the bucket then the repository registration will fail. -#### Using IAM roles for Kubernetes service accounts for authentication [iam-kubernetes-service-accounts] +#### Using IAM roles for Kubernetes service accounts for authentication [iam-kubernetes-service-accounts] If you want to use [Kubernetes service accounts](https://aws.amazon.com/blogs/opensource/introducing-fine-grained-iam-roles-service-accounts/) for authentication, you need to add a symlink to the `$AWS_WEB_IDENTITY_TOKEN_FILE` environment variable (which should be automatically set by a Kubernetes pod) in the S3 repository config directory, so the repository can have the read access for the service account (a repository can’t read any files outside its config directory). For example: @@ -359,7 +359,7 @@ mkdir -p "${ES_PATH_CONF}/repository-s3" ln -s $AWS_WEB_IDENTITY_TOKEN_FILE "${ES_PATH_CONF}/repository-s3/aws-web-identity-token-file" ``` -::::{important} +::::{important} The symlink must be created on all data and master eligible nodes and be readable by the `elasticsearch` user. By default, {{es}} runs as user `elasticsearch` using uid:gid `1000:0`. :::: @@ -388,7 +388,7 @@ You can perform some basic checks of the suitability of your storage system usin Most storage systems can be configured to log the details of their interaction with {{es}}. If you are investigating a suspected incompatibility with AWS S3, it is usually simplest to collect these logs and provide them to the supplier of your storage system for further analysis. If the incompatibility is not clear from the logs emitted by the storage system, configure {{es}} to log every request it makes to the S3 API by [setting the logging level](../../monitor/logging-configuration/elasticsearch-log4j-configuration-self-managed.md#configuring-logging-levels) of the `com.amazonaws.request` logger to `DEBUG`. -To prevent leaking sensitive information such as credentials and keys in logs, {{es}} rejects configuring this logger at high verbosity unless [insecure network trace logging](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#http-rest-request-tracer) is enabled. To do so, you must explicitly enable it on each node by setting the system property `es.insecure_network_trace_enabled` to `true`. +To prevent leaking sensitive information such as credentials and keys in logs, {{es}} rejects configuring this logger at high verbosity unless [insecure network trace logging](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#http-rest-request-tracer) is enabled. To do so, you must explicitly enable it on each node by setting the system property `es.insecure_network_trace_enabled` to `true`. Once enabled, you can configure the `com.amazonaws.request` logger: @@ -401,7 +401,7 @@ PUT /_cluster/settings } ``` -Collect the Elasticsearch logs covering the time period of the failed analysis from all nodes in your cluster and share them with the supplier of your storage system along with the analysis response so they can use them to determine the problem. See the [AWS Java SDK](https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/java-dg-../../monitor/logging-configuration/elasticsearch-log4j-configuration-self-managed.md) documentation for further information, including details about other loggers that can be used to obtain even more verbose logs. When you have finished collecting the logs needed by your supplier, set the logger settings back to `null` to return to the default logging configuration and disable insecure network trace logging again. See [Logger](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/miscellaneous-cluster-settings.md#cluster-logger) and [Cluster update settings](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings) for more information. +Collect the Elasticsearch logs covering the time period of the failed analysis from all nodes in your cluster and share them with the supplier of your storage system along with the analysis response so they can use them to determine the problem. See the [AWS Java SDK](https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/java-dg-../../monitor/logging-configuration/elasticsearch-log4j-configuration-self-managed.md) documentation for further information, including details about other loggers that can be used to obtain even more verbose logs. When you have finished collecting the logs needed by your supplier, set the logger settings back to `null` to return to the default logging configuration and disable insecure network trace logging again. See [Logger](elasticsearch://reference/elasticsearch/configuration-reference/miscellaneous-cluster-settings.md#cluster-logger) and [Cluster update settings](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings) for more information. ## Linearizable register implementation [repository-s3-linearizable-registers] diff --git a/deploy-manage/tools/snapshot-and-restore/searchable-snapshots.md b/deploy-manage/tools/snapshot-and-restore/searchable-snapshots.md index d1c02d8db..61853384e 100644 --- a/deploy-manage/tools/snapshot-and-restore/searchable-snapshots.md +++ b/deploy-manage/tools/snapshot-and-restore/searchable-snapshots.md @@ -1,10 +1,10 @@ --- applies_to: deployment: - eck: - ess: - ece: - self: + eck: + ess: + ece: + self: --- # Searchable snapshots [searchable-snapshots] @@ -22,13 +22,13 @@ By default, {{search-snap}} indices have no replicas. The underlying snapshot pr If a node fails and {{search-snap}} shards need to be recovered elsewhere, there is a brief window of time while {{es}} allocates the shards to other nodes where the cluster health will not be `green`. Searches that hit these shards may fail or return partial results until the shards are reallocated to healthy nodes. -You typically manage {{search-snaps}} through {{ilm-init}}. The [searchable snapshots](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-searchable-snapshot.md) action automatically converts a regular index into a {{search-snap}} index when it reaches the `cold` or `frozen` phase. You can also make indices in existing snapshots searchable by manually mounting them using the [mount snapshot](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-searchable-snapshots-mount) API. +You typically manage {{search-snaps}} through {{ilm-init}}. The [searchable snapshots](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-searchable-snapshot.md) action automatically converts a regular index into a {{search-snap}} index when it reaches the `cold` or `frozen` phase. You can also make indices in existing snapshots searchable by manually mounting them using the [mount snapshot](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-searchable-snapshots-mount) API. To mount an index from a snapshot that contains multiple indices, we recommend creating a [clone](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-snapshot-clone) of the snapshot that contains only the index you want to search, and mounting the clone. You should not delete a snapshot if it has any mounted indices, so creating a clone enables you to manage the lifecycle of the backup snapshot independently of any {{search-snaps}}. If you use {{ilm-init}} to manage your {{search-snaps}} then it will automatically look after cloning the snapshot as needed. You can control the allocation of the shards of {{search-snap}} indices using the same mechanisms as for regular indices. For example, you could use [Index-level shard allocation filtering](../../distributed-architecture/shard-allocation-relocation-recovery/index-level-shard-allocation.md) to restrict {{search-snap}} shards to a subset of your nodes. -The speed of recovery of a {{search-snap}} index is limited by the repository setting `max_restore_bytes_per_sec` and the node setting `indices.recovery.max_bytes_per_sec` just like a normal restore operation. By default `max_restore_bytes_per_sec` is unlimited, but the default for `indices.recovery.max_bytes_per_sec` depends on the configuration of the node. See [Recovery settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/index-recovery-settings.md#recovery-settings). +The speed of recovery of a {{search-snap}} index is limited by the repository setting `max_restore_bytes_per_sec` and the node setting `indices.recovery.max_bytes_per_sec` just like a normal restore operation. By default `max_restore_bytes_per_sec` is unlimited, but the default for `indices.recovery.max_bytes_per_sec` depends on the configuration of the node. See [Recovery settings](elasticsearch://reference/elasticsearch/configuration-reference/index-recovery-settings.md#recovery-settings). We recommend that you [force-merge](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-forcemerge) indices to a single segment per shard before taking a snapshot that will be mounted as a {{search-snap}} index. Each read from a snapshot repository takes time and costs money, and the fewer segments there are the fewer reads are needed to restore the snapshot or to respond to a search. @@ -46,7 +46,7 @@ Use any of the following repository types with searchable snapshots: * [AWS S3](s3-repository.md) * [Google Cloud Storage](google-cloud-storage-repository.md) * [Azure Blob Storage](azure-repository.md) -* [Hadoop Distributed File Store (HDFS)](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch-plugins/repository-hdfs.md) +* [Hadoop Distributed File Store (HDFS)](elasticsearch://reference/elasticsearch-plugins/repository-hdfs.md) * [Shared filesystems](shared-file-system-repository.md) such as NFS * [Read-only HTTP and HTTPS repositories](read-only-url-repository.md) @@ -99,7 +99,7 @@ Manually mounting snapshots captured by an Index Lifecycle Management ({{ilm-ini For optimal results, allow {{ilm-init}} to manage snapshots automatically. -[Learn more about {{ilm-init}} snapshot management](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-searchable-snapshot.md). +[Learn more about {{ilm-init}} snapshot management](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-searchable-snapshot.md). :::: @@ -107,10 +107,10 @@ For optimal results, allow {{ilm-init}} to manage snapshots automatically. $$$searchable-snapshots-shared-cache$$$ `xpack.searchable.snapshot.shared_cache.size` -: ([Static](../../deploy/self-managed/configure-elasticsearch.md#static-cluster-setting)) Disk space reserved for the shared cache of partially mounted indices. Accepts a percentage of total disk space or an absolute [byte value](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#byte-units). Defaults to `90%` of total disk space for dedicated frozen data tier nodes. Otherwise defaults to `0b`. +: ([Static](../../deploy/self-managed/configure-elasticsearch.md#static-cluster-setting)) Disk space reserved for the shared cache of partially mounted indices. Accepts a percentage of total disk space or an absolute [byte value](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#byte-units). Defaults to `90%` of total disk space for dedicated frozen data tier nodes. Otherwise defaults to `0b`. `xpack.searchable.snapshot.shared_cache.size.max_headroom` -: ([Static](../../deploy/self-managed/configure-elasticsearch.md#static-cluster-setting), [byte value](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#byte-units)) For dedicated frozen tier nodes, the max headroom to maintain. If `xpack.searchable.snapshot.shared_cache.size` is not explicitly set, this setting defaults to `100GB`. Otherwise it defaults to `-1` (not set). You can only configure this setting if `xpack.searchable.snapshot.shared_cache.size` is set as a percentage. +: ([Static](../../deploy/self-managed/configure-elasticsearch.md#static-cluster-setting), [byte value](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#byte-units)) For dedicated frozen tier nodes, the max headroom to maintain. If `xpack.searchable.snapshot.shared_cache.size` is not explicitly set, this setting defaults to `100GB`. Otherwise it defaults to `-1` (not set). You can only configure this setting if `xpack.searchable.snapshot.shared_cache.size` is set as a percentage. To illustrate how these settings work in concert let us look at two examples when using the default values of the settings on a dedicated frozen node: diff --git a/deploy-manage/tools/snapshot-and-restore/self-managed.md b/deploy-manage/tools/snapshot-and-restore/self-managed.md index e2afe2f79..98f262509 100644 --- a/deploy-manage/tools/snapshot-and-restore/self-managed.md +++ b/deploy-manage/tools/snapshot-and-restore/self-managed.md @@ -3,7 +3,7 @@ navigation_title: "Self-managed" applies_to: deployment: - self: + self: --- # Manage snapshot repositories in self-managed deployments [snapshots-register-repository] @@ -23,7 +23,7 @@ In this guide, you’ll learn how to: * [Cluster privileges](../../users-roles/cluster-or-deployment-auth/elasticsearch-privileges.md#privileges-list-cluster): `monitor`, `manage_slm`, `cluster:admin/snapshot`, and `cluster:admin/repository` * [Index privilege](../../users-roles/cluster-or-deployment-auth/elasticsearch-privileges.md#privileges-list-indices): `all` on the `monitor` index -* To register a snapshot repository, the cluster’s global metadata must be writeable. Ensure there aren’t any [cluster blocks](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/miscellaneous-cluster-settings.md#cluster-read-only) that prevent write access. +* To register a snapshot repository, the cluster’s global metadata must be writeable. Ensure there aren’t any [cluster blocks](elasticsearch://reference/elasticsearch/configuration-reference/miscellaneous-cluster-settings.md#cluster-read-only) that prevent write access. ## Considerations [snapshot-repo-considerations] @@ -59,7 +59,7 @@ If you manage your own {{es}} cluster, you can use the following built-in snapsh $$$snapshots-repository-plugins$$$ Other repository types are available through official plugins: -* [Hadoop Distributed File System (HDFS)](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch-plugins/repository-hdfs.md) +* [Hadoop Distributed File System (HDFS)](elasticsearch://reference/elasticsearch-plugins/repository-hdfs.md) You can also use alternative storage implementations with these repository types, as long as the alternative implementation is fully compatible. For instance, [MinIO](https://minio.io) provides an alternative implementation of the AWS S3 API and you can use MinIO with the [`s3` repository type](s3-repository.md). diff --git a/deploy-manage/tools/snapshot-and-restore/shared-file-system-repository.md b/deploy-manage/tools/snapshot-and-restore/shared-file-system-repository.md index 3861c7835..09f7450e4 100644 --- a/deploy-manage/tools/snapshot-and-restore/shared-file-system-repository.md +++ b/deploy-manage/tools/snapshot-and-restore/shared-file-system-repository.md @@ -1,12 +1,12 @@ ---- +--- applies_to: deployment: - self: + self: --- # Shared file system repository [snapshots-filesystem-repository] -::::{note} +::::{note} This repository type is only available if you run {{es}} on your own hardware. See [Manage snapshot repositories](/deploy-manage/tools/snapshot-and-restore/manage-snapshot-repositories.md) for other deployment methods. :::: @@ -137,7 +137,7 @@ PUT _snapshot/my_fs_backup ## Repository settings [filesystem-repository-settings] `chunk_size` -: (Optional, [byte value](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#byte-units)) Maximum size of files in snapshots. In snapshots, files larger than this are broken down into chunks of this size or smaller. Defaults to `null` (unlimited file size). +: (Optional, [byte value](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#byte-units)) Maximum size of files in snapshots. In snapshots, files larger than this are broken down into chunks of this size or smaller. Defaults to `null` (unlimited file size). `compress` : (Optional, Boolean) If `true`, metadata files, such as index mappings and settings, are compressed in snapshots. Data files are not compressed. Defaults to `true`. @@ -149,10 +149,10 @@ PUT _snapshot/my_fs_backup : (Optional, integer) Maximum number of snapshots the repository can contain. Defaults to `Integer.MAX_VALUE`, which is `2^31-1` or `2147483647`. `max_restore_bytes_per_sec` -: (Optional, [byte value](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#byte-units)) Maximum snapshot restore rate per node. Defaults to unlimited. Note that restores are also throttled through [recovery settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/index-recovery-settings.md). +: (Optional, [byte value](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#byte-units)) Maximum snapshot restore rate per node. Defaults to unlimited. Note that restores are also throttled through [recovery settings](elasticsearch://reference/elasticsearch/configuration-reference/index-recovery-settings.md). `max_snapshot_bytes_per_sec` -: (Optional, [byte value](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#byte-units)) Maximum snapshot creation rate per node. Defaults to `40mb` per second. Note that if the [recovery settings for managed services](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/index-recovery-settings.md#recovery-settings-for-managed-services) are set, then it defaults to unlimited, and the rate is additionally throttled through [recovery settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/index-recovery-settings.md). +: (Optional, [byte value](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#byte-units)) Maximum snapshot creation rate per node. Defaults to `40mb` per second. Note that if the [recovery settings for managed services](elasticsearch://reference/elasticsearch/configuration-reference/index-recovery-settings.md#recovery-settings-for-managed-services) are set, then it defaults to unlimited, and the rate is additionally throttled through [recovery settings](elasticsearch://reference/elasticsearch/configuration-reference/index-recovery-settings.md). `readonly` : (Optional, Boolean) If `true`, the repository is read-only. The cluster can retrieve and restore snapshots from the repository but not write to the repository or create snapshots in it. @@ -161,7 +161,7 @@ PUT _snapshot/my_fs_backup If `false`, the cluster can write to the repository and create snapshots in it. Defaults to `false`. - ::::{important} + ::::{important} If you register the same snapshot repository with multiple clusters, only one cluster should have write access to the repository. Having multiple clusters write to the repository at the same time risks corrupting the contents of the repository. :::: @@ -180,7 +180,7 @@ If the verify repository or repository analysis APIs fail with an error indicati The verify repository and repository analysis APIs will also fail if the operating system returns any other kind of I/O error when accessing the repository. If this happens, address the cause of the I/O error reported by the operating system. -::::{tip} +::::{tip} Many NFS implementations match accounts across nodes using their *numeric* user IDs (UIDs) and group IDs (GIDs) rather than their names. It is possible for {{es}} to run under an account with the same name (often `elasticsearch`) on each node, but for these accounts to have different numeric user or group IDs. If your shared file system uses NFS then ensure that every node is running with the same numeric UID and GID, or else update your NFS configuration to account for the variance in numeric IDs across nodes. :::: diff --git a/deploy-manage/tools/snapshot-and-restore/source-only-repository.md b/deploy-manage/tools/snapshot-and-restore/source-only-repository.md index c2e5ddb59..ef6cdff0a 100644 --- a/deploy-manage/tools/snapshot-and-restore/source-only-repository.md +++ b/deploy-manage/tools/snapshot-and-restore/source-only-repository.md @@ -1,7 +1,7 @@ --- applies_to: deployment: - self: + self: --- # Source-only repository [snapshots-source-only-repository] @@ -12,7 +12,7 @@ Unlike other repository types, a source-only repository doesn’t directly store When you take a snapshot using a source-only repository, {{es}} creates a source-only snapshot in the delegated storage repository. This snapshot only contains stored fields and metadata. It doesn’t include index or doc values structures and isn’t immediately searchable when restored. To search the restored data, you first have to [reindex](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-reindex) it into a new data stream or index. -::::{important} +::::{important} Source-only snapshots are only supported if the `_source` field is enabled and no source-filtering is applied. As a result, indices adopting synthetic source cannot be restored. When you restore a source-only snapshot: * The restored index is read-only and can only serve `match_all` search or scroll requests to enable reindexing. @@ -39,7 +39,7 @@ PUT _snapshot/my_src_only_repository ## Repository settings [source-only-repository-settings] `chunk_size` -: (Optional, [byte value](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#byte-units)) Maximum size of files in snapshots. In snapshots, files larger than this are broken down into chunks of this size or smaller. Defaults to `null` (unlimited file size). +: (Optional, [byte value](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#byte-units)) Maximum size of files in snapshots. In snapshots, files larger than this are broken down into chunks of this size or smaller. Defaults to `null` (unlimited file size). `compress` : (Optional, Boolean) If `true`, metadata files, such as index mappings and settings, are compressed in snapshots. Data files are not compressed. Defaults to `true`. @@ -54,10 +54,10 @@ PUT _snapshot/my_src_only_repository : (Optional, integer) Maximum number of snapshots the repository can contain. Defaults to `Integer.MAX_VALUE`, which is `2^31-1` or `2147483647`. `max_restore_bytes_per_sec` -: (Optional, [byte value](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#byte-units)) Maximum snapshot restore rate per node. Defaults to unlimited. Note that restores are also throttled through [recovery settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/index-recovery-settings.md). +: (Optional, [byte value](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#byte-units)) Maximum snapshot restore rate per node. Defaults to unlimited. Note that restores are also throttled through [recovery settings](elasticsearch://reference/elasticsearch/configuration-reference/index-recovery-settings.md). `max_snapshot_bytes_per_sec` -: (Optional, [byte value](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#byte-units)) Maximum snapshot creation rate per node. Defaults to `40mb` per second. Note that if the [recovery settings for managed services](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/index-recovery-settings.md#recovery-settings-for-managed-services) are set, then it defaults to unlimited, and the rate is additionally throttled through [recovery settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/index-recovery-settings.md). +: (Optional, [byte value](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#byte-units)) Maximum snapshot creation rate per node. Defaults to `40mb` per second. Note that if the [recovery settings for managed services](elasticsearch://reference/elasticsearch/configuration-reference/index-recovery-settings.md#recovery-settings-for-managed-services) are set, then it defaults to unlimited, and the rate is additionally throttled through [recovery settings](elasticsearch://reference/elasticsearch/configuration-reference/index-recovery-settings.md). `readonly` : (Optional, Boolean) If `true`, the repository is read-only. The cluster can retrieve and restore snapshots from the repository but not write to the repository or create snapshots in it. @@ -66,7 +66,7 @@ PUT _snapshot/my_src_only_repository If `false`, the cluster can write to the repository and create snapshots in it. Defaults to `false`. - ::::{important} + ::::{important} If you register the same snapshot repository with multiple clusters, only one cluster should have write access to the repository. Having multiple clusters write to the repository at the same time risks corrupting the contents of the repository. :::: diff --git a/deploy-manage/upgrade/deployment-or-cluster/reading-indices-from-older-elasticsearch-versions.md b/deploy-manage/upgrade/deployment-or-cluster/reading-indices-from-older-elasticsearch-versions.md index 4f5fa6dd5..f434b982f 100644 --- a/deploy-manage/upgrade/deployment-or-cluster/reading-indices-from-older-elasticsearch-versions.md +++ b/deploy-manage/upgrade/deployment-or-cluster/reading-indices-from-older-elasticsearch-versions.md @@ -12,30 +12,30 @@ The archive functionality provides slower read-only access to older {{es}} data, For this, {{es}} has the ability to access older snapshot repositories (going back to version 5). The legacy indices in the [snapshot repository](../../tools/snapshot-and-restore.md) can either be [restored](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-snapshot-restore), or can be directly accessed via [searchable snapshots](../../tools/snapshot-and-restore/searchable-snapshots.md) so that the archived data won’t even need to fully reside on local disks for access. -## Supported field types [archive-indices-supported-field-types] +## Supported field types [archive-indices-supported-field-types] Old mappings are imported as much "as-is" as possible into {{es}} 8, but only provide regular query capabilities on a select subset of fields: -* [Numeric types](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/number.md) -* [`boolean` type](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/boolean.md) -* [`ip` type](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/ip.md) -* [`geo_point` type](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/geo-point.md) -* [`date` types](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/date.md): the date `format` setting on date fields is supported as long as it behaves similarly across these versions. In case it is not, for example [when using custom date formats](https://www.elastic.co/guide/en/elasticsearch/reference/7.17/migrate-to-java-time.html), this field can be updated on legacy indices so that it can be changed by a user if need be. -* [`keyword` type](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/keyword.md#keyword-field-type): the `normalizer` setting on keyword fields is supported as long as it behaves similarly across these versions. In case it is not, this field can be updated on legacy indices if need be. -* [`text` type](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/text.md#text-field-type): scoring capabilities are limited, and all queries return constant scores that are equal to 1.0. The `analyzer` settings on text fields are supported as long as they behave similarly across these versions. In case they do not, they can be updated on legacy indices if need be. -* [Multi-fields](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/multi-fields.md) -* [Field aliases](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/field-alias.md) -* [`object`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/object.md) fields +* [Numeric types](elasticsearch://reference/elasticsearch/mapping-reference/number.md) +* [`boolean` type](elasticsearch://reference/elasticsearch/mapping-reference/boolean.md) +* [`ip` type](elasticsearch://reference/elasticsearch/mapping-reference/ip.md) +* [`geo_point` type](elasticsearch://reference/elasticsearch/mapping-reference/geo-point.md) +* [`date` types](elasticsearch://reference/elasticsearch/mapping-reference/date.md): the date `format` setting on date fields is supported as long as it behaves similarly across these versions. In case it is not, for example [when using custom date formats](https://www.elastic.co/guide/en/elasticsearch/reference/7.17/migrate-to-java-time.html), this field can be updated on legacy indices so that it can be changed by a user if need be. +* [`keyword` type](elasticsearch://reference/elasticsearch/mapping-reference/keyword.md#keyword-field-type): the `normalizer` setting on keyword fields is supported as long as it behaves similarly across these versions. In case it is not, this field can be updated on legacy indices if need be. +* [`text` type](elasticsearch://reference/elasticsearch/mapping-reference/text.md#text-field-type): scoring capabilities are limited, and all queries return constant scores that are equal to 1.0. The `analyzer` settings on text fields are supported as long as they behave similarly across these versions. In case they do not, they can be updated on legacy indices if need be. +* [Multi-fields](elasticsearch://reference/elasticsearch/mapping-reference/multi-fields.md) +* [Field aliases](elasticsearch://reference/elasticsearch/mapping-reference/field-alias.md) +* [`object`](elasticsearch://reference/elasticsearch/mapping-reference/object.md) fields * some basic metadata fields, e.g. `_type` for querying {{es}} 5 indices * [runtime fields](../../../manage-data/data-store/mapping/map-runtime-field.md) -* [`_source` field](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-source-field.md) +* [`_source` field](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md) {{es}} 5 indices with mappings that have [multiple mapping types](https://www.elastic.co/guide/en/elasticsearch/reference/7.17/removal-of-types.html) are collapsed together on a best-effort basis before they are imported. -In case the auto-import of mappings does not work, or the new {{es}} version can’t make sense of the mapping, it falls back to importing the index without the mapping, but stores the original mapping in the [_meta](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-meta-field.md) section of the imported index. The legacy mapping can then be introspected using the [GET mapping](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-get-mapping) API and an updated mapping can be manually put in place using the [update mapping](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-mapping) API, copying and adapting relevant sections of the legacy mapping to work with the current {{es}} version. While auto-import is expected to work in most cases, failures of doing so should be [raised](https://github.com/elastic/elasticsearch/issues/new/choose) with the Elastic team for future improvements. +In case the auto-import of mappings does not work, or the new {{es}} version can’t make sense of the mapping, it falls back to importing the index without the mapping, but stores the original mapping in the [_meta](elasticsearch://reference/elasticsearch/mapping-reference/mapping-meta-field.md) section of the imported index. The legacy mapping can then be introspected using the [GET mapping](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-get-mapping) API and an updated mapping can be manually put in place using the [update mapping](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-mapping) API, copying and adapting relevant sections of the legacy mapping to work with the current {{es}} version. While auto-import is expected to work in most cases, failures of doing so should be [raised](https://github.com/elastic/elasticsearch/issues/new/choose) with the Elastic team for future improvements. -## Supported APIs [_supported_apis] +## Supported APIs [_supported_apis] Archive indices are read-only, and provide data access via the [search](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search) and [field capabilities](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-field-caps) APIs. They do not support the [Get API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-get) nor any write APIs. @@ -44,7 +44,7 @@ Archive indices allow running queries as well as aggregations in so far as they Due to `_source` access the data can also be [reindexed](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-reindex) to a new index that has full compatibility with the current {{es}} version. -## How to upgrade older {{es}} 5 or 6 clusters? [_how_to_upgrade_older_es_5_or_6_clusters] +## How to upgrade older {{es}} 5 or 6 clusters? [_how_to_upgrade_older_es_5_or_6_clusters] Take a snapshot of the indices in the old cluster, delete indices that are not directly supported by ES 8 (i.e. indices older than 7.0), upgrade the cluster without the old indices, and then [restore](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-snapshot-restore) the legacy indices from the snapshot or [mount](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-searchable-snapshots-mount) them via searchable snapshots. diff --git a/deploy-manage/upgrade/prepare-to-upgrade/index-compatibility.md b/deploy-manage/upgrade/prepare-to-upgrade/index-compatibility.md index cdeba0875..9f71d58fa 100644 --- a/deploy-manage/upgrade/prepare-to-upgrade/index-compatibility.md +++ b/deploy-manage/upgrade/prepare-to-upgrade/index-compatibility.md @@ -29,7 +29,7 @@ To upgrade to 9.0.0-beta1 from 7.16 or an earlier version, **you must first upgr ## REST API compatibility [upgrade-rest-api-compatibility] -[REST API compatibility](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/compatibility.md) is a per-request opt-in feature that can help REST clients mitigate non-compatible (breaking) changes to the REST API. +[REST API compatibility](elasticsearch://reference/elasticsearch/rest-apis/compatibility.md) is a per-request opt-in feature that can help REST clients mitigate non-compatible (breaking) changes to the REST API. ## FIPS Compliance and Java 17 [upgrade-fips-java17] diff --git a/deploy-manage/users-roles/cluster-or-deployment-auth/authorization-plugins.md b/deploy-manage/users-roles/cluster-or-deployment-auth/authorization-plugins.md index 4d1d7c2c6..4fcda4a2e 100644 --- a/deploy-manage/users-roles/cluster-or-deployment-auth/authorization-plugins.md +++ b/deploy-manage/users-roles/cluster-or-deployment-auth/authorization-plugins.md @@ -60,7 +60,7 @@ In order to register the security extension for your custom roles provider or au 1. Implement a plugin class that extends `org.elasticsearch.plugins.Plugin` 2. Create a build configuration file for the plugin; Gradle is our recommendation. -3. Create a `plugin-descriptor.properties` file as described in [Help for plugin authors](asciidocalypse://docs/elasticsearch/docs/extend/index.md). +3. Create a `plugin-descriptor.properties` file as described in [Help for plugin authors](elasticsearch://extend/index.md). 4. Create a `META-INF/services/org.elasticsearch.xpack.core.security.SecurityExtension` descriptor file for the extension that contains the fully qualified class name of your `org.elasticsearch.xpack.core.security.SecurityExtension` implementation 5. Bundle all in a single zip file. diff --git a/deploy-manage/users-roles/cluster-or-deployment-auth/built-in-sm.md b/deploy-manage/users-roles/cluster-or-deployment-auth/built-in-sm.md index 9c1f1b72d..4dcfabf8c 100644 --- a/deploy-manage/users-roles/cluster-or-deployment-auth/built-in-sm.md +++ b/deploy-manage/users-roles/cluster-or-deployment-auth/built-in-sm.md @@ -11,11 +11,11 @@ navigation_title: Change passwords After you implement security, you might need or want to change passwords for different users. If you want to reset a password for a [built-in user](/deploy-manage/users-roles/cluster-or-deployment-auth/built-in-users.md) such as the `elastic` or `kibana_system` users, or a user in the [native realm](/deploy-manage/users-roles/cluster-or-deployment-auth/built-in-users.md), you can use the following tools: * The **Manage users** UI in {{kib}} -* The [`elasticsearch-reset-password`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/reset-password.md) tool +* The [`elasticsearch-reset-password`](elasticsearch://reference/elasticsearch/command-line-tools/reset-password.md) tool * The [change passwords API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-security-change-password) :::{{tip}} -This topic describes resetting passwords after the initial bootstrap password is reset. To learn about the users that are used to communicate between {{stack}} components, and about managing bootstrap passwords for built-in users, refer to [](/deploy-manage/users-roles/cluster-or-deployment-auth/built-in-users.md). +This topic describes resetting passwords after the initial bootstrap password is reset. To learn about the users that are used to communicate between {{stack}} components, and about managing bootstrap passwords for built-in users, refer to [](/deploy-manage/users-roles/cluster-or-deployment-auth/built-in-users.md). ::: ## Using {{kib}} @@ -47,7 +47,7 @@ POST /_security/user/user1/_password ## Using the `user` API [native-users-api] -You can manage users through the Elasticsearch `user` API. +You can manage users through the Elasticsearch `user` API. For example, you can change a user's password: diff --git a/deploy-manage/users-roles/cluster-or-deployment-auth/built-in-users.md b/deploy-manage/users-roles/cluster-or-deployment-auth/built-in-users.md index b9d1ce34c..c86ac8fc1 100644 --- a/deploy-manage/users-roles/cluster-or-deployment-auth/built-in-users.md +++ b/deploy-manage/users-roles/cluster-or-deployment-auth/built-in-users.md @@ -16,7 +16,7 @@ The built-in users serve specific purposes and are not intended for general use. :::: -::::{note} +::::{note} On {{ecloud}}, [operator privileges](/deploy-manage/users-roles/cluster-or-deployment-auth/operator-privileges.md) are enabled. These privileges restrict some infrastructure functionality, even if a role would otherwise permit a user to complete an administrative task. :::: @@ -45,14 +45,14 @@ The following built-in users are available: : The user {{metricbeat}} uses when collecting and storing monitoring information in {{es}}. It has the `remote_monitoring_agent` and `remote_monitoring_collector` built-in roles. -## How the built-in users work [built-in-user-explanation] +## How the built-in users work [built-in-user-explanation] These built-in users are stored in a special `.security` index, which is managed by {{es}}. If a built-in user is disabled or its password changes, the change is automatically reflected on each node in the cluster. If your `.security` index is deleted or restored from a snapshot, however, any changes you have applied are lost. Although they share the same API, the built-in users are separate and distinct from users managed by the [native realm](/deploy-manage/users-roles/cluster-or-deployment-auth/native.md). Disabling the native realm will not have any effect on the built-in users. The built-in users can be disabled individually, using the [disable users API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-security-disable-user). -## The Elastic bootstrap password [bootstrap-elastic-passwords] +## The Elastic bootstrap password [bootstrap-elastic-passwords] ```{{applies_to}} deployment: self: @@ -66,11 +66,11 @@ When you install {{es}}, if the `elastic` user does not already have a password, By default, the bootstrap password is derived from a randomized `keystore.seed` setting, which is added to the keystore during installation. You do not need to know or change this bootstrap password. If you have defined a `bootstrap.password` setting in the keystore, however, that value is used instead. For more information about interacting with the keystore, see [Secure settings](/deploy-manage/security/secure-settings.md). -::::{note} +::::{note} After you [set passwords for the built-in users](/deploy-manage/users-roles/cluster-or-deployment-auth/built-in-users.md#set-built-in-user-passwords), in particular for the `elastic` user, there is no further use for the bootstrap password. :::: -## Setting initial built-in user passwords [set-built-in-user-passwords] +## Setting initial built-in user passwords [set-built-in-user-passwords] ```{{applies_to}} deployment: self: @@ -78,7 +78,7 @@ deployment: You must set the passwords for all built-in users. You can set or reset passwords using several methods. -* Using `elasticsearch-setup-passwords` +* Using `elasticsearch-setup-passwords` * Using {{kib}} user management * Using the change password API @@ -96,15 +96,15 @@ The `elasticsearch-setup-passwords` tool is the simplest method to set the built bin/elasticsearch-setup-passwords interactive ``` -For more information about the command options, see [elasticsearch-setup-passwords](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/setup-passwords.md). +For more information about the command options, see [elasticsearch-setup-passwords](elasticsearch://reference/elasticsearch/command-line-tools/setup-passwords.md). -::::{important} +::::{important} After you set a password for the `elastic` user, the bootstrap password is no longer valid; you cannot run the `elasticsearch-setup-passwords` command a second time. :::: ### Using {{kib}} user management or the change password API -You can set the initial passwords for the built-in users by using the **Management > Users** page in {{kib}} or the [change password API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-security-change-password). +You can set the initial passwords for the built-in users by using the **Management > Users** page in {{kib}} or the [change password API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-security-change-password). To use these methods, you must supply the `elastic` user and its bootstrap password to log in to {{kib}} or run the API. This requirement means that you can't use the default bootstrap password that is derived from the `keystore.seed` setting. Instead, you must explicitly set a `bootstrap.password` setting in the keystore before you start {{es}}. For example, the following command prompts you to enter a new bootstrap password: @@ -112,13 +112,13 @@ To use these methods, you must supply the `elastic` user and its bootstrap passw bin/elasticsearch-keystore add "bootstrap.password" ``` -You can then start {{es}} and {{kib}} and use the `elastic` user and bootstrap password to log in to {{kib}} and change the passwords. +You can then start {{es}} and {{kib}} and use the `elastic` user and bootstrap password to log in to {{kib}} and change the passwords. ### Using the Change Password API Alternatively, you can submit Change Password API requests for each built-in user. These methods are better suited for changing your passwords after the initial setup is complete, since at that point the bootstrap password is no longer required. -## Adding built-in user passwords to {{kib}} [add-built-in-user-kibana] +## Adding built-in user passwords to {{kib}} [add-built-in-user-kibana] After the `kibana_system` user password is set, you need to update the {{kib}} server with the new password by setting `elasticsearch.password` in the `kibana.yml` configuration file: @@ -129,7 +129,7 @@ elasticsearch.password: kibanapassword See [Configuring security in {{kib}}](/deploy-manage/security.md). -## Adding built-in user passwords to {{ls}} [add-built-in-user-logstash] +## Adding built-in user passwords to {{ls}} [add-built-in-user-logstash] The `logstash_system` user is used internally within Logstash when monitoring is enabled for Logstash. @@ -148,7 +148,7 @@ PUT _security/user/logstash_system/_enable See [Configuring credentials for {{ls}} monitoring](asciidocalypse://docs/logstash/docs/reference/ingestion-tools/logstash/secure-connection.md#ls-monitoring-user). -## Adding built-in user passwords to Beats [add-built-in-user-beats] +## Adding built-in user passwords to Beats [add-built-in-user-beats] The `beats_system` user is used internally within Beats when monitoring is enabled for Beats. @@ -166,7 +166,7 @@ The `remote_monitoring_user` is used when {{metricbeat}} collects and stores mon If you have upgraded from an older version of {{es}}, then you may not have set a password for the `beats_system` or `remote_monitoring_user` users. If this is the case, then you should use the **Management > Users** page in {{kib}} or the [change password API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-security-change-password) to set a password for these users. -## Adding built-in user passwords to APM [add-built-in-user-apm] +## Adding built-in user passwords to APM [add-built-in-user-apm] The `apm_system` user is used internally within APM when monitoring is enabled. @@ -180,9 +180,9 @@ xpack.monitoring.elasticsearch.password: apmserverpassword If you have upgraded from an older version of {{es}}, then you may not have set a password for the `apm_system` user. If this is the case, then you should use the **Management > Users** page in {{kib}} or the [change password API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-security-change-password) to set a password for these users. -## Disabling default password functionality [disabling-default-password] +## Disabling default password functionality [disabling-default-password] -::::{important} +::::{important} This setting is deprecated. The elastic user no longer has a default password. The password must be set before the user can be used. See [The Elastic bootstrap password](/deploy-manage/users-roles/cluster-or-deployment-auth/built-in-users.md#bootstrap-elastic-passwords). :::: diff --git a/deploy-manage/users-roles/cluster-or-deployment-auth/controlling-user-cache.md b/deploy-manage/users-roles/cluster-or-deployment-auth/controlling-user-cache.md index 37955b46e..6f7029fe9 100644 --- a/deploy-manage/users-roles/cluster-or-deployment-auth/controlling-user-cache.md +++ b/deploy-manage/users-roles/cluster-or-deployment-auth/controlling-user-cache.md @@ -3,8 +3,8 @@ mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/controlling-user-cache.html applies_to: deployment: - ess: - ece: + ess: + ece: eck: self: --- @@ -13,17 +13,17 @@ applies_to: User credentials are cached in memory on each node to avoid connecting to a remote authentication service or hitting the disk for every incoming request. You can configure characteristics of the user cache with the `cache.ttl`, `cache.max_users`, and `cache.hash_algo` realm settings. -::::{note} +::::{note} JWT realms use `jwt.cache.ttl` and `jwt.cache.size` realm settings. :::: -::::{note} +::::{note} PKI and JWT realms do not cache user credentials, but do cache the resolved user object to avoid unnecessarily needing to perform role mapping on each request. :::: -The cached user credentials are hashed in memory. By default, the {{es}} {{security-features}} use a salted `sha-256` hash algorithm. You can use a different hashing algorithm by setting the `cache.hash_algo` realm settings. See [User cache and password hash algorithms](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#hashing-settings). +The cached user credentials are hashed in memory. By default, the {{es}} {{security-features}} use a salted `sha-256` hash algorithm. You can use a different hashing algorithm by setting the `cache.hash_algo` realm settings. See [User cache and password hash algorithms](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#hashing-settings). ## Evicting users from the cache [cache-eviction-api] diff --git a/deploy-manage/users-roles/cluster-or-deployment-auth/jwt.md b/deploy-manage/users-roles/cluster-or-deployment-auth/jwt.md index 0c2ebd3e3..3202b6888 100644 --- a/deploy-manage/users-roles/cluster-or-deployment-auth/jwt.md +++ b/deploy-manage/users-roles/cluster-or-deployment-auth/jwt.md @@ -72,13 +72,13 @@ Not every access token is formatted as a JSON Web Token (JWT). For it to be comp To use JWT authentication, create the realm in the `elasticsearch.yml` file to configure it within the {{es}} authentication chain. -The JWT realm has a few mandatory settings, plus optional settings that are described in [JWT realm settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#ref-jwt-settings). +The JWT realm has a few mandatory settings, plus optional settings that are described in [JWT realm settings](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#ref-jwt-settings). ::::{note} Client authentication is enabled by default for the JWT realms. Disabling client authentication is possible, but strongly discouraged. :::: -1. Configure the realm using your preferred token type: +1. Configure the realm using your preferred token type: :::::{tab-set} @@ -153,7 +153,7 @@ Client authentication is enabled by default for the JWT realms. Disabling client : Specifies a list of JWT subjects that the realm will allow. These values are typically URLs, UUIDs, or other case-sensitive string values. `allowed_subject_patterns` - : Analogous to `allowed_subjects` but it accepts a list of [Lucene regexp](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/regexp-syntax.md) and wildcards for the allowed JWT subjects. Wildcards use the `*` and `?` special characters (which are escaped by `\`) to mean "any string" and "any single character" respectively, for example "a?\**", matches "a1*" and "ab*whatever", but not "a", "abc", or "abc*" (in Java strings `\` must itself be escaped by another `\`). [Lucene regexp](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/regexp-syntax.md) must be enclosed between `/`, for example "/https?://[^/]+/?/" matches any http or https URL with no path component (matches "https://elastic.co/" but not "https://elastic.co/guide"). + : Analogous to `allowed_subjects` but it accepts a list of [Lucene regexp](elasticsearch://reference/query-languages/regexp-syntax.md) and wildcards for the allowed JWT subjects. Wildcards use the `*` and `?` special characters (which are escaped by `\`) to mean "any string" and "any single character" respectively, for example "a?\**", matches "a1*" and "ab*whatever", but not "a", "abc", or "abc*" (in Java strings `\` must itself be escaped by another `\`). [Lucene regexp](elasticsearch://reference/query-languages/regexp-syntax.md) must be enclosed between `/`, for example "/https?://[^/]+/?/" matches any http or https URL with no path component (matches "https://elastic.co/" but not "https://elastic.co/guide"). At least one of the `allowed_subjects` or `allowed_subject_patterns` settings must be specified (and be non-empty) when `token_type` is `access_token`. @@ -172,13 +172,13 @@ Client authentication is enabled by default for the JWT realms. Disabling client 2. Add secure settings [to the {{es}} keystore](/deploy-manage/security/secure-settings.md): - * The `shared_secret` value for `client_authentication.type` - + * The `shared_secret` value for `client_authentication.type` + (`xpack.security.authc.realms.jwt.jwt1.client_authentication.shared_secret1`) - * The HMAC keys for `allowed_signature_algorithms` - + * The HMAC keys for `allowed_signature_algorithms` + (`xpack.security.authc.realms.jwt.jwt1.hmac_jwkset`) - + This setting can be a path to a JWKS, which is a resource for a set of JSON-encoded secret keys. The file can be removed after you load the contents into the {{es}} keystore. @@ -208,7 +208,7 @@ Signature: UnnFmsoFKfNmKMsVoDQmKI_3-j95PCaKdgqqau3jPMY This example illustrates a partial decoding of a JWT. The validity period is from 2000 to 2099 (inclusive), as defined by the issue time (`iat`) and expiration time (`exp`). JWTs typically have a validity period shorter than 100 years, such as 1-2 hours or 1-7 days, not an entire human life. -The signature in this example is deterministic because the header, claims, and HMAC key are fixed. JWTs typically have a `nonce` claim to make the signature non-deterministic. The supported JWT encoding is JSON Web Signature (JWS), and the JWS `Header` and `Signature` are validated using OpenID Connect ID Token validation rules. Some validation is customizable through [JWT realm settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#ref-jwt-settings). +The signature in this example is deterministic because the header, claims, and HMAC key are fixed. JWTs typically have a `nonce` claim to make the signature non-deterministic. The supported JWT encoding is JSON Web Signature (JWS), and the JWS `Header` and `Signature` are validated using OpenID Connect ID Token validation rules. Some validation is customizable through [JWT realm settings](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#ref-jwt-settings). ### Header claims [jwt-validation-header] @@ -280,12 +280,12 @@ You can relax validation of any of the time-based claims by setting `allowed_clo ## Role mapping [jwt-authorization] -You can map LDAP groups to roles in the following ways: +You can map LDAP groups to roles in the following ways: * Using the role mappings page in {{kib}}. -* Using the [role mapping API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-security-put-role-mapping). +* Using the [role mapping API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-security-put-role-mapping). * By delegating authorization [to another realm](#jwt-authorization-delegation). - + For more information, see [Mapping users and groups to roles](/deploy-manage/users-roles/cluster-or-deployment-auth/mapping-users-groups-to-roles.md). ::::{important} @@ -445,7 +445,7 @@ Separate reload requests cannot be combined if JWT signature failures trigger: * PKC JWKS reloads in the same {{es}} node at different times ::::{important} -Enabling client authentication (`client_authentication.type`) is strongly recommended. Only trusted client applications and realm-specific JWT users can trigger PKC reload attempts. Additionally, configuring the following [JWT security settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#ref-jwt-settings) is recommended: +Enabling client authentication (`client_authentication.type`) is strongly recommended. Only trusted client applications and realm-specific JWT users can trigger PKC reload attempts. Additionally, configuring the following [JWT security settings](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#ref-jwt-settings) is recommended: * `allowed_audiences` * `allowed_clock_skew` @@ -497,7 +497,7 @@ xpack.security.authc.realms.jwt.jwt8.client_authentication.type: shared_secret ### JWT realm secure settings [_jwt_realm_secure_settings] -After defining the realm settings, use the [`elasticsearch-keystore`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/elasticsearch-keystore.md) tool to add the following secure settings to the {{es}} keystore. In {{ecloud}}, you define settings for the {{es}} keystore under **Security** in your deployment. +After defining the realm settings, use the [`elasticsearch-keystore`](elasticsearch://reference/elasticsearch/command-line-tools/elasticsearch-keystore.md) tool to add the following secure settings to the {{es}} keystore. In {{ecloud}}, you define settings for the {{es}} keystore under **Security** in your deployment. ```yaml xpack.security.authc.realms.jwt.jwt8.hmac_key: hmac-oidc-key-string-for-hs256-algorithm diff --git a/deploy-manage/users-roles/cluster-or-deployment-auth/kerberos.md b/deploy-manage/users-roles/cluster-or-deployment-auth/kerberos.md index 6e28afabf..3da66e05a 100644 --- a/deploy-manage/users-roles/cluster-or-deployment-auth/kerberos.md +++ b/deploy-manage/users-roles/cluster-or-deployment-auth/kerberos.md @@ -17,12 +17,12 @@ applies_to: You can configure the {{stack}} {{security-features}} to support Kerberos V5 authentication, an industry standard protocol to authenticate users in {{es}} and {{kib}}. -::::{note} +::::{note} You can't use the Kerberos realm to authenticate on the transport network layer. :::: -To authenticate users with Kerberos, you need to configure a Kerberos realm and map users to roles. For more information on realm settings, see [Kerberos realm settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#ref-kerberos-settings). +To authenticate users with Kerberos, you need to configure a Kerberos realm and map users to roles. For more information on realm settings, see [Kerberos realm settings](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#ref-kerberos-settings). ## Key concepts [kerberos-terms] @@ -47,7 +47,7 @@ realm keytab : A file that stores pairs of principals and encryption keys. - ::::{important} + ::::{important} Anyone with read permissions to this file can use the credentials in the network to access other services so it is important to protect it with proper file permissions. :::: @@ -69,13 +69,13 @@ In Kerberos, users authenticate with an authentication service and later with a Before you set up a Kerberos realm, you must have the Kerberos infrastructure set up in your environment. -::::{note} +::::{note} Kerberos requires a lot of external services to function properly, such as time synchronization between all machines and working forward and reverse DNS mappings in your domain. Refer to your Kerberos documentation for more details. :::: These instructions do not cover setting up and configuring your Kerberos deployment. Where examples are provided, they pertain to an MIT Kerberos V5 deployment. For more information, see [MIT Kerberos documentation](http://web.mit.edu/kerberos/www/index.md) -If you're using a self-managed cluster, then perform the following additional steps: +If you're using a self-managed cluster, then perform the following additional steps: * Enable TLS for HTTP. @@ -83,7 +83,7 @@ If you're using a self-managed cluster, then perform the following additional st This step is necessary to support Kerberos authentication through {{kib}}. It is not required for Kerberos authentication directly against the {{es}} Rest API. - If you started {{es}} [with security enabled](/deploy-manage/deploy/self-managed/installing-elasticsearch.md), then TLS is already enabled for HTTP. + If you started {{es}} [with security enabled](/deploy-manage/deploy/self-managed/installing-elasticsearch.md), then TLS is already enabled for HTTP. {{ech}}, {{ece}}, and {{eck}} have TLS enabled by default. @@ -110,7 +110,7 @@ To configure a Kerberos realm in {{es}}: * `krb5.conf`: The Kerberos configuration file (`krb5.conf`) provides information such as the default realm, the Key Distribution Center (KDC), and other configuration details required for Kerberos authentication. For more information, see [krb5.conf](https://web.mit.edu/kerberos/krb5-latest/doc/admin/conf_files/krb5_conf.html). * `keytab`: A keytab is a file that stores pairs of principals and encryption keys. {{es}} uses the keys from the keytab to decrypt the tickets presented by the user. You must create a keytab for {{es}} by using the tools provided by your Kerberos implementation. For example, some tools that create keytabs are `ktpass.exe` on Windows and `kadmin` for MIT Kerberos. - + The configuration requirements depend on your Kerberos setup. Refer to your Kerberos documentation to configure the `krb5.conf` file. For more information on Java GSS, see [Java GSS Kerberos requirements](https://docs.oracle.com/javase/10/security/kerberos-requirements1.htm). @@ -119,34 +119,34 @@ For more information on Java GSS, see [Java GSS Kerberos requirements](https://d The way that you provide Kerberos config files to {{es}} depends on your deployment method. -For detailed information of available realm settings, see [Kerberos realm settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#ref-kerberos-settings). +For detailed information of available realm settings, see [Kerberos realm settings](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#ref-kerberos-settings). :::::{tab-set} ::::{tab-item} Self-managed 1. Configure the JVM to find the Kerberos configuration file. - - {{es}} uses Java GSS and JAAS Krb5LoginModule to support Kerberos authentication using a Simple and Protected GSSAPI Negotiation (SPNEGO) mechanism. When the JVM needs some configuration properties, it tries to find those values by locating and loading the `krb5.conf` file. The JVM system property to configure the file path is `java.security.krb5.conf`. To configure JVM system properties see [Set JVM options](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/jvm-settings.md#set-jvm-options). If this system property is not specified, Java tries to locate the file based on the conventions. - - :::{tip} + + {{es}} uses Java GSS and JAAS Krb5LoginModule to support Kerberos authentication using a Simple and Protected GSSAPI Negotiation (SPNEGO) mechanism. When the JVM needs some configuration properties, it tries to find those values by locating and loading the `krb5.conf` file. The JVM system property to configure the file path is `java.security.krb5.conf`. To configure JVM system properties see [Set JVM options](elasticsearch://reference/elasticsearch/jvm-settings.md#set-jvm-options). If this system property is not specified, Java tries to locate the file based on the conventions. + + :::{tip} It is recommended that this system property be configured for {{es}}. The method for setting this property depends on your Kerberos infrastructure. Refer to your Kerberos documentation for more details. ::: For more information, see [krb5.conf](https://web.mit.edu/kerberos/krb5-latest/doc/admin/conf_files/krb5_conf.html). 2. Put the keytab file in the {{es}} configuration directory. - + Make sure that this keytab file has read permissions. This file contains credentials, therefore you must take appropriate measures to protect it. - - :::{important} + + :::{important} {{es}} uses Kerberos on the HTTP network layer, therefore there must be a keytab file for the HTTP service principal on every {{es}} node. The service principal name must have the format `HTTP/es.domain.local@ES.DOMAIN.LOCAL`. The keytab files are unique for each node since they include the hostname. An {{es}} node can act as any principal a client requests as long as that principal and its credentials are found in the configured keytab. ::: 3. Create a Kerberos realm. - + To enable Kerberos authentication in {{es}}, you must add a Kerberos realm in the realm chain. - + :::{note} You can configure only one Kerberos realm on {{es}} nodes. ::: @@ -154,7 +154,7 @@ For detailed information of available realm settings, see [Kerberos realm settin To configure a Kerberos realm, there are a few mandatory realm settings and other optional settings that you need to configure in the `elasticsearch.yml` configuration file. Add a realm configuration under the `xpack.security.authc.realms.kerberos` namespace. The most common configuration for a Kerberos realm is as follows: - + ```yaml xpack.security.authc.realms.kerberos.kerb1: order: 3 @@ -166,7 +166,7 @@ For detailed information of available realm settings, see [Kerberos realm settin ::::{tab-item} ECH and ECE -1. Create a [custom bundle](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch-plugins/cloud-enterprise/ece-add-plugins.md) that contains your `krb5.conf` and `keytab` files, and add it to your cluster. +1. Create a [custom bundle](elasticsearch://reference/elasticsearch-plugins/cloud-enterprise/ece-add-plugins.md) that contains your `krb5.conf` and `keytab` files, and add it to your cluster. :::{tip} You should use these exact filenames for {{ecloud}} to recognize the file in the bundle. @@ -191,13 +191,13 @@ For detailed information of available realm settings, see [Kerberos realm settin 1. Install your `krb5.conf` and `keytab` files as a [custom configuration files](/deploy-manage/deploy/cloud-on-k8s/custom-configuration-files-plugins.md#use-a-volume-and-volume-mount-together-with-a-configmap-or-secret). Mount them in a sub-directory of the main config directory, for example `/usr/share/elasticsearch/config/kerberos`, and use a `Secret` instead of a `ConfigMap` to store the information. 2. Configure the JVM to find the Kerberos configuration file. - + {{es}} uses Java GSS and JAAS Krb5LoginModule to support Kerberos authentication using a Simple and Protected GSSAPI Negotiation (SPNEGO) mechanism. When the JVM needs some configuration properties, it tries to find those values by locating and loading the `krb5.conf` file. The JVM system property to configure the file path is `java.security.krb5.conf`. If this system property is not specified, Java tries to locate the file based on the conventions. To provide JVM setting overrides to your cluster: 1. Create a new ConfigMap with a valid JVM options file as the key. The source file should be a JVM `.options` file containing the JVM system property `-Djava.security.krb5.conf=/usr/share/elasticsearch/config/kerberos/krb5.conf`, assuming the `krb5.conf` file was mounted on `/usr/share/elasticsearch/config/kerberos` in the previous step. - + ``` # create a configmap with a key named override.options and the content of your local file kubectl create configmap jvm-options --from-file=override.options= @@ -209,7 +209,7 @@ For detailed information of available realm settings, see [Kerberos realm settin apiVersion: elasticsearch.k8s.elastic.co/v1 kind: Elasticsearch metadata: - name: test-cluster + name: test-cluster spec: version: 8.17.0 nodeSets: @@ -253,12 +253,12 @@ The `username` is extracted from the ticket presented by the user and usually ha ## Map Kerberos users to roles -The `kerberos` realm enables you to map Kerberos users to roles. +The `kerberos` realm enables you to map Kerberos users to roles. -You can map these users to roles in multiple ways: +You can map these users to roles in multiple ways: * Using the role mappings page in {{kib}}. -* Using the [role mapping API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-security-put-role-mapping). +* Using the [role mapping API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-security-put-role-mapping). You identify users by their `username` field. @@ -275,14 +275,14 @@ POST /_security/role_mapping/kerbrolemapping } ``` -In case you want to support Kerberos cross realm authentication, you may need to map roles based on the Kerberos realm name. For such scenarios, the following additional user metadata can be used for role mapping: +In case you want to support Kerberos cross realm authentication, you may need to map roles based on the Kerberos realm name. For such scenarios, the following additional user metadata can be used for role mapping: -- `kerberos_realm`: The Kerberos realm name. +- `kerberos_realm`: The Kerberos realm name. - `kerberos_user_principal_name`: The user principal name from the Kerberos ticket. For more information, see [Mapping users and groups to roles](/deploy-manage/users-roles/cluster-or-deployment-auth/mapping-users-groups-to-roles.md). -::::{note} +::::{note} The Kerberos realm supports [authorization realms](/deploy-manage/users-roles/cluster-or-deployment-auth/realm-chains.md#authorization_realms) as an alternative to role mapping. :::: diff --git a/deploy-manage/users-roles/cluster-or-deployment-auth/ldap.md b/deploy-manage/users-roles/cluster-or-deployment-auth/ldap.md index 67001f840..b507faca2 100644 --- a/deploy-manage/users-roles/cluster-or-deployment-auth/ldap.md +++ b/deploy-manage/users-roles/cluster-or-deployment-auth/ldap.md @@ -18,7 +18,7 @@ You can configure the {{stack}} {{security-features}} to communicate with a Ligh To integrate with LDAP, you configure an `ldap` realm and map LDAP groups to user roles. :::{{tip}} -This topic describes implementing LDAP at the cluster or deployment level, for the purposes of authenticating with {{es}} and {{kib}}. +This topic describes implementing LDAP at the cluster or deployment level, for the purposes of authenticating with {{es}} and {{kib}}. You can also configure an {{ece}} installation to use an LDAP server to authenticate users. [Learn more](/deploy-manage/users-roles/cloud-enterprise-orchestrator/ldap.md). ::: @@ -31,7 +31,7 @@ The path to an entry is a *Distinguished Name* (DN) that uniquely identifies a u The `ldap` realm supports two modes of operation, a user search mode and a mode with specific templates for user DNs. -::::{important} +::::{important} When you configure realms in `elasticsearch.yml`, only the realms you specify are used for authentication. If you also want to use the `native` or `file` realms, you must include them in the realm chain. :::: @@ -44,13 +44,13 @@ The `ldap` realm supports two modes of operation, a user search mode and a mode * **DN templates**: If your LDAP environment uses a few specific standard naming conditions for users, you can use user DN templates to configure the realm. The advantage of this method is that a search does not have to be performed to find the user DN. However, multiple bind operations might be needed to find the correct user DN. -### Set up LDAP user search mode +### Set up LDAP user search mode To configure an `ldap` realm with user search: -1. Add a realm configuration to `elasticsearch.yml` under the `xpack.security.authc.realms.ldap` namespace. - - At a minimum, you must specify the `url` and `order` of the LDAP server, and set `user_search.base_dn` to the container DN where the users are searched for. See [LDAP realm settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#ref-ldap-settings) for all of the options you can set for an `ldap` realm. +1. Add a realm configuration to `elasticsearch.yml` under the `xpack.security.authc.realms.ldap` namespace. + + At a minimum, you must specify the `url` and `order` of the LDAP server, and set `user_search.base_dn` to the container DN where the users are searched for. See [LDAP realm settings](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#ref-ldap-settings) for all of the options you can set for an `ldap` realm. For example, the following snippet shows an LDAP realm configured with a user search: @@ -90,10 +90,10 @@ To configure an `ldap` realm with user search: 1. (Optional) Configure how the {{security-features}} interact with multiple LDAP servers. - The `load_balance.type` setting can be used at the realm level. The {{es}} {{security-features}} support both failover and load balancing modes of operation. See [LDAP realm settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#ref-ldap-settings). + The `load_balance.type` setting can be used at the realm level. The {{es}} {{security-features}} support both failover and load balancing modes of operation. See [LDAP realm settings](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#ref-ldap-settings). 2. (Optional) To protect passwords, [encrypt communications between {{es}} and the LDAP server](../../../deploy-manage/users-roles/cluster-or-deployment-auth/ldap.md#tls-ldap). - + * **For self-managed clusters and {{eck}} deployments**, clients and nodes that connect using SSL/TLS to the Active Directory server need to have the Active Directory server’s certificate or the server’s root CA certificate installed in their keystore or trust store. * **For {{ece}} and {{ech}} deployments**, if your Domain Controller is configured to use LDAP over TLS and it uses a self-signed certificate or a certificate that is signed by your organization’s CA, you need to enable the deployment to trust this certificate. @@ -103,7 +103,7 @@ To configure an `ldap` realm with user search: To configure an `ldap` realm with user DN templates: -1. Add a realm configuration to `elasticsearch.yml` in the `xpack.security.authc.realms.ldap` namespace. At a minimum, you must specify the `url` and `order` of the LDAP server, and specify at least one template with the `user_dn_templates` option. See [LDAP realm settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#ref-ldap-settings) for all of the options you can set for an `ldap` realm. +1. Add a realm configuration to `elasticsearch.yml` in the `xpack.security.authc.realms.ldap` namespace. At a minimum, you must specify the `url` and `order` of the LDAP server, and specify at least one template with the `user_dn_templates` option. See [LDAP realm settings](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#ref-ldap-settings) for all of the options you can set for an `ldap` realm. For example, the following snippet shows an LDAP realm configured with user DN templates: @@ -136,10 +136,10 @@ To configure an `ldap` realm with user DN templates: 2. (Optional) Configure how the {{security-features}} interact with multiple LDAP servers. - The `load_balance.type` setting can be used at the realm level. The {{es}} {{security-features}} support both failover and load balancing modes of operation. See [LDAP realm settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#ref-ldap-settings). + The `load_balance.type` setting can be used at the realm level. The {{es}} {{security-features}} support both failover and load balancing modes of operation. See [LDAP realm settings](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#ref-ldap-settings). 3. (Optional) To protect passwords, [encrypt communications between {{es}} and the LDAP server](../../../deploy-manage/users-roles/cluster-or-deployment-auth/ldap.md#tls-ldap). - + * **For self-managed clusters and {{eck}} deployments**, clients and nodes that connect using SSL/TLS to the Active Directory server need to have the Active Directory server’s certificate or the server’s root CA certificate installed in their keystore or trust store. * **For {{ece}} and {{ech}} deployments**, if your Domain Controller is configured to use LDAP over TLS and it uses a self-signed certificate or a certificate that is signed by your organization’s CA, you need to enable the deployment to trust this certificate. @@ -151,17 +151,17 @@ An integral part of a realm authentication process is to resolve the roles assoc Because users are managed externally in the LDAP server, the expectation is that their roles are managed there as well. LDAP groups often represent user roles for different systems in the organization. -The `active_directory` realm enables you to map Active Directory users to roles using their Active Directory groups or other metadata. +The `active_directory` realm enables you to map Active Directory users to roles using their Active Directory groups or other metadata. -You can map LDAP groups to roles in the following ways: +You can map LDAP groups to roles in the following ways: * Using the role mappings page in {{kib}}. -* Using the [role mapping API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-security-put-role-mapping). +* Using the [role mapping API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-security-put-role-mapping). * Using a role mapping file. For more information, see [Mapping users and groups to roles](/deploy-manage/users-roles/cluster-or-deployment-auth/mapping-users-groups-to-roles.md). -::::{note} +::::{note} The LDAP realm supports [authorization realms](../../../deploy-manage/users-roles/cluster-or-deployment-auth/realm-chains.md#authorization_realms) as an alternative to role mapping. :::: @@ -190,7 +190,7 @@ POST /_security/role_mapping/ldap-superuser <1> ### Example: Using a role mapping file -:::{tip} +:::{tip} If you're using {{ece}} or {{ech}}, then you must [upload this file as a custom bundle](/deploy-manage/deploy/elastic-cloud/upload-custom-plugins-bundles.md) before it can be referenced. If you're using {{eck}}, then install the file as a [custom configuration file](/deploy-manage/deploy/cloud-on-k8s/custom-configuration-files-plugins.md#use-a-volume-and-volume-mount-together-with-a-configmap-or-secret). If you're using a self-managed cluster, then the file must be present on each node. ::: @@ -260,7 +260,7 @@ xpack: The `load_balance.type` setting can be used at the realm level to configure how the {{security-features}} should interact with multiple LDAP servers. The {{security-features}} support both failover and load balancing modes of operation. -See [Load balancing and failover](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#load-balancing). +See [Load balancing and failover](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#load-balancing). ## Encrypting communications between {{es}} and LDAP [tls-ldap] @@ -271,7 +271,7 @@ If you're using {{ech}} or {{ece}}, then you must [upload your certificate as a If you're using {{eck}}, then install the certificate as a [custom configuration file](/deploy-manage/deploy/cloud-on-k8s/custom-configuration-files-plugins.md#use-a-volume-and-volume-mount-together-with-a-configmap-or-secret). -:::{tip} +:::{tip} If you're using {{ece}} or {{ech}}, then these steps are required only if TLS is enabled and the Active Directory controller is using self-signed certificates. ::: @@ -306,7 +306,7 @@ You can also specify the individual server certificates rather than the CA certi For more information about these settings, see [LDAP realm settings](https://www.elastic.co/guide/en/elasticsearch/reference/current/security-settings.html#ref-ldap-settings). -::::{note} +::::{note} By default, when you configure {{es}} to connect to an LDAP server using SSL/TLS, it attempts to verify the hostname or IP address specified with the `url` attribute in the realm configuration with the values in the certificate. If the values in the certificate and realm configuration do not match, {{es}} does not allow a connection to the LDAP server. This is done to protect against man-in-the-middle attacks. If necessary, you can disable this behavior by setting the `ssl.verification_mode` property to `certificate`. :::: diff --git a/deploy-manage/users-roles/cluster-or-deployment-auth/looking-up-users-without-authentication.md b/deploy-manage/users-roles/cluster-or-deployment-auth/looking-up-users-without-authentication.md index bbb72bf91..4e4635fb7 100644 --- a/deploy-manage/users-roles/cluster-or-deployment-auth/looking-up-users-without-authentication.md +++ b/deploy-manage/users-roles/cluster-or-deployment-auth/looking-up-users-without-authentication.md @@ -3,8 +3,8 @@ mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/user-lookup.html applies_to: deployment: - ess: - ece: + ess: + ece: eck: self: --- @@ -28,18 +28,18 @@ See the [run_as](submitting-requests-on-behalf-of-other-users.md) and [delegated * The reserved, [`native`](native.md) and [`file`](file-based.md) realms always support user lookup. * The [`ldap`](ldap.md) realm supports user lookup when the realm is configured in [*user search* mode](ldap.md#ldap-realm-configuration). User lookup is not support when the realm is configured with `user_dn_templates`. -* User lookup support in the [`active_directory`](active-directory.md) realm requires that the realm be configured with a [`bind_dn`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#ref-ad-settings) and a bind password. +* User lookup support in the [`active_directory`](active-directory.md) realm requires that the realm be configured with a [`bind_dn`](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#ref-ad-settings) and a bind password. The `pki`, `saml`, `oidc`, `kerberos` and `jwt` realms do not support user lookup. -::::{note} -If you want to use a realm only for user lookup and prevent users from authenticating against that realm, you can [configure the realm](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#ref-realm-settings) and set `authentication.enabled` to `false` +::::{note} +If you want to use a realm only for user lookup and prevent users from authenticating against that realm, you can [configure the realm](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#ref-realm-settings) and set `authentication.enabled` to `false` :::: The user lookup feature is an internal capability that is used to implement the `run_as` and delegated authorization features - there are no APIs for user lookup. If you want to test your user lookup configuration, then you can do this with `run_as`. Use the [Authenticate](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-security-authenticate) API, authenticate as a `superuser` (e.g. the builtin `elastic` user) and specify the [`es-security-runas-user` request header](submitting-requests-on-behalf-of-other-users.md). -::::{note} +::::{note} The [Get users](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-security-get-user) API and [User profiles](user-profiles.md) feature are alternative ways to retrieve information about a {{stack}} user. Those APIs are not related to the user lookup feature. :::: diff --git a/deploy-manage/users-roles/cluster-or-deployment-auth/manage-authentication-for-multiple-clusters.md b/deploy-manage/users-roles/cluster-or-deployment-auth/manage-authentication-for-multiple-clusters.md index 87e53b094..9ead7ddea 100644 --- a/deploy-manage/users-roles/cluster-or-deployment-auth/manage-authentication-for-multiple-clusters.md +++ b/deploy-manage/users-roles/cluster-or-deployment-auth/manage-authentication-for-multiple-clusters.md @@ -34,7 +34,7 @@ Make sure you check the complete [guide to setting up LDAP with {{es}}](/deploy- ### Configure LDAP using {{stack}} configuration policy with user search[k8s_to_configure_ldap_using_elastic_stack_configuration_policy_with_user_search] -1. Add a realm configuration to the `config` field under `elasticsearch` in the `xpack.security.authc.realms.ldap` namespace. At a minimum, you must specify the URL of the LDAP server and the order of the LDAP realm compared to other configured security realms. You also have to set `user_search.base_dn` to the container DN where the users are searched for. Refer to [LDAP realm settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#ref-ldap-settings) for all of the options you can set for an LDAP realm. For example, the following snippet shows an LDAP realm configured with a user search: +1. Add a realm configuration to the `config` field under `elasticsearch` in the `xpack.security.authc.realms.ldap` namespace. At a minimum, you must specify the URL of the LDAP server and the order of the LDAP realm compared to other configured security realms. You also have to set `user_search.base_dn` to the container DN where the users are searched for. Refer to [LDAP realm settings](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#ref-ldap-settings) for all of the options you can set for an LDAP realm. For example, the following snippet shows an LDAP realm configured with a user search: ```yaml elasticsearch: @@ -127,7 +127,7 @@ spec: ### Configure an LDAP realm with user DN templates[k8s_to_configure_an_ldap_realm_with_user_dn_templates] -Add a realm configuration to `elasticsearch.yml` in the xpack.security.authc.realms.ldap namespace. At a minimum, you must specify the url and order of the LDAP server, and specify at least one template with the user_dn_templates option. Check [LDAP realm settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#ref-ldap-settings) for all of the options you can set for an ldap realm. +Add a realm configuration to `elasticsearch.yml` in the xpack.security.authc.realms.ldap namespace. At a minimum, you must specify the url and order of the LDAP server, and specify at least one template with the user_dn_templates option. Check [LDAP realm settings](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#ref-ldap-settings) for all of the options you can set for an ldap realm. For example, the following snippet shows an LDAP realm configured with user DN templates: diff --git a/deploy-manage/users-roles/cluster-or-deployment-auth/openid-connect.md b/deploy-manage/users-roles/cluster-or-deployment-auth/openid-connect.md index 6217ff8cd..4d90d2f46 100644 --- a/deploy-manage/users-roles/cluster-or-deployment-auth/openid-connect.md +++ b/deploy-manage/users-roles/cluster-or-deployment-auth/openid-connect.md @@ -26,7 +26,7 @@ Because this feature is designed with {{kib}} in mind, most sections of this gui For a detailed description of how to implement OpenID Connect with various OpenID Connect Providers (OPs), refer to [Set up OpenID Connect with Azure, Google, or Okta](/deploy-manage/users-roles/cluster-or-deployment-auth/oidc-examples.md). -::::{note} +::::{note} OpenID Connect realm support in {{kib}} is designed with the expectation that it will be the primary authentication method for the users of that {{kib}} instance. The [Configuring {{kib}}](/deploy-manage/users-roles/cluster-or-deployment-auth/openid-connect.md#oidc-configure-kibana) section describes what this entails and how you can set it up to support other realms if necessary. :::: @@ -39,9 +39,9 @@ In order for the {{stack}} to be able to use your OpenID Connect Provider for au The process for registering the {{stack}} RP will be different from OP to OP, so you should follow your provider's documentation. The information for the RP that you commonly need to provide for registration are the following: * `Relying Party Name`: An arbitrary identifier for the relying party. There are no constraints on this value, either from the specification or the {{stack}} implementation. -* `Redirect URI`: The URI where the OP will redirect the user’s browser after authentication, sometimes referred to as a `Callback URI`. The appropriate value for this will depend on your setup, and whether or not {{kib}} sits behind a proxy or load balancer. - - It will typically be `${kibana-url}/api/security/oidc/callback` (for the authorization code flow) or `${kibana-url}/api/security/oidc/implicit` (for the implicit flow) where *${kibana-url}* is the base URL for your {{kib}} instance. +* `Redirect URI`: The URI where the OP will redirect the user’s browser after authentication, sometimes referred to as a `Callback URI`. The appropriate value for this will depend on your setup, and whether or not {{kib}} sits behind a proxy or load balancer. + + It will typically be `${kibana-url}/api/security/oidc/callback` (for the authorization code flow) or `${kibana-url}/api/security/oidc/implicit` (for the implicit flow) where *${kibana-url}* is the base URL for your {{kib}} instance. If you're using {{ech}}, then set this value to `/api/security/oidc/callback`. @@ -52,13 +52,13 @@ At the end of the registration process, the OP will assign a Client Identifier a Before you set up an OpenID Connect realm, you must have an OpenID Connect Provider where the {{stack}} Relying Party will be registered. -If you're using a self-managed cluster, then perform the following additional steps: +If you're using a self-managed cluster, then perform the following additional steps: * Enable TLS for HTTP. If your {{es}} cluster is operating in production mode, you must configure the HTTP interface to use SSL/TLS before you can enable OIDC authentication. For more information, see [Encrypt HTTP client communications for {{es}}](../../../deploy-manage/security/set-up-basic-security-plus-https.md#encrypt-http-communication). - If you started {{es}} [with security enabled](/deploy-manage/deploy/self-managed/installing-elasticsearch.md), then TLS is already enabled for HTTP. + If you started {{es}} [with security enabled](/deploy-manage/deploy/self-managed/installing-elasticsearch.md), then TLS is already enabled for HTTP. {{ech}}, {{ece}}, and {{eck}} have TLS enabled by default. @@ -76,31 +76,31 @@ If you're using a self-managed cluster, then perform the following additional st OpenID Connect based authentication is enabled by configuring the appropriate realm within the authentication chain for {{es}}. -This realm has a few mandatory settings, and a number of optional settings. The available settings are described in detail in [OpenID Connect realm settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#ref-oidc-settings). This guide will explore the most common settings. +This realm has a few mandatory settings, and a number of optional settings. The available settings are described in detail in [OpenID Connect realm settings](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#ref-oidc-settings). This guide will explore the most common settings. 1. Create an OpenID Connect (the realm type is `oidc`) realm in your `elasticsearch.yml` file similar to what is shown below. If you're using {{ece}} or {{ech}}, and you're using machine learning or a deployment with hot-warm architecture, you must include this configuration in the user settings section for each node type. - ::::{note} + ::::{note} The values used below are meant to be an example and are not intended to apply to every use case. The details below the configuration snippet provide insights and suggestions to help you pick the proper values, depending on your OP configuration. :::: ```yaml - xpack.security.authc.realms.oidc.oidc1: - order: 2 + xpack.security.authc.realms.oidc.oidc1: + order: 2 rp.client_id: "the_client_id" - rp.response_type: code - rp.redirect_uri: "https://kibana.example.org:5601/api/security/oidc/callback" - op.issuer: "https://op.example.org" - op.authorization_endpoint: "https://op.example.org/oauth2/v1/authorize" - op.token_endpoint: "https://op.example.org/oauth2/v1/token" - op.jwkset_path: oidc/jwkset.json - op.userinfo_endpoint: "https://op.example.org/oauth2/v1/userinfo" - op.endsession_endpoint: "https://op.example.org/oauth2/v1/logout" - rp.post_logout_redirect_uri: "https://kibana.example.org:5601/security/logged_out" - claims.principal: sub - claims.groups: "http://example.info/claims/groups" + rp.response_type: code + rp.redirect_uri: "https://kibana.example.org:5601/api/security/oidc/callback" + op.issuer: "https://op.example.org" + op.authorization_endpoint: "https://op.example.org/oauth2/v1/authorize" + op.token_endpoint: "https://op.example.org/oauth2/v1/token" + op.jwkset_path: oidc/jwkset.json + op.userinfo_endpoint: "https://op.example.org/oauth2/v1/userinfo" + op.endsession_endpoint: "https://op.example.org/oauth2/v1/logout" + rp.post_logout_redirect_uri: "https://kibana.example.org:5601/security/logged_out" + claims.principal: sub + claims.groups: "http://example.info/claims/groups" ``` ::::{dropdown} Common settings @@ -139,9 +139,9 @@ This realm has a few mandatory settings, and a number of optional settings. The If your OpenID Connect Provider doesn’t publish its JWKS at an https URL, or if you want to use a local copy, you can upload the JWKS as a file. - :::{tip} - * In self-managed clusters, the specified path is resolved relative to the {{es}} config directory. {{es}} will automatically monitor this file for changes and will reload the configuration whenever it is updated. - * If you're using {{ece}} or {{ech}}, then you must [upload this file as a custom bundle](/deploy-manage/deploy/elastic-cloud/upload-custom-plugins-bundles.md) before it can be referenced. + :::{tip} + * In self-managed clusters, the specified path is resolved relative to the {{es}} config directory. {{es}} will automatically monitor this file for changes and will reload the configuration whenever it is updated. + * If you're using {{ece}} or {{ech}}, then you must [upload this file as a custom bundle](/deploy-manage/deploy/elastic-cloud/upload-custom-plugins-bundles.md) before it can be referenced. * If you're using {{eck}}, then install the file as a [custom configuration file](/deploy-manage/deploy/cloud-on-k8s/custom-configuration-files-plugins.md#use-a-volume-and-volume-mount-together-with-a-configmap-or-secret). ::: @@ -168,7 +168,7 @@ This realm has a few mandatory settings, and a number of optional settings. The In {{ech}} and {{ece}}, after you configure Client Secret, any attempt to restart the deployment will fail until you complete the rest of the configuration steps. If you want to roll back the Active Directory realm configurations, you need to remove the `xpack.security.authc.realms.oidc.oidc1.rp.client_secret` that was just added. ::: -::::{note} +::::{note} According to the OpenID Connect specification, the OP should also make their configuration available at a well known URL, which is the concatenation of their `Issuer` value with the `.well-known/openid-configuration` string. For example: `https://op.org.com/.well-known/openid-configuration`. That document should contain all the necessary information to configure the OpenID Connect realm in {{es}}. @@ -176,17 +176,17 @@ That document should contain all the necessary information to configure the Open ## Map claims [oidc-claims-mappings] -When authenticating to {{kib}} using OpenID Connect, the OP will provide information about the user in the form of **OpenID Connect Claims**. These claims can be included either in the ID Token, or be retrieved from the UserInfo endpoint of the OP. +When authenticating to {{kib}} using OpenID Connect, the OP will provide information about the user in the form of **OpenID Connect Claims**. These claims can be included either in the ID Token, or be retrieved from the UserInfo endpoint of the OP. -An **OpenID Connect Claim** is a piece of information asserted by the OP for the authenticated user, and consists of a name/value pair that contains information about the user. +An **OpenID Connect Claim** is a piece of information asserted by the OP for the authenticated user, and consists of a name/value pair that contains information about the user. -**OpenID Connect Scopes** are identifiers that are used to request access to specific lists of claims. The standard defines a set of scope identifiers that can be requested. +**OpenID Connect Scopes** are identifiers that are used to request access to specific lists of claims. The standard defines a set of scope identifiers that can be requested. * **Mandatory scopes**: `openid` * **Commonly used scopes**: * `profile`: Requests access to the `name`,`family_name`,`given_name`,`middle_name`,`nickname`, `preferred_username`,`profile`,`picture`,`website`,`gender`,`birthdate`,`zoneinfo`,`locale`, and `updated_at` claims. - * `email`: Requests access to the `email` and `email_verified` claims. + * `email`: Requests access to the `email` and `email_verified` claims. The RP requests specific scopes during the authentication request. If the OP Privacy Policy allows it and the authenticating user consents to it, the related claims are returned to the RP (either in the ID Token or as a UserInfo response). @@ -194,7 +194,7 @@ The list of the supported claims will vary depending on the OP you are using, bu ### How claims appear in user metadata [oidc-user-metadata] -By default, users who authenticate through OpenID Connect have additional metadata fields. These fields include every OpenID claim that is provided in the authentication response, regardless of whether it is mapped to an {{es}} user property. +By default, users who authenticate through OpenID Connect have additional metadata fields. These fields include every OpenID claim that is provided in the authentication response, regardless of whether it is mapped to an {{es}} user property. For example, in the metadata field `oidc(claim_name)`, "claim_name" is the name of the claim as it was contained in the ID Token or in the User Info response. Note that these will include all the [ID Token claims](https://openid.net/specs/openid-connect-core-1_0.md#IDToken) that pertain to the authentication event, rather than the user themselves. @@ -207,29 +207,29 @@ The goal of claims mapping is to configure {{es}} in such a way as to be able to To configure claims mapping: -1. Using your OP configuration, identify the claims that it might support. - +1. Using your OP configuration, identify the claims that it might support. + The list provided in the OP’s metadata or in the configuration page of the OP is a list of potentially supported claims. However, for privacy reasons it might not be a complete one, or not all supported claims will be available for all authenticated users. 2. Review the list of [user properties](#oidc-user-properties) that {{es}} supports, and decide which of them are useful to you, and can be provided by your OP in the form of claims. At a minimum, the `principal` user property is required. -3. Configure your OP to "release" those claims to your {{stack}} Relying Party. This process greatly varies by provider. You can use a static configuration while others will support that the RP requests the scopes that correspond to the claims to be "released" on authentication time. See [`rp.requested_scopes`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#ref-oidc-settings) for details about how to configure the scopes to request. To ensure interoperability and minimize the errors, you should only request scopes that the OP supports, and that you intend to map to {{es}} user properties. +3. Configure your OP to "release" those claims to your {{stack}} Relying Party. This process greatly varies by provider. You can use a static configuration while others will support that the RP requests the scopes that correspond to the claims to be "released" on authentication time. See [`rp.requested_scopes`](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#ref-oidc-settings) for details about how to configure the scopes to request. To ensure interoperability and minimize the errors, you should only request scopes that the OP supports, and that you intend to map to {{es}} user properties. :::{note} You can only map claims with values that are strings, numbers, Boolean values, or an array consisting of strings, numbers, and Boolean values. ::: -4. Configure the OpenID Connect realm in {{es}} to associate the [{{es}} user properties](#oidc-user-properties) to the name of the claims that your OP will release. - +4. Configure the OpenID Connect realm in {{es}} to associate the [{{es}} user properties](#oidc-user-properties) to the name of the claims that your OP will release. + The [sample configuration](#oidc-create-realm) configures the `principal` and `groups` user properties as follows: - * `claims.principal: sub`: Instructs {{es}} to look for the OpenID Connect claim named `sub` in the ID Token that the OP issued for the user (or in the UserInfo response) and assign the value of this claim to the `principal` user property. - + * `claims.principal: sub`: Instructs {{es}} to look for the OpenID Connect claim named `sub` in the ID Token that the OP issued for the user (or in the UserInfo response) and assign the value of this claim to the `principal` user property. + `sub` is a commonly used claim for the principal property as it is an identifier of the user in the OP and it is also a required claim of the ID Token. This means that `sub` is available in most OPs. However, the OP may provide another claim that is a better fit for your needs. - * `claims.groups: "http://example.info/claims/groups"`: Instructs {{es}} to look for the claim with the name `http://example.info/claims/groups`, either in the ID Token or in the UserInfo response, and map the value(s) of it to the user property `groups` in {{es}}. - + * `claims.groups: "http://example.info/claims/groups"`: Instructs {{es}} to look for the claim with the name `http://example.info/claims/groups`, either in the ID Token or in the UserInfo response, and map the value(s) of it to the user property `groups` in {{es}}. + There is no standard claim in the specification that is used for expressing roles or group memberships of the authenticated user in the OP, so the name of the claim that should be mapped here will vary between providers. Consult your OP documentation for more details. :::{tip} - In this example, the value is a URI, treated as a string and not a URL pointing to a location that will be retrieved. + In this example, the value is a URI, treated as a string and not a URL pointing to a location that will be retrieved. ::: @@ -241,7 +241,7 @@ The {{es}} OpenID Connect realm can be configured to map OpenID Connect claims t principal : *(Required)* This is the username that will be applied to a user that authenticates against this realm. The `principal` appears in places such as the {{es}} audit logs. -::::{note} +::::{note} If the principal property fails to be mapped from a claim, the authentication fails. :::: @@ -285,7 +285,7 @@ In this case, the user’s `principal` is mapped from the `email_verified` claim In this example, the email address must belong to the `staff.example.com` domain, and then the local-part (anything before the `@`) is used as the principal. Any users who try to login using a different email domain will fail because the regular expression will not match against their email address, and thus their principal user property - which is mandatory - will not be populated. -::::{important} +::::{important} Small mistakes in these regular expressions can have significant security consequences. For example, if we accidentally left off the trailing `$` from the example above, then we would match any email address where the domain starts with `staff.example.com`, and this would accept an email address such as `admin@staff.example.com.attacker.net`. It is important that you make sure your regular expressions are as precise as possible so that you don't open an avenue for user impersonation attacks. :::: @@ -303,9 +303,9 @@ The OpenID Connect realm in {{es}} supports RP-initiated logout functionality as In this process, the OpenID Connect RP (the {{stack}} in this case) will redirect the user’s browser to predefined URL of the OP after successfully completing a local logout. The OP can then logout the user also, depending on the configuration, and should finally redirect the user back to the RP. -RP-initiated logout is controlled by two settings: +RP-initiated logout is controlled by two settings: -* `op.endsession_endpoint`: The URL in the OP that the browser will be redirected to. +* `op.endsession_endpoint`: The URL in the OP that the browser will be redirected to. * `rp.post_logout_redirect_uri` The URL to redirect the user back to after the OP logs them out. When configuring `rp.post_logout_redirect_uri`, do not point to a URL that will trigger re-authentication of the user. For instance, when using OpenID Connect to support single sign-on to {{kib}}, this could be set to either `${kibana-url}/security/logged_out`, which will show a user-friendly message to the user, or `${kibana-url}/login?msg=LOGGED_OUT` which will take the user to the login selector in {{kib}}. @@ -315,7 +315,7 @@ When configuring `rp.post_logout_redirect_uri`, do not point to a URL that will OpenID Connect depends on TLS to provide security properties such as encryption in transit and endpoint authentication. The RP is required to establish back-channel communication with the OP in order to exchange the code for an ID Token during the Authorization code grant flow and in order to get additional user information from the `UserInfo` endpoint. If you configure `op.jwks_path` as a URL, {{es}} will need to get the OP’s signing keys from the file hosted there. As such, it is important that {{es}} can validate and trust the server certificate that the OP uses for TLS. Because the system trust store is used for the client context of outgoing https connections, if your OP is using a certificate from a trusted CA, no additional configuration is needed. -However, if the issuer of your OP’s certificate is not trusted by the JVM on which {{es}} is running (e.g it uses an organization CA), then you must configure {{es}} to trust that CA. +However, if the issuer of your OP’s certificate is not trusted by the JVM on which {{es}} is running (e.g it uses an organization CA), then you must configure {{es}} to trust that CA. If you're using {{ech}} or {{ece}}, then you must [upload your certificate as a custom bundle](/deploy-manage/deploy/elastic-cloud/upload-custom-plugins-bundles.md) before it can be referenced. @@ -336,15 +336,15 @@ When a user authenticates using OpenID Connect, they are identified to the {{sta Your OpenID Connect users can't do anything until they are assigned roles. You can map roles This can be done through either the [add role mapping API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-security-put-role-mapping) or with [authorization realms](/deploy-manage/users-roles/cluster-or-deployment-auth/realm-chains.md#authorization_realms). -You can map LDAP groups to roles in the following ways: +You can map LDAP groups to roles in the following ways: * Using the role mappings page in {{kib}}. -* Using the [role mapping API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-security-put-role-mapping). +* Using the [role mapping API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-security-put-role-mapping). * By delegating authorization to another realm. - + For more information, see [Mapping users and groups to roles](/deploy-manage/users-roles/cluster-or-deployment-auth/mapping-users-groups-to-roles.md). -::::{note} +::::{note} You can't use [role mapping files](/deploy-manage/users-roles/cluster-or-deployment-auth/mapping-users-groups-to-roles.md#mapping-roles-file) to grant roles to users authenticating using OpenID Connect. :::: @@ -440,14 +440,14 @@ xpack.security.authc.providers: The configuration values used in the example above are: `xpack.security.authc.providers` -: Add an `oidc` provider to instruct {{kib}} to use OpenID Connect single sign-on as the authentication method. This instructs Kibana to attempt to initiate an SSO flow every time a user attempts to access a URL in {{kib}}, if the user is not already authenticated. +: Add an `oidc` provider to instruct {{kib}} to use OpenID Connect single sign-on as the authentication method. This instructs Kibana to attempt to initiate an SSO flow every time a user attempts to access a URL in {{kib}}, if the user is not already authenticated. `xpack.security.authc.providers.oidc..realm` : The name of the OpenID Connect realm in {{es}} that should handle authentication for this Kibana instance. ### Supporting OIDC and basic authentication in {{kib}} -If you also want to allow users to log in with a username and password, you must enable the `basic` authentication provider too. This will allow users that haven’t already authenticated with OpenID Connect to log in using the {{kib}} login form: +If you also want to allow users to log in with a username and password, you must enable the `basic` authentication provider too. This will allow users that haven’t already authenticated with OpenID Connect to log in using the {{kib}} login form: ```yaml xpack.security.authc.providers: @@ -468,7 +468,7 @@ The OpenID Connect realm is designed to allow users to authenticate to {{kib}}. Single sign-on realms such as OpenID Connect and SAML make use of the Token Service in {{es}} and in principle exchange a SAML or OpenID Connect Authentication response for an {{es}} access token and a refresh token. The access token is used as credentials for subsequent calls to {{es}}. The refresh token enables the user to get new {{es}} access tokens after the current one expires. -::::{note} +::::{note} The {{es}} Token Service can be seen as a minimal oAuth2 authorization server and the access token and refresh token mentioned above are tokens that pertain *only* to this authorization server. They are generated and consumed *only* by {{es}} and are in no way related to the tokens ( access token and ID Token ) that the OpenID Connect Provider issues. :::: diff --git a/deploy-manage/users-roles/cluster-or-deployment-auth/operator-only-functionality.md b/deploy-manage/users-roles/cluster-or-deployment-auth/operator-only-functionality.md index 03c6f9c33..56a26cfb9 100644 --- a/deploy-manage/users-roles/cluster-or-deployment-auth/operator-only-functionality.md +++ b/deploy-manage/users-roles/cluster-or-deployment-auth/operator-only-functionality.md @@ -3,9 +3,9 @@ mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/operator-only-functionality.html applies_to: deployment: - ess: - ece: - eck: + ess: + ece: + eck: --- # Operator-only functionality [operator-only-functionality] @@ -34,7 +34,7 @@ Operator privileges provide protection for APIs and dynamic cluster settings. An ## Operator-only dynamic cluster settings [operator-only-dynamic-cluster-settings] * All [IP filtering](../../security/ip-traffic-filtering.md) settings -* The following dynamic [machine learning settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/machine-learning-settings.md): +* The following dynamic [machine learning settings](elasticsearch://reference/elasticsearch/configuration-reference/machine-learning-settings.md): * `xpack.ml.node_concurrent_job_allocations` * `xpack.ml.max_machine_memory_percent` @@ -46,8 +46,8 @@ Operator privileges provide protection for APIs and dynamic cluster settings. An * `xpack.ml.enable_config_migration` * `xpack.ml.persist_results_max_retries` -* The [`cluster.routing.allocation.disk.threshold_enabled` setting](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-disk-threshold) -* The following [recovery settings for managed services](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/index-recovery-settings.md#recovery-settings-for-managed-services): +* The [`cluster.routing.allocation.disk.threshold_enabled` setting](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-disk-threshold) +* The following [recovery settings for managed services](elasticsearch://reference/elasticsearch/configuration-reference/index-recovery-settings.md#recovery-settings-for-managed-services): * `node.bandwidth.recovery.operator.factor` * `node.bandwidth.recovery.operator.factor.read` diff --git a/deploy-manage/users-roles/cluster-or-deployment-auth/pki.md b/deploy-manage/users-roles/cluster-or-deployment-auth/pki.md index 5467e38bb..3574f6119 100644 --- a/deploy-manage/users-roles/cluster-or-deployment-auth/pki.md +++ b/deploy-manage/users-roles/cluster-or-deployment-auth/pki.md @@ -19,7 +19,7 @@ You can also use PKI certificates to authenticate to {{kib}}, however this requi To use PKI in {{es}}, you configure a PKI realm, enable client authentication on the desired network layers (transport or http), and map the Distinguished Names (DNs) from the `Subject` field in the user certificates to roles. You create the mappings in a role mapping file or use the role mappings API. -1. Add a realm configuration for a `pki` realm to `elasticsearch.yml` under the `xpack.security.authc.realms.pki` namespace. You must explicitly set the `order` attribute. See [PKI realm settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#ref-pki-settings) for all of the options you can set for a `pki` realm. +1. Add a realm configuration for a `pki` realm to `elasticsearch.yml` under the `xpack.security.authc.realms.pki` namespace. You must explicitly set the `order` attribute. See [PKI realm settings](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#ref-pki-settings) for all of the options you can set for a `pki` realm. For example, the following snippet shows the most basic `pki` realm configuration: @@ -35,11 +35,11 @@ To use PKI in {{es}}, you configure a PKI realm, enable client authentication on With this configuration, any certificate trusted by the {{es}} SSL/TLS layer is accepted for authentication. The username is the common name (CN) extracted from the DN in the Subject field of the end-entity certificate. This configuration is not sufficient to permit PKI authentication to {{kib}}; additional steps are required. - ::::{important} + ::::{important} When you configure realms in `elasticsearch.yml`, only the realms you specify are used for authentication. If you also want to use the `native` or `file` realms, you must include them in the realm chain. :::: -2. Optional: The username is defined by the [username_pattern](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#ref-pki-settings). If you want to use something other than the CN of the Subject DN as the username, you can specify a regex to extract the desired username. The regex is applied on the Subject DN. +2. Optional: The username is defined by the [username_pattern](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#ref-pki-settings). If you want to use something other than the CN of the Subject DN as the username, you can specify a regex to extract the desired username. The regex is applied on the Subject DN. For example, the regex in the following configuration extracts the email address from the Subject DN: @@ -54,16 +54,16 @@ To use PKI in {{es}}, you configure a PKI realm, enable client authentication on username_pattern: "EMAILADDRESS=(.*?)(?:,|$)" ``` - ::::{note} + ::::{note} If the regex is too restrictive and does not match the Subject DN of the client’s certificate, then the realm does not authenticate the certificate. :::: 3. Optional: If you want the same users to also be authenticated using certificates when they connect to {{kib}}, you must configure the {{es}} PKI realm to allow delegation. See [PKI authentication for clients connecting to {{kib}}](#pki-realm-for-proxied-clients). 4. Restart {{es}} because realm configuration is not reloaded automatically. If you’re following through with the next steps, you might wish to hold the restart for last. 5. If you're using a self-managed cluster, then [enable SSL/TLS](../../security/secure-cluster-communications.md#encrypt-internode-communication). -6. If you're using a self-managed cluster or {{eck}}, then enable client authentication on the desired network layers (transport or http). +6. If you're using a self-managed cluster or {{eck}}, then enable client authentication on the desired network layers (transport or http). - ::::{important} + ::::{important} To use PKI when clients connect directly to {{es}}, you must enable SSL/TLS with client authentication by setting `xpack.security.transport.ssl.client_authentication` and `xpack.security.http.ssl.client_authentication` to `optional` or `required`. If the setting value is `optional`, clients without certificates can authenticate with other credentials. :::: @@ -75,10 +75,10 @@ To use PKI in {{es}}, you configure a PKI realm, enable client authentication on * The interface must *trust* the certificate that is presented by the client by configuring either the `truststore` or `certificate_authorities` paths, or by setting `verification_mode` to `none`. * The *protocols* supported by the interface must be compatible with those used by the client. - For an explanation of these settings, see [General TLS settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#ssl-tls-settings). + For an explanation of these settings, see [General TLS settings](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#ssl-tls-settings). 7. Optional: Configure the PKI realm to trust a subset of certificates. - + The relevant network interface (transport or http) must be configured to trust any certificate that is to be used within the PKI realm. However, it is possible to configure the PKI realm to trust only a *subset* of the certificates accepted by the network interface. This is useful when the SSL/TLS layer trusts clients with certificates that are signed by a different CA than the one that signs your users' certificates. 1. To configure the PKI realm with its own trust store, specify the `truststore.path` option. The path must be located within the {{es}} configuration directory (`ES_PATH_CONF`). For example: @@ -95,10 +95,10 @@ To use PKI in {{es}}, you configure a PKI realm, enable client authentication on path: "pki1_truststore.jks" ``` - :::{tip} + :::{tip} If you're using {{ece}} or {{ech}}, then you must [upload this file as a custom bundle](/deploy-manage/deploy/elastic-cloud/upload-custom-plugins-bundles.md) before it can be referenced. - If you're using {{eck}}, then install the file as a [custom configuration file](/deploy-manage/deploy/cloud-on-k8s/custom-configuration-files-plugins.md#use-a-volume-and-volume-mount-together-with-a-configmap-or-secret). + If you're using {{eck}}, then install the file as a [custom configuration file](/deploy-manage/deploy/cloud-on-k8s/custom-configuration-files-plugins.md#use-a-volume-and-volume-mount-together-with-a-configmap-or-secret). If you're using a self-managed cluster, then the file must be present on each node. ::: @@ -142,10 +142,10 @@ To use PKI in {{es}}, you configure a PKI realm, enable client authentication on 1. The name of a role. 2. The distinguished name (DN) of a PKI user. - :::{tip} + :::{tip} If you're using {{ece}} or {{ech}}, then you must [upload this file as a custom bundle](/deploy-manage/deploy/elastic-cloud/upload-custom-plugins-bundles.md) before it can be referenced. - If you're using {{eck}}, then install the file as a [custom configuration file](/deploy-manage/deploy/cloud-on-k8s/custom-configuration-files-plugins.md#use-a-volume-and-volume-mount-together-with-a-configmap-or-secret). + If you're using {{eck}}, then install the file as a [custom configuration file](/deploy-manage/deploy/cloud-on-k8s/custom-configuration-files-plugins.md#use-a-volume-and-volume-mount-together-with-a-configmap-or-secret). If you're using a self-managed cluster, then the file must be present on each node. ::: @@ -158,7 +158,7 @@ To use PKI in {{es}}, you configure a PKI realm, enable client authentication on For more information, see [Mapping users and groups to roles](mapping-users-groups-to-roles.md). - ::::{note} + ::::{note} The PKI realm supports [authorization realms](realm-chains.md#authorization_realms) as an alternative to role mapping. :::: diff --git a/deploy-manage/users-roles/cluster-or-deployment-auth/realm-chains.md b/deploy-manage/users-roles/cluster-or-deployment-auth/realm-chains.md index 8d4e1ee05..0faecf5e3 100644 --- a/deploy-manage/users-roles/cluster-or-deployment-auth/realm-chains.md +++ b/deploy-manage/users-roles/cluster-or-deployment-auth/realm-chains.md @@ -11,19 +11,19 @@ applies_to: # Realm chains [realm-chains] -[Realms](authentication-realms.md) live within a *realm chain*. It is essentially a prioritized list of configured realms typically of various types. Realms are consulted in ascending order: the realm with the lowest `order` value is consulted first. +[Realms](authentication-realms.md) live within a *realm chain*. It is essentially a prioritized list of configured realms typically of various types. Realms are consulted in ascending order: the realm with the lowest `order` value is consulted first. You must make sure each configured realm has a distinct `order` setting. In the event that two or more realms have the same `order`, the node will fail to start. During the authentication process, {{es}} consults and tries to authenticate the request one realm at a time. Once one of the realms successfully authenticates the request, the authentication is considered to be successful. The authenticated user is associated with the request, which then proceeds to the authorization phase. If a realm can't authenticate the request, the next realm in the chain is consulted. If none of the realms in the chain can authenticate the request, the authentication is considered to be unsuccessful and an authentication error is returned (as HTTP status code `401`). -::::{note} +::::{note} Some systems, such as Active Directory, have a temporary lock-out period after several successive failed login attempts. If the same username exists in multiple realms, unintentional account lockouts are possible. For more information, see [Users are frequently locked out of Active Directory](/troubleshoot/elasticsearch/security/trouble-shoot-active-directory.md). :::: ## Configure a realm chain -The default realm chain contains the `file` and `native` realms. To explicitly configure a realm chain, specify the chain in the `elasticsearch.yml` file. +The default realm chain contains the `file` and `native` realms. To explicitly configure a realm chain, specify the chain in the `elasticsearch.yml` file. If your realm chain does not contain `file` or `native` realm or does not disable them explicitly, `file` and `native` realms will be added automatically to the beginning of the realm chain in that order. To opt out from this automatic behavior, you can explicitly configure the `file` and `native` realms with the `order` and `enabled` settings. @@ -63,13 +63,13 @@ For example, you might want to use a PKI realm to authenticate your users with T Any realm that supports retrieving users without needing their credentials can be used as an authorization realm. Refer to [Looking up users without authentication](looking-up-users-without-authentication.md) to learn which realms can be used as authorization realms. -For realms that support this feature, it can be enabled by configuring the `authorization_realms` setting on the authenticating realm. Check the list of [supported settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#realm-settings) for each realm to see if they support the `authorization_realms` setting. +For realms that support this feature, it can be enabled by configuring the `authorization_realms` setting on the authenticating realm. Check the list of [supported settings](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#realm-settings) for each realm to see if they support the `authorization_realms` setting. If delegated authorization is enabled for a realm, it authenticates the user in its standard manner, including relevant caching, then looks for that user in the configured list of authorization realms. It tries each realm in the order they are specified in the `authorization_realms` setting. The user is retrieved by principal - the user must have identical usernames in the authentication and authorization realms. If the user can't be found in any of the authorization realms, then authentication fails. See [Configuring authorization delegation](authorization-delegation.md) for more details. -::::{note} +::::{note} Delegated authorization requires that you have a [subscription](https://www.elastic.co/subscriptions) that includes custom authentication and authorization realms. :::: diff --git a/deploy-manage/users-roles/cluster-or-deployment-auth/saml.md b/deploy-manage/users-roles/cluster-or-deployment-auth/saml.md index 6d3893716..8ccf858cd 100644 --- a/deploy-manage/users-roles/cluster-or-deployment-auth/saml.md +++ b/deploy-manage/users-roles/cluster-or-deployment-auth/saml.md @@ -21,13 +21,13 @@ applies_to: # SAML authentication [saml-realm] -The {{stack}} supports SAML single-sign-on (SSO) into {{kib}}, using {{es}} as a backend service. +The {{stack}} supports SAML single-sign-on (SSO) into {{kib}}, using {{es}} as a backend service. The {{security-features}} provide this support using the Web Browser SSO profile of the SAML 2.0 protocol. This protocol is specifically designed to support authentication using an interactive web browser, so it does not operate as a standard authentication realm. Instead, there are {{kib}} and {{es}} {{security-features}} that work together to enable interactive SAML sessions. This means that the SAML realm is not suitable for use by standard REST clients. If you configure a SAML realm for use in {{kib}}, you should also configure another realm, such as the [native realm](/deploy-manage/users-roles/cluster-or-deployment-auth/native.md) in your authentication chain. -Because this feature is designed with {{kib}} in mind, most sections of this guide assume {{kib}} is used. To learn how a custom web application could use the OpenID Connect REST APIs to authenticate the users to {{es}} with SAML, refer to [SAML without {{kib}}](#saml-no-kibana). +Because this feature is designed with {{kib}} in mind, most sections of this guide assume {{kib}} is used. To learn how a custom web application could use the OpenID Connect REST APIs to authenticate the users to {{es}} with SAML, refer to [SAML without {{kib}}](#saml-no-kibana). The SAML support in {{kib}} is designed with the expectation that it will be the primary (or sole) authentication method for users of that {{kib}} instance. After you enable SAML authentication in {{kib}}, it will affect all users who try to login. The [Configuring {{kib}}](/deploy-manage/users-roles/cluster-or-deployment-auth/saml.md#saml-configure-kibana) section provides more detail about how this works. @@ -47,7 +47,7 @@ Additional steps outlined in this document are optional. :::: ::::{tip} -This topic describes implementing SAML SSO at the deployment or cluster level, for the purposes of authenticating with a {{kib}} instance. +This topic describes implementing SAML SSO at the deployment or cluster level, for the purposes of authenticating with a {{kib}} instance. Depending on your deployment type, you can also configure SSO for the following use cases: @@ -59,7 +59,7 @@ Depending on your deployment type, you can also configure SSO for the following In SAML terminology, the {{stack}} is operating as a *Service Provider*. -The other component that is needed to enable SAML single-sign-on is the *Identity Provider*, which is a service that handles your credentials and performs that actual authentication of users. +The other component that is needed to enable SAML single-sign-on is the *Identity Provider*, which is a service that handles your credentials and performs that actual authentication of users. If you are interested in configuring SSO into {{kib}}, then you need to provide {{es}} with information about your *Identity Provider*, and you will need to register the {{stack}} as a known *Service Provider* within that Identity Provider. There are also a few configuration changes that are required in {{kib}} to activate the SAML authentication provider. @@ -89,13 +89,13 @@ The {{stack}} requires that all messages from the IdP are signed. For authentica Before you set up SAML single-sign on, you must have a [SAML IdP](#saml-guide-idp) configured. -If you're using a self-managed cluster, then perform the following additional steps: +If you're using a self-managed cluster, then perform the following additional steps: * Enable TLS for HTTP. If your {{es}} cluster is operating in production mode, you must configure the HTTP interface to use SSL/TLS before you can enable SAML authentication. For more information, see [Encrypt HTTP client communications for {{es}}](/deploy-manage/security/set-up-basic-security-plus-https.md#encrypt-http-communication). - If you started {{es}} [with security enabled](/deploy-manage/deploy/self-managed/installing-elasticsearch.md), then TLS is already enabled for HTTP. + If you started {{es}} [with security enabled](/deploy-manage/deploy/self-managed/installing-elasticsearch.md), then TLS is already enabled for HTTP. {{ech}}, {{ece}}, and {{eck}} have TLS enabled by default. @@ -113,11 +113,11 @@ If you're using a self-managed cluster, then perform the following additional st SAML authentication is enabled by configuring a SAML realm within the authentication chain for {{es}}. -This realm has a few mandatory settings, and a number of optional settings. The available settings are described in detail in [Security settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md): -* [SAML realm settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#ref-saml-settings) -* [SAML realm signing settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#ref-saml-signing-settings) -* [SAML realm encryption settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#ref-saml-encryption-settings) -* [SAML realm SSL settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#ref-saml-ssl-settings) +This realm has a few mandatory settings, and a number of optional settings. The available settings are described in detail in [Security settings](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md): +* [SAML realm settings](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#ref-saml-settings) +* [SAML realm signing settings](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#ref-saml-signing-settings) +* [SAML realm encryption settings](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#ref-saml-encryption-settings) +* [SAML realm SSL settings](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#ref-saml-ssl-settings) This guide will walk you through the most common settings. @@ -148,13 +148,13 @@ order If you're using {{eck}}, then make sure not to disable Elasticsearch’s file realm set by ECK, as ECK relies on the file realm for its operation. Set the `order` setting of the SAML realm to a greater value than the `order` value set for the file and native realms, which is by default -100 and -99 respectively. idp.metadata.path -: $$$idp-metadata-path$$$ The path to the metadata file for your Identity Provider. The metadata file path can either be a path, or an HTTPS URL. +: $$$idp-metadata-path$$$ The path to the metadata file for your Identity Provider. The metadata file path can either be a path, or an HTTPS URL. - :::{tip} - If you want to pass a file path, then review the following: - - * File path settings are resolved relative to the {{es}} config directory. {{es}} will automatically monitor this file for changes and will reload the configuration whenever it is updated. - * If you're using {{ece}} or {{ech}}, then you must upload the file [as a custom bundle](/deploy-manage/deploy/elastic-cloud/upload-custom-plugins-bundles.md) before it can be referenced. + :::{tip} + If you want to pass a file path, then review the following: + + * File path settings are resolved relative to the {{es}} config directory. {{es}} will automatically monitor this file for changes and will reload the configuration whenever it is updated. + * If you're using {{ece}} or {{ech}}, then you must upload the file [as a custom bundle](/deploy-manage/deploy/elastic-cloud/upload-custom-plugins-bundles.md) before it can be referenced. * If you're using {{eck}}, then install the file as [custom configuration files](/deploy-manage/deploy/cloud-on-k8s/custom-configuration-files-plugins.md#use-a-volume-and-volume-mount-together-with-a-configmap-or-secret). ::: @@ -212,8 +212,8 @@ The recommended steps for configuring these SAML attributes are as follows: 1. Consult your IdP to see what user attributes it can provide. This varies greatly between providers, but you should be able to obtain a list from the documentation, or from your local admin. 2. Review the list of [user properties](#saml-es-user-properties) that {{es}} supports, and decide which of them are useful to you, and can be provided by your IdP. At a *minimum*, the `principal` attribute is required. -3. Configure your IdP to "release" those attributes to your {{kib}} SAML service provider. This process varies by provider: some will provide a user interface for this, while others may require that you edit configuration files. - +3. Configure your IdP to "release" those attributes to your {{kib}} SAML service provider. This process varies by provider: some will provide a user interface for this, while others may require that you edit configuration files. + Because {{es}} does not require that any specific URIs are used, you can use any URIs to use for each attribute as they are recommended by the IDP or your local administrator. 4. Configure the SAML realm in {{es}} to associate the [{{es}} user properties](#saml-es-user-properties) to the URIs that you configured in your IdP. The [sample configuration](#saml-create-realm) configures the `principal` and `groups` attributes. @@ -228,7 +228,7 @@ In general, {{es}} expects that the configured value for an attribute will be a : This uses the SAML `NameID` value (all leading and trailing whitespace removed), but only if the NameID format is `urn:oasis:names:tc:SAML:2.0:nameid-format:persistent`. A SAML `NameID` element has an optional `Format` attribute that indicates the semantics of the provided name. It is common for IdPs to be configured with "transient" NameIDs that present a new identifier for each session. Since it is rarely useful to use a transient NameID as part of an attribute mapping, the "nameid:persistent" attribute name can be used as a safety mechanism that will cause an error if you attempt to map from a `NameID` that does not have a persistent value. :::{note} -Identity Providers can be either statically configured to release a `NameID` with a specific format, or they can be configured to try to conform with the requirements of the SP. The SP declares its requirements as part of the Authentication Request, using an element which is called the `NameIDPolicy`. If this is needed, you can set the relevant [settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#ref-saml-settings) named `nameid_format` in order to request that the IdP releases a `NameID` with a specific format. +Identity Providers can be either statically configured to release a `NameID` with a specific format, or they can be configured to try to conform with the requirements of the SP. The SP declares its requirements as part of the Authentication Request, using an element which is called the `NameIDPolicy`. If this is needed, you can set the relevant [settings](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#ref-saml-settings) named `nameid_format` in order to request that the IdP releases a `NameID` with a specific format. ::: *friendlyName* @@ -304,7 +304,7 @@ It is sometimes necessary for a SAML SP to be able to impose specific restrictio The SAML SP defines a set of Authentication Context Class Reference values, which describe the restrictions to be imposed on the IdP, and sends these in the Authentication Request. The IdP attempts to grant these restrictions. If it cannot grant them, the authentication attempt fails. If the user is successfully authenticated, the Authentication Statement of the SAML Response contains an indication of the restrictions that were satisfied. -You can define the Authentication Context Class Reference values by using the `req_authn_context_class_ref` option in the SAML realm configuration. See [SAML realm settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#ref-saml-settings). +You can define the Authentication Context Class Reference values by using the `req_authn_context_class_ref` option in the SAML realm configuration. See [SAML realm settings](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#ref-saml-settings). {{es}} supports only the `exact` comparison method for the Authentication Context. When it receives the Authentication Response from the IdP, {{es}} examines the value of the Authentication Context Class Reference that is part of the Authentication Statement of the SAML Assertion. If it matches one of the requested values, the authentication is considered successful. Otherwise, the authentication attempt fails. @@ -364,7 +364,7 @@ You can configure {{es}} for signing, encryption or both, using a single key or The {{stack}} uses X.509 certificates with RSA private keys for SAML cryptography. These keys can be generated using any standard SSL tool, including the `elasticsearch-certutil` tool. -Your IdP may require that the {{stack}} have a cryptographic key for signing SAML messages, and that you provide the corresponding signing certificate within the Service Provider configuration (either within the {{stack}} SAML metadata file, or manually configured within the IdP administration interface). +Your IdP may require that the {{stack}} have a cryptographic key for signing SAML messages, and that you provide the corresponding signing certificate within the Service Provider configuration (either within the {{stack}} SAML metadata file, or manually configured within the IdP administration interface). While most IdPs do not expect authentication requests to be signed, it is commonly the case that signatures are required for logout requests. Your IdP will validate these signatures against the signing certificate that has been configured for the {{stack}} Service Provider. @@ -372,7 +372,7 @@ Encryption certificates are rarely needed, but the {{stack}} supports them for c ### Generate certificates and keys [_generating_certificates_and_keys] -{{es}} supports certificates and keys in either PEM, PKCS#12 or JKS format. Some Identity Providers are more restrictive in the formats they support, and will require you to provide the certificates as a file in a particular format. You should consult the documentation for your IdP to determine what formats they support. +{{es}} supports certificates and keys in either PEM, PKCS#12 or JKS format. Some Identity Providers are more restrictive in the formats they support, and will require you to provide the certificates as a file in a particular format. You should consult the documentation for your IdP to determine what formats they support. #### Example: Using `openssl` @@ -387,7 +387,7 @@ deployment: self: ``` -Using the [`elasticsearch-certutil` tool](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/certutil.md), you can generate a signing certificate with the following command. Because PEM format is the most commonly supported format, the example generates a certificate in that format. +Using the [`elasticsearch-certutil` tool](elasticsearch://reference/elasticsearch/command-line-tools/certutil.md), you can generate a signing certificate with the following command. Because PEM format is the most commonly supported format, the example generates a certificate in that format. ```sh bin/elasticsearch-certutil cert --self-signed --pem --days 1100 --name saml-sign --out saml-sign.zip @@ -414,8 +414,8 @@ Encryption certificates can be generated with the same process. By default, {{es}} will sign *all* outgoing SAML messages if a signing key has been configured. -:::{tip} -* In self-managed clusters, file path settings is resolved relative to the {{es}} config directory. {{es}} will automatically monitor this file for changes and will reload the configuration whenever it is updated. +:::{tip} +* In self-managed clusters, file path settings is resolved relative to the {{es}} config directory. {{es}} will automatically monitor this file for changes and will reload the configuration whenever it is updated. * If you're using {{ece}} or {{ech}}, then you must upload any certificate or keystore files [as a custom bundle](/deploy-manage/deploy/elastic-cloud/upload-custom-plugins-bundles.md) before it can be referenced. You can add this file to your existing SAML bundle. * If you're using {{eck}}, then install the files as [custom configuration files](/deploy-manage/deploy/cloud-on-k8s/custom-configuration-files-plugins.md#use-a-volume-and-volume-mount-together-with-a-configmap-or-secret). ::: @@ -478,8 +478,8 @@ The {{es}} {{security-features}} support a single key for message decryption. If If an `Assertion` contains both encrypted and plain-text attributes, then failure to decrypt the encrypted attributes will not cause an automatic rejection. Rather, {{es}} processes the available plain-text attributes (and any `EncryptedAttributes` that could be decrypted). -:::{tip} -* In self-managed clusters, file path settings is resolved relative to the {{es}} config directory. {{es}} will automatically monitor this file for changes and will reload the configuration whenever it is updated. +:::{tip} +* In self-managed clusters, file path settings is resolved relative to the {{es}} config directory. {{es}} will automatically monitor this file for changes and will reload the configuration whenever it is updated. * If you're using {{ece}} or {{ech}}, then you must upload any certificate or keystore files [as a custom bundle](/deploy-manage/deploy/elastic-cloud/upload-custom-plugins-bundles.md) before it can be referenced. You can add this file to your existing SAML bundle. * If you're using {{eck}}, then install the files as [custom configuration files](/deploy-manage/deploy/cloud-on-k8s/custom-configuration-files-plugins.md#use-a-volume-and-volume-mount-together-with-a-configmap-or-secret). ::: @@ -519,7 +519,7 @@ If you want to use **PKCS#12 formatted** files or a **Java Keystore** for SAML e Some Identity Providers support importing a metadata file from the Service Provider. This will automatically configure many of the integration options between the IdP and the SP. -The {{stack}} supports generating such a metadata file using the [SAML service provider metadata API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-security-saml-service-provider-metadata) or the [`bin/elasticsearch-saml-metadata` command](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/saml-metadata.md). +The {{stack}} supports generating such a metadata file using the [SAML service provider metadata API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-security-saml-service-provider-metadata) or the [`bin/elasticsearch-saml-metadata` command](elasticsearch://reference/elasticsearch/command-line-tools/saml-metadata.md). ### Using the SAML service provider metadata API @@ -531,7 +531,7 @@ curl -u user_name:password -X GET http://localhost:9200/_security/saml/metadata ### Using the `elasticsearch-saml-metadata` command -You can generate the SAML metadata by running the [`bin/elasticsearch-saml-metadata` command](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/saml-metadata.md). +You can generate the SAML metadata by running the [`bin/elasticsearch-saml-metadata` command](elasticsearch://reference/elasticsearch/command-line-tools/saml-metadata.md). ```{applies_to} deployment: @@ -566,17 +566,17 @@ When a user authenticates using SAML, they are identified to the {{stack}}, but Your SAML users cannot do anything until they are assigned roles. This can be done through either the [add role mapping API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-security-put-role-mapping) or with [authorization realms](/deploy-manage/users-roles/cluster-or-deployment-auth/realm-chains.md#authorization_realms). -You can map SAML users to roles in the following ways: +You can map SAML users to roles in the following ways: * Using the role mappings page in {{kib}}. -* Using the [role mapping API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-security-put-role-mapping). +* Using the [role mapping API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-security-put-role-mapping). * By delegating authorization to another realm. ::::{note} You can't use [role mapping files](/deploy-manage/users-roles/cluster-or-deployment-auth/mapping-users-groups-to-roles.md#mapping-roles-file) to grant roles to users authenticating using SAML. :::: -### Example: Using the role mapping API +### Example: Using the role mapping API This is an example of a simple role mapping that grants the `example_role` role to any user who authenticates against the `saml1` realm: @@ -628,10 +628,10 @@ PUT /_security/role_mapping/saml-finance If your users also exist in a repository that can be directly accessed by {{es}} (such as an LDAP directory) then you can use [authorization realms](/deploy-manage/users-roles/cluster-or-deployment-auth/realm-chains.md#authorization_realms) instead of role mappings. -In this case, you perform the following steps: +In this case, you perform the following steps: -1. In your SAML realm, assigned a SAML attribute to act as the lookup userid, by configuring the `attributes.principal` setting. -2. Create a new realm that can lookup users from your local repository (e.g. an `ldap` realm) +1. In your SAML realm, assigned a SAML attribute to act as the lookup userid, by configuring the `attributes.principal` setting. +2. Create a new realm that can lookup users from your local repository (e.g. an `ldap` realm) 3. In your SAML realm, set `authorization_realms` to the name of the realm you created in step 2. ## Configure {{kib}} [saml-configure-kibana] diff --git a/deploy-manage/users-roles/cluster-or-deployment-auth/service-accounts.md b/deploy-manage/users-roles/cluster-or-deployment-auth/service-accounts.md index 941a67771..5e0bbdf3e 100644 --- a/deploy-manage/users-roles/cluster-or-deployment-auth/service-accounts.md +++ b/deploy-manage/users-roles/cluster-or-deployment-auth/service-accounts.md @@ -3,10 +3,10 @@ mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/service-accounts.html applies_to: deployment: - ess: - ece: - eck: - self: + ess: + ece: + eck: + self: --- # Service accounts [service-accounts] @@ -26,7 +26,7 @@ Service accounts provide flexibility over [built-in users](built-in-users.md) be Service accounts are not included in the response of the [get users API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-security-get-user). To retrieve a service account, use the [get service accounts API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-security-get-service-accounts). Use the [get service account credentials API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-security-get-service-credentials) to retrieve all service credentials for a service account. -## Service accounts usage [service-accounts-explanation] +## Service accounts usage [service-accounts-explanation] Service accounts have a [unique principal](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-security-get-service-accounts#security-api-get-service-accounts-path-params) that takes the format of `/`, where the `namespace` is a top-level grouping of service accounts, and `service` is the name of the service and must be unique within its namespace. @@ -38,13 +38,13 @@ Service accounts are predefined in code. The following service accounts are avai `elastic/kibana` : The service account used by {{kib}} to communicate with {{es}}. -::::{important} +::::{important} Do not attempt to use service accounts for authenticating individual users. Service accounts can only be authenticated with service tokens, which are not applicable to regular users. :::: -## Service account tokens [service-accounts-tokens] +## Service account tokens [service-accounts-tokens] A service account token, or service token, is a unique string that a service uses to authenticate with {{es}}. For a given service account, each token must have a unique name. Because tokens include access credentials, they should always be kept secret by whichever client is using them. @@ -53,7 +53,7 @@ Service tokens can be backed by either the `.security` index (recommended) or th You must create a service token to use a service account. You can create a service token using either: * The [create service account token API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-security-create-service-token), which saves the new service token in the `.security` index and returns the bearer token in the HTTP response. -* Self-managed and {{eck}} deployments only: The [elasticsearch-service-tokens](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/service-tokens-command.md) CLI tool, which saves the new service token in the `$ES_HOME/config/service_tokens` file and outputs the bearer token to your terminal +* Self-managed and {{eck}} deployments only: The [elasticsearch-service-tokens](elasticsearch://reference/elasticsearch/command-line-tools/service-tokens-command.md) CLI tool, which saves the new service token in the `$ES_HOME/config/service_tokens` file and outputs the bearer token to your terminal We recommend that you create service tokens via the REST API rather than the CLI. The API stores service tokens within the `.security` index which means that the tokens are available for authentication on all nodes, and will be backed up within cluster snapshots. The use of the CLI is intended for cases where there is an external orchestration process (such as [{{ece}}](https://www.elastic.co/guide/en/cloud-enterprise/current) or [{{eck}}](https://www.elastic.co/guide/en/cloud-on-k8s/current)) that will manage the creation and distribution of the `service_tokens` file. @@ -62,9 +62,9 @@ Both of these methods (API and CLI) create a service token with a guaranteed sec Service tokens never expire. You must actively [delete](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-security-delete-service-token) them if they are no longer needed. -## Authenticate with service tokens [authenticate-with-service-account-token] +## Authenticate with service tokens [authenticate-with-service-account-token] -::::{note} +::::{note} Service accounts currently do not support basic authentication. :::: diff --git a/deploy-manage/users-roles/cluster-or-deployment-auth/token-based-authentication-services.md b/deploy-manage/users-roles/cluster-or-deployment-auth/token-based-authentication-services.md index dfe380194..67db6261b 100644 --- a/deploy-manage/users-roles/cluster-or-deployment-auth/token-based-authentication-services.md +++ b/deploy-manage/users-roles/cluster-or-deployment-auth/token-based-authentication-services.md @@ -3,10 +3,10 @@ mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/token-authentication-services.html applies_to: deployment: - ess: - ece: - eck: - self: + ess: + ece: + eck: + self: --- # Token-based authentication services [token-authentication-services] @@ -16,7 +16,7 @@ The {{stack-security-features}} authenticate users by using realms and one or mo The {{security-features}} provide the following built-in token-based authentication services, which are listed in the order they are consulted: *service-accounts* -: The [service accounts](service-accounts.md) use either the [create service account token API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-security-create-service-token) or the [elasticsearch-service-tokens](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/service-tokens-command.md) CLI tool to generate service account tokens. +: The [service accounts](service-accounts.md) use either the [create service account token API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-security-create-service-token) or the [elasticsearch-service-tokens](elasticsearch://reference/elasticsearch/command-line-tools/service-tokens-command.md) CLI tool to generate service account tokens. To use a service account token, include the generated token value in a request with an `Authorization: Bearer` header: @@ -24,7 +24,7 @@ To use a service account token, include the generated token value in a request w curl -H "Authorization: Bearer AAEAAWVsYXN0aWMvZ...mXQtc2VydmMTpyNXdkYmRib1FTZTl2R09Ld2FKR0F3" http://localhost:9200/_cluster/health ``` -::::{important} +::::{important} Do not attempt to use service accounts for authenticating individual users. Service accounts can only be authenticated with service tokens, which are not applicable to regular users. :::: @@ -52,7 +52,7 @@ $$$token-authentication-api-key$$$ Depending on your use case, you may want to decide on the lifetime of the tokens generated by these services. You can then use this information to decide which service to use to generate and manage the tokens. Non-expiring API keys may seem like the easy option but you must consider the security implications that come with non-expiring keys. Both the *token-service* and *api-key-service* permit you to invalidate the tokens. See [invalidate token API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-security-invalidate-token) and [invalidate API key API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-security-invalidate-api-key). -::::{important} +::::{important} Authentication support for JWT bearer tokens was introduced in {{es}} 8.2 through the [JWT authentication](jwt.md), which cannot be enabled through token-authentication services. Realms offer flexible order and configurations of zero, one, or multiple JWT realms. :::: diff --git a/explore-analyze/alerts-cases/alerts/alerting-setup.md b/explore-analyze/alerts-cases/alerts/alerting-setup.md index f8c547b56..f4bc74636 100644 --- a/explore-analyze/alerts-cases/alerts/alerting-setup.md +++ b/explore-analyze/alerts-cases/alerts/alerting-setup.md @@ -20,9 +20,9 @@ If you are using an **on-premises** {{stack}} deployment: If you are using an **on-premises** {{stack}} deployment with [**security**](../../../deploy-manage/security.md): -* If you are unable to access {{kib}} {{alert-features}}, ensure that you have not [explicitly disabled API keys](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#api-key-service-settings). +* If you are unable to access {{kib}} {{alert-features}}, ensure that you have not [explicitly disabled API keys](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#api-key-service-settings). -The alerting framework uses queries that require the `search.allow_expensive_queries` setting to be `true`. See the scripts [documentation](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-script-query.md#_allow_expensive_queries_4). +The alerting framework uses queries that require the `search.allow_expensive_queries` setting to be `true`. See the scripts [documentation](elasticsearch://reference/query-languages/query-dsl-script-query.md#_allow_expensive_queries_4). ## Production considerations and scaling guidance [alerting-setup-production] diff --git a/explore-analyze/alerts-cases/alerts/rule-type-es-query.md b/explore-analyze/alerts-cases/alerts/rule-type-es-query.md index 9001556d6..289532836 100644 --- a/explore-analyze/alerts-cases/alerts/rule-type-es-query.md +++ b/explore-analyze/alerts-cases/alerts/rule-type-es-query.md @@ -52,7 +52,7 @@ When you create an {{es}} query rule, your choice of query type affects the info : Specify how to calculate the value that is compared to the threshold. The value is calculated by aggregating a numeric field within the time window. The aggregation options are: `count`, `average`, `sum`, `min`, and `max`. When using `count` the document count is used and an aggregation field is not necessary. Over or Grouped Over - : Specify whether the aggregation is applied over all documents or split into groups using up to four grouping fields. If you choose to use grouping, it’s a [terms](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-terms-aggregation.md) or [multi terms aggregation](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-multi-terms-aggregation.md); an alert will be created for each unique set of values when it meets the condition. To limit the number of alerts on high cardinality fields, you must specify the number of groups to check against the threshold. Only the top groups are checked. + : Specify whether the aggregation is applied over all documents or split into groups using up to four grouping fields. If you choose to use grouping, it’s a [terms](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-terms-aggregation.md) or [multi terms aggregation](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-multi-terms-aggregation.md); an alert will be created for each unique set of values when it meets the condition. To limit the number of alerts on high cardinality fields, you must specify the number of groups to check against the threshold. Only the top groups are checked. Threshold : Defines a threshold value and a comparison operator (`is above`, `is above or equals`, `is below`, `is below or equals`, or `is between`). The value calculated by the aggregation is compared to this threshold. @@ -150,7 +150,7 @@ The following variables are specific to the {{es}} query rule: {{/context.hits}} ``` - The documents returned by `context.hits` include the [`_source`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-source-field.md) field. If the {{es}} query search API’s [`fields`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/retrieve-selected-fields.md#search-fields-param) parameter is used, documents will also return the `fields` field, which can be used to access any runtime fields defined by the [`runtime_mappings`](../../../manage-data/data-store/mapping/define-runtime-fields-in-search-request.md) parameter. For example: + The documents returned by `context.hits` include the [`_source`](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md) field. If the {{es}} query search API’s [`fields`](elasticsearch://reference/elasticsearch/rest-apis/retrieve-selected-fields.md#search-fields-param) parameter is used, documents will also return the `fields` field, which can be used to access any runtime fields defined by the [`runtime_mappings`](../../../manage-data/data-store/mapping/define-runtime-fields-in-search-request.md) parameter. For example: ```handlebars {{#context.hits}} @@ -162,7 +162,7 @@ The following variables are specific to the {{es}} query rule: 1. The `fields` parameter here is used to access the `day_of_week` runtime field. - As the [`fields`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/retrieve-selected-fields.md#search-fields-response) response always returns an array of values for each field, the [Mustache](https://mustache.github.io/) template array syntax is used to iterate over these values in your actions. For example: + As the [`fields`](elasticsearch://reference/elasticsearch/rest-apis/retrieve-selected-fields.md#search-fields-response) response always returns an array of values for each field, the [Mustache](https://mustache.github.io/) template array syntax is used to iterate over these values in your actions. For example: ```handlebars {{#context.hits}} diff --git a/explore-analyze/alerts-cases/watcher/actions-email.md b/explore-analyze/alerts-cases/watcher/actions-email.md index 3e18a4b96..ab7a52eb9 100644 --- a/explore-analyze/alerts-cases/watcher/actions-email.md +++ b/explore-analyze/alerts-cases/watcher/actions-email.md @@ -282,7 +282,7 @@ bin/elasticsearch-keystore add xpack.notification.email.account.exchange_account The `email` action supports sending messages with an HTML body. However, for security reasons, {{watcher}} [sanitizes](https://en.wikipedia.org/wiki/HTML_sanitization) the HTML. -You can control which HTML features are allowed or disallowed by configuring the `xpack.notification.email.html.sanitization.allow` and `xpack.notification.email.html.sanitization.disallow` settings in `elasticsearch.yml`. You can specify individual HTML elements and [HTML feature groups](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/watcher-settings.md#html-feature-groups). By default, {{watcher}} allows the following features: `body`, `head`, `_tables`, `_links`, `_blocks`, `_formatting` and `img:embedded`. +You can control which HTML features are allowed or disallowed by configuring the `xpack.notification.email.html.sanitization.allow` and `xpack.notification.email.html.sanitization.disallow` settings in `elasticsearch.yml`. You can specify individual HTML elements and [HTML feature groups](elasticsearch://reference/elasticsearch/configuration-reference/watcher-settings.md#html-feature-groups). By default, {{watcher}} allows the following features: `body`, `head`, `_tables`, `_links`, `_blocks`, `_formatting` and `img:embedded`. For example, the following settings allow the HTML to contain tables and block elements, but disallow `

`, `

` and `
` tags. diff --git a/explore-analyze/alerts-cases/watcher/actions-index.md b/explore-analyze/alerts-cases/watcher/actions-index.md index d28a50d1d..14d52df6a 100644 --- a/explore-analyze/alerts-cases/watcher/actions-index.md +++ b/explore-analyze/alerts-cases/watcher/actions-index.md @@ -43,7 +43,7 @@ The following snippet shows a simple `index` action definition: | `op_type` | no | `index` | The [op_type](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-create) for the index operation. Must be one of either `index` or `create`. Must be `create` if `index` is a data stream. | | `execution_time_field` | no | - | The field that will store/index the watch execution time. | | `timeout` | no | 60s | The timeout for waiting for the index api call to return. If no response is returned within this time, the index action times out and fails. This setting overrides the default timeouts. | -| `refresh` | no | - | Optional setting of the [refresh policy](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/refresh-parameter.md) for the write request | +| `refresh` | no | - | Optional setting of the [refresh policy](elasticsearch://reference/elasticsearch/rest-apis/refresh-parameter.md) for the write request | ## Multi-document support [anatomy-actions-index-multi-doc-support] diff --git a/explore-analyze/alerts-cases/watcher/actions-jira.md b/explore-analyze/alerts-cases/watcher/actions-jira.md index 7ff0ec9b4..dd395e5d7 100644 --- a/explore-analyze/alerts-cases/watcher/actions-jira.md +++ b/explore-analyze/alerts-cases/watcher/actions-jira.md @@ -108,7 +108,7 @@ xpack.notification.jira: It is strongly advised to use Basic Authentication with secured HTTPS protocol only. :::: -You can also specify defaults for the [Jira issues](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/watcher-settings.md#jira-account-attributes): +You can also specify defaults for the [Jira issues](elasticsearch://reference/elasticsearch/configuration-reference/watcher-settings.md#jira-account-attributes): ```yaml xpack.notification.jira: diff --git a/explore-analyze/alerts-cases/watcher/actions-slack.md b/explore-analyze/alerts-cases/watcher/actions-slack.md index 2398e9fb8..0770c0dd7 100644 --- a/explore-analyze/alerts-cases/watcher/actions-slack.md +++ b/explore-analyze/alerts-cases/watcher/actions-slack.md @@ -153,7 +153,7 @@ You can no longer configure Slack accounts using `elasticsearch.yml` settings. P :::: -You can specify defaults for the [Slack notification attributes](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/watcher-settings.md#slack-account-attributes): +You can specify defaults for the [Slack notification attributes](elasticsearch://reference/elasticsearch/configuration-reference/watcher-settings.md#slack-account-attributes): ```yaml xpack.notification.slack: diff --git a/explore-analyze/alerts-cases/watcher/actions-webhook.md b/explore-analyze/alerts-cases/watcher/actions-webhook.md index dcf6dabbf..3c6674f24 100644 --- a/explore-analyze/alerts-cases/watcher/actions-webhook.md +++ b/explore-analyze/alerts-cases/watcher/actions-webhook.md @@ -71,7 +71,7 @@ You can use basic authentication when sending a request to a secured webservice. By default, both the username and the password are stored in the `.watches` index in plain text. When the {{es}} {{security-features}} are enabled, {{watcher}} can encrypt the password before storing it. :::: -You can also use PKI-based authentication when submitting requests to a cluster that has {{es}} {{security-features}} enabled. When you use PKI-based authentication instead of HTTP basic auth, you don’t need to store any authentication information in the watch itself. To use PKI-based authentication, you [configure the SSL key settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/watcher-settings.md#ssl-notification-settings) for {{watcher}} in `elasticsearch.yml`. +You can also use PKI-based authentication when submitting requests to a cluster that has {{es}} {{security-features}} enabled. When you use PKI-based authentication instead of HTTP basic auth, you don’t need to store any authentication information in the watch itself. To use PKI-based authentication, you [configure the SSL key settings](elasticsearch://reference/elasticsearch/configuration-reference/watcher-settings.md#ssl-notification-settings) for {{watcher}} in `elasticsearch.yml`. ## Query parameters [webhook-query-parameters] diff --git a/explore-analyze/alerts-cases/watcher/enable-watcher.md b/explore-analyze/alerts-cases/watcher/enable-watcher.md index d2550a260..52d4b4bf5 100644 --- a/explore-analyze/alerts-cases/watcher/enable-watcher.md +++ b/explore-analyze/alerts-cases/watcher/enable-watcher.md @@ -164,4 +164,4 @@ An example on how to configure a new account from the Elastic cloud console: 6. The new email account is now set up. It will now be used by default for watcher email actions. -For a full reference of all available settings, see the [Elasticsearch documentation](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/watcher-settings.md#email-notification-settings). +For a full reference of all available settings, see the [Elasticsearch documentation](elasticsearch://reference/elasticsearch/configuration-reference/watcher-settings.md#email-notification-settings). diff --git a/explore-analyze/alerts-cases/watcher/encrypting-data.md b/explore-analyze/alerts-cases/watcher/encrypting-data.md index a74ef574a..81233df32 100644 --- a/explore-analyze/alerts-cases/watcher/encrypting-data.md +++ b/explore-analyze/alerts-cases/watcher/encrypting-data.md @@ -14,19 +14,19 @@ Every `password` field that is used in your watch within an HTTP basic authentic To encrypt sensitive data in {{watcher}}: -1. Use the [elasticsearch-syskeygen](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/syskeygen.md) command to create a system key file. +1. Use the [elasticsearch-syskeygen](elasticsearch://reference/elasticsearch/command-line-tools/syskeygen.md) command to create a system key file. 2. Copy the `system_key` file to all of the nodes in your cluster. ::::{important} The system key is a symmetric key, so the same key must be used on every node in the cluster. :::: -3. Set the [`xpack.watcher.encrypt_sensitive_data` setting](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/watcher-settings.md): +3. Set the [`xpack.watcher.encrypt_sensitive_data` setting](elasticsearch://reference/elasticsearch/configuration-reference/watcher-settings.md): ```sh xpack.watcher.encrypt_sensitive_data: true ``` -4. Set the [`xpack.watcher.encryption_key` setting](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/watcher-settings.md) in the [{{es}} keystore](../../../deploy-manage/security/secure-settings.md) on each node in the cluster. +4. Set the [`xpack.watcher.encryption_key` setting](elasticsearch://reference/elasticsearch/configuration-reference/watcher-settings.md) in the [{{es}} keystore](../../../deploy-manage/security/secure-settings.md) on each node in the cluster. For example, run the following command to import the `system_key` file on each node: diff --git a/explore-analyze/alerts-cases/watcher/input-search.md b/explore-analyze/alerts-cases/watcher/input-search.md index 6b08ae491..277e4fb03 100644 --- a/explore-analyze/alerts-cases/watcher/input-search.md +++ b/explore-analyze/alerts-cases/watcher/input-search.md @@ -141,9 +141,9 @@ The total number of hits in the search response is returned as an object in the | `request.indices` | no | - | The indices to search. If omitted, all indices are searched, which is the default behaviour in Elasticsearch. | | `request.body` | no | - | The body of the request. The [request body](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search) follows the same structure you normally send in the body of a REST `_search` request. The body can be static text or include `mustache` [templates](how-watcher-works.md#templates). | | `request.template` | no | - | The body of the search template. See [configure templates](how-watcher-works.md#templates) for more information. | -| `request.indices_options.expand_wildcards` | no | `open` | How to expand wildcards. Valid values are: `all`, `open`, `closed`, and `none` See [`expand_wildcards`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#api-multi-index) for more information. | -| `request.indices_options.ignore_unavailable` | no | `true` | Whether the search should ignore unavailable indices. See [`ignore_unavailable`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#api-multi-index) for more information. | -| `request.indices_options.allow_no_indices` | no | `true` | Whether to allow a search where a wildcard indices expression results in no concrete indices. See [allow_no_indices](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#api-multi-index) for more information. | +| `request.indices_options.expand_wildcards` | no | `open` | How to expand wildcards. Valid values are: `all`, `open`, `closed`, and `none` See [`expand_wildcards`](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#api-multi-index) for more information. | +| `request.indices_options.ignore_unavailable` | no | `true` | Whether the search should ignore unavailable indices. See [`ignore_unavailable`](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#api-multi-index) for more information. | +| `request.indices_options.allow_no_indices` | no | `true` | Whether to allow a search where a wildcard indices expression results in no concrete indices. See [allow_no_indices](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#api-multi-index) for more information. | | `extract` | no | - | A array of JSON keys to extract from the search response and load as the payload. When a search generates a large response, you can use `extract` to select the relevant fields instead of loading the entire response. | | `timeout` | no | 1m | The timeout for waiting for the search api call to return. If no response is returned within this time, the search input times out and fails. This setting overrides the default search operations timeouts. | diff --git a/explore-analyze/alerts-cases/watcher/schedule-types.md b/explore-analyze/alerts-cases/watcher/schedule-types.md index ba2e6e638..5b164ca40 100644 --- a/explore-analyze/alerts-cases/watcher/schedule-types.md +++ b/explore-analyze/alerts-cases/watcher/schedule-types.md @@ -421,10 +421,10 @@ By default, the `yearly` schedule is evaluated in the UTC time zone. To use a di ## {{watcher}} cron schedule [schedule-cron] -Defines a [`schedule`](trigger-schedule.md) using a [cron expression](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#api-cron-expressions) that specifiues when to execute a watch. +Defines a [`schedule`](trigger-schedule.md) using a [cron expression](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#api-cron-expressions) that specifiues when to execute a watch. ::::{tip} -While cron expressions are powerful, a regularly occurring schedule is easier to configure with the other schedule types. If you must use a cron schedule, make sure you verify it with [`elasticsearch-croneval`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/elasticsearch-croneval.md) . +While cron expressions are powerful, a regularly occurring schedule is easier to configure with the other schedule types. If you must use a cron schedule, make sure you verify it with [`elasticsearch-croneval`](elasticsearch://reference/elasticsearch/command-line-tools/elasticsearch-croneval.md) . :::: @@ -487,7 +487,7 @@ By default, cron expressions are evaluated in the UTC time zone. To use a differ ### Use croneval to validate cron expressions [croneval] -{{es}} provides a [`elasticsearch-croneval`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/elasticsearch-croneval.md) command line tool in the `$ES_HOME/bin` directory that you can use to check that your cron expressions are valid and produce the expected results. +{{es}} provides a [`elasticsearch-croneval`](elasticsearch://reference/elasticsearch/command-line-tools/elasticsearch-croneval.md) command line tool in the `$ES_HOME/bin` directory that you can use to check that your cron expressions are valid and produce the expected results. To validate a cron expression, pass it in as a parameter to `elasticsearch-croneval`: diff --git a/explore-analyze/alerts-cases/watcher/transform-search.md b/explore-analyze/alerts-cases/watcher/transform-search.md index e86a1cbbd..1dd0d73bf 100644 --- a/explore-analyze/alerts-cases/watcher/transform-search.md +++ b/explore-analyze/alerts-cases/watcher/transform-search.md @@ -52,9 +52,9 @@ The following table lists all available settings for the search {{watcher-transf | `request.search_type` | no | query_then_fetch | The search [type](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search). | | `request.indices` | no | all indices | One or more indices to search on. | | `request.body` | no | `match_all` query | The body of the request. The [request body](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search) follows the same structure you normally send in the body of a REST `_search` request. The body can be static text or include `mustache` [templates](how-watcher-works.md#templates). | -| `request.indices_options.expand_wildcards` | no | `open` | Determines how to expand indices wildcards. An array consisting of a combination of `open`, `closed`, and `hidden`. Alternatively a value of `none` or `all`. (see [multi-target syntax](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#api-multi-index)) | -| `request.indices_options.ignore_unavailable` | no | `true` | A boolean value that determines whether the search should leniently ignore unavailable indices (see [multi-target syntax](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#api-multi-index)) | -| `request.indices_options.allow_no_indices` | no | `true` | A boolean value that determines whether the search should leniently return no results when no indices are resolved (see [multi-target syntax](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#api-multi-index)) | +| `request.indices_options.expand_wildcards` | no | `open` | Determines how to expand indices wildcards. An array consisting of a combination of `open`, `closed`, and `hidden`. Alternatively a value of `none` or `all`. (see [multi-target syntax](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#api-multi-index)) | +| `request.indices_options.ignore_unavailable` | no | `true` | A boolean value that determines whether the search should leniently ignore unavailable indices (see [multi-target syntax](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#api-multi-index)) | +| `request.indices_options.allow_no_indices` | no | `true` | A boolean value that determines whether the search should leniently return no results when no indices are resolved (see [multi-target syntax](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#api-multi-index)) | | `request.template` | no | - | The body of the search template. See [configure templates](how-watcher-works.md#templates) for more information. | | `timeout` | no | 30s | The timeout for waiting for the search api call to return. If no response is returned within this time, the search {{watcher-transform}} times out and fails. This setting overrides the default timeouts. | diff --git a/explore-analyze/alerts-cases/watcher/watch-cluster-status.md b/explore-analyze/alerts-cases/watcher/watch-cluster-status.md index 5f1317c31..70edbdab5 100644 --- a/explore-analyze/alerts-cases/watcher/watch-cluster-status.md +++ b/explore-analyze/alerts-cases/watcher/watch-cluster-status.md @@ -85,7 +85,7 @@ PUT _watcher/watch/cluster_health_watch It would be a good idea to create a user with the minimum privileges required for use with such a watch configuration. -Depending on how your cluster is configured, there may be additional settings required before the watch can access your cluster such as keystores, truststores, or certificates. For more information, see [{{watcher}} settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/watcher-settings.md). +Depending on how your cluster is configured, there may be additional settings required before the watch can access your cluster such as keystores, truststores, or certificates. For more information, see [{{watcher}} settings](elasticsearch://reference/elasticsearch/configuration-reference/watcher-settings.md). If you check the watch history, you’ll see that the cluster status is recorded as part of the `watch_record` each time the watch executes. diff --git a/explore-analyze/alerts-cases/watcher/watcher-ui.md b/explore-analyze/alerts-cases/watcher/watcher-ui.md index d8eb217ff..9eb0b55dc 100644 --- a/explore-analyze/alerts-cases/watcher/watcher-ui.md +++ b/explore-analyze/alerts-cases/watcher/watcher-ui.md @@ -135,7 +135,7 @@ On the Watch overview page, click **Create** and choose **Create advanced watch* The **Simulate** tab allows you to override parts of the watch, and then run a simulation. Be aware of these implementation details on overrides: -* Trigger overrides use [date math](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/common-options.md#date-math). +* Trigger overrides use [date math](elasticsearch://reference/elasticsearch/rest-apis/common-options.md#date-math). * Input overrides accepts a JSON blob. * Condition overrides indicates if you want to force the condition to always be `true`. * Action overrides support [multiple options](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-watcher-execute-watch). diff --git a/explore-analyze/discover/discover-get-started.md b/explore-analyze/discover/discover-get-started.md index 411cc2e3c..a469f8876 100644 --- a/explore-analyze/discover/discover-get-started.md +++ b/explore-analyze/discover/discover-get-started.md @@ -29,7 +29,7 @@ Select the data you want to explore, and then specify the time range in which to 2. Select the data view that contains the data you want to explore. ::::{tip} By default, {{kib}} requires a [{{data-source}}](../find-and-organize/data-views.md) to access your Elasticsearch data. A {{data-source}} can point to one or more indices, [data streams](../../manage-data/data-store/data-streams.md), or [index aliases](https://www.elastic.co/guide/en/elasticsearch/reference/current/alias.html). When adding data to {{es}} using one of the many integrations available, sometimes data views are created automatically, but you can also create your own. - + You can also [try {{esql}}](try-esql.md), that let's you query any data you have in {{es}} without specifying a {{data-source}} first. :::: If you’re using sample data, data views are automatically created and are ready to use. @@ -283,5 +283,5 @@ This section references common questions and issues encountered when using Disco This can happen in several cases: -* With runtime fields and regular keyword fields, when the string exceeds the value set for the [ignore_above](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/ignore-above.md) setting used when indexing the data into {{es}}. +* With runtime fields and regular keyword fields, when the string exceeds the value set for the [ignore_above](elasticsearch://reference/elasticsearch/mapping-reference/ignore-above.md) setting used when indexing the data into {{es}}. * Due to the structure of nested fields, a leaf field added to the table as a column will not contain values in any of its cells. Instead, add the root field as a column to view a JSON representation of its values. Learn more in [this blog post](https://www.elastic.co/de/blog/discover-uses-fields-api-in-7-12). diff --git a/explore-analyze/find-and-organize/data-views.md b/explore-analyze/find-and-organize/data-views.md index a7dc8db7c..07591d7b1 100644 --- a/explore-analyze/find-and-organize/data-views.md +++ b/explore-analyze/find-and-organize/data-views.md @@ -476,7 +476,7 @@ Built-in validation is unsupported for scripted fields. When your scripts contai 5. Select **Set format**, then enter the **Format** for the field. ::::{note} -For numeric fields the default field formatters are based on the `meta.unit` field. The unit is associated with a [time unit](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#time-units), percent, or byte. The convention for percents is to use value 1 to mean 100%. +For numeric fields the default field formatters are based on the `meta.unit` field. The unit is associated with a [time unit](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#time-units), percent, or byte. The convention for percents is to use value 1 to mean 100%. :::: @@ -613,7 +613,7 @@ Numeric fields support **Bytes**, **Color**, **Duration**, **Histogram**, **Numb The **Bytes**, **Number**, and **Percentage** formatters enable you to choose the display formats of numbers in the field using the [Elastic numeral pattern](../../explore-analyze/numeral-formatting.md) syntax that {{kib}} maintains. -The **Histogram** formatter is used only for the [histogram field type](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/histogram.md). When you use the **Histogram** formatter, you can apply the **Bytes**, **Number**, or **Percentage** format to aggregated data. +The **Histogram** formatter is used only for the [histogram field type](elasticsearch://reference/elasticsearch/mapping-reference/histogram.md). When you use the **Histogram** formatter, you can apply the **Bytes**, **Number**, or **Percentage** format to aggregated data. You can specify the following types to the `Url` field formatter: diff --git a/explore-analyze/geospatial-analysis.md b/explore-analyze/geospatial-analysis.md index 155fb30d1..b9a90ba13 100644 --- a/explore-analyze/geospatial-analysis.md +++ b/explore-analyze/geospatial-analysis.md @@ -15,7 +15,7 @@ Not sure where to get started with {{es}} and geo? Then, you have come to the ri ## Geospatial mapping [geospatial-mapping] -{{es}} supports two types of geo data: [geo_point](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/geo-point.md) fields which support lat/lon pairs, and [geo_shape](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/geo-shape.md) fields, which support points, lines, circles, polygons, multi-polygons, and so on. Use [explicit mapping](../manage-data/data-store/mapping/explicit-mapping.md) to index geo data fields. +{{es}} supports two types of geo data: [geo_point](elasticsearch://reference/elasticsearch/mapping-reference/geo-point.md) fields which support lat/lon pairs, and [geo_shape](elasticsearch://reference/elasticsearch/mapping-reference/geo-shape.md) fields, which support points, lines, circles, polygons, multi-polygons, and so on. Use [explicit mapping](../manage-data/data-store/mapping/explicit-mapping.md) to index geo data fields. Have an index with lat/lon pairs but no geo_point mapping? Use [runtime fields](../manage-data/data-store/mapping/map-runtime-field.md) to make a geo_point field without reindexing. @@ -24,46 +24,46 @@ Have an index with lat/lon pairs but no geo_point mapping? Use [runtime fields]( Data is often messy and incomplete. [Ingest pipelines](../manage-data/ingest/transform-enrich/ingest-pipelines.md) lets you clean, transform, and augment your data before indexing. -* Use [CSV](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/csv-processor.md) together with [explicit mapping](../manage-data/data-store/mapping/explicit-mapping.md) to index CSV files with geo data. Kibana’s [Import CSV](visualize/maps/import-geospatial-data.md) feature can help with this. -* Use [GeoIP](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/geoip-processor.md) to add geographical location of an IPv4 or IPv6 address. -* Use [geo-grid processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/ingest-geo-grid-processor.md) to convert grid tiles or hexagonal cell ids to bounding boxes or polygons which describe their shape. +* Use [CSV](elasticsearch://reference/ingestion-tools/enrich-processor/csv-processor.md) together with [explicit mapping](../manage-data/data-store/mapping/explicit-mapping.md) to index CSV files with geo data. Kibana’s [Import CSV](visualize/maps/import-geospatial-data.md) feature can help with this. +* Use [GeoIP](elasticsearch://reference/ingestion-tools/enrich-processor/geoip-processor.md) to add geographical location of an IPv4 or IPv6 address. +* Use [geo-grid processor](elasticsearch://reference/ingestion-tools/enrich-processor/ingest-geo-grid-processor.md) to convert grid tiles or hexagonal cell ids to bounding boxes or polygons which describe their shape. * Use [geo_match enrich policy](../manage-data/ingest/transform-enrich/example-enrich-data-based-on-geolocation.md) for reverse geocoding. For example, use [reverse geocoding](visualize/maps/reverse-geocoding-tutorial.md) to visualize metropolitan areas by web traffic. ## Query [geospatial-query] -[Geo queries](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/geo-queries.md) answer location-driven questions. Find documents that intersect with, are within, are contained by, or do not intersect your query geometry. Combine geospatial queries with full text search queries for unparalleled searching experience. For example, "Show me all subscribers that live within 5 miles of our new gym location, that joined in the last year and have running mentioned in their profile". +[Geo queries](elasticsearch://reference/query-languages/geo-queries.md) answer location-driven questions. Find documents that intersect with, are within, are contained by, or do not intersect your query geometry. Combine geospatial queries with full text search queries for unparalleled searching experience. For example, "Show me all subscribers that live within 5 miles of our new gym location, that joined in the last year and have running mentioned in their profile". ## ES|QL [esql-query] -[ES|QL](query-filter/languages/esql.md) has support for [Geospatial Search](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-functions-operators.md#esql-spatial-functions) functions, enabling efficient index searching for documents that intersect with, are within, are contained by, or are disjoint from a query geometry. In addition, the `ST_DISTANCE` function calculates the distance between two points. +[ES|QL](query-filter/languages/esql.md) has support for [Geospatial Search](elasticsearch://reference/query-languages/esql/esql-functions-operators.md#esql-spatial-functions) functions, enabling efficient index searching for documents that intersect with, are within, are contained by, or are disjoint from a query geometry. In addition, the `ST_DISTANCE` function calculates the distance between two points. -* [`ST_INTERSECTS`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-functions-operators.md#esql-st_intersects) -* [`ST_DISJOINT`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-functions-operators.md#esql-st_disjoint) -* [`ST_CONTAINS`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-functions-operators.md#esql-st_contains) -* [`ST_WITHIN`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-functions-operators.md#esql-st_within) -* [`ST_DISTANCE`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-functions-operators.md#esql-st_distance) +* [`ST_INTERSECTS`](elasticsearch://reference/query-languages/esql/esql-functions-operators.md#esql-st_intersects) +* [`ST_DISJOINT`](elasticsearch://reference/query-languages/esql/esql-functions-operators.md#esql-st_disjoint) +* [`ST_CONTAINS`](elasticsearch://reference/query-languages/esql/esql-functions-operators.md#esql-st_contains) +* [`ST_WITHIN`](elasticsearch://reference/query-languages/esql/esql-functions-operators.md#esql-st_within) +* [`ST_DISTANCE`](elasticsearch://reference/query-languages/esql/esql-functions-operators.md#esql-st_distance) ## Aggregate [geospatial-aggregate] -[Aggregations](query-filter/aggregations.md) summarizes your data as metrics, statistics, or other analytics. Use [bucket aggregations](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/bucket.md) to group documents into buckets, also called bins, based on field values, ranges, or other criteria. Then, use [metric aggregations](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/metrics.md) to calculate metrics, such as a sum or average, from field values in each bucket. Compare metrics across buckets to gain insights from your data. +[Aggregations](query-filter/aggregations.md) summarizes your data as metrics, statistics, or other analytics. Use [bucket aggregations](elasticsearch://reference/data-analysis/aggregations/bucket.md) to group documents into buckets, also called bins, based on field values, ranges, or other criteria. Then, use [metric aggregations](elasticsearch://reference/data-analysis/aggregations/metrics.md) to calculate metrics, such as a sum or average, from field values in each bucket. Compare metrics across buckets to gain insights from your data. Geospatial bucket aggregations: -* [Geo-distance aggregation](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-geodistance-aggregation.md) evaluates the distance of each geo_point location from an origin point and determines the buckets it belongs to based on the ranges (a document belongs to a bucket if the distance between the document and the origin falls within the distance range of the bucket). -* [Geohash grid aggregation](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-geohashgrid-aggregation.md) groups geo_point and geo_shape values into buckets that represent a grid. -* [Geohex grid aggregation](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-geohexgrid-aggregation.md) groups geo_point and geo_shape values into buckets that represent an H3 hexagonal cell. -* [Geotile grid aggregation](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-geotilegrid-aggregation.md) groups geo_point and geo_shape values into buckets that represent a grid. Each cell corresponds to a [map tile](https://en.wikipedia.org/wiki/Tiled_web_map) as used by many online map sites. +* [Geo-distance aggregation](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-geodistance-aggregation.md) evaluates the distance of each geo_point location from an origin point and determines the buckets it belongs to based on the ranges (a document belongs to a bucket if the distance between the document and the origin falls within the distance range of the bucket). +* [Geohash grid aggregation](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-geohashgrid-aggregation.md) groups geo_point and geo_shape values into buckets that represent a grid. +* [Geohex grid aggregation](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-geohexgrid-aggregation.md) groups geo_point and geo_shape values into buckets that represent an H3 hexagonal cell. +* [Geotile grid aggregation](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-geotilegrid-aggregation.md) groups geo_point and geo_shape values into buckets that represent a grid. Each cell corresponds to a [map tile](https://en.wikipedia.org/wiki/Tiled_web_map) as used by many online map sites. Geospatial metric aggregations: -* [Geo-bounds aggregation](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-metrics-geobounds-aggregation.md) computes the geographic bounding box containing all values for a Geopoint or Geoshape field. -* [Geo-centroid aggregation](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-metrics-geocentroid-aggregation.md) computes the weighted centroid from all coordinate values for geo fields. -* [Geo-line aggregation](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-metrics-geo-line.md) aggregates all geo_point values within a bucket into a LineString ordered by the chosen sort field. Use geo_line aggregation to create [vehicle tracks](visualize/maps/asset-tracking-tutorial.md). +* [Geo-bounds aggregation](elasticsearch://reference/data-analysis/aggregations/search-aggregations-metrics-geobounds-aggregation.md) computes the geographic bounding box containing all values for a Geopoint or Geoshape field. +* [Geo-centroid aggregation](elasticsearch://reference/data-analysis/aggregations/search-aggregations-metrics-geocentroid-aggregation.md) computes the weighted centroid from all coordinate values for geo fields. +* [Geo-line aggregation](elasticsearch://reference/data-analysis/aggregations/search-aggregations-metrics-geo-line.md) aggregates all geo_point values within a bucket into a LineString ordered by the chosen sort field. Use geo_line aggregation to create [vehicle tracks](visualize/maps/asset-tracking-tutorial.md). -Combine aggregations to perform complex geospatial analysis. For example, to calculate the most recent GPS tracks per flight, use a [terms aggregation](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-terms-aggregation.md) to group documents into buckets per aircraft. Then use geo-line aggregation to compute a track for each aircraft. In another example, use geotile grid aggregation to group documents into a grid. Then use geo-centroid aggregation to find the weighted centroid of each grid cell. +Combine aggregations to perform complex geospatial analysis. For example, to calculate the most recent GPS tracks per flight, use a [terms aggregation](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-terms-aggregation.md) to group documents into buckets per aircraft. Then use geo-line aggregation to compute a track for each aircraft. In another example, use geotile grid aggregation to group documents into a grid. Then use geo-centroid aggregation to find the weighted centroid of each grid cell. ## Integrate [geospatial-integrate] diff --git a/explore-analyze/machine-learning/anomaly-detection/geographic-anomalies.md b/explore-analyze/machine-learning/anomaly-detection/geographic-anomalies.md index 8a3b99cce..fd424f2da 100644 --- a/explore-analyze/machine-learning/anomaly-detection/geographic-anomalies.md +++ b/explore-analyze/machine-learning/anomaly-detection/geographic-anomalies.md @@ -15,9 +15,9 @@ If your data includes geographic fields, you can use {{ml-features}} to detect a To run this type of {{anomaly-job}}, you must have [{{ml-features}} set up](../setting-up-machine-learning.md). You must also have time series data that contains spatial data types. In particular, you must have: * two comma-separated numbers of the form `latitude,longitude`, -* a [`geo_point`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/geo-point.md) field, -* a [`geo_shape`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/geo-shape.md) field that contains point values, or -* a [`geo_centroid`](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-metrics-geocentroid-aggregation.md) aggregation +* a [`geo_point`](elasticsearch://reference/elasticsearch/mapping-reference/geo-point.md) field, +* a [`geo_shape`](elasticsearch://reference/elasticsearch/mapping-reference/geo-shape.md) field that contains point values, or +* a [`geo_centroid`](elasticsearch://reference/data-analysis/aggregations/search-aggregations-metrics-geocentroid-aggregation.md) aggregation The latitude and longitude must be in the range -180 to 180 and represent a point on the surface of the Earth. diff --git a/explore-analyze/machine-learning/anomaly-detection/ml-ad-run-jobs.md b/explore-analyze/machine-learning/anomaly-detection/ml-ad-run-jobs.md index 31588c65b..a9baa16b0 100644 --- a/explore-analyze/machine-learning/anomaly-detection/ml-ad-run-jobs.md +++ b/explore-analyze/machine-learning/anomaly-detection/ml-ad-run-jobs.md @@ -45,7 +45,7 @@ The {{ml-features}} use the concept of a *bucket* to divide the time series into The *bucket span* is part of the configuration information for an {{anomaly-job}}. It defines the time interval that is used to summarize and model the data. This is typically between 5 minutes to 1 hour and it depends on your data characteristics. When you set the bucket span, take into account the granularity at which you want to analyze, the frequency of the input data, the typical duration of the anomalies, and the frequency at which alerting is required. -The bucket span must contain a valid [time interval](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#time-units). When you create an {{anomaly-job}} in {{kib}}, you can choose to estimate a bucket span value based on your data characteristics. If you choose a value that is larger than one day or is significantly different than the estimated value, you receive an informational message. +The bucket span must contain a valid [time interval](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#time-units). When you create an {{anomaly-job}} in {{kib}}, you can choose to estimate a bucket span value based on your data characteristics. If you choose a value that is larger than one day or is significantly different than the estimated value, you receive an informational message. ### Detectors [ml-ad-detectors] diff --git a/explore-analyze/machine-learning/anomaly-detection/ml-configuring-aggregation.md b/explore-analyze/machine-learning/anomaly-detection/ml-configuring-aggregation.md index 01a6ddee2..44c133464 100644 --- a/explore-analyze/machine-learning/anomaly-detection/ml-configuring-aggregation.md +++ b/explore-analyze/machine-learning/anomaly-detection/ml-configuring-aggregation.md @@ -34,13 +34,13 @@ There are a number of requirements for using aggregations in {{dfeeds}}. * If your [{{dfeed}} uses aggregations with nested `terms` aggs](#aggs-dfeeds) and model plot is not enabled for the {{anomaly-job}}, neither the **Single Metric Viewer** nor the **Anomaly Explorer** can plot and display an anomaly chart. In these cases, an explanatory message is shown instead of the chart. * Your {{dfeed}} can contain multiple aggregations, but only the ones with names that match values in the job configuration are fed to the job. -* Using [scripted metric](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-metrics-scripted-metric-aggregation.md) aggregations is not supported in {{dfeeds}}. +* Using [scripted metric](elasticsearch://reference/data-analysis/aggregations/search-aggregations-metrics-scripted-metric-aggregation.md) aggregations is not supported in {{dfeeds}}. ## Recommendations [aggs-recommendations-dfeeds] * When your detectors use [metric](asciidocalypse://docs/docs-content/docs/reference/data-analysis/machine-learning/ml-metric-functions.md) or [sum](asciidocalypse://docs/docs-content/docs/reference/data-analysis/machine-learning/ml-sum-functions.md) analytical functions, it’s recommended to set the `date_histogram` or `composite` aggregation interval to a tenth of the bucket span. This creates finer, more granular time buckets, which are ideal for this type of analysis. * When your detectors use [count](asciidocalypse://docs/docs-content/docs/reference/data-analysis/machine-learning/ml-count-functions.md) or [rare](asciidocalypse://docs/docs-content/docs/reference/data-analysis/machine-learning/ml-rare-functions.md) functions, set the interval to the same value as the bucket span. -* If you have multiple influencers or partition fields or if your field cardinality is more than 1000, use [composite aggregations](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-composite-aggregation.md). +* If you have multiple influencers or partition fields or if your field cardinality is more than 1000, use [composite aggregations](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-composite-aggregation.md). To determine the cardinality of your data, you can run searches such as: @@ -254,10 +254,10 @@ Use the following format to define a composite aggregation in your {{dfeed}}: You can also use complex nested aggregations in {{dfeeds}}. -The next example uses the [`derivative` pipeline aggregation](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-pipeline-derivative-aggregation.md) to find the first order derivative of the counter `system.network.out.bytes` for each value of the field `beat.name`. +The next example uses the [`derivative` pipeline aggregation](elasticsearch://reference/data-analysis/aggregations/search-aggregations-pipeline-derivative-aggregation.md) to find the first order derivative of the counter `system.network.out.bytes` for each value of the field `beat.name`. ::::{note} -`derivative` or other pipeline aggregations may not work within `composite` aggregations. See [composite aggregations and pipeline aggregations](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-composite-aggregation.md#search-aggregations-bucket-composite-aggregation-pipeline-aggregations). +`derivative` or other pipeline aggregations may not work within `composite` aggregations. See [composite aggregations and pipeline aggregations](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-composite-aggregation.md#search-aggregations-bucket-composite-aggregation-pipeline-aggregations). :::: ```js @@ -346,7 +346,7 @@ You can also use single bucket aggregations in {{dfeeds}}. The following example It is not currently possible to use `aggregate_metric_double` type fields in {{dfeeds}} without aggregations. :::: -You can use fields with the [`aggregate_metric_double`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/aggregate-metric-double.md) field type in a {{dfeed}} with aggregations. It is required to retrieve the `value_count` of the `aggregate_metric_double` filed in an aggregation and then use it as the `summary_count_field_name` to provide the correct count that represents the aggregation value. +You can use fields with the [`aggregate_metric_double`](elasticsearch://reference/elasticsearch/mapping-reference/aggregate-metric-double.md) field type in a {{dfeed}} with aggregations. It is required to retrieve the `value_count` of the `aggregate_metric_double` filed in an aggregation and then use it as the `summary_count_field_name` to provide the correct count that represents the aggregation value. In the following example, `presum` is an `aggregate_metric_double` type field that has all the possible metrics: `[ min, max, sum, value_count ]`. To use an `avg` aggregation on this field, you need to perform a `value_count` aggregation on `presum` and then set the field that contains the aggregated values `my_count` as the `summary_count_field_name`: diff --git a/explore-analyze/machine-learning/anomaly-detection/ml-configuring-categories.md b/explore-analyze/machine-learning/anomaly-detection/ml-configuring-categories.md index d6c898cde..af6635988 100644 --- a/explore-analyze/machine-learning/anomaly-detection/ml-configuring-categories.md +++ b/explore-analyze/machine-learning/anomaly-detection/ml-configuring-categories.md @@ -101,7 +101,7 @@ If you use the categorization wizard in {{kib}}, you can see which categorizatio :class: screenshot ::: -The categorization analyzer can refer to a built-in {{es}} analyzer or a combination of zero or more character filters, a tokenizer, and zero or more token filters. In this example, adding a [`pattern_replace` character filter](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analysis-pattern-replace-charfilter.md) achieves the same behavior as the `categorization_filters` job configuration option described earlier. For more details about these properties, refer to the [`categorization_analyzer` API object](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-ml-put-job#ml-put-job-request-body). +The categorization analyzer can refer to a built-in {{es}} analyzer or a combination of zero or more character filters, a tokenizer, and zero or more token filters. In this example, adding a [`pattern_replace` character filter](elasticsearch://reference/data-analysis/text-analysis/analysis-pattern-replace-charfilter.md) achieves the same behavior as the `categorization_filters` job configuration option described earlier. For more details about these properties, refer to the [`categorization_analyzer` API object](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-ml-put-job#ml-put-job-request-body). If you use the default categorization analyzer in {{kib}} or omit the `categorization_analyzer` property from the API, the following default values are used: @@ -137,7 +137,7 @@ POST _ml/anomaly_detectors/_validate If you specify any part of the `categorization_analyzer`, however, any omitted sub-properties are *not* set to default values. -The `ml_standard` tokenizer and the day and month stopword filter are almost equivalent to the following analyzer, which is defined using only built-in {{es}} [tokenizers](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/tokenizer-reference.md) and [token filters](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/token-filter-reference.md): +The `ml_standard` tokenizer and the day and month stopword filter are almost equivalent to the following analyzer, which is defined using only built-in {{es}} [tokenizers](elasticsearch://reference/data-analysis/text-analysis/tokenizer-reference.md) and [token filters](elasticsearch://reference/data-analysis/text-analysis/token-filter-reference.md): ```console PUT _ml/anomaly_detectors/it_ops_new_logs diff --git a/explore-analyze/machine-learning/anomaly-detection/ml-configuring-transform.md b/explore-analyze/machine-learning/anomaly-detection/ml-configuring-transform.md index bfc2b4ffb..dce9a190c 100644 --- a/explore-analyze/machine-learning/anomaly-detection/ml-configuring-transform.md +++ b/explore-analyze/machine-learning/anomaly-detection/ml-configuring-transform.md @@ -74,7 +74,7 @@ PUT /my-index-000001/_doc/1 } ``` -1. In this example, string fields are mapped as `keyword` fields to support aggregation. If you want both a full text (`text`) and a keyword (`keyword`) version of the same field, use multi-fields. For more information, see [fields](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/multi-fields.md). +1. In this example, string fields are mapped as `keyword` fields to support aggregation. If you want both a full text (`text`) and a keyword (`keyword`) version of the same field, use multi-fields. For more information, see [fields](elasticsearch://reference/elasticsearch/mapping-reference/multi-fields.md). $$$ml-configuring-transform1$$$ diff --git a/explore-analyze/machine-learning/anomaly-detection/ml-getting-started.md b/explore-analyze/machine-learning/anomaly-detection/ml-getting-started.md index eb46af80b..ab27eaa9b 100644 --- a/explore-analyze/machine-learning/anomaly-detection/ml-getting-started.md +++ b/explore-analyze/machine-learning/anomaly-detection/ml-getting-started.md @@ -50,7 +50,7 @@ To get the best results from {{ml}} analytics, you must understand your data. Yo 6. Optional: You can change the random sampling behavior, which affects the number of documents per shard that are used in the {{data-viz}}. You can use automatic random sampling that balances accuracy and speed, manual sampling where you can chose a value for the sampling percentage, or you can turn the feaure off to use the full data set. There is a relatively small number of documents in the {{kib}} sample data, so you can turn random sampling off. For larger data sets, keep in mind that using a large sample size increases query run times and increases the load on the cluster. 7. Explore the fields in the {{data-viz}}. - You can filter the list by field names or [field types](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/field-data-types.md). The {{data-viz}} indicates how many of the documents in the sample for the selected time period contain each field. + You can filter the list by field names or [field types](elasticsearch://reference/elasticsearch/mapping-reference/field-data-types.md). The {{data-viz}} indicates how many of the documents in the sample for the selected time period contain each field. In particular, look at the `clientip`, `response.keyword`, and `url.keyword` fields, since we’ll use them in our {{anomaly-jobs}}. For these fields, the {{data-viz}} provides the number of distinct values, a list of the top values, and the number and percentage of documents that contain the field. For example: :::{image} ../../../images/machine-learning-ml-gs-data-keyword.jpg @@ -271,7 +271,7 @@ To create a forecast in {{kib}}: :class: screenshot ::: -3. Specify a duration for your forecast. This value indicates how far to extrapolate beyond the last record that was processed. You must use [time units](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#time-units). In this example, the duration is one week (`1w`): +3. Specify a duration for your forecast. This value indicates how far to extrapolate beyond the last record that was processed. You must use [time units](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#time-units). In this example, the duration is one week (`1w`): :::{image} ../../../images/machine-learning-ml-gs-duration.png :alt: Specify a duration of 1w :class: screenshot diff --git a/explore-analyze/machine-learning/anomaly-detection/ml-limitations.md b/explore-analyze/machine-learning/anomaly-detection/ml-limitations.md index 1c548dcc5..a379ed8e0 100644 --- a/explore-analyze/machine-learning/anomaly-detection/ml-limitations.md +++ b/explore-analyze/machine-learning/anomaly-detection/ml-limitations.md @@ -20,7 +20,7 @@ The following limitations and known problems apply to the 9.0.0-beta1 release of ### CPUs must support SSE4.2 [ml-limitations-sse] -{{ml-cap}} uses Streaming SIMD Extensions (SSE) 4.2 instructions, so it works only on machines whose CPUs [support](https://en.wikipedia.org/wiki/SSE4#Supporting_CPUs) SSE4.2. If you run {{es}} on older hardware you must disable {{ml}} by setting `xpack.ml.enabled` to `false`. See [{{ml-cap}} settings in {{es}}](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/machine-learning-settings.md). +{{ml-cap}} uses Streaming SIMD Extensions (SSE) 4.2 instructions, so it works only on machines whose CPUs [support](https://en.wikipedia.org/wiki/SSE4#Supporting_CPUs) SSE4.2. If you run {{es}} on older hardware you must disable {{ml}} by setting `xpack.ml.enabled` to `false`. See [{{ml-cap}} settings in {{es}}](elasticsearch://reference/elasticsearch/configuration-reference/machine-learning-settings.md). ### CPU scheduling improvements apply to Linux and MacOS only [ml-scheduling-priority] @@ -40,7 +40,7 @@ If you send pre-aggregated data to a job for analysis, you must ensure that the ### Scripted metric aggregations are not supported [_scripted_metric_aggregations_are_not_supported] -Using [scripted metric aggregations](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-metrics-scripted-metric-aggregation.md) in {{dfeeds}} is not supported. Refer to the [Aggregating data for faster performance](ml-configuring-aggregation.md) page to learn more about aggregations in {{dfeeds}}. +Using [scripted metric aggregations](elasticsearch://reference/data-analysis/aggregations/search-aggregations-metrics-scripted-metric-aggregation.md) in {{dfeeds}} is not supported. Refer to the [Aggregating data for faster performance](ml-configuring-aggregation.md) page to learn more about aggregations in {{dfeeds}}. ### Fields named "by", "count", or "over" cannot be used to split data [_fields_named_by_count_or_over_cannot_be_used_to_split_data] @@ -124,7 +124,7 @@ In {{kib}}, **Anomaly Explorer** and **Single Metric Viewer** charts are not dis * for anomalies that were due to categorization (if model plot is not enabled), * if the {{dfeed}} uses scripted fields and model plot is not enabled (except for scripts that define metric fields), -* if the {{dfeed}} uses [composite aggregations](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-composite-aggregation.md) that have composite sources other than `terms` and `date_histogram`, +* if the {{dfeed}} uses [composite aggregations](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-composite-aggregation.md) that have composite sources other than `terms` and `date_histogram`, * if your [{{dfeed}} uses aggregations with nested `terms` aggs](ml-configuring-aggregation.md#aggs-dfeeds) and model plot is not enabled, * `freq_rare` functions, * `info_content`, `high_info_content`, `low_info_content` functions, diff --git a/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-classification.md b/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-classification.md index fc850eae4..f45c2c825 100644 --- a/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-classification.md +++ b/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-classification.md @@ -193,13 +193,13 @@ For instance, suppose you have an online service and you would like to predict w {{infer-cap}} can be used as a processor specified in an [ingest pipeline](../../../manage-data/ingest/transform-enrich/ingest-pipelines.md). It uses a trained model to infer against the data that is being ingested in the pipeline. The model is used on the ingest node. {{infer-cap}} pre-processes the data by using the model and provides a prediction. After the process, the pipeline continues executing (if there is any other processor in the pipeline), finally the new data together with the results are indexed into the destination index. -Check the [{{infer}} processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/inference-processor.md) and [the {{ml}} {{dfanalytics}} API documentation](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-ml-data-frame) to learn more. +Check the [{{infer}} processor](elasticsearch://reference/ingestion-tools/enrich-processor/inference-processor.md) and [the {{ml}} {{dfanalytics}} API documentation](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-ml-data-frame) to learn more. #### {{infer-cap}} aggregation [ml-inference-aggregation-class] {{infer-cap}} can also be used as a pipeline aggregation. You can reference a trained model in the aggregation to infer on the result field of the parent bucket aggregation. The {{infer}} aggregation uses the model on the results to provide a prediction. This aggregation enables you to run {{classification}} or {{reganalysis}} at search time. If you want to perform the analysis on a small set of data, this aggregation enables you to generate predictions without the need to set up a processor in the ingest pipeline. -Check the [{{infer}} bucket aggregation](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-pipeline-inference-bucket-aggregation.md) and [the {{ml}} {{dfanalytics}} API documentation](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-ml-data-frame) to learn more. +Check the [{{infer}} bucket aggregation](elasticsearch://reference/data-analysis/aggregations/search-aggregations-pipeline-inference-bucket-aggregation.md) and [the {{ml}} {{dfanalytics}} API documentation](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-ml-data-frame) to learn more. ::::{note} If you use trained model aliases to reference your trained model in an {{infer}} processor or {{infer}} aggregation, you can replace your trained model with a new one without the need of updating the processor or the aggregation. Reassign the alias you used to a new trained model ID by using the [Create or update trained model aliases API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-ml-put-trained-model-alias). The new trained model needs to use the same type of {{dfanalytics}} as the old one. diff --git a/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-limitations.md b/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-limitations.md index 5a717eac5..1663cbd74 100644 --- a/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-limitations.md +++ b/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-limitations.md @@ -37,7 +37,7 @@ You cannot update {{dfanalytics}} configurations. Instead, delete the {{dfanalyt ### {{dfanalytics-cap}} memory limitation [dfa-dataframe-size-limitations] -{{dfanalytics-cap}} can only perform analyses that fit into the memory available for {{ml}}. Overspill to disk is not currently possible. For general {{ml}} settings, see [{{ml-cap}} settings in {{es}}](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/machine-learning-settings.md). +{{dfanalytics-cap}} can only perform analyses that fit into the memory available for {{ml}}. Overspill to disk is not currently possible. For general {{ml}} settings, see [{{ml-cap}} settings in {{es}}](elasticsearch://reference/elasticsearch/configuration-reference/machine-learning-settings.md). When you create a {{dfanalytics-job}} and the inference step of the process fails due to the model is too large to fit into JVM, follow the steps in [this GitHub issue](https://github.com/elastic/elasticsearch/issues/76093) for a workaround. diff --git a/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-regression.md b/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-regression.md index 528ef948f..8b98eba0d 100644 --- a/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-regression.md +++ b/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-regression.md @@ -139,13 +139,13 @@ For instance, suppose you have an online service and you would like to predict w {{infer-cap}} can be used as a processor specified in an [ingest pipeline](../../../manage-data/ingest/transform-enrich/ingest-pipelines.md). It uses a trained model to infer against the data that is being ingested in the pipeline. The model is used on the ingest node. {{infer-cap}} pre-processes the data by using the model and provides a prediction. After the process, the pipeline continues executing (if there is any other processor in the pipeline), finally the new data together with the results are indexed into the destination index. -Check the [{{infer}} processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/inference-processor.md) and [the {{ml}} {{dfanalytics}} API documentation](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-ml-data-frame) to learn more. +Check the [{{infer}} processor](elasticsearch://reference/ingestion-tools/enrich-processor/inference-processor.md) and [the {{ml}} {{dfanalytics}} API documentation](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-ml-data-frame) to learn more. #### {{infer-cap}} aggregation [ml-inference-aggregation-reg] {{infer-cap}} can also be used as a pipeline aggregation. You can reference a trained model in the aggregation to infer on the result field of the parent bucket aggregation. The {{infer}} aggregation uses the model on the results to provide a prediction. This aggregation enables you to run {{classification}} or {{reganalysis}} at search time. If you want to perform the analysis on a small set of data, this aggregation enables you to generate predictions without the need to set up a processor in the ingest pipeline. -Check the [{{infer}} bucket aggregation](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-pipeline-inference-bucket-aggregation.md) and [the {{ml}} {{dfanalytics}} API documentation](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-ml-data-frame) to learn more. +Check the [{{infer}} bucket aggregation](elasticsearch://reference/data-analysis/aggregations/search-aggregations-pipeline-inference-bucket-aggregation.md) and [the {{ml}} {{dfanalytics}} API documentation](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-ml-data-frame) to learn more. ::::{note} If you use trained model aliases to reference your trained model in an {{infer}} processor or {{infer}} aggregation, you can replace your trained model with a new one without the need of updating the processor or the aggregation. Reassign the alias you used to a new trained model ID by using the [Create or update trained model aliases API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-ml-put-trained-model-alias). The new trained model needs to use the same type of {{dfanalytics}} as the old one. diff --git a/explore-analyze/machine-learning/data-frame-analytics/ml-trained-models.md b/explore-analyze/machine-learning/data-frame-analytics/ml-trained-models.md index 38248df7b..36efc9355 100644 --- a/explore-analyze/machine-learning/data-frame-analytics/ml-trained-models.md +++ b/explore-analyze/machine-learning/data-frame-analytics/ml-trained-models.md @@ -106,7 +106,7 @@ A few observations: ::::{note} -* Models exported from the [get trained models API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-ml-get-trained-models) are limited in size by the [http.max_content_length](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md) global configuration value in {{es}}. The default value is `100mb` and may need to be increased depending on the size of model being exported. +* Models exported from the [get trained models API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-ml-get-trained-models) are limited in size by the [http.max_content_length](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md) global configuration value in {{es}}. The default value is `100mb` and may need to be increased depending on the size of model being exported. * Connection timeouts can occur, for example, when model sizes are very large or your cluster is under load. If needed, you can increase [timeout configurations](https://ec.haxx.se/usingcurl/usingcurl-timeouts) for `curl` (for example, `curl --max-time 600`) or your client of choice. :::: diff --git a/explore-analyze/machine-learning/machine-learning-in-kibana/inference-processing.md b/explore-analyze/machine-learning/machine-learning-in-kibana/inference-processing.md index 08ff1ed48..e8041f1f4 100644 --- a/explore-analyze/machine-learning/machine-learning-in-kibana/inference-processing.md +++ b/explore-analyze/machine-learning/machine-learning-in-kibana/inference-processing.md @@ -35,7 +35,7 @@ Most commonly used to detect entities such as People, Places, and Organization i ### Text embedding [ingest-pipeline-search-inference-text-embedding] -Analyzing a text field using a [Text embedding](../nlp/ml-nlp-search-compare.md#ml-nlp-text-embedding) model will generate a [dense vector](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/dense-vector.md) representation of the text. This array of numeric values encodes the semantic *meaning* of the text. Using the same model with a user’s search query will produce a vector that can then be used to search, ranking results based on vector similarity - semantic similarity - as opposed to traditional word or text similarity. +Analyzing a text field using a [Text embedding](../nlp/ml-nlp-search-compare.md#ml-nlp-text-embedding) model will generate a [dense vector](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md) representation of the text. This array of numeric values encodes the semantic *meaning* of the text. Using the same model with a user’s search query will produce a vector that can then be used to search, ranking results based on vector similarity - semantic similarity - as opposed to traditional word or text similarity. A common use case is a user searching FAQs, or a support agent searching a knowledge base, where semantically similar content may be indexed with little similarity in phrasing. diff --git a/explore-analyze/machine-learning/machine-learning-in-kibana/xpack-ml-aiops.md b/explore-analyze/machine-learning/machine-learning-in-kibana/xpack-ml-aiops.md index 558e21ac0..a76b5d9af 100644 --- a/explore-analyze/machine-learning/machine-learning-in-kibana/xpack-ml-aiops.md +++ b/explore-analyze/machine-learning/machine-learning-in-kibana/xpack-ml-aiops.md @@ -52,7 +52,7 @@ Select a field for categorization and optionally apply any filters that you want This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features. :::: -Change point detection uses the [change point aggregation](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-change-point-aggregation.md) to detect distribution changes, trend changes, and other statistically significant change points in a metric of your time series data. +Change point detection uses the [change point aggregation](elasticsearch://reference/data-analysis/aggregations/search-aggregations-change-point-aggregation.md) to detect distribution changes, trend changes, and other statistically significant change points in a metric of your time series data. You can find change point detection under **{{ml-app}}** > **AIOps Labs** or by using the [global search field](/explore-analyze/find-and-organize/find-apps-and-objects.md). Here, you can select the {{data-source}} or saved Discover session that you want to analyze. diff --git a/explore-analyze/machine-learning/nlp/ml-nlp-deploy-model.md b/explore-analyze/machine-learning/nlp/ml-nlp-deploy-model.md index 59657108b..c3736bb54 100644 --- a/explore-analyze/machine-learning/nlp/ml-nlp-deploy-model.md +++ b/explore-analyze/machine-learning/nlp/ml-nlp-deploy-model.md @@ -37,4 +37,4 @@ For the resource levels when adaptive resources are enabled, refer to <[*Trained Each allocation of a model deployment has a dedicated queue to buffer {{infer}} requests. The size of this queue is determined by the `queue_capacity` parameter in the [start trained model deployment API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-ml-start-trained-model-deployment). When the queue reaches its maximum capacity, new requests are declined until some of the queued requests are processed, creating available capacity once again. When multiple ingest pipelines reference the same deployment, the queue can fill up, resulting in rejected requests. Consider using dedicated deployments to prevent this situation. -{{infer-cap}} requests originating from search, such as the [`text_expansion` query](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-text-expansion-query.md), have a higher priority compared to non-search requests. The {{infer}} ingest processor generates normal priority requests. If both a search query and an ingest processor use the same deployment, the search requests with higher priority skip ahead in the queue for processing before the lower priority ingest requests. This prioritization accelerates search responses while potentially slowing down ingest where response time is less critical. +{{infer-cap}} requests originating from search, such as the [`text_expansion` query](elasticsearch://reference/query-languages/query-dsl-text-expansion-query.md), have a higher priority compared to non-search requests. The {{infer}} ingest processor generates normal priority requests. If both a search query and an ingest processor use the same deployment, the search requests with higher priority skip ahead in the queue for processing before the lower priority ingest requests. This prioritization accelerates search responses while potentially slowing down ingest where response time is less critical. diff --git a/explore-analyze/machine-learning/nlp/ml-nlp-elser.md b/explore-analyze/machine-learning/nlp/ml-nlp-elser.md index 4d0156457..fe6e50eef 100644 --- a/explore-analyze/machine-learning/nlp/ml-nlp-elser.md +++ b/explore-analyze/machine-learning/nlp/ml-nlp-elser.md @@ -21,7 +21,7 @@ While ELSER V2 is generally available, ELSER V1 is in [preview] and will remain ## Tokens - not synonyms [elser-tokens] -ELSER expands the indexed and searched passages into collections of terms that are learned to co-occur frequently within a diverse set of training data. The terms that the text is expanded into by the model *are not* synonyms for the search terms; they are learned associations capturing relevance. These expanded terms are weighted as some of them are more significant than others. Then the {{es}} [sparse vector](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/sparse-vector.md) (or [rank features](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/rank-features.md)) field type is used to store the terms and weights at index time, and to search against later. +ELSER expands the indexed and searched passages into collections of terms that are learned to co-occur frequently within a diverse set of training data. The terms that the text is expanded into by the model *are not* synonyms for the search terms; they are learned associations capturing relevance. These expanded terms are weighted as some of them are more significant than others. Then the {{es}} [sparse vector](elasticsearch://reference/elasticsearch/mapping-reference/sparse-vector.md) (or [rank features](elasticsearch://reference/elasticsearch/mapping-reference/rank-features.md)) field type is used to store the terms and weights at index time, and to search against later. This approach provides a more understandable search experience compared to vector embeddings. However, attempting to directly interpret the tokens and weights can be misleading, as the expansion essentially results in a vector in a very high-dimensional space. Consequently, certain tokens, especially those with low weight, contain information that is intertwined with other low-weight tokens in the representation. In this regard, they function similarly to a dense vector representation, making it challenging to separate their individual contributions. This complexity can potentially lead to misinterpretations if not carefully considered during analysis. @@ -172,7 +172,7 @@ POST _ml/trained_models/.elser_model_2/deployment/_start?deployment_id=for_searc If you want to deploy ELSER in a restricted or closed network, you have two options: * create your own HTTP/HTTPS endpoint with the model artifacts on it, -* put the model artifacts into a directory inside the config directory on all [master-eligible nodes](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/node-settings.md#master-node). +* put the model artifacts into a directory inside the config directory on all [master-eligible nodes](elasticsearch://reference/elasticsearch/configuration-reference/node-settings.md#master-node). ### Model artifact files [elser-model-artifacts] @@ -284,7 +284,7 @@ To learn more about ELSER performance, refer to the [Benchmark information](#els ## Pre-cleaning input text [pre-cleaning] -The quality of the input text significantly affects the quality of the embeddings. To achieve the best results, it’s recommended to clean the input text before generating embeddings. The exact preprocessing you may need to do heavily depends on your text. For example, if your text contains HTML tags, use the [HTML strip processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/htmlstrip-processor.md) in an ingest pipeline to remove unnecessary elements. Always review and clean your input text before ingestion to eliminate any irrelevant entities that might affect the results. +The quality of the input text significantly affects the quality of the embeddings. To achieve the best results, it’s recommended to clean the input text before generating embeddings. The exact preprocessing you may need to do heavily depends on your text. For example, if your text contains HTML tags, use the [HTML strip processor](elasticsearch://reference/ingestion-tools/enrich-processor/htmlstrip-processor.md) in an ingest pipeline to remove unnecessary elements. Always review and clean your input text before ingestion to eliminate any irrelevant entities that might affect the results. ## Recommendations for using ELSER [elser-recommendations] diff --git a/explore-analyze/machine-learning/nlp/ml-nlp-inference.md b/explore-analyze/machine-learning/nlp/ml-nlp-inference.md index 82539d37b..3f76b8d5b 100644 --- a/explore-analyze/machine-learning/nlp/ml-nlp-inference.md +++ b/explore-analyze/machine-learning/nlp/ml-nlp-inference.md @@ -25,7 +25,7 @@ In {{kib}}, you can create and edit pipelines in **{{stack-manage-app}}** > **In ::: 1. Click **Create pipeline** or edit an existing pipeline. -2. Add an [{{infer}} processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/inference-processor.md) to your pipeline: +2. Add an [{{infer}} processor](elasticsearch://reference/ingestion-tools/enrich-processor/inference-processor.md) to your pipeline: 1. Click **Add a processor** and select the **{{infer-cap}}** processor type. 2. Set **Model ID** to the name of your trained model, for example `elastic__distilbert-base-cased-finetuned-conll03-english` or `lang_ident_model_1`. @@ -39,7 +39,7 @@ In {{kib}}, you can create and edit pipelines in **{{stack-manage-app}}** > **In } ``` - 2. You can also optionally add [classification configuration options](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/inference-processor.md#inference-processor-classification-opt) in the **{{infer-cap}} configuration** section. For example, to include the top five language predictions: + 2. You can also optionally add [classification configuration options](elasticsearch://reference/ingestion-tools/enrich-processor/inference-processor.md#inference-processor-classification-opt) in the **{{infer-cap}} configuration** section. For example, to include the top five language predictions: ```js { @@ -51,7 +51,7 @@ In {{kib}}, you can create and edit pipelines in **{{stack-manage-app}}** > **In 4. Click **Add** to save the processor. -3. Optional: Add a [set processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/set-processor.md) to index the ingest timestamp. +3. Optional: Add a [set processor](elasticsearch://reference/ingestion-tools/enrich-processor/set-processor.md) to index the ingest timestamp. 1. Click **Add a processor** and select the **Set** processor type. 2. Choose a name for the field (such as `event.ingested`) and set its value to `{{{_ingest.timestamp}}}`. For more details, refer to [Access ingest metadata in a processor](../../../manage-data/ingest/transform-enrich/ingest-pipelines.md#access-ingest-metadata). @@ -117,7 +117,7 @@ PUT ner-test ``` ::::{tip} -To use the `annotated_text` data type in this example, you must install the [mapper annotated text plugin](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch-plugins/mapper-annotated-text.md). For more installation details, refer to [Add plugins provided with {{ech}}](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch-plugins/cloud/ec-adding-elastic-plugins.md). +To use the `annotated_text` data type in this example, you must install the [mapper annotated text plugin](elasticsearch://reference/elasticsearch-plugins/mapper-annotated-text.md). For more installation details, refer to [Add plugins provided with {{ech}}](elasticsearch://reference/elasticsearch-plugins/cloud/ec-adding-elastic-plugins.md). :::: You can then use the new pipeline to index some documents. For example, use a bulk indexing request with the `pipeline` query parameter for your NER pipeline: diff --git a/explore-analyze/machine-learning/nlp/ml-nlp-limitations.md b/explore-analyze/machine-learning/nlp/ml-nlp-limitations.md index db6372216..60b4ba8f5 100644 --- a/explore-analyze/machine-learning/nlp/ml-nlp-limitations.md +++ b/explore-analyze/machine-learning/nlp/ml-nlp-limitations.md @@ -12,7 +12,7 @@ The following limitations and known problems apply to the 9.0.0-beta1 release of ## Document size limitations when using `semantic_text` fields [ml-nlp-large-documents-limit-10k-10mb] -When using semantic text to ingest documents, chunking takes place automatically. The number of chunks is limited by the [`index.mapping.nested_objects.limit`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/mapping-limit.md) cluster setting, which defaults to 10k. Documents that are too large will cause errors during ingestion. To avoid this issue, please split your documents into roughly 1MB parts before ingestion. +When using semantic text to ingest documents, chunking takes place automatically. The number of chunks is limited by the [`index.mapping.nested_objects.limit`](elasticsearch://reference/elasticsearch/index-settings/mapping-limit.md) cluster setting, which defaults to 10k. Documents that are too large will cause errors during ingestion. To avoid this issue, please split your documents into roughly 1MB parts before ingestion. ## ELSER semantic search is limited to 512 tokens per field that inference is applied to [ml-nlp-elser-v1-limit-512] diff --git a/explore-analyze/machine-learning/nlp/ml-nlp-model-ref.md b/explore-analyze/machine-learning/nlp/ml-nlp-model-ref.md index d30917239..d3b9af07d 100644 --- a/explore-analyze/machine-learning/nlp/ml-nlp-model-ref.md +++ b/explore-analyze/machine-learning/nlp/ml-nlp-model-ref.md @@ -70,7 +70,7 @@ Sparse embedding models should be configured with the `text_expansion` task type Text Embedding models are designed to work with specific scoring functions for calculating the similarity between the embeddings they produce. Examples of typical scoring functions are: `cosine`, `dot product` and `euclidean distance` (also known as `l2_norm`). -The embeddings produced by these models should be indexed in {{es}} using the [dense vector field type](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/dense-vector.md) with an appropriate [similarity function](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-params) chosen for the model. +The embeddings produced by these models should be indexed in {{es}} using the [dense vector field type](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md) with an appropriate [similarity function](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-params) chosen for the model. To find similar embeddings in {{es}} use the efficient [Approximate k-nearest neighbor (kNN)](../../../solutions/search/vector/knn.md#approximate-knn) search API with a text embedding as the query vector. Approximate kNN search uses the similarity function defined in the dense vector field mapping is used to calculate the relevance. For the best results the function must be one of the suitable similarity functions for the model. diff --git a/explore-analyze/machine-learning/nlp/ml-nlp-ner-example.md b/explore-analyze/machine-learning/nlp/ml-nlp-ner-example.md index b1c2d3305..af867517b 100644 --- a/explore-analyze/machine-learning/nlp/ml-nlp-ner-example.md +++ b/explore-analyze/machine-learning/nlp/ml-nlp-ner-example.md @@ -113,7 +113,7 @@ Using the example text "Elastic is headquartered in Mountain View, California.", ## Add the NER model to an {{infer}} ingest pipeline [ex-ner-ingest] -You can perform bulk {{infer}} on documents as they are ingested by using an [{{infer}} processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/inference-processor.md) in your ingest pipeline. The novel *Les Misérables* by Victor Hugo is used as an example for {{infer}} in the following example. [Download](https://github.com/elastic/stack-docs/blob/8.5/docs/en/stack/ml/nlp/data/les-miserables-nd.json) the novel text split by paragraph as a JSON file, then upload it by using the [Data Visualizer](../../../manage-data/ingest/upload-data-files.md). Give the new index the name `les-miserables` when uploading the file. +You can perform bulk {{infer}} on documents as they are ingested by using an [{{infer}} processor](elasticsearch://reference/ingestion-tools/enrich-processor/inference-processor.md) in your ingest pipeline. The novel *Les Misérables* by Victor Hugo is used as an example for {{infer}} in the following example. [Download](https://github.com/elastic/stack-docs/blob/8.5/docs/en/stack/ml/nlp/data/les-miserables-nd.json) the novel text split by paragraph as a JSON file, then upload it by using the [Data Visualizer](../../../manage-data/ingest/upload-data-files.md). Give the new index the name `les-miserables` when uploading the file. Now create an ingest pipeline either in the [Stack management UI](ml-nlp-inference.md#ml-nlp-inference-processor) or by using the API: diff --git a/explore-analyze/machine-learning/nlp/ml-nlp-overview.md b/explore-analyze/machine-learning/nlp/ml-nlp-overview.md index cc3f42f4b..e1b433355 100644 --- a/explore-analyze/machine-learning/nlp/ml-nlp-overview.md +++ b/explore-analyze/machine-learning/nlp/ml-nlp-overview.md @@ -20,7 +20,7 @@ The [{{infer}} API](https://www.elastic.co/docs/api/doc/elasticsearch/group/endp You can **upload and manage NLP models** using the Eland client and the [{{stack}}](ml-nlp-deploy-models.md). Find the [list of recommended and compatible models here](ml-nlp-model-ref.md). Refer to [*Examples*](ml-nlp-examples.md) to learn more about how to use {{ml}} models deployed in your cluster. -You can **store embeddings in your {{es}} vector database** if you generate [dense vector](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/dense-vector.md) or [sparse vector](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/sparse-vector.md) model embeddings outside of {{es}}. +You can **store embeddings in your {{es}} vector database** if you generate [dense vector](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md) or [sparse vector](elasticsearch://reference/elasticsearch/mapping-reference/sparse-vector.md) model embeddings outside of {{es}}. ## What is NLP? [what-is-nlp] diff --git a/explore-analyze/machine-learning/nlp/ml-nlp-rerank.md b/explore-analyze/machine-learning/nlp/ml-nlp-rerank.md index 043873cbf..36516b423 100644 --- a/explore-analyze/machine-learning/nlp/ml-nlp-rerank.md +++ b/explore-analyze/machine-learning/nlp/ml-nlp-rerank.md @@ -46,7 +46,7 @@ Elastic Rerank is available in Elastic Stack version 8.17+: To download and deploy Elastic Rerank, use the [create inference API](../../../solutions/search/inference-api/elasticsearch-inference-integration.md) to create an {{es}} service `rerank` endpoint. -::::{tip} +::::{tip} Refer to this [Python notebook](https://github.com/elastic/elasticsearch-labs/blob/main/notebooks/search/12-semantic-reranking-elastic-rerank.ipynb) for an end-to-end example using Elastic Rerank. :::: @@ -166,7 +166,7 @@ For a file-based access, follow these steps: * English language only * Maximum context window of 512 tokens - When using the [`semantic_text` field type](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/semantic-text.md), text is divided into chunks. By default, each chunk contains 250 words (approximately 400 tokens). Be cautious when increasing the chunk size - if the combined length of your query and chunk text exceeds 512 tokens, the model won’t have access to the full content. + When using the [`semantic_text` field type](elasticsearch://reference/elasticsearch/mapping-reference/semantic-text.md), text is divided into chunks. By default, each chunk contains 250 words (approximately 400 tokens). Be cautious when increasing the chunk size - if the combined length of your query and chunk text exceeds 512 tokens, the model won’t have access to the full content. When the combined inputs exceed the 512 token limit, a balanced truncation strategy is used. If both the query and input text are longer than 255 tokens each then both are truncated, otherwise the longest is truncated. diff --git a/explore-analyze/machine-learning/nlp/ml-nlp-search-compare.md b/explore-analyze/machine-learning/nlp/ml-nlp-search-compare.md index 984e9c668..3f899165b 100644 --- a/explore-analyze/machine-learning/nlp/ml-nlp-search-compare.md +++ b/explore-analyze/machine-learning/nlp/ml-nlp-search-compare.md @@ -13,11 +13,11 @@ The {{stack-ml-features}} can generate embeddings, which you can use to search i * [Text embedding](#ml-nlp-text-embedding) * [Text similarity](#ml-nlp-text-similarity) -## Text embedding [ml-nlp-text-embedding] +## Text embedding [ml-nlp-text-embedding] Text embedding is a task which produces a mathematical representation of text called an embedding. The {{ml}} model turns the text into an array of numerical values (also known as a *vector*). Pieces of content with similar meaning have similar representations. This means it is possible to determine whether different pieces of text are either semantically similar, different, or even opposite by using a mathematical similarity function. -This task is responsible for producing only the embedding. When the embedding is created, it can be stored in a [dense_vector](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/dense-vector.md) field and used at search time. For example, you can use these vectors in a [k-nearest neighbor (kNN) search](../../../solutions/search/vector/knn.md) to achieve semantic search capabilities. +This task is responsible for producing only the embedding. When the embedding is created, it can be stored in a [dense_vector](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md) field and used at search time. For example, you can use these vectors in a [k-nearest neighbor (kNN) search](../../../solutions/search/vector/knn.md) to achieve semantic search capabilities. The following is an example of producing a text embedding: @@ -39,7 +39,7 @@ The task returns the following result: ... ``` -## Text similarity [ml-nlp-text-similarity] +## Text similarity [ml-nlp-text-similarity] The text similarity task estimates how similar two pieces of text are to each other and expresses the similarity in a numeric value. This is commonly referred to as cross-encoding. This task is useful for ranking document text when comparing it to another provided text input. diff --git a/explore-analyze/machine-learning/nlp/ml-nlp-text-emb-vector-search-example.md b/explore-analyze/machine-learning/nlp/ml-nlp-text-emb-vector-search-example.md index ca3609c6c..5e897abb9 100644 --- a/explore-analyze/machine-learning/nlp/ml-nlp-text-emb-vector-search-example.md +++ b/explore-analyze/machine-learning/nlp/ml-nlp-text-emb-vector-search-example.md @@ -112,7 +112,7 @@ Upload the file by using the [Data Visualizer](../../../manage-data/ingest/uploa ## Add the text embedding model to an {{infer}} ingest pipeline [ex-text-emb-ingest] -Process the initial data with an [{{infer}} processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/inference-processor.md). It adds an embedding for each passage. For this, create a text embedding ingest pipeline and then reindex the initial data with this pipeline. +Process the initial data with an [{{infer}} processor](elasticsearch://reference/ingestion-tools/enrich-processor/inference-processor.md). It adds an embedding for each passage. For this, create a text embedding ingest pipeline and then reindex the initial data with this pipeline. Now create an ingest pipeline either in the [{{stack-manage-app}} UI](ml-nlp-inference.md#ml-nlp-inference-processor) or by using the API: diff --git a/explore-analyze/machine-learning/setting-up-machine-learning.md b/explore-analyze/machine-learning/setting-up-machine-learning.md index 732a9708f..62819a592 100644 --- a/explore-analyze/machine-learning/setting-up-machine-learning.md +++ b/explore-analyze/machine-learning/setting-up-machine-learning.md @@ -14,8 +14,8 @@ mapped_pages: To use the {{stack}} {{ml-features}}, you must have: * the [appropriate subscription](https://www.elastic.co/subscriptions) level or the free trial period activated -* `xpack.ml.enabled` set to its default value of `true` on every node in the cluster (refer to [{{ml-cap}} settings in {{es}}](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/machine-learning-settings.md)) -* `ml` value defined in the list of `node.roles` on the [{{ml}} nodes](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/node-settings.md#ml-node) +* `xpack.ml.enabled` set to its default value of `true` on every node in the cluster (refer to [{{ml-cap}} settings in {{es}}](elasticsearch://reference/elasticsearch/configuration-reference/machine-learning-settings.md)) +* `ml` value defined in the list of `node.roles` on the [{{ml}} nodes](elasticsearch://reference/elasticsearch/configuration-reference/node-settings.md#ml-node) * {{ml}} features visible in the {{kib}} space * security privileges assigned to the user that: diff --git a/explore-analyze/query-filter/aggregations.md b/explore-analyze/query-filter/aggregations.md index 41e013a39..4789c0b0c 100644 --- a/explore-analyze/query-filter/aggregations.md +++ b/explore-analyze/query-filter/aggregations.md @@ -17,13 +17,13 @@ An aggregation summarizes your data as metrics, statistics, or other analytics. {{es}} organizes aggregations into three categories: -* [Metric](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/metrics.md) aggregations that calculate metrics, such as a sum or average, from field values. -* [Bucket](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/bucket.md) aggregations that group documents into buckets, also called bins, based on field values, ranges, or other criteria. -* [Pipeline](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/pipeline.md) aggregations that take input from other aggregations instead of documents or fields. +* [Metric](elasticsearch://reference/data-analysis/aggregations/metrics.md) aggregations that calculate metrics, such as a sum or average, from field values. +* [Bucket](elasticsearch://reference/data-analysis/aggregations/bucket.md) aggregations that group documents into buckets, also called bins, based on field values, ranges, or other criteria. +* [Pipeline](elasticsearch://reference/data-analysis/aggregations/pipeline.md) aggregations that take input from other aggregations instead of documents or fields. ## Run an aggregation [run-an-agg] -You can run aggregations as part of a [search](../../solutions/search/querying-for-search.md) by specifying the [search API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search)'s `aggs` parameter. The following search runs a [terms aggregation](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-terms-aggregation.md) on `my-field`: +You can run aggregations as part of a [search](../../solutions/search/querying-for-search.md) by specifying the [search API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search)'s `aggs` parameter. The following search runs a [terms aggregation](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-terms-aggregation.md) on `my-field`: ```console GET /my-index-000001/_search @@ -137,7 +137,7 @@ GET /my-index-000001/_search ## Run sub-aggregations [run-sub-aggs] -Bucket aggregations support bucket or metric sub-aggregations. For example, a terms aggregation with an [avg](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-metrics-avg-aggregation.md) sub-aggregation calculates an average value for each bucket of documents. There is no level or depth limit for nesting sub-aggregations. +Bucket aggregations support bucket or metric sub-aggregations. For example, a terms aggregation with an [avg](elasticsearch://reference/data-analysis/aggregations/search-aggregations-metrics-avg-aggregation.md) sub-aggregation calculates an average value for each bucket of documents. There is no level or depth limit for nesting sub-aggregations. ```console GET /my-index-000001/_search @@ -244,7 +244,7 @@ GET /my-index-000001/_search?typed_keys The response returns the aggregation type as a prefix to the aggregation’s name. ::::{important} -Some aggregations return a different aggregation type from the type in the request. For example, the terms, [significant terms](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-significantterms-aggregation.md), and [percentiles](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-metrics-percentile-aggregation.md) aggregations return different aggregations types depending on the data type of the aggregated field. +Some aggregations return a different aggregation type from the type in the request. For example, the terms, [significant terms](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-significantterms-aggregation.md), and [percentiles](elasticsearch://reference/data-analysis/aggregations/search-aggregations-metrics-percentile-aggregation.md) aggregations return different aggregations types depending on the data type of the aggregated field. :::: ```console-result @@ -284,14 +284,14 @@ GET /my-index-000001/_search?size=0 } ``` -Scripts calculate field values dynamically, which adds a little overhead to the aggregation. In addition to the time spent calculating, some aggregations like [`terms`](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-terms-aggregation.md) and [`filters`](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-filters-aggregation.md) can’t use some of their optimizations with runtime fields. In total, performance costs for using a runtime field varies from aggregation to aggregation. +Scripts calculate field values dynamically, which adds a little overhead to the aggregation. In addition to the time spent calculating, some aggregations like [`terms`](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-terms-aggregation.md) and [`filters`](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-filters-aggregation.md) can’t use some of their optimizations with runtime fields. In total, performance costs for using a runtime field varies from aggregation to aggregation. ## Aggregation caches [agg-caches] -For faster responses, {{es}} caches the results of frequently run aggregations in the [shard request cache](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/shard-request-cache-settings.md). To get cached results, use the same [`preference` string](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/search-shard-routing.md#shard-and-node-preference) for each search. If you don’t need search hits, [set `size` to `0`](#return-only-agg-results) to avoid filling the cache. +For faster responses, {{es}} caches the results of frequently run aggregations in the [shard request cache](elasticsearch://reference/elasticsearch/configuration-reference/shard-request-cache-settings.md). To get cached results, use the same [`preference` string](elasticsearch://reference/elasticsearch/rest-apis/search-shard-routing.md#shard-and-node-preference) for each search. If you don’t need search hits, [set `size` to `0`](#return-only-agg-results) to avoid filling the cache. {{es}} routes searches with the same preference string to the same shards. If the shards' data doesn’t change between searches, the shards return cached aggregation results. ## Limits for `long` values [limits-for-long-values] -When running aggregations, {{es}} uses [`double`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/number.md) values to hold and represent numeric data. As a result, aggregations on [`long`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/number.md) numbers greater than `253` are approximate. +When running aggregations, {{es}} uses [`double`](elasticsearch://reference/elasticsearch/mapping-reference/number.md) values to hold and represent numeric data. As a result, aggregations on [`long`](elasticsearch://reference/elasticsearch/mapping-reference/number.md) numbers greater than `253` are approximate. diff --git a/explore-analyze/query-filter/aggregations/tutorial-analyze-ecommerce-data-with-aggregations-using-query-dsl.md b/explore-analyze/query-filter/aggregations/tutorial-analyze-ecommerce-data-with-aggregations-using-query-dsl.md index d7517bf71..def3d4f77 100644 --- a/explore-analyze/query-filter/aggregations/tutorial-analyze-ecommerce-data-with-aggregations-using-query-dsl.md +++ b/explore-analyze/query-filter/aggregations/tutorial-analyze-ecommerce-data-with-aggregations-using-query-dsl.md @@ -268,19 +268,19 @@ The response shows the field mappings for the `kibana_sample_data_ecommerce` ind :::: -The sample data includes the following [field data types](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/field-data-types.md): +The sample data includes the following [field data types](elasticsearch://reference/elasticsearch/mapping-reference/field-data-types.md): -* [`text`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/text.md) and [`keyword`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/keyword.md) for text fields - * Most `text` fields have a `.keyword` subfield for exact matching using [multi-fields](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/multi-fields.md) +* [`text`](elasticsearch://reference/elasticsearch/mapping-reference/text.md) and [`keyword`](elasticsearch://reference/elasticsearch/mapping-reference/keyword.md) for text fields + * Most `text` fields have a `.keyword` subfield for exact matching using [multi-fields](elasticsearch://reference/elasticsearch/mapping-reference/multi-fields.md) -* [`date`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/date.md) for date fields -* 3 [numeric](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/number.md) types: +* [`date`](elasticsearch://reference/elasticsearch/mapping-reference/date.md) for date fields +* 3 [numeric](elasticsearch://reference/elasticsearch/mapping-reference/number.md) types: * `integer` for whole numbers * `long` for large whole numbers * `half_float` for floating-point numbers -* [`geo_point`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/geo-point.md) for geographic coordinates -* [`object`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/object.md) for nested structures such as `products`, `geoip`, `event` +* [`geo_point`](elasticsearch://reference/elasticsearch/mapping-reference/geo-point.md) for geographic coordinates +* [`object`](elasticsearch://reference/elasticsearch/mapping-reference/object.md) for nested structures such as `products`, `geoip`, `event` Now that we understand the structure of our sample data, let’s start analyzing it. @@ -290,7 +290,7 @@ Let’s start by calculating important metrics about orders and customers. ### Get average order size [aggregations-tutorial-order-value] -Calculate the average order value across all orders in the dataset using the [`avg`](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-metrics-avg-aggregation.md) aggregation. +Calculate the average order value across all orders in the dataset using the [`avg`](elasticsearch://reference/data-analysis/aggregations/search-aggregations-metrics-avg-aggregation.md) aggregation. ```console GET kibana_sample_data_ecommerce/_search @@ -347,7 +347,7 @@ GET kibana_sample_data_ecommerce/_search ### Get multiple order statistics at once [aggregations-tutorial-order-stats] -Calculate multiple statistics about orders in one request using the [`stats`](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-metrics-stats-aggregation.md) aggregation. +Calculate multiple statistics about orders in one request using the [`stats`](elasticsearch://reference/data-analysis/aggregations/search-aggregations-metrics-stats-aggregation.md) aggregation. ```console GET kibana_sample_data_ecommerce/_search @@ -391,7 +391,7 @@ GET kibana_sample_data_ecommerce/_search :::: ::::{tip} -The [stats aggregation](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-metrics-stats-aggregation.md) is more efficient than running individual min, max, avg, and sum aggregations. +The [stats aggregation](elasticsearch://reference/data-analysis/aggregations/search-aggregations-metrics-stats-aggregation.md) is more efficient than running individual min, max, avg, and sum aggregations. :::: @@ -401,7 +401,7 @@ Let’s group orders in different ways to understand sales patterns. ### Break down sales by category [aggregations-tutorial-category-breakdown] -Group orders by category to see which product categories are most popular, using the [`terms`](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-terms-aggregation.md) aggregation. +Group orders by category to see which product categories are most popular, using the [`terms`](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-terms-aggregation.md) aggregation. ```console GET kibana_sample_data_ecommerce/_search @@ -421,7 +421,7 @@ GET kibana_sample_data_ecommerce/_search 1. Name reflecting the business purpose of this breakdown 2. `terms` aggregation groups documents by field values -3. Use [`.keyword`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/keyword.md) field for exact matching on text fields +3. Use [`.keyword`](elasticsearch://reference/elasticsearch/mapping-reference/keyword.md) field for exact matching on text fields 4. Limit to top 5 categories 5. Order by number of orders (descending) @@ -476,7 +476,7 @@ GET kibana_sample_data_ecommerce/_search } ``` -1. Due to Elasticsearch’s distributed architecture, when [terms aggregations](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-terms-aggregation.md) run across multiple shards, the doc counts may have a small margin of error. This value indicates the maximum possible error in the counts. +1. Due to Elasticsearch’s distributed architecture, when [terms aggregations](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-terms-aggregation.md) run across multiple shards, the doc counts may have a small margin of error. This value indicates the maximum possible error in the counts. 2. Count of documents in categories beyond the requested size. 3. Array of category buckets, ordered by count. 4. Category name. @@ -486,7 +486,7 @@ GET kibana_sample_data_ecommerce/_search ### Track daily sales patterns [aggregations-tutorial-daily-sales] -Group orders by day to track daily sales patterns using the [`date_histogram`](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-datehistogram-aggregation.md) aggregation. +Group orders by day to track daily sales patterns using the [`date_histogram`](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-datehistogram-aggregation.md) aggregation. ```console GET kibana_sample_data_ecommerce/_search @@ -507,8 +507,8 @@ GET kibana_sample_data_ecommerce/_search 1. Descriptive name for the time-series aggregation results. 2. The `date_histogram` aggregation groups documents into time-based buckets, similar to terms aggregation but for dates. -3. Uses [calendar and fixed time intervals](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-datehistogram-aggregation.md#calendar_and_fixed_intervals) to handle months with different lengths. `"day"` ensures consistent daily grouping regardless of timezone. -4. Formats dates in response using [date patterns](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-date-format.md) (e.g. "yyyy-MM-dd"). Refer to [date math expressions](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/common-options.md#date-math) for additional options. +3. Uses [calendar and fixed time intervals](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-datehistogram-aggregation.md#calendar_and_fixed_intervals) to handle months with different lengths. `"day"` ensures consistent daily grouping regardless of timezone. +4. Formats dates in response using [date patterns](elasticsearch://reference/elasticsearch/mapping-reference/mapping-date-format.md) (e.g. "yyyy-MM-dd"). Refer to [date math expressions](elasticsearch://reference/elasticsearch/rest-apis/common-options.md#date-math) for additional options. 5. When `min_doc_count` is 0, returns buckets for days with no orders, useful for continuous time series visualization. ::::{dropdown} Example response @@ -705,7 +705,7 @@ GET kibana_sample_data_ecommerce/_search ## Combine metrics with groupings [aggregations-tutorial-combined-analysis] -Now let’s calculate [metrics](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/metrics.md) within each group to get deeper insights. +Now let’s calculate [metrics](elasticsearch://reference/data-analysis/aggregations/metrics.md) within each group to get deeper insights. ### Compare category performance [aggregations-tutorial-category-metrics] @@ -827,7 +827,7 @@ GET kibana_sample_data_ecommerce/_search ``` 1. Daily revenue -2. Uses the [`cardinality`](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-metrics-cardinality-aggregation.md) aggregation to count unique customers per day +2. Uses the [`cardinality`](elasticsearch://reference/data-analysis/aggregations/search-aggregations-metrics-cardinality-aggregation.md) aggregation to count unique customers per day 3. Average number of items per order ::::{dropdown} Example response @@ -1297,11 +1297,11 @@ GET kibana_sample_data_ecommerce/_search ## Track trends and patterns [aggregations-tutorial-trends] -You can use [pipeline aggregations](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/pipeline.md) on the results of other aggregations. Let’s analyze how metrics change over time. +You can use [pipeline aggregations](elasticsearch://reference/data-analysis/aggregations/pipeline.md) on the results of other aggregations. Let’s analyze how metrics change over time. ### Smooth out daily fluctuations [aggregations-tutorial-moving-average] -Moving averages help identify trends by reducing day-to-day noise in the data. Let’s observe sales trends more clearly by smoothing daily revenue variations, using the [Moving Function](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-pipeline-movfn-aggregation.md) aggregation. +Moving averages help identify trends by reducing day-to-day noise in the data. Let’s observe sales trends more clearly by smoothing daily revenue variations, using the [Moving Function](elasticsearch://reference/data-analysis/aggregations/search-aggregations-pipeline-movfn-aggregation.md) aggregation. ```console GET kibana_sample_data_ecommerce/_search @@ -1724,7 +1724,7 @@ Notice how the smoothed values lag behind the actual values - this is because th ### Track running totals [aggregations-tutorial-cumulative] -Track running totals over time using the [`cumulative_sum`](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-pipeline-cumulative-sum-aggregation.md) aggregation. +Track running totals over time using the [`cumulative_sum`](elasticsearch://reference/data-analysis/aggregations/search-aggregations-pipeline-cumulative-sum-aggregation.md) aggregation. ```console GET kibana_sample_data_ecommerce/_search diff --git a/explore-analyze/query-filter/languages/eql.md b/explore-analyze/query-filter/languages/eql.md index 0903047e3..42646e5c6 100644 --- a/explore-analyze/query-filter/languages/eql.md +++ b/explore-analyze/query-filter/languages/eql.md @@ -18,7 +18,7 @@ Event Query Language (EQL) is a query language for event-based time series data, ## Advantages of EQL [eql-advantages] * **EQL lets you express relationships between events.**
Many query languages allow you to match single events. EQL lets you match a sequence of events across different event categories and time spans. -* **EQL has a low learning curve.**
[EQL syntax](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/eql-syntax.md) looks like other common query languages, such as SQL. EQL lets you write and read queries intuitively, which makes for quick, iterative searching. +* **EQL has a low learning curve.**
[EQL syntax](elasticsearch://reference/query-languages/eql-syntax.md) looks like other common query languages, such as SQL. EQL lets you write and read queries intuitively, which makes for quick, iterative searching. * **EQL is designed for security use cases.**
While you can use it for any event-based data, we created EQL for threat hunting. EQL not only supports indicator of compromise (IOC) searches but can describe activity that goes beyond IOCs. @@ -26,7 +26,7 @@ Event Query Language (EQL) is a query language for event-based time series data, With the exception of sample queries, EQL searches require that the searched data stream or index contains a *timestamp* field. By default, EQL uses the `@timestamp` field from the [Elastic Common Schema (ECS)](https://www.elastic.co/guide/en/ecs/current). -EQL searches also require an *event category* field, unless you use the [`any` keyword](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/eql-syntax.md#eql-syntax-match-any-event-category) to search for documents without an event category field. By default, EQL uses the ECS `event.category` field. +EQL searches also require an *event category* field, unless you use the [`any` keyword](elasticsearch://reference/query-languages/eql-syntax.md#eql-syntax-match-any-event-category) to search for documents without an event category field. By default, EQL uses the ECS `event.category` field. To use a different timestamp or event category field, see [Specify a timestamp or event category field](#specify-a-timestamp-or-event-category-field). @@ -38,7 +38,7 @@ While no schema is required to use EQL, we recommend using the [ECS](https://www ## Run an EQL search [run-an-eql-search] -Use the [EQL search API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-eql-search) to run a [basic EQL query](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/eql-syntax.md#eql-basic-syntax). +Use the [EQL search API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-eql-search) to run a [basic EQL query](elasticsearch://reference/query-languages/eql-syntax.md#eql-basic-syntax). ```console GET /my-data-stream/_eql/search @@ -119,7 +119,7 @@ GET /my-data-stream/_eql/search ## Search for a sequence of events [eql-search-sequence] -Use EQL’s [sequence syntax](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/eql-syntax.md#eql-sequences) to search for a series of ordered events. List the event items in ascending chronological order, with the most recent event listed last: +Use EQL’s [sequence syntax](elasticsearch://reference/query-languages/eql-syntax.md#eql-sequences) to search for a series of ordered events. List the event items in ascending chronological order, with the most recent event listed last: ```console GET /my-data-stream/_eql/search @@ -188,7 +188,7 @@ The response’s `hits.sequences` property contains the 10 most recent matching } ``` -Use [`with maxspan`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/eql-syntax.md#eql-with-maxspan-keywords) to constrain matching sequences to a timespan: +Use [`with maxspan`](elasticsearch://reference/query-languages/eql-syntax.md#eql-with-maxspan-keywords) to constrain matching sequences to a timespan: ```console GET /my-data-stream/_eql/search @@ -201,7 +201,7 @@ GET /my-data-stream/_eql/search } ``` -Use `!` to match [missing events](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/eql-syntax.md#eql-missing-events): events in a sequence that do not meet a condition within a given timespan: +Use `!` to match [missing events](elasticsearch://reference/query-languages/eql-syntax.md#eql-missing-events): events in a sequence that do not meet a condition within a given timespan: ```console GET /my-data-stream/_eql/search @@ -276,7 +276,7 @@ Missing events are indicated in the response as `missing": true`: } ``` -Use the [`by` keyword](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/eql-syntax.md#eql-by-keyword) to match events that share the same field values: +Use the [`by` keyword](elasticsearch://reference/query-languages/eql-syntax.md#eql-by-keyword) to match events that share the same field values: ```console GET /my-data-stream/_eql/search @@ -320,7 +320,7 @@ The `hits.sequences.join_keys` property contains the shared field values. } ``` -Use the [`until` keyword](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/eql-syntax.md#eql-until-keyword) to specify an expiration event for sequences. Matching sequences must end before this event. +Use the [`until` keyword](elasticsearch://reference/query-languages/eql-syntax.md#eql-until-keyword) to specify an expiration event for sequences. Matching sequences must end before this event. ```console GET /my-data-stream/_eql/search @@ -337,7 +337,7 @@ GET /my-data-stream/_eql/search ## Sample chronologically unordered events [eql-search-sample] -Use EQL’s [sample syntax](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/eql-syntax.md#eql-samples) to search for events that match one or more join keys and a set of filters. Samples are similar to sequences, but do not return events in chronological order. In fact, sample queries can run on data without a timestamp. Sample queries can be useful to find correlations in events that don’t always occur in the same sequence, or that occur across long time spans. +Use EQL’s [sample syntax](elasticsearch://reference/query-languages/eql-syntax.md#eql-samples) to search for events that match one or more join keys and a set of filters. Samples are similar to sequences, but do not return events in chronological order. In fact, sample queries can run on data without a timestamp. Sample queries can be useful to find correlations in events that don’t always occur in the same sequence, or that occur across long time spans. ::::{dropdown} Click to show the sample data used in the examples below ```console @@ -553,7 +553,7 @@ POST /my-index-000003/_bulk?refresh :::: -A sample query specifies at least one join key, using the [`by` keyword](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/eql-syntax.md#eql-by-keyword), and up to five filters: +A sample query specifies at least one join key, using the [`by` keyword](elasticsearch://reference/query-languages/eql-syntax.md#eql-by-keyword), and up to five filters: ```console GET /my-index*/_eql/search @@ -871,7 +871,7 @@ GET /my-index*/_eql/search By default, each hit in the search response includes the document `_source`, which is the entire JSON object that was provided when indexing the document. -You can use the [`filter_path`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/common-options.md#common-options-response-filtering) query parameter to filter the API response. For example, the following search returns only the timestamp and PID from the `_source` of each matching event. +You can use the [`filter_path`](elasticsearch://reference/elasticsearch/rest-apis/common-options.md#common-options-response-filtering) query parameter to filter the API response. For example, the following search returns only the timestamp and PID from the `_source` of each matching event. ```console GET /my-data-stream/_eql/search?filter_path=hits.events._source.@timestamp,hits.events._source.process.pid @@ -909,12 +909,12 @@ The API returns the following response. } ``` -You can also use the `fields` parameter to retrieve and format specific fields in the response. This field is identical to the search API’s [`fields` parameter](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/retrieve-selected-fields.md). +You can also use the `fields` parameter to retrieve and format specific fields in the response. This field is identical to the search API’s [`fields` parameter](elasticsearch://reference/elasticsearch/rest-apis/retrieve-selected-fields.md). Because it consults the index mappings, the `fields` parameter provides several advantages over referencing the `_source` directly. Specifically, the `fields` parameter: * Returns each value in a standardized way that matches its mapping type -* Accepts [multi-fields](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/multi-fields.md) and [field aliases](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/field-alias.md) +* Accepts [multi-fields](elasticsearch://reference/elasticsearch/mapping-reference/multi-fields.md) and [field aliases](elasticsearch://reference/elasticsearch/mapping-reference/field-alias.md) * Formats dates and spatial data types * Retrieves [runtime field values](../../../manage-data/data-store/mapping/retrieve-runtime-field.md) * Returns fields calculated by a script at index time @@ -1055,7 +1055,7 @@ GET /my-data-stream/_eql/search } ``` -The event category field must be mapped as a [`keyword`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/keyword.md) family field type. The timestamp field should be mapped as a [`date`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/date.md) field type. [`date_nanos`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/date_nanos.md) timestamp fields are not supported. You cannot use a [`nested`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/nested.md) field or the sub-fields of a `nested` field as the timestamp or event category field. +The event category field must be mapped as a [`keyword`](elasticsearch://reference/elasticsearch/mapping-reference/keyword.md) family field type. The timestamp field should be mapped as a [`date`](elasticsearch://reference/elasticsearch/mapping-reference/date.md) field type. [`date_nanos`](elasticsearch://reference/elasticsearch/mapping-reference/date_nanos.md) timestamp fields are not supported. You cannot use a [`nested`](elasticsearch://reference/elasticsearch/mapping-reference/nested.md) field or the sub-fields of a `nested` field as the timestamp or event category field. ## Specify a sort tiebreaker [eql-search-specify-a-sort-tiebreaker] @@ -1286,5 +1286,5 @@ GET /cluster_one:my-data-stream,cluster_two:my-data-stream/_eql/search ## EQL circuit breaker settings [eql-circuit-breaker] -The relevant circuit breaker settings can be found in the [Circuit Breakers page](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/circuit-breaker-settings.md#circuit-breakers-page-eql). +The relevant circuit breaker settings can be found in the [Circuit Breakers page](elasticsearch://reference/elasticsearch/configuration-reference/circuit-breaker-settings.md#circuit-breakers-page-eql). diff --git a/explore-analyze/query-filter/languages/esql-cross-clusters.md b/explore-analyze/query-filter/languages/esql-cross-clusters.md index af921965d..4023c3ef3 100644 --- a/explore-analyze/query-filter/languages/esql-cross-clusters.md +++ b/explore-analyze/query-filter/languages/esql-cross-clusters.md @@ -361,7 +361,7 @@ Which returns: ## Enrich across clusters [ccq-enrich] -Enrich in {{esql}} across clusters operates similarly to [local enrich](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md#esql-enrich). If the enrich policy and its enrich indices are consistent across all clusters, simply write the enrich command as you would without remote clusters. In this default mode, {{esql}} can execute the enrich command on either the local cluster or the remote clusters, aiming to minimize computation or inter-cluster data transfer. Ensuring that the policy exists with consistent data on both the local cluster and the remote clusters is critical for ES|QL to produce a consistent query result. +Enrich in {{esql}} across clusters operates similarly to [local enrich](elasticsearch://reference/query-languages/esql/esql-commands.md#esql-enrich). If the enrich policy and its enrich indices are consistent across all clusters, simply write the enrich command as you would without remote clusters. In this default mode, {{esql}} can execute the enrich command on either the local cluster or the remote clusters, aiming to minimize computation or inter-cluster data transfer. Ensuring that the policy exists with consistent data on both the local cluster and the remote clusters is critical for ES|QL to produce a consistent query result. ::::{tip} Enrich in {{esql}} across clusters using the API key based security model was introduced in version **8.15.0**. Cross cluster API keys created in versions prior to 8.15.0 will need to replaced or updated to use the new required permissions. Refer to the example in the [API key authentication](#esql-ccs-security-model-api-key) section. @@ -417,7 +417,7 @@ FROM my-index-000001,cluster_one:my-index-000001,cluster_two:my-index-000001 | LIMIT 10 ``` -A `_remote` enrich cannot be executed after a [stats](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md#esql-stats-by) command. The following example would result in an error: +A `_remote` enrich cannot be executed after a [stats](elasticsearch://reference/query-languages/esql/esql-commands.md#esql-stats-by) command. The following example would result in an error: ```esql FROM my-index-000001,cluster_one:my-index-000001,cluster_two:my-index-000001 diff --git a/explore-analyze/query-filter/languages/esql-getting-started.md b/explore-analyze/query-filter/languages/esql-getting-started.md index 8f7d780db..29937f7a1 100644 --- a/explore-analyze/query-filter/languages/esql-getting-started.md +++ b/explore-analyze/query-filter/languages/esql-getting-started.md @@ -117,13 +117,13 @@ You can adjust the editor’s height by dragging its bottom border to your likin ## Your first {{esql}} query [esql-getting-started-first-query] -Each {{esql}} query starts with a [source command](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md#esql-source-commands). A source command produces a table, typically with data from {{es}}. +Each {{esql}} query starts with a [source command](elasticsearch://reference/query-languages/esql/esql-commands.md#esql-source-commands). A source command produces a table, typically with data from {{es}}. :::{image} ../../../images/elasticsearch-reference-source-command.svg :alt: A source command producing a table from {{es}} ::: -The [`FROM`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md#esql-from) source command returns a table with documents from a data stream, index, or alias. Each row in the resulting table represents a document. This query returns up to 1000 documents from the `sample_data` index: +The [`FROM`](elasticsearch://reference/query-languages/esql/esql-commands.md#esql-from) source command returns a table with documents from a data stream, index, or alias. Each row in the resulting table represents a document. This query returns up to 1000 documents from the `sample_data` index: ```esql FROM sample_data @@ -144,13 +144,13 @@ from sample_data ## Processing commands [esql-getting-started-limit] -A source command can be followed by one or more [processing commands](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md#esql-processing-commands), separated by a pipe character: `|`. Processing commands change an input table by adding, removing, or changing rows and columns. Processing commands can perform filtering, projection, aggregation, and more. +A source command can be followed by one or more [processing commands](elasticsearch://reference/query-languages/esql/esql-commands.md#esql-processing-commands), separated by a pipe character: `|`. Processing commands change an input table by adding, removing, or changing rows and columns. Processing commands can perform filtering, projection, aggregation, and more. :::{image} ../../../images/elasticsearch-reference-esql-limit.png :alt: A processing command changing an input table ::: -For example, you can use the [`LIMIT`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md#esql-limit) command to limit the number of rows that are returned, up to a maximum of 10,000 rows: +For example, you can use the [`LIMIT`](elasticsearch://reference/query-languages/esql/esql-commands.md#esql-limit) command to limit the number of rows that are returned, up to a maximum of 10,000 rows: ```esql FROM sample_data @@ -174,7 +174,7 @@ FROM sample_data | LIMIT 3 :alt: A processing command sorting an input table ::: -Another processing command is the [`SORT`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md#esql-sort) command. By default, the rows returned by `FROM` don’t have a defined sort order. Use the `SORT` command to sort rows on one or more columns: +Another processing command is the [`SORT`](elasticsearch://reference/query-languages/esql/esql-commands.md#esql-sort) command. By default, the rows returned by `FROM` don’t have a defined sort order. Use the `SORT` command to sort rows on one or more columns: ```esql FROM sample_data @@ -184,14 +184,14 @@ FROM sample_data ### Query the data [esql-getting-started-where] -Use the [`WHERE`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md#esql-where) command to query the data. For example, to find all events with a duration longer than 5ms: +Use the [`WHERE`](elasticsearch://reference/query-languages/esql/esql-commands.md#esql-where) command to query the data. For example, to find all events with a duration longer than 5ms: ```esql FROM sample_data | WHERE event_duration > 5000000 ``` -`WHERE` supports several [operators](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-functions-operators.md#esql-operators). For example, you can use [`LIKE`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-functions-operators.md#esql-like-operator) to run a wildcard query against the `message` column: +`WHERE` supports several [operators](elasticsearch://reference/query-languages/esql/esql-functions-operators.md#esql-operators). For example, you can use [`LIKE`](elasticsearch://reference/query-languages/esql/esql-functions-operators.md#esql-like-operator) to run a wildcard query against the `message` column: ```esql FROM sample_data @@ -201,7 +201,7 @@ FROM sample_data ### More processing commands [esql-getting-started-more-commands] -There are many other processing commands, like [`KEEP`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md#esql-keep) and [`DROP`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md#esql-drop) to keep or drop columns, [`ENRICH`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md#esql-enrich) to enrich a table with data from indices in {{es}}, and [`DISSECT`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md#esql-dissect) and [`GROK`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md#esql-grok) to process data. Refer to [Processing commands](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md#esql-processing-commands) for an overview of all processing commands. +There are many other processing commands, like [`KEEP`](elasticsearch://reference/query-languages/esql/esql-commands.md#esql-keep) and [`DROP`](elasticsearch://reference/query-languages/esql/esql-commands.md#esql-drop) to keep or drop columns, [`ENRICH`](elasticsearch://reference/query-languages/esql/esql-commands.md#esql-enrich) to enrich a table with data from indices in {{es}}, and [`DISSECT`](elasticsearch://reference/query-languages/esql/esql-commands.md#esql-dissect) and [`GROK`](elasticsearch://reference/query-languages/esql/esql-commands.md#esql-grok) to process data. Refer to [Processing commands](elasticsearch://reference/query-languages/esql/esql-commands.md#esql-processing-commands) for an overview of all processing commands. ## Chain processing commands [esql-getting-started-chaining] @@ -228,14 +228,14 @@ The order of processing commands is important. First limiting the result set to ## Compute values [esql-getting-started-eval] -Use the [`EVAL`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md#esql-eval) command to append columns to a table, with calculated values. For example, the following query appends a `duration_ms` column. The values in the column are computed by dividing `event_duration` by 1,000,000. In other words: `event_duration` converted from nanoseconds to milliseconds. +Use the [`EVAL`](elasticsearch://reference/query-languages/esql/esql-commands.md#esql-eval) command to append columns to a table, with calculated values. For example, the following query appends a `duration_ms` column. The values in the column are computed by dividing `event_duration` by 1,000,000. In other words: `event_duration` converted from nanoseconds to milliseconds. ```esql FROM sample_data | EVAL duration_ms = event_duration/1000000.0 ``` -`EVAL` supports several [functions](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-functions-operators.md#esql-functions). For example, to round a number to the closest number with the specified number of digits, use the [`ROUND`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-functions-operators.md#esql-round) function: +`EVAL` supports several [functions](elasticsearch://reference/query-languages/esql/esql-functions-operators.md#esql-functions). For example, to round a number to the closest number with the specified number of digits, use the [`ROUND`](elasticsearch://reference/query-languages/esql/esql-functions-operators.md#esql-round) function: ```esql FROM sample_data @@ -245,7 +245,7 @@ FROM sample_data ## Calculate statistics [esql-getting-started-stats] -{{esql}} can not only be used to query your data, you can also use it to aggregate your data. Use the [`STATS`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md#esql-stats-by) command to calculate statistics. For example, the median duration: +{{esql}} can not only be used to query your data, you can also use it to aggregate your data. Use the [`STATS`](elasticsearch://reference/query-languages/esql/esql-commands.md#esql-stats-by) command to calculate statistics. For example, the median duration: ```esql FROM sample_data @@ -269,7 +269,7 @@ FROM sample_data ## Access columns [esql-getting-started-access-columns] -You can access columns by their name. If a name contains special characters, [it needs to be quoted](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-syntax.md#esql-identifiers) with backticks (```). +You can access columns by their name. If a name contains special characters, [it needs to be quoted](elasticsearch://reference/query-languages/esql/esql-syntax.md#esql-identifiers) with backticks (```). Assigning an explicit name to a column created by `EVAL` or `STATS` is optional. If you don’t provide a name, the new column name is equal to the function expression. For example: @@ -289,9 +289,9 @@ FROM sample_data ## Create a histogram [esql-getting-started-histogram] -To track statistics over time, {{esql}} enables you to create histograms using the [`BUCKET`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-functions-operators.md#esql-bucket) function. `BUCKET` creates human-friendly bucket sizes and returns a value for each row that corresponds to the resulting bucket the row falls into. +To track statistics over time, {{esql}} enables you to create histograms using the [`BUCKET`](elasticsearch://reference/query-languages/esql/esql-functions-operators.md#esql-bucket) function. `BUCKET` creates human-friendly bucket sizes and returns a value for each row that corresponds to the resulting bucket the row falls into. -Combine `BUCKET` with [`STATS`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md#esql-stats-by) to create a histogram. For example, to count the number of events per hour: +Combine `BUCKET` with [`STATS`](elasticsearch://reference/query-languages/esql/esql-commands.md#esql-stats-by) to create a histogram. For example, to count the number of events per hour: ```esql FROM sample_data @@ -309,13 +309,13 @@ FROM sample_data ## Enrich data [esql-getting-started-enrich] -{{esql}} enables you to [enrich](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-enrich-data.md) a table with data from indices in {{es}}, using the [`ENRICH`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md#esql-enrich) command. +{{esql}} enables you to [enrich](elasticsearch://reference/query-languages/esql/esql-enrich-data.md) a table with data from indices in {{es}}, using the [`ENRICH`](elasticsearch://reference/query-languages/esql/esql-commands.md#esql-enrich) command. :::{image} ../../../images/elasticsearch-reference-esql-enrich.png :alt: esql enrich ::: -Before you can use `ENRICH`, you first need to [create](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-enrich-data.md#esql-create-enrich-policy) and [execute](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-enrich-data.md#esql-execute-enrich-policy) an [enrich policy](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-enrich-data.md#esql-enrich-policy). +Before you can use `ENRICH`, you first need to [create](elasticsearch://reference/query-languages/esql/esql-enrich-data.md#esql-create-enrich-policy) and [execute](elasticsearch://reference/query-languages/esql/esql-enrich-data.md#esql-execute-enrich-policy) an [enrich policy](elasticsearch://reference/query-languages/esql/esql-enrich-data.md#esql-enrich-policy). :::::::{tab-set} @@ -386,12 +386,12 @@ FROM sample_data | STATS median_duration = MEDIAN(event_duration) BY env ``` -For more about data enrichment with {{esql}}, refer to [Data enrichment](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-enrich-data.md). +For more about data enrichment with {{esql}}, refer to [Data enrichment](elasticsearch://reference/query-languages/esql/esql-enrich-data.md). ## Process data [esql-getting-started-process-data] -Your data may contain unstructured strings that you want to [structure](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-process-data-with-dissect-grok.md) to make it easier to analyze the data. For example, the sample data contains log messages like: +Your data may contain unstructured strings that you want to [structure](elasticsearch://reference/query-languages/esql/esql-process-data-with-dissect-grok.md) to make it easier to analyze the data. For example, the sample data contains log messages like: ```txt "Connected to 10.1.0.3" @@ -399,7 +399,7 @@ Your data may contain unstructured strings that you want to [structure](asciidoc By extracting the IP address from these messages, you can determine which IP has accepted the most client connections. -To structure unstructured strings at query time, you can use the {{esql}} [`DISSECT`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md#esql-dissect) and [`GROK`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md#esql-grok) commands. `DISSECT` works by breaking up a string using a delimiter-based pattern. `GROK` works similarly, but uses regular expressions. This makes `GROK` more powerful, but generally also slower. +To structure unstructured strings at query time, you can use the {{esql}} [`DISSECT`](elasticsearch://reference/query-languages/esql/esql-commands.md#esql-dissect) and [`GROK`](elasticsearch://reference/query-languages/esql/esql-commands.md#esql-grok) commands. `DISSECT` works by breaking up a string using a delimiter-based pattern. `GROK` works similarly, but uses regular expressions. This makes `GROK` more powerful, but generally also slower. In this case, no regular expressions are needed, as the `message` is straightforward: "Connected to ", followed by the server IP. To match this string, you can use the following `DISSECT` command: @@ -419,10 +419,10 @@ FROM sample_data | STATS COUNT(*) BY server_ip ``` -For more about data processing with {{esql}}, refer to [Data processing with DISSECT and GROK](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-process-data-with-dissect-grok.md). +For more about data processing with {{esql}}, refer to [Data processing with DISSECT and GROK](elasticsearch://reference/query-languages/esql/esql-process-data-with-dissect-grok.md). ## Learn more [esql-getting-learn-more] -To learn more about {{esql}}, refer to [{{esql}} reference](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql.md). +To learn more about {{esql}}, refer to [{{esql}} reference](elasticsearch://reference/query-languages/esql.md). diff --git a/explore-analyze/query-filter/languages/esql-kibana.md b/explore-analyze/query-filter/languages/esql-kibana.md index dd0071f63..e68d6a83a 100644 --- a/explore-analyze/query-filter/languages/esql-kibana.md +++ b/explore-analyze/query-filter/languages/esql-kibana.md @@ -39,9 +39,9 @@ After switching to {{esql}} mode, the query bar shows a sample query. For exampl from kibana_sample_data_logs | limit 10 ``` -Every query starts with a [source command](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md). In this query, the source command is [`FROM`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md#esql-from). `FROM` retrieves data from data streams, indices, or aliases. In this example, the data is retrieved from `kibana_sample_data_logs`. +Every query starts with a [source command](elasticsearch://reference/query-languages/esql/esql-commands.md). In this query, the source command is [`FROM`](elasticsearch://reference/query-languages/esql/esql-commands.md#esql-from). `FROM` retrieves data from data streams, indices, or aliases. In this example, the data is retrieved from `kibana_sample_data_logs`. -A source command can be followed by one or more [processing commands](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md). In this query, the processing command is [`LIMIT`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md#esql-limit). `LIMIT` limits the number of rows that are retrieved. +A source command can be followed by one or more [processing commands](elasticsearch://reference/query-languages/esql/esql-commands.md). In this query, the processing command is [`LIMIT`](elasticsearch://reference/query-languages/esql/esql-commands.md#esql-limit). `LIMIT` limits the number of rows that are retrieved. ::::{tip} Click the **ES|QL help** button to open the in-product reference documentation for all commands and functions or to get recommended queries that will help you get started. @@ -129,7 +129,7 @@ the 10,000 row limit only applies to the number of rows that are retrieved by th :::: -Each row shows two columns for the example query: a column with the `@timestamp` field and a column with the full document. To display specific fields from the documents, use the [`KEEP`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md#esql-keep) command: +Each row shows two columns for the example query: a column with the `@timestamp` field and a column with the full document. To display specific fields from the documents, use the [`KEEP`](elasticsearch://reference/query-languages/esql/esql-commands.md#esql-keep) command: ```esql FROM kibana_sample_data_logs @@ -151,7 +151,7 @@ The maximum number of columns in Discover is 50. If a query returns more than 50 ### Sorting [_sorting] -To sort on one of the columns, click the column name you want to sort on and select the sort order. Note that this performs client-side sorting. It only sorts the rows that were retrieved by the query, which may not be the full dataset because of the (implicit) limit. To sort the full data set, use the [`SORT`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md#esql-sort) command: +To sort on one of the columns, click the column name you want to sort on and select the sort order. Note that this performs client-side sorting. It only sorts the rows that were retrieved by the query, which may not be the full dataset because of the (implicit) limit. To sort the full data set, use the [`SORT`](elasticsearch://reference/query-languages/esql/esql-commands.md#esql-sort) command: ```esql FROM kibana_sample_data_logs @@ -179,7 +179,7 @@ FROM my_index | WHERE custom_timestamp >= ?_tstart AND custom_timestamp < ?_tend ``` -You can also use the `?_tstart` and `?_tend` parameters with the [`BUCKET`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-functions-operators.md#esql-bucket) function to create auto-incrementing time buckets in {{esql}} [visualizations](#esql-kibana-visualizations). For example: +You can also use the `?_tstart` and `?_tend` parameters with the [`BUCKET`](elasticsearch://reference/query-languages/esql/esql-functions-operators.md#esql-bucket) function to create auto-incrementing time buckets in {{esql}} [visualizations](#esql-kibana-visualizations). For example: ```esql FROM kibana_sample_data_logs @@ -191,7 +191,7 @@ This example uses `50` buckets, which is the maximum number of buckets. #### WHERE command [_where_command] -You can also limit the time range using the [`WHERE`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md#esql-where) command and the [`NOW`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-functions-operators.md#esql-now) function. For example, if the timestamp field is called `timestamp`, to query the last 15 minutes of data: +You can also limit the time range using the [`WHERE`](elasticsearch://reference/query-languages/esql/esql-commands.md#esql-where) command and the [`NOW`](elasticsearch://reference/query-languages/esql/esql-functions-operators.md#esql-now) function. For example, if the timestamp field is called `timestamp`, to query the last 15 minutes of data: ```esql FROM kibana_sample_data_logs @@ -254,7 +254,7 @@ You can also edit the {{esql}} visualization from here. Click the options button ## Create an enrich policy [esql-kibana-enrich] -The {{esql}} [`ENRICH`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md#esql-enrich) command enables you to [enrich](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-enrich-data.md) your query dataset with fields from another dataset. Before you can use `ENRICH`, you need to [create and execute an enrich policy](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-enrich-data.md#esql-set-up-enrich-policy). If a policy exists, it will be suggested by auto-complete. If not, click **Click to create** to create one. +The {{esql}} [`ENRICH`](elasticsearch://reference/query-languages/esql/esql-commands.md#esql-enrich) command enables you to [enrich](elasticsearch://reference/query-languages/esql/esql-enrich-data.md) your query dataset with fields from another dataset. Before you can use `ENRICH`, you need to [create and execute an enrich policy](elasticsearch://reference/query-languages/esql/esql-enrich-data.md#esql-set-up-enrich-policy). If a policy exists, it will be suggested by auto-complete. If not, click **Click to create** to create one. :::{image} ../../../images/elasticsearch-reference-esql-kibana-enrich-autocomplete.png :alt: esql kibana enrich autocomplete @@ -296,8 +296,8 @@ You can use {{esql}} queries to create alerts. From Discover, click **Alerts** a ## Limitations [esql-kibana-limitations] -* The user interface to filter data is not enabled when Discover is in {{esql}} mode. To filter data, write a query that uses the [`WHERE`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md#esql-where) command instead. +* The user interface to filter data is not enabled when Discover is in {{esql}} mode. To filter data, write a query that uses the [`WHERE`](elasticsearch://reference/query-languages/esql/esql-commands.md#esql-where) command instead. * Discover shows no more than 10,000 rows. This limit only applies to the number of rows that are retrieved by the query and displayed in Discover. Queries and aggregations run on the full data set. * Discover shows no more than 50 columns. If a query returns more than 50 columns, Discover only shows the first 50. * CSV export from Discover shows no more than 10,000 rows. This limit only applies to the number of rows that are retrieved by the query and displayed in Discover. Queries and aggregations run on the full data set. -* Querying many indices at once without any filters can cause an error in kibana which looks like `[esql] > Unexpected error from Elasticsearch: The content length (536885793) is bigger than the maximum allowed string (536870888)`. The response from {{esql}} is too long. Use [`DROP`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md#esql-drop) or [`KEEP`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md#esql-keep) to limit the number of fields returned. +* Querying many indices at once without any filters can cause an error in kibana which looks like `[esql] > Unexpected error from Elasticsearch: The content length (536885793) is bigger than the maximum allowed string (536870888)`. The response from {{esql}} is too long. Use [`DROP`](elasticsearch://reference/query-languages/esql/esql-commands.md#esql-drop) or [`KEEP`](elasticsearch://reference/query-languages/esql/esql-commands.md#esql-keep) to limit the number of fields returned. diff --git a/explore-analyze/query-filter/languages/esql-multi-index.md b/explore-analyze/query-filter/languages/esql-multi-index.md index 87859125d..9222e484a 100644 --- a/explore-analyze/query-filter/languages/esql-multi-index.md +++ b/explore-analyze/query-filter/languages/esql-multi-index.md @@ -25,7 +25,7 @@ FROM cluster_one:employees-00001,cluster_two:other-employees-* ``` -## Field type mismatches [esql-multi-index-invalid-mapping] +## Field type mismatches [esql-multi-index-invalid-mapping] When querying multiple indices, data streams, or aliases, you might find that the same field is mapped to multiple different types. For example, consider the two indices with the following field mappings: @@ -106,16 +106,16 @@ Cannot use field [client_ip] due to ambiguities being mapped as ``` -## Union types [esql-multi-index-union-types] +## Union types [esql-multi-index-union-types] -::::{warning} +::::{warning} This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features. :::: -{{esql}} has a way to handle [field type mismatches](#esql-multi-index-invalid-mapping). When the same field is mapped to multiple types in multiple indices, the type of the field is understood to be a *union* of the various types in the index mappings. As seen in the preceding examples, this *union type* cannot be used in the results, and cannot be referred to by the query — except in `KEEP`, `DROP` or when it’s passed to a type conversion function that accepts all the types in the *union* and converts the field to a single type. {{esql}} offers a suite of [type conversion functions](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-functions-operators.md#esql-type-conversion-functions) to achieve this. +{{esql}} has a way to handle [field type mismatches](#esql-multi-index-invalid-mapping). When the same field is mapped to multiple types in multiple indices, the type of the field is understood to be a *union* of the various types in the index mappings. As seen in the preceding examples, this *union type* cannot be used in the results, and cannot be referred to by the query — except in `KEEP`, `DROP` or when it’s passed to a type conversion function that accepts all the types in the *union* and converts the field to a single type. {{esql}} offers a suite of [type conversion functions](elasticsearch://reference/query-languages/esql/esql-functions-operators.md#esql-type-conversion-functions) to achieve this. -In the above examples, the query can use a command like `EVAL client_ip = TO_IP(client_ip)` to resolve the union of `ip` and `keyword` to just `ip`. You can also use the type-conversion syntax `EVAL client_ip = client_ip::IP`. Alternatively, the query could use [`TO_STRING`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-functions-operators.md#esql-to_string) to convert all supported types into `KEYWORD`. +In the above examples, the query can use a command like `EVAL client_ip = TO_IP(client_ip)` to resolve the union of `ip` and `keyword` to just `ip`. You can also use the type-conversion syntax `EVAL client_ip = client_ip::IP`. Alternatively, the query could use [`TO_STRING`](elasticsearch://reference/query-languages/esql/esql-functions-operators.md#esql-to_string) to convert all supported types into `KEYWORD`. For example, the [query](#query-unsupported) that returned `client_ip:unsupported` with `null` values can be improved using the `TO_IP` function or the equivalent `field::ip` syntax. These changes also resolve the error message. As long as the only reference to the original field is to pass it to a conversion function that resolves the type ambiguity, no error results. @@ -137,9 +137,9 @@ FROM events_* | 2023-10-23T12:15:03.360Z | 172.21.2.162 | 3450233 | Connected to 10.1.0.3 | -## Index metadata [esql-multi-index-index-metadata] +## Index metadata [esql-multi-index-index-metadata] -It can be helpful to know the particular index from which each row is sourced. To get this information, use the [`METADATA`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-metadata-fields.md) option on the [`FROM`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md#esql-from) command. +It can be helpful to know the particular index from which each row is sourced. To get this information, use the [`METADATA`](elasticsearch://reference/query-languages/esql/esql-metadata-fields.md) option on the [`FROM`](elasticsearch://reference/query-languages/esql/esql-commands.md#esql-from) command. ```esql FROM events_* METADATA _index diff --git a/explore-analyze/query-filter/languages/esql.md b/explore-analyze/query-filter/languages/esql.md index 2ea7e3e91..18607a061 100644 --- a/explore-analyze/query-filter/languages/esql.md +++ b/explore-analyze/query-filter/languages/esql.md @@ -26,9 +26,9 @@ mapped_urls: ## What's {{esql}}? [_the_esql_compute_engine] -**Elasticsearch Query Language ({{esql}})** is a piped query language for filtering, transforming, and analyzing data. +**Elasticsearch Query Language ({{esql}})** is a piped query language for filtering, transforming, and analyzing data. -You can author {{esql}} queries to find specific events, perform statistical analysis, and generate visualizations. It supports a wide range of [commands, functions and operators](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-functions-operators.md) to perform various data operations, such as filtering, aggregation, time-series analysis, and more. Today, it supports a subset of the features available in Query DSL, but it is rapidly evolving. +You can author {{esql}} queries to find specific events, perform statistical analysis, and generate visualizations. It supports a wide range of [commands, functions and operators](elasticsearch://reference/query-languages/esql/esql-functions-operators.md) to perform various data operations, such as filtering, aggregation, time-series analysis, and more. Today, it supports a subset of the features available in Query DSL, but it is rapidly evolving. ::::{note} **{{esql}}'s compute architecture** @@ -52,10 +52,10 @@ You can use it: ## Next steps Find more details about {{esql}} in the following documentation pages: -- [{{esql}} reference](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql.md): - - Reference documentation for the [{{esql}} syntax](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-syntax.md), [commands](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md), and [functions and operators](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-functions-operators.md). - - Information about working with [metadata fields](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-metadata-fields.md) and [multivalued fields](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-multivalued-fields.md). - - Guidance for [data processing with DISSECT and GROK](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-process-data-with-dissect-grok.md) and [data enrichment with ENRICH](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-enrich-data.md). +- [{{esql}} reference](elasticsearch://reference/query-languages/esql.md): + - Reference documentation for the [{{esql}} syntax](elasticsearch://reference/query-languages/esql/esql-syntax.md), [commands](elasticsearch://reference/query-languages/esql/esql-commands.md), and [functions and operators](elasticsearch://reference/query-languages/esql/esql-functions-operators.md). + - Information about working with [metadata fields](elasticsearch://reference/query-languages/esql/esql-metadata-fields.md) and [multivalued fields](elasticsearch://reference/query-languages/esql/esql-multivalued-fields.md). + - Guidance for [data processing with DISSECT and GROK](elasticsearch://reference/query-languages/esql/esql-process-data-with-dissect-grok.md) and [data enrichment with ENRICH](elasticsearch://reference/query-languages/esql/esql-enrich-data.md). - Using {{esql}}: - An overview of using the [`_query` API endpoint](/explore-analyze/query-filter/languages/esql-rest.md). @@ -64,7 +64,7 @@ Find more details about {{esql}} in the following documentation pages: - [Using {{esql}} across clusters](/explore-analyze/query-filter/languages/esql-cross-clusters.md). - [Task management](/explore-analyze/query-filter/languages/esql-task-management.md). -- [Limitations](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/limitations.md): The current limitations of {{esql}}. +- [Limitations](elasticsearch://reference/query-languages/esql/limitations.md): The current limitations of {{esql}}. - [Examples](/explore-analyze/query-filter/languages/esql.md): A few examples of what you can do with {{esql}}. diff --git a/explore-analyze/query-filter/languages/example-detect-threats-with-eql.md b/explore-analyze/query-filter/languages/example-detect-threats-with-eql.md index 40f6af614..902cb08fb 100644 --- a/explore-analyze/query-filter/languages/example-detect-threats-with-eql.md +++ b/explore-analyze/query-filter/languages/example-detect-threats-with-eql.md @@ -19,7 +19,7 @@ One common variant of regsvr32 misuse is a [Squiblydoo attack](https://attack.mi ``` -## Setup [eql-ex-threat-detection-setup] +## Setup [eql-ex-threat-detection-setup] This tutorial uses a test dataset from [Atomic Red Team](https://github.com/redcanaryco/atomic-red-team) that includes events imitating a Squiblydoo attack. The data has been mapped to [Elastic Common Schema (ECS)](https://www.elastic.co/guide/en/ecs/current) fields. @@ -58,7 +58,7 @@ To get started: -## Get a count of regsvr32 events [eql-ex-get-a-count-of-regsvr32-events] +## Get a count of regsvr32 events [eql-ex-get-a-count-of-regsvr32-events] First, get a count of events associated with a `regsvr32.exe` process: @@ -95,7 +95,7 @@ The response returns 143 related events. ``` -## Check for command line artifacts [eql-ex-check-for-command-line-artifacts] +## Check for command line artifacts [eql-ex-check-for-command-line-artifacts] `regsvr32.exe` processes were associated with 143 events. But how was `regsvr32.exe` first called? And who called it? `regsvr32.exe` is a command-line utility. Narrow your results to processes where the command line was used: @@ -155,7 +155,7 @@ The query matches one event with an `event.type` of `creation`, indicating the s ``` -## Check for malicious script loads [eql-ex-check-for-malicious-script-loads] +## Check for malicious script loads [eql-ex-check-for-malicious-script-loads] Check if `regsvr32.exe` later loads the `scrobj.dll` library: @@ -205,9 +205,9 @@ The query matches an event, confirming `scrobj.dll` was loaded. ``` -## Determine the likelihood of success [eql-ex-detemine-likelihood-of-success] +## Determine the likelihood of success [eql-ex-detemine-likelihood-of-success] -In many cases, attackers use malicious scripts to connect to remote servers or download other files. Use an [EQL sequence query](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/eql-syntax.md#eql-sequences) to check for the following series of events: +In many cases, attackers use malicious scripts to connect to remote servers or download other files. Use an [EQL sequence query](elasticsearch://reference/query-languages/eql-syntax.md#eql-sequences) to check for the following series of events: 1. A `regsvr32.exe` process 2. A load of the `scrobj.dll` library by the same process diff --git a/explore-analyze/query-filter/languages/kql.md b/explore-analyze/query-filter/languages/kql.md index 474b88c01..fd225c603 100644 --- a/explore-analyze/query-filter/languages/kql.md +++ b/explore-analyze/query-filter/languages/kql.md @@ -30,7 +30,7 @@ Combine free text search with field-based search using KQL. Type a search term t -## Filter for documents where a field exists [_filter_for_documents_where_a_field_exists] +## Filter for documents where a field exists [_filter_for_documents_where_a_field_exists] To filter documents for which an indexed value exists for a given field, use the `*` operator. For example, to filter for documents where the `http.request.method` field exists, use the following syntax: @@ -41,7 +41,7 @@ http.request.method: * This checks for any indexed value, including an empty string. -## Filter for documents that match a value [_filter_for_documents_that_match_a_value] +## Filter for documents that match a value [_filter_for_documents_that_match_a_value] Use KQL to filter for documents that match a specific number, text, date, or boolean value. For example, to filter for documents where the `http.request.method` is GET, use the following query: @@ -81,7 +81,7 @@ You must escape following characters: ``` -## Filter for documents within a range [_filter_for_documents_within_a_range] +## Filter for documents within a range [_filter_for_documents_within_a_range] To search documents that contain terms within a provided range, use KQL’s range syntax. For example, to search for all documents for which `http.response.bytes` is less than 10000, use the following syntax: @@ -101,10 +101,10 @@ You can also use range syntax for string values, IP addresses, and timestamps. F @timestamp < now-2w ``` -For more examples on acceptable date formats, refer to [Date Math](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/common-options.md#date-math). +For more examples on acceptable date formats, refer to [Date Math](elasticsearch://reference/elasticsearch/rest-apis/common-options.md#date-math). -## Filter for documents using wildcards [_filter_for_documents_using_wildcards] +## Filter for documents using wildcards [_filter_for_documents_using_wildcards] To search for documents matching a pattern, use the wildcard syntax. For example, to find documents where `http.response.status_code` begins with a 4, use the following syntax: @@ -114,13 +114,13 @@ http.response.status_code: 4* By default, leading wildcards are not allowed for performance reasons. You can modify this with the [`query:allowLeadingWildcards`](asciidocalypse://docs/kibana/docs/reference/advanced-settings.md#query-allowleadingwildcards) advanced setting. -::::{note} +::::{note} Only `*` is currently supported. This matches zero or more characters. :::: -## Negating a query [_negating_a_query] +## Negating a query [_negating_a_query] To negate or exclude a set of documents, use the `not` keyword (not case-sensitive). For example, to filter documents where the `http.request.method` is **not** GET, use the following query: @@ -129,7 +129,7 @@ NOT http.request.method: GET ``` -## Combining multiple queries [_combining_multiple_queries] +## Combining multiple queries [_combining_multiple_queries] To combine multiple queries, use the `and`/`or` keywords (not case-sensitive). For example, to find documents where the `http.request.method` is GET **or** the `http.response.status_code` is 400, use the following query: @@ -157,7 +157,7 @@ http.request.method: (GET OR POST OR DELETE) ``` -## Matching multiple fields [_matching_multiple_fields] +## Matching multiple fields [_matching_multiple_fields] Wildcards can also be used to query multiple fields. For example, to search for documents where any sub-field of `datastream` contains “logs”, use the following: @@ -165,15 +165,15 @@ Wildcards can also be used to query multiple fields. For example, to search for datastream.*: logs ``` -::::{note} +::::{note} When using wildcards to query multiple fields, errors might occur if the fields are of different types. For example, if `datastream.*` matches both numeric and string fields, the above query will result in an error because numeric fields cannot be queried for string values. :::: -## Querying nested fields [_querying_nested_fields] +## Querying nested fields [_querying_nested_fields] -Querying [nested fields](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/nested.md) requires a special syntax. Consider the following document, where `user` is a nested field: +Querying [nested fields](elasticsearch://reference/elasticsearch/mapping-reference/nested.md) requires a special syntax. Consider the following document, where `user` is a nested field: ```yaml { diff --git a/explore-analyze/query-filter/languages/lucene-query-syntax.md b/explore-analyze/query-filter/languages/lucene-query-syntax.md index 3be04055e..96f325b5f 100644 --- a/explore-analyze/query-filter/languages/lucene-query-syntax.md +++ b/explore-analyze/query-filter/languages/lucene-query-syntax.md @@ -8,7 +8,7 @@ mapped_pages: # Lucene query syntax [lucene-query] -Lucene query syntax is available to {{kib}} users who opt out of the [{{kib}} Query Language](kql.md). Full documentation for this syntax is available as part of {{es}} [query string syntax](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-query-string-query.md#query-string-syntax). +Lucene query syntax is available to {{kib}} users who opt out of the [{{kib}} Query Language](kql.md). Full documentation for this syntax is available as part of {{es}} [query string syntax](elasticsearch://reference/query-languages/query-dsl-query-string-query.md#query-string-syntax). The main reason to use the Lucene query syntax in {{kib}} is for advanced Lucene features, such as regular expressions or fuzzy term matching. However, Lucene syntax is not able to search nested objects or scripted fields. diff --git a/explore-analyze/query-filter/languages/querydsl.md b/explore-analyze/query-filter/languages/querydsl.md index f3376a021..d8fdac4b2 100644 --- a/explore-analyze/query-filter/languages/querydsl.md +++ b/explore-analyze/query-filter/languages/querydsl.md @@ -39,10 +39,10 @@ The [`_search` endpoint](../../../solutions/search/querying-for-search.md) accep Query DSL support a wide range of search techniques, including the following: * [**Full-text search**](/solutions/search/full-text.md): Search text that has been analyzed and indexed to support phrase or proximity queries, fuzzy matches, and more. -* [**Keyword search**](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/keyword.md): Search for exact matches using `keyword` fields. +* [**Keyword search**](elasticsearch://reference/elasticsearch/mapping-reference/keyword.md): Search for exact matches using `keyword` fields. * [**Semantic search**](/solutions/search/semantic-search/semantic-search-semantic-text.md): Search `semantic_text` fields using dense or sparse vector search on embeddings generated in your {{es}} cluster. * [**Vector search**](/solutions/search/vector/knn.md): Search for similar dense vectors using the kNN algorithm for embeddings generated outside of {{es}}. -* [**Geospatial search**](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/geo-queries.md): Search for locations and calculate spatial relationships using geospatial queries. +* [**Geospatial search**](elasticsearch://reference/query-languages/geo-queries.md): Search for locations and calculate spatial relationships using geospatial queries. You can also filter data using Query DSL. Filters enable you to include or exclude documents by retrieving documents that match specific field-level criteria. A query that uses the `filter` parameter indicates [filter context](#filter-context). @@ -54,9 +54,9 @@ Because aggregations leverage the same data structures used for search, they are The following aggregation types are available: -* [Metric](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/metrics.md): Calculate metrics, such as a sum or average, from field values. -* [Bucket](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/bucket.md): Group documents into buckets based on field values, ranges, or other criteria. -* [Pipeline](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/pipeline.md): Run aggregations on the results of other aggregations. +* [Metric](elasticsearch://reference/data-analysis/aggregations/metrics.md): Calculate metrics, such as a sum or average, from field values. +* [Bucket](elasticsearch://reference/data-analysis/aggregations/bucket.md): Group documents into buckets based on field values, ranges, or other criteria. +* [Pipeline](elasticsearch://reference/data-analysis/aggregations/pipeline.md): Run aggregations on the results of other aggregations. Run aggregations by specifying the [search API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search)'s `aggs` parameter. Learn more in [Run an aggregation](/explore-analyze/query-filter/aggregations.md#run-an-agg). @@ -65,9 +65,9 @@ Run aggregations by specifying the [search API](https://www.elastic.co/docs/api/ Think of the Query DSL as an AST (Abstract Syntax Tree) of queries, consisting of two types of clauses: -**Leaf query clauses**: Leaf query clauses look for a particular value in a particular field, such as the [`match`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-match-query.md), [`term`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-term-query.md) or [`range`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-range-query.md) queries. These queries can be used by themselves. +**Leaf query clauses**: Leaf query clauses look for a particular value in a particular field, such as the [`match`](elasticsearch://reference/query-languages/query-dsl-match-query.md), [`term`](elasticsearch://reference/query-languages/query-dsl-term-query.md) or [`range`](elasticsearch://reference/query-languages/query-dsl-range-query.md) queries. These queries can be used by themselves. -**Compound query clauses**: Compound query clauses wrap other leaf **or** compound queries and are used to combine multiple queries in a logical fashion (such as the [`bool`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-bool-query.md) or [`dis_max`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-dis-max-query.md) query), or to alter their behavior (such as the [`constant_score`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-constant-score-query.md) query). +**Compound query clauses**: Compound query clauses wrap other leaf **or** compound queries and are used to combine multiple queries in a logical fashion (such as the [`bool`](elasticsearch://reference/query-languages/query-dsl-bool-query.md) or [`dis_max`](elasticsearch://reference/query-languages/query-dsl-dis-max-query.md) query), or to alter their behavior (such as the [`constant_score`](elasticsearch://reference/query-languages/query-dsl-constant-score-query.md) query). Query clauses behave differently depending on whether they are used in [query context or filter context](#query-filter-context). @@ -77,22 +77,22 @@ $$$query-dsl-allow-expensive-queries$$$ - Queries that need to do linear scans to identify matches: - - [`script` queries](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-script-query.md) - - queries on [numeric](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/number.md), [date](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/date.md), [boolean](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/boolean.md), [ip](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/ip.md), [geo_point](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/geo-point.md) or [keyword](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/keyword.md) fields that are not indexed but have [doc values](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/doc-values.md) enabled + - [`script` queries](elasticsearch://reference/query-languages/query-dsl-script-query.md) + - queries on [numeric](elasticsearch://reference/elasticsearch/mapping-reference/number.md), [date](elasticsearch://reference/elasticsearch/mapping-reference/date.md), [boolean](elasticsearch://reference/elasticsearch/mapping-reference/boolean.md), [ip](elasticsearch://reference/elasticsearch/mapping-reference/ip.md), [geo_point](elasticsearch://reference/elasticsearch/mapping-reference/geo-point.md) or [keyword](elasticsearch://reference/elasticsearch/mapping-reference/keyword.md) fields that are not indexed but have [doc values](elasticsearch://reference/elasticsearch/mapping-reference/doc-values.md) enabled - Queries that have a high up-front cost: - - [`fuzzy` queries](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-fuzzy-query.md) (except on [`wildcard`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/keyword.md#wildcard-field-type) fields) - - [`regexp` queries](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-regexp-query.md) (except on [`wildcard`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/keyword.md#wildcard-field-type) fields) - - [`prefix` queries](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-prefix-query.md) (except on [`wildcard`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/keyword.md#wildcard-field-type) fields or those without [`index_prefixes`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/index-prefixes.md)) - - [`wildcard` queries](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-wildcard-query.md) (except on [`wildcard`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/keyword.md#wildcard-field-type) fields) - - [`range` queries](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-range-query.md) on [`text`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/text.md) and [`keyword`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/keyword.md) fields + - [`fuzzy` queries](elasticsearch://reference/query-languages/query-dsl-fuzzy-query.md) (except on [`wildcard`](elasticsearch://reference/elasticsearch/mapping-reference/keyword.md#wildcard-field-type) fields) + - [`regexp` queries](elasticsearch://reference/query-languages/query-dsl-regexp-query.md) (except on [`wildcard`](elasticsearch://reference/elasticsearch/mapping-reference/keyword.md#wildcard-field-type) fields) + - [`prefix` queries](elasticsearch://reference/query-languages/query-dsl-prefix-query.md) (except on [`wildcard`](elasticsearch://reference/elasticsearch/mapping-reference/keyword.md#wildcard-field-type) fields or those without [`index_prefixes`](elasticsearch://reference/elasticsearch/mapping-reference/index-prefixes.md)) + - [`wildcard` queries](elasticsearch://reference/query-languages/query-dsl-wildcard-query.md) (except on [`wildcard`](elasticsearch://reference/elasticsearch/mapping-reference/keyword.md#wildcard-field-type) fields) + - [`range` queries](elasticsearch://reference/query-languages/query-dsl-range-query.md) on [`text`](elasticsearch://reference/elasticsearch/mapping-reference/text.md) and [`keyword`](elasticsearch://reference/elasticsearch/mapping-reference/keyword.md) fields - - [Joining queries](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/joining-queries.md) + - [Joining queries](elasticsearch://reference/query-languages/joining-queries.md) - Queries that may have a high per-document cost: - - [`script_score` queries](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-script-score-query.md) - - [`percolate` queries](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-percolate-query.md) + - [`script_score` queries](elasticsearch://reference/query-languages/query-dsl-script-score-query.md) + - [`percolate` queries](elasticsearch://reference/query-languages/query-dsl-percolate-query.md) The execution of such queries can be prevented by setting the value of the `search.allow_expensive_queries` setting to `false` (defaults to `true`). @@ -142,9 +142,9 @@ Common filter applications include: Filter context applies when a query clause is passed to a `filter` parameter, such as: -* `filter` or `must_not` parameters in [`bool`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-bool-query.md) queries -* `filter` parameter in [`constant_score`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-constant-score-query.md) queries -* [`filter`](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-filter-aggregation.md) aggregations +* `filter` or `must_not` parameters in [`bool`](elasticsearch://reference/query-languages/query-dsl-bool-query.md) queries +* `filter` parameter in [`constant_score`](elasticsearch://reference/query-languages/query-dsl-constant-score-query.md) queries +* [`filter`](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-filter-aggregation.md) aggregations Filters optimize query performance and efficiency, especially for structured data queries and when combined with full-text searches. diff --git a/explore-analyze/query-filter/languages/sql-cli.md b/explore-analyze/query-filter/languages/sql-cli.md index ac55a787f..d837bc985 100644 --- a/explore-analyze/query-filter/languages/sql-cli.md +++ b/explore-analyze/query-filter/languages/sql-cli.md @@ -54,7 +54,7 @@ $ ./java -cp [PATH_TO_CLI_JAR]/elasticsearch-sql-cli-[VERSION].jar org.elasticse The jar name will be different for each Elasticsearch version (for example `elasticsearch-sql-cli-7.3.2.jar`), thus the generic `VERSION` specified in the example above. Furthermore, if not running the command from the folder where the SQL CLI jar resides, you’d have to provide the full path, as well. -## CLI commands [cli-commands] +## CLI commands [cli-commands] Apart from SQL queries, CLI can also execute some specific commands: @@ -83,7 +83,7 @@ fetch separator set to "---------------------" ``` `lenient = ` (default `false`) -: If `false`, Elasticsearch SQL returns an error for fields containing [array values](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/array.md). If `true`, Elasticsearch SQL returns the first value from the array with no guarantee of consistent results. +: If `false`, Elasticsearch SQL returns an error for fields containing [array values](elasticsearch://reference/elasticsearch/mapping-reference/array.md). If `true`, Elasticsearch SQL returns the first value from the array with no guarantee of consistent results. ```sql sql> lenient = true; diff --git a/explore-analyze/query-filter/languages/sql-data-types.md b/explore-analyze/query-filter/languages/sql-data-types.md index 5c5988408..1f3d5deff 100644 --- a/explore-analyze/query-filter/languages/sql-data-types.md +++ b/explore-analyze/query-filter/languages/sql-data-types.md @@ -12,31 +12,31 @@ mapped_pages: | --- | --- | --- | --- | | **{{es}} type** | **Elasticsearch SQL type** | **SQL type** | **SQL precision** | | Core types | -| [`null`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/null-value.md) | `null` | NULL | 0 | -| [`boolean`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/boolean.md) | `boolean` | BOOLEAN | 1 | -| [`byte`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/number.md) | `byte` | TINYINT | 3 | -| [`short`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/number.md) | `short` | SMALLINT | 5 | -| [`integer`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/number.md) | `integer` | INTEGER | 10 | -| [`long`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/number.md) | `long` | BIGINT | 19 | -| [`unsigned_long`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/number.md) | `[preview] unsigned_long` | BIGINT | 20 | -| [`double`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/number.md) | `double` | DOUBLE | 15 | -| [`float`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/number.md) | `float` | REAL | 7 | -| [`half_float`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/number.md) | `half_float` | FLOAT | 3 | -| [`scaled_float`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/number.md) | `scaled_float` | DOUBLE | 15 | -| [keyword type family](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/keyword.md) | `keyword` | VARCHAR | 32,766 | -| [`text`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/text.md) | `text` | VARCHAR | 2,147,483,647 | -| [`binary`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/binary.md) | `binary` | VARBINARY | 2,147,483,647 | -| [`date`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/date.md) | `datetime` | TIMESTAMP | 29 | -| [`ip`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/ip.md) | `ip` | VARCHAR | 39 | -| [`version`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/version.md) | `version` | VARCHAR | 32,766 | +| [`null`](elasticsearch://reference/elasticsearch/mapping-reference/null-value.md) | `null` | NULL | 0 | +| [`boolean`](elasticsearch://reference/elasticsearch/mapping-reference/boolean.md) | `boolean` | BOOLEAN | 1 | +| [`byte`](elasticsearch://reference/elasticsearch/mapping-reference/number.md) | `byte` | TINYINT | 3 | +| [`short`](elasticsearch://reference/elasticsearch/mapping-reference/number.md) | `short` | SMALLINT | 5 | +| [`integer`](elasticsearch://reference/elasticsearch/mapping-reference/number.md) | `integer` | INTEGER | 10 | +| [`long`](elasticsearch://reference/elasticsearch/mapping-reference/number.md) | `long` | BIGINT | 19 | +| [`unsigned_long`](elasticsearch://reference/elasticsearch/mapping-reference/number.md) | `[preview] unsigned_long` | BIGINT | 20 | +| [`double`](elasticsearch://reference/elasticsearch/mapping-reference/number.md) | `double` | DOUBLE | 15 | +| [`float`](elasticsearch://reference/elasticsearch/mapping-reference/number.md) | `float` | REAL | 7 | +| [`half_float`](elasticsearch://reference/elasticsearch/mapping-reference/number.md) | `half_float` | FLOAT | 3 | +| [`scaled_float`](elasticsearch://reference/elasticsearch/mapping-reference/number.md) | `scaled_float` | DOUBLE | 15 | +| [keyword type family](elasticsearch://reference/elasticsearch/mapping-reference/keyword.md) | `keyword` | VARCHAR | 32,766 | +| [`text`](elasticsearch://reference/elasticsearch/mapping-reference/text.md) | `text` | VARCHAR | 2,147,483,647 | +| [`binary`](elasticsearch://reference/elasticsearch/mapping-reference/binary.md) | `binary` | VARBINARY | 2,147,483,647 | +| [`date`](elasticsearch://reference/elasticsearch/mapping-reference/date.md) | `datetime` | TIMESTAMP | 29 | +| [`ip`](elasticsearch://reference/elasticsearch/mapping-reference/ip.md) | `ip` | VARCHAR | 39 | +| [`version`](elasticsearch://reference/elasticsearch/mapping-reference/version.md) | `version` | VARCHAR | 32,766 | | Complex types | -| [`object`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/object.md) | `object` | STRUCT | 0 | -| [`nested`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/nested.md) | `nested` | STRUCT | 0 | +| [`object`](elasticsearch://reference/elasticsearch/mapping-reference/object.md) | `object` | STRUCT | 0 | +| [`nested`](elasticsearch://reference/elasticsearch/mapping-reference/nested.md) | `nested` | STRUCT | 0 | | Unsupported types | | *types not mentioned above* | `unsupported` | OTHER | 0 | ::::{note} -Most of {{es}} [data types](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/field-data-types.md) are available in Elasticsearch SQL, as indicated above. As one can see, all of {{es}} [data types](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/field-data-types.md) are mapped to the data type with the same name in Elasticsearch SQL, with the exception of **date** data type which is mapped to **datetime** in Elasticsearch SQL. This is to avoid confusion with the ANSI SQL types **DATE** (date only) and **TIME** (time only), which are also supported by Elasticsearch SQL in queries (with the use of [`CAST`](sql-functions-type-conversion.md#sql-functions-type-conversion-cast)/[`CONVERT`](sql-functions-type-conversion.md#sql-functions-type-conversion-convert)), but don’t correspond to an actual mapping in {{es}} (see the [`table`](#es-sql-only-types) below). +Most of {{es}} [data types](elasticsearch://reference/elasticsearch/mapping-reference/field-data-types.md) are available in Elasticsearch SQL, as indicated above. As one can see, all of {{es}} [data types](elasticsearch://reference/elasticsearch/mapping-reference/field-data-types.md) are mapped to the data type with the same name in Elasticsearch SQL, with the exception of **date** data type which is mapped to **datetime** in Elasticsearch SQL. This is to avoid confusion with the ANSI SQL types **DATE** (date only) and **TIME** (time only), which are also supported by Elasticsearch SQL in queries (with the use of [`CAST`](sql-functions-type-conversion.md#sql-functions-type-conversion-cast)/[`CONVERT`](sql-functions-type-conversion.md#sql-functions-type-conversion-convert)), but don’t correspond to an actual mapping in {{es}} (see the [`table`](#es-sql-only-types) below). :::: @@ -72,9 +72,9 @@ The table below indicates these types: ## SQL and multi-fields [sql-multi-field] -A core concept in {{es}} is that of an `analyzed` field, that is a full-text value that is interpreted in order to be effectively indexed. These fields are of type [`text`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/text.md) and are not used for sorting or aggregations as their actual value depends on the [`analyzer`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/analyzer.md) used hence why {{es}} also offers the [`keyword`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/keyword.md) type for storing the *exact* value. +A core concept in {{es}} is that of an `analyzed` field, that is a full-text value that is interpreted in order to be effectively indexed. These fields are of type [`text`](elasticsearch://reference/elasticsearch/mapping-reference/text.md) and are not used for sorting or aggregations as their actual value depends on the [`analyzer`](elasticsearch://reference/elasticsearch/mapping-reference/analyzer.md) used hence why {{es}} also offers the [`keyword`](elasticsearch://reference/elasticsearch/mapping-reference/keyword.md) type for storing the *exact* value. -In most case, and the default actually, is to use both types for strings which {{es}} supports through [multi-fields](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/multi-fields.md), that is the ability to index the same string in multiple ways; for example index it both as `text` for search but also as `keyword` for sorting and aggregations. +In most case, and the default actually, is to use both types for strings which {{es}} supports through [multi-fields](elasticsearch://reference/elasticsearch/mapping-reference/multi-fields.md), that is the ability to index the same string in multiple ways; for example index it both as `text` for search but also as `keyword` for sorting and aggregations. As SQL requires exact values, when encountering a `text` field Elasticsearch SQL will search for an exact multi-field that it can use for comparisons, sorting and aggregations. To do that, it will search for the first `keyword` that it can find that is *not* normalized and use that as the original field *exact* value. diff --git a/explore-analyze/query-filter/languages/sql-functions-aggs.md b/explore-analyze/query-filter/languages/sql-functions-aggs.md index 7c425154d..3ccd61256 100644 --- a/explore-analyze/query-filter/languages/sql-functions-aggs.md +++ b/explore-analyze/query-filter/languages/sql-functions-aggs.md @@ -11,7 +11,7 @@ mapped_pages: Functions for computing a *single* result from a set of input values. Elasticsearch SQL supports aggregate functions only alongside [grouping](sql-syntax-select.md#sql-syntax-group-by) (implicit or explicit). -## General Purpose [sql-functions-aggs-general] +## General Purpose [sql-functions-aggs-general] ## `AVG` [sql-functions-aggs-avg] @@ -243,13 +243,13 @@ F |umant M |emzi ``` -::::{note} +::::{note} `FIRST` cannot be used in a HAVING clause. :::: -::::{note} -`FIRST` cannot be used with columns of type [`text`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/text.md) unless the field is also [saved as a keyword](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/text.md#before-enabling-fielddata). +::::{note} +`FIRST` cannot be used with columns of type [`text`](elasticsearch://reference/elasticsearch/mapping-reference/text.md) unless the field is also [saved as a keyword](elasticsearch://reference/elasticsearch/mapping-reference/text.md#before-enabling-fielddata). :::: @@ -364,13 +364,13 @@ F |ldiodio M |lari ``` -::::{note} +::::{note} `LAST` cannot be used in `HAVING` clause. :::: -::::{note} -`LAST` cannot be used with columns of type [`text`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/text.md) unless the field is also [`saved as a keyword`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/text.md#before-enabling-fielddata). +::::{note} +`LAST` cannot be used with columns of type [`text`](elasticsearch://reference/elasticsearch/mapping-reference/text.md) unless the field is also [`saved as a keyword`](elasticsearch://reference/elasticsearch/mapping-reference/text.md#before-enabling-fielddata). :::: @@ -406,8 +406,8 @@ SELECT MAX(ABS(salary / -12.0)) AS max FROM emp; 6249.916666666667 ``` -::::{note} -`MAX` on a field of type [`text`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/text.md) or [`keyword`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/keyword.md) is translated into [`LAST/LAST_VALUE`](#sql-functions-aggs-last) and therefore, it cannot be used in `HAVING` clause. +::::{note} +`MAX` on a field of type [`text`](elasticsearch://reference/elasticsearch/mapping-reference/text.md) or [`keyword`](elasticsearch://reference/elasticsearch/mapping-reference/keyword.md) is translated into [`LAST/LAST_VALUE`](#sql-functions-aggs-last) and therefore, it cannot be used in `HAVING` clause. :::: @@ -435,8 +435,8 @@ SELECT MIN(salary) AS min FROM emp; 25324 ``` -::::{note} -`MIN` on a field of type [`text`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/text.md) or [`keyword`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/keyword.md) is translated into [`FIRST/FIRST_VALUE`](#sql-functions-aggs-first) and therefore, it cannot be used in `HAVING` clause. +::::{note} +`MIN` on a field of type [`text`](elasticsearch://reference/elasticsearch/mapping-reference/text.md) or [`keyword`](elasticsearch://reference/elasticsearch/mapping-reference/keyword.md) is translated into [`FIRST/FIRST_VALUE`](#sql-functions-aggs-first) and therefore, it cannot be used in `HAVING` clause. :::: @@ -473,7 +473,7 @@ SELECT ROUND(SUM(salary / 12.0), 1) AS sum FROM emp; ``` -## Statistics [sql-functions-aggs-statistics] +## Statistics [sql-functions-aggs-statistics] ## `KURTOSIS` [sql-functions-aggs-kurtosis] @@ -501,7 +501,7 @@ SELECT MIN(salary) AS min, MAX(salary) AS max, KURTOSIS(salary) AS k FROM emp; 25324 |74999 |2.0444718929142986 ``` -::::{note} +::::{note} `KURTOSIS` cannot be used on top of scalar functions or operators but only directly on a field. So, for example, the following is not allowed and an error is returned: ```sql @@ -560,8 +560,8 @@ PERCENTILE( 1. a numeric field. If this field contains only `null` values, the function returns `null`. Otherwise, the function ignores `null` values in this field. 2. a numeric expression (must be a constant and not based on a field). If `null`, the function returns `null`. -3. optional string literal for the [percentile algorithm](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-metrics-percentile-aggregation.md#search-aggregations-metrics-percentile-aggregation-approximation). Possible values: `tdigest` or `hdr`. Defaults to `tdigest`. -4. optional numeric literal that configures the [percentile algorithm](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-metrics-percentile-aggregation.md#search-aggregations-metrics-percentile-aggregation-approximation). Configures `compression` for `tdigest` or `number_of_significant_value_digits` for `hdr`. The default is the same as that of the backing algorithm. +3. optional string literal for the [percentile algorithm](elasticsearch://reference/data-analysis/aggregations/search-aggregations-metrics-percentile-aggregation.md#search-aggregations-metrics-percentile-aggregation-approximation). Possible values: `tdigest` or `hdr`. Defaults to `tdigest`. +4. optional numeric literal that configures the [percentile algorithm](elasticsearch://reference/data-analysis/aggregations/search-aggregations-metrics-percentile-aggregation.md#search-aggregations-metrics-percentile-aggregation-approximation). Configures `compression` for `tdigest` or `number_of_significant_value_digits` for `hdr`. The default is the same as that of the backing algorithm. **Output**: `double` numeric value @@ -631,8 +631,8 @@ PERCENTILE_RANK( 1. a numeric field. If this field contains only `null` values, the function returns `null`. Otherwise, the function ignores `null` values in this field. 2. a numeric expression (must be a constant and not based on a field). If `null`, the function returns `null`. -3. optional string literal for the [percentile algorithm](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-metrics-percentile-aggregation.md#search-aggregations-metrics-percentile-aggregation-approximation). Possible values: `tdigest` or `hdr`. Defaults to `tdigest`. -4. optional numeric literal that configures the [percentile algorithm](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-metrics-percentile-aggregation.md#search-aggregations-metrics-percentile-aggregation-approximation). Configures `compression` for `tdigest` or `number_of_significant_value_digits` for `hdr`. The default is the same as that of the backing algorithm. +3. optional string literal for the [percentile algorithm](elasticsearch://reference/data-analysis/aggregations/search-aggregations-metrics-percentile-aggregation.md#search-aggregations-metrics-percentile-aggregation-approximation). Possible values: `tdigest` or `hdr`. Defaults to `tdigest`. +4. optional numeric literal that configures the [percentile algorithm](elasticsearch://reference/data-analysis/aggregations/search-aggregations-metrics-percentile-aggregation.md#search-aggregations-metrics-percentile-aggregation-approximation). Configures `compression` for `tdigest` or `number_of_significant_value_digits` for `hdr`. The default is the same as that of the backing algorithm. **Output**: `double` numeric value @@ -711,7 +711,7 @@ SELECT MIN(salary) AS min, MAX(salary) AS max, SKEWNESS(salary) AS s FROM emp; 25324 |74999 |0.2707722118423227 ``` -::::{note} +::::{note} `SKEWNESS` cannot be used on top of scalar functions but only directly on a field. So, for example, the following is not allowed and an error is returned: ```sql diff --git a/explore-analyze/query-filter/languages/sql-functions-datetime.md b/explore-analyze/query-filter/languages/sql-functions-datetime.md index eae19ce4a..51139cb03 100644 --- a/explore-analyze/query-filter/languages/sql-functions-datetime.md +++ b/explore-analyze/query-filter/languages/sql-functions-datetime.md @@ -14,7 +14,7 @@ Elasticsearch SQL offers a wide range of facilities for performing date/time man A common requirement when dealing with date/time in general revolves around the notion of `interval`, a topic that is worth exploring in the context of {{es}} and Elasticsearch SQL. -{{es}} has comprehensive support for [date math](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/common-options.md#date-math) both inside [index names](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#api-date-math-index-names) and [queries](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-date-format.md). Inside Elasticsearch SQL the former is supported as is by passing the expression in the table name, while the latter is supported through the standard SQL `INTERVAL`. +{{es}} has comprehensive support for [date math](elasticsearch://reference/elasticsearch/rest-apis/common-options.md#date-math) both inside [index names](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#api-date-math-index-names) and [queries](elasticsearch://reference/elasticsearch/mapping-reference/mapping-date-format.md). Inside Elasticsearch SQL the former is supported as is by passing the expression in the table name, while the latter is supported through the standard SQL `INTERVAL`. The table below shows the mapping between {{es}} and Elasticsearch SQL: @@ -34,7 +34,7 @@ The table below shows the mapping between {{es}} and Elasticsearch SQL: `INTERVAL` allows either `YEAR` and `MONTH` to be mixed together *or* `DAY`, `HOUR`, `MINUTE` and `SECOND`. -::::{tip} +::::{tip} Elasticsearch SQL accepts also the plural for each time unit (e.g. both `YEAR` and `YEARS` are valid). :::: @@ -56,7 +56,7 @@ Example of the possible combinations below: ## Comparison [_comparison] -Date/time fields can be compared to [date math](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/common-options.md#date-math) expressions with the equality (`=`) and `IN` operators: +Date/time fields can be compared to [date math](elasticsearch://reference/elasticsearch/rest-apis/common-options.md#date-math) expressions with the equality (`=`) and `IN` operators: ```sql SELECT hire_date FROM emp WHERE hire_date = '1987-03-01||+4y/y'; @@ -153,7 +153,7 @@ CURDATE() **Description**: Returns the date (no time part) when the current query reached the server. It can be used both as a keyword: `CURRENT_DATE` or as a function with no arguments: `CURRENT_DATE()`. -::::{note} +::::{note} Unlike CURRENT_DATE, `CURDATE()` can only be used as a function with no arguments and not as a keyword. :::: @@ -264,7 +264,7 @@ Anoosh Arumugam ``` -::::{important} +::::{important} Currently, using a *precision* greater than 6 doesn’t make any difference to the output of the function as the maximum number of second fractional digits returned is 6. :::: @@ -326,7 +326,7 @@ Anoosh Arumugam ``` -::::{important} +::::{important} Currently, using a *precision* greater than 6 doesn’t make any difference to the output of the function as the maximum number of second fractional digits returned is 6. :::: @@ -352,7 +352,7 @@ DATE_ADD( **Description**: Add the given number of date/time units to a date/datetime. If the number of units is negative then it’s subtracted from the date/datetime. -::::{warning} +::::{warning} If the second argument is a long there is possibility of truncation since an integer value will be extracted and used from that long. :::: @@ -484,7 +484,7 @@ SELECT DATE_DIFF('qq', '2019-09-04'::date, '2025-04-25'::date) AS "diffInQuarter 23 ``` -::::{note} +::::{note} For `hour` and `minute`, `DATEDIFF` doesn’t do any rounding, but instead first truncates the more detailed time fields on the 2 dates to zero and then calculates the subtraction. :::: @@ -532,7 +532,7 @@ DATE_FORMAT( **Description**: Returns the date/datetime/time as a string using the format specified in the 2nd argument. The formatting pattern is one of the specifiers used in the [MySQL DATE_FORMAT() function](https://dev.mysql.com/doc/refman/8.0/en/date-and-time-functions.md#function_date-format). -::::{note} +::::{note} If the 1st argument is of type `time`, then pattern specified by the 2nd argument cannot contain date related units (e.g. *dd*, *MM*, *yyyy*, etc.). If it contains such units an error is returned. Ranges for month and day specifiers (%c, %D, %d, %e, %m) start at one, unlike MySQL, where they start at zero, due to the fact that MySQL permits the storing of incomplete dates such as *2014-00-00*. Elasticsearch in this case returns an error. :::: @@ -580,7 +580,7 @@ DATE_PARSE( **Description**: Returns a date by parsing the 1st argument using the format specified in the 2nd argument. The parsing format pattern used is the one from [`java.time.format.DateTimeFormatter`](https://docs.oracle.com/en/java/javase/14/docs/api/java.base/java/time/format/DateTimeFormatter.md). -::::{note} +::::{note} If the parsing pattern does not contain all valid date units (e.g. *HH:mm:ss*, *dd-MM HH:mm:ss*, etc.) an error is returned as the function needs to return a value of `date` type which will contain date part. :::: @@ -593,7 +593,7 @@ SELECT DATE_PARSE('07/04/2020', 'dd/MM/yyyy') AS "date"; 2020-04-07 ``` -::::{note} +::::{note} The resulting `date` will have the time zone specified by the user through the [`time_zone`](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-sql-query)/[`timezone`](sql-jdbc.md#jdbc-cfg-timezone) REST/driver parameters with no conversion applied. ```sql @@ -629,7 +629,7 @@ DATETIME_FORMAT( **Description**: Returns the date/datetime/time as a string using the format specified in the 2nd argument. The formatting pattern used is the one from [`java.time.format.DateTimeFormatter`](https://docs.oracle.com/en/java/javase/14/docs/api/java.base/java/time/format/DateTimeFormatter.md). -::::{note} +::::{note} If the 1st argument is of type `time`, then pattern specified by the 2nd argument cannot contain date related units (e.g. *dd*, *MM*, *yyyy*, etc.). If it contains such units an error is returned. :::: @@ -677,7 +677,7 @@ DATETIME_PARSE( **Description**: Returns a datetime by parsing the 1st argument using the format specified in the 2nd argument. The parsing format pattern used is the one from [`java.time.format.DateTimeFormatter`](https://docs.oracle.com/en/java/javase/14/docs/api/java.base/java/time/format/DateTimeFormatter.md). -::::{note} +::::{note} If the parsing pattern contains only date or only time units (e.g. *dd/MM/yyyy*, *HH:mm:ss*, etc.) an error is returned as the function needs to return a value of `datetime` type which must contain both. :::: @@ -698,7 +698,7 @@ SELECT DATETIME_PARSE('10:20:30 07/04/2020 Europe/Berlin', 'HH:mm:ss dd/MM/yyyy 2020-04-07T08:20:30.000Z ``` -::::{note} +::::{note} If timezone is not specified in the datetime string expression and the parsing pattern, the resulting `datetime` will have the time zone specified by the user through the [`time_zone`](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-sql-query)/[`timezone`](sql-jdbc.md#jdbc-cfg-timezone) REST/driver parameters with no conversion applied. ```sql @@ -734,7 +734,7 @@ TIME_PARSE( **Description**: Returns a time by parsing the 1st argument using the format specified in the 2nd argument. The parsing format pattern used is the one from [`java.time.format.DateTimeFormatter`](https://docs.oracle.com/en/java/javase/14/docs/api/java.base/java/time/format/DateTimeFormatter.md). -::::{note} +::::{note} If the parsing pattern contains only date units (e.g. *dd/MM/yyyy*) an error is returned as the function needs to return a value of `time` type which will contain only time. :::: @@ -755,7 +755,7 @@ SELECT TIME_PARSE('10:20:30-01:00', 'HH:mm:ssXXX') AS "time"; 11:20:30.000Z ``` -::::{note} +::::{note} If timezone is not specified in the time string expression and the parsing pattern, the resulting `time` will have the offset of the time zone specified by the user through the [`time_zone`](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-sql-query)/[`timezone`](sql-jdbc.md#jdbc-cfg-timezone) REST/driver parameters at the Unix epoch date (`1970-01-01`) with no conversion applied. ```sql @@ -841,7 +841,7 @@ SELECT DATE_PART('month', CAST('2019-09-24' AS DATE)) AS month; 9 ``` -::::{note} +::::{note} For `week` and `weekday` the unit is extracted using the non-ISO calculation, which means that a given week is considered to start from Sunday, not Monday. :::: @@ -854,7 +854,7 @@ SELECT DATE_PART('week', '2019-09-22T11:22:33.123Z'::datetime) AS week; 39 ``` -::::{note} +::::{note} The `tzoffset` returns the total number of minutes (signed) that represent the time zone’s offset. :::: @@ -995,7 +995,7 @@ FORMAT( **Description**: Returns the date/datetime/time as a string using the [format](https://docs.microsoft.com/en-us/sql/t-sql/functions/format-transact-sql#arguments) specified in the 2nd argument. The formatting pattern used is the one from [Microsoft SQL Server Format Specification](https://docs.microsoft.com/en-us/dotnet/standard/base-types/custom-date-and-time-format-strings). -::::{note} +::::{note} If the 1st argument is of type `time`, then pattern specified by the 2nd argument cannot contain date related units (e.g. *dd*, *MM*, *yyyy*, etc.). If it contains such units an error is returned.
Format specifier `F` will be working similar to format specifier `f`. It will return the fractional part of seconds, and the number of digits will be same as of the number of `Fs` provided as input (up to 9 digits). Result will contain `0` appended in the end to match with number of `F` provided. e.g.: for a time part `10:20:30.1234` and pattern `HH:mm:ss.FFFFFF`, the output string of the function would be: `10:20:30.123400`.
Format specifier `y` will return year-of-era instead of one/two low-order digits. eg.: For year `2009`, `y` will be returning `2009` instead of `9`. For year `43`, `y` format specifier will return `43`. - Special characters like `"` , `\` and `%` will be returned as it is without any change. eg.: formatting date `17-sep-2020` with `%M` will return `%9` :::: @@ -1043,7 +1043,7 @@ TO_CHAR( **Description**: Returns the date/datetime/time as a string using the format specified in the 2nd argument. The formatting pattern conforms to [PostgreSQL Template Patterns for Date/Time Formatting](https://www.postgresql.org/docs/13/functions-formatting.html). -::::{note} +::::{note} If the 1st argument is of type `time`, then the pattern specified by the 2nd argument cannot contain date related units (e.g. *dd*, *MM*, *YYYY*, etc.). If it contains such units an error is returned.
The result of the patterns `TZ` and `tz` (time zone abbreviations) in some cases differ from the results returned by the `TO_CHAR` in PostgreSQL. The reason is that the time zone abbreviations specified by the JDK are different from the ones specified by PostgreSQL. This function might show an actual time zone abbreviation instead of the generic `LMT` or empty string or offset returned by the PostgreSQL implementation. The summer/daylight markers might also differ between the two implementations (e.g. will show `HT` instead of `HST` for Hawaii).
The `FX`, `TM`, `SP` pattern modifiers are not supported and will show up as `FX`, `TM`, `SP` literals in the output. :::: diff --git a/explore-analyze/query-filter/languages/sql-functions-geo.md b/explore-analyze/query-filter/languages/sql-functions-geo.md index 1df38bd39..4d3dda725 100644 --- a/explore-analyze/query-filter/languages/sql-functions-geo.md +++ b/explore-analyze/query-filter/languages/sql-functions-geo.md @@ -8,7 +8,7 @@ mapped_pages: # Geo Functions [sql-functions-geo] -::::{warning} +::::{warning} This functionality is in beta and is subject to change. The design and code is less mature than official GA features and is being provided as-is with no warranties. Beta features are not subject to the support SLA of official GA features. :::: @@ -17,7 +17,7 @@ The geo functions work with geometries stored in `geo_point`, `geo_shape` and `s ## Limitations [_limitations_4] -[`geo_point`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/geo-point.md), [`geo_shape`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/geo-shape.md) and [`shape`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/shape.md) and types are represented in SQL as geometry and can be used interchangeably with the following exceptions: +[`geo_point`](elasticsearch://reference/elasticsearch/mapping-reference/geo-point.md), [`geo_shape`](elasticsearch://reference/elasticsearch/mapping-reference/geo-shape.md) and [`shape`](elasticsearch://reference/elasticsearch/mapping-reference/shape.md) and types are represented in SQL as geometry and can be used interchangeably with the following exceptions: * `geo_shape` and `shape` fields don’t have doc values, therefore these fields cannot be used for filtering, grouping or sorting. * `geo_points` fields are indexed and have doc values by default, however only latitude and longitude are stored and indexed with some loss of precision from the original values (4.190951585769653E-8 for the latitude and 8.381903171539307E-8 for longitude). The altitude component is accepted but not stored in doc values nor indexed. Therefore calling `ST_Z` function in the filtering, grouping or sorting will return `null`. diff --git a/explore-analyze/query-filter/languages/sql-functions-grouping.md b/explore-analyze/query-filter/languages/sql-functions-grouping.md index 7f5e0d137..3a8723a6d 100644 --- a/explore-analyze/query-filter/languages/sql-functions-grouping.md +++ b/explore-analyze/query-filter/languages/sql-functions-grouping.md @@ -39,7 +39,7 @@ bucket_key = Math.floor(value / interval) * interval ``` ::::{note} -The histogram in SQL does **NOT** return empty buckets for missing intervals as the traditional [histogram](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-histogram-aggregation.md) and [date histogram](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-datehistogram-aggregation.md). Such behavior does not fit conceptually in SQL which treats all missing values as `null`; as such the histogram places all missing values in the `null` group. +The histogram in SQL does **NOT** return empty buckets for missing intervals as the traditional [histogram](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-histogram-aggregation.md) and [date histogram](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-datehistogram-aggregation.md). Such behavior does not fit conceptually in SQL which treats all missing values as `null`; as such the histogram places all missing values in the `null` group. :::: @@ -137,7 +137,7 @@ When the histogram in SQL is applied on **DATE** type instead of **DATETIME**, t ::::{important} -All intervals specified for a date/time HISTOGRAM will use a [fixed interval](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-datehistogram-aggregation.md) in their `date_histogram` aggregation definition, with the notable exceptions of `INTERVAL '1' YEAR`, `INTERVAL '1' MONTH` and `INTERVAL '1' DAY` where a calendar interval is used. The choice for a calendar interval was made for having a more intuitive result for YEAR, MONTH and DAY groupings. In the case of YEAR, for example, the calendar intervals consider a one year bucket as the one starting on January 1st that specific year, whereas a fixed interval one-year-bucket considers one year as a number of milliseconds (for example, `31536000000ms` corresponding to 365 days, 24 hours per day, 60 minutes per hour etc.). With fixed intervals, the day of February 5th, 2019 for example, belongs to a bucket that starts on December 20th, 2018 and {{es}} (and implicitly Elasticsearch SQL) would have returned the year 2018 for a date that’s actually in 2019. With calendar interval this behavior is more intuitive, having the day of February 5th, 2019 actually belonging to the 2019 year bucket. +All intervals specified for a date/time HISTOGRAM will use a [fixed interval](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-datehistogram-aggregation.md) in their `date_histogram` aggregation definition, with the notable exceptions of `INTERVAL '1' YEAR`, `INTERVAL '1' MONTH` and `INTERVAL '1' DAY` where a calendar interval is used. The choice for a calendar interval was made for having a more intuitive result for YEAR, MONTH and DAY groupings. In the case of YEAR, for example, the calendar intervals consider a one year bucket as the one starting on January 1st that specific year, whereas a fixed interval one-year-bucket considers one year as a number of milliseconds (for example, `31536000000ms` corresponding to 365 days, 24 hours per day, 60 minutes per hour etc.). With fixed intervals, the day of February 5th, 2019 for example, belongs to a bucket that starts on December 20th, 2018 and {{es}} (and implicitly Elasticsearch SQL) would have returned the year 2018 for a date that’s actually in 2019. With calendar interval this behavior is more intuitive, having the day of February 5th, 2019 actually belonging to the 2019 year bucket. :::: diff --git a/explore-analyze/query-filter/languages/sql-functions-search.md b/explore-analyze/query-filter/languages/sql-functions-search.md index cdfb738d8..87252476b 100644 --- a/explore-analyze/query-filter/languages/sql-functions-search.md +++ b/explore-analyze/query-filter/languages/sql-functions-search.md @@ -10,7 +10,7 @@ mapped_pages: Search functions should be used when performing full-text search, namely when the `MATCH` or `QUERY` predicates are being used. Outside a, so-called, search context, these functions will return default values such as `0` or `NULL`. -Elasticsearch SQL optimizes all queries executed against {{es}} depending on the scoring needs. Using [`track_scores`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/sort-search-results.md#_track_scores) on the search request or [`_doc` sorting](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/sort-search-results.md) that disables scores calculation, Elasticsearch SQL instructs {{es}} not to compute scores when these are not needed. For example, every time a `SCORE()` function is encountered in the SQL query, the scores are computed. +Elasticsearch SQL optimizes all queries executed against {{es}} depending on the scoring needs. Using [`track_scores`](elasticsearch://reference/elasticsearch/rest-apis/sort-search-results.md#_track_scores) on the search request or [`_doc` sorting](elasticsearch://reference/elasticsearch/rest-apis/sort-search-results.md) that disables scores calculation, Elasticsearch SQL instructs {{es}} not to compute scores when these are not needed. For example, every time a `SCORE()` function is encountered in the SQL query, the scores are computed. ## `MATCH` [sql-functions-search-match] @@ -28,7 +28,7 @@ MATCH( 3. additional parameters; optional -**Description**: A full-text search option, in the form of a predicate, available in Elasticsearch SQL that gives the user control over powerful [match](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-match-query.md) and [multi_match](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-multi-match-query.md) {{es}} queries. +**Description**: A full-text search option, in the form of a predicate, available in Elasticsearch SQL that gives the user control over powerful [match](elasticsearch://reference/query-languages/query-dsl-match-query.md) and [multi_match](elasticsearch://reference/query-languages/query-dsl-multi-match-query.md) {{es}} queries. The first parameter is the field or fields to match against. In case it receives one value only, Elasticsearch SQL will use a `match` query to perform the search: @@ -56,8 +56,8 @@ Frank Herbert |Children of Dune |8.043278 Frank Herbert |God Emperor of Dune|7.0029488 ``` -::::{note} -The `multi_match` query in {{es}} has the option of [per-field boosting](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-multi-match-query.md) that gives preferential weight (in terms of scoring) to fields being searched in, using the `^` character. In the example above, the `name` field has a greater weight in the final score than the `author` field when searching for `frank dune` text in both of them. +::::{note} +The `multi_match` query in {{es}} has the option of [per-field boosting](elasticsearch://reference/query-languages/query-dsl-multi-match-query.md) that gives preferential weight (in terms of scoring) to fields being searched in, using the `^` character. In the example above, the `name` field has a greater weight in the final score than the `author` field when searching for `frank dune` text in both of them. :::: @@ -73,12 +73,12 @@ Douglas Adams |The Hitchhiker's Guide to the Galaxy|3.1756816 Peter F. Hamilton|Pandora's Star |3.0997515 ``` -::::{note} +::::{note} The allowed optional parameters for a single-field `MATCH()` variant (for the `match` {{es}} query) are: `analyzer`, `auto_generate_synonyms_phrase_query`, `lenient`, `fuzziness`, `fuzzy_transpositions`, `fuzzy_rewrite`, `minimum_should_match`, `operator`, `max_expansions`, `prefix_length`. :::: -::::{note} +::::{note} The allowed optional parameters for a multi-field `MATCH()` variant (for the `multi_match` {{es}} query) are: `analyzer`, `auto_generate_synonyms_phrase_query`, `lenient`, `fuzziness`, `fuzzy_transpositions`, `fuzzy_rewrite`, `minimum_should_match`, `operator`, `max_expansions`, `prefix_length`, `slop`, `tie_breaker`, `type`. :::: @@ -98,7 +98,7 @@ QUERY( 2. additional parameters; optional -**Description**: Just like `MATCH`, `QUERY` is a full-text search predicate that gives the user control over the [query_string](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-query-string-query.md) query in {{es}}. +**Description**: Just like `MATCH`, `QUERY` is a full-text search predicate that gives the user control over the [query_string](elasticsearch://reference/query-languages/query-dsl-query-string-query.md) query in {{es}}. The first parameter is basically the input that will be passed as is to the `query_string` query, which means that anything that `query_string` accepts in its `query` field can be used here as well: @@ -140,7 +140,7 @@ SELECT author, name, SCORE() FROM library WHERE QUERY('dune god', 'default_opera Frank Herbert |God Emperor of Dune|3.6984892 ``` -::::{note} +::::{note} The allowed optional parameters for `QUERY()` are: `allow_leading_wildcard`, `analyze_wildcard`, `analyzer`, `auto_generate_synonyms_phrase_query`, `default_field`, `default_operator`, `enable_position_increments`, `escape`, `fuzziness`, `fuzzy_max_expansions`, `fuzzy_prefix_length`, `fuzzy_rewrite`, `fuzzy_transpositions`, `lenient`, `max_determinized_states`, `minimum_should_match`, `phrase_slop`, `rewrite`, `quote_analyzer`, `quote_field_suffix`, `tie_breaker`, `time_zone`, `type`. :::: @@ -158,8 +158,8 @@ SCORE() **Description**: Returns the [relevance](https://www.elastic.co/guide/en/elasticsearch/guide/2.x/relevance-intro.html) of a given input to the executed query. The higher score, the more relevant the data. -::::{note} -When doing multiple text queries in the `WHERE` clause then, their scores will be combined using the same rules as {{es}}'s [bool query](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-bool-query.md). +::::{note} +When doing multiple text queries in the `WHERE` clause then, their scores will be combined using the same rules as {{es}}'s [bool query](elasticsearch://reference/query-languages/query-dsl-bool-query.md). :::: diff --git a/explore-analyze/query-filter/languages/sql-index-patterns.md b/explore-analyze/query-filter/languages/sql-index-patterns.md index 7e946f50e..772e15d5d 100644 --- a/explore-analyze/query-filter/languages/sql-index-patterns.md +++ b/explore-analyze/query-filter/languages/sql-index-patterns.md @@ -11,9 +11,9 @@ mapped_pages: Elasticsearch SQL supports two types of patterns for matching multiple indices or tables: -## {{es}} multi-target syntax [sql-index-patterns-multi] +## {{es}} multi-target syntax [sql-index-patterns-multi] -The {{es}} notation for enumerating, including or excluding [multi-target syntax](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#api-multi-index) is supported *as long* as it is quoted or escaped as a table identifier. +The {{es}} notation for enumerating, including or excluding [multi-target syntax](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#api-multi-index) is supported *as long* as it is quoted or escaped as a table identifier. For example: @@ -40,7 +40,7 @@ SELECT emp_no FROM "e*p" LIMIT 1; 10001 ``` -::::{note} +::::{note} There is the restriction that all resolved concrete tables have the exact same mapping. :::: @@ -58,7 +58,7 @@ SELECT emp_no FROM "my*cluster:*emp" LIMIT 1; ``` -## SQL `LIKE` notation [sql-index-patterns-like] +## SQL `LIKE` notation [sql-index-patterns-like] The common `LIKE` statement (including escaping if needed) to match a wildcard pattern, based on one `_` or multiple `%` characters. @@ -101,7 +101,7 @@ In a nutshell, the differences between the two type of patterns are: Which one to use, is up to you however try to stick to the same one across your queries for consistency. -::::{note} +::::{note} As the query type of quoting between the two patterns is fairly similar (`"` vs `'`), Elasticsearch SQL *always* requires the keyword `LIKE` for SQL `LIKE` pattern. :::: diff --git a/explore-analyze/query-filter/languages/sql-lexical-structure.md b/explore-analyze/query-filter/languages/sql-lexical-structure.md index 4616ba2eb..12da08300 100644 --- a/explore-analyze/query-filter/languages/sql-lexical-structure.md +++ b/explore-analyze/query-filter/languages/sql-lexical-structure.md @@ -43,7 +43,7 @@ Identifiers can be of two types: *quoted* and *unquoted*: SELECT ip_address FROM "hosts-*" ``` -This query has two identifiers, `ip_address` and `hosts-*` (an [index pattern](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#api-multi-index)). As `ip_address` does not clash with any key words it can be used verbatim, `hosts-*` on the other hand cannot as it clashes with `-` (minus operation) and `*` hence the double quotes. +This query has two identifiers, `ip_address` and `hosts-*` (an [index pattern](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#api-multi-index)). As `ip_address` does not clash with any key words it can be used verbatim, `hosts-*` on the other hand cannot as it clashes with `-` (minus operation) and `*` hence the double quotes. Another example: @@ -51,7 +51,7 @@ Another example: SELECT "from" FROM "" ``` -The first identifier from needs to quoted as otherwise it clashes with the `FROM` key word (which is case insensitive as thus can be written as `from`) while the second identifier using {{es}} [Date math support in index and index alias names](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#api-date-math-index-names) would have otherwise confuse the parser. +The first identifier from needs to quoted as otherwise it clashes with the `FROM` key word (which is case insensitive as thus can be written as `from`) while the second identifier using {{es}} [Date math support in index and index alias names](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#api-date-math-index-names) would have otherwise confuse the parser. Hence why in general, **especially** when dealing with user input it is **highly** recommended to use quotes for identifiers. It adds minimal increase to your queries and in return offers clarity and disambiguation. @@ -61,17 +61,17 @@ Hence why in general, **especially** when dealing with user input it is **highly Elasticsearch SQL supports two kind of *implicitly-typed* literals: strings and numbers. -#### String Literals [sql-syntax-string-literals] +#### String Literals [sql-syntax-string-literals] A string literal is an arbitrary number of characters bounded by single quotes `'`: `'Giant Robot'`. To include a single quote in the string, escape it using another single quote: `'Captain EO''s Voyage'`. -::::{note} +::::{note} An escaped single quote is **not** a double quote (`"`), but a single quote `'` *repeated* (`''`). :::: -#### Numeric Literals [_numeric_literals] +#### Numeric Literals [_numeric_literals] Numeric literals are accepted both in decimal and scientific notation with exponent marker (`e` or `E`), starting either with a digit or decimal point `.`: @@ -86,7 +86,7 @@ Numeric literals are accepted both in decimal and scientific notation with expon Numeric literals that contain a decimal point are always interpreted as being of type `double`. Those without are considered `integer` if they fit otherwise their type is `long` (or `BIGINT` in ANSI SQL types). -#### Generic Literals [sql-syntax-generic-literals] +#### Generic Literals [sql-syntax-generic-literals] When dealing with arbitrary type literal, one creates the object by casting, typically, the string representation to the desired type. This can be achieved through the dedicated [cast operator](sql-operators-cast.md) and [functions](sql-functions-type-conversion.md): @@ -116,7 +116,7 @@ SELECT "first_name" <1> 2. Single quotes `'` used for a string literal -::::{note} +::::{note} To escape single or double quotes, one needs to use that specific quote one more time. For example, the literal `John's` can be escaped like `SELECT 'John''s' AS name`. The same goes for double quotes escaping - `SELECT 123 AS "test""number"` will display as a result a column with the name `test"number`. :::: diff --git a/explore-analyze/query-filter/languages/sql-like-rlike-operators.md b/explore-analyze/query-filter/languages/sql-like-rlike-operators.md index bf5a96bdc..7b5d86e84 100644 --- a/explore-analyze/query-filter/languages/sql-like-rlike-operators.md +++ b/explore-analyze/query-filter/languages/sql-like-rlike-operators.md @@ -10,8 +10,8 @@ mapped_pages: `LIKE` and `RLIKE` operators are commonly used to filter data based on string patterns. They usually act on a field placed on the left-hand side of the operator, but can also act on a constant (literal) expression. The right-hand side of the operator represents the pattern. Both can be used in the `WHERE` clause of the `SELECT` statement, but `LIKE` can also be used in other places, such as defining an [index pattern](sql-index-patterns.md) or across various [SHOW commands](sql-commands.md). This section covers only the `SELECT ... WHERE ...` usage. -::::{note} -One significant difference between `LIKE`/`RLIKE` and the [full-text search predicates](sql-functions-search.md) is that the former act on [exact fields](sql-data-types.md#sql-multi-field) while the latter also work on [analyzed](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/text.md) fields. If the field used with `LIKE`/`RLIKE` doesn’t have an exact not-normalized sub-field (of [keyword](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/keyword.md) type) Elasticsearch SQL will not be able to run the query. If the field is either exact or has an exact sub-field, it will use it as is, or it will automatically use the exact sub-field even if it wasn’t explicitly specified in the statement. +::::{note} +One significant difference between `LIKE`/`RLIKE` and the [full-text search predicates](sql-functions-search.md) is that the former act on [exact fields](sql-data-types.md#sql-multi-field) while the latter also work on [analyzed](elasticsearch://reference/elasticsearch/mapping-reference/text.md) fields. If the field used with `LIKE`/`RLIKE` doesn’t have an exact not-normalized sub-field (of [keyword](elasticsearch://reference/elasticsearch/mapping-reference/keyword.md) type) Elasticsearch SQL will not be able to run the query. If the field is either exact or has an exact sub-field, it will use it as is, or it will automatically use the exact sub-field even if it wasn’t explicitly specified in the statement. :::: @@ -33,7 +33,7 @@ LIKE constant_exp <2> The percent sign represents zero, one or multiple characters. The underscore represents a single number or character. These symbols can be used in combinations. -::::{note} +::::{note} No other characters have special meaning or act as wildcard. Characters often used as wildcards in other languages (`*` or `?`) are treated as normal characters. :::: @@ -54,7 +54,7 @@ SELECT name, author FROM library WHERE name LIKE 'Dune/%' ESCAPE '/'; ``` In the example above `/` is defined as an escape character which needs to be placed before the `%` or `_` characters if one needs to match those characters in the pattern specifically. By default, there is no escape character defined. -::::{important} +::::{important} Even though `LIKE` is a valid option when searching or filtering in Elasticsearch SQL, full-text search predicates `MATCH` and `QUERY` are [faster and much more powerful and are the preferred alternative](#sql-like-prefer-full-text). :::: @@ -73,7 +73,7 @@ RLIKE constant_exp <2> **Description**: This operator is similar to `LIKE`, but the user is not limited to search for a string based on a fixed pattern with the percent sign (`%`) and underscore (`_`); the pattern in this case is a regular expression which allows the construction of more flexible patterns. -For supported syntax, see [*Regular expression syntax*](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/regexp-syntax.md). +For supported syntax, see [*Regular expression syntax*](elasticsearch://reference/query-languages/regexp-syntax.md). ```sql SELECT author, name FROM library WHERE name RLIKE 'Child.* Dune'; @@ -83,7 +83,7 @@ SELECT author, name FROM library WHERE name RLIKE 'Child.* Dune'; Frank Herbert |Children of Dune ``` -::::{important} +::::{important} Even though `RLIKE` is a valid option when searching or filtering in Elasticsearch SQL, full-text search predicates `MATCH` and `QUERY` are [faster and much more powerful and are the preferred alternative](#sql-like-prefer-full-text). :::: diff --git a/explore-analyze/query-filter/languages/sql-limitations.md b/explore-analyze/query-filter/languages/sql-limitations.md index e86fae499..86352db5e 100644 --- a/explore-analyze/query-filter/languages/sql-limitations.md +++ b/explore-analyze/query-filter/languages/sql-limitations.md @@ -134,7 +134,7 @@ But, if the sub-select would include a `GROUP BY` or `HAVING` or the enclosing ` ## Using [`FIRST`](sql-functions-aggs.md#sql-functions-aggs-first)/[`LAST`](sql-functions-aggs.md#sql-functions-aggs-last) aggregation functions in `HAVING` clause [first-last-agg-functions-having-clause] -Using `FIRST` and `LAST` in the `HAVING` clause is not supported. The same applies to [`MIN`](sql-functions-aggs.md#sql-functions-aggs-min) and [`MAX`](sql-functions-aggs.md#sql-functions-aggs-max) when their target column is of type [`keyword`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/keyword.md) or [`unsigned_long`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/number.md) as they are internally translated to `FIRST` and `LAST`. +Using `FIRST` and `LAST` in the `HAVING` clause is not supported. The same applies to [`MIN`](sql-functions-aggs.md#sql-functions-aggs-min) and [`MAX`](sql-functions-aggs.md#sql-functions-aggs-max) when their target column is of type [`keyword`](elasticsearch://reference/elasticsearch/mapping-reference/keyword.md) or [`unsigned_long`](elasticsearch://reference/elasticsearch/mapping-reference/number.md) as they are internally translated to `FIRST` and `LAST`. ## Using TIME data type in GROUP BY or [`HISTOGRAM`](sql-functions-grouping.md#sql-functions-grouping-histogram) [group-by-time] @@ -167,7 +167,7 @@ By default,`geo_points` fields are indexed and have doc values. However only lat ## Retrieving using the `fields` search parameter [using-fields-api] -Elasticsearch SQL retrieves column values using the [search API’s `fields` parameter](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/retrieve-selected-fields.md#search-fields-param). Any limitations on the `fields` parameter also apply to Elasticsearch SQL queries. For example, if `_source` is disabled for any of the returned fields or at index level, the values cannot be retrieved. +Elasticsearch SQL retrieves column values using the [search API’s `fields` parameter](elasticsearch://reference/elasticsearch/rest-apis/retrieve-selected-fields.md#search-fields-param). Any limitations on the `fields` parameter also apply to Elasticsearch SQL queries. For example, if `_source` is disabled for any of the returned fields or at index level, the values cannot be retrieved. ## Aggregations in the [`PIVOT`](sql-syntax-select.md#sql-syntax-pivot) clause [aggs-in-pivot] diff --git a/explore-analyze/query-filter/languages/sql-pagination.md b/explore-analyze/query-filter/languages/sql-pagination.md index 3362e4db5..fc7567c95 100644 --- a/explore-analyze/query-filter/languages/sql-pagination.md +++ b/explore-analyze/query-filter/languages/sql-pagination.md @@ -34,7 +34,7 @@ Which looks like: Note that the `columns` object is only part of the first page. -You’ve reached the last page when there is no `cursor` returned in the results. Like Elasticsearch’s [scroll](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/paginate-search-results.md#scroll-search-results), SQL may keep state in Elasticsearch to support the cursor. Unlike scroll, receiving the last page is enough to guarantee that the Elasticsearch state is cleared. +You’ve reached the last page when there is no `cursor` returned in the results. Like Elasticsearch’s [scroll](elasticsearch://reference/elasticsearch/rest-apis/paginate-search-results.md#scroll-search-results), SQL may keep state in Elasticsearch to support the cursor. Unlike scroll, receiving the last page is enough to guarantee that the Elasticsearch state is cleared. To clear the state earlier, use the [clear cursor API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-sql-clear-cursor): diff --git a/explore-analyze/query-filter/languages/sql-rest-filtering.md b/explore-analyze/query-filter/languages/sql-rest-filtering.md index 898a83665..3136e2587 100644 --- a/explore-analyze/query-filter/languages/sql-rest-filtering.md +++ b/explore-analyze/query-filter/languages/sql-rest-filtering.md @@ -34,8 +34,8 @@ Which returns: Douglas Adams |The Hitchhiker's Guide to the Galaxy|180 |1979-10-12T00:00:00.000Z ``` -::::{tip} -A useful and less obvious usage for standard Query DSL filtering is to search documents by a specific [routing key](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/search-shard-routing.md#search-routing). Because Elasticsearch SQL does not support a `routing` parameter, one can specify a [`terms` filter for the `_routing` field](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-routing-field.md) instead: +::::{tip} +A useful and less obvious usage for standard Query DSL filtering is to search documents by a specific [routing key](elasticsearch://reference/elasticsearch/rest-apis/search-shard-routing.md#search-routing). Because Elasticsearch SQL does not support a `routing` parameter, one can specify a [`terms` filter for the `_routing` field](elasticsearch://reference/elasticsearch/mapping-reference/mapping-routing-field.md) instead: ```console POST /_sql?format=txt diff --git a/explore-analyze/query-filter/languages/sql-syntax-select.md b/explore-analyze/query-filter/languages/sql-syntax-select.md index 7b6ba4b1e..39212d726 100644 --- a/explore-analyze/query-filter/languages/sql-syntax-select.md +++ b/explore-analyze/query-filter/languages/sql-syntax-select.md @@ -133,7 +133,7 @@ SELECT * FROM "emp" LIMIT 1; 1953-09-02T00:00:00Z|10001 |Georgi |M |1986-06-26T00:00:00.000Z|2 |Facello |Georgi Facello |57305 ``` -The name can be a [pattern](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#api-multi-index) pointing to multiple indices (likely requiring quoting as mentioned above) with the restriction that **all** resolved concrete tables have **exact mapping**. +The name can be a [pattern](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#api-multi-index) pointing to multiple indices (likely requiring quoting as mentioned above) with the restriction that **all** resolved concrete tables have **exact mapping**. ```sql SELECT emp_no FROM "e*p" LIMIT 1; @@ -507,7 +507,7 @@ Ordering by aggregation is possible for up to **10000** entries for memory consu When doing full-text queries in the `WHERE` clause, results can be returned based on their [score](https://www.elastic.co/guide/en/elasticsearch/guide/2.x/relevance-intro.html) or *relevance* to the given query. ::::{note} -When doing multiple text queries in the `WHERE` clause then, their scores will be combined using the same rules as {{es}}'s [bool query](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-bool-query.md). +When doing multiple text queries in the `WHERE` clause then, their scores will be combined using the same rules as {{es}}'s [bool query](elasticsearch://reference/query-languages/query-dsl-bool-query.md). :::: diff --git a/explore-analyze/query-filter/languages/sql-syntax-show-tables.md b/explore-analyze/query-filter/languages/sql-syntax-show-tables.md index 65cd00dd9..65173078b 100644 --- a/explore-analyze/query-filter/languages/sql-syntax-show-tables.md +++ b/explore-analyze/query-filter/languages/sql-syntax-show-tables.md @@ -38,7 +38,7 @@ javaRestTest |employees |VIEW |ALIAS javaRestTest |library |TABLE |INDEX ``` -Match multiple indices by using {{es}} [multi-target syntax](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#api-multi-index) notation: +Match multiple indices by using {{es}} [multi-target syntax](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#api-multi-index) notation: ```sql SHOW TABLES "*,-l*"; diff --git a/explore-analyze/query-filter/languages/sql-translate.md b/explore-analyze/query-filter/languages/sql-translate.md index 42ccf9cee..dbd6dac7f 100644 --- a/explore-analyze/query-filter/languages/sql-translate.md +++ b/explore-analyze/query-filter/languages/sql-translate.md @@ -52,7 +52,7 @@ Which returns: } ``` -Which is the request that SQL will run to provide the results. In this case, SQL will use the [scroll](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/paginate-search-results.md#scroll-search-results) API. If the result contained an aggregation then SQL would use the normal [search API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search). +Which is the request that SQL will run to provide the results. In this case, SQL will use the [scroll](elasticsearch://reference/elasticsearch/rest-apis/paginate-search-results.md#scroll-search-results) API. If the result contained an aggregation then SQL would use the normal [search API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search). The request body accepts the same [parameters](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-sql-query) as the [SQL search API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-sql-query), excluding `cursor`. diff --git a/explore-analyze/query-filter/tools/console.md b/explore-analyze/query-filter/tools/console.md index 3de92d1f8..143168877 100644 --- a/explore-analyze/query-filter/tools/console.md +++ b/explore-analyze/query-filter/tools/console.md @@ -26,7 +26,7 @@ $$$configuring-console$$$ $$$import-export-console-requests$$$ -**Console** is an interactive UI for sending requests to [{{es}} APIs](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/index.md) and [{{kib}} APIs](https://www.elastic.co/docs/api) and viewing their responses. +**Console** is an interactive UI for sending requests to [{{es}} APIs](elasticsearch://reference/elasticsearch/rest-apis/index.md) and [{{kib}} APIs](https://www.elastic.co/docs/api) and viewing their responses. :::{image} ../../../images/kibana-console.png :alt: Console diff --git a/explore-analyze/query-filter/tools/grok-debugger.md b/explore-analyze/query-filter/tools/grok-debugger.md index 7e9e9f135..70bbaf8c6 100644 --- a/explore-analyze/query-filter/tools/grok-debugger.md +++ b/explore-analyze/query-filter/tools/grok-debugger.md @@ -10,7 +10,7 @@ mapped_pages: You can build and debug grok patterns in the {{kib}} **Grok Debugger** before you use them in your data processing pipelines. Grok is a pattern matching syntax that you can use to parse arbitrary text and structure it. Grok is good for parsing syslog, apache, and other webserver logs, mysql logs, and in general, any log format that is written for human consumption. -Grok patterns are supported in {{es}} [runtime fields](../../../manage-data/data-store/mapping/runtime-fields.md), the {{es}} [grok ingest processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/grok-processor.md), and the {{ls}} [grok filter](asciidocalypse://docs/logstash/docs/reference/plugins-filters-grok.md). For syntax, see [Grokking grok](../../scripting/grok.md). +Grok patterns are supported in {{es}} [runtime fields](../../../manage-data/data-store/mapping/runtime-fields.md), the {{es}} [grok ingest processor](elasticsearch://reference/ingestion-tools/enrich-processor/grok-processor.md), and the {{ls}} [grok filter](asciidocalypse://docs/logstash/docs/reference/plugins-filters-grok.md). For syntax, see [Grokking grok](../../scripting/grok.md). The {{stack}} ships with more than 120 reusable grok patterns. For a complete list of patterns, see [{{es}} grok patterns](https://github.com/elastic/elasticsearch/tree/master/libs/grok/src/main/resources/patterns) and [{{ls}} grok patterns](https://github.com/logstash-plugins/logstash-patterns-core/tree/master/patterns). diff --git a/explore-analyze/report-and-share/reporting-troubleshooting-csv.md b/explore-analyze/report-and-share/reporting-troubleshooting-csv.md index 8e1ceaed4..9dbcb3ada 100644 --- a/explore-analyze/report-and-share/reporting-troubleshooting-csv.md +++ b/explore-analyze/report-and-share/reporting-troubleshooting-csv.md @@ -43,7 +43,7 @@ The Kibana CSV export feature collects all of the data from Elasticsearch by usi 1. Permissions to read data aliases alone will not work: the permissions are needed on the underlying indices or data streams. 2. In cases where data shards are unavailable or time out, the export will be empty rather than returning partial data. -Some users may benefit from using the [scroll API](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/paginate-search-results.md#scroll-search-results), an alternative to paging through the data. The behavior of this API does not have the limitations of point in time API, however it has its own limitations: +Some users may benefit from using the [scroll API](elasticsearch://reference/elasticsearch/rest-apis/paginate-search-results.md#scroll-search-results), an alternative to paging through the data. The behavior of this API does not have the limitations of point in time API, however it has its own limitations: 1. Search is limited to 500 shards at the very most. 2. In cases where the data shards are unavailable or time out, the export may return partial data. diff --git a/explore-analyze/scripting/dissect.md b/explore-analyze/scripting/dissect.md index 48592568e..4630a5fd4 100644 --- a/explore-analyze/scripting/dissect.md +++ b/explore-analyze/scripting/dissect.md @@ -55,7 +55,7 @@ Now that you have a dissect pattern, how do you test and use it? ## Test dissect patterns with Painless [dissect-patterns-test] -You can incorporate dissect patterns into Painless scripts to extract data. To test your script, use either the [field contexts](asciidocalypse://docs/elasticsearch/docs/reference/scripting-languages/painless/painless-api-examples.md#painless-execute-runtime-field-context) of the Painless execute API or create a runtime field that includes the script. Runtime fields offer greater flexibility and accept multiple documents, but the Painless execute API is a great option if you don’t have write access on a cluster where you’re testing a script. +You can incorporate dissect patterns into Painless scripts to extract data. To test your script, use either the [field contexts](elasticsearch://reference/scripting-languages/painless/painless-api-examples.md#painless-execute-runtime-field-context) of the Painless execute API or create a runtime field that includes the script. Runtime fields offer greater flexibility and accept multiple documents, but the Painless execute API is a great option if you don’t have write access on a cluster where you’re testing a script. For example, test your dissect pattern with the Painless execute API by including your Painless script and a single document that matches your data. Start by indexing the `message` field as a `wildcard` data type: diff --git a/explore-analyze/scripting/grok.md b/explore-analyze/scripting/grok.md index 370afb18b..e5dc1d666 100644 --- a/explore-analyze/scripting/grok.md +++ b/explore-analyze/scripting/grok.md @@ -53,7 +53,7 @@ New features and enhancements will be added to the ECS-compliant files. The lega ## Use grok patterns in Painless scripts [grok-patterns] -You can incorporate predefined grok patterns into Painless scripts to extract data. To test your script, use either the [field contexts](asciidocalypse://docs/elasticsearch/docs/reference/scripting-languages/painless/painless-api-examples.md#painless-execute-runtime-field-context) of the Painless execute API or create a runtime field that includes the script. Runtime fields offer greater flexibility and accept multiple documents, but the Painless execute API is a great option if you don’t have write access on a cluster where you’re testing a script. +You can incorporate predefined grok patterns into Painless scripts to extract data. To test your script, use either the [field contexts](elasticsearch://reference/scripting-languages/painless/painless-api-examples.md#painless-execute-runtime-field-context) of the Painless execute API or create a runtime field that includes the script. Runtime fields offer greater flexibility and accept multiple documents, but the Painless execute API is a great option if you don’t have write access on a cluster where you’re testing a script. ::::{tip} If you need help building grok patterns to match your data, use the [Grok Debugger](../query-filter/tools/grok-debugger.md) tool in {{kib}}. @@ -67,7 +67,7 @@ For example, if you’re working with Apache log data, you can use the `%{{COMMO [30/Apr/2020:14:30:17 -0500] \"GET /images/hm_bg.jpg HTTP/1.0\" 200 24736" ``` -To extract the IP address from the `message` field, you can write a Painless script that incorporates the `%{{COMMONAPACHELOG}}` syntax. You can test this script using the [`ip` field context](asciidocalypse://docs/elasticsearch/docs/reference/scripting-languages/painless/painless-api-examples.md#painless-runtime-ip) of the Painless execute API, but let’s use a runtime field instead. +To extract the IP address from the `message` field, you can write a Painless script that incorporates the `%{{COMMONAPACHELOG}}` syntax. You can test this script using the [`ip` field context](elasticsearch://reference/scripting-languages/painless/painless-api-examples.md#painless-runtime-ip) of the Painless execute API, but let’s use a runtime field instead. Based on the sample document, index the `@timestamp` and `message` fields. To remain flexible, use `wildcard` as the field type for `message`: @@ -154,7 +154,7 @@ GET my-index/_search ## Return calculated results [grok-pattern-results] -Using the `http.clientip` runtime field, you can define a simple query to run a search for a specific IP address and return all related fields. The [`fields`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/retrieve-selected-fields.md) parameter on the `_search` API works for all fields, even those that weren’t sent as part of the original `_source`: +Using the `http.clientip` runtime field, you can define a simple query to run a search for a specific IP address and return all related fields. The [`fields`](elasticsearch://reference/elasticsearch/rest-apis/retrieve-selected-fields.md) parameter on the `_search` API works for all fields, even those that weren’t sent as part of the original `_source`: ```console GET my-index/_search diff --git a/explore-analyze/scripting/modules-scripting-engine.md b/explore-analyze/scripting/modules-scripting-engine.md index 06e0f1d25..52093e8bd 100644 --- a/explore-analyze/scripting/modules-scripting-engine.md +++ b/explore-analyze/scripting/modules-scripting-engine.md @@ -10,7 +10,7 @@ mapped_pages: A `ScriptEngine` is a backend for implementing a scripting language. It may also be used to write scripts that need to use advanced internals of scripting. For example, a script that wants to use term frequencies while scoring. -The plugin [documentation](asciidocalypse://docs/elasticsearch/docs/extend/index.md) has more information on how to write a plugin so that Elasticsearch will properly load it. To register the `ScriptEngine`, your plugin should implement the `ScriptPlugin` interface and override the `getScriptEngine(Settings settings)` method. +The plugin [documentation](elasticsearch://extend/index.md) has more information on how to write a plugin so that Elasticsearch will properly load it. To register the `ScriptEngine`, your plugin should implement the `ScriptPlugin` interface and override the `getScriptEngine(Settings settings)` method. The following is an example of a custom `ScriptEngine` which uses the language name `expert_scripts`. It implements a single script called `pure_df` which may be used as a search script to override each document’s score as the document frequency of a provided term. diff --git a/explore-analyze/scripting/modules-scripting-fields.md b/explore-analyze/scripting/modules-scripting-fields.md index 88c591a86..5f891ea4c 100644 --- a/explore-analyze/scripting/modules-scripting-fields.md +++ b/explore-analyze/scripting/modules-scripting-fields.md @@ -11,34 +11,34 @@ mapped_pages: Depending on where a script is used, it will have access to certain special variables and document fields. -## Update scripts [_update_scripts] +## Update scripts [_update_scripts] A script used in the [update](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-update), [update-by-query](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-update-by-query), or [reindex](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-reindex) API will have access to the `ctx` variable which exposes: `ctx._source` -: Access to the document [`_source` field](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-source-field.md). +: Access to the document [`_source` field](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md). `ctx.op` : The operation that should be applied to the document: `index` or `delete`. `ctx._index` etc -: Access to [document metadata fields](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/document-metadata-fields.md), some of which may be read-only. +: Access to [document metadata fields](elasticsearch://reference/elasticsearch/mapping-reference/document-metadata-fields.md), some of which may be read-only. These scripts do not have access to the `doc` variable and have to use `ctx` to access the documents they operate on. -## Search and aggregation scripts [_search_and_aggregation_scripts] +## Search and aggregation scripts [_search_and_aggregation_scripts] -With the exception of [script fields](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/retrieve-selected-fields.md#script-fields) which are executed once per search hit, scripts used in search and aggregations will be executed once for every document which might match a query or an aggregation. Depending on how many documents you have, this could mean millions or billions of executions: these scripts need to be fast! +With the exception of [script fields](elasticsearch://reference/elasticsearch/rest-apis/retrieve-selected-fields.md#script-fields) which are executed once per search hit, scripts used in search and aggregations will be executed once for every document which might match a query or an aggregation. Depending on how many documents you have, this could mean millions or billions of executions: these scripts need to be fast! Field values can be accessed from a script using [doc-values](#modules-scripting-doc-vals), [the `_source` field](#modules-scripting-source), or [stored fields](#modules-scripting-stored), each of which is explained below. -### Accessing the score of a document within a script [scripting-score] +### Accessing the score of a document within a script [scripting-score] -Scripts used in the [`function_score` query](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-function-score-query.md), in [script-based sorting](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/sort-search-results.md), or in [aggregations](../query-filter/aggregations.md) have access to the `_score` variable which represents the current relevance score of a document. +Scripts used in the [`function_score` query](elasticsearch://reference/query-languages/query-dsl-function-score-query.md), in [script-based sorting](elasticsearch://reference/elasticsearch/rest-apis/sort-search-results.md), or in [aggregations](../query-filter/aggregations.md) have access to the `_score` variable which represents the current relevance score of a document. -Here’s an example of using a script in a [`function_score` query](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-function-score-query.md) to alter the relevance `_score` of each document: +Here’s an example of using a script in a [`function_score` query](elasticsearch://reference/query-languages/query-dsl-function-score-query.md) to alter the relevance `_score` of each document: ```console PUT my-index-000001/_doc/1?refresh @@ -74,11 +74,11 @@ GET my-index-000001/_search ``` -### Accessing term statistics of a document within a script [scripting-term-statistics] +### Accessing term statistics of a document within a script [scripting-term-statistics] -Scripts used in a [`script_score`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-script-score-query.md) query have access to the `_termStats` variable which provides statistical information about the terms in the child query. +Scripts used in a [`script_score`](elasticsearch://reference/query-languages/query-dsl-script-score-query.md) query have access to the `_termStats` variable which provides statistical information about the terms in the child query. -In the following example, `_termStats` is used within a [`script_score`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-script-score-query.md) query to retrieve the average term frequency for the terms `quick`, `brown`, and `fox` in the `text` field: +In the following example, `_termStats` is used within a [`script_score`](elasticsearch://reference/query-languages/query-dsl-script-score-query.md) query to retrieve the average term frequency for the terms `quick`, `brown`, and `fox` in the `text` field: ```console PUT my-index-000001/_doc/1?refresh @@ -141,9 +141,9 @@ The `_termStats` variable is only available when using the [Painless](modules-sc -### Doc values [modules-scripting-doc-vals] +### Doc values [modules-scripting-doc-vals] -By far the fastest most efficient way to access a field value from a script is to use the `doc['field_name']` syntax, which retrieves the field value from [doc values](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/doc-values.md). Doc values are a columnar field value store, enabled by default on all fields except for [analyzed `text` fields](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/text.md). +By far the fastest most efficient way to access a field value from a script is to use the `doc['field_name']` syntax, which retrieves the field value from [doc values](elasticsearch://reference/elasticsearch/mapping-reference/doc-values.md). Doc values are a columnar field value store, enabled by default on all fields except for [analyzed `text` fields](elasticsearch://reference/elasticsearch/mapping-reference/text.md). ```console PUT my-index-000001/_doc/1?refresh @@ -180,22 +180,22 @@ The `doc['field']` will throw an error if `field` is missing from the mappings. ::::{admonition} Doc values and `text` fields :class: note -The `doc['field']` syntax can also be used for [analyzed `text` fields](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/text.md) if [`fielddata`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/text.md#fielddata-mapping-param) is enabled, but **BEWARE**: enabling fielddata on a `text` field requires loading all of the terms into the JVM heap, which can be very expensive both in terms of memory and CPU. It seldom makes sense to access `text` fields from scripts. +The `doc['field']` syntax can also be used for [analyzed `text` fields](elasticsearch://reference/elasticsearch/mapping-reference/text.md) if [`fielddata`](elasticsearch://reference/elasticsearch/mapping-reference/text.md#fielddata-mapping-param) is enabled, but **BEWARE**: enabling fielddata on a `text` field requires loading all of the terms into the JVM heap, which can be very expensive both in terms of memory and CPU. It seldom makes sense to access `text` fields from scripts. :::: -### The document `_source` [modules-scripting-source] +### The document `_source` [modules-scripting-source] -The document [`_source`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-source-field.md) can be accessed using the `_source.field_name` syntax. The `_source` is loaded as a map-of-maps, so properties within object fields can be accessed as, for example, `_source.name.first`. +The document [`_source`](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md) can be accessed using the `_source.field_name` syntax. The `_source` is loaded as a map-of-maps, so properties within object fields can be accessed as, for example, `_source.name.first`. ::::{admonition} Prefer doc-values to _source :class: important Accessing the `_source` field is much slower than using doc-values. The _source field is optimised for returning several fields per result, while doc values are optimised for accessing the value of a specific field in many documents. -It makes sense to use `_source` when generating a [script field](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/retrieve-selected-fields.md#script-fields) for the top ten hits from a search result but, for other search and aggregation use cases, always prefer using doc values. +It makes sense to use `_source` when generating a [script field](elasticsearch://reference/elasticsearch/rest-apis/retrieve-selected-fields.md#script-fields) for the top ten hits from a search result but, for other search and aggregation use cases, always prefer using doc values. :::: @@ -237,9 +237,9 @@ GET my-index-000001/_search ``` -### Stored fields [modules-scripting-stored] +### Stored fields [modules-scripting-stored] -*Stored fields* — fields explicitly marked as [`"store": true`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-store.md) in the mapping — can be accessed using the `_fields['field_name'].value` or `_fields['field_name']` syntax: +*Stored fields* — fields explicitly marked as [`"store": true`](elasticsearch://reference/elasticsearch/mapping-reference/mapping-store.md) in the mapping — can be accessed using the `_fields['field_name'].value` or `_fields['field_name']` syntax: ```console PUT my-index-000001 diff --git a/explore-analyze/scripting/modules-scripting-painless.md b/explore-analyze/scripting/modules-scripting-painless.md index a668b26ee..99bbef45c 100644 --- a/explore-analyze/scripting/modules-scripting-painless.md +++ b/explore-analyze/scripting/modules-scripting-painless.md @@ -22,4 +22,4 @@ Painless provides numerous capabilities that center around the following core pr Ready to start scripting with Painless? Learn how to [write your first script](modules-scripting-using.md). -If you’re already familiar with Painless, see the [Painless Language Specification](asciidocalypse://docs/elasticsearch/docs/reference/scripting-languages/painless/painless-language-specification.md) for a detailed description of the Painless syntax and language features. +If you’re already familiar with Painless, see the [Painless Language Specification](elasticsearch://reference/scripting-languages/painless/painless-language-specification.md) for a detailed description of the Painless syntax and language features. diff --git a/explore-analyze/scripting/modules-scripting-security.md b/explore-analyze/scripting/modules-scripting-security.md index 410e2c5cd..79ef2fd30 100644 --- a/explore-analyze/scripting/modules-scripting-security.md +++ b/explore-analyze/scripting/modules-scripting-security.md @@ -16,16 +16,16 @@ The second layer of security is the [Java Security Manager](https://www.oracle.c {{es}} uses [seccomp](https://en.wikipedia.org/wiki/Seccomp) in Linux, [Seatbelt](https://www.chromium.org/developers/design-documents/sandbox/osx-sandboxing-design) in macOS, and [ActiveProcessLimit](https://msdn.microsoft.com/en-us/library/windows/desktop/ms684147) on Windows as additional security layers to prevent {{es}} from forking or running other processes. -Finally, scripts used in [scripted metrics aggregations](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-metrics-scripted-metric-aggregation.md) can be restricted to a defined list of scripts, or forbidden altogether. This can prevent users from running particularly slow or resource intensive aggregation queries. +Finally, scripts used in [scripted metrics aggregations](elasticsearch://reference/data-analysis/aggregations/search-aggregations-metrics-scripted-metric-aggregation.md) can be restricted to a defined list of scripts, or forbidden altogether. This can prevent users from running particularly slow or resource intensive aggregation queries. -You can modify the following script settings to restrict the type of scripts that are allowed to run, and control the available [contexts](asciidocalypse://docs/elasticsearch/docs/reference/scripting-languages/painless/painless-contexts.md) that scripts can run in. To implement additional layers in your defense in depth strategy, follow the [{{es}} security principles](../../deploy-manage/security.md). +You can modify the following script settings to restrict the type of scripts that are allowed to run, and control the available [contexts](elasticsearch://reference/scripting-languages/painless/painless-contexts.md) that scripts can run in. To implement additional layers in your defense in depth strategy, follow the [{{es}} security principles](../../deploy-manage/security.md). -## Allowed script types setting [allowed-script-types-setting] +## Allowed script types setting [allowed-script-types-setting] {{es}} supports two script types: `inline` and `stored`. By default, {{es}} is configured to run both types of scripts. To limit what type of scripts are run, set `script.allowed_types` to `inline` or `stored`. To prevent any scripts from running, set `script.allowed_types` to `none`. -::::{important} +::::{important} If you use {{kib}}, set `script.allowed_types` to both or just `inline`. Some {{kib}} features rely on inline scripts and do not function as expected if {{es}} does not allow inline scripts. :::: @@ -37,7 +37,7 @@ script.allowed_types: inline ``` -## Allowed script contexts setting [allowed-script-contexts-setting] +## Allowed script contexts setting [allowed-script-contexts-setting] By default, all script contexts are permitted. Use the `script.allowed_contexts` setting to specify the contexts that are allowed. To specify that no contexts are allowed, set `script.allowed_contexts` to `none`. @@ -48,9 +48,9 @@ script.allowed_contexts: score, update ``` -## Allowed scripts in scripted metrics aggregations [allowed-script-in-aggs-settings] +## Allowed scripts in scripted metrics aggregations [allowed-script-in-aggs-settings] -By default, all scripts are permitted in [scripted metrics aggregations](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-metrics-scripted-metric-aggregation.md). To restrict the set of allowed scripts, set [`search.aggs.only_allowed_metric_scripts`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/search-settings.md#search-settings-only-allowed-scripts) to `true` and provide the allowed scripts using [`search.aggs.allowed_inline_metric_scripts`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/search-settings.md#search-settings-allowed-inline-scripts) and/or [`search.aggs.allowed_stored_metric_scripts`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/search-settings.md#search-settings-allowed-stored-scripts). +By default, all scripts are permitted in [scripted metrics aggregations](elasticsearch://reference/data-analysis/aggregations/search-aggregations-metrics-scripted-metric-aggregation.md). To restrict the set of allowed scripts, set [`search.aggs.only_allowed_metric_scripts`](elasticsearch://reference/elasticsearch/configuration-reference/search-settings.md#search-settings-only-allowed-scripts) to `true` and provide the allowed scripts using [`search.aggs.allowed_inline_metric_scripts`](elasticsearch://reference/elasticsearch/configuration-reference/search-settings.md#search-settings-allowed-inline-scripts) and/or [`search.aggs.allowed_stored_metric_scripts`](elasticsearch://reference/elasticsearch/configuration-reference/search-settings.md#search-settings-allowed-stored-scripts). To disallow certain script types, omit the corresponding script list (`search.aggs.allowed_inline_metric_scripts` or `search.aggs.allowed_stored_metric_scripts`) or set it to an empty array. When both script lists are not empty, the given stored scripts and the given inline scripts will be allowed. diff --git a/explore-analyze/scripting/modules-scripting-using.md b/explore-analyze/scripting/modules-scripting-using.md index e37b0a284..085f7e036 100644 --- a/explore-analyze/scripting/modules-scripting-using.md +++ b/explore-analyze/scripting/modules-scripting-using.md @@ -28,13 +28,13 @@ Wherever scripting is supported in the {{es}} APIs, the syntax follows the same : Specifies any named parameters that are passed into the script as variables. [Use parameters](#prefer-params) instead of hard-coded values to decrease compile time. -## Write your first script [hello-world-script] +## Write your first script [hello-world-script] [Painless](modules-scripting-painless.md) is the default scripting language for {{es}}. It is secure, performant, and provides a natural syntax for anyone with a little coding experience. A Painless script is structured as one or more statements and optionally has one or more user-defined functions at the beginning. A script must always have at least one statement. -The [Painless execute API](asciidocalypse://docs/elasticsearch/docs/reference/scripting-languages/painless/painless-api-examples.md) provides the ability to test a script with simple user-defined parameters and receive a result. Let’s start with a complete script and review its constituent parts. +The [Painless execute API](elasticsearch://reference/scripting-languages/painless/painless-api-examples.md) provides the ability to test a script with simple user-defined parameters and receive a result. Let’s start with a complete script and review its constituent parts. First, index a document with a single field so that we have some data to work with: @@ -45,7 +45,7 @@ PUT my-index-000001/_doc/1 } ``` -We can then construct a script that operates on that field and run evaluate the script as part of a query. The following query uses the [`script_fields`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/retrieve-selected-fields.md#script-fields) parameter of the search API to retrieve a script valuation. There’s a lot happening here, but we’ll break it down the components to understand them individually. For now, you only need to understand that this script takes `my_field` and operates on it. +We can then construct a script that operates on that field and run evaluate the script as part of a query. The following query uses the [`script_fields`](elasticsearch://reference/elasticsearch/rest-apis/retrieve-selected-fields.md#script-fields) parameter of the search API to retrieve a script valuation. There’s a lot happening here, but we’ll break it down the components to understand them individually. For now, you only need to understand that this script takes `my_field` and operates on it. ```console GET my-index-000001/_search @@ -70,7 +70,7 @@ GET my-index-000001/_search The `script` is a standard JSON object that defines scripts under most APIs in {{es}}. This object requires `source` to define the script itself. The script doesn’t specify a language, so it defaults to Painless. -## Use parameters in your script [prefer-params] +## Use parameters in your script [prefer-params] The first time {{es}} sees a new script, it compiles the script and stores the compiled version in a cache. Compilation can be a heavy process. Rather than hard-coding values in your script, pass them as named `params` instead. @@ -97,13 +97,13 @@ You can compile up to 150 scripts per 5 minutes by default. For ingest contexts, script.context.field.max_compilations_rate=100/10m ``` -::::{important} +::::{important} If you compile too many unique scripts within a short time, {{es}} rejects the new dynamic scripts with a `circuit_breaking_exception` error. :::: -## Shorten your script [script-shorten-syntax] +## Shorten your script [script-shorten-syntax] Using syntactic abilities that are native to Painless, you can reduce verbosity in your scripts and make them shorter. Here’s a simple script that we can make shorter: @@ -152,13 +152,13 @@ This version of the script removes several components and simplifies the syntax Use this abbreviated syntax anywhere that {{es}} supports scripts, such as when you’re creating [runtime fields](../../manage-data/data-store/mapping/map-runtime-field.md). -## Store and retrieve scripts [script-stored-scripts] +## Store and retrieve scripts [script-stored-scripts] You can store and retrieve scripts from the cluster state using the [stored script APIs](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-script). Stored scripts allow you to reference shared scripts for operations like scoring, aggregating, filtering, and reindexing. Instead of embedding scripts inline in each query, you can reference these shared operations. Stored scripts can also reduce request payload size. Depending on script size and request frequency, this can help lower latency and data transfer costs. -::::{note} +::::{note} Unlike regular scripts, stored scripts require that you specify a script language using the `lang` parameter. :::: @@ -214,7 +214,7 @@ DELETE _scripts/calculate-score ``` -## Update documents with scripts [scripts-update-scripts] +## Update documents with scripts [scripts-update-scripts] You can use the [update API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-update) to update documents with a specified script. The script can update, delete, or skip modifying the document. The update API also supports passing a partial document, which is merged into the existing document. diff --git a/explore-analyze/scripting/scripting-field-extraction.md b/explore-analyze/scripting/scripting-field-extraction.md index 3f4ca58d7..ad3490e90 100644 --- a/explore-analyze/scripting/scripting-field-extraction.md +++ b/explore-analyze/scripting/scripting-field-extraction.md @@ -246,7 +246,7 @@ The following pattern tells dissect to return the term `used`, a blank space, th emit("used" + ' ' + gc.usize + ', ' + "capacity" + ' ' + gc.csize + ', ' + "committed" + ' ' + gc.comsize) ``` -Putting it all together, you can create a runtime field named `gc_size` in a search request. Using the [`fields` option](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/retrieve-selected-fields.md#search-fields-param), you can retrieve all values for the `gc_size` runtime field. This query also includes a bucket aggregation to group your data. +Putting it all together, you can create a runtime field named `gc_size` in a search request. Using the [`fields` option](elasticsearch://reference/elasticsearch/rest-apis/retrieve-selected-fields.md#search-fields-param), you can retrieve all values for the `gc_size` runtime field. This query also includes a bucket aggregation to group your data. ```console GET my-index/_search diff --git a/explore-analyze/scripting/scripts-search-speed.md b/explore-analyze/scripting/scripts-search-speed.md index db12416d8..ba2bbd18f 100644 --- a/explore-analyze/scripting/scripts-search-speed.md +++ b/explore-analyze/scripting/scripts-search-speed.md @@ -16,13 +16,13 @@ If you see a large number of script cache evictions and a rising number of compi All scripts are cached by default so that they only need to be recompiled when updates occur. By default, scripts do not have a time-based expiration. You can change this behavior by using the `script.cache.expire` setting. Use the `script.cache.max_size` setting to configure the size of the cache. -::::{note} +::::{note} The size of scripts is limited to 65,535 bytes. Set the value of `script.max_size_in_bytes` to increase that soft limit. If your scripts are really large, then consider using a [native script engine](modules-scripting-engine.md). :::: -## Improving search speed [_improving_search_speed] +## Improving search speed [_improving_search_speed] Scripts are incredibly useful, but can’t use {{es}}'s index structures or related optimizations. This relationship can sometimes result in slower search speeds. @@ -72,7 +72,7 @@ PUT /my_test_scores/_mapping } ``` -Next, use an [ingest pipeline](../../manage-data/ingest/transform-enrich/ingest-pipelines.md) containing the [script processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/script-processor.md) to calculate the sum of `math_score` and `verbal_score` and index it in the `total_score` field. +Next, use an [ingest pipeline](../../manage-data/ingest/transform-enrich/ingest-pipelines.md) containing the [script processor](elasticsearch://reference/ingestion-tools/enrich-processor/script-processor.md) to calculate the sum of `math_score` and `verbal_score` and index it in the `total_score` field. ```console PUT _ingest/pipeline/my_test_scores_pipeline diff --git a/explore-analyze/transforms/ecommerce-transforms.md b/explore-analyze/transforms/ecommerce-transforms.md index 3e938b3cb..fb82e8af3 100644 --- a/explore-analyze/transforms/ecommerce-transforms.md +++ b/explore-analyze/transforms/ecommerce-transforms.md @@ -27,7 +27,7 @@ mapped_pages: :class: screenshot ::: - Group the data by customer ID and add one or more aggregations to learn more about each customer’s orders. For example, let’s calculate the sum of products they purchased, the total price of their purchases, the maximum number of products that they purchased in a single order, and their total number of orders. We’ll accomplish this by using the [`sum` aggregation](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-metrics-sum-aggregation.md) on the `total_quantity` and `taxless_total_price` fields, the [`max` aggregation](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-metrics-max-aggregation.md) on the `total_quantity` field, and the [`cardinality` aggregation](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-metrics-cardinality-aggregation.md) on the `order_id` field: + Group the data by customer ID and add one or more aggregations to learn more about each customer’s orders. For example, let’s calculate the sum of products they purchased, the total price of their purchases, the maximum number of products that they purchased in a single order, and their total number of orders. We’ll accomplish this by using the [`sum` aggregation](elasticsearch://reference/data-analysis/aggregations/search-aggregations-metrics-sum-aggregation.md) on the `total_quantity` and `taxless_total_price` fields, the [`max` aggregation](elasticsearch://reference/data-analysis/aggregations/search-aggregations-metrics-max-aggregation.md) on the `total_quantity` field, and the [`cardinality` aggregation](elasticsearch://reference/data-analysis/aggregations/search-aggregations-metrics-cardinality-aggregation.md) on the `order_id` field: :::{image} ../../images/elasticsearch-reference-ecommerce-pivot2.png :alt: Adding multiple aggregations to a {{transform}} in {{kib}} @@ -171,7 +171,7 @@ mapped_pages: :::: 5. Optional: Create the destination index. - If the destination index does not exist, it is created the first time you start your {{transform}}. A pivot transform deduces the mappings for the destination index from the source indices and the transform aggregations. If there are fields in the destination index that are derived from scripts (for example, if you use [`scripted_metrics`](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-metrics-scripted-metric-aggregation.md) or [`bucket_scripts`](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-pipeline-bucket-script-aggregation.md) aggregations), they’re created with [dynamic mappings](../../manage-data/data-store/mapping/dynamic-mapping.md). You can use the preview {{transform}} API to preview the mappings it will use for the destination index. In {{kib}}, if you copied the API request to your clipboard, paste it into the console, then refer to the `generated_dest_index` object in the API response. + If the destination index does not exist, it is created the first time you start your {{transform}}. A pivot transform deduces the mappings for the destination index from the source indices and the transform aggregations. If there are fields in the destination index that are derived from scripts (for example, if you use [`scripted_metrics`](elasticsearch://reference/data-analysis/aggregations/search-aggregations-metrics-scripted-metric-aggregation.md) or [`bucket_scripts`](elasticsearch://reference/data-analysis/aggregations/search-aggregations-pipeline-bucket-script-aggregation.md) aggregations), they’re created with [dynamic mappings](../../manage-data/data-store/mapping/dynamic-mapping.md). You can use the preview {{transform}} API to preview the mappings it will use for the destination index. In {{kib}}, if you copied the API request to your clipboard, paste it into the console, then refer to the `generated_dest_index` object in the API response. ::::{note} {{transforms-cap}} might have more configuration options provided by the APIs than the options available in {{kib}}. For example, you can set an ingest pipeline for `dest` by calling the [Create {{transform}}](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-transform-put-transform). For all the {{transform}} configuration options, refer to the [documentation](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-transform). :::: diff --git a/explore-analyze/transforms/transform-checkpoints.md b/explore-analyze/transforms/transform-checkpoints.md index 09699505e..db09d7eb7 100644 --- a/explore-analyze/transforms/transform-checkpoints.md +++ b/explore-analyze/transforms/transform-checkpoints.md @@ -41,7 +41,7 @@ If the cluster experiences unsuitable performance degradation due to the {{trans In most cases, it is strongly recommended to use the ingest timestamp of the source indices for syncing the {{transform}}. This is the most optimal way for {{transforms}} to be able to identify new changes. If your data source follows the [ECS standard](asciidocalypse://docs/ecs/docs/reference/index.md), you might already have an [`event.ingested`](asciidocalypse://docs/ecs/docs/reference/ecs-event.md#field-event-ingested) field. In this case, use `event.ingested` as the `sync`.`time`.`field` property of your {{transform}}. -If you don’t have a `event.ingested` field or it isn’t populated, you can set it by using an ingest pipeline. Create an ingest pipeline either using the [ingest pipeline API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-ingest-put-pipeline) (like the example below) or via {{kib}} under **Stack Management > Ingest Pipelines**. Use a [`set` processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/set-processor.md) to set the field and associate it with the value of the ingest timestamp. +If you don’t have a `event.ingested` field or it isn’t populated, you can set it by using an ingest pipeline. Create an ingest pipeline either using the [ingest pipeline API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-ingest-put-pipeline) (like the example below) or via {{kib}} under **Stack Management > Ingest Pipelines**. Use a [`set` processor](elasticsearch://reference/ingestion-tools/enrich-processor/set-processor.md) to set the field and associate it with the value of the ingest timestamp. ```console PUT _ingest/pipeline/set_ingest_time diff --git a/explore-analyze/transforms/transform-examples.md b/explore-analyze/transforms/transform-examples.md index 9d112bb7b..5feffc7ba 100644 --- a/explore-analyze/transforms/transform-examples.md +++ b/explore-analyze/transforms/transform-examples.md @@ -94,7 +94,7 @@ It’s possible to answer these questions using aggregations alone, however {{tr ## Finding air carriers with the most delays [example-airline] -This example uses the Flights sample data set to find out which air carrier had the most delays. First, filter the source data such that it excludes all the cancelled flights by using a query filter. Then transform the data to contain the distinct number of flights, the sum of delayed minutes, and the sum of the flight minutes by air carrier. Finally, use a [`bucket_script`](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-pipeline-bucket-script-aggregation.md) to determine what percentage of the flight time was actually delay. +This example uses the Flights sample data set to find out which air carrier had the most delays. First, filter the source data such that it excludes all the cancelled flights by using a query filter. Then transform the data to contain the distinct number of flights, the sum of delayed minutes, and the sum of the flight minutes by air carrier. Finally, use a [`bucket_script`](elasticsearch://reference/data-analysis/aggregations/search-aggregations-pipeline-bucket-script-aggregation.md) to determine what percentage of the flight time was actually delay. ```console POST _transform/_preview @@ -415,9 +415,9 @@ This {{transform}} makes it easier to answer questions such as: ## Finding client IPs that sent the most bytes to the server [example-bytes] -This example uses the web log sample data set to find the client IP that sent the most bytes to the server in every hour. The example uses a `pivot` {{transform}} with a [`top_metrics`](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-metrics-top-metrics.md) aggregation. +This example uses the web log sample data set to find the client IP that sent the most bytes to the server in every hour. The example uses a `pivot` {{transform}} with a [`top_metrics`](elasticsearch://reference/data-analysis/aggregations/search-aggregations-metrics-top-metrics.md) aggregation. -Group the data by a [date histogram](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-composite-aggregation.md#_date_histogram) on the time field with an interval of one hour. Use a [max aggregation](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-metrics-max-aggregation.md) on the `bytes` field to get the maximum amount of data that is sent to the server. Without the `max` aggregation, the API call still returns the client IP that sent the most bytes, however, the amount of bytes that it sent is not returned. In the `top_metrics` property, specify `clientip` and `geo.src`, then sort them by the `bytes` field in descending order. The {{transform}} returns the client IP that sent the biggest amount of data and the 2-letter ISO code of the corresponding location. +Group the data by a [date histogram](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-composite-aggregation.md#_date_histogram) on the time field with an interval of one hour. Use a [max aggregation](elasticsearch://reference/data-analysis/aggregations/search-aggregations-metrics-max-aggregation.md) on the `bytes` field to get the maximum amount of data that is sent to the server. Without the `max` aggregation, the API call still returns the client IP that sent the most bytes, however, the amount of bytes that it sent is not returned. In the `top_metrics` property, specify `clientip` and `geo.src`, then sort them by the `bytes` field in descending order. The {{transform}} returns the client IP that sent the biggest amount of data and the 2-letter ISO code of the corresponding location. ```console POST _transform/_preview diff --git a/explore-analyze/transforms/transform-limitations.md b/explore-analyze/transforms/transform-limitations.md index febf4c0d1..226f02477 100644 --- a/explore-analyze/transforms/transform-limitations.md +++ b/explore-analyze/transforms/transform-limitations.md @@ -49,7 +49,7 @@ A {{ctransform}} periodically checks for changes to source data. The functionali ### Aggregation responses may be incompatible with destination index mappings [transform-aggresponse-limitations] -When a pivot {{transform}} is first started, it will deduce the mappings required for the destination index. This process is based on the field types of the source index and the aggregations used. If the fields are derived from [`scripted_metrics`](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-metrics-scripted-metric-aggregation.md) or [`bucket_scripts`](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-pipeline-bucket-script-aggregation.md), [dynamic mappings](../../manage-data/data-store/mapping/dynamic-mapping.md) will be used. In some instances the deduced mappings may be incompatible with the actual data. For example, numeric overflows might occur or dynamically mapped fields might contain both numbers and strings. Please check {{es}} logs if you think this may have occurred. +When a pivot {{transform}} is first started, it will deduce the mappings required for the destination index. This process is based on the field types of the source index and the aggregations used. If the fields are derived from [`scripted_metrics`](elasticsearch://reference/data-analysis/aggregations/search-aggregations-metrics-scripted-metric-aggregation.md) or [`bucket_scripts`](elasticsearch://reference/data-analysis/aggregations/search-aggregations-pipeline-bucket-script-aggregation.md), [dynamic mappings](../../manage-data/data-store/mapping/dynamic-mapping.md) will be used. In some instances the deduced mappings may be incompatible with the actual data. For example, numeric overflows might occur or dynamically mapped fields might contain both numbers and strings. Please check {{es}} logs if you think this may have occurred. You can view the deduced mappings by using the [preview transform API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-transform-preview-transform). See the `generated_dest_index` object in the API response. @@ -57,7 +57,7 @@ If it’s required, you may define custom mappings prior to starting the {{trans ### Batch {{transforms}} may not account for changed documents [transform-batch-limitations] -A batch {{transform}} uses a [composite aggregation](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-composite-aggregation.md) which allows efficient pagination through all buckets. Composite aggregations do not yet support a search context, therefore if the source data is changed (deleted, updated, added) while the batch {{dataframe}} is in progress, then the results may not include these changes. +A batch {{transform}} uses a [composite aggregation](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-composite-aggregation.md) which allows efficient pagination through all buckets. Composite aggregations do not yet support a search context, therefore if the source data is changed (deleted, updated, added) while the batch {{dataframe}} is in progress, then the results may not include these changes. ### {{ctransform-cap}} consistency does not account for deleted or updated documents [transform-consistency-limitations] @@ -77,7 +77,7 @@ When deleting a {{transform}} using `DELETE _transform/index` neither the destin During the development of {{transforms}}, control was favoured over performance. In the design considerations, it is preferred for the {{transform}} to take longer to complete quietly in the background rather than to finish quickly and take precedence in resource consumption. -Composite aggregations are well suited for high cardinality data enabling pagination through results. If a [circuit breaker](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/circuit-breaker-settings.md) memory exception occurs when performing the composite aggregated search then we try again reducing the number of buckets requested. This circuit breaker is calculated based upon all activity within the cluster, not just activity from {{transforms}}, so it therefore may only be a temporary resource availability issue. +Composite aggregations are well suited for high cardinality data enabling pagination through results. If a [circuit breaker](elasticsearch://reference/elasticsearch/configuration-reference/circuit-breaker-settings.md) memory exception occurs when performing the composite aggregated search then we try again reducing the number of buckets requested. This circuit breaker is calculated based upon all activity within the cluster, not just activity from {{transforms}}, so it therefore may only be a temporary resource availability issue. For a batch {{transform}}, the number of buckets requested is only ever adjusted downwards. The lowering of value may result in a longer duration for the {{transform}} checkpoint to complete. For {{ctransforms}}, the number of buckets requested is reset back to its default at the start of every checkpoint and it is possible for circuit breaker exceptions to occur repeatedly in the {{es}} logs. @@ -85,11 +85,11 @@ The {{transform}} retrieves data in batches which means it calculates several bu ### Handling dynamic adjustments for many terms [transform-dynamic-adjustments-limitations] -For each checkpoint, entities are identified that have changed since the last time the check was performed. This list of changed entities is supplied as a [terms query](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-terms-query.md) to the {{transform}} composite aggregation, one page at a time. Then updates are applied to the destination index for each page of entities. +For each checkpoint, entities are identified that have changed since the last time the check was performed. This list of changed entities is supplied as a [terms query](elasticsearch://reference/query-languages/query-dsl-terms-query.md) to the {{transform}} composite aggregation, one page at a time. Then updates are applied to the destination index for each page of entities. The page `size` is defined by `max_page_search_size` which is also used to define the number of buckets returned by the composite aggregation search. The default value is 500, the minimum is 10. -The index setting [`index.max_terms_count`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-modules.md#dynamic-index-settings) defines the maximum number of terms that can be used in a terms query. The default value is 65536. If `max_page_search_size` exceeds `index.max_terms_count` the {{transform}} will fail. +The index setting [`index.max_terms_count`](elasticsearch://reference/elasticsearch/index-settings/index-modules.md#dynamic-index-settings) defines the maximum number of terms that can be used in a terms query. The default value is 65536. If `max_page_search_size` exceeds `index.max_terms_count` the {{transform}} will fail. Using smaller values for `max_page_search_size` may result in a longer duration for the {{transform}} checkpoint to complete. @@ -109,7 +109,7 @@ If using a `sync.time.field` that represents the data ingest time and using a ze ### Support for date nanoseconds data type [transform-date-nanos] -If your data uses the [date nanosecond data type](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/date_nanos.md), aggregations are nonetheless on millisecond resolution. This limitation also affects the aggregations in your {{transforms}}. +If your data uses the [date nanosecond data type](elasticsearch://reference/elasticsearch/mapping-reference/date_nanos.md), aggregations are nonetheless on millisecond resolution. This limitation also affects the aggregations in your {{transforms}}. ### Data streams as destination indices are not supported [transform-data-streams-destination] @@ -119,7 +119,7 @@ If your data uses the [date nanosecond data type](asciidocalypse://docs/elastics [ILM](../../manage-data/lifecycle/index-lifecycle-management.md) is not recommended to use as a {{transform}} destination index. {{transforms-cap}} update documents in the current destination, and cannot delete documents in the indices previously used by ILM. This may lead to duplicated documents when you use {{transforms}} combined with ILM in case of a rollover. -If you use ILM to have time-based indices, please consider using the [Date index name](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/date-index-name-processor.md) instead. The processor works without duplicated documents if your {{transform}} contains a `group_by` based on `date_histogram`. +If you use ILM to have time-based indices, please consider using the [Date index name](elasticsearch://reference/ingestion-tools/enrich-processor/date-index-name-processor.md) instead. The processor works without duplicated documents if your {{transform}} contains a `group_by` based on `date_histogram`. ## Limitations in {{kib}} [transform-ui-limitations] diff --git a/explore-analyze/transforms/transform-painless-examples.md b/explore-analyze/transforms/transform-painless-examples.md index acc5061f3..962bff1a5 100644 --- a/explore-analyze/transforms/transform-painless-examples.md +++ b/explore-analyze/transforms/transform-painless-examples.md @@ -9,11 +9,11 @@ mapped_pages: # Painless examples [transform-painless-examples] -::::{important} +::::{important} The examples that use the `scripted_metric` aggregation are not supported on {{es}} Serverless. :::: -These examples demonstrate how to use Painless in {{transforms}}. You can learn more about the Painless scripting language in the [Painless guide](asciidocalypse://docs/elasticsearch/docs/reference/scripting-languages/painless/painless.md). +These examples demonstrate how to use Painless in {{transforms}}. You can learn more about the Painless scripting language in the [Painless guide](elasticsearch://reference/scripting-languages/painless/painless.md). * [Getting top hits by using scripted metric aggregation](#painless-top-hits) * [Getting time features by using aggregations](#painless-time-features) @@ -31,7 +31,7 @@ These examples demonstrate how to use Painless in {{transforms}}. You can learn ## Getting top hits by using scripted metric aggregation [painless-top-hits] -This snippet shows how to find the latest document, in other words the document with the latest timestamp. From a technical perspective, it helps to achieve the function of a [Top hits](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-metrics-top-hits-aggregation.md) by using scripted metric aggregation in a {{transform}}, which provides a metric output. +This snippet shows how to find the latest document, in other words the document with the latest timestamp. From a technical perspective, it helps to achieve the function of a [Top hits](elasticsearch://reference/data-analysis/aggregations/search-aggregations-metrics-top-hits-aggregation.md) by using scripted metric aggregation in a {{transform}}, which provides a metric output. ::::{important} This example uses a `scripted_metric` aggregation which is not supported on {{es}} Serverless. @@ -66,7 +66,7 @@ This example uses a `scripted_metric` aggregation which is not supported on {{es 3. The `combine_script` returns `state` from each shard. 4. The `reduce_script` iterates through the value of `s.timestamp_latest` returned by each shard and returns the document with the latest timestamp (`last_doc`). In the response, the top hit (in other words, the `latest_doc`) is nested below the `latest_doc` field. -Check the [scope of scripts](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-metrics-scripted-metric-aggregation.md#scripted-metric-aggregation-scope) for detailed explanation on the respective scripts. +Check the [scope of scripts](elasticsearch://reference/data-analysis/aggregations/search-aggregations-metrics-scripted-metric-aggregation.md#scripted-metric-aggregation-scope) for detailed explanation on the respective scripts. You can retrieve the last value in a similar way: @@ -215,7 +215,7 @@ This snippet shows how to extract time based features by using Painless in a {{t ## Getting duration by using bucket script [painless-bucket-script] -This example shows you how to get the duration of a session by client IP from a data log by using [bucket script](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-pipeline-bucket-script-aggregation.md). The example uses the {{kib}} sample web logs dataset. +This example shows you how to get the duration of a session by client IP from a data log by using [bucket script](elasticsearch://reference/data-analysis/aggregations/search-aggregations-pipeline-bucket-script-aggregation.md). The example uses the {{kib}} sample web logs dataset. ```console PUT _transform/data_log diff --git a/explore-analyze/transforms/transform-scale.md b/explore-analyze/transforms/transform-scale.md index 1c7525674..7b829b557 100644 --- a/explore-analyze/transforms/transform-scale.md +++ b/explore-analyze/transforms/transform-scale.md @@ -55,7 +55,7 @@ Imagine your {{ctransform}} is configured to group by `IP` and calculate the sum To limit which historical indices are accessed, exclude certain tiers (for example `"must_not": { "terms": { "_tier": [ "data_frozen", "data_cold" ] } }` and/or use an absolute time value as a date range filter in your source query (for example, greater than 2024-01-01T00:00:00). If you use a relative time value (for example, gte now-30d/d) then ensure date rounding is applied to take advantage of query caching and ensure that the relative time is much larger than the largest of `frequency` or `time.sync.delay` or the date histogram bucket, otherwise data may be missed. Do not use date filters which are less than a date value (for example, `lt`: less than or `lte`: less than or equal to) as this conflicts with the logic applied at each checkpoint execution and data may be missed. -Consider using [date math](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#api-date-math-index-names) in your index names to reduce the number of indices to resolve in your queries. Add a date pattern - for example, `yyyy-MM-dd` - to your index names and use it to limit your query to a specific date. The example below queries indices only from yesterday and today: +Consider using [date math](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#api-date-math-index-names) in your index names to reduce the number of indices to resolve in your queries. Add a date pattern - for example, `yyyy-MM-dd` - to your index names and use it to limit your query to a specific date. The example below queries indices only from yesterday and today: ```js "source": { @@ -88,10 +88,10 @@ Index sorting enables you to store documents on disk in a specific order which c ## 9. Disable the `_source` field on the destination index (storage) [disable-source-dest] -The [`_source` field](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-source-field.md) contains the original JSON document body that was passed at index time. The `_source` field itself is not indexed (and thus is not searchable), but it is still stored in the index and incurs a storage overhead. Consider disabling `_source` to save storage space if you have a large destination index. Disabling `_source` is only possible during index creation. +The [`_source` field](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md) contains the original JSON document body that was passed at index time. The `_source` field itself is not indexed (and thus is not searchable), but it is still stored in the index and incurs a storage overhead. Consider disabling `_source` to save storage space if you have a large destination index. Disabling `_source` is only possible during index creation. ::::{note} -When the `_source` field is disabled, a number of features are not supported. Consult [Disabling the `_source` field](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-source-field.md#disable-source-field) to understand the consequences before disabling it. +When the `_source` field is disabled, a number of features are not supported. Consult [Disabling the `_source` field](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md#disable-source-field) to understand the consequences before disabling it. :::: ## Further reading [_further_reading] diff --git a/explore-analyze/transforms/transform-usage.md b/explore-analyze/transforms/transform-usage.md index 470dea6ee..cb94d5dee 100644 --- a/explore-analyze/transforms/transform-usage.md +++ b/explore-analyze/transforms/transform-usage.md @@ -18,11 +18,11 @@ You might want to consider using {{transforms}} instead of aggregations when: In {{ml}}, you often need a complete set of behavioral features rather just the top-N. For example, if you are predicting customer churn, you might look at features such as the number of website visits in the last week, the total number of sales, or the number of emails sent. The {{stack}} {{ml-features}} create models based on this multi-dimensional feature space, so they benefit from the full feature indices that are created by {{transforms}}. - This scenario also applies when you are trying to search across the results of an aggregation or multiple aggregations. Aggregation results can be ordered or filtered, but there are [limitations to ordering](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-terms-aggregation.md#search-aggregations-bucket-terms-aggregation-order) and [filtering by bucket selector](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-pipeline-bucket-selector-aggregation.md) is constrained by the maximum number of buckets returned. If you want to search all aggregation results, you need to create the complete {{dataframe}}. If you need to sort or filter the aggregation results by multiple fields, {{transforms}} are particularly useful. + This scenario also applies when you are trying to search across the results of an aggregation or multiple aggregations. Aggregation results can be ordered or filtered, but there are [limitations to ordering](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-terms-aggregation.md#search-aggregations-bucket-terms-aggregation-order) and [filtering by bucket selector](elasticsearch://reference/data-analysis/aggregations/search-aggregations-pipeline-bucket-selector-aggregation.md) is constrained by the maximum number of buckets returned. If you want to search all aggregation results, you need to create the complete {{dataframe}}. If you need to sort or filter the aggregation results by multiple fields, {{transforms}} are particularly useful. * You need to sort aggregation results by a pipeline aggregation. - [Pipeline aggregations](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/pipeline.md) cannot be used for sorting. Technically, this is because pipeline aggregations are run during the reduce phase after all other aggregations have already completed. If you create a {{transform}}, you can effectively perform multiple passes over the data. + [Pipeline aggregations](elasticsearch://reference/data-analysis/aggregations/pipeline.md) cannot be used for sorting. Technically, this is because pipeline aggregations are run during the reduce phase after all other aggregations have already completed. If you create a {{transform}}, you can effectively perform multiple passes over the data. * You want to create summary tables to optimize queries. diff --git a/explore-analyze/visualize/custom-visualizations-with-vega.md b/explore-analyze/visualize/custom-visualizations-with-vega.md index a8b19822a..67a848840 100644 --- a/explore-analyze/visualize/custom-visualizations-with-vega.md +++ b/explore-analyze/visualize/custom-visualizations-with-vega.md @@ -116,7 +116,7 @@ POST kibana_sample_data_ecommerce/_search } ``` -Add the [terms aggregation](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-terms-aggregation.md), then click **Click to send request**: +Add the [terms aggregation](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-terms-aggregation.md), then click **Click to send request**: ```js POST kibana_sample_data_ecommerce/_search diff --git a/explore-analyze/visualize/esorql.md b/explore-analyze/visualize/esorql.md index 940bded3d..f0801a41d 100644 --- a/explore-analyze/visualize/esorql.md +++ b/explore-analyze/visualize/esorql.md @@ -24,7 +24,7 @@ You can then **Save** and add it to an existing or a new dashboard using the sav 2. Choose **ES|QL** under **Visualizations**. An ES|QL editor appears and lets you configure your query and its associated visualization. The **Suggestions** panel can help you find alternative ways to configure the visualization. ::::{tip} - Check the [ES|QL reference](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql.md) to get familiar with the syntax and optimize your query. + Check the [ES|QL reference](elasticsearch://reference/query-languages/esql.md) to get familiar with the syntax and optimize your query. :::: 3. When editing your query or its configuration, run the query to update the preview of the visualization. diff --git a/explore-analyze/visualize/field-statistics.md b/explore-analyze/visualize/field-statistics.md index 3fd743e36..f90df3ab8 100644 --- a/explore-analyze/visualize/field-statistics.md +++ b/explore-analyze/visualize/field-statistics.md @@ -14,7 +14,7 @@ mapped_pages: 2. Choose **Field statistics** under **Visualizations**. An ES|QL editor appears and lets you configure your query with the fields and information that you want to show. ::::{tip} - Check the [ES|QL reference](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql.md) to get familiar with the syntax and optimize your query. + Check the [ES|QL reference](elasticsearch://reference/query-languages/esql.md) to get familiar with the syntax and optimize your query. :::: 3. When editing your query or its configuration, run the query to update the preview of the visualization. diff --git a/explore-analyze/visualize/graph/graph-troubleshooting.md b/explore-analyze/visualize/graph/graph-troubleshooting.md index c688fb391..e4340fe05 100644 --- a/explore-analyze/visualize/graph/graph-troubleshooting.md +++ b/explore-analyze/visualize/graph/graph-troubleshooting.md @@ -13,7 +13,7 @@ mapped_pages: -## Why are results missing? [_why_are_results_missing] +## Why are results missing? [_why_are_results_missing] The default settings in Graph API requests are configured to tune out noisy results by using the following strategies: @@ -28,20 +28,20 @@ These are useful defaults for getting the "big picture" signals from noisy data, * Set the `min_doc_count` for your vertices to 1 to ensure only one document is required to assert a relationship. -## What can I do to improve performance? [_what_can_i_do_to_improve_performance] +## What can I do to improve performance? [_what_can_i_do_to_improve_performance] With the default setting of `use_significance` set to `true`, the Graph API performs a background frequency check of the terms it discovers as part of exploration. Each unique term has to have its frequency looked up in the index, which costs at least one disk seek. Disk seeks are expensive. If you don’t need to perform this noise-filtering, setting `use_significance` to `false` eliminates all of these expensive checks (at the expense of not performing any quality-filtering on the terms). If your data is noisy and you need to filter based on significance, you can reduce the number of frequency checks by: * Reducing the `sample_size`. Considering fewer documents can actually be better when the quality of matches is quite variable. -* Avoiding noisy documents that have a large number of terms. You can do this by either allowing ranking to naturally favor shorter documents in the top-results sample (see [enabling norms](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/norms.md)) or by explicitly excluding large documents with your seed and guiding queries. +* Avoiding noisy documents that have a large number of terms. You can do this by either allowing ranking to naturally favor shorter documents in the top-results sample (see [enabling norms](elasticsearch://reference/elasticsearch/mapping-reference/norms.md)) or by explicitly excluding large documents with your seed and guiding queries. * Increasing the frequency threshold. Many many terms occur very infrequently so even increasing the frequency threshold by one can massively reduce the number of candidate terms whose background frequencies are checked. Keep in mind that all of these options reduce the scope of information analyzed and can increase the potential to miss what could be interesting details. However, the information that’s lost tends to be associated with lower-quality documents with lower-frequency terms, which can be an acceptable trade-off. -## Limited support for multiple indices [_limited_support_for_multiple_indices] +## Limited support for multiple indices [_limited_support_for_multiple_indices] The graph API can explore multiple indices, types, or aliases in a single API request, but the assumption is that each "hop" it performs is querying the same set of indices. Currently, it is not possible to take a term found in a field from one index and use that value to explore connections in *a different field* held in another type or index. diff --git a/explore-analyze/visualize/maps/heatmap-layer.md b/explore-analyze/visualize/maps/heatmap-layer.md index 62f4ab3fc..405533e5a 100644 --- a/explore-analyze/visualize/maps/heatmap-layer.md +++ b/explore-analyze/visualize/maps/heatmap-layer.md @@ -15,7 +15,7 @@ Heat map layers cluster point data to show locations with higher densities. :class: screenshot ::: -To add a heat map layer to your map, click **Add layer**, then select **Heat map**. The index must contain at least one field mapped as [geo_point](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/geo-point.md) or [geo_shape](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/geo-shape.md). +To add a heat map layer to your map, click **Add layer**, then select **Heat map**. The index must contain at least one field mapped as [geo_point](elasticsearch://reference/elasticsearch/mapping-reference/geo-point.md) or [geo_shape](elasticsearch://reference/elasticsearch/mapping-reference/geo-shape.md). ::::{note} Only count, sum, unique count metric aggregations are available with the grid aggregation source and heat map layers. Average, min, and max are turned off because the heat map will blend nearby values. Blending two average values would make the cluster more prominent, even though it just might literally mean that these nearby areas are average. diff --git a/explore-analyze/visualize/maps/import-geospatial-data.md b/explore-analyze/visualize/maps/import-geospatial-data.md index a92370a38..84e119d6c 100644 --- a/explore-analyze/visualize/maps/import-geospatial-data.md +++ b/explore-analyze/visualize/maps/import-geospatial-data.md @@ -8,7 +8,7 @@ mapped_pages: # Import geospatial data [import-geospatial-data] -To import geospatical data into the Elastic Stack, the data must be indexed as [geo_point](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/geo-point.md) or [geo_shape](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/geo-shape.md). Geospatial data comes in many formats. Choose an import tool based on the format of your geospatial data. +To import geospatical data into the Elastic Stack, the data must be indexed as [geo_point](elasticsearch://reference/elasticsearch/mapping-reference/geo-point.md) or [geo_shape](elasticsearch://reference/elasticsearch/mapping-reference/geo-shape.md). Geospatial data comes in many formats. Choose an import tool based on the format of your geospatial data. ## Security privileges [import-geospatial-privileges] @@ -114,7 +114,7 @@ To draw features: ## Upload data with IP addresses [_upload_data_with_ip_addresses] -The GeoIP processor adds information about the geographical location of IP addresses. See [GeoIP processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/geoip-processor.md) for details. For private IP addresses, see [Enriching data with GeoIPs from internal, private IP addresses](https://www.elastic.co/blog/enriching-elasticsearch-data-geo-ips-internal-private-ip-addresses). +The GeoIP processor adds information about the geographical location of IP addresses. See [GeoIP processor](elasticsearch://reference/ingestion-tools/enrich-processor/geoip-processor.md) for details. For private IP addresses, see [Enriching data with GeoIPs from internal, private IP addresses](https://www.elastic.co/blog/enriching-elasticsearch-data-geo-ips-internal-private-ip-addresses). ## Upload data with GDAL [_upload_data_with_gdal] diff --git a/explore-analyze/visualize/maps/indexing-geojson-data-tutorial.md b/explore-analyze/visualize/maps/indexing-geojson-data-tutorial.md index 945a357ca..b7a7187da 100644 --- a/explore-analyze/visualize/maps/indexing-geojson-data-tutorial.md +++ b/explore-analyze/visualize/maps/indexing-geojson-data-tutorial.md @@ -50,7 +50,7 @@ For each GeoJSON file you downloaded, complete the following steps: 2. From the list of layer types, click **Upload file**. 3. Using the File Picker, upload the GeoJSON file. - Depending on the geometry type of your features, this will auto-populate **Index type** with either [geo_point](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/geo-point.md) or [geo_shape](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/geo-shape.md) and **Index name** with ``. + Depending on the geometry type of your features, this will auto-populate **Index type** with either [geo_point](elasticsearch://reference/elasticsearch/mapping-reference/geo-point.md) or [geo_shape](elasticsearch://reference/elasticsearch/mapping-reference/geo-shape.md) and **Index name** with ``. 4. Click **Import file**. @@ -72,12 +72,12 @@ For each GeoJSON file you downloaded, complete the following steps: ## Add a heatmap aggregation layer [_add_a_heatmap_aggregation_layer] -Looking at the `Lightning detected` layer, it’s clear where lightning has struck. What’s less clear, is if there have been more lightning strikes in some areas than others, in other words, where the lightning hot spots are. An advantage of having indexed [geo_point](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/geo-point.md) data for the lightning strikes is that you can perform aggregations on the data. +Looking at the `Lightning detected` layer, it’s clear where lightning has struck. What’s less clear, is if there have been more lightning strikes in some areas than others, in other words, where the lightning hot spots are. An advantage of having indexed [geo_point](elasticsearch://reference/elasticsearch/mapping-reference/geo-point.md) data for the lightning strikes is that you can perform aggregations on the data. 1. Click **Add layer**. 2. From the list of layer types, click **Heat map**. - Because you indexed `lightning_detected.geojson` using the index name and pattern `lightning_detected`, that data is available as a [geo_point](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/geo-point.md) aggregation. + Because you indexed `lightning_detected.geojson` using the index name and pattern `lightning_detected`, that data is available as a [geo_point](elasticsearch://reference/elasticsearch/mapping-reference/geo-point.md) aggregation. 3. Select `lightning_detected`. 4. Click **Add layer** to add the heat map layer "Lightning intensity". diff --git a/explore-analyze/visualize/maps/maps-create-filter-from-map.md b/explore-analyze/visualize/maps/maps-create-filter-from-map.md index 94ba53584..3b2ea3dc9 100644 --- a/explore-analyze/visualize/maps/maps-create-filter-from-map.md +++ b/explore-analyze/visualize/maps/maps-create-filter-from-map.md @@ -35,7 +35,7 @@ A spatial filter narrows search results to documents that either intersect with, Spatial filters have the following properties: * **Geometry label** enables you to provide a meaningful name for your spatial filter. -* **Spatial relation** determines the [spatial relation operator](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-geo-shape-query.md#geo-shape-spatial-relations) to use at search time. +* **Spatial relation** determines the [spatial relation operator](elasticsearch://reference/query-languages/query-dsl-geo-shape-query.md#geo-shape-spatial-relations) to use at search time. * **Action** specifies whether to apply the filter to the current view or to a drilldown action. ::::{note} diff --git a/explore-analyze/visualize/maps/maps-grid-aggregation.md b/explore-analyze/visualize/maps/maps-grid-aggregation.md index a414974b8..35683a847 100644 --- a/explore-analyze/visualize/maps/maps-grid-aggregation.md +++ b/explore-analyze/visualize/maps/maps-grid-aggregation.md @@ -8,21 +8,21 @@ mapped_pages: # Clusters [maps-grid-aggregation] -Clusters use [Geotile grid aggregation](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-geotilegrid-aggregation.md) or [Geohex grid aggregation](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-geohexgrid-aggregation.md) to group your documents into grids. You can calculate metrics for each gridded cell. +Clusters use [Geotile grid aggregation](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-geotilegrid-aggregation.md) or [Geohex grid aggregation](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-geohexgrid-aggregation.md) to group your documents into grids. You can calculate metrics for each gridded cell. Symbolize cluster metrics as: **Clusters** -: Uses [Geotile grid aggregation](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-geotilegrid-aggregation.md) to group your documents into grids. Creates a [vector layer](vector-layer.md) with a cluster symbol for each gridded cell. The cluster location is the weighted centroid for all documents in the gridded cell. +: Uses [Geotile grid aggregation](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-geotilegrid-aggregation.md) to group your documents into grids. Creates a [vector layer](vector-layer.md) with a cluster symbol for each gridded cell. The cluster location is the weighted centroid for all documents in the gridded cell. **Grids** -: Uses [Geotile grid aggregation](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-geotilegrid-aggregation.md) to group your documents into grids. Creates a [vector layer](vector-layer.md) with a bounding box polygon for each gridded cell. +: Uses [Geotile grid aggregation](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-geotilegrid-aggregation.md) to group your documents into grids. Creates a [vector layer](vector-layer.md) with a bounding box polygon for each gridded cell. **Heat map** -: Uses [Geotile grid aggregation](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-geotilegrid-aggregation.md) to group your documents into grids. Creates a [heat map layer](heatmap-layer.md) that clusters the weighted centroids for each gridded cell. +: Uses [Geotile grid aggregation](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-geotilegrid-aggregation.md) to group your documents into grids. Creates a [heat map layer](heatmap-layer.md) that clusters the weighted centroids for each gridded cell. **Hexbins** -: Uses [Geohex grid aggregation](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-geohexgrid-aggregation.md) to group your documents into H3 hexagon grids. Creates a [vector layer](vector-layer.md) with a hexagon polygon for each gridded cell. +: Uses [Geohex grid aggregation](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-geohexgrid-aggregation.md) to group your documents into H3 hexagon grids. Creates a [vector layer](vector-layer.md) with a hexagon polygon for each gridded cell. To enable a clusters layer: diff --git a/explore-analyze/visualize/maps/maps-search-across-multiple-indices.md b/explore-analyze/visualize/maps/maps-search-across-multiple-indices.md index 491c0798b..3fbf51134 100644 --- a/explore-analyze/visualize/maps/maps-search-across-multiple-indices.md +++ b/explore-analyze/visualize/maps/maps-search-across-multiple-indices.md @@ -20,7 +20,7 @@ One strategy for eliminating unintentional empty layers from a cross index searc ## Use _index in a search [maps-add-index-search] -Add [_index](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-index-field.md) to your search to include documents from indices that do not contain a search field. +Add [_index](elasticsearch://reference/elasticsearch/mapping-reference/mapping-index-field.md) to your search to include documents from indices that do not contain a search field. For example, suppose you have a vector layer showing the `kibana_sample_data_logs` documents and another vector layer with `kibana_sample_data_flights` documents. (See [adding sample data](/explore-analyze/index.md) to install the `kibana_sample_data_logs` and `kibana_sample_data_flights` indices.) diff --git a/explore-analyze/visualize/maps/maps-top-hits-aggregation.md b/explore-analyze/visualize/maps/maps-top-hits-aggregation.md index d727b035a..9abeaebe9 100644 --- a/explore-analyze/visualize/maps/maps-top-hits-aggregation.md +++ b/explore-analyze/visualize/maps/maps-top-hits-aggregation.md @@ -8,7 +8,7 @@ mapped_pages: # Display the most relevant documents per entity [maps-top-hits-aggregation] -Use **Top hits per entity** to display the most relevant documents per entity, for example, the most recent GPS tracks per flight route. To get this data, {{es}} first groups your data using a [terms aggregation](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-terms-aggregation.md), then accumulates the most relevant documents based on sort order for each entry using a [top hits metric aggregation](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-metrics-top-hits-aggregation.md). +Use **Top hits per entity** to display the most relevant documents per entity, for example, the most recent GPS tracks per flight route. To get this data, {{es}} first groups your data using a [terms aggregation](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-terms-aggregation.md), then accumulates the most relevant documents based on sort order for each entry using a [top hits metric aggregation](elasticsearch://reference/data-analysis/aggregations/search-aggregations-metrics-top-hits-aggregation.md). To enable top hits: diff --git a/explore-analyze/visualize/maps/maps-troubleshooting.md b/explore-analyze/visualize/maps/maps-troubleshooting.md index 3c439bc31..5711c7973 100644 --- a/explore-analyze/visualize/maps/maps-troubleshooting.md +++ b/explore-analyze/visualize/maps/maps-troubleshooting.md @@ -35,7 +35,7 @@ Maps uses the [{{es}} vector tile search API](https://www.elastic.co/docs/api/do ### Data view not listed when adding layer [_data_view_not_listed_when_adding_layer] -* Verify your geospatial data is correctly mapped as [geo_point](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/geo-point.md) or [geo_shape](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/geo-shape.md). +* Verify your geospatial data is correctly mapped as [geo_point](elasticsearch://reference/elasticsearch/mapping-reference/geo-point.md) or [geo_shape](elasticsearch://reference/elasticsearch/mapping-reference/geo-shape.md). * Run `GET myIndexName/_field_caps?fields=myGeoFieldName` in [Console](../../query-filter/tools/console.md), replacing `myIndexName` and `myGeoFieldName` with your index and geospatial field name. * Ensure response specifies `type` as `geo_point` or `geo_shape`. diff --git a/explore-analyze/visualize/maps/point-to-point.md b/explore-analyze/visualize/maps/point-to-point.md index 3b5dc1929..7806015b2 100644 --- a/explore-analyze/visualize/maps/point-to-point.md +++ b/explore-analyze/visualize/maps/point-to-point.md @@ -10,7 +10,7 @@ mapped_pages: A point-to-point connection plots aggregated data paths between the source and the destination. Thicker, darker lines symbolize more connections between a source and destination, and thinner, lighter lines symbolize less connections. -Point to point uses an {{es}} [terms aggregation](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-terms-aggregation.md) to group your documents by destination. Then, a nested [GeoTile grid aggregation](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-geotilegrid-aggregation.md) groups sources for each destination into grids. A line connects each source grid centroid to each destination. +Point to point uses an {{es}} [terms aggregation](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-terms-aggregation.md) to group your documents by destination. Then, a nested [GeoTile grid aggregation](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-geotilegrid-aggregation.md) groups sources for each destination into grids. A line connects each source grid centroid to each destination. Point-to-point layers are used in several common use cases: diff --git a/explore-analyze/visualize/maps/reverse-geocoding-tutorial.md b/explore-analyze/visualize/maps/reverse-geocoding-tutorial.md index ab54eae6a..71ad146f2 100644 --- a/explore-analyze/visualize/maps/reverse-geocoding-tutorial.md +++ b/explore-analyze/visualize/maps/reverse-geocoding-tutorial.md @@ -17,7 +17,7 @@ In this tutorial, you’ll use reverse geocoding to visualize United States Cens You’ll learn to: * Upload custom regions. -* Reverse geocode with the {{es}} [enrich processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/enrich-processor.md). +* Reverse geocode with the {{es}} [enrich processor](elasticsearch://reference/ingestion-tools/enrich-processor/enrich-processor.md). * Create a map and visualize CSA regions by web traffic. When you complete this tutorial, you’ll have a map that looks like this: @@ -32,7 +32,7 @@ When you complete this tutorial, you’ll have a map that looks like this: GeoIP is a common way of transforming an IP address to a longitude and latitude. GeoIP is roughly accurate on the city level globally and neighborhood level in selected countries. It’s not as good as an actual GPS location from your phone, but it’s much more precise than just a country, state, or province. -You’ll use the [web logs sample data set](../../index.md#gs-get-data-into-kibana) that comes with Kibana for this tutorial. Web logs sample data set has longitude and latitude. If your web log data does not contain longitude and latitude, use [GeoIP processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/geoip-processor.md) to transform an IP address into a [geo_point](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/geo-point.md) field. +You’ll use the [web logs sample data set](../../index.md#gs-get-data-into-kibana) that comes with Kibana for this tutorial. Web logs sample data set has longitude and latitude. If your web log data does not contain longitude and latitude, use [GeoIP processor](elasticsearch://reference/ingestion-tools/enrich-processor/geoip-processor.md) to transform an IP address into a [geo_point](elasticsearch://reference/elasticsearch/mapping-reference/geo-point.md) field. ## Step 2: Index Combined Statistical Area (CSA) regions [_step_2_index_combined_statistical_area_csa_regions] @@ -75,7 +75,7 @@ Looking at the map, you get a sense of what constitutes a metro area in the eyes ## Step 3: Reverse geocoding [_step_3_reverse_geocoding] -To visualize CSA regions by web log traffic, the web log traffic must contain a CSA region identifier. You’ll use {{es}} [enrich processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/enrich-processor.md) to add CSA region identifiers to the web logs sample data set. You can skip this step if your source data already contains region identifiers. +To visualize CSA regions by web log traffic, the web log traffic must contain a CSA region identifier. You’ll use {{es}} [enrich processor](elasticsearch://reference/ingestion-tools/enrich-processor/enrich-processor.md) to add CSA region identifiers to the web logs sample data set. You can skip this step if your source data already contains region identifiers. 1. Go to **Developer tools** using the navigation menu or the [global search field](/explore-analyze/find-and-organize/find-apps-and-objects.md). 2. In **Console**, create a [geo_match enrichment policy](../../../manage-data/ingest/transform-enrich/example-enrich-data-based-on-geolocation.md): diff --git a/explore-analyze/visualize/maps/terms-join.md b/explore-analyze/visualize/maps/terms-join.md index 717a8daca..77fea4082 100644 --- a/explore-analyze/visualize/maps/terms-join.md +++ b/explore-analyze/visualize/maps/terms-join.md @@ -62,7 +62,7 @@ In the following example, **iso2** property defines the shared key for the left The right source uses the Kibana sample data set "Sample web logs". In this data set, the **geo.src** field contains the ISO 3166-1 alpha-2 code of the country of origin. -A [terms aggregation](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-terms-aggregation.md) groups the sample web log documents by **geo.src** and calculates metrics for each term. +A [terms aggregation](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-terms-aggregation.md) groups the sample web log documents by **geo.src** and calculates metrics for each term. The METRICS configuration defines two metric aggregations: diff --git a/explore-analyze/visualize/maps/vector-layer.md b/explore-analyze/visualize/maps/vector-layer.md index acbc03449..bc3b94eeb 100644 --- a/explore-analyze/visualize/maps/vector-layer.md +++ b/explore-analyze/visualize/maps/vector-layer.md @@ -21,18 +21,18 @@ To add a vector layer to your map, click **Add layer**, then select one of the f : Shaded areas to compare statistics across boundaries. **Clusters** -: Geospatial data grouped in grids with metrics for each gridded cell. The index must contain at least one field mapped as [geo_point](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/geo-point.md) or [geo_shape](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/geo-shape.md). +: Geospatial data grouped in grids with metrics for each gridded cell. The index must contain at least one field mapped as [geo_point](elasticsearch://reference/elasticsearch/mapping-reference/geo-point.md) or [geo_shape](elasticsearch://reference/elasticsearch/mapping-reference/geo-shape.md). **Create index** : Draw shapes on the map and index in Elasticsearch. **Documents** -: Points, lines, and polyons from Elasticsearch. The index must contain at least one field mapped as [geo_point](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/geo-point.md) or [geo_shape](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/geo-shape.md). +: Points, lines, and polyons from Elasticsearch. The index must contain at least one field mapped as [geo_point](elasticsearch://reference/elasticsearch/mapping-reference/geo-point.md) or [geo_shape](elasticsearch://reference/elasticsearch/mapping-reference/geo-shape.md). Results are limited to the `index.max_result_window` index setting, which defaults to 10000. Select the appropriate **Scaling** option for your use case. * **Limit results to 10,000** The layer displays features from the first `index.max_result_window` documents. Results exceeding `index.max_result_window` are not displayed. - * **Show clusters when results exceed 10,000** When results exceed `index.max_result_window`, the layer uses [GeoTile grid aggregation](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-geotilegrid-aggregation.md) to group your documents into clusters and displays metrics for each cluster. When results are less then `index.max_result_window`, the layer displays features from individual documents. + * **Show clusters when results exceed 10,000** When results exceed `index.max_result_window`, the layer uses [GeoTile grid aggregation](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-geotilegrid-aggregation.md) to group your documents into clusters and displays metrics for each cluster. When results are less then `index.max_result_window`, the layer displays features from individual documents. * **Use vector tiles.** Vector tiles partition your map into tiles. Each tile request is limited to the `index.max_result_window` index setting. When a tile exceeds `index.max_result_window`, results exceeding `index.max_result_window` are not contained in the tile and a dashed rectangle outlining the bounding box containing all geo values within the tile is displayed. @@ -43,13 +43,13 @@ To add a vector layer to your map, click **Add layer**, then select one of the f : Points and lines associated with anomalies. The {{anomaly-job}} must use a `lat_long` function. Go to [Detecting anomalous locations in geographic data](../../machine-learning/anomaly-detection/geographic-anomalies.md) for an example. **Point to point** -: Aggregated data paths between the source and destination. The index must contain at least 2 fields mapped as [geo_point](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/geo-point.md), source and destination. +: Aggregated data paths between the source and destination. The index must contain at least 2 fields mapped as [geo_point](elasticsearch://reference/elasticsearch/mapping-reference/geo-point.md), source and destination. **Top hits per entity** -: The layer displays the [most relevant documents per entity](maps-top-hits-aggregation.md). The index must contain at least one field mapped as [geo_point](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/geo-point.md) or [geo_shape](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/geo-shape.md). +: The layer displays the [most relevant documents per entity](maps-top-hits-aggregation.md). The index must contain at least one field mapped as [geo_point](elasticsearch://reference/elasticsearch/mapping-reference/geo-point.md) or [geo_shape](elasticsearch://reference/elasticsearch/mapping-reference/geo-shape.md). **Tracks** -: Create lines from points. The index must contain at least one field mapped as [geo_point](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/geo-point.md). +: Create lines from points. The index must contain at least one field mapped as [geo_point](elasticsearch://reference/elasticsearch/mapping-reference/geo-point.md). **Upload Geojson** : Index GeoJSON data in Elasticsearch. diff --git a/explore-analyze/visualize/supported-chart-types.md b/explore-analyze/visualize/supported-chart-types.md index c8f5a28f0..2b76bc5e0 100644 --- a/explore-analyze/visualize/supported-chart-types.md +++ b/explore-analyze/visualize/supported-chart-types.md @@ -25,7 +25,7 @@ $$$aggregation-reference$$$ | Tag cloud | ✓ | | ✓ | ✓ | | -## Bar, line, and area chart features [xy-features] +## Bar, line, and area chart features [xy-features] | Feature | **Lens** | **TSVB** | **Aggregation-based** | **Vega** | **Timelion** | | --- | --- | --- | --- | --- | --- | @@ -37,7 +37,7 @@ $$$aggregation-reference$$$ | Synchronized tooltips | ✓ | ✓ | | | | -## Advanced features [other-features] +## Advanced features [other-features] | Feature | **Lens** | **TSVB** | **Vega** | **Timelion** | | --- | --- | --- | --- | --- | @@ -51,7 +51,7 @@ $$$aggregation-reference$$$ | Annotations | ✓ | ✓ | | | -## Table features [table-features] +## Table features [table-features] | Feature | **Lens** | **TSVB** | **Aggregation-based** | | --- | --- | --- | --- | @@ -61,7 +61,7 @@ $$$aggregation-reference$$$ | Color by value | ✓ | ✓ | | -## Functions [custom-functions] +## Functions [custom-functions] | Function | **Lens** | **TSVB** | | --- | --- | --- | @@ -72,7 +72,7 @@ $$$aggregation-reference$$$ | Static value | ✓ | ✓ | -## Metrics aggregations [metrics-aggregations] +## Metrics aggregations [metrics-aggregations] Metric aggregations are calculated from the values in the aggregated documents. The values are extracted from the document fields. @@ -89,10 +89,10 @@ Metric aggregations are calculated from the values in the aggregated documents. | Value count | ✓ | | ✓ | ✓ | | Variance | ✓ | ✓ | | ✓ | -For information about {{es}} metrics aggregations, refer to [Metrics aggregations](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/metrics.md). +For information about {{es}} metrics aggregations, refer to [Metrics aggregations](elasticsearch://reference/data-analysis/aggregations/metrics.md). -## Bucket aggregations [bucket-aggregations] +## Bucket aggregations [bucket-aggregations] Bucket aggregations group, or bucket, documents based on the aggregation type. To define the document buckets, bucket aggregations compute and return the number of documents for each bucket. @@ -110,10 +110,10 @@ Bucket aggregations group, or bucket, documents based on the aggregation type. T | Terms | ✓ | ✓ | ✓ | ✓ | | Significant terms | ✓ | | ✓ | ✓ | -For information about {{es}} bucket aggregations, refer to [Bucket aggregations](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/bucket.md). +For information about {{es}} bucket aggregations, refer to [Bucket aggregations](elasticsearch://reference/data-analysis/aggregations/bucket.md). -## Pipeline aggregations [pipeline-aggregations] +## Pipeline aggregations [pipeline-aggregations] Pipeline aggregations are dependent on the outputs calculated from other aggregations. Parent pipeline aggregations are provided with the output of the parent aggregation, and compute new buckets or aggregations that are added to existing buckets. Sibling pipeline aggregations are provided with the output of a sibling aggregation, and compute new aggregations for the same level as the sibling aggregation. @@ -130,5 +130,5 @@ Pipeline aggregations are dependent on the outputs calculated from other aggrega | Bucket selector | | | | ✓ | | Serial differencing | | ✓ | ✓ | ✓ | -For information about {{es}} pipeline aggregations, refer to [Pipeline aggregations](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/pipeline.md). +For information about {{es}} pipeline aggregations, refer to [Pipeline aggregations](elasticsearch://reference/data-analysis/aggregations/pipeline.md). diff --git a/manage-data/data-store/aliases.md b/manage-data/data-store/aliases.md index 49a8962fb..043961a36 100644 --- a/manage-data/data-store/aliases.md +++ b/manage-data/data-store/aliases.md @@ -316,7 +316,7 @@ Filters are only applied when using the [Query DSL](../../explore-analyze/query- ## Routing [alias-routing] -Use the `routing` option to [route](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-routing-field.md) requests for an alias to a specific shard. This lets you take advantage of [shard caches](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/shard-request-cache-settings.md) to speed up searches. Data stream aliases do not support routing options. +Use the `routing` option to [route](elasticsearch://reference/elasticsearch/mapping-reference/mapping-routing-field.md) requests for an alias to a specific shard. This lets you take advantage of [shard caches](elasticsearch://reference/elasticsearch/configuration-reference/shard-request-cache-settings.md) to speed up searches. Data stream aliases do not support routing options. ```console POST _aliases diff --git a/manage-data/data-store/data-streams.md b/manage-data/data-store/data-streams.md index eb0965268..f551f9d29 100644 --- a/manage-data/data-store/data-streams.md +++ b/manage-data/data-store/data-streams.md @@ -30,7 +30,7 @@ Keep in mind that some features such as [Time Series Data Streams (TSDS)](../dat ## Backing indices [backing-indices] -A data stream consists of one or more [hidden](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-modules.md#index-hidden), auto-generated backing indices. +A data stream consists of one or more [hidden](elasticsearch://reference/elasticsearch/index-settings/index-modules.md#index-hidden), auto-generated backing indices. :::{image} ../../images/elasticsearch-reference-data-streams-diagram.svg :alt: data streams diagram @@ -38,7 +38,7 @@ A data stream consists of one or more [hidden](asciidocalypse://docs/elasticsear A data stream requires a matching [index template](templates.md). The template contains the mappings and settings used to configure the stream’s backing indices. -Every document indexed to a data stream must contain a `@timestamp` field, mapped as a [`date`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/date.md) or [`date_nanos`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/date_nanos.md) field type. If the index template doesn’t specify a mapping for the `@timestamp` field, {{es}} maps `@timestamp` as a `date` field with default options. +Every document indexed to a data stream must contain a `@timestamp` field, mapped as a [`date`](elasticsearch://reference/elasticsearch/mapping-reference/date.md) or [`date_nanos`](elasticsearch://reference/elasticsearch/mapping-reference/date_nanos.md) field type. If the index template doesn’t specify a mapping for the `@timestamp` field, {{es}} maps `@timestamp` as a `date` field with default options. The same index template can be used for multiple data streams. You cannot delete an index template in use by a data stream. diff --git a/manage-data/data-store/data-streams/downsampling-time-series-data-stream.md b/manage-data/data-store/data-streams/downsampling-time-series-data-stream.md index 81b6763a2..8e1608540 100644 --- a/manage-data/data-store/data-streams/downsampling-time-series-data-stream.md +++ b/manage-data/data-store/data-streams/downsampling-time-series-data-stream.md @@ -86,7 +86,7 @@ POST /my-time-series-index/_downsample/my-downsampled-time-series-index } ``` -To downsample time series data as part of ILM, include a [Downsample action](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-downsample.md) in your ILM policy and set `fixed_interval` to the level of granularity that you’d like: +To downsample time series data as part of ILM, include a [Downsample action](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-downsample.md) in your ILM policy and set `fixed_interval` to the level of granularity that you’d like: ```console PUT _ilm/policy/my_policy @@ -118,12 +118,12 @@ The result of a time based histogram aggregation is in a uniform bucket size and There are a few things to note about querying downsampled indices: * When you run queries in {{kib}} and through Elastic solutions, a normal response is returned without notification that some of the queried indices are downsampled. -* For [date histogram aggregations](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-datehistogram-aggregation.md), only `fixed_intervals` (and not calendar-aware intervals) are supported. +* For [date histogram aggregations](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-datehistogram-aggregation.md), only `fixed_intervals` (and not calendar-aware intervals) are supported. * Timezone support comes with caveats: * Date histograms at intervals that are multiples of an hour are based on values generated at UTC. This works well for timezones that are on the hour, e.g. +5:00 or -3:00, but requires offsetting the reported time buckets, e.g. `2020-01-01T10:30:00.000` instead of `2020-03-07T10:00:00.000` for timezone +5:30 (India), if downsampling aggregates values per hour. In this case, the results include the field `downsampled_results_offset: true`, to indicate that the time buckets are shifted. This can be avoided if a downsampling interval of 15 minutes is used, as it allows properly calculating hourly values for the shifted buckets. * Date histograms at intervals that are multiples of a day are similarly affected, in case downsampling aggregates values per day. In this case, the beginning of each day is always calculated at UTC when generated the downsampled values, so the time buckets need to be shifted, e.g. reported as `2020-03-07T19:00:00.000` instead of `2020-03-07T00:00:00.000` for timezone `America/New_York`. The field `downsampled_results_offset: true` is added in this case too. - * Daylight savings and similar peculiarities around timezones affect reported results, as [documented](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-datehistogram-aggregation.md#datehistogram-aggregation-time-zone) for date histogram aggregation. Besides, downsampling at daily interval hinders tracking any information related to daylight savings changes. + * Daylight savings and similar peculiarities around timezones affect reported results, as [documented](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-datehistogram-aggregation.md#datehistogram-aggregation-time-zone) for date histogram aggregation. Besides, downsampling at daily interval hinders tracking any information related to daylight savings changes. @@ -136,9 +136,9 @@ The following restrictions and limitations apply for downsampling: * Within a data stream, a downsampled index replaces the original index and the original index is deleted. Only one index can exist for a given time period. * A source index must be in read-only mode for the downsampling process to succeed. Check the [Run downsampling manually](./run-downsampling-manually.md) example for details. * Downsampling data for the same period many times (downsampling of a downsampled index) is supported. The downsampling interval must be a multiple of the interval of the downsampled index. -* Downsampling is provided as an ILM action. See [Downsample](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-downsample.md). +* Downsampling is provided as an ILM action. See [Downsample](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-downsample.md). * The new, downsampled index is created on the data tier of the original index and it inherits its settings (for example, the number of shards and replicas). -* The numeric `gauge` and `counter` [metric types](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-field-meta.md) are supported. +* The numeric `gauge` and `counter` [metric types](elasticsearch://reference/elasticsearch/mapping-reference/mapping-field-meta.md) are supported. * The downsampling configuration is extracted from the time series data stream [index mapping](./set-up-tsds.md#create-tsds-index-template). The only additional required setting is the downsampling `fixed_interval`. diff --git a/manage-data/data-store/data-streams/logs-data-stream.md b/manage-data/data-store/data-streams/logs-data-stream.md index 8de970a00..cccf0513f 100644 --- a/manage-data/data-store/data-streams/logs-data-stream.md +++ b/manage-data/data-store/data-streams/logs-data-stream.md @@ -47,13 +47,13 @@ You can also set the index mode and adjust other template settings in [the Elast ## Synthetic source [logsdb-synthetic-source] -If you have the required [subscription](https://www.elastic.co/subscriptions), `logsdb` index mode uses [synthetic `_source`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source), which omits storing the original `_source` field. Instead, the document source is synthesized from doc values or stored fields upon document retrieval. +If you have the required [subscription](https://www.elastic.co/subscriptions), `logsdb` index mode uses [synthetic `_source`](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source), which omits storing the original `_source` field. Instead, the document source is synthesized from doc values or stored fields upon document retrieval. If you don’t have the required [subscription](https://www.elastic.co/subscriptions), `logsdb` mode uses the original `_source` field. -Before using synthetic source, make sure to review the [restrictions](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source-restrictions). +Before using synthetic source, make sure to review the [restrictions](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source-restrictions). -When working with multi-value fields, the `index.mapping.synthetic_source_keep` setting controls how field values are preserved for [synthetic source](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source) reconstruction. In `logsdb`, the default value is `arrays`, which retains both duplicate values and the order of entries. However, the exact structure of array elements and objects is not necessarily retained. Preserving duplicates and ordering can be critical for some log fields, such as DNS A records, HTTP headers, and log entries that represent sequential or repeated events. +When working with multi-value fields, the `index.mapping.synthetic_source_keep` setting controls how field values are preserved for [synthetic source](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source) reconstruction. In `logsdb`, the default value is `arrays`, which retains both duplicate values and the order of entries. However, the exact structure of array elements and objects is not necessarily retained. Preserving duplicates and ordering can be critical for some log fields, such as DNS A records, HTTP headers, and log entries that represent sequential or repeated events. ## Index sort settings [logsdb-sort-settings] @@ -68,7 +68,7 @@ In `logsdb` index mode, indices are sorted by the fields `host.name` and `@times * To prioritize the latest data, `host.name` is sorted in ascending order and `@timestamp` is sorted in descending order. -You can override the default sort settings by manually configuring `index.sort.field` and `index.sort.order`. For more details, see [*Index Sorting*](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/sorting.md). +You can override the default sort settings by manually configuring `index.sort.field` and `index.sort.order`. For more details, see [*Index Sorting*](elasticsearch://reference/elasticsearch/index-settings/sorting.md). To modify the sort configuration of an existing data stream, update the data stream’s component templates, and then perform or wait for a [rollover](../data-streams.md#data-streams-rollover). @@ -86,7 +86,7 @@ To avoid mapping conflicts, consider these options: * **Adjust mappings:** Check your existing mappings to ensure that `host.name` is mapped as a keyword. * **Change sorting:** If needed, you can remove `host.name` from the sort settings and use a different set of fields. Sorting by `@timestamp` can be a good fallback. -* **Switch to a different [index mode](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-modules.md#index-mode-setting)**: If resolving `host.name` mapping conflicts is not feasible, you can choose not to use `logsdb` mode. +* **Switch to a different [index mode](elasticsearch://reference/elasticsearch/index-settings/index-modules.md#index-mode-setting)**: If resolving `host.name` mapping conflicts is not feasible, you can choose not to use `logsdb` mode. ::::{important} On existing data streams, `logsdb` mode is applied on [rollover](../data-streams.md#data-streams-rollover) (automatic or manual). @@ -103,15 +103,15 @@ In benchmarks, routing optimizations reduced storage requirements by 20% compare To configure a routing optimization: * Include the index setting `[index.logsdb.route_on_sort_fields:true]` in the data stream configuration. -* [Configure index sorting](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/sorting.md) with two or more fields, in addition to `@timestamp`. -* Make sure the [`_id`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-id-field.md) field is not populated in ingested documents. It should be auto-generated instead. +* [Configure index sorting](elasticsearch://reference/elasticsearch/index-settings/sorting.md) with two or more fields, in addition to `@timestamp`. +* Make sure the [`_id`](elasticsearch://reference/elasticsearch/mapping-reference/mapping-id-field.md) field is not populated in ingested documents. It should be auto-generated instead. A custom sort configuration is required, to improve storage efficiency and to minimize hotspots from logging spikes that may route documents to a single shard. For best results, use a few sort fields that have a relatively low cardinality and don’t co-vary (for example, `host.name` and `host.id` are not optimal). ## Specialized codecs [logsdb-specialized-codecs] -By default, `logsdb` index mode uses the `best_compression` [codec](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-modules.md#index-codec), which applies [ZSTD](https://en.wikipedia.org/wiki/Zstd) compression to stored fields. You can switch to the `default` codec for faster compression with a slightly larger storage footprint. +By default, `logsdb` index mode uses the `best_compression` [codec](elasticsearch://reference/elasticsearch/index-settings/index-modules.md#index-codec), which applies [ZSTD](https://en.wikipedia.org/wiki/Zstd) compression to stored fields. You can switch to the `default` codec for faster compression with a slightly larger storage footprint. The `logsdb` index mode also automatically applies specialized codecs for numeric doc values, in order to optimize storage usage. Numeric fields are encoded using the following sequence of codecs: @@ -158,7 +158,7 @@ When automatically injected, `host.name` and `@timestamp` count toward the limit ## Fields without `doc_values` [logsdb-nodocvalue-fields] -When the `logsdb` index mode uses synthetic `_source` and `doc_values` are disabled for a field in the mapping, {{es}} might set the `store` setting to `true` for that field. This ensures that the field’s data remains accessible for reconstructing the document’s source when using [synthetic source](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source). +When the `logsdb` index mode uses synthetic `_source` and `doc_values` are disabled for a field in the mapping, {{es}} might set the `store` setting to `true` for that field. This ensures that the field’s data remains accessible for reconstructing the document’s source when using [synthetic source](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source). For example, this adjustment occurs with text fields when `store` is `false` and no suitable multi-field is available for reconstructing the original value. diff --git a/manage-data/data-store/data-streams/modify-data-stream.md b/manage-data/data-store/data-streams/modify-data-stream.md index 5a565e6cd..82d70cac5 100644 --- a/manage-data/data-store/data-streams/modify-data-stream.md +++ b/manage-data/data-store/data-streams/modify-data-stream.md @@ -23,7 +23,7 @@ If you later need to change the mappings or settings for a data stream, you have * [Change a static index setting for a data stream](../data-streams/modify-data-stream.md#change-static-index-setting-for-a-data-stream) ::::{tip} -If your changes include modifications to existing field mappings or [static index settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index.md), a reindex is often required to apply the changes to a data stream’s backing indices. If you are already performing a reindex, you can use the same process to add new field mappings and change [dynamic index settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index.md). See [Use reindex to change mappings or settings](../data-streams/modify-data-stream.md#data-streams-use-reindex-to-change-mappings-settings). +If your changes include modifications to existing field mappings or [static index settings](elasticsearch://reference/elasticsearch/index-settings/index.md), a reindex is often required to apply the changes to a data stream’s backing indices. If you are already performing a reindex, you can use the same process to add new field mappings and change [dynamic index settings](elasticsearch://reference/elasticsearch/index-settings/index.md). See [Use reindex to change mappings or settings](../data-streams/modify-data-stream.md#data-streams-use-reindex-to-change-mappings-settings). :::: @@ -92,13 +92,13 @@ To add a mapping for a new field to a data stream, following these steps: ### Change an existing field mapping in a data stream [change-existing-field-mapping-in-a-data-stream] -The documentation for each [mapping parameter](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-parameters.md) indicates whether you can update it for an existing field using the [update mapping API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-mapping). To update these parameters for an existing field, follow these steps: +The documentation for each [mapping parameter](elasticsearch://reference/elasticsearch/mapping-reference/mapping-parameters.md) indicates whether you can update it for an existing field using the [update mapping API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-mapping). To update these parameters for an existing field, follow these steps: 1. Update the index template used by the data stream. This ensures the updated field mapping is added to future backing indices created for the stream. For example, `my-data-stream-template` is an existing index template used by `my-data-stream`. - The following [create or update index template](../templates.md) request changes the argument for the `host.ip` field’s [`ignore_malformed`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/ignore-malformed.md) mapping parameter to `true`. + The following [create or update index template](../templates.md) request changes the argument for the `host.ip` field’s [`ignore_malformed`](elasticsearch://reference/elasticsearch/mapping-reference/ignore-malformed.md) mapping parameter to `true`. ```console PUT /_index_template/my-data-stream-template @@ -173,7 +173,7 @@ If you need to change the mapping of an existing field, create a new data stream ### Change a dynamic index setting for a data stream [change-dynamic-index-setting-for-a-data-stream] -To change a [dynamic index setting](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index.md) for a data stream, follow these steps: +To change a [dynamic index setting](elasticsearch://reference/elasticsearch/index-settings/index.md) for a data stream, follow these steps: 1. Update the index template used by the data stream. This ensures the setting is applied to future backing indices created for the stream. @@ -219,7 +219,7 @@ To change the `index.lifecycle.name` setting, first use the [remove policy API]( ### Change a static index setting for a data stream [change-static-index-setting-for-a-data-stream] -[Static index settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index.md) can only be set when a backing index is created. You cannot update static index settings using the [update index settings API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-settings). +[Static index settings](elasticsearch://reference/elasticsearch/index-settings/index.md) can only be set when a backing index is created. You cannot update static index settings using the [update index settings API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-settings). To apply a new static setting to future backing indices, update the index template used by the data stream. The setting is automatically applied to any backing index created after the update. @@ -426,7 +426,7 @@ Follow these steps: You can also use a query to reindex only a subset of documents with each request. - The following [reindex API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-reindex) request copies documents from `my-data-stream` to `new-data-stream`. The request uses a [`range` query](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-range-query.md) to only reindex documents with a timestamp within the last week. Note the request’s `op_type` is `create`. + The following [reindex API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-reindex) request copies documents from `my-data-stream` to `new-data-stream`. The request uses a [`range` query](elasticsearch://reference/query-languages/query-dsl-range-query.md) to only reindex documents with a timestamp within the last week. Note the request’s `op_type` is `create`. ```console POST /_reindex diff --git a/manage-data/data-store/data-streams/reindex-tsds.md b/manage-data/data-store/data-streams/reindex-tsds.md index 0a6eff68b..69384cee5 100644 --- a/manage-data/data-store/data-streams/reindex-tsds.md +++ b/manage-data/data-store/data-streams/reindex-tsds.md @@ -13,7 +13,7 @@ applies_to: -## Introduction [tsds-reindex-intro] +## Introduction [tsds-reindex-intro] With reindexing, you can copy documents from an old [time-series data stream (TSDS)](../data-streams/time-series-data-stream-tsds.md) to a new one. Data streams support reindexing in general, with a few [restrictions](use-data-stream.md#reindex-with-a-data-stream). Still, time-series data streams introduce additional challenges due to tight control on the accepted timestamp range for each backing index they contain. Direct use of the reindex API would likely error out due to attempting to insert documents with timestamps that are outside the current acceptance window. @@ -30,7 +30,7 @@ To avoid these limitations, use the process that is outlined below: 4. Revert the overriden index settings in the destination index template. 5. Invoke the `rollover` api to create a new backing index that can receive new documents. -::::{note} +::::{note} This process only applies to time-series data streams without [downsampling](./downsampling-time-series-data-stream.md) configuration. Data streams with downsampling can only be re-indexed by re-indexing their backing indexes individually and adding them to an empty destination data stream. :::: @@ -38,7 +38,7 @@ This process only applies to time-series data streams without [downsampling](./d In what follows, we elaborate on each step of the process with examples. -## Create a TSDS template to accept old documents [tsds-reindex-create-template] +## Create a TSDS template to accept old documents [tsds-reindex-create-template] Consider a TSDS with the following template: @@ -201,7 +201,7 @@ POST /_index_template/2 ``` -## Reindex [tsds-reindex-op] +## Reindex [tsds-reindex-op] Invoke the reindex api, for instance: @@ -219,7 +219,7 @@ POST /_reindex ``` -## Restore the destination index template [tsds-reindex-restore] +## Restore the destination index template [tsds-reindex-restore] Once the reindexing operation completes, restore the index template for the destination TSDS as follows: @@ -267,5 +267,5 @@ POST /k9s/_rollover/ This creates a new backing index with the updated index settings. The destination data stream is now ready to accept new documents. -Note that the initial backing index can still accept documents within the range of timestamps derived from the source data stream. If this is not desired, mark it as [read-only](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-block.md#index-blocks-read-only) explicitly. +Note that the initial backing index can still accept documents within the range of timestamps derived from the source data stream. If this is not desired, mark it as [read-only](elasticsearch://reference/elasticsearch/index-settings/index-block.md#index-blocks-read-only) explicitly. diff --git a/manage-data/data-store/data-streams/run-downsampling-using-data-stream-lifecycle.md b/manage-data/data-store/data-streams/run-downsampling-using-data-stream-lifecycle.md index 879c281c0..a21929429 100644 --- a/manage-data/data-store/data-streams/run-downsampling-using-data-stream-lifecycle.md +++ b/manage-data/data-store/data-streams/run-downsampling-using-data-stream-lifecycle.md @@ -317,7 +317,7 @@ POST /datastream/_rollover/ ## View downsampling results [downsampling-dsl-view-results] -By default, data stream lifecycle actions are executed every five minutes. Downsampling takes place after the index is rolled over and the [index time series end time](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/time-series.md#index-time-series-end-time) has lapsed as the source index is still expected to receive major writes until then. Index is now rolled over after previous step but its time series range end is likely still in the future. Once index time series range is in the past, re-run the `GET _data_stream` request. +By default, data stream lifecycle actions are executed every five minutes. Downsampling takes place after the index is rolled over and the [index time series end time](elasticsearch://reference/elasticsearch/index-settings/time-series.md#index-time-series-end-time) has lapsed as the source index is still expected to receive major writes until then. Index is now rolled over after previous step but its time series range end is likely still in the future. Once index time series range is in the past, re-run the `GET _data_stream` request. ```console GET _data_stream diff --git a/manage-data/data-store/data-streams/run-downsampling-with-ilm.md b/manage-data/data-store/data-streams/run-downsampling-with-ilm.md index 8546f72dd..505127951 100644 --- a/manage-data/data-store/data-streams/run-downsampling-with-ilm.md +++ b/manage-data/data-store/data-streams/run-downsampling-with-ilm.md @@ -32,9 +32,9 @@ Before running this example you may want to try the [Run downsampling manually]( Create an ILM policy for your time series data. While not required, an ILM policy is recommended to automate the management of your time series data stream indices. -To enable downsampling, add a [Downsample action](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-downsample.md) and set [`fixed_interval`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-downsample.md#ilm-downsample-options) to the downsampling interval at which you want to aggregate the original time series data. +To enable downsampling, add a [Downsample action](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-downsample.md) and set [`fixed_interval`](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-downsample.md#ilm-downsample-options) to the downsampling interval at which you want to aggregate the original time series data. -In this example, an ILM policy is configured for the `hot` phase. The downsample takes place after the index is rolled over and the [index time series end time](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/time-series.md#index-time-series-end-time) has lapsed as the source index is still expected to receive major writes until then. {{ilm-cap}} will not proceed with any action that expects the index to not receive writes anymore until the [index’s end time](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/time-series.md#index-time-series-end-time) has passed. The {{ilm-cap}} actions that wait on the end time before proceeding are: - [Delete](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-delete.md) - [Downsample](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-downsample.md) - [Force merge](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-forcemerge.md) - [Read only](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-readonly.md) - [Searchable snapshot](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-searchable-snapshot.md) - [Shrink](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-shrink.md) +In this example, an ILM policy is configured for the `hot` phase. The downsample takes place after the index is rolled over and the [index time series end time](elasticsearch://reference/elasticsearch/index-settings/time-series.md#index-time-series-end-time) has lapsed as the source index is still expected to receive major writes until then. {{ilm-cap}} will not proceed with any action that expects the index to not receive writes anymore until the [index’s end time](elasticsearch://reference/elasticsearch/index-settings/time-series.md#index-time-series-end-time) has passed. The {{ilm-cap}} actions that wait on the end time before proceeding are: - [Delete](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-delete.md) - [Downsample](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-downsample.md) - [Force merge](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-forcemerge.md) - [Read only](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-readonly.md) - [Searchable snapshot](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-searchable-snapshot.md) - [Shrink](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-shrink.md) ```console PUT _ilm/policy/datastream_policy diff --git a/manage-data/data-store/data-streams/set-up-data-stream.md b/manage-data/data-store/data-streams/set-up-data-stream.md index 581ee8196..61b43ad41 100644 --- a/manage-data/data-store/data-streams/set-up-data-stream.md +++ b/manage-data/data-store/data-streams/set-up-data-stream.md @@ -92,13 +92,13 @@ A data stream requires a matching index template. In most cases, you compose thi When creating your component templates, include: -* A [`date`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/date.md) or [`date_nanos`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/date_nanos.md) mapping for the `@timestamp` field. If you don’t specify a mapping, {{es}} maps `@timestamp` as a `date` field with default options. +* A [`date`](elasticsearch://reference/elasticsearch/mapping-reference/date.md) or [`date_nanos`](elasticsearch://reference/elasticsearch/mapping-reference/date_nanos.md) mapping for the `@timestamp` field. If you don’t specify a mapping, {{es}} maps `@timestamp` as a `date` field with default options. * Your lifecycle policy in the `index.lifecycle.name` index setting. ::::{tip} Use the [Elastic Common Schema (ECS)](https://www.elastic.co/guide/en/ecs/current) when mapping your fields. ECS fields integrate with several {{stack}} features by default. -If you’re unsure how to map your fields, use [runtime fields](../mapping/define-runtime-fields-in-search-request.md) to extract fields from [unstructured content](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/keyword.md#mapping-unstructured-content) at search time. For example, you can index a log message to a `wildcard` field and later extract IP addresses and other data from this field during a search. +If you’re unsure how to map your fields, use [runtime fields](../mapping/define-runtime-fields-in-search-request.md) to extract fields from [unstructured content](elasticsearch://reference/elasticsearch/mapping-reference/keyword.md#mapping-unstructured-content) at search time. For example, you can index a log message to a `wildcard` field and later extract IP addresses and other data from this field during a search. :::: diff --git a/manage-data/data-store/data-streams/set-up-tsds.md b/manage-data/data-store/data-streams/set-up-tsds.md index 5aebe1cf7..4520e8b96 100644 --- a/manage-data/data-store/data-streams/set-up-tsds.md +++ b/manage-data/data-store/data-streams/set-up-tsds.md @@ -97,7 +97,7 @@ To setup a TSDS create an index template with the following details: * Enable data streams. * Specify a mapping that defines your dimensions and metrics: - * One or more [dimension fields](time-series-data-stream-tsds.md#time-series-dimension) with a `time_series_dimension` value of `true`. Alternatively, one or more [pass-through](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/passthrough.md#passthrough-dimensions) fields configured as dimension containers, provided that they will contain at least one sub-field (mapped statically or dynamically). + * One or more [dimension fields](time-series-data-stream-tsds.md#time-series-dimension) with a `time_series_dimension` value of `true`. Alternatively, one or more [pass-through](elasticsearch://reference/elasticsearch/mapping-reference/passthrough.md#passthrough-dimensions) fields configured as dimension containers, provided that they will contain at least one sub-field (mapped statically or dynamically). * One or more [metric fields](time-series-data-stream-tsds.md#time-series-metric), marked using the `time_series_metric` mapping parameter. * Optional: A `date` or `date_nanos` mapping for the `@timestamp` field. If you don’t specify a mapping, Elasticsearch maps `@timestamp` as a `date` field with default options. @@ -105,7 +105,7 @@ To setup a TSDS create an index template with the following details: * Set `index.mode` setting to `time_series`. * Your lifecycle policy in the `index.lifecycle.name` index setting. - * Optional: Other index settings, such as [`index.number_of_replicas`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-modules.md#dynamic-index-number-of-replicas), for your TSDS’s backing indices. + * Optional: Other index settings, such as [`index.number_of_replicas`](elasticsearch://reference/elasticsearch/index-settings/index-modules.md#dynamic-index-number-of-replicas), for your TSDS’s backing indices. * A priority higher than `200` to avoid collisions with built-in templates. See [Avoid index pattern collisions](../templates.md#avoid-index-pattern-collisions). * Optional: Component templates containing your mappings and other index settings. diff --git a/manage-data/data-store/data-streams/time-series-data-stream-tsds.md b/manage-data/data-store/data-streams/time-series-data-stream-tsds.md index 24d2ea729..d333e21a1 100644 --- a/manage-data/data-store/data-streams/time-series-data-stream-tsds.md +++ b/manage-data/data-store/data-streams/time-series-data-stream-tsds.md @@ -32,9 +32,9 @@ A TSDS works like a regular data stream with some key differences: * {{es}} generates a hidden [`_tsid`](#tsid) metadata field for each document in a TSDS. * A TSDS uses [time-bound backing indices](#time-bound-indices) to store data from the same time period in the same backing index. * The matching index template for a TSDS must contain the `index.routing_path` index setting. A TSDS uses this setting to perform [dimension-based routing](#dimension-based-routing). -* A TSDS uses internal [index sorting](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/sorting.md) to order shard segments by `_tsid` and `@timestamp`. +* A TSDS uses internal [index sorting](elasticsearch://reference/elasticsearch/index-settings/sorting.md) to order shard segments by `_tsid` and `@timestamp`. * TSDS documents only support auto-generated document `_id` values. For TSDS documents, the document `_id` is a hash of the document’s dimensions and `@timestamp`. A TSDS doesn’t support custom document `_id` values. -* A TSDS uses [synthetic `_source`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source), and as a result is subject to some [restrictions](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source-restrictions) and [modifications](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source-modifications) applied to the `_source` field. +* A TSDS uses [synthetic `_source`](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source), and as a result is subject to some [restrictions](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source-restrictions) and [modifications](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source-modifications) applied to the `_source` field. ::::{note} A time series index can contain fields other than dimensions or metrics. @@ -66,18 +66,18 @@ A TSDS document is uniquely identified by its time series and timestamp, both of You mark a field as a dimension using the boolean `time_series_dimension` mapping parameter. The following field types support the `time_series_dimension` parameter: -* [`keyword`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/keyword.md#keyword-field-type) -* [`ip`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/ip.md) -* [`byte`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/number.md) -* [`short`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/number.md) -* [`integer`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/number.md) -* [`long`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/number.md) -* [`unsigned_long`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/number.md) -* [`boolean`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/boolean.md) +* [`keyword`](elasticsearch://reference/elasticsearch/mapping-reference/keyword.md#keyword-field-type) +* [`ip`](elasticsearch://reference/elasticsearch/mapping-reference/ip.md) +* [`byte`](elasticsearch://reference/elasticsearch/mapping-reference/number.md) +* [`short`](elasticsearch://reference/elasticsearch/mapping-reference/number.md) +* [`integer`](elasticsearch://reference/elasticsearch/mapping-reference/number.md) +* [`long`](elasticsearch://reference/elasticsearch/mapping-reference/number.md) +* [`unsigned_long`](elasticsearch://reference/elasticsearch/mapping-reference/number.md) +* [`boolean`](elasticsearch://reference/elasticsearch/mapping-reference/boolean.md) -For a flattened field, use the `time_series_dimensions` parameter to configure an array of fields as dimensions. For details refer to [`flattened`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/flattened.md#flattened-params). +For a flattened field, use the `time_series_dimensions` parameter to configure an array of fields as dimensions. For details refer to [`flattened`](elasticsearch://reference/elasticsearch/mapping-reference/flattened.md#flattened-params). -Dimension definitions can be simplified through [pass-through](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/passthrough.md#passthrough-dimensions) fields. +Dimension definitions can be simplified through [pass-through](elasticsearch://reference/elasticsearch/mapping-reference/passthrough.md#passthrough-dimensions) fields. ### Metrics [time-series-metric] @@ -88,9 +88,9 @@ Metrics differ from dimensions in that while dimensions generally remain constan To mark a field as a metric, you must specify a metric type using the `time_series_metric` mapping parameter. The following field types support the `time_series_metric` parameter: -* [`aggregate_metric_double`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/aggregate-metric-double.md) -* [`histogram`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/histogram.md) -* All [numeric field types](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/number.md) +* [`aggregate_metric_double`](elasticsearch://reference/elasticsearch/mapping-reference/aggregate-metric-double.md) +* [`histogram`](elasticsearch://reference/elasticsearch/mapping-reference/histogram.md) +* All [numeric field types](elasticsearch://reference/elasticsearch/mapping-reference/number.md) Accepted metric types vary based on the field type: @@ -123,7 +123,7 @@ Due to the cumulative nature of counter fields, the following aggregations are s ## Time series mode [time-series-mode] -The matching index template for a TSDS must contain a `data_stream` object with the `index_mode: time_series` option. This option ensures the TSDS creates backing indices with an [`index.mode`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/time-series.md#index-mode) setting of `time_series`. This setting enables most TSDS-related functionality in the backing indices. +The matching index template for a TSDS must contain a `data_stream` object with the `index_mode: time_series` option. This option ensures the TSDS creates backing indices with an [`index.mode`](elasticsearch://reference/elasticsearch/index-settings/time-series.md#index-mode) setting of `time_series`. This setting enables most TSDS-related functionality in the backing indices. If you convert an existing data stream to a TSDS, only backing indices created after the conversion have an `index.mode` of `time_series`. You can’t change the `index.mode` of an existing backing index. @@ -132,7 +132,7 @@ If you convert an existing data stream to a TSDS, only backing indices created a When you add a document to a TSDS, {{es}} automatically generates a `_tsid` metadata field for the document. The `_tsid` is an object containing the document’s dimensions. Documents in the same TSDS with the same `_tsid` are part of the same time series. -The `_tsid` field is not queryable or updatable. You also can’t retrieve a document’s `_tsid` using a [get document](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-get) request. However, you can use the `_tsid` field in aggregations and retrieve the `_tsid` value in searches using the [`fields` parameter](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/retrieve-selected-fields.md#search-fields-param). +The `_tsid` field is not queryable or updatable. You also can’t retrieve a document’s `_tsid` using a [get document](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-get) request. However, you can use the `_tsid` field in aggregations and retrieve the `_tsid` value in searches using the [`fields` parameter](elasticsearch://reference/elasticsearch/rest-apis/retrieve-selected-fields.md#search-fields-param). ::::{warning} The format of the `_tsid` field shouldn’t be relied upon. It may change from version to version. @@ -142,7 +142,7 @@ The format of the `_tsid` field shouldn’t be relied upon. It may change from v ### Time-bound indices [time-bound-indices] -In a TSDS, each backing index, including the most recent backing index, has a range of accepted `@timestamp` values. This range is defined by the [`index.time_series.start_time`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/time-series.md#index-time-series-start-time) and [`index.time_series.end_time`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/time-series.md#index-time-series-end-time) index settings. +In a TSDS, each backing index, including the most recent backing index, has a range of accepted `@timestamp` values. This range is defined by the [`index.time_series.start_time`](elasticsearch://reference/elasticsearch/index-settings/time-series.md#index-time-series-start-time) and [`index.time_series.end_time`](elasticsearch://reference/elasticsearch/index-settings/time-series.md#index-time-series-end-time) index settings. When you add a document to a TSDS, {{es}} adds the document to the appropriate backing index based on its `@timestamp` value. As a result, a TSDS can add documents to any TSDS backing index that can receive writes. This applies even if the index isn’t the most recent backing index. @@ -151,7 +151,7 @@ When you add a document to a TSDS, {{es}} adds the document to the appropriate b ::: ::::{tip} -Some {{ilm-init}} actions mark the source index as read-only, or expect the index to not be actively written anymore in order to provide good performance. These actions are: - [Delete](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-delete.md) - [Downsample](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-downsample.md) - [Force merge](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-forcemerge.md) - [Read only](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-readonly.md) - [Searchable snapshot](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-searchable-snapshot.md) - [Shrink](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-shrink.md) {{ilm-cap}} will **not** proceed with executing these actions until the upper time-bound for accepting writes, represented by the [`index.time_series.end_time`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/time-series.md#index-time-series-end-time) index setting, has lapsed. +Some {{ilm-init}} actions mark the source index as read-only, or expect the index to not be actively written anymore in order to provide good performance. These actions are: - [Delete](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-delete.md) - [Downsample](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-downsample.md) - [Force merge](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-forcemerge.md) - [Read only](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-readonly.md) - [Searchable snapshot](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-searchable-snapshot.md) - [Shrink](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-shrink.md) {{ilm-cap}} will **not** proceed with executing these actions until the upper time-bound for accepting writes, represented by the [`index.time_series.end_time`](elasticsearch://reference/elasticsearch/index-settings/time-series.md#index-time-series-end-time) index setting, has lapsed. :::: @@ -162,7 +162,7 @@ If no backing index can accept a document’s `@timestamp` value, {{es}} rejects ### Look-ahead time [tsds-look-ahead-time] -Use the [`index.look_ahead_time`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/time-series.md#index-look-ahead-time) index setting to configure how far into the future you can add documents to an index. When you create a new write index for a TSDS, {{es}} calculates the index’s `index.time_series.end_time` value as: +Use the [`index.look_ahead_time`](elasticsearch://reference/elasticsearch/index-settings/time-series.md#index-look-ahead-time) index setting to configure how far into the future you can add documents to an index. When you create a new write index for a TSDS, {{es}} calculates the index’s `index.time_series.end_time` value as: `now + index.look_ahead_time` @@ -175,7 +175,7 @@ This process continues until the write index rolls over. When the index rolls ov ### Look-back time [tsds-look-back-time] -Use the [`index.look_back_time`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/time-series.md#index-look-back-time) index setting to configure how far in the past you can add documents to an index. When you create a data stream for a TSDS, {{es}} calculates the index’s `index.time_series.start_time` value as: +Use the [`index.look_back_time`](elasticsearch://reference/elasticsearch/index-settings/time-series.md#index-look-back-time) index setting to configure how far in the past you can add documents to an index. When you create a data stream for a TSDS, {{es}} calculates the index’s `index.time_series.start_time` value as: `now - index.look_back_time` @@ -196,24 +196,24 @@ You can use the [get data stream API](https://www.elastic.co/docs/api/doc/elasti ### Dimension-based routing [dimension-based-routing] -Within each TSDS backing index, {{es}} uses the [`index.routing_path`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/time-series.md#index-routing-path) index setting to route documents with the same dimensions to the same shards. +Within each TSDS backing index, {{es}} uses the [`index.routing_path`](elasticsearch://reference/elasticsearch/index-settings/time-series.md#index-routing-path) index setting to route documents with the same dimensions to the same shards. When you create the matching index template for a TSDS, you must specify one or more dimensions in the `index.routing_path` setting. Each document in a TSDS must contain one or more dimensions that match the `index.routing_path` setting. The `index.routing_path` setting accepts wildcard patterns (for example `dim.*`) and can dynamically match new fields. However, {{es}} will reject any mapping updates that add scripted, runtime, or non-dimension fields that match the `index.routing_path` value. -[Pass-through](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/passthrough.md#passthrough-dimensions) fields may be configured as dimension containers. In this case, their sub-fields get included to the routing path automatically. +[Pass-through](elasticsearch://reference/elasticsearch/mapping-reference/passthrough.md#passthrough-dimensions) fields may be configured as dimension containers. In this case, their sub-fields get included to the routing path automatically. TSDS documents don’t support a custom `_routing` value. Similarly, you can’t require a `_routing` value in mappings for a TSDS. ### Index sorting [tsds-index-sorting] -{{es}} uses [compression algorithms](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-modules.md#index-codec) to compress repeated values. This compression works best when repeated values are stored near each other — in the same index, on the same shard, and side-by-side in the same shard segment. +{{es}} uses [compression algorithms](elasticsearch://reference/elasticsearch/index-settings/index-modules.md#index-codec) to compress repeated values. This compression works best when repeated values are stored near each other — in the same index, on the same shard, and side-by-side in the same shard segment. Most time series data contains repeated values. Dimensions are repeated across documents in the same time series. The metric values of a time series may also change slowly over time. -Internally, each TSDS backing index uses [index sorting](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/sorting.md) to order its shard segments by `_tsid` and `@timestamp`. This makes it more likely that these repeated values are stored near each other for better compression. A TSDS doesn’t support any [`index.sort.*`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/sorting.md) index settings. +Internally, each TSDS backing index uses [index sorting](elasticsearch://reference/elasticsearch/index-settings/sorting.md) to order its shard segments by `_tsid` and `@timestamp`. This makes it more likely that these repeated values are stored near each other for better compression. A TSDS doesn’t support any [`index.sort.*`](elasticsearch://reference/elasticsearch/index-settings/sorting.md) index settings. ## What’s next? [tsds-whats-next] diff --git a/manage-data/data-store/data-streams/use-data-stream.md b/manage-data/data-store/data-streams/use-data-stream.md index 7d547f646..1b17f34f4 100644 --- a/manage-data/data-store/data-streams/use-data-stream.md +++ b/manage-data/data-store/data-streams/use-data-stream.md @@ -21,7 +21,7 @@ After you [set up a data stream](set-up-data-stream.md), you can do the followin * [Update or delete documents in a backing index](#update-delete-docs-in-a-backing-index) -## Add documents to a data stream [add-documents-to-a-data-stream] +## Add documents to a data stream [add-documents-to-a-data-stream] To add an individual document, use the [index API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-create). [Ingest pipelines](../../ingest/transform-enrich/ingest-pipelines.md) are supported. @@ -51,7 +51,7 @@ PUT /my-data-stream/_bulk?refresh ``` -## Search a data stream [search-a-data-stream] +## Search a data stream [search-a-data-stream] The following search APIs support data streams: @@ -62,7 +62,7 @@ The following search APIs support data streams: * [EQL search](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-eql-search) -## Get statistics for a data stream [get-stats-for-a-data-stream] +## Get statistics for a data stream [get-stats-for-a-data-stream] Use the [data stream stats API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-data-streams-stats-1) to get statistics for one or more data streams: @@ -71,7 +71,7 @@ GET /_data_stream/my-data-stream/_stats?human=true ``` -## Manually roll over a data stream [manually-roll-over-a-data-stream] +## Manually roll over a data stream [manually-roll-over-a-data-stream] Use the [rollover API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-rollover) to manually [roll over](../data-streams.md#data-streams-rollover) a data stream. You have two options when manually rolling over: @@ -91,7 +91,7 @@ Use the [rollover API](https://www.elastic.co/docs/api/doc/elasticsearch/operati -## Open closed backing indices [open-closed-backing-indices] +## Open closed backing indices [open-closed-backing-indices] You cannot search a [closed](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-close) backing index, even by searching its data stream. You also cannot [update](#update-docs-in-a-data-stream-by-query) or [delete](#delete-docs-in-a-data-stream-by-query) documents in a closed index. @@ -108,7 +108,7 @@ POST /my-data-stream/_open/ ``` -## Reindex with a data stream [reindex-with-a-data-stream] +## Reindex with a data stream [reindex-with-a-data-stream] Use the [reindex API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-reindex) to copy documents from an existing index, alias, or data stream to a data stream. Because data streams are [append-only](../data-streams.md#data-streams-append-only), a reindex into a data stream must use an `op_type` of `create`. A reindex cannot update existing documents in a data stream. @@ -126,7 +126,7 @@ POST /_reindex ``` -## Update documents in a data stream by query [update-docs-in-a-data-stream-by-query] +## Update documents in a data stream by query [update-docs-in-a-data-stream-by-query] Use the [update by query API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-update-by-query) to update documents in a data stream that match a provided query: @@ -148,7 +148,7 @@ POST /my-data-stream/_update_by_query ``` -## Delete documents in a data stream by query [delete-docs-in-a-data-stream-by-query] +## Delete documents in a data stream by query [delete-docs-in-a-data-stream-by-query] Use the [delete by query API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-delete-by-query) to delete documents in a data stream that match a provided query: @@ -164,13 +164,13 @@ POST /my-data-stream/_delete_by_query ``` -## Update or delete documents in a backing index [update-delete-docs-in-a-backing-index] +## Update or delete documents in a backing index [update-delete-docs-in-a-backing-index] If needed, you can update or delete documents in a data stream by sending requests to the backing index containing the document. You’ll need: -* The [document ID](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-id-field.md) +* The [document ID](elasticsearch://reference/elasticsearch/mapping-reference/mapping-id-field.md) * The name of the backing index containing the document -* If updating the document, its [sequence number and primary term](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/optimistic-concurrency-control.md) +* If updating the document, its [sequence number and primary term](elasticsearch://reference/elasticsearch/rest-apis/optimistic-concurrency-control.md) To get this information, use a [search request](#search-a-data-stream): diff --git a/manage-data/data-store/index-basics.md b/manage-data/data-store/index-basics.md index 0590615dd..2dbeefb86 100644 --- a/manage-data/data-store/index-basics.md +++ b/manage-data/data-store/index-basics.md @@ -14,7 +14,7 @@ This content applies to: [![Elasticsearch](/images/serverless-es-badge.svg "")]( An index is a fundamental unit of storage in {{es}}. It is a collection of documents uniquely identified by a name or an [alias](/manage-data/data-store/aliases.md). This unique name is important because it’s used to target the index in search queries and other operations. -::::{tip} +::::{tip} A closely related concept is a [data stream](/manage-data/data-store/data-streams.md). This index abstraction is optimized for append-only timestamped data, and is made up of hidden, auto-generated backing indices. If you’re working with timestamped data, we recommend the [Elastic Observability](https://www.elastic.co/guide/en/observability/current) solution for additional tools and optimized content. :::: @@ -22,7 +22,7 @@ A closely related concept is a [data stream](/manage-data/data-store/data-stream An index is made up of the following components. -### Documents [elasticsearch-intro-documents-fields] +### Documents [elasticsearch-intro-documents-fields] {{es}} serializes and stores data in the form of JSON documents. A document is a set of fields, which are key-value pairs that contain your data. Each document has a unique ID, which you can create or have {{es}} auto-generate. @@ -53,17 +53,17 @@ A simple {{es}} document might look like this: } ``` -### Metadata fields [elasticsearch-intro-documents-fields-data-metadata] +### Metadata fields [elasticsearch-intro-documents-fields-data-metadata] -An indexed document contains data and metadata. [Metadata fields](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/document-metadata-fields.md) are system fields that store information about the documents. In {{es}}, metadata fields are prefixed with an underscore. For example, the following fields are metadata fields: +An indexed document contains data and metadata. [Metadata fields](elasticsearch://reference/elasticsearch/mapping-reference/document-metadata-fields.md) are system fields that store information about the documents. In {{es}}, metadata fields are prefixed with an underscore. For example, the following fields are metadata fields: * `_index`: The name of the index where the document is stored. * `_id`: The document’s ID. IDs must be unique per index. -### Mappings and data types [elasticsearch-intro-documents-fields-mappings] +### Mappings and data types [elasticsearch-intro-documents-fields-mappings] -Each index has a [mapping](/manage-data/data-store/mapping.md) or schema for how the fields in your documents are indexed. A mapping defines the [data type](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/field-data-types.md) for each field, how the field should be indexed, and how it should be stored. +Each index has a [mapping](/manage-data/data-store/mapping.md) or schema for how the fields in your documents are indexed. A mapping defines the [data type](elasticsearch://reference/elasticsearch/mapping-reference/field-data-types.md) for each field, how the field should be indexed, and how it should be stored. ## Index management @@ -82,12 +82,12 @@ Investigate your indices and perform operations from the **Indices** view. * To show details and perform operations, click the index name. To perform operations on multiple indices, select their checkboxes and then open the **Manage** menu. For more information on managing indices, refer to [Index APIs](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-indices). * To filter the list of indices, use the search bar or click a badge. Badges indicate if an index is a [follower index](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-ccr-follow), a [rollup index](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-rollup-get-rollup-index-caps), or [frozen](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-unfreeze). -* To drill down into the index [mappings](/manage-data/data-store/mapping.md), [settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index.md), and statistics, click an index name. From this view, you can navigate to **Discover** to further explore the documents in the index. +* To drill down into the index [mappings](/manage-data/data-store/mapping.md), [settings](elasticsearch://reference/elasticsearch/index-settings/index.md), and statistics, click an index name. From this view, you can navigate to **Discover** to further explore the documents in the index. * To create new indices, use the **Create index** wizard. ### Manage data streams -A [data stream](/manage-data/data-store/data-streams.md) lets you store append-only time series data across multiple indices while giving you a single named resource for requests. +A [data stream](/manage-data/data-store/data-streams.md) lets you store append-only time series data across multiple indices while giving you a single named resource for requests. Investigate your data streams and address lifecycle management needs in the **Data Streams** view. @@ -96,16 +96,16 @@ Investigate your data streams and address lifecycle management needs in the **Da :class: screenshot ::: -In {{es-serverless}}, indices matching the `logs-*-*` pattern use the logsDB index mode by default. The logsDB index mode creates a [logs data stream](https://www.elastic.co/guide/en/elasticsearch/reference/master/logs-data-stream.html). +In {{es-serverless}}, indices matching the `logs-*-*` pattern use the logsDB index mode by default. The logsDB index mode creates a [logs data stream](https://www.elastic.co/guide/en/elasticsearch/reference/master/logs-data-stream.html). * To view information about the stream's backing indices, click the number in the **Indices** column. -* A value in the **Data retention** column indicates that the data stream is managed by a data stream lifecycle policy. This value is the time period for which your data is guaranteed to be stored. Data older than this period can be deleted by {{es}} at a later time. +* A value in the **Data retention** column indicates that the data stream is managed by a data stream lifecycle policy. This value is the time period for which your data is guaranteed to be stored. Data older than this period can be deleted by {{es}} at a later time. * To modify the data retention value, select an index, open the **Manage** menu, and click **Edit data retention**. * To view more information about a data stream, such as its generation or its current index lifecycle policy, click the stream's name. From this view, you can navigate to **Discover** to further explore data within the data stream. ### Manage index templates [index-management-manage-index-templates] -An [index template](/manage-data/data-store/templates.md) is a way to tell {{es}} how to configure an index when it is created. +An [index template](/manage-data/data-store/templates.md) is a way to tell {{es}} how to configure an index when it is created. Create, edit, clone, and delete your index templates in the **Index Templates** view. Changes made to an index template do not affect existing indices. @@ -115,7 +115,7 @@ Create, edit, clone, and delete your index templates in the **Index Templates** ::: * To show details and perform operations, click the template name. -* To view more information about the component templates within an index template, click the value in the **Component templates** column. +* To view more information about the component templates within an index template, click the value in the **Component templates** column. * Values in the **Content** column indicate whether a template contains index mappings, settings, and aliases. * To create new index templates, use the **Create template** wizard. diff --git a/manage-data/data-store/mapping.md b/manage-data/data-store/mapping.md index 115cdf833..79e7b0773 100644 --- a/manage-data/data-store/mapping.md +++ b/manage-data/data-store/mapping.md @@ -35,7 +35,7 @@ $$$mapping-explicit$$$ Mapping is the process of defining how a document and the fields it contains are stored and indexed. -Each document is a collection of fields, which each have their own [data type](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/field-data-types.md). When mapping your data, you create a mapping definition, which contains a list of fields that are pertinent to the document. A mapping definition also includes [metadata fields](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/document-metadata-fields.md), like the `_source` field, which customize how a document’s associated metadata is handled. +Each document is a collection of fields, which each have their own [data type](elasticsearch://reference/elasticsearch/mapping-reference/field-data-types.md). When mapping your data, you create a mapping definition, which contains a list of fields that are pertinent to the document. A mapping definition also includes [metadata fields](elasticsearch://reference/elasticsearch/mapping-reference/document-metadata-fields.md), like the `_source` field, which customize how a document’s associated metadata is handled. Depending on where you are in your data journey, use *dynamic mapping* and *explicit mapping* to define your data. For example, you can explicitly map fields where you don’t want to use the defaults, or to gain greater control over which fields are created. Then you can allow {{es}} to dynamically map other fields. Using a combination of dynamic and explicit mapping on the same index is especially useful when you have a mix of known and unknown fields in your data. @@ -43,13 +43,13 @@ Depending on where you are in your data journey, use *dynamic mapping* and *expl Before 7.0.0, the mapping definition included a type name. {{es}} 7.0.0 and later no longer accept a default mapping. [Removal of mapping types](/manage-data/data-store/mapping/removal-of-mapping-types.md) provides more information. :::: -## Dynamic mapping [mapping-dynamic] +## Dynamic mapping [mapping-dynamic] -When you use [dynamic mapping](/manage-data/data-store/mapping/dynamic-mapping.md), {{es}} automatically detects the data types of fields in your documents and creates mappings for you. If you index additional documents with new fields, {{es}} will add these fields automatically. You can add fields to the top-level mapping, and to inner [`object`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/object.md) and [`nested`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/nested.md) fields. Dynamic mapping helps you get started quickly, but might yield suboptimal results for your specific use case due to automatic field type inference. +When you use [dynamic mapping](/manage-data/data-store/mapping/dynamic-mapping.md), {{es}} automatically detects the data types of fields in your documents and creates mappings for you. If you index additional documents with new fields, {{es}} will add these fields automatically. You can add fields to the top-level mapping, and to inner [`object`](elasticsearch://reference/elasticsearch/mapping-reference/object.md) and [`nested`](elasticsearch://reference/elasticsearch/mapping-reference/nested.md) fields. Dynamic mapping helps you get started quickly, but might yield suboptimal results for your specific use case due to automatic field type inference. -Use [dynamic templates](/manage-data/data-store/mapping/dynamic-templates.md) to define custom mappings that are applied to dynamically added fields based on the matching condition. +Use [dynamic templates](/manage-data/data-store/mapping/dynamic-templates.md) to define custom mappings that are applied to dynamically added fields based on the matching condition. -## Explicit mapping [mapping-explicit] +## Explicit mapping [mapping-explicit] Use [explicit mapping](/manage-data/data-store/mapping/explicit-mapping.md) to define mappings by specifying data types for each field. This is recommended for production use cases, because you have full control over how your data is indexed to suit your specific use case. @@ -58,7 +58,7 @@ Defining your own mappings enables you to: * Define which string fields should be treated as full-text fields. * Define which fields contain numbers, dates, or geolocations. * Use data types that cannot be automatically detected (such as `geo_point` and `geo_shape`.) -* Choose date value [formats](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-date-format.md), including custom date formats. +* Choose date value [formats](elasticsearch://reference/elasticsearch/mapping-reference/mapping-date-format.md), including custom date formats. * Create custom rules to control the mapping for [dynamically added fields](/manage-data/data-store/mapping/dynamic-mapping.md). * Optimize fields for partial matching. * Perform language-specific text analysis. @@ -87,11 +87,11 @@ In most cases, you can’t change mappings for fields that are already mapped. T However, you can update mappings under certain conditions: * You can add new fields to an existing mapping at any time, dynamically or explicitly. -* You can add new [multi-fields](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/multi-fields.md) for existing fields. +* You can add new [multi-fields](elasticsearch://reference/elasticsearch/mapping-reference/multi-fields.md) for existing fields. * Documents indexed before the mapping update will not have values for the new multi-fields until they are updated or reindexed. Documents indexed after the mapping change will automatically have values for the new multi-fields. -* Some [mapping parameters](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-parameters.md) can be updated for existing fields of certain [data types](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/field-data-types.md). +* Some [mapping parameters](elasticsearch://reference/elasticsearch/mapping-reference/mapping-parameters.md) can be updated for existing fields of certain [data types](elasticsearch://reference/elasticsearch/mapping-reference/field-data-types.md). ## Prevent mapping explosions [mapping-limit-settings] @@ -100,4 +100,4 @@ Defining too many fields in an index can lead to a mapping explosion, which can Consider a situation where every new document inserted introduces new fields, such as with [dynamic mapping](/manage-data/data-store/mapping/dynamic-mapping.md). Each new field is added to the index mapping, which can become a problem as the mapping grows. -Use the [mapping limit settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/mapping-limit.md) to limit the number of field mappings (created manually or dynamically) and prevent documents from causing a mapping explosion. \ No newline at end of file +Use the [mapping limit settings](elasticsearch://reference/elasticsearch/index-settings/mapping-limit.md) to limit the number of field mappings (created manually or dynamically) and prevent documents from causing a mapping explosion. \ No newline at end of file diff --git a/manage-data/data-store/mapping/define-runtime-fields-in-search-request.md b/manage-data/data-store/mapping/define-runtime-fields-in-search-request.md index ca434b28a..d2c13c490 100644 --- a/manage-data/data-store/mapping/define-runtime-fields-in-search-request.md +++ b/manage-data/data-store/mapping/define-runtime-fields-in-search-request.md @@ -36,7 +36,7 @@ GET my-index-000001/_search ``` -## Create runtime fields that use other runtime fields [runtime-search-request-examples] +## Create runtime fields that use other runtime fields [runtime-search-request-examples] You can even define runtime fields in a search request that return values from other runtime fields. For example, let’s say you bulk index some sensor data: @@ -74,7 +74,7 @@ PUT my-index-000001/_mapping Runtime fields take precedence over fields defined with the same name in the index mappings. This flexibility allows you to shadow existing fields and calculate a different value, without modifying the field itself. If you made a mistake in your index mapping, you can use runtime fields to calculate values that [override values](override-field-values-at-query-time.md) in the mapping during the search request. -Now, you can easily run an [average aggregation](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-metrics-avg-aggregation.md) on the `measures.start` and `measures.end` fields: +Now, you can easily run an [average aggregation](elasticsearch://reference/data-analysis/aggregations/search-aggregations-metrics-avg-aggregation.md) on the `measures.start` and `measures.end` fields: ```console GET my-index-000001/_search @@ -109,7 +109,7 @@ The response includes the aggregation results without changing the values for th } ``` -Further, you can define a runtime field as part of a search query that calculates a value, and then run a [stats aggregation](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-metrics-stats-aggregation.md) on that field *in the same query*. +Further, you can define a runtime field as part of a search query that calculates a value, and then run a [stats aggregation](elasticsearch://reference/data-analysis/aggregations/search-aggregations-metrics-stats-aggregation.md) on that field *in the same query*. The `duration` runtime field doesn’t exist in the index mapping, but we can still search and aggregate on that field. The following query returns the calculated value for the `duration` field and runs a stats aggregation to compute statistics over numeric values extracted from the aggregated documents. diff --git a/manage-data/data-store/mapping/dynamic-field-mapping.md b/manage-data/data-store/mapping/dynamic-field-mapping.md index 996e863c2..19fc993c3 100644 --- a/manage-data/data-store/mapping/dynamic-field-mapping.md +++ b/manage-data/data-store/mapping/dynamic-field-mapping.md @@ -8,12 +8,12 @@ applies_to: # Dynamic field mapping [dynamic-field-mapping] -When {{es}} detects a new field in a document, it *dynamically* adds the field to the type mapping by default. The [`dynamic`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/dynamic.md) parameter controls this behavior. +When {{es}} detects a new field in a document, it *dynamically* adds the field to the type mapping by default. The [`dynamic`](elasticsearch://reference/elasticsearch/mapping-reference/dynamic.md) parameter controls this behavior. You can explicitly instruct {{es}} to dynamically create fields based on incoming documents by setting the `dynamic` parameter to `true` or `runtime`. When dynamic field mapping is enabled, {{es}} uses the rules in the following table to determine how to map data types for each field. -::::{note} -The field data types in the following table are the only [field data types](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/field-data-types.md) that {{es}} detects dynamically. You must explicitly map all other data types. +::::{note} +The field data types in the following table are the only [field data types](elasticsearch://reference/elasticsearch/mapping-reference/field-data-types.md) that {{es}} detects dynamically. You must explicitly map all other data types. :::: @@ -33,9 +33,9 @@ $$$dynamic-field-mapping-types$$$ | `string` that passes [numeric detection](#numeric-detection) | `float` or `long` | `double` or `long` | | `string` that doesn’t pass `date` detection or `numeric` detection | `text` with a `.keyword` sub-field | `keyword` | -You can disable dynamic mapping, both at the document and at the [`object`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/object.md) level. Setting the `dynamic` parameter to `false` ignores new fields, and `strict` rejects the document if {{es}} encounters an unknown field. +You can disable dynamic mapping, both at the document and at the [`object`](elasticsearch://reference/elasticsearch/mapping-reference/object.md) level. Setting the `dynamic` parameter to `false` ignores new fields, and `strict` rejects the document if {{es}} encounters an unknown field. -::::{tip} +::::{tip} Use the [update mapping API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-mapping) to update the `dynamic` setting on existing fields. :::: @@ -44,11 +44,11 @@ You can customize dynamic field mapping rules for [date detection](#date-detecti ## Date detection [date-detection] -If `date_detection` is enabled (default), then new string fields are checked to see whether their contents match any of the date patterns specified in `dynamic_date_formats`. If a match is found, a new [`date`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/date.md) field is added with the corresponding format. +If `date_detection` is enabled (default), then new string fields are checked to see whether their contents match any of the date patterns specified in `dynamic_date_formats`. If a match is found, a new [`date`](elasticsearch://reference/elasticsearch/mapping-reference/date.md) field is added with the corresponding format. The default value for `dynamic_date_formats` is: -[ [`"strict_date_optional_time"`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-date-format.md#strict-date-time),`"yyyy/MM/dd HH:mm:ss Z||yyyy/MM/dd Z"`] +[ [`"strict_date_optional_time"`](elasticsearch://reference/elasticsearch/mapping-reference/mapping-date-format.md#strict-date-time),`"yyyy/MM/dd HH:mm:ss Z||yyyy/MM/dd Z"`] For example: @@ -61,7 +61,7 @@ PUT my-index-000001/_doc/1 GET my-index-000001/_mapping <1> ``` -1. The `create_date` field has been added as a [`date`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/date.md) field with the [`format`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-date-format.md):
`"yyyy/MM/dd HH:mm:ss Z||yyyy/MM/dd Z"`. +1. The `create_date` field has been added as a [`date`](elasticsearch://reference/elasticsearch/mapping-reference/date.md) field with the [`format`](elasticsearch://reference/elasticsearch/mapping-reference/mapping-date-format.md):
`"yyyy/MM/dd HH:mm:ss Z||yyyy/MM/dd Z"`. ### Disabling date detection [_disabling_date_detection] @@ -82,13 +82,13 @@ PUT my-index-000001/_doc/1 <1> } ``` -1. The `create_date` field has been added as a [`text`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/text.md) field. +1. The `create_date` field has been added as a [`text`](elasticsearch://reference/elasticsearch/mapping-reference/text.md) field. ### Customizing detected date formats [_customizing_detected_date_formats] -Alternatively, the `dynamic_date_formats` can be customized to support your own [date formats](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-date-format.md): +Alternatively, the `dynamic_date_formats` can be customized to support your own [date formats](elasticsearch://reference/elasticsearch/mapping-reference/mapping-date-format.md): ```console PUT my-index-000001 @@ -104,7 +104,7 @@ PUT my-index-000001/_doc/1 } ``` -::::{note} +::::{note} There is a difference between configuring an array of date patterns and configuring multiple patterns in a single string separated by `||`. When you configure an array of date patterns, the pattern that matches the date in the first document with an unmapped date field will determine the mapping of that field: ```console @@ -181,7 +181,7 @@ The resulting mapping will be: :::: -::::{note} +::::{note} Epoch formats (`epoch_millis` and `epoch_second`) are not supported as dynamic date formats. :::: @@ -208,8 +208,8 @@ PUT my-index-000001/_doc/1 } ``` -1. The `my_float` field is added as a [`float`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/number.md) field. -2. The `my_integer` field is added as a [`long`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/number.md) field. +1. The `my_float` field is added as a [`float`](elasticsearch://reference/elasticsearch/mapping-reference/number.md) field. +2. The `my_integer` field is added as a [`long`](elasticsearch://reference/elasticsearch/mapping-reference/number.md) field. diff --git a/manage-data/data-store/mapping/dynamic-templates.md b/manage-data/data-store/mapping/dynamic-templates.md index bcc6f2f80..6d6adebd7 100644 --- a/manage-data/data-store/mapping/dynamic-templates.md +++ b/manage-data/data-store/mapping/dynamic-templates.md @@ -17,7 +17,7 @@ Dynamic templates allow you greater control over how {{es}} maps your data beyon Use the `{{name}}` and `{{dynamic_type}}` [template variables](#template-variables) in the mapping specification as placeholders. -::::{important} +::::{important} Dynamic field mappings are only added when a field contains a concrete value. {{es}} doesn’t add a dynamic field mapping when the field contains `null` or an empty array. If the `null_value` option is used in a `dynamic_template`, it will only be applied after the first document with a concrete value for the field has been indexed. :::: @@ -89,7 +89,7 @@ The `match_mapping_type` parameter matches fields by the data type detected by t Because JSON doesn’t distinguish a `long` from an `integer` or a `double` from a `float`, any parsed floating point number is considered a `double` JSON data type, while any parsed `integer` number is considered a `long`. -::::{note} +::::{note} With dynamic mappings, {{es}} will always choose the wider data type. The one exception is `float`, which requires less storage space than `double` and is precise enough for most applications. Runtime fields do not support `float`, which is why `"dynamic":"runtime"` uses `double`. :::: @@ -174,7 +174,7 @@ PUT my-index-000001/_doc/1 ``` 1. The `my_integer` field is mapped as an `integer`. -2. The `my_string` field is mapped as a `text`, with a `keyword` [multi-field](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/multi-fields.md). +2. The `my_string` field is mapped as a `text`, with a `keyword` [multi-field](elasticsearch://reference/elasticsearch/mapping-reference/multi-fields.md). 3. The `my_boolean` field is mapped as a `keyword`. 4. The `field.count` field is mapped as a `long`. @@ -359,7 +359,7 @@ PUT my-index-000001/_doc/2 ## Template variables [template-variables] -The `{{name}}` and `{{dynamic_type}}` placeholders are replaced in the `mapping` with the field name and detected dynamic type. The following example sets all string fields to use an [`analyzer`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/analyzer.md) with the same name as the field, and disables [`doc_values`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/doc-values.md) for all non-string fields: +The `{{name}}` and `{{dynamic_type}}` placeholders are replaced in the `mapping` with the field name and detected dynamic type. The following example sets all string fields to use an [`analyzer`](elasticsearch://reference/elasticsearch/mapping-reference/analyzer.md) with the same name as the field, and disables [`doc_values`](elasticsearch://reference/elasticsearch/mapping-reference/doc-values.md) for all non-string fields: ```console PUT my-index-000001 diff --git a/manage-data/data-store/mapping/explicit-mapping.md b/manage-data/data-store/mapping/explicit-mapping.md index f2881f960..f49cbfb52 100644 --- a/manage-data/data-store/mapping/explicit-mapping.md +++ b/manage-data/data-store/mapping/explicit-mapping.md @@ -13,7 +13,7 @@ You know more about your data than {{es}} can guess, so while dynamic mapping ca You can create field mappings when you [create an index](#create-mapping) and [add fields to an existing index](#add-field-mapping). -## Create an index with an explicit mapping [create-mapping] +## Create an index with an explicit mapping [create-mapping] You can use the [create index](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-create) API to create a new index with an explicit mapping. @@ -30,17 +30,17 @@ PUT /my-index-000001 } ``` -1. Creates `age`, an [`integer`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/number.md) field -2. Creates `email`, a [`keyword`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/keyword.md) field -3. Creates `name`, a [`text`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/text.md) field +1. Creates `age`, an [`integer`](elasticsearch://reference/elasticsearch/mapping-reference/number.md) field +2. Creates `email`, a [`keyword`](elasticsearch://reference/elasticsearch/mapping-reference/keyword.md) field +3. Creates `name`, a [`text`](elasticsearch://reference/elasticsearch/mapping-reference/text.md) field -## Add a field to an existing mapping [add-field-mapping] +## Add a field to an existing mapping [add-field-mapping] You can use the [update mapping](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-mapping) API to add one or more new fields to an existing index. -The following example adds `employee-id`, a `keyword` field with an [`index`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-index.md) mapping parameter value of `false`. This means values for the `employee-id` field are stored but not indexed or available for search. +The following example adds `employee-id`, a `keyword` field with an [`index`](elasticsearch://reference/elasticsearch/mapping-reference/mapping-index.md) mapping parameter value of `false`. This means values for the `employee-id` field are stored but not indexed or available for search. ```console PUT /my-index-000001/_mapping @@ -55,18 +55,18 @@ PUT /my-index-000001/_mapping ``` -## Update the mapping of a field [update-mapping] +## Update the mapping of a field [update-mapping] -Except for supported [mapping parameters](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-parameters.md), you can’t change the mapping or field type of an existing field. Changing an existing field could invalidate data that’s already indexed. +Except for supported [mapping parameters](elasticsearch://reference/elasticsearch/mapping-reference/mapping-parameters.md), you can’t change the mapping or field type of an existing field. Changing an existing field could invalidate data that’s already indexed. If you need to change the mapping of a field in a data stream’s backing indices, see [Change mappings and settings for a data stream](../data-streams/modify-data-stream.md#data-streams-change-mappings-and-settings). If you need to change the mapping of a field in other indices, create a new index with the correct mapping and [reindex](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-reindex) your data into that index. -Renaming a field would invalidate data already indexed under the old field name. Instead, add an [`alias`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/field-alias.md) field to create an alternate field name. +Renaming a field would invalidate data already indexed under the old field name. Instead, add an [`alias`](elasticsearch://reference/elasticsearch/mapping-reference/field-alias.md) field to create an alternate field name. -## View the mapping of an index [view-mapping] +## View the mapping of an index [view-mapping] You can use the [get mapping](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-get-mapping) API to view the mapping of an existing index. @@ -101,7 +101,7 @@ The API returns the following response: ``` -## View the mapping of specific fields [view-field-mapping] +## View the mapping of specific fields [view-field-mapping] If you only want to view the mapping of one or more specific fields, you can use the [get field mapping](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-get-mapping) API. diff --git a/manage-data/data-store/mapping/explore-data-with-runtime-fields.md b/manage-data/data-store/mapping/explore-data-with-runtime-fields.md index f58913092..4fe70b312 100644 --- a/manage-data/data-store/mapping/explore-data-with-runtime-fields.md +++ b/manage-data/data-store/mapping/explore-data-with-runtime-fields.md @@ -238,7 +238,7 @@ If the script didn’t include this condition, the query would fail on any shard ### Search for documents in a specific range [runtime-examples-grok-range] -You can also run a [range query](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-range-query.md) that operates on the `timestamp` field. The following query returns any documents where the `timestamp` is greater than or equal to `2020-04-30T14:31:27-05:00`: +You can also run a [range query](elasticsearch://reference/query-languages/query-dsl-range-query.md) that operates on the `timestamp` field. The following query returns any documents where the `timestamp` is greater than or equal to `2020-04-30T14:31:27-05:00`: ```console GET my-index-000001/_search @@ -292,7 +292,7 @@ The response includes the document where the log format doesn’t match, but the ## Define a runtime field with a dissect pattern [runtime-examples-dissect] -If you don’t need the power of regular expressions, you can use [dissect patterns](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/dissect-processor.md) instead of grok patterns. Dissect patterns match on fixed delimiters but are typically faster than grok. +If you don’t need the power of regular expressions, you can use [dissect patterns](elasticsearch://reference/ingestion-tools/enrich-processor/dissect-processor.md) instead of grok patterns. Dissect patterns match on fixed delimiters but are typically faster than grok. You can use dissect to achieve the same results as parsing the Apache logs with a [grok pattern](#runtime-examples-grok). Instead of matching on a log pattern, you include the parts of the string that you want to discard. Paying special attention to the parts of the string you want to discard will help build successful dissect patterns. diff --git a/manage-data/data-store/mapping/index-runtime-field.md b/manage-data/data-store/mapping/index-runtime-field.md index f764f3843..c38d1528f 100644 --- a/manage-data/data-store/mapping/index-runtime-field.md +++ b/manage-data/data-store/mapping/index-runtime-field.md @@ -10,14 +10,14 @@ applies_to: Runtime fields are defined by the context where they run. For example, you can define runtime fields in the [context of a search query](define-runtime-fields-in-search-request.md) or within the [`runtime` section](map-runtime-field.md) of an index mapping. If you decide to index a runtime field for greater performance, just move the full runtime field definition (including the script) to the context of an index mapping. {{es}} automatically uses these indexed fields to drive queries, resulting in a fast response time. This capability means you can write a script only once, and apply it to any context that supports runtime fields. -::::{note} +::::{note} Indexing a `composite` runtime field is currently not supported. :::: You can then use runtime fields to limit the number of fields that {{es}} needs to calculate values for. Using indexed fields in tandem with runtime fields provides flexibility in the data that you index and how you define queries for other fields. -::::{important} +::::{important} After indexing a runtime field, you cannot update the included script. If you need to change the script, create a new field with the updated script. :::: @@ -85,7 +85,7 @@ PUT my-index-000001/_mapping } ``` -You retrieve the calculated values using the [`fields`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/retrieve-selected-fields.md) parameter on the `_search` API: +You retrieve the calculated values using the [`fields`](elasticsearch://reference/elasticsearch/rest-apis/retrieve-selected-fields.md) parameter on the `_search` API: ```console GET my-index-000001/_search @@ -157,7 +157,7 @@ POST my-index-000001/_bulk?refresh=true { "timestamp": 1516297294000, "temperature": 202, "voltage": 4.0, "node": "c"} ``` -You can now retrieve calculated values in a search query, and find documents based on precise values. The following range query returns all documents where the calculated `voltage_corrected` is greater than or equal to `16`, but less than or equal to `20`. Again, use the [`fields`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/retrieve-selected-fields.md) parameter on the `_search` API to retrieve the fields you want: +You can now retrieve calculated values in a search query, and find documents based on precise values. The following range query returns all documents where the calculated `voltage_corrected` is greater than or equal to `16`, but less than or equal to `20`. Again, use the [`fields`](elasticsearch://reference/elasticsearch/rest-apis/retrieve-selected-fields.md) parameter on the `_search` API to retrieve the fields you want: ```console POST my-index-000001/_search diff --git a/manage-data/data-store/mapping/map-runtime-field.md b/manage-data/data-store/mapping/map-runtime-field.md index 68bf74781..9d56e5418 100644 --- a/manage-data/data-store/mapping/map-runtime-field.md +++ b/manage-data/data-store/mapping/map-runtime-field.md @@ -11,7 +11,7 @@ applies_to: You map runtime fields by adding a `runtime` section under the mapping definition and defining [a Painless script](../../../explore-analyze/scripting/modules-scripting-using.md). This script has access to the entire context of a document, including the original `_source` via `params._source` and any mapped fields plus their values. At query time, the script runs and generates values for each scripted field that is required for the query. ::::{admonition} Emitting runtime field values -When defining a Painless script to use with runtime fields, you must include the [`emit` method](asciidocalypse://docs/elasticsearch/docs/reference/scripting-languages/painless/painless-runtime-fields-context.md) to emit calculated values. +When defining a Painless script to use with runtime fields, you must include the [`emit` method](elasticsearch://reference/scripting-languages/painless/painless-runtime-fields-context.md) to emit calculated values. :::: @@ -49,7 +49,7 @@ The `runtime` section can be any of these data types: * `long` * [`lookup`](retrieve-runtime-field.md#lookup-runtime-fields) -Runtime fields with a `type` of `date` can accept the [`format`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-date-format.md) parameter exactly as the `date` field type. +Runtime fields with a `type` of `date` can accept the [`format`](elasticsearch://reference/elasticsearch/mapping-reference/mapping-date-format.md) parameter exactly as the `date` field type. Runtime fields with a `type` of `lookup` allow retrieving fields from related indices. See [`retrieve fields from related indices`](retrieve-runtime-field.md#lookup-runtime-fields). @@ -88,11 +88,11 @@ PUT my-index-000001/ When no script is provided, {{es}} implicitly looks in `_source` at query time for a field with the same name as the runtime field, and returns a value if one exists. If a field with the same name doesn’t exist, the response doesn’t include any values for that runtime field. -In most cases, retrieve field values through [`doc_values`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/doc-values.md) whenever possible. Accessing `doc_values` with a runtime field is faster than retrieving values from `_source` because of how data is loaded from Lucene. +In most cases, retrieve field values through [`doc_values`](elasticsearch://reference/elasticsearch/mapping-reference/doc-values.md) whenever possible. Accessing `doc_values` with a runtime field is faster than retrieving values from `_source` because of how data is loaded from Lucene. However, there are cases where retrieving fields from `_source` is necessary. For example, `text` fields do not have `doc_values` available by default, so you have to retrieve values from `_source`. In other instances, you might choose to disable `doc_values` on a specific field. -::::{note} +::::{note} You can alternatively prefix the field you want to retrieve values for with `params._source` (such as `params._source.day_of_week`). For simplicity, defining a runtime field in the mapping definition without a script is the recommended option, whenever possible. :::: @@ -119,7 +119,7 @@ PUT my-index-000001/_mapping :::::{admonition} Downstream impacts Updating or removing a runtime field while a dependent query is running can return inconsistent results. Each shard might have access to different versions of the script, depending on when the mapping change takes effect. -::::{warning} +::::{warning} Existing queries or visualizations in {{kib}} that rely on runtime fields can fail if you remove or update the field. For example, a bar chart visualization that uses a runtime field of type `ip` will fail if the type is changed to `boolean`, or if the runtime field is removed. :::: diff --git a/manage-data/data-store/mapping/override-field-values-at-query-time.md b/manage-data/data-store/mapping/override-field-values-at-query-time.md index 1b434ba67..f703548aa 100644 --- a/manage-data/data-store/mapping/override-field-values-at-query-time.md +++ b/manage-data/data-store/mapping/override-field-values-at-query-time.md @@ -86,7 +86,7 @@ The response includes indexed values for documents matching model number `HG537P The following request defines a runtime field where the script evaluates the `model_number` field where the value is `HG537PU`. For each match, the script multiplies the value for the `voltage` field by `1.7`. -Using the [`fields`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/retrieve-selected-fields.md) parameter on the `_search` API, you can retrieve the value that the script calculates for the `measures.voltage` field for documents matching the search request: +Using the [`fields`](elasticsearch://reference/elasticsearch/rest-apis/retrieve-selected-fields.md) parameter on the `_search` API, you can retrieve the value that the script calculates for the `measures.voltage` field for documents matching the search request: ```console POST my-index-000001/_search diff --git a/manage-data/data-store/mapping/retrieve-runtime-field.md b/manage-data/data-store/mapping/retrieve-runtime-field.md index 2df1c243c..a6da515e0 100644 --- a/manage-data/data-store/mapping/retrieve-runtime-field.md +++ b/manage-data/data-store/mapping/retrieve-runtime-field.md @@ -8,7 +8,7 @@ applies_to: # Retrieve a runtime field [runtime-retrieving-fields] -Use the [`fields`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/retrieve-selected-fields.md) parameter on the `_search` API to retrieve the values of runtime fields. Runtime fields won’t display in `_source`, but the `fields` API works for all fields, even those that were not sent as part of the original `_source`. +Use the [`fields`](elasticsearch://reference/elasticsearch/rest-apis/retrieve-selected-fields.md) parameter on the `_search` API to retrieve the values of runtime fields. Runtime fields won’t display in `_source`, but the `fields` API works for all fields, even those that were not sent as part of the original `_source`. ## Define a runtime field to calculate the day of week [runtime-define-field-dayofweek] @@ -155,7 +155,7 @@ This time, the response includes only two hits. The value for `day_of_week` (`Su ## Retrieve fields from related indices [lookup-runtime-fields] -The [`fields`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/retrieve-selected-fields.md) parameter on the `_search` API can also be used to retrieve fields from the related indices via runtime fields with a type of `lookup`. +The [`fields`](elasticsearch://reference/elasticsearch/rest-apis/retrieve-selected-fields.md) parameter on the `_search` API can also be used to retrieve fields from the related indices via runtime fields with a type of `lookup`. ::::{note} Fields that are retrieved by runtime fields of type `lookup` can be used to enrich the hits in a search response. It’s not possible to query or aggregate on these fields. @@ -202,11 +202,11 @@ POST logs/_search } ``` -1. Define a runtime field in the main search request with a type of `lookup` that retrieves fields from the target index using the [`term`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-term-query.md) queries. +1. Define a runtime field in the main search request with a type of `lookup` that retrieves fields from the target index using the [`term`](elasticsearch://reference/query-languages/query-dsl-term-query.md) queries. 2. The target index where the lookup query executes against 3. A field on the main index whose values are used as the input values of the lookup term query 4. A field on the lookup index which the lookup query searches against -5. A list of fields to retrieve from the lookup index. See the [`fields`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/retrieve-selected-fields.md) parameter of a search request. +5. A list of fields to retrieve from the lookup index. See the [`fields`](elasticsearch://reference/elasticsearch/rest-apis/retrieve-selected-fields.md) parameter of a search request. The above search returns the country and city from the `ip_location` index for each ip address of the returned search hits. diff --git a/manage-data/data-store/mapping/runtime-fields.md b/manage-data/data-store/mapping/runtime-fields.md index e0e24661e..7e86a09b0 100644 --- a/manage-data/data-store/mapping/runtime-fields.md +++ b/manage-data/data-store/mapping/runtime-fields.md @@ -17,12 +17,12 @@ A *runtime field* is a field that is evaluated at query time. Runtime fields ena You access runtime fields from the search API like any other field, and {{es}} sees runtime fields no differently. You can define runtime fields in the [index mapping](map-runtime-field.md) or in the [search request](define-runtime-fields-in-search-request.md). Your choice, which is part of the inherent flexibility of runtime fields. -Use the [`fields`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/retrieve-selected-fields.md) parameter on the `_search` API to [retrieve the values of runtime fields](retrieve-runtime-field.md). Runtime fields won’t display in `_source`, but the `fields` API works for all fields, even those that were not sent as part of the original `_source`. +Use the [`fields`](elasticsearch://reference/elasticsearch/rest-apis/retrieve-selected-fields.md) parameter on the `_search` API to [retrieve the values of runtime fields](retrieve-runtime-field.md). Runtime fields won’t display in `_source`, but the `fields` API works for all fields, even those that were not sent as part of the original `_source`. Runtime fields are useful when working with log data (see [examples](explore-data-with-runtime-fields.md)), especially when you’re unsure about the data structure. Your search speed decreases, but your index size is much smaller and you can more quickly process logs without having to index them. -## Benefits [runtime-benefits] +## Benefits [runtime-benefits] Because runtime fields aren’t indexed, adding a runtime field doesn’t increase the index size. You define runtime fields directly in the index mapping, saving storage costs and increasing ingestion speed. You can more quickly ingest data into the Elastic Stack and access it right away. When you define a runtime field, you can immediately use it in search requests, aggregations, filtering, and sorting. @@ -31,20 +31,20 @@ If you change a runtime field into an indexed field, you don’t need to modify At its core, the most important benefit of runtime fields is the ability to add fields to documents after you’ve ingested them. This capability simplifies mapping decisions because you don’t have to decide how to parse your data up front, and can use runtime fields to amend the mapping at any time. Using runtime fields allows for a smaller index and faster ingest time, which combined use less resources and reduce your operating costs. -## Incentives [runtime-incentives] +## Incentives [runtime-incentives] Runtime fields can replace many of the ways you can use scripting with the `_search` API. How you use a runtime field is impacted by the number of documents that the included script runs against. For example, if you’re using the `fields` parameter on the `_search` API to [retrieve the values of a runtime field](retrieve-runtime-field.md), the script runs only against the top hits just like script fields do. -You can use [script fields](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/retrieve-selected-fields.md#script-fields) to access values in `_source` and return calculated values based on a script valuation. Runtime fields have the same capabilities, but provide greater flexibility because you can query and aggregate on runtime fields in a search request. Script fields can only fetch values. +You can use [script fields](elasticsearch://reference/elasticsearch/rest-apis/retrieve-selected-fields.md#script-fields) to access values in `_source` and return calculated values based on a script valuation. Runtime fields have the same capabilities, but provide greater flexibility because you can query and aggregate on runtime fields in a search request. Script fields can only fetch values. -Similarly, you could write a [script query](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-script-query.md) that filters documents in a search request based on a script. Runtime fields provide a very similar feature that is more flexible. You write a script to create field values and they are available everywhere, such as [`fields`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/retrieve-selected-fields.md), [all queries](../../../explore-analyze/query-filter/languages/querydsl.md), and [aggregations](../../../explore-analyze/query-filter/aggregations.md). +Similarly, you could write a [script query](elasticsearch://reference/query-languages/query-dsl-script-query.md) that filters documents in a search request based on a script. Runtime fields provide a very similar feature that is more flexible. You write a script to create field values and they are available everywhere, such as [`fields`](elasticsearch://reference/elasticsearch/rest-apis/retrieve-selected-fields.md), [all queries](../../../explore-analyze/query-filter/languages/querydsl.md), and [aggregations](../../../explore-analyze/query-filter/aggregations.md). -You can also use scripts to [sort search results](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/sort-search-results.md#script-based-sorting), but that same script works exactly the same in a runtime field. +You can also use scripts to [sort search results](elasticsearch://reference/elasticsearch/rest-apis/sort-search-results.md#script-based-sorting), but that same script works exactly the same in a runtime field. If you move a script from any of these sections in a search request to a runtime field that is computing values from the same number of documents, the performance should be about the same. The performance for these features is largely dependent upon the calculations that the included script is running and how many documents the script runs against. -## Compromises [runtime-compromises] +## Compromises [runtime-compromises] Runtime fields use less disk space and provide flexibility in how you access your data, but can impact search performance based on the computation defined in the runtime script. @@ -52,7 +52,7 @@ To balance search performance and flexibility, index fields that you’ll freque Use the [asynchronous search API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-async-search-submit) to run searches that include runtime fields. This method of search helps to offset the performance impacts of computing values for runtime fields in each document containing that field. If the query can’t return the result set synchronously, you’ll get results asynchronously as they become available. -::::{important} +::::{important} Queries against runtime fields are considered expensive. If [`search.allow_expensive_queries`](../../../explore-analyze/query-filter/languages/querydsl.md#query-dsl-allow-expensive-queries) is set to `false`, expensive queries are not allowed and {{es}} will reject any queries against runtime fields. :::: diff --git a/manage-data/data-store/near-real-time-search.md b/manage-data/data-store/near-real-time-search.md index 57b3fc8bd..f57309e9e 100644 --- a/manage-data/data-store/near-real-time-search.md +++ b/manage-data/data-store/near-real-time-search.md @@ -30,7 +30,7 @@ Lucene allows new segments to be written and opened, making the documents they c In {{es}}, this process of writing and opening a new segment is called a *refresh*. A refresh makes all operations performed on an index since the last refresh available for search. You can control refreshes through the following means: * Waiting for the refresh interval -* Setting the [?refresh](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/refresh-parameter.md) option +* Setting the [?refresh](elasticsearch://reference/elasticsearch/rest-apis/refresh-parameter.md) option * Using the [Refresh API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-refresh) to explicitly complete a refresh (`POST _refresh`) By default, {{es}} periodically refreshes indices every second, but only on indices that have received one search request or more in the last 30 seconds. This is why we say that {{es}} has *near* real-time search: document changes are not visible to search immediately, but will become visible within this timeframe. diff --git a/manage-data/data-store/templates.md b/manage-data/data-store/templates.md index 109334157..ea72e8454 100644 --- a/manage-data/data-store/templates.md +++ b/manage-data/data-store/templates.md @@ -41,7 +41,7 @@ The following conditions apply to index templates: If you use {{fleet}} or {{agent}}, assign your index templates a priority lower than `100` to avoid overriding these templates. Otherwise, to avoid accidentally applying the templates, do one or more of the following: -* To disable all built-in index and component templates, set [`stack.templates.enabled`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/index-management-settings.md#stack-templates-enabled) to `false` using the [cluster update settings API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings). Note, however, that this is not recommended, see the [setting documentation](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/index-management-settings.md#stack-templates-enabled) for more information. +* To disable all built-in index and component templates, set [`stack.templates.enabled`](elasticsearch://reference/elasticsearch/configuration-reference/index-management-settings.md#stack-templates-enabled) to `false` using the [cluster update settings API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings). Note, however, that this is not recommended, see the [setting documentation](elasticsearch://reference/elasticsearch/configuration-reference/index-management-settings.md#stack-templates-enabled) for more information. * Use a non-overlapping index pattern. * Assign templates with an overlapping pattern a `priority` higher than `500`. For example, if you don’t use {{fleet}} or {{agent}} and want to create a template for the `logs-*` index pattern, assign your template a priority of `500`. This ensures your template is applied instead of the built-in template for `logs-*-*`. * To avoid naming collisions with built-in and Fleet-managed index templates, avoid using `@` as part of the name of your own index templates. diff --git a/manage-data/data-store/templates/index-template-management.md b/manage-data/data-store/templates/index-template-management.md index e599b9376..666239ed4 100644 --- a/manage-data/data-store/templates/index-template-management.md +++ b/manage-data/data-store/templates/index-template-management.md @@ -51,7 +51,7 @@ In this tutorial, you’ll create an index template and use it to configure two ::: 2. Define index settings. These are optional. For this tutorial, leave this section blank. -3. Define a mapping that contains an [object](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/object.md) field named `geo` with a child [`geo_point`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/geo-point.md) field named `coordinates`: +3. Define a mapping that contains an [object](elasticsearch://reference/elasticsearch/mapping-reference/object.md) field named `geo` with a child [`geo_point`](elasticsearch://reference/elasticsearch/mapping-reference/geo-point.md) field named `coordinates`: :::{image} ../../../images/elasticsearch-reference-management-index-templates-mappings.png :alt: Mapped fields page diff --git a/manage-data/data-store/text-analysis.md b/manage-data/data-store/text-analysis.md index 968264ab4..d93b3a986 100644 --- a/manage-data/data-store/text-analysis.md +++ b/manage-data/data-store/text-analysis.md @@ -14,7 +14,7 @@ _Text analysis_ is the process of converting unstructured text, like the body of Text analysis enables {{es}} to perform full-text search, where the search returns all *relevant* results rather than just exact matches. For example, if you search for `Quick fox jumps`, you probably want the document that contains `A quick brown fox jumps over the lazy dog`, and you might also want documents that contain related words like `fast fox` or `foxes leap`. -{{es}} performs text analysis when indexing or searching [`text`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/text.md) fields. If your index does _not_ contain `text` fields, no further setup is needed; you can skip the pages in this section. If you _do_ use `text` fields or your text searches aren’t returning results as expected, configuring text analysis can often help. You should also look into analysis configuration if you’re using {{es}} to: +{{es}} performs text analysis when indexing or searching [`text`](elasticsearch://reference/elasticsearch/mapping-reference/text.md) fields. If your index does _not_ contain `text` fields, no further setup is needed; you can skip the pages in this section. If you _do_ use `text` fields or your text searches aren’t returning results as expected, configuring text analysis can often help. You should also look into analysis configuration if you’re using {{es}} to: * Build a search engine * Mine unstructured data @@ -47,9 +47,9 @@ To ensure search terms match these words as intended, you can apply the same tok Text analysis is performed by an [*analyzer*](/manage-data/data-store/text-analysis/anatomy-of-an-analyzer.md), a set of rules that govern the entire process. -{{es}} includes a default analyzer, called the [standard analyzer](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analysis-standard-analyzer.md), which works well for most use cases right out of the box. +{{es}} includes a default analyzer, called the [standard analyzer](elasticsearch://reference/data-analysis/text-analysis/analysis-standard-analyzer.md), which works well for most use cases right out of the box. -If you want to tailor your search experience, you can choose a different [built-in analyzer](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analyzer-reference.md) or even [configure a custom one](/manage-data/data-store/text-analysis/create-custom-analyzer.md). A custom analyzer gives you control over each step of the analysis process, including: +If you want to tailor your search experience, you can choose a different [built-in analyzer](elasticsearch://reference/data-analysis/text-analysis/analyzer-reference.md) or even [configure a custom one](/manage-data/data-store/text-analysis/create-custom-analyzer.md). A custom analyzer gives you control over each step of the analysis process, including: * Changes to the text *before* tokenization * How text is converted to tokens diff --git a/manage-data/data-store/text-analysis/anatomy-of-an-analyzer.md b/manage-data/data-store/text-analysis/anatomy-of-an-analyzer.md index c16327d3d..968175199 100644 --- a/manage-data/data-store/text-analysis/anatomy-of-an-analyzer.md +++ b/manage-data/data-store/text-analysis/anatomy-of-an-analyzer.md @@ -10,30 +10,30 @@ applies_to: An *analyzer*  — whether built-in or custom — is just a package which contains three lower-level building blocks: *character filters*, *tokenizers*, and *token filters*. -The built-in [analyzers](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analyzer-reference.md) pre-package these building blocks into analyzers suitable for different languages and types of text. Elasticsearch also exposes the individual building blocks so that they can be combined to define new [`custom`](create-custom-analyzer.md) analyzers. +The built-in [analyzers](elasticsearch://reference/data-analysis/text-analysis/analyzer-reference.md) pre-package these building blocks into analyzers suitable for different languages and types of text. Elasticsearch also exposes the individual building blocks so that they can be combined to define new [`custom`](create-custom-analyzer.md) analyzers. ## Character filters [analyzer-anatomy-character-filters] A *character filter* receives the original text as a stream of characters and can transform the stream by adding, removing, or changing characters. For instance, a character filter could be used to convert Hindu-Arabic numerals (٠‎١٢٣٤٥٦٧٨‎٩‎) into their Arabic-Latin equivalents (0123456789), or to strip HTML elements like `` from the stream. -An analyzer may have **zero or more** [character filters](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/character-filter-reference.md), which are applied in order. +An analyzer may have **zero or more** [character filters](elasticsearch://reference/data-analysis/text-analysis/character-filter-reference.md), which are applied in order. ## Tokenizer [analyzer-anatomy-tokenizer] -A *tokenizer* receives a stream of characters, breaks it up into individual *tokens* (usually individual words), and outputs a stream of *tokens*. For instance, a [`whitespace`](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analysis-whitespace-tokenizer.md) tokenizer breaks text into tokens whenever it sees any whitespace. It would convert the text `"Quick brown fox!"` into the terms `[Quick, brown, fox!]`. +A *tokenizer* receives a stream of characters, breaks it up into individual *tokens* (usually individual words), and outputs a stream of *tokens*. For instance, a [`whitespace`](elasticsearch://reference/data-analysis/text-analysis/analysis-whitespace-tokenizer.md) tokenizer breaks text into tokens whenever it sees any whitespace. It would convert the text `"Quick brown fox!"` into the terms `[Quick, brown, fox!]`. The tokenizer is also responsible for recording the order or *position* of each term and the start and end *character offsets* of the original word which the term represents. -An analyzer must have **exactly one** [tokenizer](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/tokenizer-reference.md). +An analyzer must have **exactly one** [tokenizer](elasticsearch://reference/data-analysis/text-analysis/tokenizer-reference.md). ## Token filters [analyzer-anatomy-token-filters] -A *token filter* receives the token stream and may add, remove, or change tokens. For example, a [`lowercase`](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analysis-lowercase-tokenfilter.md) token filter converts all tokens to lowercase, a [`stop`](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analysis-stop-tokenfilter.md) token filter removes common words (*stop words*) like `the` from the token stream, and a [`synonym`](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analysis-synonym-tokenfilter.md) token filter introduces synonyms into the token stream. +A *token filter* receives the token stream and may add, remove, or change tokens. For example, a [`lowercase`](elasticsearch://reference/data-analysis/text-analysis/analysis-lowercase-tokenfilter.md) token filter converts all tokens to lowercase, a [`stop`](elasticsearch://reference/data-analysis/text-analysis/analysis-stop-tokenfilter.md) token filter removes common words (*stop words*) like `the` from the token stream, and a [`synonym`](elasticsearch://reference/data-analysis/text-analysis/analysis-synonym-tokenfilter.md) token filter introduces synonyms into the token stream. Token filters are not allowed to change the position or character offsets of each token. -An analyzer may have **zero or more** [token filters](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/token-filter-reference.md), which are applied in order. +An analyzer may have **zero or more** [token filters](elasticsearch://reference/data-analysis/text-analysis/token-filter-reference.md), which are applied in order. diff --git a/manage-data/data-store/text-analysis/configure-text-analysis.md b/manage-data/data-store/text-analysis/configure-text-analysis.md index 6537fdf6d..c2672633c 100644 --- a/manage-data/data-store/text-analysis/configure-text-analysis.md +++ b/manage-data/data-store/text-analysis/configure-text-analysis.md @@ -8,9 +8,9 @@ applies_to: # Configure text analysis [configure-text-analysis] -By default, {{es}} uses the [`standard` analyzer](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analysis-standard-analyzer.md) for all text analysis. The `standard` analyzer gives you out-of-the-box support for most natural languages and use cases. If you chose to use the `standard` analyzer as-is, no further configuration is needed. +By default, {{es}} uses the [`standard` analyzer](elasticsearch://reference/data-analysis/text-analysis/analysis-standard-analyzer.md) for all text analysis. The `standard` analyzer gives you out-of-the-box support for most natural languages and use cases. If you chose to use the `standard` analyzer as-is, no further configuration is needed. -If the standard analyzer does not fit your needs, review and test {{es}}'s other built-in [built-in analyzers](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analyzer-reference.md). Built-in analyzers don’t require configuration, but some support options that can be used to adjust their behavior. For example, you can configure the `standard` analyzer with a list of custom stop words to remove. +If the standard analyzer does not fit your needs, review and test {{es}}'s other built-in [built-in analyzers](elasticsearch://reference/data-analysis/text-analysis/analyzer-reference.md). Built-in analyzers don’t require configuration, but some support options that can be used to adjust their behavior. For example, you can configure the `standard` analyzer with a list of custom stop words to remove. If no built-in analyzer fits your needs, you can test and create a custom analyzer. Custom analyzers involve selecting and combining different [analyzer components](anatomy-of-an-analyzer.md), giving you greater control over the process. diff --git a/manage-data/data-store/text-analysis/configuring-built-in-analyzers.md b/manage-data/data-store/text-analysis/configuring-built-in-analyzers.md index de6cb3bea..a93a9177b 100644 --- a/manage-data/data-store/text-analysis/configuring-built-in-analyzers.md +++ b/manage-data/data-store/text-analysis/configuring-built-in-analyzers.md @@ -8,7 +8,7 @@ applies_to: # Configuring built-in analyzers [configuring-analyzers] -The built-in analyzers can be used directly without any configuration. Some of them, however, support configuration options to alter their behaviour. For instance, the [`standard` analyzer](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analysis-standard-analyzer.md) can be configured to support a list of stop words: +The built-in analyzers can be used directly without any configuration. Some of them, however, support configuration options to alter their behaviour. For instance, the [`standard` analyzer](elasticsearch://reference/data-analysis/text-analysis/analysis-standard-analyzer.md) can be configured to support a list of stop words: ```console PUT my-index-000001 diff --git a/manage-data/data-store/text-analysis/create-custom-analyzer.md b/manage-data/data-store/text-analysis/create-custom-analyzer.md index 9be4da578..0fd42f661 100644 --- a/manage-data/data-store/text-analysis/create-custom-analyzer.md +++ b/manage-data/data-store/text-analysis/create-custom-analyzer.md @@ -10,46 +10,46 @@ applies_to: When the built-in analyzers do not fulfill your needs, you can create a `custom` analyzer which uses the appropriate combination of: -* zero or more [character filters](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/character-filter-reference.md) -* a [tokenizer](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/tokenizer-reference.md) -* zero or more [token filters](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/token-filter-reference.md). +* zero or more [character filters](elasticsearch://reference/data-analysis/text-analysis/character-filter-reference.md) +* a [tokenizer](elasticsearch://reference/data-analysis/text-analysis/tokenizer-reference.md) +* zero or more [token filters](elasticsearch://reference/data-analysis/text-analysis/token-filter-reference.md). -## Configuration [_configuration] +## Configuration [_configuration] The `custom` analyzer accepts the following parameters: `type` -: Analyzer type. Accepts [built-in analyzer types](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analyzer-reference.md). For custom analyzers, use `custom` or omit this parameter. +: Analyzer type. Accepts [built-in analyzer types](elasticsearch://reference/data-analysis/text-analysis/analyzer-reference.md). For custom analyzers, use `custom` or omit this parameter. `tokenizer` -: A built-in or customised [tokenizer](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/tokenizer-reference.md). (Required) +: A built-in or customised [tokenizer](elasticsearch://reference/data-analysis/text-analysis/tokenizer-reference.md). (Required) `char_filter` -: An optional array of built-in or customised [character filters](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/character-filter-reference.md). +: An optional array of built-in or customised [character filters](elasticsearch://reference/data-analysis/text-analysis/character-filter-reference.md). `filter` -: An optional array of built-in or customised [token filters](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/token-filter-reference.md). +: An optional array of built-in or customised [token filters](elasticsearch://reference/data-analysis/text-analysis/token-filter-reference.md). `position_increment_gap` -: When indexing an array of text values, Elasticsearch inserts a fake "gap" between the last term of one value and the first term of the next value to ensure that a phrase query doesn’t match two terms from different array elements. Defaults to `100`. See [`position_increment_gap`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/position-increment-gap.md) for more. +: When indexing an array of text values, Elasticsearch inserts a fake "gap" between the last term of one value and the first term of the next value to ensure that a phrase query doesn’t match two terms from different array elements. Defaults to `100`. See [`position_increment_gap`](elasticsearch://reference/elasticsearch/mapping-reference/position-increment-gap.md) for more. -## Example configuration [_example_configuration] +## Example configuration [_example_configuration] Here is an example that combines the following: Character Filter -: * [HTML Strip Character Filter](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analysis-htmlstrip-charfilter.md) +: * [HTML Strip Character Filter](elasticsearch://reference/data-analysis/text-analysis/analysis-htmlstrip-charfilter.md) Tokenizer -: * [Standard Tokenizer](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analysis-standard-tokenizer.md) +: * [Standard Tokenizer](elasticsearch://reference/data-analysis/text-analysis/analysis-standard-tokenizer.md) Token Filters -: * [Lowercase Token Filter](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analysis-lowercase-tokenfilter.md) -* [ASCII-Folding Token Filter](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analysis-asciifolding-tokenfilter.md) +: * [Lowercase Token Filter](elasticsearch://reference/data-analysis/text-analysis/analysis-lowercase-tokenfilter.md) +* [ASCII-Folding Token Filter](elasticsearch://reference/data-analysis/text-analysis/analysis-asciifolding-tokenfilter.md) ```console @@ -95,16 +95,16 @@ The previous example used tokenizer, token filters, and character filters with t Here is a more complicated example that combines the following: Character Filter -: * [Mapping Character Filter](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analysis-mapping-charfilter.md), configured to replace `:)` with `_happy_` and `:(` with `_sad_` +: * [Mapping Character Filter](elasticsearch://reference/data-analysis/text-analysis/analysis-mapping-charfilter.md), configured to replace `:)` with `_happy_` and `:(` with `_sad_` Tokenizer -: * [Pattern Tokenizer](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analysis-pattern-tokenizer.md), configured to split on punctuation characters +: * [Pattern Tokenizer](elasticsearch://reference/data-analysis/text-analysis/analysis-pattern-tokenizer.md), configured to split on punctuation characters Token Filters -: * [Lowercase Token Filter](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analysis-lowercase-tokenfilter.md) -* [Stop Token Filter](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analysis-stop-tokenfilter.md), configured to use the pre-defined list of English stop words +: * [Lowercase Token Filter](elasticsearch://reference/data-analysis/text-analysis/analysis-lowercase-tokenfilter.md) +* [Stop Token Filter](elasticsearch://reference/data-analysis/text-analysis/analysis-stop-tokenfilter.md), configured to use the pre-defined list of English stop words Here is an example: diff --git a/manage-data/data-store/text-analysis/index-search-analysis.md b/manage-data/data-store/text-analysis/index-search-analysis.md index 9c023e5e8..e7834e1d0 100644 --- a/manage-data/data-store/text-analysis/index-search-analysis.md +++ b/manage-data/data-store/text-analysis/index-search-analysis.md @@ -11,10 +11,10 @@ applies_to: Text analysis occurs at two times: Index time -: When a document is indexed, any [`text`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/text.md) field values are analyzed. +: When a document is indexed, any [`text`](elasticsearch://reference/elasticsearch/mapping-reference/text.md) field values are analyzed. Search time -: When running a [full-text search](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/full-text-queries.md) on a `text` field, the query string (the text the user is searching for) is analyzed. Search time is also called *query time*. +: When running a [full-text search](elasticsearch://reference/query-languages/full-text-queries.md) on a `text` field, the query string (the text the user is searching for) is analyzed. Search time is also called *query time*. For more details on text analysis at search time, refer to [Text analysis during search](/solutions/search/full-text/text-analysis-during-search.md). diff --git a/manage-data/data-store/text-analysis/specify-an-analyzer.md b/manage-data/data-store/text-analysis/specify-an-analyzer.md index aa9a08375..44553bf8d 100644 --- a/manage-data/data-store/text-analysis/specify-an-analyzer.md +++ b/manage-data/data-store/text-analysis/specify-an-analyzer.md @@ -31,15 +31,15 @@ If you don’t typically create mappings for your indices, you can use [index te {{es}} determines which index analyzer to use by checking the following parameters in order: -1. The [`analyzer`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/analyzer.md) mapping parameter for the field. See [Specify the analyzer for a field](#specify-index-field-analyzer). +1. The [`analyzer`](elasticsearch://reference/elasticsearch/mapping-reference/analyzer.md) mapping parameter for the field. See [Specify the analyzer for a field](#specify-index-field-analyzer). 2. The `analysis.analyzer.default` index setting. See [Specify the default analyzer for an index](#specify-index-time-default-analyzer). -If none of these parameters are specified, the [`standard` analyzer](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analysis-standard-analyzer.md) is used. +If none of these parameters are specified, the [`standard` analyzer](elasticsearch://reference/data-analysis/text-analysis/analysis-standard-analyzer.md) is used. ## Specify the analyzer for a field [specify-index-field-analyzer] -When mapping an index, you can use the [`analyzer`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/analyzer.md) mapping parameter to specify an analyzer for each `text` field. +When mapping an index, you can use the [`analyzer`](elasticsearch://reference/elasticsearch/mapping-reference/analyzer.md) mapping parameter to specify an analyzer for each `text` field. The following [create index API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-create) request sets the `whitespace` analyzer as the analyzer for the `title` field. @@ -92,19 +92,19 @@ If you choose to specify a separate search analyzer, we recommend you thoroughly At search time, {{es}} determines which analyzer to use by checking the following parameters in order: -1. The [`analyzer`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/analyzer.md) parameter in the search query. See [Specify the search analyzer for a query](#specify-search-query-analyzer). -2. The [`search_analyzer`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/search-analyzer.md) mapping parameter for the field. See [Specify the search analyzer for a field](#specify-search-field-analyzer). +1. The [`analyzer`](elasticsearch://reference/elasticsearch/mapping-reference/analyzer.md) parameter in the search query. See [Specify the search analyzer for a query](#specify-search-query-analyzer). +2. The [`search_analyzer`](elasticsearch://reference/elasticsearch/mapping-reference/search-analyzer.md) mapping parameter for the field. See [Specify the search analyzer for a field](#specify-search-field-analyzer). 3. The `analysis.analyzer.default_search` index setting. See [Specify the default search analyzer for an index](#specify-search-default-analyzer). -4. The [`analyzer`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/analyzer.md) mapping parameter for the field. See [Specify the analyzer for a field](#specify-index-field-analyzer). +4. The [`analyzer`](elasticsearch://reference/elasticsearch/mapping-reference/analyzer.md) mapping parameter for the field. See [Specify the analyzer for a field](#specify-index-field-analyzer). -If none of these parameters are specified, the [`standard` analyzer](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analysis-standard-analyzer.md) is used. +If none of these parameters are specified, the [`standard` analyzer](elasticsearch://reference/data-analysis/text-analysis/analysis-standard-analyzer.md) is used. ## Specify the search analyzer for a query [specify-search-query-analyzer] -When writing a [full-text query](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/full-text-queries.md), you can use the `analyzer` parameter to specify a search analyzer. If provided, this overrides any other search analyzers. +When writing a [full-text query](elasticsearch://reference/query-languages/full-text-queries.md), you can use the `analyzer` parameter to specify a search analyzer. If provided, this overrides any other search analyzers. -The following [search API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search) request sets the `stop` analyzer as the search analyzer for a [`match`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-match-query.md) query. +The following [search API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search) request sets the `stop` analyzer as the search analyzer for a [`match`](elasticsearch://reference/query-languages/query-dsl-match-query.md) query. ```console GET my-index-000001/_search @@ -123,7 +123,7 @@ GET my-index-000001/_search ## Specify the search analyzer for a field [specify-search-field-analyzer] -When mapping an index, you can use the [`search_analyzer`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/analyzer.md) mapping parameter to specify a search analyzer for each `text` field. +When mapping an index, you can use the [`search_analyzer`](elasticsearch://reference/elasticsearch/mapping-reference/analyzer.md) mapping parameter to specify a search analyzer for each `text` field. If a search analyzer is provided, the index analyzer must also be specified using the `analyzer` parameter. diff --git a/manage-data/data-store/text-analysis/stemming.md b/manage-data/data-store/text-analysis/stemming.md index 88ab114ca..d00720aed 100644 --- a/manage-data/data-store/text-analysis/stemming.md +++ b/manage-data/data-store/text-analysis/stemming.md @@ -44,10 +44,10 @@ However, most algorithmic stemmers only alter the existing text of a word. This The following token filters use algorithmic stemming: -* [`stemmer`](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analysis-stemmer-tokenfilter.md), which provides algorithmic stemming for several languages, some with additional variants. -* [`kstem`](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analysis-kstem-tokenfilter.md), a stemmer for English that combines algorithmic stemming with a built-in dictionary. -* [`porter_stem`](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analysis-porterstem-tokenfilter.md), our recommended algorithmic stemmer for English. -* [`snowball`](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analysis-snowball-tokenfilter.md), which uses [Snowball](https://snowballstem.org/)-based stemming rules for several languages. +* [`stemmer`](elasticsearch://reference/data-analysis/text-analysis/analysis-stemmer-tokenfilter.md), which provides algorithmic stemming for several languages, some with additional variants. +* [`kstem`](elasticsearch://reference/data-analysis/text-analysis/analysis-kstem-tokenfilter.md), a stemmer for English that combines algorithmic stemming with a built-in dictionary. +* [`porter_stem`](elasticsearch://reference/data-analysis/text-analysis/analysis-porterstem-tokenfilter.md), our recommended algorithmic stemmer for English. +* [`snowball`](elasticsearch://reference/data-analysis/text-analysis/analysis-snowball-tokenfilter.md), which uses [Snowball](https://snowballstem.org/)-based stemming rules for several languages. ## Dictionary stemmers [dictionary-stemmers] @@ -68,10 +68,10 @@ In practice, algorithmic stemmers typically outperform dictionary stemmers. This * **Dictionary quality**
A dictionary stemmer is only as good as its dictionary. To work well, these dictionaries must include a significant number of words, be updated regularly, and change with language trends. Often, by the time a dictionary has been made available, it’s incomplete and some of its entries are already outdated. * **Size and performance**
Dictionary stemmers must load all words, prefixes, and suffixes from its dictionary into memory. This can use a significant amount of RAM. Low-quality dictionaries may also be less efficient with prefix and suffix removal, which can slow the stemming process significantly. -You can use the [`hunspell`](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analysis-hunspell-tokenfilter.md) token filter to perform dictionary stemming. +You can use the [`hunspell`](elasticsearch://reference/data-analysis/text-analysis/analysis-hunspell-tokenfilter.md) token filter to perform dictionary stemming. -::::{tip} -If available, we recommend trying an algorithmic stemmer for your language before using the [`hunspell`](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analysis-hunspell-tokenfilter.md) token filter. +::::{tip} +If available, we recommend trying an algorithmic stemmer for your language before using the [`hunspell`](elasticsearch://reference/data-analysis/text-analysis/analysis-hunspell-tokenfilter.md) token filter. :::: @@ -83,10 +83,10 @@ Sometimes stemming can produce shared root words that are spelled similarly but To prevent this and better control stemming, you can use the following token filters: -* [`stemmer_override`](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analysis-stemmer-override-tokenfilter.md), which lets you define rules for stemming specific tokens. -* [`keyword_marker`](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analysis-keyword-marker-tokenfilter.md), which marks specified tokens as keywords. Keyword tokens are not stemmed by subsequent stemmer token filters. -* [`conditional`](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analysis-condition-tokenfilter.md), which can be used to mark tokens as keywords, similar to the `keyword_marker` filter. +* [`stemmer_override`](elasticsearch://reference/data-analysis/text-analysis/analysis-stemmer-override-tokenfilter.md), which lets you define rules for stemming specific tokens. +* [`keyword_marker`](elasticsearch://reference/data-analysis/text-analysis/analysis-keyword-marker-tokenfilter.md), which marks specified tokens as keywords. Keyword tokens are not stemmed by subsequent stemmer token filters. +* [`conditional`](elasticsearch://reference/data-analysis/text-analysis/analysis-condition-tokenfilter.md), which can be used to mark tokens as keywords, similar to the `keyword_marker` filter. -For built-in [language analyzers](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analysis-lang-analyzer.md), you also can use the [`stem_exclusion`](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analysis-lang-analyzer.md#_excluding_words_from_stemming) parameter to specify a list of words that won’t be stemmed. +For built-in [language analyzers](elasticsearch://reference/data-analysis/text-analysis/analysis-lang-analyzer.md), you also can use the [`stem_exclusion`](elasticsearch://reference/data-analysis/text-analysis/analysis-lang-analyzer.md#_excluding_words_from_stemming) parameter to specify a list of words that won’t be stemmed. diff --git a/manage-data/data-store/text-analysis/token-graphs.md b/manage-data/data-store/text-analysis/token-graphs.md index 9212a0fdc..02d140f08 100644 --- a/manage-data/data-store/text-analysis/token-graphs.md +++ b/manage-data/data-store/text-analysis/token-graphs.md @@ -36,10 +36,10 @@ Some token filters can add tokens that span multiple positions. These can includ However, only some token filters, known as *graph token filters*, accurately record the `positionLength` for multi-position tokens. These filters include: -* [`synonym_graph`](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analysis-synonym-graph-tokenfilter.md) -* [`word_delimiter_graph`](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analysis-word-delimiter-graph-tokenfilter.md) +* [`synonym_graph`](elasticsearch://reference/data-analysis/text-analysis/analysis-synonym-graph-tokenfilter.md) +* [`word_delimiter_graph`](elasticsearch://reference/data-analysis/text-analysis/analysis-word-delimiter-graph-tokenfilter.md) -Some tokenizers, such as the [`nori_tokenizer`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch-plugins/analysis-nori-tokenizer.md), also accurately decompose compound tokens into multi-position tokens. +Some tokenizers, such as the [`nori_tokenizer`](elasticsearch://reference/elasticsearch-plugins/analysis-nori-tokenizer.md), also accurately decompose compound tokens into multi-position tokens. In the following graph, `domain name system` and its synonym, `dns`, both have a position of `0`. However, `dns` has a `positionLength` of `3`. Other tokens in the graph have a default `positionLength` of `1`. @@ -51,7 +51,7 @@ In the following graph, `domain name system` and its synonym, `dns`, both have a [Indexing](index-search-analysis.md) ignores the `positionLength` attribute and does not support token graphs containing multi-position tokens. -However, queries, such as the [`match`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-match-query.md) or [`match_phrase`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-match-query-phrase.md) query, can use these graphs to generate multiple sub-queries from a single query string. +However, queries, such as the [`match`](elasticsearch://reference/query-languages/query-dsl-match-query.md) or [`match_phrase`](elasticsearch://reference/query-languages/query-dsl-match-query-phrase.md) query, can use these graphs to generate multiple sub-queries from a single query string. :::::{dropdown} Example A user runs a search for the following phrase using the `match_phrase` query: @@ -81,8 +81,8 @@ This means the query matches documents containing either `dns is fragile` *or* ` The following token filters can add tokens that span multiple positions but only record a default `positionLength` of `1`: -* [`synonym`](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analysis-synonym-tokenfilter.md) -* [`word_delimiter`](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analysis-word-delimiter-tokenfilter.md) +* [`synonym`](elasticsearch://reference/data-analysis/text-analysis/analysis-synonym-tokenfilter.md) +* [`word_delimiter`](elasticsearch://reference/data-analysis/text-analysis/analysis-word-delimiter-tokenfilter.md) This means these filters will produce invalid token graphs for streams containing such tokens. diff --git a/manage-data/ingest.md b/manage-data/ingest.md index 983944963..28e2408f8 100644 --- a/manage-data/ingest.md +++ b/manage-data/ingest.md @@ -30,7 +30,7 @@ Elastic offer tools designed to ingest specific types of general content. The co * To index **documents** directly into {{es}}, use the {{es}} [document APIs](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-document). * To send **application data** directly to {{es}}, use an [{{es}} language client](https://www.elastic.co/guide/en/elasticsearch/client/index.html). * To index **web page content**, use the Elastic [web crawler](https://www.elastic.co/web-crawler). -* To sync **data from third-party sources**, use [connectors](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/search-connectors/index.md). A connector syncs content from an original data source to an {{es}} index. Using connectors you can create *searchable*, read-only replicas of your data sources. +* To sync **data from third-party sources**, use [connectors](elasticsearch://reference/ingestion-tools/search-connectors/index.md). A connector syncs content from an original data source to an {{es}} index. Using connectors you can create *searchable*, read-only replicas of your data sources. * To index **single files** for testing in a non-production environment, use the {{kib}} [file uploader](ingest/upload-data-files.md). If you would like to try things out before you add your own data, try using our [sample data](ingest/sample-data.md). diff --git a/manage-data/ingest/ingesting-data-for-elastic-solutions.md b/manage-data/ingest/ingesting-data-for-elastic-solutions.md index 3fc3627c3..11579471d 100644 --- a/manage-data/ingest/ingesting-data-for-elastic-solutions.md +++ b/manage-data/ingest/ingesting-data-for-elastic-solutions.md @@ -41,7 +41,7 @@ To use [Elastic Agent](https://www.elastic.co/guide/en/fleet/current) and [Elast * [{{es}} document APIs](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-document) * [{{es}} language clients](https://www.elastic.co/guide/en/elasticsearch/client/index.html) * [Elastic web crawler](https://www.elastic.co/web-crawler) - * [Elastic connectors](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/search-connectors/index.md) + * [Elastic connectors](elasticsearch://reference/ingestion-tools/search-connectors/index.md) @@ -101,6 +101,6 @@ Bring your ideas and use {{es}} and the {{stack}} to store, search, and visualiz * [{{es}} document APIs](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-document) * [{{es}} language clients](https://www.elastic.co/guide/en/elasticsearch/client/index.html) * [Elastic web crawler](https://www.elastic.co/web-crawler) - * [Elastic connectors](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/search-connectors/index.md) + * [Elastic connectors](elasticsearch://reference/ingestion-tools/search-connectors/index.md) * [Tutorial: Get started with vector search and generative AI](https://www.elastic.co/guide/en/starting-with-the-elasticsearch-platform-and-its-solutions/current/getting-started-general-purpose.html) diff --git a/manage-data/ingest/tools.md b/manage-data/ingest/tools.md index a47901d32..6c657e11d 100644 --- a/manage-data/ingest/tools.md +++ b/manage-data/ingest/tools.md @@ -53,5 +53,5 @@ Depending on the type of data you want to ingest, you have a number of methods a | Application logs | Ingest application logs using Filebeat, {{agent}}, or the APM agent, or reformat application logs into Elastic Common Schema (ECS) logs and then ingest them using Filebeat or {{agent}}. | [Stream application logs](/solutions/observability/logs/stream-application-logs.md)
[ECS formatted application logs](/solutions/observability/logs/ecs-formatted-application-logs.md) | | Elastic Serverless forwarder for AWS | Ship logs from your AWS environment to cloud-hosted, self-managed Elastic environments, or {{ls}}. | [Elastic Serverless Forwarder](elastic-serverless-forwarder://reference/index.md) | | Connectors | Use connectors to extract data from an original data source and sync it to an {{es}} index. | [Ingest content with Elastic connectors -](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/search-connectors/index.md)
[Connector clients](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/search-connectors/index.md) | +](elasticsearch://reference/ingestion-tools/search-connectors/index.md)
[Connector clients](elasticsearch://reference/ingestion-tools/search-connectors/index.md) | | Web crawler | Discover, extract, and index searchable content from websites and knowledge bases using the web crawler. | [Elastic Open Web Crawler](https://github.com/elastic/crawler#readme) | \ No newline at end of file diff --git a/manage-data/ingest/transform-enrich/data-enrichment.md b/manage-data/ingest/transform-enrich/data-enrichment.md index 3b56249d1..5c277f11d 100644 --- a/manage-data/ingest/transform-enrich/data-enrichment.md +++ b/manage-data/ingest/transform-enrich/data-enrichment.md @@ -9,7 +9,7 @@ applies_to: # Data enrichment -You can use the [enrich processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/enrich-processor.md) to add data from your existing indices to incoming documents during ingest. +You can use the [enrich processor](elasticsearch://reference/ingestion-tools/enrich-processor/enrich-processor.md) to add data from your existing indices to incoming documents during ingest. For example, you can use the enrich processor to: @@ -75,7 +75,7 @@ Use the **Enrich Policies** view to add data from your existing indices to incom * The source indices that store enrich data as documents * The fields from the source indices used to match incoming documents * The enrich fields containing enrich data from the source indices that you want to add to incoming documents -* An optional [query](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-match-all-query.md). +* An optional [query](elasticsearch://reference/query-languages/query-dsl-match-all-query.md). :::{image} ../../../images/elasticsearch-reference-management-enrich-policies.png :alt: Enrich policies diff --git a/manage-data/ingest/transform-enrich/example-enrich-data-based-on-exact-values.md b/manage-data/ingest/transform-enrich/example-enrich-data-based-on-exact-values.md index b66fb7e52..b10658a74 100644 --- a/manage-data/ingest/transform-enrich/example-enrich-data-based-on-exact-values.md +++ b/manage-data/ingest/transform-enrich/example-enrich-data-based-on-exact-values.md @@ -8,7 +8,7 @@ applies_to: # Example: Enrich your data based on exact values [match-enrich-policy-type] -`match` [enrich policies](data-enrichment.md#enrich-policy) match enrich data to incoming documents based on an exact value, such as a email address or ID, using a [`term` query](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-term-query.md). +`match` [enrich policies](data-enrichment.md#enrich-policy) match enrich data to incoming documents based on an exact value, such as a email address or ID, using a [`term` query](elasticsearch://reference/query-languages/query-dsl-term-query.md). The following example creates a `match` enrich policy that adds user name and contact information to incoming documents based on an email address. It then adds the `match` enrich policy to a processor in an ingest pipeline. @@ -53,7 +53,7 @@ Use the [execute enrich policy API](https://www.elastic.co/docs/api/doc/elastics POST /_enrich/policy/users-policy/_execute?wait_for_completion=false ``` -Use the [create or update pipeline API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-ingest-put-pipeline) to create an ingest pipeline. In the pipeline, add an [enrich processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/enrich-processor.md) that includes: +Use the [create or update pipeline API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-ingest-put-pipeline) to create an ingest pipeline. In the pipeline, add an [enrich processor](elasticsearch://reference/ingestion-tools/enrich-processor/enrich-processor.md) that includes: * Your enrich policy. * The `field` of incoming documents used to match documents from the enrich index. diff --git a/manage-data/ingest/transform-enrich/example-enrich-data-based-on-geolocation.md b/manage-data/ingest/transform-enrich/example-enrich-data-based-on-geolocation.md index 19cf704cd..d386a2673 100644 --- a/manage-data/ingest/transform-enrich/example-enrich-data-based-on-geolocation.md +++ b/manage-data/ingest/transform-enrich/example-enrich-data-based-on-geolocation.md @@ -8,7 +8,7 @@ applies_to: # Example: Enrich your data based on geolocation [geo-match-enrich-policy-type] -`geo_match` [enrich policies](data-enrichment.md#enrich-policy) match enrich data to incoming documents based on a geographic location, using a [`geo_shape` query](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-geo-shape-query.md). +`geo_match` [enrich policies](data-enrichment.md#enrich-policy) match enrich data to incoming documents based on a geographic location, using a [`geo_shape` query](elasticsearch://reference/query-languages/query-dsl-geo-shape-query.md). The following example creates a `geo_match` enrich policy that adds postal codes to incoming documents based on a set of coordinates. It then adds the `geo_match` enrich policy to a processor in an ingest pipeline. @@ -66,12 +66,12 @@ Use the [execute enrich policy API](https://www.elastic.co/docs/api/doc/elastics POST /_enrich/policy/postal_policy/_execute?wait_for_completion=false ``` -Use the [create or update pipeline API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-ingest-put-pipeline) to create an ingest pipeline. In the pipeline, add an [enrich processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/enrich-processor.md) that includes: +Use the [create or update pipeline API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-ingest-put-pipeline) to create an ingest pipeline. In the pipeline, add an [enrich processor](elasticsearch://reference/ingestion-tools/enrich-processor/enrich-processor.md) that includes: * Your enrich policy. * The `field` of incoming documents used to match the geoshape of documents from the enrich index. * The `target_field` used to store appended enrich data for incoming documents. This field contains the `match_field` and `enrich_fields` specified in your enrich policy. -* The `shape_relation`, which indicates how the processor matches geoshapes in incoming documents to geoshapes in documents from the enrich index. See [Spatial Relations](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-shape-query.md#_spatial_relations) for valid options and more information. +* The `shape_relation`, which indicates how the processor matches geoshapes in incoming documents to geoshapes in documents from the enrich index. See [Spatial Relations](elasticsearch://reference/query-languages/query-dsl-shape-query.md#_spatial_relations) for valid options and more information. ```console PUT /_ingest/pipeline/postal_lookup diff --git a/manage-data/ingest/transform-enrich/example-enrich-data-by-matching-value-to-range.md b/manage-data/ingest/transform-enrich/example-enrich-data-by-matching-value-to-range.md index 03fdd90da..1098ba802 100644 --- a/manage-data/ingest/transform-enrich/example-enrich-data-by-matching-value-to-range.md +++ b/manage-data/ingest/transform-enrich/example-enrich-data-by-matching-value-to-range.md @@ -8,7 +8,7 @@ applies_to: # Example: Enrich your data by matching a value to a range [range-enrich-policy-type] -A `range` [enrich policy](data-enrichment.md#enrich-policy) uses a [`term` query](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-term-query.md) to match a number, date, or IP address in incoming documents to a range of the same type in the enrich index. Matching a range to a range is not supported. +A `range` [enrich policy](data-enrichment.md#enrich-policy) uses a [`term` query](elasticsearch://reference/query-languages/query-dsl-term-query.md) to match a number, date, or IP address in incoming documents to a range of the same type in the enrich index. Matching a range to a range is not supported. The following example creates a `range` enrich policy that adds a descriptive network name and responsible department to incoming documents based on an IP address. It then adds the enrich policy to a processor in an ingest pipeline. @@ -63,7 +63,7 @@ Use the [execute enrich policy API](https://www.elastic.co/docs/api/doc/elastics POST /_enrich/policy/networks-policy/_execute?wait_for_completion=false ``` -Use the [create or update pipeline API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-ingest-put-pipeline) to create an ingest pipeline. In the pipeline, add an [enrich processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/enrich-processor.md) that includes: +Use the [create or update pipeline API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-ingest-put-pipeline) to create an ingest pipeline. In the pipeline, add an [enrich processor](elasticsearch://reference/ingestion-tools/enrich-processor/enrich-processor.md) that includes: * Your enrich policy. * The `field` of incoming documents used to match documents from the enrich index. diff --git a/manage-data/ingest/transform-enrich/example-parse-logs.md b/manage-data/ingest/transform-enrich/example-parse-logs.md index 513ef9464..d38317bce 100644 --- a/manage-data/ingest/transform-enrich/example-parse-logs.md +++ b/manage-data/ingest/transform-enrich/example-parse-logs.md @@ -31,7 +31,7 @@ These logs contain a timestamp, IP address, and user agent. You want to give the 2. Click **Create pipeline > New pipeline**. 3. Set **Name** to `my-pipeline` and optionally add a description for the pipeline. -4. Add a [grok processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/grok-processor.md) to parse the log message: +4. Add a [grok processor](elasticsearch://reference/ingestion-tools/enrich-processor/grok-processor.md) to parse the log message: 1. Click **Add a processor** and select the **Grok** processor type. 2. Set **Field** to `message` and **Patterns** to the following [grok pattern](../../../explore-analyze/scripting/grok.md): @@ -47,9 +47,9 @@ These logs contain a timestamp, IP address, and user agent. You want to give the | Processor type | Field | Additional options | Description | | --- | --- | --- | --- | - | [**Date**](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/date-processor.md) | `@timestamp` | **Formats**: `dd/MMM/yyyy:HH:mm:ss Z` | `Format '@timestamp' as 'dd/MMM/yyyy:HH:mm:ss Z'` | - | [**GeoIP**](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/geoip-processor.md) | `source.ip` | **Target field**: `source.geo` | `Add 'source.geo' GeoIP data for 'source.ip'` | - | [**User agent**](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/user-agent-processor.md) | `user_agent` | | `Extract fields from 'user_agent'` | + | [**Date**](elasticsearch://reference/ingestion-tools/enrich-processor/date-processor.md) | `@timestamp` | **Formats**: `dd/MMM/yyyy:HH:mm:ss Z` | `Format '@timestamp' as 'dd/MMM/yyyy:HH:mm:ss Z'` | + | [**GeoIP**](elasticsearch://reference/ingestion-tools/enrich-processor/geoip-processor.md) | `source.ip` | **Target field**: `source.geo` | `Add 'source.geo' GeoIP data for 'source.ip'` | + | [**User agent**](elasticsearch://reference/ingestion-tools/enrich-processor/user-agent-processor.md) | `user_agent` | | `Extract fields from 'user_agent'` | Your form should look similar to this: @@ -135,7 +135,7 @@ These logs contain a timestamp, IP address, and user agent. You want to give the } ``` -12. To verify, search the data stream to retrieve the document. The following search uses [`filter_path`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/common-options.md#common-options-response-filtering) to return only the [document source](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-source-field.md). +12. To verify, search the data stream to retrieve the document. The following search uses [`filter_path`](elasticsearch://reference/elasticsearch/rest-apis/common-options.md#common-options-response-filtering) to return only the [document source](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md). ```console GET my-data-stream/_search?filter_path=hits.hits._source diff --git a/manage-data/ingest/transform-enrich/ingest-pipelines.md b/manage-data/ingest/transform-enrich/ingest-pipelines.md index 67d1dc244..00c99d9da 100644 --- a/manage-data/ingest/transform-enrich/ingest-pipelines.md +++ b/manage-data/ingest/transform-enrich/ingest-pipelines.md @@ -10,7 +10,7 @@ applies_to: {{es}} ingest pipelines let you perform common transformations on your data before indexing. For example, you can use pipelines to remove fields, extract values from text, and enrich your data. -A pipeline consists of a series of configurable tasks called [processors](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/index.md). Each processor runs sequentially, making specific changes to incoming documents. After the processors have run, {{es}} adds the transformed documents to your data stream or index. +A pipeline consists of a series of configurable tasks called [processors](elasticsearch://reference/ingestion-tools/enrich-processor/index.md). Each processor runs sequentially, making specific changes to incoming documents. After the processors have run, {{es}} adds the transformed documents to your data stream or index. :::{image} ../../../images/elasticsearch-reference-ingest-process.svg :alt: Ingest pipeline diagram @@ -49,7 +49,7 @@ The **New pipeline from CSV** option lets you use a CSV to create an ingest pipe :::: -You can also use the [ingest APIs](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-ingest) to create and manage pipelines. The following [create pipeline API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-ingest-put-pipeline) request creates a pipeline containing two [`set`](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/set-processor.md) processors followed by a [`lowercase`](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/lowercase-processor.md) processor. The processors run sequentially in the order specified. +You can also use the [ingest APIs](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-ingest) to create and manage pipelines. The following [create pipeline API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-ingest-put-pipeline) request creates a pipeline containing two [`set`](elasticsearch://reference/ingestion-tools/enrich-processor/set-processor.md) processors followed by a [`lowercase`](elasticsearch://reference/ingestion-tools/enrich-processor/lowercase-processor.md) processor. The processors run sequentially in the order specified. ```console PUT _ingest/pipeline/my-pipeline @@ -228,12 +228,12 @@ POST _reindex ## Set a default pipeline [set-default-pipeline] -Use the [`index.default_pipeline`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-modules.md#index-default-pipeline) index setting to set a default pipeline. {{es}} applies this pipeline to indexing requests if no `pipeline` parameter is specified. +Use the [`index.default_pipeline`](elasticsearch://reference/elasticsearch/index-settings/index-modules.md#index-default-pipeline) index setting to set a default pipeline. {{es}} applies this pipeline to indexing requests if no `pipeline` parameter is specified. ## Set a final pipeline [set-final-pipeline] -Use the [`index.final_pipeline`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-modules.md#index-final-pipeline) index setting to set a final pipeline. {{es}} applies this pipeline after the request or default pipeline, even if neither is specified. +Use the [`index.final_pipeline`](elasticsearch://reference/elasticsearch/index-settings/index-modules.md#index-final-pipeline) index setting to set a final pipeline. {{es}} applies this pipeline after the request or default pipeline, even if neither is specified. ## Pipelines for {{beats}} [pipelines-for-beats] @@ -270,7 +270,7 @@ $$$pipeline-custom-logs-index-template$$$ } ``` -2. Create an [index template](../../data-store/templates.md) that includes your pipeline in the [`index.default_pipeline`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-modules.md#index-default-pipeline) or [`index.final_pipeline`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-modules.md#index-final-pipeline) index setting. Ensure the template is [data stream enabled](../../data-store/data-streams/set-up-data-stream.md#create-index-template). The template’s index pattern should match `logs--*`. +2. Create an [index template](../../data-store/templates.md) that includes your pipeline in the [`index.default_pipeline`](elasticsearch://reference/elasticsearch/index-settings/index-modules.md#index-default-pipeline) or [`index.final_pipeline`](elasticsearch://reference/elasticsearch/index-settings/index-modules.md#index-final-pipeline) index setting. Ensure the template is [data stream enabled](../../data-store/data-streams/set-up-data-stream.md#create-index-template). The template’s index pattern should match `logs--*`. You can create this template using {{kib}}'s [**Index Management**](../../lifecycle/index-lifecycle-management/index-management-in-kibana.md#manage-index-templates) feature or the [create index template API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-index-template). @@ -345,12 +345,12 @@ $$$pipeline-custom-logs-configuration$$$ **{{agent}} standalone** -If you run {{agent}} standalone, you can apply pipelines using an [index template](../../data-store/templates.md) that includes the [`index.default_pipeline`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-modules.md#index-default-pipeline) or [`index.final_pipeline`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-modules.md#index-final-pipeline) index setting. Alternatively, you can specify the `pipeline` policy setting in your `elastic-agent.yml` configuration. See [Install standalone {{agent}}s](asciidocalypse://docs/docs-content/docs/reference/ingestion-tools/fleet/install-standalone-elastic-agent.md). +If you run {{agent}} standalone, you can apply pipelines using an [index template](../../data-store/templates.md) that includes the [`index.default_pipeline`](elasticsearch://reference/elasticsearch/index-settings/index-modules.md#index-default-pipeline) or [`index.final_pipeline`](elasticsearch://reference/elasticsearch/index-settings/index-modules.md#index-final-pipeline) index setting. Alternatively, you can specify the `pipeline` policy setting in your `elastic-agent.yml` configuration. See [Install standalone {{agent}}s](asciidocalypse://docs/docs-content/docs/reference/ingestion-tools/fleet/install-standalone-elastic-agent.md). ## Pipelines for search indices [pipelines-in-enterprise-search] -When you create Elasticsearch indices for search use cases, for example, using the [web crawler^](https://www.elastic.co/guide/en/enterprise-search/current/crawler.html) or [connectors](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/search-connectors/index.md), these indices are automatically set up with specific ingest pipelines. These processors help optimize your content for search. See [*Ingest pipelines in Search*](../../../solutions/search/ingest-for-search.md) for more information. +When you create Elasticsearch indices for search use cases, for example, using the [web crawler^](https://www.elastic.co/guide/en/enterprise-search/current/crawler.html) or [connectors](elasticsearch://reference/ingestion-tools/search-connectors/index.md), these indices are automatically set up with specific ingest pipelines. These processors help optimize your content for search. See [*Ingest pipelines in Search*](../../../solutions/search/ingest-for-search.md) for more information. ## Access source fields in a processor [access-source-fields] @@ -390,7 +390,7 @@ PUT _ingest/pipeline/my-pipeline Use dot notation to access object fields. ::::{important} -If your document contains flattened objects, use the [`dot_expander`](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/dot-expand-processor.md) processor to expand them first. Other ingest processors cannot access flattened objects. +If your document contains flattened objects, use the [`dot_expander`](elasticsearch://reference/ingestion-tools/enrich-processor/dot-expand-processor.md) processor to expand them first. Other ingest processors cannot access flattened objects. :::: @@ -636,10 +636,10 @@ PUT _ingest/pipeline/my-pipeline ## Conditionally run a processor [conditionally-run-processor] -Each processor supports an optional `if` condition, written as a [Painless script](asciidocalypse://docs/elasticsearch/docs/reference/scripting-languages/painless/painless.md). If provided, the processor only runs when the `if` condition is `true`. +Each processor supports an optional `if` condition, written as a [Painless script](elasticsearch://reference/scripting-languages/painless/painless.md). If provided, the processor only runs when the `if` condition is `true`. ::::{important} -`if` condition scripts run in Painless’s [ingest processor context](asciidocalypse://docs/elasticsearch/docs/reference/scripting-languages/painless/painless-ingest-processor-context.md). In `if` conditions, `ctx` values are read-only. +`if` condition scripts run in Painless’s [ingest processor context](elasticsearch://reference/scripting-languages/painless/painless-ingest-processor-context.md). In `if` conditions, `ctx` values are read-only. :::: @@ -657,7 +657,7 @@ PUT _ingest/pipeline/my-pipeline } ``` -If the [`script.painless.regex.enabled`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/circuit-breaker-settings.md#script-painless-regex-enabled) cluster setting is enabled, you can use regular expressions in your `if` condition scripts. For supported syntax, see [Painless regular expressions](asciidocalypse://docs/elasticsearch/docs/reference/scripting-languages/painless/painless-regexes.md). +If the [`script.painless.regex.enabled`](elasticsearch://reference/elasticsearch/configuration-reference/circuit-breaker-settings.md#script-painless-regex-enabled) cluster setting is enabled, you can use regular expressions in your `if` condition scripts. For supported syntax, see [Painless regular expressions](elasticsearch://reference/scripting-languages/painless/painless-regexes.md). ::::{tip} If possible, avoid using regular expressions. Expensive regular expressions can slow indexing speeds. @@ -745,7 +745,7 @@ PUT _ingest/pipeline/my-pipeline } ``` -Incoming documents often contain object fields. If a processor script attempts to access a field whose parent object does not exist, {{es}} returns a `NullPointerException`. To avoid these exceptions, use [null safe operators](asciidocalypse://docs/elasticsearch/docs/reference/scripting-languages/painless/painless-operators-reference.md#null-safe-operator), such as `?.`, and write your scripts to be null safe. +Incoming documents often contain object fields. If a processor script attempts to access a field whose parent object does not exist, {{es}} returns a `NullPointerException`. To avoid these exceptions, use [null safe operators](elasticsearch://reference/scripting-languages/painless/painless-operators-reference.md#null-safe-operator), such as `?.`, and write your scripts to be null safe. For example, `ctx.network?.name.equalsIgnoreCase('Guest')` is not null safe. `ctx.network?.name` can return null. Rewrite the script as `'Guest'.equalsIgnoreCase(ctx.network?.name)`, which is null safe because `Guest` is always non-null. @@ -768,7 +768,7 @@ PUT _ingest/pipeline/my-pipeline ## Conditionally apply pipelines [conditionally-apply-pipelines] -Combine an `if` condition with the [`pipeline`](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/pipeline-processor.md) processor to apply other pipelines to documents based on your criteria. You can use this pipeline as the [default pipeline](ingest-pipelines.md#set-default-pipeline) in an [index template](../../data-store/templates.md) used to configure multiple data streams or indices. +Combine an `if` condition with the [`pipeline`](elasticsearch://reference/ingestion-tools/enrich-processor/pipeline-processor.md) processor to apply other pipelines to documents based on your criteria. You can use this pipeline as the [default pipeline](ingest-pipelines.md#set-default-pipeline) in an [index template](../../data-store/templates.md) used to configure multiple data streams or indices. ```console PUT _ingest/pipeline/one-pipeline-to-rule-them-all diff --git a/manage-data/ingest/transform-enrich/set-up-an-enrich-processor.md b/manage-data/ingest/transform-enrich/set-up-an-enrich-processor.md index ddf1c3d16..b6fe860f4 100644 --- a/manage-data/ingest/transform-enrich/set-up-an-enrich-processor.md +++ b/manage-data/ingest/transform-enrich/set-up-an-enrich-processor.md @@ -20,7 +20,7 @@ To set up an enrich processor, follow these steps: Once you have an enrich processor set up, you can [update your enrich data](#update-enrich-data) and [update your enrich policies](#update-enrich-policies). ::::{important} -The enrich processor performs several operations and may impact the speed of your ingest pipeline. We recommend [node roles](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/node-settings.md) co-locating ingest and data roles to minimize remote search operations. +The enrich processor performs several operations and may impact the speed of your ingest pipeline. We recommend [node roles](elasticsearch://reference/elasticsearch/configuration-reference/node-settings.md) co-locating ingest and data roles to minimize remote search operations. We strongly recommend testing and benchmarking your enrich processors before deploying them in production. @@ -68,7 +68,7 @@ Once the enrich policy is created, you need to execute it using the [execute enr The *enrich index* contains documents from the policy’s source indices. Enrich indices always begin with `.enrich-*`, are read-only, and are [force merged](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-forcemerge). ::::{warning} -Enrich indices should only be used by the [enrich processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/enrich-processor.md) or the [{{esql}} `ENRICH` command](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md#esql-enrich). Avoid using enrich indices for other purposes. +Enrich indices should only be used by the [enrich processor](elasticsearch://reference/ingestion-tools/enrich-processor/enrich-processor.md) or the [{{esql}} `ENRICH` command](elasticsearch://reference/query-languages/esql/esql-commands.md#esql-enrich). Avoid using enrich indices for other purposes. :::: @@ -82,7 +82,7 @@ Once you have source indices, an enrich policy, and the related enrich index in :alt: enrich processor ::: -Define an [enrich processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/enrich-processor.md) and add it to an ingest pipeline using the [create or update pipeline API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-ingest-put-pipeline). +Define an [enrich processor](elasticsearch://reference/ingestion-tools/enrich-processor/enrich-processor.md) and add it to an ingest pipeline using the [create or update pipeline API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-ingest-put-pipeline). When defining the enrich processor, you must include at least the following: @@ -92,9 +92,9 @@ When defining the enrich processor, you must include at least the following: You also can use the `max_matches` option to set the number of enrich documents an incoming document can match. If set to the default of `1`, data is added to an incoming document’s target field as a JSON object. Otherwise, the data is added as an array. -See [Enrich](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/enrich-processor.md) for a full list of configuration options. +See [Enrich](elasticsearch://reference/ingestion-tools/enrich-processor/enrich-processor.md) for a full list of configuration options. -You also can add other [processors](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/index.md) to your ingest pipeline. +You also can add other [processors](elasticsearch://reference/ingestion-tools/enrich-processor/index.md) to your ingest pipeline. ## Ingest and enrich documents [ingest-enrich-docs] diff --git a/manage-data/lifecycle/data-stream.md b/manage-data/lifecycle/data-stream.md index 5839ac19c..ec0c2389e 100644 --- a/manage-data/lifecycle/data-stream.md +++ b/manage-data/lifecycle/data-stream.md @@ -21,28 +21,28 @@ To achieve that, it supports: A data stream lifecycle also supports downsampling the data stream backing indices. See [the downsampling example](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-data-lifecycle) for more details. -## How does it work? [data-streams-lifecycle-how-it-works] +## How does it work? [data-streams-lifecycle-how-it-works] -In intervals configured by [`data_streams.lifecycle.poll_interval`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/data-stream-lifecycle-settings.md#data-streams-lifecycle-poll-interval), {{es}} goes over each data stream and performs the following steps: +In intervals configured by [`data_streams.lifecycle.poll_interval`](elasticsearch://reference/elasticsearch/configuration-reference/data-stream-lifecycle-settings.md#data-streams-lifecycle-poll-interval), {{es}} goes over each data stream and performs the following steps: 1. Checks if the data stream has a data stream lifecycle configured, skipping any indices not part of a managed data stream. -2. Rolls over the write index of the data stream, if it fulfills the conditions defined by [`cluster.lifecycle.default.rollover`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/data-stream-lifecycle-settings.md#cluster-lifecycle-default-rollover). +2. Rolls over the write index of the data stream, if it fulfills the conditions defined by [`cluster.lifecycle.default.rollover`](elasticsearch://reference/elasticsearch/configuration-reference/data-stream-lifecycle-settings.md#cluster-lifecycle-default-rollover). 3. After an index is not the write index anymore (i.e. the data stream has been rolled over), automatically tail merges the index. Data stream lifecycle executes a merge operation that only targets the long tail of small segments instead of the whole shard. As the segments are organised into tiers of exponential sizes, merging the long tail of small segments is only a fraction of the cost of force merging to a single segment. The small segments would usually hold the most recent data so tail merging will focus the merging resources on the higher-value data that is most likely to keep being queried. 4. If [downsampling](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-data-lifecycle) is configured it will execute all the configured downsampling rounds. -5. Applies retention to the remaining backing indices. This means deleting the backing indices whose `generation_time` is longer than the effective retention period (read more about the [effective retention calculation](data-stream/tutorial-data-stream-retention.md#effective-retention-calculation)). The `generation_time` is only applicable to rolled over backing indices and it is either the time since the backing index got rolled over, or the time optionally configured in the [`index.lifecycle.origination_date`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/data-stream-lifecycle-settings.md#index-data-stream-lifecycle-origination-date) setting. +5. Applies retention to the remaining backing indices. This means deleting the backing indices whose `generation_time` is longer than the effective retention period (read more about the [effective retention calculation](data-stream/tutorial-data-stream-retention.md#effective-retention-calculation)). The `generation_time` is only applicable to rolled over backing indices and it is either the time since the backing index got rolled over, or the time optionally configured in the [`index.lifecycle.origination_date`](elasticsearch://reference/elasticsearch/configuration-reference/data-stream-lifecycle-settings.md#index-data-stream-lifecycle-origination-date) setting. -::::{important} +::::{important} We use the `generation_time` instead of the creation time because this ensures that all data in the backing index have passed the retention period. As a result, the retention period is not the exact time data gets deleted, but the minimum time data will be stored. :::: -::::{note} -Steps `2-4` apply only to backing indices that are not already managed by {{ilm-init}}, meaning that these indices either do not have an {{ilm-init}} policy defined, or if they do, they have [`index.lifecycle.prefer_ilm`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/data-stream-lifecycle-settings.md#index-lifecycle-prefer-ilm) set to `false`. +::::{note} +Steps `2-4` apply only to backing indices that are not already managed by {{ilm-init}}, meaning that these indices either do not have an {{ilm-init}} policy defined, or if they do, they have [`index.lifecycle.prefer_ilm`](elasticsearch://reference/elasticsearch/configuration-reference/data-stream-lifecycle-settings.md#index-lifecycle-prefer-ilm) set to `false`. :::: -## Configuring data stream lifecycle [data-stream-lifecycle-configuration] +## Configuring data stream lifecycle [data-stream-lifecycle-configuration] Since the lifecycle is configured on the data stream level, the process to configure a lifecycle on a new data stream and on an existing one differ. @@ -52,7 +52,7 @@ In the following sections, we will go through the following tutorials: * To update the lifecycle of an existing data stream you need to use the [data stream lifecycle APIs](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-data-stream) to edit the lifecycle on the data stream itself (see [Tutorial: Update existing data stream](data-stream/tutorial-update-existing-data-stream.md)). * Migrate an existing {{ilm-init}} managed data stream to Data stream lifecycle using [Tutorial: Migrate ILM managed data stream to data stream lifecycle](data-stream/tutorial-migrate-ilm-managed-data-stream-to-data-stream-lifecycle.md). -::::{note} +::::{note} Updating the data stream lifecycle of an existing data stream is different from updating the settings or the mapping, because it is applied on the data stream level and not on the individual backing indices. :::: diff --git a/manage-data/lifecycle/data-stream/tutorial-data-stream-retention.md b/manage-data/lifecycle/data-stream/tutorial-data-stream-retention.md index 1d85d6e4b..8c0487ca3 100644 --- a/manage-data/lifecycle/data-stream/tutorial-data-stream-retention.md +++ b/manage-data/lifecycle/data-stream/tutorial-data-stream-retention.md @@ -53,8 +53,8 @@ Retention does not define the period that the data will be removed, but the mini We define 4 different types of retention: * The data stream retention, or `data_retention`, which is the retention configured on the data stream level. It can be set via an [index template](../../data-store/templates.md) for future data streams or via the [PUT data stream lifecycle API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-data-lifecycle) for an existing data stream. When the data stream retention is not set, it implies that the data need to be kept forever. -* The global default retention, let’s call it `default_retention`, which is a retention configured via the cluster setting [`data_streams.lifecycle.retention.default`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/data-stream-lifecycle-settings.md#data-streams-lifecycle-retention-default) and will be applied to all data streams managed by data stream lifecycle that do not have `data_retention` configured. Effectively, it ensures that there will be no data streams keeping their data forever. This can be set via the [update cluster settings API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings). -* The global max retention, let’s call it `max_retention`, which is a retention configured via the cluster setting [`data_streams.lifecycle.retention.max`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/data-stream-lifecycle-settings.md#data-streams-lifecycle-retention-max) and will be applied to all data streams managed by data stream lifecycle. Effectively, it ensures that there will be no data streams whose retention will exceed this time period. This can be set via the [update cluster settings API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings). +* The global default retention, let’s call it `default_retention`, which is a retention configured via the cluster setting [`data_streams.lifecycle.retention.default`](elasticsearch://reference/elasticsearch/configuration-reference/data-stream-lifecycle-settings.md#data-streams-lifecycle-retention-default) and will be applied to all data streams managed by data stream lifecycle that do not have `data_retention` configured. Effectively, it ensures that there will be no data streams keeping their data forever. This can be set via the [update cluster settings API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings). +* The global max retention, let’s call it `max_retention`, which is a retention configured via the cluster setting [`data_streams.lifecycle.retention.max`](elasticsearch://reference/elasticsearch/configuration-reference/data-stream-lifecycle-settings.md#data-streams-lifecycle-retention-max) and will be applied to all data streams managed by data stream lifecycle. Effectively, it ensures that there will be no data streams whose retention will exceed this time period. This can be set via the [update cluster settings API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings). * The effective retention, or `effective_retention`, which is the retention applied at a data stream on a given moment. Effective retention cannot be set, it is derived by taking into account all the configured retention listed above and is calculated as it is described [here](#effective-retention-calculation). ::::{note} @@ -169,7 +169,7 @@ We see that it will remain the same with what the user configured: ## How is the effective retention applied? [effective-retention-application] -Retention is applied to the remaining backing indices of a data stream as the last step of [a data stream lifecycle run](../data-stream.md#data-streams-lifecycle-how-it-works). Data stream lifecycle will retrieve the backing indices whose `generation_time` is longer than the effective retention period and delete them. The `generation_time` is only applicable to rolled over backing indices and it is either the time since the backing index got rolled over, or the time optionally configured in the [`index.lifecycle.origination_date`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/data-stream-lifecycle-settings.md#index-data-stream-lifecycle-origination-date) setting. +Retention is applied to the remaining backing indices of a data stream as the last step of [a data stream lifecycle run](../data-stream.md#data-streams-lifecycle-how-it-works). Data stream lifecycle will retrieve the backing indices whose `generation_time` is longer than the effective retention period and delete them. The `generation_time` is only applicable to rolled over backing indices and it is either the time since the backing index got rolled over, or the time optionally configured in the [`index.lifecycle.origination_date`](elasticsearch://reference/elasticsearch/configuration-reference/data-stream-lifecycle-settings.md#index-data-stream-lifecycle-origination-date) setting. ::::{important} We use the `generation_time` instead of the creation time because this ensures that all data in the backing index have passed the retention period. As a result, the retention period is not the exact time data get deleted, but the minimum time data will be stored. diff --git a/manage-data/lifecycle/data-stream/tutorial-migrate-ilm-managed-data-stream-to-data-stream-lifecycle.md b/manage-data/lifecycle/data-stream/tutorial-migrate-ilm-managed-data-stream-to-data-stream-lifecycle.md index a5b6b842c..0c64035a6 100644 --- a/manage-data/lifecycle/data-stream/tutorial-migrate-ilm-managed-data-stream-to-data-stream-lifecycle.md +++ b/manage-data/lifecycle/data-stream/tutorial-migrate-ilm-managed-data-stream-to-data-stream-lifecycle.md @@ -15,7 +15,7 @@ In this tutorial we’ll look at migrating an existing data stream from [Index L To migrate a data stream from {{ilm-init}} to data stream lifecycle we’ll have to execute two steps: -1. Update the index template that’s backing the data stream to set [prefer_ilm](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/data-stream-lifecycle-settings.md#index-lifecycle-prefer-ilm) to `false`, and to configure data stream lifecycle. +1. Update the index template that’s backing the data stream to set [prefer_ilm](elasticsearch://reference/elasticsearch/configuration-reference/data-stream-lifecycle-settings.md#index-lifecycle-prefer-ilm) to `false`, and to configure data stream lifecycle. 2. Configure the data stream lifecycle for the *existing* data stream using the [lifecycle API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-data-lifecycle). For more details see the [migrate to data stream lifecycle](#migrate-from-ilm-to-dsl) section. @@ -127,11 +127,11 @@ Inspecting the response we’ll see that both backing indices are managed by {{i ``` 1. The name of the backing index. -2. For each backing index we display the value of the [prefer_ilm](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/data-stream-lifecycle-settings.md#index-lifecycle-prefer-ilm) configuration which will indicate if {{ilm-init}} takes precedence over data stream lifecycle in case both systems are configured for an index. +2. For each backing index we display the value of the [prefer_ilm](elasticsearch://reference/elasticsearch/configuration-reference/data-stream-lifecycle-settings.md#index-lifecycle-prefer-ilm) configuration which will indicate if {{ilm-init}} takes precedence over data stream lifecycle in case both systems are configured for an index. 3. The {{ilm-init}} policy configured for this index. 4. The system that manages this index (possible values are "Index Lifecycle Management", "Data stream lifecycle", or "Unmanaged") 5. The system that will manage the next generation index (the new write index of this data stream, once the data stream is rolled over). The possible values are "Index Lifecycle Management", "Data stream lifecycle", or "Unmanaged". -6. The [prefer_ilm](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/data-stream-lifecycle-settings.md#index-lifecycle-prefer-ilm) value configured in the index template that’s backing the data stream. This value will be configured for all the new backing indices. If it’s not configured in the index template the backing indices will receive the `true` default value ({{ilm-init}} takes precedence over data stream lifecycle by default as it’s currently richer in features). +6. The [prefer_ilm](elasticsearch://reference/elasticsearch/configuration-reference/data-stream-lifecycle-settings.md#index-lifecycle-prefer-ilm) value configured in the index template that’s backing the data stream. This value will be configured for all the new backing indices. If it’s not configured in the index template the backing indices will receive the `true` default value ({{ilm-init}} takes precedence over data stream lifecycle by default as it’s currently richer in features). 7. The {{ilm-init}} policy configured in the index template that’s backing this data stream (which will be configured on all the new backing indices, as long as it exists in the index template). @@ -140,7 +140,7 @@ Inspecting the response we’ll see that both backing indices are managed by {{i To migrate the `dsl-data-stream` to data stream lifecycle we’ll have to execute two steps: -1. Update the index template that’s backing the data stream to set [prefer_ilm](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/data-stream-lifecycle-settings.md#index-lifecycle-prefer-ilm) to `false`, and to configure data stream lifecycle. +1. Update the index template that’s backing the data stream to set [prefer_ilm](elasticsearch://reference/elasticsearch/configuration-reference/data-stream-lifecycle-settings.md#index-lifecycle-prefer-ilm) to `false`, and to configure data stream lifecycle. 2. Configure the data stream lifecycle for the *existing* `dsl-data-stream` using the [lifecycle API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-data-lifecycle). ::::{important} diff --git a/manage-data/lifecycle/data-tiers.md b/manage-data/lifecycle/data-tiers.md index 43e78eda4..3768b699f 100644 --- a/manage-data/lifecycle/data-tiers.md +++ b/manage-data/lifecycle/data-tiers.md @@ -10,7 +10,7 @@ applies_to: # Data tiers -A *data tier* is a collection of [nodes](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/node-settings.md) within a cluster that share the same [data node role](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/node-settings.md#node-roles), and a hardware profile that’s appropriately sized for the role. Elastic recommends that nodes in the same tier share the same hardware profile to avoid [hot spotting](/troubleshoot/elasticsearch/hotspotting.md). +A *data tier* is a collection of [nodes](elasticsearch://reference/elasticsearch/configuration-reference/node-settings.md) within a cluster that share the same [data node role](elasticsearch://reference/elasticsearch/configuration-reference/node-settings.md#node-roles), and a hardware profile that’s appropriately sized for the role. Elastic recommends that nodes in the same tier share the same hardware profile to avoid [hot spotting](/troubleshoot/elasticsearch/hotspotting.md). ## Available data tiers [available-tier] @@ -24,8 +24,8 @@ The data tiers that you use, and the way that you use them, depends on the data * [Hot tier](/manage-data/lifecycle/data-tiers.md#hot-tier) nodes handle the indexing load for time series data, such as logs or metrics. They hold your most recent, most-frequently-accessed data. * [Warm tier](/manage-data/lifecycle/data-tiers.md#warm-tier) nodes hold time series data that is accessed less-frequently and rarely needs to be updated. -* [Cold tier](/manage-data/lifecycle/data-tiers.md#cold-tier) nodes hold time series data that is accessed infrequently and not normally updated. To save space, you can keep [fully mounted indices](/deploy-manage/tools/snapshot-and-restore/searchable-snapshots.md#fully-mounted) of [{{search-snaps}}](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-searchable-snapshot.md) on the cold tier. These fully mounted indices eliminate the need for replicas, reducing required disk space by approximately 50% compared to the regular indices. -* [Frozen tier](/manage-data/lifecycle/data-tiers.md#frozen-tier) nodes hold time series data that is accessed rarely and never updated. The frozen tier stores [partially mounted indices](/deploy-manage/tools/snapshot-and-restore/searchable-snapshots.md#partially-mounted) of [{{search-snaps}}](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-searchable-snapshot.md) exclusively. This extends the storage capacity even further — by up to 20 times compared to the warm tier. +* [Cold tier](/manage-data/lifecycle/data-tiers.md#cold-tier) nodes hold time series data that is accessed infrequently and not normally updated. To save space, you can keep [fully mounted indices](/deploy-manage/tools/snapshot-and-restore/searchable-snapshots.md#fully-mounted) of [{{search-snaps}}](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-searchable-snapshot.md) on the cold tier. These fully mounted indices eliminate the need for replicas, reducing required disk space by approximately 50% compared to the regular indices. +* [Frozen tier](/manage-data/lifecycle/data-tiers.md#frozen-tier) nodes hold time series data that is accessed rarely and never updated. The frozen tier stores [partially mounted indices](/deploy-manage/tools/snapshot-and-restore/searchable-snapshots.md#partially-mounted) of [{{search-snaps}}](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-searchable-snapshot.md) exclusively. This extends the storage capacity even further — by up to 20 times compared to the warm tier. ::::{tip} The performance of an {{es}} node is often limited by the performance of the underlying storage and hardware profile. For example hardware profiles, refer to Elastic Cloud’s [instance configurations](asciidocalypse://docs/cloud/docs/reference/cloud-hosted/hardware.md). Review our recommendations for optimizing your storage for [indexing](/deploy-manage/production-guidance/optimize-performance/indexing-speed.md#indexing-use-faster-hardware) and [search](/deploy-manage/production-guidance/optimize-performance/search-speed.md#search-use-faster-hardware). @@ -69,7 +69,7 @@ Time series data can move to the warm tier once it is being queried less frequen When you no longer need to search time series data regularly, it can move from the warm tier to the cold tier. While still searchable, this tier is typically optimized for lower storage costs rather than search speed. -For better storage savings, you can keep [fully mounted indices](/deploy-manage/tools/snapshot-and-restore/searchable-snapshots.md#fully-mounted) of [{{search-snaps}}](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-searchable-snapshot.md) on the cold tier. Unlike regular indices, these fully mounted indices don’t require replicas for reliability. In the event of a failure, they can recover data from the underlying snapshot instead. This potentially halves the local storage needed for the data. A snapshot repository is required to use fully mounted indices in the cold tier. Fully mounted indices are read-only. +For better storage savings, you can keep [fully mounted indices](/deploy-manage/tools/snapshot-and-restore/searchable-snapshots.md#fully-mounted) of [{{search-snaps}}](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-searchable-snapshot.md) on the cold tier. Unlike regular indices, these fully mounted indices don’t require replicas for reliability. In the event of a failure, they can recover data from the underlying snapshot instead. This potentially halves the local storage needed for the data. A snapshot repository is required to use fully mounted indices in the cold tier. Fully mounted indices are read-only. Alternatively, you can use the cold tier to store regular indices with replicas instead of using {{search-snaps}}. This lets you store older data on less expensive hardware but doesn’t reduce required disk space compared to the warm tier. @@ -436,7 +436,7 @@ We recommend you use [dedicated nodes](/deploy-manage/distributed-architecture/c ## Data tier index allocation [data-tier-allocation] -The [`index.routing.allocation.include._tier_preference`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/data-tier-allocation.md#tier-preference-allocation-filter) setting determines which tier the index should be allocated to. +The [`index.routing.allocation.include._tier_preference`](elasticsearch://reference/elasticsearch/index-settings/data-tier-allocation.md#tier-preference-allocation-filter) setting determines which tier the index should be allocated to. When you create an index, by default {{es}} sets the `_tier_preference` to `data_content` to automatically allocate the index shards to the content tier. @@ -451,7 +451,7 @@ You can override this setting after index creation by [updating the index settin This setting also accepts multiple tiers in order of preference. This prevents indices from remaining unallocated if no nodes are available in the preferred tier. For example, when {{ilm}} migrates an index to the cold phase, it sets the index `_tier_preference` to `data_cold,data_warm,data_hot`. -To remove the data tier preference setting, set the `_tier_preference` value to `null`. This allows the index to allocate to any data node within the cluster. Setting the `_tier_preference` to `null` does not restore the default value. Note that, in the case of managed indices, a [migrate](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-migrate.md) action might apply a new value in its place. +To remove the data tier preference setting, set the `_tier_preference` value to `null`. This allows the index to allocate to any data node within the cluster. Setting the `_tier_preference` to `null` does not restore the default value. Note that, in the case of managed indices, a [migrate](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-migrate.md) action might apply a new value in its place. ### Determine the current data tier preference [data-tier-allocation-value] @@ -470,4 +470,4 @@ This setting will not unallocate a currently allocated shard, but might prevent ### Automatic data tier migration [data-tier-migration] -{{ilm-init}} automatically transitions managed indices through the available data tiers using the [migrate](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-migrate.md) action. By default, this action is automatically injected in every phase. You can explicitly specify the migrate action with `"enabled": false` to [disable automatic migration](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-migrate.md#ilm-disable-migrate-ex), for example, if you’re using the [allocate action](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-allocate.md) to manually specify allocation rules. +{{ilm-init}} automatically transitions managed indices through the available data tiers using the [migrate](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-migrate.md) action. By default, this action is automatically injected in every phase. You can explicitly specify the migrate action with `"enabled": false` to [disable automatic migration](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-migrate.md#ilm-disable-migrate-ex), for example, if you’re using the [allocate action](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-allocate.md) to manually specify allocation rules. diff --git a/manage-data/lifecycle/index-lifecycle-management.md b/manage-data/lifecycle/index-lifecycle-management.md index edd8b8ee3..5d4395c8f 100644 --- a/manage-data/lifecycle/index-lifecycle-management.md +++ b/manage-data/lifecycle/index-lifecycle-management.md @@ -40,7 +40,7 @@ To use {{ilm-init}}, all nodes in a cluster must run the same version. Although * **Shrink**: Reduces the number of primary shards in an index. * **Force merge**: Triggers a [force merge](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-forcemerge) to reduce the number of segments in an index’s shards. * **Delete**: Permanently remove an index, including all of its data and metadata. -* [And more](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/index.md) +* [And more](elasticsearch://reference/elasticsearch/index-lifecycle-actions/index.md) Each action has options you can use to specify index behavior and characteristics like: @@ -58,7 +58,7 @@ For example, if you are indexing metrics data from a fleet of ATMs into Elastics 3. After 7 days, move the index into the cold phase and move it to less expensive hardware. 4. Delete the index once the required 30 day retention period is reached. -**Learn about all available actions in [Index lifecycle actions](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/index.md).** +**Learn about all available actions in [Index lifecycle actions](elasticsearch://reference/elasticsearch/index-lifecycle-actions/index.md).** ## Create and manage {{ilm-init}} policies diff --git a/manage-data/lifecycle/index-lifecycle-management/configure-lifecycle-policy.md b/manage-data/lifecycle/index-lifecycle-management/configure-lifecycle-policy.md index c08f0ad0d..2e822e069 100644 --- a/manage-data/lifecycle/index-lifecycle-management/configure-lifecycle-policy.md +++ b/manage-data/lifecycle/index-lifecycle-management/configure-lifecycle-policy.md @@ -205,7 +205,7 @@ To switch an index’s lifecycle policy, follow these steps: 2. The remove policy API removes all {{ilm-init}} metadata from the index and doesn’t consider the index’s lifecycle status. This can leave indices in an undesired state. - For example, the [`forcemerge`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-forcemerge.md) action temporarily closes an index before reopening it. Removing an index’s {{ilm-init}} policy during a `forcemerge` can leave the index closed indefinitely. + For example, the [`forcemerge`](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-forcemerge.md) action temporarily closes an index before reopening it. Removing an index’s {{ilm-init}} policy during a `forcemerge` can leave the index closed indefinitely. After policy removal, use the [get index API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-get) to check an index’s state . Target a data stream or alias to get the state of all its indices. diff --git a/manage-data/lifecycle/index-lifecycle-management/index-lifecycle.md b/manage-data/lifecycle/index-lifecycle-management/index-lifecycle.md index fbcce4ad2..e045b4b12 100644 --- a/manage-data/lifecycle/index-lifecycle-management/index-lifecycle.md +++ b/manage-data/lifecycle/index-lifecycle-management/index-lifecycle.md @@ -34,7 +34,7 @@ If you use {{es}}'s security features, {{ilm-init}} performs operations as the u The minimum age defaults to zero, which causes {{ilm-init}} to move indices to the next phase as soon as all actions in the current phase complete. ::::{note} -If an index has been [rolled over](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-rollover.md), then the `min_age` value is relative to the time the index was rolled over, not the index creation time. [Learn more](../../../troubleshoot/elasticsearch/index-lifecycle-management-errors.md#min-age-calculation). +If an index has been [rolled over](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-rollover.md), then the `min_age` value is relative to the time the index was rolled over, not the index creation time. [Learn more](../../../troubleshoot/elasticsearch/index-lifecycle-management-errors.md#min-age-calculation). :::: @@ -59,42 +59,42 @@ When an index enters a phase, {{ilm-init}} caches the phase definition in the in * Hot - * [Set Priority](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-set-priority.md) - * [Unfollow](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-unfollow.md) - * [Rollover](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-rollover.md) - * [Read-Only](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-readonly.md) - * [Downsample](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-downsample.md) - * [Shrink](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-shrink.md) - * [Force Merge](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-forcemerge.md) - * [Searchable Snapshot](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-searchable-snapshot.md) + * [Set Priority](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-set-priority.md) + * [Unfollow](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-unfollow.md) + * [Rollover](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-rollover.md) + * [Read-Only](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-readonly.md) + * [Downsample](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-downsample.md) + * [Shrink](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-shrink.md) + * [Force Merge](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-forcemerge.md) + * [Searchable Snapshot](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-searchable-snapshot.md) * Warm - * [Set Priority](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-set-priority.md) - * [Unfollow](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-unfollow.md) - * [Read-Only](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-readonly.md) - * [Downsample](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-downsample.md) - * [Allocate](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-allocate.md) - * [Migrate](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-migrate.md) - * [Shrink](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-shrink.md) - * [Force Merge](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-forcemerge.md) + * [Set Priority](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-set-priority.md) + * [Unfollow](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-unfollow.md) + * [Read-Only](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-readonly.md) + * [Downsample](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-downsample.md) + * [Allocate](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-allocate.md) + * [Migrate](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-migrate.md) + * [Shrink](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-shrink.md) + * [Force Merge](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-forcemerge.md) * Cold - * [Set Priority](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-set-priority.md) - * [Unfollow](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-unfollow.md) - * [Read-Only](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-readonly.md) - * [Downsample](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-downsample.md) - * [Searchable Snapshot](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-searchable-snapshot.md) - * [Allocate](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-allocate.md) - * [Migrate](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-migrate.md) + * [Set Priority](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-set-priority.md) + * [Unfollow](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-unfollow.md) + * [Read-Only](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-readonly.md) + * [Downsample](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-downsample.md) + * [Searchable Snapshot](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-searchable-snapshot.md) + * [Allocate](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-allocate.md) + * [Migrate](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-migrate.md) * Frozen - * [Unfollow](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-unfollow.md) - * [Searchable Snapshot](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-searchable-snapshot.md) + * [Unfollow](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-unfollow.md) + * [Searchable Snapshot](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-searchable-snapshot.md) * Delete - * [Wait For Snapshot](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-wait-for-snapshot.md) - * [Delete](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-delete.md) + * [Wait For Snapshot](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-wait-for-snapshot.md) + * [Delete](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-delete.md) diff --git a/manage-data/lifecycle/index-lifecycle-management/index-management-in-kibana.md b/manage-data/lifecycle/index-lifecycle-management/index-management-in-kibana.md index ae3be01f4..bce55cb0d 100644 --- a/manage-data/lifecycle/index-lifecycle-management/index-management-in-kibana.md +++ b/manage-data/lifecycle/index-lifecycle-management/index-management-in-kibana.md @@ -35,7 +35,7 @@ Investigate your indices and perform operations from the **Indices** view. * To show details and perform operations such as close, forcemerge, and flush, click the index name. To perform operations on multiple indices, select their checkboxes and then open the **Manage** menu. For more information on managing indices, refer to [Index APIs](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-indices). * To filter the list of indices, use the search bar or click a badge. Badges indicate if an index is a [follower index](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-ccr-follow), a [rollup index](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-rollup-get-rollup-index-caps), or [frozen](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-unfreeze). -* To drill down into the index [mappings](../../data-store/mapping.md), [settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index.md), and statistics, click an index name. From this view, you can navigate to **Discover** to further explore the documents in the index. +* To drill down into the index [mappings](../../data-store/mapping.md), [settings](elasticsearch://reference/elasticsearch/index-settings/index.md), and statistics, click an index name. From this view, you can navigate to **Discover** to further explore the documents in the index. :::{image} ../../../images/elasticsearch-reference-management_index_details.png :alt: Index Management UI @@ -102,7 +102,7 @@ In this tutorial, you’ll create an index template and use it to configure two ::: 2. Define index settings. These are optional. For this tutorial, leave this section blank. -3. Define a mapping that contains an [object](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/object.md) field named `geo` with a child [`geo_point`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/geo-point.md) field named `coordinates`: +3. Define a mapping that contains an [object](elasticsearch://reference/elasticsearch/mapping-reference/object.md) field named `geo` with a child [`geo_point`](elasticsearch://reference/elasticsearch/mapping-reference/geo-point.md) field named `coordinates`: :::{image} ../../../images/elasticsearch-reference-management-index-templates-mappings.png :alt: Mapped fields page @@ -192,7 +192,7 @@ Use the **Enrich Policies** view to add data from your existing indices to incom * The source indices that store enrich data as documents * The fields from the source indices used to match incoming documents * The enrich fields containing enrich data from the source indices that you want to add to incoming documents -* An optional [query](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-match-all-query.md). +* An optional [query](elasticsearch://reference/query-languages/query-dsl-match-all-query.md). :::{image} ../../../images/elasticsearch-reference-management-enrich-policies.png :alt: Enrich policies diff --git a/manage-data/lifecycle/index-lifecycle-management/manage-existing-indices.md b/manage-data/lifecycle/index-lifecycle-management/manage-existing-indices.md index 2648c3902..d6c82e7f7 100644 --- a/manage-data/lifecycle/index-lifecycle-management/manage-existing-indices.md +++ b/manage-data/lifecycle/index-lifecycle-management/manage-existing-indices.md @@ -27,7 +27,7 @@ Define a separate policy for your older indices that omits the rollover action. Keep in mind that policies applied to existing indices compare the `min_age` for each phase to the original creation date of the index, and might proceed through multiple phases immediately. If your policy performs resource-intensive operations like force merge, you don’t want to have a lot of indices performing those operations all at once when you switch over to {{ilm-init}}. -You can specify different `min_age` values in the policy you use for existing indices, or set [`index.lifecycle.origination_date`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/index-lifecycle-management-settings.md#index-lifecycle-origination-date) to control how the index age is calculated. +You can specify different `min_age` values in the policy you use for existing indices, or set [`index.lifecycle.origination_date`](elasticsearch://reference/elasticsearch/configuration-reference/index-lifecycle-management-settings.md#index-lifecycle-origination-date) to control how the index age is calculated. Once all pre-{{ilm-init}} indices have been aged out and removed, you can delete the policy you used to manage them. diff --git a/manage-data/lifecycle/index-lifecycle-management/migrate-index-allocation-filters-to-node-roles.md b/manage-data/lifecycle/index-lifecycle-management/migrate-index-allocation-filters-to-node-roles.md index 8fd0b551a..58ffd8473 100644 --- a/manage-data/lifecycle/index-lifecycle-management/migrate-index-allocation-filters-to-node-roles.md +++ b/manage-data/lifecycle/index-lifecycle-management/migrate-index-allocation-filters-to-node-roles.md @@ -8,9 +8,9 @@ applies_to: # Migrate index allocation filters to node roles [migrate-index-allocation-filters] -If you currently use [custom node attributes](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/node-settings.md#custom-node-attributes) and [attribute-based allocation filters](../../../deploy-manage/distributed-architecture/shard-allocation-relocation-recovery/index-level-shard-allocation.md) to move indices through [data tiers](../data-tiers.md) in a [hot-warm-cold architecture](https://www.elastic.co/blog/implementing-hot-warm-cold-in-elasticsearch-with-index-lifecycle-management), we recommend that you switch to using the built-in node roles and automatic [data tier allocation](../data-tiers.md#data-tier-allocation). Using node roles enables {{ilm-init}} to automatically move indices between data tiers. +If you currently use [custom node attributes](elasticsearch://reference/elasticsearch/configuration-reference/node-settings.md#custom-node-attributes) and [attribute-based allocation filters](../../../deploy-manage/distributed-architecture/shard-allocation-relocation-recovery/index-level-shard-allocation.md) to move indices through [data tiers](../data-tiers.md) in a [hot-warm-cold architecture](https://www.elastic.co/blog/implementing-hot-warm-cold-in-elasticsearch-with-index-lifecycle-management), we recommend that you switch to using the built-in node roles and automatic [data tier allocation](../data-tiers.md#data-tier-allocation). Using node roles enables {{ilm-init}} to automatically move indices between data tiers. -::::{note} +::::{note} While we recommend relying on automatic data tier allocation to manage your data in a hot-warm-cold architecture, you can still use attribute-based allocation filters to control shard allocation for other purposes. :::: @@ -18,7 +18,7 @@ While we recommend relying on automatic data tier allocation to manage your data {{ech}} and {{ece}} can perform the migration automatically. For self-managed deployments, you need to manually update your configuration, ILM policies, and indices to switch to node roles. -## Automatically migrate to node roles on {{ech}} or {{ece}} [cloud-migrate-to-node-roles] +## Automatically migrate to node roles on {{ech}} or {{ece}} [cloud-migrate-to-node-roles] If you are using node attributes from the default deployment template in {{ech}} or {{ece}}, you will be prompted to switch to node roles when you: @@ -30,13 +30,13 @@ These actions automatically update your cluster configuration and {{ilm-init}} p If you use custom index templates, check them after the automatic migration completes and remove any [attribute-based allocation filters](../../../deploy-manage/distributed-architecture/shard-allocation-relocation-recovery/index-level-shard-allocation.md). -::::{note} +::::{note} You do not need to take any further action after the automatic migration. The following manual steps are only necessary if you do not allow the automatic migration or have a self-managed deployment. :::: -## Migrate to node roles on self-managed deployments [on-prem-migrate-to-node-roles] +## Migrate to node roles on self-managed deployments [on-prem-migrate-to-node-roles] To switch to using node roles: @@ -46,9 +46,9 @@ To switch to using node roles: 4. Update existing indices to [set a tier preference](#set-tier-preference). -### Assign data nodes to a data tier [assign-data-tier] +### Assign data nodes to a data tier [assign-data-tier] -Configure the appropriate roles for each data node to assign it to one or more data tiers: `data_hot`, `data_content`, `data_warm`, `data_cold`, or `data_frozen`. A node can also have other [roles](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/node-settings.md). By default, new nodes are configured with all roles. +Configure the appropriate roles for each data node to assign it to one or more data tiers: `data_hot`, `data_content`, `data_warm`, `data_cold`, or `data_frozen`. A node can also have other [roles](elasticsearch://reference/elasticsearch/configuration-reference/node-settings.md). By default, new nodes are configured with all roles. When you add a data tier to an {{ech}} deployment, one or more nodes are automatically configured with the corresponding role. To explicitly change the role of a node in an {{ech}} deployment, use the [Update deployment API](../../../deploy-manage/deploy/elastic-cloud/manage-deployments-using-elastic-cloud-api.md#ec_update_a_deployment). Replace the node’s `node_type` configuration with the appropriate `node_roles`. For example, the following configuration adds the node to the hot and content tiers, and enables it to act as an ingest node, remote, and transform node. @@ -69,19 +69,19 @@ node.roles [ data_hot, data_content ] ``` -### Remove custom allocation settings from existing {{ilm-init}} policies [remove-custom-allocation-settings] +### Remove custom allocation settings from existing {{ilm-init}} policies [remove-custom-allocation-settings] -Update the allocate action for each lifecycle phase to remove the attribute-based allocation settings. {{ilm-init}} will inject a [migrate](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-migrate.md) action into each phase to automatically transition the indices through the data tiers. +Update the allocate action for each lifecycle phase to remove the attribute-based allocation settings. {{ilm-init}} will inject a [migrate](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-migrate.md) action into each phase to automatically transition the indices through the data tiers. If the allocate action does not set the number of replicas, remove the allocate action entirely. (An empty allocate action is invalid.) -::::{important} +::::{important} The policy must specify the corresponding phase for each data tier in your architecture. Each phase must be present so {{ilm-init}} can inject the migrate action to move indices through the data tiers. If you don’t need to perform any other actions, the phase can be empty. For example, if you enable the warm and cold data tiers for a deployment, your policy must include the hot, warm, and cold phases. :::: -### Stop setting the custom hot attribute on new indices [stop-setting-custom-hot-attribute] +### Stop setting the custom hot attribute on new indices [stop-setting-custom-hot-attribute] When you create a data stream, its first backing index is now automatically assigned to `data_hot` nodes. Similarly, when you directly create an index, it is automatically assigned to `data_content` nodes. @@ -96,14 +96,14 @@ If you’re using a custom index template, update it to remove the [attribute-ba To completely avoid the issues that raise when mixing the tier preference and custom attribute routing setting we also recommend updating all the legacy, composable, and component templates to remove the [attribute-based allocation filters](../../../deploy-manage/distributed-architecture/shard-allocation-relocation-recovery/index-level-shard-allocation.md) from the settings they configure. -### Set a tier preference for existing indices [set-tier-preference] +### Set a tier preference for existing indices [set-tier-preference] -{{ilm-init}} automatically transitions managed indices through the available data tiers by automatically injecting a [migrate action](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-migrate.md) into each phase. +{{ilm-init}} automatically transitions managed indices through the available data tiers by automatically injecting a [migrate action](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-migrate.md) into each phase. To enable {{ilm-init}} to move an *existing* managed index through the data tiers, update the index settings to: 1. Remove the custom allocation filter by setting it to `null`. -2. Set the [tier preference](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/data-tier-allocation.md#tier-preference-allocation-filter). +2. Set the [tier preference](elasticsearch://reference/elasticsearch/index-settings/data-tier-allocation.md#tier-preference-allocation-filter). For example, if your old template set the `data` attribute to `hot` to allocate shards to the hot tier, set the `data` attribute to `null` and set the `_tier_preference` to `data_hot`. diff --git a/manage-data/lifecycle/index-lifecycle-management/rollover.md b/manage-data/lifecycle/index-lifecycle-management/rollover.md index 55db51d2a..e981bb835 100644 --- a/manage-data/lifecycle/index-lifecycle-management/rollover.md +++ b/manage-data/lifecycle/index-lifecycle-management/rollover.md @@ -20,7 +20,7 @@ We recommend using [data streams](https://www.elastic.co/docs/api/doc/elasticsea Each data stream requires an [index template](../../data-store/templates.md) that contains: * A name or wildcard (`*`) pattern for the data stream. -* The data stream’s timestamp field. This field must be mapped as a [`date`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/date.md) or [`date_nanos`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/date_nanos.md) field data type and must be included in every document indexed to the data stream. +* The data stream’s timestamp field. This field must be mapped as a [`date`](elasticsearch://reference/elasticsearch/mapping-reference/date.md) or [`date_nanos`](elasticsearch://reference/elasticsearch/mapping-reference/date_nanos.md) field data type and must be included in every document indexed to the data stream. * The mappings and settings applied to each backing index when it’s created. Data streams are designed for append-only data, where the data stream name can be used as the operations (read, write, rollover, shrink etc.) target. If your use case requires data to be updated in place, you can instead manage your time series data using [index aliases](../../data-store/aliases.md). However, there are a few more configuration steps and concepts: @@ -29,28 +29,28 @@ Data streams are designed for append-only data, where the data stream name can b * An *index alias* that references the entire set of indices. * A single index designated as the *write index*. This is the active index that handles all write requests. On each rollover, the new index becomes the write index. -::::{note} +::::{note} When an index is rolled over, the previous index’s age is updated to reflect the rollover time. This date, rather than the index’s `creation_date`, is used in {{ilm}} `min_age` phase calculations. [Learn more](../../../troubleshoot/elasticsearch/index-lifecycle-management-errors.md#min-age-calculation). :::: -## Automatic rollover [ilm-automatic-rollover] +## Automatic rollover [ilm-automatic-rollover] {{ilm-init}} and the data stream lifecycle (in [preview]]) enable you to automatically roll over to a new index based on conditions like the index size, document count, or age. When a rollover is triggered, a new index is created, the write alias is updated to point to the new index, and all subsequent updates are written to the new index. -::::{tip} +::::{tip} Rolling over to a new index based on size, document count, or age is preferable to time-based rollovers. Rolling over at an arbitrary time often results in many small indices, which can have a negative impact on performance and resource usage. :::: -::::{important} +::::{important} Empty indices will not be rolled over, even if they have an associated `max_age` that would otherwise result in a roll over occurring. A policy can override this behavior, and explicitly opt in to rolling over empty indices, by adding a `"min_docs": 0` condition. This can also be disabled on a cluster-wide basis by setting `indices.lifecycle.rollover.only_if_has_documents` to `false`. :::: -::::{important} +::::{important} The rollover action implicitly always rolls over a data stream or alias if one or more shards contain 200000000 or more documents. Normally a shard will reach 50GB long before it reaches 200M documents, but this isn’t the case for space efficient data sets. Search performance will very likely suffer if a shard contains more than 200M documents. This is the reason of the builtin limit. :::: diff --git a/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md b/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md index cdc92f763..625837f68 100644 --- a/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md +++ b/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md @@ -18,7 +18,7 @@ When you continuously index timestamped documents into {{es}}, you typically use To automate rollover and management of a data stream with {{ilm-init}}, you: -1. [Create a lifecycle policy](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#ilm-gs-create-policy) that defines the appropriate [phases](index-lifecycle.md) and [actions](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/index.md). +1. [Create a lifecycle policy](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#ilm-gs-create-policy) that defines the appropriate [phases](index-lifecycle.md) and [actions](elasticsearch://reference/elasticsearch/index-lifecycle-actions/index.md). 2. [Create an index template](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#ilm-gs-apply-policy) to [create the data stream](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#ilm-gs-create-the-data-stream) and apply the ILM policy and the indices settings and mappings configurations for the backing indices. 3. [Verify indices are moving through the lifecycle phases](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#ilm-gs-check-progress) as expected. diff --git a/manage-data/lifecycle/rollup/understanding-groups.md b/manage-data/lifecycle/rollup/understanding-groups.md index c430a5b10..51dbf639d 100644 --- a/manage-data/lifecycle/rollup/understanding-groups.md +++ b/manage-data/lifecycle/rollup/understanding-groups.md @@ -111,7 +111,7 @@ Ultimately, when configuring `groups` for a job, think in terms of how you might ## Calendar vs fixed time intervals [rollup-understanding-group-intervals] -Each rollup-job must have a date histogram group with a defined interval. {{es}} understands both [calendar and fixed time intervals](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-datehistogram-aggregation.md#calendar_and_fixed_intervals). Fixed time intervals are fairly easy to understand; `60s` means sixty seconds. But what does `1M` mean? One month of time depends on which month we are talking about, some months are longer or shorter than others. This is an example of calendar time and the duration of that unit depends on context. Calendar units are also affected by leap-seconds, leap-years, etc. +Each rollup-job must have a date histogram group with a defined interval. {{es}} understands both [calendar and fixed time intervals](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-datehistogram-aggregation.md#calendar_and_fixed_intervals). Fixed time intervals are fairly easy to understand; `60s` means sixty seconds. But what does `1M` mean? One month of time depends on which month we are talking about, some months are longer or shorter than others. This is an example of calendar time and the duration of that unit depends on context. Calendar units are also affected by leap-seconds, leap-years, etc. This is important because the buckets generated by rollup are in either calendar or fixed intervals and this limits how you can query them later. See [Requests must be multiples of the config](rollup-search-limitations.md#rollup-search-limitations-intervals). diff --git a/manage-data/use-case-use-elasticsearch-to-manage-time-series-data.md b/manage-data/use-case-use-elasticsearch-to-manage-time-series-data.md index 04836910a..bced164af 100644 --- a/manage-data/use-case-use-elasticsearch-to-manage-time-series-data.md +++ b/manage-data/use-case-use-elasticsearch-to-manage-time-series-data.md @@ -34,7 +34,7 @@ The steps for setting up data tiers vary based on your deployment type: :::::: ::::::{tab-item} Self-managed -To assign a node to a data tier, add the respective [node role](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/node-settings.md#node-roles) to the node’s `elasticsearch.yml` file. Changing an existing node’s roles requires a [rolling restart](../deploy-manage/maintenance/start-stop-services/full-cluster-restart-rolling-restart-procedures.md#restart-cluster-rolling). +To assign a node to a data tier, add the respective [node role](elasticsearch://reference/elasticsearch/configuration-reference/node-settings.md#node-roles) to the node’s `elasticsearch.yml` file. Changing an existing node’s roles requires a [rolling restart](../deploy-manage/maintenance/start-stop-services/full-cluster-restart-rolling-restart-procedures.md#restart-cluster-rolling). ```yaml # Content tier @@ -94,7 +94,7 @@ Use any of the following repository types with searchable snapshots: * [AWS S3](../deploy-manage/tools/snapshot-and-restore/s3-repository.md) * [Google Cloud Storage](../deploy-manage/tools/snapshot-and-restore/google-cloud-storage-repository.md) * [Azure Blob Storage](../deploy-manage/tools/snapshot-and-restore/azure-repository.md) -* [Hadoop Distributed File Store (HDFS)](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch-plugins/repository-hdfs.md) +* [Hadoop Distributed File Store (HDFS)](elasticsearch://reference/elasticsearch-plugins/repository-hdfs.md) * [Shared filesystems](../deploy-manage/tools/snapshot-and-restore/shared-file-system-repository.md) such as NFS * [Read-only HTTP and HTTPS repositories](../deploy-manage/tools/snapshot-and-restore/read-only-url-repository.md) @@ -249,13 +249,13 @@ If you use a custom application, you need to set up your own data stream. A data When creating your component templates, include: -* A [`date`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/date.md) or [`date_nanos`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/date_nanos.md) mapping for the `@timestamp` field. If you don’t specify a mapping, {{es}} maps `@timestamp` as a `date` field with default options. +* A [`date`](elasticsearch://reference/elasticsearch/mapping-reference/date.md) or [`date_nanos`](elasticsearch://reference/elasticsearch/mapping-reference/date_nanos.md) mapping for the `@timestamp` field. If you don’t specify a mapping, {{es}} maps `@timestamp` as a `date` field with default options. * Your lifecycle policy in the `index.lifecycle.name` index setting. ::::{tip} Use the [Elastic Common Schema (ECS)](https://www.elastic.co/guide/en/ecs/current) when mapping your fields. ECS fields integrate with several {{stack}} features by default. -If you’re unsure how to map your fields, use [runtime fields](data-store/mapping/define-runtime-fields-in-search-request.md) to extract fields from [unstructured content](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/keyword.md#mapping-unstructured-content) at search time. For example, you can index a log message to a `wildcard` field and later extract IP addresses and other data from this field during a search. +If you’re unsure how to map your fields, use [runtime fields](data-store/mapping/define-runtime-fields-in-search-request.md) to extract fields from [unstructured content](elasticsearch://reference/elasticsearch/mapping-reference/keyword.md#mapping-unstructured-content) at search time. For example, you can index a log message to a `wildcard` field and later extract IP addresses and other data from this field during a search. :::: diff --git a/raw-migrated-files/cloud-on-k8s/cloud-on-k8s/k8s-orchestration.md b/raw-migrated-files/cloud-on-k8s/cloud-on-k8s/k8s-orchestration.md index 5aca84f13..6ab3f1e40 100644 --- a/raw-migrated-files/cloud-on-k8s/cloud-on-k8s/k8s-orchestration.md +++ b/raw-migrated-files/cloud-on-k8s/cloud-on-k8s/k8s-orchestration.md @@ -169,7 +169,7 @@ Advanced users may force an upgrade by manually deleting Pods themselves. The de Operations that reduce the number of nodes in the cluster cannot make progress without user intervention, if the Elasticsearch index replica settings are incompatible with the intended downscale. Specifically, if the Elasticsearch index settings demand a higher number of shard copies than data nodes in the cluster after the downscale operation, ECK cannot migrate the data away from the node about to be removed. You can address this in the following ways: * Adjust the Elasticsearch [index settings](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-settings) to a number of replicas that allow the desired node removal. -* Use [`auto_expand_replicas`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-modules.md#dynamic-index-settings) to automatically adjust the replicas to the number of data nodes in the cluster. +* Use [`auto_expand_replicas`](elasticsearch://reference/elasticsearch/index-settings/index-modules.md#dynamic-index-settings) to automatically adjust the replicas to the number of data nodes in the cluster. ## Advanced control during rolling upgrades [k8s-advanced-upgrade-control] diff --git a/raw-migrated-files/cloud/cloud-enterprise/ece-add-custom-bundle-plugin.md b/raw-migrated-files/cloud/cloud-enterprise/ece-add-custom-bundle-plugin.md index 460d85a66..4d26f8dc8 100644 --- a/raw-migrated-files/cloud/cloud-enterprise/ece-add-custom-bundle-plugin.md +++ b/raw-migrated-files/cloud/cloud-enterprise/ece-add-custom-bundle-plugin.md @@ -353,7 +353,7 @@ You do not need to do this step if you are using default filename and password ( } ``` -4. To use this bundle, you can refer it in the [GeoIP processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/geoip-processor.md) of an ingest pipeline as `MyGeoLite2-City.mmdb` under `database_file` such as: +4. To use this bundle, you can refer it in the [GeoIP processor](elasticsearch://reference/ingestion-tools/enrich-processor/geoip-processor.md) of an ingest pipeline as `MyGeoLite2-City.mmdb` under `database_file` such as: ```sh ... diff --git a/raw-migrated-files/cloud/cloud-enterprise/ece-upgrade-deployment.md b/raw-migrated-files/cloud/cloud-enterprise/ece-upgrade-deployment.md index 4d57619e6..b3ec18b83 100644 --- a/raw-migrated-files/cloud/cloud-enterprise/ece-upgrade-deployment.md +++ b/raw-migrated-files/cloud/cloud-enterprise/ece-upgrade-deployment.md @@ -43,7 +43,7 @@ To upgrade a cluster in Elastic Cloud Enterprise: 4. Select one of the available software versions. Let the user interface guide you through the steps for upgrading a deployment. When you save your changes, your deployment configuration is updated to the new version. ::::{tip} - You cannot downgrade after upgrading, so plan ahead to make sure that your applications still work after upgrading. For more information on changes that might affect your applications, check [Breaking changes](asciidocalypse://docs/elasticsearch/docs/release-notes/breaking-changes.md). + You cannot downgrade after upgrading, so plan ahead to make sure that your applications still work after upgrading. For more information on changes that might affect your applications, check [Breaking changes](elasticsearch://release-notes/breaking-changes.md). :::: 5. If you are upgrading to version 6.6 and earlier, major upgrades require a full cluster restart to complete the upgrade process. diff --git a/raw-migrated-files/cloud/cloud-heroku/ech-add-user-settings.md b/raw-migrated-files/cloud/cloud-heroku/ech-add-user-settings.md index 11c447e1b..f9ed62a51 100644 --- a/raw-migrated-files/cloud/cloud-heroku/ech-add-user-settings.md +++ b/raw-migrated-files/cloud/cloud-heroku/ech-add-user-settings.md @@ -35,7 +35,7 @@ Elasticsearch Add-On for Heroku supports the following `elasticsearch.yml` setti The following general settings are supported: $$$http-cors-settings$$$`http.cors.*` -: Enables cross-origin resource sharing (CORS) settings for the [HTTP module](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md). +: Enables cross-origin resource sharing (CORS) settings for the [HTTP module](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md). ::::{note} If your use case depends on the ability to receive CORS requests and you have a cluster that was provisioned prior to January 25th 2019, you must manually set `http.cors.enabled` to `true` and allow a specific set of hosts with `http.cors.allow-origin`. Applying these changes in your Elasticsearch configuration allows cross-origin resource sharing requests. @@ -43,13 +43,13 @@ $$$http-cors-settings$$$`http.cors.*` `http.compression` -: Support for [HTTP compression](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md) when possible (with Accept-Encoding). Defaults to `true`. +: Support for [HTTP compression](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md) when possible (with Accept-Encoding). Defaults to `true`. `transport.compress` -: Configures [transport compression](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md) for node-to-node traffic. +: Configures [transport compression](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md) for node-to-node traffic. `transport.compression_scheme` -: Configures [transport compression](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md) for node-to-node traffic. +: Configures [transport compression](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md) for node-to-node traffic. `repositories.url.allowed_urls` : Enables explicit allowing of [read-only URL repositories](../../../deploy-manage/tools/snapshot-and-restore/read-only-url-repository.md). @@ -61,7 +61,7 @@ $$$http-cors-settings$$$`http.cors.*` : To learn more on how to configure reindex SSL user settings, check [configuring reindex SSL parameters](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-reindex). `script.painless.regex.enabled` -: Enables [regular expressions](asciidocalypse://docs/elasticsearch/docs/reference/scripting-languages/painless/brief-painless-walkthrough.md#modules-scripting-painless-regex) for the Painless scripting language. +: Enables [regular expressions](elasticsearch://reference/scripting-languages/painless/brief-painless-walkthrough.md#modules-scripting-painless-regex) for the Painless scripting language. `action.auto_create_index` : [Automatically create index](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-create) if it doesn’t already exist. @@ -94,19 +94,19 @@ $$$http-cors-settings$$$`http.cors.*` The following circuit breaker settings are supported: `indices.breaker.total.limit` -: Configures [the parent circuit breaker settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/circuit-breaker-settings.md#parent-circuit-breaker). +: Configures [the parent circuit breaker settings](elasticsearch://reference/elasticsearch/configuration-reference/circuit-breaker-settings.md#parent-circuit-breaker). `indices.breaker.fielddata.limit` -: Configures [the limit for the fielddata breaker](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/circuit-breaker-settings.md#fielddata-circuit-breaker). +: Configures [the limit for the fielddata breaker](elasticsearch://reference/elasticsearch/configuration-reference/circuit-breaker-settings.md#fielddata-circuit-breaker). `indices.breaker.fielddata.overhead` -: Configures [a constant that all field data estimations are multiplied with to determine a final estimation](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/circuit-breaker-settings.md#fielddata-circuit-breaker). +: Configures [a constant that all field data estimations are multiplied with to determine a final estimation](elasticsearch://reference/elasticsearch/configuration-reference/circuit-breaker-settings.md#fielddata-circuit-breaker). `indices.breaker.request.limit` -: Configures [the limit for the request breaker](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/circuit-breaker-settings.md#request-circuit-breaker). +: Configures [the limit for the request breaker](elasticsearch://reference/elasticsearch/configuration-reference/circuit-breaker-settings.md#request-circuit-breaker). `indices.breaker.request.overhead` -: Configures [a constant that all request estimations are multiplied by to determine a final estimation](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/circuit-breaker-settings.md#request-circuit-breaker). +: Configures [a constant that all request estimations are multiplied by to determine a final estimation](elasticsearch://reference/elasticsearch/configuration-reference/circuit-breaker-settings.md#request-circuit-breaker). ### Indexing pressure settings [echindexing_pressure_settings] @@ -114,7 +114,7 @@ The following circuit breaker settings are supported: The following indexing pressure settings are supported: `indexing_pressure.memory.limit` -: Configures [the indexing pressure settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/pressure.md). +: Configures [the indexing pressure settings](elasticsearch://reference/elasticsearch/index-settings/pressure.md). ### X-Pack [echx_pack] @@ -128,7 +128,7 @@ The following indexing pressure settings are supported: #### All supported versions [echall_supported_versions] `xpack.ml.inference_model.time_to_live` -: Sets the duration of time that the trained models are cached. Check [{{ml-cap}} settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/machine-learning-settings.md). +: Sets the duration of time that the trained models are cached. Check [{{ml-cap}} settings](elasticsearch://reference/elasticsearch/configuration-reference/machine-learning-settings.md). `xpack.security.loginAssistanceMessage` : Adds a message to the login screen. Useful for displaying corporate messages. @@ -146,10 +146,10 @@ The following indexing pressure settings are supported: : Defines when the watch should start, based on date and time [Learn more](/explore-analyze/alerts-cases/watcher/trigger-schedule.md). `xpack.notification.email.html.sanitization.*` -: Enables [email notification settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/watcher-settings.md) to sanitize HTML elements in emails that are sent. +: Enables [email notification settings](elasticsearch://reference/elasticsearch/configuration-reference/watcher-settings.md) to sanitize HTML elements in emails that are sent. `xpack.monitoring.collection.interval` -: Controls [how often data samples are collected](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/monitoring-settings.md#monitoring-collection-settings). +: Controls [how often data samples are collected](elasticsearch://reference/elasticsearch/configuration-reference/monitoring-settings.md#monitoring-collection-settings). `xpack.monitoring.collection.min_interval_seconds` : Specifies the minimum number of seconds that a time bucket in a chart can represent. If you modify the `xpack.monitoring.collection.interval`, use the same value in this setting. @@ -158,10 +158,10 @@ The following indexing pressure settings are supported: $$$xpack-monitoring-history-duration$$$`xpack.monitoring.history.duration` -: Sets the [retention duration](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/monitoring-settings.md#monitoring-collection-settings) beyond which the indices created by a monitoring exporter will be automatically deleted. +: Sets the [retention duration](elasticsearch://reference/elasticsearch/configuration-reference/monitoring-settings.md#monitoring-collection-settings) beyond which the indices created by a monitoring exporter will be automatically deleted. `xpack.watcher.history.cleaner_service.enabled` -: Controls [whether old watcher indices are automatically deleted](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/watcher-settings.md#general-notification-settings). +: Controls [whether old watcher indices are automatically deleted](elasticsearch://reference/elasticsearch/configuration-reference/watcher-settings.md#general-notification-settings). `xpack.http.ssl.cipher_suites` : Controls the list of supported cipher suites for all outgoing TLS connections. @@ -197,16 +197,16 @@ The following search settings are supported: The following disk-based allocation settings are supported: `cluster.routing.allocation.disk.threshold_enabled` -: Enable or disable [disk allocation](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#disk-based-shard-allocation) decider and defaults to `true`. +: Enable or disable [disk allocation](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#disk-based-shard-allocation) decider and defaults to `true`. `cluster.routing.allocation.disk.watermark.low` -: Configures [disk-based shard allocation’s low watermark](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#disk-based-shard-allocation). +: Configures [disk-based shard allocation’s low watermark](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#disk-based-shard-allocation). `cluster.routing.allocation.disk.watermark.high` -: Configures [disk-based shard allocation’s high watermark](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#disk-based-shard-allocation). +: Configures [disk-based shard allocation’s high watermark](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#disk-based-shard-allocation). `cluster.routing.allocation.disk.watermark.flood_stage` -: Configures [disk-based shard allocation’s flood_stage](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#disk-based-shard-allocation). +: Configures [disk-based shard allocation’s flood_stage](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#disk-based-shard-allocation). ::::{tip} Remember to update user settings for alerts when performing a major version upgrade. diff --git a/raw-migrated-files/cloud/cloud-heroku/ech-enable-logging-and-monitoring.md b/raw-migrated-files/cloud/cloud-heroku/ech-enable-logging-and-monitoring.md index 50c6b171d..6a8df2a48 100644 --- a/raw-migrated-files/cloud/cloud-heroku/ech-enable-logging-and-monitoring.md +++ b/raw-migrated-files/cloud/cloud-heroku/ech-enable-logging-and-monitoring.md @@ -13,12 +13,12 @@ Monitoring consists of two components: The steps in this section cover only the enablement of the monitoring and logging features in Elasticsearch Add-On for Heroku. For more information on how to use the monitoring features, refer to [Monitor a cluster](../../../deploy-manage/monitor.md). -### Before you begin [ech-logging-and-monitoring-limitations] +### Before you begin [ech-logging-and-monitoring-limitations] Some limitations apply when you use monitoring on Elasticsearch Add-On for Heroku. To learn more, check the monitoring [restrictions and limitations](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md). -### Monitoring for production use [ech-logging-and-monitoring-production] +### Monitoring for production use [ech-logging-and-monitoring-production] For production use, you should send your deployment logs and metrics to a dedicated monitoring deployment. Monitoring indexes logs and metrics into {{es}} and these indexes consume storage, memory, and CPU cycles like any other index. By using a separate monitoring deployment, you avoid affecting your other production deployments and can view the logs and metrics even when a production deployment is unavailable. @@ -35,15 +35,15 @@ How many monitoring deployments you use depends on your requirements: Logs and metrics that get sent to a dedicated monitoring {{es}} deployment [may not be cleaned up automatically](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md#ech-logging-and-monitoring-retention) and might require some additional steps to remove excess data periodically. -### Retention of monitoring daily indices [ech-logging-and-monitoring-retention] +### Retention of monitoring daily indices [ech-logging-and-monitoring-retention] -#### Stack versions 8.0 and above [ech-logging-and-monitoring-retention-8] +#### Stack versions 8.0 and above [ech-logging-and-monitoring-retention-8] When you enable monitoring in Elasticsearch Add-On for Heroku, your monitoring indices are retained for a certain period by default. After the retention period has passed, the monitoring indices are deleted automatically. The retention period is configured in the `.monitoring-8-ilm-policy` index lifecycle policy. To view or edit the policy open {{kib}} **Stack management > Data > Index Lifecycle Policies**. -### Sending monitoring data to itself (self monitoring) [ech-logging-and-monitoring-retention-self-monitoring] +### Sending monitoring data to itself (self monitoring) [ech-logging-and-monitoring-retention-self-monitoring] $$$ech-logging-and-monitoring-retention-7$$$ When you enable self-monitoring in Elasticsearch Add-On for Heroku, your monitoring indices are retained for a certain period by default. After the retention period has passed, the monitoring indices are deleted automatically. Monitoring data is retained for three days by default or as specified by the [`xpack.monitoring.history.duration` user setting](../../../deploy-manage/deploy/elastic-cloud/edit-stack-settings.md#xpack-monitoring-history-duration). @@ -65,7 +65,7 @@ PUT /_cluster/settings ``` -### Sending monitoring data to a dedicated monitoring deployment [ech-logging-and-monitoring-retention-dedicated-monitoring] +### Sending monitoring data to a dedicated monitoring deployment [ech-logging-and-monitoring-retention-dedicated-monitoring] When [monitoring for production use](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md#ech-logging-and-monitoring-production), where you configure your deployments **to send monitoring data to a dedicated monitoring deployment** for indexing, this retention period does not apply. Monitoring indices on a dedicated monitoring deployment are retained until you remove them. There are three options open to you: @@ -98,17 +98,17 @@ When [monitoring for production use](../../../deploy-manage/monitor/stack-monito * To retain monitoring indices on a dedicated monitoring deployment as is without deleting them automatically, no additional steps are required other than making sure that you do not enable the monitoring deployment to send monitoring data to itself. You should also monitor the deployment for disk space usage and upgrade your deployment periodically, if necessary. -### Retention of logging indices [ech-logging-and-monitoring-log-retention] +### Retention of logging indices [ech-logging-and-monitoring-log-retention] An ILM policy is pre-configured to manage log retention. The policy can be adjusted according to your requirements. -### Index management [ech-logging-and-monitoring-index-management-ilm] +### Index management [ech-logging-and-monitoring-index-management-ilm] When sending monitoring data to a deployment, you can configure [Index Lifecycle Management (ILM)](../../../manage-data/lifecycle/index-lifecycle-management.md) to manage retention of your monitoring and logging indices. When sending logs to a deployment, an ILM policy is pre-configured to manage log retention and the policy can be customized to your needs. -### Enable logging and monitoring [ech-enable-logging-and-monitoring-steps] +### Enable logging and monitoring [ech-enable-logging-and-monitoring-steps] Elasticsearch Add-On for Heroku manages the installation and configuration of the monitoring agent for you. When you enable monitoring on a deployment, you are configuring where the monitoring agent for your current deployment should send its logs and metrics. @@ -125,23 +125,23 @@ To enable monitoring on your deployment: If a deployment is not listed, make sure that it is running a compatible version. The monitoring deployment and production deployment must be on the same major version, cloud provider, and region. - ::::{tip} + ::::{tip} Remember to send logs and metrics for production deployments to a dedicated monitoring deployment, so that your production deployments are not impacted by the overhead of indexing and storing monitoring data. A dedicated monitoring deployment also gives you more control over the retention period for monitoring data. :::: -::::{note} +::::{note} Enabling logs and monitoring may trigger a plan change on your deployment. You can monitor the plan change progress from the deployment’s **Activity** page. :::: -::::{note} +::::{note} Enabling logs and monitoring requires some extra resource on a deployment. For production systems, we recommend sizing deployments with logs and monitoring enabled to at least 4 GB of RAM. :::: -### Access the monitoring application in Kibana [ech-access-kibana-monitoring] +### Access the monitoring application in Kibana [ech-access-kibana-monitoring] With monitoring enabled for your deployment, you can access the [logs](https://www.elastic.co/guide/en/kibana/current/observability.html) and [stack monitoring](../../../deploy-manage/monitor/monitoring-data/visualizing-monitoring-data.md) through Kibana. @@ -165,28 +165,28 @@ Alternatively, you can access logs and metrics directly on the Kibana **Logs** a | `service.version` | The version of the stack resource that generated the log | `8.13.1` | -### Logging features [ech-extra-logging-features] +### Logging features [ech-extra-logging-features] When shipping logs to a monitoring deployment there are more logging features available to you. These features include: -#### For {{es}}: [ech-extra-logging-features-elasticsearch] +#### For {{es}}: [ech-extra-logging-features-elasticsearch] * [Audit logging](../../../deploy-manage/monitor/logging-configuration/enabling-audit-logs.md) - logs security-related events on your deployment -* [Slow query and index logging](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/slow-log.md) - helps find and debug slow queries and indexing +* [Slow query and index logging](elasticsearch://reference/elasticsearch/index-settings/slow-log.md) - helps find and debug slow queries and indexing * Verbose logging - helps debug stack issues by increasing component logs After you’ve enabled log delivery on your deployment, you can [add the Elasticsearch user settings](../../../deploy-manage/deploy/elastic-cloud/edit-stack-settings.md) to enable these features. -#### For Kibana: [ech-extra-logging-features-kibana] +#### For Kibana: [ech-extra-logging-features-kibana] * [Audit logging](../../../deploy-manage/monitor/logging-configuration/enabling-audit-logs.md) - logs security-related events on your deployment After you’ve enabled log delivery on your deployment, you can [add the Kibana user settings](../../../deploy-manage/deploy/elastic-cloud/edit-stack-settings.md) to enable this feature. -### Other components [ech-extra-logging-features-enterprise-search] +### Other components [ech-extra-logging-features-enterprise-search] Enabling log collection also supports collecting and indexing the following types of logs from other components in your deployments: @@ -204,12 +204,12 @@ The ˆ*ˆ indicates that we also index the archived files of each type of log. Check the respective product documentation for more information about the logging capabilities of each product. -## Metrics features [ech-extra-metrics-features] +## Metrics features [ech-extra-metrics-features] With logging and monitoring enabled for a deployment, metrics are collected for Elasticsearch, Kibana, and APM with Fleet Server. -#### Enabling Elasticsearch/Kibana audit logs on your deployment [ech-enable-audit-logs] +#### Enabling Elasticsearch/Kibana audit logs on your deployment [ech-enable-audit-logs] Audit logs are useful for tracking security events on your {{es}} and/or {{kib}} clusters. To enable {{es}} audit logs on your deployment: diff --git a/raw-migrated-files/cloud/cloud-heroku/ech-manage-kibana-settings.md b/raw-migrated-files/cloud/cloud-heroku/ech-manage-kibana-settings.md index 4a944696f..d160ca3b7 100644 --- a/raw-migrated-files/cloud/cloud-heroku/ech-manage-kibana-settings.md +++ b/raw-migrated-files/cloud/cloud-heroku/ech-manage-kibana-settings.md @@ -588,7 +588,7 @@ This setting is not available in versions 8.0.0 through 8.2.0. As such, this set : When enabled, specifies the email address to receive cluster alert notifications. `xpack.monitoring.kibana.collection.interval` -: Controls [how often data samples are collected](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/monitoring-settings.md#monitoring-collection-settings). +: Controls [how often data samples are collected](elasticsearch://reference/elasticsearch/configuration-reference/monitoring-settings.md#monitoring-collection-settings). `xpack.monitoring.min_interval_seconds` : Specifies the minimum number of seconds that a time bucket in a chart can represent. If you modify the `xpack.monitoring.kibana.collection.interval`, use the same value in this setting. @@ -599,7 +599,7 @@ This setting is not available in versions 8.0.0 through 8.2.0. As such, this set `xpack.ml.enabled` : Set to true (default) to enable machine learning. - If set to `false` in `kibana.yml`, the machine learning icon is hidden in this Kibana instance. If `xpack.ml.enabled` is set to `true` in `elasticsearch.yml`, however, you can still use the machine learning APIs. To disable machine learning entirely, check the [Elasticsearch Machine Learning Settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/machine-learning-settings.md). + If set to `false` in `kibana.yml`, the machine learning icon is hidden in this Kibana instance. If `xpack.ml.enabled` is set to `true` in `elasticsearch.yml`, however, you can still use the machine learning APIs. To disable machine learning entirely, check the [Elasticsearch Machine Learning Settings](elasticsearch://reference/elasticsearch/configuration-reference/machine-learning-settings.md). #### Content security policy configuration [echcontent_security_policy_configuration] @@ -692,7 +692,7 @@ Each method has its own unique limitations which are important to understand. `xpack.reporting.csv.scroll.duration` -: Amount of [time](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#time-units) allowed before {{kib}} cleans the scroll context during a CSV export. Valid option is either `auto` or [time](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#time-units), Defaults to `30s`. +: Amount of [time](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#time-units) allowed before {{kib}} cleans the scroll context during a CSV export. Valid option is either `auto` or [time](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#time-units), Defaults to `30s`. ::::{note} Support for the The option `auto` was included here, when the config value is set to `auto` the scroll context will be preserved for as long as is possible, before the report task is terminated due to the limits of `xpack.reporting.queue.timeout`. @@ -757,7 +757,7 @@ Support for the The option `auto` was included here, when the config value is se Defaults to `true`. `xpack.reporting.csv.scroll.duration` -: Amount of [time](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#time-units) allowed before {{kib}} cleans the scroll context during a CSV export. +: Amount of [time](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#time-units) allowed before {{kib}} cleans the scroll context during a CSV export. Defaults to `30s` (30 seconds). diff --git a/raw-migrated-files/cloud/cloud-heroku/ech-monitoring-setup.md b/raw-migrated-files/cloud/cloud-heroku/ech-monitoring-setup.md index 0c318af48..5a0f3eb54 100644 --- a/raw-migrated-files/cloud/cloud-heroku/ech-monitoring-setup.md +++ b/raw-migrated-files/cloud/cloud-heroku/ech-monitoring-setup.md @@ -27,7 +27,7 @@ After you have created a new deployment, you should enable shipping logs and met 5. Select **Save**. -Optionally, turn on [audit logging](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/auding-settings.md) to capture security-related events, such as authentication failures, refused connections, and data-access events through the proxy. To turn on audit logging, [edit your deployment’s elasticsearch.yml file](../../../deploy-manage/deploy/elastic-cloud/edit-stack-settings.md) to add these lines: +Optionally, turn on [audit logging](elasticsearch://reference/elasticsearch/configuration-reference/auding-settings.md) to capture security-related events, such as authentication failures, refused connections, and data-access events through the proxy. To turn on audit logging, [edit your deployment’s elasticsearch.yml file](../../../deploy-manage/deploy/elastic-cloud/edit-stack-settings.md) to add these lines: ```sh xpack.security.audit.enabled: true diff --git a/raw-migrated-files/cloud/cloud/ec-add-user-settings.md b/raw-migrated-files/cloud/cloud/ec-add-user-settings.md index 9c8dcb5eb..9828a4f41 100644 --- a/raw-migrated-files/cloud/cloud/ec-add-user-settings.md +++ b/raw-migrated-files/cloud/cloud/ec-add-user-settings.md @@ -35,7 +35,7 @@ In some cases, you may get a warning saying "User settings are different across The following general settings are supported: $$$http-cors-settings$$$`http.cors.*` -: Enables cross-origin resource sharing (CORS) settings for the [HTTP module](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md). +: Enables cross-origin resource sharing (CORS) settings for the [HTTP module](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md). ::::{note} If your use case depends on the ability to receive CORS requests and you have a cluster that was provisioned prior to January 25th 2019, you must manually set `http.cors.enabled` to `true` and allow a specific set of hosts with `http.cors.allow-origin`. Applying these changes in your Elasticsearch configuration allows cross-origin resource sharing requests. @@ -43,13 +43,13 @@ $$$http-cors-settings$$$`http.cors.*` `http.compression` -: Support for [HTTP compression](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md) when possible (with Accept-Encoding). Defaults to `true`. +: Support for [HTTP compression](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md) when possible (with Accept-Encoding). Defaults to `true`. `transport.compress` -: Configures [transport compression](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md) for node-to-node traffic. +: Configures [transport compression](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md) for node-to-node traffic. `transport.compression_scheme` -: Configures [transport compression](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md) for node-to-node traffic. +: Configures [transport compression](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md) for node-to-node traffic. `repositories.url.allowed_urls` : Enables explicit allowing of [read-only URL repositories](../../../deploy-manage/tools/snapshot-and-restore/read-only-url-repository.md). @@ -61,7 +61,7 @@ $$$http-cors-settings$$$`http.cors.*` : To learn more on how to configure reindex SSL user settings, check [configuring reindex SSL parameters](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-reindex). `script.painless.regex.enabled` -: Enables [regular expressions](asciidocalypse://docs/elasticsearch/docs/reference/scripting-languages/painless/brief-painless-walkthrough.md#modules-scripting-painless-regex) for the Painless scripting language. +: Enables [regular expressions](elasticsearch://reference/scripting-languages/painless/brief-painless-walkthrough.md#modules-scripting-painless-regex) for the Painless scripting language. `action.auto_create_index` : [Automatically create index](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-create) if it doesn’t already exist. @@ -94,19 +94,19 @@ $$$http-cors-settings$$$`http.cors.*` The following circuit breaker settings are supported: `indices.breaker.total.limit` -: Configures [the parent circuit breaker settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/circuit-breaker-settings.md#parent-circuit-breaker). +: Configures [the parent circuit breaker settings](elasticsearch://reference/elasticsearch/configuration-reference/circuit-breaker-settings.md#parent-circuit-breaker). `indices.breaker.fielddata.limit` -: Configures [the limit for the fielddata breaker](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/circuit-breaker-settings.md#fielddata-circuit-breaker). +: Configures [the limit for the fielddata breaker](elasticsearch://reference/elasticsearch/configuration-reference/circuit-breaker-settings.md#fielddata-circuit-breaker). `indices.breaker.fielddata.overhead` -: Configures [a constant that all field data estimations are multiplied with to determine a final estimation](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/circuit-breaker-settings.md#fielddata-circuit-breaker). +: Configures [a constant that all field data estimations are multiplied with to determine a final estimation](elasticsearch://reference/elasticsearch/configuration-reference/circuit-breaker-settings.md#fielddata-circuit-breaker). `indices.breaker.request.limit` -: Configures [the limit for the request breaker](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/circuit-breaker-settings.md#request-circuit-breaker). +: Configures [the limit for the request breaker](elasticsearch://reference/elasticsearch/configuration-reference/circuit-breaker-settings.md#request-circuit-breaker). `indices.breaker.request.overhead` -: Configures [a constant that all request estimations are multiplied by to determine a final estimation](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/circuit-breaker-settings.md#request-circuit-breaker). +: Configures [a constant that all request estimations are multiplied by to determine a final estimation](elasticsearch://reference/elasticsearch/configuration-reference/circuit-breaker-settings.md#request-circuit-breaker). ### Indexing pressure settings [ec_indexing_pressure_settings] @@ -114,7 +114,7 @@ The following circuit breaker settings are supported: The following indexing pressure settings are supported: `indexing_pressure.memory.limit` -: Configures [the indexing pressure settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/pressure.md). +: Configures [the indexing pressure settings](elasticsearch://reference/elasticsearch/index-settings/pressure.md). ### X-Pack [ec_x_pack] @@ -128,7 +128,7 @@ The following indexing pressure settings are supported: #### All supported versions [ec_all_supported_versions] `xpack.ml.inference_model.time_to_live` -: Sets the duration of time that the trained models are cached. Check [{{ml-cap}} settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/machine-learning-settings.md). +: Sets the duration of time that the trained models are cached. Check [{{ml-cap}} settings](elasticsearch://reference/elasticsearch/configuration-reference/machine-learning-settings.md). `xpack.security.loginAssistanceMessage` : Adds a message to the login screen. Useful for displaying corporate messages. @@ -146,10 +146,10 @@ The following indexing pressure settings are supported: : Defines when the watch should start, based on date and time [Learn more](/explore-analyze/alerts-cases/watcher/trigger-schedule.md). `xpack.notification.email.html.sanitization.*` -: Enables [email notification settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/watcher-settings.md) to sanitize HTML elements in emails that are sent. +: Enables [email notification settings](elasticsearch://reference/elasticsearch/configuration-reference/watcher-settings.md) to sanitize HTML elements in emails that are sent. `xpack.monitoring.collection.interval` -: Controls [how often data samples are collected](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/monitoring-settings.md#monitoring-collection-settings). +: Controls [how often data samples are collected](elasticsearch://reference/elasticsearch/configuration-reference/monitoring-settings.md#monitoring-collection-settings). `xpack.monitoring.collection.min_interval_seconds` : Specifies the minimum number of seconds that a time bucket in a chart can represent. If you modify the `xpack.monitoring.collection.interval`, use the same value in this setting. @@ -158,10 +158,10 @@ The following indexing pressure settings are supported: $$$xpack-monitoring-history-duration$$$`xpack.monitoring.history.duration` -: Sets the [retention duration](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/monitoring-settings.md#monitoring-collection-settings) beyond which the indices created by a monitoring exporter will be automatically deleted. +: Sets the [retention duration](elasticsearch://reference/elasticsearch/configuration-reference/monitoring-settings.md#monitoring-collection-settings) beyond which the indices created by a monitoring exporter will be automatically deleted. `xpack.watcher.history.cleaner_service.enabled` -: Controls [whether old watcher indices are automatically deleted](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/watcher-settings.md#general-notification-settings). +: Controls [whether old watcher indices are automatically deleted](elasticsearch://reference/elasticsearch/configuration-reference/watcher-settings.md#general-notification-settings). `xpack.http.ssl.cipher_suites` : Controls the list of supported cipher suites for all outgoing TLS connections. @@ -197,16 +197,16 @@ The following search settings are supported: The following disk-based allocation settings are supported: `cluster.routing.allocation.disk.threshold_enabled` -: Enable or disable [disk allocation](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#disk-based-shard-allocation) decider and defaults to `true`. +: Enable or disable [disk allocation](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#disk-based-shard-allocation) decider and defaults to `true`. `cluster.routing.allocation.disk.watermark.low` -: Configures [disk-based shard allocation’s low watermark](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#disk-based-shard-allocation). +: Configures [disk-based shard allocation’s low watermark](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#disk-based-shard-allocation). `cluster.routing.allocation.disk.watermark.high` -: Configures [disk-based shard allocation’s high watermark](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#disk-based-shard-allocation). +: Configures [disk-based shard allocation’s high watermark](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#disk-based-shard-allocation). `cluster.routing.allocation.disk.watermark.flood_stage` -: Configures [disk-based shard allocation’s flood_stage](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#disk-based-shard-allocation). +: Configures [disk-based shard allocation’s flood_stage](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#disk-based-shard-allocation). ::::{tip} Remember to update user settings for alerts when performing a major version upgrade. diff --git a/raw-migrated-files/cloud/cloud/ec-cloud-ingest-data.md b/raw-migrated-files/cloud/cloud/ec-cloud-ingest-data.md index 36f8fffcd..cc244db5a 100644 --- a/raw-migrated-files/cloud/cloud/ec-cloud-ingest-data.md +++ b/raw-migrated-files/cloud/cloud/ec-cloud-ingest-data.md @@ -5,7 +5,7 @@ You have a number of options for getting data into Elasticsearch, referred to as $$$ec-ingest-methods$$$ General content -: Index content like HTML pages, catalogs and other files. Send data directly to Elasticseach from your application using an Elastic language client. Otherwise use Elastic content [connectors](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/search-connectors/index.md) or the Elastic [web crawler](https://github.com/elastic/crawler). +: Index content like HTML pages, catalogs and other files. Send data directly to Elasticseach from your application using an Elastic language client. Otherwise use Elastic content [connectors](elasticsearch://reference/ingestion-tools/search-connectors/index.md) or the Elastic [web crawler](https://github.com/elastic/crawler). Timestamped data : The preferred way to index timestamped data is to use Elastic Agent. Elastic Agent is a single, unified way to add monitoring for logs, metrics, and other types of data to a host. It can also protect hosts from security threats, query data from operating systems, and forward data from remote services or hardware. Each Elastic Agent based integration includes default ingestion rules, dashboards, and visualizations to start analyzing your data right away. Fleet Management enables you to centrally manage all of your deployed Elastic Agents from Kibana. @@ -191,7 +191,7 @@ For users who want to build their own solution, we can help you get started inge [Add data with the web crawler](https://github.com/elastic/crawler) : Use the web crawler to programmatically discover, extract, and index searchable content from websites and knowledge bases. -[Add data with connectors](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/search-connectors/index.md) +[Add data with connectors](elasticsearch://reference/ingestion-tools/search-connectors/index.md) : Sync data from an original data source to an {{es}} index. Connectors enable you to create searchable, read-only replicas of your data sources. diff --git a/raw-migrated-files/cloud/cloud/ec-custom-bundles.md b/raw-migrated-files/cloud/cloud/ec-custom-bundles.md index 61e50495b..b78577cb2 100644 --- a/raw-migrated-files/cloud/cloud/ec-custom-bundles.md +++ b/raw-migrated-files/cloud/cloud/ec-custom-bundles.md @@ -16,7 +16,7 @@ The selected plugins/bundles are downloaded and provided when a node starts. Cha With great power comes great responsibility: your plugins can extend your deployment with new functionality, but also break it. Be careful. We obviously cannot guarantee that your custom code works. -::::{important} +::::{important} You cannot edit or delete a custom extension after it has been used in a deployment. To remove it from your deployment, you can disable the extension and update your deployment configuration. :::: @@ -39,7 +39,7 @@ Plugins {{es}} assumes that the uploaded ZIP file contains binaries. If it finds any source code, it fails with an error message, causing provisioning to fail. Make sure you upload binaries, and not source code. - ::::{note} + ::::{note} Plugins larger than 5GB should have the plugin descriptor file at the top of the archive. This order can be achieved by specifying at time of creating the ZIP file: ```sh @@ -76,7 +76,7 @@ Bundles The dictionary `synonyms.txt` can be used as `synonyms.txt` or using the full path `/app/config/synonyms.txt` in the `synonyms_path` of the `synonym-filter`. - To learn more about analyzing with synonyms, check [Synonym token filter](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analysis-synonym-tokenfilter.md) and [Formatting Synonyms](https://www.elastic.co/guide/en/elasticsearch/guide/2.x/synonym-formats.html). + To learn more about analyzing with synonyms, check [Synonym token filter](elasticsearch://reference/data-analysis/text-analysis/analysis-synonym-tokenfilter.md) and [Formatting Synonyms](https://www.elastic.co/guide/en/elasticsearch/guide/2.x/synonym-formats.html). **GeoIP database bundle** @@ -110,7 +110,7 @@ You must upload your files before you can apply them to your cluster configurati After creating your extension, you can [enable them for existing {{es}} deployments](../../../deploy-manage/deploy/elastic-cloud/upload-custom-plugins-bundles.md#ec-update-bundles) or enable them when creating new deployments. -::::{note} +::::{note} Creating extensions larger than 200MB should be done through the extensions API. Refer to [Managing plugins and extensions through the API](../../../deploy-manage/deploy/elastic-cloud/manage-plugins-extensions-through-api.md) for more details. @@ -169,7 +169,7 @@ To update an extension with a new file version, ## How to use the extensions API [ec-extension-api-usage-guide] -::::{note} +::::{note} For a full set of examples, check [Managing plugins and extensions through the API](../../../deploy-manage/deploy/elastic-cloud/manage-plugins-extensions-through-api.md). :::: diff --git a/raw-migrated-files/cloud/cloud/ec-enable-logging-and-monitoring.md b/raw-migrated-files/cloud/cloud/ec-enable-logging-and-monitoring.md index 27c12b67b..fb308686a 100644 --- a/raw-migrated-files/cloud/cloud/ec-enable-logging-and-monitoring.md +++ b/raw-migrated-files/cloud/cloud/ec-enable-logging-and-monitoring.md @@ -13,12 +13,12 @@ Monitoring consists of two components: The steps in this section cover only the enablement of the monitoring and logging features in {{ech}}. For more information on how to use the monitoring features, refer to [Monitor a cluster](../../../deploy-manage/monitor.md). -### Before you begin [ec-logging-and-monitoring-limitations] +### Before you begin [ec-logging-and-monitoring-limitations] Some limitations apply when you use monitoring on {{ech}}. To learn more, check the monitoring [restrictions and limitations](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md#ec-restrictions-monitoring). -### Monitoring for production use [ec-logging-and-monitoring-production] +### Monitoring for production use [ec-logging-and-monitoring-production] For production use, you should send your deployment logs and metrics to a dedicated monitoring deployment. Monitoring indexes logs and metrics into {{es}} and these indexes consume storage, memory, and CPU cycles like any other index. By using a separate monitoring deployment, you avoid affecting your other production deployments and can view the logs and metrics even when a production deployment is unavailable. @@ -35,15 +35,15 @@ How many monitoring deployments you use depends on your requirements: Logs and metrics that get sent to a dedicated monitoring {{es}} deployment [may not be cleaned up automatically](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md#ec-logging-and-monitoring-retention) and might require some additional steps to remove excess data periodically. -### Retention of monitoring daily indices [ec-logging-and-monitoring-retention] +### Retention of monitoring daily indices [ec-logging-and-monitoring-retention] -#### Stack versions 8.0 and above [ec-logging-and-monitoring-retention-8] +#### Stack versions 8.0 and above [ec-logging-and-monitoring-retention-8] When you enable monitoring in {{ech}}, your monitoring indices are retained for a certain period by default. After the retention period has passed, the monitoring indices are deleted automatically. The retention period is configured in the `.monitoring-8-ilm-policy` index lifecycle policy. To view or edit the policy open {{kib}} **Stack management > Data > Index Lifecycle Policies**. -### Sending monitoring data to itself (self monitoring) [ec-logging-and-monitoring-retention-self-monitoring] +### Sending monitoring data to itself (self monitoring) [ec-logging-and-monitoring-retention-self-monitoring] $$$ec-logging-and-monitoring-retention-7$$$ When you enable self-monitoring in {{ech}}, your monitoring indices are retained for a certain period by default. After the retention period has passed, the monitoring indices are deleted automatically. Monitoring data is retained for three days by default or as specified by the [`xpack.monitoring.history.duration` user setting](../../../deploy-manage/deploy/elastic-cloud/edit-stack-settings.md#xpack-monitoring-history-duration). @@ -65,7 +65,7 @@ PUT /_cluster/settings ``` -### Sending monitoring data to a dedicated monitoring deployment [ec-logging-and-monitoring-retention-dedicated-monitoring] +### Sending monitoring data to a dedicated monitoring deployment [ec-logging-and-monitoring-retention-dedicated-monitoring] When [monitoring for production use](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md#ec-logging-and-monitoring-production), where you configure your deployments **to send monitoring data to a dedicated monitoring deployment** for indexing, this retention period does not apply. Monitoring indices on a dedicated monitoring deployment are retained until you remove them. There are three options open to you: @@ -98,17 +98,17 @@ When [monitoring for production use](../../../deploy-manage/monitor/stack-monito * To retain monitoring indices on a dedicated monitoring deployment as is without deleting them automatically, no additional steps are required other than making sure that you do not enable the monitoring deployment to send monitoring data to itself. You should also monitor the deployment for disk space usage and upgrade your deployment periodically, if necessary. -### Retention of logging indices [ec-logging-and-monitoring-log-retention] +### Retention of logging indices [ec-logging-and-monitoring-log-retention] An ILM policy is pre-configured to manage log retention. The policy can be adjusted according to your requirements. -### Index management [ec-logging-and-monitoring-index-management-ilm] +### Index management [ec-logging-and-monitoring-index-management-ilm] When sending monitoring data to a deployment, you can configure [Index Lifecycle Management (ILM)](../../../manage-data/lifecycle/index-lifecycle-management.md) to manage retention of your monitoring and logging indices. When sending logs to a deployment, an ILM policy is pre-configured to manage log retention and the policy can be customized to your needs. -### Enable logging and monitoring [ec-enable-logging-and-monitoring-steps] +### Enable logging and monitoring [ec-enable-logging-and-monitoring-steps] {{ech}} manages the installation and configuration of the monitoring agent for you. When you enable monitoring on a deployment, you are configuring where the monitoring agent for your current deployment should send its logs and metrics. @@ -125,23 +125,23 @@ To enable monitoring on your deployment: If a deployment is not listed, make sure that it is running a compatible version. The monitoring deployment and production deployment must be on the same major version, cloud provider, and region. - ::::{tip} + ::::{tip} Remember to send logs and metrics for production deployments to a dedicated monitoring deployment, so that your production deployments are not impacted by the overhead of indexing and storing monitoring data. A dedicated monitoring deployment also gives you more control over the retention period for monitoring data. :::: -::::{note} +::::{note} Enabling logs and monitoring may trigger a plan change on your deployment. You can monitor the plan change progress from the deployment’s **Activity** page. :::: -::::{note} +::::{note} Enabling logs and monitoring requires some extra resource on a deployment. For production systems, we recommend sizing deployments with logs and monitoring enabled to at least 4 GB of RAM. :::: -### Access the monitoring application in Kibana [ec-access-kibana-monitoring] +### Access the monitoring application in Kibana [ec-access-kibana-monitoring] With monitoring enabled for your deployment, you can access the [logs](https://www.elastic.co/guide/en/kibana/current/observability.html) and [stack monitoring](../../../deploy-manage/monitor/monitoring-data/visualizing-monitoring-data.md) through Kibana. @@ -165,28 +165,28 @@ Alternatively, you can access logs and metrics directly on the Kibana **Logs** a | `service.version` | The version of the stack resource that generated the log | `8.13.1` | -### Logging features [ec-extra-logging-features] +### Logging features [ec-extra-logging-features] When shipping logs to a monitoring deployment there are more logging features available to you. These features include: -#### For {{es}}: [ec-extra-logging-features-elasticsearch] +#### For {{es}}: [ec-extra-logging-features-elasticsearch] * [Audit logging](../../../deploy-manage/monitor/logging-configuration/enabling-audit-logs.md) - logs security-related events on your deployment -* [Slow query and index logging](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/slow-log.md) - helps find and debug slow queries and indexing +* [Slow query and index logging](elasticsearch://reference/elasticsearch/index-settings/slow-log.md) - helps find and debug slow queries and indexing * Verbose logging - helps debug stack issues by increasing component logs After you’ve enabled log delivery on your deployment, you can [add the Elasticsearch user settings](../../../deploy-manage/deploy/elastic-cloud/edit-stack-settings.md) to enable these features. -#### For Kibana: [ec-extra-logging-features-kibana] +#### For Kibana: [ec-extra-logging-features-kibana] * [Audit logging](../../../deploy-manage/monitor/logging-configuration/enabling-audit-logs.md) - logs security-related events on your deployment After you’ve enabled log delivery on your deployment, you can [add the Kibana user settings](../../../deploy-manage/deploy/elastic-cloud/edit-stack-settings.md) to enable this feature. -### Other components [ec-extra-logging-features-enterprise-search] +### Other components [ec-extra-logging-features-enterprise-search] Enabling log collection also supports collecting and indexing the following types of logs from other components in your deployments: @@ -204,12 +204,12 @@ The ˆ*ˆ indicates that we also index the archived files of each type of log. Check the respective product documentation for more information about the logging capabilities of each product. -## Metrics features [ec-extra-metrics-features] +## Metrics features [ec-extra-metrics-features] With logging and monitoring enabled for a deployment, metrics are collected for Elasticsearch, Kibana, and APM with Fleet Server. -#### Enabling Elasticsearch/Kibana audit logs on your deployment [ec-enable-audit-logs] +#### Enabling Elasticsearch/Kibana audit logs on your deployment [ec-enable-audit-logs] % Added by eedugon to audit logging in deploy and manage -> monitoring -> logging section Audit logs are useful for tracking security events on your {{es}} and/or {{kib}} clusters. To enable {{es}} audit logs on your deployment: diff --git a/raw-migrated-files/cloud/cloud/ec-maintenance-mode-routing.md b/raw-migrated-files/cloud/cloud/ec-maintenance-mode-routing.md index 223d5d9a3..badafd1ca 100644 --- a/raw-migrated-files/cloud/cloud/ec-maintenance-mode-routing.md +++ b/raw-migrated-files/cloud/cloud/ec-maintenance-mode-routing.md @@ -7,7 +7,7 @@ The {{ecloud}} proxy routes HTTP requests to its deployment’s individual produ It might be helpful to temporarily block upstream requests in order to protect some or all instances or products within your deployment. For example, you might stop request routing in the following cases: * If another team within your company starts streaming new data into your production {{integrations-server}} without previous load testing, both it and {{es}} might experience performance issues. You might consider stopping routing requests on all {{integrations-server}} instances in order to protect your downstream {{es}} instance. -* If {{es}} is being overwhelmed by upstream requests, it might experience increased response times or even become unresponsive. This might impact your ability to resize components in your deployment and increase the duration of pending plans or increase the chance of plan changes failing. Because every {{es}} node is an [implicit coordinating node](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/node-settings.md), you should stop routing requests across all {{es}} nodes to completely block upstream traffic. +* If {{es}} is being overwhelmed by upstream requests, it might experience increased response times or even become unresponsive. This might impact your ability to resize components in your deployment and increase the duration of pending plans or increase the chance of plan changes failing. Because every {{es}} node is an [implicit coordinating node](elasticsearch://reference/elasticsearch/configuration-reference/node-settings.md), you should stop routing requests across all {{es}} nodes to completely block upstream traffic. ## Considerations [ec_considerations] diff --git a/raw-migrated-files/cloud/cloud/ec-manage-kibana-settings.md b/raw-migrated-files/cloud/cloud/ec-manage-kibana-settings.md index 059023584..5d197896b 100644 --- a/raw-migrated-files/cloud/cloud/ec-manage-kibana-settings.md +++ b/raw-migrated-files/cloud/cloud/ec-manage-kibana-settings.md @@ -588,7 +588,7 @@ This setting is not available in versions 8.0.0 through 8.2.0. As such, this set : When enabled, specifies the email address to receive cluster alert notifications. `xpack.monitoring.kibana.collection.interval` -: Controls [how often data samples are collected](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/monitoring-settings.md#monitoring-collection-settings). +: Controls [how often data samples are collected](elasticsearch://reference/elasticsearch/configuration-reference/monitoring-settings.md#monitoring-collection-settings). `xpack.monitoring.min_interval_seconds` : Specifies the minimum number of seconds that a time bucket in a chart can represent. If you modify the `xpack.monitoring.kibana.collection.interval`, use the same value in this setting. @@ -599,7 +599,7 @@ This setting is not available in versions 8.0.0 through 8.2.0. As such, this set `xpack.ml.enabled` : Set to true (default) to enable machine learning. - If set to `false` in `kibana.yml`, the machine learning icon is hidden in this Kibana instance. If `xpack.ml.enabled` is set to `true` in `elasticsearch.yml`, however, you can still use the machine learning APIs. To disable machine learning entirely, check the [Elasticsearch Machine Learning Settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/machine-learning-settings.md). + If set to `false` in `kibana.yml`, the machine learning icon is hidden in this Kibana instance. If `xpack.ml.enabled` is set to `true` in `elasticsearch.yml`, however, you can still use the machine learning APIs. To disable machine learning entirely, check the [Elasticsearch Machine Learning Settings](elasticsearch://reference/elasticsearch/configuration-reference/machine-learning-settings.md). #### Content security policy configuration [ec_content_security_policy_configuration] @@ -692,7 +692,7 @@ Each method has its own unique limitations which are important to understand. `xpack.reporting.csv.scroll.duration` -: Amount of [time](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#time-units) allowed before {{kib}} cleans the scroll context during a CSV export. Valid option is either `auto` or [time](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#time-units), Defaults to `30s`. +: Amount of [time](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#time-units) allowed before {{kib}} cleans the scroll context during a CSV export. Valid option is either `auto` or [time](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#time-units), Defaults to `30s`. ::::{note} Support for the The option `auto` was included here, when the config value is set to `auto` the scroll context will be preserved for as long as is possible, before the report task is terminated due to the limits of `xpack.reporting.queue.timeout`. @@ -757,7 +757,7 @@ Support for the The option `auto` was included here, when the config value is se Defaults to `true`. `xpack.reporting.csv.scroll.duration` -: Amount of [time](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#time-units) allowed before {{kib}} cleans the scroll context during a CSV export. +: Amount of [time](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#time-units) allowed before {{kib}} cleans the scroll context during a CSV export. Defaults to `30s` (30 seconds). diff --git a/raw-migrated-files/cloud/cloud/ec-monitoring-setup.md b/raw-migrated-files/cloud/cloud/ec-monitoring-setup.md index 13d58758a..521973328 100644 --- a/raw-migrated-files/cloud/cloud/ec-monitoring-setup.md +++ b/raw-migrated-files/cloud/cloud/ec-monitoring-setup.md @@ -27,7 +27,7 @@ After you have created a new deployment, you should enable shipping logs and met 5. Select **Save**. -Optionally, turn on [audit logging](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/auding-settings.md) to capture security-related events, such as authentication failures, refused connections, and data-access events through the proxy. To turn on audit logging, [edit your deployment’s elasticsearch.yml file](../../../deploy-manage/deploy/elastic-cloud/edit-stack-settings.md) to add these lines: +Optionally, turn on [audit logging](elasticsearch://reference/elasticsearch/configuration-reference/auding-settings.md) to capture security-related events, such as authentication failures, refused connections, and data-access events through the proxy. To turn on audit logging, [edit your deployment’s elasticsearch.yml file](../../../deploy-manage/deploy/elastic-cloud/edit-stack-settings.md) to add these lines: ```sh xpack.security.audit.enabled: true diff --git a/raw-migrated-files/docs-content/serverless/ai-assistant-knowledge-base.md b/raw-migrated-files/docs-content/serverless/ai-assistant-knowledge-base.md index 9d2c0b497..0d5fadaf3 100644 --- a/raw-migrated-files/docs-content/serverless/ai-assistant-knowledge-base.md +++ b/raw-migrated-files/docs-content/serverless/ai-assistant-knowledge-base.md @@ -117,7 +117,7 @@ Refer to the following video for an example of adding a document to Knowledge Ba Add an index as a knowledge source when you want new information added to that index to automatically inform AI Assistant’s responses. Common security examples include asset inventories, network configuration information, on-call matrices, threat intelligence reports, and vulnerability scans. ::::{important} -Indices added to Knowledge Base must have at least one field mapped as [semantic text](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/semantic-text.md). +Indices added to Knowledge Base must have at least one field mapped as [semantic text](elasticsearch://reference/elasticsearch/mapping-reference/semantic-text.md). :::: diff --git a/raw-migrated-files/docs-content/serverless/detections-logsdb-index-mode-impact.md b/raw-migrated-files/docs-content/serverless/detections-logsdb-index-mode-impact.md index c5cae52bc..ebb99b4ce 100644 --- a/raw-migrated-files/docs-content/serverless/detections-logsdb-index-mode-impact.md +++ b/raw-migrated-files/docs-content/serverless/detections-logsdb-index-mode-impact.md @@ -2,22 +2,22 @@ Logsdb is enabled by default for {{serverless-full}}. This topic explains the impact of using logsdb index mode with {{sec-serverless}}. -With logsdb index mode, the original `_source` field is not stored in the index but can be reconstructed using [synthetic `_source`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source). +With logsdb index mode, the original `_source` field is not stored in the index but can be reconstructed using [synthetic `_source`](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source). -When the `_source` is reconstructed, [modifications](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source-modifications) are possible. Therefore, there could be a mismatch between users' expectations and how fields are formatted. +When the `_source` is reconstructed, [modifications](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source-modifications) are possible. Therefore, there could be a mismatch between users' expectations and how fields are formatted. Continue reading to find out how this affects specific {{sec-serverless}} components. -## Alerts [logsdb-alerts] +## Alerts [logsdb-alerts] When alerts are generated, the `_source` event is copied into the alert to retain the original data. When the logsdb index mode is applied, the `_source` event stored in the alert is reconstructed using synthetic `_source`. If you’re switching to use logsdb index mode, the `_source` field stored in the alert might look different in certain situations: -* [Arrays can be reconstructed differently or deduplicated](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source-modifications-leaf-arrays) -* [Field names](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source-modifications-field-names) -* `geo_point` data fields (refer to [Representation of ranges](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source-modifications-ranges) and [Reduced precision of `geo_point` values](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source-precision-loss-for-point-types) for more information) +* [Arrays can be reconstructed differently or deduplicated](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source-modifications-leaf-arrays) +* [Field names](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source-modifications-field-names) +* `geo_point` data fields (refer to [Representation of ranges](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source-modifications-ranges) and [Reduced precision of `geo_point` values](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source-precision-loss-for-point-types) for more information) Alerts generated by the following rule types could be affected: @@ -28,7 +28,7 @@ Alerts generated by the following rule types could be affected: Alerts that are generated by threshold, {{ml}}, and event correlation sequence rules are not affected since they do not contain copies of the original source. -## Rule actions [logsdb-rule-actions] +## Rule actions [logsdb-rule-actions] While we do not recommend using `_source` for actions, in cases where the action relies on the `_source`, the same limitations and changes apply. @@ -37,7 +37,7 @@ If you send alert notifications by enabling [actions](../../../explore-analyze/a We recommend checking and adjusting the rule actions using `_source` before switching to logsdb index mode. -## Runtime fields [logsdb-runtime-fields] +## Runtime fields [logsdb-runtime-fields] Runtime fields that reference `_source` may be affected. Some runtime fields might not work and need to be adjusted. For example, if an event was indexed with the value of `agent.name` in the dot-notation form, it will be returned in the nested form and might not work. diff --git a/raw-migrated-files/docs-content/serverless/elasticsearch-differences.md b/raw-migrated-files/docs-content/serverless/elasticsearch-differences.md index 2857d1d16..db61c8909 100644 --- a/raw-migrated-files/docs-content/serverless/elasticsearch-differences.md +++ b/raw-migrated-files/docs-content/serverless/elasticsearch-differences.md @@ -36,7 +36,7 @@ To ensure optimal performance, follow these recommendations for sizing individua For large datasets that exceed the recommended maximum size for a single index, consider splitting your data across smaller indices and using an alias to search them collectively. -These recommendations do not apply to indices using better binary quantization (BBQ). Refer to [vector quantization](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-quantization) in the core {{es}} docs for more information. +These recommendations do not apply to indices using better binary quantization (BBQ). Refer to [vector quantization](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-quantization) in the core {{es}} docs for more information. ## API availability [elasticsearch-differences-serverless-apis-availability] @@ -88,7 +88,7 @@ When attempting to use an unavailable API, you’ll receive a clear error messag ## Settings availability [elasticsearch-differences-serverless-settings-availability] -In {{es-serverless}}, you can only configure [index-level settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index.md). Cluster-level settings and node-level settings are not required by end users and the `elasticsearch.yml` file is fully managed by Elastic. +In {{es-serverless}}, you can only configure [index-level settings](elasticsearch://reference/elasticsearch/index-settings/index.md). Cluster-level settings and node-level settings are not required by end users and the `elasticsearch.yml` file is fully managed by Elastic. Available settings : **Index-level settings**: Settings that control how {{es}} documents are processed, stored, and searched are available to end users. These include: @@ -148,6 +148,6 @@ The following features are not available in {{es-serverless}} and are not planne * [Custom plugins and bundles](/deploy-manage/deploy/elastic-cloud/upload-custom-plugins-bundles.md) * [{{es}} for Apache Hadoop](elasticsearch-hadoop://reference/index.md) -* [Scripted metric aggregations](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-metrics-scripted-metric-aggregation.md) +* [Scripted metric aggregations](elasticsearch://reference/data-analysis/aggregations/search-aggregations-metrics-scripted-metric-aggregation.md) * Managed web crawler: You can use the [self-managed web crawler](https://github.com/elastic/crawler) instead. -* Managed Search connectors: You can use [self-managed Search connectors](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/search-connectors/self-managed-connectors.md) instead. +* Managed Search connectors: You can use [self-managed Search connectors](elasticsearch://reference/ingestion-tools/search-connectors/self-managed-connectors.md) instead. diff --git a/raw-migrated-files/docs-content/serverless/observability-plaintext-application-logs.md b/raw-migrated-files/docs-content/serverless/observability-plaintext-application-logs.md index d1fc4e6c8..025e8dbc1 100644 --- a/raw-migrated-files/docs-content/serverless/observability-plaintext-application-logs.md +++ b/raw-migrated-files/docs-content/serverless/observability-plaintext-application-logs.md @@ -259,7 +259,7 @@ Also, refer to [{{filebeat}} and systemd](asciidocalypse://docs/beats/docs/refer Use an ingest pipeline to parse the contents of your logs into structured, [Elastic Common Schema (ECS)](asciidocalypse://docs/ecs/docs/reference/index.md)-compatible fields. -Create an ingest pipeline with a [dissect processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/dissect-processor.md) to extract structured ECS fields from your log messages. In your project, go to **Developer Tools** and use a command similar to the following example: +Create an ingest pipeline with a [dissect processor](elasticsearch://reference/ingestion-tools/enrich-processor/dissect-processor.md) to extract structured ECS fields from your log messages. In your project, go to **Developer Tools** and use a command similar to the following example: ```shell PUT _ingest/pipeline/filebeat* <1> @@ -277,7 +277,7 @@ PUT _ingest/pipeline/filebeat* <1> ``` 1. `_ingest/pipeline/filebeat*`: The name of the pipeline. Update the pipeline name to match the name of your data stream. For more information, refer to [Data stream naming scheme](asciidocalypse://docs/docs-content/docs/reference/ingestion-tools/fleet/data-streams.md#data-streams-naming-scheme). -2. `processors.dissect`: Adds a [dissect processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/dissect-processor.md) to extract structured fields from your log message. +2. `processors.dissect`: Adds a [dissect processor](elasticsearch://reference/ingestion-tools/enrich-processor/dissect-processor.md) to extract structured fields from your log message. 3. `field`: The field you’re extracting data from, `message` in this case. 4. `pattern`: The pattern of the elements in your log data. The pattern varies depending on your log format. `%{@timestamp}`, `%{log.level}`, `%{host.ip}`, and `%{{message}}` are common [ECS](asciidocalypse://docs/ecs/docs/reference/index.md) fields. This pattern would match a log file in this format: `2023-11-07T09:39:01.012Z ERROR 192.168.1.110 Server hardware failure detected.` @@ -344,7 +344,7 @@ To aggregate or search for information in plaintext logs, use an ingest pipeline 2. Select the integration policy you created in the previous section. 3. Click **Change defaults** → **Advanced options**. 4. Under **Ingest pipelines**, click **Add custom pipeline**. -5. Create an ingest pipeline with a [dissect processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/dissect-processor.md) to extract structured fields from your log messages. +5. Create an ingest pipeline with a [dissect processor](elasticsearch://reference/ingestion-tools/enrich-processor/dissect-processor.md) to extract structured fields from your log messages. Click **Import processors** and add a similar JSON to the following example: @@ -362,7 +362,7 @@ To aggregate or search for information in plaintext logs, use an ingest pipeline } ``` - 1. `processors.dissect`: Adds a [dissect processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/dissect-processor.md) to extract structured fields from your log message. + 1. `processors.dissect`: Adds a [dissect processor](elasticsearch://reference/ingestion-tools/enrich-processor/dissect-processor.md) to extract structured fields from your log message. 2. `field`: The field you’re extracting data from, `message` in this case. 3. `pattern`: The pattern of the elements in your log data. The pattern varies depending on your log format. `%{@timestamp}`, `%{log.level}`, `%{host.ip}`, and `%{{message}}` are common [ECS](asciidocalypse://docs/ecs/docs/reference/index.md) fields. This pattern would match a log file in this format: `2023-11-07T09:39:01.012Z ERROR 192.168.1.110 Server hardware failure detected.` diff --git a/raw-migrated-files/docs-content/serverless/security-about-rules.md b/raw-migrated-files/docs-content/serverless/security-about-rules.md index 8d8cb08c2..fbe959915 100644 --- a/raw-migrated-files/docs-content/serverless/security-about-rules.md +++ b/raw-migrated-files/docs-content/serverless/security-about-rules.md @@ -25,7 +25,7 @@ You can create the following types of rules: For example, if the threshold `field` is `source.ip` and its `value` is `10`, an alert is generated for every source IP address that appears in at least 10 of the rule’s search results. * [**Event correlation**](../../../solutions/security/detect-and-alert/create-detection-rule.md#create-eql-rule): Searches the defined indices and creates an alert when results match an [Event Query Language (EQL)](../../../explore-analyze/query-filter/languages/eql.md) query. -* [**Indicator match**](../../../solutions/security/detect-and-alert/create-detection-rule.md#create-indicator-rule): Creates an alert when {{elastic-sec}} index field values match field values defined in the specified indicator index patterns. For example, you can create an indicator index for IP addresses and use this index to create an alert whenever an event’s `destination.ip` equals a value in the index. Indicator index field mappings should be [ECS-compliant](https://www.elastic.co/guide/en/ecs/current). For information on creating {{es}} indices and field types, see [Index some documents](https://www.elastic.co/guide/en/starting-with-the-elasticsearch-platform-and-its-solutions/current/getting-started-general-purpose.html#gp-gs-add-data), [Create index API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-create), and [Field data types](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/field-data-types.md). If you have indicators in a standard file format, such as CSV or JSON, you can also use the Machine Learning Data Visualizer to import your indicators into an indicator index. See [Explore the data in {{kib}}](../../../explore-analyze/machine-learning/anomaly-detection/ml-getting-started.md#sample-data-visualizer) and use the **Import Data** option to import your indicators. +* [**Indicator match**](../../../solutions/security/detect-and-alert/create-detection-rule.md#create-indicator-rule): Creates an alert when {{elastic-sec}} index field values match field values defined in the specified indicator index patterns. For example, you can create an indicator index for IP addresses and use this index to create an alert whenever an event’s `destination.ip` equals a value in the index. Indicator index field mappings should be [ECS-compliant](https://www.elastic.co/guide/en/ecs/current). For information on creating {{es}} indices and field types, see [Index some documents](https://www.elastic.co/guide/en/starting-with-the-elasticsearch-platform-and-its-solutions/current/getting-started-general-purpose.html#gp-gs-add-data), [Create index API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-create), and [Field data types](elasticsearch://reference/elasticsearch/mapping-reference/field-data-types.md). If you have indicators in a standard file format, such as CSV or JSON, you can also use the Machine Learning Data Visualizer to import your indicators into an indicator index. See [Explore the data in {{kib}}](../../../explore-analyze/machine-learning/anomaly-detection/ml-getting-started.md#sample-data-visualizer) and use the **Import Data** option to import your indicators. ::::{tip} You can also use value lists as the indicator match index. See [Use value lists with indicator match rules](../../../solutions/security/detect-and-alert/create-detection-rule.md#indicator-value-lists) at the end of this topic for more information. diff --git a/raw-migrated-files/docs-content/serverless/security-data-quality-dash.md b/raw-migrated-files/docs-content/serverless/security-data-quality-dash.md index a1e3487c3..68cb8c8fc 100644 --- a/raw-migrated-files/docs-content/serverless/security-data-quality-dash.md +++ b/raw-migrated-files/docs-content/serverless/security-data-quality-dash.md @@ -70,7 +70,7 @@ After an index is checked, a `Pass` or `Fail` status appears. `Fail` indicates m The index check flyout provides more information about the status of fields in that index. Each of its tabs describe fields grouped by mapping status. ::::{note} -Fields in the Same family category have the correct search behavior, but might have different storage or performance characteristics (for example, you can index strings to both text and keyword fields). To learn more, refer to [Field data types](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/field-data-types.md). +Fields in the Same family category have the correct search behavior, but might have different storage or performance characteristics (for example, you can index strings to both text and keyword fields). To learn more, refer to [Field data types](elasticsearch://reference/elasticsearch/mapping-reference/field-data-types.md). :::: diff --git a/raw-migrated-files/docs-content/serverless/security-interactive-investigation-guides.md b/raw-migrated-files/docs-content/serverless/security-interactive-investigation-guides.md index 9c5dd34a4..1ebb865f7 100644 --- a/raw-migrated-files/docs-content/serverless/security-interactive-investigation-guides.md +++ b/raw-migrated-files/docs-content/serverless/security-interactive-investigation-guides.md @@ -75,7 +75,7 @@ The following syntax defines a query button in an interactive investigation guid | `label` | Identifying text on the button. | | `description` | Additional text included with the button. | | `providers` | A two-level nested array that defines the query to run in Timeline. Similar to the structure of queries in Timeline, items in the outer level are joined by an `OR` relationship, and items in the inner level are joined by an `AND` relationship.

Each item in `providers` corresponds to a filter created in the query builder UI and is defined by these attributes:

* `field`: The name of the field to query.
* `excluded`: Whether the query result is excluded (such as **is not one of**) or included (*is one of*).
* `queryType`: The query type used to filter events, based on the filter’s operator. For example, `phrase` or `range`.
* `value`: The value to search for. Either a hard-coded literal value, or the name of an alert field (in double curly brackets) whose value you want to use as a query parameter.
* `valueType`: The data type of `value`, such as `string` or `boolean`.
| -| `relativeFrom`, `relativeTo` | (Optional) The start and end, respectively, of the relative time range for the query. Times are relative to the alert’s creation time, represented as `now` in [date math](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/common-options.md#date-math) format. For example, selecting **Last 15 minutes** in the query builder form creates the syntax `"relativeFrom": "now-15m", "relativeTo": "now"`. | +| `relativeFrom`, `relativeTo` | (Optional) The start and end, respectively, of the relative time range for the query. Times are relative to the alert’s creation time, represented as `now` in [date math](elasticsearch://reference/elasticsearch/rest-apis/common-options.md#date-math) format. For example, selecting **Last 15 minutes** in the query builder form creates the syntax `"relativeFrom": "now-15m", "relativeTo": "now"`. | ::::{note} Some characters must be escaped with a backslash, such as `\"` for a quotation mark and `\\` for a literal backslash. Divide Windows paths with double backslashes (for example, `C:\\Windows\\explorer.exe`), and paths that already include double backslashes might require four backslashes for each divider. A clickable error icon (![Error](../../../images/serverless-error.svg "")) displays below the Markdown editor if there are any syntax errors. diff --git a/raw-migrated-files/docs-content/serverless/security-rules-create.md b/raw-migrated-files/docs-content/serverless/security-rules-create.md index da695d1a1..8f291b023 100644 --- a/raw-migrated-files/docs-content/serverless/security-rules-create.md +++ b/raw-migrated-files/docs-content/serverless/security-rules-create.md @@ -149,10 +149,10 @@ To create or edit {{ml}} rules, you need an appropriate user role. Additionally, 2. To create an event correlation rule using EQL, select **Event Correlation**, then: 1. Define which {{es}} indices or data view the rule searches when querying for events. - 2. Write an [EQL query](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/eql-syntax.md) that searches for matching events or a series of matching events. + 2. Write an [EQL query](elasticsearch://reference/query-languages/eql-syntax.md) that searches for matching events or a series of matching events. ::::{tip} - To find events that are missing in a sequence, use the [missing events](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/eql-syntax.md#eql-missing-events) syntax. + To find events that are missing in a sequence, use the [missing events](elasticsearch://reference/query-languages/eql-syntax.md#eql-missing-events) syntax. :::: @@ -190,7 +190,7 @@ To create or edit {{ml}} rules, you need an appropriate user role. Additionally, 3. (Optional) Click the EQL settings icon (![EQL settings](../../../images/serverless-controlsVertical.svg "")) to configure additional fields used by [EQL search](../../../explore-analyze/query-filter/languages/eql.md#specify-a-timestamp-or-event-category-field): - * **Event category field**: Contains the event classification, such as `process`, `file`, or `network`. This field is typically mapped as a field type in the [keyword family](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/keyword.md). Defaults to the `event.category` ECS field. + * **Event category field**: Contains the event classification, such as `process`, `file`, or `network`. This field is typically mapped as a field type in the [keyword family](elasticsearch://reference/elasticsearch/mapping-reference/keyword.md). Defaults to the `event.category` ECS field. * **Tiebreaker field**: Sets a secondary field for sorting events (in ascending, lexicographic order) if they have the same timestamp. * **Timestamp field**: Contains the event timestamp used for sorting a sequence of events. This is different from the **Timestamp override** advanced setting, which is used for querying events within a range. Defaults to the `@timestamp` ECS field. @@ -391,7 +391,7 @@ To create an {{esql}} rule: #### Aggregating query [esql-agg-query] -Aggregating queries use [`STATS...BY`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-functions-operators.md#esql-agg-functions) functions to aggregate source event data. Alerts generated by a rule with an aggregating query only contain the fields that the {{esql}} query returns and any new fields that the query creates. +Aggregating queries use [`STATS...BY`](elasticsearch://reference/query-languages/esql/esql-functions-operators.md#esql-agg-functions) functions to aggregate source event data. Alerts generated by a rule with an aggregating query only contain the fields that the {{esql}} query returns and any new fields that the query creates. ::::{note} A *new field* is a field that doesn’t exist in the query’s source index and is instead created when the rule runs. You can access new fields in the details of any alerts that are generated by the rule. For example, if you use the `STATS...BY` function to create a column with aggregated values, the column is created when the rule runs and is added as a new field to any alerts that are generated by the rule. @@ -425,7 +425,7 @@ Rules that use aggregating queries might create duplicate alerts. This can happe Non-aggregating queries don’t use `STATS...BY` functions and don’t aggregate source event data. Alerts generated by a non-aggregating query contain source event fields that the query returns, new fields the query creates, and all other fields in the source event document. ::::{note} -A *new field* is a field that doesn’t exist in the query’s source index and is instead created when the rule runs. You can access new fields in the details of any alerts that are generated by the rule. For example, if you use the [`EVAL`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md#esql-eval) command to append new columns with calculated values, the columns are created when the rule runs and are added as new fields to any alerts generated by the rule. +A *new field* is a field that doesn’t exist in the query’s source index and is instead created when the rule runs. You can access new fields in the details of any alerts that are generated by the rule. For example, if you use the [`EVAL`](elasticsearch://reference/query-languages/esql/esql-commands.md#esql-eval) command to append new columns with calculated values, the columns are created when the rule runs and are added as new fields to any alerts generated by the rule. :::: @@ -455,7 +455,7 @@ FROM logs-* METADATA _id, _index, _version When those metadata fields are provided, unique alert IDs are created for each alert generated by the query. -When developing the query, make sure you don’t [`DROP`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md#esql-drop) or filter out the `_id`, `_index`, or `_version` metadata fields. +When developing the query, make sure you don’t [`DROP`](elasticsearch://reference/query-languages/esql/esql-commands.md#esql-drop) or filter out the `_id`, `_index`, or `_version` metadata fields. Here is an example of a query that fails to deduplicate alerts. It uses the `DROP` command to omit the `_id` property from the results table: @@ -480,11 +480,11 @@ FROM logs-* METADATA _id, _index, _version When writing your query, consider the following: -* The [`LIMIT`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md#esql-limit) command specifies the maximum number of rows an {{esql}} query returns and the maximum number of alerts created per rule run. Similarly, a detection rule’s **Max alerts per run** setting specifies the maximum number of alerts it can create every time it runs. +* The [`LIMIT`](elasticsearch://reference/query-languages/esql/esql-commands.md#esql-limit) command specifies the maximum number of rows an {{esql}} query returns and the maximum number of alerts created per rule run. Similarly, a detection rule’s **Max alerts per run** setting specifies the maximum number of alerts it can create every time it runs. If the `LIMIT` value and **Max alerts per run** value are different, the rule uses the lower value to determine the maximum number of alerts the rule generates. -* When writing an aggregating query, use the [`STATS...BY`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md#esql-stats-by) command with fields that you want to search and filter for after alerts are created. For example, using the `host.name`, `user.name`, `process.name` fields with the `BY` operator of the `STATS...BY` command returns these fields in alert documents, and allows you to search and filter for them from the Alerts table. +* When writing an aggregating query, use the [`STATS...BY`](elasticsearch://reference/query-languages/esql/esql-commands.md#esql-stats-by) command with fields that you want to search and filter for after alerts are created. For example, using the `host.name`, `user.name`, `process.name` fields with the `BY` operator of the `STATS...BY` command returns these fields in alert documents, and allows you to search and filter for them from the Alerts table. * When configuring alert suppression on a non-aggregating query, we recommend sorting results by ascending `@timestamp` order. Doing so ensures that alerts are properly suppressed, especially if the number of alerts generated is higher than the **Max alerts per run** value. diff --git a/raw-migrated-files/elasticsearch/elasticsearch-reference/autoscaling-fixed-decider.md b/raw-migrated-files/elasticsearch/elasticsearch-reference/autoscaling-fixed-decider.md index 88ea10ff1..6b41028d9 100644 --- a/raw-migrated-files/elasticsearch/elasticsearch-reference/autoscaling-fixed-decider.md +++ b/raw-migrated-files/elasticsearch/elasticsearch-reference/autoscaling-fixed-decider.md @@ -1,11 +1,11 @@ # Fixed decider [autoscaling-fixed-decider] -::::{warning} +::::{warning} This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features. :::: -::::{warning} +::::{warning} The fixed decider is intended for testing only. Do not use this decider in production. :::: @@ -15,10 +15,10 @@ The [autoscaling](../../../deploy-manage/autoscaling.md) `fixed` decider respond ## Configuration settings [_configuration_settings] `storage` -: (Optional, [byte value](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#byte-units)) Required amount of node-level storage. Defaults to `-1` (disabled). +: (Optional, [byte value](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#byte-units)) Required amount of node-level storage. Defaults to `-1` (disabled). `memory` -: (Optional, [byte value](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#byte-units)) Required amount of node-level memory. Defaults to `-1` (disabled). +: (Optional, [byte value](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#byte-units)) Required amount of node-level memory. Defaults to `-1` (disabled). `processors` : (Optional, float) Required number of processors. Defaults to disabled. diff --git a/raw-migrated-files/elasticsearch/elasticsearch-reference/autoscaling-frozen-shards-decider.md b/raw-migrated-files/elasticsearch/elasticsearch-reference/autoscaling-frozen-shards-decider.md index ef1921b2f..e54a37130 100644 --- a/raw-migrated-files/elasticsearch/elasticsearch-reference/autoscaling-frozen-shards-decider.md +++ b/raw-migrated-files/elasticsearch/elasticsearch-reference/autoscaling-frozen-shards-decider.md @@ -5,6 +5,6 @@ The [autoscaling](../../../deploy-manage/autoscaling.md) frozen shards decider ( ## Configuration settings [autoscaling-frozen-shards-decider-settings] `memory_per_shard` -: (Optional, [byte value](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#byte-units)) The memory needed per shard, in bytes. Defaults to 2000 shards per 64 GB node (roughly 32 MB per shard). Notice that this is total memory, not heap, assuming that the Elasticsearch default heap sizing mechanism is used and that nodes are not bigger than 64 GB. +: (Optional, [byte value](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#byte-units)) The memory needed per shard, in bytes. Defaults to 2000 shards per 64 GB node (roughly 32 MB per shard). Notice that this is total memory, not heap, assuming that the Elasticsearch default heap sizing mechanism is used and that nodes are not bigger than 64 GB. diff --git a/raw-migrated-files/elasticsearch/elasticsearch-reference/autoscaling-machine-learning-decider.md b/raw-migrated-files/elasticsearch/elasticsearch-reference/autoscaling-machine-learning-decider.md index 106eacf3b..2ea195129 100644 --- a/raw-migrated-files/elasticsearch/elasticsearch-reference/autoscaling-machine-learning-decider.md +++ b/raw-migrated-files/elasticsearch/elasticsearch-reference/autoscaling-machine-learning-decider.md @@ -4,8 +4,8 @@ The [autoscaling](../../../deploy-manage/autoscaling.md) {{ml}} decider (`ml`) c The {{ml}} decider is enabled for policies governing `ml` nodes. -::::{note} -For {{ml}} jobs to open when the cluster is not appropriately scaled, set `xpack.ml.max_lazy_ml_nodes` to the largest number of possible {{ml}} nodes (refer to [Advanced machine learning settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/machine-learning-settings.md#advanced-ml-settings) for more information). In {{ech}}, this is automatically set. +::::{note} +For {{ml}} jobs to open when the cluster is not appropriately scaled, set `xpack.ml.max_lazy_ml_nodes` to the largest number of possible {{ml}} nodes (refer to [Advanced machine learning settings](elasticsearch://reference/elasticsearch/configuration-reference/machine-learning-settings.md#advanced-ml-settings) for more information). In {{ech}}, this is automatically set. :::: @@ -20,7 +20,7 @@ Both `num_anomaly_jobs_in_queue` and `num_analytics_jobs_in_queue` are designed : (Optional, integer) Specifies the number of queued {{dfanalytics-jobs}} to allow. Defaults to `0`. `down_scale_delay` -: (Optional, [time value](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#time-units)) Specifies the time to delay before scaling down. Defaults to 1 hour. If a scale down is possible for the entire time window, then a scale down is requested. If the cluster requires a scale up during the window, the window is reset. +: (Optional, [time value](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#time-units)) Specifies the time to delay before scaling down. Defaults to 1 hour. If a scale down is possible for the entire time window, then a scale down is requested. If the cluster requires a scale up during the window, the window is reset. ## {{api-examples-title}} [autoscaling-machine-learning-decider-examples] diff --git a/raw-migrated-files/elasticsearch/elasticsearch-reference/autoscaling-proactive-storage-decider.md b/raw-migrated-files/elasticsearch/elasticsearch-reference/autoscaling-proactive-storage-decider.md index 754178efb..17b2a7c9a 100644 --- a/raw-migrated-files/elasticsearch/elasticsearch-reference/autoscaling-proactive-storage-decider.md +++ b/raw-migrated-files/elasticsearch/elasticsearch-reference/autoscaling-proactive-storage-decider.md @@ -9,7 +9,7 @@ The estimation of expected additional data is based on past indexing that occurr ## Configuration settings [autoscaling-proactive-storage-decider-settings] `forecast_window` -: (Optional, [time value](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#time-units)) The window of time to use for forecasting. Defaults to 30 minutes. +: (Optional, [time value](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#time-units)) The window of time to use for forecasting. Defaults to 30 minutes. ## {{api-examples-title}} [autoscaling-proactive-storage-decider-examples] diff --git a/raw-migrated-files/elasticsearch/elasticsearch-reference/autoscaling-reactive-storage-decider.md b/raw-migrated-files/elasticsearch/elasticsearch-reference/autoscaling-reactive-storage-decider.md index 0b3d0e83d..7dba43dd9 100644 --- a/raw-migrated-files/elasticsearch/elasticsearch-reference/autoscaling-reactive-storage-decider.md +++ b/raw-migrated-files/elasticsearch/elasticsearch-reference/autoscaling-reactive-storage-decider.md @@ -4,5 +4,5 @@ The [autoscaling](../../../deploy-manage/autoscaling.md) reactive storage decide The reactive storage decider is enabled for all policies governing data nodes and has no configuration options. -The decider relies partially on using [data tier preference](../../../manage-data/lifecycle/data-tiers.md#data-tier-allocation) allocation rather than node attributes. In particular, scaling a data tier into existence (starting the first node in a tier) will result in starting a node in any data tier that is empty if not using allocation based on data tier preference. Using the [ILM migrate](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-migrate.md) action to migrate between tiers is the preferred way of allocating to tiers and fully supports scaling a tier into existence. +The decider relies partially on using [data tier preference](../../../manage-data/lifecycle/data-tiers.md#data-tier-allocation) allocation rather than node attributes. In particular, scaling a data tier into existence (starting the first node in a tier) will result in starting a node in any data tier that is empty if not using allocation based on data tier preference. Using the [ILM migrate](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-migrate.md) action to migrate between tiers is the preferred way of allocating to tiers and fully supports scaling a tier into existence. diff --git a/raw-migrated-files/elasticsearch/elasticsearch-reference/bootstrap-checks-xpack.md b/raw-migrated-files/elasticsearch/elasticsearch-reference/bootstrap-checks-xpack.md index e627ecbbb..229f36a94 100644 --- a/raw-migrated-files/elasticsearch/elasticsearch-reference/bootstrap-checks-xpack.md +++ b/raw-migrated-files/elasticsearch/elasticsearch-reference/bootstrap-checks-xpack.md @@ -3,21 +3,21 @@ In addition to the [{{es}} bootstrap checks](../../../deploy-manage/deploy/self-managed/bootstrap-checks.md), there are checks that are specific to {{xpack}} features. -## Encrypt sensitive data check [bootstrap-checks-xpack-encrypt-sensitive-data] +## Encrypt sensitive data check [bootstrap-checks-xpack-encrypt-sensitive-data] If you use {{watcher}} and have chosen to encrypt sensitive data (by setting `xpack.watcher.encrypt_sensitive_data` to `true`), you must also place a key in the secure settings store. To pass this bootstrap check, you must set the `xpack.watcher.encryption_key` on each node in the cluster. For more information, see [Encrypting sensitive data in Watcher](../../../explore-analyze/alerts-cases/watcher/encrypting-data.md). -## PKI realm check [bootstrap-checks-xpack-pki-realm] +## PKI realm check [bootstrap-checks-xpack-pki-realm] If you use {{es}} {{security-features}} and a Public Key Infrastructure (PKI) realm, you must configure Transport Layer Security (TLS) on your cluster and enable client authentication on the network layers (either transport or http). For more information, see [PKI user authentication](../../../deploy-manage/users-roles/cluster-or-deployment-auth/pki.md) and [Set up basic security plus HTTPS](../../../deploy-manage/security/set-up-basic-security-plus-https.md). To pass this bootstrap check, if a PKI realm is enabled, you must configure TLS and enable client authentication on at least one network communication layer. -## Role mappings check [bootstrap-checks-xpack-role-mappings] +## Role mappings check [bootstrap-checks-xpack-role-mappings] If you authenticate users with realms other than `native` or `file` realms, you must create role mappings. These role mappings define which roles are assigned to each user. @@ -26,11 +26,11 @@ If you use files to manage the role mappings, you must configure a YAML file and To pass this bootstrap check, the role mapping files must exist and must be valid. The Distinguished Names (DNs) that are listed in the role mappings files must also be valid. -## SSL/TLS check [bootstrap-checks-tls] +## SSL/TLS check [bootstrap-checks-tls] If you enable {{es}} {{security-features}}, unless you have a trial license, you must configure SSL/TLS for internode-communication. -::::{note} +::::{note} Single-node clusters that use a loopback interface do not have this requirement. For more information, see [*Start the {{stack}} with security enabled automatically*](../../../deploy-manage/security/security-certificates-keys.md). :::: @@ -38,11 +38,11 @@ Single-node clusters that use a loopback interface do not have this requirement. To pass this bootstrap check, you must [set up SSL/TLS in your cluster](../../../deploy-manage/security/set-up-basic-security.md#encrypt-internode-communication). -## Token SSL check [bootstrap-checks-xpack-token-ssl] +## Token SSL check [bootstrap-checks-xpack-token-ssl] If you use {{es}} {{security-features}} and the built-in token service is enabled, you must configure your cluster to use SSL/TLS for the HTTP interface. HTTPS is required in order to use the token service. -In particular, if `xpack.security.authc.token.enabled` is set to `true` in the `elasticsearch.yml` file, you must also set `xpack.security.http.ssl.enabled` to `true`. For more information about these settings, see [Security settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md) and [Advanced HTTP settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#http-settings). +In particular, if `xpack.security.authc.token.enabled` is set to `true` in the `elasticsearch.yml` file, you must also set `xpack.security.http.ssl.enabled` to `true`. For more information about these settings, see [Security settings](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md) and [Advanced HTTP settings](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#http-settings). To pass this bootstrap check, you must enable HTTPS or disable the built-in token service. diff --git a/raw-migrated-files/elasticsearch/elasticsearch-reference/bootstrap-checks.md b/raw-migrated-files/elasticsearch/elasticsearch-reference/bootstrap-checks.md index 7a89c61d0..e8792692f 100644 --- a/raw-migrated-files/elasticsearch/elasticsearch-reference/bootstrap-checks.md +++ b/raw-migrated-files/elasticsearch/elasticsearch-reference/bootstrap-checks.md @@ -7,21 +7,21 @@ These bootstrap checks inspect a variety of Elasticsearch and system settings an There are some bootstrap checks that are always enforced to prevent Elasticsearch from running with incompatible settings. These checks are documented individually. -## Development vs. production mode [dev-vs-prod-mode] +## Development vs. production mode [dev-vs-prod-mode] -By default, {{es}} binds to loopback addresses for [HTTP and transport (internal) communication](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md). This is fine for downloading and playing with {{es}} as well as everyday development, but it’s useless for production systems. To join a cluster, an {{es}} node must be reachable via transport communication. To join a cluster via a non-loopback address, a node must bind transport to a non-loopback address and not be using [single-node discovery](../../../deploy-manage/deploy/self-managed/bootstrap-checks.md#single-node-discovery). Thus, we consider an Elasticsearch node to be in development mode if it can not form a cluster with another machine via a non-loopback address, and is otherwise in production mode if it can join a cluster via non-loopback addresses. +By default, {{es}} binds to loopback addresses for [HTTP and transport (internal) communication](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md). This is fine for downloading and playing with {{es}} as well as everyday development, but it’s useless for production systems. To join a cluster, an {{es}} node must be reachable via transport communication. To join a cluster via a non-loopback address, a node must bind transport to a non-loopback address and not be using [single-node discovery](../../../deploy-manage/deploy/self-managed/bootstrap-checks.md#single-node-discovery). Thus, we consider an Elasticsearch node to be in development mode if it can not form a cluster with another machine via a non-loopback address, and is otherwise in production mode if it can join a cluster via non-loopback addresses. -Note that HTTP and transport can be configured independently via [`http.host`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#http-settings) and [`transport.host`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#transport-settings); this can be useful for configuring a single node to be reachable via HTTP for testing purposes without triggering production mode. +Note that HTTP and transport can be configured independently via [`http.host`](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#http-settings) and [`transport.host`](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#transport-settings); this can be useful for configuring a single node to be reachable via HTTP for testing purposes without triggering production mode. -## Single-node discovery [single-node-discovery] +## Single-node discovery [single-node-discovery] We recognize that some users need to bind the transport to an external interface for testing a remote-cluster configuration. For this situation, we provide the discovery type `single-node` (configure it by setting `discovery.type` to `single-node`); in this situation, a node will elect itself master and will not join a cluster with any other node. -## Forcing the bootstrap checks [_forcing_the_bootstrap_checks] +## Forcing the bootstrap checks [_forcing_the_bootstrap_checks] -If you are running a single node in production, it is possible to evade the bootstrap checks (either by not binding transport to an external interface, or by binding transport to an external interface and setting the discovery type to `single-node`). For this situation, you can force execution of the bootstrap checks by setting the system property `es.enforce.bootstrap.checks` to `true` in the [JVM options](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/jvm-settings.md#set-jvm-options). We strongly encourage you to do this if you are in this specific situation. This system property can be used to force execution of the bootstrap checks independent of the node configuration. +If you are running a single node in production, it is possible to evade the bootstrap checks (either by not binding transport to an external interface, or by binding transport to an external interface and setting the discovery type to `single-node`). For this situation, you can force execution of the bootstrap checks by setting the system property `es.enforce.bootstrap.checks` to `true` in the [JVM options](elasticsearch://reference/elasticsearch/jvm-settings.md#set-jvm-options). We strongly encourage you to do this if you are in this specific situation. This system property can be used to force execution of the bootstrap checks independent of the node configuration. diff --git a/raw-migrated-files/elasticsearch/elasticsearch-reference/change-passwords-native-users.md b/raw-migrated-files/elasticsearch/elasticsearch-reference/change-passwords-native-users.md index 61559bd13..9996cbaad 100644 --- a/raw-migrated-files/elasticsearch/elasticsearch-reference/change-passwords-native-users.md +++ b/raw-migrated-files/elasticsearch/elasticsearch-reference/change-passwords-native-users.md @@ -1,6 +1,6 @@ # Setting passwords for native and built-in users [change-passwords-native-users] -After you implement security, you might need or want to change passwords for different users. You can use the [`elasticsearch-reset-password`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/reset-password.md) tool or the [change passwords API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-security-change-password) to change passwords for native users and [built-in users](../../../deploy-manage/users-roles/cluster-or-deployment-auth/built-in-users.md), such as the `elastic` or `kibana_system` users. +After you implement security, you might need or want to change passwords for different users. You can use the [`elasticsearch-reset-password`](elasticsearch://reference/elasticsearch/command-line-tools/reset-password.md) tool or the [change passwords API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-security-change-password) to change passwords for native users and [built-in users](../../../deploy-manage/users-roles/cluster-or-deployment-auth/built-in-users.md), such as the `elastic` or `kibana_system` users. For example, the following command changes the password for a user with the username `user1` to an auto-generated value, and prints the new password to the terminal: diff --git a/raw-migrated-files/elasticsearch/elasticsearch-reference/configuring-stack-security.md b/raw-migrated-files/elasticsearch/elasticsearch-reference/configuring-stack-security.md index f68ea7803..410ff8170 100644 --- a/raw-migrated-files/elasticsearch/elasticsearch-reference/configuring-stack-security.md +++ b/raw-migrated-files/elasticsearch/elasticsearch-reference/configuring-stack-security.md @@ -34,7 +34,7 @@ There are [some cases](../../../deploy-manage/security/security-certificates-key 2. Copy the generated `elastic` password and enrollment token. These credentials are only shown when you start {{es}} for the first time. ::::{note} - If you need to reset the password for the `elastic` user or other built-in users, run the [`elasticsearch-reset-password`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/reset-password.md) tool. To generate new enrollment tokens for {{kib}} or {{es}} nodes, run the [`elasticsearch-create-enrollment-token`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/create-enrollment-token.md) tool. These tools are available in the {{es}} `bin` directory. + If you need to reset the password for the `elastic` user or other built-in users, run the [`elasticsearch-reset-password`](elasticsearch://reference/elasticsearch/command-line-tools/reset-password.md) tool. To generate new enrollment tokens for {{kib}} or {{es}} nodes, run the [`elasticsearch-create-enrollment-token`](elasticsearch://reference/elasticsearch/command-line-tools/create-enrollment-token.md) tool. These tools are available in the {{es}} `bin` directory. :::: @@ -85,11 +85,11 @@ When {{es}} starts for the first time, the security auto-configuration process b Before enrolling a new node, additional actions such as binding to an address other than `localhost` or satisfying bootstrap checks are typically necessary in production clusters. During that time, an auto-generated enrollment token could expire, which is why enrollment tokens aren’t generated automatically. -Additionally, only nodes on the same host can join the cluster without additional configuration. If you want nodes from another host to join your cluster, you need to set `transport.host` to a [supported value](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#network-interface-values) (such as uncommenting the suggested value of `0.0.0.0`), or an IP address that’s bound to an interface where other hosts can reach it. Refer to [transport settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#transport-settings) for more information. +Additionally, only nodes on the same host can join the cluster without additional configuration. If you want nodes from another host to join your cluster, you need to set `transport.host` to a [supported value](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#network-interface-values) (such as uncommenting the suggested value of `0.0.0.0`), or an IP address that’s bound to an interface where other hosts can reach it. Refer to [transport settings](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#transport-settings) for more information. To enroll new nodes in your cluster, create an enrollment token with the `elasticsearch-create-enrollment-token` tool on any existing node in your cluster. You can then start a new node with the `--enrollment-token` parameter so that it joins an existing cluster. -1. In a separate terminal from where {{es}} is running, navigate to the directory where you installed {{es}} and run the [`elasticsearch-create-enrollment-token`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/create-enrollment-token.md) tool to generate an enrollment token for your new nodes. +1. In a separate terminal from where {{es}} is running, navigate to the directory where you installed {{es}} and run the [`elasticsearch-create-enrollment-token`](elasticsearch://reference/elasticsearch/command-line-tools/create-enrollment-token.md) tool to generate an enrollment token for your new nodes. ```sh bin/elasticsearch-create-enrollment-token -s node @@ -172,7 +172,7 @@ When you install {{es}}, the following certificates and keys are generated in th `transport.p12` : Keystore that contains the key and certificate for the transport layer for all the nodes in your cluster. -`http.p12` and `transport.p12` are password-protected PKCS#12 keystores. {{es}} stores the passwords for these keystores as [secure settings](../../../deploy-manage/security/secure-settings.md). To retrieve the passwords so that you can inspect or change the keystore contents, use the [`bin/elasticsearch-keystore`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/elasticsearch-keystore.md) tool. +`http.p12` and `transport.p12` are password-protected PKCS#12 keystores. {{es}} stores the passwords for these keystores as [secure settings](../../../deploy-manage/security/secure-settings.md). To retrieve the passwords so that you can inspect or change the keystore contents, use the [`bin/elasticsearch-keystore`](elasticsearch://reference/elasticsearch/command-line-tools/elasticsearch-keystore.md) tool. Use the following command to retrieve the password for `http.p12`: @@ -223,11 +223,11 @@ The {{es}} configuration directory isn’t writable The following settings are incompatible with security auto configuration. If any of these settings exist, the node startup process skips configuring security automatically and the node starts normally. -* [`node.roles`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/node-settings.md#node-roles) is set to a value where the node can’t be elected as `master`, or if the node can’t hold data -* [`xpack.security.autoconfiguration.enabled`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#general-security-settings) is set to `false` -* [`xpack.security.enabled`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#general-security-settings) has a value set -* Any of the [`xpack.security.transport.ssl.*`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#transport-tls-ssl-settings) or [`xpack.security.http.ssl.*`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#http-tls-ssl-settings) settings have a value set in the `elasticsearch.yml` configuration file or in the `elasticsearch.keystore` -* Any of the `discovery.type`, `discovery.seed_hosts`, or `cluster.initial_master_nodes` [discovery and cluster formation settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/discovery-cluster-formation-settings.md) have a value set +* [`node.roles`](elasticsearch://reference/elasticsearch/configuration-reference/node-settings.md#node-roles) is set to a value where the node can’t be elected as `master`, or if the node can’t hold data +* [`xpack.security.autoconfiguration.enabled`](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#general-security-settings) is set to `false` +* [`xpack.security.enabled`](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#general-security-settings) has a value set +* Any of the [`xpack.security.transport.ssl.*`](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#transport-tls-ssl-settings) or [`xpack.security.http.ssl.*`](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#http-tls-ssl-settings) settings have a value set in the `elasticsearch.yml` configuration file or in the `elasticsearch.keystore` +* Any of the `discovery.type`, `discovery.seed_hosts`, or `cluster.initial_master_nodes` [discovery and cluster formation settings](elasticsearch://reference/elasticsearch/configuration-reference/discovery-cluster-formation-settings.md) have a value set ::::{note} Exceptions are when `discovery.type` is set to `single-node`, or when `cluster.initial_master_nodes` exists but contains only the name of the current node. diff --git a/raw-migrated-files/elasticsearch/elasticsearch-reference/defining-roles.md b/raw-migrated-files/elasticsearch/elasticsearch-reference/defining-roles.md index 7525007da..2bb8838a4 100644 --- a/raw-migrated-files/elasticsearch/elasticsearch-reference/defining-roles.md +++ b/raw-migrated-files/elasticsearch/elasticsearch-reference/defining-roles.md @@ -23,8 +23,8 @@ A role is defined by the following JSON structure: 5. A list of application privilege entries. This field is optional. 6. A list of indices permissions entries for [remote clusters configured with the API key based model](../../../deploy-manage/remote-clusters/remote-clusters-api-key.md). This field is optional (missing `remote_indices` privileges effectively mean no index level permissions for any API key based remote clusters). 7. A list of cluster permissions entries for [remote clusters configured with the API key based model](../../../deploy-manage/remote-clusters/remote-clusters-api-key.md). This field is optional (missing `remote_cluster` privileges effectively means no additional cluster permissions for any API key based remote clusters). -8. Metadata field associated with the role, such as `metadata.app_tag`. Metadata is internally indexed as a [flattened](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/flattened.md) field type. This means that all sub-fields act like `keyword` fields when querying and sorting. Metadata values can be simple values, but also lists and maps. This field is optional. -9. A string value with the description text of the role. The maximum length of it is `1000` chars. The field is internally indexed as a [text](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/text.md#text-field-type) field type (with default values for all parameters). This field is optional. +8. Metadata field associated with the role, such as `metadata.app_tag`. Metadata is internally indexed as a [flattened](elasticsearch://reference/elasticsearch/mapping-reference/flattened.md) field type. This means that all sub-fields act like `keyword` fields when querying and sorting. Metadata values can be simple values, but also lists and maps. This field is optional. +9. A string value with the description text of the role. The maximum length of it is `1000` chars. The field is internally indexed as a [text](elasticsearch://reference/elasticsearch/mapping-reference/text.md#text-field-type) field type (with default values for all parameters). This field is optional. ::::{note} @@ -143,7 +143,7 @@ The remote indices privileges entry has an extra mandatory `clusters` field comp } ``` -1. A list of remote cluster aliases. It supports literal strings as well as [wildcards](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#api-multi-index) and [regular expressions](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/regexp-syntax.md). This field is required. +1. A list of remote cluster aliases. It supports literal strings as well as [wildcards](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#api-multi-index) and [regular expressions](elasticsearch://reference/query-languages/regexp-syntax.md). This field is required. 2. A list of data streams, indices, and aliases to which the permissions in this entry apply. Supports wildcards (`*`). 3. The index level privileges the owners of the role have on the associated data streams and indices specified in the `names` argument. 4. Specification for document fields the owners of the role have read access to. See [Setting up field and document level security](../../../deploy-manage/users-roles/cluster-or-deployment-auth/controlling-access-at-document-field-level.md) for details. @@ -170,7 +170,7 @@ The following describes the structure of a remote cluster permissions entry: } ``` -1. A list of remote cluster aliases. It supports literal strings as well as [wildcards](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#api-multi-index) and [regular expressions](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/regexp-syntax.md). This field is required. +1. A list of remote cluster aliases. It supports literal strings as well as [wildcards](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#api-multi-index) and [regular expressions](elasticsearch://reference/query-languages/regexp-syntax.md). This field is required. 2. The cluster level privileges for the remote cluster. The allowed values here are a subset of the [cluster privileges](../../../deploy-manage/users-roles/cluster-or-deployment-auth/elasticsearch-privileges.md#privileges-list-cluster). The [builtin privileges API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-security-get-builtin-privileges) can be used to determine which privileges are allowed here. This field is required. diff --git a/raw-migrated-files/elasticsearch/elasticsearch-reference/documents-indices.md b/raw-migrated-files/elasticsearch/elasticsearch-reference/documents-indices.md index 46ac46928..42fe29605 100644 --- a/raw-migrated-files/elasticsearch/elasticsearch-reference/documents-indices.md +++ b/raw-migrated-files/elasticsearch/elasticsearch-reference/documents-indices.md @@ -9,14 +9,14 @@ The index is the fundamental unit of storage in {{es}}, a logical namespace for An index is a collection of documents uniquely identified by a name or an [alias](../../../manage-data/data-store/aliases.md). This unique name is important because it’s used to target the index in search queries and other operations. -::::{tip} +::::{tip} A closely related concept is a [data stream](../../../manage-data/data-store/data-streams.md). This index abstraction is optimized for append-only timestamped data, and is made up of hidden, auto-generated backing indices. If you’re working with timestamped data, we recommend the [Elastic Observability](https://www.elastic.co/guide/en/observability/current) solution for additional tools and optimized content. :::: -## Documents and fields [elasticsearch-intro-documents-fields] +## Documents and fields [elasticsearch-intro-documents-fields] {{es}} serializes and stores data in the form of JSON documents. A document is a set of fields, which are key-value pairs that contain your data. Each document has a unique ID, which you can create or have {{es}} auto-generate. @@ -48,22 +48,22 @@ A simple {{es}} document might look like this: ``` -## Metadata fields [elasticsearch-intro-documents-fields-data-metadata] +## Metadata fields [elasticsearch-intro-documents-fields-data-metadata] -An indexed document contains data and metadata. [Metadata fields](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/document-metadata-fields.md) are system fields that store information about the documents. In {{es}}, metadata fields are prefixed with an underscore. For example, the following fields are metadata fields: +An indexed document contains data and metadata. [Metadata fields](elasticsearch://reference/elasticsearch/mapping-reference/document-metadata-fields.md) are system fields that store information about the documents. In {{es}}, metadata fields are prefixed with an underscore. For example, the following fields are metadata fields: * `_index`: The name of the index where the document is stored. * `_id`: The document’s ID. IDs must be unique per index. -## Mappings and data types [elasticsearch-intro-documents-fields-mappings] +## Mappings and data types [elasticsearch-intro-documents-fields-mappings] -Each index has a [mapping](../../../manage-data/data-store/mapping.md) or schema for how the fields in your documents are indexed. A mapping defines the [data type](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/field-data-types.md) for each field, how the field should be indexed, and how it should be stored. When adding documents to {{es}}, you have two options for mappings: +Each index has a [mapping](../../../manage-data/data-store/mapping.md) or schema for how the fields in your documents are indexed. A mapping defines the [data type](elasticsearch://reference/elasticsearch/mapping-reference/field-data-types.md) for each field, how the field should be indexed, and how it should be stored. When adding documents to {{es}}, you have two options for mappings: * [Dynamic mapping](../../../manage-data/data-store/mapping.md#mapping-dynamic): Let {{es}} automatically detect the data types and create the mappings for you. Dynamic mapping helps you get started quickly, but might yield suboptimal results for your specific use case due to automatic field type inference. * [Explicit mapping](../../../manage-data/data-store/mapping.md#mapping-explicit): Define the mappings up front by specifying data types for each field. Recommended for production use cases, because you have full control over how your data is indexed to suit your specific use case. -::::{tip} +::::{tip} You can use a combination of dynamic and explicit mapping on the same index. This is useful when you have a mix of known and unknown fields in your data. :::: diff --git a/raw-migrated-files/elasticsearch/elasticsearch-reference/es-security-principles.md b/raw-migrated-files/elasticsearch/elasticsearch-reference/es-security-principles.md index 6537b21ea..fe0fc1a1e 100644 --- a/raw-migrated-files/elasticsearch/elasticsearch-reference/es-security-principles.md +++ b/raw-migrated-files/elasticsearch/elasticsearch-reference/es-security-principles.md @@ -3,24 +3,24 @@ Protecting your {{es}} cluster and the data it contains is of utmost importance. Implementing a defense in depth strategy provides multiple layers of security to help safeguard your system. The following principles provide a foundation for running {{es}} in a secure manner that helps to mitigate attacks on your system at multiple levels. -## Run {{es}} with security enabled [security-run-with-security] +## Run {{es}} with security enabled [security-run-with-security] Never run an {{es}} cluster without security enabled. This principle cannot be overstated. Running {{es}} without security leaves your cluster exposed to anyone who can send network traffic to {{es}}, permitting these individuals to download, modify, or delete any data in your cluster. [Start the {{stack}} with security enabled](../../../deploy-manage/security/security-certificates-keys.md) or [manually configure security](../../../deploy-manage/security/manually-configure-security-in-self-managed-cluster.md) to prevent unauthorized access to your clusters and ensure that internode communication is secure. -## Run {{es}} with a dedicated non-root user [security-not-root-user] +## Run {{es}} with a dedicated non-root user [security-not-root-user] Never try to run {{es}} as the `root` user, which would invalidate any defense strategy and permit a malicious user to do **anything** on your server. You must create a dedicated, unprivileged user to run {{es}}. By default, the `rpm`, `deb`, `docker`, and Windows packages of {{es}} contain an `elasticsearch` user with this scope. -## Protect {{es}} from public internet traffic [security-protect-cluster-traffic] +## Protect {{es}} from public internet traffic [security-protect-cluster-traffic] Even with security enabled, never expose {{es}} to public internet traffic. Using an application to sanitize requests to {{es}} still poses risks, such as a malicious user writing [`_search`](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-search) requests that could overwhelm an {{es}} cluster and bring it down. Keep {{es}} as isolated as possible, preferably behind a firewall and a VPN. Any internet-facing applications should run pre-canned aggregations, or not run aggregations at all. -While you absolutely shouldn’t expose {{es}} directly to the internet, you also shouldn’t expose {{es}} directly to users. Instead, use an intermediary application to make requests on behalf of users. This implementation allows you to track user behaviors, such as can submit requests, and to which specific nodes in the cluster. For example, you can implement an application that accepts a search term from a user and funnels it through a [`simple_query_string`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-simple-query-string-query.md) query. +While you absolutely shouldn’t expose {{es}} directly to the internet, you also shouldn’t expose {{es}} directly to users. Instead, use an intermediary application to make requests on behalf of users. This implementation allows you to track user behaviors, such as can submit requests, and to which specific nodes in the cluster. For example, you can implement an application that accepts a search term from a user and funnels it through a [`simple_query_string`](elasticsearch://reference/query-languages/query-dsl-simple-query-string-query.md) query. -## Implement role based access control [security-create-appropriate-users] +## Implement role based access control [security-create-appropriate-users] [Define roles](../../../deploy-manage/users-roles/cluster-or-deployment-auth/defining-roles.md) for your users and [assign appropriate privileges](../../../deploy-manage/users-roles/cluster-or-deployment-auth/elasticsearch-privileges.md) to ensure that users have access only to the resources that they need. This process determines whether the user behind an incoming request is allowed to run that request. diff --git a/raw-migrated-files/elasticsearch/elasticsearch-reference/field-and-document-access-control.md b/raw-migrated-files/elasticsearch/elasticsearch-reference/field-and-document-access-control.md index 07109e9a9..4c7215324 100644 --- a/raw-migrated-files/elasticsearch/elasticsearch-reference/field-and-document-access-control.md +++ b/raw-migrated-files/elasticsearch/elasticsearch-reference/field-and-document-access-control.md @@ -2,14 +2,14 @@ You can control access to data within a data stream or index by adding field and document level security permissions to a role. [Field level security permissions](../../../deploy-manage/users-roles/cluster-or-deployment-auth/controlling-access-at-document-field-level.md) restrict access to particular fields within a document. [Document level security permissions](../../../deploy-manage/users-roles/cluster-or-deployment-auth/controlling-access-at-document-field-level.md) restrict access to particular documents. -::::{note} +::::{note} Document and field level security is currently meant to operate with read-only privileged accounts. Users with document and field level security enabled for a data stream or index should not perform write operations. :::: A role can define both field and document level permissions on a per-index basis. A role that doesn’t specify field level permissions grants access to ALL fields. Similarly, a role that doesn’t specify document level permissions grants access to ALL documents in the index. -::::{important} +::::{important} When assigning users multiple roles, be careful that you don’t inadvertently grant wider access than intended. Each user has a single set of field level and document level permissions per data stream or index. See [Multiple roles with document and field level security](../../../deploy-manage/users-roles/cluster-or-deployment-auth/controlling-access-at-document-field-level.md#multiple-roles-dls-fls). :::: @@ -25,7 +25,7 @@ Field level security takes into account each role the user has and combines all For example, let’s say `role_a` grants access to only the `address` field of the documents in `index1`; it doesn’t specify any document restrictions. Conversely, `role_b` limits access to a subset of the documents in `index1`; it doesn’t specify any field restrictions. If you assign a user both roles, `role_a` gives the user access to all documents and `role_b` gives the user access to all fields. -::::{important} +::::{important} If you need to restrict access to both documents and fields, consider splitting documents by index instead. :::: @@ -110,16 +110,16 @@ POST /_security/role/example3 ## Pre-processing documents to add security details [set-security-user-processor] -To guarantee that a user reads only their own documents, it makes sense to set up document level security. In this scenario, each document must have the username or role name associated with it, so that this information can be used by the role query for document level security. This is a situation where the [set security user processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/ingest-node-set-security-user-processor.md) ingest processor can help. +To guarantee that a user reads only their own documents, it makes sense to set up document level security. In this scenario, each document must have the username or role name associated with it, so that this information can be used by the role query for document level security. This is a situation where the [set security user processor](elasticsearch://reference/ingestion-tools/enrich-processor/ingest-node-set-security-user-processor.md) ingest processor can help. -::::{note} +::::{note} Document level security doesn’t apply to write APIs. You must use unique ids for each user that uses the same data stream or index, otherwise they might overwrite other users' documents. The ingest processor just adds properties for the current authenticated user to the documents that are being indexed. :::: -The [set security user processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/ingest-node-set-security-user-processor.md) attaches user-related details (such as `username`, `roles`, `email`, `full_name` and `metadata` ) from the current authenticated user to the current document by pre-processing the ingest. When you index data with an ingest pipeline, user details are automatically attached to the document. If the authenticating credential is an API key, the API key `id`, `name` and `metadata` (if it exists and is non-empty) are also attached to the document. +The [set security user processor](elasticsearch://reference/ingestion-tools/enrich-processor/ingest-node-set-security-user-processor.md) attaches user-related details (such as `username`, `roles`, `email`, `full_name` and `metadata` ) from the current authenticated user to the current document by pre-processing the ingest. When you index data with an ingest pipeline, user details are automatically attached to the document. If the authenticating credential is an API key, the API key `id`, `name` and `metadata` (if it exists and is non-empty) are also attached to the document. -For more information see [Ingest pipelines](../../../manage-data/ingest/transform-enrich/ingest-pipelines.md) and [Set security user](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/ingest-node-set-security-user-processor.md) +For more information see [Ingest pipelines](../../../manage-data/ingest/transform-enrich/ingest-pipelines.md) and [Set security user](elasticsearch://reference/ingestion-tools/enrich-processor/ingest-node-set-security-user-processor.md) ## Field and document level security with Cross-cluster API keys [ccx-apikeys-dls-fls] diff --git a/raw-migrated-files/elasticsearch/elasticsearch-reference/field-level-security.md b/raw-migrated-files/elasticsearch/elasticsearch-reference/field-level-security.md index c02cd1db2..867ba73dc 100644 --- a/raw-migrated-files/elasticsearch/elasticsearch-reference/field-level-security.md +++ b/raw-migrated-files/elasticsearch/elasticsearch-reference/field-level-security.md @@ -23,7 +23,7 @@ POST /_security/role/test_role1 Access to the following metadata fields is always allowed: `_id`, `_type`, `_parent`, `_routing`, `_timestamp`, `_ttl`, `_size` and `_index`. If you specify an empty list of fields, only these metadata fields are accessible. -::::{note} +::::{note} Omitting the fields entry entirely disables field level security. :::: @@ -188,8 +188,8 @@ The resulting permission is equal to: } ``` -::::{note} -Field-level security should not be set on [`alias`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/field-alias.md) fields. To secure a concrete field, its field name must be used directly. +::::{note} +Field-level security should not be set on [`alias`](elasticsearch://reference/elasticsearch/mapping-reference/field-alias.md) fields. To secure a concrete field, its field name must be used directly. :::: diff --git a/raw-migrated-files/elasticsearch/elasticsearch-reference/fips-140-compliance.md b/raw-migrated-files/elasticsearch/elasticsearch-reference/fips-140-compliance.md index d2c882cc7..cf3a71e4f 100644 --- a/raw-migrated-files/elasticsearch/elasticsearch-reference/fips-140-compliance.md +++ b/raw-migrated-files/elasticsearch/elasticsearch-reference/fips-140-compliance.md @@ -2,7 +2,7 @@ The Federal Information Processing Standard (FIPS) Publication 140-2, (FIPS PUB 140-2), titled "Security Requirements for Cryptographic Modules" is a U.S. government computer security standard used to approve cryptographic modules. {{es}} offers a FIPS 140-2 compliant mode and as such can run in a FIPS 140-2 configured JVM. -::::{important} +::::{important} The JVM bundled with {{es}} is not configured for FIPS 140-2. You must configure an external JDK with a FIPS 140-2 certified Java Security Provider. Refer to the {{es}} [JVM support matrix](https://www.elastic.co/support/matrix#matrix_jvm) for supported JVM configurations. See [subscriptions](https://www.elastic.co/subscriptions) for required licensing. :::: @@ -14,7 +14,7 @@ Compliance with FIPS 140-2 requires using only FIPS approved / NIST recommended * Setting `xpack.security.fips_mode.enabled` to `true` in `elasticsearch.yml`. Note - this setting alone is not sufficient to be compliant with FIPS 140-2. -## Configuring {{es}} for FIPS 140-2 [_configuring_es_for_fips_140_2] +## Configuring {{es}} for FIPS 140-2 [_configuring_es_for_fips_140_2] Detailed instructions for the configuration required for FIPS 140-2 compliance is beyond the scope of this document. It is the responsibility of the user to ensure compliance with FIPS 140-2. {{es}} has been tested with a specific configuration described below. However, there are other configurations possible to achieve compliance. @@ -33,40 +33,40 @@ The following is a high-level overview of the required configuration: * Review the upgrade considerations ([see below](../../../deploy-manage/security/fips-140-2.md#fips-upgrade-considerations)) and limitations ([see below](../../../deploy-manage/security/fips-140-2.md#fips-limitations)). -### Java security provider [java-security-provider] +### Java security provider [java-security-provider] Detailed instructions for installation and configuration of a FIPS certified Java security provider is beyond the scope of this document. Specifically, a FIPS certified [JCA](https://docs.oracle.com/en/java/javase/17/security/java-cryptography-architecture-jca-reference-guide.md) and [JSSE](https://docs.oracle.com/en/java/javase/17/security/java-secure-socket-extension-jsse-reference-guide.md) implementation is required so that the JVM uses FIPS validated implementations of NIST recommended cryptographic algorithms. Elasticsearch has been tested with Bouncy Castle’s [bc-fips 1.0.2.5](https://repo1.maven.org/maven2/org/bouncycastle/bc-fips/1.0.2.5/bc-fips-1.0.2.5.jar) and [bctls-fips 1.0.19](https://repo1.maven.org/maven2/org/bouncycastle/bctls-fips/1.0.19/bctls-fips-1.0.19.jar). Please refer to the {{es}} [JVM support matrix](https://www.elastic.co/support/matrix#matrix_jvm) for details on which combinations of JVM and security provider are supported in FIPS mode. Elasticsearch does not ship with a FIPS certified provider. It is the responsibility of the user to install and configure the security provider to ensure compliance with FIPS 140-2. Using a FIPS certified provider will ensure that only approved cryptographic algorithms are used. -To configure {{es}} to use additional security provider(s) configure {{es}}'s [JVM property](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/jvm-settings.md#set-jvm-options) `java.security.properties` to point to a file ([example](https://raw.githubusercontent.com/elastic/elasticsearch/main/build-tools-internal/src/main/resources/fips_java.security)) in {{es}}'s `config` directory. Ensure the FIPS certified security provider is configured with the lowest order. This file should contain the necessary configuration to instruct Java to use the FIPS certified security provider. +To configure {{es}} to use additional security provider(s) configure {{es}}'s [JVM property](elasticsearch://reference/elasticsearch/jvm-settings.md#set-jvm-options) `java.security.properties` to point to a file ([example](https://raw.githubusercontent.com/elastic/elasticsearch/main/build-tools-internal/src/main/resources/fips_java.security)) in {{es}}'s `config` directory. Ensure the FIPS certified security provider is configured with the lowest order. This file should contain the necessary configuration to instruct Java to use the FIPS certified security provider. -### Java security manager [java-security-manager] +### Java security manager [java-security-manager] All code running in {{es}} is subject to the security restrictions enforced by the Java security manager. The security provider you have installed and configured may require additional permissions in order to function correctly. You can grant these permissions by providing your own [Java security policy](https://docs.oracle.com/javase/8/docs/technotes/guides/security/PolicyFiles.md#FileSyntax) To configure {{es}}'s security manager configure the JVM property `java.security.policy` to point a file ([example](https://raw.githubusercontent.com/elastic/elasticsearch/main/build-tools-internal/src/main/resources/fips_java.policy))in {{es}}'s `config` directory with the desired permissions. This file should contain the necessary configuration for the Java security manager to grant the required permissions needed by the security provider. -### {{es}} Keystore [keystore-fips-password] +### {{es}} Keystore [keystore-fips-password] -FIPS 140-2 (via NIST Special Publication 800-132) dictates that encryption keys should at least have an effective strength of 112 bits. As such, the {{es}} keystore that stores the node’s [secure settings](../../../deploy-manage/security/secure-settings.md) needs to be password protected with a password that satisfies this requirement. This means that the password needs to be 14 bytes long which is equivalent to a 14 character ASCII encoded password, or a 7 character UTF-8 encoded password. You can use the [elasticsearch-keystore passwd](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/elasticsearch-keystore.md) subcommand to change or set the password of an existing keystore. Note that when the keystore is password-protected, you must supply the password each time Elasticsearch starts. +FIPS 140-2 (via NIST Special Publication 800-132) dictates that encryption keys should at least have an effective strength of 112 bits. As such, the {{es}} keystore that stores the node’s [secure settings](../../../deploy-manage/security/secure-settings.md) needs to be password protected with a password that satisfies this requirement. This means that the password needs to be 14 bytes long which is equivalent to a 14 character ASCII encoded password, or a 7 character UTF-8 encoded password. You can use the [elasticsearch-keystore passwd](elasticsearch://reference/elasticsearch/command-line-tools/elasticsearch-keystore.md) subcommand to change or set the password of an existing keystore. Note that when the keystore is password-protected, you must supply the password each time Elasticsearch starts. -### TLS [fips-tls] +### TLS [fips-tls] -SSLv2 and SSLv3 are not allowed by FIPS 140-2, so `SSLv2Hello` and `SSLv3` cannot be used for [`ssl.supported_protocols`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#ssl-tls-settings). +SSLv2 and SSLv3 are not allowed by FIPS 140-2, so `SSLv2Hello` and `SSLv3` cannot be used for [`ssl.supported_protocols`](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#ssl-tls-settings). -::::{note} -The use of TLS ciphers is mainly governed by the relevant crypto module (the FIPS Approved Security Provider that your JVM uses). All the ciphers that are configured by default in {{es}} are FIPS 140-2 compliant and as such can be used in a FIPS 140-2 JVM. See [`ssl.cipher_suites`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#ssl-tls-settings). +::::{note} +The use of TLS ciphers is mainly governed by the relevant crypto module (the FIPS Approved Security Provider that your JVM uses). All the ciphers that are configured by default in {{es}} are FIPS 140-2 compliant and as such can be used in a FIPS 140-2 JVM. See [`ssl.cipher_suites`](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#ssl-tls-settings). :::: -### TLS keystores and keys [_tls_keystores_and_keys] +### TLS keystores and keys [_tls_keystores_and_keys] -Keystores can be used in a number of [General TLS settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#ssl-tls-settings) in order to conveniently store key and trust material. Neither `JKS`, nor `PKCS#12` keystores can be used in a FIPS 140-2 configured JVM. Avoid using these types of keystores. Your FIPS 140-2 provider may provide a compliant keystore implementation that can be used, or you can use PEM encoded files. To use PEM encoded key material, you can use the relevant `\*.key` and `*.certificate` configuration options, and for trust material you can use `*.certificate_authorities`. +Keystores can be used in a number of [General TLS settings](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#ssl-tls-settings) in order to conveniently store key and trust material. Neither `JKS`, nor `PKCS#12` keystores can be used in a FIPS 140-2 configured JVM. Avoid using these types of keystores. Your FIPS 140-2 provider may provide a compliant keystore implementation that can be used, or you can use PEM encoded files. To use PEM encoded key material, you can use the relevant `\*.key` and `*.certificate` configuration options, and for trust material you can use `*.certificate_authorities`. FIPS 140-2 compliance dictates that the length of the public keys used for TLS must correspond to the strength of the symmetric key algorithm in use in TLS. Depending on the value of `ssl.cipher_suites` that you select to use, the TLS keys must have corresponding length according to the following table: @@ -80,32 +80,32 @@ $$$comparable-key-strength$$$ | `AES-256` | 15630 | 512+ | -### Stored password hashing [_stored_password_hashing] +### Stored password hashing [_stored_password_hashing] $$$fips-stored-password-hashing$$$ -While {{es}} offers a number of algorithms for securely hashing credentials on disk, only the `PBKDF2` based family of algorithms is compliant with FIPS 140-2 for stored password hashing. However, since `PBKDF2` is essentially a key derivation function, your JVM security provider may enforce a [112-bit key strength requirement](../../../deploy-manage/security/fips-140-2.md#keystore-fips-password). Although FIPS 140-2 does not mandate user password standards, this requirement may affect password hashing in {{es}}. To comply with this requirement, while allowing you to use passwords that satisfy your security policy, {{es}} offers [pbkdf2_stretch](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#hashing-settings) which is the suggested hashing algorithm when running {{es}} in FIPS 140-2 environments. `pbkdf2_stretch` performs a single round of SHA-512 on the user password before passing it to the `PBKDF2` implementation. +While {{es}} offers a number of algorithms for securely hashing credentials on disk, only the `PBKDF2` based family of algorithms is compliant with FIPS 140-2 for stored password hashing. However, since `PBKDF2` is essentially a key derivation function, your JVM security provider may enforce a [112-bit key strength requirement](../../../deploy-manage/security/fips-140-2.md#keystore-fips-password). Although FIPS 140-2 does not mandate user password standards, this requirement may affect password hashing in {{es}}. To comply with this requirement, while allowing you to use passwords that satisfy your security policy, {{es}} offers [pbkdf2_stretch](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#hashing-settings) which is the suggested hashing algorithm when running {{es}} in FIPS 140-2 environments. `pbkdf2_stretch` performs a single round of SHA-512 on the user password before passing it to the `PBKDF2` implementation. -::::{note} +::::{note} You can still use one of the plain `pbkdf2` options instead of `pbkdf2_stretch` if you have external policies and tools that can ensure all user passwords for the reserved, native, and file realms are longer than 14 bytes. :::: -You must set the `xpack.security.authc.password_hashing.algorithm` setting to one of the available `pbkdf_stretch_*` values. When FIPS-140 mode is enabled, the default value for `xpack.security.authc.password_hashing.algorithm` is `pbkdf2_stretch`. See [User cache and password hash algorithms](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#hashing-settings). +You must set the `xpack.security.authc.password_hashing.algorithm` setting to one of the available `pbkdf_stretch_*` values. When FIPS-140 mode is enabled, the default value for `xpack.security.authc.password_hashing.algorithm` is `pbkdf2_stretch`. See [User cache and password hash algorithms](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#hashing-settings). -Password hashing configuration changes are not retroactive so the stored hashed credentials of existing users of the reserved, native, and file realms are not updated on disk. To ensure FIPS 140-2 compliance, recreate users or change their password using the [elasticsearch-user](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/users-command.md) CLI tool for the file realm and the [create users](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-security-put-user) and [change password](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-security-change-password) APIs for the native and reserved realms. Other types of realms are not affected and do not require any changes. +Password hashing configuration changes are not retroactive so the stored hashed credentials of existing users of the reserved, native, and file realms are not updated on disk. To ensure FIPS 140-2 compliance, recreate users or change their password using the [elasticsearch-user](elasticsearch://reference/elasticsearch/command-line-tools/users-command.md) CLI tool for the file realm and the [create users](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-security-put-user) and [change password](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-security-change-password) APIs for the native and reserved realms. Other types of realms are not affected and do not require any changes. -### Cached password hashing [_cached_password_hashing] +### Cached password hashing [_cached_password_hashing] $$$fips-cached-password-hashing$$$ `ssha256` (salted `sha256`) is recommended for cache hashing. Though `PBKDF2` is compliant with FIPS-140-2, it is — by design — slow, and thus not generally suitable as a cache hashing algorithm. Cached credentials are never stored on disk, and salted `sha256` provides an adequate level of security for in-memory credential hashing, without imposing prohibitive performance overhead. You *may* use `PBKDF2`, however you should carefully assess performance impact first. Depending on your deployment, the overhead of `PBKDF2` could undo most of the performance gain of using a cache. -Either set all `cache.hash_algo` settings to `ssha256` or leave them undefined, since `ssha256` is the default value for all `cache.hash_algo` settings. See [User cache and password hash algorithms](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#hashing-settings). +Either set all `cache.hash_algo` settings to `ssha256` or leave them undefined, since `ssha256` is the default value for all `cache.hash_algo` settings. See [User cache and password hash algorithms](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#hashing-settings). The user cache will be emptied upon node restart, so any existing hashes using non-compliant algorithms will be discarded and the new ones will be created using the algorithm you have selected. -### Configure {{es}} elasticsearch.yml [configuring-es-yml] +### Configure {{es}} elasticsearch.yml [configuring-es-yml] * Set `xpack.security.fips_mode.enabled` to `true` in `elasticsearch.yml`. This setting is used to ensure to configure some internal configuration to be FIPS 140-2 compliant and provides some additional verification. * Set `xpack.security.autoconfiguration.enabled` to `false`. This will disable the automatic configuration of the security settings. Users must ensure that the security settings are configured correctly for FIPS-140-2 compliance. This is only applicable for new installations. @@ -121,7 +121,7 @@ xpack.security.authc.password_hashing.algorithm: "pbkdf2_stretch" ``` -### Verify the security provider is installed [verify-security-provider] +### Verify the security provider is installed [verify-security-provider] To verify that the security provider is installed and in use, you can use any of the following steps: @@ -129,13 +129,13 @@ To verify that the security provider is installed and in use, you can use any of * Set `xpack.security.fips_mode.required_providers` in `elasticsearch.yml` to the list of required security providers. This setting is used to ensure that the correct security provider is installed and configured. (8.13+) If the security provider is not installed correctly, {{es}} will fail to start. `["BCFIPS", "BCJSSE"]` are the values to use for Bouncy Castle’s FIPS JCE and JSSE certified provider. -## Upgrade considerations [fips-upgrade-considerations] +## Upgrade considerations [fips-upgrade-considerations] {{es}} 8.0+ requires Java 17 or later. {{es}} 8.13+ has been tested with [Bouncy Castle](https://www.bouncycastle.org/java.md)'s Java 17 [certified](https://csrc.nist.gov/projects/cryptographic-module-validation-program/certificate/4616) FIPS implementation and is the recommended Java security provider when running {{es}} in FIPS 140-2 mode. Note - {{es}} does not ship with a FIPS certified security provider and requires explicit installation and configuration. Alternatively, consider using {{ech}} in the [FedRAMP-certified GovCloud region](https://www.elastic.co/industries/public-sector/fedramp). -::::{important} +::::{important} Some encryption algorithms may no longer be available by default in updated FIPS 140-2 security providers. Notably, Triple DES and PKCS1.5 RSA are now discouraged and [Bouncy Castle](https://www.bouncycastle.org/fips-java) now requires explicit configuration to continue using these algorithms. :::: @@ -149,11 +149,11 @@ If you plan to upgrade your existing cluster to a version that can be run in a F If your [subscription](https://www.elastic.co/subscriptions) already supports FIPS 140-2 mode, you can elect to perform a rolling upgrade while at the same time running each upgraded node in a FIPS 140-2 JVM. In this case, you would need to also manually regenerate your `elasticsearch.keystore` and migrate all secure settings to it, in addition to the necessary configuration changes outlined below, before starting each node. -## Limitations [fips-limitations] +## Limitations [fips-limitations] Due to the limitations that FIPS 140-2 compliance enforces, a small number of features are not available while running in FIPS 140-2 mode. The list is as follows: * Azure Classic Discovery Plugin -* The [`elasticsearch-certutil`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/certutil.md) tool. However, `elasticsearch-certutil` can very well be used in a non FIPS 140-2 configured JVM (pointing `ES_JAVA_HOME` environment variable to a different java installation) in order to generate the keys and certificates that can be later used in the FIPS 140-2 configured JVM. +* The [`elasticsearch-certutil`](elasticsearch://reference/elasticsearch/command-line-tools/certutil.md) tool. However, `elasticsearch-certutil` can very well be used in a non FIPS 140-2 configured JVM (pointing `ES_JAVA_HOME` environment variable to a different java installation) in order to generate the keys and certificates that can be later used in the FIPS 140-2 configured JVM. * The SQL CLI client cannot run in a FIPS 140-2 configured JVM while using TLS for transport security or PKI for client authentication. diff --git a/raw-migrated-files/elasticsearch/elasticsearch-reference/index-modules-allocation.md b/raw-migrated-files/elasticsearch/elasticsearch-reference/index-modules-allocation.md index fa7a52651..ec7a51e02 100644 --- a/raw-migrated-files/elasticsearch/elasticsearch-reference/index-modules-allocation.md +++ b/raw-migrated-files/elasticsearch/elasticsearch-reference/index-modules-allocation.md @@ -4,8 +4,8 @@ This module provides per-index settings to control the allocation of shards to n * [Shard allocation filtering](../../../deploy-manage/distributed-architecture/shard-allocation-relocation-recovery/index-level-shard-allocation.md): Controlling which shards are allocated to which nodes. * [Delayed allocation](../../../deploy-manage/distributed-architecture/shard-allocation-relocation-recovery/delaying-allocation-when-node-leaves.md): Delaying allocation of unassigned shards caused by a node leaving. -* [Total shards per node](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/total-shards-per-node.md): A hard limit on the number of shards from the same index per node. -* [Data tier allocation](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/data-tier-allocation.md): Controls the allocation of indices to [data tiers](../../../manage-data/lifecycle/data-tiers.md). +* [Total shards per node](elasticsearch://reference/elasticsearch/index-settings/total-shards-per-node.md): A hard limit on the number of shards from the same index per node. +* [Data tier allocation](elasticsearch://reference/elasticsearch/index-settings/data-tier-allocation.md): Controls the allocation of indices to [data tiers](../../../manage-data/lifecycle/data-tiers.md). diff --git a/raw-migrated-files/elasticsearch/elasticsearch-reference/ip-filtering.md b/raw-migrated-files/elasticsearch/elasticsearch-reference/ip-filtering.md index c4bb8210d..681f8df2a 100644 --- a/raw-migrated-files/elasticsearch/elasticsearch-reference/ip-filtering.md +++ b/raw-migrated-files/elasticsearch/elasticsearch-reference/ip-filtering.md @@ -4,19 +4,19 @@ You can apply IP filtering to application clients, node clients, or transport cl If a node’s IP address is on the denylist, the {{es}} {{security-features}} allow the connection to {{es}} but it is be dropped immediately and no requests are processed. -::::{note} +::::{note} Elasticsearch installations are not designed to be publicly accessible over the Internet. IP Filtering and the other capabilities of the {{es}} {{security-features}} do not change this condition. :::: -## Enabling IP filtering [_enabling_ip_filtering] +## Enabling IP filtering [_enabling_ip_filtering] The {{es}} {{security-features}} contain an access control feature that allows or rejects hosts, domains, or subnets. If the [{{operator-feature}}](../../../deploy-manage/users-roles/cluster-or-deployment-auth/operator-privileges.md) is enabled, only operator users can update these settings. You configure IP filtering by specifying the `xpack.security.transport.filter.allow` and `xpack.security.transport.filter.deny` settings in `elasticsearch.yml`. Allow rules take precedence over the deny rules. -::::{important} +::::{important} Unless explicitly specified, `xpack.security.http.filter.*` and `xpack.security.remote_cluster.filter.*` settings default to the corresponding `xpack.security.transport.filter.*` setting’s value. :::: @@ -48,7 +48,7 @@ xpack.security.transport.filter.deny: '*.google.com' ``` -## Disabling IP Filtering [_disabling_ip_filtering] +## Disabling IP Filtering [_disabling_ip_filtering] Disabling IP filtering can slightly improve performance under some conditions. To disable IP filtering entirely, set the value of the `xpack.security.transport.filter.enabled` setting in the `elasticsearch.yml` configuration file to `false`. @@ -64,9 +64,9 @@ xpack.security.http.filter.enabled: true ``` -## Specifying TCP transport profiles [_specifying_tcp_transport_profiles] +## Specifying TCP transport profiles [_specifying_tcp_transport_profiles] -[TCP transport profiles](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#transport-profiles) enable Elasticsearch to bind on multiple hosts. The {{es}} {{security-features}} enable you to apply different IP filtering on different profiles. +[TCP transport profiles](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#transport-profiles) enable Elasticsearch to bind on multiple hosts. The {{es}} {{security-features}} enable you to apply different IP filtering on different profiles. ```yaml xpack.security.transport.filter.allow: 172.16.0.0/24 @@ -75,13 +75,13 @@ transport.profiles.client.xpack.security.filter.allow: 192.168.0.0/24 transport.profiles.client.xpack.security.filter.deny: _all ``` -::::{note} +::::{note} When you do not specify a profile, `default` is used automatically. :::: -## HTTP filtering [_http_filtering] +## HTTP filtering [_http_filtering] You may want to have different IP filtering for the transport and HTTP protocols. @@ -93,7 +93,7 @@ xpack.security.http.filter.deny: _all ``` -## Remote cluster (API key based model) filtering [_remote_cluster_api_key_based_model_filtering] +## Remote cluster (API key based model) filtering [_remote_cluster_api_key_based_model_filtering] If other clusters connect [using API key authentication](../../../deploy-manage/remote-clusters/remote-clusters-api-key.md) for {{ccs}} or {{ccr}}, you may want to have different IP filtering for the remote cluster server interface. @@ -106,13 +106,13 @@ xpack.security.http.filter.allow: 172.16.0.0/16 xpack.security.http.filter.deny: _all ``` -::::{note} +::::{note} Whether IP filtering for remote cluster is enabled is controlled by `xpack.security.transport.filter.enabled` as well. This means filtering for the remote cluster and transport interfaces must be enabled or disabled together. But the exact allow and deny lists can be different between them. :::: -## Dynamically updating IP filter settings [dynamic-ip-filtering] +## Dynamically updating IP filter settings [dynamic-ip-filtering] In case of running in an environment with highly dynamic IP addresses like cloud based hosting, it is very hard to know the IP addresses upfront when provisioning a machine. Instead of changing the configuration file and restarting the node, you can use the *Cluster Update Settings API*. For example: @@ -136,7 +136,7 @@ PUT /_cluster/settings } ``` -::::{note} +::::{note} In order to avoid locking yourself out of the cluster, the default bound transport address will never be denied. This means you can always SSH into a system and use curl to apply changes. :::: diff --git a/raw-migrated-files/elasticsearch/elasticsearch-reference/mapping-roles.md b/raw-migrated-files/elasticsearch/elasticsearch-reference/mapping-roles.md index 51ea20963..1318d81c2 100644 --- a/raw-migrated-files/elasticsearch/elasticsearch-reference/mapping-roles.md +++ b/raw-migrated-files/elasticsearch/elasticsearch-reference/mapping-roles.md @@ -12,14 +12,14 @@ The PKI, LDAP, AD, Kerberos, OpenID Connect, JWT, and SAML realms also support [ To use role mapping, you create roles and role mapping rules. Role mapping rules can be based on realm name, realm type, username, groups, other user metadata, or combinations of those values. -::::{note} +::::{note} When [anonymous access](../../../deploy-manage/users-roles/cluster-or-deployment-auth/anonymous-access.md) is enabled, the roles of the anonymous user are assigned to all the other users as well. :::: If there are role-mapping rules created through the API as well as a role mapping file, the rules are combined. It’s possible for a single user to have some roles that were mapped through the API, and others assigned based on the role mapping file. You can define role-mappings via an [API](../../../deploy-manage/users-roles/cluster-or-deployment-auth/mapping-users-groups-to-roles.md#mapping-roles-api) or manage them through [files](../../../deploy-manage/users-roles/cluster-or-deployment-auth/mapping-users-groups-to-roles.md#mapping-roles-file). These two sources of role-mapping are combined inside of the {{es}} {{security-features}}, so it is possible for a single user to have some roles that have been mapped through the API, and other roles that are mapped through files. -::::{note} +::::{note} Users with no roles assigned will be unauthorized for any action. In other words, they may be able to authenticate, but they will have no roles. No roles means no privileges, and no privileges means no authorizations to make requests. :::: @@ -35,7 +35,7 @@ You can define role-mappings through the [add role mapping API](https://www.elas To use file based role-mappings, you must configure the mappings in a YAML file and copy it to each node in the cluster. Tools like Puppet or Chef can help with this. -By default, role mappings are stored in `ES_PATH_CONF/role_mapping.yml`, where `ES_PATH_CONF` is `ES_HOME/config` (zip/tar installations) or `/etc/elasticsearch` (package installations). To specify a different location, you configure the `files.role_mapping` setting in the [Active Directory](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#ref-ad-settings), [LDAP](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#ref-ldap-settings), and [PKI](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#ref-pki-settings) realm settings in `elasticsearch.yml`. +By default, role mappings are stored in `ES_PATH_CONF/role_mapping.yml`, where `ES_PATH_CONF` is `ES_HOME/config` (zip/tar installations) or `/etc/elasticsearch` (package installations). To specify a different location, you configure the `files.role_mapping` setting in the [Active Directory](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#ref-ad-settings), [LDAP](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#ref-ldap-settings), and [PKI](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#ref-pki-settings) realm settings in `elasticsearch.yml`. Within the role mapping file, the security roles are keys and groups and users are values. The mappings can have a many-to-many relationship. When you map roles to groups, the roles of a user in that group are the combination of the roles assigned to that group and the roles assigned to that user. @@ -48,7 +48,7 @@ While the *role mapping APIs* is the preferred way to manage role mappings, usin Please note however, that the `role_mapping.yml` file is provided as a minimal administrative function and is not intended to cover and be used to define roles for all use cases. -::::{important} +::::{important} You cannot view, edit, or remove any roles that are defined in the role mapping files by using the role mapping APIs. :::: @@ -57,11 +57,11 @@ You cannot view, edit, or remove any roles that are defined in the role mapping ## Realm specific details [_realm_specific_details] -#### Active Directory and LDAP realms [ldap-role-mapping] +#### Active Directory and LDAP realms [ldap-role-mapping] To specify users and groups in the role mappings, you use their *Distinguished Names* (DNs). A DN is a string that uniquely identifies the user or group, for example `"cn=John Doe,cn=contractors,dc=example,dc=com"`. -::::{note} +::::{note} The {{es}} {{security-features}} support only Active Directory security groups. You cannot map distribution groups to roles. :::: @@ -106,7 +106,7 @@ PUT /_security/role_mapping/basic_users ``` -#### PKI realms [pki-role-mapping] +#### PKI realms [pki-role-mapping] PKI realms support mapping users to roles, but you cannot map groups as the PKI realm has no notion of a group. diff --git a/raw-migrated-files/elasticsearch/elasticsearch-reference/role-mapping-resources.md b/raw-migrated-files/elasticsearch/elasticsearch-reference/role-mapping-resources.md index d5c13e1d6..f4925c99a 100644 --- a/raw-migrated-files/elasticsearch/elasticsearch-reference/role-mapping-resources.md +++ b/raw-migrated-files/elasticsearch/elasticsearch-reference/role-mapping-resources.md @@ -28,7 +28,7 @@ A role mapping resource has the following properties: -## Field rules [mapping-roles-rule-field] +## Field rules [mapping-roles-rule-field] The `field` rule is the primary building block for a role mapping expression. It takes a single object as its value and that object must contain a single member with key *F* and value *V*. The field rule looks up the value of *F* within the user object and then tests whether the user value *matches* the provided value *V*. @@ -38,13 +38,13 @@ The value specified in the field rule can be one of the following types: | --- | --- | --- | | Simple String | Exactly matches the provided value. | `"esadmin"` | | Wildcard String | Matches the provided value using a wildcard. | `"*,dc=example,dc=com"` | -| Regular Expression | Matches the provided value using a [Lucene regexp](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/regexp-syntax.md). | `"/.*-admin[0-9]*/"` | +| Regular Expression | Matches the provided value using a [Lucene regexp](elasticsearch://reference/query-languages/regexp-syntax.md). | `"/.*-admin[0-9]*/"` | | Number | Matches an equivalent numerical value. | `7` | | Null | Matches a null or missing value. | `null` | | Array | Tests each element in the array in accordance with the above definitions. If *any* of elements match, the match is successful. | `["admin", "operator"]` | -### User fields [_user_fields] +### User fields [_user_fields] The *user object* against which rules are evaluated has the following fields: diff --git a/raw-migrated-files/elasticsearch/elasticsearch-reference/search-analyze.md b/raw-migrated-files/elasticsearch/elasticsearch-reference/search-analyze.md index a809ea63f..a31b10afa 100644 --- a/raw-migrated-files/elasticsearch/elasticsearch-reference/search-analyze.md +++ b/raw-migrated-files/elasticsearch/elasticsearch-reference/search-analyze.md @@ -37,10 +37,10 @@ The [`_search` endpoint](../../../solutions/search/querying-for-search.md) accep Query DSL support a wide range of search techniques, including the following: * [**Full-text search**](../../../solutions/search/full-text.md): Search text that has been analyzed and indexed to support phrase or proximity queries, fuzzy matches, and more. -* [**Keyword search**](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/keyword.md): Search for exact matches using `keyword` fields. +* [**Keyword search**](elasticsearch://reference/elasticsearch/mapping-reference/keyword.md): Search for exact matches using `keyword` fields. * [**Semantic search**](../../../solutions/search/semantic-search/semantic-search-semantic-text.md): Search `semantic_text` fields using dense or sparse vector search on embeddings generated in your {{es}} cluster. * [**Vector search**](../../../solutions/search/vector/knn.md): Search for similar dense vectors using the kNN algorithm for embeddings generated outside of {{es}}. -* [**Geospatial search**](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/geo-queries.md): Search for locations and calculate spatial relationships using geospatial queries. +* [**Geospatial search**](elasticsearch://reference/query-languages/geo-queries.md): Search for locations and calculate spatial relationships using geospatial queries. Learn about the full range of queries supported by [Query DSL](../../../explore-analyze/query-filter/languages/querydsl.md). @@ -55,9 +55,9 @@ Because aggregations leverage the same data structures used for search, they are The folowing aggregation types are available: -* [Metric](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/metrics.md): Calculate metrics, such as a sum or average, from field values. -* [Bucket](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/bucket.md): Group documents into buckets based on field values, ranges, or other criteria. -* [Pipeline](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/pipeline.md): Run aggregations on the results of other aggregations. +* [Metric](elasticsearch://reference/data-analysis/aggregations/metrics.md): Calculate metrics, such as a sum or average, from field values. +* [Bucket](elasticsearch://reference/data-analysis/aggregations/bucket.md): Group documents into buckets based on field values, ranges, or other criteria. +* [Pipeline](elasticsearch://reference/data-analysis/aggregations/pipeline.md): Run aggregations on the results of other aggregations. Run aggregations by specifying the [search API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search)'s `aggs` parameter. Learn more in [Run an aggregation](../../../explore-analyze/query-filter/aggregations.md#run-an-agg). @@ -70,7 +70,7 @@ The [`_query` endpoint](../../../explore-analyze/query-filter/languages/esql-res Today, it supports a subset of the features available in Query DSL, but it is rapidly evolving. -It comes with a comprehensive set of [functions and operators](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-functions-operators.md) for working with data and has robust integration with {{kib}}'s Discover, dashboards and visualizations. +It comes with a comprehensive set of [functions and operators](elasticsearch://reference/query-languages/esql/esql-functions-operators.md) for working with data and has robust integration with {{kib}}'s Discover, dashboards and visualizations. Learn more in [Getting started with {{esql}}](../../../solutions/search/get-started.md), or try [our training course](https://www.elastic.co/training/introduction-to-esql). diff --git a/raw-migrated-files/elasticsearch/elasticsearch-reference/search-with-synonyms.md b/raw-migrated-files/elasticsearch/elasticsearch-reference/search-with-synonyms.md index cc0f08dc8..b58055666 100644 --- a/raw-migrated-files/elasticsearch/elasticsearch-reference/search-with-synonyms.md +++ b/raw-migrated-files/elasticsearch/elasticsearch-reference/search-with-synonyms.md @@ -110,10 +110,10 @@ An index with invalid synonym rules cannot be reopened, making it inoperable whe :::: -{{es}} uses synonyms as part of the [analysis process](../../../manage-data/data-store/text-analysis.md). You can use two types of [token filter](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/token-filter-reference.md) to include synonyms: +{{es}} uses synonyms as part of the [analysis process](../../../manage-data/data-store/text-analysis.md). You can use two types of [token filter](elasticsearch://reference/data-analysis/text-analysis/token-filter-reference.md) to include synonyms: -* [Synonym graph](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analysis-synonym-graph-tokenfilter.md): It is recommended to use it, as it can correctly handle multi-word synonyms ("hurriedly", "in a hurry"). -* [Synonym](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analysis-synonym-tokenfilter.md): Not recommended if you need to use multi-word synonyms. +* [Synonym graph](elasticsearch://reference/data-analysis/text-analysis/analysis-synonym-graph-tokenfilter.md): It is recommended to use it, as it can correctly handle multi-word synonyms ("hurriedly", "in a hurry"). +* [Synonym](elasticsearch://reference/data-analysis/text-analysis/analysis-synonym-tokenfilter.md): Not recommended if you need to use multi-word synonyms. Check each synonym token filter documentation for configuration details and instructions on adding it to an analyzer. diff --git a/raw-migrated-files/elasticsearch/elasticsearch-reference/secure-settings.md b/raw-migrated-files/elasticsearch/elasticsearch-reference/secure-settings.md index d594ce8e5..7b821533c 100644 --- a/raw-migrated-files/elasticsearch/elasticsearch-reference/secure-settings.md +++ b/raw-migrated-files/elasticsearch/elasticsearch-reference/secure-settings.md @@ -1,8 +1,8 @@ # Secure settings [secure-settings] -Some settings are sensitive, and relying on filesystem permissions to protect their values is not sufficient. For this use case, {{es}} provides a keystore and the [`elasticsearch-keystore` tool](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/elasticsearch-keystore.md) to manage the settings in the keystore. +Some settings are sensitive, and relying on filesystem permissions to protect their values is not sufficient. For this use case, {{es}} provides a keystore and the [`elasticsearch-keystore` tool](elasticsearch://reference/elasticsearch/command-line-tools/elasticsearch-keystore.md) to manage the settings in the keystore. -::::{important} +::::{important} Only some settings are designed to be read from the keystore. Adding unsupported settings to the keystore causes the validation in the `_nodes/reload_secure_settings` API to fail and if not addressed, will cause {{es}} to fail to start. To see whether a setting is supported in the keystore, look for a "Secure" qualifier in the setting reference. :::: @@ -12,7 +12,7 @@ All the modifications to the keystore take effect only after restarting {{es}}. These settings, just like the regular ones in the `elasticsearch.yml` config file, need to be specified on each node in the cluster. Currently, all secure settings are node-specific settings that must have the same value on every node. -## Reloadable secure settings [reloadable-secure-settings] +## Reloadable secure settings [reloadable-secure-settings] Just like the settings values in `elasticsearch.yml`, changes to the keystore contents are not automatically applied to the running {{es}} node. Re-reading settings requires a node restart. However, certain secure settings are marked as **reloadable**. Such settings can be re-read and applied on a running node. @@ -37,13 +37,13 @@ When changing multiple **reloadable** secure settings, modify all of them on eac There are reloadable secure settings for: * [The Azure repository plugin](../../../deploy-manage/tools/snapshot-and-restore/azure-repository.md) -* [The EC2 discovery plugin](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch-plugins/discovery-ec2-usage.md#_configuring_ec2_discovery) +* [The EC2 discovery plugin](elasticsearch://reference/elasticsearch-plugins/discovery-ec2-usage.md#_configuring_ec2_discovery) * [The GCS repository plugin](../../../deploy-manage/tools/snapshot-and-restore/google-cloud-storage-repository.md) * [The S3 repository plugin](../../../deploy-manage/tools/snapshot-and-restore/s3-repository.md) -* [Monitoring settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/monitoring-settings.md) -* [{{watcher}} settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/watcher-settings.md) -* [JWT realm](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#ref-jwt-settings) -* [Active Directory realm](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#ref-ad-settings) -* [LDAP realm](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#ref-ldap-settings) +* [Monitoring settings](elasticsearch://reference/elasticsearch/configuration-reference/monitoring-settings.md) +* [{{watcher}} settings](elasticsearch://reference/elasticsearch/configuration-reference/watcher-settings.md) +* [JWT realm](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#ref-jwt-settings) +* [Active Directory realm](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#ref-ad-settings) +* [LDAP realm](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#ref-ldap-settings) * [Remote cluster credentials for the API key based security model](../../../deploy-manage/remote-clusters/remote-clusters-settings.md#remote-cluster-credentials-setting) diff --git a/raw-migrated-files/elasticsearch/elasticsearch-reference/security-basic-setup-https.md b/raw-migrated-files/elasticsearch/elasticsearch-reference/security-basic-setup-https.md index dc57585d9..bfd20a800 100644 --- a/raw-migrated-files/elasticsearch/elasticsearch-reference/security-basic-setup-https.md +++ b/raw-migrated-files/elasticsearch/elasticsearch-reference/security-basic-setup-https.md @@ -265,7 +265,7 @@ To send monitoring data securely, create a monitoring user and grant it the nece You can use the built-in `beats_system` user, if it’s available in your environment. Because the built-in users are not available in {{ecloud}}, these instructions create a user that is explicitly used for monitoring {{metricbeat}}. -1. If you’re using the built-in `beats_system` user, on any node in your cluster, run the [`elasticsearch-reset-password`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/reset-password.md) utility to set the password for that user: +1. If you’re using the built-in `beats_system` user, on any node in your cluster, run the [`elasticsearch-reset-password`](elasticsearch://reference/elasticsearch/command-line-tools/reset-password.md) utility to set the password for that user: This command resets the password for the `beats_system` user to an auto-generated value. diff --git a/raw-migrated-files/elasticsearch/elasticsearch-reference/security-basic-setup.md b/raw-migrated-files/elasticsearch/elasticsearch-reference/security-basic-setup.md index c59eae3dd..ee8800602 100644 --- a/raw-migrated-files/elasticsearch/elasticsearch-reference/security-basic-setup.md +++ b/raw-migrated-files/elasticsearch/elasticsearch-reference/security-basic-setup.md @@ -77,7 +77,7 @@ Complete the following steps **for each node in your cluster**. To join the same 1. Open the `$ES_PATH_CONF/elasticsearch.yml` file and make the following changes: - 1. Add the [`cluster-name`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/miscellaneous-cluster-settings.md#cluster-name) setting and enter a name for your cluster: + 1. Add the [`cluster-name`](elasticsearch://reference/elasticsearch/configuration-reference/miscellaneous-cluster-settings.md#cluster-name) setting and enter a name for your cluster: ```yaml cluster.name: my-cluster @@ -101,7 +101,7 @@ Complete the following steps **for each node in your cluster**. To join the same xpack.security.transport.ssl.truststore.path: elastic-certificates.p12 ``` - 1. If you want to use hostname verification, set the verification mode to `full`. You should generate a different certificate for each host that matches the DNS or IP address. See the `xpack.security.transport.ssl.verification_mode` parameter in [TLS settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#transport-tls-ssl-settings). + 1. If you want to use hostname verification, set the verification mode to `full`. You should generate a different certificate for each host that matches the DNS or IP address. See the `xpack.security.transport.ssl.verification_mode` parameter in [TLS settings](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#transport-tls-ssl-settings). 2. If you entered a password when creating the node certificate, run the following commands to store the password in the {{es}} keystore: diff --git a/raw-migrated-files/elasticsearch/elasticsearch-reference/security-limitations.md b/raw-migrated-files/elasticsearch/elasticsearch-reference/security-limitations.md index e4cd12b40..553057de2 100644 --- a/raw-migrated-files/elasticsearch/elasticsearch-reference/security-limitations.md +++ b/raw-migrated-files/elasticsearch/elasticsearch-reference/security-limitations.md @@ -6,27 +6,27 @@ navigation_title: "Limitations" -## Plugins [_plugins] +## Plugins [_plugins] {{es}}'s plugin infrastructure is extremely flexible in terms of what can be extended. While it opens up {{es}} to a wide variety of (often custom) additional functionality, when it comes to security, this high extensibility level comes at a cost. We have no control over the third-party plugins' code (open source or not) and therefore we cannot guarantee their compliance with {{stack-security-features}}. For this reason, third-party plugins are not officially supported on clusters with {{security-features}} enabled. -## Changes in wildcard behavior [_changes_in_wildcard_behavior] +## Changes in wildcard behavior [_changes_in_wildcard_behavior] {{es}} clusters with the {{security-features}} enabled apply `_all` and other wildcards to data streams, indices, and aliases the current user has privileges for, not all data streams, indices, and aliases on the cluster. -## Multi document APIs [_multi_document_apis] +## Multi document APIs [_multi_document_apis] Multi get and multi term vectors API throw IndexNotFoundException when trying to access non existing indices that the user is not authorized for. By doing that they leak information regarding the fact that the data stream or index doesn’t exist, while the user is not authorized to know anything about those data streams or indices. -## Filtered index aliases [_filtered_index_aliases] +## Filtered index aliases [_filtered_index_aliases] Aliases containing filters are not a secure way to restrict access to individual documents, due to the limitations described in [Index and field names can be leaked when using aliases](../../../deploy-manage/security.md#alias-limitations). The {{stack-security-features}} provide a secure way to restrict access to documents through the [document-level security](../../../deploy-manage/users-roles/cluster-or-deployment-auth/controlling-access-at-document-field-level.md) feature. -## Field and document level security limitations [field-document-limitations] +## Field and document level security limitations [field-document-limitations] When a user’s role enables document or [field level security](../../../deploy-manage/users-roles/cluster-or-deployment-auth/controlling-access-at-document-field-level.md) for a data stream or index: @@ -52,7 +52,7 @@ When a user’s role enables [document level security](../../../deploy-manage/us * Document level security doesn’t affect global index statistics that relevancy scoring uses. This means that scores are computed without taking the role query into account. Documents that don’t match the role query are never returned. * The `has_child` and `has_parent` queries aren’t supported as query parameters in the role definition. The `has_child` and `has_parent` queries can be used in the search API with document level security enabled. -* [Date math](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/common-options.md#date-math) expressions cannot contain `now` in [range queries with date fields](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-range-query.md#ranges-on-dates) +* [Date math](elasticsearch://reference/elasticsearch/rest-apis/common-options.md#date-math) expressions cannot contain `now` in [range queries with date fields](elasticsearch://reference/query-languages/query-dsl-range-query.md#ranges-on-dates) * Any query that makes remote calls to fetch query data isn’t supported, including the following queries: * `terms` query with terms lookup @@ -62,27 +62,27 @@ When a user’s role enables [document level security](../../../deploy-manage/us * If suggesters are specified and document level security is enabled, the specified suggesters are ignored. * A search request cannot be profiled if document level security is enabled. * The [terms enum API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-terms-enum) does not return terms if document level security is enabled. -* The [`multi_match`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-multi-match-query.md) query does not support specifying fields using wildcards. +* The [`multi_match`](elasticsearch://reference/query-languages/query-dsl-multi-match-query.md) query does not support specifying fields using wildcards. -::::{note} +::::{note} While document-level security prevents users from viewing restricted documents, it’s still possible to write search requests that return aggregate information about the entire index. A user whose access is restricted to specific documents in an index could still learn about field names and terms that only exist in inaccessible documents, and count how many inaccessible documents contain a given term. :::: -## Index and field names can be leaked when using aliases [alias-limitations] +## Index and field names can be leaked when using aliases [alias-limitations] Calling certain {{es}} APIs on an alias can potentially leak information about indices that the user isn’t authorized to access. For example, when you get the mappings for an alias with the `_mapping` API, the response includes the index name and mappings for each index that the alias applies to. Until this limitation is addressed, avoid index and field names that contain confidential or sensitive information. -## LDAP realm [_ldap_realm] +## LDAP realm [_ldap_realm] The [LDAP Realm](../../../deploy-manage/users-roles/cluster-or-deployment-auth/ldap.md) does not currently support the discovery of nested LDAP Groups. For example, if a user is a member of `group_1` and `group_1` is a member of `group_2`, only `group_1` will be discovered. However, the [Active Directory Realm](../../../deploy-manage/users-roles/cluster-or-deployment-auth/active-directory.md) **does** support transitive group membership. -## Resource sharing check for users and API keys [can-access-resources-check] +## Resource sharing check for users and API keys [can-access-resources-check] The result of [async search](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-async-search-submit) and [scroll](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-scroll) requests can be retrieved later by the same user or API key that submitted the initial request. The verification process involves comparing the username, authentication realm type, and (for realms other than file or native) realm name. If you used an API key to submit the request, only that key can retrieve the results. This logic also has a few limitations: diff --git a/raw-migrated-files/elasticsearch/elasticsearch-reference/semantic-search-inference.md b/raw-migrated-files/elasticsearch/elasticsearch-reference/semantic-search-inference.md index 1410a4d58..c6bb0136a 100644 --- a/raw-migrated-files/elasticsearch/elasticsearch-reference/semantic-search-inference.md +++ b/raw-migrated-files/elasticsearch/elasticsearch-reference/semantic-search-inference.md @@ -315,7 +315,7 @@ PUT _inference/text_embedding/alibabacloud_ai_search_embeddings <1> ## Create the index mapping [infer-service-mappings] -The mapping of the destination index - the index that contains the embeddings that the model will create based on your input text - must be created. The destination index must have a field with the [`dense_vector`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/dense-vector.md) field type for most models and the [`sparse_vector`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/sparse-vector.md) field type for the sparse vector models like in the case of the `elasticsearch` service to index the output of the used model. +The mapping of the destination index - the index that contains the embeddings that the model will create based on your input text - must be created. The destination index must have a field with the [`dense_vector`](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md) field type for most models and the [`sparse_vector`](elasticsearch://reference/elasticsearch/mapping-reference/sparse-vector.md) field type for the sparse vector models like in the case of the `elasticsearch` service to index the output of the used model. :::::::{tab-set} @@ -592,7 +592,7 @@ PUT alibabacloud-ai-search-embeddings ## Create an ingest pipeline with an inference processor [infer-service-inference-ingest-pipeline] -Create an [ingest pipeline](../../../manage-data/ingest/transform-enrich/ingest-pipelines.md) with an [{{infer}} processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/inference-processor.md) and use the model you created above to infer against the data that is being ingested in the pipeline. +Create an [ingest pipeline](../../../manage-data/ingest/transform-enrich/ingest-pipelines.md) with an [{{infer}} processor](elasticsearch://reference/ingestion-tools/enrich-processor/inference-processor.md) and use the model you created above to infer against the data that is being ingested in the pipeline. :::::::{tab-set} diff --git a/raw-migrated-files/elasticsearch/elasticsearch-reference/shard-allocation-filtering.md b/raw-migrated-files/elasticsearch/elasticsearch-reference/shard-allocation-filtering.md index 03ff71195..ff1639946 100644 --- a/raw-migrated-files/elasticsearch/elasticsearch-reference/shard-allocation-filtering.md +++ b/raw-migrated-files/elasticsearch/elasticsearch-reference/shard-allocation-filtering.md @@ -1,15 +1,15 @@ # Index-level shard allocation filtering [shard-allocation-filtering] -You can use shard allocation filters to control where {{es}} allocates shards of a particular index. These per-index filters are applied in conjunction with [cluster-wide allocation filtering](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-shard-allocation-filtering) and [allocation awareness](../../../deploy-manage/distributed-architecture/shard-allocation-relocation-recovery/shard-allocation-awareness.md). +You can use shard allocation filters to control where {{es}} allocates shards of a particular index. These per-index filters are applied in conjunction with [cluster-wide allocation filtering](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-shard-allocation-filtering) and [allocation awareness](../../../deploy-manage/distributed-architecture/shard-allocation-relocation-recovery/shard-allocation-awareness.md). -Shard allocation filters can be based on [custom node attributes](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/node-settings.md#custom-node-attributes) or the built-in `_name`, `_host_ip`, `_publish_ip`, `_ip`, `_host`, `_id`, `_tier` and `_tier_preference` attributes. [Index lifecycle management](../../../manage-data/lifecycle/index-lifecycle-management.md) uses filters based on custom node attributes to determine how to reallocate shards when moving between phases. +Shard allocation filters can be based on [custom node attributes](elasticsearch://reference/elasticsearch/configuration-reference/node-settings.md#custom-node-attributes) or the built-in `_name`, `_host_ip`, `_publish_ip`, `_ip`, `_host`, `_id`, `_tier` and `_tier_preference` attributes. [Index lifecycle management](../../../manage-data/lifecycle/index-lifecycle-management.md) uses filters based on custom node attributes to determine how to reallocate shards when moving between phases. The `cluster.routing.allocation` settings are dynamic, enabling existing indices to be moved immediately from one set of nodes to another. Shards are only relocated if it is possible to do so without breaking another routing constraint, such as never allocating a primary and replica shard on the same node. For example, you could use a custom node attribute to indicate a node’s performance characteristics and use shard allocation filtering to route shards for a particular index to the most appropriate class of hardware. -## Enabling index-level shard allocation filtering [index-allocation-filters] +## Enabling index-level shard allocation filtering [index-allocation-filters] To filter based on a custom node attribute: @@ -52,7 +52,7 @@ To filter based on a custom node attribute: -## Index allocation filter settings [index-allocation-settings] +## Index allocation filter settings [index-allocation-settings] `index.routing.allocation.include.{{attribute}}` : Assign the index to a node whose `{{attribute}}` has at least one of the comma-separated values. @@ -84,10 +84,10 @@ The index allocation settings support the following built-in attributes: : Match nodes by node id `_tier` -: Match nodes by the node’s [data tier](../../../manage-data/lifecycle/data-tiers.md) role. For more details see [data tier allocation filtering](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/data-tier-allocation.md) +: Match nodes by the node’s [data tier](../../../manage-data/lifecycle/data-tiers.md) role. For more details see [data tier allocation filtering](elasticsearch://reference/elasticsearch/index-settings/data-tier-allocation.md) -::::{note} -`_tier` filtering is based on [node](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/node-settings.md) roles. Only a subset of roles are [data tier](../../../manage-data/lifecycle/data-tiers.md) roles, and the generic [data role](../../../deploy-manage/distributed-architecture/clusters-nodes-shards/node-roles.md#data-node-role) will match any tier filtering. +::::{note} +`_tier` filtering is based on [node](elasticsearch://reference/elasticsearch/configuration-reference/node-settings.md) roles. Only a subset of roles are [data tier](../../../manage-data/lifecycle/data-tiers.md) roles, and the generic [data role](../../../deploy-manage/distributed-architecture/clusters-nodes-shards/node-roles.md#data-node-role) will match any tier filtering. :::: diff --git a/raw-migrated-files/elasticsearch/elasticsearch-reference/shard-request-cache.md b/raw-migrated-files/elasticsearch/elasticsearch-reference/shard-request-cache.md index 8b2487c6c..e80002fb2 100644 --- a/raw-migrated-files/elasticsearch/elasticsearch-reference/shard-request-cache.md +++ b/raw-migrated-files/elasticsearch/elasticsearch-reference/shard-request-cache.md @@ -4,12 +4,12 @@ When a search request is run against an index or against many indices, each invo The shard-level request cache module caches the local results on each shard. This allows frequently used (and potentially heavy) search requests to return results almost instantly. The requests cache is a very good fit for the logging use case, where only the most recent index is being actively updated — results from older indices will be served directly from the cache. -You can control the size and expiration of the cache at the node level using the [shard request cache settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/shard-request-cache-settings.md). +You can control the size and expiration of the cache at the node level using the [shard request cache settings](elasticsearch://reference/elasticsearch/configuration-reference/shard-request-cache-settings.md). -::::{important} +::::{important} By default, the requests cache will only cache the results of search requests where `size=0`, so it will not cache `hits`, but it will cache `hits.total`, [aggregations](../../../explore-analyze/query-filter/aggregations.md), and [suggestions](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters.html). -Most queries that use `now` (see [Date Math](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/common-options.md#date-math)) cannot be cached. +Most queries that use `now` (see [Date Math](elasticsearch://reference/elasticsearch/rest-apis/common-options.md#date-math)) cannot be cached. Scripted queries that use the API calls which are non-deterministic, such as `Math.random()` or `new Date()` are not cached. @@ -17,7 +17,7 @@ Scripted queries that use the API calls which are non-deterministic, such as `Ma -## Cache invalidation [_cache_invalidation] +## Cache invalidation [_cache_invalidation] The cache is smart — it keeps the same *near real-time* promise as uncached search. @@ -32,7 +32,7 @@ POST /my-index-000001,my-index-000002/_cache/clear?request=true ``` -## Enabling and disabling caching [_enabling_and_disabling_caching] +## Enabling and disabling caching [_enabling_and_disabling_caching] The cache is enabled by default, but can be disabled when creating a new index as follows: @@ -53,7 +53,7 @@ PUT /my-index-000001/_settings ``` -## Enabling and disabling caching per request [_enabling_and_disabling_caching_per_request] +## Enabling and disabling caching per request [_enabling_and_disabling_caching_per_request] The `request_cache` query-string parameter can be used to enable or disable caching on a **per-request** basis. If set, it overrides the index-level setting: @@ -74,17 +74,17 @@ GET /my-index-000001/_search?request_cache=true Requests where `size` is greater than 0 will not be cached even if the request cache is enabled in the index settings. To cache these requests you will need to use the query-string parameter detailed here. -## Cache key [_cache_key] +## Cache key [_cache_key] A hash of the whole JSON body is used as the cache key. This means that if the JSON changes — for instance if keys are output in a different order — then the cache key will not be recognised. -::::{tip} +::::{tip} Most JSON libraries support a *canonical* mode which ensures that JSON keys are always emitted in the same order. This canonical mode can be used in the application to ensure that a request is always serialized in the same way. :::: -## Monitoring cache usage [_monitoring_cache_usage] +## Monitoring cache usage [_monitoring_cache_usage] The size of the cache (in bytes) and the number of evictions can be viewed by index, with the [`indices-stats`](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-stats) API: diff --git a/raw-migrated-files/elasticsearch/elasticsearch-reference/snapshot-restore.md b/raw-migrated-files/elasticsearch/elasticsearch-reference/snapshot-restore.md index 72d1a6ded..6c253ea99 100644 --- a/raw-migrated-files/elasticsearch/elasticsearch-reference/snapshot-restore.md +++ b/raw-migrated-files/elasticsearch/elasticsearch-reference/snapshot-restore.md @@ -55,7 +55,7 @@ To retrieve a list of feature states, use the [Features API](https://www.elastic :::: -A feature state typically includes one or more [system indices or system data streams](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#system-indices). It may also include regular indices and data streams used by the feature. For example, a feature state may include a regular index that contains the feature’s execution history. Storing this history in a regular index lets you more easily search it. +A feature state typically includes one or more [system indices or system data streams](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#system-indices). It may also include regular indices and data streams used by the feature. For example, a feature state may include a regular index that contains the feature’s execution history. Storing this history in a regular index lets you more easily search it. In {{es}} 8.0 and later versions, feature states are the only way to back up and restore system indices and system data streams. @@ -115,7 +115,7 @@ You can’t restore an index to an earlier version of {{es}}. For example, you c A compatible snapshot can contain indices created in an older incompatible version. For example, a snapshot of a 8.17 cluster can contain an index created in 6.8. Restoring the 6.8 index to an 9.0 cluster fails unless you can use the [archive functionality](../../../deploy-manage/upgrade/deployment-or-cluster/reading-indices-from-older-elasticsearch-versions.md). Keep this in mind if you take a snapshot before upgrading a cluster. -As a workaround, you can first restore the index to another cluster running the latest version of {{es}} that’s compatible with both the index and your current cluster. You can then use [reindex-from-remote](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-reindex) to rebuild the index on your current cluster. Reindex from remote is only possible if the index’s [`_source`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-source-field.md) is enabled. +As a workaround, you can first restore the index to another cluster running the latest version of {{es}} that’s compatible with both the index and your current cluster. You can then use [reindex-from-remote](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-reindex) to rebuild the index on your current cluster. Reindex from remote is only possible if the index’s [`_source`](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md) is enabled. Reindexing from remote can take significantly longer than restoring a snapshot. Before you start, test the reindex from remote process with a subset of the data to estimate your time requirements. diff --git a/raw-migrated-files/elasticsearch/elasticsearch-reference/snapshots-restore-snapshot.md b/raw-migrated-files/elasticsearch/elasticsearch-reference/snapshots-restore-snapshot.md index a33ad3f9e..1447b0438 100644 --- a/raw-migrated-files/elasticsearch/elasticsearch-reference/snapshots-restore-snapshot.md +++ b/raw-migrated-files/elasticsearch/elasticsearch-reference/snapshots-restore-snapshot.md @@ -23,7 +23,7 @@ This guide also provides tips for [restoring to another cluster](../../../deploy * You can only restore a snapshot to a running cluster with an elected [master node](../../../deploy-manage/distributed-architecture/clusters-nodes-shards/node-roles.md#master-node-role). The snapshot’s repository must be [registered](../../../deploy-manage/tools/snapshot-and-restore/self-managed.md) and available to the cluster. * The snapshot and cluster versions must be compatible. See [Snapshot compatibility](../../../deploy-manage/tools/snapshot-and-restore.md#snapshot-restore-version-compatibility). -* To restore a snapshot, the cluster’s global metadata must be writable. Ensure there aren’t any [cluster blocks](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/miscellaneous-cluster-settings.md#cluster-read-only) that prevent writes. The restore operation ignores [index blocks](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-block.md). +* To restore a snapshot, the cluster’s global metadata must be writable. Ensure there aren’t any [cluster blocks](elasticsearch://reference/elasticsearch/configuration-reference/miscellaneous-cluster-settings.md#cluster-read-only) that prevent writes. The restore operation ignores [index blocks](elasticsearch://reference/elasticsearch/index-settings/index-block.md). * Before you restore a data stream, ensure the cluster contains a [matching index template](../../../manage-data/data-store/data-streams/set-up-data-stream.md#create-index-template) with data stream enabled. To check, use {{kib}}'s [**Index Management**](../../../manage-data/lifecycle/index-lifecycle-management/index-management-in-kibana.md#manage-index-templates) feature or the [get index template API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-get-index-template): ```console @@ -83,7 +83,7 @@ If you’re restoring data to a pre-existing cluster, use one of the following m The simplest way to avoid conflicts is to delete an existing index or data stream before restoring it. To prevent the accidental re-creation of the index or data stream, we recommend you temporarily stop all indexing until the restore operation is complete. ::::{warning} -If the [`action.destructive_requires_name`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/index-management-settings.md#action-destructive-requires-name) cluster setting is `false`, don’t use the [delete index API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-delete) to target the `*` or `.*` wildcard pattern. If you use {{es}}'s security features, this will delete system indices required for authentication. Instead, target the `*,-.*` wildcard pattern to exclude these system indices and other index names that begin with a dot (`.`). +If the [`action.destructive_requires_name`](elasticsearch://reference/elasticsearch/configuration-reference/index-management-settings.md#action-destructive-requires-name) cluster setting is `false`, don’t use the [delete index API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-delete) to target the `*` or `.*` wildcard pattern. If you use {{es}}'s security features, this will delete system indices required for authentication. Instead, target the `*,-.*` wildcard pattern to exclude these system indices and other index names that begin with a dot (`.`). :::: @@ -279,7 +279,7 @@ If you’re restoring to a different cluster, see [Restore to a different cluste } ``` -3. $$$restore-create-file-realm-user$$$If you use {{es}} security features, log in to a node host, navigate to the {{es}} installation directory, and add a user with the `superuser` role to the file realm using the [`elasticsearch-users`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/users-command.md) tool. +3. $$$restore-create-file-realm-user$$$If you use {{es}} security features, log in to a node host, navigate to the {{es}} installation directory, and add a user with the `superuser` role to the file realm using the [`elasticsearch-users`](elasticsearch://reference/elasticsearch/command-line-tools/users-command.md) tool. For example, the following command creates a user named `restore_user`. @@ -289,7 +289,7 @@ If you’re restoring to a different cluster, see [Restore to a different cluste Use this file realm user to authenticate requests until the restore operation is complete. -4. Use the [cluster update settings API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings) to set [`action.destructive_requires_name`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/index-management-settings.md#action-destructive-requires-name) to `false`. This lets you delete data streams and indices using wildcards. +4. Use the [cluster update settings API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings) to set [`action.destructive_requires_name`](elasticsearch://reference/elasticsearch/configuration-reference/index-management-settings.md#action-destructive-requires-name) to `false`. This lets you delete data streams and indices using wildcards. ```console PUT _cluster/settings @@ -466,7 +466,7 @@ Before you start a restore operation, ensure the new cluster has enough capacity * Add nodes or upgrade your hardware to increase capacity. * Restore fewer indices and data streams. -* Reduce the [number of replicas](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-modules.md#dynamic-index-number-of-replicas) for restored indices. +* Reduce the [number of replicas](elasticsearch://reference/elasticsearch/index-settings/index-modules.md#dynamic-index-number-of-replicas) for restored indices. For example, the following restore snapshot API request uses the `index_settings` option to set `index.number_of_replicas` to `1`. diff --git a/raw-migrated-files/kibana/kibana/connect-to-elasticsearch.md b/raw-migrated-files/kibana/kibana/connect-to-elasticsearch.md index ea7f80aae..8bcb91a19 100644 --- a/raw-migrated-files/kibana/kibana/connect-to-elasticsearch.md +++ b/raw-migrated-files/kibana/kibana/connect-to-elasticsearch.md @@ -21,7 +21,7 @@ A good place to start is with one of our Elastic solutions, which offer experien * **Elastic connectors and crawler.** - * Create searchable mirrors of your data in Sharepoint Online, S3, Google Drive, and many other web services using our open code [Elastic connectors](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/search-connectors/index.md). + * Create searchable mirrors of your data in Sharepoint Online, S3, Google Drive, and many other web services using our open code [Elastic connectors](elasticsearch://reference/ingestion-tools/search-connectors/index.md). * Discover, extract, and index your web content into {{es}} using the [Elastic web crawler](https://www.elastic.co/guide/en/enterprise-search/current/crawler.html). * **Elastic Observability.** Get logs, metrics, traces, and uptime data into the Elastic Stack. Integrations are available for popular services and platforms, such as Nginx, AWS, and MongoDB, and generic input types like log files. Refer to [Elastic Observability](../../../solutions/observability/get-started/what-is-elastic-observability.md) for more information. diff --git a/raw-migrated-files/kibana/kibana/console-kibana.md b/raw-migrated-files/kibana/kibana/console-kibana.md index 8c775b5c3..a3e003bdd 100644 --- a/raw-migrated-files/kibana/kibana/console-kibana.md +++ b/raw-migrated-files/kibana/kibana/console-kibana.md @@ -1,6 +1,6 @@ # Run API requests with Console [console-kibana] -**Console** lets you interact with [{{es}} APIs](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/index.md) and [{{kib}} APIs](https://www.elastic.co/docs/api) from within {{kib}}. +**Console** lets you interact with [{{es}} APIs](elasticsearch://reference/elasticsearch/rest-apis/index.md) and [{{kib}} APIs](https://www.elastic.co/docs/api) from within {{kib}}. :::{image} ../../../images/kibana-console.png :alt: Console diff --git a/raw-migrated-files/kibana/kibana/elasticsearch-mutual-tls.md b/raw-migrated-files/kibana/kibana/elasticsearch-mutual-tls.md index 0da3e478d..af542141a 100644 --- a/raw-migrated-files/kibana/kibana/elasticsearch-mutual-tls.md +++ b/raw-migrated-files/kibana/kibana/elasticsearch-mutual-tls.md @@ -34,7 +34,7 @@ If you haven’t already, start {{kib}} and connect it to {{es}} using the [enro :::: - You may choose to generate a client certificate and private key using the [`elasticsearch-certutil`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/certutil.md) tool. If you followed the {{es}} documentation for [generating the certificates authority](../../../deploy-manage/security/set-up-basic-security.md#generate-certificates), then you already have a certificate authority (CA) to sign the {{es}} server certificate. You may choose to use the same CA to sign the {{kib}} client certificate. For example: + You may choose to generate a client certificate and private key using the [`elasticsearch-certutil`](elasticsearch://reference/elasticsearch/command-line-tools/certutil.md) tool. If you followed the {{es}} documentation for [generating the certificates authority](../../../deploy-manage/security/set-up-basic-security.md#generate-certificates), then you already have a certificate authority (CA) to sign the {{es}} server certificate. You may choose to use the same CA to sign the {{kib}} client certificate. For example: ```sh bin/elasticsearch-certutil cert -ca elastic-stack-ca.p12 -name kibana-client -dns diff --git a/raw-migrated-files/kibana/kibana/search-ai-assistant.md b/raw-migrated-files/kibana/kibana/search-ai-assistant.md index e732113a3..2e35dc2be 100644 --- a/raw-migrated-files/kibana/kibana/search-ai-assistant.md +++ b/raw-migrated-files/kibana/kibana/search-ai-assistant.md @@ -104,18 +104,18 @@ This functionality is not available on Elastic Cloud Serverless projects. :::: -You can ingest external data (GitHub issues, Markdown files, Jira tickets, text files, etc.) into {{es}} using [Search Connectors](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/search-connectors/index.md). Connectors sync third party data sources to {{es}}. +You can ingest external data (GitHub issues, Markdown files, Jira tickets, text files, etc.) into {{es}} using [Search Connectors](elasticsearch://reference/ingestion-tools/search-connectors/index.md). Connectors sync third party data sources to {{es}}. -Supported service types include [GitHub](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/search-connectors/es-connectors-github.md), [Slack](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/search-connectors/es-connectors-slack.md), [Jira](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/search-connectors/es-connectors-jira.md), and more. These can be Elastic managed or self-managed on your own infrastructure. +Supported service types include [GitHub](elasticsearch://reference/ingestion-tools/search-connectors/es-connectors-github.md), [Slack](elasticsearch://reference/ingestion-tools/search-connectors/es-connectors-slack.md), [Jira](elasticsearch://reference/ingestion-tools/search-connectors/es-connectors-jira.md), and more. These can be Elastic managed or self-managed on your own infrastructure. To create a connector and make its content available to the AI Assistant knowledge base, follow these steps: 1. **In {{kib}} UI, go to *Search → Content → Connectors* and follow the instructions to create a new connector.** - For example, if you create a [GitHub connector](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/search-connectors/es-connectors-github.md) you must set a `name`, attach it to a new or existing `index`, add your `personal access token` and include the `list of repositories` to synchronize. + For example, if you create a [GitHub connector](elasticsearch://reference/ingestion-tools/search-connectors/es-connectors-github.md) you must set a `name`, attach it to a new or existing `index`, add your `personal access token` and include the `list of repositories` to synchronize. ::::{tip} - Learn more about configuring and [using connectors](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/search-connectors/connectors-ui-in-kibana.md) in the Elasticsearch documentation. + Learn more about configuring and [using connectors](elasticsearch://reference/ingestion-tools/search-connectors/connectors-ui-in-kibana.md) in the Elasticsearch documentation. :::: 2. **Create a pipeline and process the data with ELSER.** diff --git a/raw-migrated-files/kibana/kibana/secure-reporting.md b/raw-migrated-files/kibana/kibana/secure-reporting.md index 8c49e6c9f..93446ae51 100644 --- a/raw-migrated-files/kibana/kibana/secure-reporting.md +++ b/raw-migrated-files/kibana/kibana/secure-reporting.md @@ -198,7 +198,7 @@ To automatically generate reports with {{watcher}}, you must configure {{watcher xpack.http.ssl.certificate_authorities: ["/path/to/your/cacert1.pem", "/path/to/your/cacert2.pem"] ``` - For more information, see [the {{watcher}} HTTP TLS/SSL Settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/watcher-settings.md#ssl-notification-settings). + For more information, see [the {{watcher}} HTTP TLS/SSL Settings](elasticsearch://reference/elasticsearch/configuration-reference/watcher-settings.md#ssl-notification-settings). 4. Add one or more users who have access to the {{report-features}}. @@ -240,5 +240,5 @@ If using PNG/PDF {{report-features}} in a production environment, it is preferre ## Ensure {{es}} allows built-in templates [reporting-elasticsearch-configuration] -Reporting relies on {{es}} to install a mapping template for the data stream that stores reports. Ensure that {{es}} allows built-in templates to be installed by keeping the `stack.templates.enabled` setting at the default value of `true`. For more information, see [Index management settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/index-management-settings.md#stack-templates-enabled). +Reporting relies on {{es}} to install a mapping template for the data stream that stores reports. Ensure that {{es}} allows built-in templates to be installed by keeping the `stack.templates.enabled` setting at the default value of `true`. For more information, see [Index management settings](elasticsearch://reference/elasticsearch/configuration-reference/index-management-settings.md#stack-templates-enabled). diff --git a/raw-migrated-files/kibana/kibana/using-kibana-with-security.md b/raw-migrated-files/kibana/kibana/using-kibana-with-security.md index eb5ade401..28aee5568 100644 --- a/raw-migrated-files/kibana/kibana/using-kibana-with-security.md +++ b/raw-migrated-files/kibana/kibana/using-kibana-with-security.md @@ -9,13 +9,13 @@ When you start {{es}} for the first time, {{stack-security-features}} are enable You can then log in to {{kib}} as the `elastic` user to create additional roles and users. -::::{note} +::::{note} When a user is not authorized to view data in an index (such as an {{es}} index), the entire index will be inaccessible and not display in {{kib}}. :::: -## Configure security settings [security-configure-settings] +## Configure security settings [security-configure-settings] Set an encryption key so that sessions are not invalidated. You can optionally configure additional security settings and authentication. @@ -32,14 +32,14 @@ Set an encryption key so that sessions are not invalidated. You can optionally c 4. Restart {{kib}}. -## Create roles and users [security-create-roles] +## Create roles and users [security-create-roles] Configure roles for your {{kib}} users to control what data those users can access. 1. Temporarily log in to {{kib}} using the built-in `elastic` superuser so you can create new users and assign roles. If you are running {{kib}} locally, go to `https://localhost:5601` to view the login page. - ::::{note} - The password for the built-in `elastic` user is generated as part of the security configuration process on {{es}}. If you need to reset the password for the `elastic` user or other built-in users, run the [`elasticsearch-reset-password`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/reset-password.md) tool. + ::::{note} + The password for the built-in `elastic` user is generated as part of the security configuration process on {{es}}. If you need to reset the password for the `elastic` user or other built-in users, run the [`elasticsearch-reset-password`](elasticsearch://reference/elasticsearch/command-line-tools/reset-password.md) tool. :::: 2. $$$kibana-roles$$$Create roles and users to grant access to {{kib}}. @@ -56,13 +56,13 @@ Configure roles for your {{kib}} users to control what data those users can acce } ``` - ::::{tip} + ::::{tip} For more information on Basic Authentication and additional methods of authenticating {{kib}} users, see [Authentication](../../../deploy-manage/users-roles/cluster-or-deployment-auth/user-authentication.md). :::: 3. Grant users access to the indices that they will be working with in {{kib}}. - ::::{tip} + ::::{tip} You can define as many different roles for your {{kib}} users as you need. :::: @@ -71,7 +71,7 @@ Configure roles for your {{kib}} users to control what data those users can acce 4. Log out of {{kib}} and verify that you can log in as a normal user. If you are running {{kib}} locally, go to `https://localhost:5601` and enter the credentials for a user you’ve assigned a {{kib}} user role. For example, you could log in as the user `jacknich`. - ::::{note} + ::::{note} This must be a user who has been assigned [Kibana privileges](../../../deploy-manage/users-roles/cluster-or-deployment-auth/kibana-privileges.md). {{kib}} server credentials (the built-in `kibana_system` user) should only be used internally by the {{kib}} server. :::: diff --git a/raw-migrated-files/observability-docs/observability/obs-ai-assistant.md b/raw-migrated-files/observability-docs/observability/obs-ai-assistant.md index 863e39e2a..53db9dbe8 100644 --- a/raw-migrated-files/observability-docs/observability/obs-ai-assistant.md +++ b/raw-migrated-files/observability-docs/observability/obs-ai-assistant.md @@ -39,7 +39,7 @@ Also, the data you provide to the Observability AI assistant is *not* anonymized The AI assistant requires the following: * {{stack}} version 8.9 and later. -* A [self-managed](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/search-connectors/self-managed-connectors.md) connector service must be deployed if search connectors are used to populate external data into the knowledge base. +* A [self-managed](elasticsearch://reference/ingestion-tools/search-connectors/self-managed-connectors.md) connector service must be deployed if search connectors are used to populate external data into the knowledge base. * An account with a third-party generative AI provider that preferably supports function calling. If your AI provider does not support function calling, you can configure AI Assistant settings under **Stack Management** to simulate function calling, but this might affect performance. Refer to the [connector documentation](../../../deploy-manage/manage-connectors.md) for your provider to learn about supported and default models. @@ -142,16 +142,16 @@ To add external data to the knowledge base in {{kib}}: ### Use search connectors [obs-ai-search-connectors] ::::{tip} -The [search connectors](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/search-connectors/index.md) described in this section differ from the [Stack management → Connectors](../../../deploy-manage/manage-connectors.md) configured during the [AI Assistant setup](../../../solutions/observability/observability-ai-assistant.md#obs-ai-set-up). Search connectors are only needed when importing external data into the Knowledge base of the AI Assistant, while the stack connector to the LLM is required for the AI Assistant to work. +The [search connectors](elasticsearch://reference/ingestion-tools/search-connectors/index.md) described in this section differ from the [Stack management → Connectors](../../../deploy-manage/manage-connectors.md) configured during the [AI Assistant setup](../../../solutions/observability/observability-ai-assistant.md#obs-ai-set-up). Search connectors are only needed when importing external data into the Knowledge base of the AI Assistant, while the stack connector to the LLM is required for the AI Assistant to work. :::: -[Connectors](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/search-connectors/index.md) allow you to index content from external sources thereby making it available for the AI Assistant. This can greatly improve the relevance of the AI Assistant’s responses. Data can be integrated from sources such as GitHub, Confluence, Google Drive, Jira, AWS S3, Microsoft Teams, Slack, and more. +[Connectors](elasticsearch://reference/ingestion-tools/search-connectors/index.md) allow you to index content from external sources thereby making it available for the AI Assistant. This can greatly improve the relevance of the AI Assistant’s responses. Data can be integrated from sources such as GitHub, Confluence, Google Drive, Jira, AWS S3, Microsoft Teams, Slack, and more. UI affordances for creating and managing search connectors are available in the Search Solution in {{kib}}. You can also use the {{es}} [Connector APIs](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-connector) to create and manage search connectors. -A [self-managed](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/search-connectors/self-managed-connectors.md) connector service must be deployed to run connectors. +A [self-managed](elasticsearch://reference/ingestion-tools/search-connectors/self-managed-connectors.md) connector service must be deployed to run connectors. By default, the AI Assistant queries all search connector indices. To override this behavior and customize which indices are queried, adjust the **Search connector index pattern** setting on the [AI Assistant Settings](../../../solutions/observability/observability-ai-assistant.md#obs-ai-settings) page. This allows precise control over which data sources are included in AI Assistant knowledge base. @@ -166,9 +166,9 @@ To create a connector in the {{kib}} UI and make its content available to the AI 2. Follow the instructions to create a new connector. - For example, if you create a [GitHub connector](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/search-connectors/es-connectors-github.md) you have to set a `name`, attach it to a new or existing `index`, add your `personal access token` and include the `list of repositories` to synchronize. + For example, if you create a [GitHub connector](elasticsearch://reference/ingestion-tools/search-connectors/es-connectors-github.md) you have to set a `name`, attach it to a new or existing `index`, add your `personal access token` and include the `list of repositories` to synchronize. - Learn more about configuring and [using connectors](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/search-connectors/connectors-ui-in-kibana.md) in the Elasticsearch documentation. + Learn more about configuring and [using connectors](elasticsearch://reference/ingestion-tools/search-connectors/connectors-ui-in-kibana.md) in the Elasticsearch documentation. After creating your connector, create the embeddings needed by the AI Assistant. You can do this using either: @@ -194,7 +194,7 @@ After creating the pipeline, complete the following steps: Once the pipeline is set up, perform a **Full Content Sync** of the connector. The inference pipeline will process the data as follows: - * As data comes in, ELSER is applied to the data, and embeddings (weights and tokens into a [sparse vector field](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-sparse-vector-query.md)) are added to capture semantic meaning and context of the data. + * As data comes in, ELSER is applied to the data, and embeddings (weights and tokens into a [sparse vector field](elasticsearch://reference/query-languages/query-dsl-sparse-vector-query.md)) are added to capture semantic meaning and context of the data. * When you look at the ingested documents, you can see the embeddings are added to the `predicted_value` field in the documents. 2. Check if AI Assistant can use the index (optional). @@ -205,7 +205,7 @@ After creating the pipeline, complete the following steps: #### Use a `semantic_text` field type to create AI Assistant embeddings [obs-ai-search-connectors-semantic-text] -To create the embeddings needed by the AI Assistant using a [`semantic_text`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/semantic-text.md) field type: +To create the embeddings needed by the AI Assistant using a [`semantic_text`](elasticsearch://reference/elasticsearch/mapping-reference/semantic-text.md) field type: 1. Open the previously created connector, and select the **Mappings** tab. 2. Select **Add field**. diff --git a/raw-migrated-files/stack-docs/elastic-stack/air-gapped-install.md b/raw-migrated-files/stack-docs/elastic-stack/air-gapped-install.md index 2f7fb5909..67e6e759b 100644 --- a/raw-migrated-files/stack-docs/elastic-stack/air-gapped-install.md +++ b/raw-migrated-files/stack-docs/elastic-stack/air-gapped-install.md @@ -59,7 +59,7 @@ Air-gapped install of {{es}} may require additional steps in order to access som Specifically: -* To be able to use the GeoIP processor, refer to [the GeoIP processor documentation](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/geoip-processor.md#manually-update-geoip-databases) for instructions on downloading and deploying the required databases. +* To be able to use the GeoIP processor, refer to [the GeoIP processor documentation](elasticsearch://reference/ingestion-tools/enrich-processor/geoip-processor.md#manually-update-geoip-databases) for instructions on downloading and deploying the required databases. * Refer to [{{ml-cap}}](../../../deploy-manage/deploy/self-managed/air-gapped-install.md#air-gapped-machine-learning) for instructions on deploying the Elastic Learned Sparse EncodeR (ELSER) natural language processing (NLP) model and other trained {{ml}} models. diff --git a/raw-migrated-files/stack-docs/elastic-stack/install-stack-demo-secure.md b/raw-migrated-files/stack-docs/elastic-stack/install-stack-demo-secure.md index 32f38892d..e8e2ae9f8 100644 --- a/raw-migrated-files/stack-docs/elastic-stack/install-stack-demo-secure.md +++ b/raw-migrated-files/stack-docs/elastic-stack/install-stack-demo-secure.md @@ -40,7 +40,7 @@ In a production environment you would typically use the CA certificate from your sudo systemctl stop elasticsearch.service ``` -2. Generate a CA certificate using the provided certificate utility, `elasticsearch-certutil`. Note that the location of the utility depends on the installation method you used to install {{es}}. Refer to [elasticsearch-certutil](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/certutil.md) for the command details and to [Update security certificates with a different CA](../../../deploy-manage/security/different-ca.md) for details about the procedure as a whole. +2. Generate a CA certificate using the provided certificate utility, `elasticsearch-certutil`. Note that the location of the utility depends on the installation method you used to install {{es}}. Refer to [elasticsearch-certutil](elasticsearch://reference/elasticsearch/command-line-tools/certutil.md) for the command details and to [Update security certificates with a different CA](../../../deploy-manage/security/different-ca.md) for details about the procedure as a whole. Run the following command. When prompted, specify a unique name for the output file, such as `elastic-stack-ca.zip`: diff --git a/raw-migrated-files/stack-docs/elastic-stack/installing-stack-demo-self.md b/raw-migrated-files/stack-docs/elastic-stack/installing-stack-demo-self.md index c38777164..c4fec1844 100644 --- a/raw-migrated-files/stack-docs/elastic-stack/installing-stack-demo-self.md +++ b/raw-migrated-files/stack-docs/elastic-stack/installing-stack-demo-self.md @@ -149,7 +149,7 @@ Before moving ahead to configure additional {{es}} nodes, you’ll need to updat ``` ::::{tip} - You can find details about the `network.host` and `transport.host` settings in the {{es}} [Networking](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md) documentation. + You can find details about the `network.host` and `transport.host` settings in the {{es}} [Networking](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md) documentation. :::: 6. Save your changes and close the editor. diff --git a/raw-migrated-files/stack-docs/elastic-stack/upgrading-elastic-stack.md b/raw-migrated-files/stack-docs/elastic-stack/upgrading-elastic-stack.md index 6c89e4640..875b73c09 100644 --- a/raw-migrated-files/stack-docs/elastic-stack/upgrading-elastic-stack.md +++ b/raw-migrated-files/stack-docs/elastic-stack/upgrading-elastic-stack.md @@ -17,7 +17,7 @@ Upgrading from a release candidate build, such as 8.0.0-rc1 or 8.0.0-rc2, is not * [APM breaking changes](https://www.elastic.co/guide/en/observability/current/apm-breaking.html) * [{{beats}} breaking changes](asciidocalypse://docs/beats/docs/release-notes/breaking-changes.md) - * [{{es}} migration guide](asciidocalypse://docs/elasticsearch/docs/release-notes/breaking-changes.md) + * [{{es}} migration guide](elasticsearch://release-notes/breaking-changes.md) * [{{elastic-sec}} release notes](https://www.elastic.co/guide/en/security/current/release-notes.html) * [{{ents}} release notes](https://www.elastic.co/guide/en/enterprise-search/current/changelog.html) * [{{fleet}} and {{agent}} release notes](asciidocalypse://docs/docs-content/docs/release-notes/fleet.md) @@ -74,7 +74,7 @@ You can view your remote clusters from **Stack Management > Remote Clusters**. 3. Make the recommended changes to ensure that your applications continue to operate as expected after the upgrade. ::::{note} - As a temporary solution, you can submit requests to 9.x using the 8.x syntax with the REST API compatibility mode. While this enables you to submit requests that use the old syntax, it does not guarantee the same behavior. REST API compatibility should be a bridge to smooth out the upgrade process, not a long term strategy. For more information, see [REST API compatibility](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/compatibility.md). + As a temporary solution, you can submit requests to 9.x using the 8.x syntax with the REST API compatibility mode. While this enables you to submit requests that use the old syntax, it does not guarantee the same behavior. REST API compatibility should be a bridge to smooth out the upgrade process, not a long term strategy. For more information, see [REST API compatibility](elasticsearch://reference/elasticsearch/rest-apis/compatibility.md). :::: 4. If you use any {{es}} plugins, make sure there is a version of each plugin that is compatible with {{es}} version 9.0.0-beta1. diff --git a/raw-migrated-files/stack-docs/elastic-stack/upgrading-elasticsearch.md b/raw-migrated-files/stack-docs/elastic-stack/upgrading-elasticsearch.md index 80e972006..638c157f9 100644 --- a/raw-migrated-files/stack-docs/elastic-stack/upgrading-elasticsearch.md +++ b/raw-migrated-files/stack-docs/elastic-stack/upgrading-elasticsearch.md @@ -14,7 +14,7 @@ To upgrade a cluster: 1. **Disable shard allocation**. - When you shut down a data node, the allocation process waits for `index.unassigned.node_left.delayed_timeout` (by default, one minute) before starting to replicate the shards on that node to other nodes in the cluster, which can involve a lot of I/O. Since the node is shortly going to be restarted, this I/O is unnecessary. You can avoid racing the clock by [disabling allocation](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-allocation-enable) of replicas before shutting down [data nodes](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/node-settings.md#data-node): + When you shut down a data node, the allocation process waits for `index.unassigned.node_left.delayed_timeout` (by default, one minute) before starting to replicate the shards on that node to other nodes in the cluster, which can involve a lot of I/O. Since the node is shortly going to be restarted, this I/O is unnecessary. You can avoid racing the clock by [disabling allocation](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-allocation-enable) of replicas before shutting down [data nodes](elasticsearch://reference/elasticsearch/configuration-reference/node-settings.md#data-node): ```console PUT _cluster/settings diff --git a/solutions/observability/apps/built-in-data-filters.md b/solutions/observability/apps/built-in-data-filters.md index 371654d24..e6440edfe 100644 --- a/solutions/observability/apps/built-in-data-filters.md +++ b/solutions/observability/apps/built-in-data-filters.md @@ -59,7 +59,7 @@ This setting supports [Central configuration](apm-agent-central-configuration.md By default, the APM Server captures some personal data associated with trace events: -* `client.ip`: The client’s IP address. Typically derived from the HTTP headers of incoming requests. `client.ip` is also used in conjunction with the [`geoip` processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/geoip-processor.md) to assign geographical information to trace events. To learn more about how `client.ip` is derived, see [Deriving an incoming request’s `client.ip` address](anonymous-authentication.md#apm-derive-client-ip). +* `client.ip`: The client’s IP address. Typically derived from the HTTP headers of incoming requests. `client.ip` is also used in conjunction with the [`geoip` processor](elasticsearch://reference/ingestion-tools/enrich-processor/geoip-processor.md) to assign geographical information to trace events. To learn more about how `client.ip` is derived, see [Deriving an incoming request’s `client.ip` address](anonymous-authentication.md#apm-derive-client-ip). * `user_agent`: User agent data, including the client operating system, device name, vendor, and version. The capturing of this data can be turned off by setting **Capture personal data** to `false`. diff --git a/solutions/observability/apps/configure-logstash-output.md b/solutions/observability/apps/configure-logstash-output.md index 04a546e18..7d11a8ee1 100644 --- a/solutions/observability/apps/configure-logstash-output.md +++ b/solutions/observability/apps/configure-logstash-output.md @@ -308,7 +308,7 @@ To use SSL mutual authentication: 1. Create a certificate authority (CA) and use it to sign the certificates that you plan to use for APM Server and {{ls}}. Creating a correct SSL/TLS infrastructure is outside the scope of this document. There are many online resources available that describe how to create certificates. ::::{tip} - If you are using {{security-features}}, you can use the [`elasticsearch-certutil` tool](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/certutil.md) to generate certificates. + If you are using {{security-features}}, you can use the [`elasticsearch-certutil` tool](elasticsearch://reference/elasticsearch/command-line-tools/certutil.md) to generate certificates. :::: 2. Configure APM Server to use SSL. In the `apm-server.yml` config file, specify the following settings under `ssl`: @@ -333,7 +333,7 @@ To use SSL mutual authentication: * `ssl`: When set to true, enables {{ls}} to use SSL/TLS. * `ssl_certificate_authorities`: Configures {{ls}} to trust any certificates signed by the specified CA. * `ssl_certificate` and `ssl_key`: Specify the certificate and key that {{ls}} uses to authenticate with the client. - * `ssl_verify_mode`: Specifies whether the {{ls}} server verifies the client certificate against the CA. You need to specify either `peer` or `force_peer` to make the server ask for the certificate and validate it. If you specify `force_peer`, and APM Server doesn’t provide a certificate, the {{ls}} connection will be closed. If you choose not to use [`certutil`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/certutil.md), the certificates that you obtain must allow for both `clientAuth` and `serverAuth` if the extended key usage extension is present. + * `ssl_verify_mode`: Specifies whether the {{ls}} server verifies the client certificate against the CA. You need to specify either `peer` or `force_peer` to make the server ask for the certificate and validate it. If you specify `force_peer`, and APM Server doesn’t provide a certificate, the {{ls}} connection will be closed. If you choose not to use [`certutil`](elasticsearch://reference/elasticsearch/command-line-tools/certutil.md), the certificates that you obtain must allow for both `clientAuth` and `serverAuth` if the extended key usage extension is present. For example: diff --git a/solutions/observability/apps/create-assign-feature-roles-to-apm-server-users.md b/solutions/observability/apps/create-assign-feature-roles-to-apm-server-users.md index e07db60f6..ac7fc4a10 100644 --- a/solutions/observability/apps/create-assign-feature-roles-to-apm-server-users.md +++ b/solutions/observability/apps/create-assign-feature-roles-to-apm-server-users.md @@ -65,7 +65,7 @@ To grant an APM Server user the required privileges for writing events to {{es}} | --- | --- | --- | | Index | `auto_configure` on `traces-apm*`, `logs-apm*`, and `metrics-apm*` indices | Permits auto-creation of indices and data streams | | Index | `create_doc` on `traces-apm*`, `logs-apm*`, and `metrics-apm*` indices | Write events into {{es}} | - | Cluster | `monitor` | Allows cluster UUID checks, which are performed as part of APM server startup preconditionsif [Elasticsearch security](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md) is enabled (it is enabled by default), and allows a license check, which is required if [tail-based sampling](transaction-sampling.md#apm-tail-based-sampling) is enabled. | + | Cluster | `monitor` | Allows cluster UUID checks, which are performed as part of APM server startup preconditionsif [Elasticsearch security](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md) is enabled (it is enabled by default), and allows a license check, which is required if [tail-based sampling](transaction-sampling.md#apm-tail-based-sampling) is enabled. | ::::{note} diff --git a/solutions/observability/apps/custom-filters.md b/solutions/observability/apps/custom-filters.md index d3c9f800c..6023883c8 100644 --- a/solutions/observability/apps/custom-filters.md +++ b/solutions/observability/apps/custom-filters.md @@ -51,7 +51,7 @@ Say you decide to [capture HTTP request bodies](built-in-data-filters.md#apm-fil } ``` -To obfuscate the passwords stored in the request body, you can use a series of [ingest processors](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/index.md). +To obfuscate the passwords stored in the request body, you can use a series of [ingest processors](elasticsearch://reference/ingestion-tools/enrich-processor/index.md). ### Create a pipeline [_create_a_pipeline] @@ -78,7 +78,7 @@ To start, create a pipeline with a simple description and an empty array of proc #### Add a JSON processor [_add_a_json_processor] -Add your first processor to the processors array. Because the agent captures the request body as a string, use the [JSON processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/json-processor.md) to convert the original field value into a structured JSON object. Save this JSON object in a new field: +Add your first processor to the processors array. Because the agent captures the request body as a string, use the [JSON processor](elasticsearch://reference/ingestion-tools/enrich-processor/json-processor.md) to convert the original field value into a structured JSON object. Save this JSON object in a new field: ```json { @@ -93,7 +93,7 @@ Add your first processor to the processors array. Because the agent captures the #### Add a set processor [_add_a_set_processor] -If `body.original_json` is not `null`, i.e., it exists, we’ll redact the `password` with the [set processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/set-processor.md), by setting the value of `body.original_json.password` to `"redacted"`: +If `body.original_json` is not `null`, i.e., it exists, we’ll redact the `password` with the [set processor](elasticsearch://reference/ingestion-tools/enrich-processor/set-processor.md), by setting the value of `body.original_json.password` to `"redacted"`: ```json { @@ -108,7 +108,7 @@ If `body.original_json` is not `null`, i.e., it exists, we’ll redact the `pass #### Add a convert processor [_add_a_convert_processor] -Use the [convert processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/convert-processor.md) to convert the JSON value of `body.original_json` to a string and set it as the `body.original` value: +Use the [convert processor](elasticsearch://reference/ingestion-tools/enrich-processor/convert-processor.md) to convert the JSON value of `body.original_json` to a string and set it as the `body.original` value: ```json { @@ -125,7 +125,7 @@ Use the [convert processor](asciidocalypse://docs/elasticsearch/docs/reference/i #### Add a remove processor [_add_a_remove_processor] -Finally, use the [remove processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/remove-processor.md) to remove the `body.original_json` field: +Finally, use the [remove processor](elasticsearch://reference/ingestion-tools/enrich-processor/remove-processor.md) to remove the `body.original_json` field: ```json { diff --git a/solutions/observability/apps/data-streams.md b/solutions/observability/apps/data-streams.md index bf0cefcbb..e3dd1d3a1 100644 --- a/solutions/observability/apps/data-streams.md +++ b/solutions/observability/apps/data-streams.md @@ -60,7 +60,7 @@ Metrics ::::{important} - Additional storage efficiencies provided by [Synthetic `_source`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-source-field.md) are available to users with an [appropriate license](https://www.elastic.co/subscriptions). + Additional storage efficiencies provided by [Synthetic `_source`](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md) are available to users with an [appropriate license](https://www.elastic.co/subscriptions). :::: @@ -75,7 +75,7 @@ Logs ## APM data stream rerouting [apm-data-stream-rerouting] -APM supports rerouting APM data to user-defined APM data stream names other than the defaults. This can be achieved by using a [`reroute` processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/reroute-processor.md) in ingest pipelines to set the data stream dataset or namespace. The benefit of separating APM data streams is that custom retention and security policies can be used. +APM supports rerouting APM data to user-defined APM data stream names other than the defaults. This can be achieved by using a [`reroute` processor](elasticsearch://reference/ingestion-tools/enrich-processor/reroute-processor.md) in ingest pipelines to set the data stream dataset or namespace. The benefit of separating APM data streams is that custom retention and security policies can be used. For example, consider traces that would originally be indexed to `traces-apm-default`. To set the data stream namespace from the trace’s `service.environment` and fallback to a static string `"default"`, create an ingest pipeline named `traces-apm@custom` which will be used automatically: diff --git a/solutions/observability/apps/fleet-managed-apm-server.md b/solutions/observability/apps/fleet-managed-apm-server.md index 7bdeb58db..2ed5add63 100644 --- a/solutions/observability/apps/fleet-managed-apm-server.md +++ b/solutions/observability/apps/fleet-managed-apm-server.md @@ -16,7 +16,7 @@ You need {{es}} for storing and searching your data, and {{kib}} for visualizing * Secure, encrypted connection between {{kib}} and {{es}}. For more information, see [Start the {{stack}} with security enabled](../../../deploy-manage/deploy/self-managed/installing-elasticsearch.md). * Internet connection for {{kib}} to download integration packages from the {{package-registry}}. Make sure the {{kib}} server can connect to `https://epr.elastic.co` on port `443`. If your environment has network traffic restrictions, there are ways to work around this requirement. See [Air-gapped environments](asciidocalypse://docs/docs-content/docs/reference/ingestion-tools/fleet/air-gapped.md) for more information. * {{kib}} user with `All` privileges on {{fleet}} and {{integrations}}. Since many Integrations assets are shared across spaces, users need the {{kib}} privileges in all spaces. -* In the {{es}} configuration, the [built-in API key service](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#api-key-service-settings) must be enabled. (`xpack.security.authc.api_key.enabled: true`) +* In the {{es}} configuration, the [built-in API key service](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#api-key-service-settings) must be enabled. (`xpack.security.authc.api_key.enabled: true`) * In the {{kib}} configuration, the saved objects encryption key must be set. {{fleet}} requires this setting in order to save API keys and encrypt them in {{kib}}. You can either set `xpack.encryptedSavedObjects.encryptionKey` to an alphanumeric value of at least 32 characters, or run the [`kibana-encryption-keys` command](asciidocalypse://docs/kibana/docs/reference/commands/kibana-encryption-keys.md) to generate the key. **Example security settings** @@ -85,7 +85,7 @@ You can install only a single {{agent}} per host, which means you cannot run {{f * Use your own {{fleet-server}} policy. You can create a new {{fleet-server}} policy or select an existing one. Alternatively you can [create a {{fleet-server}} policy without using the UI](asciidocalypse://docs/docs-content/docs/reference/ingestion-tools/fleet/create-policy-no-ui.md), and select the policy here. * Use your own TLS certificates to encrypt traffic between {{agent}}s and {{fleet-server}}. To learn how to generate certs, refer to [Configure SSL/TLS for self-managed {{fleet-server}}s](asciidocalypse://docs/docs-content/docs/reference/ingestion-tools/fleet/secure-connections.md). -* It’s recommended you generate a unique service token for each {{fleet-server}}. For other ways to generate service tokens, see [`elasticsearch-service-tokens`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/service-tokens-command.md). +* It’s recommended you generate a unique service token for each {{fleet-server}}. For other ways to generate service tokens, see [`elasticsearch-service-tokens`](elasticsearch://reference/elasticsearch/command-line-tools/service-tokens-command.md). * If you are providing your own certificates: * Before running the `install` command, make sure you replace the values in angle brackets. diff --git a/solutions/observability/apps/metadata.md b/solutions/observability/apps/metadata.md index 0806cd85f..b7f9a1089 100644 --- a/solutions/observability/apps/metadata.md +++ b/solutions/observability/apps/metadata.md @@ -13,7 +13,7 @@ Metadata can enrich your events and make application performance monitoring even Labels add **indexed** information to transactions, spans, and errors. Indexed means the data is searchable and aggregatable in {{es}}. Add additional key-value pairs to define multiple labels. * Indexed: Yes -* {{es}} type: [object](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/object.md) +* {{es}} type: [object](elasticsearch://reference/elasticsearch/mapping-reference/object.md) * {{es}} field: `labels` * Applies to: [Transactions](transactions.md) | [Spans](spans.md) | [Errors](errors.md) @@ -44,7 +44,7 @@ Custom context adds **non-indexed**, custom contextual information to transactio Non-indexed information is useful for providing contextual information to help you quickly debug performance issues or errors. * Indexed: No -* {{es}} type: [object](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/object.md) +* {{es}} type: [object](elasticsearch://reference/elasticsearch/mapping-reference/object.md) * {{es}} fields: `transaction.custom` | `error.custom` * Applies to: [Transactions](transactions.md) | [Errors](errors.md) @@ -72,7 +72,7 @@ Setting a circular object, a large object, or a non JSON serializable object can User context adds **indexed** user information to transactions and errors. Indexed means the data is searchable and aggregatable in {{es}}. * Indexed: Yes -* {{es}} type: [keyword](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/keyword.md) +* {{es}} type: [keyword](elasticsearch://reference/elasticsearch/mapping-reference/keyword.md) * {{es}} fields: `user.email` | `user.name` | `user.id` * Applies to: [Transactions](transactions.md) | [Errors](errors.md) diff --git a/solutions/observability/apps/switch-self-installation-to-apm-integration.md b/solutions/observability/apps/switch-self-installation-to-apm-integration.md index 10f3aed9d..cda1e6d0d 100644 --- a/solutions/observability/apps/switch-self-installation-to-apm-integration.md +++ b/solutions/observability/apps/switch-self-installation-to-apm-integration.md @@ -28,7 +28,7 @@ Review the APM [release notes](asciidocalypse://docs/docs-content/docs/release-n {{fleet}} Server is a component of the {{stack}} used to centrally manage {{agent}}s. The APM integration requires a {{fleet}} Server to be running and accessible to your hosts. Add a {{fleet}} Server by following [this guide](asciidocalypse://docs/docs-content/docs/reference/ingestion-tools/fleet/deployment-models.md). ::::{tip} -If you’re upgrading a self-managed deployment of the {{stack}}, you’ll need to enable [{{es}} security](../../../deploy-manage/deploy/self-managed/installing-elasticsearch.md) and the [API key service](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md). +If you’re upgrading a self-managed deployment of the {{stack}}, you’ll need to enable [{{es}} security](../../../deploy-manage/deploy/self-managed/installing-elasticsearch.md) and the [API key service](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md). :::: diff --git a/solutions/observability/apps/tutorial-monitor-java-application.md b/solutions/observability/apps/tutorial-monitor-java-application.md index 3a3ba106a..7f37d873f 100644 --- a/solutions/observability/apps/tutorial-monitor-java-application.md +++ b/solutions/observability/apps/tutorial-monitor-java-application.md @@ -1288,9 +1288,9 @@ Visualize the number of log messages over time, split by the log level. Since th 1. Log into {{kib}} and select **Visualize** → **Create Visualization**. 2. Create a line chart and select `metricbeat-*` as the source. - The basic idea is to have a [max aggregation](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-metrics-max-aggregation.md) on the y-axis on the `prometheus.log4j2_events_total.rate` field, whereas the x-axis, is split by date using a [date_histogram aggregation](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-datehistogram-aggregation.md) on the `@timestamp` field. + The basic idea is to have a [max aggregation](elasticsearch://reference/data-analysis/aggregations/search-aggregations-metrics-max-aggregation.md) on the y-axis on the `prometheus.log4j2_events_total.rate` field, whereas the x-axis, is split by date using a [date_histogram aggregation](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-datehistogram-aggregation.md) on the `@timestamp` field. - There is one more split within each date histogram bucket, split by log level, using a [terms aggregation](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-terms-aggregation.md) on the `prometheus.labels.level`, which contains the log level. Also, increase the size of the log level to six to display every log level. + There is one more split within each date histogram bucket, split by log level, using a [terms aggregation](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-terms-aggregation.md) on the `prometheus.labels.level`, which contains the log level. Also, increase the size of the log level to six to display every log level. The final result looks like this. diff --git a/solutions/observability/apps/view-elasticsearch-index-template.md b/solutions/observability/apps/view-elasticsearch-index-template.md index aa9e612fe..52d55e7a1 100644 --- a/solutions/observability/apps/view-elasticsearch-index-template.md +++ b/solutions/observability/apps/view-elasticsearch-index-template.md @@ -28,7 +28,7 @@ Add any custom metadata, index settings, or mappings. ### Index settings [apm-custom-index-template-index-settings] -In the **Index settings** step, you can specify custom [index settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index.md). For example, you could: +In the **Index settings** step, you can specify custom [index settings](elasticsearch://reference/elasticsearch/index-settings/index.md). For example, you could: * Customize the index lifecycle policy applied to a data stream. See [custom index lifecycle policies](index-lifecycle-management.md#apm-data-streams-custom-policy) for a walk-through. * Change the number of [shards](../../../deploy-manage/index.md) per index. Specify the number of primary shards: @@ -57,7 +57,7 @@ In the **Index settings** step, you can specify custom [index settings](asciidoc [Mapping](../../../manage-data/data-store/mapping.md) is the process of defining how a document, and the fields it contains, are stored and indexed. In the **Mappings** step, you can add custom field mappings. For example, you could: -* Add custom field mappings that you can index on and search. In the **Mapped fields** tab, add a new field including the [field type](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/field-data-types.md): +* Add custom field mappings that you can index on and search. In the **Mapped fields** tab, add a new field including the [field type](elasticsearch://reference/elasticsearch/mapping-reference/field-data-types.md): :::{image} ../../../images/observability-custom-index-template-mapped-fields.png :alt: Editing a component template to add a new mapped field diff --git a/solutions/observability/data-set-quality-monitoring.md b/solutions/observability/data-set-quality-monitoring.md index 6836924b6..23dcbf5a3 100644 --- a/solutions/observability/data-set-quality-monitoring.md +++ b/solutions/observability/data-set-quality-monitoring.md @@ -23,7 +23,7 @@ Users with the `viewer` role can view the Data Sets Quality summary. To view the :::: -The quality of your data sets is based on the percentage of degraded documents in each data set. A degraded document in a data set contains the [`_ignored`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-ignored-field.md) property because one or more of its fields were ignored during indexing. Fields are ignored for a variety of reasons. For example, when the [`ignore_malformed`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-ignored-field.md) parameter is set to true, if a document field contains the wrong data type, the malformed field is ignored and the rest of the document is indexed. +The quality of your data sets is based on the percentage of degraded documents in each data set. A degraded document in a data set contains the [`_ignored`](elasticsearch://reference/elasticsearch/mapping-reference/mapping-ignored-field.md) property because one or more of its fields were ignored during indexing. Fields are ignored for a variety of reasons. For example, when the [`ignore_malformed`](elasticsearch://reference/elasticsearch/mapping-reference/mapping-ignored-field.md) parameter is set to true, if a document field contains the wrong data type, the malformed field is ignored and the rest of the document is indexed. From the data set table, you’ll find information for each data set such as its namespace, when the data set was last active, and the percentage of degraded docs. The percentage of degraded documents determines the data set’s quality according to the following scale: diff --git a/solutions/observability/incident-management/configure-service-level-objective-slo-access.md b/solutions/observability/incident-management/configure-service-level-objective-slo-access.md index 2a3747f0f..a4cc7b821 100644 --- a/solutions/observability/incident-management/configure-service-level-objective-slo-access.md +++ b/solutions/observability/incident-management/configure-service-level-objective-slo-access.md @@ -10,7 +10,7 @@ mapped_pages: ::::{important} -To create and manage SLOs, you need an [appropriate license](https://www.elastic.co/subscriptions) and an {{es}} cluster with both `transform` and `ingest` [node roles](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/node-settings.md#node-roles) present. +To create and manage SLOs, you need an [appropriate license](https://www.elastic.co/subscriptions) and an {{es}} cluster with both `transform` and `ingest` [node roles](elasticsearch://reference/elasticsearch/configuration-reference/node-settings.md#node-roles) present. :::: diff --git a/solutions/observability/incident-management/create-an-elasticsearch-query-rule.md b/solutions/observability/incident-management/create-an-elasticsearch-query-rule.md index 33040a631..acee0e709 100644 --- a/solutions/observability/incident-management/create-an-elasticsearch-query-rule.md +++ b/solutions/observability/incident-management/create-an-elasticsearch-query-rule.md @@ -64,7 +64,7 @@ When you create an {{es}} query rule, your choice of query type affects the info : Specify how to calculate the value that is compared to the threshold. The value is calculated by aggregating a numeric field within the time window. The aggregation options are: `count`, `average`, `sum`, `min`, and `max`. When using `count` the document count is used and an aggregation field is not necessary. Over or Grouped Over - : Specify whether the aggregation is applied over all documents or split into groups using up to four grouping fields. If you choose to use grouping, it’s a [terms](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-terms-aggregation.md) or [multi terms aggregation](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-multi-terms-aggregation.md); an alert will be created for each unique set of values when it meets the condition. To limit the number of alerts on high cardinality fields, you must specify the number of groups to check against the threshold. Only the top groups are checked. + : Specify whether the aggregation is applied over all documents or split into groups using up to four grouping fields. If you choose to use grouping, it’s a [terms](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-terms-aggregation.md) or [multi terms aggregation](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-multi-terms-aggregation.md); an alert will be created for each unique set of values when it meets the condition. To limit the number of alerts on high cardinality fields, you must specify the number of groups to check against the threshold. Only the top groups are checked. Threshold : Defines a threshold value and a comparison operator (`is above`, `is above or equals`, `is below`, `is below or equals`, or `is between`). The value calculated by the aggregation is compared to this threshold. @@ -192,7 +192,7 @@ The following variables are specific to this rule type. You can also specify [va {{/context.hits}} ``` - The documents returned by `context.hits` include the [`_source`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-source-field.md) field. If the {{es}} query search API’s [`fields`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/retrieve-selected-fields.md#search-fields-param) parameter is used, documents will also return the `fields` field, which can be used to access any runtime fields defined by the [`runtime_mappings`](../../../manage-data/data-store/mapping/define-runtime-fields-in-search-request.md) parameter. For example: + The documents returned by `context.hits` include the [`_source`](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md) field. If the {{es}} query search API’s [`fields`](elasticsearch://reference/elasticsearch/rest-apis/retrieve-selected-fields.md#search-fields-param) parameter is used, documents will also return the `fields` field, which can be used to access any runtime fields defined by the [`runtime_mappings`](../../../manage-data/data-store/mapping/define-runtime-fields-in-search-request.md) parameter. For example: ```txt {{#context.hits}} @@ -204,7 +204,7 @@ The following variables are specific to this rule type. You can also specify [va 1. The `fields` parameter here is used to access the `day_of_week` runtime field. - As the [`fields`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/retrieve-selected-fields.md#search-fields-response) response always returns an array of values for each field, the [Mustache](https://mustache.github.io/) template array syntax is used to iterate over these values in your actions. For example: + As the [`fields`](elasticsearch://reference/elasticsearch/rest-apis/retrieve-selected-fields.md#search-fields-response) response always returns an array of values for each field, the [Mustache](https://mustache.github.io/) template array syntax is used to iterate over these values in your actions. For example: ```txt {{#context.hits}} diff --git a/solutions/observability/infra-and-hosts/manage-data-storage.md b/solutions/observability/infra-and-hosts/manage-data-storage.md index 36bb390a0..04051a899 100644 --- a/solutions/observability/infra-and-hosts/manage-data-storage.md +++ b/solutions/observability/infra-and-hosts/manage-data-storage.md @@ -10,8 +10,8 @@ Universal Profiling provides the following ways to manage how your data is store * [Index lifecycle management](universal-profiling-index-life-cycle-management.md) automatically manages your indices according to age or size metric thresholds. Universal Profiling ships with a default index lifecycle policy, but you can create a custom policy to meet your requirements. * [Probabilistic profiling](configure-probabilistic-profiling.md) mode uses representative samples of profiling data to reduce storage needs even further. -::::{important} -Additional storage efficiencies provided by [Synthetic `_source`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-source-field.md) are available to users with an [appropriate license](https://www.elastic.co/subscriptions). +::::{important} +Additional storage efficiencies provided by [Synthetic `_source`](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md) are available to users with an [appropriate license](https://www.elastic.co/subscriptions). :::: diff --git a/solutions/observability/infra-and-hosts/universal-profiling-index-life-cycle-management.md b/solutions/observability/infra-and-hosts/universal-profiling-index-life-cycle-management.md index 28577d8fe..304fb1ce0 100644 --- a/solutions/observability/infra-and-hosts/universal-profiling-index-life-cycle-management.md +++ b/solutions/observability/infra-and-hosts/universal-profiling-index-life-cycle-management.md @@ -26,7 +26,7 @@ The following table lists the default thresholds for rollover and delete: | after 30 days or 50 GB | after 30 days | after 60 days | ::::{note} -The [rollover condition blocks phase transitions](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-rollover.md#_rollover_condition_blocks_phase_transition) which means that indices are kept 30 days **after** rollover on the hot tier. +The [rollover condition blocks phase transitions](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-rollover.md#_rollover_condition_blocks_phase_transition) which means that indices are kept 30 days **after** rollover on the hot tier. :::: diff --git a/solutions/observability/logs/add-service-name-to-logs.md b/solutions/observability/logs/add-service-name-to-logs.md index 3e78fdf6b..a079d8700 100644 --- a/solutions/observability/logs/add-service-name-to-logs.md +++ b/solutions/observability/logs/add-service-name-to-logs.md @@ -40,7 +40,7 @@ For more on defining processors, refer to [define processors](asciidocalypse://d ## Map an existing field to the service name field [observability-add-logs-service-name-map-an-existing-field-to-the-service-name-field] -For logs that with an existing field being used to represent the service name, map that field to the `service.name` field using the [alias field type](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/field-alias.md). Follow these steps to update your mapping: +For logs that with an existing field being used to represent the service name, map that field to the `service.name` field using the [alias field type](elasticsearch://reference/elasticsearch/mapping-reference/field-alias.md). Follow these steps to update your mapping: 1. find **Stack Management** in the main menu or use the [global search field](/explore-analyze/find-and-organize/find-apps-and-objects.md). 2. Select **Index Templates**. diff --git a/solutions/observability/logs/filter-aggregate-logs.md b/solutions/observability/logs/filter-aggregate-logs.md index 86b3eb4fd..d490bd819 100644 --- a/solutions/observability/logs/filter-aggregate-logs.md +++ b/solutions/observability/logs/filter-aggregate-logs.md @@ -129,7 +129,7 @@ For more on using Logs Explorer, refer to the [Discover](../../../explore-analyz [Query DSL](../../../explore-analyze/query-filter/languages/querydsl.md) is a JSON-based language that sends requests and retrieves data from indices and data streams. You can filter your log data using Query DSL from **Developer Tools**. -For example, you might want to troubleshoot an issue that happened on a specific date or at a specific time. To do this, use a boolean query with a [range query](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-range-query.md) to filter for the specific timestamp range and a [term query](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-term-query.md) to filter for `WARN` and `ERROR` log levels. +For example, you might want to troubleshoot an issue that happened on a specific date or at a specific time. To do this, use a boolean query with a [range query](elasticsearch://reference/query-languages/query-dsl-range-query.md) to filter for the specific timestamp range and a [term query](elasticsearch://reference/query-languages/query-dsl-term-query.md) to filter for `WARN` and `ERROR` log levels. First, from **Developer Tools**, add some logs with varying timestamps and log levels to your data stream with the following command: @@ -212,7 +212,7 @@ The filtered results should show `WARN` and `ERROR` logs that occurred within th ## Aggregate logs [logs-aggregate] -Use aggregation to analyze and summarize your log data to find patterns and gain insight. [Bucket aggregations](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/bucket.md) organize log data into meaningful groups making it easier to identify patterns, trends, and anomalies within your logs. +Use aggregation to analyze and summarize your log data to find patterns and gain insight. [Bucket aggregations](elasticsearch://reference/data-analysis/aggregations/bucket.md) organize log data into meaningful groups making it easier to identify patterns, trends, and anomalies within your logs. For example, you might want to understand error distribution by analyzing the count of logs per log level. diff --git a/solutions/observability/logs/parse-route-logs.md b/solutions/observability/logs/parse-route-logs.md index 9aa248b8e..75b9f77ed 100644 --- a/solutions/observability/logs/parse-route-logs.md +++ b/solutions/observability/logs/parse-route-logs.md @@ -130,9 +130,9 @@ When looking into issues, you want to filter for logs by when the issue occurred #### Use an ingest pipeline to extract the `@timestamp` field [observability-parse-log-data-use-an-ingest-pipeline-to-extract-the-timestamp-field] -Ingest pipelines consist of a series of processors that perform common transformations on incoming documents before they are indexed. To extract the `@timestamp` field from the example log, use an ingest pipeline with a [dissect processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/dissect-processor.md). The dissect processor extracts structured fields from unstructured log messages based on a pattern you set. +Ingest pipelines consist of a series of processors that perform common transformations on incoming documents before they are indexed. To extract the `@timestamp` field from the example log, use an ingest pipeline with a [dissect processor](elasticsearch://reference/ingestion-tools/enrich-processor/dissect-processor.md). The dissect processor extracts structured fields from unstructured log messages based on a pattern you set. -Elastic can parse string timestamps that are in `yyyy-MM-dd'T'HH:mm:ss.SSSZ` and `yyyy-MM-dd` formats into date fields. Since the log example’s timestamp is in one of these formats, you don’t need additional processors. More complex or nonstandard timestamps require a [date processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/date-processor.md) to parse the timestamp into a date field. +Elastic can parse string timestamps that are in `yyyy-MM-dd'T'HH:mm:ss.SSSZ` and `yyyy-MM-dd` formats into date fields. Since the log example’s timestamp is in one of these formats, you don’t need additional processors. More complex or nonstandard timestamps require a [date processor](elasticsearch://reference/ingestion-tools/enrich-processor/date-processor.md) to parse the timestamp into a date field. Use the following command to extract the timestamp from the `message` field into the `@timestamp` field: @@ -245,7 +245,7 @@ The example index template above sets the following component templates: * The default lifecycle policy that rolls over when the primary shard reaches 50 GB or after 30 days. * The default pipeline uses the ingest timestamp if there is no specified `@timestamp` and places a hook for the `logs@custom` pipeline. If a `logs@custom` pipeline is installed, it’s applied to logs ingested into this data stream. - * Sets the [`ignore_malformed`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/ignore-malformed.md) flag to `true`. When ingesting a large batch of log data, a single malformed field like an IP address can cause the entire batch to fail. When set to true, malformed fields with a mapping type that supports this flag are still processed. + * Sets the [`ignore_malformed`](elasticsearch://reference/elasticsearch/mapping-reference/ignore-malformed.md) flag to `true`. When ingesting a large batch of log data, a single malformed field like an IP address can cause the entire batch to fail. When set to true, malformed fields with a mapping type that supports this flag are still processed. * `logs@custom`: a predefined component template that is not installed by default. Use this name to install a custom component template to override or extend any of the default mappings or settings. * `ecs@mappings`: dynamic templates that automatically ensure your data stream mappings comply with the [Elastic Common Schema (ECS)](asciidocalypse://docs/ecs/docs/reference/index.md). @@ -304,8 +304,8 @@ You can now use the `@timestamp` field to sort your logs by the date and time th Check the following common issues and solutions with timestamps: * **Timestamp failure:** If your data has inconsistent date formats, set `ignore_failure` to `true` for your date processor. This processes logs with correctly formatted dates and ignores those with issues. -* **Incorrect timezone:** Set your timezone using the `timezone` option on the [date processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/date-processor.md). -* **Incorrect timestamp format:** Your timestamp can be a Java time pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N. For more information on timestamp formats, refer to the [mapping date format](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-date-format.md). +* **Incorrect timezone:** Set your timezone using the `timezone` option on the [date processor](elasticsearch://reference/ingestion-tools/enrich-processor/date-processor.md). +* **Incorrect timestamp format:** Your timestamp can be a Java time pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N. For more information on timestamp formats, refer to the [mapping date format](elasticsearch://reference/elasticsearch/mapping-reference/mapping-date-format.md). ### Extract the `log.level` field [observability-parse-log-data-extract-the-loglevel-field] @@ -476,7 +476,7 @@ The results should show only the high-severity logs: Extracting the `host.ip` field lets you filter logs by host IP addresses allowing you to focus on specific hosts that you’re having issues with or find disparities between hosts. -The `host.ip` field is part of the [Elastic Common Schema (ECS)](asciidocalypse://docs/ecs/docs/reference/index.md). Through the ECS, the `host.ip` field is mapped as an [`ip` field type](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/ip.md). `ip` field types allow range queries so you can find logs with IP addresses in a specific range. You can also query `ip` field types using Classless Inter-Domain Routing (CIDR) notation to find logs from a particular network or subnet. +The `host.ip` field is part of the [Elastic Common Schema (ECS)](asciidocalypse://docs/ecs/docs/reference/index.md). Through the ECS, the `host.ip` field is mapped as an [`ip` field type](elasticsearch://reference/elasticsearch/mapping-reference/ip.md). `ip` field types allow range queries so you can find logs with IP addresses in a specific range. You can also query `ip` field types using Classless Inter-Domain Routing (CIDR) notation to find logs from a particular network or subnet. This section shows you how to extract the `host.ip` field from the following example logs and query based on the extracted fields: @@ -676,7 +676,7 @@ Because all of the example logs are in this range, you’ll get the following re ##### Range queries [observability-parse-log-data-range-queries] -Use [range queries](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-range-query.md) to query logs in a specific range. +Use [range queries](elasticsearch://reference/query-languages/query-dsl-range-query.md) to query logs in a specific range. The following command searches for IP addresses greater than or equal to `192.168.1.100` and less than or equal to `192.168.1.102`. @@ -744,7 +744,7 @@ You’ll get the following results only showing logs in the range you’ve set: ## Reroute log data to specific data streams [observability-parse-log-data-reroute-log-data-to-specific-data-streams] -By default, an ingest pipeline sends your log data to a single data stream. To simplify log data management, use a [reroute processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/reroute-processor.md) to route data from the generic data stream to a target data stream. For example, you might want to send high-severity logs to a specific data stream to help with categorization. +By default, an ingest pipeline sends your log data to a single data stream. To simplify log data management, use a [reroute processor](elasticsearch://reference/ingestion-tools/enrich-processor/reroute-processor.md) to route data from the generic data stream to a target data stream. For example, you might want to send high-severity logs to a specific data stream to help with categorization. This section shows you how to use a reroute processor to send the high-severity logs (`WARN` or `ERROR`) from the following example logs to a specific data stream and keep the regular logs (`DEBUG` and `INFO`) in the default data stream: diff --git a/solutions/observability/logs/plaintext-application-logs.md b/solutions/observability/logs/plaintext-application-logs.md index 112706800..d8c78ad9b 100644 --- a/solutions/observability/logs/plaintext-application-logs.md +++ b/solutions/observability/logs/plaintext-application-logs.md @@ -231,7 +231,7 @@ By default, Windows log files are stored in `C:\ProgramData\filebeat\Logs`. Use an ingest pipeline to parse the contents of your logs into structured, [Elastic Common Schema (ECS)](asciidocalypse://docs/ecs/docs/reference/index.md)-compatible fields. -Create an ingest pipeline that defines a [dissect processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/dissect-processor.md) to extract structured ECS fields from your log messages. In your project, navigate to **Developer Tools** and using a command similar to the following example: +Create an ingest pipeline that defines a [dissect processor](elasticsearch://reference/ingestion-tools/enrich-processor/dissect-processor.md) to extract structured ECS fields from your log messages. In your project, navigate to **Developer Tools** and using a command similar to the following example: ```console PUT _ingest/pipeline/filebeat* <1> @@ -249,7 +249,7 @@ PUT _ingest/pipeline/filebeat* <1> ``` 1. `_ingest/pipeline/filebeat*`: The name of the pipeline. Update the pipeline name to match the name of your data stream. For more information, refer to [Data stream naming scheme](asciidocalypse://docs/docs-content/docs/reference/ingestion-tools/fleet/data-streams.md#data-streams-naming-scheme). -2. `processors.dissect`: Adds a [dissect processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/dissect-processor.md) to extract structured fields from your log message. +2. `processors.dissect`: Adds a [dissect processor](elasticsearch://reference/ingestion-tools/enrich-processor/dissect-processor.md) to extract structured fields from your log message. 3. `field`: The field you’re extracting data from, `message` in this case. 4. `pattern`: The pattern of the elements in your log data. The pattern varies depending on your log format. `%{@timestamp}` is required. `%{log.level}`, `%{host.ip}`, and `%{{message}}` are common [ECS](asciidocalypse://docs/ecs/docs/reference/index.md) fields. This pattern would match a log file in this format: `2023-11-07T09:39:01.012Z ERROR 192.168.1.110 Server hardware failure detected.` @@ -297,7 +297,7 @@ To aggregate or search for information in plaintext logs, use an ingest pipeline 2. Select the integration policy you created in the previous section. 3. Click **Change defaults → Advanced options**. 4. Under **Ingest pipelines**, click **Add custom pipeline**. -5. Create an ingest pipeline with a [dissect processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/dissect-processor.md) to extract structured fields from your log messages. +5. Create an ingest pipeline with a [dissect processor](elasticsearch://reference/ingestion-tools/enrich-processor/dissect-processor.md) to extract structured fields from your log messages. Click **Import processors** and add a similar JSON to the following example: @@ -315,7 +315,7 @@ To aggregate or search for information in plaintext logs, use an ingest pipeline } ``` - 1. `processors.dissect`: Adds a [dissect processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/dissect-processor.md) to extract structured fields from your log message. + 1. `processors.dissect`: Adds a [dissect processor](elasticsearch://reference/ingestion-tools/enrich-processor/dissect-processor.md) to extract structured fields from your log message. 2. `field`: The field you’re extracting data from, `message` in this case. 3. `pattern`: The pattern of the elements in your log data. The pattern varies depending on your log format. `%{@timestamp}`, `%{log.level}`, `%{host.ip}`, and `%{{message}}` are common [ECS](asciidocalypse://docs/ecs/docs/reference/index.md) fields. This pattern would match a log file in this format: `2023-11-07T09:39:01.012Z ERROR 192.168.1.110 Server hardware failure detected.` diff --git a/solutions/observability/observability-ai-assistant.md b/solutions/observability/observability-ai-assistant.md index 38862a062..c6cd5e1a3 100644 --- a/solutions/observability/observability-ai-assistant.md +++ b/solutions/observability/observability-ai-assistant.md @@ -44,7 +44,7 @@ Also, the data you provide to the Observability AI assistant is *not* anonymized The AI assistant requires the following: * {{stack}} version 8.9 and later. -* A self-deployed connector service if [search connectors](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/search-connectors/self-managed-connectors.md) are used to populate external data into the knowledge base. +* A self-deployed connector service if [search connectors](elasticsearch://reference/ingestion-tools/search-connectors/self-managed-connectors.md) are used to populate external data into the knowledge base. * An account with a third-party generative AI provider that preferably supports function calling. If your AI provider does not support function calling, you can configure AI Assistant settings under **Stack Management** to simulate function calling, but this might affect performance. Refer to the [connector documentation](../../deploy-manage/manage-connectors.md) for your provider to learn about supported and default models. @@ -147,16 +147,16 @@ To add external data to the knowledge base in {{kib}}: ### Use search connectors [obs-ai-search-connectors] ::::{tip} -The [search connectors](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/search-connectors/index.md) described in this section differ from the [Stack management → Connectors](../../deploy-manage/manage-connectors.md) configured during the [AI Assistant setup](#obs-ai-set-up). Search connectors are only needed when importing external data into the Knowledge base of the AI Assistant, while the stack connector to the LLM is required for the AI Assistant to work. +The [search connectors](elasticsearch://reference/ingestion-tools/search-connectors/index.md) described in this section differ from the [Stack management → Connectors](../../deploy-manage/manage-connectors.md) configured during the [AI Assistant setup](#obs-ai-set-up). Search connectors are only needed when importing external data into the Knowledge base of the AI Assistant, while the stack connector to the LLM is required for the AI Assistant to work. :::: -[Connectors](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/search-connectors/index.md) allow you to index content from external sources thereby making it available for the AI Assistant. This can greatly improve the relevance of the AI Assistant’s responses. Data can be integrated from sources such as GitHub, Confluence, Google Drive, Jira, AWS S3, Microsoft Teams, Slack, and more. +[Connectors](elasticsearch://reference/ingestion-tools/search-connectors/index.md) allow you to index content from external sources thereby making it available for the AI Assistant. This can greatly improve the relevance of the AI Assistant’s responses. Data can be integrated from sources such as GitHub, Confluence, Google Drive, Jira, AWS S3, Microsoft Teams, Slack, and more. UI affordances for creating and managing search connectors are available in the Search Solution in {{kib}}. You can also use the {{es}} [Connector APIs](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-connector) to create and manage search connectors. -The infrastructure for deploying connectors must be [self-managed](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/search-connectors/self-managed-connectors.md). +The infrastructure for deploying connectors must be [self-managed](elasticsearch://reference/ingestion-tools/search-connectors/self-managed-connectors.md). By default, the AI Assistant queries all search connector indices. To override this behavior and customize which indices are queried, adjust the **Search connector index pattern** setting on the [AI Assistant Settings](#obs-ai-settings) page. This allows precise control over which data sources are included in AI Assistant knowledge base. @@ -171,9 +171,9 @@ To create a connector in the {{kib}} UI and make its content available to the AI 2. Follow the instructions to create a new connector. - For example, if you create a [GitHub connector](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/search-connectors/es-connectors-github.md) you have to set a `name`, attach it to a new or existing `index`, add your `personal access token` and include the `list of repositories` to synchronize. + For example, if you create a [GitHub connector](elasticsearch://reference/ingestion-tools/search-connectors/es-connectors-github.md) you have to set a `name`, attach it to a new or existing `index`, add your `personal access token` and include the `list of repositories` to synchronize. - Learn more about configuring and [using connectors](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/search-connectors/connectors-ui-in-kibana.md) in the Elasticsearch documentation. + Learn more about configuring and [using connectors](elasticsearch://reference/ingestion-tools/search-connectors/connectors-ui-in-kibana.md) in the Elasticsearch documentation. After creating your connector, create the embeddings needed by the AI Assistant. You can do this using either: @@ -199,7 +199,7 @@ After creating the pipeline, complete the following steps: Once the pipeline is set up, perform a **Full Content Sync** of the connector. The inference pipeline will process the data as follows: - * As data comes in, ELSER is applied to the data, and embeddings (weights and tokens into a [sparse vector field](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-sparse-vector-query.md)) are added to capture semantic meaning and context of the data. + * As data comes in, ELSER is applied to the data, and embeddings (weights and tokens into a [sparse vector field](elasticsearch://reference/query-languages/query-dsl-sparse-vector-query.md)) are added to capture semantic meaning and context of the data. * When you look at the ingested documents, you can see the embeddings are added to the `predicted_value` field in the documents. 2. Check if AI Assistant can use the index (optional). @@ -210,7 +210,7 @@ After creating the pipeline, complete the following steps: #### Use a `semantic_text` field type to create AI Assistant embeddings [obs-ai-search-connectors-semantic-text] -To create the embeddings needed by the AI Assistant using a [`semantic_text`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/semantic-text.md) field type: +To create the embeddings needed by the AI Assistant using a [`semantic_text`](elasticsearch://reference/elasticsearch/mapping-reference/semantic-text.md) field type: 1. Open the previously created connector, and select the **Mappings** tab. 2. Select **Add field**. diff --git a/solutions/search/cross-cluster-search.md b/solutions/search/cross-cluster-search.md index 320bbc2c6..db4a49a1b 100644 --- a/solutions/search/cross-cluster-search.md +++ b/solutions/search/cross-cluster-search.md @@ -21,7 +21,7 @@ The following APIs support {{ccs}}: * [Search template](search-templates.md) * [Multi search template](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-msearch-template) * [Field capabilities](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-field-caps) -* [Painless execute API](asciidocalypse://docs/elasticsearch/docs/reference/scripting-languages/painless/painless-api-examples.md) +* [Painless execute API](elasticsearch://reference/scripting-languages/painless/painless-api-examples.md) * [Resolve Index API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-resolve-index) * [preview] [EQL search](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-eql-search) * [preview] [SQL search](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-sql-query) @@ -979,7 +979,7 @@ In Elasticsearch 8.15, the default value for `skip_unavailable` was changed from If `skip_unavailable` is `true`, a {{ccs}}: * Skips the remote cluster if its nodes are unavailable during the search. The response’s `_clusters.skipped` value contains a count of any skipped clusters and the `_clusters.details` section of the response will show a `skipped` status. -* Ignores errors returned by the remote cluster, such as errors related to unavailable shards or indices. This can include errors related to search parameters such as [`allow_no_indices`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#api-multi-index) and [`ignore_unavailable`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/api-conventions.md#api-multi-index). +* Ignores errors returned by the remote cluster, such as errors related to unavailable shards or indices. This can include errors related to search parameters such as [`allow_no_indices`](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#api-multi-index) and [`ignore_unavailable`](elasticsearch://reference/elasticsearch/rest-apis/api-conventions.md#api-multi-index). * Ignores the [`allow_partial_search_results`](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search#operation-search-allow_partial_search_results) parameter and the related `search.default_allow_partial_results` cluster setting when searching the remote cluster. This means searches on the remote cluster may return partial results. You can modify the `skip_unavailable` setting by editing the `cluster.remote.` settings in the elasticsearch.yml config file. For example: @@ -1007,7 +1007,7 @@ If at least one shard from a cluster provides search results, those results will Because {{ccs}} involves sending requests to remote clusters, any network delays can impact search speed. To avoid slow searches, {{ccs}} offers two options for handling network delays: [Minimize network roundtrips](#ccs-min-roundtrips) -: By default, {{es}} reduces the number of network roundtrips between remote clusters. This reduces the impact of network delays on search speed. However, {{es}} can’t reduce network roundtrips for large search requests, such as those including a [scroll](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/paginate-search-results.md#scroll-search-results) or [inner hits](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/retrieve-inner-hits.md). +: By default, {{es}} reduces the number of network roundtrips between remote clusters. This reduces the impact of network delays on search speed. However, {{es}} can’t reduce network roundtrips for large search requests, such as those including a [scroll](elasticsearch://reference/elasticsearch/rest-apis/paginate-search-results.md#scroll-search-results) or [inner hits](elasticsearch://reference/elasticsearch/rest-apis/retrieve-inner-hits.md). See [Considerations for choosing whether to minimize roundtrips in a {{ccs}}](#ccs-min-roundtrips) to learn how this option works. diff --git a/solutions/search/elasticsearch-basics-quickstart.md b/solutions/search/elasticsearch-basics-quickstart.md index 41f44ad1c..a0defa8ec 100644 --- a/solutions/search/elasticsearch-basics-quickstart.md +++ b/solutions/search/elasticsearch-basics-quickstart.md @@ -309,7 +309,7 @@ GET /books/_mapping ### Define explicit mapping [getting-started-explicit-mapping] -Create an index named `my-explicit-mappings-books` with explicit mappings. Pass each field’s properties as a JSON object. This object should contain the [field data type](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/field-data-types.md) and any additional [mapping parameters](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-parameters.md). +Create an index named `my-explicit-mappings-books` with explicit mappings. Pass each field’s properties as a JSON object. This object should contain the [field data type](elasticsearch://reference/elasticsearch/mapping-reference/field-data-types.md) and any additional [mapping parameters](elasticsearch://reference/elasticsearch/mapping-reference/mapping-parameters.md). ```console PUT /my-explicit-mappings-books @@ -416,7 +416,7 @@ GET books/_search ### `match` query [getting-started-match-query] -You can use the [`match` query](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-match-query.md) to search for documents that contain a specific value in a specific field. This is the standard query for full-text searches. +You can use the [`match` query](elasticsearch://reference/query-languages/query-dsl-match-query.md) to search for documents that contain a specific value in a specific field. This is the standard query for full-text searches. Run the following command to search the `books` index for documents containing `brave` in the `name` field: diff --git a/solutions/search/full-text.md b/solutions/search/full-text.md index c0f9ec84d..e316a7de6 100644 --- a/solutions/search/full-text.md +++ b/solutions/search/full-text.md @@ -12,7 +12,7 @@ applies_to: Would you prefer to start with a hands-on example? Refer to our [full-text search tutorial](querydsl-full-text-filter-tutorial.md). :::: -Full-text search, also known as lexical search, is a technique for fast, efficient searching through text fields in documents. Documents and search queries are transformed to enable returning [relevant](https://www.elastic.co/what-is/search-relevance) results instead of simply exact term matches. Fields of type [`text`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/text.md#text-field-type) are analyzed and indexed for full-text search. +Full-text search, also known as lexical search, is a technique for fast, efficient searching through text fields in documents. Documents and search queries are transformed to enable returning [relevant](https://www.elastic.co/what-is/search-relevance) results instead of simply exact term matches. Fields of type [`text`](elasticsearch://reference/elasticsearch/mapping-reference/text.md#text-field-type) are analyzed and indexed for full-text search. Built on decades of information retrieval research, full-text search delivers reliable results that scale predictably as your data grows. Because it runs efficiently on CPUs, {{es}}'s full-text search requires minimal computational resources compared to GPU-intensive vector operations. @@ -34,18 +34,18 @@ Here are some resources to help you learn more about full-text search with {{es} Learn about the core components of full-text search: -* [Text fields](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/text.md) +* [Text fields](elasticsearch://reference/elasticsearch/mapping-reference/text.md) * [Text analysis](full-text/text-analysis-during-search.md) - * [Tokenizers](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/tokenizer-reference.md) - * [Analyzers](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analyzer-reference.md) + * [Tokenizers](elasticsearch://reference/data-analysis/text-analysis/tokenizer-reference.md) + * [Analyzers](elasticsearch://reference/data-analysis/text-analysis/analyzer-reference.md) **{{es}} query languages** Learn how to build full-text search queries using {{es}}'s query languages: -* [Full-text queries using Query DSL](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/full-text-queries.md) -* [Full-text search functions in {{esql}}](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-functions-operators.md#esql-search-functions) +* [Full-text queries using Query DSL](elasticsearch://reference/query-languages/full-text-queries.md) +* [Full-text search functions in {{esql}}](elasticsearch://reference/query-languages/esql/esql-functions-operators.md#esql-search-functions) **Advanced topics** diff --git a/solutions/search/full-text/how-full-text-works.md b/solutions/search/full-text/how-full-text-works.md index cb9a77cde..dc7b721bf 100644 --- a/solutions/search/full-text/how-full-text-works.md +++ b/solutions/search/full-text/how-full-text-works.md @@ -14,7 +14,7 @@ The following diagram illustrates the components of full-text search. At a high level, full-text search involves the following: -* [**Text analysis**](../../../manage-data/data-store/text-analysis.md): Analysis consists of a pipeline of sequential transformations. Text is transformed into a format optimized for searching using techniques such as stemming, lowercasing, and stop word elimination. {{es}} contains a number of built-in [analyzers](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analyzer-reference.md) and tokenizers, including options to analyze specific language text. You can also create custom analyzers. +* [**Text analysis**](../../../manage-data/data-store/text-analysis.md): Analysis consists of a pipeline of sequential transformations. Text is transformed into a format optimized for searching using techniques such as stemming, lowercasing, and stop word elimination. {{es}} contains a number of built-in [analyzers](elasticsearch://reference/data-analysis/text-analysis/analyzer-reference.md) and tokenizers, including options to analyze specific language text. You can also create custom analyzers. ::::{tip} Refer to [Test an analyzer](../../../manage-data/data-store/text-analysis/test-an-analyzer.md) to learn how to test an analyzer and inspect the tokens and metadata it generates. :::: @@ -26,10 +26,10 @@ Refer to [Test an analyzer](../../../manage-data/data-store/text-analysis/test-a * **Relevance scoring**: Results are ranked by how relevant they are to the given query. The relevance score of each document is represented by a positive floating-point number called the `_score`. The higher the `_score`, the more relevant the document. - The default [similarity algorithm](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/similarity.md) {{es}} uses for calculating relevance scores is [Okapi BM25](https://en.wikipedia.org/wiki/Okapi_BM25), a variation of the [TF-IDF algorithm](https://en.wikipedia.org/wiki/Tf–idf). BM25 calculates relevance scores based on term frequency, document frequency, and document length. Refer to this [technical blog post](https://www.elastic.co/blog/practical-bm25-part-2-the-bm25-algorithm-and-its-variables) for a deep dive into BM25. + The default [similarity algorithm](elasticsearch://reference/elasticsearch/index-settings/similarity.md) {{es}} uses for calculating relevance scores is [Okapi BM25](https://en.wikipedia.org/wiki/Okapi_BM25), a variation of the [TF-IDF algorithm](https://en.wikipedia.org/wiki/Tf–idf). BM25 calculates relevance scores based on term frequency, document frequency, and document length. Refer to this [technical blog post](https://www.elastic.co/blog/practical-bm25-part-2-the-bm25-algorithm-and-its-variables) for a deep dive into BM25. * **Full-text search query**: Query text is analyzed [the same way as the indexed text](../../../manage-data/data-store/text-analysis/index-search-analysis.md), and the resulting tokens are used to search the inverted index. - Query DSL supports a number of [full-text queries](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/full-text-queries.md). + Query DSL supports a number of [full-text queries](elasticsearch://reference/query-languages/full-text-queries.md). - As of 8.17, {{esql}} also supports [full-text search](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-functions-operators.md#esql-search-functions) functions. + As of 8.17, {{esql}} also supports [full-text search](elasticsearch://reference/query-languages/esql/esql-functions-operators.md#esql-search-functions) functions. diff --git a/solutions/search/full-text/search-relevance/mixing-exact-search-with-stemming.md b/solutions/search/full-text/search-relevance/mixing-exact-search-with-stemming.md index 0625e740d..716c748fc 100644 --- a/solutions/search/full-text/search-relevance/mixing-exact-search-with-stemming.md +++ b/solutions/search/full-text/search-relevance/mixing-exact-search-with-stemming.md @@ -8,7 +8,7 @@ applies_to: # Mixing exact search with stemming [mixing-exact-search-with-stemming] -When building a search application, stemming is often a must as it is desirable for a query on `skiing` to match documents that contain `ski` or `skis`. But what if a user wants to search for `skiing` specifically? The typical way to do this would be to use a [multi-field](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/multi-fields.md) in order to have the same content indexed in two different ways: +When building a search application, stemming is often a must as it is desirable for a query on `skiing` to match documents that contain `ski` or `skis`. But what if a user wants to search for `skiing` specifically? The typical way to do this would be to use a [multi-field](elasticsearch://reference/elasticsearch/mapping-reference/multi-fields.md) in order to have the same content indexed in two different ways: ```console PUT index @@ -199,7 +199,7 @@ GET index/_search In the above case, since `ski` was in-between quotes, it was searched on the `body.exact` field due to the `quote_field_suffix` parameter, so only document `1` matched. This allows users to mix exact search with stemmed search as they like. -::::{note} +::::{note} If the choice of field passed in `quote_field_suffix` does not exist the search will fall back to using the default field for the query string. :::: diff --git a/solutions/search/full-text/search-relevance/static-scoring-signals.md b/solutions/search/full-text/search-relevance/static-scoring-signals.md index 04d55d351..9d4739b81 100644 --- a/solutions/search/full-text/search-relevance/static-scoring-signals.md +++ b/solutions/search/full-text/search-relevance/static-scoring-signals.md @@ -12,12 +12,12 @@ Many domains have static signals that are known to be correlated with relevance. There are two main queries that allow combining static score contributions with textual relevance, eg. as computed with BM25: -* [`script_score` query](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-script-score-query.md) -* [`rank_feature` query](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-rank-feature-query.md) +* [`script_score` query](elasticsearch://reference/query-languages/query-dsl-script-score-query.md) +* [`rank_feature` query](elasticsearch://reference/query-languages/query-dsl-rank-feature-query.md) For instance imagine that you have a `pagerank` field that you wish to combine with the BM25 score so that the final score is equal to `score = bm25_score + pagerank / (10 + pagerank)`. -With the [`script_score` query](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-script-score-query.md) the query would look like this: +With the [`script_score` query](elasticsearch://reference/query-languages/query-dsl-script-score-query.md) the query would look like this: ```console GET index/_search @@ -35,10 +35,10 @@ GET index/_search } ``` -1. `pagerank` must be mapped as a [Numeric](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/number.md) +1. `pagerank` must be mapped as a [Numeric](elasticsearch://reference/elasticsearch/mapping-reference/number.md) -while with the [`rank_feature` query](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-rank-feature-query.md) it would look like below: +while with the [`rank_feature` query](elasticsearch://reference/query-languages/query-dsl-rank-feature-query.md) it would look like below: ```console GET _search @@ -61,8 +61,8 @@ GET _search } ``` -1. `pagerank` must be mapped as a [`rank_feature`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/rank-feature.md) field +1. `pagerank` must be mapped as a [`rank_feature`](elasticsearch://reference/elasticsearch/mapping-reference/rank-feature.md) field -While both options would return similar scores, there are trade-offs: [script_score](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-script-score-query.md) provides a lot of flexibility, enabling you to combine the text relevance score with static signals as you prefer. On the other hand, the [`rank_feature` query](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/rank-feature.md) only exposes a couple ways to incorporate static signals into the score. However, it relies on the [`rank_feature`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/rank-feature.md) and [`rank_features`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/rank-features.md) fields, which index values in a special way that allows the [`rank_feature` query](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-rank-feature-query.md) to skip over non-competitive documents and get the top matches of a query faster. +While both options would return similar scores, there are trade-offs: [script_score](elasticsearch://reference/query-languages/query-dsl-script-score-query.md) provides a lot of flexibility, enabling you to combine the text relevance score with static signals as you prefer. On the other hand, the [`rank_feature` query](elasticsearch://reference/elasticsearch/mapping-reference/rank-feature.md) only exposes a couple ways to incorporate static signals into the score. However, it relies on the [`rank_feature`](elasticsearch://reference/elasticsearch/mapping-reference/rank-feature.md) and [`rank_features`](elasticsearch://reference/elasticsearch/mapping-reference/rank-features.md) fields, which index values in a special way that allows the [`rank_feature` query](elasticsearch://reference/query-languages/query-dsl-rank-feature-query.md) to skip over non-competitive documents and get the top matches of a query faster. diff --git a/solutions/search/full-text/search-with-synonyms.md b/solutions/search/full-text/search-with-synonyms.md index 0aea17f3f..cd58dc543 100644 --- a/solutions/search/full-text/search-with-synonyms.md +++ b/solutions/search/full-text/search-with-synonyms.md @@ -136,10 +136,10 @@ An index with invalid synonym rules cannot be reopened, making it inoperable whe :::: -{{es}} uses synonyms as part of the [analysis process](../../../manage-data/data-store/text-analysis.md). You can use two types of [token filter](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/token-filter-reference.md) to include synonyms: +{{es}} uses synonyms as part of the [analysis process](../../../manage-data/data-store/text-analysis.md). You can use two types of [token filter](elasticsearch://reference/data-analysis/text-analysis/token-filter-reference.md) to include synonyms: -* [Synonym graph](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analysis-synonym-graph-tokenfilter.md): It is recommended to use it, as it can correctly handle multi-word synonyms ("hurriedly", "in a hurry"). -* [Synonym](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analysis-synonym-tokenfilter.md): Not recommended if you need to use multi-word synonyms. +* [Synonym graph](elasticsearch://reference/data-analysis/text-analysis/analysis-synonym-graph-tokenfilter.md): It is recommended to use it, as it can correctly handle multi-word synonyms ("hurriedly", "in a hurry"). +* [Synonym](elasticsearch://reference/data-analysis/text-analysis/analysis-synonym-tokenfilter.md): Not recommended if you need to use multi-word synonyms. Check each synonym token filter documentation for configuration details and instructions on adding it to an analyzer. diff --git a/solutions/search/full-text/text-analysis-during-search.md b/solutions/search/full-text/text-analysis-during-search.md index 3977a9282..b3d88aab8 100644 --- a/solutions/search/full-text/text-analysis-during-search.md +++ b/solutions/search/full-text/text-analysis-during-search.md @@ -11,9 +11,9 @@ applies_to: *Text analysis* is the process of converting unstructured text, like the body of an email or a product description, into a structured format that’s [optimized for search](../full-text.md). -## When to configure text analysis [when-to-configure-analysis] +## When to configure text analysis [when-to-configure-analysis] -{{es}} performs text analysis when indexing or searching [`text`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/text.md) fields. +{{es}} performs text analysis when indexing or searching [`text`](elasticsearch://reference/elasticsearch/mapping-reference/text.md) fields. If your index doesn’t contain `text` fields, no further setup is needed; you can skip the pages in this section. @@ -25,16 +25,16 @@ However, if you use `text` fields or your text searches aren’t returning resul * Perform lexicographic or linguistic research -## Learn more [analysis-toc] +## Learn more [analysis-toc] Learn more about text analysis in the **Manage Data** section of the documentation: * [Overview](../../../manage-data/data-store/text-analysis.md) * [Concepts](../../../manage-data/data-store/text-analysis/concepts.md) * [*Configure text analysis*](../../../manage-data/data-store/text-analysis/configure-text-analysis.md) -* [*Built-in analyzer reference*](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/analyzer-reference.md) -* [*Tokenizer reference*](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/tokenizer-reference.md) -* [*Token filter reference*](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/token-filter-reference.md) -* [*Character filters reference*](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/character-filter-reference.md) -* [*Normalizers*](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/text-analysis/normalizers.md) +* [*Built-in analyzer reference*](elasticsearch://reference/data-analysis/text-analysis/analyzer-reference.md) +* [*Tokenizer reference*](elasticsearch://reference/data-analysis/text-analysis/tokenizer-reference.md) +* [*Token filter reference*](elasticsearch://reference/data-analysis/text-analysis/token-filter-reference.md) +* [*Character filters reference*](elasticsearch://reference/data-analysis/text-analysis/character-filter-reference.md) +* [*Normalizers*](elasticsearch://reference/data-analysis/text-analysis/normalizers.md) diff --git a/solutions/search/hybrid-search.md b/solutions/search/hybrid-search.md index a7924f2cf..471c9ef89 100644 --- a/solutions/search/hybrid-search.md +++ b/solutions/search/hybrid-search.md @@ -9,4 +9,4 @@ Hybrid search combines traditional [full-text search](full-text.md) with [AI-pow The recommended way to use hybrid search in the Elastic Stack is following the `semantic_text` workflow. Check out the [hands-on tutorial](hybrid-semantic-text.md) for a step-by-step guide. -We recommend implementing hybrid search with the [reciprocal rank fusion (RRF)](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/reciprocal-rank-fusion.md) algorithm. This approach merges rankings from both semantic and lexical queries, giving more weight to results that rank high in either search. This ensures that the final results are balanced and relevant. \ No newline at end of file +We recommend implementing hybrid search with the [reciprocal rank fusion (RRF)](elasticsearch://reference/elasticsearch/rest-apis/reciprocal-rank-fusion.md) algorithm. This approach merges rankings from both semantic and lexical queries, giving more weight to results that rank high in either search. This ensures that the final results are balanced and relevant. \ No newline at end of file diff --git a/solutions/search/hybrid-semantic-text.md b/solutions/search/hybrid-semantic-text.md index 79fd2be5d..104ab593d 100644 --- a/solutions/search/hybrid-semantic-text.md +++ b/solutions/search/hybrid-semantic-text.md @@ -102,7 +102,7 @@ POST _tasks//_cancel ## Perform hybrid search [hybrid-search-perform-search] -After reindexing the data into the `semantic-embeddings` index, you can perform hybrid search by using [reciprocal rank fusion (RRF)](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/reciprocal-rank-fusion.md). RRF is a technique that merges the rankings from both semantic and lexical queries, giving more weight to results that rank high in either search. This ensures that the final results are balanced and relevant. +After reindexing the data into the `semantic-embeddings` index, you can perform hybrid search by using [reciprocal rank fusion (RRF)](elasticsearch://reference/elasticsearch/rest-apis/reciprocal-rank-fusion.md). RRF is a technique that merges the rankings from both semantic and lexical queries, giving more weight to results that rank high in either search. This ensures that the final results are balanced and relevant. ```console GET semantic-embeddings/_search diff --git a/solutions/search/querydsl-full-text-filter-tutorial.md b/solutions/search/querydsl-full-text-filter-tutorial.md index f2efcb1a6..c0625b81d 100644 --- a/solutions/search/querydsl-full-text-filter-tutorial.md +++ b/solutions/search/querydsl-full-text-filter-tutorial.md @@ -24,7 +24,7 @@ The goal is to create search queries that enable users to: To achieve these goals we’ll use different Elasticsearch queries to perform full-text search, apply filters, and combine multiple search criteria. -## Requirements [full-text-filter-tutorial-requirements] +## Requirements [full-text-filter-tutorial-requirements] You’ll need a running {{es}} cluster, together with {{kib}} to use the Dev Tools API Console. Run the following command in your terminal to set up a [single-node local cluster in Docker](get-started.md): @@ -33,7 +33,7 @@ curl -fsSL https://elastic.co/start-local | sh ``` -## Step 1: Create an index [full-text-filter-tutorial-create-index] +## Step 1: Create an index [full-text-filter-tutorial-create-index] Create the `cooking_blog` index to get started: @@ -101,18 +101,18 @@ PUT /cooking_blog/_mapping ``` 1. The `standard` analyzer is used by default for `text` fields if an `analyzer` isn’t specified. It’s included here for demonstration purposes. -2. [Multi-fields](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/multi-fields.md) are used here to index `text` fields as both `text` and `keyword` [data types](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/field-data-types.md). This enables both full-text search and exact matching/filtering on the same field. Note that if you used [dynamic mapping](../../manage-data/data-store/mapping/dynamic-field-mapping.md), these multi-fields would be created automatically. -3. The [`ignore_above` parameter](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/ignore-above.md) prevents indexing values longer than 256 characters in the `keyword` field. Again this is the default value, but it’s included here for demonstration purposes. It helps to save disk space and avoid potential issues with Lucene’s term byte-length limit. +2. [Multi-fields](elasticsearch://reference/elasticsearch/mapping-reference/multi-fields.md) are used here to index `text` fields as both `text` and `keyword` [data types](elasticsearch://reference/elasticsearch/mapping-reference/field-data-types.md). This enables both full-text search and exact matching/filtering on the same field. Note that if you used [dynamic mapping](../../manage-data/data-store/mapping/dynamic-field-mapping.md), these multi-fields would be created automatically. +3. The [`ignore_above` parameter](elasticsearch://reference/elasticsearch/mapping-reference/ignore-above.md) prevents indexing values longer than 256 characters in the `keyword` field. Again this is the default value, but it’s included here for demonstration purposes. It helps to save disk space and avoid potential issues with Lucene’s term byte-length limit. -::::{tip} +::::{tip} Full-text search is powered by [text analysis](full-text/text-analysis-during-search.md). Text analysis normalizes and standardizes text data so it can be efficiently stored in an inverted index and searched in near real-time. Analysis happens at both [index and search time](../../manage-data/data-store/text-analysis/index-search-analysis.md). This tutorial won’t cover analysis in detail, but it’s important to understand how text is processed to create effective search queries. :::: -## Step 2: Add sample blog posts to your index [full-text-filter-tutorial-index-data] +## Step 2: Add sample blog posts to your index [full-text-filter-tutorial-index-data] Now you’ll need to index some example blog posts using the [Bulk API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-settings). Note that `text` fields are analyzed and multi-fields are generated at index time. @@ -131,14 +131,14 @@ POST /cooking_blog/_bulk?refresh=wait_for ``` -## Step 3: Perform basic full-text searches [full-text-filter-tutorial-match-query] +## Step 3: Perform basic full-text searches [full-text-filter-tutorial-match-query] Full-text search involves executing text-based queries across one or more document fields. These queries calculate a relevance score for each matching document, based on how closely the document’s content aligns with the search terms. {{es}} offers various query types, each with its own method for matching text and [relevance scoring](../../explore-analyze/query-filter/languages/querydsl.md#relevance-scores). -### `match` query [_match_query] +### `match` query [_match_query] -The [`match`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-match-query.md) query is the standard query for full-text, or "lexical", search. The query text will be analyzed according to the analyzer configuration specified on each field (or at query time). +The [`match`](elasticsearch://reference/query-languages/query-dsl-match-query.md) query is the standard query for full-text, or "lexical", search. The query text will be analyzed according to the analyzer configuration specified on each field (or at query time). First, search the `description` field for "fluffy pancakes": @@ -212,7 +212,7 @@ At search time, {{es}} defaults to the analyzer defined in the field mapping. In -### Require all terms in a match query [_require_all_terms_in_a_match_query] +### Require all terms in a match query [_require_all_terms_in_a_match_query] Specify the `and` operator to require both terms in the `description` field. This stricter search returns *zero hits* on our sample data, as no document contains both "fluffy" and "pancakes" in the description. @@ -256,9 +256,9 @@ GET /cooking_blog/_search -### Specify a minimum number of terms to match [_specify_a_minimum_number_of_terms_to_match] +### Specify a minimum number of terms to match [_specify_a_minimum_number_of_terms_to_match] -Use the [`minimum_should_match`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-minimum-should-match.md) parameter to specify the minimum number of terms a document should have to be included in the search results. +Use the [`minimum_should_match`](elasticsearch://reference/query-languages/query-dsl-minimum-should-match.md) parameter to specify the minimum number of terms a document should have to be included in the search results. Search the title field to match at least 2 of the 3 terms: "fluffy", "pancakes", or "breakfast". This is useful for improving relevance while allowing some flexibility. @@ -277,9 +277,9 @@ GET /cooking_blog/_search ``` -## Step 4: Search across multiple fields at once [full-text-filter-tutorial-multi-match] +## Step 4: Search across multiple fields at once [full-text-filter-tutorial-multi-match] -When users enter a search query, they often don’t know (or care) whether their search terms appear in a specific field. A [`multi_match`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-multi-match-query.md) query allows searching across multiple fields simultaneously. +When users enter a search query, they often don’t know (or care) whether their search terms appear in a specific field. A [`multi_match`](elasticsearch://reference/query-languages/query-dsl-multi-match-query.md) query allows searching across multiple fields simultaneously. Let’s start with a basic `multi_match` query: @@ -320,7 +320,7 @@ GET /cooking_blog/_search -Learn more about fields and per-field boosting in the [`multi_match` query](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-multi-match-query.md) reference. +Learn more about fields and per-field boosting in the [`multi_match` query](elasticsearch://reference/query-languages/query-dsl-multi-match-query.md) reference. ::::{dropdown} Example response ```console-result @@ -374,18 +374,18 @@ This result demonstrates how the `multi_match` query with field boosts helps use :::: -::::{tip} +::::{tip} The `multi_match` query is often recommended over a single `match` query for most text search use cases, as it provides more flexibility and better matches user expectations. :::: -## Step 5: Filter and find exact matches [full-text-filter-tutorial-filtering] +## Step 5: Filter and find exact matches [full-text-filter-tutorial-filtering] [Filtering](../../explore-analyze/query-filter/languages/querydsl.md#filter-context) allows you to narrow down your search results based on exact criteria. Unlike full-text searches, filters are binary (yes/no) and do not affect the relevance score. Filters execute faster than queries because excluded results don’t need to be scored. -This [`bool`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-bool-query.md) query will return only blog posts in the "Breakfast" category. +This [`bool`](elasticsearch://reference/query-languages/query-dsl-bool-query.md) query will return only blog posts in the "Breakfast" category. ```console GET /cooking_blog/_search @@ -400,10 +400,10 @@ GET /cooking_blog/_search } ``` -1. Note the use of `category.keyword` here. This refers to the [`keyword`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/keyword.md) multi-field of the `category` field, ensuring an exact, case-sensitive match. +1. Note the use of `category.keyword` here. This refers to the [`keyword`](elasticsearch://reference/elasticsearch/mapping-reference/keyword.md) multi-field of the `category` field, ensuring an exact, case-sensitive match. -::::{tip} +::::{tip} The `.keyword` suffix accesses the unanalyzed version of a field, enabling exact, case-sensitive matching. This works in two scenarios: 1. **When using dynamic mapping for text fields**. Elasticsearch automatically creates a `.keyword` sub-field. @@ -413,9 +413,9 @@ The `.keyword` suffix accesses the unanalyzed version of a field, enabling exact -### Search for posts within a date range [full-text-filter-tutorial-range-query] +### Search for posts within a date range [full-text-filter-tutorial-range-query] -Often users want to find content published within a specific time frame. A [`range`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-range-query.md) query finds documents that fall within numeric or date ranges. +Often users want to find content published within a specific time frame. A [`range`](elasticsearch://reference/query-languages/query-dsl-range-query.md) query finds documents that fall within numeric or date ranges. ```console GET /cooking_blog/_search @@ -436,9 +436,9 @@ GET /cooking_blog/_search -### Find exact matches [full-text-filter-tutorial-term-query] +### Find exact matches [full-text-filter-tutorial-term-query] -Sometimes users want to search for exact terms to eliminate ambiguity in their search results. A [`term`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-term-query.md) query searches for an exact term in a field without analyzing it. Exact, case-sensitive matches on specific terms are often referred to as "keyword" searches. +Sometimes users want to search for exact terms to eliminate ambiguity in their search results. A [`term`](elasticsearch://reference/query-languages/query-dsl-term-query.md) query searches for an exact term in a field without analyzing it. Exact, case-sensitive matches on specific terms are often referred to as "keyword" searches. Here you’ll search for the author "Maria Rodriguez" in the `author.keyword` field. @@ -456,16 +456,16 @@ GET /cooking_blog/_search 1. The `term` query has zero flexibility. For example, here the queries `maria` or `maria rodriguez` would have zero hits, due to case sensitivity. -::::{tip} -Avoid using the `term` query for [`text` fields](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/text.md) because they are transformed by the analysis process. +::::{tip} +Avoid using the `term` query for [`text` fields](elasticsearch://reference/elasticsearch/mapping-reference/text.md) because they are transformed by the analysis process. :::: -## Step 6: Combine multiple search criteria [full-text-filter-tutorial-complex-bool] +## Step 6: Combine multiple search criteria [full-text-filter-tutorial-complex-bool] -A [`bool`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-bool-query.md) query allows you to combine multiple query clauses to create sophisticated searches. In this tutorial scenario it’s useful for when users have complex requirements for finding recipes. +A [`bool`](elasticsearch://reference/query-languages/query-dsl-bool-query.md) query allows you to combine multiple query clauses to create sophisticated searches. In this tutorial scenario it’s useful for when users have complex requirements for finding recipes. Let’s create a query that addresses the following user needs: @@ -571,7 +571,7 @@ GET /cooking_blog/_search } ``` -1. The title contains "Spicy" and "Curry", matching our should condition. With the default [best_fields](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-multi-match-query.md#type-best-fields) behavior, this field contributes most to the relevance score. +1. The title contains "Spicy" and "Curry", matching our should condition. With the default [best_fields](elasticsearch://reference/query-languages/query-dsl-multi-match-query.md#type-best-fields) behavior, this field contributes most to the relevance score. 2. While the description also contains matching terms, only the best matching field’s score is used by default. 3. The recipe was published within the last month, satisfying our recency preference. 4. The "Main Course" category satisfies another `should` condition. @@ -583,7 +583,7 @@ GET /cooking_blog/_search -## Learn more [full-text-filter-tutorial-learn-more] +## Learn more [full-text-filter-tutorial-learn-more] This tutorial introduced the basics of full-text search and filtering in {{es}}. Building a real-world search experience requires understanding many more advanced concepts and techniques. Here are some resources once you’re ready to dive deeper: diff --git a/solutions/search/rag/playground-troubleshooting.md b/solutions/search/rag/playground-troubleshooting.md index 421a027d6..2d4765cb4 100644 --- a/solutions/search/rag/playground-troubleshooting.md +++ b/solutions/search/rag/playground-troubleshooting.md @@ -8,13 +8,13 @@ applies_to: # Troubleshooting [playground-troubleshooting] -::::{warning} +::::{warning} This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features. :::: Dense vectors are not searchable -: Embeddings must be generated using the [inference processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/inference-processor.md) with an ML node. +: Embeddings must be generated using the [inference processor](elasticsearch://reference/ingestion-tools/enrich-processor/inference-processor.md) with an ML node. Context length error : You’ll need to adjust the size of the context you’re sending to the model. Refer to [Optimize model context](playground-context.md). diff --git a/solutions/search/rag/playground.md b/solutions/search/rag/playground.md index 6d6720f52..7de09dc4d 100644 --- a/solutions/search/rag/playground.md +++ b/solutions/search/rag/playground.md @@ -146,7 +146,7 @@ If you need to update a connector, or add a new one, click the 🔧 **Manage** b There are many options for ingesting data into {{es}}, including: * The [Elastic crawler](https://www.elastic.co/guide/en/enterprise-search/current/crawler.html) for web content (**NOTE**: Not yet available in *Serverless*) -* [Elastic connectors](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/search-connectors/index.md) for data synced from third-party sources +* [Elastic connectors](elasticsearch://reference/ingestion-tools/search-connectors/index.md) for data synced from third-party sources * The {{es}} [Bulk API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-bulk) for JSON documents ::::{dropdown} **Expand** for example diff --git a/solutions/search/ranking.md b/solutions/search/ranking.md index 615c13dcb..3fd64fb43 100644 --- a/solutions/search/ranking.md +++ b/solutions/search/ranking.md @@ -17,38 +17,38 @@ Later stages use more powerful models, often machine learning-based, to reorder {{es}} supports various ranking and re-ranking techniques to optimize search relevance and performance. -## Two-stage retrieval pipelines [re-ranking-two-stage-pipeline] +## Two-stage retrieval pipelines [re-ranking-two-stage-pipeline] -### Initial retrieval [re-ranking-first-stage-pipeline] +### Initial retrieval [re-ranking-first-stage-pipeline] -#### Full-text search: BM25 scoring [re-ranking-ranking-overview-bm25] +#### Full-text search: BM25 scoring [re-ranking-ranking-overview-bm25] {{es}} ranks documents based on term frequency and inverse document frequency, adjusted for document length. [BM25](https://en.wikipedia.org/wiki/Okapi_BM25) is the default statistical scoring algorithm in {{es}}. -#### Vector search: similarity scoring [re-ranking-ranking-overview-vector] +#### Vector search: similarity scoring [re-ranking-ranking-overview-vector] Vector search involves transforming data into dense or sparse vector embeddings to capture semantic meanings, and computing similarity scores for query vectors. Store vectors using `semantic_text` fields for automatic inference and vectorization or `dense_vector` and `sparse_vector` fields when you need more control over the underlying embedding model. Query vector fields with `semantic`, `knn` or `sparse_vector` queries to compute similarity scores. Refer to [semantic search](semantic-search.md) for more information. -#### Hybrid techniques [re-ranking-ranking-overview-hybrid] +#### Hybrid techniques [re-ranking-ranking-overview-hybrid] -Hybrid search techniques combine results from full-text and vector search pipelines. {{es}} enables combining lexical matching (BM25) and vector search scores using the [Reciprocal Rank Fusion (RRF)](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/reciprocal-rank-fusion.md) algorithm. +Hybrid search techniques combine results from full-text and vector search pipelines. {{es}} enables combining lexical matching (BM25) and vector search scores using the [Reciprocal Rank Fusion (RRF)](elasticsearch://reference/elasticsearch/rest-apis/reciprocal-rank-fusion.md) algorithm. -### Re-ranking [re-ranking-overview-second-stage] +### Re-ranking [re-ranking-overview-second-stage] When using the following advanced re-ranking pipelines, first-stage retrieval mechanisms effectively generate a set of candidates. These candidates are funneled into the re-ranker to perform more computationally expensive re-ranking tasks. -#### Semantic re-ranking [re-ranking-overview-semantic] +#### Semantic re-ranking [re-ranking-overview-semantic] [*Semantic re-ranking*](ranking/semantic-reranking.md) uses machine learning models to reorder search results based on their semantic similarity to a query. Models can be hosted directly in your {{es}} cluster, or you can use [inference endpoints](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-inference) to call models provided by third-party services. Semantic re-ranking enables out-of-the-box semantic search capabilities on existing full-text search indices. -#### Learning to Rank (LTR) [re-ranking-overview-ltr] +#### Learning to Rank (LTR) [re-ranking-overview-ltr] [*Learning To Rank*](ranking/learning-to-rank-ltr.md) is for advanced users. Learning To Rank involves training a machine learning model to build a ranking function for your search experience that updates over time. LTR is best suited for when you have ample training data and need highly customized relevance tuning. diff --git a/solutions/search/ranking/learning-to-rank-model-training.md b/solutions/search/ranking/learning-to-rank-model-training.md index 26d2dadcd..7669ec9e2 100644 --- a/solutions/search/ranking/learning-to-rank-model-training.md +++ b/solutions/search/ranking/learning-to-rank-model-training.md @@ -81,7 +81,7 @@ feature_extractors=[ ::::{admonition} Term statistics as features :class: note -It is very common for an LTR model to leverage raw term statistics as features. To extract this information, you can use the [term statistics feature](../../../explore-analyze/scripting/modules-scripting-fields.md#scripting-term-statistics) provided as part of the [`script_score`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-script-score-query.md) query. +It is very common for an LTR model to leverage raw term statistics as features. To extract this information, you can use the [term statistics feature](../../../explore-analyze/scripting/modules-scripting-fields.md#scripting-term-statistics) provided as part of the [`script_score`](elasticsearch://reference/query-languages/query-dsl-script-score-query.md) query. :::: diff --git a/solutions/search/ranking/learning-to-rank-search-usage.md b/solutions/search/ranking/learning-to-rank-search-usage.md index 77bead5bb..48509d792 100644 --- a/solutions/search/ranking/learning-to-rank-search-usage.md +++ b/solutions/search/ranking/learning-to-rank-search-usage.md @@ -12,15 +12,15 @@ applies_to: # Search using LTR [learning-to-rank-search-usage] -::::{note} +::::{note} This feature was introduced in version 8.12.0 and is only available to certain subscription levels. For more information, see {{subscriptions}}. :::: -## Learning To Rank as a rescorer [learning-to-rank-rescorer] +## Learning To Rank as a rescorer [learning-to-rank-rescorer] -Once your LTR model is trained and deployed in {{es}}, it can be used as a [rescorer](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/filter-search-results.md#rescore) in the [search API](../querying-for-search.md): +Once your LTR model is trained and deployed in {{es}}, it can be used as a [rescorer](elasticsearch://reference/elasticsearch/rest-apis/filter-search-results.md#rescore) in the [search API](../querying-for-search.md): ```console GET my-index/_search @@ -50,20 +50,20 @@ GET my-index/_search -### Known limitations [learning-to-rank-rescorer-limitations] +### Known limitations [learning-to-rank-rescorer-limitations] -#### Rescore window size [learning-to-rank-rescorer-limitations-window-size] +#### Rescore window size [learning-to-rank-rescorer-limitations-window-size] Scores returned by LTR models are usually not comparable with the scores issued by the first pass query and can be lower than the non-rescored score. This can cause the non-rescored result document to be ranked higher than the rescored document. To prevent this, the `window_size` parameter is mandatory for LTR rescorers and should be greater than or equal to `from + size`. -#### Pagination [learning-to-rank-rescorer-limitations-pagination] +#### Pagination [learning-to-rank-rescorer-limitations-pagination] When exposing pagination to users, `window_size` should remain constant as each page is progressed by passing different `from` values. Changing the `window_size` can alter the top hits causing results to confusingly shift as the user steps through pages. -#### Negative scores [learning-to-rank-rescorer-limitations-negative-scores] +#### Negative scores [learning-to-rank-rescorer-limitations-negative-scores] Depending on how your model is trained, it’s possible that the model will return negative scores for documents. While negative scores are not allowed from first-stage retrieval and ranking, it is possible to use them in the LTR rescorer. diff --git a/solutions/search/ranking/semantic-reranking.md b/solutions/search/ranking/semantic-reranking.md index 4e3516074..43cdaa355 100644 --- a/solutions/search/ranking/semantic-reranking.md +++ b/solutions/search/ranking/semantic-reranking.md @@ -39,7 +39,7 @@ Semantic re-ranking enables a variety of use cases: * **Semantic retrieval results re-ranking** * Improves results from semantic retrievers using ELSER sparse vector embeddings or dense vector embeddings by using more powerful models. - * Adds a refinement layer on top of hybrid retrieval with [reciprocal rank fusion (RRF)](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/reciprocal-rank-fusion.md). + * Adds a refinement layer on top of hybrid retrieval with [reciprocal rank fusion (RRF)](elasticsearch://reference/elasticsearch/rest-apis/reciprocal-rank-fusion.md). * **General applications** diff --git a/solutions/search/retrievers-examples.md b/solutions/search/retrievers-examples.md index 6ba37f528..fb9480a44 100644 --- a/solutions/search/retrievers-examples.md +++ b/solutions/search/retrievers-examples.md @@ -401,7 +401,7 @@ Which would return the following results: ## Example: Grouping results by year with `collapse` [retrievers-examples-collapsing-retriever-results] -In our result set, we have many documents with the same `year` value. We can clean this up using the `collapse` parameter with our retriever. This, as with the standard [collapse](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/collapse-search-results.md) feature, +In our result set, we have many documents with the same `year` value. We can clean this up using the `collapse` parameter with our retriever. This, as with the standard [collapse](elasticsearch://reference/elasticsearch/rest-apis/collapse-search-results.md) feature, enables grouping results by any field and returns only the highest-scoring document from each group. In this example we’ll collapse our results based on the `year` field. ```console @@ -551,7 +551,7 @@ This returns the following response with collapsed results. ## Example: Highlighting results based on nested sub-retrievers [retrievers-examples-highlighting-retriever-results] -Highlighting is now also available for nested sub-retrievers matches. For example, consider the same `rrf` retriever as above, with a `knn` and `standard` retriever as its sub-retrievers. We can specify a `highlight` section, as defined in the [highlighting](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/highlighting.md) documentation, and compute highlights for the top results. +Highlighting is now also available for nested sub-retrievers matches. For example, consider the same `rrf` retriever as above, with a `knn` and `standard` retriever as its sub-retrievers. We can specify a `highlight` section, as defined in the [highlighting](elasticsearch://reference/elasticsearch/rest-apis/highlighting.md) documentation, and compute highlights for the top results. ```console GET /retrievers_example/_search @@ -748,7 +748,7 @@ POST /retrievers_example_nested/_doc/3 POST /retrievers_example_nested/_refresh ``` -Now we can run an `rrf` retriever query and also compute [inner hits](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/retrieve-inner-hits.md) for the `nested_field.nested_vector` field, based on the `knn` query specified. +Now we can run an `rrf` retriever query and also compute [inner hits](elasticsearch://reference/elasticsearch/rest-apis/retrieve-inner-hits.md) for the `nested_field.nested_vector` field, based on the `knn` query specified. ```console GET /retrievers_example_nested/_search diff --git a/solutions/search/retrievers-overview.md b/solutions/search/retrievers-overview.md index 38196bdd3..e1e7de820 100644 --- a/solutions/search/retrievers-overview.md +++ b/solutions/search/retrievers-overview.md @@ -28,7 +28,7 @@ Retrievers come in various types, each tailored for different search operations. * [**kNN Retriever**](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search#operation-search-body-application-json-retriever). Returns top documents from a [knn search](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search#operation-search-body-application-json-knn), in the context of a retriever framework. * [**Linear Retriever**](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search#operation-search-body-application-json-retriever). Combines the top results from multiple sub-retrievers using a weighted sum of their scores. Allows to specify different weights for each retriever, as well as independently normalize the scores from each result set. * [**RRF Retriever**](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search#operation-search-body-application-json-retriever). Combines and ranks multiple first-stage retrievers using the reciprocal rank fusion (RRF) algorithm. Allows you to combine multiple result sets with different relevance indicators into a single result set. An RRF retriever is a **compound retriever**, where its `filter` element is propagated to its sub retrievers. -* [**Rule Retriever**](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search#operation-search-body-application-json-retriever). Applies [query rules](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/searching-with-query-rules.md#query-rules) to the query before returning results. +* [**Rule Retriever**](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search#operation-search-body-application-json-retriever). Applies [query rules](elasticsearch://reference/elasticsearch/rest-apis/searching-with-query-rules.md#query-rules) to the query before returning results. * [**Text Similarity Re-ranker Retriever**](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search#operation-search-body-application-json-retriever). Used for [semantic reranking](ranking/semantic-reranking.md). Requires first creating a `rerank` task using the [{{es}} Inference API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-inference-put). diff --git a/solutions/search/search-applications/search-application-api.md b/solutions/search/search-applications/search-application-api.md index 0b484a1f9..4f881ca3f 100644 --- a/solutions/search/search-applications/search-application-api.md +++ b/solutions/search/search-applications/search-application-api.md @@ -166,7 +166,7 @@ When you actually perform a search with no parameters, it will execute the under POST _application/search_application/my_search_application/_search ``` -Searching with the `query_string` and/or `default_field` parameters will perform a [`query_string`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-query-string-query.md) query. +Searching with the `query_string` and/or `default_field` parameters will perform a [`query_string`](elasticsearch://reference/query-languages/query-dsl-query-string-query.md) query. ::::{warning} The default template is subject to change in future versions of the Search Applications feature. @@ -284,7 +284,7 @@ The `text_fields` parameters can be overridden with new/different fields and boo ### Text search + ELSER with RRF [search-application-api-rrf-template] -This example supports the [reciprocal rank fusion (RRF)]](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/reciprocal-rank-fusion.md) method for combining BM25 and [ELSER](../../../explore-analyze/machine-learning/nlp/ml-nlp-elser.md) searches. Reciprocal Rank Fusion consistently improves the combined results of different search algorithms. It outperforms all other ranking algorithms, and often surpasses the best individual results, without calibration. +This example supports the [reciprocal rank fusion (RRF)]](elasticsearch://reference/elasticsearch/rest-apis/reciprocal-rank-fusion.md) method for combining BM25 and [ELSER](../../../explore-analyze/machine-learning/nlp/ml-nlp-elser.md) searches. Reciprocal Rank Fusion consistently improves the combined results of different search algorithms. It outperforms all other ranking algorithms, and often surpasses the best individual results, without calibration. ```console PUT _application/search_application/my-search-app @@ -516,10 +516,10 @@ POST _application/search_application/my_search_application/_search Text search results and ELSER search results are expected to have significantly different scores in some cases, which makes ranking challenging. To find the best search result mix for your dataset, we suggest experimenting with the boost values provided in the example template: * `text_query_boost` to boost the BM25 query as a whole -* [`boost`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-query-string-query.md#_boosting) fields to boost individual text search fields +* [`boost`](elasticsearch://reference/query-languages/query-dsl-query-string-query.md#_boosting) fields to boost individual text search fields * [`min_score`](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search#operation-search-body-application-json-min_score) parameter to omit significantly low confidence results -The above boosts should be sufficient for many use cases, but there are cases when adding a [rescore](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/filter-search-results.md#rescore) query or [index boost](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/search-multiple-data-streams-indices.md#index-boost) to your template may be beneficial. Remember to update your search application to use the new template using the [put search application command](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search-application-put). +The above boosts should be sufficient for many use cases, but there are cases when adding a [rescore](elasticsearch://reference/elasticsearch/rest-apis/filter-search-results.md#rescore) query or [index boost](elasticsearch://reference/elasticsearch/rest-apis/search-multiple-data-streams-indices.md#index-boost) to your template may be beneficial. Remember to update your search application to use the new template using the [put search application command](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search-application-put). :::: diff --git a/solutions/search/search-applications/search-application-client.md b/solutions/search/search-applications/search-application-client.md index 72bd71bc3..337e01856 100644 --- a/solutions/search/search-applications/search-application-client.md +++ b/solutions/search/search-applications/search-application-client.md @@ -309,7 +309,7 @@ If you need to adjust `search_fields` at query request time, you can add a new p **Use case: I want to boost results given a certain proximity to the user** -You can add additional template parameters to send the geo-coordinates of the user. Then use [`function_score`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-function-score-query.md) to boost documents which match a certain [`geo_distance`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-geo-distance-query.md) from the user. +You can add additional template parameters to send the geo-coordinates of the user. Then use [`function_score`](elasticsearch://reference/query-languages/query-dsl-function-score-query.md) to boost documents which match a certain [`geo_distance`](elasticsearch://reference/query-languages/query-dsl-geo-distance-query.md) from the user. ## Result fields [search-application-client-client-features-result-fields] @@ -368,7 +368,7 @@ If you need to adjust the fields returned at query request time, you can add a n ### Highlighting and snippets [search-application-client-client-features-highlight-snippets] -Highlighting support is straightforward to add to the template. With the [highlighting API](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/highlighting.md), you can specify which fields you want to highlight for matches. +Highlighting support is straightforward to add to the template. With the [highlighting API](elasticsearch://reference/elasticsearch/rest-apis/highlighting.md), you can specify which fields you want to highlight for matches. In the following example, we specify `title` and `plot` as the highlighted fields. `title` typically has a short value length, compared to `plot` which is variable and tends to be longer. diff --git a/solutions/search/search-pipelines.md b/solutions/search/search-pipelines.md index 5a373955a..aca7f4049 100644 --- a/solutions/search/search-pipelines.md +++ b/solutions/search/search-pipelines.md @@ -41,7 +41,7 @@ To this end, when you create indices for search use cases, (including web crawle This pipeline is called `search-default-ingestion`. While it is a "managed" pipeline (meaning it should not be tampered with), you can view its details via the Kibana UI or the Elasticsearch API. You can also [read more about its contents below](#ingest-pipeline-search-details-generic-reference). -You can control whether you run some of these processors. While all features are enabled by default, they are eligible for opt-out. For [Elastic crawler](https://www.elastic.co/guide/en/enterprise-search/current/crawler.html) and [connectors](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/search-connectors/index.md). , you can opt out (or back in) per index, and your choices are saved. For API indices, you can opt out (or back in) by including specific fields in your documents. [See below for details](#ingest-pipeline-search-pipeline-settings-using-the-api). +You can control whether you run some of these processors. While all features are enabled by default, they are eligible for opt-out. For [Elastic crawler](https://www.elastic.co/guide/en/enterprise-search/current/crawler.html) and [connectors](elasticsearch://reference/ingestion-tools/search-connectors/index.md). , you can opt out (or back in) per index, and your choices are saved. For API indices, you can opt out (or back in) by including specific fields in your documents. [See below for details](#ingest-pipeline-search-pipeline-settings-using-the-api). At the deployment level, you can change the default settings for all new indices. This will not effect existing indices. @@ -111,12 +111,12 @@ This pipeline is a "managed" pipeline. That means that it is not intended to be #### Processors [ingest-pipeline-search-details-generic-reference-processors] -1. `attachment` - this uses the [Attachment](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/attachment.md) processor to convert any binary data stored in a document’s `_attachment` field to a nested object of plain text and metadata. -2. `set_body` - this uses the [Set](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/set-processor.md) processor to copy any plain text extracted from the previous step and persist it on the document in the `body` field. -3. `remove_replacement_chars` - this uses the [Gsub](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/gsub-processor.md) processor to remove characters like "�" from the `body` field. -4. `remove_extra_whitespace` - this uses the [Gsub](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/gsub-processor.md) processor to replace consecutive whitespace characters with single spaces in the `body` field. While not perfect for every use case (see below for how to disable), this can ensure that search experiences display more content and highlighting and less empty space for your search results. -5. `trim` - this uses the [Trim](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/trim-processor.md) processor to remove any remaining leading or trailing whitespace from the `body` field. -6. `remove_meta_fields` - this final step of the pipeline uses the [Remove](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/remove-processor.md) processor to remove special fields that may have been used elsewhere in the pipeline, whether as temporary storage or as control flow parameters. +1. `attachment` - this uses the [Attachment](elasticsearch://reference/ingestion-tools/enrich-processor/attachment.md) processor to convert any binary data stored in a document’s `_attachment` field to a nested object of plain text and metadata. +2. `set_body` - this uses the [Set](elasticsearch://reference/ingestion-tools/enrich-processor/set-processor.md) processor to copy any plain text extracted from the previous step and persist it on the document in the `body` field. +3. `remove_replacement_chars` - this uses the [Gsub](elasticsearch://reference/ingestion-tools/enrich-processor/gsub-processor.md) processor to remove characters like "�" from the `body` field. +4. `remove_extra_whitespace` - this uses the [Gsub](elasticsearch://reference/ingestion-tools/enrich-processor/gsub-processor.md) processor to replace consecutive whitespace characters with single spaces in the `body` field. While not perfect for every use case (see below for how to disable), this can ensure that search experiences display more content and highlighting and less empty space for your search results. +5. `trim` - this uses the [Trim](elasticsearch://reference/ingestion-tools/enrich-processor/trim-processor.md) processor to remove any remaining leading or trailing whitespace from the `body` field. +6. `remove_meta_fields` - this final step of the pipeline uses the [Remove](elasticsearch://reference/ingestion-tools/enrich-processor/remove-processor.md) processor to remove special fields that may have been used elsewhere in the pipeline, whether as temporary storage or as control flow parameters. #### Control flow parameters [ingest-pipeline-search-details-generic-reference-params] @@ -161,8 +161,8 @@ This pipeline is a "managed" pipeline. That means that it is not intended to be In addition to the processors inherited from the [`search-default-ingestion` pipeline](#ingest-pipeline-search-details-generic-reference), the index-specific pipeline also defines: -* `index_ml_inference_pipeline` - this uses the [Pipeline](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/pipeline-processor.md) processor to run the `@ml-inference` pipeline. This processor will only be run if the source document includes a `_run_ml_inference` field with the value `true`. -* `index_custom_pipeline` - this uses the [Pipeline](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/pipeline-processor.md) processor to run the `@custom` pipeline. +* `index_ml_inference_pipeline` - this uses the [Pipeline](elasticsearch://reference/ingestion-tools/enrich-processor/pipeline-processor.md) processor to run the `@ml-inference` pipeline. This processor will only be run if the source document includes a `_run_ml_inference` field with the value `true`. +* `index_custom_pipeline` - this uses the [Pipeline](elasticsearch://reference/ingestion-tools/enrich-processor/pipeline-processor.md) processor to run the `@custom` pipeline. ##### Control flow parameters [ingest-pipeline-search-details-specific-reference-params] diff --git a/solutions/search/semantic-search/cohere-es.md b/solutions/search/semantic-search/cohere-es.md index 2fda0c058..f164c1f88 100644 --- a/solutions/search/semantic-search/cohere-es.md +++ b/solutions/search/semantic-search/cohere-es.md @@ -127,7 +127,7 @@ client.indices.create( ## Create the {{infer}} pipeline [cohere-es-infer-pipeline] -Now you have an {{infer}} endpoint and an index ready to store embeddings. The next step is to create an [ingest pipeline](../../../manage-data/ingest/transform-enrich/ingest-pipelines.md) with an [{{infer}} processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/inference-processor.md) that will create the embeddings using the {{infer}} endpoint and stores them in the index. +Now you have an {{infer}} endpoint and an index ready to store embeddings. The next step is to create an [ingest pipeline](../../../manage-data/ingest/transform-enrich/ingest-pipelines.md) with an [{{infer}} processor](elasticsearch://reference/ingestion-tools/enrich-processor/inference-processor.md) that will create the embeddings using the {{infer}} endpoint and stores them in the index. ```py client.ingest.put_pipeline( diff --git a/solutions/search/semantic-search/semantic-search-elser-ingest-pipelines.md b/solutions/search/semantic-search/semantic-search-elser-ingest-pipelines.md index d7f1729fb..7a9ac7286 100644 --- a/solutions/search/semantic-search/semantic-search-elser-ingest-pipelines.md +++ b/solutions/search/semantic-search/semantic-search-elser-ingest-pipelines.md @@ -37,7 +37,7 @@ The minimum dedicated ML node size for deploying and using the ELSER model is 4 ### Create the index mapping [elser-mappings] -First, the mapping of the destination index - the index that contains the tokens that the model created based on your text - must be created. The destination index must have a field with the [`sparse_vector`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/sparse-vector.md) or [`rank_features`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/rank-features.md) field type to index the ELSER output. +First, the mapping of the destination index - the index that contains the tokens that the model created based on your text - must be created. The destination index must have a field with the [`sparse_vector`](elasticsearch://reference/elasticsearch/mapping-reference/sparse-vector.md) or [`rank_features`](elasticsearch://reference/elasticsearch/mapping-reference/rank-features.md) field type to index the ELSER output. ::::{note} ELSER output must be ingested into a field with the `sparse_vector` or `rank_features` field type. Otherwise, {{es}} interprets the token-weight pairs as a massive amount of fields in a document. If you get an error similar to this: `"Limit of total fields [1000] has been exceeded while adding new fields"` then the ELSER output field is not mapped properly and it has a field type different than `sparse_vector` or `rank_features`. @@ -66,7 +66,7 @@ PUT my-index 4. The field type which is text in this example. -To learn how to optimize space, refer to the [Saving disk space by excluding the ELSER tokens from document source](/manage-data/ingest/transform-enrich/ingest-pipelines.md) with an [{{infer}} processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/inference-processor.md) to use ELSER to infer against the data that is being ingested in the pipeline. +To learn how to optimize space, refer to the [Saving disk space by excluding the ELSER tokens from document source](/manage-data/ingest/transform-enrich/ingest-pipelines.md) with an [{{infer}} processor](elasticsearch://reference/ingestion-tools/enrich-processor/inference-processor.md) to use ELSER to infer against the data that is being ingested in the pipeline. ```console PUT _ingest/pipeline/elser-v2-test @@ -143,7 +143,7 @@ POST _tasks//_cancel ### Semantic search by using the `sparse_vector` query [text-expansion-query] -To perform semantic search, use the [`sparse_vector` query](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-sparse-vector-query.md), and provide the query text and the inference ID associated with your ELSER model. The example below uses the query text "How to avoid muscle soreness after running?", the `content_embedding` field contains the generated ELSER output: +To perform semantic search, use the [`sparse_vector` query](elasticsearch://reference/query-languages/query-dsl-sparse-vector-query.md), and provide the query text and the inference ID associated with your ELSER model. The example below uses the query text "How to avoid muscle soreness after running?", the `content_embedding` field contains the generated ELSER output: ```console GET my-index/_search @@ -200,7 +200,7 @@ The result is the top 10 documents that are closest in meaning to your query tex ### Combining semantic search with other queries [text-expansion-compound-query] -You can combine [`sparse_vector`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-sparse-vector-query.md) with other queries in a [compound query](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/compound-queries.md). For example, use a filter clause in a [Boolean](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-bool-query.md) or a full text query with the same (or different) query text as the `sparse_vector` query. This enables you to combine the search results from both queries. +You can combine [`sparse_vector`](elasticsearch://reference/query-languages/query-dsl-sparse-vector-query.md) with other queries in a [compound query](elasticsearch://reference/query-languages/compound-queries.md). For example, use a filter clause in a [Boolean](elasticsearch://reference/query-languages/query-dsl-bool-query.md) or a full text query with the same (or different) query text as the `sparse_vector` query. This enables you to combine the search results from both queries. The search hits from the `sparse_vector` query tend to score higher than other {{es}} queries. Those scores can be regularized by increasing or decreasing the relevance scores of each query by using the `boost` parameter. Recall on the `sparse_vector` query can be high where there is a long tail of less relevant results. Use the `min_score` parameter to prune those less relevant documents. @@ -243,10 +243,10 @@ GET my-index/_search ### Saving disk space by excluding the ELSER tokens from document source [save-space] -The tokens generated by ELSER must be indexed for use in the [sparse_vector query](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-sparse-vector-query.md). However, it is not necessary to retain those terms in the document source. You can save disk space by using the [source exclude](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-source-field.md#include-exclude) mapping to remove the ELSER terms from the document source. +The tokens generated by ELSER must be indexed for use in the [sparse_vector query](elasticsearch://reference/query-languages/query-dsl-sparse-vector-query.md). However, it is not necessary to retain those terms in the document source. You can save disk space by using the [source exclude](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md#include-exclude) mapping to remove the ELSER terms from the document source. ::::{warning} -Reindex uses the document source to populate the destination index. **Once the ELSER terms have been excluded from the source, they cannot be recovered through reindexing.** Excluding the tokens from the source is a space-saving optimization that should only be applied if you are certain that reindexing will not be required in the future! It’s important to carefully consider this trade-off and make sure that excluding the ELSER terms from the source aligns with your specific requirements and use case. Review the [Disabling the `_source` field](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-source-field.md#disable-source-field) and [Including / Excluding fields from `_source`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-source-field.md#include-exclude) sections carefully to learn more about the possible consequences of excluding the tokens from the `_source`. +Reindex uses the document source to populate the destination index. **Once the ELSER terms have been excluded from the source, they cannot be recovered through reindexing.** Excluding the tokens from the source is a space-saving optimization that should only be applied if you are certain that reindexing will not be required in the future! It’s important to carefully consider this trade-off and make sure that excluding the ELSER terms from the source aligns with your specific requirements and use case. Review the [Disabling the `_source` field](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md#disable-source-field) and [Including / Excluding fields from `_source`](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md#include-exclude) sections carefully to learn more about the possible consequences of excluding the tokens from the `_source`. :::: diff --git a/solutions/search/semantic-search/semantic-search-inference.md b/solutions/search/semantic-search/semantic-search-inference.md index 8546c0b59..0f08e2b0a 100644 --- a/solutions/search/semantic-search/semantic-search-inference.md +++ b/solutions/search/semantic-search/semantic-search-inference.md @@ -320,7 +320,7 @@ PUT _inference/text_embedding/alibabacloud_ai_search_embeddings <1> ## Create the index mapping [infer-service-mappings] -The mapping of the destination index - the index that contains the embeddings that the model will create based on your input text - must be created. The destination index must have a field with the [`dense_vector`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/dense-vector.md) field type for most models and the [`sparse_vector`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/sparse-vector.md) field type for the sparse vector models like in the case of the `elasticsearch` service to index the output of the used model. +The mapping of the destination index - the index that contains the embeddings that the model will create based on your input text - must be created. The destination index must have a field with the [`dense_vector`](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md) field type for most models and the [`sparse_vector`](elasticsearch://reference/elasticsearch/mapping-reference/sparse-vector.md) field type for the sparse vector models like in the case of the `elasticsearch` service to index the output of the used model. :::::::{tab-set} @@ -597,7 +597,7 @@ PUT alibabacloud-ai-search-embeddings ## Create an ingest pipeline with an inference processor [infer-service-inference-ingest-pipeline] -Create an [ingest pipeline](../../../manage-data/ingest/transform-enrich/ingest-pipelines.md) with an [{{infer}} processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/inference-processor.md) and use the model you created above to infer against the data that is being ingested in the pipeline. +Create an [ingest pipeline](../../../manage-data/ingest/transform-enrich/ingest-pipelines.md) with an [{{infer}} processor](elasticsearch://reference/ingestion-tools/enrich-processor/inference-processor.md) and use the model you created above to infer against the data that is being ingested in the pipeline. :::::::{tab-set} diff --git a/solutions/search/semantic-search/semantic-search-semantic-text.md b/solutions/search/semantic-search/semantic-search-semantic-text.md index 7f0f4c49a..8a768723e 100644 --- a/solutions/search/semantic-search/semantic-search-semantic-text.md +++ b/solutions/search/semantic-search/semantic-search-semantic-text.md @@ -31,7 +31,7 @@ This tutorial uses the [`elasticsearch` service](../inference-api/elasticsearch- ## Create the index mapping [semantic-text-index-mapping] -The mapping of the destination index - the index that contains the embeddings that the inference endpoint will generate based on your input text - must be created. The destination index must have a field with the [`semantic_text`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/semantic-text.md) field type to index the output of the used inference endpoint. +The mapping of the destination index - the index that contains the embeddings that the inference endpoint will generate based on your input text - must be created. The destination index must have a field with the [`semantic_text`](elasticsearch://reference/elasticsearch/mapping-reference/semantic-text.md) field type to index the output of the used inference endpoint. ```console PUT semantic-embeddings diff --git a/solutions/search/serverless-elasticsearch-get-started.md b/solutions/search/serverless-elasticsearch-get-started.md index 1e4b00154..d6577e8c0 100644 --- a/solutions/search/serverless-elasticsearch-get-started.md +++ b/solutions/search/serverless-elasticsearch-get-started.md @@ -108,7 +108,7 @@ If you’re already familiar with Elasticsearch, you can jump right into setting 2. Ingest your data. Elasticsearch provides several methods for ingesting data: * [{{es}} API](ingest-for-search.md) - * [Connector clients](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/search-connectors/index.md) + * [Connector clients](elasticsearch://reference/ingestion-tools/search-connectors/index.md) * [File Uploader](/manage-data/ingest/upload-data-files.md) * [{{beats}}](asciidocalypse://docs/beats/docs/reference/index.md) * [{{ls}}](asciidocalypse://docs/logstash/docs/reference/index.md) diff --git a/solutions/search/site-or-app/clients.md b/solutions/search/site-or-app/clients.md index fe2b00d6c..f9cc44840 100644 --- a/solutions/search/site-or-app/clients.md +++ b/solutions/search/site-or-app/clients.md @@ -27,7 +27,7 @@ applies_to: In addition to official clients, the Elastic community has contributed libraries for other programming languages. -- [Community-contributed clients](asciidocalypse://docs/elasticsearch/docs/reference/community-contributed.md) +- [Community-contributed clients](elasticsearch://reference/community-contributed.md) ::::{tip} Learn how to [connect to your {{es}} endpoint](/solutions/search/search-connection-details.md). diff --git a/solutions/search/the-search-api.md b/solutions/search/the-search-api.md index 4366fa7c5..06177d579 100644 --- a/solutions/search/the-search-api.md +++ b/solutions/search/the-search-api.md @@ -18,7 +18,7 @@ You can use the [search API](https://www.elastic.co/docs/api/doc/elasticsearch/o ## Run a search [run-an-es-search] -The following request searches `my-index-000001` using a [`match`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-match-query.md) query. This query matches documents with a `user.id` value of `kimchy`. +The following request searches `my-index-000001` using a [`match`](elasticsearch://reference/query-languages/query-dsl-match-query.md) query. This query matches documents with a `user.id` value of `kimchy`. ```console GET /my-index-000001/_search @@ -87,10 +87,10 @@ You can use the following options to customize your searches. **Query DSL**
[Query DSL](../../explore-analyze/query-filter/languages/querydsl.md) supports a variety of query types you can mix and match to get the results you want. Query types include: -* [Boolean](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-bool-query.md) and other [compound queries](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/compound-queries.md), which let you combine queries and match results based on multiple criteria -* [Term-level queries](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/term-level-queries.md) for filtering and finding exact matches -* [Full text queries](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/full-text-queries.md), which are commonly used in search engines -* [Geo](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/geo-queries.md) and [spatial queries](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/shape-queries.md) +* [Boolean](elasticsearch://reference/query-languages/query-dsl-bool-query.md) and other [compound queries](elasticsearch://reference/query-languages/compound-queries.md), which let you combine queries and match results based on multiple criteria +* [Term-level queries](elasticsearch://reference/query-languages/term-level-queries.md) for filtering and finding exact matches +* [Full text queries](elasticsearch://reference/query-languages/full-text-queries.md), which are commonly used in search engines +* [Geo](elasticsearch://reference/query-languages/geo-queries.md) and [spatial queries](elasticsearch://reference/query-languages/shape-queries.md) **Aggregations**
You can use [search aggregations](../../explore-analyze/query-filter/aggregations.md) to get statistics and other analytics for your search results. Aggregations help you answer questions like: @@ -98,13 +98,13 @@ You can use the following options to customize your searches. * What are the top IP addresses hit by users on my network? * What is the total transaction revenue by customer? -**Search multiple data streams and indices**
You can use comma-separated values and grep-like index patterns to search several data streams and indices in the same request. You can even boost search results from specific indices. See [Search multiple data streams and indices using a query](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/search-multiple-data-streams-indices.md). +**Search multiple data streams and indices**
You can use comma-separated values and grep-like index patterns to search several data streams and indices in the same request. You can even boost search results from specific indices. See [Search multiple data streams and indices using a query](elasticsearch://reference/elasticsearch/rest-apis/search-multiple-data-streams-indices.md). -**Paginate search results**
By default, searches return only the top 10 matching hits. To retrieve more or fewer documents, see [Paginate search results](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/paginate-search-results.md). +**Paginate search results**
By default, searches return only the top 10 matching hits. To retrieve more or fewer documents, see [Paginate search results](elasticsearch://reference/elasticsearch/rest-apis/paginate-search-results.md). -**Retrieve selected fields**
The search response’s `hits.hits` property includes the full document [`_source`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-source-field.md) for each hit. To retrieve only a subset of the `_source` or other fields, see [Retrieve selected fields](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/retrieve-selected-fields.md). +**Retrieve selected fields**
The search response’s `hits.hits` property includes the full document [`_source`](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md) for each hit. To retrieve only a subset of the `_source` or other fields, see [Retrieve selected fields](elasticsearch://reference/elasticsearch/rest-apis/retrieve-selected-fields.md). -**Sort search results**
By default, search hits are sorted by `_score`, a [relevance score](../../explore-analyze/query-filter/languages/querydsl.md#relevance-scores) that measures how well each document matches the query. To customize the calculation of these scores, use the [`script_score`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-script-score-query.md) query. To sort search hits by other field values, see [Sort search results](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/sort-search-results.md). +**Sort search results**
By default, search hits are sorted by `_score`, a [relevance score](../../explore-analyze/query-filter/languages/querydsl.md#relevance-scores) that measures how well each document matches the query. To customize the calculation of these scores, use the [`script_score`](elasticsearch://reference/query-languages/query-dsl-script-score-query.md) query. To sort search hits by other field values, see [Sort search results](elasticsearch://reference/elasticsearch/rest-apis/sort-search-results.md). **Run an async search**
{{es}} searches are designed to run on large volumes of data quickly, often returning results in milliseconds. For this reason, searches are *synchronous* by default. The search request waits for complete results before returning a response. @@ -119,7 +119,7 @@ Instead of indexing your data and then searching it, you can define [runtime fie For example, the following query defines a runtime field called `day_of_week`. The included script calculates the day of the week based on the value of the `@timestamp` field, and uses `emit` to return the calculated value. -The query also includes a [terms aggregation](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-terms-aggregation.md) that operates on `day_of_week`. +The query also includes a [terms aggregation](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-terms-aggregation.md) that operates on `day_of_week`. ```console GET /my-index-000001/_search @@ -341,7 +341,7 @@ GET /_search?q=user.id:elkbee&size=0&terminate_after=1 ``` ::::{note} -`terminate_after` is always applied **after** the [`post_filter`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/filter-search-results.md#post-filter) and stops the query as well as the aggregation executions when enough hits have been collected on the shard. Though the doc count on aggregations may not reflect the `hits.total` in the response since aggregations are applied **before** the post filtering. +`terminate_after` is always applied **after** the [`post_filter`](elasticsearch://reference/elasticsearch/rest-apis/filter-search-results.md#post-filter) and stops the query as well as the aggregation executions when enough hits have been collected on the shard. Though the doc count on aggregations may not reflect the `hits.total` in the response since aggregations are applied **before** the post filtering. :::: diff --git a/solutions/search/vector/bring-own-vectors.md b/solutions/search/vector/bring-own-vectors.md index ccb055b46..21b347241 100644 --- a/solutions/search/vector/bring-own-vectors.md +++ b/solutions/search/vector/bring-own-vectors.md @@ -14,24 +14,24 @@ This tutorial demonstrates how to index documents that already have dense vector You’ll find links at the end of this tutorial for more information about deploying a text embedding model in {{es}}, so you can generate embeddings for queries on the fly. -::::{tip} +::::{tip} This is an advanced use case. Refer to [Semantic search](../semantic-search.md) for an overview of your options for semantic search with {{es}}. :::: -## Step 1: Create an index with `dense_vector` mapping [bring-your-own-vectors-create-index] +## Step 1: Create an index with `dense_vector` mapping [bring-your-own-vectors-create-index] Each document in our simple dataset will have: * A review: stored in a `review_text` field * An embedding of that review: stored in a `review_vector` field - * The `review_vector` field is defined as a [`dense_vector`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/dense-vector.md) data type. + * The `review_vector` field is defined as a [`dense_vector`](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md) data type. -::::{tip} -The `dense_vector` type automatically uses `int8_hnsw` quantization by default to reduce the memory footprint required when searching float vectors. Learn more about balancing performance and accuracy in [Dense vector quantization](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-quantization). +::::{tip} +The `dense_vector` type automatically uses `int8_hnsw` quantization by default to reduce the memory footprint required when searching float vectors. Learn more about balancing performance and accuracy in [Dense vector quantization](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-quantization). :::: @@ -61,10 +61,10 @@ PUT /amazon-reviews -## Step 2: Index documents with embeddings [bring-your-own-vectors-index-documents] +## Step 2: Index documents with embeddings [bring-your-own-vectors-index-documents] -### Index a single document [_index_a_single_document] +### Index a single document [_index_a_single_document] First, index a single document to understand the document structure. @@ -80,7 +80,7 @@ PUT /amazon-reviews/_doc/1 -### Bulk index multiple documents [_bulk_index_multiple_documents] +### Bulk index multiple documents [_bulk_index_multiple_documents] In a production scenario, you’ll want to index many documents at once using the [`_bulk` endpoint](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-bulk). @@ -99,7 +99,7 @@ POST /_bulk ``` -## Step 3: Search documents with embeddings [bring-your-own-vectors-search-documents] +## Step 3: Search documents with embeddings [bring-your-own-vectors-search-documents] Now you can query these document vectors using a [`knn` retriever](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search#operation-search-body-application-json-retriever). `knn` is a type of vector search, which finds the `k` most similar documents to a query vector. Here we’re simply using a raw vector for the query text, for demonstration purposes. @@ -123,15 +123,15 @@ POST /amazon-reviews/_search -## Learn more [bring-your-own-vectors-learn-more] +## Learn more [bring-your-own-vectors-learn-more] In this simple example, we’re sending a raw vector for the query text. In a real-world scenario you won’t know the query text ahead of time. You’ll need to generate query vectors, on the fly, using the same embedding model that generated the document vectors. -For this you’ll need to deploy a text embedding model in {{es}} and use the [`query_vector_builder` parameter](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-knn-query.md#knn-query-top-level-parameters). Alternatively, you can generate vectors client-side and send them directly with the search request. +For this you’ll need to deploy a text embedding model in {{es}} and use the [`query_vector_builder` parameter](elasticsearch://reference/query-languages/query-dsl-knn-query.md#knn-query-top-level-parameters). Alternatively, you can generate vectors client-side and send them directly with the search request. Learn how to [use a deployed text embedding model](dense-versus-sparse-ingest-pipelines.md) for semantic search. -::::{tip} +::::{tip} If you’re just getting started with vector search in {{es}}, refer to [Semantic search](../semantic-search.md). :::: diff --git a/solutions/search/vector/dense-versus-sparse-ingest-pipelines.md b/solutions/search/vector/dense-versus-sparse-ingest-pipelines.md index 538e6b241..d4d2e2ffd 100644 --- a/solutions/search/vector/dense-versus-sparse-ingest-pipelines.md +++ b/solutions/search/vector/dense-versus-sparse-ingest-pipelines.md @@ -10,9 +10,9 @@ applies_to: # Tutorial: Dense and sparse workflows using ingest pipelines [semantic-search-deployed-nlp-model] -::::{important} +::::{important} * For the easiest way to perform semantic search in the {{stack}}, refer to the [`semantic_text`](../semantic-search/semantic-search-semantic-text.md) end-to-end tutorial. -* This tutorial was written before the [{{infer}} endpoint](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-inference) and [`semantic_text` field type](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/semantic-text.md) was introduced. Today we have simpler options for performing semantic search. +* This tutorial was written before the [{{infer}} endpoint](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-inference) and [`semantic_text` field type](elasticsearch://reference/elasticsearch/mapping-reference/semantic-text.md) was introduced. Today we have simpler options for performing semantic search. :::: @@ -20,7 +20,7 @@ applies_to: This guide shows you how to implement semantic search with models deployed in {{es}}: from selecting an NLP model, to writing queries. -## Select an NLP model [deployed-select-nlp-model] +## Select an NLP model [deployed-select-nlp-model] {{es}} offers the usage of a [wide range of NLP models](/explore-analyze/machine-learning/nlp/ml-nlp-model-ref.md#ml-nlp-model-ref-text-embedding), including both dense and sparse vector models. Your choice of the language model is critical for implementing semantic search successfully. @@ -31,7 +31,7 @@ To address this issue, Elastic provides a pre-trained representational model cal In the case of sparse vector representation, the vectors mostly consist of zero values, with only a small subset containing non-zero values. This representation is commonly used for textual data. In the case of ELSER, each document in an index and the query text itself are represented by high-dimensional sparse vectors. Each non-zero element of the vector corresponds to a term in the model vocabulary. The ELSER vocabulary contains around 30000 terms, so the sparse vectors created by ELSER contain about 30000 values, the majority of which are zero. Effectively the ELSER model is replacing the terms in the original query with other terms that have been learnt to exist in the documents that best match the original search terms in a training dataset, and weights to control how important each is. -## Deploy the model [deployed-deploy-nlp-model] +## Deploy the model [deployed-deploy-nlp-model] After you decide which model you want to use for implementing semantic search, you need to deploy the model in {{es}}. @@ -47,14 +47,14 @@ To deploy a third-party text embedding model, refer to [Deploy a text embedding ::::::: -## Map a field for the text embeddings [deployed-field-mappings] +## Map a field for the text embeddings [deployed-field-mappings] Before you start using the deployed model to generate embeddings based on your input text, you need to prepare your index mapping first. The mapping of the index depends on the type of model. :::::::{tab-set} ::::::{tab-item} ELSER -ELSER produces token-weight pairs as output from the input text and the query. The {{es}} [`sparse_vector`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/sparse-vector.md) field type can store these token-weight pairs as numeric feature vectors. The index must have a field with the `sparse_vector` field type to index the tokens that ELSER generates. +ELSER produces token-weight pairs as output from the input text and the query. The {{es}} [`sparse_vector`](elasticsearch://reference/elasticsearch/mapping-reference/sparse-vector.md) field type can store these token-weight pairs as numeric feature vectors. The index must have a field with the `sparse_vector` field type to index the tokens that ELSER generates. To create a mapping for your ELSER index, refer to the [Create the index mapping section](../semantic-search/semantic-search-elser-ingest-pipelines.md#elser-mappings) of the tutorial. The example shows how to create an index mapping for `my-index` that defines the `my_embeddings.tokens` field - which will contain the ELSER output - as a `sparse_vector` field. @@ -81,7 +81,7 @@ PUT my-index :::::: ::::::{tab-item} Dense vector models -The models compatible with {{es}} NLP generate dense vectors as output. The [`dense_vector`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/dense-vector.md) field type is suitable for storing dense vectors of numeric values. The index must have a field with the `dense_vector` field type to index the embeddings that the supported third-party model that you selected generates. Keep in mind that the model produces embeddings with a certain number of dimensions. The `dense_vector` field must be configured with the same number of dimensions using the `dims` option. Refer to the respective model documentation to get information about the number of dimensions of the embeddings. +The models compatible with {{es}} NLP generate dense vectors as output. The [`dense_vector`](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md) field type is suitable for storing dense vectors of numeric values. The index must have a field with the `dense_vector` field type to index the embeddings that the supported third-party model that you selected generates. Keep in mind that the model produces embeddings with a certain number of dimensions. The `dense_vector` field must be configured with the same number of dimensions using the `dims` option. Refer to the respective model documentation to get information about the number of dimensions of the embeddings. To review a mapping of an index for an NLP model, refer to the mapping code snippet in the [Add the text embedding model to an ingest inference pipeline](/explore-analyze/machine-learning/nlp/ml-nlp-text-emb-vector-search-example.md#ex-text-emb-ingest) section of the tutorial. The example shows how to create an index mapping that defines the `my_embeddings.predicted_value` field - which will contain the model output - as a `dense_vector` field. @@ -111,9 +111,9 @@ PUT my-index ::::::: -## Generate text embeddings [deployed-generate-embeddings] +## Generate text embeddings [deployed-generate-embeddings] -Once you have created the mappings for the index, you can generate text embeddings from your input text. This can be done by using an [ingest pipeline](../../../manage-data/ingest/transform-enrich/ingest-pipelines.md) with an [inference processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/inference-processor.md). The ingest pipeline processes the input data and indexes it into the destination index. At index time, the inference ingest processor uses the trained model to infer against the data ingested through the pipeline. After you created the ingest pipeline with the inference processor, you can ingest your data through it to generate the model output. +Once you have created the mappings for the index, you can generate text embeddings from your input text. This can be done by using an [ingest pipeline](../../../manage-data/ingest/transform-enrich/ingest-pipelines.md) with an [inference processor](elasticsearch://reference/ingestion-tools/enrich-processor/inference-processor.md). The ingest pipeline processes the input data and indexes it into the destination index. At index time, the inference ingest processor uses the trained model to infer against the data ingested through the pipeline. After you created the ingest pipeline with the inference processor, you can ingest your data through it to generate the model output. :::::::{tab-set} @@ -178,14 +178,14 @@ To ingest data through the pipeline to generate text embeddings with your chosen Now it is time to perform semantic search! -## Search the data [deployed-search] +## Search the data [deployed-search] -Depending on the type of model you have deployed, you can query rank features with a [sparse vector](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-sparse-vector-query.md) query, or dense vectors with a kNN search. +Depending on the type of model you have deployed, you can query rank features with a [sparse vector](elasticsearch://reference/query-languages/query-dsl-sparse-vector-query.md) query, or dense vectors with a kNN search. :::::::{tab-set} ::::::{tab-item} ELSER -ELSER text embeddings can be queried using a [sparse vector query](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-sparse-vector-query.md). The sparse vector query enables you to query a [sparse vector](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/sparse-vector.md) field, by providing the inference ID associated with the NLP model you want to use, and the query text: +ELSER text embeddings can be queried using a [sparse vector query](elasticsearch://reference/query-languages/query-dsl-sparse-vector-query.md). The sparse vector query enables you to query a [sparse vector](elasticsearch://reference/elasticsearch/mapping-reference/sparse-vector.md) field, by providing the inference ID associated with the NLP model you want to use, and the query text: ```console GET my-index/_search @@ -224,16 +224,16 @@ GET my-index/_search ::::::: -## Beyond semantic search with hybrid search [deployed-hybrid-search] +## Beyond semantic search with hybrid search [deployed-hybrid-search] In some situations, lexical search may perform better than semantic search. For example, when searching for single words or IDs, like product numbers. -Combining semantic and lexical search into one hybrid search request using [reciprocal rank fusion](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/reciprocal-rank-fusion.md) provides the best of both worlds. Not only that, but hybrid search using reciprocal rank fusion [has been shown to perform better in general](https://www.elastic.co/blog/improving-information-retrieval-elastic-stack-hybrid). +Combining semantic and lexical search into one hybrid search request using [reciprocal rank fusion](elasticsearch://reference/elasticsearch/rest-apis/reciprocal-rank-fusion.md) provides the best of both worlds. Not only that, but hybrid search using reciprocal rank fusion [has been shown to perform better in general](https://www.elastic.co/blog/improving-information-retrieval-elastic-stack-hybrid). :::::::{tab-set} ::::::{tab-item} ELSER -Hybrid search between a semantic and lexical query can be achieved by using an [`rrf` retriever](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search#operation-search-body-application-json-retriever) as part of your search request. Provide a `sparse_vector` query and a full-text query as [`standard` retrievers](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search#operation-search-body-application-json-retriever) for the `rrf` retriever. The `rrf` retriever uses [reciprocal rank fusion](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/reciprocal-rank-fusion.md) to rank the top documents. +Hybrid search between a semantic and lexical query can be achieved by using an [`rrf` retriever](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search#operation-search-body-application-json-retriever) as part of your search request. Provide a `sparse_vector` query and a full-text query as [`standard` retrievers](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search#operation-search-body-application-json-retriever) for the `rrf` retriever. The `rrf` retriever uses [reciprocal rank fusion](elasticsearch://reference/elasticsearch/rest-apis/reciprocal-rank-fusion.md) to rank the top documents. ```console GET my-index/_search @@ -271,7 +271,7 @@ GET my-index/_search ::::::{tab-item} Dense vector models Hybrid search between a semantic and lexical query can be achieved by providing: -* an `rrf` retriever to rank top documents using [reciprocal rank fusion](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/reciprocal-rank-fusion.md) +* an `rrf` retriever to rank top documents using [reciprocal rank fusion](elasticsearch://reference/elasticsearch/rest-apis/reciprocal-rank-fusion.md) * a `standard` retriever as a child retriever with `query` clause for the full-text query * a `knn` retriever as a child retriever with the kNN search that queries the dense vector field diff --git a/solutions/search/vector/knn.md b/solutions/search/vector/knn.md index 91fda8703..f6df5ef13 100644 --- a/solutions/search/vector/knn.md +++ b/solutions/search/vector/knn.md @@ -30,7 +30,7 @@ Common use cases for kNN include: ## Prerequisites [knn-prereqs] * To run a kNN search, your data must be transformed into vectors. You can [use an NLP model in {{es}}](../../../explore-analyze/machine-learning/nlp/ml-nlp-text-emb-vector-search-example.md), or generate them outside {{es}}. - - Dense vectors need to use the [`dense_vector`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/dense-vector.md) field type. + - Dense vectors need to use the [`dense_vector`](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md) field type. - Queries are represented as vectors with the same dimension. You should use the same model to generate the query vector as you used to generate the document vectors. - If you already have vectors, refer to the [Bring your own dense vectors](bring-own-vectors.md) guide. @@ -62,7 +62,7 @@ To run an approximate kNN search, use the [`knn` option](https://www.elastic.co/ 1. Explicitly map one or more `dense_vector` fields. Approximate kNN search requires the following mapping options: - * A `similarity` value. This value determines the similarity metric used to score documents based on similarity between the query and document vector. For a list of available metrics, see the [`similarity`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-similarity) parameter documentation. The `similarity` setting defaults to `cosine`. + * A `similarity` value. This value determines the similarity metric used to score documents based on similarity between the query and document vector. For a list of available metrics, see the [`similarity`](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-similarity) parameter documentation. The `similarity` setting defaults to `cosine`. ```console PUT image-index @@ -103,7 +103,7 @@ To run an approximate kNN search, use the [`knn` option](https://www.elastic.co/ ... ``` -3. Run the search using the [`knn` option](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search#operation-search-body-application-json-knn) or the [`knn` query](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-knn-query.md) (expert case). +3. Run the search using the [`knn` option](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search#operation-search-body-application-json-knn) or the [`knn` query](elasticsearch://reference/query-languages/query-dsl-knn-query.md) (expert case). ```console POST image-index/_search @@ -119,7 +119,7 @@ To run an approximate kNN search, use the [`knn` option](https://www.elastic.co/ ``` -The document `_score` is a positive 32-bit floating point number used to score the relevance of the returned document, determined by the similarity between the query and document vector. See [`similarity`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-similarity) for more information on how kNN search scores are computed. +The document `_score` is a positive 32-bit floating point number used to score the relevance of the returned document, determined by the similarity between the query and document vector. See [`similarity`](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-similarity) for more information on how kNN search scores are computed. ::::{note} Support for approximate kNN search was added in version 8.0. Before this, `dense_vector` fields did not support enabling `index` in the mapping. If you created an index prior to 8.0 containing `dense_vector` fields, then to support approximate kNN search the data must be reindexed using a new field mapping that sets `index: true` which is the default option. @@ -129,7 +129,7 @@ Support for approximate kNN search was added in version 8.0. Before this, `dense For approximate kNN search, {{es}} stores the dense vector values of each segment as an [HNSW graph](https://arxiv.org/abs/1603.09320). Indexing vectors for approximate kNN search can take substantial time because of how expensive it is to build these graphs. You may need to increase the client request timeout for index and bulk requests. The [approximate kNN tuning guide](/deploy-manage/production-guidance/optimize-performance/approximate-knn-search.md) contains important guidance around indexing performance, and how the index configuration can affect search performance. -In addition to its search-time tuning parameters, the HNSW algorithm has index-time parameters that trade off between the cost of building the graph, search speed, and accuracy. When setting up the `dense_vector` mapping, you can use the [`index_options`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-index-options) argument to adjust these parameters: +In addition to its search-time tuning parameters, the HNSW algorithm has index-time parameters that trade off between the cost of building the graph, search speed, and accuracy. When setting up the `dense_vector` mapping, you can use the [`index_options`](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-index-options) argument to adjust these parameters: ```console PUT image-index @@ -162,9 +162,9 @@ Similarly, you can decrease `num_candidates` for faster searches with potentiall ### Approximate kNN using byte vectors [approximate-knn-using-byte-vectors] -The approximate kNN search API supports `byte` value vectors in addition to `float` value vectors. Use the [`knn` option](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search#operation-search-body-application-json-knn) to search a `dense_vector` field with [`element_type`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-params) set to `byte` and indexing enabled. +The approximate kNN search API supports `byte` value vectors in addition to `float` value vectors. Use the [`knn` option](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search#operation-search-body-application-json-knn) to search a `dense_vector` field with [`element_type`](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-params) set to `byte` and indexing enabled. -1. Explicitly map one or more `dense_vector` fields with [`element_type`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-params) set to `byte` and indexing enabled. +1. Explicitly map one or more `dense_vector` fields with [`element_type`](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-params) set to `byte` and indexing enabled. ```console PUT byte-image-index @@ -230,7 +230,7 @@ POST byte-image-index/_search ### Byte quantized kNN search [knn-search-quantized-example] -If you want to provide `float` vectors, but want the memory savings of `byte` vectors, you can use the [quantization](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-quantization) feature. Quantization allows you to provide `float` vectors, but internally they are indexed as `byte` vectors. Additionally, the original `float` vectors are still retained in the index. +If you want to provide `float` vectors, but want the memory savings of `byte` vectors, you can use the [quantization](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-quantization) feature. Quantization allows you to provide `float` vectors, but internally they are indexed as `byte` vectors. Additionally, the original `float` vectors are still retained in the index. ::::{note} The default index type for `dense_vector` is `int8_hnsw`. @@ -501,11 +501,11 @@ To alleviate this worry, there is a `similarity` parameter available in the `knn * Do not return any vectors that are further away than the configured `similarity` ::::{note} -`similarity` is the true [similarity](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-similarity) before it has been transformed into `_score` and boost applied. +`similarity` is the true [similarity](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-similarity) before it has been transformed into `_score` and boost applied. :::: -For each configured [similarity](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-similarity), here is the corresponding inverted `_score` function. This is so if you are wanting to filter from a `_score` perspective, you can do this minor transformation to correctly reject irrelevant results. +For each configured [similarity](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-similarity), here is the corresponding inverted `_score` function. This is so if you are wanting to filter from a `_score` perspective, you can do this minor transformation to correctly reject irrelevant results. * `l2_norm`: `sqrt((1 / _score) - 1)` * `cosine`: `(2 * _score) - 1` @@ -543,7 +543,7 @@ In our data set, the only document with the file type of `png` has a vector of ` ### Nested kNN Search [nested-knn-search] -It is common for text to exceed a particular model’s token limit and requires chunking before building the embeddings for individual chunks. When using [`nested`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/nested.md) with [`dense_vector`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/dense-vector.md), you can achieve nearest passage retrieval without copying top-level document metadata. +It is common for text to exceed a particular model’s token limit and requires chunking before building the embeddings for individual chunks. When using [`nested`](elasticsearch://reference/elasticsearch/mapping-reference/nested.md) with [`dense_vector`](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md), you can achieve nearest passage retrieval without copying top-level document metadata. Here is a simple passage vectors index that stores vectors and some top-level metadata for filtering. @@ -739,10 +739,10 @@ Now we have filtered based on the top level `"creation_time"` and only one docum ### Nested kNN Search with Inner hits [nested-knn-search-inner-hits] -Additionally, if you wanted to extract the nearest passage for a matched document, you can supply [inner_hits](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/retrieve-inner-hits.md) to the `knn` clause. +Additionally, if you wanted to extract the nearest passage for a matched document, you can supply [inner_hits](elasticsearch://reference/elasticsearch/rest-apis/retrieve-inner-hits.md) to the `knn` clause. ::::{note} -When using `inner_hits` and multiple `knn` clauses, be sure to specify the [`inner_hits.name`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/retrieve-inner-hits.md#inner-hits-options) field. Otherwise, a naming clash can occur and fail the search request. +When using `inner_hits` and multiple `knn` clauses, be sure to specify the [`inner_hits.name`](elasticsearch://reference/elasticsearch/rest-apis/retrieve-inner-hits.md#inner-hits-options) field. Otherwise, a naming clash can occur and fail the search request. :::: @@ -898,7 +898,7 @@ Approximate kNN search always uses the [`dfs_query_then_fetch`](https://www.elas ### Oversampling and rescoring for quantized vectors [dense-vector-knn-search-rescoring] -When using [quantized vectors](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-quantization) for kNN search, you can optionally rescore results to balance performance and accuracy, by doing: +When using [quantized vectors](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-quantization) for kNN search, you can optionally rescore results to balance performance and accuracy, by doing: * **Oversampling**: Retrieve more candidates per shard. * **Rescoring**: Use the original vector values for re-calculating the score on the oversampled candidates. @@ -955,7 +955,7 @@ The following sections provide additional ways of rescoring: You can use this option when you don’t want to rescore on each shard, but on the top results from all shards. -Use the [rescore section](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/filter-search-results.md#rescore) in the `_search` request to rescore the top results from a kNN search. +Use the [rescore section](elasticsearch://reference/elasticsearch/rest-apis/filter-search-results.md#rescore) in the `_search` request to rescore the top results from a kNN search. Here is an example using the top level `knn` search with oversampling and using `rescore` to rerank the results: @@ -1005,7 +1005,7 @@ POST /my-index/_search You can use this option when you want to rescore on each shard and want more fine-grained control on the rescoring than the `rescore_vector` option provides. -Use rescore per shard with the [knn query](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-knn-query.md) and [script_score query ](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-script-score-query.md). Generally, this means that there will be more rescoring per shard, but this can increase overall recall at the cost of compute. +Use rescore per shard with the [knn query](elasticsearch://reference/query-languages/query-dsl-knn-query.md) and [script_score query ](elasticsearch://reference/query-languages/query-dsl-script-score-query.md). Generally, this means that there will be more rescoring per shard, but this can increase overall recall at the cost of compute. ```console POST /my-index/_search @@ -1075,10 +1075,10 @@ To run an exact kNN search, use a `script_score` query with a vector function. ... ``` -3. Use the [search API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search) to run a `script_score` query containing a [vector function](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-script-score-query.md#vector-functions). +3. Use the [search API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search) to run a `script_score` query containing a [vector function](elasticsearch://reference/query-languages/query-dsl-script-score-query.md#vector-functions). ::::{tip} - To limit the number of matched documents passed to the vector function, we recommend you specify a filter query in the `script_score.query` parameter. If needed, you can use a [`match_all` query](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-match-all-query.md) in this parameter to match all documents. However, matching all documents can significantly increase search latency. + To limit the number of matched documents passed to the vector function, we recommend you specify a filter query in the `script_score.query` parameter. If needed, you can use a [`match_all` query](elasticsearch://reference/query-languages/query-dsl-match-all-query.md) in this parameter to match all documents. However, matching all documents can significantly increase search latency. :::: @@ -1117,6 +1117,6 @@ Common use cases for kNN include: * Product recommendations and recommendation engines * Similarity search for images or videos -::::{tip} +::::{tip} Check out our [hands-on tutorial](bring-own-vectors.md) to learn how to ingest dense vector embeddings into Elasticsearch. :::: \ No newline at end of file diff --git a/solutions/search/vector/sparse-vector.md b/solutions/search/vector/sparse-vector.md index 7d50d1dac..238b1e402 100644 --- a/solutions/search/vector/sparse-vector.md +++ b/solutions/search/vector/sparse-vector.md @@ -28,4 +28,4 @@ Sparse vector search with ELSER expands both documents and queries into weighted - Deploy and configure the ELSER model - Use the `sparse_vector` field type - See [this overview](../semantic-search.md#using-nlp-models) for implementation options -2. Query the index using [`sparse_vector` query](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-sparse-vector-query.md). \ No newline at end of file +2. Query the index using [`sparse_vector` query](elasticsearch://reference/query-languages/query-dsl-sparse-vector-query.md). \ No newline at end of file diff --git a/solutions/security/ai/ai-assistant-knowledge-base.md b/solutions/security/ai/ai-assistant-knowledge-base.md index 9d6600500..c5936e753 100644 --- a/solutions/security/ai/ai-assistant-knowledge-base.md +++ b/solutions/security/ai/ai-assistant-knowledge-base.md @@ -136,7 +136,7 @@ Refer to the following video for an example of adding a document to Knowledge Ba Add an index as a knowledge source when you want new information added to that index to automatically inform AI Assistant’s responses. Common security examples include asset inventories, network configuration information, on-call matrices, threat intelligence reports, and vulnerability scans. ::::{important} -Indices added to Knowledge Base must have at least one field mapped as [semantic text](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/semantic-text.md). +Indices added to Knowledge Base must have at least one field mapped as [semantic text](elasticsearch://reference/elasticsearch/mapping-reference/semantic-text.md). :::: @@ -175,7 +175,7 @@ Refer to the following video for an example of adding an index to Knowledge Base You can use an {{es}} connector or web crawler to create an index that contains data you want to add to Knowledge Base. -This section provides an example of adding a threat intelligence feed to Knowledge Base using a web crawler. For more information on adding data to {{es}} using a connector, refer to [Ingest data with Elastic connectors](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/search-connectors/index.md). For more information on web crawlers, refer to [Elastic web crawler](https://www.elastic.co/guide/en/enterprise-search/current/crawler.html). +This section provides an example of adding a threat intelligence feed to Knowledge Base using a web crawler. For more information on adding data to {{es}} using a connector, refer to [Ingest data with Elastic connectors](elasticsearch://reference/ingestion-tools/search-connectors/index.md). For more information on web crawlers, refer to [Elastic web crawler](https://www.elastic.co/guide/en/enterprise-search/current/crawler.html). #### Use a web crawler to add threat intelligence to Knowledge Base [_use_a_web_crawler_to_add_threat_intelligence_to_knowledge_base] diff --git a/solutions/security/dashboards/data-quality-dashboard.md b/solutions/security/dashboards/data-quality-dashboard.md index 0cf917875..5d027141a 100644 --- a/solutions/security/dashboards/data-quality-dashboard.md +++ b/solutions/security/dashboards/data-quality-dashboard.md @@ -92,7 +92,7 @@ After an index is checked, a **Pass** or **Fail*** status appears. ***Fail*** in The index check flyout provides more information about the status of fields in that index. Each of its tabs describe fields grouped by mapping status. ::::{note} -Fields in the **Same family** category have the correct search behavior, but might have different storage or performance characteristics (for example, you can index strings to both `text` and `keyword` fields). To learn more, refer to [Field data types](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/field-data-types.md). +Fields in the **Same family** category have the correct search behavior, but might have different storage or performance characteristics (for example, you can index strings to both `text` and `keyword` fields). To learn more, refer to [Field data types](elasticsearch://reference/elasticsearch/mapping-reference/field-data-types.md). :::: diff --git a/solutions/security/detect-and-alert/about-detection-rules.md b/solutions/security/detect-and-alert/about-detection-rules.md index 8528b30c7..2eefab191 100644 --- a/solutions/security/detect-and-alert/about-detection-rules.md +++ b/solutions/security/detect-and-alert/about-detection-rules.md @@ -37,7 +37,7 @@ You can create the following types of rules: For example, if the threshold `field` is `source.ip` and its `value` is `10`, an alert is generated for every source IP address that appears in at least 10 of the rule’s search results. * [**Event correlation**](/solutions/security/detect-and-alert/create-detection-rule.md#create-eql-rule): Searches the defined indices and creates an alert when results match an [Event Query Language (EQL)](/explore-analyze/query-filter/languages/eql.md) query. -* [**Indicator match**](/solutions/security/detect-and-alert/create-detection-rule.md#create-indicator-rule): Creates an alert when {{elastic-sec}} index field values match field values defined in the specified indicator index patterns. For example, you can create an indicator index for IP addresses and use this index to create an alert whenever an event’s `destination.ip` equals a value in the index. Indicator index field mappings should be [ECS-compliant](https://www.elastic.co/guide/en/ecs/current). For information on creating {{es}} indices and field types, see [Index some documents](https://www.elastic.co/guide/en/starting-with-the-elasticsearch-platform-and-its-solutions/current/getting-started-general-purpose.html#gp-gs-add-data), [Create index API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-create), and [Field data types](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/field-data-types.md). If you have indicators in a standard file format, such as CSV or JSON, you can also use the Machine Learning Data Visualizer to import your indicators into an indicator index. See [Explore the data in {{kib}}](/explore-analyze/machine-learning/anomaly-detection/ml-getting-started.md#sample-data-visualizer) and use the **Import Data** option to import your indicators. +* [**Indicator match**](/solutions/security/detect-and-alert/create-detection-rule.md#create-indicator-rule): Creates an alert when {{elastic-sec}} index field values match field values defined in the specified indicator index patterns. For example, you can create an indicator index for IP addresses and use this index to create an alert whenever an event’s `destination.ip` equals a value in the index. Indicator index field mappings should be [ECS-compliant](https://www.elastic.co/guide/en/ecs/current). For information on creating {{es}} indices and field types, see [Index some documents](https://www.elastic.co/guide/en/starting-with-the-elasticsearch-platform-and-its-solutions/current/getting-started-general-purpose.html#gp-gs-add-data), [Create index API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-create), and [Field data types](elasticsearch://reference/elasticsearch/mapping-reference/field-data-types.md). If you have indicators in a standard file format, such as CSV or JSON, you can also use the Machine Learning Data Visualizer to import your indicators into an indicator index. See [Explore the data in {{kib}}](/explore-analyze/machine-learning/anomaly-detection/ml-getting-started.md#sample-data-visualizer) and use the **Import Data** option to import your indicators. ::::{tip} You can also use value lists as the indicator match index. See [Use value lists with indicator match rules](/solutions/security/detect-and-alert/create-detection-rule.md#indicator-value-lists) at the end of this topic for more information. diff --git a/solutions/security/detect-and-alert/add-manage-exceptions.md b/solutions/security/detect-and-alert/add-manage-exceptions.md index c68421c11..ef54e54b5 100644 --- a/solutions/security/detect-and-alert/add-manage-exceptions.md +++ b/solutions/security/detect-and-alert/add-manage-exceptions.md @@ -93,7 +93,7 @@ You can add exceptions to a rule from the rule details page, the Alerts table, t :::: - * `matches` | `does not match` — Allows you to use wildcards in **Value**, such as `C:\\path\\*\\app.exe`. Available wildcards are `?` (match one character) and `*` (match zero or more characters). The selected **Field** data type must be [keyword](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/keyword.md#keyword-field-type), [text](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/text.md#text-field-type), or [wildcard](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/keyword.md#wildcard-field-type). + * `matches` | `does not match` — Allows you to use wildcards in **Value**, such as `C:\\path\\*\\app.exe`. Available wildcards are `?` (match one character) and `*` (match zero or more characters). The selected **Field** data type must be [keyword](elasticsearch://reference/elasticsearch/mapping-reference/keyword.md#keyword-field-type), [text](elasticsearch://reference/elasticsearch/mapping-reference/text.md#text-field-type), or [wildcard](elasticsearch://reference/elasticsearch/mapping-reference/keyword.md#wildcard-field-type). ::::{note} Some characters must be escaped with a backslash, such as `\\` for a literal backslash, `\*` for an asterisk, and `\?` for a question mark. Windows paths must be divided with double backslashes (for example, `C:\\Windows\\explorer.exe`), and paths that already include double backslashes might require four backslashes for each divider. @@ -156,7 +156,7 @@ Additionally, to add an Endpoint exception to an endpoint protection rule, there ::::{important} -[Binary fields](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/binary.md) are not supported in detection rule exceptions. +[Binary fields](elasticsearch://reference/elasticsearch/mapping-reference/binary.md) are not supported in detection rule exceptions. :::: diff --git a/solutions/security/detect-and-alert/create-detection-rule.md b/solutions/security/detect-and-alert/create-detection-rule.md index 96b6bfa6b..823c015a4 100644 --- a/solutions/security/detect-and-alert/create-detection-rule.md +++ b/solutions/security/detect-and-alert/create-detection-rule.md @@ -212,10 +212,10 @@ To create or edit {{ml}} rules, you must have the [appropriate license](https:// 4. To create an event correlation rule using EQL, select **Event Correlation**, then: 1. Define which {{es}} indices or data view the rule searches when querying for events. - 2. Write an [EQL query](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/eql-syntax.md) that searches for matching events or a series of matching events. + 2. Write an [EQL query](elasticsearch://reference/query-languages/eql-syntax.md) that searches for matching events or a series of matching events. ::::{tip} - To find events that are missing in a sequence, use the [missing events](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/eql-syntax.md#eql-missing-events) syntax. + To find events that are missing in a sequence, use the [missing events](elasticsearch://reference/query-languages/eql-syntax.md#eql-missing-events) syntax. :::: @@ -251,7 +251,7 @@ To create or edit {{ml}} rules, you must have the [appropriate license](https:// 5. (Optional) Click the EQL settings icon (![EQL settings icon](../../../images/security-eql-settings-icon.png "")) to configure additional fields used by [EQL search](/explore-analyze/query-filter/languages/eql.md#specify-a-timestamp-or-event-category-field): - * **Event category field**: Contains the event classification, such as `process`, `file`, or `network`. This field is typically mapped as a field type in the [keyword family](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/keyword.md). Defaults to the `event.category` ECS field. + * **Event category field**: Contains the event classification, such as `process`, `file`, or `network`. This field is typically mapped as a field type in the [keyword family](elasticsearch://reference/elasticsearch/mapping-reference/keyword.md). Defaults to the `event.category` ECS field. * **Tiebreaker field**: Sets a secondary field for sorting events (in ascending, lexicographic order) if they have the same timestamp. * **Timestamp field**: Contains the event timestamp used for sorting a sequence of events. This is different from the **Timestamp override** advanced setting, which is used for querying events within a range. Defaults to the `@timestamp` ECS field. @@ -444,7 +444,7 @@ To create an {{esql}} rule: #### Aggregating query [esql-agg-query] -Aggregating queries use [`STATS...BY`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-functions-operators.md#esql-agg-functions) functions to aggregate source event data. Alerts generated by a rule with an aggregating query only contain the fields that the {{esql}} query returns and any new fields that the query creates. +Aggregating queries use [`STATS...BY`](elasticsearch://reference/query-languages/esql/esql-functions-operators.md#esql-agg-functions) functions to aggregate source event data. Alerts generated by a rule with an aggregating query only contain the fields that the {{esql}} query returns and any new fields that the query creates. ::::{note} A *new field* is a field that doesn’t exist in the query’s source index and is instead created when the rule runs. You can access new fields in the details of any alerts that are generated by the rule. For example, if you use the `STATS...BY` function to create a column with aggregated values, the column is created when the rule runs and is added as a new field to any alerts that are generated by the rule. @@ -476,7 +476,7 @@ Rules that use aggregating queries might create duplicate alerts. This can happe Non-aggregating queries don’t use `STATS...BY` functions and don’t aggregate source event data. Alerts generated by a non-aggregating query contain source event fields that the query returns, new fields the query creates, and all other fields in the source event document. ::::{note} -A *new field* is a field that doesn’t exist in the query’s source index and is instead created when the rule runs. You can access new fields in the details of any alerts that are generated by the rule. For example, if you use the [`EVAL`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md#esql-eval) command to append new columns with calculated values, the columns are created when the rule runs, and are added as new fields to any alerts generated by the rule. +A *new field* is a field that doesn’t exist in the query’s source index and is instead created when the rule runs. You can access new fields in the details of any alerts that are generated by the rule. For example, if you use the [`EVAL`](elasticsearch://reference/query-languages/esql/esql-commands.md#esql-eval) command to append new columns with calculated values, the columns are created when the rule runs, and are added as new fields to any alerts generated by the rule. :::: @@ -505,7 +505,7 @@ FROM logs-* METADATA _id, _index, _version When those metadata fields are provided, unique alert IDs are created for each alert generated by the query. -When developing the query, make sure you don’t [`DROP`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md#esql-drop) or filter out the `_id`, `_index`, or `_version` metadata fields. +When developing the query, make sure you don’t [`DROP`](elasticsearch://reference/query-languages/esql/esql-commands.md#esql-drop) or filter out the `_id`, `_index`, or `_version` metadata fields. Here is an example of a query that fails to deduplicate alerts. It uses the `DROP` command to omit the `_id` property from the results table: @@ -530,11 +530,11 @@ FROM logs-* METADATA _id, _index, _version When writing your query, consider the following: -* The [`LIMIT`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md#esql-limit) command specifies the maximum number of rows an {{esql}} query returns and the maximum number of alerts created per rule execution. Similarly, a detection rule’s **Max alerts per run** setting specifies the maximum number of alerts it can create every time it runs. +* The [`LIMIT`](elasticsearch://reference/query-languages/esql/esql-commands.md#esql-limit) command specifies the maximum number of rows an {{esql}} query returns and the maximum number of alerts created per rule execution. Similarly, a detection rule’s **Max alerts per run** setting specifies the maximum number of alerts it can create every time it runs. If the `LIMIT` value and **Max alerts per run** value are different, the rule uses the lower value to determine the maximum number of alerts the rule generates. -* When writing an aggregating query, use the [`STATS...BY`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md#esql-stats-by) command with fields that you want to search and filter for after alerts are created. For example, using the `host.name`, `user.name`, `process.name` fields with the `BY` operator of the `STATS...BY` command returns these fields in alert documents, and allows you to search and filter for them from the Alerts table. +* When writing an aggregating query, use the [`STATS...BY`](elasticsearch://reference/query-languages/esql/esql-commands.md#esql-stats-by) command with fields that you want to search and filter for after alerts are created. For example, using the `host.name`, `user.name`, `process.name` fields with the `BY` operator of the `STATS...BY` command returns these fields in alert documents, and allows you to search and filter for them from the Alerts table. * When configuring alert suppression on a non-aggregating query, we recommend sorting results by ascending `@timestamp` order. Doing so ensures that alerts are properly suppressed, especially if the number of alerts generated is higher than the **Max alerts per run** value. diff --git a/solutions/security/detect-and-alert/create-manage-shared-exception-lists.md b/solutions/security/detect-and-alert/create-manage-shared-exception-lists.md index 2bf7daa40..35604bd8f 100644 --- a/solutions/security/detect-and-alert/create-manage-shared-exception-lists.md +++ b/solutions/security/detect-and-alert/create-manage-shared-exception-lists.md @@ -60,7 +60,7 @@ Add exception items: :::: - * `matches` | `does not match` — Allows you to use wildcards in **Value**, such as `C:\path\*\app.exe`. Available wildcards are `?` (match one character) and `*` (match zero or more characters). The selected **Field** data type must be [keyword](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/keyword.md#keyword-field-type), [text](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/text.md#text-field-type), or [wildcard](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/keyword.md#wildcard-field-type). + * `matches` | `does not match` — Allows you to use wildcards in **Value**, such as `C:\path\*\app.exe`. Available wildcards are `?` (match one character) and `*` (match zero or more characters). The selected **Field** data type must be [keyword](elasticsearch://reference/elasticsearch/mapping-reference/keyword.md#keyword-field-type), [text](elasticsearch://reference/elasticsearch/mapping-reference/text.md#text-field-type), or [wildcard](elasticsearch://reference/elasticsearch/mapping-reference/keyword.md#wildcard-field-type). ::::{important} Using wildcards can impact performance. To create a more efficient exception using wildcards, use multiple conditions and make them as specific as possible. For example, adding conditions using `process.name` or `file.name` can help limit the scope of wildcard matching. diff --git a/solutions/security/detect-and-alert/create-manage-value-lists.md b/solutions/security/detect-and-alert/create-manage-value-lists.md index f0b7ca564..647fd3cdf 100644 --- a/solutions/security/detect-and-alert/create-manage-value-lists.md +++ b/solutions/security/detect-and-alert/create-manage-value-lists.md @@ -8,7 +8,7 @@ mapped_urls: Value lists hold multiple values of the same Elasticsearch data type, such as IP addresses, which are used to determine when an exception prevents an alert from being generated. You can use value lists to define exceptions for detection rules; however, you cannot use value lists to define endpoint rule exceptions. -Value lists are lists of items with the same {{es}} [data type](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/field-data-types.md). You can create value lists with these types: +Value lists are lists of items with the same {{es}} [data type](elasticsearch://reference/elasticsearch/mapping-reference/field-data-types.md). You can create value lists with these types: * `Keywords` (many [ECS fields](asciidocalypse://docs/ecs/docs/reference/ecs-field-reference.md) are keywords) * `IP Addresses` diff --git a/solutions/security/detect-and-alert/detections-requirements.md b/solutions/security/detect-and-alert/detections-requirements.md index 9166bf9ac..95165600f 100644 --- a/solutions/security/detect-and-alert/detections-requirements.md +++ b/solutions/security/detect-and-alert/detections-requirements.md @@ -36,7 +36,7 @@ Additionally, there are some [advanced settings](/solutions/security/detect-and- These steps are only required for **self-managed** deployments: * HTTPS must be configured for communication between [{{es}} and {{kib}}](/deploy-manage/security/set-up-basic-security-plus-https.md#encrypt-kibana-http). -* In the `elasticsearch.yml` configuration file, set the `xpack.security.enabled` setting to `true`. For more information, refer to [Configuring {{es}}](/deploy-manage/deploy/self-managed/configure-elasticsearch.md) and [Security settings in {{es}}](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md). +* In the `elasticsearch.yml` configuration file, set the `xpack.security.enabled` setting to `true`. For more information, refer to [Configuring {{es}}](/deploy-manage/deploy/self-managed/configure-elasticsearch.md) and [Security settings in {{es}}](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md). * In the `kibana.yml` [configuration file](/deploy-manage/deploy/self-managed/configure.md), add the `xpack.encryptedSavedObjects.encryptionKey` setting with any alphanumeric value of at least 32 characters. For example: `xpack.encryptedSavedObjects.encryptionKey: 'fhjskloppd678ehkdfdlliverpoolfcr'` diff --git a/solutions/security/detect-and-alert/launch-timeline-from-investigation-guides.md b/solutions/security/detect-and-alert/launch-timeline-from-investigation-guides.md index 01d1531c4..8ef0b7883 100644 --- a/solutions/security/detect-and-alert/launch-timeline-from-investigation-guides.md +++ b/solutions/security/detect-and-alert/launch-timeline-from-investigation-guides.md @@ -103,7 +103,7 @@ The following syntax defines a query button in an interactive investigation guid | `label` | Identifying text on the button. | | `description` | Additional text included with the button. | | `providers` | A two-level nested array that defines the query to run in Timeline. Similar to the structure of queries in Timeline, items in the outer level are joined by an `OR` relationship, and items in the inner level are joined by an `AND` relationship.

Each item in `providers` corresponds to a filter created in the query builder UI and is defined by these attributes:

* `field`: The name of the field to query.
* `excluded`: Whether the query result is excluded (such as **is not one of**) or included (**is one of**).
* `queryType`: The query type used to filter events, based on the filter’s operator. For example, `phrase` or `range`.
* `value`: The value to search for. Either a hard-coded literal value, or the name of an alert field (in double curly brackets) whose value you want to use as a query parameter.
* `valueType`: The data type of `value`, such as `string` or `boolean`.
| -| `relativeFrom`, `relativeTo` | (Optional) The start and end, respectively, of the relative time range for the query. Times are relative to the alert’s creation time, represented as `now` in [date math](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/common-options.md#date-math) format. For example, selecting **Last 15 minutes** in the query builder form creates the syntax `"relativeFrom": "now-15m", "relativeTo": "now"`. | +| `relativeFrom`, `relativeTo` | (Optional) The start and end, respectively, of the relative time range for the query. Times are relative to the alert’s creation time, represented as `now` in [date math](elasticsearch://reference/elasticsearch/rest-apis/common-options.md#date-math) format. For example, selecting **Last 15 minutes** in the query builder form creates the syntax `"relativeFrom": "now-15m", "relativeTo": "now"`. | ::::{note} Some characters must be escaped with a backslash, such as `\"` for a quotation mark and `\\` for a literal backslash. Divide Windows paths with double backslashes (for example, `C:\\Windows\\explorer.exe`), and paths that already include double backslashes might require four backslashes for each divider. A clickable error icon (![Error icon](../../../images/security-ig-error-icon.png "")) displays below the Markdown editor if there are any syntax errors. diff --git a/solutions/security/detect-and-alert/using-logsdb-index-mode-with-elastic-security.md b/solutions/security/detect-and-alert/using-logsdb-index-mode-with-elastic-security.md index 65a15f010..95fe8d2d2 100644 --- a/solutions/security/detect-and-alert/using-logsdb-index-mode-with-elastic-security.md +++ b/solutions/security/detect-and-alert/using-logsdb-index-mode-with-elastic-security.md @@ -13,34 +13,34 @@ mapped_urls: % - [x] ./raw-migrated-files/security-docs/security/detections-logsdb-index-mode-impact.md % - [ ] ./raw-migrated-files/docs-content/serverless/detections-logsdb-index-mode-impact.md -::::{note} -To use the [synthetic `_source`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source) feature, you must have the appropriate subscription. Refer to the subscription page for [Elastic Cloud](https://www.elastic.co/subscriptions/cloud) and [Elastic Stack/self-managed](https://www.elastic.co/subscriptions) for the breakdown of available features and their associated subscription tiers. +::::{note} +To use the [synthetic `_source`](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source) feature, you must have the appropriate subscription. Refer to the subscription page for [Elastic Cloud](https://www.elastic.co/subscriptions/cloud) and [Elastic Stack/self-managed](https://www.elastic.co/subscriptions) for the breakdown of available features and their associated subscription tiers. :::: This topic explains the impact of using logsdb index mode with {{elastic-sec}}. -With logsdb index mode, the original `_source` field is not stored in the index but can be reconstructed using [synthetic `_source`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source). +With logsdb index mode, the original `_source` field is not stored in the index but can be reconstructed using [synthetic `_source`](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source). -When the `_source` is reconstructed, [modifications](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source-modifications) are possible. Therefore, there could be a mismatch between users' expectations and how fields are formatted. +When the `_source` is reconstructed, [modifications](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source-modifications) are possible. Therefore, there could be a mismatch between users' expectations and how fields are formatted. Continue reading to find out how this affects specific {{elastic-sec}} components. -::::{note} +::::{note} Logsdb is not recommended for {{elastic-sec}} at this time. Users must fully understand and accept the documented changes to detection alert documents (see below), and ensure their deployment has excess hot data tier CPU resource capacity before enabling logsdb mode, as logsdb mode requires additional CPU resources during the ingest/indexing process. Enabling logsdb without sufficient hot data tier CPU may result in data ingestion backups and/or security detection rule timeouts and errors. :::: -## Alerts [logsdb-alerts] +## Alerts [logsdb-alerts] When alerts are generated, the `_source` event is copied into the alert to retain the original data. When the logsdb index mode is applied, the `_source` event stored in the alert is reconstructed using synthetic `_source`. If you’re switching to use logsdb index mode, the `_source` field stored in the alert might look different in certain situations: -* [Arrays can be reconstructed differently or deduplicated](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source-modifications-leaf-arrays) -* [Field names](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source-modifications-field-names) -* `geo_point` data fields (refer to [Representation of ranges](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source-modifications-ranges) and [Reduced precision of `geo_point` values](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source-precision-loss-for-point-types) for more information) +* [Arrays can be reconstructed differently or deduplicated](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source-modifications-leaf-arrays) +* [Field names](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source-modifications-field-names) +* `geo_point` data fields (refer to [Representation of ranges](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source-modifications-ranges) and [Reduced precision of `geo_point` values](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source-precision-loss-for-point-types) for more information) Alerts generated by the following rule types could be affected: @@ -51,7 +51,7 @@ Alerts generated by the following rule types could be affected: Alerts that are generated by threshold, {{ml}}, and event correlation sequence rules are not affected since they do not contain copies of the original source. -## Rule actions [logsdb-rule-actions] +## Rule actions [logsdb-rule-actions] While we do not recommend using `_source` for actions, in cases where the action relies on the `_source`, the same limitations and changes apply. @@ -60,7 +60,7 @@ If you send alert notifications by enabling [actions](/explore-analyze/alerts-ca We recommend checking and adjusting the rule actions using `_source` before switching to logsdb index mode. -## Runtime fields [logsdb-runtime-fields] +## Runtime fields [logsdb-runtime-fields] Runtime fields that reference `_source` may be affected. Some runtime fields might not work and need to be adjusted. For example, if an event was indexed with the value of `agent.name` in the dot-notation form, it will be returned in the nested form and might not work. diff --git a/solutions/security/get-started/configure-advanced-settings.md b/solutions/security/get-started/configure-advanced-settings.md index 1e5f310ff..352c8da79 100644 --- a/solutions/security/get-started/configure-advanced-settings.md +++ b/solutions/security/get-started/configure-advanced-settings.md @@ -141,7 +141,7 @@ These settings determine the default time interval and refresh rate {{elastic-se * `securitySolution:refreshIntervalDefaults`: Default refresh rate ::::{note} -Refer to [Date Math](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/common-options.md) for information about the syntax. The UI [time filter](/explore-analyze/query-filter/filtering.md) overrides the default values. +Refer to [Date Math](elasticsearch://reference/elasticsearch/rest-apis/common-options.md) for information about the syntax. The UI [time filter](/explore-analyze/query-filter/filtering.md) overrides the default values. :::: diff --git a/solutions/security/get-started/elastic-security-requirements.md b/solutions/security/get-started/elastic-security-requirements.md index e48bdb274..fa095fab0 100644 --- a/solutions/security/get-started/elastic-security-requirements.md +++ b/solutions/security/get-started/elastic-security-requirements.md @@ -22,21 +22,21 @@ For information about installing and managing the {{stack}} yourself, see [Insta The [Support Matrix](https://www.elastic.co/support/matrix) page lists officially supported operating systems, platforms, and browsers on which {{es}}, {{kib}}, {{beats}}, and Elastic Endpoint have been tested. -## Node role requirements [node-role-requirements] +## Node role requirements [node-role-requirements] -To use Elastic Security, at least one node in your Elasticsearch cluster must have the [`transform` role](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/transforms-settings.md). Nodes are automatically given this role when they’re created, so changes are not required if default role settings remain the same. This applies to on-premise and cloud deployments. +To use Elastic Security, at least one node in your Elasticsearch cluster must have the [`transform` role](elasticsearch://reference/elasticsearch/configuration-reference/transforms-settings.md). Nodes are automatically given this role when they’re created, so changes are not required if default role settings remain the same. This applies to on-premise and cloud deployments. -Changes might be required if your nodes have customized roles. When updating node roles, nodes are only assigned the roles you specify, and default roles are removed. If you need to reassign the `transform` role to a node, [create a dedicated transform node](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/node-settings.md#transform-node). +Changes might be required if your nodes have customized roles. When updating node roles, nodes are only assigned the roles you specify, and default roles are removed. If you need to reassign the `transform` role to a node, [create a dedicated transform node](elasticsearch://reference/elasticsearch/configuration-reference/node-settings.md#transform-node). -## Space and index privileges [_space_and_index_privileges] +## Space and index privileges [_space_and_index_privileges] To use {{elastic-sec}}, your role must have at least: * `Read` privilege for the `Security` feature in the [space](/deploy-manage/manage-spaces.md). This grants you `Read` access to all features in {{elastic-sec}} except cases. You need additional [minimum privileges](/solutions/security/investigate/cases-requirements.md) to use cases. * `Read` and `view_index_metadata` privileges for all {{elastic-sec}} indices, such as `filebeat-*`, `packetbeat-*`, `logs-*`, and `endgame-*` indices. -::::{note} +::::{note} [*Configure advanced settings*](/solutions/security/get-started/configure-advanced-settings.md) describes how to modify {{elastic-sec}} indices. :::: @@ -44,7 +44,7 @@ To use {{elastic-sec}}, your role must have at least: For more information about index privileges, refer to [{{es}} security privileges](/deploy-manage/users-roles/cluster-or-deployment-auth/elasticsearch-privileges.md). -## Feature-specific requirements [_feature_specific_requirements] +## Feature-specific requirements [_feature_specific_requirements] There are some additional requirements for specific features: @@ -56,7 +56,7 @@ There are some additional requirements for specific features: * [Configure network map data](/solutions/security/explore/configure-network-map-data.md) -## License requirements [_license_requirements] +## License requirements [_license_requirements] All features are available as part of the free Basic plan **except**: @@ -67,22 +67,22 @@ All features are available as part of the free Basic plan **except**: [Elastic Stack subscriptions](https://www.elastic.co/subscriptions) lists the required subscription plans for all features. -## Advanced configuration and UI options [_advanced_configuration_and_ui_options] +## Advanced configuration and UI options [_advanced_configuration_and_ui_options] [*Configure advanced settings*](/solutions/security/get-started/configure-advanced-settings.md) describes how to modify advanced settings, such as the {{elastic-sec}} indices, default time intervals used in filters, and IP reputation links. -## Third-party collectors mapped to ECS [_third_party_collectors_mapped_to_ecs] +## Third-party collectors mapped to ECS [_third_party_collectors_mapped_to_ecs] The [Elastic Common Schema (ECS)](https://www.elastic.co/guide/en/ecs/current) defines a common set of fields to be used for storing event data in Elasticsearch. ECS helps users normalize their event data to better analyze, visualize, and correlate the data represented in their events. {{elastic-sec}} can ingest and normalize events from any ECS-compliant data source. -::::{important} +::::{important} {{elastic-sec}} requires [ECS-compliant data](https://www.elastic.co/guide/en/ecs/current). If you use third-party data collectors to ship data to {{es}}, the data must be mapped to ECS. [*Elastic Security ECS field reference*](asciidocalypse://docs/docs-content/docs/reference/security/fields-and-object-schemas/siem-field-reference.md) lists ECS fields used in {{elastic-sec}}. :::: -## Cross-cluster searches [_cross_cluster_searches] +## Cross-cluster searches [_cross_cluster_searches] For information on how to perform cross-cluster searches on {{elastic-sec}} indices, see: diff --git a/solutions/security/investigate/timeline.md b/solutions/security/investigate/timeline.md index fa342f839..03cc9d089 100644 --- a/solutions/security/investigate/timeline.md +++ b/solutions/security/investigate/timeline.md @@ -164,7 +164,7 @@ To learn more about cases, refer to [Cases](/solutions/security/investigate/case You can view, duplicate, export, delete, and create templates from existing Timelines: -1. Find **Timelines** in the navigation menu or use the [global search field](/explore-analyze/find-and-organize/find-apps-and-objects.md). +1. Find **Timelines** in the navigation menu or use the [global search field](/explore-analyze/find-and-organize/find-apps-and-objects.md). 2. Click the **All actions** menu in the desired row, then select an action: * **Create template from timeline** (refer to [Timeline templates](/solutions/security/investigate/timeline-templates.md)) @@ -246,7 +246,7 @@ You can use {{esql}} in Timeline by opening the **{{esql}}** tab. From there, yo * Finally, it keeps the default Timeline fields (`@timestamp`, `message`, `event.category`, `event.action`, `host.name`, `source.ip`, `destination.ip`, and `user.name`) in the output. ::::{tip} - When querying indices that tend to be large (for example, `logs-*`), performance can be impacted by the number of fields returned in the output. To optimize performance, we recommend using the [`KEEP`](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/esql-commands.md#esql-keep) command to specify fields that you want returned. For example, add the clause `KEEP @timestamp, user.name` to the end of your query to specify that you only want the `@timestamp` and `user.name` fields returned. + When querying indices that tend to be large (for example, `logs-*`), performance can be impacted by the number of fields returned in the output. To optimize performance, we recommend using the [`KEEP`](elasticsearch://reference/query-languages/esql/esql-commands.md#esql-keep) command to specify fields that you want returned. For example, add the clause `KEEP @timestamp, user.name` to the end of your query to specify that you only want the `@timestamp` and `user.name` fields returned. :::: diff --git a/troubleshoot/deployments/cloud-on-k8s/troubleshooting-methods.md b/troubleshoot/deployments/cloud-on-k8s/troubleshooting-methods.md index a8c2533a2..768f1293e 100644 --- a/troubleshoot/deployments/cloud-on-k8s/troubleshooting-methods.md +++ b/troubleshoot/deployments/cloud-on-k8s/troubleshooting-methods.md @@ -297,7 +297,7 @@ This can also be done for Kibana and APM Server. ## Suspend Elasticsearch [k8s-suspend-elasticsearch] -In exceptional cases, you might need to suspend the Elasticsearch process while using `kubectl exec` (as in the [previous section](#k8s-exec-into-containers)) to troubleshoot. One such example where Elasticsearch has to be stopped are the unsafe operations on Elasticsearch nodes that can be executed with the [elasticsearch-node](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/node-tool.md) tool. +In exceptional cases, you might need to suspend the Elasticsearch process while using `kubectl exec` (as in the [previous section](#k8s-exec-into-containers)) to troubleshoot. One such example where Elasticsearch has to be stopped are the unsafe operations on Elasticsearch nodes that can be executed with the [elasticsearch-node](elasticsearch://reference/elasticsearch/command-line-tools/node-tool.md) tool. To suspend an Elasticearch node, while keeping the corresponding Pod running, you can annotate the Elasticsearch resource with the `eck.k8s.elastic.co/suspend` annotation. The value should be a comma-separated list of the names of the Pods whose Elasticsearch process you want to suspend. diff --git a/troubleshoot/elasticsearch/allow-all-cluster-allocation.md b/troubleshoot/elasticsearch/allow-all-cluster-allocation.md index 9f959ab3f..be7e1b6d7 100644 --- a/troubleshoot/elasticsearch/allow-all-cluster-allocation.md +++ b/troubleshoot/elasticsearch/allow-all-cluster-allocation.md @@ -1,12 +1,12 @@ --- navigation_title: Data allocation -mapped_pages: +mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/allow-all-cluster-allocation.html --- # Allow Elasticsearch to allocate the data in the system [allow-all-cluster-allocation] -The allocation of data in an {{es}} deployment can be controlled using the [enable cluster allocation configuration](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-allocation-enable). In certain circumstances users might want to temporarily disable or restrict the allocation of data in the system. +The allocation of data in an {{es}} deployment can be controlled using the [enable cluster allocation configuration](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-allocation-enable). In certain circumstances users might want to temporarily disable or restrict the allocation of data in the system. Forgetting to re-allow all data allocations can lead to unassigned shards. @@ -15,7 +15,7 @@ In order to (re)allow all data to be allocated follow these steps: :::::::{tab-set} ::::::{tab-item} {{ech}} -In order to get the shards assigned we’ll need to change the value of the [configuration](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-allocation-enable) that restricts the assignemnt of the shards to allow all shards to be allocated. +In order to get the shards assigned we’ll need to change the value of the [configuration](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-allocation-enable) that restricts the assignemnt of the shards to allow all shards to be allocated. We’ll achieve this by inspecting the system-wide `cluster.routing.allocation.enable` [cluster setting](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-get-settings) and changing the configured value to `all`. @@ -54,7 +54,7 @@ We’ll achieve this by inspecting the system-wide `cluster.routing.allocation.e 1. Represents the current configured value that controls if data is partially or fully allowed to be allocated in the system. -5. [Change](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings) the [configuration](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-allocation-enable) value to allow all the data in the system to be fully allocated: +5. [Change](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings) the [configuration](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-allocation-enable) value to allow all the data in the system to be fully allocated: ```console PUT _cluster/settings @@ -69,7 +69,7 @@ We’ll achieve this by inspecting the system-wide `cluster.routing.allocation.e :::::: ::::::{tab-item} Self-managed -In order to get the shards assigned we’ll need to change the value of the [configuration](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-allocation-enable) that restricts the assignemnt of the shards to allow all shards to be allocated. +In order to get the shards assigned we’ll need to change the value of the [configuration](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-allocation-enable) that restricts the assignemnt of the shards to allow all shards to be allocated. We’ll achieve this by inspecting the system-wide `cluster.routing.allocation.enable` [cluster setting](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-get-settings) and changing the configured value to `all`. @@ -92,7 +92,7 @@ We’ll achieve this by inspecting the system-wide `cluster.routing.allocation.e 1. Represents the current configured value that controls if data is partially or fully allowed to be allocated in the system. -2. [Change](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings) the [configuration](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-allocation-enable) value to allow all the data in the system to be fully allocated: +2. [Change](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings) the [configuration](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-allocation-enable) value to allow all the data in the system to be fully allocated: ```console PUT _cluster/settings diff --git a/troubleshoot/elasticsearch/allow-all-index-allocation.md b/troubleshoot/elasticsearch/allow-all-index-allocation.md index 41a8d97c1..59bcdf043 100644 --- a/troubleshoot/elasticsearch/allow-all-index-allocation.md +++ b/troubleshoot/elasticsearch/allow-all-index-allocation.md @@ -9,7 +9,7 @@ mapped_pages: # Allow Elasticsearch to allocate the index [allow-all-index-allocation] -The allocation of data can be controlled using the [enable allocation configuration](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-modules.md#index-routing-allocation-enable-setting). In certain circumstances users might want to temporarily disable or restrict the allocation of data. +The allocation of data can be controlled using the [enable allocation configuration](elasticsearch://reference/elasticsearch/index-settings/index-modules.md#index-routing-allocation-enable-setting). In certain circumstances users might want to temporarily disable or restrict the allocation of data. Forgetting to re-allow all data allocation can lead to unassigned shards. @@ -18,7 +18,7 @@ In order to (re)allow all data to be allocated follow these steps: :::::::{tab-set} ::::::{tab-item} {{ech}} -In order to get the shards assigned we’ll need to change the value of the [configuration](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-modules.md#index-routing-allocation-enable-setting) that restricts the assignemnt of the shards to `all`. +In order to get the shards assigned we’ll need to change the value of the [configuration](elasticsearch://reference/elasticsearch/index-settings/index-modules.md#index-routing-allocation-enable-setting) that restricts the assignemnt of the shards to `all`. **Use {{kib}}** @@ -56,7 +56,7 @@ In order to get the shards assigned we’ll need to change the value of the [con 1. Represents the current configured value that controls if the index is allowed to be partially or totally allocated. -5. [Change](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-settings) the [configuration](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-modules.md#index-routing-allocation-enable-setting) value to allow the index to be fully allocated: +5. [Change](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-settings) the [configuration](elasticsearch://reference/elasticsearch/index-settings/index-modules.md#index-routing-allocation-enable-setting) value to allow the index to be fully allocated: ```console PUT /my-index-000001/_settings @@ -71,7 +71,7 @@ In order to get the shards assigned we’ll need to change the value of the [con :::::: ::::::{tab-item} Self-managed -In order to get the shards assigned we’ll need to change the value of the [configuration](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-modules.md#index-routing-allocation-enable-setting) that restricts the assignemnt of the shards to `all`. +In order to get the shards assigned we’ll need to change the value of the [configuration](elasticsearch://reference/elasticsearch/index-settings/index-modules.md#index-routing-allocation-enable-setting) that restricts the assignemnt of the shards to `all`. 1. Inspect the `index.routing.allocation.enable` [index setting](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-get-settings) for the index with unassigned shards: @@ -93,7 +93,7 @@ In order to get the shards assigned we’ll need to change the value of the [con 1. Represents the current configured value that controls if the index is allowed to be partially or totally allocated. -2. [Change](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-settings) the [configuration](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-modules.md#index-routing-allocation-enable-setting) value to allow the index to be fully allocated: +2. [Change](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-settings) the [configuration](elasticsearch://reference/elasticsearch/index-settings/index-modules.md#index-routing-allocation-enable-setting) value to allow the index to be fully allocated: ```console PUT /my-index-000001/_settings diff --git a/troubleshoot/elasticsearch/circuit-breaker-errors.md b/troubleshoot/elasticsearch/circuit-breaker-errors.md index 5b9d379dd..6d65abd6b 100644 --- a/troubleshoot/elasticsearch/circuit-breaker-errors.md +++ b/troubleshoot/elasticsearch/circuit-breaker-errors.md @@ -5,9 +5,9 @@ mapped_pages: # Circuit breaker errors [circuit-breaker-errors] -{{es}} uses [circuit breakers](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/circuit-breaker-settings.md) to prevent nodes from running out of JVM heap memory. If Elasticsearch estimates an operation would exceed a circuit breaker, it stops the operation and returns an error. +{{es}} uses [circuit breakers](elasticsearch://reference/elasticsearch/configuration-reference/circuit-breaker-settings.md) to prevent nodes from running out of JVM heap memory. If Elasticsearch estimates an operation would exceed a circuit breaker, it stops the operation and returns an error. -By default, the [parent circuit breaker](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/circuit-breaker-settings.md#parent-circuit-breaker) triggers at 95% JVM memory usage. To prevent errors, we recommend taking steps to reduce memory pressure if usage consistently exceeds 85%. +By default, the [parent circuit breaker](elasticsearch://reference/elasticsearch/configuration-reference/circuit-breaker-settings.md#parent-circuit-breaker) triggers at 95% JVM memory usage. To prevent errors, we recommend taking steps to reduce memory pressure if usage consistently exceeds 85%. See [this video](https://www.youtube.com/watch?v=k3wYlRVbMSw) for a walkthrough of diagnosing circuit breaker errors. @@ -68,7 +68,7 @@ High JVM memory pressure often causes circuit breaker errors. See [High JVM memo **Avoid using fielddata on `text` fields** -For high-cardinality `text` fields, fielddata can use a large amount of JVM memory. To avoid this, {{es}} disables fielddata on `text` fields by default. If you’ve enabled fielddata and triggered the [fielddata circuit breaker](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/circuit-breaker-settings.md#fielddata-circuit-breaker), consider disabling it and using a `keyword` field instead. See [`fielddata` mapping parameter](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/text.md#fielddata-mapping-param). +For high-cardinality `text` fields, fielddata can use a large amount of JVM memory. To avoid this, {{es}} disables fielddata on `text` fields by default. If you’ve enabled fielddata and triggered the [fielddata circuit breaker](elasticsearch://reference/elasticsearch/configuration-reference/circuit-breaker-settings.md#fielddata-circuit-breaker), consider disabling it and using a `keyword` field instead. See [`fielddata` mapping parameter](elasticsearch://reference/elasticsearch/mapping-reference/text.md#fielddata-mapping-param). **Clear the fielddata cache** diff --git a/troubleshoot/elasticsearch/decrease-disk-usage-data-node.md b/troubleshoot/elasticsearch/decrease-disk-usage-data-node.md index 82cea7c2f..0d30caa5d 100644 --- a/troubleshoot/elasticsearch/decrease-disk-usage-data-node.md +++ b/troubleshoot/elasticsearch/decrease-disk-usage-data-node.md @@ -45,7 +45,7 @@ Reducing the replicas of an index can potentially reduce search throughput and d ::::::{tab-item} Self-managed In order to estimate how many replicas need to be removed, first you need to estimate the amount of disk space that needs to be released. -1. First, retrieve the relevant disk thresholds that will indicate how much space should be released. The relevant thresholds are the [high watermark](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-watermark-high) for all the tiers apart from the frozen one and the [frozen flood stage watermark](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-flood-stage-frozen) for the frozen tier. The following example demonstrates disk shortage in the hot tier, so we will only retrieve the high watermark: +1. First, retrieve the relevant disk thresholds that will indicate how much space should be released. The relevant thresholds are the [high watermark](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-watermark-high) for all the tiers apart from the frozen one and the [frozen flood stage watermark](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-flood-stage-frozen) for the frozen tier. The following example demonstrates disk shortage in the hot tier, so we will only retrieve the high watermark: ```console GET _cluster/settings?include_defaults&filter_path=*.cluster.routing.allocation.disk.watermark.high* @@ -72,7 +72,7 @@ In order to estimate how many replicas need to be removed, first you need to est } ``` - The above means that in order to resolve the disk shortage we need to either drop our disk usage below the 90% or have more than 150GB available, read more on how this threshold works [here](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-watermark-high). + The above means that in order to resolve the disk shortage we need to either drop our disk usage below the 90% or have more than 150GB available, read more on how this threshold works [here](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-watermark-high). 2. The next step is to find out the current disk usage; this will indicate how much space should be freed. For simplicity, our example has one node, but you can apply the same for every node over the relevant threshold. diff --git a/troubleshoot/elasticsearch/diagnostic.md b/troubleshoot/elasticsearch/diagnostic.md index 027906279..dd614c221 100644 --- a/troubleshoot/elasticsearch/diagnostic.md +++ b/troubleshoot/elasticsearch/diagnostic.md @@ -73,9 +73,9 @@ To capture an {{es}} diagnostic: You can execute the script in three [modes](https://github.com/elastic/support-diagnostics#diagnostic-types): - * `local` (default, recommended): Polls the [{{es}} API](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/index.md), gathers operating system info, and captures cluster and GC logs. + * `local` (default, recommended): Polls the [{{es}} API](elasticsearch://reference/elasticsearch/rest-apis/index.md), gathers operating system info, and captures cluster and GC logs. * `remote`: Establishes an ssh session to the applicable target server to pull the same information as `local`. - * `api`: Polls the [{{es}} API](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/index.md). All other data must be collected manually. + * `api`: Polls the [{{es}} API](elasticsearch://reference/elasticsearch/rest-apis/index.md). All other data must be collected manually. :::: diff --git a/troubleshoot/elasticsearch/discovery-troubleshooting.md b/troubleshoot/elasticsearch/discovery-troubleshooting.md index d4859f54f..0e92e132d 100644 --- a/troubleshoot/elasticsearch/discovery-troubleshooting.md +++ b/troubleshoot/elasticsearch/discovery-troubleshooting.md @@ -17,7 +17,7 @@ If the cluster has no elected master node for more than a few seconds, the maste The following sections describe some common discovery and election problems. -## No master is elected [discovery-no-master] +## No master is elected [discovery-no-master] When a node wins the master election, it logs a message containing `elected-as-master` and all nodes log a message containing `master node changed` identifying the new elected master node. @@ -33,48 +33,48 @@ If the logs or the health report indicate that {{es}} *has* discovered a possibl If the logs suggest that discovery or master elections are failing due to timeouts or network-related issues then narrow down the problem as follows. -* GC pauses are recorded in the GC logs that {{es}} emits by default, and also usually by the `JvmMonitorService` in the main node logs. Use these logs to confirm whether or not the node is experiencing high heap usage with long GC pauses. If so, [the troubleshooting guide for high heap usage](high-jvm-memory-pressure.md) has some suggestions for further investigation but typically you will need to capture a heap dump and the [garbage collector logs](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/jvm-settings.md#gc-logging) during a time of high heap usage to fully understand the problem. +* GC pauses are recorded in the GC logs that {{es}} emits by default, and also usually by the `JvmMonitorService` in the main node logs. Use these logs to confirm whether or not the node is experiencing high heap usage with long GC pauses. If so, [the troubleshooting guide for high heap usage](high-jvm-memory-pressure.md) has some suggestions for further investigation but typically you will need to capture a heap dump and the [garbage collector logs](elasticsearch://reference/elasticsearch/jvm-settings.md#gc-logging) during a time of high heap usage to fully understand the problem. * VM pauses also affect other processes on the same host. A VM pause also typically causes a discontinuity in the system clock, which {{es}} will report in its logs. If you see evidence of other processes pausing at the same time, or unexpected clock discontinuities, investigate the infrastructure on which you are running {{es}}. * Packet captures will reveal system-level and network-level faults, especially if you capture the network traffic simultaneously at all relevant nodes and analyse it alongside the {{es}} logs from those nodes. You should be able to observe any retransmissions, packet loss, or other delays on the connections between the nodes. * Long waits for particular threads to be available can be identified by taking stack dumps of the main {{es}} process (for example, using `jstack`) or a profiling trace (for example, using Java Flight Recorder) in the few seconds leading up to the relevant log message. The [Nodes hot threads](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-nodes-hot-threads) API sometimes yields useful information, but bear in mind that this API also requires a number of `transport_worker` and `generic` threads across all the nodes in the cluster. The API may be affected by the very problem you’re trying to diagnose. `jstack` is much more reliable since it doesn’t require any JVM threads. - The threads involved in discovery and cluster membership are mainly `transport_worker` and `cluster_coordination` threads, for which there should never be a long wait. There may also be evidence of long waits for threads in the {{es}} logs, particularly looking at warning logs from `org.elasticsearch.transport.InboundHandler`. See [Networking threading model](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#modules-network-threading-model) for more information. + The threads involved in discovery and cluster membership are mainly `transport_worker` and `cluster_coordination` threads, for which there should never be a long wait. There may also be evidence of long waits for threads in the {{es}} logs, particularly looking at warning logs from `org.elasticsearch.transport.InboundHandler`. See [Networking threading model](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#modules-network-threading-model) for more information. -## Master is elected but unstable [discovery-master-unstable] +## Master is elected but unstable [discovery-master-unstable] When a node wins the master election, it logs a message containing `elected-as-master`. If this happens repeatedly, the elected master node is unstable. In this situation, focus on the logs from the master-eligible nodes to understand why the election winner stops being the master and triggers another election. If the logs suggest that the master is unstable due to timeouts or network-related issues then narrow down the problem as follows. -* GC pauses are recorded in the GC logs that {{es}} emits by default, and also usually by the `JvmMonitorService` in the main node logs. Use these logs to confirm whether or not the node is experiencing high heap usage with long GC pauses. If so, [the troubleshooting guide for high heap usage](high-jvm-memory-pressure.md) has some suggestions for further investigation but typically you will need to capture a heap dump and the [garbage collector logs](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/jvm-settings.md#gc-logging) during a time of high heap usage to fully understand the problem. +* GC pauses are recorded in the GC logs that {{es}} emits by default, and also usually by the `JvmMonitorService` in the main node logs. Use these logs to confirm whether or not the node is experiencing high heap usage with long GC pauses. If so, [the troubleshooting guide for high heap usage](high-jvm-memory-pressure.md) has some suggestions for further investigation but typically you will need to capture a heap dump and the [garbage collector logs](elasticsearch://reference/elasticsearch/jvm-settings.md#gc-logging) during a time of high heap usage to fully understand the problem. * VM pauses also affect other processes on the same host. A VM pause also typically causes a discontinuity in the system clock, which {{es}} will report in its logs. If you see evidence of other processes pausing at the same time, or unexpected clock discontinuities, investigate the infrastructure on which you are running {{es}}. * Packet captures will reveal system-level and network-level faults, especially if you capture the network traffic simultaneously at all relevant nodes and analyse it alongside the {{es}} logs from those nodes. You should be able to observe any retransmissions, packet loss, or other delays on the connections between the nodes. * Long waits for particular threads to be available can be identified by taking stack dumps of the main {{es}} process (for example, using `jstack`) or a profiling trace (for example, using Java Flight Recorder) in the few seconds leading up to the relevant log message. The [Nodes hot threads](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-nodes-hot-threads) API sometimes yields useful information, but bear in mind that this API also requires a number of `transport_worker` and `generic` threads across all the nodes in the cluster. The API may be affected by the very problem you’re trying to diagnose. `jstack` is much more reliable since it doesn’t require any JVM threads. - The threads involved in discovery and cluster membership are mainly `transport_worker` and `cluster_coordination` threads, for which there should never be a long wait. There may also be evidence of long waits for threads in the {{es}} logs, particularly looking at warning logs from `org.elasticsearch.transport.InboundHandler`. See [Networking threading model](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#modules-network-threading-model) for more information. + The threads involved in discovery and cluster membership are mainly `transport_worker` and `cluster_coordination` threads, for which there should never be a long wait. There may also be evidence of long waits for threads in the {{es}} logs, particularly looking at warning logs from `org.elasticsearch.transport.InboundHandler`. See [Networking threading model](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#modules-network-threading-model) for more information. -## Node cannot discover or join stable master [discovery-cannot-join-master] +## Node cannot discover or join stable master [discovery-cannot-join-master] If there is a stable elected master but a node can’t discover or join its cluster, it will repeatedly log messages about the problem using the `ClusterFormationFailureHelper` logger. The [Health](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-health-report) API on the affected node will also provide useful information about the situation. Other log messages on the affected node and the elected master may provide additional information about the problem. If the logs suggest that the node cannot discover or join the cluster due to timeouts or network-related issues then narrow down the problem as follows. -* GC pauses are recorded in the GC logs that {{es}} emits by default, and also usually by the `JvmMonitorService` in the main node logs. Use these logs to confirm whether or not the node is experiencing high heap usage with long GC pauses. If so, [the troubleshooting guide for high heap usage](high-jvm-memory-pressure.md) has some suggestions for further investigation but typically you will need to capture a heap dump and the [garbage collector logs](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/jvm-settings.md#gc-logging) during a time of high heap usage to fully understand the problem. +* GC pauses are recorded in the GC logs that {{es}} emits by default, and also usually by the `JvmMonitorService` in the main node logs. Use these logs to confirm whether or not the node is experiencing high heap usage with long GC pauses. If so, [the troubleshooting guide for high heap usage](high-jvm-memory-pressure.md) has some suggestions for further investigation but typically you will need to capture a heap dump and the [garbage collector logs](elasticsearch://reference/elasticsearch/jvm-settings.md#gc-logging) during a time of high heap usage to fully understand the problem. * VM pauses also affect other processes on the same host. A VM pause also typically causes a discontinuity in the system clock, which {{es}} will report in its logs. If you see evidence of other processes pausing at the same time, or unexpected clock discontinuities, investigate the infrastructure on which you are running {{es}}. * Packet captures will reveal system-level and network-level faults, especially if you capture the network traffic simultaneously at all relevant nodes and analyse it alongside the {{es}} logs from those nodes. You should be able to observe any retransmissions, packet loss, or other delays on the connections between the nodes. * Long waits for particular threads to be available can be identified by taking stack dumps of the main {{es}} process (for example, using `jstack`) or a profiling trace (for example, using Java Flight Recorder) in the few seconds leading up to the relevant log message. The [Nodes hot threads](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-nodes-hot-threads) API sometimes yields useful information, but bear in mind that this API also requires a number of `transport_worker` and `generic` threads across all the nodes in the cluster. The API may be affected by the very problem you’re trying to diagnose. `jstack` is much more reliable since it doesn’t require any JVM threads. - The threads involved in discovery and cluster membership are mainly `transport_worker` and `cluster_coordination` threads, for which there should never be a long wait. There may also be evidence of long waits for threads in the {{es}} logs, particularly looking at warning logs from `org.elasticsearch.transport.InboundHandler`. See [Networking threading model](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#modules-network-threading-model) for more information. + The threads involved in discovery and cluster membership are mainly `transport_worker` and `cluster_coordination` threads, for which there should never be a long wait. There may also be evidence of long waits for threads in the {{es}} logs, particularly looking at warning logs from `org.elasticsearch.transport.InboundHandler`. See [Networking threading model](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#modules-network-threading-model) for more information. -## Node joins cluster and leaves again [discovery-node-leaves] +## Node joins cluster and leaves again [discovery-node-leaves] If a node joins the cluster but {{es}} determines it to be faulty then it will be removed from the cluster again. See [Troubleshooting an unstable cluster](../../deploy-manage/distributed-architecture/discovery-cluster-formation/cluster-fault-detection.md#cluster-fault-detection-troubleshooting) for more information. diff --git a/troubleshoot/elasticsearch/elasticsearch-client-javascript-api/nodejs.md b/troubleshoot/elasticsearch/elasticsearch-client-javascript-api/nodejs.md index aff3fd9de..3dcae7ed9 100644 --- a/troubleshoot/elasticsearch/elasticsearch-client-javascript-api/nodejs.md +++ b/troubleshoot/elasticsearch/elasticsearch-client-javascript-api/nodejs.md @@ -6,7 +6,7 @@ mapped_pages: # Troubleshoot {{es}} Node.js client [timeout-best-practices] -Starting in 9.0.0, this client is configured to not time out any HTTP request by default. {{es}} will always eventually respond to any request, even if it takes several minutes. Reissuing a request that it has not responded to yet can cause performance side effects. See the [official {{es}} recommendations for HTTP clients](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#_http_client_configuration) for more information. +Starting in 9.0.0, this client is configured to not time out any HTTP request by default. {{es}} will always eventually respond to any request, even if it takes several minutes. Reissuing a request that it has not responded to yet can cause performance side effects. See the [official {{es}} recommendations for HTTP clients](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#_http_client_configuration) for more information. Prior to 9.0, this client was configured by default to operate like many HTTP client libraries do, by using a relatively short (30 second) timeout on all requests sent to {{es}}, raising a `TimeoutError` when that time period elapsed without receiving a response. diff --git a/troubleshoot/elasticsearch/fix-master-node-out-of-disk.md b/troubleshoot/elasticsearch/fix-master-node-out-of-disk.md index aeddcf168..9272f5a90 100644 --- a/troubleshoot/elasticsearch/fix-master-node-out-of-disk.md +++ b/troubleshoot/elasticsearch/fix-master-node-out-of-disk.md @@ -28,7 +28,7 @@ mapped_pages: ::::::{tab-item} Self-managed In order to increase the disk capacity of a master node, you will need to replace **all** the master nodes with master nodes of higher disk capacity. -1. First, retrieve the disk threshold that will indicate how much disk space is needed. The relevant threshold is the [high watermark](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-watermark-high) and can be retrieved via the following command: +1. First, retrieve the disk threshold that will indicate how much disk space is needed. The relevant threshold is the [high watermark](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-watermark-high) and can be retrieved via the following command: ```console GET _cluster/settings?include_defaults&filter_path=*.cluster.routing.allocation.disk.watermark.high* @@ -54,7 +54,7 @@ In order to increase the disk capacity of a master node, you will need to replac } ``` - The above means that in order to resolve the disk shortage we need to either drop our disk usage below the 90% or have more than 150GB available, read more how this threshold works [here](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-watermark-high). + The above means that in order to resolve the disk shortage we need to either drop our disk usage below the 90% or have more than 150GB available, read more how this threshold works [here](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-watermark-high). 2. The next step is to find out the current disk usage, this will allow to calculate how much extra space is needed. In the following example, we show only the master nodes for readability purposes: diff --git a/troubleshoot/elasticsearch/fix-other-node-out-of-disk.md b/troubleshoot/elasticsearch/fix-other-node-out-of-disk.md index a4978042b..65bd7baa9 100644 --- a/troubleshoot/elasticsearch/fix-other-node-out-of-disk.md +++ b/troubleshoot/elasticsearch/fix-other-node-out-of-disk.md @@ -28,7 +28,7 @@ mapped_pages: ::::::{tab-item} Self-managed In order to increase the disk capacity of any other node, you will need to replace the instance that has run out of space with one of higher disk capacity. -1. First, retrieve the disk threshold that will indicate how much disk space is needed. The relevant threshold is the [high watermark](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-watermark-high) and can be retrieved via the following command: +1. First, retrieve the disk threshold that will indicate how much disk space is needed. The relevant threshold is the [high watermark](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-watermark-high) and can be retrieved via the following command: ```console GET _cluster/settings?include_defaults&filter_path=*.cluster.routing.allocation.disk.watermark.high* @@ -54,7 +54,7 @@ In order to increase the disk capacity of any other node, you will need to repla } ``` - The above means that in order to resolve the disk shortage we need to either drop our disk usage below the 90% or have more than 150GB available, read more how this threshold works [here](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-watermark-high). + The above means that in order to resolve the disk shortage we need to either drop our disk usage below the 90% or have more than 150GB available, read more how this threshold works [here](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-watermark-high). 2. The next step is to find out the current disk usage, this will allow to calculate how much extra space is needed. In the following example, we show only a machine learning node for readability purposes: diff --git a/troubleshoot/elasticsearch/fix-watermark-errors.md b/troubleshoot/elasticsearch/fix-watermark-errors.md index 6730c66d6..9a860fc85 100644 --- a/troubleshoot/elasticsearch/fix-watermark-errors.md +++ b/troubleshoot/elasticsearch/fix-watermark-errors.md @@ -9,11 +9,11 @@ mapped_pages: # Watermark errors [fix-watermark-errors] -When a data node is critically low on disk space and has reached the [flood-stage disk usage watermark](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-flood-stage), the following error is logged: `Error: disk usage exceeded flood-stage watermark, index has read-only-allow-delete block`. +When a data node is critically low on disk space and has reached the [flood-stage disk usage watermark](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-flood-stage), the following error is logged: `Error: disk usage exceeded flood-stage watermark, index has read-only-allow-delete block`. -To prevent a full disk, when a node reaches this watermark, {{es}} [blocks writes](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-block.md) to any index with a shard on the node. If the block affects related system indices, {{kib}} and other {{stack}} features may become unavailable. For example, this could induce {{kib}}'s `Kibana Server is not Ready yet` [error message](/troubleshoot/kibana/error-server-not-ready.md). +To prevent a full disk, when a node reaches this watermark, {{es}} [blocks writes](elasticsearch://reference/elasticsearch/index-settings/index-block.md) to any index with a shard on the node. If the block affects related system indices, {{kib}} and other {{stack}} features may become unavailable. For example, this could induce {{kib}}'s `Kibana Server is not Ready yet` [error message](/troubleshoot/kibana/error-server-not-ready.md). -{{es}} will automatically remove the write block when the affected node’s disk usage falls below the [high disk watermark](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-watermark-high). To achieve this, {{es}} attempts to rebalance some of the affected node’s shards to other nodes in the same data tier. +{{es}} will automatically remove the write block when the affected node’s disk usage falls below the [high disk watermark](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-watermark-high). To achieve this, {{es}} attempts to rebalance some of the affected node’s shards to other nodes in the same data tier. ::::{tip} If you’re using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to [Monitor with AutoOps](/deploy-manage/monitor/autoops.md). @@ -45,7 +45,7 @@ GET _cluster/allocation/explain ## Temporary Relief [fix-watermark-errors-temporary] -To immediately restore write operations, you can temporarily increase [disk watermarks](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#disk-based-shard-allocation) and remove the [write block](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-block.md). +To immediately restore write operations, you can temporarily increase [disk watermarks](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#disk-based-shard-allocation) and remove the [write block](elasticsearch://reference/elasticsearch/index-settings/index-block.md). ```console PUT _cluster/settings diff --git a/troubleshoot/elasticsearch/high-cpu-usage.md b/troubleshoot/elasticsearch/high-cpu-usage.md index 7bd0240b1..c04fa2c73 100644 --- a/troubleshoot/elasticsearch/high-cpu-usage.md +++ b/troubleshoot/elasticsearch/high-cpu-usage.md @@ -6,7 +6,7 @@ mapped_pages: # Symptom: High CPU usage [high-cpu-usage] -{{es}} uses [thread pools](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/thread-pool-settings.md) to manage CPU resources for concurrent operations. High CPU usage typically means one or more thread pools are running low. +{{es}} uses [thread pools](elasticsearch://reference/elasticsearch/configuration-reference/thread-pool-settings.md) to manage CPU resources for concurrent operations. High CPU usage typically means one or more thread pools are running low. If a thread pool is depleted, {{es}} will [reject requests](rejected-requests.md) related to the thread pool. For example, if the `search` thread pool is depleted, {{es}} will reject search requests until more threads are available. diff --git a/troubleshoot/elasticsearch/high-jvm-memory-pressure.md b/troubleshoot/elasticsearch/high-jvm-memory-pressure.md index a4b886c12..a1a79c79f 100644 --- a/troubleshoot/elasticsearch/high-jvm-memory-pressure.md +++ b/troubleshoot/elasticsearch/high-jvm-memory-pressure.md @@ -57,7 +57,7 @@ As memory usage increases, garbage collection becomes more frequent and takes lo **Capture a JVM heap dump** -To determine the exact reason for the high JVM memory pressure, capture a heap dump of the JVM while its memory usage is high, and also capture the [garbage collector logs](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/jvm-settings.md#gc-logging) covering the same time period. +To determine the exact reason for the high JVM memory pressure, capture a heap dump of the JVM while its memory usage is high, and also capture the [garbage collector logs](elasticsearch://reference/elasticsearch/jvm-settings.md#gc-logging) covering the same time period. ## Reduce JVM memory pressure [reduce-jvm-memory-pressure] @@ -71,12 +71,12 @@ Every shard uses memory. In most cases, a small set of large shards uses fewer r $$$avoid-expensive-searches$$$ **Avoid expensive searches** -Expensive searches can use large amounts of memory. To better track expensive searches on your cluster, enable [slow logs](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/slow-log.md). +Expensive searches can use large amounts of memory. To better track expensive searches on your cluster, enable [slow logs](elasticsearch://reference/elasticsearch/index-settings/slow-log.md). -Expensive searches may have a large [`size` argument](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/rest-apis/paginate-search-results.md), use aggregations with a large number of buckets, or include [expensive queries](../../explore-analyze/query-filter/languages/querydsl.md#query-dsl-allow-expensive-queries). To prevent expensive searches, consider the following setting changes: +Expensive searches may have a large [`size` argument](elasticsearch://reference/elasticsearch/rest-apis/paginate-search-results.md), use aggregations with a large number of buckets, or include [expensive queries](../../explore-analyze/query-filter/languages/querydsl.md#query-dsl-allow-expensive-queries). To prevent expensive searches, consider the following setting changes: -* Lower the `size` limit using the [`index.max_result_window`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-modules.md#index-max-result-window) index setting. -* Decrease the maximum number of allowed aggregation buckets using the [search.max_buckets](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/search-settings.md#search-settings-max-buckets) cluster setting. +* Lower the `size` limit using the [`index.max_result_window`](elasticsearch://reference/elasticsearch/index-settings/index-modules.md#index-max-result-window) index setting. +* Decrease the maximum number of allowed aggregation buckets using the [search.max_buckets](elasticsearch://reference/elasticsearch/configuration-reference/search-settings.md#search-settings-max-buckets) cluster setting. * Disable expensive queries using the [`search.allow_expensive_queries`](../../explore-analyze/query-filter/languages/querydsl.md#query-dsl-allow-expensive-queries) cluster setting. * Set a default search timeout using the [`search.default_search_timeout`](../../solutions/search/the-search-api.md#search-timeout) cluster setting. @@ -97,7 +97,7 @@ PUT _cluster/settings **Prevent mapping explosions** -Defining too many fields or nesting fields too deeply can lead to [mapping explosions](../../manage-data/data-store/mapping.md#mapping-limit-settings) that use large amounts of memory. To prevent mapping explosions, use the [mapping limit settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/mapping-limit.md) to limit the number of field mappings. +Defining too many fields or nesting fields too deeply can lead to [mapping explosions](../../manage-data/data-store/mapping.md#mapping-limit-settings) that use large amounts of memory. To prevent mapping explosions, use the [mapping limit settings](elasticsearch://reference/elasticsearch/index-settings/mapping-limit.md) to limit the number of field mappings. **Spread out bulk requests** diff --git a/troubleshoot/elasticsearch/hotspotting.md b/troubleshoot/elasticsearch/hotspotting.md index c78ee3644..50789a00f 100644 --- a/troubleshoot/elasticsearch/hotspotting.md +++ b/troubleshoot/elasticsearch/hotspotting.md @@ -9,7 +9,7 @@ mapped_pages: # Hot spotting [hotspotting] -Computer [hot spotting](https://en.wikipedia.org/wiki/Hot_spot_(computer_programming)) may occur in {{es}} when resource utilizations are unevenly distributed across [nodes](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/node-settings.md). Temporary spikes are not usually considered problematic, but ongoing significantly unique utilization may lead to cluster bottlenecks and should be reviewed. +Computer [hot spotting](https://en.wikipedia.org/wiki/Hot_spot_(computer_programming)) may occur in {{es}} when resource utilizations are unevenly distributed across [nodes](elasticsearch://reference/elasticsearch/configuration-reference/node-settings.md). Temporary spikes are not usually considered problematic, but ongoing significantly unique utilization may lead to cluster bottlenecks and should be reviewed. ::::{tip} If you’re using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to [Monitor with AutoOps](/deploy-manage/monitor/autoops.md). @@ -54,18 +54,18 @@ Here are some common improper hardware setups which may contribute to hot spotti * Resources are allocated non-uniformly. For example, if one hot node is given half the CPU of its peers. {{es}} expects all nodes on a [data tier](../../manage-data/lifecycle/data-tiers.md) to share the same hardware profiles or specifications. * Resources are consumed by another service on the host, including other {{es}} nodes. Refer to our [dedicated host](../../deploy-manage/deploy/self-managed/deploy-cluster.md#dedicated-host) recommendation. * Resources experience different network or disk throughputs. For example, if one node’s I/O is lower than its peers. Refer to [Use faster hardware](../../deploy-manage/production-guidance/optimize-performance/indexing-speed.md) for more information. -* A JVM that has been configured with a heap larger than 31GB. Refer to [Set the JVM heap size](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/jvm-settings.md#set-jvm-heap-size) for more information. +* A JVM that has been configured with a heap larger than 31GB. Refer to [Set the JVM heap size](elasticsearch://reference/elasticsearch/jvm-settings.md#set-jvm-heap-size) for more information. * Problematic resources uniquely report [memory swapping](../../deploy-manage/deploy/self-managed/setup-configuration-memory.md). ## Shard distributions [causes-shards] -{{es}} indices are divided into one or more [shards](https://en.wikipedia.org/wiki/Shard_(database_architecture)) which can sometimes be poorly distributed. {{es}} accounts for this by [balancing shard counts](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md) across data nodes. As [introduced in version 8.6](https://www.elastic.co/blog/whats-new-elasticsearch-kibana-cloud-8-6-0), {{es}} by default also enables [desired balancing](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md) to account for ingest load. A node may still experience hot spotting either due to write-heavy indices or by the overall shards it’s hosting. +{{es}} indices are divided into one or more [shards](https://en.wikipedia.org/wiki/Shard_(database_architecture)) which can sometimes be poorly distributed. {{es}} accounts for this by [balancing shard counts](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md) across data nodes. As [introduced in version 8.6](https://www.elastic.co/blog/whats-new-elasticsearch-kibana-cloud-8-6-0), {{es}} by default also enables [desired balancing](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md) to account for ingest load. A node may still experience hot spotting either due to write-heavy indices or by the overall shards it’s hosting. ### Node level [causes-shards-nodes] -You can check for shard balancing via [cat allocation](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cat-allocation), though as of version 8.6, [desired balancing](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md) may no longer fully expect to balance shards. Kindly note, both methods may temporarily show problematic imbalance during [cluster stability issues](../../deploy-manage/distributed-architecture/discovery-cluster-formation/cluster-fault-detection.md). +You can check for shard balancing via [cat allocation](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cat-allocation), though as of version 8.6, [desired balancing](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md) may no longer fully expect to balance shards. Kindly note, both methods may temporarily show problematic imbalance during [cluster stability issues](../../deploy-manage/distributed-architecture/discovery-cluster-formation/cluster-fault-detection.md). For example, let’s showcase two separate plausible issues using cat allocation: @@ -82,11 +82,11 @@ node_2 31 52 44.6gb 372.7gb node_3 445 43 271.5gb 289.4gb ``` -Here we see two significantly unique situations. `node_2` has recently restarted, so it has a much lower number of shards than all other nodes. This also relates to `disk.indices` being much smaller than `disk.used` while shards are recovering as seen via [cat recovery](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cat-recovery). While `node_2`'s shard count is low, it may become a write hot spot due to ongoing [ILM rollovers](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-rollover.md). This is a common root cause of write hot spots covered in the next section. +Here we see two significantly unique situations. `node_2` has recently restarted, so it has a much lower number of shards than all other nodes. This also relates to `disk.indices` being much smaller than `disk.used` while shards are recovering as seen via [cat recovery](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cat-recovery). While `node_2`'s shard count is low, it may become a write hot spot due to ongoing [ILM rollovers](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-rollover.md). This is a common root cause of write hot spots covered in the next section. The second situation is that `node_3` has a higher `disk.percent` than `node_1`, even though they hold roughly the same number of shards. This occurs when either shards are not evenly sized (refer to [Aim for shards of up to 200M documents, or with sizes between 10GB and 50GB](../../deploy-manage/production-guidance/optimize-performance/size-shards.md#shard-size-recommendation)) or when there are a lot of empty indices. -Cluster rebalancing based on desired balance does much of the heavy lifting of keeping nodes from hot spotting. It can be limited by either nodes hitting [watermarks](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#disk-based-shard-allocation) (refer to [fixing disk watermark errors](fix-watermark-errors.md)) or by a write-heavy index’s total shards being much lower than the written-to nodes. +Cluster rebalancing based on desired balance does much of the heavy lifting of keeping nodes from hot spotting. It can be limited by either nodes hitting [watermarks](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#disk-based-shard-allocation) (refer to [fixing disk watermark errors](fix-watermark-errors.md)) or by a write-heavy index’s total shards being much lower than the written-to nodes. You can confirm hot spotted nodes via [the nodes stats API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-nodes-stats), potentially polling twice over time to only checking for the stats differences between them rather than polling once giving you stats for the node’s full [node uptime](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-nodes-usage). For example, to check all nodes indexing stats: @@ -115,7 +115,7 @@ write node_2 0 4 0 980 write node_3 1 5 0 8714 ``` -Here you can see two significantly unique situations. Firstly, `node_1` has a severely backed up write queue compared to other nodes. Secondly, `node_3` shows historically completed writes that are double any other node. These are both probably due to either poorly distributed write-heavy indices, or to multiple write-heavy indices allocated to the same node. Since primary and replica writes are majorly the same amount of cluster work, we usually recommend setting [`index.routing.allocation.total_shards_per_node`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/total-shards-per-node.md#total-shards-per-node) to force index spreading after lining up index shard counts to total nodes. +Here you can see two significantly unique situations. Firstly, `node_1` has a severely backed up write queue compared to other nodes. Secondly, `node_3` shows historically completed writes that are double any other node. These are both probably due to either poorly distributed write-heavy indices, or to multiple write-heavy indices allocated to the same node. Since primary and replica writes are majorly the same amount of cluster work, we usually recommend setting [`index.routing.allocation.total_shards_per_node`](elasticsearch://reference/elasticsearch/index-settings/total-shards-per-node.md#total-shards-per-node) to force index spreading after lining up index shard counts to total nodes. We normally recommend heavy-write indices have sufficient primary `number_of_shards` and replica `number_of_replicas` to evenly spread across indexing nodes. Alternatively, you can [reroute](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-reroute) shards to more quiet nodes to alleviate the nodes with write hot spotting. @@ -145,7 +145,7 @@ cat shard_stats.json | jq -rc 'sort_by(-.avg_indexing)[]' | head Shard distribution problems will most-likely surface as task load as seen above in the [cat thread pool](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cat-thread-pool) example. It is also possible for tasks to hot spot a node either due to individual qualitative expensiveness or overall quantitative traffic loads. -For example, if [cat thread pool](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cat-thread-pool) reported a high queue on the `warmer` [thread pool](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/thread-pool-settings.md), you would look-up the effected node’s [hot threads](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-nodes-hot-threads). Let’s say it reported `warmer` threads at `100% cpu` related to `GlobalOrdinalsBuilder`. This would let you know to inspect [field data’s global ordinals](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/eager-global-ordinals.md). +For example, if [cat thread pool](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cat-thread-pool) reported a high queue on the `warmer` [thread pool](elasticsearch://reference/elasticsearch/configuration-reference/thread-pool-settings.md), you would look-up the effected node’s [hot threads](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-nodes-hot-threads). Let’s say it reported `warmer` threads at `100% cpu` related to `GlobalOrdinalsBuilder`. This would let you know to inspect [field data’s global ordinals](elasticsearch://reference/elasticsearch/mapping-reference/eager-global-ordinals.md). Alternatively, let’s say [cat nodes](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cat-nodes) shows a hot spotted master node and [cat thread pool](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cat-thread-pool) shows general queuing across nodes. This would suggest the master node is overwhelmed. To resolve this, first ensure [hardware high availability](../../deploy-manage/production-guidance/availability-and-resilience/resilience-in-small-clusters.md) setup and then look to ephemeral causes. In this example, [the nodes hot threads API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-nodes-hot-threads) reports multiple threads in `other` which indicates they’re waiting on or blocked by either garbage collection or I/O. diff --git a/troubleshoot/elasticsearch/increase-capacity-data-node.md b/troubleshoot/elasticsearch/increase-capacity-data-node.md index cc04ca4d1..90b628dca 100644 --- a/troubleshoot/elasticsearch/increase-capacity-data-node.md +++ b/troubleshoot/elasticsearch/increase-capacity-data-node.md @@ -47,7 +47,7 @@ In order to increase the disk capacity of the data nodes in your cluster: ::::::{tab-item} Self-managed In order to increase the data node capacity in your cluster, you will need to calculate the amount of extra disk space needed. -1. First, retrieve the relevant disk thresholds that will indicate how much space should be available. The relevant thresholds are the [high watermark](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-watermark-high) for all the tiers apart from the frozen one and the [frozen flood stage watermark](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-flood-stage-frozen) for the frozen tier. The following example demonstrates disk shortage in the hot tier, so we will only retrieve the high watermark: +1. First, retrieve the relevant disk thresholds that will indicate how much space should be available. The relevant thresholds are the [high watermark](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-watermark-high) for all the tiers apart from the frozen one and the [frozen flood stage watermark](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-flood-stage-frozen) for the frozen tier. The following example demonstrates disk shortage in the hot tier, so we will only retrieve the high watermark: ```console GET _cluster/settings?include_defaults&filter_path=*.cluster.routing.allocation.disk.watermark.high* @@ -74,7 +74,7 @@ In order to increase the data node capacity in your cluster, you will need to ca } ``` - The above means that in order to resolve the disk shortage we need to either drop our disk usage below the 90% or have more than 150GB available, read more on how this threshold works [here](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-watermark-high). + The above means that in order to resolve the disk shortage we need to either drop our disk usage below the 90% or have more than 150GB available, read more on how this threshold works [here](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-watermark-high). 2. The next step is to find out the current disk usage, this will indicate how much extra space is needed. For simplicity, our example has one node, but you can apply the same for every node over the relevant threshold. diff --git a/troubleshoot/elasticsearch/increase-cluster-shard-limit.md b/troubleshoot/elasticsearch/increase-cluster-shard-limit.md index ebae1c4fc..027e95642 100644 --- a/troubleshoot/elasticsearch/increase-cluster-shard-limit.md +++ b/troubleshoot/elasticsearch/increase-cluster-shard-limit.md @@ -10,7 +10,7 @@ mapped_pages: Elasticsearch tries to take advantage of all the available resources by distributing data (index shards) amongst the cluster nodes. -Users might want to influence this data distribution by configuring the [`cluster.routing.allocation.total_shards_per_node`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/total-shards-per-node.md#cluster-total-shards-per-node) system setting to restrict the number of shards that can be hosted on a single node in the system, regardless of the index. Various configurations limiting how many shards can be hosted on a single node can lead to shards being unassigned due to the cluster not having enough nodes to satisfy the configuration. +Users might want to influence this data distribution by configuring the [`cluster.routing.allocation.total_shards_per_node`](elasticsearch://reference/elasticsearch/index-settings/total-shards-per-node.md#cluster-total-shards-per-node) system setting to restrict the number of shards that can be hosted on a single node in the system, regardless of the index. Various configurations limiting how many shards can be hosted on a single node can lead to shards being unassigned due to the cluster not having enough nodes to satisfy the configuration. In order to fix this follow the next steps: diff --git a/troubleshoot/elasticsearch/increase-shard-limit.md b/troubleshoot/elasticsearch/increase-shard-limit.md index 576da94eb..271d67a37 100644 --- a/troubleshoot/elasticsearch/increase-shard-limit.md +++ b/troubleshoot/elasticsearch/increase-shard-limit.md @@ -10,7 +10,7 @@ mapped_pages: Elasticsearch tries to take advantage of all the available resources by distributing data (index shards) among nodes in the cluster. -Users might want to influence this data distribution by configuring the [index.routing.allocation.total_shards_per_node](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/total-shards-per-node.md#total-shards-per-node) index setting to a custom value (for e.g. `1` in case of a highly trafficked index). Various configurations limiting how many shards an index can have located on one node can lead to shards being unassigned due to the cluster not having enough nodes to satisfy the index configuration. +Users might want to influence this data distribution by configuring the [index.routing.allocation.total_shards_per_node](elasticsearch://reference/elasticsearch/index-settings/total-shards-per-node.md#total-shards-per-node) index setting to a custom value (for e.g. `1` in case of a highly trafficked index). Various configurations limiting how many shards an index can have located on one node can lead to shards being unassigned due to the cluster not having enough nodes to satisfy the index configuration. In order to fix this follow the next steps: diff --git a/troubleshoot/elasticsearch/increase-tier-capacity.md b/troubleshoot/elasticsearch/increase-tier-capacity.md index 94ab500ed..05369882e 100644 --- a/troubleshoot/elasticsearch/increase-tier-capacity.md +++ b/troubleshoot/elasticsearch/increase-tier-capacity.md @@ -8,7 +8,7 @@ mapped_pages: Distributing copies of the data (index shard replicas) on different nodes can parallelize processing requests thus speeding up search queries. This can be achieved by increasing the number of replica shards up to the maximum value (total number of nodes minus one) which also serves the purpose to protect against hardware failure. If the index has a preferred tier, Elasticsearch will only place the copies of the data for that index on nodes in the target tier. -If a warning is encountered with not enough nodes to allocate all shard replicas, you can influence this behavior by adding more nodes to the cluster (or tier if tiers are in use), or by reducing the [`index.number_of_replicas`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-modules.md#dynamic-index-number-of-replicas) index setting. +If a warning is encountered with not enough nodes to allocate all shard replicas, you can influence this behavior by adding more nodes to the cluster (or tier if tiers are in use), or by reducing the [`index.number_of_replicas`](elasticsearch://reference/elasticsearch/index-settings/index-modules.md#dynamic-index-number-of-replicas) index setting. In order to fix this follow the next steps: @@ -71,10 +71,10 @@ Now that you know the tier, you want to increase the number of nodes in that tie * Find the **Availability zones** selection. If it is less than 3, you can select a higher number of availability zones for that tier. -If it is not possible to increase the size per zone or the number of availability zones, you can reduce the number of replicas of your index data. We’ll achieve this by inspecting the [`index.number_of_replicas`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-modules.md#dynamic-index-number-of-replicas) index setting index setting and decreasing the configured value. +If it is not possible to increase the size per zone or the number of availability zones, you can reduce the number of replicas of your index data. We’ll achieve this by inspecting the [`index.number_of_replicas`](elasticsearch://reference/elasticsearch/index-settings/index-modules.md#dynamic-index-number-of-replicas) index setting index setting and decreasing the configured value. 1. Access {{kib}} as described above. -2. Inspect the [`index.number_of_replicas`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-modules.md#dynamic-index-number-of-replicas) index setting. +2. Inspect the [`index.number_of_replicas`](elasticsearch://reference/elasticsearch/index-settings/index-modules.md#dynamic-index-number-of-replicas) index setting. ```console GET /my-index-000001/_settings/index.number_of_replicas @@ -150,9 +150,9 @@ The response will look like this: 1. Represents a comma separated list of data tier node roles this index is allowed to be allocated on, the first one in the list being the one with the higher priority i.e. the tier the index is targeting. e.g. in this example the tier preference is `data_warm,data_hot` so the index is targeting the `warm` tier and more nodes with the `data_warm` role are needed in the {{es}} cluster. -Alternatively, if adding more nodes to the {{es}} cluster is not desired, inspect the [`index.number_of_replicas`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-modules.md#dynamic-index-number-of-replicas) index setting and decrease the configured value: +Alternatively, if adding more nodes to the {{es}} cluster is not desired, inspect the [`index.number_of_replicas`](elasticsearch://reference/elasticsearch/index-settings/index-modules.md#dynamic-index-number-of-replicas) index setting and decrease the configured value: -1. Inspect the [`index.number_of_replicas`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-modules.md#dynamic-index-number-of-replicas) index setting for the index with unassigned replica shards: +1. Inspect the [`index.number_of_replicas`](elasticsearch://reference/elasticsearch/index-settings/index-modules.md#dynamic-index-number-of-replicas) index setting for the index with unassigned replica shards: ```console GET /my-index-000001/_settings/index.number_of_replicas diff --git a/troubleshoot/elasticsearch/index-lifecycle-management-errors.md b/troubleshoot/elasticsearch/index-lifecycle-management-errors.md index da9adb739..0e71e6a06 100644 --- a/troubleshoot/elasticsearch/index-lifecycle-management-errors.md +++ b/troubleshoot/elasticsearch/index-lifecycle-management-errors.md @@ -144,9 +144,9 @@ POST /my-index-000001/_ilm/retry When setting up an [{{ilm-init}} policy](../../manage-data/lifecycle/index-lifecycle-management/configure-lifecycle-policy.md) or [automating rollover with {{ilm-init}}](../../manage-data/lifecycle/index-lifecycle-management.md), be aware that `min_age` can be relative to either the rollover time or the index creation time. -If you use [{{ilm-init}} rollover](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-rollover.md), `min_age` is calculated relative to the time the index was rolled over. This is because the [rollover API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-rollover) generates a new index and updates the `age` of the previous index to reflect the rollover time. If the index hasn’t been rolled over, then the `age` is the same as the `creation_date` for the index. +If you use [{{ilm-init}} rollover](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-rollover.md), `min_age` is calculated relative to the time the index was rolled over. This is because the [rollover API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-rollover) generates a new index and updates the `age` of the previous index to reflect the rollover time. If the index hasn’t been rolled over, then the `age` is the same as the `creation_date` for the index. -You can override how `min_age` is calculated using the `index.lifecycle.origination_date` and `index.lifecycle.parse_origination_date` [{{ilm-init}} settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/index-lifecycle-management-settings.md). +You can override how `min_age` is calculated using the `index.lifecycle.origination_date` and `index.lifecycle.parse_origination_date` [{{ilm-init}} settings](elasticsearch://reference/elasticsearch/configuration-reference/index-lifecycle-management-settings.md). ## Common {{ilm-init}} errors [_common_ilm_init_errors] diff --git a/troubleshoot/elasticsearch/mapping-explosion.md b/troubleshoot/elasticsearch/mapping-explosion.md index 817b42c9e..07c75243b 100644 --- a/troubleshoot/elasticsearch/mapping-explosion.md +++ b/troubleshoot/elasticsearch/mapping-explosion.md @@ -5,7 +5,7 @@ mapped_pages: # Mapping explosion [mapping-explosion] -{{es}}'s search and [{{kib}}'s discover](../../explore-analyze/discover.md) Javascript rendering are dependent on the search’s backing indices total amount of [mapped fields](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/field-data-types.md), of all mapping depths. When this total amount is too high or is exponentially climbing, we refer to it as experiencing mapping explosion. Field counts going this high are uncommon and usually suggest an upstream document formatting issue as [shown in this blog](https://www.elastic.co/blog/found-crash-elasticsearch#mapping-explosion). +{{es}}'s search and [{{kib}}'s discover](../../explore-analyze/discover.md) Javascript rendering are dependent on the search’s backing indices total amount of [mapped fields](elasticsearch://reference/elasticsearch/mapping-reference/field-data-types.md), of all mapping depths. When this total amount is too high or is exponentially climbing, we refer to it as experiencing mapping explosion. Field counts going this high are uncommon and usually suggest an upstream document formatting issue as [shown in this blog](https://www.elastic.co/blog/found-crash-elasticsearch#mapping-explosion). Mapping explosion may surface as the following performance symptoms: @@ -20,15 +20,15 @@ Mapping explosion may surface as the following performance symptoms: ## Prevent or prepare [prevent] -[Mappings](../../manage-data/data-store/mapping.md) cannot be field-reduced once initialized. {{es}} indices default to [dynamic mappings](../../manage-data/data-store/mapping.md) which doesn’t normally cause problems unless it’s combined with overriding [`index.mapping.total_fields.limit`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/mapping-limit.md). The default `1000` limit is considered generous, though overriding to `10000` doesn’t cause noticeable impact depending on use case. However, to give a bad example, overriding to `100000` and this limit being hit by mapping totals would usually have strong performance implications. +[Mappings](../../manage-data/data-store/mapping.md) cannot be field-reduced once initialized. {{es}} indices default to [dynamic mappings](../../manage-data/data-store/mapping.md) which doesn’t normally cause problems unless it’s combined with overriding [`index.mapping.total_fields.limit`](elasticsearch://reference/elasticsearch/index-settings/mapping-limit.md). The default `1000` limit is considered generous, though overriding to `10000` doesn’t cause noticeable impact depending on use case. However, to give a bad example, overriding to `100000` and this limit being hit by mapping totals would usually have strong performance implications. If your index mapped fields expect to contain a large, arbitrary set of keys, you may instead consider: -* Setting [`index.mapping.total_fields.ignore_dynamic_beyond_limit`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/mapping-limit.md) to `true`. Instead of rejecting documents that exceed the field limit, this will ignore dynamic fields once the limit is reached. -* Using the [flattened](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/flattened.md) data type. Please note, however, that flattened objects is [not fully supported in {{kib}}](https://github.com/elastic/kibana/issues/25820) yet. For example, this could apply to sub-mappings like { `host.name` , `host.os`, `host.version` }. Desired fields are still accessed by [runtime fields](../../manage-data/data-store/mapping/define-runtime-fields-in-search-request.md). +* Setting [`index.mapping.total_fields.ignore_dynamic_beyond_limit`](elasticsearch://reference/elasticsearch/index-settings/mapping-limit.md) to `true`. Instead of rejecting documents that exceed the field limit, this will ignore dynamic fields once the limit is reached. +* Using the [flattened](elasticsearch://reference/elasticsearch/mapping-reference/flattened.md) data type. Please note, however, that flattened objects is [not fully supported in {{kib}}](https://github.com/elastic/kibana/issues/25820) yet. For example, this could apply to sub-mappings like { `host.name` , `host.os`, `host.version` }. Desired fields are still accessed by [runtime fields](../../manage-data/data-store/mapping/define-runtime-fields-in-search-request.md). * Disable [dynamic mappings](../../manage-data/data-store/mapping.md). This cannot effect current index mapping, but can apply going forward via an [index template](../../manage-data/data-store/templates.md). -Modifying to the [nested](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/nested.md) data type would not resolve the core issue. +Modifying to the [nested](elasticsearch://reference/elasticsearch/mapping-reference/nested.md) data type would not resolve the core issue. ## Check for issue [check] @@ -54,9 +54,9 @@ Mapping explosions also covers when an individual index field totals are within However, though less common, it is possible to only experience mapping explosions on the combination of backing indices. For example, if a [data stream](../../manage-data/data-store/data-streams.md)'s backing indices are all at field total limit but each contain unique fields from one another. -This situation most easily surfaces by adding a [data view](../../explore-analyze/find-and-organize/data-views.md) and checking its **Fields** tab for its total fields count. This statistic does tells you overall fields and not only where [`index:true`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/mapping-index.md), but serves as a good baseline. +This situation most easily surfaces by adding a [data view](../../explore-analyze/find-and-organize/data-views.md) and checking its **Fields** tab for its total fields count. This statistic does tells you overall fields and not only where [`index:true`](elasticsearch://reference/elasticsearch/mapping-reference/mapping-index.md), but serves as a good baseline. -If your issue only surfaces via a [data view](../../explore-analyze/find-and-organize/data-views.md), you may consider this menu’s **Field filters** if you’re not using [multi-fields](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/field-data-types.md). Alternatively, you may consider a more targeted index pattern or using a negative pattern to filter-out problematic indices. For example, if `logs-*` has too high a field count because of problematic backing indices `logs-lotsOfFields-*`, then you could update to either `logs-*,-logs-lotsOfFields-*` or `logs-iMeantThisAnyway-*`. +If your issue only surfaces via a [data view](../../explore-analyze/find-and-organize/data-views.md), you may consider this menu’s **Field filters** if you’re not using [multi-fields](elasticsearch://reference/elasticsearch/mapping-reference/field-data-types.md). Alternatively, you may consider a more targeted index pattern or using a negative pattern to filter-out problematic indices. For example, if `logs-*` has too high a field count because of problematic backing indices `logs-lotsOfFields-*`, then you could update to either `logs-*,-logs-lotsOfFields-*` or `logs-iMeantThisAnyway-*`. ## Resolve [resolve] diff --git a/troubleshoot/elasticsearch/red-yellow-cluster-status.md b/troubleshoot/elasticsearch/red-yellow-cluster-status.md index 288442769..9d9d9668e 100644 --- a/troubleshoot/elasticsearch/red-yellow-cluster-status.md +++ b/troubleshoot/elasticsearch/red-yellow-cluster-status.md @@ -63,7 +63,7 @@ A shard can become unassigned for several reasons. The following tips outline th ### Single node cluster [fix-cluster-status-only-one-node] -{{es}} will never assign a replica to the same node as the primary shard. A single-node cluster will always have yellow status. To change to green, set [number_of_replicas](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-modules.md#dynamic-index-number-of-replicas) to 0 for all indices. +{{es}} will never assign a replica to the same node as the primary shard. A single-node cluster will always have yellow status. To change to green, set [number_of_replicas](elasticsearch://reference/elasticsearch/index-settings/index-modules.md#dynamic-index-number-of-replicas) to 0 for all indices. Therefore, if the number of replicas equals or exceeds the number of nodes, some shards won’t be allocated. @@ -92,7 +92,7 @@ POST _cluster/reroute Misconfigured allocation settings can result in an unassigned primary shard. These settings include: * [Shard allocation](../../deploy-manage/distributed-architecture/shard-allocation-relocation-recovery/index-level-shard-allocation.md) index settings -* [Allocation filtering](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-shard-allocation-filtering) cluster settings +* [Allocation filtering](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-shard-allocation-filtering) cluster settings * [Allocation awareness](../../deploy-manage/distributed-architecture/shard-allocation-relocation-recovery/shard-allocation-awareness.md) cluster settings To review your allocation settings, use the [get index settings](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-get-settings) and [cluster get settings](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-get-settings) APIs. @@ -123,7 +123,7 @@ PUT _settings ### Free up or increase disk space [fix-cluster-status-disk-space] -{{es}} uses a [low disk watermark](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#disk-based-shard-allocation) to ensure data nodes have enough disk space for incoming shards. By default, {{es}} does not allocate shards to nodes using more than 85% of disk space. +{{es}} uses a [low disk watermark](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#disk-based-shard-allocation) to ensure data nodes have enough disk space for incoming shards. By default, {{es}} does not allocate shards to nodes using more than 85% of disk space. To check the current disk space of your nodes, use the [cat allocation API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cat-allocation). @@ -135,14 +135,14 @@ If your nodes are running low on disk space, you have a few options: * Upgrade your nodes to increase disk space. * Add more nodes to the cluster. -* Delete unneeded indices to free up space. If you use {{ilm-init}}, you can update your lifecycle policy to use [searchable snapshots](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-searchable-snapshot.md) or add a delete phase. If you no longer need to search the data, you can use a [snapshot](../../deploy-manage/tools/snapshot-and-restore.md) to store it off-cluster. -* If you no longer write to an index, use the [force merge API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-forcemerge) or {{ilm-init}}'s [force merge action](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-forcemerge.md) to merge its segments into larger ones. +* Delete unneeded indices to free up space. If you use {{ilm-init}}, you can update your lifecycle policy to use [searchable snapshots](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-searchable-snapshot.md) or add a delete phase. If you no longer need to search the data, you can use a [snapshot](../../deploy-manage/tools/snapshot-and-restore.md) to store it off-cluster. +* If you no longer write to an index, use the [force merge API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-forcemerge) or {{ilm-init}}'s [force merge action](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-forcemerge.md) to merge its segments into larger ones. ```console POST my-index/_forcemerge ``` -* If an index is read-only, use the [shrink index API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-shrink) or {{ilm-init}}'s [shrink action](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-shrink.md) to reduce its primary shard count. +* If an index is read-only, use the [shrink index API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-shrink) or {{ilm-init}}'s [shrink action](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-shrink.md) to reduce its primary shard count. ```console POST my-index/_shrink/my-shrunken-index @@ -186,7 +186,7 @@ See [this video](https://www.youtube.com/watch?v=MiKKUdZvwnI) for walkthrough of ### Reduce JVM memory pressure [fix-cluster-status-jvm] -Shard allocation requires JVM heap memory. High JVM memory pressure can trigger [circuit breakers](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/circuit-breaker-settings.md) that stop allocation and leave shards unassigned. See [High JVM memory pressure](high-jvm-memory-pressure.md). +Shard allocation requires JVM heap memory. High JVM memory pressure can trigger [circuit breakers](elasticsearch://reference/elasticsearch/configuration-reference/circuit-breaker-settings.md) that stop allocation and leave shards unassigned. See [High JVM memory pressure](high-jvm-memory-pressure.md). ### Recover data for a lost primary shard [fix-cluster-status-restore] diff --git a/troubleshoot/elasticsearch/rejected-requests.md b/troubleshoot/elasticsearch/rejected-requests.md index db1e3b682..f7a7e4ad2 100644 --- a/troubleshoot/elasticsearch/rejected-requests.md +++ b/troubleshoot/elasticsearch/rejected-requests.md @@ -9,7 +9,7 @@ When {{es}} rejects a request, it stops the operation and returns an error with * A [depleted thread pool](high-cpu-usage.md). A depleted `search` or `write` thread pool returns a `TOO_MANY_REQUESTS` error message. * A [circuit breaker error](circuit-breaker-errors.md). -* High [indexing pressure](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/pressure.md) that exceeds the [`indexing_pressure.memory.limit`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/pressure.md#memory-limits). +* High [indexing pressure](elasticsearch://reference/elasticsearch/index-settings/pressure.md) that exceeds the [`indexing_pressure.memory.limit`](elasticsearch://reference/elasticsearch/index-settings/pressure.md#memory-limits). ::::{tip} If you're using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to [Monitor with AutoOps](/deploy-manage/monitor/autoops.md). @@ -35,20 +35,20 @@ See [this video](https://www.youtube.com/watch?v=auZJRXoAVpI) for a walkthrough ## Check circuit breakers [check-circuit-breakers] -To check the number of tripped [circuit breakers](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/circuit-breaker-settings.md), use the [node stats API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-nodes-stats). +To check the number of tripped [circuit breakers](elasticsearch://reference/elasticsearch/configuration-reference/circuit-breaker-settings.md), use the [node stats API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-nodes-stats). ```console GET /_nodes/stats/breaker ``` -These statistics are cumulative from node startup. For more information, see [circuit breaker errors](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/circuit-breaker-settings.md). +These statistics are cumulative from node startup. For more information, see [circuit breaker errors](elasticsearch://reference/elasticsearch/configuration-reference/circuit-breaker-settings.md). See [this video](https://www.youtube.com/watch?v=k3wYlRVbMSw) for a walkthrough of diagnosing circuit breaker errors. ## Check indexing pressure [check-indexing-pressure] -To check the number of [indexing pressure](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/pressure.md) rejections, use the [node stats API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-nodes-stats). +To check the number of [indexing pressure](elasticsearch://reference/elasticsearch/index-settings/pressure.md) rejections, use the [node stats API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-nodes-stats). ```console GET _nodes/stats?human&filter_path=nodes.*.indexing_pressure @@ -58,7 +58,7 @@ These stats are cumulative from node startup. Indexing pressure rejections appear as an `EsRejectedExecutionException`, and indicate that they were rejected due to `combined_coordinating_and_primary`, `coordinating`, `primary`, or `replica`. -These errors are often related to [backlogged tasks](task-queue-backlog.md), [bulk index](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-bulk) sizing, or the ingest target's [`refresh_interval` setting](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-modules.md). +These errors are often related to [backlogged tasks](task-queue-backlog.md), [bulk index](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-bulk) sizing, or the ingest target's [`refresh_interval` setting](elasticsearch://reference/elasticsearch/index-settings/index-modules.md). See [this video](https://www.youtube.com/watch?v=QuV8QqSfc0c) for a walkthrough of diagnosing indexing pressure rejections. diff --git a/troubleshoot/elasticsearch/remote-clusters.md b/troubleshoot/elasticsearch/remote-clusters.md index f5e5b2176..edd1d0140 100644 --- a/troubleshoot/elasticsearch/remote-clusters.md +++ b/troubleshoot/elasticsearch/remote-clusters.md @@ -52,9 +52,9 @@ The API should return `"connected" : true`. When using [API key authentication]( When using API key authentication, cross-cluster traffic happens on the remote cluster interface, instead of the transport interface. The remote cluster interface is not enabled by default. This means a node is not ready to accept incoming cross-cluster requests by default, while it is ready to send outgoing cross-cluster requests. Ensure you’ve enabled the remote cluster server on every node of the remote cluster. In `elasticsearch.yml`: -* Set [`remote_cluster_server.enabled`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#remote-cluster-network-settings) to `true`. -* Configure the bind and publish address for remote cluster server traffic, for example using [`remote_cluster.host`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#remote-cluster-network-settings). Without configuring the address, remote cluster traffic may be bound to the local interface, and remote clusters running on other machines can’t connect. -* Optionally, configure the remote server port using [`remote_cluster.port`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#remote_cluster.port) (defaults to `9443`). +* Set [`remote_cluster_server.enabled`](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#remote-cluster-network-settings) to `true`. +* Configure the bind and publish address for remote cluster server traffic, for example using [`remote_cluster.host`](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#remote-cluster-network-settings). Without configuring the address, remote cluster traffic may be bound to the local interface, and remote clusters running on other machines can’t connect. +* Optionally, configure the remote server port using [`remote_cluster.port`](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#remote_cluster.port) (defaults to `9443`). @@ -111,7 +111,7 @@ Note that with some network configurations it could take minutes or hours for th #### Resolution [_resolution_2] * Ensure that the network between the clusters is as reliable as possible. -* Ensure that the network is configured to permit [Long-lived idle connections](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#long-lived-connections). +* Ensure that the network is configured to permit [Long-lived idle connections](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#long-lived-connections). * Ensure that the network is configured to detect faulty connections quickly. In particular, you must enable and fully support TCP keepalives, and set a short [retransmission timeout](../../deploy-manage/deploy/self-managed/system-config-tcpretries.md). * On Linux systems, execute `ss -tonie` to verify the details of the configuration of each network connection between the clusters. * If the problems persist, capture network packets at both ends of the connection and analyse the traffic to look for delays and lost messages. diff --git a/troubleshoot/elasticsearch/repeated-snapshot-failures.md b/troubleshoot/elasticsearch/repeated-snapshot-failures.md index bcef7a5f9..022bb615b 100644 --- a/troubleshoot/elasticsearch/repeated-snapshot-failures.md +++ b/troubleshoot/elasticsearch/repeated-snapshot-failures.md @@ -8,7 +8,7 @@ mapped_pages: Repeated snapshot failures are usually an indicator of a problem with your deployment. Continuous failures of automated snapshots can leave a deployment without recovery options in cases of data loss or outages. -Elasticsearch keeps track of the number of repeated failures when executing automated snapshots. If an automated snapshot fails too many times without a successful execution, the health API will report a warning. The number of repeated failures before reporting a warning is controlled by the [`slm.health.failed_snapshot_warn_threshold`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/snapshot-restore-settings.md#slm-health-failed-snapshot-warn-threshold) setting. +Elasticsearch keeps track of the number of repeated failures when executing automated snapshots. If an automated snapshot fails too many times without a successful execution, the health API will report a warning. The number of repeated failures before reporting a warning is controlled by the [`slm.health.failed_snapshot_warn_threshold`](elasticsearch://reference/elasticsearch/configuration-reference/snapshot-restore-settings.md#slm-health-failed-snapshot-warn-threshold) setting. In the event that an automated {{slm}} policy execution is experiencing repeated failures, follow these steps to get more information about the problem: diff --git a/troubleshoot/elasticsearch/security/security-trb-extraargs.md b/troubleshoot/elasticsearch/security/security-trb-extraargs.md index f52fb9192..d54bab575 100644 --- a/troubleshoot/elasticsearch/security/security-trb-extraargs.md +++ b/troubleshoot/elasticsearch/security/security-trb-extraargs.md @@ -1,5 +1,5 @@ --- -navigation_title: "Error: Extra arguments provided" +navigation_title: "Error: Extra arguments provided" mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/security-trb-extraargs.html --- @@ -14,5 +14,5 @@ mapped_pages: This error occurs when the `elasticsearch-users` tool is parsing the input and finds unexpected arguments. This can happen when there are special characters used in some of the arguments. For example, on Windows systems the `,` character is considered a parameter separator; in other words `-r role1,role2` is translated to `-r role1 role2` and the `elasticsearch-users` tool only recognizes `role1` as an expected parameter. The solution here is to quote the parameter: `-r "role1,role2"`. -For more information about this command, see [`elasticsearch-users` command](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/users-command.md). +For more information about this command, see [`elasticsearch-users` command](elasticsearch://reference/elasticsearch/command-line-tools/users-command.md). diff --git a/troubleshoot/elasticsearch/security/security-trb-roles.md b/troubleshoot/elasticsearch/security/security-trb-roles.md index c2215d4e1..4fba2d9a0 100644 --- a/troubleshoot/elasticsearch/security/security-trb-roles.md +++ b/troubleshoot/elasticsearch/security/security-trb-roles.md @@ -25,13 +25,13 @@ mapped_pages: 1. `unknown_role` was not found in `roles.yml` - For more information about this command, see the [`elasticsearch-users` command](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/users-command.md). + For more information about this command, see the [`elasticsearch-users` command](elasticsearch://reference/elasticsearch/command-line-tools/users-command.md). 2. If you are authenticating to LDAP, a number of configuration options can cause this error. | | | | --- | --- | - | *group identification* | Groups are located by either an LDAP search or by the "memberOf" attribute onthe user. Also, If subtree search is turned off, it will search only onelevel deep. For all the options, see [LDAP realm settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#ref-ldap-settings).There are many options here and sticking to the defaults will not work for allscenarios. | + | *group identification* | Groups are located by either an LDAP search or by the "memberOf" attribute onthe user. Also, If subtree search is turned off, it will search only onelevel deep. For all the options, see [LDAP realm settings](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#ref-ldap-settings).There are many options here and sticking to the defaults will not work for allscenarios. | | *group to role mapping* | Either the `role_mapping.yml` file or the location for this file could bemisconfigured. For more information, see [Security files](../../../deploy-manage/security.md). | | *role definition* | The role definition might be missing or invalid. | diff --git a/troubleshoot/elasticsearch/security/trb-security-kerberos.md b/troubleshoot/elasticsearch/security/trb-security-kerberos.md index b158e9795..4ace23459 100644 --- a/troubleshoot/elasticsearch/security/trb-security-kerberos.md +++ b/troubleshoot/elasticsearch/security/trb-security-kerberos.md @@ -47,7 +47,7 @@ As Kerberos logs are often cryptic in nature and many things can go wrong as it xpack.security.authc.realms.kerberos..krb.debug: true ``` -For detailed information, see [Kerberos realm settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#ref-kerberos-settings). +For detailed information, see [Kerberos realm settings](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#ref-kerberos-settings). Sometimes you may need to go deeper to understand the problem during SPNEGO GSS context negotiation or look at the Kerberos message exchange. To enable Kerberos/SPNEGO debug logging on JVM, add following JVM system properties: @@ -55,5 +55,5 @@ Sometimes you may need to go deeper to understand the problem during SPNEGO GSS `-Dsun.security.spnego.debug=true` -For more information about JVM system properties, see [Set JVM options](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/jvm-settings.md#set-jvm-options). +For more information about JVM system properties, see [Set JVM options](elasticsearch://reference/elasticsearch/jvm-settings.md#set-jvm-options). diff --git a/troubleshoot/elasticsearch/security/trb-security-saml.md b/troubleshoot/elasticsearch/security/trb-security-saml.md index 2199143ed..82d8acf38 100644 --- a/troubleshoot/elasticsearch/security/trb-security-saml.md +++ b/troubleshoot/elasticsearch/security/trb-security-saml.md @@ -132,7 +132,7 @@ Some of the common SAML problems are shown below with tips on how to resolve the This means that the SAML Identity Provider failed to authenticate the user and sent a SAML Response to the Service Provider ({{stack}}) indicating this failure. The message will convey whether the SAML Identity Provider thinks that the problem is with the Service Provider ({{stack}}) or with the Identity Provider itself and the specific status code that follows is extremely useful as it usually indicates the underlying issue. The list of specific error codes is defined in the [SAML 2.0 Core specification - Section 3.2.2.2](https://docs.oasis-open.org/security/saml/v2.0/saml-core-2.0-os.pdf) and the most commonly encountered ones are: 1. `urn:oasis:names:tc:SAML:2.0:status:AuthnFailed`: The SAML Identity Provider failed to authenticate the user. There is not much to troubleshoot on the {{stack}} side for this status, the logs of the SAML Identity Provider will hopefully offer much more information. - 2. `urn:oasis:names:tc:SAML:2.0:status:InvalidNameIDPolicy`: The SAML Identity Provider cannot support releasing a NameID with the requested format. When creating SAML Authentication Requests, {{es}} sets the NameIDPolicy element of the Authentication request with the appropriate value. This is controlled by the [`nameid_format`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md#ref-saml-settings) configuration parameter in `elasticsearch.yml`, which if not set defaults to `urn:oasis:names:tc:SAML:2.0:nameid-format:transient`. This instructs the Identity Provider to return a NameID with that specific format in the SAML Response. If the SAML Identity Provider cannot grant that request, for example because it is configured to release a NameID format with `urn:oasis:names:tc:SAML:2.0:nameid-format:persistent` format instead, it returns this error indicating an invalid NameID policy. This issue can be resolved by adjusting `nameid_format` to match the format the SAML Identity Provider can return or by setting it to `urn:oasis:names:tc:SAML:2.0:nameid-format:unspecified` so that the Identity Provider is allowed to return any format it wants. + 2. `urn:oasis:names:tc:SAML:2.0:status:InvalidNameIDPolicy`: The SAML Identity Provider cannot support releasing a NameID with the requested format. When creating SAML Authentication Requests, {{es}} sets the NameIDPolicy element of the Authentication request with the appropriate value. This is controlled by the [`nameid_format`](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md#ref-saml-settings) configuration parameter in `elasticsearch.yml`, which if not set defaults to `urn:oasis:names:tc:SAML:2.0:nameid-format:transient`. This instructs the Identity Provider to return a NameID with that specific format in the SAML Response. If the SAML Identity Provider cannot grant that request, for example because it is configured to release a NameID format with `urn:oasis:names:tc:SAML:2.0:nameid-format:persistent` format instead, it returns this error indicating an invalid NameID policy. This issue can be resolved by adjusting `nameid_format` to match the format the SAML Identity Provider can return or by setting it to `urn:oasis:names:tc:SAML:2.0:nameid-format:unspecified` so that the Identity Provider is allowed to return any format it wants. 8. **Symptoms:** diff --git a/troubleshoot/elasticsearch/security/trb-security-setup.md b/troubleshoot/elasticsearch/security/trb-security-setup.md index 83982c482..fbb16beab 100644 --- a/troubleshoot/elasticsearch/security/trb-security-setup.md +++ b/troubleshoot/elasticsearch/security/trb-security-setup.md @@ -6,7 +6,7 @@ mapped_pages: # Diagnose password setup connection failures [trb-security-setup] -The [elasticsearch-setup-passwords command](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/command-line-tools/setup-passwords.md) sets passwords for the built-in users by sending user management API requests. If your cluster uses SSL/TLS for the HTTP (REST) interface, the command attempts to establish a connection with the HTTPS protocol. If the connection attempt fails, the command fails. +The [elasticsearch-setup-passwords command](elasticsearch://reference/elasticsearch/command-line-tools/setup-passwords.md) sets passwords for the built-in users by sending user management API requests. If your cluster uses SSL/TLS for the HTTP (REST) interface, the command attempts to establish a connection with the HTTPS protocol. If the connection attempt fails, the command fails. **Symptoms:** @@ -59,5 +59,5 @@ The [elasticsearch-setup-passwords command](asciidocalypse://docs/elasticsearch/ 2. If the command does not trust the {{es}} server, verify that you configured the `xpack.security.http.ssl.certificate_authorities` setting or the `xpack.security.http.ssl.truststore.path` setting. 3. If hostname verification fails, you can disable this verification by setting `xpack.security.http.ssl.verification_mode` to `certificate`. -For more information about these settings, see [Security settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md). +For more information about these settings, see [Security settings](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md). diff --git a/troubleshoot/elasticsearch/security/trb-security-ssl.md b/troubleshoot/elasticsearch/security/trb-security-ssl.md index be452a3ee..103dff8a4 100644 --- a/troubleshoot/elasticsearch/security/trb-security-ssl.md +++ b/troubleshoot/elasticsearch/security/trb-security-ssl.md @@ -19,13 +19,13 @@ mapped_pages: `org.elasticsearch.common.netty.handler.ssl.NotSslRecordException: not an SSL/TLS record:` : Indicates that there was incoming plaintext traffic on an SSL connection. This typically occurs when a node is not configured to use encrypted communication and tries to connect to nodes that are using encrypted communication. Please verify that all nodes are using the same setting for `xpack.security.transport.ssl.enabled`. -For more information about this setting, see [Security settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md). +For more information about this setting, see [Security settings](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md). `java.io.StreamCorruptedException: invalid internal transport message format, got` : Indicates an issue with data received on the transport interface in an unknown format. This can happen when a node with encrypted communication enabled connects to a node that has encrypted communication disabled. Please verify that all nodes are using the same setting for `xpack.security.transport.ssl.enabled`. -For more information about this setting, see [Security settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md). +For more information about this setting, see [Security settings](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md). `java.lang.IllegalArgumentException: empty text` @@ -35,7 +35,7 @@ For more information about this setting, see [Security settings](asciidocalypse: xpack.security.http.ssl.enabled: true ``` -For more information about this setting, see [Security settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/security-settings.md). +For more information about this setting, see [Security settings](elasticsearch://reference/elasticsearch/configuration-reference/security-settings.md). `ERROR: unsupported ciphers [...] were requested but cannot be used in this JVM` diff --git a/troubleshoot/elasticsearch/troubleshoot-migrate-to-tiers.md b/troubleshoot/elasticsearch/troubleshoot-migrate-to-tiers.md index 73b0fe00a..8c339e399 100644 --- a/troubleshoot/elasticsearch/troubleshoot-migrate-to-tiers.md +++ b/troubleshoot/elasticsearch/troubleshoot-migrate-to-tiers.md @@ -83,7 +83,7 @@ In order to get the shards assigned we need to call the [migrate to data tiers r ``` 1. The ILM policies that were updated. - 2. The indices that were migrated to [tier preference](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/data-tier-allocation.md#tier-preference-allocation-filter) routing. + 2. The indices that were migrated to [tier preference](elasticsearch://reference/elasticsearch/index-settings/data-tier-allocation.md#tier-preference-allocation-filter) routing. 3. The legacy index templates that were updated to not contain custom routing settings for the provided data attribute. 4. The composable index templates that were updated to not contain custom routing settings for the provided data attribute. 5. The component templates that were updated to not contain custom routing settings for the provided data attribute. @@ -160,7 +160,7 @@ In order to get the shards assigned we need to make sure the deployment is using ``` 1. The ILM policies that were updated. - 2. The indices that were migrated to [tier preference](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/data-tier-allocation.md#tier-preference-allocation-filter) routing. + 2. The indices that were migrated to [tier preference](elasticsearch://reference/elasticsearch/index-settings/data-tier-allocation.md#tier-preference-allocation-filter) routing. 3. The legacy index templates that were updated to not contain custom routing settings for the provided data attribute. 4. The composable index templates that were updated to not contain custom routing settings for the provided data attribute. 5. The component templates that were updated to not contain custom routing settings for the provided data attribute. diff --git a/troubleshoot/elasticsearch/troubleshooting-searches.md b/troubleshoot/elasticsearch/troubleshooting-searches.md index ab40ac230..c7f36787e 100644 --- a/troubleshoot/elasticsearch/troubleshooting-searches.md +++ b/troubleshoot/elasticsearch/troubleshooting-searches.md @@ -113,7 +113,7 @@ To change the mapping of an existing field, refer to [Changing the mapping of a ## Check the field’s values [troubleshooting-check-field-values] -Use the [`exists` query](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-exists-query.md) to check whether there are documents that return a value for a field. Check that `count` in the response is not 0. +Use the [`exists` query](elasticsearch://reference/query-languages/query-dsl-exists-query.md) to check whether there are documents that return a value for a field. Check that `count` in the response is not 0. ```console GET /my-index-000001/_count @@ -126,7 +126,7 @@ GET /my-index-000001/_count } ``` -If the field is aggregatable, you can use [aggregations](../../explore-analyze/query-filter/aggregations.md) to check the field’s values. For `keyword` fields, you can use a [terms aggregation](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-bucket-terms-aggregation.md) to retrieve the field’s most common values: +If the field is aggregatable, you can use [aggregations](../../explore-analyze/query-filter/aggregations.md) to check the field’s values. For `keyword` fields, you can use a [terms aggregation](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-terms-aggregation.md) to retrieve the field’s most common values: ```console GET /my-index-000001/_search?filter_path=aggregations @@ -143,7 +143,7 @@ GET /my-index-000001/_search?filter_path=aggregations } ``` -For numeric fields, you can use the [stats aggregation](asciidocalypse://docs/elasticsearch/docs/reference/data-analysis/aggregations/search-aggregations-metrics-stats-aggregation.md) to get an idea of the field’s value distribution: +For numeric fields, you can use the [stats aggregation](elasticsearch://reference/data-analysis/aggregations/search-aggregations-metrics-stats-aggregation.md) to get an idea of the field’s value distribution: ```console GET my-index-000001/_search?filter_path=aggregations @@ -211,7 +211,7 @@ To troubleshoot queries in {{kib}}, select **Inspect** in the toolbar. Next, sel ## Check index settings [troubleshooting-searches-settings] -[Index settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index.md) can influence search results. For example, the `index.query.default_field` setting, which determines the field that is queried when a query specifies no explicit field. Use the [get index settings API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-get-settings) to retrieve the settings for an index: +[Index settings](elasticsearch://reference/elasticsearch/index-settings/index.md) can influence search results. For example, the `index.query.default_field` setting, which determines the field that is queried when a query specifies no explicit field. Use the [get index settings API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-get-settings) to retrieve the settings for an index: ```console GET /my-index-000001/_settings @@ -224,7 +224,7 @@ For static settings, you need to create a new index with the correct settings. N ## Find slow queries [troubleshooting-slow-searches] -[Slow logs](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/slow-log.md) can help pinpoint slow performing search requests. Enabling [audit logging](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/auding-settings.md) on top can help determine query source. Add the following settings to the `elasticsearch.yml` configuration file to trace queries. The resulting logging is verbose, so disable these settings when not troubleshooting. +[Slow logs](elasticsearch://reference/elasticsearch/index-settings/slow-log.md) can help pinpoint slow performing search requests. Enabling [audit logging](elasticsearch://reference/elasticsearch/configuration-reference/auding-settings.md) on top can help determine query source. Add the following settings to the `elasticsearch.yml` configuration file to trace queries. The resulting logging is verbose, so disable these settings when not troubleshooting. ```yaml xpack.security.audit.enabled: true diff --git a/troubleshoot/elasticsearch/troubleshooting-shards-capacity-issues.md b/troubleshoot/elasticsearch/troubleshooting-shards-capacity-issues.md index 43fb3eb36..739b64469 100644 --- a/troubleshoot/elasticsearch/troubleshooting-shards-capacity-issues.md +++ b/troubleshoot/elasticsearch/troubleshooting-shards-capacity-issues.md @@ -6,12 +6,12 @@ mapped_pages: # Troubleshoot shard capacity health issues [troubleshooting-shards-capacity-issues] -{{es}} limits the maximum number of shards to be held per node using the [`cluster.max_shards_per_node`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/miscellaneous-cluster-settings.md#cluster-max-shards-per-node) and [`cluster.max_shards_per_node.frozen`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/miscellaneous-cluster-settings.md#cluster-max-shards-per-node-frozen) settings. The current shards capacity of the cluster is available in the [health API shards capacity section](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-health-report). +{{es}} limits the maximum number of shards to be held per node using the [`cluster.max_shards_per_node`](elasticsearch://reference/elasticsearch/configuration-reference/miscellaneous-cluster-settings.md#cluster-max-shards-per-node) and [`cluster.max_shards_per_node.frozen`](elasticsearch://reference/elasticsearch/configuration-reference/miscellaneous-cluster-settings.md#cluster-max-shards-per-node-frozen) settings. The current shards capacity of the cluster is available in the [health API shards capacity section](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-health-report). ## Cluster is close to reaching the configured maximum number of shards for data nodes. [_cluster_is_close_to_reaching_the_configured_maximum_number_of_shards_for_data_nodes] -The [`cluster.max_shards_per_node`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/miscellaneous-cluster-settings.md#cluster-max-shards-per-node) cluster setting limits the maximum number of open shards for a cluster, only counting data nodes that do not belong to the frozen tier. +The [`cluster.max_shards_per_node`](elasticsearch://reference/elasticsearch/configuration-reference/miscellaneous-cluster-settings.md#cluster-max-shards-per-node) cluster setting limits the maximum number of open shards for a cluster, only counting data nodes that do not belong to the frozen tier. This symptom indicates that action should be taken, otherwise, either the creation of new indices or upgrading the cluster could be blocked. @@ -74,7 +74,7 @@ If you’re confident your changes won’t destabilize the cluster, you can temp 1. Current value of the setting `cluster.max_shards_per_node` 2. Current number of open shards across the cluster -5. Update the [`cluster.max_shards_per_node`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/miscellaneous-cluster-settings.md#cluster-max-shards-per-node) setting with a proper value: +5. Update the [`cluster.max_shards_per_node`](elasticsearch://reference/elasticsearch/configuration-reference/miscellaneous-cluster-settings.md#cluster-max-shards-per-node) setting with a proper value: ```console PUT _cluster/settings @@ -166,7 +166,7 @@ The response will look like this: 2. Current number of open shards across the cluster -Using the [`cluster settings API`](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings), update the [`cluster.max_shards_per_node`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/miscellaneous-cluster-settings.md#cluster-max-shards-per-node) setting: +Using the [`cluster settings API`](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings), update the [`cluster.max_shards_per_node`](elasticsearch://reference/elasticsearch/configuration-reference/miscellaneous-cluster-settings.md#cluster-max-shards-per-node) setting: ```console PUT _cluster/settings @@ -221,7 +221,7 @@ PUT _cluster/settings ## Cluster is close to reaching the configured maximum number of shards for frozen nodes. [_cluster_is_close_to_reaching_the_configured_maximum_number_of_shards_for_frozen_nodes] -The [`cluster.max_shards_per_node.frozen`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/miscellaneous-cluster-settings.md#cluster-max-shards-per-node-frozen) cluster setting limits the maximum number of open shards for a cluster, only counting data nodes that belong to the frozen tier. +The [`cluster.max_shards_per_node.frozen`](elasticsearch://reference/elasticsearch/configuration-reference/miscellaneous-cluster-settings.md#cluster-max-shards-per-node-frozen) cluster setting limits the maximum number of open shards for a cluster, only counting data nodes that belong to the frozen tier. This symptom indicates that action should be taken, otherwise, either the creation of new indices or upgrading the cluster could be blocked. @@ -283,7 +283,7 @@ If you’re confident your changes won’t destabilize the cluster, you can temp 1. Current value of the setting `cluster.max_shards_per_node.frozen` 2. Current number of open shards used by frozen nodes across the cluster -5. Update the [`cluster.max_shards_per_node.frozen`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/miscellaneous-cluster-settings.md#cluster-max-shards-per-node-frozen) setting: +5. Update the [`cluster.max_shards_per_node.frozen`](elasticsearch://reference/elasticsearch/configuration-reference/miscellaneous-cluster-settings.md#cluster-max-shards-per-node-frozen) setting: ```console PUT _cluster/settings @@ -373,7 +373,7 @@ GET _health_report/shards_capacity 2. Current number of open shards used by frozen nodes across the cluster. -Using the [`cluster settings API`](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings), update the [`cluster.max_shards_per_node.frozen`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/miscellaneous-cluster-settings.md#cluster-max-shards-per-node-frozen) setting: +Using the [`cluster settings API`](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings), update the [`cluster.max_shards_per_node.frozen`](elasticsearch://reference/elasticsearch/configuration-reference/miscellaneous-cluster-settings.md#cluster-max-shards-per-node-frozen) setting: ```console PUT _cluster/settings diff --git a/troubleshoot/elasticsearch/troubleshooting-unbalanced-cluster.md b/troubleshoot/elasticsearch/troubleshooting-unbalanced-cluster.md index b30943a23..433732719 100644 --- a/troubleshoot/elasticsearch/troubleshooting-unbalanced-cluster.md +++ b/troubleshoot/elasticsearch/troubleshooting-unbalanced-cluster.md @@ -62,7 +62,7 @@ This is not concerning as long as the number of such shards is decreasing and th If the cluster has this warning repeatedly for an extended period of time (multiple hours), it is possible that the desired balance is diverging too far from the current state. -If so, increase the [`cluster.routing.allocation.balance.threshold`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#shards-rebalancing-heuristics) to reduce the sensitivity of the algorithm that tries to level up the shard count and disk usage within the cluster. +If so, increase the [`cluster.routing.allocation.balance.threshold`](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#shards-rebalancing-heuristics) to reduce the sensitivity of the algorithm that tries to level up the shard count and disk usage within the cluster. And reset the desired balance using the following API call: diff --git a/troubleshoot/elasticsearch/troubleshooting-unstable-cluster.md b/troubleshoot/elasticsearch/troubleshooting-unstable-cluster.md index 8200fbc72..e7afe0b9e 100644 --- a/troubleshoot/elasticsearch/troubleshooting-unstable-cluster.md +++ b/troubleshoot/elasticsearch/troubleshooting-unstable-cluster.md @@ -65,19 +65,19 @@ The [Health](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operati If the node did not restart then you should look at the reason for its departure more closely. Each reason has different troubleshooting steps, described below. There are three possible reasons: * `disconnected`: The connection from the master node to the removed node was closed. -* `lagging`: The master published a cluster state update, but the removed node did not apply it within the permitted timeout. By default, this timeout is 2 minutes. Refer to [Discovery and cluster formation settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/discovery-cluster-formation-settings.md) for information about the settings which control this mechanism. -* `followers check retry count exceeded`: The master sent a number of consecutive health checks to the removed node. These checks were rejected or timed out. By default, each health check times out after 10 seconds and {{es}} removes the node removed after three consecutively failed health checks. Refer to [Discovery and cluster formation settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/discovery-cluster-formation-settings.md) for information about the settings which control this mechanism. +* `lagging`: The master published a cluster state update, but the removed node did not apply it within the permitted timeout. By default, this timeout is 2 minutes. Refer to [Discovery and cluster formation settings](elasticsearch://reference/elasticsearch/configuration-reference/discovery-cluster-formation-settings.md) for information about the settings which control this mechanism. +* `followers check retry count exceeded`: The master sent a number of consecutive health checks to the removed node. These checks were rejected or timed out. By default, each health check times out after 10 seconds and {{es}} removes the node removed after three consecutively failed health checks. Refer to [Discovery and cluster formation settings](elasticsearch://reference/elasticsearch/configuration-reference/discovery-cluster-formation-settings.md) for information about the settings which control this mechanism. ## Diagnosing `disconnected` nodes [troubleshooting-unstable-cluster-disconnected] Nodes typically leave the cluster with reason `disconnected` when they shut down, but if they rejoin the cluster without restarting then there is some other problem. -{{es}} is designed to run on a fairly reliable network. It opens a number of TCP connections between nodes and expects these connections to remain open [forever](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#long-lived-connections). If a connection is closed then {{es}} will try and reconnect, so the occasional blip may fail some in-flight operations but should otherwise have limited impact on the cluster. In contrast, repeatedly-dropped connections will severely affect its operation. +{{es}} is designed to run on a fairly reliable network. It opens a number of TCP connections between nodes and expects these connections to remain open [forever](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#long-lived-connections). If a connection is closed then {{es}} will try and reconnect, so the occasional blip may fail some in-flight operations but should otherwise have limited impact on the cluster. In contrast, repeatedly-dropped connections will severely affect its operation. The connections from the elected master node to every other node in the cluster are particularly important. The elected master never spontaneously closes its outbound connections to other nodes. Similarly, once an inbound connection is fully established, a node never spontaneously closes it unless the node is shutting down. -If you see a node unexpectedly leave the cluster with the `disconnected` reason, something other than {{es}} likely caused the connection to close. A common cause is a misconfigured firewall with an improper timeout or another policy that’s [incompatible with {{es}}](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#long-lived-connections). It could also be caused by general connectivity issues, such as packet loss due to faulty hardware or network congestion. If you’re an advanced user, configure the following loggers to get more detailed information about network exceptions: +If you see a node unexpectedly leave the cluster with the `disconnected` reason, something other than {{es}} likely caused the connection to close. A common cause is a misconfigured firewall with an improper timeout or another policy that’s [incompatible with {{es}}](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#long-lived-connections). It could also be caused by general connectivity issues, such as packet loss due to faulty hardware or network congestion. If you’re an advanced user, configure the following loggers to get more detailed information about network exceptions: ```yaml logger.org.elasticsearch.transport.TcpTransport: DEBUG @@ -89,7 +89,7 @@ If these logs do not show enough information to diagnose the problem, obtain a p ## Diagnosing `lagging` nodes [troubleshooting-unstable-cluster-lagging] -{{es}} needs every node to process cluster state updates reasonably quickly. If a node takes too long to process a cluster state update, it can be harmful to the cluster. The master will remove these nodes with the `lagging` reason. Refer to [Discovery and cluster formation settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/discovery-cluster-formation-settings.md) for information about the settings which control this mechanism. +{{es}} needs every node to process cluster state updates reasonably quickly. If a node takes too long to process a cluster state update, it can be harmful to the cluster. The master will remove these nodes with the `lagging` reason. Refer to [Discovery and cluster formation settings](elasticsearch://reference/elasticsearch/configuration-reference/discovery-cluster-formation-settings.md) for information about the settings which control this mechanism. Lagging is typically caused by performance issues on the removed node. However, a node may also lag due to severe network delays. To rule out network delays, ensure that `net.ipv4.tcp_retries2` is [configured properly](../../deploy-manage/deploy/self-managed/system-config-tcpretries.md). Log messages that contain `warn threshold` may provide more information about the root cause. @@ -120,20 +120,20 @@ cat lagdetector.log | sed -e 's/.*://' | base64 --decode | gzip --decompress Nodes sometimes leave the cluster with reason `follower check retry count exceeded` when they shut down, but if they rejoin the cluster without restarting then there is some other problem. -{{es}} needs every node to respond to network messages successfully and reasonably quickly. If a node rejects requests or does not respond at all then it can be harmful to the cluster. If enough consecutive checks fail then the master will remove the node with reason `follower check retry count exceeded` and will indicate in the `node-left` message how many of the consecutive unsuccessful checks failed and how many of them timed out. Refer to [Discovery and cluster formation settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/discovery-cluster-formation-settings.md) for information about the settings which control this mechanism. +{{es}} needs every node to respond to network messages successfully and reasonably quickly. If a node rejects requests or does not respond at all then it can be harmful to the cluster. If enough consecutive checks fail then the master will remove the node with reason `follower check retry count exceeded` and will indicate in the `node-left` message how many of the consecutive unsuccessful checks failed and how many of them timed out. Refer to [Discovery and cluster formation settings](elasticsearch://reference/elasticsearch/configuration-reference/discovery-cluster-formation-settings.md) for information about the settings which control this mechanism. Timeouts and failures may be due to network delays or performance problems on the affected nodes. Ensure that `net.ipv4.tcp_retries2` is [configured properly](../../deploy-manage/deploy/self-managed/system-config-tcpretries.md) to eliminate network delays as a possible cause for this kind of instability. Log messages containing `warn threshold` may give further clues about the cause of the instability. If the last check failed with an exception then the exception is reported, and typically indicates the problem that needs to be addressed. If any of the checks timed out then narrow down the problem as follows. -* GC pauses are recorded in the GC logs that {{es}} emits by default, and also usually by the `JvmMonitorService` in the main node logs. Use these logs to confirm whether or not the node is experiencing high heap usage with long GC pauses. If so, [the troubleshooting guide for high heap usage](high-jvm-memory-pressure.md) has some suggestions for further investigation but typically you will need to capture a heap dump and the [garbage collector logs](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/jvm-settings.md#gc-logging) during a time of high heap usage to fully understand the problem. +* GC pauses are recorded in the GC logs that {{es}} emits by default, and also usually by the `JvmMonitorService` in the main node logs. Use these logs to confirm whether or not the node is experiencing high heap usage with long GC pauses. If so, [the troubleshooting guide for high heap usage](high-jvm-memory-pressure.md) has some suggestions for further investigation but typically you will need to capture a heap dump and the [garbage collector logs](elasticsearch://reference/elasticsearch/jvm-settings.md#gc-logging) during a time of high heap usage to fully understand the problem. * VM pauses also affect other processes on the same host. A VM pause also typically causes a discontinuity in the system clock, which {{es}} will report in its logs. If you see evidence of other processes pausing at the same time, or unexpected clock discontinuities, investigate the infrastructure on which you are running {{es}}. * Packet captures will reveal system-level and network-level faults, especially if you capture the network traffic simultaneously at the elected master and the faulty node and analyse it alongside the {{es}} logs from those nodes. The connection used for follower checks is not used for any other traffic so it can be easily identified from the flow pattern alone, even if TLS is in use: almost exactly every second there will be a few hundred bytes sent each way, first the request by the master and then the response by the follower. You should be able to observe any retransmissions, packet loss, or other delays on such a connection. * Long waits for particular threads to be available can be identified by taking stack dumps of the main {{es}} process (for example, using `jstack`) or a profiling trace (for example, using Java Flight Recorder) in the few seconds leading up to the relevant log message. The [Nodes hot threads](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-nodes-hot-threads) API sometimes yields useful information, but bear in mind that this API also requires a number of `transport_worker` and `generic` threads across all the nodes in the cluster. The API may be affected by the very problem you’re trying to diagnose. `jstack` is much more reliable since it doesn’t require any JVM threads. - The threads involved in discovery and cluster membership are mainly `transport_worker` and `cluster_coordination` threads, for which there should never be a long wait. There may also be evidence of long waits for threads in the {{es}} logs, particularly looking at warning logs from `org.elasticsearch.transport.InboundHandler`. See [Networking threading model](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#modules-network-threading-model) for more information. + The threads involved in discovery and cluster membership are mainly `transport_worker` and `cluster_coordination` threads, for which there should never be a long wait. There may also be evidence of long waits for threads in the {{es}} logs, particularly looking at warning logs from `org.elasticsearch.transport.InboundHandler`. See [Networking threading model](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#modules-network-threading-model) for more information. By default the follower checks will time out after 30s, so if node departures are unpredictable then capture stack dumps every 15s to be sure that at least one stack dump was taken at the right time. @@ -168,7 +168,7 @@ cat shardlock.log | sed -e 's/.*://' | base64 --decode | gzip --decompress ## Diagnosing other network disconnections [troubleshooting-unstable-cluster-network] -{{es}} is designed to run on a fairly reliable network. It opens a number of TCP connections between nodes and expects these connections to remain open [forever](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#long-lived-connections). If a connection is closed then {{es}} will try and reconnect, so the occasional blip may fail some in-flight operations but should otherwise have limited impact on the cluster. In contrast, repeatedly-dropped connections will severely affect its operation. +{{es}} is designed to run on a fairly reliable network. It opens a number of TCP connections between nodes and expects these connections to remain open [forever](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#long-lived-connections). If a connection is closed then {{es}} will try and reconnect, so the occasional blip may fail some in-flight operations but should otherwise have limited impact on the cluster. In contrast, repeatedly-dropped connections will severely affect its operation. {{es}} nodes will only actively close an outbound connection to another node if the other node leaves the cluster. See [Troubleshooting an unstable cluster](../../deploy-manage/distributed-architecture/discovery-cluster-formation/cluster-fault-detection.md#cluster-fault-detection-troubleshooting) for further information about identifying and troubleshooting this situation. If an outbound connection closes for some other reason, nodes will log a message such as the following: @@ -178,7 +178,7 @@ cat shardlock.log | sed -e 's/.*://' | base64 --decode | gzip --decompress Similarly, once an inbound connection is fully established, a node never spontaneously closes it unless the node is shutting down. -Therefore if you see a node report that a connection to another node closed unexpectedly, something other than {{es}} likely caused the connection to close. A common cause is a misconfigured firewall with an improper timeout or another policy that’s [incompatible with {{es}}](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/networking-settings.md#long-lived-connections). It could also be caused by general connectivity issues, such as packet loss due to faulty hardware or network congestion. If you’re an advanced user, configure the following loggers to get more detailed information about network exceptions: +Therefore if you see a node report that a connection to another node closed unexpectedly, something other than {{es}} likely caused the connection to close. A common cause is a misconfigured firewall with an improper timeout or another policy that’s [incompatible with {{es}}](elasticsearch://reference/elasticsearch/configuration-reference/networking-settings.md#long-lived-connections). It could also be caused by general connectivity issues, such as packet loss due to faulty hardware or network congestion. If you’re an advanced user, configure the following loggers to get more detailed information about network exceptions: ```yaml logger.org.elasticsearch.transport.TcpTransport: DEBUG diff --git a/troubleshoot/kibana/error-server-not-ready.md b/troubleshoot/kibana/error-server-not-ready.md index aa9680d7a..8f1ff1596 100644 --- a/troubleshoot/kibana/error-server-not-ready.md +++ b/troubleshoot/kibana/error-server-not-ready.md @@ -20,7 +20,7 @@ To troubleshoot the `Kibana server is not ready yet` error, try these steps: curl -XGET elasticsearch_ip_or_hostname:9200/_cat/indices/.kibana,.kibana_task_manager,.kibana_security_session?v=true ``` - These {{kib}}-backing indices must also not have [index settings](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-get-settings) flagging `read_only_allow_delete` or `write` [index blocks](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/index-block.md). + These {{kib}}-backing indices must also not have [index settings](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-get-settings) flagging `read_only_allow_delete` or `write` [index blocks](elasticsearch://reference/elasticsearch/index-settings/index-block.md). 3. [Shut down all {{kib}} nodes](../../deploy-manage/maintenance/start-stop-services/start-stop-kibana.md). 4. Choose any {{kib}} node, then update the config to set the [debug logging](../../deploy-manage/monitor/logging-configuration/kibana-log-settings-examples.md#change-overall-log-level). diff --git a/troubleshoot/kibana/graph.md b/troubleshoot/kibana/graph.md index 03400a69e..9459eaabe 100644 --- a/troubleshoot/kibana/graph.md +++ b/troubleshoot/kibana/graph.md @@ -10,7 +10,7 @@ mapped_pages: -## Why are results missing? [_why_are_results_missing] +## Why are results missing? [_why_are_results_missing] The default settings in Graph API requests are configured to tune out noisy results by using the following strategies: @@ -25,20 +25,20 @@ These are useful defaults for getting the "big picture" signals from noisy data, * Set the `min_doc_count` for your vertices to 1 to ensure only one document is required to assert a relationship. -## What can I do to improve performance? [_what_can_i_do_to_improve_performance] +## What can I do to improve performance? [_what_can_i_do_to_improve_performance] With the default setting of `use_significance` set to `true`, the Graph API performs a background frequency check of the terms it discovers as part of exploration. Each unique term has to have its frequency looked up in the index, which costs at least one disk seek. Disk seeks are expensive. If you don’t need to perform this noise-filtering, setting `use_significance` to `false` eliminates all of these expensive checks (at the expense of not performing any quality-filtering on the terms). If your data is noisy and you need to filter based on significance, you can reduce the number of frequency checks by: * Reducing the `sample_size`. Considering fewer documents can actually be better when the quality of matches is quite variable. -* Avoiding noisy documents that have a large number of terms. You can do this by either allowing ranking to naturally favor shorter documents in the top-results sample (see [enabling norms](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/norms.md)) or by explicitly excluding large documents with your seed and guiding queries. +* Avoiding noisy documents that have a large number of terms. You can do this by either allowing ranking to naturally favor shorter documents in the top-results sample (see [enabling norms](elasticsearch://reference/elasticsearch/mapping-reference/norms.md)) or by explicitly excluding large documents with your seed and guiding queries. * Increasing the frequency threshold. Many many terms occur very infrequently so even increasing the frequency threshold by one can massively reduce the number of candidate terms whose background frequencies are checked. Keep in mind that all of these options reduce the scope of information analyzed and can increase the potential to miss what could be interesting details. However, the information that’s lost tends to be associated with lower-quality documents with lower-frequency terms, which can be an acceptable trade-off. -## Limited support for multiple indices [_limited_support_for_multiple_indices] +## Limited support for multiple indices [_limited_support_for_multiple_indices] The graph API can explore multiple indices, types, or aliases in a single API request, but the assumption is that each "hop" it performs is querying the same set of indices. Currently, it is not possible to take a term found in a field from one index and use that value to explore connections in *a different field* held in another type or index. diff --git a/troubleshoot/kibana/maps.md b/troubleshoot/kibana/maps.md index 2e85043cb..451e33598 100644 --- a/troubleshoot/kibana/maps.md +++ b/troubleshoot/kibana/maps.md @@ -32,7 +32,7 @@ Maps uses the [{{es}} vector tile search API](https://www.elastic.co/docs/api/do ### Data view not listed when adding layer [_data_view_not_listed_when_adding_layer] -* Verify your geospatial data is correctly mapped as [geo_point](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/geo-point.md) or [geo_shape](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/geo-shape.md). +* Verify your geospatial data is correctly mapped as [geo_point](elasticsearch://reference/elasticsearch/mapping-reference/geo-point.md) or [geo_shape](elasticsearch://reference/elasticsearch/mapping-reference/geo-shape.md). * Run `GET myIndexName/_field_caps?fields=myGeoFieldName` in [Console](../../explore-analyze/query-filter/tools/console.md), replacing `myIndexName` and `myGeoFieldName` with your index and geospatial field name. * Ensure response specifies `type` as `geo_point` or `geo_shape`. diff --git a/troubleshoot/kibana/migration-failures.md b/troubleshoot/kibana/migration-failures.md index 91fee511f..07a89ae6e 100644 --- a/troubleshoot/kibana/migration-failures.md +++ b/troubleshoot/kibana/migration-failures.md @@ -193,4 +193,4 @@ PUT /_cluster/settings When upgrading, {{kib}} creates new indices requiring a small number of new shards. If the amount of open {{es}} shards approaches or exceeds the {{es}} `cluster.max_shards_per_node` setting, {{kib}} is unable to complete the upgrade. Ensure that {{kib}} is able to add at least 10 more shards by removing indices to clear up resources, or by increasing the `cluster.max_shards_per_node` setting. -For more information, refer to the documentation on [total shards per node](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/total-shards-per-node.md). +For more information, refer to the documentation on [total shards per node](elasticsearch://reference/elasticsearch/index-settings/total-shards-per-node.md). diff --git a/troubleshoot/kibana/trace-elasticsearch-query-to-the-origin-in-kibana.md b/troubleshoot/kibana/trace-elasticsearch-query-to-the-origin-in-kibana.md index 2760d1801..416f5a1dd 100644 --- a/troubleshoot/kibana/trace-elasticsearch-query-to-the-origin-in-kibana.md +++ b/troubleshoot/kibana/trace-elasticsearch-query-to-the-origin-in-kibana.md @@ -6,9 +6,9 @@ mapped_pages: # Trace an {{es}} query in {{kib}} [kibana-troubleshooting-trace-query] -Sometimes the {{es}} server might be slowed down by the execution of an expensive query. Such queries are logged to {{es}}'s [search slow log](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-settings/slow-log.md#search-slow-log) file. But there is a problem: it’s impossible to say what triggered a slow search request—a {{kib}} instance or a user accessing an {{es}} endpoint directly. To simplify the investigation of such cases, the search slow log file includes the `x-opaque-id` header, which might provide additional information about a request if it originated from {{kib}}. +Sometimes the {{es}} server might be slowed down by the execution of an expensive query. Such queries are logged to {{es}}'s [search slow log](elasticsearch://reference/elasticsearch/index-settings/slow-log.md#search-slow-log) file. But there is a problem: it’s impossible to say what triggered a slow search request—a {{kib}} instance or a user accessing an {{es}} endpoint directly. To simplify the investigation of such cases, the search slow log file includes the `x-opaque-id` header, which might provide additional information about a request if it originated from {{kib}}. -::::{warning} +::::{warning} At the moment, {{kib}} can only highlight cases where a slow query originated from a {{kib}} visualization, **Lens**, **Discover**, **Maps**, or **Alerting**. :::: diff --git a/troubleshoot/monitoring/node-bootlooping.md b/troubleshoot/monitoring/node-bootlooping.md index 8cd9ea897..25821f89f 100644 --- a/troubleshoot/monitoring/node-bootlooping.md +++ b/troubleshoot/monitoring/node-bootlooping.md @@ -143,4 +143,4 @@ To resolve this: ## Insufficient Storage [ec-config-change-errors-insufficient-storage] -Configuration change errors can occur when there is insufficient disk space for a data tier. To resolve this, you need to increase the size of that tier to ensure it provides enough storage to accommodate the data in your cluster tier considering the [high watermark](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#disk-based-shard-allocation). For troubleshooting walkthrough, see [Fix watermark errors](/troubleshoot/elasticsearch/fix-watermark-errors.md). \ No newline at end of file +Configuration change errors can occur when there is insufficient disk space for a data tier. To resolve this, you need to increase the size of that tier to ensure it provides enough storage to accommodate the data in your cluster tier considering the [high watermark](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#disk-based-shard-allocation). For troubleshooting walkthrough, see [Fix watermark errors](/troubleshoot/elasticsearch/fix-watermark-errors.md). \ No newline at end of file diff --git a/troubleshoot/monitoring/unavailable-shards.md b/troubleshoot/monitoring/unavailable-shards.md index 8b44530c8..3fb09d215 100644 --- a/troubleshoot/monitoring/unavailable-shards.md +++ b/troubleshoot/monitoring/unavailable-shards.md @@ -221,7 +221,7 @@ Review the topic for your deployment architecture: To learn more, review the following topics: -* [Cluster-level shard allocation and routing settings](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md) +* [Cluster-level shard allocation and routing settings](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md) * [Fix watermark errors](/troubleshoot/elasticsearch/fix-watermark-errors.md) @@ -263,7 +263,7 @@ When shards cannot be assigned, due to [data tier allocation](/manage-data/lifec * Make sure nodes are available in each data tier and have sufficient disk space. * [Check the index settings](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-indices) and ensure shards can be allocated to the expected data tier. -* Check the [ILM policy](/manage-data/lifecycle/index-lifecycle-management.md) and check for issues with the [allocate action](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/index-lifecycle-actions/ilm-allocate.md). +* Check the [ILM policy](/manage-data/lifecycle/index-lifecycle-management.md) and check for issues with the [allocate action](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-allocate.md). * Inspect the [index templates](/manage-data/data-store/templates.md) and check for issues with the index settings. @@ -304,7 +304,7 @@ The bugs also affect searchable snapshots. If you still have data in the cluster **Symptom** -The parameter [`cluster.max_shards_per_node`](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/miscellaneous-cluster-settings.md#cluster-max-shards-per-node) limits the total number of primary and replica shards for the cluster. If your cluster has a number of shards beyond this limit, you might get the following message: +The parameter [`cluster.max_shards_per_node`](elasticsearch://reference/elasticsearch/configuration-reference/miscellaneous-cluster-settings.md#cluster-max-shards-per-node) limits the total number of primary and replica shards for the cluster. If your cluster has a number of shards beyond this limit, you might get the following message: `Validation Failed: 1: this action would add [2] shards, but this cluster currently has [1000]/[1000] maximum normal shards open` diff --git a/troubleshoot/observability/troubleshoot-logs.md b/troubleshoot/observability/troubleshoot-logs.md index 942d7911f..fa16d9799 100644 --- a/troubleshoot/observability/troubleshoot-logs.md +++ b/troubleshoot/observability/troubleshoot-logs.md @@ -225,7 +225,7 @@ PUT my-index-000001 } ``` -Refer to the [`date` field type](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/mapping-reference/date.md) docs for more information. +Refer to the [`date` field type](elasticsearch://reference/elasticsearch/mapping-reference/date.md) docs for more information. ### Grok or dissect pattern mismatch [logs-mapping-troubleshooting-grok-mismatch] @@ -239,7 +239,7 @@ Provided Grok patterns do not match field value... #### Solution [logs-mapping-troubleshooting-grok-solution] -Make sure your [grok](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/grok-processor.md) or [dissect](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/dissect-processor.md) processor pattern matches your log document format. +Make sure your [grok](elasticsearch://reference/ingestion-tools/enrich-processor/grok-processor.md) or [dissect](elasticsearch://reference/ingestion-tools/enrich-processor/dissect-processor.md) processor pattern matches your log document format. You can build and debug grok patterns in {{kib}} using the [Grok Debugger](../../explore-analyze/query-filter/tools/grok-debugger.md). Find the **Grok Debugger** by navigating to the **Developer tools** page using the navigation menu or the global search field. diff --git a/troubleshoot/observability/troubleshoot-service-level-objectives-slos.md b/troubleshoot/observability/troubleshoot-service-level-objectives-slos.md index 31a301c4a..bb97cd81f 100644 --- a/troubleshoot/observability/troubleshoot-service-level-objectives-slos.md +++ b/troubleshoot/observability/troubleshoot-service-level-objectives-slos.md @@ -9,7 +9,7 @@ mapped_pages: ::::{important} -In {{stack}}, to create and manage SLOs, you need an [appropriate license](https://www.elastic.co/subscriptions), an {{es}} cluster with both `transform` and `ingest` [node roles](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/node-settings.md#node-roles) present, and [SLO access](../../solutions/observability/incident-management/configure-service-level-objective-slo-access.md) must be configured. +In {{stack}}, to create and manage SLOs, you need an [appropriate license](https://www.elastic.co/subscriptions), an {{es}} cluster with both `transform` and `ingest` [node roles](elasticsearch://reference/elasticsearch/configuration-reference/node-settings.md#node-roles) present, and [SLO access](../../solutions/observability/incident-management/configure-service-level-objective-slo-access.md) must be configured. :::: @@ -64,7 +64,7 @@ It’s common for SLO problems to arise when there are underlying problems in th ### No transform or ingest nodes [slo-no-transform-ingest-node] -Because SLOs depend on both [ingest pipelines](../../manage-data/ingest/transform-enrich/ingest-pipelines.md) and [transforms](../../explore-analyze/transforms.md) to process the data, it’s essential to ensure that the cluster has nodes with the appropriate [roles](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/configuration-reference/node-settings.md#node-roles). +Because SLOs depend on both [ingest pipelines](../../manage-data/ingest/transform-enrich/ingest-pipelines.md) and [transforms](../../explore-analyze/transforms.md) to process the data, it’s essential to ensure that the cluster has nodes with the appropriate [roles](elasticsearch://reference/elasticsearch/configuration-reference/node-settings.md#node-roles). Ensure the cluster includes one or more nodes with both `ingest` and `transform` roles to support the data processing and transformations required for SLOs to function properly. The roles can exist on the same node or be distributed across separate nodes. diff --git a/troubleshoot/security/detection-rules.md b/troubleshoot/security/detection-rules.md index 9ec830ba3..5194aa237 100644 --- a/troubleshoot/security/detection-rules.md +++ b/troubleshoot/security/detection-rules.md @@ -53,7 +53,7 @@ If you receive the following rule failure: `"An error occurred during rule execu ::::{dropdown} Indicator match rules are failing because the `maxClauseCount` limit is too low :name: IM-rule-heap-memory -If you receive the following rule failure: `Bulk Indexing of signals failed: index: ".index-name" reason: "maxClauseCount is set to 1024" type: "too_many_clauses"`, this indicates that the limit for the total number of clauses that a query tree can have is too low. To update your maximum clause count, [increase the size of your {{es}} JVM heap memory](asciidocalypse://docs/elasticsearch/docs/reference/elasticsearch/jvm-settings.md#set-jvm-heap-size). 1 GB of {{es}} JVM heap size or more is sufficient. +If you receive the following rule failure: `Bulk Indexing of signals failed: index: ".index-name" reason: "maxClauseCount is set to 1024" type: "too_many_clauses"`, this indicates that the limit for the total number of clauses that a query tree can have is too low. To update your maximum clause count, [increase the size of your {{es}} JVM heap memory](elasticsearch://reference/elasticsearch/jvm-settings.md#set-jvm-heap-size). 1 GB of {{es}} JVM heap size or more is sufficient. ::::