arangodb · Simran-B · May 15, 2025 · Jun 5, 2025 · Jun 5, 2025 · Jun 5, 2025
diff --git a/site/content/3.12/about-arangodb/features/community-edition.md b/site/content/3.12/about-arangodb/features/community-edition.md
@@ -149,7 +149,7 @@ see [arangodb.com/community-server/](https://www.arangodb.com/community-server/)
 {{% /comment %}}
 
 {{% comment %}} Experimental feature
-- [**Vector search**](#TODO):
+- [**Vector search**](../../index-and-search/indexing/working-with-indexes/vector-indexes.md):
   Find items with similar properties by comparing vector embeddings generated by
   machine learning models.
 {{% /comment %}}

diff --git a/site/content/3.12/aql/functions/vector.md b/site/content/3.12/aql/functions/vector.md
@@ -0,0 +1,193 @@
+---
+title: Vector search functions in AQL
+menuTitle: Vector
+weight: 60
+description: >-
+  The functions for vector search let you quickly find semantically similar
+  documents utilizing indexed vector embeddings
+---
+<small>Introduced in: v3.12.4</small>
+
+To use vector search, you need to have vector embeddings stored in documents
+and the attribute that stores them needs to be indexed by a
+[vector index](../../index-and-search/indexing/working-with-indexes/vector-indexes.md).
+
+You can calculate vector embeddings using [ArangoDB's GraphML](../../data-science/arangographml/_index.md)
+capabilities (available in ArangoGraph) or using external tools.
+
+{{< warning >}}
+The vector index is an experimental feature that you need to enable for the
+ArangoDB server with the `--experimental-vector-index` startup option.
+Once enabled for a deployment, it cannot be disabled anymore because it
+permanently changes how the data is managed by the RocksDB storage engine
+(it adds an additional column family).
+
+To restore a dump that contains vector indexes, the `--experimental-vector-index`
+startup option needs to be enabled on the deployment you want to restore to.
+{{< /warning >}}
+
+## Vector similarity functions
+
+In order to utilize a vector index, you need to do the following in an AQL query:
+
+- Use one of the following vector similarity functions in a query.
+- `SORT` by the similarity so that the most similar documents come first.
+- Specify the maximum number of documents to retrieve with a `LIMIT` operation.
+
+As a result, you get up to the specified number of documents whose vector embeddings
+are the most similar to the reference vector embedding you provided in the query,
+as approximated by the vector index.
+
+Example:
+
+```aql
+FOR doc IN coll
+  SORT APPROX_NEAR_COSINE(doc.vector, @q) DESC
+  LIMIT 5
+  RETURN doc
+```
+
+For this query, a vector index over the `vector` attribute and with the `cosine`
+metric is required. The `@q` bind variable needs to be a vector (array of numbers)
+with the dimension as specified in the vector index. It defines the point at
+which to look for similar documents (up to `5` in this case). How many documents can
+be found depends on the data as well as the search effort (see the `nProbe` option).
+
+{{< info >}}
+- If there is more than one suitable vector index over the same attribute, it is
+  undefined which one is selected.
+- You cannot have any `FILTER` operation between `FOR` and `LIMIT` for
+  pre-filtering.
+{{< /info >}}
+
+### APPROX_NEAR_COSINE()
+
+`APPROX_NEAR_COSINE(vector1, vector2, options) → similarity`
+
+Retrieve the approximate angular similarity using the cosine metric, accelerated
+by a matching vector index.
+
+The higher the cosine similarity value is, the more similar the two vectors
+are. The closer it is to 0, the more different they are. The value can also
+be negative, indicating that the vectors are not similar and point in opposite
+directions. You need to sort in descending order so that the most similar
+documents come first, which is what a vector index using the `cosine` metric
+can provide.
+
+- **vector1** (array of numbers): The first vector. Either this parameter or
+  `vector2` needs to reference a stored attribute holding the vector embedding.
+  attribute of a stored document that stores a vector, like `doc.vector`
+- **vector2** (array of numbers): The second vector. Either this parameter or
+  `vector1` needs to reference a stored attribute holding the vector embedding.
+- **options** (object, _optional_):
+  - **nProbe** (number, _optional_): How many neighboring centroids respectively
+    closest Voronoi cells to consider for the search results. The larger the number,
+    the slower the search but the better the search results. If not specified, the
+    `defaultNProbe` value of the vector index is used.
+- returns **similarity** (number): The approximate angular similarity between
+  both vectors.
+
+**Examples**
+
+Return up to `10` similar documents based on their closeness to the vector
+`@q` according to the cosine metric:
+
+```aql
+FOR doc IN coll
+  SORT APPROX_NEAR_COSINE(doc.vector, @q) DESC
+  LIMIT 10
+  RETURN doc
+```
+
+Return up to `5` similar documents as well as the similarity value,
+considering `20` neighboring centroids respectively closest Voronoi cells:
+
+```aql
+FOR doc IN coll
+  LET similarity = APPROX_NEAR_COSINE(doc.vector, @q, { nProbe: 20 })
+  SORT similarity DESC
+  LIMIT 5
+  RETURN MERGE( { similarity }, doc)
+```
+
+Return the similarity value and the document keys of up to `3` similar documents
+for multiple input vectors using a subquery. In this example, the input vectors
+are taken from ten random documents of the same collection:
+
+```aql
+FOR docOuter IN coll
+  LIMIT 10
+  LET neighbors = (
+    FOR docInner IN coll
+      LET similarity = APPROX_NEAR_COSINE(docInner.vector, docOuter.vector)
+      SORT similarity DESC
+      LIMIT 3
+      RETURN { key: docInner._key, similarity }
+  )
+  RETURN { key: docOuter._key, neighbors }
+```
+
+### APPROX_NEAR_L2()
+
+`APPROX_NEAR_L2(vector1, vector2, options) → similarity`
+
+Retrieve the approximate distance using the L2 (Euclidean) metric, accelerated
+by a matching vector index.
+
+The closer the distance is to 0, the more similar the two vectors are. The higher
+the value, the more different the they are. You need to sort in ascending order
+so that the most similar documents come first, which is what a vector index using
+the `l2` metric can provide.
+
+- **vector1** (array of numbers): The first vector. Either this parameter or
+  `vector2` needs to reference a stored attribute holding the vector embedding.
+  attribute of a stored document that stores a vector, like `doc.vector`
+- **vector2** (array of numbers): The second vector. Either this parameter or
+  `vector1` needs to reference a stored attribute holding the vector embedding.
+- **options** (object, _optional_):
+  - **nProbe** (number, _optional_): How many neighboring centroids to consider
+    for the search results. The larger the number, the slower the search but the
+    better the search results. If not specified, the `defaultNProbe` value of
+    the vector index is used.
+- returns **similarity** (number): The approximate L2 (Euclidean) distance between
+  both vectors.
+
+**Examples**
+
+Return up to `10` similar documents based on their closeness to the vector
+`@q` according to the L2 (Euclidean) metric:
+
+```aql
+FOR doc IN coll
+  SORT APPROX_NEAR_L2(doc.vector, @q)
+  LIMIT 10
+  RETURN doc
+```
+
+Return up to `5` similar documents as well as the similarity value,
+considering `20` neighboring centroids respectively closest Voronoi cells:
+
+```aql
+FOR doc IN coll
+  LET similarity = APPROX_NEAR_L2(doc.vector, @q, { nProbe: 20 })
+  SORT similarity
+  LIMIT 5
+  RETURN MERGE( { similarity }, doc)
+```
+
+Return the similarity value and the document keys of up to `3` similar documents
+for multiple input vectors using a subquery. In this example, the input vectors
+are taken from ten random documents of the same collection:
+
+```aql
+FOR docOuter IN coll
+  LIMIT 10
+  LET neighbors = (
+    FOR docInner IN coll
+      LET similarity = APPROX_NEAR_L2(docInner.vector, docOuter.vector)
+      SORT similarity
+      LIMIT 3
+      RETURN { key: docInner._key, similarity }
+  )
+  RETURN { key: docOuter._key, neighbors }
+```
diff --git a/site/content/3.12/develop/http-api/indexes/_index.md b/site/content/3.12/develop/http-api/indexes/_index.md
@@ -247,9 +247,9 @@ paths:
         `cacheEnabled` defaults to `false` and should only be used for indexes that
         are known to benefit from an extra layer of caching.
 
-        The optional attribute **inBackground** can be set to `true` to create the index
-        in the background, which will not write-lock the underlying collection for
-        as long as if the index is built in the foreground.
+        The optional attribute **inBackground** can be set to `true` to keep the
+        collection/shards available for write operations by not using an exclusive
+        write lock for the duration of the index creation.
       parameters:
         - name: database-name
           in: path

diff --git a/site/content/3.12/develop/http-api/indexes/fulltext.md b/site/content/3.12/develop/http-api/indexes/fulltext.md
@@ -49,6 +49,7 @@ paths:
                   description: |
                     Must be equal to `"fulltext"`.
                   type: string
+                  example: fulltext
                 name:
                   description: |
                     An easy-to-remember name for the index to look it up or refer to it in index hints.
@@ -58,9 +59,10 @@ paths:
                   type: string
                 fields:
                   description: |
-                    an array of attribute names. Currently, the array is limited
-                    to exactly one attribute.
+                    A list with exactly one attribute path.
                   type: array
+                  minItems: 1
+                  maxItems: 1
                   items:
                     type: string
                 minLength:
@@ -71,22 +73,21 @@ paths:
                   type: integer
                 inBackground:
                   description: |
-                    You can set this option to `true` to create the index
-                    in the background, which will not write-lock the underlying collection for
-                    as long as if the index is built in the foreground. The default value is `false`.
+                    Set this option to `true` to keep the collection/shards available for
+                    write operations by not using an exclusive write lock for the duration
+                    of the index creation.
                   type: boolean
+                  default: false
       responses:
         '200':
           description: |
-            If the index already exists, then a *HTTP 200* is
-            returned.
+            The index exists already.
         '201':
           description: |
-            If the index does not already exist and could be created, then a *HTTP 201*
-            is returned.
+            The index is created as there is no such existing index.
         '404':
           description: |
-            If the `collection-name` is unknown, then a *HTTP 404* is returned.
+            The collection is unknown.
       tags:
         - Indexes
 ```

diff --git a/site/content/3.12/develop/http-api/indexes/geo-spatial.md b/site/content/3.12/develop/http-api/indexes/geo-spatial.md
@@ -13,7 +13,7 @@ paths:
       operationId: createIndexGeo
       description: |
         Creates a geo-spatial index in the collection `collection`, if
-        it does not already exist. Expects an object containing the index details.
+        it does not already exist.
 
         Geo indexes are always sparse, meaning that documents that do not contain
         the index attributes or have non-numeric values in the index attributes
@@ -47,6 +47,7 @@ paths:
                   description: |
                     Must be equal to `"geo"`.
                   type: string
+                  example: geo
                 name:
                   description: |
                     An easy-to-remember name for the index to look it up or refer to it in index hints.
@@ -72,6 +73,8 @@ paths:
                     All documents which do not have the attribute paths or which have
                     values that are not suitable are ignored.
                   type: array
+                  minItems: 1
+                  maxItems: 2
                   items:
                     type: string
                 geoJson:
@@ -98,21 +101,21 @@ paths:
                   type: boolean
                 inBackground:
                   description: |
-                    You can set this option to `true` to create the index
-                    in the background, which will not write-lock the underlying collection for
-                    as long as if the index is built in the foreground. The default value is `false`.
+                    Set this option to `true` to keep the collection/shards available for
+                    write operations by not using an exclusive write lock for the duration
+                    of the index creation.
                   type: boolean
+                  default: false
       responses:
         '200':
           description: |
-            If the index already exists, then a *HTTP 200* is returned.
+            The index exists already.
         '201':
           description: |
-            If the index does not already exist and could be created, then a *HTTP 201*
-            is returned.
+            The index is created as there is no such existing index.
         '404':
           description: |
-            If the `collection` is unknown, then a *HTTP 404* is returned.
+            The collection is unknown.
       tags:
         - Indexes
 ```

diff --git a/site/content/3.12/develop/http-api/indexes/inverted.md b/site/content/3.12/develop/http-api/indexes/inverted.md
@@ -44,6 +44,7 @@ paths:
                   description: |
                     Must be equal to `"inverted"`.
                   type: string
+                  example: inverted
                 name:
                   description: |
                     An easy-to-remember name for the index to look it up or refer to it in index hints.
@@ -57,6 +58,7 @@ paths:
                     default options, or objects to specify options for the fields (with the
                     attribute path in the `name` property), or a mix of both.
                   type: array
+                  minItems: 1
                   items:
                     type: object
                     required:
@@ -487,10 +489,11 @@ paths:
                   type: integer
                 inBackground:
                   description: |
-                    This attribute can be set to `true` to create the index
-                    in the background, not write-locking the underlying collection for
-                    as long as if the index is built in the foreground. The default value is `false`.
+                    Set this option to `true` to keep the collection/shards available for
+                    write operations by not using an exclusive write lock for the duration
+                    of the index creation.
                   type: boolean
+                  default: false
                 cleanupIntervalStep:
                   description: |
                     Wait at least this many commits between removing unused files in the
@@ -625,14 +628,13 @@ paths:
       responses:
         '200':
           description: |
-            If the index already exists, then a *HTTP 200* is returned.
+            The index exists already.
         '201':
           description: |
-            If the index does not already exist and can be created, then a *HTTP 201*
-            is returned.
+            The index is created as there is no such existing index.
         '404':
           description: |
-            If the `collection-name` is unknown, then a *HTTP 404* is returned.
+            The collection is unknown.
       tags:
         - Indexes
 ```