Skip to content

Commit

Permalink
[Search] Update/consolidate ingest section, move search pipelines con…
Browse files Browse the repository at this point in the history
…tent (#326)
  • Loading branch information
leemthompo authored Feb 5, 2025
1 parent 368fd9c commit 1305342
Show file tree
Hide file tree
Showing 6 changed files with 52 additions and 85 deletions.
4 changes: 2 additions & 2 deletions explore-analyze/machine-learning/nlp/inference-processing.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ mapped_pages:

# Inference processing [ingest-pipeline-search-inference]

When you create an index through the **Content** UI, a set of default ingest pipelines are also created, including a ML inference pipeline. The [ML inference pipeline](../../../solutions/search/ingest-for-search.md#ingest-pipeline-search-details-specific-ml-reference) uses inference processors to analyze fields and enrich documents with the output. Inference processors use ML trained models, so you need to use a built-in model or [deploy a trained model in your cluster^](ml-nlp-deploy-models.md) to use this feature.
When you create an index through the **Content** UI, a set of default ingest pipelines are also created, including a ML inference pipeline. The [ML inference pipeline](/solutions/search/search-pipelines.md#ingest-pipeline-search-details-specific-ml-reference) uses inference processors to analyze fields and enrich documents with the output. Inference processors use ML trained models, so you need to use a built-in model or [deploy a trained model in your cluster^](ml-nlp-deploy-models.md) to use this feature.

This guide focuses on the ML inference pipeline, its use, and how to manage it.

Expand Down Expand Up @@ -129,7 +129,7 @@ To ensure the ML inference pipeline will be run when ingesting documents, you mu

## Learn More [ingest-pipeline-search-inference-learn-more]

* See [Overview](../../../solutions/search/ingest-for-search.md#ingest-pipeline-search-in-enterprise-search) for information on the various pipelines that are created.
* See [Overview](/solutions/search/search-pipelines.md#ingest-pipeline-search-in-enterprise-search) for information on the various pipelines that are created.
* Learn about [ELSER](ml-nlp-elser.md), Elastic’s proprietary retrieval model for semantic search with sparse vectors.
* [NER HuggingFace Models](https://huggingface.co/models?library=pytorch&pipeline_tag=token-classification&sort=downloads)
* [Text Classification HuggingFace Models](https://huggingface.co/models?library=pytorch&pipeline_tag=text-classification&sort=downloads)
Expand Down

This file was deleted.

2 changes: 0 additions & 2 deletions raw-migrated-files/toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -603,7 +603,6 @@ toc:
- file: elasticsearch/elasticsearch-reference/document-level-security.md
- file: elasticsearch/elasticsearch-reference/documents-indices.md
- file: elasticsearch/elasticsearch-reference/elasticsearch-intro-deploy.md
- file: elasticsearch/elasticsearch-reference/es-ingestion-overview.md
- file: elasticsearch/elasticsearch-reference/es-security-principles.md
- file: elasticsearch/elasticsearch-reference/esql-using.md
- file: elasticsearch/elasticsearch-reference/field-and-document-access-control.md
Expand All @@ -618,7 +617,6 @@ toc:
- file: elasticsearch/elasticsearch-reference/index-modules-analysis.md
- file: elasticsearch/elasticsearch-reference/index-modules-mapper.md
- file: elasticsearch/elasticsearch-reference/ingest-enriching-data.md
- file: elasticsearch/elasticsearch-reference/ingest-pipeline-search.md
- file: elasticsearch/elasticsearch-reference/ingest.md
- file: elasticsearch/elasticsearch-reference/install-elasticsearch.md
- file: elasticsearch/elasticsearch-reference/ip-filtering.md
Expand Down
46 changes: 28 additions & 18 deletions solutions/search/ingest-for-search.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,35 +6,45 @@ mapped_urls:
- https://www.elastic.co/guide/en/serverless/current/elasticsearch-ingest-your-data.html
---

# Ingest for search
# Ingest for search use cases

% What needs to be done: Lift-and-shift
% ----
% navigation_title: "Ingest for search use cases"
% ----

% Scope notes: guidance on what ingest options you might want to use for search - connectors, crawler ...
$$$elasticsearch-ingest-time-series-data$$$
::::{note}
This page covers ingest methods specifically for search use cases. If you're working with a different use case, refer to the [ingestion overview](/manage-data/ingest.md) for more options.
::::

% Use migrated content from existing pages that map to this page:
Search use cases usually focus on general **content**, typically text-heavy data that does not have a timestamp. This could be data like knowledge bases, website content, product catalogs, and more.

% - [ ] ./raw-migrated-files/elasticsearch/elasticsearch-reference/es-ingestion-overview.md
% - [ ] ./raw-migrated-files/docs-content/serverless/elasticsearch-ingest-data-through-api.md
% - [ ] ./raw-migrated-files/elasticsearch/elasticsearch-reference/ingest-pipeline-search.md
% - [ ] ./raw-migrated-files/docs-content/serverless/elasticsearch-ingest-your-data.md
Once you've decided how to [deploy Elastic](/deploy-manage/index.md), the next step is getting your content into {{es}}. Your choice of ingestion method depends on where your content lives and how you need to access it.

% Internal links rely on the following IDs being on this page (e.g. as a heading ID, paragraph ID, etc):
There are several methods to ingest data into {{es}} for search use cases. Choose one or more based on your requirements.

$$$elasticsearch-ingest-time-series-data$$$
::::{tip}
If you just want to do a quick test, you can load [sample data](/manage-data/ingest/sample-data.md) into your {{es}} cluster using the UI.
::::

## Use APIs [es-ingestion-overview-apis]

$$$ingest-pipeline-search-details-specific-ml-reference$$$
You can use the [`_bulk` API](https://www.elastic.co/docs/api/doc/elasticsearch/v8/group/endpoint-document) to add data to your {{es}} indices, using any HTTP client, including the [{{es}} client libraries](/solutions/search/site-or-app/clients.md).

$$$ingest-pipeline-search-in-enterprise-search$$$
While the {{es}} APIs can be used for any data type, Elastic provides specialized tools that optimize ingestion for specific use cases.

$$$ingest-pipeline-search-details-generic-reference$$$
## Use specialized tools [es-ingestion-overview-general-content]

$$$ingest-pipeline-search-details-specific-custom-reference$$$
You can use these specialized tools to add general content to {{es}} indices.

$$$ingest-pipeline-search-details-specific-reference-processors$$$
| Method | Description | Notes |
|--------|-------------|-------|
| [**Web crawler**](https://github.com/elastic/crawler) | Programmatically discover and index content from websites and knowledge bases | Crawl public-facing web content or internal sites accessible via HTTP proxy |
| [**Search connectors**]() | Third-party integrations to popular content sources like databases, cloud storage, and business applications | Choose from a range of Elastic-built connectors or build your own in Python using the Elastic connector framework|
| [**File upload**](/manage-data/ingest/tools/upload-data-files.md)| One-off manual uploads through the UI | Useful for testing or very small-scale use cases, but not recommended for production workflows |

$$$ingest-pipeline-search-details-specific$$$
### Process data at ingest time

$$$ingest-pipeline-search-pipeline-settings-using-the-api$$$
You can also transform and enrich your content at ingest time using [ingest pipelines](/manage-data/ingest/transform-enrich/ingest-pipelines.md).

$$$ingest-pipeline-search-pipeline-settings$$$
The Elastic UI has a set of tools for creating and managing indices optimized for search use cases. You can also manage your ingest pipelines in this UI. Learn more in [](search-pipelines.md).
Loading

0 comments on commit 1305342

Please sign in to comment.