Skip to content

[pull] main from swirlai:main #97

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Feb 25, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions docs/AI-Connect.md
Original file line number Diff line number Diff line change
Expand Up @@ -470,6 +470,8 @@ The following SearchProvider configuration is recommended for public source data

```

To obtain a Diffbot token, sign up here: <https://www.diffbot.com/>

If you prefer not to use Diffbot, the following configuration is recommended:

```
Expand Down Expand Up @@ -497,6 +499,9 @@ If you prefer not to use Diffbot, the following configuration is recommended:
},
```

{: .highlight }
Consult the [SearchProvider Guide](SP-Guide.md#activating-a-google-programmable-search-engine-pse-searchprovider) for more information about the Google PSE SearchProvider.

### Notes

* The `cache` parameter is set to "false" by default as of Release 3.0.
Expand Down
2 changes: 1 addition & 1 deletion docs/Quick-Start.md
Original file line number Diff line number Diff line change
Expand Up @@ -128,7 +128,7 @@ The Docker version of SWIRL AI Connect Community Edition does *not* retain any d
## Notes

{: .highlight }
SWIRL includes SearchProviders for Google Web (via their Programmable Search Engine offering), Arxiv.org, European PMC, Google News and SWIRL Documentation to get you up and running right away. The credentials for the Google Cloud API are shared with the SWIRL Community for this purpose.
SWIRL includes active SearchProviders for Arxiv.org, European PMC and Google News that will work "out of the box" so long as internet access is available. There are also inactive providers for Google Web and SWIRL Documentation that use the Google Programmable Search Engine (PSE). These services require a Google API key. Consult the [SearchProvider Guide](SP-Guide.md#activating-a-google-programmable-search-engine-pse-searchprovider) for more information.

{: .highlight }
Using SWIRL with Microsoft 365 requires installation and approval by an authorized company Administrator. For more information, please review the [M365 Guide](M365-Guide.html) or contact support as noted below.
51 changes: 48 additions & 3 deletions docs/SP-Guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ SearchProviders are the essential element of SWIRL. They make it quick and easy
SearchProviders are JSON objects. SWIRL's distribution comes preloaded with a variety of configurations for sources like Elastic, Solr, PostgreSQL, BigQuery, NLResearch.com, Miro.com, Atlassian, and more.

{: .highlight }
SWIRL includes a Google's Programmable Search Engines SearchProvider with live credentials so you can use SWIRL on web data right away. The credentials for these are shared with the SWIRL Community. The EuropePMC SearchProvider, Arxiv.org and the SWIRL documentation are also enabled for search by default; no credentials are required for any of those sources.
SWIRL includes active SearchProviders for Arxiv.org, European PMC and Google News that will work "out of the box" so long as internet access is available. There are also inactive providers for Google Web and SWIRL Documentation that use the Google Programmable Search Engine (PSE). These services require a Google API key. Consult the [SearchProvider Guide](#activating-a-google-programmable-search-engine-pse-searchprovider) for more information.

[SearchProvider Example JSON](https://github.com/swirlai/swirl-search/tree/main/SearchProviders)

Expand All @@ -47,7 +47,7 @@ SWIRL includes a Google's Programmable Search Engines SearchProvider with live c
| funding_db_sqlite3.json | SQLite3 funding database | [Funding Dataset](Developer-Reference.html#funding-data-set) |
| github.json | Searches public repositories for Code, Commits, Issues, and Pull Requests | Requires a bearer token |
| google_news.json | Searches the [Google News](https://news.google.com/) feed | No authorization required |
| google_pse.json | Five Google Programmable Search Engines (PSE) | Includes shared SWIRL credentials; may return a 429 error if overused |
| google_pse.json | Search the web, SWIRL documentation and more | Uses Google PSE, requires a valid Google API key |
| hacker_news.json | Queries a [searchable version](https://hn.algolia.com/) of the Hacker News feeds | No authorization required |
| http_get_with_auth.json | Generic HTTP GET query with basic authentication | Requires url, credentials |
| http_post_with_auth.json | Generic HTTP POST query with basic authentication | Requires url, credentials |
Expand Down Expand Up @@ -86,6 +86,51 @@ Click the `PUT` button to save the change. You can use the `HTML Form` at the bo

![picture of the SearchProvider endpoint HTML form](images/swirl_sp_html_form.png)

## Activating a Google Programmable Search Engine (PSE) SearchProvider

SWIRL includes an inactive Google PSE configuration that can be used to search the web or most any "slice" of it.
PSE is not free and requires a valid Google API key to operate.

To create a Google PSE:
* <https://programmablesearchengine.google.com/about/> - click "Get Started" and login with your Google account

To create a Google API key:
* <https://developers.google.com/custom-search/v1/overview>

To activate the SearchProvider:
* [Edit the Google PSE provider ](#editing)

* Change:

``` json
"active": false
```

to

``` json
"active": true
```

Or use the HTML form at at the bottom of the page.

* Edit the `query_mappings` and put in the id (cx parameter) from a valid Google PSE you created:

``` json
"query_mappings": "cx=<your-Google-PSE-id>",
```


* Edit the `credentials` field and put in your Google API key, with the `key=` prefix as shown:

``` json
"credentials": "key=<your-Google-API-key>",
```

* Click the `PUT` button to save the change.

* Reload SWIRL Galaxy to see the new source appear in the source selector.

## Copy/Paste Install

If you have the JSON of SearchProvider, you can copy/paste into the form at the bottom of the SearchProvider endpoint.
Expand Down Expand Up @@ -428,4 +473,4 @@ The PAYLOAD is a JSON list structure that can hold arbitrary data structures. Th
After mapping the fields you want the way you want them, then add this directive to the `result_mappings` so that you only get back what you want.

{: .highlight }
To use `NO_PAYLOAD` most effectively, send your first query to a SearchProvider *without it* to see what you get back in the PAYLOAD.
To use `NO_PAYLOAD` most effectively, send your first query to a SearchProvider *without it* to see what you get back in the PAYLOAD.
Loading