diff --git a/fern/assets/images/1d24fd7-Screenshot_2024-07-01_at_10.33.04_AM.png b/fern/assets/images/1d24fd7-Screenshot_2024-07-01_at_10.33.04_AM.png new file mode 100644 index 000000000..c9d7d12cf Binary files /dev/null and b/fern/assets/images/1d24fd7-Screenshot_2024-07-01_at_10.33.04_AM.png differ diff --git a/fern/assets/images/27062e8-Screenshot_2024-07-01_at_10.33.54_AM.png b/fern/assets/images/27062e8-Screenshot_2024-07-01_at_10.33.54_AM.png new file mode 100644 index 000000000..50071ec56 Binary files /dev/null and b/fern/assets/images/27062e8-Screenshot_2024-07-01_at_10.33.54_AM.png differ diff --git a/fern/docs.yml b/fern/docs.yml index a4ac4d08d..fe070dbaa 100644 --- a/fern/docs.yml +++ b/fern/docs.yml @@ -123,25 +123,36 @@ navigation: - page: Rerank path: pages/models/rerank-2.mdx - section: Text Generation - skip-slug: true contents: - page: Using the Chat API path: pages/text-generation/chat-api.mdx - page: Streaming Responses path: pages/text-generation/streaming.mdx + - page: Structured Generations (JSON) + path: pages/text-generation/structured-outputs-json.mdx - page: Predictable Outputs path: pages/text-generation/predictable-outputs.mdx - page: Advanced Generation Parameters path: pages/text-generation/advanced-generation-hyperparameters.mdx - page: Retrieval Augmented Generation (RAG) path: pages/text-generation/retrieval-augmented-generation-rag.mdx + - section: RAG Connectors + contents: + - page: Overview of RAG Connectors + path: pages/text-generation/connectors/overview-1.mdx + - page: Creating and Deploying a Connector + path: pages/text-generation/connectors/creating-and-deploying-a-connector.mdx + - page: Managing your Connector + path: pages/text-generation/connectors/managing-your-connector.mdx + - page: Connector Authentication + path: pages/text-generation/connectors/connector-authentication.mdx + - page: Connector FAQs + path: pages/text-generation/connectors/connector-faqs.mdx - section: Tool Use path: pages/text-generation/tools.mdx - skip-slug: true contents: - section: Multi-step Tool Use (Agents) path: pages/text-generation/tools/multi-step-tool-use.mdx - skip-slug: true contents: - page: Implementing a Multi-Step Agent with Langchain path: pages/text-generation/tools/multi-step-tool-use/implementing-a-multi-step-agent-with-langchain.mdx @@ -179,23 +190,11 @@ navigation: path: pages/text-generation/prompt-engineering/prompt-library/faster-web-search.mdx - page: Multilingual interpreter path: pages/text-generation/prompt-engineering/prompt-library/multilingual-interpreter.mdx - - section: RAG Connectors - contents: - - page: Overview - path: pages/text-generation/connectors/overview-1.mdx - - page: Creating and Deploying a Connector - path: pages/text-generation/connectors/creating-and-deploying-a-connector.mdx - - page: Managing your Connector - path: pages/text-generation/connectors/managing-your-connector.mdx - - page: Connector Authentication - path: pages/text-generation/connectors/connector-authentication.mdx - - page: Connector FAQs - path: pages/text-generation/connectors/connector-faqs.mdx - page: Migrating from the Generate API to the Chat API path: pages/text-generation/migrating-from-cogenerate-to-cochat.mdx - section: Text Embeddings (Vectors, Search, Retrieval) contents: - - page: Embeddings + - page: Introduction to Embeddings at Cohere path: pages/text-embeddings/embeddings.mdx - page: Batch Embedding Jobs path: pages/text-embeddings/embed-jobs-api.mdx @@ -205,6 +204,8 @@ navigation: path: pages/text-embeddings/reranking/overview.mdx - page: Rerank Best Practices path: pages/text-embeddings/reranking/reranking-best-practices.mdx + - page: Text Classification + path: pages/text-embeddings/text-classification-1.mdx - section: Fine-Tuning contents: - page: Introduction @@ -229,7 +230,7 @@ navigation: contents: - page: Preparing the Classify Fine-tuning data path: pages/fine-tuning/classify-fine-tuning/classify-preparing-the-data.mdx - - page: Starting the Classify Fine-Tuning + - page: Trains and deploys a fine-tuned model path: pages/fine-tuning/classify-fine-tuning/classify-starting-the-training.mdx - page: Understanding the Classify Fine-tuning Results path: pages/fine-tuning/classify-fine-tuning/classify-understanding-the-results.mdx @@ -269,22 +270,22 @@ navigation: path: pages/integrations/integrations/redis-and-cohere.mdx - page: Haystack and Cohere path: pages/integrations/integrations/haystack-and-cohere.mdx + - page: Pinecone and Cohere + path: pages/integrations/integrations/pinecone-and-cohere.mdx + - page: Weaviate and Cohere + path: pages/integrations/integrations/weaviate-and-cohere.mdx - page: Open Search and Cohere path: pages/integrations/integrations/opensearch-and-cohere.mdx - page: Vespa and Cohere path: pages/integrations/integrations/vespa-and-cohere.mdx - - page: Chroma and Cohere - path: pages/integrations/integrations/chroma-and-cohere.mdx - page: Qdrant and Cohere path: pages/integrations/integrations/qdrant-and-cohere.mdx - - page: Weaviate and Cohere - path: pages/integrations/integrations/weaviate-and-cohere.mdx - - page: Pinecone and Cohere - path: pages/integrations/integrations/pinecone-and-cohere.mdx - page: Milvus and Cohere path: pages/integrations/integrations/milvus-and-cohere.mdx - page: Zilliz and Cohere path: pages/integrations/integrations/zilliz-and-cohere.mdx + - page: Chroma and Cohere + path: pages/integrations/integrations/chroma-and-cohere.mdx - section: LangChain path: pages/integrations/cohere-and-langchain.mdx contents: @@ -334,9 +335,8 @@ navigation: path: pages/responsible-use/responsible-use/generation-benchmarks.mdx - page: Representation Benchmarks path: pages/responsible-use/responsible-use/representation-benchmarks.mdx - - page: Security - # LINK ELSEWHERE - path: pages/responsible-use/security.mdx + - link: Security + href: https://cohere.ai/security - page: Environmental Impact path: pages/responsible-use/environmental-impact.mdx - section: Cohere for AI @@ -587,9 +587,6 @@ navigation: - page: Sending Feedback hidden: true path: pages/text-generation/feedback.mdx - - page: Structured Generations (JSON) - hidden: true - path: pages/text-generation/structured-outputs-json.mdx - page: Book an appointment hidden: true path: pages/text-generation/prompt-engineering/prompt-library/book-an-appointment.mdx diff --git a/fern/openapi/cohere.yaml b/fern/openapi/cohere.yaml index d78496811..75286a529 100644 --- a/fern/openapi/cohere.yaml +++ b/fern/openapi/cohere.yaml @@ -9392,7 +9392,7 @@ components: description: | [BETA] A JSON schema object that the output will adhere to. There are some restrictions we have on the schema, refer to [our guide](/docs/structured-outputs-json#schema-constraints) for more information. Example (required name and age object): - ```json + ```json JSON { "type": "object", "properties": { diff --git a/fern/pages/-ARCHIVE-/old-tutorials/semantic-search.mdx b/fern/pages/-ARCHIVE-/old-tutorials/semantic-search.mdx index 203d4640a..f94fe1312 100644 --- a/fern/pages/-ARCHIVE-/old-tutorials/semantic-search.mdx +++ b/fern/pages/-ARCHIVE-/old-tutorials/semantic-search.mdx @@ -36,7 +36,7 @@ You can find the code in the >") @@ -103,7 +103,7 @@ public class ChatPost { go get github.com/cohere-ai/cohere-go/v2 ``` -```go +```go package main import ( diff --git a/fern/pages/cohere-api/errors.mdx b/fern/pages/cohere-api/errors.mdx index 5d96b064e..d18b8aa53 100644 --- a/fern/pages/cohere-api/errors.mdx +++ b/fern/pages/cohere-api/errors.mdx @@ -25,3 +25,32 @@ With a non-2xx response code, the response will be an error object in the follow ``` Here are code examples for how error handling might look in our SDKs: + + +```python PYTHON + try: + response = co.generate( + model='invalid-model', + prompt='sample prompt') + except cohere.CohereError as e: + print(e.message) + print(e.http_status) + print(e.headers) +``` +```javascript JAVASCRIPT +(async () => { + const response = await cohere.generate({model: 'invalid-model'}); + + if (response.statusCode !== 200) { + console.log(response.body.message); + } +})(); +``` +```go GO +_, err := co.Generate(generateOptions) +if err != nil { + fmt.Println(err) + return +} +``` + diff --git a/fern/pages/cookbooks/agent-api-calls.mdx b/fern/pages/cookbooks/agent-api-calls.mdx index d535c4a17..e51ee61d5 100644 --- a/fern/pages/cookbooks/agent-api-calls.mdx +++ b/fern/pages/cookbooks/agent-api-calls.mdx @@ -40,14 +40,14 @@ With this approach, we bring together the best of two worlds: the ability of LLM # Step 1: Setup [#sec_step1] -```python +```python PYTHON # Uncomment if you need to install the following packages # !pip install cohere # !pip install python-dotenv # !pip install pandas ``` -```python +```python PYTHON import os import json import re @@ -60,7 +60,7 @@ from langchain_core.pydantic_v1 import BaseModel, Field from langchain_core.tools import tool ``` -```python +```python PYTHON # load the cohere api key os.environ["COHERE_API_KEY"] = getpass.getpass() ``` @@ -70,7 +70,7 @@ os.environ["COHERE_API_KEY"] = getpass.getpass() Here we create a tool which implements the deterministic function to extract alphanumeric strings from the user's query and match them to the right parameter. -```python +```python PYTHON @tool def regex_extractor(user_query: str) -> dict: """Function which, given the query from the user, returns a dictionary parameter:value.""" @@ -94,7 +94,7 @@ regex_extractor.args_schema = extract_code_v1productssearch tools=[regex_extractor] ``` -```python +```python PYTHON # Let's define the preamble for the Agent. # The preamble includes info about: # - the tool the Agent has access to @@ -122,7 +122,7 @@ Search products sport | Search products dress and jumpsuit | [{'taxonomies': ['S """ ``` -```python +```python PYTHON # Define the prompt prompt = ChatPromptTemplate.from_template("{input}") # Define the agent @@ -140,7 +140,7 @@ agent_executor = AgentExecutor(agent=agent, ) ``` -```python +```python PYTHON # finally, let's write a function to convert the Agents output to a json def convert_to_json(string: str) -> json: return json.loads( @@ -155,7 +155,7 @@ def convert_to_json(string: str) -> json: Let's now test the Agent we just defined! -```python +```python PYTHON query_1 = "Look for urn:75f2b737-06dd-4399-9206-a6c11b65138e, GLCMS004AGTCAMIS; 0000234GLCMS0100ANORAKCAA, GLCHL000CGUCHALE" response_1 = agent_executor.invoke( { @@ -172,7 +172,7 @@ I will use the regex_extractor tool to extract the codes from the user query. {'tool_name': 'regex_extractor', 'parameters': {'user_query': 'Look for urn:75f2b737-06dd-4399-9206-a6c11b65138e, GLCMS004AGTCAMIS; 0000234GLCMS0100ANORAKCAA, GLCHL000CGUCHALE'}} {'nmgs': ['0000234GLCMS0100ANORAKCAA'], 'objref': ['GLCMS004AGTCAMIS', 'GLCHL000CGUCHALE'], 'urn': ['urn:75f2b737-06dd-4399-9206-a6c11b65138e']}Relevant Documents: 0 Cited Documents: 0 -Answer: ```json +Answer: ```json JSON [     {         "urn": ["urn:75f2b737-06dd-4399-9206-a6c11b65138e"], @@ -181,7 +181,7 @@ Answer: ```json     } ] ``` -Grounded answer: ```json +Grounded answer: ```json JSON [     {         "urn": ["urn:75f2b737-06dd-4399-9206-a6c11b65138e"], @@ -197,7 +197,7 @@ Grounded answer: ```json In the reasoning chain above, we can see that the Agent uses the tool we provided it to extract the strings in the query. The output of the tool is then used to generate the request. -```python +```python PYTHON # let's have a look at the final output convert_to_json(response_1['output']) ``` @@ -210,7 +210,7 @@ convert_to_json(response_1['output']) As mentioned above, the Agent can use the tool when specific alphanumeric patterns have to be extracted from the query; however, it can also generate the output based on its semantic understanding of the query. For example: -```python +```python PYTHON query_2 = "I need tennis products" response_2 = agent_executor.invoke( @@ -228,7 +228,7 @@ I will use the regex_extractor tool to extract the relevant information from the {'tool_name': 'regex_extractor', 'parameters': {'user_query': 'I need tennis products'}} {}Relevant Documents: None Cited Documents: None -Answer: ```json +Answer: ```json JSON [ { "taxonomies": [ @@ -237,7 +237,7 @@ Answer: ```json } ] ``` -Grounded answer: ```json +Grounded answer: ```json JSON [ { "taxonomies": [ @@ -252,7 +252,7 @@ Grounded answer: ```json The Agent runs the tool to check if any target string was in the query, then it generated the request body based on its understanding. -```python +```python PYTHON convert_to_json(response_2['output']) ``` @@ -262,7 +262,7 @@ convert_to_json(response_2['output']) Finally, the two paths to generation - deterministic and semantic - can be applied in parallel by the Agent, as shown below: -```python +```python PYTHON query_3 = "Look for GLBRL0000GACHALE, nmg 0000234GLCZD0000GUREDTOAA and car products" response_3 = agent_executor.invoke( @@ -280,7 +280,7 @@ I will use the regex_extractor tool to extract the codes from the user query. Th {'tool_name': 'regex_extractor', 'parameters': {'user_query': 'Look for GLBRL0000GACHALE, nmg 0000234GLCZD0000GUREDTOAA and car products'}} {'nmgs': ['0000234GLCZD0000GUREDTOAA'], 'objref': ['GLBRL0000GACHALE']}Relevant Documents: 0 Cited Documents: 0 -Answer: ```json +Answer: ```json JSON [     {         "objref": ["GLBRL0000GACHALE"], @@ -291,7 +291,7 @@ Answer: ```json     } ] ``` -Grounded answer: ```json +Grounded answer: ```json JSON [     {         "objref": ["GLBRL0000GACHALE"], @@ -306,7 +306,7 @@ Grounded answer: ```json > Finished chain. ```` -```python +```python PYTHON convert_to_json(response_3['output']) ``` diff --git a/fern/pages/cookbooks/agent-short-term-memory.mdx b/fern/pages/cookbooks/agent-short-term-memory.mdx index 153d44be4..004c32f8d 100644 --- a/fern/pages/cookbooks/agent-short-term-memory.mdx +++ b/fern/pages/cookbooks/agent-short-term-memory.mdx @@ -35,14 +35,14 @@ Below, we show that, with augmented memory objects, the Agent is more aware of t # Step 1: Setup the Prompt and the Agent [#sec_step1] -```python +```python PYTHON # Uncomment if you need to install the following packages # !pip install cohere # !pip install python-dotenv # !pip install pandas ``` -```python +```python PYTHON import os import pandas as pd import getpass @@ -58,22 +58,22 @@ from langchain_core.messages.system import SystemMessage from langchain_core.messages.human import HumanMessage ``` -```python +```python PYTHON # load the cohere api key os.environ["COHERE_API_KEY"] = getpass.getpass() ``` -```python +```python PYTHON # Load the data revenue_table = pd.read_csv('revenue_table.csv') ``` -```python +```python PYTHON # Define the prompt prompt = ChatPromptTemplate.from_template("{input}") ``` -```python +```python PYTHON # Define the tools python_repl = PythonREPL() python_tool = Tool( @@ -90,7 +90,7 @@ python_tool.args_schema = ToolInput tools=[python_tool] ``` -```python +```python PYTHON # Define the agent llm = ChatCohere(model="command-r", temperature=0) @@ -109,7 +109,7 @@ agent_executor = AgentExecutor(agent=agent, # Step 2: Conversation without memory [#sec_step2] -```python +```python PYTHON # let's start the conversation with a question about the csv we have loaded q1 = "read revenue_table.csv and show me the column names" a1=agent_executor.invoke({ @@ -139,7 +139,7 @@ Grounded answer: The column names in the CSV file are: > Finished chain. ``` -```python +```python PYTHON # nice! now let's ask a follow-up question q2 = "plot revenue numbers" a2_no_mem = agent_executor.invoke({ @@ -150,7 +150,7 @@ a2_no_mem = agent_executor.invoke({ ````txt title="Output" > Entering new AgentExecutor chain... Plan: I will ask the user for clarification on what data they would like to visualise. -Action: ```json +Action: ```json JSON [ { "tool_name": "directly_answer", @@ -174,7 +174,7 @@ Without memory, the model cannot answer follow up questions because it misses th Here we will populate the chat history only with the generations from the model. This is the current approach used, e.g., here: https://python.langchain.com/docs/modules/agents/how_to/custom_agent/ -```python +```python PYTHON # let's answer the followup question above with the new setup a2_mem_ai = agent_executor.invoke({ "input": q2, @@ -206,7 +206,7 @@ Also in this case, the model cannot manage the follow up question. The reason is # Step 4: Conversation with Memory using AI Messages and Human Messages [#sec_step4] -```python +```python PYTHON a2_mem_ai_hum = agent_executor.invoke({ "input": q2, "chat_history": [HumanMessage(content=q1), @@ -236,7 +236,7 @@ Grounded answer: Here's a plot of the revenue numbers: It works! Let's go on with the conversation. -```python +```python PYTHON q3 = "set the min of y axis to zero and the max to 1000" a3_mem_ai_hum = agent_executor.invoke({ "input": q3, @@ -277,7 +277,7 @@ Reasoning chains can be very long, especially in the cases that contain errors a To avoid this issue, we need a way to extract the relevant info from the previous turns. Below, we propose a simple approach to info extraction. We format the extracted info in such a way to enhance human interpretability. We call the objects passed in the chat history _augmented memory objects_. -```python +```python PYTHON # function to create augmented memory objects def create_augmented_mem_objs(output_previous_turn: dict) -> str: """Function to convert the output of a ReAct agent to a compact and interpretable representation""" @@ -301,14 +301,14 @@ def create_augmented_mem_objs(output_previous_turn: dict) -> str: return augmented_mem_obj ``` -```python +```python PYTHON augmented_mem_obj_a1 = create_augmented_mem_objs(a1) augmented_mem_obj_a2 = create_augmented_mem_objs(a2_mem_ai_hum) ``` Below, an example of the augmented memory object generated by the model. You can see that the agent now has full visibility on what it did in the previous step. -```python +```python PYTHON print(augmented_mem_obj_a2) ``` @@ -331,7 +331,7 @@ Here's a plot of the revenue numbers: END OUTPUT ``` -```python +```python PYTHON a3_mem_ai_hum_amo = agent_executor.invoke({ "input": q3, "chat_history": [SystemMessage(content=augmented_mem_obj_a1), diff --git a/fern/pages/cookbooks/agentic-multi-stage-rag.mdx b/fern/pages/cookbooks/agentic-multi-stage-rag.mdx index 85707637b..bdc001662 100644 --- a/fern/pages/cookbooks/agentic-multi-stage-rag.mdx +++ b/fern/pages/cookbooks/agentic-multi-stage-rag.mdx @@ -41,7 +41,7 @@ One of the challenges in building a RAG system is that it has many moving pieces As you will see more below, the multi-stage retrieval is achieved by adding a new function `reference_extractor()` that extracts other references in the documents and updating the instruction so the agent continues to retrieve more documents. -```python +```python PYTHON import os from pprint import pprint @@ -50,7 +50,7 @@ import pandas as pd from sklearn.metrics.pairwise import cosine_similarity ``` -```python +```python PYTHON # versions print('cohere version:', cohere.__version__) ``` @@ -61,7 +61,7 @@ cohere version: 5.5.1 ## Setup -```python +```python PYTHON COHERE_API_KEY = os.environ.get("CO_API_KEY") COHERE_MODEL = 'command-r-plus' co = cohere.Client(api_key=COHERE_API_KEY) @@ -71,7 +71,7 @@ co = cohere.Client(api_key=COHERE_API_KEY) We leveraged data from [Washington Department of Transportation](https://wsdot.wa.gov/travel/bicycling-walking/bicycling-washington/bicyclist-laws-safety) and modified to fit the need of this demo. -```python +```python PYTHON documents = [ { "title": "Bicycle law", @@ -132,7 +132,7 @@ db["embeddings"] = embeddings.embeddings ``` -```python +```python PYTHON db ``` @@ -184,7 +184,7 @@ db Following functions and tools will be used in the subsequent tasks. -```python +```python PYTHON def retrieve_documents(query: str, n=1) -> dict: """ Function to retrieve documents a given query. @@ -227,7 +227,7 @@ tools = [ ## RAG function -```python +```python PYTHON def simple_rag(query, db): """ Given user's query, retrieve top documents and generate response using documents parameter. @@ -246,7 +246,7 @@ def simple_rag(query, db): ## Agentic RAG - cohere_agent() -```python +```python PYTHON def cohere_agent( message: str, preamble: str, @@ -328,13 +328,13 @@ Here we are asking a question that can be answered easily with single-stage retr | ----------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------- | | Is there a state level law for wearing helmets? | There is currently no state law requiring the use of helmets when riding a bicycle. However, some cities and counties do require helmet use. | There is currently no state law requiring helmet use. However, some cities and counties do require helmet use with bicycles. | -```python +```python PYTHON question1 = "Is there a state level law for wearing helmets?" ``` ### Simple RAG -```python +```python PYTHON output = simple_rag(question1, db) print(output) ``` @@ -347,7 +347,7 @@ There is currently no state law requiring the use of helmets when riding a bicyc ### Agentic RAG -```python +```python PYTHON preamble = """ You are an expert assistant that helps users answers question about legal documents and policies. Use the provided documents to answer questions about an employee's specific situation. @@ -377,13 +377,13 @@ The second question requires a double-stage retrieval because top matched docume | --------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ---------------------------------------------------------------------------- | | I live in orting, do I need to wear a helmet with a bike? | In the state of Washington, there is no law requiring you to wear a helmet when riding a bike. However, some cities and counties do require helmet use, so it is worth checking your local laws. | Yes, you do need to wear a helmet with a bike in Orting if you are under 17. | -```python +```python PYTHON question2 = "I live in orting, do I need to wear a helmet with a bike?" ``` ### Simple RAG -```python +```python PYTHON output = simple_rag(question2, db) print(output) ``` @@ -398,7 +398,7 @@ In the state of Washington, there is no law requiring you to wear a helmet when Produces same quality answer as the simple rag. -```python +```python PYTHON preamble = """ You are an expert assistant that helps users answers question about legal documents and policies. Use the provided documents to answer questions about an employee's specific situation. @@ -427,7 +427,7 @@ In order for the model to retrieve correct documents, we do two things: 1. New reference_extractor() function is added. This function finds the references to other documents when given query and documents. 2. We update the instruction that directs the agent to keep retrieving relevant documents. -```python +```python PYTHON def reference_extractor(query: str, documents: list[str]) -> str: """ Given a query and document, find references to other documents. @@ -491,7 +491,7 @@ tools = [ ``` -```python +```python PYTHON preamble2 = """# Instruction You are an expert assistant that helps users answers question about legal documents and policies. diff --git a/fern/pages/cookbooks/agentic-rag-mixed-data.mdx b/fern/pages/cookbooks/agentic-rag-mixed-data.mdx index 816d54737..7d8f3efc7 100644 --- a/fern/pages/cookbooks/agentic-rag-mixed-data.mdx +++ b/fern/pages/cookbooks/agentic-rag-mixed-data.mdx @@ -41,7 +41,7 @@ Various LangChain-supported parsers can be found [here](https://python.langchain ## Install Dependencies -```python +```python PYTHON # there may be other dependencies that will need installation # ! pip install --quiet langchain langchain_cohere langchain_experimental # !pip --quiet install faiss-cpu tiktoken @@ -52,7 +52,7 @@ Various LangChain-supported parsers can be found [here](https://python.langchain # !pip install chromadb ``` -```python +```python PYTHON # LLM import os from langchain.text_splitter import RecursiveCharacterTextSplitter @@ -90,7 +90,7 @@ We have found that the best option for parsing is unstructured.io since the pars - separate tables from text - automatically chunk the tables and text by title during the parsing step so that similar elements are grouped -```python +```python PYTHON # UNSTRUCTURED pdf loader # Get elements raw_pdf_elements = partition_pdf( @@ -113,7 +113,7 @@ raw_pdf_elements = partition_pdf( ``` -```python +```python PYTHON # extract table and textual objects from parser class Element(BaseModel): type: str @@ -153,7 +153,7 @@ Below, we demonstrate the following process: - summaries of each chunk are embedded - during inference, the multi-vector retrieval returns the full context document related to the summary -```python +```python PYTHON co = cohere.Client() def get_chat_output(message, preamble, chat_history, model, temp, documents=None): return co.chat( @@ -183,7 +183,7 @@ def rerank_cohere(query, returned_documents,model:str="rerank-multilingual-v3.0" ``` -```python +```python PYTHON # generate table and text summaries prompt_text = """You are an assistant tasked with summarizing tables and text. \ Give a concise summary of the table or text. Table or text chunk: {element}. Only provide the summary and no other text.""" @@ -196,7 +196,7 @@ tables = [i.text for i in table_elements] texts = [i.text for i in text_elements] ``` -```python +```python PYTHON # The vectorstore to use to index the child chunks vectorstore = Chroma(collection_name="summaries", embedding_function=CohereEmbeddings()) # The storage layer for the parent documents @@ -234,7 +234,7 @@ With our database in place, we can run queries against it. The query process can - use each augmented query to retrieve the top k docs and then rerank them - concatenate all the shortlisted/reranked docs and pass them to the generation model -```python +```python PYTHON def process_query(query, retriever): """Runs query augmentation, retrieval, rerank and final generation in one call.""" augmented_queries=co.chat(message=query,model='command-r-plus',temperature=0.2, search_queries_only=True) @@ -282,7 +282,7 @@ Unless the user asks for a different style of answer, you should answer in full We can now test out a query. In this example, the final answer can be found on page 12 of the PDF, which aligns with the response provided by the model: -```python +```python PYTHON query = "what are the charges for services in 2022" final_answer, final_answer_docs = process_query(query, retriever) print(final_answer) @@ -307,7 +307,7 @@ We detect questions that do not require RAG by examining the `search_queries` ob In the example below, the `else` statement is invoked based on `query2`. We still pass in the chat history, allowing the question to be answered with only the prior context. -```python +```python PYTHON query2='divide this by two' augmented_queries=co.chat(message=query2,model='command-r-plus',temperature=0.2, search_queries_only=True) if augmented_queries.search_queries: @@ -339,11 +339,11 @@ else: Here, we connect all of the pieces discussed above into one class object, which is then used as a tool for a Cohere ReAct agent. This class definition consolidates and clarify the key parameters used to define the RAG pipeline. -```python +```python PYTHON co = cohere.Client() ``` -```python +```python PYTHON class Element(BaseModel): type: str text: Any @@ -549,7 +549,7 @@ Unless the user asks for a different style of answer, you should answer in full ``` -```python +```python PYTHON rag_object=RAG_pipeline(paths=["city_ny_popular_fin_report.pdf"]) ``` @@ -572,7 +572,7 @@ Finally, we build a simple agent that utilizes the RAG pipeline defined above. W The intention behind coupling these tools is to enable the model to perform mathematical and other postprocessing operations on RAG outputs using Python. -```python +```python PYTHON from langchain.agents import Tool from langchain_experimental.utilities import PythonREPL from langchain.agents import AgentExecutor @@ -655,11 +655,11 @@ You also have access to a python interpreter tool which you can use to run code ``` -```python +```python PYTHON agent_object=react_agent(rag_retriever=rag_object) ``` -```python +```python PYTHON step1_response=agent_object.run_agent("what are the charges for services in 2022 and 2023") ``` @@ -680,18 +680,18 @@ Grounded answer: The charges for services in 2022 were 💡 If you'd like to run the OCR pipeline yourself, you can find more info in the section titled **PDF to Text using OCR and `pdf2image`**. -```python +```python PYTHON # Using langchain here since they have access to the Unstructured Data Loader powered by unstructured.io from langchain_community.document_loaders import UnstructuredURLLoader @@ -87,7 +87,7 @@ We choose to use LlamaIndex's `SentenceSplitter` in this case in order to get th You may also apply further transformations from the LlamaIndex repo if you so choose. Take a look at the [docs](https://docs.llamaindex.ai/en/stable/understanding/loading/loading.html) for inspiration on what is possible with transformations. -```python +```python PYTHON from llama_index.core.ingestion import IngestionPipeline from llama_index.core.node_parser import SentenceSplitter @@ -155,7 +155,7 @@ Special tokens have been added in the vocabulary, make sure the associated word Loading the document into a LlamaIndex vector store will allow us to use the Cohere embedding model and rerank model to retrieve the relevant parts of the form to pass into Command. -```python +```python PYTHON from llama_index.core import Settings, VectorStoreIndex from llama_index.postprocessor.cohere_rerank import CohereRerank @@ -187,7 +187,7 @@ In order to do RAG, we need a query or a set of queries to actually _do_ the ret To learn more about document mode and query generation, check out [our documentation](https://docs.cohere.com/docs/retrieval-augmented-generation-rag). -```python +```python PYTHON PROMPT = "List the overall revenue numbers for 2021, 2022, and 2023 in the 10-K as bullet points, then explain the revenue growth trends." # Get queries to run against our index from the command-nightly model @@ -200,7 +200,7 @@ else: Now, with the queries in hand, we search against our vector index. -```python +```python PYTHON # Convenience function for formatting documents def format_for_cohere_client(nodes_): return [ @@ -232,7 +232,7 @@ You can see this for yourself by inspecting the `response.citations` field to ch You can learn more about the `chat` endpoint by checking out the API reference [here](https://docs.cohere.com/reference/chat). -```python +```python PYTHON # Make a request to the model response = co.chat( message=PROMPT, @@ -258,7 +258,7 @@ The revenue growth trend demonstrates sustained strong travel demand. On a const Other factors influencing the company's financial performance are described outside of the revenue growth trends. ``` -```python +```python PYTHON # Helper function for displaying response WITH citations def insert_citations(text: str, citations: list[dict]): """ @@ -312,7 +312,7 @@ To go from PDF to text with PyTesseract, there is an intermediary step of conver To do this, we use `pdf2image`, which uses `poppler` behind the scenes to convert the PDF into a PNG. From there, we can pass the image (which is a PIL Image object) directly into the OCR tool. -```python +```python PYTHON import pytesseract from pdf2image import convert_from_path @@ -326,7 +326,7 @@ pages = [pytesseract.image_to_string(page) for page in pages] ## Token count / price comparison and latency -```python +```python PYTHON def get_response(prompt, rag): if rag: # Get queries to run against our index from the command-nightly model @@ -364,7 +364,7 @@ def get_response(prompt, rag): return response ``` -```python +```python PYTHON prompt_template = """# financial form 10-K {tenk} @@ -374,17 +374,17 @@ prompt_template = """# financial form 10-K full_context_prompt = prompt_template.format(tenk=edgar_10k, question=PROMPT) ``` -```python +```python PYTHON r1 = get_response(PROMPT, rag=True) r2 = get_response(full_context_prompt, rag=False) ``` -```python +```python PYTHON def get_price(r): return (r.token_count["prompt_tokens"] * 0.5 / 10e6) + (r.token_count["response_tokens"] * 1.5 / 10e6) ``` -```python +```python PYTHON rag_price = get_price(r1) full_context_price = get_price(r2) @@ -395,7 +395,7 @@ print(f"RAG is {(full_context_price - rag_price) / full_context_price:.0%} cheap RAG is 93% cheaper than full context ``` -```python +```python PYTHON %timeit get_response(PROMPT, rag=True) ``` @@ -403,7 +403,7 @@ RAG is 93% cheaper than full context 14.9 s ± 1.4 s per loop (mean ± std. dev. of 7 runs, 1 loop each) ``` -```python +```python PYTHON %timeit get_response(full_context_prompt, rag=False) ``` diff --git a/fern/pages/cookbooks/analyzing-hacker-news.mdx b/fern/pages/cookbooks/analyzing-hacker-news.mdx index b0c7b843f..d666e0635 100644 --- a/fern/pages/cookbooks/analyzing-hacker-news.mdx +++ b/fern/pages/cookbooks/analyzing-hacker-news.mdx @@ -23,7 +23,7 @@ In this notebook we take thousands of the most popular posts from Hacker News an Let's start by installing the tools we'll need and then importing them. -```python +```python PYTHON !pip install cohere umap-learn altair annoy bertopic ``` @@ -88,7 +88,7 @@ Requirement already satisfied: urllib3<3,>=1.21.1 in /Users/alexiscook/anaconda3 Requirement already satisfied: mpmath>=0.19 in /Users/alexiscook/anaconda3/lib/python3.11/site-packages (from sympy->torch>=1.11.0->sentence-transformers>=0.4.1->bertopic) (1.3.0) ``` -```python +```python PYTHON import cohere import numpy as np import pandas as pd @@ -106,7 +106,7 @@ pd.set_option('display.max_colwidth', None) Fill in your Cohere API key in the next cell. To do this, begin by [signing up to Cohere](https://os.cohere.ai/) (for free!) if you haven't yet. Then get your API key [here](https://dashboard.cohere.com/api-keys). -```python +```python PYTHON co = cohere.Client("COHERE_API_KEY") # Insert your Cohere API key ``` @@ -114,7 +114,7 @@ co = cohere.Client("COHERE_API_KEY") # Insert your Cohere API key We will use the top 3,000 posts from the Ask HN section of Hacker News. We provide a CSV containing the posts. -```python +```python PYTHON df = pd.read_csv('https://storage.googleapis.com/cohere-assets/blog/text-clustering/data/askhn3k_df.csv', index_col=0) print(f'Loaded a DataFrame with {len(df)} rows') @@ -124,7 +124,7 @@ print(f'Loaded a DataFrame with {len(df)} rows') Loaded a DataFrame with 3000 rows ``` -```python +```python PYTHON df.head() ``` @@ -241,7 +241,7 @@ df.head() We calculate the embeddings using Cohere's `embed-english-v3.0` model. The resulting embeddings matrix has 3,000 rows (one for each post) and 1024 columns (meaning each post title is represented with a 1024-dimensional embedding). -```python +```python PYTHON batch_size = 90 embeds_list = [] @@ -265,7 +265,7 @@ embeds.shape For nearest-neighbor search, we can use the open-source Annoy library. Let's create a semantic search index and feed it all the embeddings. -```python +```python PYTHON search_index = AnnoyIndex(embeds.shape[1], 'angular') for i in range(len(embeds)): search_index.add_item(i, embeds[i]) @@ -282,7 +282,7 @@ True We can query neighbors of a specific post using `get_nns_by_item`. -```python +```python PYTHON example_id = 50 similar_item_ids = search_index.get_nns_by_item(example_id, @@ -365,7 +365,7 @@ Nearest neighbors: We're not limited to searching using existing items. If we get a query, we can embed it and find its nearest neighbors from the dataset. -```python +```python PYTHON query = "How can I improve my knowledge of calculus?" query_embed = co.embed(texts=[query], @@ -454,12 +454,12 @@ Nearest neighbors: What if we want to browse the archive instead of only searching it? Let's plot all the questions in a 2D chart so you're able to visualize the posts in the archive and their similarities. -```python +```python PYTHON reducer = umap.UMAP(n_neighbors=100) umap_embeds = reducer.fit_transform(embeds) ``` -```python +```python PYTHON df['x'] = umap_embeds[:,0] df['y'] = umap_embeds[:,1] @@ -491,7 +491,7 @@ chart.interactive() Let's proceed to cluster the embeddings using KMeans from scikit-learn. -```python +```python PYTHON n_clusters = 8 kmeans_model = KMeans(n_clusters=n_clusters, random_state=0) @@ -500,7 +500,7 @@ classes = kmeans_model.fit_predict(embeds) ## 5- Extract major keywords from each cluster so we can identify what the cluster is about -```python +```python PYTHON documents = df['title'] documents = pd.DataFrame({"Document": documents, "ID": range(len(documents)), @@ -512,7 +512,7 @@ count = count_vectorizer.transform(documents_per_topic.Document) words = count_vectorizer.get_feature_names_out() ``` -```python +```python PYTHON ctfidf = ClassTfidfTransformer().fit_transform(count).toarray() words_per_class = {label: [words[index] for index in ctfidf[label].argsort()[-10:]] for label in documents_per_topic.Topic} df['cluster'] = classes @@ -523,7 +523,7 @@ df['keywords'] = df['cluster'].map(lambda topic_num: ", ".join(np.array(words_pe We can now plot the documents with their clusters and keywords -```python +```python PYTHON selection = alt.selection_multi(fields=['keywords'], bind='legend') chart = alt.Chart(df).transform_calculate( diff --git a/fern/pages/cookbooks/article-recommender-with-text-embeddings.mdx b/fern/pages/cookbooks/article-recommender-with-text-embeddings.mdx index 420e357cc..1d84a3942 100644 --- a/fern/pages/cookbooks/article-recommender-with-text-embeddings.mdx +++ b/fern/pages/cookbooks/article-recommender-with-text-embeddings.mdx @@ -34,7 +34,7 @@ We will implement the following steps: **4: Show the top 5 recommended articles.** -```python +```python PYTHON ! pip install cohere ``` @@ -52,7 +52,7 @@ Installing collected packages: cohere Successfully installed cohere-1.3.10 ``` -```python +```python PYTHON import numpy as np import pandas as pd import re @@ -73,7 +73,7 @@ Throughout this article, we'll use the [BBC news article dataset](https://www.ka We'll extract a subset of the data and in Step 1, use the first 100 data points. -```python +```python PYTHON df = pd.read_csv('https://raw.githubusercontent.com/cohere-ai/notebooks/main/notebooks/data/bbc_news_subset.csv', delimiter=',') INP_START = 0 @@ -125,7 +125,7 @@ Next we turn each article text into embeddings. An [embedding](https://docs.cohe We do this by calling Cohere's [Embed endpoint](https://docs.cohere.ai/embed-reference), which takes in text as input and returns embeddings as output. -```python +```python PYTHON articles = df_inputs['Text'].tolist() output = co.embed( @@ -147,7 +147,7 @@ Next, we pick any one article to be the one the reader is currently reading (let [Cosine similarity](https://en.wikipedia.org/wiki/Cosine_similarity) is a metric that measures how similar sequences of numbers are (embeddings in our case), and we compute it for each target-candidate pair. -```python +```python PYTHON print(f'Choose one article ID between {INP_START} and {INP_END-1} below...') ``` @@ -155,13 +155,13 @@ print(f'Choose one article ID between {INP_START} and {INP_END-1} below...') Choose one article ID between 0 and 99 below... ``` -```python +```python PYTHON READING_IDX = 70 reading = embeds[READING_IDX] ``` -```python +```python PYTHON from sklearn.metrics.pairwise import cosine_similarity @@ -182,7 +182,7 @@ def get_similarity(target,candidates): return similarity_scores ``` -```python +```python PYTHON similarity = get_similarity(reading,embeds) print('Target:') @@ -222,7 +222,7 @@ A typical text classification model requires hundreds/thousands of data points t To build the classifier, we need a set of examples consisting of text (news text) and labels (news category). The BBC News dataset happens to have both (columns 'Text' and 'Category'), so this time we’ll use the categories for building our examples. For this, we will set aside another portion of dataset. -```python +```python PYTHON EX_START = 100 EX_END = 200 df_examples = df.iloc[EX_START:EX_END] @@ -274,7 +274,7 @@ df_examples.head() With the Classify endpoint, there is a limit of 512 tokens per input. This means full articles won't be able to fit in the examples, so we will approximate and limit each article to its first 300 characters. -```python +```python PYTHON MAX_CHARS = 300 def shorten_text(text): @@ -285,7 +285,7 @@ df_examples['Text'] = df_examples['Text'].apply(shorten_text) The Classify endpoint needs a minimum of 2 examples for each category. We'll have 5 examples each, sampled randomly from the dataset. We have 5 categories, so we will have a total of 25 examples. -```python +```python PYTHON EX_PER_CAT = 5 categories = df_examples['Category'].unique().tolist() @@ -313,7 +313,7 @@ Total number of examples: 25 Once the examples are ready, we can now get the classifications. Here is a function that returns the classification given an input. -```python +```python PYTHON from cohere import ClassifyExample @@ -334,7 +334,7 @@ def classify_text(texts, examples): Before actually using the classifier, let's first test its performance. Here we take another 100 data points as the test dataset and the classifier will predict its class i.e. news category. -```python +```python PYTHON TEST_START = 200 TEST_END = 300 df_test = df.iloc[TEST_START:TEST_END] @@ -386,7 +386,7 @@ df_test.head() -```python +```python PYTHON predictions = [] BATCH_SIZE = 90 # The API accepts a maximum of 96 inputs for i in range(0, len(df_test['Text']), BATCH_SIZE): @@ -396,7 +396,7 @@ for i in range(0, len(df_test['Text']), BATCH_SIZE): actual = df_test['Category'].tolist() ``` -```python +```python PYTHON from sklearn.metrics import accuracy_score accuracy = accuracy_score(actual, predictions) @@ -421,7 +421,7 @@ We do this with the Chat endpoint. We call the endpoint by specifying a few settings, and it will generate the corresponding extractions. -```python +```python PYTHON def extract_tags(article): prompt = f"""Given an article, extract a list of tags containing keywords of that article. @@ -469,7 +469,7 @@ Let's now put everything together for our article recommender system. First, we select the target article and compute the similarity scores against the candidate articles. -```python +```python PYTHON print(f'Choose one article ID between {INP_START} and {INP_END-1} below...') ``` @@ -477,7 +477,7 @@ print(f'Choose one article ID between {INP_START} and {INP_END-1} below...') Choose one article ID between 0 and 99 below... ``` -```python +```python PYTHON READING_IDX = 70 reading = embeds[READING_IDX] @@ -487,7 +487,7 @@ similarity = get_similarity(reading,embeds) Next, we filter the articles via classification. Finally, we extract the keywords from each article and show the recommendations. -```python +```python PYTHON SHOW_TOP = 5 df_inputs = df_inputs.copy() @@ -532,7 +532,7 @@ def get_recommendations(reading_idx,similarity,show_top): break ``` -```python +```python PYTHON get_recommendations(READING_IDX,similarity,SHOW_TOP) ``` @@ -565,7 +565,7 @@ Let's try a couple of other articles in business and tech and see the output... Business article (returning recommendations around German economy and economic growth/slump): -```python +```python PYTHON READING_IDX = 1 @@ -599,7 +599,7 @@ Tags: bmw, diesel cars, robert bosch, fuel injection pump Tech article (returning recommendations around consumer devices): -```python +```python PYTHON READING_IDX = 71 diff --git a/fern/pages/cookbooks/basic-multi-step.mdx b/fern/pages/cookbooks/basic-multi-step.mdx index 5889021d2..90a97db3f 100644 --- a/fern/pages/cookbooks/basic-multi-step.mdx +++ b/fern/pages/cookbooks/basic-multi-step.mdx @@ -16,7 +16,7 @@ The recommended way to achieve [multi-step tool use with Cohere](https://docs.co ## Install Dependencies -```python +```python PYTHON ! pip install --quiet langchain langchain_cohere langchain_experimental ``` @@ -37,7 +37,7 @@ The recommended way to achieve [multi-step tool use with Cohere](https://docs.co [?25h ``` -```python +```python PYTHON import os os.environ['COHERE_API_KEY'] = ``` @@ -57,7 +57,7 @@ Plus the model can self-reflect. You can easily equip your agent with web search! -```python +```python PYTHON from langchain_community.tools.tavily_search import TavilySearchResults os.environ["TAVILY_API_KEY"] = # you can create an API key for free on Tavily's website @@ -77,7 +77,7 @@ internet_search.args_schema = TavilySearchInput You can easily equip your agent with a vector store! -```python +```python PYTHON !pip --quiet install faiss-cpu tiktoken ``` @@ -87,7 +87,7 @@ You can easily equip your agent with a vector store! [?25h ``` -```python +```python PYTHON from langchain.text_splitter import RecursiveCharacterTextSplitter from langchain_community.document_loaders import WebBaseLoader from langchain_community.vectorstores import FAISS @@ -116,7 +116,7 @@ vectorstore_retriever = vectorstore.as_retriever() ``` -```python +```python PYTHON from langchain.tools.retriever import create_retriever_tool vectorstore_search = create_retriever_tool( @@ -130,7 +130,7 @@ vectorstore_search = create_retriever_tool( You can easily equip your agent with a python interpreter! -```python +```python PYTHON from langchain.agents import Tool from langchain_experimental.utilities import PythonREPL @@ -151,7 +151,7 @@ python_tool.args_schema = ToolInput You can easily equip your agent with any Python function! -```python +```python PYTHON from langchain_core.tools import tool import random @@ -179,7 +179,7 @@ random_operation_tool.args_schema = random_operation_inputs The model can smartly pick the right tool(s) for the user query, call them in any sequence, analyze the results and self-reflect. Once the model considers it has enough information to answer the user question, it generates the final answer. -```python +```python PYTHON from langchain.agents import AgentExecutor from langchain_cohere.react_multi_hop.agent import create_cohere_react_agent from langchain_core.prompts import ChatPromptTemplate @@ -208,7 +208,7 @@ agent_executor = AgentExecutor(agent=agent, tools=[internet_search, vectorstore_ A question that requires using a predefined tool from Langchain -```python +```python PYTHON response = agent_executor.invoke({ "input": "I want to write an essay about the Roman Empire. Any tips for writing an essay? Any fun facts?", "preamble": preamble, @@ -397,7 +397,7 @@ Here are some fun facts about the Roman Empire: A question that requires the large language model to use a custom tool. -```python +```python PYTHON response = agent_executor.invoke({ "input": "Calculate the result of the random operation of 10 and 20. Then find a few fun facts about that number, as well as its prime factors.", "preamble": preamble, @@ -447,7 +447,7 @@ The prime factors of 200 are 2 and 5. A question that requires the large language model to directly answer. -```python +```python PYTHON response = agent_executor.invoke({ "input": "Hey how are you?", "preamble": preamble, @@ -480,7 +480,7 @@ Grounded answer: I'm an AI chatbot, so I don't have feelings as such, but I'm he A question that requires using multipe tools, in sequence -```python +```python PYTHON response = agent_executor.invoke({ "input": "In what year was the company that was founded as Sound of Music went public? What was its stock price in 2000 and 2010.", "preamble": preamble, @@ -517,7 +517,7 @@ Grounded answer: Best Buy, which was founded as Sound of The chat history enables you to have multi-turn conversations with the ReAct agent. -```python +```python PYTHON from langchain_core.messages import HumanMessage, AIMessage chat_history = [ @@ -529,7 +529,7 @@ chat_history = [ prompt = ChatPromptTemplate.from_messages(chat_history) ``` -```python +```python PYTHON agent = create_cohere_react_agent( llm=llm, tools=[internet_search, vectorstore_search, python_tool], @@ -539,7 +539,7 @@ agent = create_cohere_react_agent( agent_executor = AgentExecutor(agent=agent, tools=[internet_search, vectorstore_search, python_tool], verbose=True) ``` -```python +```python PYTHON response = agent_executor.invoke({ "preamble": preamble, }) diff --git a/fern/pages/cookbooks/basic-rag.mdx b/fern/pages/cookbooks/basic-rag.mdx index 7a14326f0..589b97b98 100644 --- a/fern/pages/cookbooks/basic-rag.mdx +++ b/fern/pages/cookbooks/basic-rag.mdx @@ -26,7 +26,7 @@ In this example, we'll use a recent piece of text, that wasn't in the training d In practice, you would typically do RAG on much longer text, that doesn't fit in the context window of the model. -```python +```python PYTHON %pip install "cohere<5" --quiet ``` @@ -36,13 +36,13 @@ In practice, you would typically do RAG on much longer text, that doesn't fit in [?25h ``` -```python +```python PYTHON import cohere API_KEY = "..." # fill in your Cohere API key here co = cohere.Client(API_KEY) ``` -```python +```python PYTHON !pip install wikipedia --quiet import wikipedia ``` @@ -52,7 +52,7 @@ Preparing metadata (setup.py) ... [?25l[?25hdone Building wheel for wikipedia (setup.py) ... [?25l[?25hdone ``` -```python +```python PYTHON article = wikipedia.page('Dune Part Two') text = article.content print(f"The text has roughly {len(text.split())} words.") @@ -68,7 +68,7 @@ We index the document in a vector database. This requires getting the documents, ### We split the document into chunks of roughly 512 words -```python +```python PYTHON %pip install -qU langchain-text-splitters --quiet from langchain_text_splitters import RecursiveCharacterTextSplitter ``` @@ -80,7 +80,7 @@ from langchain_text_splitters import RecursiveCharacterTextSplitter [?25h ``` -```python +```python PYTHON text_splitter = RecursiveCharacterTextSplitter( chunk_size=512, chunk_overlap=50, @@ -101,7 +101,7 @@ The text has been broken down in 91 chunks. Cohere embeddings are state-of-the-art. -```python +```python PYTHON model="embed-english-v3.0" response = co.embed( texts= chunks, @@ -121,11 +121,11 @@ We just computed 91 embeddings. We use the simplest vector database ever: a python dictionary using `np.array()`. -```python +```python PYTHON !pip install numpy --quiet ``` -```python +```python PYTHON import numpy as np vector_database = {i: np.array(embedding) for i, embedding in enumerate(embeddings)} ``` @@ -134,7 +134,7 @@ vector_database = {i: np.array(embedding) for i, embedding in enumerate(embeddin ### Define the user question -```python +```python PYTHON query = "Name everyone involved in writing the script, directing, and producing 'Dune: Part Two'?" ``` @@ -143,7 +143,7 @@ query = "Name everyone involved in writing the script, directing, and producing Cohere embeddings are state-of-the-art. -```python +```python PYTHON response = co.embed( texts=[query], model=model, @@ -162,7 +162,7 @@ query_embedding: [-0.068603516, -0.02947998, -0.06274414, -0.015449524, -0.0332 We use cosine similarity to find the most similar chunks -```python +```python PYTHON def cosine_similarity(a, b): return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b)) @@ -202,7 +202,7 @@ We rerank the 10 chunks retrieved from the vector database. Reranking boosts ret Reranking lets us go from 10 chunks retrieved from the vector database, to the 3 most relevant chunks. -```python +```python PYTHON response = co.rerank( query=query, documents=top_chunks_after_retrieval, @@ -225,7 +225,7 @@ Here are the top 3 chunks after rerank: ## Step 3 - Generate the model final answer, given the retrieved and reranked chunks -```python +```python PYTHON preamble = """ ## Task & Context You help people answer their questions and other requests interactively. You will be asked a very wide array of requests on all kinds of topics. You will be equipped with a wide range of search engines or similar tools to help you, which you use to research your answer. You should focus on serving the user's needs as best you can, which will be wide-ranging. @@ -235,7 +235,7 @@ Unless the user asks for a different style of answer, you should answer in full """ ``` -```python +```python PYTHON documents = [ {"title": "chunk 0", "snippet": top_chunks_after_rerank[0]}, {"title": "chunk 1", "snippet": top_chunks_after_rerank[1]}, @@ -284,7 +284,7 @@ These citations make it easy to check where the model’s generated response cla They help users gain visibility into the model reasoning, and sanity check the final model generation. These citations are optional — you can decide to ignore them. -```python +```python PYTHON print("Citations that support the final answer:") for cite in response.citations: print(cite) @@ -324,7 +324,7 @@ Citations that support the final answer: {'start': 686, 'end': 691, 'text': '2024.', 'document_ids': ['doc_0']} ``` -```python +```python PYTHON def insert_citations_in_order(text, citations): """ A helper function to pretty print citations. diff --git a/fern/pages/cookbooks/basic-semantic-search.mdx b/fern/pages/cookbooks/basic-semantic-search.mdx index e5c424e09..a4132ce70 100644 --- a/fern/pages/cookbooks/basic-semantic-search.mdx +++ b/fern/pages/cookbooks/basic-semantic-search.mdx @@ -24,7 +24,7 @@ In this notebook, we'll build a simple semantic search engine. The applications And if you're running an older version of the SDK, you might need to upgrade it like so: -```python +```python PYTHON #!pip install --upgrade cohere ``` @@ -32,7 +32,7 @@ Get your Cohere API key by [signing up here](https://os.cohere.ai/register). Pas ## 1. Getting Set Up -```python +```python PYTHON #@title Import libraries (Run this cell to execute required code) {display-mode: "form"} import cohere @@ -52,7 +52,7 @@ pd.set_option('display.max_colwidth', None) You'll need your API key for this next cell. [Sign up to Cohere](https://os.cohere.ai/) and get one if you haven't yet. -```python +```python PYTHON model_name = "embed-english-v3.0" api_key = "" input_type_embed = "search_document" @@ -64,7 +64,7 @@ co = cohere.Client(api_key) We'll use the [trec](https://www.tensorflow.org/datasets/catalog/trec) dataset which is made up of questions and their categories. -```python +```python PYTHON dataset = load_dataset("trec", split="train") df = pd.DataFrame(dataset)[:1000] @@ -160,13 +160,13 @@ The next step is to embed the text of the questions. To get a thousand embeddings of this length should take about fifteen seconds. -```python +```python PYTHON embeds = co.embed(texts=list(df['text']), model=model_name, input_type=input_type_embed).embeddings ``` -```python +```python PYTHON embeds = np.array(embeds) embeds.shape ``` @@ -186,7 +186,7 @@ Let's now use [Annoy](https://github.com/spotify/annoy) to build an index that s After building the index, we can use it to retrieve the nearest neighbors either of existing questions (section 3.1), or of new questions that we embed (section 3.2). -```python +```python PYTHON search_index = AnnoyIndex(embeds.shape[1], 'angular') for i in range(len(embeds)): search_index.add_item(i, embeds[i]) @@ -203,7 +203,7 @@ True If we're only interested in measuring the distance between the questions in the dataset (no outside queries), a simple way is to calculate the distance between every pair of embeddings we have. -```python +```python PYTHON example_id = 92 similar_item_ids = search_index.get_nns_by_item(example_id,10, @@ -283,7 +283,7 @@ Nearest neighbors: We're not limited to searching using existing items. If we get a query, we can embed it and find its nearest neighbors from the dataset. -```python +```python PYTHON query = "What is the tallest mountain in the world?" input_type_query = "search_query" @@ -380,7 +380,7 @@ Nearest neighbors: Finally, let's plot out all the questions onto a 2D chart so you're able to visualize the semantic similarities of this dataset! -```python +```python PYTHON #@title Plot the archive {display-mode: "form"} reducer = umap.UMAP(n_neighbors=20) diff --git a/fern/pages/cookbooks/basic-tool-use.mdx b/fern/pages/cookbooks/basic-tool-use.mdx index 0d4ebf122..6231533d3 100644 --- a/fern/pages/cookbooks/basic-tool-use.mdx +++ b/fern/pages/cookbooks/basic-tool-use.mdx @@ -19,7 +19,7 @@ Below, we illustrate tool use in four steps: - Step 3: the tool calls are executed - Step 4: the model **generates a final answer with precise citations** based on the tool results -```python +```python PYTHON import cohere, json API_KEY = "..." # fill in your Cohere API key here co = cohere.Client(API_KEY) @@ -29,7 +29,7 @@ co = cohere.Client(API_KEY) Before we can illustrate tool use, we first need to do some setup. Here, we'll define the mock data that our tools will query. This data represents sales reports and a product catalog. -```python +```python PYTHON sales_database = { '2023-09-28': { 'total_sales_amount': 5000, @@ -62,7 +62,7 @@ product_catalog = { Now, we'll define the tools that simulate querying this database. For example, you could use the API of an enterprise sales platform. -```python +```python PYTHON def query_daily_sales_report(day: str) -> dict: """ Function to retrieve the sales report for the given day @@ -107,7 +107,7 @@ You can specify one or many tools to the model. Every tool needs to be described In our example, we provide two tools to the model: `daily_sales_report` and `product_catalog`. -```python +```python PYTHON tools = [ { "name": "query_daily_sales_report", @@ -140,7 +140,7 @@ In our example we'll use: "Can you provide a sales summary for 29th September 20 Only a langage model with Tool Use can answer this request: it requires looking up information in the right external tools (step 2), and then providing a final answer based on the tool results (step 4). -```python +```python PYTHON preamble = """ ## Task & Context You help people answer their questions and other requests interactively. You will be asked a very wide array of requests on all kinds of topics. You will be equipped with a wide range of search engines or similar tools to help you, which you use to research your answer. You should focus on serving the user's needs as best you can, which will be wide-ranging. @@ -156,7 +156,7 @@ message = "Can you provide a sales summary for 29th September 2023, and also giv The model intelligently selects the right tool(s) to call -- and the right parameters for each tool call -- based on the content of the user message. -```python +```python PYTHON response = co.chat( message=message, tools=tools, @@ -188,7 +188,7 @@ cohere.ToolCall { You can now execute the appropriate calls, using the tool calls and tool parameters generated by the model. These tool calls return tool results that will be fed to the model in Step 4. -```python +```python PYTHON tool_results = [] for tool_call in response.tool_calls: # here is where you would call the tool recommended by the model, using the parameters recommended by the model @@ -270,7 +270,7 @@ Tool results that will be fed back to the model in step 4: Finally, the developer calls the Cohere model, providing the tools results, in order to generate the model's final answer. -```python +```python PYTHON response = co.chat( message=message, tools=tools, @@ -281,7 +281,7 @@ response = co.chat( ) ``` -```python +```python PYTHON print("Final answer:") print(response.text) ``` @@ -309,7 +309,7 @@ These citations make it easy to check where the model’s generated response cla They help users gain visibility into the model reasoning, and sanity check the final model generation. These citations are optional — you can decide to ignore them. -```python +```python PYTHON print("Citations that support the final answer:") for cite in response.citations: print(cite) @@ -331,7 +331,7 @@ Citations that support the final answer: {'start': 298, 'end': 300, 'text': '25', 'document_ids': ['query_product_catalog:1:0']} ``` -```python +```python PYTHON def insert_citations_in_order(text, citations): """ A helper function to pretty print citations. diff --git a/fern/pages/cookbooks/calendar-agent.mdx b/fern/pages/cookbooks/calendar-agent.mdx index b9cbf3115..a111f52ce 100644 --- a/fern/pages/cookbooks/calendar-agent.mdx +++ b/fern/pages/cookbooks/calendar-agent.mdx @@ -10,11 +10,11 @@ import { CookbookHeader } from "../../components/cookbook-header"; In the example below, we demonstrate how to use the cohere Chat API with the `list_calendar_events` and `create_calendar_event` tools to book appointments. Booking the correct appointment requires the model to first check for an available slot by listing existing events, reasoning about the correct slot to book the new appointment and then finally invoking the right tool to create the calendar event. To learn more about Tool Use, read the official [multi-step tool use guide](https://docs.cohere.com/docs/multi-step-tool-use). -```python +```python PYTHON # !pip install cohere==5.5.3 ``` -```python +```python PYTHON # Instantiate the Cohere client import cohere @@ -24,7 +24,7 @@ COHERE_API_KEY = os.environ["COHERE_API_KEY"] co = cohere.Client(api_key=COHERE_API_KEY) ``` -```python +```python PYTHON # Define the tools import json @@ -91,7 +91,7 @@ def invoke_tool(tool_call: cohere.ToolCall): raise f"Unknown tool name '{tool_call.name}'" ``` -```python +```python PYTHON # Check what tools the model wants to use and how to use them res = co.chat( model="command-r-plus", diff --git a/fern/pages/cookbooks/chunking-strategies.mdx b/fern/pages/cookbooks/chunking-strategies.mdx index 526d7239f..62dbfdc7e 100644 --- a/fern/pages/cookbooks/chunking-strategies.mdx +++ b/fern/pages/cookbooks/chunking-strategies.mdx @@ -17,7 +17,7 @@ import { CookbookHeader } from "../../components/cookbook-header"; -```python +```python PYTHON %%capture !pip install cohere !pip install -qU langchain-text-splitters @@ -25,7 +25,7 @@ import { CookbookHeader } from "../../components/cookbook-header"; !pip install llama-index-postprocessor-cohere-rerank ``` -```python +```python PYTHON import requests from typing import List @@ -44,7 +44,7 @@ from llama_index.postprocessor.cohere_rerank import CohereRerank from llama_index.core import VectorStoreIndex, ServiceContext ``` -```python +```python PYTHON co_model = 'command-r' co_api_key = getpass("Enter Cohere API key: ") co = cohere.Client(api_key=co_api_key) @@ -94,7 +94,7 @@ Designing a robust chunking strategy is as much a science as an art. There are n ## Utils -```python +```python PYTHON def set_css(): display(HTML(''' @@ -104,7 +104,7 @@ get_ipython().events.register('pre_run_cell', set_css) set_css() ``` -```python +```python PYTHON def insert_citations(text: str, citations: List[dict]): """ A helper function to pretty print citations. @@ -150,7 +150,7 @@ def build_retriever(documents, top_n=5): In this example we will work with an 2023 Tesla earning call transcript. -```python +```python PYTHON # Get all investement memos (19) in bvp repository url_path = 'https://www.fool.com/earnings/call-transcripts/2024/01/24/tesla-tsla-q4-2023-earnings-call-transcript/' response = requests.get(url_path) @@ -186,7 +186,7 @@ Elon Musk -- Chief Executive Officer and Product Architect Yeah. The creators of Westworld, Jonathan Nolan, Lisa Joy Nolan, are friends -- are all friends of mine, actually. And I invited them to come see the lab and, like, well, come see it, hopefully soon. It's pretty well -- especially the sort of subsystem test stands where you've just got like one leg on a test stand just doing repetitive exercises and one arm on a test stand pretty well. ``` -```python +```python PYTHON # Define the question question = "Who mentions Jonathan Nolan?" ``` @@ -195,7 +195,7 @@ In this case, we are more concerned about accuracy than a verbose answer, so we We employ the `RecursiveCharacterTextSplitter` from [LangChain](https://python.langchain.com/docs/get_started/introduction) for this task. -```python +```python PYTHON # Define the chunking function def get_chunks(text, chunk_size, chunk_overlap): text_splitter = RecursiveCharacterTextSplitter( @@ -217,7 +217,7 @@ In our first experiment we define the chunk size as 500 and allow **no overlap b Subsequently, we implement the standard RAG pipeline. We feed the chunks into a retriever, selecting the `top_n` most pertinent to the query chunks, and supply them as context to the generation model. Throughout this pipeline, we leverage [Cohere's endpoints](https://docs.cohere.com/reference/about), specifically, `co.embed`, `co.re.rank`, and finally, `co.chat`. -```python +```python PYTHON chunk_size = 500 chunk_overlap = 0 documents = get_chunks(text, chunk_size, chunk_overlap) @@ -244,7 +244,7 @@ An unknown speaker mentions Jonathan Nolan in a conversation about the creators A notable feature of [`co.chat`](https://docs.cohere.com/reference/chat) is its ability to ground the model's answer within the context. This means we can identify which chunks were used to generate the answer. Below, we show the previous output of the model together with the citation reference, where `[num]` represents the index of the chunk. -```python +```python PYTHON print(insert_citations(response.text, response.citations)) ``` @@ -254,7 +254,7 @@ An unknown speaker [0] mentions Jonathan Nolan in a conversation about the creat Indeed, by printing the cited chunk, we can validate that the text was divided so that the generation model could not provide the correct response. Notably, the speaker's name is not included in the context, which is why the model refes to an `unknown speaker`. -```python +```python PYTHON print(source_nodes[0]) ``` @@ -268,7 +268,7 @@ In the previous experiment, we discovered that the chunks were generated in a wa Therefore, this time to mitigate this issue, we **allow for overlap between consecutive chunks**. -```python +```python PYTHON chunk_size = 500 chunk_overlap = 100 documents = get_chunks(text,chunk_size, chunk_overlap) @@ -295,7 +295,7 @@ Elon Musk mentions Jonathan Nolan. Musk is the CEO and Product Architect of the Again, we can print the text along with the citations. -```python +```python PYTHON print(insert_citations(response.text, response.citations)) ``` @@ -305,7 +305,7 @@ Elon Musk [0] mentions Jonathan Nolan. Musk is the CEO and Product Architect [0] And investigate the chunks which were used as context to answer the query. -```python +```python PYTHON source_nodes[0] ``` @@ -329,7 +329,7 @@ Firstly, let's observe that in the HTML text, each time the speaker changes, the To facilitate our text chunking process, we'll use the above observation and introduce a unique character sequence `###`, which we'll utilize as a marker for splitting the text. -```python +```python PYTHON print('HTML text') print(target_divs[:3]) print('-------------------\n') @@ -359,7 +359,7 @@ During this call, we will discuss our business outlook and make forward-looking In this approach, we prioritize splitting the text at the appropriate separator, `###.` To ensure this behavior, we'll use `CharacterTextSplitter` from [LangChain](https://python.langchain.com/docs/get_started/introduction), guaranteeing such behavior. From our analysis of the text and the fact that we aim to preserve entire speaker speeches intact, we anticipate that most of them will exceed a length of 500. Hence, we'll increase the chunk size to 1000. -```python +```python PYTHON separator = "###" chunk_size = 1000 chunk_overlap = 0 @@ -409,7 +409,7 @@ Elon Musk mentions Jonathan Nolan. Musk is friends with the creators of Westworl Below we validate the answer using citations. -```python +```python PYTHON print(insert_citations(response.text, response.citations)) ``` @@ -417,7 +417,7 @@ print(insert_citations(response.text, response.citations)) Elon Musk [0] mentions Jonathan Nolan. [0] Musk is friends [0] with the creators of Westworld [0], Jonathan Nolan [0] and Lisa Joy Nolan. [0] ``` -```python +```python PYTHON source_nodes[0] ``` diff --git a/fern/pages/cookbooks/creating-a-qa-bot.mdx b/fern/pages/cookbooks/creating-a-qa-bot.mdx index 90c1d88ed..64cc0dba8 100644 --- a/fern/pages/cookbooks/creating-a-qa-bot.mdx +++ b/fern/pages/cookbooks/creating-a-qa-bot.mdx @@ -21,12 +21,12 @@ We proceed as follows: ## Setup -```python +```python PYTHON %%capture !pip install cohere datasets llama_index llama-index-llms-cohere llama-index-embeddings-cohere ``` -```python +```python PYTHON import cohere import datasets from llama_index.core import StorageContext, VectorStoreIndex, load_index_from_storage @@ -41,7 +41,7 @@ from typing import List ``` -```python +```python PYTHON api_key = "" # co = cohere.Client(api_key=api_key) ``` @@ -54,7 +54,7 @@ co = cohere.Client(api_key=api_key) Because this process is lengthy (~2h for all documents on a MacBookPro), we store the index to disc for future reuse. We also provide a (commented) code snippet to index only a subset of the data. If you use this snippet, bear in mind that many documents will become unavailable to the model and, as a result, performance will suffer! -```python +```python PYTHON data = datasets.load_dataset("sauravjoshi23/aws-documentation-chunked") print(data) @@ -79,7 +79,7 @@ DatasetDict({ }) ``` -```python +```python PYTHON overwrite = True # only compute index if it doesn't exist path_index = Path(".") / "aws-documentation_index_cohere" @@ -119,13 +119,13 @@ else: The vector database we built using `VectorStoreIndex` comes with an in-built retriever. We can call that retriever to fetch the top $k$ documents most relevant to the user question with: -```python +```python PYTHON retriever = index.as_retriever(similarity_top_k=top_k) ``` We recently released [Rerank-3](https://txt.cohere.com/rerank-3/) (April '24), which we can use to improve the quality of retrieval, as well as reduce latency and the cost of inference. To use the retriever with `rerank`, we create a thin wrapper around `index.as_retriever` as follows: -```python +```python PYTHON class RetrieverWithRerank: def __init__(self, retriever, api_key): self.retriever = retriever @@ -158,7 +158,7 @@ retriever = RetrieverWithRerank( ``` -```python +```python PYTHON query = "What happens to my Amazon EC2 instances if I delete my Auto Scaling group?" documents = retriever.retrieve(query, top_n=top_n) @@ -170,7 +170,7 @@ print(resp.text) This works! With `co.chat`, you get the additional benefit that citations are returned for every span of text. Here's a simple function to display the citations inside square brackets. -```python +```python PYTHON def build_answer_with_citations(response): """ """ text = response.text @@ -203,7 +203,7 @@ Now that we have a running pipeline, we need to assess its performance. The author of the repository provides 100 QA pairs that we can test the model on. Let's download these questions, then run inference on all 100 questions. Later, we will use Command-R+ -- Cohere's largest and most powerful model -- to measure performance. -```python +```python PYTHON url = "https://github.com/siagholami/aws-documentation/blob/main/QA_true.csv?raw=true" qa_pairs = pd.read_csv(url) qa_pairs.sample(2) @@ -221,7 +221,7 @@ We'll loop over each question and generate our model answer. We'll also complete 1. We compute the rank of the golden document amid the retrieved documents -- this will inform how well our retrieval system performs 2. We prepare the grading prompts -- these will be sent to an LLM scorer to compute the goodness of responses -```python +```python PYTHON LLM_EVAL_TEMPLATE = """## References {references} @@ -264,7 +264,7 @@ def get_rank_of_golden_within_retrieved(golden: str, retrieved: List[dict]) -> i ``` -```python +```python PYTHON from tqdm import tqdm answers = [] @@ -308,7 +308,7 @@ We want to test our model performance on two dimensions: Note that this pipeline is for illustration only. To measure performance in practice, we would want to run more in-depths tests on a broader, representative dataset. -```python +```python PYTHON results = pd.DataFrame() results["answer"] = answers results["golden_answer"] = qa_pairs["Answer_True"] @@ -320,7 +320,7 @@ results["rank"] = ranks We'll use Command-R+ as a judge of whether the answers produced by our model convey the same information as the golden answers. Since we've defined the grading prompts earlier, we can simply ask our LLM judge to evaluate that grading prompt. After a little bit of postprocessing, we can then extract our model scores. -````python +````python PYTHON scores = [] reasonings = [] @@ -348,12 +348,12 @@ for prompt in tqdm(grading_prompts, total=len(grading_prompts)): ```` -```python +```python PYTHON results["score"] = scores results["reasoning"] = reasonings ``` -```python +```python PYTHON print(f"Average score: {results['score'].mean():.3f}") ``` @@ -362,7 +362,7 @@ print(f"Average score: {results['score'].mean():.3f}") We've already computed the rank of the golden documents using `get_rank_of_golden_within_retrieved`. Here, we'll plot the histogram of ranks, using blue when the answer scored a 1, and red when the answer scored a 0. -```python +```python PYTHON import matplotlib.pyplot as plt import seaborn as sns diff --git a/fern/pages/cookbooks/csv-agent-native-api.mdx b/fern/pages/cookbooks/csv-agent-native-api.mdx index 9413d6d58..9e65c0457 100644 --- a/fern/pages/cookbooks/csv-agent-native-api.mdx +++ b/fern/pages/cookbooks/csv-agent-native-api.mdx @@ -38,7 +38,7 @@ In this notebook we explore how to setup a [Cohere Agent](https://docs.cohere.co # Setup [#setup] -```python +```python PYTHON import os from typing import List @@ -52,12 +52,12 @@ from langchain_core.pydantic_v1 import BaseModel, Field from langchain_experimental.utilities import PythonREPL ``` -```python +```python PYTHON # Uncomment if you need to install the following packages # !pip install --quiet langchain langchain_experimental cohere --upgrade ``` -```python +```python PYTHON # versions print('cohere version:', cohere.__version__) print('langchain version:', langchain.__version__) @@ -74,7 +74,7 @@ langchain_experimental version: 0.0.59 ### API Key -```python +```python PYTHON COHERE_API_KEY = os.environ["COHERE_API_KEY"] CHAT_URL= "https://api.cohere.ai/v1/chat" COHERE_MODEL = 'command-r-plus' @@ -83,12 +83,12 @@ co = cohere.Client(api_key=COHERE_API_KEY) ### Data Loading -```python +```python PYTHON income_statement = pd.read_csv('income_statement.csv') balance_sheet = pd.read_csv('balance_sheet.csv') ``` -```python +```python PYTHON income_statement.head(2) ``` @@ -163,7 +163,7 @@ income_statement.head(2) -```python +```python PYTHON balance_sheet.head(2) ``` @@ -253,7 +253,7 @@ balance_sheet.head(2) Here we define the python tool using langchain's PythonREPL. We also define `functions_map` that will later be used by the Cohere Agent to correctly map function name to the actual function. Lastly, we define the tools that will be passed in the Cohere API. -```python +```python PYTHON python_repl = PythonREPL() python_tool = Tool( name="python_repl", @@ -295,7 +295,7 @@ tools = [ As [Multi-Step Tool Use](https://docs.cohere.com/page/basic-multi-step) shows, you have a lot of flexiblity on how you can customize and interact with the cohere agent. Here I am creating a wrapper so that it automatically determines when to stop calling the tools and output final answer. It will run maximum of 15 steps. -```python +```python PYTHON def cohere_agent( message: str, preamble: str, @@ -392,7 +392,7 @@ We will ask the following questions given income statement data. - What is the largest gross profit margin? - What is the minimum ratio of operating income loss divided by non operating income expense? -```python +```python PYTHON question_dict ={ 'q1': ['what is the highest value of cost of goods and service?',169559000000], 'q2': ['what is the largest gross profit margin?',0.3836194330595236], @@ -400,7 +400,7 @@ question_dict ={ } ``` -```python +```python PYTHON preamble = """ You are an expert who answers the user's question. You are working with a pandas dataframe in Python. The name of the dataframe is `income_statement.csv`. Here is a preview of the dataframe: @@ -418,7 +418,7 @@ print(preamble) | 1 | 1 | 2018-09-30-2018-12-29 | 84310000000 | nan | 32031000000 | nan | nan | nan | nan | nan | nan | nan | 19965000000 | 1.05 | 1.05 | nan | nan | | 2 | 2 | 2018-09-30-2019-09-28 | 260174000000 | 1.61782e+11 | 98392000000 | 1.6217e+10 | 1.8245e+10 | 3.4462e+10 | 6.393e+10 | 1.807e+09 | 6.5737e+10 | 1.0481e+10 | 55256000000 | 2.99 | 2.97 | 1.84713e+10 | 1.85957e+10 | -```python +```python PYTHON for qsn,val in question_dict.items(): print(f'question:{qsn}') question = val[0] @@ -472,13 +472,13 @@ We now make the task for the Agent more complicated by asking a question that ca As you will see below, this question can be obtained only by accessing both the balance sheet and the income statement. -```python +```python PYTHON question_dict ={ 'q1': ['what is the ratio of the largest stockholders equity to the smallest revenue'], } ``` -```python +```python PYTHON # get the largest stockholders equity x = balance_sheet['StockholdersEquity'].astype(float).max() print(f"The largest stockholders equity value is: {x}") @@ -498,7 +498,7 @@ The smallest revenue value is: 53809000000.0 Their ratio is: 2.4911631883142227 ``` -```python +```python PYTHON preamble = """ You are an expert who answers the user's question in complete sentences. You are working with two pandas dataframe in Python. Ensure your output is a string. @@ -531,7 +531,7 @@ Here is a preview of the `balance_sheet.csv` dataframe: | 2 | 2 | 2019-09-28 | 4.8844e+10 | 5.1713e+10 | 2.2926e+10 | 4.106e+09 | 2.2878e+10 | 1.2352e+10 | 1.62819e+11 | 1.05341e+11 | 3.7378e+10 | 3.2978e+10 | 1.75697e+11 | 3.38516e+11 | 4.6236e+10 | 3.772e+10 | 5.522e+09 | 5.98e+09 | 1.026e+10 | 1.05718e+11 | 9.1807e+10 | 5.0503e+10 | 1.4231e+11 | 2.48028e+11 | 0 | 4.5174e+10 | 4.5898e+10 | -5.84e+08 | 90488000000 | 3.38516e+11 | ``` -```python +```python PYTHON for qsn,val in question_dict.items(): print(f'question:{qsn}') question = val[0] @@ -559,7 +559,7 @@ In the previous example over single table, the model successfully answered your You will see that the second method is able to come to the answer with fewer steps. -```python +```python PYTHON preamble = """ You are an expert who answers the user's question. You are working with a pandas dataframe in Python. The name of the dataframe is `income_statement.csv`. """ @@ -581,7 +581,7 @@ Sorry, there is no column named 'Cost of Goods and Service' in the 'income_state As you see above, the model failed to execute because it assumed certain column names but they turned out to be wrong. One simple fix is to tell the model to continue to solve the problem in the face of error. -```python +```python PYTHON preamble = """ You are an expert who answers the user's question. You are working with a pandas dataframe in Python. The name of the dataframe is `income_statement.csv`. If you run into error, keep trying until you fix it. You may need to view the data to understand the error. @@ -616,7 +616,7 @@ The highest value of 'Cost of Goods and Services' is 169559000000.0. What if we directly give the model the ability to view the data as a tool so that it can explicitly use it instead of indirectly figuring it out? -```python +```python PYTHON def view_csv_data(path: str) -> dict: """ Function to view the head, tail and shape of a given csv file. @@ -660,7 +660,7 @@ tools = [ ] ``` -```python +```python PYTHON preamble = """ You are an expert who answers the user's question. You are working with a pandas dataframe in Python. The name of the dataframe is `income_statement.csv`. Always view the data first to write flawless code. diff --git a/fern/pages/cookbooks/csv-agent.mdx b/fern/pages/cookbooks/csv-agent.mdx index 6bf8bdb71..7af8ade8f 100644 --- a/fern/pages/cookbooks/csv-agent.mdx +++ b/fern/pages/cookbooks/csv-agent.mdx @@ -38,7 +38,7 @@ In this notebook we explore how to setup a [Cohere ReAct Agent](https://github.c # Setup [#sec_step0] -```python +```python PYTHON from langchain_core.pydantic_v1 import BaseModel, Field from langchain.agents import AgentExecutor from langchain_cohere.react_multi_hop.agent import create_cohere_react_agent @@ -50,7 +50,7 @@ from langchain_experimental.utilities import PythonREPL from langchain_experimental.tools.python.tool import PythonAstREPLTool ``` -```python +```python PYTHON # Uncomment if you need to install the following packages #!pip install --quiet langchain langchain_cohere langchain_experimental --upgrade #!pip install sec-api @@ -65,7 +65,7 @@ This notebook assumes the input is a set of csv files extracted from Apple's SEC We use the sec-api to download the income statement and balance sheet from the SEC 10K .htm file. Please note that the tables need to be parsed such that the index is numerical as the python code generation struggles to filter based on index. We have processed the tables and provided them for you. They can be found [here](https://github.com/cohere-ai/notebooks/tree/main/notebooks/agents/financial-csv-agent). -```python +```python PYTHON income_statement = pd.read_csv('income_statement.csv') balance_sheet = pd.read_csv('balance_sheet.csv') ``` @@ -81,7 +81,7 @@ In the example below, we show how the python tool can be used to load a datafram First, let's implement the ReAct agent. -```python +```python PYTHON # instantiate the Cohere llm llm = ChatCohere(model="command-r", temperature=0.1,cohere_api_key="",verbose=True) @@ -124,7 +124,7 @@ Here is a preview of the dataframe: Then, we define the dictionary including the questions we want the Agent to answer, and their answer. -```python +```python PYTHON question_dict ={ 'q1': ['what is the highest value of cost of goods and service?',169559000000], 'q2': ['what is the largest gross profit margin?',0.3836194330595236], @@ -134,7 +134,7 @@ question_dict ={ Let's now see how the Agent answers each of the questions. -```python +```python PYTHON for qsn,val in question_dict.items(): print(f'question:{qsn}') agent_executor.invoke({ @@ -202,7 +202,7 @@ Nice! The agent uses the Python tool to write the code to access the data in the In the example above, the model needs to load the dataframe in the python call before carrying out operations. In this example, we show how to pass the dataframes to the python tool so it has the file already loaded. -```python +```python PYTHON # call the PythonAstREPLTool in order to pass tables to the tool df_locals = {'df':pd.read_csv('income_statement.csv')} tools = [PythonAstREPLTool(locals=df_locals)] @@ -228,7 +228,7 @@ Here is a preview of the dataframe: Let's loop again over the same dictionary of questions. -```python +```python PYTHON for qsn,val in question_dict.items(): print(f'question:{qsn}') agent_executor.invoke({ @@ -296,7 +296,7 @@ Also in this case, the Agent correctly answers all the questions. We now make the task for the Agent more complicated, by asking it questions whose answer can be computed only by retrieving relevant information from multiple tables. -```python +```python PYTHON # define the Agent python_repl = PythonREPL() python_tool = Tool( @@ -337,7 +337,7 @@ Here is a preview of the `balance_sheet.csv` dataframe: We now define a new question. -```python +```python PYTHON question_dict ={ 'q1': ['what is the ratio of the largest stockholders equity to the smallest revenue'], } @@ -345,7 +345,7 @@ question_dict ={ The answer to this question can be obtained only by accessing both the balance sheet and the income statement, as shown below: -```python +```python PYTHON # get the largest stockholders equity x = balance_sheet['StockholdersEquity'].astype(float).max() print(f"The largest stockholders equity value is: {x}") @@ -367,7 +367,7 @@ Their ratio is: 2.4911631883142227 Let's now get the answer from the Agent. -```python +```python PYTHON for qsn,val in question_dict.items(): print(f'question:{qsn}') agent_executor.invoke({ diff --git a/fern/pages/cookbooks/data-analyst-agent.mdx b/fern/pages/cookbooks/data-analyst-agent.mdx index ebaca22e8..244e90408 100644 --- a/fern/pages/cookbooks/data-analyst-agent.mdx +++ b/fern/pages/cookbooks/data-analyst-agent.mdx @@ -19,7 +19,7 @@ In this notebook, we'll see how we can use two tools to create a simple data ana Let's start by installing the required libraries -```python +```python PYTHON ! pip install --quiet langchain langchain_cohere langchain_experimental ``` @@ -27,13 +27,13 @@ Let's start by installing the required libraries We'll need a Cohere API key here. Grab your key and paste it in the next slide if you have one, or [register](https://dashboard.cohere.ai/welcome/register) and create a new API key. -```python +```python PYTHON ### LLMs import os os.environ['COHERE_API_KEY'] = "" ``` -```python +```python PYTHON from langchain_cohere.chat_models import ChatCohere chat = ChatCohere(model="command-r-plus", temperature=0.3) ``` @@ -44,7 +44,7 @@ Our simple data analyst will be equipped with a web search tool, and a python in Let's first equip our agent with web search! We can use the Tavily API for this. Head on to [tavily.com](https://tavily.com) and grab an API key to use here. -```python +```python PYTHON from langchain_community.tools.tavily_search import TavilySearchResults os.environ['TAVILY_API_KEY'] = "" @@ -64,7 +64,7 @@ internet_search.args_schema = TavilySearchInput Let's equip our agent with a python interpreter! -```python +```python PYTHON from langchain.agents import Tool from langchain_experimental.utilities import PythonREPL @@ -82,13 +82,13 @@ repl_tool.args_schema = ToolInput ``` -```python +```python PYTHON from langchain.agents import AgentExecutor from langchain_cohere.react_multi_hop.agent import create_cohere_react_agent from langchain_core.prompts import ChatPromptTemplate ``` -```python +```python PYTHON prompt = ChatPromptTemplate.from_template("{input}") @@ -99,11 +99,11 @@ agent = create_cohere_react_agent( ) ``` -```python +```python PYTHON agent_executor = AgentExecutor(agent=agent, tools=[internet_search, repl_tool], verbose=True) ``` -```python +```python PYTHON agent_executor.invoke({ "input": "Create a plot of the number of full time employees at the 3 tech companies with the highest market cap in the United States in 2024.", }) @@ -162,7 +162,7 @@ The companies with the highest number of full-time employees are Mic src="" /> -```python +```python PYTHON agent_executor.invoke({ "input": "Hey how are you?", }) @@ -171,7 +171,7 @@ agent_executor.invoke({ ````txt title="Output" > Entering new AgentExecutor chain... Plan: I will respond to the user's greeting. -Action: ```json +Action: ```json JSON [ { "tool_name": "directly_answer", diff --git a/fern/pages/cookbooks/document-parsing-for-enterprises.mdx b/fern/pages/cookbooks/document-parsing-for-enterprises.mdx index 3108d7349..b5edce8f0 100644 --- a/fern/pages/cookbooks/document-parsing-for-enterprises.mdx +++ b/fern/pages/cookbooks/document-parsing-for-enterprises.mdx @@ -57,20 +57,20 @@ Before we dive into the technical weeds, we need to set up the notebook's runtim - precomputed parsed documents for each parsing solution. While the point of this notebook is to illustrate how this is done, we provide the parsed final results to allow readers to skip ahead to the RAG section without having to set up the required infrastructure for each solution.) - Add utility functions needed for later sections -```python +```python PYTHON %%capture ! sudo apt install tesseract-ocr poppler-utils ! pip install "cohere<5" fsspec hnswlib google-cloud-documentai google-cloud-storage boto3 langchain-text-splitters llama_parse pytesseract pdf2image pandas ``` -```python +```python PYTHON data_dir = "data/document-parsing" source_filename = "example-drug-label" extension = "pdf" ``` -```python +```python PYTHON from pathlib import Path sources = ["gcp", "aws", "unstructured-io", "llamaparse-text", "llamaparse-markdown", "pytesseract"] @@ -88,13 +88,13 @@ for filename in filenames: Make sure to include the notebook's utility functions in the runtime. -```python +```python PYTHON def store_document(path: str, doc_content: str): with open(path, 'w') as f: f.write(doc_content) ``` -```python +```python PYTHON import json def insert_citations_in_order(text, citations, documents): @@ -130,7 +130,7 @@ def insert_citations_in_order(text, citations, documents): return text_with_citations, "\n".join(citations_reference) ``` -```python +```python PYTHON def format_docs_for_chat(documents): return [{"id": str(index), "text": x} for index, x in enumerate(documents)] ``` @@ -157,7 +157,7 @@ The following block can be executed in one of two ways: **Note: You can skip to the next block if you want to use the pre-existing parsed version.** -```python +```python PYTHON """ Extracted from https://cloud.google.com/document-ai/docs/samples/documentai-batch-process-document """ @@ -269,7 +269,7 @@ def batch_process_documents( # ) ``` -```python +```python PYTHON """ Post process parsed document and store it locally. Make sure to run this in a Google Vertex AI environment or include a credentials file. @@ -301,7 +301,7 @@ for filename, doc_content in parsed_documents: #### Visualize the parsed document -```python +```python PYTHON filename = "gcp-parsed-{}.txt".format(source_filename) with open("{}/{}".format(data_dir, filename), "r") as doc: parsed_document = doc.read() @@ -329,7 +329,7 @@ At minimum, you will need access to the following AWS resources to get started: First, we bring in the `TextractWrapper` class provided in the [AWS Code Examples repository](https://github.com/awsdocs/aws-doc-sdk-examples/blob/main/python/example_code/textract/textract_wrapper.py). This class makes it simpler to interface with the Textract service. -```python +```python PYTHON # source: https://github.com/awsdocs/aws-doc-sdk-examples/tree/main/python/example_code/textract # Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. @@ -612,7 +612,7 @@ class TextractWrapper: Next, we set up Textract and S3, and provide this to an instance of `TextractWrapper`. -```python +```python PYTHON import boto3 textract_client = boto3.client('textract') @@ -630,7 +630,7 @@ Asynchronous calls follow the below process: 3. Once the request is complete, Textract sends out a message to the SNS topic. This can be used in conjunction with other services such as Lambda or SQS for downstream processes. 4. The parsed results can be fetched from Textract in chunks via the job ID. -```python +```python PYTHON bucket_name = "your-bucket-name" sns_topic_arn = "your-sns-arn" # this can be found under the topic you created in the Amazon SNS dashboard sns_role_arn = "sns-role-arn" # this is an IAM role that allows Textract to interact with SNS @@ -638,7 +638,7 @@ sns_role_arn = "sns-role-arn" # this is an IAM role that allows Textract to inte file_name = "example-drug-label.pdf" ``` -```python +```python PYTHON # kick off a text detection job. This returns a job ID. job_id = textractWrapper.start_detection_job(bucket_name=bucket_name, document_file_name=file_name, sns_topic_arn=sns_topic_arn, sns_role_arn=sns_role_arn) @@ -652,7 +652,7 @@ This response corresponds to one chunk of information parsed by Textract. The nu Textract returns an information-rich representation of the extracted text, such as their position on the page and hierarchical relationships with other entities, all the way down to the individual word level. Since we are only interested in the raw text, we need a way to parse through all of the chunks and their `Blocks`. Lucky for us, Amazon provides some [helper functions](https://github.com/aws-samples/textract-paragraph-identification/tree/main) for this purpose, which we utilize below. -```python +```python PYTHON def get_text_results_from_textract(job_id): response = textract_client.get_document_text_detection(JobId=job_id) collection_of_textract_responses = [] @@ -729,7 +729,7 @@ We feed in the Job ID from before into the function `get_text_results_from_textr Finally, we can concatenate the lines into one string to pass into our downstream RAG pipeline. -```python +```python PYTHON all_text = "\n".join([line["text"] if line else "" for line in text_info_with_line_spacing]) with open(f"aws-parsed-{source_filename}.txt", "w") as f: @@ -738,7 +738,7 @@ with open(f"aws-parsed-{source_filename}.txt", "w") as f: #### Visualize the parsed document -```python +```python PYTHON filename = "aws-parsed-{}.txt".format(source_filename) with open("{}/{}".format(data_dir, filename), "r") as doc: parsed_document = doc.read() @@ -761,7 +761,7 @@ The guide assumes an endpoint exists that hosts this service. The API is offered **Note: You can skip to the next block if you want to use the pre-existing parsed version.** -```python +```python PYTHON import os import requests @@ -789,7 +789,7 @@ parsed_document = " ".join([parsed_entry["text"] for parsed_entry in parsed_resp print("Parsed {}".format(source_filename)) ``` -```python +```python PYTHON """ Post process parsed document and store it locally. """ @@ -800,7 +800,7 @@ store_document(file_path, parsed_document) #### Visualize the parsed document -```python +```python PYTHON filename = "unstructured-io-parsed-{}.txt".format(source_filename) with open("{}/{}".format(data_dir, filename), "r") as doc: parsed_document = doc.read() @@ -825,7 +825,7 @@ Parsing documents with LlamaParse offers an option for two output modes both of **Note: You can skip to the next block if you want to use the pre-existing parsed version.** -```python +```python PYTHON import os from llama_parse import LlamaParse @@ -836,7 +836,7 @@ llama_index_api_key = "{API_KEY}" input_path = "{}/{}.{}".format(data_dir, source_filename, extension) ``` -```python +```python PYTHON # Text mode text_parser = LlamaParse( api_key=llama_index_api_key, @@ -849,7 +849,7 @@ text_parsed_document = " ".join([parsed_entry.text for parsed_entry in text_resp print("Parsed {} to text".format(source_filename)) ``` -```python +```python PYTHON """ Post process parsed document and store it locally. """ @@ -858,7 +858,7 @@ file_path = "{}/{}-text-parsed-fda-approved-drug.txt".format(data_dir, "llamapar store_document(file_path, text_parsed_document) ``` -```python +```python PYTHON # Markdown mode markdown_parser = LlamaParse( api_key=llama_index_api_key, @@ -871,7 +871,7 @@ markdown_parsed_document = " ".join([parsed_entry.text for parsed_entry in markd print("Parsed {} to markdown".format(source_filename)) ``` -```python +```python PYTHON """ Post process parsed document and store it locally. """ @@ -882,7 +882,7 @@ store_document(file_path, markdown_parsed_document) #### Visualize the parsed document -```python +```python PYTHON # Text parsing filename = "llamaparse-text-parsed-{}.txt".format(source_filename) @@ -893,7 +893,7 @@ with open("{}/{}".format(data_dir, filename), "r") as doc: print(parsed_document[:1000]) ``` -```python +```python PYTHON # Markdown parsing filename = "llamaparse-markdown-parsed-fda-approved-drug.txt" @@ -909,18 +909,18 @@ The final parsing method we examine does not rely on cloud services, but rather #### Parsing the document -```python +```python PYTHON from matplotlib import pyplot as plt from pdf2image import convert_from_path import pytesseract ``` -```python +```python PYTHON # pdf2image extracts as a list of PIL.Image objects pages = convert_from_path(filename) ``` -```python +```python PYTHON # we look at the first page as a sanity check: plt.imshow(pages[0]) @@ -930,11 +930,11 @@ plt.show() Now, we can process the image of each page with `pytesseract` and concatenate the results to get our parsed document. -```python +```python PYTHON label_ocr_pytesseract = "".join([pytesseract.image_to_string(page) for page in pages]) ``` -```python +```python PYTHON print(label_ocr_pytesseract[:200]) ``` @@ -948,7 +948,7 @@ IWILFIN. IWILFIN™ (eflor ``` -```python +```python PYTHON label_ocr_pytesseract = "".join([pytesseract.image_to_string(page) for page in pages]) with open(f"pytesseract-parsed-{source_filename}.txt", "w") as f: @@ -957,7 +957,7 @@ with open(f"pytesseract-parsed-{source_filename}.txt", "w") as f: #### Visualize the parsed document -```python +```python PYTHON filename = "pytesseract-parsed-{}.txt".format(source_filename) with open("{}/{}".format(data_dir, filename), "r") as doc: parsed_document = doc.read() @@ -976,12 +976,12 @@ We can now ask a set of simple + complex questions and see how each parsing solu - **I need a succinct summary of the compound name, indication, route of administration, and mechanism of action of Iwilfin.** - Task: Overall document summary -```python +```python PYTHON import cohere co = cohere.Client(api_key="{API_KEY}") ``` -```python +```python PYTHON """ Document Questions """ @@ -1004,7 +1004,7 @@ source = "gcp" In order to set up our RAG implementation, we need to separate the parsed text into chunks and load the chunks to an index. The index will allow us to retrieve relevant passages from the document for different queries. Here, we use a simple implementation of indexing using the `hnswlib` library. Note that there are many different indexing solutions that are appropriate for specific production use cases. -```python +```python PYTHON """ Read parsed document content and chunk data """ @@ -1038,14 +1038,14 @@ documents = [c.page_content for c in chunks_] print("Source document has been broken down to {} chunks".format(len(documents))) ``` -```python +```python PYTHON """ Embed document chunks """ document_embeddings = co.embed(texts=documents, model="embed-english-v3.0", input_type="search_document").embeddings ``` -```python +```python PYTHON """ Create document index and add embedded chunks """ @@ -1066,7 +1066,7 @@ Count: 115 In this step, we use k-nearest neighbors to fetch the most relevant documents for our query. Once the nearest neighbors are retrieved, we use Cohere's reranker to reorder the documents in the most relevant order with regards to our input search query. -```python +```python PYTHON """ Embed search query Fetch k nearest neighbors @@ -1080,7 +1080,7 @@ neighbors = [(result[0][0][i], result[1][0][i]) for i in range(len(result[0][0]) relevant_docs = [documents[x[0]] for x in sorted(neighbors, key=lambda x: x[1])] ``` -```python +```python PYTHON """ Rerank retrieved documents """ @@ -1091,7 +1091,7 @@ reranked_relevant_docs = format_docs_for_chat([x.document["text"] for x in reran ## Final Step: Call Command-R + RAG! -```python +```python PYTHON """ Call the /chat endpoint with command-r """ @@ -1113,12 +1113,12 @@ print(citations_reference) Run the code cells below to make head to head comparisons of the different parsing techniques across different questions. -```python +```python PYTHON import pandas as pd results = pd.read_csv("{}/results-table.csv".format(data_dir)) ``` -```python +```python PYTHON question = input(""" Question 1: What are the most common adverse reactions of Iwilfin? Question 2: What is the recommended dosage of Iwilfin on body surface area between 0.5 m2 and 0.75 m2? diff --git a/fern/pages/cookbooks/elasticsearch-and-cohere.mdx b/fern/pages/cookbooks/elasticsearch-and-cohere.mdx index 37321f373..998547253 100644 --- a/fern/pages/cookbooks/elasticsearch-and-cohere.mdx +++ b/fern/pages/cookbooks/elasticsearch-and-cohere.mdx @@ -30,13 +30,13 @@ First we need to `pip` install the following packages: After installing, in the Serverless dashboard, find your endpoint URL, and create your API key. -```python +```python PYTHON pip install elasticsearch_serverless cohere ``` Next, we need to import the modules we need. 🔐 NOTE: getpass enables us to securely prompt the user for credentials without echoing them to the terminal, or storing it in memory. -```python +```python PYTHON from elasticsearch_serverless import Elasticsearch, helpers from getpass import getpass import cohere @@ -51,7 +51,7 @@ Then we create a `client` object that instantiates an instance of the `Elasticse When creating your Elastic Serverless API key make sure to turn on Control security privileges, and edit cluster privileges to specify `"cluster": ["all"]` -```python +```python PYTHON ELASTICSEARCH_ENDPOINT = getpass("Elastic Endpoint: ") ELASTIC_API_KEY = getpass("Elastic encoded API key: ") # Use the encoded API key @@ -63,7 +63,7 @@ client = Elasticsearch( Confirm that the client has connected with this test: -```python +```python PYTHON print(client.info()) ``` @@ -73,7 +73,7 @@ Let's create the inference endpoint by using the [Create inference API](https:// You'll need an Cohere API key for this that you can find in your Cohere account under the [API keys section](https://dashboard.cohere.com/api-keys). A production key is required to complete the steps in this notebook as the Cohere free trial API usage is limited. -```python +```python PYTHON COHERE_API_KEY = getpass("Enter Cohere API key: ") client.options(ignore_status=[404]).inference.delete_model(inference_id="cohere_embeddings") @@ -98,7 +98,7 @@ client.inference.put_model( Create an ingest pipeline with an inference processor by using the [`put_pipeline`](https://www.elastic.co/guide/en/elasticsearch/reference/master/put-pipeline-api.html) method. Reference the inference endpoint created above as the `model_id` to infer against the data that is being ingested in the pipeline. -```python +```python PYTHON client.options(ignore_status=[404]).ingest.delete_pipeline(id="cohere_embeddings") client.ingest.put_pipeline( @@ -132,7 +132,7 @@ The mapping of the destination index – the index that contains the embeddings Let's create an index named `cohere-wiki-embeddings` with the mappings we need. -```python +```python PYTHON client.indices.delete(index="cohere-wiki-embeddings", ignore_unavailable=True) client.indices.create( index="cohere-wiki-embeddings", @@ -161,7 +161,7 @@ client.indices.create( Let's insert our example wiki dataset. You need a production Cohere account to complete this step, otherwise the documentation ingest will time out due to the API request rate limits. -```python +```python PYTHON url = "https://raw.githubusercontent.com/cohere-ai/notebooks/main/notebooks/data/embed_jobs_sample_data.jsonl" response = requests.get(url) @@ -187,7 +187,7 @@ After the dataset has been enriched with the embeddings, you can query the data Pass a `query_vector_builder` to the k-nearest neighbor (kNN) vector search API, and provide the query text and the model you have used to create the embeddings. -```python +```python PYTHON query = "When were the semi-finals of the 2022 FIFA world cup played?" response = client.search( @@ -228,7 +228,7 @@ In order to effectively combine the results from our vector and BM25 retrieval, First, create an inference endpoint with your Cohere API key. Make sure to specify a name for your endpoint, and the model_id of one of the rerank models. In this example we will use Rerank 3. -```python +```python PYTHON client.options(ignore_status=[404]).inference.delete_model(inference_id="cohere_rerank") client.inference.put_model( @@ -253,7 +253,7 @@ The inference service will respond with a list of documents in descending order In this case we will set the response to False and will reconstruct the input documents based on the index returned in the response. -```python +```python PYTHON response = client.inference.inference( inference_id="cohere_rerank", body={ @@ -280,13 +280,13 @@ Now that we have ranked our results, we can easily turn this into a RAG system w First, we will create the Cohere client. -```python +```python PYTHON co = cohere.Client(COHERE_API_KEY) ``` Next, we can easily get a grounded generation with citations from the Cohere Chat API. We simply pass in the user query and documents retrieved from Elastic to the API, and print out our grounded response. -```python +```python PYTHON response = co.chat( message=query, documents=ranked_documents, diff --git a/fern/pages/cookbooks/embed-jobs-serverless-pinecone.mdx b/fern/pages/cookbooks/embed-jobs-serverless-pinecone.mdx index cf203a414..3814c588c 100644 --- a/fern/pages/cookbooks/embed-jobs-serverless-pinecone.mdx +++ b/fern/pages/cookbooks/embed-jobs-serverless-pinecone.mdx @@ -8,7 +8,7 @@ import { CookbookHeader } from "../../components/cookbook-header"; -```python +```python PYTHON import os import json import time @@ -30,7 +30,7 @@ pc = Pinecone( ## Step 1: Upload a dataset -```python +```python PYTHON dataset_file_path = "data/embed_jobs_sample_data.jsonl" # Full path - https://raw.githubusercontent.com/cohere-ai/notebooks/main/notebooks/data/embed_jobs_sample_data.jsonl ds=co.create_dataset( @@ -64,7 +64,7 @@ cohere.Dataset { ## Step 2: Create embeddings via Cohere's Embed Jobs endpoint -```python +```python PYTHON job = co.create_embed_job(dataset_id=ds.id, input_type='search_document', model='embed-english-v3.0', @@ -78,7 +78,7 @@ job.wait() # poll the server until the job is completed ... ``` -```python +```python PYTHON print(job) ``` @@ -108,7 +108,7 @@ cohere.EmbedJob { ## Step 3: Prepare embeddings for upsert -```python +```python PYTHON output_dataset=co.get_dataset(job.output.id) data_array = [] for record in output_dataset: @@ -123,7 +123,7 @@ to_upsert = list(zip(ids, embeds, meta)) ## Step 4: Initialize Pinecone vector database -```python +```python PYTHON from pinecone import ServerlessSpec index_name = "embed-jobs-serverless-test-example" @@ -140,7 +140,7 @@ idx = pc.Index(index_name) ## Step 5: Upsert embeddings into the index -```python +```python PYTHON batch_size = 128 for i in range(0, len(data_array), batch_size): @@ -159,7 +159,7 @@ print(idx.describe_index_stats()) ## Step 6: Query the index -```python +```python PYTHON query = "What did Microsoft announce in Las Vegas?" xq = co.embed( @@ -178,7 +178,7 @@ res = idx.query(xq, top_k=20, include_metadata=True) (1, 1024) ``` -```python +```python PYTHON for match in res['matches']: print(f"{match['score']:.2f}: {match['metadata']['text']}") ``` @@ -208,7 +208,7 @@ for match in res['matches']: ## Step 7: Rerank the retrieved results -```python +```python PYTHON docs =[match['metadata']['text'] for match in res['matches']] rerank_response = co.rerank( @@ -229,7 +229,7 @@ for response in rerank_response: ## Another example - query and rerank -```python +```python PYTHON query = "What was the first youtube video about?" xq = co.embed( @@ -271,7 +271,7 @@ for match in res['matches']: 0.45: In September 2020, YouTube announced that it would be launching a beta version of a new platform of 15-second videos, similar to TikTok, called YouTube Shorts. The platform was first tested in India but as of March 2021 has expanded to other countries including the United States with videos now able to be up to 1 minute long. The platform is not a standalone app, but is integrated into the main YouTube app. Like TikTok, it gives users access to built-in creative tools, including the possibility of adding licensed music to their videos. The platform had its global beta launch in July 2021. ``` -```python +```python PYTHON docs =[match['metadata']['text'] for match in res['matches']] rerank_response = co.rerank( diff --git a/fern/pages/cookbooks/embed-jobs.mdx b/fern/pages/cookbooks/embed-jobs.mdx index b6eb861b2..7c2e93cb2 100644 --- a/fern/pages/cookbooks/embed-jobs.mdx +++ b/fern/pages/cookbooks/embed-jobs.mdx @@ -8,7 +8,7 @@ import { CookbookHeader } from "../../components/cookbook-header"; -```python +```python PYTHON import time import cohere import hnswlib @@ -17,7 +17,7 @@ co = cohere.Client('COHERE_API_KEY') ## Step 1: Upload a dataset -```python +```python PYTHON dataset_file_path = "data/embed_jobs_sample_data.jsonl" # Full path - https://raw.githubusercontent.com/cohere-ai/notebooks/main/notebooks/data/embed_jobs_sample_data.jsonl @@ -35,7 +35,7 @@ sample-file-hca4x0 was uploaded ... ``` -```python +```python PYTHON print(ds.await_validation()) ``` @@ -55,7 +55,7 @@ cohere.Dataset { ## Step 2: Create embeddings via Cohere's Embed Jobs endpoint -```python +```python PYTHON job = co.create_embed_job( dataset_id=ds.id, input_type='search_document' , @@ -70,7 +70,7 @@ job.wait() # poll the server until the job is completed ... ``` -```python +```python PYTHON print(job) ``` @@ -100,13 +100,13 @@ cohere.EmbedJob { ## Step 3: Download and prepare the embeddings -```python +```python PYTHON embeddings_file_path = 'embed_jobs_output.csv' output_dataset=co.get_dataset(job.output.id) output_dataset.save(filepath=embeddings_file_path, format="csv") ``` -```python +```python PYTHON embeddings=[] texts=[] for record in output_dataset: @@ -116,7 +116,7 @@ for record in output_dataset: ## Step 4: Initialize Hnwslib index and add embeddings -```python +```python PYTHON index = hnswlib.Index(space='ip', dim=1024) index.init_index(max_elements=len(embeddings), ef_construction=512, M=64) index.add_items(embeddings,list(range(len(embeddings)))) @@ -124,7 +124,7 @@ index.add_items(embeddings,list(range(len(embeddings)))) ## Step 5: Query the index and rerank the results -```python +```python PYTHON query = "What was the first youtube video about?" query_emb=co.embed( @@ -146,7 +146,7 @@ final_result = co.rerank( ## Step 6: Display the results -```python +```python PYTHON for idx, r in enumerate(final_result): print(f"Document Rank: {idx + 1}, Document Index: {r.index}") print(f"Document: {r.document['text']}") diff --git a/fern/pages/cookbooks/fueling-generative-content.mdx b/fern/pages/cookbooks/fueling-generative-content.mdx index 4539b317b..525a10de1 100644 --- a/fern/pages/cookbooks/fueling-generative-content.mdx +++ b/fern/pages/cookbooks/fueling-generative-content.mdx @@ -12,11 +12,11 @@ Generative models have proven extremely useful in content idea generation. But t Read the accompanying [blog post here](https://txt.cohere.ai/generative-content-keyword-research/). -```python +```python PYTHON ! pip install cohere -q ``` -```python +```python PYTHON import cohere import numpy as np import pandas as pd @@ -26,7 +26,7 @@ import cohere co = cohere.Client("COHERE_API_KEY") # Get your API key: https://dashboard.cohere.com/api-keys ``` -```python +```python PYTHON #@title Enable text wrapping in Google Colab from IPython.display import HTML, display @@ -40,7 +40,7 @@ get_ipython().events.register('pre_run_cell', set_css) First, we need to get a supply of high-traffic keywords for a given topic. We can get this via keyword research tools, of which are many available. We’ll use Google Keyword Planner, which is free to use. -```python +```python PYTHON import wget wget.download("https://raw.githubusercontent.com/cohere-ai/notebooks/main/notebooks/data/remote_teams.csv", "remote_teams.csv") @@ -50,7 +50,7 @@ wget.download("https://raw.githubusercontent.com/cohere-ai/notebooks/main/notebo 'remote_teams.csv' ``` -```python +```python PYTHON df = pd.read_csv('remote_teams.csv') df.columns = ["keyword","volume"] df.head() @@ -103,7 +103,7 @@ We can do that by clustering them into topics. For this, we’ll leverage Cohere The Cohere Embed endpoint turns a text input into a text embedding. -```python +```python PYTHON def embed_text(texts): output = co.embed( texts=texts, @@ -119,7 +119,7 @@ embeds = np.array(embed_text(df['keyword'].tolist())) We then use these embeddings to cluster the keywords. A common term used for this exercise is “topic modeling.” Here, we can leverage scikit-learn’s KMeans module, a machine learning algorithm for clustering. -```python +```python PYTHON NUM_TOPICS = 4 kmeans = KMeans(n_clusters=NUM_TOPICS, random_state=21, n_init="auto").fit(embeds) df['topic'] = list(kmeans.labels_) @@ -175,11 +175,11 @@ df.head() We use the Chat to generate a topic name for that cluster. -```python +```python PYTHON topic_keywords_dict = {topic: list(set(group['keyword'])) for topic, group in df.groupby('topic')} ``` -```python +```python PYTHON def generate_topic_name(keywords): # Construct the prompt prompt = f"""Generate a concise topic name that best represents these keywords.\ @@ -197,7 +197,7 @@ Keywords: {', '.join(keywords)}""" return response.text ``` -```python +```python PYTHON topic_name_mapping = {topic: generate_topic_name(keywords) for topic, keywords in topic_keywords_dict.items()} df['topic_name'] = df['topic'].map(topic_name_mapping) @@ -256,7 +256,7 @@ df.head() -```python +```python PYTHON for topic, name in topic_name_mapping.items(): print(f"Topic {topic}: {name}") ``` @@ -274,7 +274,7 @@ Now that we have the keywords nicely grouped into topics, we can proceed to gene Here we can implement a filter to take just the top N keywords from each topic, sorted by the search volume. In our case, we use 10. -```python +```python PYTHON TOP_N = 10 top_keywords = (df.groupby('topic') @@ -290,7 +290,7 @@ for topic, group in top_keywords.groupby('topic'): content_by_topic[topic] = {'topic_name': topic_name, 'keywords': keywords} ``` -```python +```python PYTHON content_by_topic ``` @@ -309,7 +309,7 @@ content_by_topic Next, we use the Chat endpoint to produce the content ideas. The prompt we’ll use is as follows -```python +```python PYTHON def generate_blog_ideas(keywords): prompt = f"""{keywords}\n\nThe above is a list of high-traffic keywords obtained from a keyword research tool. Suggest three blog post ideas that are highly relevant to these keywords. @@ -329,7 +329,7 @@ Abstract: """ Next, we generate the blog post ideas. It takes in a string of keywords, calls the Chat endpoint, and returns the generated text. -```python +```python PYTHON for key,value in content_by_topic.items(): value['ideas'] = generate_blog_ideas(value['keywords']) diff --git a/fern/pages/cookbooks/grounded-summarization.mdx b/fern/pages/cookbooks/grounded-summarization.mdx index 3aa4b5fd7..38f819fd7 100644 --- a/fern/pages/cookbooks/grounded-summarization.mdx +++ b/fern/pages/cookbooks/grounded-summarization.mdx @@ -19,7 +19,7 @@ This notebook provides the code to produce the outputs described in [this blog p ## 1. Setup [#setup] -```python +```python PYTHON %%capture import cohere @@ -41,7 +41,7 @@ co = cohere.Client(co_api_key) ``` -```python +```python PYTHON from google.colab import drive drive.mount("/content/drive", force_remount=True) @@ -58,7 +58,7 @@ print(f"Loaded IMF report with {num_tokens} tokens") ### Aside: define utils -```python +```python PYTHON def split_text_into_sentences(text: str) -> List[str]: sentences = sent_tokenize(text) @@ -189,7 +189,7 @@ def _add_chunks_by_priority( First, let's see Command-R's out-of-the-box performance. It's a 128k-context model, so we can pass the full IMF report in a single call. We replicate the exact instructions from the original tweet (correcting for a minor typo) for enabling fair comparisons. -```python +```python PYTHON prompt_template = """\ ## text {text} @@ -232,7 +232,7 @@ For more information on how to enable grounded generation via our `co.chat` API, Finally, note that we chunk the IMF report into multiple documents before passing them to `co.chat`. This isn't necessary (`co.chat` annotates citations at the character level), but allows for more human-readable citations. -```python +```python PYTHON summarize_preamble = """\ You will receive a series of text fragments from an article that are presented in chronological order. \ As the assistant, you must generate responses to user's requests based on the information given in the fragments. \ @@ -268,7 +268,7 @@ print(resp.text) Let's display the citations inside our answer: -```python +```python PYTHON print(insert_citations(resp.text, resp.citations)) ``` @@ -282,7 +282,7 @@ Around 40% of employment worldwide is exposed to AI [1, 6] by checking its chunk: -```python +```python PYTHON print(chunked[6]) ``` @@ -295,7 +295,7 @@ Even though Command-R is an efficient, light-weight model, for some applications We have a whole notebook dedicated to methods for reducing context length. Here, we call our 'text-rank' method to select maximally central chunks in a graph based on the chunk-to-chunk similarties. For more detail, please refer [to this cookbook](https://colab.research.google.com/drive/1zxSAbruOWwWJHNsj3N56uxZtUeiS7Evd). -```python +```python PYTHON num_tokens = 8192 shortened = textrank(text, co, num_tokens, n_sentences_per_passage=30) diff --git a/fern/pages/cookbooks/hello-world-meet-ai.mdx b/fern/pages/cookbooks/hello-world-meet-ai.mdx index cdec3bb11..bf96ebd27 100644 --- a/fern/pages/cookbooks/hello-world-meet-ai.mdx +++ b/fern/pages/cookbooks/hello-world-meet-ai.mdx @@ -25,11 +25,11 @@ We’ll cover three groups of tasks that you will typically work on when dealing The first step is to install the Cohere Python SDK. Next, create an API key, which you can generate from the Cohere [dashboard](https://os.cohere.ai/register) or [CLI tool](https://docs.cohere.ai/cli-key). -```python +```python PYTHON ! pip install cohere altair umap-learn -q ``` -```python +```python PYTHON import cohere import pandas as pd import numpy as np @@ -42,7 +42,7 @@ The Cohere Generate endpoint generates text given an input, called “prompt”. ### Try a Simple Prompt -```python +```python PYTHON prompt = "What is a Hello World program." response = co.chat( @@ -69,11 +69,11 @@ int main() { } ``` 2. **Python**: -```python +```python PYTHON print("Hello World") ``` 3. **Java**: -```java +```java JAVA class HelloWorld { public static void main(String[] args) { System.out.println("Hello World"); @@ -101,7 +101,7 @@ The "Hello World" program is a testament to the power of programming, as a simpl The output is not bad, but it can be better. We need to find a way to make the output tighter to how we want it to be, which is where we leverage _prompt engineering_. -```python +```python PYTHON prompt = """ Write the first paragraph of a blog post given a blog title. -- @@ -134,7 +134,7 @@ Starting to code can be daunting, but it's actually simpler than you think! The In real applications, you will likely need to produce these text generations on an ongoing basis, given different inputs. Let’s simulate that with our example. -```python +```python PYTHON def generate_text(topic): prompt = f""" Write the first paragraph of a blog post given a blog title. @@ -160,13 +160,13 @@ First Paragraph:""" return response.text ``` -```python +```python PYTHON topics = ["How to Grow in Your Career", "The Habits of Great Software Developers", "Ideas for a Relaxing Weekend"] ``` -```python +```python PYTHON paragraphs = [] for topic in topics: @@ -194,7 +194,7 @@ Cohere’s Classify endpoint makes it easy to take a list of texts and predict t ### Sentiment Analysis -```python +```python PYTHON from cohere import ClassifyExample examples = [ @@ -216,7 +216,7 @@ examples = [ ] ``` -```python +```python PYTHON inputs=["Hello, world! What a beautiful day", "It was a great time with great people", "Great place to work", @@ -232,7 +232,7 @@ inputs=["Hello, world! What a beautiful day", ] ``` -```python +```python PYTHON def classify_text(inputs, examples): """ Classify a list of input texts @@ -253,7 +253,7 @@ def classify_text(inputs, examples): return classifications ``` -```python +```python PYTHON predictions = classify_text(inputs,examples) classes = ["positive","negative","neutral"] @@ -325,7 +325,7 @@ Cohere’s Embed endpoint takes a piece of text and turns it into a vector embed Here we have a list of 50 top web search keywords about Hello, World! taken from a keyword tool. Let’s look at a few examples: -```python +```python PYTHON df = pd.read_csv("https://github.com/cohere-ai/notebooks/raw/main/notebooks/data/hello-world-kw.csv", names=["search_term"]) df.head() ``` @@ -365,7 +365,7 @@ df.head() We use the Embed endpoint to get the embeddings for each of these keywords. -```python +```python PYTHON def embed_text(texts, input_type): """ Turns a piece of text into embeddings @@ -383,7 +383,7 @@ def embed_text(texts, input_type): return response.embeddings ``` -```python +```python PYTHON df["search_term_embeds"] = embed_text(texts=df["search_term"].tolist(), input_type="search_document") doc_embeds = np.array(df["search_term_embeds"].tolist()) @@ -393,7 +393,7 @@ doc_embeds = np.array(df["search_term_embeds"].tolist()) We’ll look at a couple of example applications. The first example is semantic search. Given a new query, our "search engine" must return the most similar FAQs, where the FAQs are the 50 search terms we uploaded earlier. -```python +```python PYTHON query = "what is the history of hello world" query_embeds = embed_text(texts=[query], @@ -402,7 +402,7 @@ query_embeds = embed_text(texts=[query], We use cosine similarity to compare the similarity of the new query with each of the FAQs -```python +```python PYTHON from sklearn.metrics.pairwise import cosine_similarity @@ -433,7 +433,7 @@ def get_similarity(target, candidates): Finally, we display the top 5 FAQs that match the new query -```python +```python PYTHON similarity = get_similarity(query_embeds,doc_embeds) print("New query:") @@ -462,7 +462,7 @@ In the second example, we take the same idea as semantic search and take a broad We'll use the same 50 top web search terms about Hello, World! There are different techniques we can use to compress the embeddings down to just 2 dimensions while retaining as much information as possible. We'll use a technique called UMAP. And once we can get it down to 2 dimensions, we can plot these embeddings on a 2D chart. -```python +```python PYTHON import umap reducer = umap.UMAP(n_neighbors=49) umap_embeds = reducer.fit_transform(doc_embeds) @@ -471,7 +471,7 @@ df['x'] = umap_embeds[:,0] df['y'] = umap_embeds[:,1] ``` -```python +```python PYTHON chart = alt.Chart(df).mark_circle(size=500).encode( x= alt.X('x', diff --git a/fern/pages/cookbooks/long-form-general-strategies.mdx b/fern/pages/cookbooks/long-form-general-strategies.mdx index 933ae22be..31165ae18 100644 --- a/fern/pages/cookbooks/long-form-general-strategies.mdx +++ b/fern/pages/cookbooks/long-form-general-strategies.mdx @@ -33,7 +33,7 @@ We'll show you three potential mitigation strategies: truncating the document, q ## Getting Started [#getting-started] -```python +```python PYTHON %%capture !pip install cohere !pip install python-dotenv @@ -44,7 +44,7 @@ We'll show you three potential mitigation strategies: truncating the document, q !pip install pypdf2 ``` -```python +```python PYTHON import os import requests from collections import deque @@ -74,14 +74,14 @@ from IPython.display import HTML, display [nltk_data] Package punkt is already up-to-date! ``` -```python +```python PYTHON # Set up Cohere client co_model = 'command-r' co_api_key = getpass("Enter your Cohere API key: ") co = cohere.Client(api_key=co_api_key) ``` -```python +```python PYTHON def load_long_pdf(file_path): """ Load a long PDF file and extract its text content. @@ -120,7 +120,7 @@ def save_pdf_from_url(pdf_url, save_path): In this example we use the Proposal for a Regulation of the European Parliament and of the Council defining rules on Artificial Intelligence from 26 January 2024, [link](https://data.consilium.europa.eu/doc/document/ST-5662-2024-INIT/en/pdf). -```python +```python PYTHON # Download the PDF file from the URL pdf_url = 'https://data.consilium.europa.eu/doc/document/ST-5662-2024-INIT/en/pdf' save_path = 'example.pdf' @@ -141,7 +141,7 @@ Document length - #tokens: 134184 ## Summarizing the text -```python +```python PYTHON def generate_response(message, max_tokens=300, temperature=0.2, k=0): """ A wrapper around the Cohere API to generate a response based on a given prompt. @@ -166,7 +166,7 @@ def generate_response(message, max_tokens=300, temperature=0.2, k=0): return response.text ``` -```python +```python PYTHON # Example summary prompt. prompt_template = """ ## Instruction @@ -183,7 +183,7 @@ If you run the cell below, an error will occur. Therefore, in the following sect Error: :`CohereAPIError: too many tokens:` -```python +```python PYTHON prompt = prompt_template.format(document=long_text) # print(generate_response(message=prompt)) ``` @@ -194,7 +194,7 @@ Therefore, in the following sections, we will explore some techniques to address First we try to truncate the document so that it meets the length constraints. This approach is simple to implement and understand. However, it drops potentially important information contained towards the end of the document. -```python +```python PYTHON # The new Cohere model has a context limit of 128k tokens. However, for the purpose of this exercise, we will assume a smaller context window. # Employing a smaller context window also has the additional benefit of reducing the cost per request, especially if billed by the number of tokens. @@ -212,7 +212,7 @@ def truncate(long: str, max_tokens: int) -> str: return short ``` -```python +```python PYTHON short_text = truncate(long_text, MAX_TOKENS) prompt = prompt_template.format(document=short_text) @@ -248,7 +248,7 @@ See `query_based_retrieval` function for the starting point. ### Query based retrieval implementation -```python +```python PYTHON def split_text_into_sentences(text) -> List[str]: """ Split the input text into a list of sentences. @@ -279,7 +279,7 @@ def build_simple_chunks(text, n_sentences=5): return chunks ``` -```python +```python PYTHON sentences = split_text_into_sentences(long_text) passages = group_sentences_into_passages(sentences, n_sentences_per_passage=5) print('Example sentence:', np.random.choice(np.asarray(sentences), size=1, replace=False)) @@ -293,7 +293,7 @@ Example sentence: ['4.'] Example passage: ['T echnical robustness and safety means that AI systems are developed and used in a way that allows robustness in case of problems and resilience against attempts to alter the use or performance of the AI system so as to allow unlawful use by third parties, a nd minimise unintended harm. Privacy and data governance means that AI systems are developed and used in compliance with existing privacy and data protection rules, while processing data that meets high standards in terms of quality and integrity. Transpar ency means that AI systems are developed and used in a way that allows appropriate traceability and explainability, while making humans aware that they communicate or interact with an AI system, as well as duly informing deployers of the capabilities and l imitations of that AI system and affected persons about their rights. Diversity, non - discrimination and fairness means that AI systems are developed and used in a way that includes diverse actors and promotes equal access, gender equality and cultural dive rsity, while avoiding discriminatory impacts and unfair biases that are prohibited by Union or national law. Social and environmental well - being means that AI systems are developed and used in a sustainable and environmentally friendly manner as well as in a way to benefit all human beings, while monitoring and assessing the long - term impacts on the individual, society and democracy. '] ``` -```python +```python PYTHON def _add_chunks_by_priority( chunks: List[str], idcs_sorted_by_priority: List[int], @@ -348,7 +348,7 @@ def query_based_retrieval( return short ``` -```python +```python PYTHON # Example prompt prompt_template = """ ## Instruction @@ -361,7 +361,7 @@ prompt_template = """ """.strip() ``` -```python +```python PYTHON query = "What does the report say about biometric identification? Answer only based on the document." short_text = query_based_retrieval(long_text, MAX_TOKENS, query) prompt = prompt_template.format(query=query, document=short_text) @@ -403,7 +403,7 @@ See `text_rank` as the starting point. ### Text rank implementation -```python +```python PYTHON def text_rank(text: str, max_tokens: int, n_setences_per_passage: int) -> str: """ Shortens text by extracting key units of text from it based on their centrality. @@ -438,7 +438,7 @@ def text_rank(text: str, max_tokens: int, n_setences_per_passage: int) -> str: return short ``` -```python +```python PYTHON # Example summary prompt. prompt_template = """ ## Instruction @@ -451,7 +451,7 @@ Summarize the following Document in 3-5 sentences. Only answer based on the info """.strip() ``` -```python +```python PYTHON short_text = text_rank(long_text, MAX_TOKENS, 5) prompt = prompt_template.format(document=short_text) print(generate_response(message=prompt, max_tokens=600)) diff --git a/fern/pages/cookbooks/migrating-prompts.mdx b/fern/pages/cookbooks/migrating-prompts.mdx index e57df4faf..2ebcb4cd6 100644 --- a/fern/pages/cookbooks/migrating-prompts.mdx +++ b/fern/pages/cookbooks/migrating-prompts.mdx @@ -19,11 +19,11 @@ The two use cases demonstrated here are: 1. Autobiography Assistant; and 2. Legal Question Answering -```python +```python PYTHON #!pip install cohere ``` -```python +```python PYTHON import json import os import re @@ -32,7 +32,7 @@ import cohere import getpass ``` -```python +```python PYTHON CO_API_KEY = getpass.getpass('cohere API key:') ``` @@ -40,7 +40,7 @@ CO_API_KEY = getpass.getpass('cohere API key:') cohere API key:·········· ``` -```python +```python PYTHON co = cohere.Client(CO_API_KEY) ``` @@ -48,7 +48,7 @@ co = cohere.Client(CO_API_KEY) This application scenario is a common LLM-as-assistant use case: given some context, help the user to complete a task. In this case, the task is to write a concise autobiographical summary. -```python +```python PYTHON original_prompt = '''## information Current Job Title: Senior Software Engineer Current Company Name: GlobalSolTech @@ -64,14 +64,14 @@ The length of the text should be no more than 100 words. Write the summary in first person.''' ``` -```python +```python PYTHON response = co.chat( message=original_prompt, model='command-r', ) ``` -```python +```python PYTHON print(response.text) ``` @@ -81,7 +81,7 @@ print(response.text) Using Command-R, we can automatically upgrade the original prompt to a RAG-style prompt to get more faithful adherence to the instructions, a clearer and more concise prompt, and in-line citations for free. Consider the following meta-prompt: -```python +```python PYTHON meta_prompt = f'''Below is a task for an LLM delimited with ## Original Task. Your task is to split that task into two parts: (1) the context; and (2) the instructions. The context should be split into several separate parts and returned as a JSON object where each part has a name describing its contents and the value is the contents itself. Make sure to include all of the context contained in the original task description and do not change its meaning. @@ -98,7 +98,7 @@ Return everything in a JSON object with the following structure: ''' ``` -```python +```python PYTHON print(meta_prompt) ``` @@ -132,20 +132,20 @@ Write the summary in first person. Command-R returns with the following: -```python +```python PYTHON upgraded_prompt = co.chat( message=meta_prompt, model='command-r', ) ``` -```python +```python PYTHON print(upgraded_prompt.text) ``` ````txt title="Output" Here is the task delved into a JSON object as requested: -```json +```json JSON { "context": [ { @@ -168,7 +168,7 @@ Here is the task delved into a JSON object as requested: To extract the returned information, we will write two simple functions to post-process out the JSON and then parse it. -````python +````python PYTHON def get_json(text: str) -> str: matches = [m.group(1) for m in re.finditer("```([\w\W]*?)```", text)] if len(matches): @@ -179,7 +179,7 @@ def get_json(text: str) -> str: return text ```` -```python +```python PYTHON def get_prompt_and_docs(text: str) -> tuple: json_obj = json.loads(get_json(text)) prompt = json_obj['instructions'] @@ -190,11 +190,11 @@ def get_prompt_and_docs(text: str) -> tuple: return prompt, docs ``` -```python +```python PYTHON new_prompt, docs = get_prompt_and_docs(upgraded_prompt.text) ``` -```python +```python PYTHON new_prompt, docs ``` @@ -212,7 +212,7 @@ new_prompt, docs As we can see above, the new prompt is much more concise and gets right to the point. The context has been split into 4 "documents" that Command-R can ground the information to. Now let's run the same task with the new prompt while leveraging the `documents=` parameter. Note that the `docs` variable is a list of dict objects with `title` describing the contents of a text and `snippet` containing the text itself: -```python +```python PYTHON response = co.chat( message=new_prompt, model='command-r', @@ -220,7 +220,7 @@ response = co.chat( ) ``` -```python +```python PYTHON print(response.text) ``` @@ -230,7 +230,7 @@ I'm a senior software engineer with a Ph.D. in Statistics and over 15 years of A The response is concise. More importantly, we can ensure that there is no hallucination because the text is automatically grounded in the input documents. Using the simple function below, we can add this grounding information to the text as citations: -```python +```python PYTHON def insert_citations(text: str, citations: list[dict], add_one: bool=False): """ A helper function to pretty print citations. @@ -259,7 +259,7 @@ def insert_citations(text: str, citations: list[dict], add_one: bool=False): return text ``` -```python +```python PYTHON print(insert_citations(response.text, response.citations, True)) ``` @@ -273,11 +273,11 @@ Now let's move on to an arguably more difficult problem. On March 21st, the DOJ announced that it is [suing Apple](https://www.theverge.com/2024/3/21/24107659/apple-doj-lawsuit-antitrust-documents-suing) for anti-competitive practices. The [complaint](https://www.justice.gov/opa/media/1344546/dl) is 88 pages long and consists of about 230 paragraphs of text. To understand what the suit alleges, a common use case would be to ask for a summary. Because Command-R has a context window of 128K, even an 88-page legal complaint fits comfortably within the window. -```python +```python PYTHON apple = open('data/apple_mod.txt').read() ``` -```python +```python PYTHON tokens = co.tokenize(text=apple, model='command-r') len(tokens.tokens) ``` @@ -288,7 +288,7 @@ len(tokens.tokens) We can set up a prompt template that allows us to ask questions on the original text. -```python +```python PYTHON prompt_template = ''' {legal_text} @@ -296,12 +296,12 @@ prompt_template = ''' ''' ``` -```python +```python PYTHON question = '''Please summarize the attached legal complaint succinctly. Focus on answering the question: what does the complaint allege?''' rendered_prompt = prompt_template.format(legal_text=apple, question=question) ``` -```python +```python PYTHON response = co.chat( message=rendered_prompt, model='command-r', @@ -309,7 +309,7 @@ response = co.chat( ) ``` -```python +```python PYTHON print(response.text) ``` @@ -319,19 +319,19 @@ The complaint alleges that Apple has violated antitrust laws by engaging in a pa The summary seems clear enough. But we are interested in the specific allegations that the DOJ makes. For example, skimming the full complaint, it looks like the DOJ is alleging that Apple could encrypt text messages sent to Android phones if it wanted to do so. We can amend the rendered prompt and ask: -```python +```python PYTHON question = '''Does the DOJ allege that Apple could encrypt text messages sent to Android phones?''' rendered_prompt = prompt_template.format(legal_text=apple, question=question) ``` -```python +```python PYTHON response = co.chat( message=rendered_prompt, model='command-r', ) ``` -```python +```python PYTHON print(response.text) ``` @@ -343,7 +343,7 @@ This is a very interesting allegation that at first glance suggests that the mod While previously we asked Command-R to chunk the text for us, the legal complaint is highly structured with numbered paragraphs so we can use the following function to break the complaint into input docs ready for RAG: -```python +```python PYTHON def chunk_doc(input_doc: str) -> list: chunks = [] current_para = 'Preamble' @@ -366,11 +366,11 @@ def chunk_doc(input_doc: str) -> list: return docs ``` -```python +```python PYTHON chunks = chunk_doc(apple) ``` -```python +```python PYTHON print(chunks[18]) ``` @@ -380,7 +380,7 @@ print(chunks[18]) We can now try the same question but ask it directly to Command-R with the chunks as grounding information. -```python +```python PYTHON response = co.chat( message='''Does the DOJ allege that Apple could encrypt text messages sent to Android phones?''', model='command-r', @@ -388,7 +388,7 @@ response = co.chat( ) ``` -```python +```python PYTHON print(response.text) ``` @@ -398,7 +398,7 @@ Yes, according to the DOJ, Apple could encrypt text messages sent from iPhones t The responses seem similar, but we should add citations and check the citation to get confidence in the response. -```python +```python PYTHON print(insert_citations(response.text, response.citations)) ``` @@ -408,7 +408,7 @@ Yes, according to the DOJ, Apple could encrypt text messages sent from iPhones t The most important passage seems to be paragraph 144. Paragraph 93 is also cited. Let's check what they contain. -```python +```python PYTHON print(chunks[144]['snippet']) ``` @@ -420,7 +420,7 @@ users to send encrypted messages to Android users while still using iMessage on which would instantly improve the privacy and security of iPhone and other smartphone users. ``` -```python +```python PYTHON print(chunks[93]['snippet']) ``` diff --git a/fern/pages/cookbooks/multilingual-search.mdx b/fern/pages/cookbooks/multilingual-search.mdx index 442ed68bf..a6a62ecce 100644 --- a/fern/pages/cookbooks/multilingual-search.mdx +++ b/fern/pages/cookbooks/multilingual-search.mdx @@ -38,7 +38,7 @@ We'll go through the following examples: - Enter a question - Answer the question based on the most relevant documents -```python +```python PYTHON from langchain.embeddings.cohere import CohereEmbeddings from langchain.llms import Cohere from langchain.prompts import PromptTemplate @@ -66,7 +66,7 @@ True ### Import a list of documents -```python +```python PYTHON import tensorflow_datasets as tfds dataset = tfds.load('trec', split='train') texts = [item['text'].decode('utf-8') for item in tfds.as_numpy(dataset)] @@ -113,7 +113,7 @@ Dataset trec downloaded and prepared to /root/tensorflow_datasets/trec/1.0.0. Su Number of documents: 5452 ``` -```python +```python PYTHON random.seed(11) for item in random.sample(texts, 5): print(item) @@ -129,7 +129,7 @@ What is a female rabbit called ? ### Embed the documents and store them in an index -```python +```python PYTHON embeddings = CohereEmbeddings(model = "multilingual-22-12") db = Qdrant.from_texts(texts, embeddings, location=":memory:", collection_name="my_documents", distance_func="Dot") @@ -137,7 +137,7 @@ db = Qdrant.from_texts(texts, embeddings, location=":memory:", collection_name=" ### Enter a query -```python +```python PYTHON queries = ["How to get in touch with Bill Gates", "Comment entrer en contact avec Bill Gates", "Cara menghubungi Bill Gates"] @@ -147,14 +147,14 @@ queries_lang = ["English", "French", "Indonesian"] ### Return the document most similar to the query -```python +```python PYTHON answers = [] for query in queries: docs = db.similarity_search(query) answers.append(docs[0].page_content) ``` -```python +```python PYTHON for idx,query in enumerate(queries): print(f"Query language: {queries_lang[idx]}") print(f"Query: {query}") @@ -186,7 +186,7 @@ Most similar existing question: What is Bill Gates of Microsoft E-mail address ? ## Add an article and chunk it into smaller passages -```python +```python PYTHON !wget 'https://docs.google.com/uc?export=download&id=1f1INWOfJrHTFmbyF_0be5b4u_moz3a4F' -O steve-jobs-commencement.txt ``` @@ -210,7 +210,7 @@ steve-jobs-commence 100%[===================>] 11.71K --.-KB/s in 0s 2023-06-08 06:11:20 (115 MB/s) - ‘steve-jobs-commencement.txt’ saved [11993/11993] ``` -```python +```python PYTHON loader = TextLoader("steve-jobs-commencement.txt") documents = loader.load() text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0) @@ -219,14 +219,14 @@ texts = text_splitter.split_documents(documents) ## Embed the passages and store them in an index -```python +```python PYTHON embeddings = CohereEmbeddings(model = "multilingual-22-12") db = Qdrant.from_documents(texts, embeddings, location=":memory:", collection_name="my_documents", distance_func="Dot") ``` ## Enter a question -```python +```python PYTHON questions = [ "What did the author liken The Whole Earth Catalog to?", "What was Reed College great at?", @@ -238,7 +238,7 @@ questions = [ ## Answer the question based on the most relevant documents -```python +```python PYTHON prompt_template = """Text: {context} @@ -251,7 +251,7 @@ PROMPT = PromptTemplate( ) ``` -```python +```python PYTHON chain_type_kwargs = {"prompt": PROMPT} qa = RetrievalQA.from_chain_type(llm=Cohere(model="command", temperature=0), @@ -368,7 +368,7 @@ to you with a bit more certainty than when death was a useful but purely intelle ## Questions in French -```python +```python PYTHON questions_fr = [ "À quoi se compare The Whole Earth Catalog ?", "Dans quoi Reed College était-il excellent ?", @@ -378,11 +378,11 @@ questions_fr = [ ] ``` -```python +```python PYTHON ``` -```python +```python PYTHON chain_type_kwargs = {"prompt": PROMPT} diff --git a/fern/pages/cookbooks/pdf-extractor.mdx b/fern/pages/cookbooks/pdf-extractor.mdx index 8cac04314..25193576f 100644 --- a/fern/pages/cookbooks/pdf-extractor.mdx +++ b/fern/pages/cookbooks/pdf-extractor.mdx @@ -30,7 +30,7 @@ In the directory, we have a simple_invoice.pdf file. Everytime a user uploads th 3. The agent summarizes the document and passes that information to convert_to_json() function. This function makes another call to command model to convert the summary to json output. This separation of tasks is useful when the text document is complicated and long. Therefore, we first distill the information and ask another model to convert the text into json object. This is useful so each model or agent focuses on its own task without suffering from long context. 4. Then the json object goes through a check to make sure all keys are present and gets saved as a csv file. When the document is too long or the task is too complex, the model may fail to extract all information. These checks are then very useful because they give feedback to the model so it can adjust it's parameters to retry. -```python +```python PYTHON import os import cohere @@ -39,12 +39,12 @@ import json from unstructured.partition.pdf import partition_pdf ``` -```python +```python PYTHON # uncomment to install dependencies # !pip install cohere unstructured ``` -```python +```python PYTHON # versions print('cohere version:', cohere.__version__) ``` @@ -55,7 +55,7 @@ cohere version: 5.5.1 ## Setup -```python +```python PYTHON COHERE_API_KEY = os.environ.get("CO_API_KEY") COHERE_MODEL = 'command-r-plus' co = cohere.Client(api_key=COHERE_API_KEY) @@ -69,7 +69,7 @@ The sample invoice data is from https://unidoc.io/media/simple-invoices/simple_i Here we define the tool which converts summary of the pdf into json object. Then, it checks to make sure all necessary keys are present and saves it as csv. -```python +```python PYTHON def convert_to_json(text: str) -> dict: """ Given text files, convert to json object and saves to csv. @@ -125,7 +125,7 @@ def convert_to_json(text: str) -> dict: Below is a cohere agent that leverages multi-step API. It is equipped with convert_to_json tool. -```python +```python PYTHON def cohere_agent( message: str, preamble: str, @@ -213,7 +213,7 @@ def cohere_agent( ### main -```python +```python PYTHON def extract_pdf(path): """ Function to extract text from a PDF file. diff --git a/fern/pages/cookbooks/pondr.mdx b/fern/pages/cookbooks/pondr.mdx index 48b7ce81f..17b32a0c0 100644 --- a/fern/pages/cookbooks/pondr.mdx +++ b/fern/pages/cookbooks/pondr.mdx @@ -24,13 +24,13 @@ In this notebook we will walk through the first two steps. Install and import the tools we will need as well as initializing the Cohere model. -```python +```python PYTHON import cohere from cohere.responses.classify import Example import pandas as pd ``` -```python +```python PYTHON co=cohere.Client('YOUR_API_KEY') ``` @@ -38,7 +38,7 @@ co=cohere.Client('YOUR_API_KEY') Generate a list of potential conversation questions and retain the first 10. -```python +```python PYTHON #user_input is hardcoded for this example user_input='I am meeting up with a coworker. We are meeting at a fancy restaurant. I wanna ask some interesting questions. These questions should be deep.' prompt=user_input+'\nHere are 10 interesting questions to ask:\n1)' @@ -46,7 +46,7 @@ response=co.generate(model='xlarge', prompt=prompt, max_tokens=200, temperature= response ``` -```python +```python PYTHON def generation_to_df(generation): generation=response.split('\n') clean_questions=[] @@ -57,7 +57,7 @@ def generation_to_df(generation): return clean_q_df ``` -```python +```python PYTHON clean_q_df = generation_to_df(response) pd.options.display.max_colwidth=150 clean_q_df @@ -67,7 +67,7 @@ clean_q_df Rank and sort the questions based on interestingness and specificity. -```python +```python PYTHON interestingness=[ Example("What do you think is the hardest part of what I do for a living?", "Not Interesting"), Example("What\'s the first thing you noticed about me?", "Interesting"), @@ -107,7 +107,7 @@ specificity=[ Example("What would your younger self not believe about your life today?", "Specific")] ``` -```python +```python PYTHON def add_attribute(df, attribute, name, target): response = co.classify( @@ -122,7 +122,7 @@ def add_attribute(df, attribute, name, target): df[name]=q_conf ``` -```python +```python PYTHON add_attribute(clean_q_df, interestingness, 'interestingness', 'Interesting') add_attribute(clean_q_df, specificity, 'specificity', 'Specific') clean_q_df['average']= clean_q_df.iloc[:,1:].mean(axis=1) diff --git a/fern/pages/cookbooks/rag-evaluation-deep-dive.mdx b/fern/pages/cookbooks/rag-evaluation-deep-dive.mdx index 8a8938cf4..42de0df66 100644 --- a/fern/pages/cookbooks/rag-evaluation-deep-dive.mdx +++ b/fern/pages/cookbooks/rag-evaluation-deep-dive.mdx @@ -39,13 +39,13 @@ To demonstrate the metrics, we will use data from the [Docugami's KG-RAG](https: Let's start by setting the environment and downloading the dataset. -```python +```python PYTHON %%capture !pip install llama-index cohere openai !pip install mistralai ``` -```python +```python PYTHON # required imports from getpass import getpass import os @@ -60,7 +60,7 @@ from mistralai.client import MistralClient For Response evaluation, we will use an LLM as a judge. Any LLM can be used for this goal, but because evaluation is a very challenging task, we recommend using powerful LLMs, possibly as an ensemble of models. In [previous work](https://arxiv.org/pdf/2303.16634.pdf), it has been shown that models tend to assign higher scores to their own output. Since we generated the answers in this notebook using `command-r`, we will not use it for evaluation. We will provide two alternatives, `gpt-4` and `mistral`. We set `gpt-4` as the default model because, as mentioned above, evaluation is challenging, and `gpt-4` is powerful enough to efficiently perform the task. -```python +```python PYTHON # Get keys openai_api_key = getpass("Enter your OpenAI API Key: ") # uncomment if you want to use mistral @@ -73,14 +73,14 @@ model = "gpt-4" ``` -```python +```python PYTHON if model == "gpt-4": client = Client(api_key=openai_api_key) else: client = MistralClient(api_key=mistral_api_key) ``` -```python +```python PYTHON # let's define a function to get the model's response for a given input def get_response(model, client, prompt): response = client.chat.completions.create( @@ -90,7 +90,7 @@ def get_response(model, client, prompt): return response.choices[0].message.content ``` -```python +```python PYTHON # load the DocugamiKgRagSec10Q dataset if os.path.exists("./data/source_files") and os.path.exists("./data/rag_dataset.json"): rag_dataset = LabelledRagDataset.from_json("./data/rag_dataset.json") @@ -111,7 +111,7 @@ We use three standard metrics to evaluate retrieval: We implement these three metrics in the class below: -```python +```python PYTHON class RetrievalEvaluator: def compute_precision(self, retrieved_documents, golden_documents): @@ -143,7 +143,7 @@ class RetrievalEvaluator: Let's now see how to use the class above to compute the results on a single datapoint. -```python +```python PYTHON # select the index of a single datapoint - the first one in the dataset idx = 0 @@ -167,7 +167,7 @@ Golden docs: ['2022 Q3 AAPL.pdf', '2023 Q1 AAPL.pdf', '2023 Q2 AAPL.pdf', '2023 Retrieved docs: ['2022 Q3 AAPL.pdf', '2023 Q1 MSFT.pdf', '2023 Q1 AAPL.pdf'] ``` -```python +```python PYTHON # we can now instantiate the evaluator evaluate_retrieval = RetrievalEvaluator() @@ -210,7 +210,7 @@ Also, while Correctness is measuring the precision of the claims in the response Let's now see how we implement the evaluation described above using LLMs. Let's start with **claim extraction**. -```python +```python PYTHON # first, let's define a function which extracts the claims from a response def extract_claims(query, response, model, client): @@ -227,7 +227,7 @@ def extract_claims(query, response, model, client): ``` -```python +```python PYTHON # now, let's consider this answer, which we previously generated with command-r response = "Apple's total net sales experienced a decline over the last year. The three-month period ended July 1, 2023, saw a total net sale of $81,797 million, which was a 1% decrease from the same period in 2022. The nine-month period ended July 1, 2023, fared slightly better, with a 3% decrease in net sales compared to the first nine months of 2022.\nThis downward trend continued into the three and six-month periods ending April 1, 2023. Apple's total net sales decreased by 3% and 4% respectively, compared to the same periods in 2022." @@ -253,7 +253,7 @@ List of claims extracted from the model's response: Nice! now that we have the list of claims, we can go ahead and **assess the validity** of each claim. -```python +```python PYTHON # Let's create a function that checks each claim against a reference text, # which here we will call "context". As you will see, we will use different contexts, # depending on the metric we want to compute. @@ -278,7 +278,7 @@ def assess_claims(query, claims, context, model, client): ### Faithfulness -```python +```python PYTHON # Let's start with Faithfulness: in this case, we want to assess the claims # in the response against the retrieved documents (i.e., context = retrieved documents) @@ -308,7 +308,7 @@ Assessment of the claims extracted from the model's response: Great, we now have an assessment for each of the claims: in the last step, we just need to use these assessments to define the final score. -```python +```python PYTHON # given the list of claims and their label, compute the final score # as the proportion of correct claims over the full list of claims def get_final_score(claims_list): @@ -318,7 +318,7 @@ def get_final_score(claims_list): return round(score, 2) ``` -```python +```python PYTHON score_faithfulness = get_final_score(assessed_claims_faithfulness) print(f'Faithfulness: {score_faithfulness}') ``` @@ -331,7 +331,7 @@ The final Faithfulness score is 1, which means that the model's response is full Before moving on, let's modify the model's response by adding a piece of information which is **not** grounded in any document, and re-compute Faithfulness. -```python +```python PYTHON # let's mess up the century, changing 2022 to 1922 modified_response = response.replace('2022', '1922') @@ -370,7 +370,7 @@ As you can see, by assessing claims one by one, we are able to spot **hallucinat As said, Faithfulness and Correctness share the same logic, the only difference being that we will check the claims against the gold answer. We can therefore repeat the process above, and just substitute the `context`. -```python +```python PYTHON # let's get the gold answer from the dataset golden_answer = rag_dataset[idx].reference_answer @@ -400,7 +400,7 @@ Assess the claims extracted from the model's response against the golden answer: As mentioned above, automatic evaluation is a hard task, and even when using powerful models, claim assessment can present problems: for example, the third claim is labelled as 0, even if it might be inferred from the information in the gold answer. -```python +```python PYTHON # we can now compute the final Correctness score score_correctness = get_final_score(assessed_claims_correctness) print(f'Correctness: {score_correctness}') @@ -416,7 +416,7 @@ For Correctness, we found that only half of the claims in the generated response We finally move to Coverage. Remember that, in this case, we want to check how many of the claims _in the gold answer_ are included in the generated response. To do it, we first need to extract the claims from the gold answer. -```python +```python PYTHON # let's extract the golden claims gold_claims = extract_claims(query, golden_answer, model, client) @@ -436,7 +436,7 @@ List of claims extracted from the gold answer: Then, we check which of these claims is present in the response generated by the model. -```python +```python PYTHON # note that in, this case, the context is the model's response assessed_claims_coverage = assess_claims(query=query, claims=gold_claims, @@ -459,7 +459,7 @@ Assess which of the gold claims is in the model's response: - There was a decrease in total net sales in the quarters ended April 1, 2023, and July 1, 2023. SUPPORTED=1 ``` -```python +```python PYTHON # we compute the final Coverage score score_coverage = get_final_score(assessed_claims_coverage) print(f'Coverage: {score_coverage}') diff --git a/fern/pages/cookbooks/rag-with-chat-embed.mdx b/fern/pages/cookbooks/rag-with-chat-embed.mdx index a74563bc5..84c0678ba 100644 --- a/fern/pages/cookbooks/rag-with-chat-embed.mdx +++ b/fern/pages/cookbooks/rag-with-chat-embed.mdx @@ -39,11 +39,11 @@ For each user-chatbot interaction: - If no query is generated - **Step 4**: Call the Chat endpoint in normal mode to generate a response -```python +```python PYTHON ! pip install cohere hnswlib unstructured python-dotenv -q ``` -```python +```python PYTHON import cohere from pinecone import Pinecone, PodSpec import uuid @@ -56,7 +56,7 @@ co = cohere.Client("COHERE_API_KEY") # Get your API key here: https://dashboard. pc = Pinecone(api_key="PINECONE_API_KEY") # (get API key at app.pinecone.io) ``` -```python +```python PYTHON import cohere import os import dotenv @@ -71,7 +71,7 @@ pc = Pinecone( First, we define the list of documents we want to ingest and make available for retrieval. As an example, we'll use the contents from the first module of Cohere's _LLM University: What are Large Language Models?_. -```python +```python PYTHON raw_documents = [ { "title": "Text Embeddings", @@ -105,7 +105,7 @@ This method uses Cohere's `embed-english-v3.0` model to generate embeddings of t `index()` This method uses the `hsnwlib` package to index the document chunk embeddings. This will ensure efficient similarity search during retrieval. Note that `hnswlib` uses a vector library, and we have chosen it for its simplicity. -```python +```python PYTHON class Vectorstore: """ A class representing a collection of documents indexed into a vectorstore. @@ -248,7 +248,7 @@ class Vectorstore: In the code cell below, we initialize an instance of the `Vectorstore` class and pass in the `raw_documents` list as input. -```python +```python PYTHON vectorstore = Vectorstore(raw_documents) ``` @@ -279,7 +279,7 @@ In the code cell below, we check the document chunks that are retrieved for the ## Test Retrieval -```python +```python PYTHON vectorstore.retrieve("multi-head attention definition") ``` @@ -310,7 +310,7 @@ In either case, we also pass the `conversation_id` parameter, which retains the We then print the chatbot's response. In the case that the external information was used to generate a response, we also display citations. -```python +```python PYTHON class Chatbot: def __init__(self, vectorstore: Vectorstore): """ @@ -405,7 +405,7 @@ The format of each citation is: - `text`: The text representing this span - `document_ids`: The IDs of the documents being referenced (`doc_0` being the ID of the first document passed to the `documents` creating parameter in the endpoint call, and so on) -```python +```python PYTHON chatbot = Chatbot(vectorstore) chatbot.run() diff --git a/fern/pages/cookbooks/rerank-demo.mdx b/fern/pages/cookbooks/rerank-demo.mdx index d76814c0b..56c361b23 100644 --- a/fern/pages/cookbooks/rerank-demo.mdx +++ b/fern/pages/cookbooks/rerank-demo.mdx @@ -18,7 +18,7 @@ In our benchmarks across 20 datasets, we **saw significant improvements compared We will demonstrate the rerank endpoint in this notebook. -```python +```python PYTHON !pip install "cohere<5" ``` @@ -45,7 +45,7 @@ Requirement already satisfied: certifi>=2017.4.17 in /Users/elliottchoi/Library/  ``` -```python +```python PYTHON import cohere import requests import numpy as np @@ -54,7 +54,7 @@ from typing import List from pprint import pprint ``` -```python +```python PYTHON API_KEY = "" co = cohere.Client(API_KEY) MODEL_NAME = "rerank-english-v3.0" # another option is rerank-multilingual-02 @@ -77,7 +77,7 @@ docs = [ In the following cell we will call rerank to rank `docs` based on how relevant they are with `query`. -```python +```python PYTHON results = co.rerank(query=query, model=MODEL_NAME, documents=docs, top_n=3) # Change top_n to change the number of results returned. If top_n is not passed, all results will be returned. for idx, r in enumerate(results): print(f"Document Rank: {idx + 1}, Document Index: {r.index}") @@ -108,7 +108,7 @@ The following is an example how to use this model end-to-end to search over the We use BM25 lexical search to retrieve the top-100 passages matching the query and then send these 100 passages and the query to our rerank endpoint to get a re-ranked list. We output the top-3 hits according to BM25 lexical search (as used by e.g. Elasticsearch) and the re-ranked list from our endpoint. -```python +```python PYTHON !pip install -U rank_bm25 ``` @@ -124,7 +124,7 @@ Installing collected packages: rank_bm25 Successfully installed rank_bm25-0.2.2 ``` -```python +```python PYTHON import json import gzip import os @@ -139,7 +139,7 @@ from tqdm.autonotebook import tqdm from tqdm.autonotebook import tqdm ``` -```python +```python PYTHON !wget http://sbert.net/datasets/simplewiki-2020-11-01.jsonl.gz ``` @@ -165,7 +165,7 @@ simplewiki-2020-11- 100%[===================>] 47.90M 5.78MB/s in 8.9s 2024-04-08 14:28:11 (5.37 MB/s) - ‘simplewiki-2020-11-01.jsonl.gz’ saved [50223724/50223724] ``` -```python +```python PYTHON wikipedia_filepath = 'simplewiki-2020-11-01.jsonl.gz' passages = [] @@ -181,13 +181,13 @@ print("Passages:", len(passages)) Passages: 509663 ``` -```python +```python PYTHON print(passages[0], passages[1]) ``` Ted Cassidy (July 31, 1932 - January 16, 1979) was an American actor. He was best known for his roles as Lurch and Thing on "The Addams Family". Aileen Carol Wuornos Pralle (born Aileen Carol Pittman; February 29, 1956 – October 9, 2002) was an American serial killer. She was born in Rochester, Michigan. She confessed to killing six men in Florida and was executed in Florida State Prison by lethal injection for the murders. Wuornos said that the men she killed had raped her or tried to rape her while she was working as a prostitute. -```python +```python PYTHON def bm25_tokenizer(text): tokenized_doc = [] @@ -210,7 +210,7 @@ bm25 = BM25Okapi(tokenized_corpus) 100%|██████████| 509663/509663 [00:09<00:00, 51180.82it/s] ``` -```python +```python PYTHON def search(query, top_k=3, num_candidates=100): print("Input question:", query) @@ -235,7 +235,7 @@ def search(query, top_k=3, num_candidates=100): print("\t{:.3f}\t{}".format(hit.relevance_score, hit.document["text"].replace("\n", " "))) ``` -```python +```python PYTHON search(query = "What is the capital of the United States?") ``` @@ -252,7 +252,7 @@ Top-3 hits by rank-API (100 BM25 hits re-ranked) 0.993 As the national capital of the United States, Washington, D.C. has numerous media outlets in various mediums. Some of these media are known throughout the United States, including "The Washington Post" and various broadcasting networks headquartered in D.C. ``` -```python +```python PYTHON search(query = "Number countries Europe") ``` @@ -269,7 +269,7 @@ Top-3 hits by rank-API (100 BM25 hits re-ranked) 0.981 Europe, the planet's 6th largest continent, includes 47 countries and assorted dependencies, islands and territories. ``` -```python +```python PYTHON search(query = "Elon Musk year birth") ``` @@ -286,7 +286,7 @@ Top-3 hits by rank-API (100 BM25 hits re-ranked) 0.474 In early 2002, Musk was seeking workers for his new space company, soon to be named SpaceX. Musk found a rocket engineer Tom Mueller (later SpaceX's CTO of Propulsion). He agreed to work for Musk. That was how SpaceX was born. The first headquarters of SpaceX was in a warehouse in El Segundo, California. The company has grown rapidly since it was founded in 2002, growing from 160 workers in November 2005 to 1,100 in 2010, 3,800 workers and contractors by October 2013, nearly 5,000 by late 2015, and about 6,000 in April 2017. ``` -```python +```python PYTHON search(query = "Which US president was killed?") ``` @@ -303,7 +303,7 @@ Top-3 hits by rank-API (100 BM25 hits re-ranked) 0.916 On the night that President Abraham Lincoln was killed, someone also tried to kill Seward. For the rest of his life, Seward had scars on his face from the attack. Later, the man who attacked him was caught and put to death. ``` -```python +```python PYTHON search(query="When is Chinese New Year") ``` @@ -320,7 +320,7 @@ Top-3 hits by rank-API (100 BM25 hits re-ranked) 0.996 Chinese New Year lasts fifteen days, including one week as a national holiday. It starts with the first day of the Chinese lunar year and ends with the full moon fifteen days later. It is always in the middle of winter, but is called the Spring Festival in Chinese because Chinese seasons are a little different from English ones. On the first day of the Chinese New Year, people call on friends and relatives. Because most people watch the special performances on CCTV all the night on New Year's Eve and don't go to bed until 12:00 AM, they usually get up later in the next day. The fifth day of the Chinese New Year is the day to welcome the god of Wealth (Chinese:财神爷), many people make and eat dumplings (Chinese:饺子. Pinyin: Jaozi). They believe that dumplings can hold the god of Wealth and bring luck. The last day of the Chinese New Year is the Lantern Festival. On this day, the moon becomes the full moon. People go out and watch the lantern festivals everywhere. After that, they eat sweet dumpling (Chinese:汤圆,元宵), a kind of dumpling which is round and looks like the full moon. ``` -```python +```python PYTHON search(query="How many people live in Paris") ``` @@ -337,7 +337,7 @@ Top-3 hits by rank-API (100 BM25 hits re-ranked) 0.602 Essonne is a department to the south of Paris in the Île-de-France region. Its prefecture is Évry. About 1,172,000 people live there (2006 estimation). ``` -```python +```python PYTHON search(query="Who is the director of The Matrix?") ``` diff --git a/fern/pages/cookbooks/sql-agent.mdx b/fern/pages/cookbooks/sql-agent.mdx index 8240a32b0..eb230e2e9 100644 --- a/fern/pages/cookbooks/sql-agent.mdx +++ b/fern/pages/cookbooks/sql-agent.mdx @@ -27,7 +27,7 @@ In this notebook we explore how to setup a [Cohere ReAct Agent](https://github.c # Toolkit Setup [#sec_step0] -```python +```python PYTHON from langchain.agents import AgentExecutor from langchain_cohere.react_multi_hop.agent import create_cohere_react_agent from langchain_core.prompts import ChatPromptTemplate @@ -38,7 +38,7 @@ import os import json ``` -```python +```python PYTHON # Uncomment if you need to install the following packages #!pip install --quiet langchain langchain_cohere langchain_experimental --upgrade ``` @@ -52,12 +52,12 @@ These are the following tools: - 'sql_db_list_tables': lists the tables in the database - 'sql_db_query_checker': validates the SQL query -```python +```python PYTHON # load the cohere api key os.environ["COHERE_API_KEY"] = "" ``` -```python +```python PYTHON DB_NAME='Chinook.db' MODEL="command-r-plus" llm = ChatCohere(model=MODEL, temperature=0.1,verbose=True) @@ -79,7 +79,7 @@ print([tool.name for tool in tools]) We follow the general cohere react agent setup in Langchain to build our SQL agent. -```python +```python PYTHON # define the prompt template prompt = ChatPromptTemplate.from_template("{input}") # instantiate the ReAct agent @@ -95,7 +95,7 @@ agent_executor = AgentExecutor(agent=agent, ) ``` -```python +```python PYTHON output=agent_executor.invoke({ "input": 'what tables are available?', }) @@ -114,7 +114,7 @@ Grounded answer: The following tables are available: Album, Finished chain. ``` -```python +```python PYTHON print(output['output']) ``` @@ -124,7 +124,7 @@ The following tables are available: Album, Artist, Customer, Employee, Genre, In The agent uses the list_tables tool to effectively highlight all the tables in the DB. -```python +```python PYTHON output=agent_executor.invoke({ "input": 'show the first row of the Playlist and Genre tables?', }) @@ -191,7 +191,7 @@ Here is the first row of the Playlist table: > Finished chain. ``` -```python +```python PYTHON print(output['output']) ``` @@ -211,7 +211,7 @@ Here is the first row of the Playlist table: Here we see that the tool takes a list of tables to query the sql_db_schema tool to retrieve the various schemas. -```python +```python PYTHON output=agent_executor.invoke({ "input": 'which countries have the most invoices?', }) @@ -267,7 +267,7 @@ Grounded answer: The countries with the most invoices are the USA (91< > Finished chain. ``` -```python +```python PYTHON print(output['output']) ``` @@ -277,7 +277,7 @@ The countries with the most invoices are the USA (91), Canada (56), and France ( The agent initially makes some errors as it jumps to answer the question using the db_query tool, but it then realizes it needs to figure out what tables it has access to and what they look like. It then fixes the SQL code and is able to generate the right answer. -```python +```python PYTHON output=agent_executor.invoke({ "input": 'who is the best customer? The customer who has spent the most money is the best.', }) @@ -356,7 +356,7 @@ Grounded answer: The best customer is Helena Holý, who has spen > Finished chain. ``` -```python +```python PYTHON print(output['output']) ``` @@ -370,7 +370,7 @@ As you can see, the agent makes an error, but is able to rectify itself. It also Generally, passing in additional context to the preamble can help reduce the initial failures. This context is provided by the SQLDBToolkit and contains the first 3 rows of the tables in the Database. -```python +```python PYTHON print('**Context to pass to LLM on tables**') print('Table Names') print(context['table_names']) @@ -604,7 +604,7 @@ TrackId Name AlbumId MediaTypeId GenreId Composer Milliseconds Bytes UnitPrice We can pass this context into the preamble and re-run a query to see how it performs. -```python +```python PYTHON preamble="""## Task And Context You use your advanced complex reasoning capabilities to help people by answering their questions and other requests interactively. You will be asked a very wide array of requests on all kinds of topics. You will be equipped with a wide range of search engines or similar tools to help you, which you use to research your answer. You may need to use multiple tools in parallel or sequentially to complete your task. You should focus on serving the user's needs as best you can, which will be wide-ranging. @@ -620,7 +620,7 @@ Here is information about the database: """.format(schema_info=context) ``` -```python +```python PYTHON output=agent_executor.invoke({ "input": 'provide the name of the best customer? The customer who has spent the most money is the best.', "preamble": preamble @@ -640,7 +640,7 @@ Grounded answer: The customer who has spent the most money is Helena H > Finished chain. ``` -```python +```python PYTHON print(output['output']) ``` diff --git a/fern/pages/cookbooks/summarization-evals.mdx b/fern/pages/cookbooks/summarization-evals.mdx index 22a51738b..b9c6c63b2 100644 --- a/fern/pages/cookbooks/summarization-evals.mdx +++ b/fern/pages/cookbooks/summarization-evals.mdx @@ -14,11 +14,11 @@ In this cookbook, we will be demonstrating an approach we use for evaluating sum You'll need a Cohere API key to run this notebook. If you don't have a key, head to https://cohere.com/ to generate your key. -```python +```python PYTHON !pip install cohere datasets --quiet ``` -```python +```python PYTHON import json import random import re @@ -36,7 +36,7 @@ co = cohere.Client(api_key=co_api_key) As test data, we'll use transcripts from the [QMSum dataset](https://github.com/Yale-LILY/QMSum). Note that in addition to the transcripts, this dataset also contains reference summaries -- we will use only the transcripts as our approach is reference-free. -```python +```python PYTHON qmsum = load_dataset("MocktaiLEngineer/qmsum-processed", split="validation") transcripts = [x for x in qmsum["meeting_transcript"] if x is not None] ``` @@ -68,7 +68,7 @@ Therefore, we must first create a dataset that contains diverse summarization pr First, we define the prompt that combines the text and instructions. Here, we use a very basic prompt: -```python +```python PYTHON prompt_template = """## meeting transcript {transcript} @@ -78,7 +78,7 @@ prompt_template = """## meeting transcript Next, we build the instructions. Because each instruction may have a different objective and modifiers, we track them using metadata. This will later be required for evaluation (i.e. to know what the prompt is asking). -```python +```python PYTHON instruction_objectives = { "general_summarization": "Summarize the meeting based on the transcript.", @@ -129,7 +129,7 @@ format_length_modifiers = { Let's combine the objectives and format/length modifiers to finish building the instructions. -```python +```python PYTHON instructions = [] for obj_name, obj_text in instruction_objectives.items(): for mod_data in format_length_modifiers.values(): @@ -170,7 +170,7 @@ print(json.dumps(instructions[:2], indent=4)) Finally, let's build the final prompts by semi-randomly pairing the instructions with transcripts from the QMSum dataset. -```python +```python PYTHON data = pd.DataFrame(instructions) transcripts = sorted(transcripts, key=lambda x: len(x), reverse=True)[:int(len(transcripts) * 0.25)] @@ -181,11 +181,11 @@ data["transcript"] = transcripts[:len(data)] data["prompt"] = data.apply(lambda x: prompt_template.format(transcript=x["transcript"], instructions=x["instruction"]), axis=1) ``` -```python +```python PYTHON data["transcript_token_len"] = [len(x) for x in co.batch_tokenize(data["transcript"].tolist(), model=co_model)] ``` -```python +```python PYTHON print(data["prompt"][0]) ``` @@ -249,7 +249,7 @@ We use three criteria that are graded using LLMs: In this cookbook, we will use Command-R to grade the completions. However, note that in practice, we typically use an ensemble of multiple LLM evaluators to reduce any bias. -```python +```python PYTHON grading_prompt_template = """You are an AI grader that given a prompt, a completion, and a criterion, grades the completion based on the prompt and criterion. Below is a prompt, a completion, and a criterion with which to grade the completion. You need to respond according to the criterion instructions. @@ -304,7 +304,7 @@ In addition, we have two criteria that are graded programmatically: - Format: checks if the summary follows the format (e.g. bullets) that was requested in the prompt - Length: checks if the summary follows the length that was requested in the prompt. -```python +```python PYTHON def score_format(completion: str, format_type: str) -> int: """ @@ -391,7 +391,7 @@ Now that we have our evaluation dataset and defined our evaluation functions, le First, we generate completions to be graded. We will use Cohere's [Command-R](https://huggingface.co/CohereForAI/c4ai-command-r-v01) model, boasting a context length of 128K. -```python +```python PYTHON completions = [] for prompt in data["prompt"]: completion = co.chat(message=prompt, model="command-r", temperature=0.2).text @@ -400,7 +400,7 @@ for prompt in data["prompt"]: data["completion"] = completions ``` -```python +```python PYTHON print(data["completion"][0]) ``` @@ -408,7 +408,7 @@ print(data["completion"][0]) Let's grade the completions using our LLM and non-LLM checks. -```python +```python PYTHON data["format_score"] = data.apply( lambda x: score_format(x["completion"], x["eval_metadata"]["format"]), axis=1 ) @@ -436,7 +436,7 @@ data["conciseness_score"] = data.apply( ) ``` -```python +```python PYTHON data ``` @@ -556,7 +556,7 @@ data Finally, let's print the average scores per critiera. -```python +```python PYTHON avg_scores = data[["format_score", "length_score", "completeness_score", "correctness_score", "conciseness_score"]].mean() print(avg_scores) ``` diff --git a/fern/pages/cookbooks/text-classification-using-embeddings.mdx b/fern/pages/cookbooks/text-classification-using-embeddings.mdx index a7824b89e..703caaf66 100644 --- a/fern/pages/cookbooks/text-classification-using-embeddings.mdx +++ b/fern/pages/cookbooks/text-classification-using-embeddings.mdx @@ -27,13 +27,13 @@ We'll go through the following steps: If you're running an older version of the SDK you'll want to upgrade it, like this: -```python +```python PYTHON #!pip install --upgrade cohere ``` ## 1. Get the dataset -```python +```python PYTHON import cohere from sklearn.model_selection import train_test_split @@ -43,7 +43,7 @@ pd.set_option('display.max_colwidth', None) df = pd.read_csv('https://github.com/clairett/pytorch-sentiment-classification/raw/master/data/SST2/train.tsv', delimiter='\t', header=None) ``` -```python +```python PYTHON df.head() ``` @@ -107,7 +107,7 @@ We'll only use a subset of the training and testing datasets in this example. We The `train_test_split` method splits arrays or matrices into random train and test subsets. -```python +```python PYTHON num_examples = 500 df_sample = df.sample(num_examples) @@ -126,7 +126,7 @@ labels_test = labels_test[:95] We're now ready to retrieve the embeddings from the API. You'll need your API key for this next cell. [Sign up to Cohere](https://os.cohere.ai/) and get one if you haven't yet. -```python +```python PYTHON model_name = "embed-english-v3.0" api_key = "" @@ -135,7 +135,7 @@ input_type = "classification" co = cohere.Client(api_key) ``` -```python +```python PYTHON embeddings_train = co.embed(texts=sentences_train, model=model_name, input_type=input_type @@ -154,7 +154,7 @@ We now have two sets of embeddings, `embeddings_train` contains the embeddings o Curious what an embedding looks like? We can print it: -```python +```python PYTHON print(f"Review text: {sentences_train[0]}") print(f"Embedding vector: {embeddings_train[0][:10]}") ``` @@ -168,7 +168,7 @@ Embedding vector: [1.1531117, -0.8543223, -1.2496399, -0.28317127, -0.75870246, Now that we have the embedding, we can train our classifier. We'll use an SVM from sklearn. -```python +```python PYTHON from sklearn.svm import SVC from sklearn.pipeline import make_pipeline from sklearn.preprocessing import StandardScaler @@ -187,7 +187,7 @@ Pipeline(steps=[('standardscaler', StandardScaler()), ## 4. Evaluate the performance of the classifier on the testing set -```python +```python PYTHON score = svm_classifier.score(embeddings_test, labels_test) print(f"Validation accuracy on is {100*score}%!") ``` diff --git a/fern/pages/cookbooks/topic-modeling-ai-papers.mdx b/fern/pages/cookbooks/topic-modeling-ai-papers.mdx index 53cb2b339..d043fe7b8 100644 --- a/fern/pages/cookbooks/topic-modeling-ai-papers.mdx +++ b/fern/pages/cookbooks/topic-modeling-ai-papers.mdx @@ -16,13 +16,13 @@ To follow along with this tutorial, you need to be familiar with Python. Make su First, you need to install the python dependencies required to run the project. Use pip to install them using the command below -```python +```python PYTHON pip install requests beautifulsoup4 cohere altair clean-text numpy pandas sklearn > /dev/null ``` Create a new python file named cohere_nlp.py. Write all your code in this file. import the dependencies and initialize Cohere’s client. -```python +```python PYTHON import cohere api_key = '' @@ -33,7 +33,7 @@ This tutorial focuses on applying topic modeling to look for recent trends in AI First, import the required libraries to make web requests and process the web content . -```python +```python PYTHON import requests from bs4 import BeautifulSoup import pandas as pd @@ -43,14 +43,14 @@ from cleantext import clean Next, make an HTTP request to the source website that has an archive of the AI papers. -```python +```python PYTHON URL = "https://www.jair.org/index.php/jair/issue/archive" page = requests.get(URL) ``` Use this archive to get the list of AI papers published. This archive has papers published since 2015. This tutorial considers papers published recently, on or after 2020 only. -```python +```python PYTHON soup = BeautifulSoup(page.content, "html.parser") archive_links = [] @@ -64,7 +64,7 @@ for link in soup.select('a.title'): Finally, you’ll need to clean the titles of the AI papers gathered. Remove trailing white spaces and unwanted characters. Use the NTLK library to get English stop words and filter them out. -```python +```python PYTHON papers = [] for archive in archive_links: page = requests.get(archive['link']) @@ -98,7 +98,7 @@ for archive in archive_links: The dataset created using this process has 258 AI papers published between 2020 and 2022. Use pandas library to create a data frame to hold our text data. -```python +```python PYTHON df = pd.DataFrame(papers) print(len(df)) ``` @@ -117,7 +117,7 @@ Cohere’s platform provides an Embed Endpoint that returns text embeddings. An Write a function to create the word embeddings using Cohere. The function should read as follows: -```python +```python PYTHON def get_embeddings(text,model='medium'): output = co.embed( model=model, @@ -127,13 +127,13 @@ def get_embeddings(text,model='medium'): Create a new column in your pandas data frame to hold the embeddings created. -```python +```python PYTHON df['title_embeds'] = df['title'].apply(get_embeddings) ``` Congratulations! You have created the word embeddings . Now, you will proceed to visualize the embeddings using a scatter plot. First, you need to reduce the dimensions of the word embeddings. You’ll use the Principal Component Analysis (PCA) method to achieve this task. Import the necessary packages and create a function to return the principle components. -```python +```python PYTHON from sklearn.decomposition import PCA def get_pc(arr,n): @@ -144,7 +144,7 @@ def get_pc(arr,n): Next, create a function to generate a scatter plot chart. You’ll use the altair library to create the charts. -```python +```python PYTHON import altair as alt def generate_chart(df,xcol,ycol,lbl='off',color='basic',title=''): chart = alt.Chart(df).mark_circle(size=500).encode( @@ -181,7 +181,7 @@ def generate_chart(df,xcol,ycol,lbl='off',color='basic',title=''): Finally, use the embeddings with reduced dimensionality to create a scatter plot. -```python +```python PYTHON sample = 200 embeds = np.array(df['title_embeds'].tolist()) embeds_pc2 = get_pc(embeds,2) @@ -237,7 +237,7 @@ print(df_pc2.iloc[:sample]) Here’s a chart demonstrating the word embeddings for AI papers. It is important to note that the chart represents a sample size of 200 papers. -```python +```python PYTHON generate_chart(df_pc2.iloc[:sample],'0','1',title='2D Embeddings') ``` @@ -247,7 +247,7 @@ Data searching techniques focus on using keywords to retrieve text-based informa First, create a function to get similarities between two embeddings. This will use the cosine similarity algorithm from the sci-kit learn library. -```python +```python PYTHON from sklearn.metrics.pairwise import cosine_similarity def get_similarity(target,candidates): @@ -269,7 +269,7 @@ def get_similarity(target,candidates): Next, create embeddings for the search query -```python +```python PYTHON new_query = "graph network strategies" new_query_embeds = get_embeddings(new_query) @@ -278,7 +278,7 @@ new_query_embeds = get_embeddings(new_query) Finally, check the similarity between the two embeddings. Display the top 10 similar papers using your result -```python +```python PYTHON similarity = get_similarity(new_query_embeds,embeds[:sample]) print('Query:') @@ -503,7 +503,7 @@ Clustering is a process of grouping similar documents into clusters. It allows y First, import the k-means algorithm from the scikit-learn package. Then configure two variables: the number of clusters and a duplicate dataset. -```python +```python PYTHON from sklearn.cluster import KMeans df_clust = df_pc2.copy() @@ -513,7 +513,7 @@ n_clusters=5 Next, initialize the k-means model and use it to fit the embeddings to create the clusters. -```python +```python PYTHON kmeans_model = KMeans(n_clusters=n_clusters, random_state=0) classes = kmeans_model.fit_predict(embeds).tolist() print(classes) @@ -527,7 +527,7 @@ df_clust['cluster'] = (list(map(str,classes))) Finally, plot a scatter plot to visualize the 5 clusters in our sample size. -```python +```python PYTHON df_clust.columns = df_clust.columns.astype(str) generate_chart(df_clust.iloc[:sample],'0','1',lbl='off',color='cluster',title='Clustering with 5 Clusters') ``` diff --git a/fern/pages/cookbooks/wikipedia-search-with-weaviate.mdx b/fern/pages/cookbooks/wikipedia-search-with-weaviate.mdx index ab60a48e3..289f8090c 100644 --- a/fern/pages/cookbooks/wikipedia-search-with-weaviate.mdx +++ b/fern/pages/cookbooks/wikipedia-search-with-weaviate.mdx @@ -10,7 +10,7 @@ import { CookbookHeader } from "../../components/cookbook-header"; This is starter code that you can use to search 10 million vectors from Wikipedia embedded with Cohere's multilingual model and hosted as a Weaviate public dataset. This dataset contains 1M vectors in each of the Wikipedia sites in these languages: English, German, French, Spanish, Italian, Japanese, Arabic, Chinese (Simplified), Korean, Hindi \[respective language codes: `en, de, fr, es, it, ja, ar, zh, ko, hi`\] -```python +```python PYTHON import weaviate cohere_api_key = '' @@ -33,7 +33,7 @@ True Let's now define the search function that queries our vector database. Optionally, we want the ability to filter by language. -```python +```python PYTHON def semantic_serch(query, results_lang=''): """ @@ -98,7 +98,7 @@ def print_result(result): We can now query the database with any query we want. In the background, Weaviate uses your Cohere API key to embed the query, then retrun the most relevant passages to the query. -```python +```python PYTHON query_result = semantic_serch("time travel plot twist") print_result(query_result) @@ -130,7 +130,7 @@ Will and Sylvia rob Weis' time banks, giving the time capsules to the needy. The If we're interested in results in only one language, we can specify it. -```python +```python PYTHON query_result = semantic_serch("time travel plot twist", results_lang='ja') print_result(query_result) diff --git a/fern/pages/cookbooks/wikipedia-semantic-search.mdx b/fern/pages/cookbooks/wikipedia-semantic-search.mdx index e057f16cf..8e7914e01 100644 --- a/fern/pages/cookbooks/wikipedia-semantic-search.mdx +++ b/fern/pages/cookbooks/wikipedia-semantic-search.mdx @@ -12,7 +12,7 @@ This notebook contains the starter code to do simple [semantic search](https://t Let's now download 1,000 records from the English Wikipedia embeddings archive so we can search it afterwards. -```python +```python PYTHON from datasets import load_dataset import torch import cohere @@ -44,7 +44,7 @@ Using custom data configuration Cohere--wikipedia-22-12-simple-embeddings-94deea Now, `doc_embeddings` holds the embeddings of the first 1,000 documents in the dataset. Each document is represented as an [embeddings vector](https://txt.cohere.ai/sentence-word-embeddings/) of 768 values. -```python +```python PYTHON doc_embeddings.shape ``` @@ -56,7 +56,7 @@ We can now search these vectors for any query we want. For this toy example, we' To search, we embed the query, then get the nearest neighbors to its embedding (using dot product). -```python +```python PYTHON query = 'Who founded Wikipedia' response = co.embed(texts=[query], model='multilingual-22-12') diff --git a/fern/pages/deployment-options/cohere-on-aws/amazon-bedrock.mdx b/fern/pages/deployment-options/cohere-on-aws/amazon-bedrock.mdx index 991bd5e23..e816a647b 100644 --- a/fern/pages/deployment-options/cohere-on-aws/amazon-bedrock.mdx +++ b/fern/pages/deployment-options/cohere-on-aws/amazon-bedrock.mdx @@ -27,7 +27,6 @@ Here are the steps you'll need to get set up in advance of running Cohere models - Subscribe to Cohere's models on Amazon Bedrock. For more details, [see here](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access.html). - You'll also need to install the AWS Python SDK and some related tooling. Run: - - `pip install boto3`. You can find more details about the boto3 library [here](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/quickstart.html#install-boto3). - `pip install cohere-aws` (or `pip install --upgrade cohere-aws` if you need to upgrade). You can also install from source with `python setup.py install`. - For more details, see this [GitHub repo](https://github.com/cohere-ai/cohere-aws/) and [related notebooks](https://github.com/cohere-ai/cohere-aws/tree/main/notebooks/bedrock). - Finally, you'll have to configure your authentication credentials for AWS. This [document](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/quickstart.html#configuration) has more information. @@ -36,12 +35,15 @@ Here are the steps you'll need to get set up in advance of running Cohere models You can use this code to invoke Cohere's embed model on Amazon Bedrock: -```python -import boto3 -import json +```python PYTHON +import cohere -# Create the AWS client for the Bedrock runtime with boto3 -aws_client = boto3.client(service_name="bedrock-runtime") +co = cohere.BedrockClient( + aws_region="us-east-1", + aws_access_key="...", + aws_secret_key="...", + aws_session_token="...", +) # Input parameters for embed. In this example we are embedding hacker news post titles. texts = ["Interesting (Non software) books?", @@ -55,38 +57,34 @@ input_type = "clustering" truncate = "NONE" # optional model_id = "cohere.embed-english-v3" # or "cohere.embed-multilingual-v3" -# Create the JSON payload for the request -json_params = { - 'texts': texts, - 'truncate': truncate, - "input_type": input_type - } -json_body = json.dumps(json_params) -params = {'body': json_body, 'modelId': model_id,} # Invoke the model and print the response -result = aws_client.invoke_model(**params) -response = json.loads(result['body'].read().decode()) -print(response) +result = co.embed( + model=model_id, + input_type=input_type, + texts=texts, + truncate=truncate) # aws_client.invoke_model(**params) + +print(result) ``` ## Text Generation You can use this code to invoke Cohere's Command models on Amazon Bedrock: -```python -import boto3 -import json +```python PYTHON +import cohere -# Create the AWS client for the Bedrock runtime with boto3 -aws_client = boto3.client(service_name="bedrock-runtime") +co = cohere.BedrockClient( + aws_region="us-east-1", + aws_access_key="...", + aws_secret_key="...", + aws_session_token="...", +) -# Create the JSON payload for the request -json_params = {'prompt': "Write a LinkedIn post about starting a career in tech:"} -params = {'body': json.dumps(json_params),'modelId': 'cohere.command-text-v14',} +result = co.chat(message="Write a LinkedIn post about starting a career in tech:", + model='cohere.command-r-plus-v1:0' # or 'cohere.command-r-v1:0' + ) -# Invoke the model and print the response -result = aws_client.invoke_model(**params) -response = json.loads(result['body'].read().decode()) -print(response) +print(result) ``` diff --git a/fern/pages/deployment-options/cohere-on-aws/amazon-sagemaker-setup-guide.mdx b/fern/pages/deployment-options/cohere-on-aws/amazon-sagemaker-setup-guide.mdx index 7684ad214..5d4e277df 100644 --- a/fern/pages/deployment-options/cohere-on-aws/amazon-sagemaker-setup-guide.mdx +++ b/fern/pages/deployment-options/cohere-on-aws/amazon-sagemaker-setup-guide.mdx @@ -27,7 +27,6 @@ These permissions allow a user to manage your organization’s Amazon SageMaker You'll also need to install the AWS Python SDK and some related tooling. Run: -- `pip install boto3`. You can find more details about the boto3 library [here](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/quickstart.html#install-boto3). - `pip install cohere-aws` (or `pip install --upgrade cohere-aws` if you want to upgrade to the most recent version of the SDK). ## Cohere with Amazon SageMaker Setup @@ -41,6 +40,64 @@ Next, explore the tools on the **Product Detail** page to evaluate how you want - Subscribing: This section will once again present you with both the pricing details and the EULA for final review before you accept the offer. This information is identical to the information on Product Detail page. - Configuration: The primary goal of this section is to retrieve the [Amazon Resource Name (ARN)](https://docs.aws.amazon.com/IAM/latest/UserGuide/reference-arns.html) for the product you have subscribed to. +## Embeddings + +You can use this code to invoke Cohere's embed model on Amazon SageMaker: + +```python PYTHON +import cohere + +co = cohere.SageMakerClient( + aws_region="us-east-1", + aws_access_key="...", + aws_secret_key="...", + aws_session_token="...", +) + +# Input parameters for embed. In this example we are embedding hacker news post titles. +texts = ["Interesting (Non software) books?", + "Non-tech books that have helped you grow professionally?", + "I sold my company last month for $5m. What do I do with the money?", + "How are you getting through (and back from) burning out?", + "I made $24k over the last month. Now what?", + "What kind of personal financial investment do you do?", + "Should I quit the field of software development?"] +input_type = "clustering" +truncate = "NONE" # optional +model_id = "" # On SageMaker, you create a model name that you'll pass here. + + +# Invoke the model and print the response +result = co.embed( + model=model_id, + input_type=input_type, + texts=texts, + truncate=truncate) + +print(result) +``` + +## Text Generation + +You can use this code to invoke Cohere's Command models on Amazon SageMaker: + +```python PYTHON +import cohere + +co = cohere.SageMakerClient( + aws_region="us-east-1", + aws_access_key="...", + aws_secret_key="...", + aws_session_token="...", +) + +# Invoke the model and print the response +result = co.chat(message="Write a LinkedIn post about starting a career in tech:", + model="") # On SageMaker, you create a model name that you'll pass here. + +print(result) +``` + ## Next Steps With your selected configuration and Product ARN available, you now have everything you need to integrate with Cohere’s model offerings on SageMaker. diff --git a/fern/pages/deployment-options/cohere-on-microsoft-azure.mdx b/fern/pages/deployment-options/cohere-on-microsoft-azure.mdx index 0a8c6fc0e..4b6a23273 100644 --- a/fern/pages/deployment-options/cohere-on-microsoft-azure.mdx +++ b/fern/pages/deployment-options/cohere-on-microsoft-azure.mdx @@ -46,7 +46,7 @@ You can find more information about Azure's API [here](https://learn.microsoft.c Here's a code snippet demonstrating how to programmatically interact with a Cohere model on Azure: -```python +```python PYTHON import urllib.request import json @@ -99,7 +99,7 @@ We expose two routes for Embed v3 - English and Embed v3 - Multilingual inferenc You can find more information about Azure's API [here](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/deploy-models-cohere-embed#embed-api-reference-for-cohere-embed-models-deployed-as-a-service). -```python +```python PYTHON import urllib.request import json diff --git a/fern/pages/deployment-options/cohere-works-everywhere.mdx b/fern/pages/deployment-options/cohere-works-everywhere.mdx index fe546bc4e..1fb049813 100644 --- a/fern/pages/deployment-options/cohere-works-everywhere.mdx +++ b/fern/pages/deployment-options/cohere-works-everywhere.mdx @@ -43,4 +43,456 @@ The most complete set of features is found on the cohere platform, while each of ### Typescript -#### Cohere Platform +#### Cohere Platform + +```typescript TS +const { CohereClient } = require('cohere-ai'); + +const cohere = new CohereClient({ + token: 'Your API key', +}); + +(async () => { + const response = await cohere.chat({ + chatHistory: [ + { role: 'USER', message: 'Who discovered gravity?' }, + { + role: 'CHATBOT', + message: 'The man who is widely credited with discovering gravity is Sir Isaac Newton', + }, + ], + message: 'What year was he born?', + // perform web search before answering the question. You can also use your own custom connector. + connectors: [{ id: 'web-search' }], + }); + + console.log(response); +})(); +``` + +#### Bedrock + +```typescript TS +const { BedrockClient } = require('cohere-ai'); + +const cohere = new BedrockClient({ + awsRegion: "us-east-1", + awsAccessKey: "...", + awsSecretKey: "...", + awsSessionToken: "...", +}); + +(async () => { + const response = await cohere.chat({ + model: "cohere.command-r-plus-v1:0", + chatHistory: [ + { role: 'USER', message: 'Who discovered gravity?' }, + { + role: 'CHATBOT', + message: 'The man who is widely credited with discovering gravity is Sir Isaac Newton', + }, + ], + message: 'What year was he born?', + }); + + console.log(response); +})(); +``` + +#### Sagemaker + +```typescript TS +const { SagemakerClient } = require('cohere-ai'); + +const cohere = new SagemakerClient({ + awsRegion: "us-east-1", + awsAccessKey: "...", + awsSecretKey: "...", + awsSessionToken: "...", +}); + +(async () => { + const response = await cohere.chat({ + model: "my-endpoint-name", + chatHistory: [ + { role: 'USER', message: 'Who discovered gravity?' }, + { + role: 'CHATBOT', + message: 'The man who is widely credited with discovering gravity is Sir Isaac Newton', + }, + ], + message: 'What year was he born?', + }); + + console.log(response); +})(); +``` + +#### Azure + +```typescript TS +const { CohereClient } = require('cohere-ai'); + +const cohere = new CohereClient({ + token: "", + environment: "https://Cohere-command-r-plus-phulf-serverless.eastus2.inference.ai.azure.com/v1", +}); + +(async () => { + const response = await cohere.chat({ + chatHistory: [ + { role: 'USER', message: 'Who discovered gravity?' }, + { + role: 'CHATBOT', + message: 'The man who is widely credited with discovering gravity is Sir Isaac Newton', + }, + ], + message: 'What year was he born?', + }); + + console.log(response); +})(); +``` + +### Python + +#### Cohere Platform + +```python PYTHON +import cohere + +co = cohere.Client("Your API key") + +response = co.chat( + chat_history=[ + {"role": "USER", "message": "Who discovered gravity?"}, + { + "role": "CHATBOT", + "message": "The man who is widely credited with discovering gravity is Sir Isaac Newton", + }, + ], + message="What year was he born?", + # perform web search before answering the question. You can also use your own custom connector. + connectors=[{"id": "web-search"}], +) + +print(response) +``` + +#### Bedrock + +```python PYTHON +import cohere + +co = cohere.BedrockClient( + aws_region="us-east-1", + aws_access_key="...", + aws_secret_key="...", + aws_session_token="...", +) + +response = co.chat( + model="cohere.command-r-plus-v1:0", + chat_history=[ + {"role": "USER", "message": "Who discovered gravity?"}, + { + "role": "CHATBOT", + "message": "The man who is widely credited with discovering gravity is Sir Isaac Newton", + }, + ], + message="What year was he born?", +) + +print(response) +``` + +#### Sagemaker + +```python PYTHON +import cohere + +co = cohere.SagemakerClient( + aws_region="us-east-1", + aws_access_key="...", + aws_secret_key="...", + aws_session_token="...", +) + +response = co.chat( + model="my-endpoint-name", + chat_history=[ + {"role": "USER", "message": "Who discovered gravity?"}, + { + "role": "CHATBOT", + "message": "The man who is widely credited with discovering gravity is Sir Isaac Newton", + }, + ], + message="What year was he born?", +) + +print(response) +``` + +#### Azure + +```python PYTHON +import cohere + +co = cohere.Client( + api_key="", + base_url="https://Cohere-command-r-plus-phulf-serverless.eastus2.inference.ai.azure.com/v1", +) + +response = co.chat( + chat_history=[ + {"role": "USER", "message": "Who discovered gravity?"}, + { + "role": "CHATBOT", + "message": "The man who is widely credited with discovering gravity is Sir Isaac Newton", + }, + ], + message="What year was he born?", +) + +print(response) +``` + +### Go + +#### Cohere Platform + +```go GO +package main + +import ( + "context" + "log" + + cohere "github.com/cohere-ai/cohere-go/v2" + client "github.com/cohere-ai/cohere-go/v2/client" +) + +func main() { + co := client.NewClient(client.WithToken("Your API key")) + + resp, err := co.Chat( + context.TODO(), + &cohere.ChatRequest{ + ChatHistory: []*cohere.ChatMessage{ + { + Role: cohere.ChatMessageRoleUser, + Message: "Who discovered gravity?", + }, + { + Role: cohere.ChatMessageRoleChatbot, + Message: "The man who is widely credited with discovering gravity is Sir Isaac Newton", + }}, + Message: "What year was he born?", + Connectors: []*cohere.ChatConnector{ + {Id: "web-search"}, + }, + }, + ) + + if err != nil { + log.Fatal(err) + } + + log.Printf("%+v", resp) +} +``` + +#### Bedrock + +```go GO +package main + +import ( + "context" + "log" + + cohere "github.com/cohere-ai/cohere-go/v2" + client "github.com/cohere-ai/cohere-go/v2/client" + "github.com/cohere-ai/cohere-go/v2/core" +) + +func main() { + co := client.NewBedrockClient([]core.RequestOption{}, []client.AwsRequestOption{ + client.WithAwsRegion("us-east-1"), + client.WithAwsAccessKey(""), + client.WithAwsSecretKey(""), + client.WithAwsSessionToken(""), + }) + + resp, err := co.Chat( + context.TODO(), + &cohere.ChatRequest{ + ChatHistory: []*cohere.ChatMessage{ + { + Role: cohere.ChatMessageRoleUser, + Message: "Who discovered gravity?", + }, + { + Role: cohere.ChatMessageRoleChatbot, + Message: "The man who is widely credited with discovering gravity is Sir Isaac Newton", + }}, + Message: "What year was he born?", + }, + ) + + if err != nil { + log.Fatal(err) + } + + log.Printf("%+v", resp) +} +``` + +#### Sagemaker + +```go GO +package main + +import ( + "context" + "log" + + cohere "github.com/cohere-ai/cohere-go/v2" + client "github.com/cohere-ai/cohere-go/v2/client" + "github.com/cohere-ai/cohere-go/v2/core" +) + +func main() { + co := client.NewSagemakerClient([]core.RequestOption{}, []client.AwsRequestOption{ + client.WithAwsRegion("us-east-1"), + client.WithAwsAccessKey(""), + client.WithAwsSecretKey(""), + client.WithAwsSessionToken(""), + }) + + resp, err := co.Chat( + context.TODO(), + &cohere.ChatRequest{ + Model: cohere.String("my-endpoint-name"), + ChatHistory: []*cohere.ChatMessage{ + { + Role: cohere.ChatMessageRoleUser, + Message: "Who discovered gravity?", + }, + { + Role: cohere.ChatMessageRoleChatbot, + Message: "The man who is widely credited with discovering gravity is Sir Isaac Newton", + }}, + Message: "What year was he born?", + }, + ) + + if err != nil { + log.Fatal(err) + } + + log.Printf("%+v", resp) +} +``` + +#### Azure + +```go GO +package main + +import ( + "context" + "log" + + cohere "github.com/cohere-ai/cohere-go/v2" + client "github.com/cohere-ai/cohere-go/v2/client" +) + +func main() { + client := client.NewClient( + client.WithToken(""), + client.WithBaseURL("https://Cohere-command-r-plus-phulf-serverless.eastus2.inference.ai.azure.com/v1"), + ) + + resp, err := co.Chat( + context.TODO(), + &cohere.ChatRequest{ + ChatHistory: []*cohere.ChatMessage{ + { + Role: cohere.ChatMessageRoleUser, + Message: "Who discovered gravity?", + }, + { + Role: cohere.ChatMessageRoleChatbot, + Message: "The man who is widely credited with discovering gravity is Sir Isaac Newton", + }}, + Message: "What year was he born?", + }, + ) + + if err != nil { + log.Fatal(err) + } + + log.Printf("%+v", resp) +} +``` + +### Java + +#### Cohere Platform + +```java JAVA +import com.cohere.api.Cohere; +import com.cohere.api.requests.ChatRequest; +import com.cohere.api.types.ChatMessage; +import com.cohere.api.types.Message; +import com.cohere.api.types.NonStreamedChatResponse; + +import java.util.List; + + +public class ChatPost { + public static void main(String[] args) { + Cohere cohere = Cohere.builder().token("Your API key").clientName("snippet").build(); + + NonStreamedChatResponse response = cohere.chat( + ChatRequest.builder() + .message("What year was he born?") + .chatHistory( + List.of(Message.user(ChatMessage.builder().message("Who discovered gravity?").build()), + Message.chatbot(ChatMessage.builder().message("The man who is widely credited with discovering gravity is Sir Isaac Newton").build()))).build()); + + System.out.println(response); + } +} +``` + +#### Azure + +```java JAVA +import com.cohere.api.Cohere; +import com.cohere.api.requests.ChatRequest; +import com.cohere.api.types.ChatMessage; +import com.cohere.api.types.Message; +import com.cohere.api.types.NonStreamedChatResponse; + +import java.util.List; + + +public class ChatPost { + public static void main(String[] args) { + Cohere cohere = Cohere.builder().environment(Environment.custom("https://Cohere-command-r-plus-phulf-serverless.eastus2.inference.ai.azure.com/v1")).token("").clientName("snippet").build(); + + NonStreamedChatResponse response = cohere.chat( + ChatRequest.builder() + .message("What year was he born?") + .chatHistory( + List.of(Message.user(ChatMessage.builder().message("Who discovered gravity?").build()), + Message.chatbot(ChatMessage.builder().message("The man who is widely credited with discovering gravity is Sir Isaac Newton").build()))).build()); + + System.out.println(response); + } +} +``` + diff --git a/fern/pages/deployment-options/getting-started-with-coral-toolkit.mdx b/fern/pages/deployment-options/getting-started-with-coral-toolkit.mdx index 74f7e7ff6..12698ab04 100644 --- a/fern/pages/deployment-options/getting-started-with-coral-toolkit.mdx +++ b/fern/pages/deployment-options/getting-started-with-coral-toolkit.mdx @@ -12,23 +12,26 @@ With Cohere's decision to open source some of our Github repositories, it has no To begin, make sure you have the with the SDK installed (the examples below are in Python, Typescript, and Go): -```python + +```python PYTHON pip install cohere ``` -```typescript +```typescript TYPESCRIPT npm i -s cohere-ai ``` -```go +```go GO go get github.com/cohere-ai/cohere-go/v2 ``` + Import dependencies and set up the Cohere client. -```python + +```python PYTHON import cohere co = cohere.Client('Your API key') ``` -```typescript +```typescript TYPESCRIPT import { CohereClient } from "cohere-ai"; const cohere = new CohereClient({ @@ -44,11 +47,12 @@ const cohere = new CohereClient({ console.log("Received prediction", prediction); })(); ``` -```go +```go GO import cohereclient "github.com/cohere-ai/cohere-go/v2/client" client := cohereclient.NewClient(cohereclient.WithToken("")) ``` + (All the rest of the examples on this page will be in Python, but you can find more detailed instructions for getting set up by checking out the Github repositories for [Python](https://github.com/cohere-ai/cohere-python), [Typescript](https://github.com/cohere-ai/cohere-typescript), and [Go](https://github.com/cohere-ai/cohere-go).) diff --git a/fern/pages/deployment-options/single-container-on-private-clouds.mdx b/fern/pages/deployment-options/single-container-on-private-clouds.mdx index 8255ed34f..6f6f910d2 100644 --- a/fern/pages/deployment-options/single-container-on-private-clouds.mdx +++ b/fern/pages/deployment-options/single-container-on-private-clouds.mdx @@ -76,6 +76,7 @@ This section provides simple examples of using each primary Cohere model in a Do Here are the `bash` commands you can run to use the Embed English, Embed Multilingual, Rerank English, Rerank Multilingual, and Command models through Docker. + ```Text Embed English docker run -d --rm --name embed-english --gpus=1 --net=host $IMAGE_TAG @@ -236,6 +237,7 @@ curl --header "Content-Type: application/json" --request POST http://localhost:8 docker stop command ``` + You'll note that final example includes documents that the Command model can use to ground its replies. This functionality falls under [retrieval augmented generation](/docs/retrieval-augmented-generation-rag). @@ -249,7 +251,7 @@ Deploying to Kubernetes requires nodes with the following installed: To deploy the same image on Kubernetes, we must first convert the docker configuration into an image pull secret (see the [Kubernetes documentation](https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/#registry-secret-existing-credentials) for more detail). -```yaml +```yaml YAML kubectl create secret generic cohere-pull-secret \ --from-file=.dockerconfigjson="~/.docker/config.json" \ --type=kubernetes.io/dockerconfigjson @@ -338,6 +340,7 @@ Leave that running in the background, and up a new terminal session to execute a Here are the `bash` commands you can run to use the Embed English, Embed Multilingual, Rerank English, Rerank Multilingual, and Command models through Kubernetes. + ```Text Embed English curl --header "Content-Type: application/json" --request POST http://localhost:8080/embed --data-raw '{"texts": ["testing embeddings in english"], "input_type": "classification"}' @@ -476,6 +479,7 @@ curl --header "Content-Type: application/json" --request POST http://localhost:8 ] } ``` + Remember that this is only an illustrative deployment. Feel free to modify it as needed to accommodate your environment. diff --git a/fern/pages/fine-tuning/chat-fine-tuning/chat-preparing-the-data.mdx b/fern/pages/fine-tuning/chat-fine-tuning/chat-preparing-the-data.mdx index d6ab4f524..3ee3da042 100644 --- a/fern/pages/fine-tuning/chat-fine-tuning/chat-preparing-the-data.mdx +++ b/fern/pages/fine-tuning/chat-fine-tuning/chat-preparing-the-data.mdx @@ -25,7 +25,7 @@ A message consist of the following parts: Here is a chat example that trains a chat bot to answer questions. Notice that, for the sake of readability, the document spans over multiple lines. For your dataset, make sure that each line contains one whole example. -```json +```json JSON { "messages": [ { @@ -74,7 +74,7 @@ Evaluation data is utilized to calculate metrics that depict the performance of If you intend to fine-tune through our UI you can skip to the next chapter. Otherwise continue reading to learn how to create datasets for fine-tuning via our Python SDK. Before you start, we recommend that you read about [datasets](/docs/datasets). Please also see the 'Data Formatting and Requirements' in 'Using the Python SDK' in the next chapter for a full table of expected validation errors. Below you will find some code samples on how create datasets via the SDK: -```python +```python PYTHON import cohere # instantiate the Cohere client diff --git a/fern/pages/fine-tuning/chat-fine-tuning/chat-starting-the-training.mdx b/fern/pages/fine-tuning/chat-fine-tuning/chat-starting-the-training.mdx index 6210efd87..41346308f 100644 --- a/fern/pages/fine-tuning/chat-fine-tuning/chat-starting-the-training.mdx +++ b/fern/pages/fine-tuning/chat-fine-tuning/chat-starting-the-training.mdx @@ -33,7 +33,7 @@ Upload your training data by clicking on the `TRAINING SET` button at the bottom Your data has to be in a `.jsonl` file, where each `json` object is a conversation with the following structure: -```json +```json JSON {'messages': [{'role': 'System', 'content': 'You are a chatbot trained to answer to my every question.'}, @@ -108,7 +108,7 @@ Creating a fine-tuned model that can be used with the `co.Chat` API requires goo Your data has to be in a `.jsonl` file, where each `json` object is a conversation with the following structure: -```json +```json JSON {'messages': [{'role': 'System', 'content': 'You are a chatbot trained to answer to my every question.'}, @@ -126,7 +126,7 @@ We require a minimum of two valid conversations to begin training. Currently, us Using the `co.finetuning.create_finetuned_model()` method of the Cohere client, you can kick off a training job that will result in a fine-tuned model. Fine-tuned models are trained on custom datasets which are created using the `co.datasets.create()` method. In the example below, we create a dataset with training and evaluation data, and use it to fine-tune a model. -```python +```python PYTHON import cohere co = cohere.Client('Your API key') @@ -181,7 +181,7 @@ To train a custom model, please see the example below for parameters to pass to ## Example -```python +```python PYTHON import cohere from cohere.finetuning import Hyperparameters, Settings, BaseModel @@ -247,7 +247,7 @@ print(response.text) After your first message with the model, an `id` field will be returned which you can pass as the `conversation_id` to continue the conversation from that point onwards, like so: -```python +```python PYTHON # Continuing the above conversation with `response.id`. response_2 = co.chat( message="How are you?", diff --git a/fern/pages/fine-tuning/classify-fine-tuning/classify-preparing-the-data.mdx b/fern/pages/fine-tuning/classify-fine-tuning/classify-preparing-the-data.mdx index 2d181b59c..4f2ec1180 100644 --- a/fern/pages/fine-tuning/classify-fine-tuning/classify-preparing-the-data.mdx +++ b/fern/pages/fine-tuning/classify-fine-tuning/classify-preparing-the-data.mdx @@ -30,7 +30,7 @@ Please notice that both text and label are required fields. When it comes to sin **jsonl** -```json +```json JSON {"text":"This movie offers that rare combination of entertainment and education", "label":"positive"} {"text":"Boring movie that is not as good as the book", "label":"negative"} {"text":"We had a great time watching it!", "label":"positive"} @@ -55,7 +55,7 @@ Multi-label data differs from single-label data in the following ways: **jsonl** -```json +```json JSON {"text":"About 99% of the mass of the human body is made up of six elements: oxygen, carbon, hydrogen, nitrogen, calcium, and phosphorus.", "label":["biology", "physics"]} {"text":"The square root of a number is defined as the value, which gives the number when it is multiplied by itself", "label":["mathematics"]} {"text":"Hello world!", "label":[]} @@ -78,7 +78,7 @@ Evaluation data is utilized to calculate metrics that depict the performance of If you intend to fine-tune through our UI you can skip to the next chapter. Otherwise continue reading to learn how to create datasets for fine-tuning via our [Python SDK](/docs/fine-tuning-with-the-python-sdk). Before you start, we recommend that you read about the [dataset](/docs/datasets) API. Below you will find some code samples on how create datasets via the SDK: -```python +```python PYTHON import cohere # instantiate the Cohere client diff --git a/fern/pages/fine-tuning/classify-fine-tuning/classify-starting-the-training.mdx b/fern/pages/fine-tuning/classify-fine-tuning/classify-starting-the-training.mdx index aa6721a76..3dfe34cfd 100644 --- a/fern/pages/fine-tuning/classify-fine-tuning/classify-starting-the-training.mdx +++ b/fern/pages/fine-tuning/classify-fine-tuning/classify-starting-the-training.mdx @@ -1,5 +1,5 @@ --- -title: "Starting the Classify Fine-Tuning" +title: "Trains and deploys a fine-tuned model." slug: "docs/classify-starting-the-training" hidden: false @@ -34,8 +34,9 @@ You also have the option of uploading a validation dataset. This will not be use At this point in time, if there are labels in the training set with less than five unique examples, those labels will be removed. + set. - + Once done, click 'Next'. @@ -94,7 +95,7 @@ Here are some example code snippets for you to use. ### Starting a Single-label Fine-tune -```python +```python PYTHON # create dataset single_label_dataset = co.datasets.create(name="single-label-dataset", data=open("path/to/train.csv, "rb"), @@ -120,7 +121,7 @@ print(f"fine-tune ID: {finetune.id}, fine-tune status: {finetune.status}" ### Starting a Multi-label Fine-tune -``` +```python PYTHON # create dataset multi_label_dataset = co.create_dataset(name="multi-label-dataset", data=open("path/to/train.jsonl", "rb"), diff --git a/fern/pages/fine-tuning/fine-tuning-with-the-python-sdk.mdx b/fern/pages/fine-tuning/fine-tuning-with-the-python-sdk.mdx index 26afc5d74..a4995441b 100644 --- a/fern/pages/fine-tuning/fine-tuning-with-the-python-sdk.mdx +++ b/fern/pages/fine-tuning/fine-tuning-with-the-python-sdk.mdx @@ -18,7 +18,7 @@ Before a fine-tune job can be started, users must upload a [Dataset](/docs/datas The snippet below creates a dataset for fine-tuning a model on records of customer service interactions. -```python +```python PYTHON # create a dataset co = cohere.Client('Your API key') @@ -36,7 +36,7 @@ result = co.wait(my_dataset) Below is an example of starting a fine-tune job of a generative model for Chat using a dataset of conversational data. -```python +```python PYTHON from cohere.finetuning import FinetunedModel, Settings, BaseModel # start training a custom model using the dataset diff --git a/fern/pages/fine-tuning/rerank-fine-tuning/rerank-preparing-the-data.mdx b/fern/pages/fine-tuning/rerank-fine-tuning/rerank-preparing-the-data.mdx index 74c2f4a36..f7822b5f4 100644 --- a/fern/pages/fine-tuning/rerank-fine-tuning/rerank-preparing-the-data.mdx +++ b/fern/pages/fine-tuning/rerank-fine-tuning/rerank-preparing-the-data.mdx @@ -22,7 +22,7 @@ First, ensure your data is in `jsonl` format. There are three required fields: Here are a few example lines from a dataset that could be used to train a model that finds the paraphrased question most relevant to a target question. -```json +```json JSON {"query": "What are your views on the supreme court's decision to make playing national anthem mandatory in cinema halls?", "relevant_passages": ["What are your views on Supreme Court decision of must National Anthem before movies?"], "hard_negatives": ["Is the decision of SC justified by not allowing national anthem inside courts but making it compulsory at cinema halls?", "Why has the supreme court of India ordered that cinemas play the national anthem before the screening of all movies? Is it justified?", "Is it a good decision by SC to play National Anthem in the theater before screening movie?", "Why is the national anthem being played in theaters?", "What does Balaji Vishwanathan think about the compulsory national anthem rule?"]} {"query": "Will Google's virtual monopoly in web search ever end? When?", "relevant_passages": ["Is Google's search monopoly capable of being disrupted?"], "hard_negatives": ["Who is capable of ending Google's monopoly in search?", "What is the future of Google?", "When will the Facebook era end?", "When will Facebook stop being the most popular?", "What happened to Google Search?"]} ``` @@ -43,7 +43,7 @@ Evaluation data is utilized to calculate metrics that depict the performance of If you intend to fine-tune through our UI you can skip to the next chapter. Otherwise continue reading to learn how to create datasets for fine-tuning via our Python SDK. Before you start we recommend that you read about the [dataset](/docs/datasets) API. Below you will find some code samples on how create datasets via the SDK: -```python +```python PYTHON import cohere # instantiate the Cohere client diff --git a/fern/pages/fine-tuning/rerank-fine-tuning/rerank-starting-the-training.mdx b/fern/pages/fine-tuning/rerank-fine-tuning/rerank-starting-the-training.mdx index 19add0608..f7f9f07d4 100644 --- a/fern/pages/fine-tuning/rerank-fine-tuning/rerank-starting-the-training.mdx +++ b/fern/pages/fine-tuning/rerank-fine-tuning/rerank-starting-the-training.mdx @@ -38,7 +38,9 @@ You also have the option of uploading a validation dataset. This will not be use At this point in time, the platform will error if you upload a query in which a passage is listed as both a relevant passage and a hard negative + list. + In addition, if your `hard_negatives` are empty strings or duplicated in a given row, we will remove those from the training set as well. diff --git a/fern/pages/get-started/datasets.mdx b/fern/pages/get-started/datasets.mdx index 3e7beaa2f..2f8741824 100644 --- a/fern/pages/get-started/datasets.mdx +++ b/fern/pages/get-started/datasets.mdx @@ -34,13 +34,13 @@ You should also be aware of how Cohere handles data retention. This is the most First, let's install the SDK -```python +```python PYTHON pip install cohere ``` Import dependencies and set up the Cohere client. -```python +```python PYTHON import cohere co = cohere.Client(api_key='Your API key') ``` @@ -57,7 +57,7 @@ The dataset `name` is useful when browsing the datasets you've uploaded. In addi Here is an example code snippet illustrating the process of creating a dataset, with both the `name` and the `dataset_type` specified. -```python +```python PYTHON my_dataset = co.datasets.create( name="shakespeare", data=open("./shakespeare.csv", "rb"), @@ -72,7 +72,7 @@ Whenever a dataset is created, the data is validated asynchronously against the Here's a code snippet showing how to check the validation status of a dataset you've created. -```python +```python PYTHON ds = co.wait(my_dataset) print(ds.validation_status) ``` @@ -100,14 +100,14 @@ The Dataset API will preserve metadata if specified at time of upload. During th #### Sample Dataset Input Format -```json +```json JSON {"wiki_id": 69407798, "url": "https://en.wikipedia.org/wiki?curid=69407798", "views": 5674.4492597435465, "langs": 38, "title": "Deaths in 2022", "text": "The following notable deaths occurred in 2022. Names are reported under the date of death, in alphabetical order. A typical entry reports information in the following sequence:", "paragraph_id": 0, "id": 0} {"wiki_id": 3524766, "url": "https://en.wikipedia.org/wiki?curid=3524766", "views": 5409.5609619796405, "title": "YouTube", "text": "YouTube is a global online video sharing and social media platform headquartered in San Bruno, California. It was launched on February 14, 2005, by Steve Chen, Chad Hurley, and Jawed Karim. It is owned by Google, and is the second most visited website, after Google Search. YouTube has more than 2.5 billion monthly users who collectively watch more than one billion hours of videos each day. , videos were being uploaded at a rate of more than 500 hours of content per minute.", "paragraph_id": 0, "id": 1} ``` As seen in the above example, the following would be a valid `create_dataset` call since `langs` is in the first entry but not in the second entry. The fields `wiki_id`, `url`, `views` and `title` are present in both JSONs. -```python +```python PYTHON # Upload a dataset for embed jobs ds=co.datasets.create( name='sample_file', @@ -129,7 +129,7 @@ Once the dataset passes validation, it can be used to fine-tune a model. To do t In the example below, we will create a new dataset and upload an evaluation set using the optional `eval_data` parameter. We will then kick off a fine-tuning job using `co.finetuning.create_finetuned_model`. -```python +```python PYTHON # create a dataset my_dataset = co.datasets.create( name="shakespeare", @@ -153,7 +153,7 @@ When a dataset is created, the `dataset_type` field _must_ be specified in order Datasets of type `chat-finetune-input`, for example, are expected to have a json with the field `messages` containing a list of messages: -```python +```python PYTHON { "messages": [ { @@ -189,7 +189,7 @@ Datasets can be fetched using its unique `id`. Note that the dataset `name` and Here is an example code snippet showing how to fetch a dataset by its unique `id`. -```python +```python PYTHON # fetch the dataset by ID my_dataset = co.datasets.get(id="") @@ -207,6 +207,6 @@ co.utils.save_dataset(dataset=my_dataset, filepath='./path/to/new/file.csv') Datasets are automatically deleted after 30 days, but they can also be deleted manually. Here's a code snippet showing how to do that: -```python +```python PYTHON co.datasets.delete(id="") ``` diff --git a/fern/pages/going-to-production/rate-limits.mdx b/fern/pages/going-to-production/rate-limits.mdx index 96d69a4a7..2245d90fb 100644 --- a/fern/pages/going-to-production/rate-limits.mdx +++ b/fern/pages/going-to-production/rate-limits.mdx @@ -35,9 +35,16 @@ With a trial key: ## Production Key Specifications -Production keys for all endpoints are rate-limited at 10,000 calls per minute and are intended for serving Cohere in a public-facing application and testing purposes. Usage of production keys is metered at price points which can be found on our pricing page. +Production keys for all endpoints are rate-limited at 10,000 calls per minute and are intended for serving Cohere in a public-facing application and testing purposes. Usage of production keys is metered at price points which can be found on our [pricing page](/docs/how-does-cohere-pricing-work). + +To get a production key, start by navigating to the [API Keys](https://dashboard.cohere.com/api-keys) page in your Cohere dashboard. You'll either need to be the admin of your organization, or ask your organization Admin to complete these steps. + +![](../../assets/images/1d24fd7-Screenshot_2024-07-01_at_10.33.04_AM.png) + +From there, click on _Create Production key_ to finish the process. + +![](../../assets/images/27062e8-Screenshot_2024-07-01_at_10.33.54_AM.png) + +The whole process should complete in less than three minutes, and enables you to generate a production key that you can use to serve Cohere APIs in production. If you deploy without completing the go to production workflow, your API key may be temporarily or permanently revoked. -To get a production key, you will need to complete a few steps in our Go to Production workflow. You can start the process by navigating to the [API Keys](https://dashboard.cohere.com/api-keys) page in your Cohere dashboard as the Admin of your organization (or asking your organization Admin to complete these steps). From there, click on the _New Production Key_ button to start the process. -![](../../assets/images/e0c5638-Going_Live_Screenshot_1.png) -The process takes less than three minutes to finish and enables you to generate a Production key that you can use to serve Cohere APIs in production. If you deploy without completing the Go to Production workflow, your API key may be temporarily or permanently revoked. diff --git a/fern/pages/integrations/cohere-and-langchain/chat-on-langchain.mdx b/fern/pages/integrations/cohere-and-langchain/chat-on-langchain.mdx index 4939b56b2..da43b791c 100644 --- a/fern/pages/integrations/cohere-and-langchain/chat-on-langchain.mdx +++ b/fern/pages/integrations/cohere-and-langchain/chat-on-langchain.mdx @@ -20,7 +20,7 @@ Running Cohere Chat with LangChain doesn't require many prerequisites, consult t To use [Cohere chat](/docs/cochat-beta) with LangChain, simply create a [ChatCohere](https://github.com/langchain-ai/langchain/blob/master/libs/community/langchain_community/chat_models/cohere.py) object and pass in the message or message history. In the example below, you will need to add your Cohere API key. -```python +```python PYTHON from langchain_community.chat_models import ChatCohere from langchain_core.messages import AIMessage, HumanMessage @@ -46,7 +46,7 @@ To use Cohere's multi hop agent create a `create_cohere_react_agent` and pass in For example, using an internet search tool to get essay writing advice from Cohere with citations: -```python +```python PYTHON from langchain.agents import AgentExecutor from langchain_cohere.chat_models import ChatCohere from langchain_cohere.react_multi_hop.agent import create_cohere_react_agent @@ -84,7 +84,7 @@ To use Cohere's [retrieval augmented generation (RAG)](/docs/retrieval-augmented In this example, we use the [wikipedia retriever](https://python.langchain.com/docs/integrations/retrievers/wikipedia) but any [retriever supported by LangChain](https://python.langchain.com/docs/integrations/retrievers) can be used here. In order to set up the wikipedia retriever you need to install the wikipedia python package using `%pip install --upgrade --quiet wikipedia`. With that done, you can execute this code to see how a retriever works: -```python +```python PYTHON from langchain.retrievers import CohereRagRetriever from langchain.retrievers import WikipediaRetriever from langchain_community.chat_models import ChatCohere @@ -118,7 +118,7 @@ print(citations) In this example, we take documents (which might be generated in other parts of your application) and pass them into the [CohereRagRetriever](https://github.com/langchain-ai/langchain/blob/master/libs/community/langchain_community/retrievers/cohere_rag_retriever.py) object: -```python +```python PYTHON from langchain.retrievers import CohereRagRetriever from langchain_community.chat_models import ChatCohere from langchain_core.documents import Document @@ -153,7 +153,7 @@ In this example, we create a generation with a [connector](/docs/connectors) whi Here's a code sample illustrating how to use a connector: -```python +```python PYTHON from langchain.retrievers import CohereRagRetriever from langchain_community.chat_models import ChatCohere from langchain_core.documents import Document diff --git a/fern/pages/integrations/cohere-and-langchain/embed-on-langchain.mdx b/fern/pages/integrations/cohere-and-langchain/embed-on-langchain.mdx index d4fb48215..1784fe98a 100644 --- a/fern/pages/integrations/cohere-and-langchain/embed-on-langchain.mdx +++ b/fern/pages/integrations/cohere-and-langchain/embed-on-langchain.mdx @@ -16,7 +16,7 @@ Running Cohere embeddings with LangChain doesn't require many prerequisites, con To use [Cohere's Embeddings](/docs/embeddings) with LangChain, create a [CohereEmbedding](https://github.com/langchain-ai/langchain/blob/master/libs/community/langchain_community/embeddings/cohere.py) object as follows (the available cohere embedding models [are listed here](/reference/embed)): -```python +```python PYTHON from langchain_community.embeddings import CohereEmbeddings cohere_embeddings = CohereEmbeddings(cohere_api_key="{API_KEY}", model="embed-english-light-v3.0") @@ -29,7 +29,7 @@ print(doc_result) To use these embeddings with Cohere's RAG functionality, you will need to use one of the vector DBs [from this list](https://python.langchain.com/docs/integrations/vectorstores). In this example we use chroma, so in order to run it you will need to install chroma using `pip install chromadb`. -```python +```python PYTHON from langchain.retrievers import ContextualCompressionRetriever, CohereRagRetriever from langchain.retrievers.document_compressors import CohereRerank from langchain_community.embeddings import CohereEmbeddings @@ -82,7 +82,7 @@ In addition to the prerequisites above, integrating Cohere with LangChain on Ama In this example, we create embeddings for a query using Bedrock and LangChain: -```python +```python PYTHON from langchain_community.embeddings import BedrockEmbeddings # Replace the profile name with the one created in the setup. diff --git a/fern/pages/integrations/cohere-and-langchain/rerank-on-langchain.mdx b/fern/pages/integrations/cohere-and-langchain/rerank-on-langchain.mdx index 254dce602..2d67e4234 100644 --- a/fern/pages/integrations/cohere-and-langchain/rerank-on-langchain.mdx +++ b/fern/pages/integrations/cohere-and-langchain/rerank-on-langchain.mdx @@ -18,7 +18,7 @@ To use Cohere's [rerank functionality ](/docs/reranking) with LangChain, start w You can then use it with LangChain retrievers, embeddings, and RAG. The example below uses the vector DB chroma, for which you will need to install `pip install chromadb`. Other vector DB's [from this list](https://python.langchain.com/docs/integrations/vectorstores) can also be used. -```python +```python PYTHON from langchain.retrievers import ContextualCompressionRetriever, CohereRagRetriever from langchain.retrievers.document_compressors import CohereRerank from langchain_community.embeddings import CohereEmbeddings diff --git a/fern/pages/integrations/cohere-and-langchain/tools-on-langchain.mdx b/fern/pages/integrations/cohere-and-langchain/tools-on-langchain.mdx index 580c83759..ce8e1fa14 100644 --- a/fern/pages/integrations/cohere-and-langchain/tools-on-langchain.mdx +++ b/fern/pages/integrations/cohere-and-langchain/tools-on-langchain.mdx @@ -20,7 +20,7 @@ Running Cohere tools with LangChain doesn't require many prerequisites, consult Multi-step is enabled by default. Here's an example of using it to put together a simple agent: -```python +```python PYTHON from langchain.agents import AgentExecutor from langchain_cohere.react_multi_hop.agent import create_cohere_react_agent from langchain_core.prompts import ChatPromptTemplate @@ -71,7 +71,7 @@ print(response['output']) In order to utilize single-step mode, you have to set `force_single_step=False`. Here's an example of using it to answer a few questions: -```python +```python PYTHON ### Router from typing import Literal diff --git a/fern/pages/integrations/integrations/chroma-and-cohere.mdx b/fern/pages/integrations/integrations/chroma-and-cohere.mdx index c88ad5ea2..d90c3b2f8 100644 --- a/fern/pages/integrations/integrations/chroma-and-cohere.mdx +++ b/fern/pages/integrations/integrations/chroma-and-cohere.mdx @@ -7,7 +7,7 @@ createdAt: "Thu May 23 2024 16:53:24 GMT+0000 (Coordinated Universal Time)" updatedAt: "Thu May 23 2024 16:53:54 GMT+0000 (Coordinated Universal Time)" --- - + Chroma is an open-source vector search engine that's quick to install and start building with using Python or Javascript. diff --git a/fern/pages/integrations/integrations/elasticsearch-and-cohere.mdx b/fern/pages/integrations/integrations/elasticsearch-and-cohere.mdx index 10b016808..228f1f16f 100644 --- a/fern/pages/integrations/integrations/elasticsearch-and-cohere.mdx +++ b/fern/pages/integrations/integrations/elasticsearch-and-cohere.mdx @@ -11,7 +11,7 @@ createdAt: "Sun Apr 07 2024 20:15:08 GMT+0000 (Coordinated Universal Time)" updatedAt: "Thu May 30 2024 15:56:35 GMT+0000 (Coordinated Universal Time)" --- - + [Elasticsearch](https://www.elastic.co/search-labs/blog/elasticsearch-cohere-embeddings-support) has all the tools developers need to build next generation search experiences with generative AI, and it supports native integration with [Cohere](https://www.elastic.co/search-labs/blog/elasticsearch-cohere-embeddings-support) through their [inference API](https://www.elastic.co/guide/en/elasticsearch/reference/master/semantic-search-inference.html). @@ -60,14 +60,14 @@ Install and import the required Python Packages: To install the packages, use the following code -```python +```python PYTHON !pip install elasticsearch_serverless==0.2.0.20231031 !pip install cohere==5.2.5 ``` ## Import the required packages -```python +```python PYTHON from elasticsearch_serverless import Elasticsearch, helpers import cohere import json @@ -84,7 +84,7 @@ In order to create an Elasticsearch client you will need: When creating your API key in the Serverless dashboard make sure to turn on Control security privileges, and edit cluster privileges to specify `"cluster": ["all"].` Note - you can also create a client using a local or Elastic Cloud cluster. For simplicity we use Elastic Serverless. -```python +```python PYTHON ELASTICSEARCH_ENDPOINT = "elastic_endpoint" ELASTIC_API_KEY = "encoded_api_key" @@ -107,7 +107,7 @@ To set up an inference pipeline for ingestion we first must create an inference We will create an inference endpoint that uses `embed-english-v3.0` and `int8` or `byte` compression to save on storage. -```python +```python PYTHON COHERE_API_KEY = "cohere_api_key" client.inference.put_model( @@ -130,7 +130,7 @@ client.inference.put_model( Now that we have an inference endpoint we can create an inference pipeline and processor to use when we ingest documents into our index. -```python +```python PYTHON client.ingest.put_pipeline( id="cohere_embeddings", description="Ingest pipeline for Cohere inference.", @@ -160,7 +160,7 @@ Let's note a few important parameters from that API call: We will now create an empty index that will be the destination of our documents and embeddings. -```python +```python PYTHON client.indices.create( index="cohere-wiki-embeddings", settings={"index": {"default_pipeline": "cohere_embeddings"}}, @@ -188,7 +188,7 @@ client.indices.create( Let’s now index our wiki dataset. -```python +```python PYTHON url = "https://raw.githubusercontent.com/cohere-ai/notebooks/main/notebooks/data/embed_jobs_sample_data.jsonl" response = requests.get(url) @@ -218,7 +218,7 @@ Our index should now be populated with our wiki data and text embeddings for the Now let’s start querying our index. We will perform a hybrid search query, which means we will compute the relevance of search results based on the vector similarity to our query, as well as the keyword similarity. Hybrid search tends to lead to state-of-the-art search results and Elastic is well-suited to offer this. Here we build a query that will search over the `title` and `text` fields using keyword matching, and will search over our text embeddings using vector similarity. -```python +```python PYTHON query = "When were the semi-finals of the 2022 FIFA world cup played?" response = client.search( @@ -263,7 +263,7 @@ In order to effectively combine the results from our vector and BM25 retrieval, First, create an inference endpoint with your Cohere API key. Make sure to specify a name for your endpoint, and the model_id of one of the rerank models. In this example we will use Rerank v3. -```python +```python PYTHON client.inference.put_model( task_type="rerank", inference_id="cohere_rerank", @@ -284,7 +284,7 @@ You can now rerank your results using that inference endpoint. Here we will pass The inference service will respond with a list of documents in descending order of relevance. Each document has a corresponding index (reflecting the order the documents were in when sent to the inference endpoint), and if the “return_documents” task setting is True, then the document texts will be included as well. -```python +```python PYTHON response = client.inference.inference( inference_id="cohere_rerank", body={ @@ -315,7 +315,7 @@ Now that we have ranked our results, we can easily turn this into a RAG system w Next, we can easily get a grounded generation with citations from the Cohere Chat API. We simply pass in the user query and documents retrieved from Elasticsearch to the API, and print out our grounded response. -```python +```python PYTHON response = co.chat(message=query, documents=ranked_documents, model='command-r-plus') source_documents = [] diff --git a/fern/pages/integrations/integrations/haystack-and-cohere.mdx b/fern/pages/integrations/integrations/haystack-and-cohere.mdx index b2255d888..36b50ed26 100644 --- a/fern/pages/integrations/integrations/haystack-and-cohere.mdx +++ b/fern/pages/integrations/integrations/haystack-and-cohere.mdx @@ -11,7 +11,7 @@ createdAt: "Tue Feb 27 2024 20:06:57 GMT+0000 (Coordinated Universal Time)" updatedAt: "Thu May 23 2024 17:07:17 GMT+0000 (Coordinated Universal Time)" --- - + [Haystack](https://github.com/deepset-ai/haystack) is an open source LLM framework in Python by [deepset](https://www.deepset.ai/) for building customizable, production-ready LLM applications. You can use Cohere's `/embed`, `/generate`, `/chat`, and `/rerank` models with Haystack. @@ -35,7 +35,7 @@ Haystack’s `CohereChatGenerator` component enables chat completion using Coher In the example below, you will need to add your Cohere API key. We suggest using an environment variable, `COHERE_API_KEY`. Don’t commit API keys to source control! -```python +```python PYTHON from haystack import Pipeline from haystack.components.builders import DynamicChatPromptBuilder from haystack.dataclasses import ChatMessage @@ -60,7 +60,7 @@ print(res) You can pass additional dynamic variables to the LLM, like so: -```python +```python PYTHON messages = [system_message, ChatMessage.from_user("What's the weather forecast for {{location}} in the next {{day_count}} days?")] res = pipe.run(data={"prompt_builder": {"template_variables": {"location": location, "day_count": "5"}, @@ -73,7 +73,7 @@ print(res) This Haystack [retrieval augmented generation](/docs/retrieval-augmented-generation-rag) (RAG) pipeline passes Cohere’s documentation to a Cohere model, so it can better explain Cohere’s capabilities. In the example below, you can see the `LinkContentFetcher` replacing a classic retriever. The contents of the URL are passed to our generator. -```python +```python PYTHON from haystack import Document from haystack import Pipeline from haystack.components.builders import DynamicChatPromptBuilder @@ -127,7 +127,7 @@ The code sample below adds a set of documents to an `InMemoryDocumentStore`, the Although these examples use an `InMemoryDocumentStore` to keep things simple, Haystack supports [a variety](https://haystack.deepset.ai/integrations?type=Document+Store) of vector database and document store options. You can use any of them in combination with Cohere models. -```python +```python PYTHON from haystack import Pipeline from haystack.components.retrievers.in_memory import InMemoryBM25Retriever from haystack.components.builders.prompt_builder import PromptBuilder @@ -185,7 +185,7 @@ Although these examples use an `InMemoryDocumentStore` to keep things simple, Ha #### Index Documents with Haystack and Cohere Embeddings -```python +```python PYTHON from haystack import Pipeline from haystack import Document from haystack.document_stores.in_memory import InMemoryDocumentStore @@ -218,7 +218,7 @@ print(document_store.filter_documents()) After the indexing pipeline has added the embeddings to the document store, you can build a retrieval pipeline that gets the most relevant documents from your database. This can also form the basis of RAG pipelines, where a generator component can be added at the end. -```python +```python PYTHON from haystack import Pipeline from haystack.components.retrievers.in_memory import InMemoryEmbeddingRetriever from haystack_integrations.components.embedders.cohere import CohereTextEmbedder diff --git a/fern/pages/integrations/integrations/milvus-and-cohere.mdx b/fern/pages/integrations/integrations/milvus-and-cohere.mdx index dcac8251f..ffc99e73d 100644 --- a/fern/pages/integrations/integrations/milvus-and-cohere.mdx +++ b/fern/pages/integrations/integrations/milvus-and-cohere.mdx @@ -7,7 +7,7 @@ createdAt: "Thu May 23 2024 16:59:08 GMT+0000 (Coordinated Universal Time)" updatedAt: "Thu May 23 2024 16:59:13 GMT+0000 (Coordinated Universal Time)" --- - + [Milvus](https://milvus.io/) is a highly flexible, reliable, and blazing-fast cloud-native, open-source vector database. It powers embedding similarity search and AI applications and strives to make vector databases accessible to every organization. Milvus is a graduated-stage project of the LF AI & Data Foundation. diff --git a/fern/pages/integrations/integrations/mongodb-and-cohere.mdx b/fern/pages/integrations/integrations/mongodb-and-cohere.mdx index 76c3a9037..97248f9f3 100644 --- a/fern/pages/integrations/integrations/mongodb-and-cohere.mdx +++ b/fern/pages/integrations/integrations/mongodb-and-cohere.mdx @@ -11,7 +11,7 @@ createdAt: "Thu May 23 2024 16:41:27 GMT+0000 (Coordinated Universal Time)" updatedAt: "Thu May 23 2024 17:06:29 GMT+0000 (Coordinated Universal Time)" --- - + MongoDB Atlas Vector Search is a fully managed vector search platform from MongoDB. It can be used with Cohere's Embed and Rerank models to easily build semantic search or retrieval-augmented generation (RAG) systems with your data from MongoDB. diff --git a/fern/pages/integrations/integrations/opensearch-and-cohere.mdx b/fern/pages/integrations/integrations/opensearch-and-cohere.mdx index 3198fb260..83e6ab5fb 100644 --- a/fern/pages/integrations/integrations/opensearch-and-cohere.mdx +++ b/fern/pages/integrations/integrations/opensearch-and-cohere.mdx @@ -10,7 +10,7 @@ keywords: "OpenSearch, Cohere" createdAt: "Fri Feb 02 2024 15:17:19 GMT+0000 (Coordinated Universal Time)" updatedAt: "Thu May 23 2024 17:09:12 GMT+0000 (Coordinated Universal Time)" --- - + [OpenSearch](https://opensearch.org/platform/search/vector-database.html) is an open-source, distributed search and analytics engine platform that allows users to search, analyze, and visualize large volumes of data in real time. When it comes to text search, OpenSearch is well-known for powering keyword-based (also called lexical) search methods. OpenSearch supports Vector Search and integrates with Cohere through [3rd-Party ML Connectors](https://opensearch.org/docs/latest/ml-commons-plugin/remote-models/connectors/) as well as Amazon Bedrock diff --git a/fern/pages/integrations/integrations/pinecone-and-cohere.mdx b/fern/pages/integrations/integrations/pinecone-and-cohere.mdx index e7b13efdb..0b7490622 100644 --- a/fern/pages/integrations/integrations/pinecone-and-cohere.mdx +++ b/fern/pages/integrations/integrations/pinecone-and-cohere.mdx @@ -7,7 +7,7 @@ createdAt: "Thu May 23 2024 16:56:18 GMT+0000 (Coordinated Universal Time)" updatedAt: "Thu May 23 2024 16:57:06 GMT+0000 (Coordinated Universal Time)" --- - + The [Pinecone](https://www.pinecone.io/) vector database makes it easy to build high-performance vector search applications. Use Cohere to generate language embeddings, then store them in Pinecone and use them for Semantic Search. diff --git a/fern/pages/integrations/integrations/qdrant-and-cohere.mdx b/fern/pages/integrations/integrations/qdrant-and-cohere.mdx index 21925bd78..eba83a7c4 100644 --- a/fern/pages/integrations/integrations/qdrant-and-cohere.mdx +++ b/fern/pages/integrations/integrations/qdrant-and-cohere.mdx @@ -7,7 +7,7 @@ createdAt: "Thu May 23 2024 16:54:52 GMT+0000 (Coordinated Universal Time)" updatedAt: "Thu May 23 2024 16:55:09 GMT+0000 (Coordinated Universal Time)" --- - + [Qdrant](https://qdrant.tech/) is an open-source vector similarity search engine and vector database. It provides a production-ready service with a convenient API to store, search, and manage points - vectors with an additional payload. Qdrant is tailored to extended filtering support. It makes it useful for all sorts of neural-network or semantic-based matching, faceted search, and other applications. diff --git a/fern/pages/integrations/integrations/redis-and-cohere.mdx b/fern/pages/integrations/integrations/redis-and-cohere.mdx index 21aeb0410..fb1d41889 100644 --- a/fern/pages/integrations/integrations/redis-and-cohere.mdx +++ b/fern/pages/integrations/integrations/redis-and-cohere.mdx @@ -11,7 +11,7 @@ createdAt: "Mon Feb 26 2024 22:22:44 GMT+0000 (Coordinated Universal Time)" updatedAt: "Thu May 23 2024 17:06:49 GMT+0000 (Coordinated Universal Time)" --- - + [RedisVL](https://www.redisvl.com/) provides a powerful, dedicated Python client library for using Redis as a Vector Database. This walks through how to integrate [Cohere embeddings](/docs/embeddings) with Redis using a dataset of Wikipedia articles to set up a pipeline for semantic search. It will cover: @@ -36,7 +36,7 @@ The code samples on this page assume the following: - You have docker running locally -```shell +```shell SHELL docker run -d --name redis-stack -p 6379:6379 -p 8001:8001 redis/redis-stack:latest ``` @@ -53,7 +53,7 @@ Install and import the required Python Packages: To install the packages, use the following code -```shell +```shell SHELL !pip install redisvl==0.1.0 !pip install cohere==4.45 !pip install jsonlines @@ -61,7 +61,7 @@ To install the packages, use the following code ### Import the required packages: -```python +```python PYTHON from redis import Redis from redisvl.index import SearchIndex from redisvl.schema import IndexSchema @@ -77,7 +77,7 @@ import jsonlines To configure a Redis index you can either specify a `yaml` file or import a dictionary. In this tutorial we will be using a `yaml` file with the following schema. Either use the `yaml` file found at this [link](https://github.com/cohere-ai/notebooks/blob/main/notebooks/configs/redis_guide_schema.yaml), or create a `.yaml` file locally with the following configuration. -```yaml +```yaml YAML version: "0.1.0" index: name: semantic_search_demo @@ -116,7 +116,7 @@ For this guide, we will be using the Cohere `embed-english-v3.0 model` which has ## Initializing the Cohere Text Vectorizer: -```python +```python PYTHON # create a vectorizer api_key='{Insert your cohere API Key}' @@ -132,7 +132,7 @@ The following [link](/docs/embed-2) contains details around the available embedd ## Initializing the Redis Index: -```python +```python PYTHON # construct a search index from the schema - this schema is called "semantic_search_demo" schema = IndexSchema.from_yaml("./schema.yaml") client = Redis.from_url("redis://localhost:6379") @@ -144,20 +144,20 @@ index.create(overwrite=True) Note that we are using `SearchIndex.from_yaml` because we are choosing to import the schema from a yaml file, we could also do `SearchIndex.from_dict` as well. -```curl +```curl CURL !rvl index listall ``` The above code checks to see if an index has been created. If it has, you should see something like this below: -```text +```text TEXT 15:39:22 [RedisVL] INFO Indices: 15:39:22 [RedisVL] INFO 1. semantic_search_demo ``` Look inside the index to make sure it matches the schema you want -```curl +```curl CURL !rvl index info -i semantic_search_demo ``` @@ -188,11 +188,13 @@ Index Fields: You can also visit: [http://localhost:8001/redis-stack/browser](http://localhost:8001/redis-stack/browser). The Redis GUI will show you the index in realtime. + GUI + ## Loading your Documents and Embedding them into Redis: -```python +```python PYTHON # read in your documents jsonl_file_path='data/redis_guide_data.jsonl' @@ -214,7 +216,7 @@ We will be loading a subset of data which contains paragraphs from wikipedia - t ## Prepare your Data to be inserted into the Index: -```python +```python PYTHON # contruct the data payload to be uploaded to your index data = [{"url": row['url'], "title": row['title'], @@ -237,7 +239,7 @@ At this point, your Redis DB is ready for semantic search! ## Query your Redis DB: -```python +```python PYTHON # use the Cohere vectorizer again to create a query embedding query_embedding = cohere_vectorizer.embed("What did Microsoft release in 2015?", input_type='search_query',as_buffer=True) @@ -261,7 +263,7 @@ Use the `VectorQuery` class to construct a query object - here you can specify t ## Adding Tag Filters: -```python +```python PYTHON # Initialize a tag filter tag_filter = Tag("title") == "Microsoft Office" @@ -278,7 +280,7 @@ One feature of Redis is the ability to add [filtering](https://www.redisvl.com/a ## Using Filter Expressions: -```python +```python PYTHON # define a tag match on the title, text match on the text field, and numeric filter on the views field filter_data=(Tag('title')=='Elizabeth II') & (Text("text")% "born") & (Num("views")>4500) diff --git a/fern/pages/integrations/integrations/vespa-and-cohere.mdx b/fern/pages/integrations/integrations/vespa-and-cohere.mdx index d5b929de0..600a26197 100644 --- a/fern/pages/integrations/integrations/vespa-and-cohere.mdx +++ b/fern/pages/integrations/integrations/vespa-and-cohere.mdx @@ -7,7 +7,7 @@ createdAt: "Thu May 23 2024 16:52:12 GMT+0000 (Coordinated Universal Time)" updatedAt: "Thu May 23 2024 16:52:39 GMT+0000 (Coordinated Universal Time)" --- - + [Vespa](https://vespa.ai/) is a fully featured search engine and vector database. It supports vector search (ANN), lexical search, and search in structured data, all in the same query. Integrated machine-learned model inference allows you to apply AI to make sense of your data in real time. diff --git a/fern/pages/integrations/integrations/weaviate-and-cohere.mdx b/fern/pages/integrations/integrations/weaviate-and-cohere.mdx index f5c57e8c4..6baca4056 100644 --- a/fern/pages/integrations/integrations/weaviate-and-cohere.mdx +++ b/fern/pages/integrations/integrations/weaviate-and-cohere.mdx @@ -7,7 +7,7 @@ createdAt: "Thu May 23 2024 16:55:16 GMT+0000 (Coordinated Universal Time)" updatedAt: "Thu May 23 2024 16:56:09 GMT+0000 (Coordinated Universal Time)" --- - + [Weaviate](https://weaviate.io/) is an open source vector search engine that stores both objects and vectors, allowing for combining vector search with structured filtering. diff --git a/fern/pages/integrations/integrations/zilliz-and-cohere.mdx b/fern/pages/integrations/integrations/zilliz-and-cohere.mdx index 1d5f59b15..35921934b 100644 --- a/fern/pages/integrations/integrations/zilliz-and-cohere.mdx +++ b/fern/pages/integrations/integrations/zilliz-and-cohere.mdx @@ -7,7 +7,7 @@ createdAt: "Thu May 23 2024 17:00:11 GMT+0000 (Coordinated Universal Time)" updatedAt: "Thu May 23 2024 20:28:12 GMT+0000 (Coordinated Universal Time)" --- - + [Zilliz Cloud](https://zilliz.com/cloud) is a cloud-native vector database that stores, indexes, and searches for billions of embedding vectors to power enterprise-grade similarity search, recommender systems, anomaly detection, and more. Zilliz Cloud provides a fully-managed Milvus service, made by the creators of Milvus that allows for easy integration with vectorizers from Cohere and other popular models. Purpose-built to solve the challenge of managing billions of embeddings, Zilliz Cloud makes it easy to build applications for scale. diff --git a/fern/pages/integrations/llamaindex.mdx b/fern/pages/integrations/llamaindex.mdx index d1499c8b6..d6fa6a045 100644 --- a/fern/pages/integrations/llamaindex.mdx +++ b/fern/pages/integrations/llamaindex.mdx @@ -22,7 +22,7 @@ To use LlamaIndex and Cohere, you will need: To use Cohere's chat functionality with LlamaIndex create a [Cohere model object](https://docs.llamaindex.ai/en/stable/examples/llm/cohere.html) and call the `chat` function. -```python +```python PYTHON from llama_index.llms.cohere import Cohere from llama_index.core.llms import ChatMessage @@ -36,7 +36,7 @@ print(resp) To use Cohere's embeddings with LlamaIndex create a [Cohere Embeddings object](https://docs.llamaindex.ai/en/stable/examples/embeddings/cohereai.html) with an embedding model [from this list](/reference/embed) and call `get_text_embedding`. -```python +```python PYTHON from llama_index.embeddings.cohereai import CohereEmbedding embed_model = CohereEmbedding( @@ -57,7 +57,7 @@ print(embeddings[:5]) To use Cohere's rerank functionality with LlamaIndex create a [ Cohere Rerank object ](https://docs.llamaindex.ai/en/latest/examples/node_postprocessor/CohereRerank.html#) and use as a [node post processor.](https://docs.llamaindex.ai/en/stable/module_guides/querying/node_postprocessors/root.html) -```python +```python PYTHON cohere_rerank = CohereRerank(api_key="{API_KEY}", top_n=2) ``` @@ -65,7 +65,7 @@ cohere_rerank = CohereRerank(api_key="{API_KEY}", top_n=2) The following example uses Cohere's chat model, embeddings and rerank functionality to generate a response based on your data. -```python +```python PYTHON # rerank from llama_index import ServiceContext, VectorStoreIndex diff --git a/fern/pages/llm-university/intro-deployment/deploying-with-aws-sagemaker.mdx b/fern/pages/llm-university/intro-deployment/deploying-with-aws-sagemaker.mdx index 592ecd3b7..bbd06b4ac 100644 --- a/fern/pages/llm-university/intro-deployment/deploying-with-aws-sagemaker.mdx +++ b/fern/pages/llm-university/intro-deployment/deploying-with-aws-sagemaker.mdx @@ -63,7 +63,7 @@ The notebook goes through an example of creating an endpoint (the complete [note ### Step 1: Import the Required Libraries -```python +```python PYTHON !pip install cohere-sagemaker from cohere_sagemaker import Client @@ -74,7 +74,7 @@ import boto3 Select the product ARN while creating a deployable model using Boto3. -```python +```python PYTHON # Map the ARNs (current us-east-1 and eu-west-1 are supported) model_package_map = { "us-east-1": "arn:aws:sagemaker:us-east-1:865070037744:model-package/cohere-gpt-xlarge-v1-2-4d938caa0259377e94c4eb5bf6bc365a", @@ -92,7 +92,7 @@ model_package_arn = model_package_map[region] ### Step 3: Create an Endpoint -```python +```python PYTHON co = Client(region_name=region) co.create_endpoint(arn=model_package_arn, endpoint_name="cohere-gpt-xlarge", instance_type="ml.p4d.24xlarge", n_instances=1) @@ -102,7 +102,7 @@ co.create_endpoint(arn=model_package_arn, endpoint_name="cohere-gpt-xlarge", ins ### Step 4: Run Inference on the Endpoint -```python +```python PYTHON prompt="Write a creative product description for a wireless headphone product named the CO-1T" response = co.generate(prompt=prompt, max_tokens=100, temperature=0.9) @@ -117,7 +117,7 @@ The CO-1T is a sleek and stylish wireless headphone that is perfect for on-the-g ### Step 5: Delete the Endpoint -```python +```python PYTHON co.delete_endpoint() co.close() ``` diff --git a/fern/pages/llm-university/intro-deployment/deploying-with-databutton.mdx b/fern/pages/llm-university/intro-deployment/deploying-with-databutton.mdx index 04633877c..10830f30c 100644 --- a/fern/pages/llm-university/intro-deployment/deploying-with-databutton.mdx +++ b/fern/pages/llm-university/intro-deployment/deploying-with-databutton.mdx @@ -49,7 +49,7 @@ As the file is now saved to storage, the user can access the data throughout the The following are the helper functions mentioned above. -```python +```python PYTHON # Function to generate Cohere embeddings def embed_text(texts): embeddings = \[] @@ -64,7 +64,7 @@ def embed_text(texts): return embeddings ``` -```python +```python PYTHON # Function to reduce dimensionality of embeddings using umap def reduce_dimensionality(embeddings): @@ -73,7 +73,7 @@ def reduce_dimensionality(embeddings): return umap_embeddings[:, 0], umap_embeddings[:, 1] ``` -```python +```python PYTHON # Function to save embeddings into a json file in Databutton data storage def save_embeddings_to_json(df): @@ -97,7 +97,7 @@ The following is what the user interface will look like at this point. The reduced embeddings are now saved as a new file. We can copy the code snippet to import and use the DataFrame anywhere in our application. Here is an example of how you call the data as a DataFrame: -```python +```python PYTHON # Call the embeddings data as a DataFrame df = db.storage.dataframes.get(key=”reduced.csv”) ``` @@ -118,7 +118,7 @@ We will create a user interface that provides interactivity at a few crucial poi - **Calculate cluster centers and distances from centroids**: The script calculates the centroid (or geometric center) of each cluster. It then computes how far each point in your data is from its cluster’s centroid. We will need this in the next step of the process where we label our clusters. - **Display and save the results**: The script will display the DataFrame that now includes cluster labels and distances from centroids. If you’re satisfied with the results, you can save this labeled data back into Databutton’s storage for later use. -```python +```python PYTHON import databutton as db import streamlit as st from sklearn.cluster import KMeans @@ -196,7 +196,7 @@ Finally we’ll use the [Chat endpoint](/reference/chat) to suggest a name for e We use the `utterance_prompt` as the prompt to the Chat endpoint to generate descriptive labels for the data clusters. -```python +```python PYTHON utterance_prompt = """ These are clusters of commands given to an AI-based personal assistant. Each cluster represents a specific type of task or query that users often ask their personal assistant to perform. A list of keywords summarizing the collection is included, along with the name of the cluster. The name of each cluster should be a brief, precise description of the common theme within the utterances. --- @@ -237,7 +237,7 @@ Sample utterances from this cluster: We create several helper functions for processing the DataFrame, generating keywords, creating labels, and displaying information to the user. The `extract_top_n_words` function generates the most relevant keywords for each cluster. The `generate_label` function uses an AI model to generate a descriptive label for each cluster. The `generate_keywords_and_label` function wraps up these processes for each cluster and updates the DataFrame accordingly. The `present_cluster_data` function is used to present the information about each cluster to the user. -```python +```python PYTHON # Function to generate a name for each cluster def generate_label(customer_service_prompt, text_series): @@ -258,7 +258,7 @@ def generate_label(customer_service_prompt, text_series): return response.text, prompt ``` -```python +```python PYTHON # Function to generate keywords for each cluster def extract_top_n_words(vectorizer, tfidf_matrix, n=10): @@ -279,7 +279,7 @@ def extract_top_n_words(vectorizer, tfidf_matrix, n=10): return [word[0] for word in words_freq[:n]] ``` -```python +```python PYTHON # Helper function to generate the cluster name and keywords @st.cache_resource @@ -327,7 +327,7 @@ And finally, putting the steps together for cluster names and keyword generation These are reflected in the corresponding code block. -```python +```python PYTHON from sklearn.feature_extraction.text import TfidfVectorizer import databutton as db import streamlit as st diff --git a/fern/pages/llm-university/intro-deployment/deploying-with-fastapi.mdx b/fern/pages/llm-university/intro-deployment/deploying-with-fastapi.mdx index 4b0ac77f0..9a28ab5e6 100644 --- a/fern/pages/llm-university/intro-deployment/deploying-with-fastapi.mdx +++ b/fern/pages/llm-university/intro-deployment/deploying-with-fastapi.mdx @@ -29,11 +29,11 @@ First, we create a Python file. Let's name it `main.py`. Next, we import FastAPI and Cohere, as well as Pydantic, for structuring inputs to the API. -```shell +```shell SHELL pip install cohere fastapi "uvicorn[standard]" ``` -```python +```python PYTHON from fastapi import FastAPI from pydantic import BaseModel, conlist import cohere @@ -55,7 +55,7 @@ We feed these examples to the `co.classify()` method via the `examples` argument If you need more information about the endpoint, visit the [Classify endpoint documentation](/reference/classify). -```python +```python PYTHON examples=[ClassifyExample(text="The order came 5 days early", label="positive"), ClassifyExample(text="The item exceeded my expectations", label="positive"), ClassifyExample(text="I ordered more for my friends", label="positive"), @@ -87,7 +87,7 @@ Next, we create an endpoint that we call `prediction`together with a function th We take the code to call the Classify endpoint from the previous step and put it inside the function. -```python +```python PYTHON app = FastAPI() class ProductReviews(BaseModel): @@ -122,7 +122,7 @@ def predict_sentiment(product_reviews: ProductReviews): We can now test the endpoint locally. Switch your terminal working directory to the location of your saved Python file, then input the following shell command. This brings up a server on your localhost. -```shell +```shell SHELL uvicorn main:app ``` @@ -135,7 +135,7 @@ Let's test with these two text inputs and get the predicted classes from the mod There are a couple of options to call the endpoint. One way is to run a cURL command on your terminal. -```shell +```shell SHELL curl -X 'POST' \ 'http://127.0.0.1:8000/prediction' \ -H 'accept: application/json' \ diff --git a/fern/pages/llm-university/intro-deployment/deploying-with-streamlit.mdx b/fern/pages/llm-university/intro-deployment/deploying-with-streamlit.mdx index 86110562c..261d6bb2e 100644 --- a/fern/pages/llm-university/intro-deployment/deploying-with-streamlit.mdx +++ b/fern/pages/llm-university/intro-deployment/deploying-with-streamlit.mdx @@ -36,7 +36,7 @@ In our case, we create a prompt containing an instruction and a few examples of We then build a function that leverages Cohere’s [Python SDK](/generate-reference?ref=txt.cohere.com) to take in user input and return the generated text, and the code looks as follows. -```python +```python PYTHON def generate_idea(industry): prompt = f""" Generate a startup idea given the industry. @@ -74,7 +74,7 @@ Generating startup ideas is great, but it would make the app much more exciting We create another function that takes in a startup idea as the input and returns the generated startup name. The code looks as follows. -```python +```python PYTHON def generate_name(idea): prompt= f""" Generate a startup name and name given the startup idea. @@ -114,7 +114,7 @@ Now that the text generation part is working, let’s create the front end with The following is our front-end code using Streamlit, which gets us a basic working app in just a few lines. -```python +```python PYTHON st.title("🚀 Startup Idea Generator") form = st.form(key="user_settings") @@ -160,7 +160,7 @@ The second is a bit more interesting. With the Chat endpoint, we can use the `te For this, we’ll also use `st.slider()` to let the users control the `temperature` value. We’ll also need to modify the `generate_idea()` and `generate_name()` functions to accept the `temperature` argument, to be passed to the Chat API call. The following is an example with `generate_idea()`. -```python +```python PYTHON def generate_idea(industry, temperature): ... response = co.chat( @@ -176,7 +176,7 @@ Let’s also add a couple more things. First, we’ll use `st.progress()` to sho Altogether, the completed front-end Streamlit code looks like the following, about 20 lines. -```python +```python PYTHON st.title("🚀 Startup Idea Generator") form = st.form(key="user_settings") diff --git a/fern/pages/llm-university/intro-large-language-models/similarity-between-words-and-sentences.mdx b/fern/pages/llm-university/intro-large-language-models/similarity-between-words-and-sentences.mdx index 9a514cdc2..8e4b24fe6 100644 --- a/fern/pages/llm-university/intro-large-language-models/similarity-between-words-and-sentences.mdx +++ b/fern/pages/llm-university/intro-large-language-models/similarity-between-words-and-sentences.mdx @@ -78,7 +78,7 @@ Of course, this was a very small example. Let’s do a real-life example with th To set up, we first import several tools we'll need. -```python +```python PYTHON import numpy as np import seaborn as sns import altair as alt @@ -87,14 +87,14 @@ from sklearn.metrics.pairwise import cosine_similarity We also import the Cohere module and create a client. -```python +```python PYTHON import cohere co = cohere.Client("COHERE_API_KEY") # Your Cohere API key ``` Consider the following 3 sentences, stored in the Python list `texts`. -```python +```python PYTHON texts = ["I like to be in my house", "I enjoy staying home", "the isotope 238u decays to 206pb"] @@ -110,7 +110,7 @@ To get the corresponding sentence embeddings, we call the [Embed endpoint](/refe You'll learn about these parameters in more detail in the [LLMU Module on Text Representation](/docs/intro-text-representation). -```python +```python PYTHON response = co.embed( texts=texts, model='embed-english-v3.0', @@ -120,7 +120,7 @@ response = co.embed( The embeddings are stored in the `embeddings` value of the response. After getting the embeddings, we separate them by sentence and print the values. -```python +```python PYTHON embeddings = response.embeddings [sentence1, sentence2, sentence3] = embeddings @@ -147,7 +147,7 @@ Note that the embeddings are vectors (lists) of 1024 numbers, so they are trunca Let’s calculate the dot products between the three sentences. The following line of code will do it. -```python +```python PYTHON print("Similarity between sentences 1 and 2:", np.dot(sentence1, sentence2)) print("Similarity between sentences 1 and 3:", np.dot(sentence1, sentence3)) print("Similarity between sentences 2 and 3:", np.dot(sentence2, sentence3)) @@ -177,7 +177,7 @@ This checks out—the similarity between a sentence and itself is around 1, whic Now let’s calculate the cosine similarities between them. -```python +```python PYTHON print("Cosine similarity between sentences 1 and 2:", cosine_similarity([sentence1], [sentence2])[0][0]) print("Cosine similarity between sentences 1 and 3:", cosine_similarity([sentence1], [sentence3])[0][0]) print("Cosine similarity between sentences 2 and 3:", cosine_similarity([sentence2], [sentence3])[0][0]) diff --git a/fern/pages/llm-university/intro-prompt-engineering/chaining-prompts-2.mdx b/fern/pages/llm-university/intro-prompt-engineering/chaining-prompts-2.mdx index 106fb4853..ded026453 100644 --- a/fern/pages/llm-university/intro-prompt-engineering/chaining-prompts-2.mdx +++ b/fern/pages/llm-university/intro-prompt-engineering/chaining-prompts-2.mdx @@ -49,13 +49,13 @@ A sequential chain of prompts is needed when the subtasks depend on each other. Let’s say we are building an application that generates recipe ideas for a whole week and then generates a shopping list of ingredients for the user to buy. In this case, given a user input of, say, the number of meals or days, we can run the recipe generation step in parallel. The prompt might look something like the following: -```python +```python PYTHON prompt = f'Suggest a simple and quick recipe for {meal}. Write in JSON containing these keys "Ingredients" and "Instructions"' ``` Next, we’ll repeat the recipe generation across all meals. Once complete, we can consolidate the ingredients from each meal into a single shopping list that the user can use immediately. -```python +```python PYTHON prompt = f"""Consolidate the following ingredients into a single shopping list, without repetition: {ingredients}""" ``` @@ -108,7 +108,7 @@ Let’s take a rephrasing task as an example. Say we have an application that ta The prompt, taking in the user input to be rephrased, might look something like the following: -```python +```python PYTHON user_input = "I really don't have time for this nonsense." prompt_rephrase = f"""Rephrase this user comment into something more polite: @@ -126,7 +126,7 @@ I think we might need to set aside some time to discuss this properly. Next, we create another prompt to check if the rephrased comment is similar enough to the original comment. -```python +```python PYTHON prompt_check = f"""Below is a rude comment that has been rephrased into a polite version. The rephrased comment must maintain a similar meaning to the original comment. Check if this is true. Answer with YES or NO. diff --git a/fern/pages/llm-university/intro-prompt-engineering/constructing-prompts.mdx b/fern/pages/llm-university/intro-prompt-engineering/constructing-prompts.mdx index 00a415484..3afc842e0 100644 --- a/fern/pages/llm-university/intro-prompt-engineering/constructing-prompts.mdx +++ b/fern/pages/llm-university/intro-prompt-engineering/constructing-prompts.mdx @@ -24,7 +24,7 @@ First, let’s install the Cohere Python SDK, get the Cohere API key, and set up ``` -```python +```python PYTHON import cohere co = cohere.Client("COHERE_API_KEY") # Get your API key: https://dashboard.cohere.com/api-keys ``` @@ -58,7 +58,7 @@ While prompts can morph into something very lengthy and complex, it doesn’t ha Let’s say we want to generate a product description for a wireless headphone. Here’s an example prompt, where we create a variable for the user to input some text and merge that into the main prompt. -```python +```python PYTHON user_input = "a wireless headphone product named the CO-1T" prompt = f"""Write a creative product description for {user_input}""" @@ -83,7 +83,7 @@ A simple and short prompt can get you started, but in most cases, you’ll need Going back to the previous prompt, the generated product description was great, but what if we wanted it to include specific things, such as its features, who it is designed for, and so on? We can adjust the prompt to take more inputs from the user, like so: -```python +```python PYTHON user_input_product = "a wireless headphone product named the CO-1T" user_input_keywords = '"bluetooth", "wireless", "fast charging"' user_input_customer = "a software developer who works in noisy offices" @@ -98,7 +98,7 @@ generate_text(prompt, temp=0.5) In the example above, we pack the additional details of the prompt in a single paragraph. Alternatively, we can also compose it to be more structured, like so: -```python +```python PYTHON user_input_product = "a wireless headphone product named the CO-1T" user_input_keywords = '"bluetooth", "wireless", "fast charging"' user_input_customer = "a software developer who works in noisy offices" @@ -145,7 +145,7 @@ This is a whole topic on its own, but to provide some idea, [this demo](https:// Here’s an example where we ask the model to list the features of the CO-1T wireless headphone without any additional context: -```python +```python PYTHON user_input ="What are the key features of the CO-1T wireless headphone" prompt = user_input @@ -162,7 +162,7 @@ The CO-1T wireless headphone is a high-quality, comfortable, and durable headpho And here’s the same request to the model, this time with the product description of the product added as context. -```python +```python PYTHON context = """Think back to the last time you were working without any distractions in the office. That's right...I bet it's been a while. \ With the newly improved CO-1T noise-cancelling Bluetooth headphones, you can work in peace all day. Designed in partnership with \ software developers who work around the mayhem of tech startups, these headphones are finally the break you've been waiting for. With \ @@ -193,7 +193,7 @@ So far, we saw how to get the model to generate responses that follow certain st Here, the task is to extract information from a list of invoices. Instead of providing the information in plain text, we can prompt the model to generate a table that contains all the information required. -```python +```python PYTHON prompt="""Turn the following information into a table with columns Invoice Number, Merchant Name, and Account Number. Bank Invoice: INVOICE #0521 MERCHANT ALLBIRDS ACC XXX3846 Bank Invoice: INVOICE #6781 MERCHANT SHOPPERS ACC XXX9877 @@ -220,7 +220,7 @@ Let me know if you'd like me to make any modifications to this table or provide Another useful format is JSON, which we can modify the prompt as follows. -```python +```python PYTHON prompt="""Turn the following information into a JSON string with the following keys: Invoice Number, Merchant Name, and Account Number. Bank Invoice: INVOICE #0521 MERCHANT ALLBIRDS ACC XXX3846 Bank Invoice: INVOICE #6781 MERCHANT SHOPPERS ACC XXX9877 @@ -235,7 +235,7 @@ This returns the following response. ```` Certainly, here is the JSON format of the three bank invoices: -```json +```json JSON [ { "invoice_number": "INVOICE #0521", @@ -270,7 +270,7 @@ We’ll use this example request: “Send a message to Alison to ask if she can First, let’s generate a response without giving the model an example. Here’s the prompt: -```python +```python PYTHON prompt="""Turn the following message to a virtual assistant into the correct action: Send a message to Alison to ask if she can pick me up tonight to go to the concert together""" @@ -287,7 +287,7 @@ Ok, I will send a message to Alison asking if she can pick you up tonight to go Now, let’s modify the prompt by adding a few examples of how we expect the output to be. -```python +```python PYTHON user_input = "Send a message to Alison to ask if she can pick me up tonight to go to the concert together" prompt=f"""Turn the following message to a virtual assistant into the correct action: @@ -322,7 +322,7 @@ This concept is called _chain of thought_ prompting, introduced by [Wei et al.]( First let’s look at a prompt _without_ a chain of thought. It contains one example of a question followed by the answer, without any intermediate calculation step. It also contains the new question we want to answer. -```python +```python PYTHON prompt=f"""Q: Roger has 5 tennis balls. He buys 2 more cans of tennis balls. Each can has 3 tennis balls. \ How many tennis balls does he have now? A: The answer is 11. @@ -347,7 +347,7 @@ The answer is 12. There are 5 balls that are red and 5 balls that are not red, a Now, let’s repeat that, this time with a chain of thought. Now, the example answer contains a reasoning step, describing the calculation logic to get to the final answer, before giving the final answer. -```python +```python PYTHON prompt=f"""Q: Roger has 5 tennis balls. He buys 2 more cans of tennis balls. Each can has 3 tennis balls. \ How many tennis balls does he have now? A: Roger started with 5 balls. 2 cans of 3 tennis balls each is 6 tennis balls. 5 + 6 = 11. \ @@ -383,7 +383,7 @@ There could be another scenario where we specifically need the response to conta Let’s use an example of generating startup ideas. We can get the model to directly generate an idea for a given industry, like so: -```python +```python PYTHON user_input = "education" prompt = f"""Generate a startup idea for this industry: {user_input}""" @@ -411,7 +411,7 @@ Approach & Features: Alternatively, we can ask the model to generate information in steps, such as describing the problem to be solved and the target audience experiencing this problem. -```python +```python PYTHON user_input = "education" prompt = f"""Generate a startup idea for this industry: {user_input} @@ -454,7 +454,7 @@ To ensure that the response follows a consistent format or style, sometimes we n Let’s say we are generating the characteristics of football players for a given position, with one separate paragraph per characteristic. A prompt without any guiding prefix would look something like this: -```python +```python PYTHON user_input_position = "modern centre forward" prompt = f"""Describe the ideal {user_input_position}. In particular, describe the following characteristics: \ @@ -481,7 +481,7 @@ Awareness: A great centre... But if we just added a prefix of the first characteristic (“Pace”) at the end of the prompt, it will give a signal to the model as to how the output should look like. -```python +```python PYTHON user_input_position = "modern centre forward" prompt = f"""Describe the ideal {user_input_position}. In particular, describe the following characteristics: \ @@ -519,7 +519,7 @@ The paper proposes adding a prefix that nudges the model to perform a reasoning Here’s an example taken from the paper. First, we look at a prompt without the “Let’s think step by step” prefix. -```python +```python PYTHON prompt=f"""Q: A juggler can juggle 16 balls. Half of the balls are golf balls, and half of the golf balls are blue. How many blue golf balls are there? diff --git a/fern/pages/llm-university/intro-prompt-engineering/evaluating-outputs.mdx b/fern/pages/llm-university/intro-prompt-engineering/evaluating-outputs.mdx index 945efec10..358d9a8df 100644 --- a/fern/pages/llm-university/intro-prompt-engineering/evaluating-outputs.mdx +++ b/fern/pages/llm-university/intro-prompt-engineering/evaluating-outputs.mdx @@ -99,7 +99,7 @@ Any human evaluation paradigms that we discussed (reference, scoring, and A/B te The example below uses the Command model to perform an A/B testing evaluation for the same question-answering task. The model's task is to choose the winning response between two responses to the question. -```python +```python PYTHON # Add text to evaluate ref_answer = "Because the sound quality is the best out there" gen_answers = ["Because the audio experience is unrivaled", @@ -147,7 +147,7 @@ We calculate the precision, recall, and F1-score of the n-grams of the question- Here is an example using ROUGE: -```python +```python PYTHON from collections import Counter def rouge_1(reference, candidate): diff --git a/fern/pages/llm-university/intro-prompt-engineering/use-case-patterns.mdx b/fern/pages/llm-university/intro-prompt-engineering/use-case-patterns.mdx index e3d4ee15d..4bd745fd1 100644 --- a/fern/pages/llm-university/intro-prompt-engineering/use-case-patterns.mdx +++ b/fern/pages/llm-university/intro-prompt-engineering/use-case-patterns.mdx @@ -21,7 +21,7 @@ In this blog post, we’ll go through several broad use case categories for the First, let’s install the Cohere package, get the [Cohere API key](https://dashboard.cohere.ai/api-keys?ref=txt.cohere.com), and set up the client. -```python +```python PYTHON ! pip install cohere import cohere @@ -31,7 +31,7 @@ co = cohere.Client("COHERE_API_KEY") # Your Cohere API key Let’s also define a function to take a [prompt](https://txt.cohere.com/generative-ai-part-1/) and a [temperature value](https://txt.cohere.com/generative-ai-part-1/#being-creative-vs-predictable) and then call the [Chat endpoint](/reference/chat?ref=txt.cohere.com), which is how we can access the Command model.​​ We set a default temperature value of 0, which nudges the response to be more predictable and less random. This function returns the text response generated by the model. -```python +```python PYTHON def generate_text(prompt, temp=0): response = co.chat_stream( message=prompt, @@ -64,7 +64,7 @@ Here we can ask the model to write freeform text, for example, with this prompt: Here’s how we can do that. Let’s say we’re building an application for users to enter some bullet points and get a complete email written. We can set up the prompt in the following way: create a variable for the user to input some text and merge that, together with the product description, into the main prompt. -```python +```python PYTHON user_input =""" - announce product launch - create a call to action @@ -108,7 +108,7 @@ This use case is about answering a question that a user asks, be it in a single- Question answering can take place in either a closed or open setting. In a closed-book question answering setting, we rely on the model to answer questions based on the general knowledge from which it has been trained. Here’s one example: -```python +```python PYTHON user_input ="What features should I consider when choosing a wireless headphone" prompt = user_input @@ -153,7 +153,7 @@ In this setting, we can get the model to refer to specific knowledge bases to he Here, a customer asks a product question. We can append the customer’s question and the product description to the prompt, as follows. -```python +```python PYTHON user_input ="How do I control the sound levels" prompt = f"""{product} @@ -178,7 +178,7 @@ Another form of writing is brainstorming, where we want the model to generate a In this example, we want the model to act as an assistant to a customer support agent in identifying possible ways to troubleshoot a technical problem that a customer is facing. -```python +```python PYTHON user_input = "I can't get the Bluetooth connection working" prompt = f"""{product} A customer provided the following complaint about this product: {user_input}. @@ -212,7 +212,7 @@ One example is transforming a passage of text into a different form, making it r For example, creating a list of Frequently Asked Questions (FAQs) about wireless headphones is crucial, but it requires some effort to create. We can cut short this process by getting the model to generate a list of FAQs based on the product description, as follows: -```python +```python PYTHON prompt =f"""Turn the following product description into a list of frequently asked questions (FAQ). Product description: {product} @@ -252,7 +252,7 @@ One popular use case for synthesizing text is summarization. Here we take a long In this example, we create a prompt to summarize a list of customer reviews about the wireless headphone. -```python +```python PYTHON user_input ="""Customer reviews of the CO-1T wireless headphones: "The CO-1T is a great pair of headphones! The design is sleek and modern, and the headphones are \ @@ -291,7 +291,7 @@ The CO-1T wireless headphones have a sleek, modern design and are comfortable to Rewriting text is another useful use case where you need to modify some aspects of the text while maintaining its overall meaning. One example is changing the tone of a piece of text to tailor it to a specific audience. Here we want to rewrite the product description so it’s more relatable to students. -```python +```python PYTHON user_input = "college students" prompt = f"""Create a version of this product description that's tailored towards {user_input}. @@ -316,7 +316,7 @@ Another extremely useful way of looking at text synthesis is information extract Here is an example of an email that a customer is, unfortunately, asking for a refund for the wireless headphone. We can have the model process this email by getting it to extract information, such as the product name, refund reason, and pick-up address. -```python +```python PYTHON user_input ="""I am writing to request a refund for a recent CO-1T purchase I made on your platform. \ Unfortunately, the produce has not met my expectations due to its poor battery life. \ Please arrange for the pick-up at this address: to 171 John Street, Toronto ON, M5T 1X2.""" @@ -346,7 +346,7 @@ One of the most widely deployed use cases in NLP is text classification. Here, t We can create the prompt as follows. -```python +```python PYTHON user_input ="The battery drains fast" prompt = f"""The following is a user message to a customer support agent. @@ -371,7 +371,7 @@ Alternatively, the [Classify endpoint](/reference/classify?ref=txt.cohere.com) p Here’s how we can use the Classify endpoint. It requires a minimum of two examples per class, which is passed as an argument to the API call. We have six examples altogether – two for each class. -```python +```python PYTHON from cohere import ClassifyExample response = co.classify( diff --git a/fern/pages/llm-university/intro-prompt-engineering/validating-outputs.mdx b/fern/pages/llm-university/intro-prompt-engineering/validating-outputs.mdx index 12f4c1ece..86de0cedb 100644 --- a/fern/pages/llm-university/intro-prompt-engineering/validating-outputs.mdx +++ b/fern/pages/llm-university/intro-prompt-engineering/validating-outputs.mdx @@ -23,7 +23,7 @@ However, one key property of LLMs that’s different from traditional software i Here’s an example. In [Chapter 1](/docs/constructing-prompts), we looked at a text extraction task of turning a list of bank invoices in a text document into a JSON object containing three fields: “Invoice Number,” “Merchant Name,” and “Account Number.” For brevity, we’ll turn it into a shorter version with the document containing just one invoice, as follows. -```python +```python PYTHON prompt="""Turn the following information into a JSON string with the following keys: Invoice Number, Merchant Name, and Account Number. Bank Invoice: INVOICE #0521 MERCHANT ALLBIRDS ACC XXX3846 """ @@ -31,7 +31,7 @@ Bank Invoice: INVOICE #0521 MERCHANT ALLBIRDS ACC XXX3846 This produced an LLM response that followed exactly what we wanted, as shown below. -```json +```json JSON { "Invoice Number": "0521", "Merchant Name": "Allbirds", @@ -41,7 +41,7 @@ This produced an LLM response that followed exactly what we wanted, as shown bel But how do we ensure we’ll get the same response every time? Perhaps another time, the output may miss some information, such as the returning incomplete information like this one. -```json +```json JSON { "Invoice Number": "0521" } @@ -83,7 +83,7 @@ Implementation-wise, the following steps are involved in incorporating Guardrail Let’s look at an example of using Guardrails in a text extraction task. The task is to extract the information from a doctor’s note into a JSON object. The following is the doctor’s note. -```python +```python PYTHON doctors_notes = """49 y/o Male with chronic macular rash to face & hair, worse in beard, eyebrows & nares. Itchy, flaky, slightly scaly. Moderate response to OTC steroid cream""" ``` @@ -112,7 +112,7 @@ We'll also need to download the validators required for this tutorial from [Guar Next, import the necessary packages and create a Cohere client. -```python +```python PYTHON import cohere import guardrails as gd from guardrails.hub import ValidRange, ValidChoices @@ -128,7 +128,7 @@ co = cohere.Client(api_key="COHERE_API_KEY") Next, we define the output schema that defines what the LLM response should look like. As mentioned earlier, Guardrails provides an option to define the schema using Pydantic. We’ll use this option, and below is the schema we’ll use for the doctor notes extraction task. -```python +```python PYTHON class Symptom(BaseModel): symptom: str = Field(..., description="Symptom that a patient is experiencing") affected_area: str = Field( @@ -159,7 +159,7 @@ Next, we initialize a `Guard` object based on the schema we have defined. First, we define the base instruction prompt for the LLM as follows. -```python +```python PYTHON PROMPT = """Given the following doctor's notes about a patient, please extract a dictionary that contains the patient's information. @@ -171,7 +171,7 @@ ${gr.complete_json_suffix_v2} Then, we initialize a `Guard` object from the `PatientInfo` Pydantic model. -```python +```python PYTHON # Initialize a Guard object from the Pydantic model PatientInfo guard = gd.Guard.from_pydantic(PatientInfo, prompt=PROMPT) print(guard.base_prompt) @@ -223,7 +223,7 @@ Here are examples of simple (XML, JSON) pairs that show the expected behavior: We’re ready to run an LLM call using the Cohere `Generate` endpoint. For this, we wrap the LLM call with the `Guard` object. This means it will take care of the validation and reasking (if any) until the final generated output fulfills the defined schema. -```python +```python PYTHON # Wrap the Cohere API call with the `guard` object response = guard( co.chat, @@ -238,7 +238,7 @@ print(response.validated_output) And we get the final validated output as follows. -```json +```json JSON { 'gender': 'Male', 'age': 49, @@ -249,13 +249,13 @@ And we get the final validated output as follows. Behind the scenes, Guardrails performs the validation step on the output against the schema, raises any errors if there are mismatches, and triggers a reask. We can trace the execution steps as follows. -```python +```python PYTHON guard.history.last.tree ``` The LLM call first returned the following response. However, notice that the `affected_area` field returned `"Face & Head"`, which did not fall within the options we had defined earlier (any of `"Head", "Face", "Neck", or "Chest"`). -```json +```json JSON { "gender": "Male", "age": 49, @@ -276,7 +276,7 @@ The LLM call first returned the following response. However, notice that the `af Guardrails captured this discrepancy by raising a `FieldReAsk` object containing the incorrect value, the error message, and other additional information. -```json +```json JSON { 'gender': 'Male', 'age': 49, @@ -364,7 +364,7 @@ I was given the following JSON response, which had problems due to incorrect val Guardrails then generated the final validated output, which now completely fulfills the schema. -```json +```json JSON { "gender": "Male", "age": 49, diff --git a/fern/pages/llm-university/intro-semantic-search/deeper-semantic-search.mdx b/fern/pages/llm-university/intro-semantic-search/deeper-semantic-search.mdx index 9f6a516bd..089ad29e0 100644 --- a/fern/pages/llm-university/intro-semantic-search/deeper-semantic-search.mdx +++ b/fern/pages/llm-university/intro-semantic-search/deeper-semantic-search.mdx @@ -29,7 +29,7 @@ As you've seen before, semantic search goes way beyond keyword search. The appli ### 1. Download the Dependencies -```python +```python PYTHON #title Import libraries (Run this cell to execute required code) {display-mode: "form"} import cohere @@ -51,7 +51,7 @@ pd.set_option('display.max_colwidth', None) We'll use the trec dataset which is made up of questions and their categories. -```python +```python PYTHON # Get dataset dataset = load_dataset("trec", split="train") # Import into a pandas dataframe, take only the first 1000 rows @@ -81,7 +81,7 @@ Let's now embed the text of the questions. To get a thousand embeddings of this length should take a few seconds. -```python +```python PYTHON # Paste your API key here. Remember to not share publicly api_key = '' @@ -99,7 +99,7 @@ embeds = co.embed(texts=list(df['text']), Let's build an index using the library called annoy. Annoy is a library created by Spotify to do nearest neighbour search; nearest neighbour search is an optimization problem of finding the point in a given set that is closest (or most similar) to a given point. -```python +```python PYTHON # Create the search index, pass the size of embedding search_index = AnnoyIndex(np.array(embeds).shape[1], 'angular') # Add all the vectors to the search index @@ -115,7 +115,7 @@ After building the index, we can use it to retrieve the nearest neighbours eithe If we're only interested in measuring the similarities between the questions in the dataset (no outside queries), a simple way is to calculate the similarities between every pair of embeddings we have. -```python +```python PYTHON # Choose an example (we'll retrieve others similar to it) example_id = 92 # Retrieve nearest neighbors @@ -128,7 +128,7 @@ print(f"Question:'{df.iloc[example_id]['text']}'\nNearest neighbors:") results ``` -```python +```python PYTHON # Output: Question:'What are bear and bull markets ?' Nearest neighbors: @@ -150,7 +150,7 @@ Nearest neighbors: We're not limited to searching using existing items. If we get a query, we can embed it and find its nearest neighbours from the dataset. -```python +```python PYTHON query = "What is the tallest mountain in the world?" # Get the query's embedding @@ -184,7 +184,7 @@ results ### 5. Visualize the archive -```python +```python PYTHON #@title Plot the archive {display-mode: "form"} # UMAP reduces the dimensions from 1024 to 2 dimensions that we can plot diff --git a/fern/pages/llm-university/intro-semantic-search/dense-retrieval.mdx b/fern/pages/llm-university/intro-semantic-search/dense-retrieval.mdx index 6772ca1bb..222aa76ad 100644 --- a/fern/pages/llm-university/intro-semantic-search/dense-retrieval.mdx +++ b/fern/pages/llm-university/intro-semantic-search/dense-retrieval.mdx @@ -34,7 +34,7 @@ In short, dense retrieval consists of the following: To use dense retrieval, we’ll first define the following function. Just like with keyword search, we’ll tell the vector database what properties we want from each retrieved document, and filter them to the English language (using results_lang). -```python +```python PYTHON def dense_retrieval(query, results_lang='en', num_results=10): nearText = {"concepts": [query]} diff --git a/fern/pages/llm-university/intro-semantic-search/fine-tuning-for-rerank.mdx b/fern/pages/llm-university/intro-semantic-search/fine-tuning-for-rerank.mdx index b94db29b9..b7818e895 100644 --- a/fern/pages/llm-university/intro-semantic-search/fine-tuning-for-rerank.mdx +++ b/fern/pages/llm-university/intro-semantic-search/fine-tuning-for-rerank.mdx @@ -20,7 +20,7 @@ To understand the importance of domain-specific training, we will work with a co We'll start by importing the tools we'll need. -```python +```python PYTHON import os import cohere import json @@ -32,7 +32,7 @@ from datasets import load_dataset Next, we'll instantiate a Cohere client. -```python +```python PYTHON # Paste your API key here. Remember to not share publicly co = cohere.Client("COHERE_API_KEY") ``` @@ -46,7 +46,7 @@ We'll work with [the CaseHOLD dataset](https://huggingface.co/datasets/casehold/ We'll work with an [IterableDataset](https://huggingface.co/docs/datasets/en/about_mapstyle_vs_iterable) and load only a small fraction of examples at a time to avoid loading the entire dataset in memory. -```python +```python PYTHON iterable_dataset = load_dataset("casehold/casehold", split="train", streaming=True) ``` @@ -60,7 +60,7 @@ The data is stored in a Pandas DataFrame `df` with 5 columns: - `"relevant_passages"` - The document that correctly answers the query - `"hard_negatives"`- The four documents that don't correctly answer the query -```python +```python PYTHON # Size of data subset num_examples = 420 @@ -90,7 +90,7 @@ df = pd.DataFrame(d) We next split the data into training (in `df_train`), validation (in `df_valid`), and test (in `df_test`) sets. -```python +```python PYTHON # Set number of examples for train-valid-test split train_num = 256 valid_num = 154 @@ -113,7 +113,7 @@ To get predictions, we'll use the [`rerank()` method](/reference/rerank-1) of th - `documents` - List of documents to choose from - `top_n` - Number of documents to return -```python +```python PYTHON # Predict index of document that corrrectly answers query def get_prediction(item, model="rerank-english-v3.0"): response = co.rerank(model=model, @@ -126,7 +126,7 @@ def get_prediction(item, model="rerank-english-v3.0"): We apply this function to every row in the test set and save the predictions in new column `"baseline_prediction"`. Then, to calculate the test accuracy, we compare the predictions to the ground truth labels in the `"label"` column. -```python +```python PYTHON # Calculate pre-trained model's test accuracy df_test["baseline_prediction"] = df_test.apply(get_prediction, axis=1) print("Baseline accuracy:", sum(df_test["baseline_prediction"] == df_test["label"])/len(df_test)) @@ -144,7 +144,7 @@ To prepare for fine-tuning with the Rerank endpoint, we'll need to convert the d We do this separately for training and validation data. You can learn more about preparing the Rerank fine-tuning data in [the documentation](/docs/rerank-preparing-the-data). -```python +```python PYTHON # Arranges the data to suit Cohere's format def create_rerank_ft_data(query, relevant_passages, hard_negatives): formatted_data = { @@ -206,7 +206,7 @@ Navigate to the API tab of the fine-tuned model. There, you'll see the model ID In the following code, we calculate the test accuracy of the fine-tuned model. We use the same `get_prediction()` function as before, but now just need to pass in the fine-tuned model ID. -```python +```python PYTHON # Calculate fine-tuned model's test accuracy df_test['ft_prediction'] = df_test.apply(get_prediction, model='9f22e08a-f1ab-4cee-9524-607dcb08c954-ft', axis=1) print("Fine-tune accuracy:", sum(df_test["ft_prediction"] == df_test["label"])/len(df_test)) diff --git a/fern/pages/llm-university/intro-semantic-search/generating-answers.mdx b/fern/pages/llm-university/intro-semantic-search/generating-answers.mdx index dc9c79284..aa834fae3 100644 --- a/fern/pages/llm-university/intro-semantic-search/generating-answers.mdx +++ b/fern/pages/llm-university/intro-semantic-search/generating-answers.mdx @@ -42,7 +42,7 @@ The answer to this question is five: _Marie Curie, Linus Pauling, John Bardeen, The way to ask this to the model is with the following line of code, which calls the `co.chat` endpoint. -```python +```python PYTHON prediction_without_search = [ co.chat( message=query, @@ -69,7 +69,7 @@ Instead, let’s first search for the answer using what we’ve learned in the p In order to find the answer to this question in the Wikipedia dataset (the one we’ve been working with throughout this post), we can use the same `dense_retrieval` function that we used before. For simplicity, we will only use dense retrieval without Rerank, but we invite you to add it to the lab and see how the results improve! -```python +```python PYTHON responses = dense_retrieval(query, num_results=20) ``` @@ -87,13 +87,13 @@ Next, we’ll feed these 20 paragraphs to a generative model, and instruct it to In order to get the generative model to answer a question based on a certain context, we need to create a prompt. And in this prompt, we need to give it a command and a context. The context will be the concatenation of all the paragraphs retrieved in the search step, which we can obtain using this line of code: -```python +```python PYTHON context = [r['text'] for r in responses] ``` The array `context` contains a lot of text, and, given the good results we’ve been obtaining with search mechanisms, we are fairly confident that somewhere in this text lies the answer to our original question. Now, we invoke the `Chat` endpoint. The prompt we’ll use is the following. -```python +```python PYTHON prompt = f""" Use the information provided below to answer the questions at the end. If the answer to the question is not contained in the provided information, say "The answer is not in the context". --- @@ -106,7 +106,7 @@ Question: How many people have won more than one Nobel prize? In other words, we’ve prompted the model to answer the question, but only from information coming from the `context` array. And if the information is not there, we are prompting the model to state that the answer is not in the context. The following line of code will run the prompt. As before, we generate 5 answers, and `max_tokens` controls the length of each answer. -```python +```python PYTHON prediction_with_search = [ co.chat( message=prompt, diff --git a/fern/pages/llm-university/intro-semantic-search/keyword-search.mdx b/fern/pages/llm-university/intro-semantic-search/keyword-search.mdx index 2f7ef7147..8f25411ab 100644 --- a/fern/pages/llm-university/intro-semantic-search/keyword-search.mdx +++ b/fern/pages/llm-university/intro-semantic-search/keyword-search.mdx @@ -45,7 +45,7 @@ client = weaviate.Client( To use keyword matching, we’ll first define the following function for keyword search. In this function, we’ll tell the vector database what properties we want from each retrieved document. We’ll also filter them to the English language (using results_lang), but feel free to explore searching in other languages as well! -```python +```python PYTHON def keyword_search(query, results_lang='en', num_results=10): properties = ["text", "title", "url", "views", "lang", "_additional {distance}"] diff --git a/fern/pages/llm-university/intro-semantic-search/reranking-2.mdx b/fern/pages/llm-university/intro-semantic-search/reranking-2.mdx index 923b51fd0..ccd6adaa0 100644 --- a/fern/pages/llm-university/intro-semantic-search/reranking-2.mdx +++ b/fern/pages/llm-university/intro-semantic-search/reranking-2.mdx @@ -58,7 +58,7 @@ These could contain the answer somewhere in the document, but they are certainly Ok, there’s a high chance that the answer is there. Let’s see if Rerank can help us find it. The following function calls the Rerank endpoint. Its inputs are the query, the responses, and the number of responses we’d like to retrieve. -```python +```python PYTHON def rerank_responses(query, responses, num_responses=3): reranked_responses = co.rerank( query = query, diff --git a/fern/pages/llm-university/intro-text-generation/building-a-chatbot.mdx b/fern/pages/llm-university/intro-text-generation/building-a-chatbot.mdx index 79cb30e25..cd920cac5 100644 --- a/fern/pages/llm-university/intro-text-generation/building-a-chatbot.mdx +++ b/fern/pages/llm-university/intro-text-generation/building-a-chatbot.mdx @@ -18,7 +18,7 @@ Additionally, [the API reference](/reference/chat?ref=txt.cohere.com) page conta To set up, we first import the Cohere module and create a client. -```python +```python PYTHON import cohere co = cohere.Client("COHERE_API_KEY") # Your Cohere API key ``` @@ -31,7 +31,7 @@ Here’s an example. We call the endpoint with "Hello" as the user message. In o Right now, we’re interested in the main content of the response, which is stored in the `text` value of the response. -```python +```python PYTHON response = co.chat(message="Hello", model="command-r-plus") print(response.text) ``` @@ -52,7 +52,7 @@ In the quickstart example, we didn’t have to define a preamble because a defau Here’s an example. We added a preamble telling the chatbot to assume the persona of an expert public speaking coach. As a result, we get a response that adopts that persona. -```python +```python PYTHON response = co.chat(message="Hello", model="command-r-plus", preamble="You are an expert public speaking coach. Don't use any greetings.") @@ -77,7 +77,7 @@ In streaming mode, the endpoint will generate a series of objects. To get the ac If you have not already, make your own copy of the Google Colaboratory notebook and run the code in this section to see the same example with streamed responses activated. -```python +```python PYTHON stream = co.chat_stream(message="Hello. I'd like to learn about techniques for effective audience engagement", model="command-r-plus", preamble="You are an expert public speaking coach") @@ -132,7 +132,7 @@ Putting everything together, let’s now build a simple chat interface that take As described before, in streaming mode, the Chat endpoint generates a series of objects. To get the conversation history, we take the object with `event_type` of `"stream-end"` and save it as a new variable `chat_history`. -```python +```python PYTHON import uuid # Create a conversation ID @@ -213,7 +213,7 @@ Ending chat. Next, we print the full conversation history. -```python +```python PYTHON for chat in chat_history: print(chat) ``` @@ -233,7 +233,7 @@ If you opt not to use the endpoint’s conversation history persistence feature, The chat history is a list of multiple turns of messages from the user and the chatbot. Each item is a `cohere.ChatMessage` object containing the `role`, which can be either `”USER”` or `”CHATBOT”`, and the `message` containing the message string. The following is an example of a chat history. -```python +```python PYTHON from cohere import ChatMessage chat_history = [ @@ -247,7 +247,7 @@ chat_history = [ The following modifies the previous implementation by using `chat_history` instead of `conversation_id` for managing the conversation history. -```python +```python PYTHON # Initialize the chat history chat_history = [] diff --git a/fern/pages/llm-university/intro-text-generation/creating-custom-models.mdx b/fern/pages/llm-university/intro-text-generation/creating-custom-models.mdx index 2c1c31604..3ae861fd6 100644 --- a/fern/pages/llm-university/intro-text-generation/creating-custom-models.mdx +++ b/fern/pages/llm-university/intro-text-generation/creating-custom-models.mdx @@ -59,7 +59,7 @@ The finetuning feature runs on the `command` model family, trained to follow use The finetuning format takes JSONL files. We provide for each training example a formatted prompt, alongside a completion. Here is what the file should look like: -```json +```json JSON {"prompt": "This is example prompt #1", "completion": "This is the completion example #1"} {"prompt": "This is example prompt #2", "completion": "This is the completion example #2"} ... @@ -132,7 +132,7 @@ The training will take some time, and once it’s done, you will receive an emai Using a custom model is as simple as substituting the baseline model with the model ID (replace the ID shown below with your model ID, which you can get from the dashboard). -```python +```python PYTHON response = co.generate( model='26db2994-cf88-4243-898d-31258411c120-ft', # REPLACE WITH YOUR MODEL ID prompt="""Turn the following message to a virtual assistant into the correct action: @@ -176,7 +176,7 @@ can you pick me up tonight to go to the concert together We run the following code for each of the baseline (`command`) and the finetuned (using the model ID) models. We make API calls for the two models through four temperature values (0.0, 0.5, 1.0, and 1.5) and three generations each: -```python +```python PYTHON # Create a function to call the endpoint def generate_text(prompt,temperature,num_gens): response = co.generate( diff --git a/fern/pages/llm-university/intro-text-generation/fine-tuning-for-chat.mdx b/fern/pages/llm-university/intro-text-generation/fine-tuning-for-chat.mdx index 62334774c..7068ffcc2 100644 --- a/fern/pages/llm-university/intro-text-generation/fine-tuning-for-chat.mdx +++ b/fern/pages/llm-university/intro-text-generation/fine-tuning-for-chat.mdx @@ -100,7 +100,7 @@ When you're ready to use the fine-tuned model, navigate to the API tab. There, y In the following code, we supply a message from the test dataset to both the pre-trained and fine-tuned models for comparison. -```python +```python PYTHON user_message = "Make the text coherent: Pimelodella kronei is a species of three-barbeled catfish endemic to Brazil. Discovered by the German naturalist Sigismund Ernst Richard Krone, Pimelodella kronei was the first troglobitic species described in Brazil, but several others have been described later." # Desired response: Pimelodella kronei is a species of three-barbeled catfish endemic to Brazil. Discovered by the German naturalist Sigismund Ernst Richard Krone, it was the first troglobitic fish described in Brazil, but several others have been described later. @@ -145,7 +145,7 @@ We have demonstrated that the fine-tuned model can provide good answers to indiv To see this, we will borrow from the code in the [Building a Chatbot chapter](/docs/building-a-chatbot) to build a simple chat interface. The only change we need to make is to supply the model nickname when using `co.chat_stream()` to ensure that we are chatting with the model we just fine-tuned. -```python +```python PYTHON # Create a conversation ID import uuid conversation_id = str(uuid.uuid4()) diff --git a/fern/pages/llm-university/intro-text-generation/parameters-for-controlling-outputs.mdx b/fern/pages/llm-university/intro-text-generation/parameters-for-controlling-outputs.mdx index 160d4e664..e2fbfe21a 100644 --- a/fern/pages/llm-university/intro-text-generation/parameters-for-controlling-outputs.mdx +++ b/fern/pages/llm-university/intro-text-generation/parameters-for-controlling-outputs.mdx @@ -16,7 +16,7 @@ As you’ll learn, the Command model has many variations to select from, where e To set up, we first import the Cohere module and create a client. -```python +```python PYTHON import cohere co = cohere.Client("COHERE_API_KEY") # Your Cohere API key ``` @@ -34,7 +34,7 @@ With the [Chat endpoint](/reference/chat) , you can choose from several variatio Use the `model` parameter to select a variation that suits your requirements. In the code cell, we select `command-r`. -```python +```python PYTHON response = co.chat(message="Hello", model="command-r-plus") print(response.text) @@ -75,7 +75,7 @@ The temperature parameter is a value between 0 and 1. As you increase the temper Let’s look at a code example, where we suggest that the model generate alternative names for a blog post. Prompting the endpoint five times when the temperature is set to 0 yields the same output each time. -```python +```python PYTHON message = """Suggest a more exciting title for a blog post titled: Intro to Retrieval-Augmented Generation. \ Respond in a single line.""" @@ -98,7 +98,7 @@ The Future of AI: Unlocking the Power of Retrieval-Augmented Generation However, if we increase the temperature to the maximum value of 1, the model gives different proposals. -```python +```python PYTHON message = """Suggest a more exciting title for a blog post titled: Intro to Retrieval-Augmented Generation. \ Respond in a single line.""" diff --git a/fern/pages/llm-university/intro-text-generation/prompt-engineering-basics.mdx b/fern/pages/llm-university/intro-text-generation/prompt-engineering-basics.mdx index 0e4299bd6..d5dbfcb15 100644 --- a/fern/pages/llm-university/intro-text-generation/prompt-engineering-basics.mdx +++ b/fern/pages/llm-university/intro-text-generation/prompt-engineering-basics.mdx @@ -18,14 +18,14 @@ Coming up with a good prompt is a bit of both science and art. On the one hand, To set up, we first import the Cohere module and create a client. -```python +```python PYTHON import cohere co = cohere.Client("COHERE_API_KEY") # Your Cohere API key ``` Let's also define a function `generate_text()` to take a user message, call the Chat endpoint, and stream the response. -```python +```python PYTHON def generate_text(message): stream = co.chat_stream(message=message, model="command-r-plus") for event in stream: @@ -39,7 +39,7 @@ The best way to design prompts for a model like [Command](https://cohere.com/mod For instance, let’s say that we are creating the product description copy for a wireless earbuds product. We can write the prompt as follows. -```python +```python PYTHON generate_text("Generate a concise product description for the product: wireless earbuds.") ``` @@ -59,7 +59,7 @@ But perhaps we want to be more specific regarding what we want the output to loo Let’s say we want the model to write the product description in a particular format with specific information. In this case, we can append this specific instruction in the prompt. -```python +```python PYTHON generate_text(""" Generate a concise product description for the product: wireless earbuds. Use the following format: Hook, Solution, Features and Benefits, Call to Action. @@ -91,7 +91,7 @@ The model returns an output following the format that we wanted. The prompt can also be constructed as a combination of an instruction and some context. Let’s see this in action with another example: emails. We can create a prompt to summarize an email, which is included in the prompt for context. -```python +```python PYTHON generate_text(""" Summarize this email in one sentence. Dear [Team Members], @@ -117,7 +117,7 @@ This instruction–context prompt format is extremely useful as it means that we Let's move to another example — an extraction task, which a generative model can do very well. Given context, which in this case is a description of a movie, we want the model to extract the movie title. -```python +```python PYTHON generate_text(""" Extract the movie title from the text below. Deadpool 2 | Official HD Deadpool's "Wet on Wet" Teaser | 2018 @@ -136,7 +136,7 @@ The model is also effective at tasks that involve taking a piece of text and rew Here's an example. We have a one-line instruction followed by the context, which in this case is a blog excerpt. The instruction is to generate a list of frequently asked questions (FAQ) based on the passage, which involves a mixture of several tasks, such as extraction and rewriting. -```python +```python PYTHON generate_text(""" Given the following text, write down a list of potential frequently asked questions (FAQ), together with the answers. The Cohere Platform provides an API for developers and organizations to access cutting-edge LLMs without needing machine learning know-how. diff --git a/fern/pages/llm-university/intro-text-generation/the-generate-endpoint.mdx b/fern/pages/llm-university/intro-text-generation/the-generate-endpoint.mdx index 3c2b32fa8..f975a6075 100644 --- a/fern/pages/llm-university/intro-text-generation/the-generate-endpoint.mdx +++ b/fern/pages/llm-university/intro-text-generation/the-generate-endpoint.mdx @@ -16,7 +16,7 @@ Here’s a quick look at how to generate a piece of text via the endpoint. It’ We enter a prompt: -```python +```python PYTHON response = co.generate( model='command', prompt='Generate a social ad copy for the product: Wireless Earbuds.', @@ -65,7 +65,7 @@ First, if you haven’t done so already, you can SVM video. -```python +```python PYTHON # Train the classifier with Support Vector Machine (SVM) algorithm # import SVM classifier code @@ -50,7 +50,7 @@ svm_classifier.fit(features, label) Once that is done, we’ll take the embeddings of the 9 data points, put them through the trained model, and get the class predictions on the other side. And with this small test dataset, we get all predictions correct. -```python +```python PYTHON # Predict with test data # Prepare the test inputs diff --git a/fern/pages/llm-university/intro-text-representation/classify-endpoint.mdx b/fern/pages/llm-university/intro-text-representation/classify-endpoint.mdx index 4c23f3962..5e29f7455 100644 --- a/fern/pages/llm-university/intro-text-representation/classify-endpoint.mdx +++ b/fern/pages/llm-university/intro-text-representation/classify-endpoint.mdx @@ -40,7 +40,7 @@ A typical machine learning model requires many training examples to perform text Our sentiment analysis classifier has three classes with five examples each: “Positive” for a positive sentiment, “Negative” for a negative sentiment, and “Neutral” for a neutral sentiment. The code looks as follows. The examples: -```python +```python PYTHON from cohere import ClassifyExample examples = [ @@ -64,7 +64,7 @@ examples = [ The inputs (we have twelve in this example): -```python +```python PYTHON inputs=["Hello, world! What a beautiful day", "It was a great time with great people", "Great place to work", @@ -86,7 +86,7 @@ With the Classify endpoint, setting up the model is quite straightforward. The m Putting everything together with the Classify endpoint looks like the following: -```python +```python PYTHON co = cohere.Client(api_key) def classify_text(inputs, examples): diff --git a/fern/pages/llm-university/intro-text-representation/clustering-hacker-news-posts.mdx b/fern/pages/llm-university/intro-text-representation/clustering-hacker-news-posts.mdx index 3ededebb8..f3b9c080f 100644 --- a/fern/pages/llm-university/intro-text-representation/clustering-hacker-news-posts.mdx +++ b/fern/pages/llm-university/intro-text-representation/clustering-hacker-news-posts.mdx @@ -56,7 +56,7 @@ The next step is to embed these titles so we can examine the dataset based on th As you've seen before, Cohere’s embed endpoint gives us vector representations from a large embedding language model specifically tuned for text embedding (as opposed to word embedding or text generation). -```python +```python PYTHON embeds = co.embed(texts=list_of_posts, model="small", truncate="LEFT").embeddings @@ -70,7 +70,7 @@ Next, we’ll reduce the embeddings down to two dimensions so we can plot them a The UMAP call looks like this: -```python +```python PYTHON import umap reducer = umap.UMAP(n_neighbors=100) @@ -135,7 +135,7 @@ Let’s now cluster these posts to understand their overall hierarchy. The goal Just like we did in a previous chapter, we can use K-Means clustering on the original embeddings to create eight clusters (more on K-Means clustering on this video). -```python +```python PYTHON from sklearn.cluster import KMeans # Pick the number of clusters @@ -166,7 +166,7 @@ We can use a hierarchical plot to better understand the hierarchy of the cluster For this plot, we use the hierarchy package from scipy: -```python +```python PYTHON Z = hierarchy.linkage(kmeans_model.cluster_centers_, 'single') dn = hierarchy.dendrogram(Z, orientation='right', labels=label_list) diff --git a/fern/pages/llm-university/intro-text-representation/clustering-using-embeddings.mdx b/fern/pages/llm-university/intro-text-representation/clustering-using-embeddings.mdx index 241dce530..b2f2386e5 100644 --- a/fern/pages/llm-university/intro-text-representation/clustering-using-embeddings.mdx +++ b/fern/pages/llm-university/intro-text-representation/clustering-using-embeddings.mdx @@ -26,7 +26,7 @@ Implementation-wise, we use the K-means algorithms to cluster these data points Other than providing the embeddings, the only other key information we need to provide for the algorithm is the number of clusters we want to find. This is normally larger in actual applications, but since our dataset is small, we’ll set the number of clusters to 2. -```python +```python PYTHON from sklearn.cluster import KMeans # Pick the number of clusters diff --git a/fern/pages/llm-university/intro-text-representation/clustering-with-embeddings.mdx b/fern/pages/llm-university/intro-text-representation/clustering-with-embeddings.mdx index f9d370696..37b0c1ef1 100644 --- a/fern/pages/llm-university/intro-text-representation/clustering-with-embeddings.mdx +++ b/fern/pages/llm-university/intro-text-representation/clustering-with-embeddings.mdx @@ -35,7 +35,7 @@ Let’s look at an example using the same 9 data points. We embed the documents using the same `get_embeddings()` function as before, but now we set `input_type="clustering"` because we'll use the embeddings for clustering. -```python +```python PYTHON # Embed the text for clustering df['clustering_embeds'] = get_embeddings(df['query'].tolist(), input_type="clustering") embeds = np.array(df['clustering_embeds'].tolist()) @@ -47,7 +47,7 @@ Implementation-wise, we use the K-means algorithms to cluster these data points Other than providing the embeddings, the only other key information we need to provide for the algorithm is the number of clusters we want to find. This is normally larger in actual applications, but since our dataset is small, we’ll set the number of clusters to 2. -```python +```python PYTHON # Pick the number of clusters n_clusters = 2 diff --git a/fern/pages/llm-university/intro-text-representation/embed-endpoint.mdx b/fern/pages/llm-university/intro-text-representation/embed-endpoint.mdx index e60fed277..86ee3a430 100644 --- a/fern/pages/llm-university/intro-text-representation/embed-endpoint.mdx +++ b/fern/pages/llm-university/intro-text-representation/embed-endpoint.mdx @@ -21,7 +21,7 @@ For the setup, please refer to the Se The dataset we'll use is formed of 50 top search terms on the web about "Hello, World!". -```python +```python PYTHON df = pd.read_csv("https://github.com/cohere-ai/notebooks/raw/main/notebooks/data/hello-world-kw.csv", names=["search_term"]) df.head() ``` @@ -53,7 +53,7 @@ Here's how to use the Embed endpoint: The code looks like this: -```python +```python PYTHON def embed_text(texts): output = co.embed( model="embed-english-v3.0", @@ -79,7 +79,7 @@ To understand what these numbers represent, there are techniques we can use to c We can make use of the UMAP technique to do this. The code is as follows: -```python +```python PYTHON # If you don't have umap installed, pleased run `pip install umap-learn` first! import umap embeds = list(df["search_term_embeds"]) diff --git a/fern/pages/llm-university/intro-text-representation/few-shot-classification.mdx b/fern/pages/llm-university/intro-text-representation/few-shot-classification.mdx index f26f1fe1c..b05a3eef5 100644 --- a/fern/pages/llm-university/intro-text-representation/few-shot-classification.mdx +++ b/fern/pages/llm-university/intro-text-representation/few-shot-classification.mdx @@ -22,14 +22,14 @@ In this chapter, you'll learn to classify text based on sentiment using Cohere's To set up, we first import several tools. -```python +```python PYTHON import cohere from cohere import ClassifyExample ``` We also create a Cohere client. -```python +```python PYTHON co = cohere.Client("COHERE_API_KEY") # Your Cohere API key ``` @@ -54,7 +54,7 @@ Our sentiment analysis classifier has three classes with five examples each: “ The examples: -```python +```python PYTHON examples = [ClassifyExample(text="I’m so proud of you", label="positive"), ClassifyExample(text="What a great time to be alive", label="positive"), ClassifyExample(text="That’s awesome work", label="positive"), @@ -74,7 +74,7 @@ examples = [ClassifyExample(text="I’m so proud of you", label="positive"), The inputs (we have twelve in this example): -```python +```python PYTHON inputs = ["Hello, world! What a beautiful day", "It was a great time with great people", "Great place to work", @@ -95,7 +95,7 @@ With the Classify endpoint, setting up the model is quite straightforward. The m Putting everything together with the Classify endpoint looks like the following: -```python +```python PYTHON def classify_text(inputs, examples): """ Classifies a list of input texts given the examples diff --git a/fern/pages/llm-university/intro-text-representation/fine-tuning-for-classification.mdx b/fern/pages/llm-university/intro-text-representation/fine-tuning-for-classification.mdx index 497dceb18..c906ab116 100644 --- a/fern/pages/llm-university/intro-text-representation/fine-tuning-for-classification.mdx +++ b/fern/pages/llm-university/intro-text-representation/fine-tuning-for-classification.mdx @@ -20,7 +20,7 @@ In this chapter, you'll learn how to fine-tune a model for classification. To set up, we first import several tools. -```python +```python PYTHON import os import json import numpy as np @@ -33,7 +33,7 @@ from sklearn.metrics import f1_score We also import the Cohere module and create a client. -```python +```python PYTHON import cohere co = cohere.Client("COHERE_API_KEY") # Your Cohere API key ``` @@ -42,14 +42,14 @@ co = cohere.Client("COHERE_API_KEY") # Your Cohere API key We'll use the [Airline Travel Information System (ATIS)](https://www.kaggle.com/datasets/hassanamin/atis-airlinetravelinformationsystem?select=atis_intents_train.csv) intent classification dataset \[[source](https://aclanthology.org/H90-1021/)]. For demonstration purposes, we’ll take just a small portion of the dataset: 1,000 data points in total. -```python +```python PYTHON # Load the dataset to a dataframe df = pd.read_csv('https://raw.githubusercontent.com/cohere-ai/notebooks/main/notebooks/data/atis_subset.csv', names=['query','intent']) ``` The first thing we need is to create a training dataset, to be used for building the classifier, and a test dataset, to be used for testing the classifier performance. We will use 800 and 200 data points for these datasets, respectively. -```python +```python PYTHON # Split the dataset into training and test portions df_train, df_test = train_test_split(df, test_size=200, random_state=21) ``` @@ -69,7 +69,7 @@ Our goal is to train the classifier so it can predict the class of a new custome We transform the data to JSONL format to match the style expected by the Classification endpoint ([documentation](/docs/classify-preparing-the-data)). -```python +```python PYTHON def create_classification_data(text, label): formatted_data = { "text": text, @@ -130,7 +130,7 @@ Once the model has finished fine-tuning, it’s time to evaluate its performance We fill in the model ID to generate test predictions. -```python +```python PYTHON # Generate classification predictions on the test dataset using the finetuned model # Classification function @@ -152,7 +152,7 @@ for i in range(0, len(df_test), BATCH_SIZE): Next, we calculate the model's test accuracy and F1 score. -```python +```python PYTHON # Compute metrics on the test dataset accuracy = accuracy_score(df_test["intent"], y_pred) f1 = f1_score(df_test["intent"], y_pred, average='weighted') diff --git a/fern/pages/llm-university/intro-text-representation/finetuning.mdx b/fern/pages/llm-university/intro-text-representation/finetuning.mdx index 6530dd475..cec589334 100644 --- a/fern/pages/llm-university/intro-text-representation/finetuning.mdx +++ b/fern/pages/llm-university/intro-text-representation/finetuning.mdx @@ -63,7 +63,7 @@ When you go to the custom model’s page, you can see a few evaluation metrics s Once finetuning is complete, we’ll re-generate the embeddings, now using the finetuned model. -```python +```python PYTHON # Embedding API call def get_embeddings(texts,model): output = co.embed( diff --git a/fern/pages/llm-university/intro-text-representation/introduction-to-semantic-search.mdx b/fern/pages/llm-university/intro-text-representation/introduction-to-semantic-search.mdx index 6c1abac45..55149ba4f 100644 --- a/fern/pages/llm-university/intro-text-representation/introduction-to-semantic-search.mdx +++ b/fern/pages/llm-university/intro-text-representation/introduction-to-semantic-search.mdx @@ -41,7 +41,7 @@ Implementation-wise, there are many ways we can approach this. And in our case, We embed the query using the same `get_embeddings()` function as before, but now we set `input_type="search_query"` because we're embedding a search query that we want to compare to the embedded documents. -```python +```python PYTHON # Define new query new_query = "How can I find a taxi or a bus when the plane lands?" @@ -53,7 +53,7 @@ new_query_embeds = get_embeddings([new_query], input_type="search_query")[0] We define and use a function `get_similarity()` that employs cosine similarity to determine how similar the documents are to our query. -```python +```python PYTHON # Calculate cosine similarity between the search query and existing queries def get_similarity(target, candidates): # Turn list into array @@ -76,7 +76,7 @@ similarity = get_similarity(new_query_embeds, embeds[:sample]) We'll then view the documents in decreasing order of similarity. -```python +```python PYTHON # View the top 5 articles print('Query:') print(new_query,'\n') diff --git a/fern/pages/llm-university/intro-text-representation/introduction-to-text-embeddings.mdx b/fern/pages/llm-university/intro-text-representation/introduction-to-text-embeddings.mdx index 5b42353b5..95fd4d4da 100644 --- a/fern/pages/llm-university/intro-text-representation/introduction-to-text-embeddings.mdx +++ b/fern/pages/llm-university/intro-text-representation/introduction-to-text-embeddings.mdx @@ -25,7 +25,7 @@ In this chapter, we take a visual approach to understand the intuition behind te To set up, we first import several tools. We'll use the same notebook for the next several chapters, and we'll import everything we need here. -```python +```python PYTHON import pandas as pd import numpy as np import altair as alt @@ -36,7 +36,7 @@ from sklearn.cluster import KMeans We also import the Cohere module and create a client. -```python +```python PYTHON import cohere co = cohere.Client("COHERE_API_KEY") # Your Cohere API key ``` @@ -45,7 +45,7 @@ co = cohere.Client("COHERE_API_KEY") # Your Cohere API key We'll work a subset of the [Airline Travel Information System (ATIS) intent classification dataset](https://www.kaggle.com/datasets/hassanamin/atis-airlinetravelinformationsystem?select=atis_intents_train.csv) \[[Source](https://aclanthology.org/H90-1021/)]. The following code loads the dataset into a pandas Dataframe `df` with a single column `"queries"` containing 91 inquiries coming to airline travel inquiry systems. -```python +```python PYTHON # Load the dataset to a dataframe df_orig = pd.read_csv('https://raw.githubusercontent.com/cohere-ai/notebooks/main/notebooks/data/atis_intents_train.csv', names=['intent','query']) @@ -97,7 +97,7 @@ Next, we embed each inquiry by calling Cohere’s [Embed endpoint](/reference/em The code looks like this: -```python +```python PYTHON def get_embeddings(texts, model='embed-english-v3.0', input_type="search_document"): output = co.embed( model=model, @@ -120,7 +120,7 @@ Let’s get some visual intuition about this by plotting these numbers in a heat The `get_pc()` function below does this via a technique called [Principal Component Analysis (PCA)](https://en.wikipedia.org/wiki/Principal_component_analysis), which reduces the number of dimensions in an embedding while retaining as much information as possible. We set `embeds_pc` to the ten-dimensional version of the document embeddings. -```python +```python PYTHON # Function to return the principal components def get_pc(arr, n): pca = PCA(n_components=n) diff --git a/fern/pages/llm-university/intro-text-representation/setting-up.mdx b/fern/pages/llm-university/intro-text-representation/setting-up.mdx index a5f7556d0..9e3ce6c97 100644 --- a/fern/pages/llm-university/intro-text-representation/setting-up.mdx +++ b/fern/pages/llm-university/intro-text-representation/setting-up.mdx @@ -33,7 +33,7 @@ In order to make API calls to the models, you need an API key. You can sign up f Once that is done, you can set up the Cohere client as follows. -```python +```python PYTHON import cohere co = cohere.Client("add_your_api_key_name") diff --git a/fern/pages/models/the-command-family-of-models/command-beta.mdx b/fern/pages/models/the-command-family-of-models/command-beta.mdx index 35ac77cd0..489f67e9e 100644 --- a/fern/pages/models/the-command-family-of-models/command-beta.mdx +++ b/fern/pages/models/the-command-family-of-models/command-beta.mdx @@ -50,20 +50,20 @@ Install the SDK, if you haven't already. Then, set up the Cohere client. -```python +```python PYTHON import cohere co = cohere.Client(api_key) ``` ### Create prompt -```python +```python PYTHON message = "Write an introductory paragraph for a blog post about language models." ``` ### Generate text -```python +```python PYTHON response = co.chat( model='command', message=message, diff --git a/fern/pages/models/the-command-family-of-models/command-r-plus.mdx b/fern/pages/models/the-command-family-of-models/command-r-plus.mdx index 625c870cc..366aafd58 100644 --- a/fern/pages/models/the-command-family-of-models/command-r-plus.mdx +++ b/fern/pages/models/the-command-family-of-models/command-r-plus.mdx @@ -36,7 +36,7 @@ Additionally, pre-training data has been included for the following 13 languages The model has been trained to respond in the language of the user. Here's an example: -```python +```python PYTHON co.chat( message="Écris une description de produit pour une voiture électrique en 50 à 75 mots" ) @@ -44,7 +44,7 @@ co.chat( And here's what the response might look like: -```text +```text TEXT Découvrez la voiture électrique qui va révolutionner votre façon de conduire. Avec son design élégant, cette voiture offre une expérience de conduite unique avec une accélération puissante et une autonomie impressionnante. Sa diff --git a/fern/pages/models/the-command-family-of-models/command-r.mdx b/fern/pages/models/the-command-family-of-models/command-r.mdx index f8049e6a3..3520ec697 100644 --- a/fern/pages/models/the-command-family-of-models/command-r.mdx +++ b/fern/pages/models/the-command-family-of-models/command-r.mdx @@ -36,7 +36,7 @@ Additionally, pre-training data has been included for the following 13 languages The model has been trained to respond in the language of the user. Here's an example: -```python +```python PYTHON co.chat( message="Écris une description de produit pour une voiture électrique en 50 à 75 mots" ) @@ -44,7 +44,7 @@ co.chat( And here's what the response might look like: -```text +```text TEXT Découvrez la voiture électrique qui va révolutionner votre façon de conduire. Avec son design élégant, cette voiture offre une expérience de conduite unique avec une accélération puissante et une autonomie impressionnante. Sa diff --git a/fern/pages/responsible-use/responsible-use/generation-benchmarks.mdx b/fern/pages/responsible-use/responsible-use/generation-benchmarks.mdx index 79c88b2a1..1f8960e50 100644 --- a/fern/pages/responsible-use/responsible-use/generation-benchmarks.mdx +++ b/fern/pages/responsible-use/responsible-use/generation-benchmarks.mdx @@ -21,35 +21,6 @@ Outputs from Classify can be used for classification and analysis tasks, such as Always refer to the [Usage Guidelines](/docs/usage-guidelines) for guidance on using the Cohere Platform responsibly. Additionally, please consult the following model-specific usage notes: -## Performance Benchmarks - -Performance has been evaluated on the following research benchmarks. These metrics are reported on the Extremely Large (Beta) model. To gain a deeper understanding of what these benchmarks mean for your use case, read more about [Hellaswag](https://aclanthology.org/P19-1472.pdf) and [COPA](https://people.ict.usc.edu/~gordon/copa.html). - -
- -| **Model** | **Benchmark** | **Metric** | **Statistic** | -| --------------- | ------------- | --------------------- | ------------- | -| Extremely Large | Hellaswag | Accuracy, Zero-Shot | 0.805 | -| | PIQA | Likelihood, Zero-Shot | 0.824 | - -## Safety Benchmarks - -Performance has been evaluated on the following safety-related research benchmarks. These metrics are reported on the Large model. - -| **Model** | **Benchmark** | **Metric** | **Statistic** | -| --------- | --------------------- | ---------------------------------------------- | ------------- | -| XLarge | Real Toxicity Prompts | Max Toxicity in 5K Unconditional Generations | 0.93 | -| | Real Toxicity Prompts | Percent Toxic in 10K Conditional Generations | 0.08 | -| | Real Toxicity Prompts | Percent Toxic in 10K "Challenging" Generations | 0.38 | -| Large | StereoSet | Stereotype Score | 51.95 | -| | StereoSet | Language Modeling Score | 80.92 | -| | BOLD | Gender Ratio (M:F) | 1.99 | -| | BOLD | Sentiment (+:-) | 3.21 | -| | BOLD | Regard (+:-) | 3.41 | -| | BOLD | Toxic samples in 1K | 5.4 | - -We are researching how to expand our safety benchmarking to the multilingual context; multilingual benchmarks will be introduced in the future. - ### Model Toxicity and Bias Language models learn the statistical relationships present in training datasets, which may include toxic language and historical biases along race, gender, sexual orientation, ability, language, cultural, and intersectional dimensions. We recommend that developers using the Generation model take model toxicity and bias into account and design applications carefully to avoid the following: diff --git a/fern/pages/responsible-use/responsible-use/representation-benchmarks.mdx b/fern/pages/responsible-use/responsible-use/representation-benchmarks.mdx index ad68ba4f2..593bef73e 100644 --- a/fern/pages/responsible-use/responsible-use/representation-benchmarks.mdx +++ b/fern/pages/responsible-use/responsible-use/representation-benchmarks.mdx @@ -8,27 +8,6 @@ updatedAt: "Fri Jun 21 2024 10:25:44 GMT+0000 (Coordinated Universal Time)" --- Here, you'll find safety benchmarks, intended use cases, and various other information pertaining to Cohere representation models. -## Safety Benchmarks - -Performance has been evaluated on the following safety-related research benchmarks. These metrics are reported for the Small model. - -| **Model** | **Benchmark** | **Metric** | **Statistic** | -| --------- | ------------- | ---------------------------------------------- | ------------- | -| Large | StereoSet | Stereotype Score | 65.8502 | -| | StereoSet | Language Modeling Score | 96.9383 | -| | SEAT | S3: EA/AA Names | \- | -| | SEAT | S6: Male/Female, Career | 0.3322 | -| | SEAT | S7: Male/Female, Math/Arts | 0.4046 | -| | SEAT | S8: Male/Female, Science/Arts | \- | -| | SEAT | S10: Young/Old | \- | -| | SEAT | Angry Black Woman Stereotype - Terms | \- | -| | SEAT | Heilman Double Bind - Male/Female, Achievement | \- | -| | SEAT | Heilman Double Bind - Male/Female, Likeable | \- | - -For StereoSet Stereotype Score, 50 is best. For Language Modeling Score, 100 is best. - -For SEAT tests, a dash "-" indicates no significant evidence of bias was found. Otherwise, a number indicates the bias effect size. We are researching how to expand our safety benchmarking to the multilingual context; multilingual benchmarks will be introduced in the future. - ## Intended Use Case Embeddings may be used for purposes such as estimating semantic similarity between two sentences, choosing a sentence which is most likely to follow another sentence, sentiment analysis, topic extraction, or categorizing user feedback. Performance of embeddings will vary across use cases depending on the language, dialect, subject matter, and other qualities of the represented text. diff --git a/fern/pages/text-embeddings/embed-jobs-api.mdx b/fern/pages/text-embeddings/embed-jobs-api.mdx index 6f6dd2776..316a94190 100644 --- a/fern/pages/text-embeddings/embed-jobs-api.mdx +++ b/fern/pages/text-embeddings/embed-jobs-api.mdx @@ -58,7 +58,7 @@ The Embed Jobs and Dataset APIs respect metadata through two fields: `keep_field As seen in the example above, the following would be a valid `create_dataset` call since `langs` is in the first entry but not in the second entry. The fields `wiki_id`, `url`, `views` and `title` are present in both JSONs. -```python +```python PYTHON # Upload a dataset for embed jobs ds=co.datasets.create( name='sample_file', @@ -80,7 +80,7 @@ Currently the dataset endpoint will accept `.csv` and `.jsonl` files - in both c The Embed Jobs API takes in `dataset IDs` as an input. Uploading a local file to the Datasets API with `dataset_type="embed-input"` will validate the data for embedding. The input file types we currently support are `.csv` and `.jsonl`. Here's a code snippet of what this looks like: -```python +```python PYTHON import cohere co = cohere.Client(api_key="") @@ -102,7 +102,7 @@ uploading file, starting validation... Once the dataset has been uploaded and validated you will get a response like this: -```text +```text TEXT sample-file-m613zv was uploaded ``` @@ -150,7 +150,7 @@ The Embed Jobs API will respect the original order of your dataset and the outpu Below is a sample of what the output would look like if you downloaded the dataset as a `jsonl`. -```json +```json JSON { "text": "The following notable deaths occurred in 2022. Names are reported under the date of death, in alphabetical order......", "embeddings": { @@ -165,7 +165,7 @@ Below is a sample of what the output would look like if you downloaded the datas If you have specified any metadata to be kept either as `optional_fields` or `keep_fields` when uploading a dataset, the output of embed jobs will look like this: -```json +```json JSON { "text": "The following notable deaths occurred in 2022. Names are reported under the date of death, in alphabetical order......", "embeddings": { diff --git a/fern/pages/text-embeddings/embed-jobs.mdx b/fern/pages/text-embeddings/embed-jobs.mdx index 7fd9e834e..6118864fc 100644 --- a/fern/pages/text-embeddings/embed-jobs.mdx +++ b/fern/pages/text-embeddings/embed-jobs.mdx @@ -18,7 +18,7 @@ If you have a large collection of text rather than a single file, bulk embedding #### Request -```python +```python PYTHON # Request co.bulk_embed( model: "string", @@ -31,7 +31,7 @@ co.bulk_embed( #### Response -```python +```python PYTHON { job_id:"string" } @@ -41,7 +41,7 @@ co.bulk_embed( #### Request -```python +```python PYTHON co.get_bulk_embed( job_id: "string" ) @@ -49,7 +49,7 @@ co.get_bulk_embed( #### Response -```python +```python PYTHON # Response of co.get_bulk_embed(): { job_id: "string", @@ -69,13 +69,13 @@ co.get_bulk_embed( #### Request -```python +```python PYTHON co.list_bulk_embed() ``` #### Response -```python +```python PYTHON # Response of co.list_bulk_embeds(): {bulk_embed : [{ job_id: "string", diff --git a/fern/pages/text-embeddings/embeddings.mdx b/fern/pages/text-embeddings/embeddings.mdx index bb97a16fc..929e1a07b 100644 --- a/fern/pages/text-embeddings/embeddings.mdx +++ b/fern/pages/text-embeddings/embeddings.mdx @@ -1,5 +1,5 @@ --- -title: "Embeddings" +title: "Introduction to Embeddings at Cohere" slug: "docs/embeddings" hidden: false @@ -17,7 +17,7 @@ Embeddings are a way to represent the **meaning** of text as a list of numbers. In the example below we use the `embed-english-v3.0` model to generate embeddings for 3 phrases and compare them using a similarity function. The two **similar** phrases have a **high similarity score**, and the embeddings for two **unrelated** phrases have a **low similarity score**: -```python +```python PYTHON import cohere import numpy as np @@ -87,7 +87,7 @@ The following embedding types are now supported: The parameter defaults to `float`, so if you pass in no argument you'll get back `float` embeddings: -```python +```python PYTHON ret = co.embed(texts=phrases, model=model, input_type=input_type) @@ -97,7 +97,7 @@ ret.embeddings # This contains the float embeddings However we recommend being explicit about the `embedding type(s)`. To specify an `embedding type`, pass one of the types from the list above in as list containing a string: -```python +```python PYTHON ret = co.embed(texts=phrases, model=model, input_type=input_type, @@ -112,7 +112,7 @@ ret.embeddings.binary # This will be empty Finally, you can also pass several `embedding types` in as a list, in which case the endpoint will return a dictionary with both types available: -```python +```python PYTHON ret = co.embed(texts=phrases, model=model, input_type=input_type, diff --git a/fern/pages/text-embeddings/reranking/overview.mdx b/fern/pages/text-embeddings/reranking/overview.mdx index dee80744f..450c23cb4 100644 --- a/fern/pages/text-embeddings/reranking/overview.mdx +++ b/fern/pages/text-embeddings/reranking/overview.mdx @@ -20,7 +20,7 @@ In the example below, we use the [Rerank API endpoint](/reference/rerank-1) to i In this example, the documents being passed in are a list of strings: -```python +```python PYTHON import cohere co = cohere.Client(api_key="") @@ -94,7 +94,7 @@ Alternatively, you can pass in a JSON object and specify the fields you'd like t **Request** -```python +```python PYTHON query = "What is the capital of the United States?" docs = [ {"Title":"Facts about Carson City","Content":"Carson City is the capital city of the American state of Nevada. At the 2010 United States Census, Carson City had a population of 55,274."}, @@ -108,7 +108,7 @@ results = co.rerank(model="rerank-english-v3.0", query=query, documents=docs, ra In the `docs` parameter, we are passing in a list of objects which have the key values: `[Title ,Content]`. As part of the Rerank call, we are specifying which keys to rank over, as well as the order in which the key value pairs should be considered. -```python +```python PYTHON { "id": "75a94aa7-6761-4a64-a2ae-4bc0a62bc601", "results": [ diff --git a/fern/pages/text-embeddings/reranking/reranking-best-practices.mdx b/fern/pages/text-embeddings/reranking/reranking-best-practices.mdx index 01d11a1c7..bf423df07 100644 --- a/fern/pages/text-embeddings/reranking/reranking-best-practices.mdx +++ b/fern/pages/text-embeddings/reranking/reranking-best-practices.mdx @@ -65,7 +65,7 @@ Our `rerank-v3.0` models are trained with a context length of 4096 tokens. The m Our `rerank-v3.0` models support semi-structured data reranking through a list of JSON objects. The `rank_fields` parameter will default to a field parameter called `text` unless otherwise specified. If the `rank_fields` parameter is unspecified _and_ none of your JSON objects have a `text` field, the endpoint will return an error. -```json +```json JSON [ { "Title":"How to fix a dishwasher" @@ -87,7 +87,7 @@ Looking at the example above, passing in `rank_fields=["Title","Content"]` would The most important output from the [Rerank API endpoint](/reference/rerank-1) is the absolute rank exposed in the response object. The score is query dependent, and could be higher or lower depending on the query and passages sent in. In the example below, what matters is that Ottawa is more relevant than Toronto, but the user should not assume that Ottawa is two times more relevant than Ontario. -```python +```python PYTHON [ RerankResult, RerankResult, diff --git a/fern/pages/text-embeddings/text-classification-1.mdx b/fern/pages/text-embeddings/text-classification-1.mdx index 1acc381bd..ce6c2d0a0 100644 --- a/fern/pages/text-embeddings/text-classification-1.mdx +++ b/fern/pages/text-embeddings/text-classification-1.mdx @@ -2,7 +2,141 @@ title: "Text Classification" slug: "docs/text-classification-1" -hidden: true +hidden: false createdAt: "Wed Jan 31 2024 20:35:25 GMT+0000 (Coordinated Universal Time)" updatedAt: "Wed Jan 31 2024 20:35:26 GMT+0000 (Coordinated Universal Time)" --- + +Among the most popular use cases for language embeddings is 'text classification,' in which different pieces of text -- blog posts, lyrics, poems, headlines, etc. -- are grouped based on their similarity, their sentiment, or some other property. + +Here, we'll discuss how to perform simple text classification tasks with Cohere's `classify` endpoint, and provide links to more information on how to fine-tune this endpoint for more specialized work. + +## Few-Shot Classification with Cohere's `classify` Endpoint + +Generally, training a text classifier requires a tremendous amount of data. But with large language models, it's now possible to create so-called 'few shot' classification models able to perform well after seeing a far smaller number of samples. + +In the next few sections, we'll create a sentiment analysis classifier to sort text into "positive," "negative," and "neutral" categories. + +### Setting up the SDK + +First, let's import the required tools and set up a Cohere client. + +```python PYTHON +import cohere +from cohere import ClassifyExample +``` +```python PYTHON +co = cohere.Client("COHERE_API_KEY") # Your Cohere API key +``` + +### Preparing the Data and Inputs + +With the `classify` endpoint, you can create a text classifier with as few as two examples per class, and each example **must** contain the text itself and the corresponding label (i.e. class). So, if you have two classes you need a minimum of four examples, if you have three classes you need a minimum of six examples, and so on. + +Here are examples, created as `ClassifyExample` objects: + +```python PYTHON +examples = [ClassifyExample(text="I’m so proud of you", label="positive"), + ClassifyExample(text="What a great time to be alive", label="positive"), + ClassifyExample(text="That’s awesome work", label="positive"), + ClassifyExample(text="The service was amazing", label="positive"), + ClassifyExample(text="I love my family", label="positive"), + ClassifyExample(text="They don't care about me", label="negative"), + ClassifyExample(text="I hate this place", label="negative"), + ClassifyExample(text="The most ridiculous thing I've ever heard", label="negative"), + ClassifyExample(text="I am really frustrated", label="negative"), + ClassifyExample(text="This is so unfair", label="negative"), + ClassifyExample(text="This made me think", label="neutral"), + ClassifyExample(text="The good old days", label="neutral"), + ClassifyExample(text="What's the difference", label="neutral"), + ClassifyExample(text="You can't ignore this", label="neutral"), + ClassifyExample(text="That's how I see it", label="neutral")] +``` + +Besides the examples, you'll also need the 'inputs,' which are the strings of text you want the classifier to sort. Here are the ones we'll be using: + +```python PYTHON +inputs = ["Hello, world! What a beautiful day", + "It was a great time with great people", + "Great place to work", + "That was a wonderful evening", + "Maybe this is why", + "Let's start again", + "That's how I see it", + "These are all facts", + "This is the worst thing", + "I cannot stand this any longer", + "This is really annoying", + "I am just plain fed up"] +``` + +### Generate Predictions + +Setting up the model is quite straightforward with the `classify` endpoint. We'll use Cohere's `embed-english-v3.0` model, here's what that looks like: + +```python PYTHON +def classify_text(inputs, examples): + + """ + Classifies a list of input texts given the examples + Arguments: + model (str): identifier of the model + inputs (list[str]): a list of input texts to be classified + examples (list[Example]): a list of example texts and class labels + Returns: + classifications (list): each result contains the text, labels, and conf values + """ + + # Classify text by calling the Classify endpoint + response = co.classify( + model='embed-english-v3.0', + inputs=inputs, + examples=examples) + + classifications = response.classifications + + return classifications + +# Classify the inputs +predictions = classify_text(inputs, examples) + +print(predictions) +``` + +Here’s a sample output returned (note that this output has been truncated to make it easier to read, you'll get much more in return if you run the code yourself): + +``` +[ClassifyResponseClassificationsItem(id='9df6628d-57b2-414c-837e-c8a22f00d3db', + input='hello, world! what a beautiful day', + prediction='positive', + predictions=['positive'], + confidence=0.40137812, + confidences=[0.40137812], + labels={'negative': ClassifyResponseClassificationsItemLabelsValue(confidence=0.23582731), + 'neutral': ClassifyResponseClassificationsItemLabelsValue(confidence=0.36279458), + 'positive': ClassifyResponseClassificationsItemLabelsValue(confidence=0.40137812)}, + classification_type='single-label'), + ClassifyResponseClassificationsItem(id='ce2c3b0b-ce98-4905-9ef5-fc83c6848fc5', + input='it was a great time with great people', + prediction='positive', + predictions=['positive'], + confidence=0.49054274, + confidences=[0.49054274], + labels={'negative': ClassifyResponseClassificationsItemLabelsValue(confidence=0.19989403), + 'neutral': ClassifyResponseClassificationsItemLabelsValue(confidence=0.30956325), + 'positive': ClassifyResponseClassificationsItemLabelsValue(confidence=0.49054274)}, + classification_type='single-label') + ....] +``` + +Most of this is pretty easy to understand, but there are a few things worth drawing attention to. + +Besides returning the predicted class in the `prediction` field, the endpoint also returns the `confidence` value of the prediction, which varies between 0 (unconfident) and 1 (completely confident). + +Also, these confidence values are split among the classes; since we're using three, the confidence values for the "positive," "negative," and "neutral" classes must add up to a total of 1. + +Under the hood, the classifier selects the class with the highest confidence value as the “predicted class.” A high confidence value for the predicted class therefore indicates that the model is very confident of its prediction, and vice versa. + +### What If I Need to Fine-Tune the `classify` endpoint? + +Cohere has [dedicated documentation](/docs/classify-fine-tuning) on fine-tuning the `classify` endpoint for bespoke tasks. You can also read this [blog post](/blog/fine-tuning-for-classification), which works out a detailed example. \ No newline at end of file diff --git a/fern/pages/text-generation/chat-api.mdx b/fern/pages/text-generation/chat-api.mdx index 3a89b83de..513494faa 100644 --- a/fern/pages/text-generation/chat-api.mdx +++ b/fern/pages/text-generation/chat-api.mdx @@ -12,7 +12,8 @@ updatedAt: "Tue Jun 18 2024 07:20:15 GMT+0000 (Coordinated Universal Time)" --- The Chat API endpoint is used to generate text with Cohere LLMs. This endpoint facilitates a conversational interface, allowing users to send messages to the model and receive text responses. -```python + +```python PYTHON import cohere co = cohere.Client(api_key="") @@ -23,7 +24,7 @@ response = co.chat( print(response.text) # "The Art of API Design: Crafting Elegant and Powerful Interfaces" ``` -```java +```java JAVA public class ChatPost { public static void main(String[] args) { Cohere cohere = Cohere.builder().token("").build(); @@ -38,7 +39,7 @@ public class ChatPost { } } ``` -```typescript +```typescript TYPESCRIPT const { CohereClient } = require('cohere-ai'); const cohere = new CohereClient({ @@ -53,12 +54,13 @@ const cohere = new CohereClient({ console.log(response.text) })(); ``` + ## Response Structure Below is a sample response from the Chat API -```json +```json JSON { "text": "The Art of API Design: Crafting Elegant and Powerful Interfaces", "generation_id": "dd78b9fe-988b-4c18-9419-8fbdf9968948", @@ -103,7 +105,7 @@ Every response contains the following fields: The user message in the Chat request can be sent together with a `chat_history` to provide the model with conversational context: -```python +```python PYTHON import cohere co = cohere.Client(api_key="") @@ -123,7 +125,7 @@ print(response.text) # "Sure thing Michael, LLMs are ..." Instead of manually building the chat_history, we can grab it from the response of the previous turn. -```python +```python PYTHON chat_history = [] max_turns = 10 @@ -150,7 +152,7 @@ for _ in range(max_turns): Providing the model with the conversation history is one way to have a multi-turn conversation with the model. Cohere has developed another option for users who do not wish to save the conversation history, and it works through a user-defined `conversation_id`. -```python +```python PYTHON import cohere co = cohere.Client("") @@ -165,7 +167,7 @@ answer = response.text Then, if you wanted to continue the conversation, you could do so like this (keeping the `id` consistent): -```python +```python PYTHON response2 = co.chat( model="command-r-plus", message="What is the secret word?", diff --git a/fern/pages/text-generation/connectors/connector-authentication.mdx b/fern/pages/text-generation/connectors/connector-authentication.mdx index e29a6b0d3..7fd3c3dc0 100644 --- a/fern/pages/text-generation/connectors/connector-authentication.mdx +++ b/fern/pages/text-generation/connectors/connector-authentication.mdx @@ -37,7 +37,7 @@ To enable service level authentication, you will need to generate a token, confi First, start by generating a secure token. Here’s a snippet of what generating a token looks like in python: -```python +```python PYTHON # Generate a token import secrets secrets.token_urlsafe(32) @@ -45,7 +45,8 @@ secrets.token_urlsafe(32) After generating the token, you will have to configure your connector to check it. The quick start connectors should expose an environment variable for you to use. For example, the Google Drive connector exposes `CONNECTOR_API_KEY` for this purpose. After setting this environment variable, you should verify that requests without the appropriate `Authorization` header are being denied: -```curl + +```curl CURL curl --request POST --url 'https://connector-example.com/search' --header 'Content-Type: application/json' @@ -53,11 +54,11 @@ curl --request POST "query": "How do I expense a meal?" }' ``` -```python +```python PYTHON import requests r = requests.post('{base_connector_url}/search', {'query': 'How do I expense a meal?'}) ``` -```typescript +```typescript TYPESCRIPT const response = await fetch('{base_connector_url}/search'{ method: 'POST', body: {'query': 'How do I expense a meal?'}, @@ -66,10 +67,12 @@ const data = await response.json(); console.log(data); ``` + You should also verify that requests with the correct header are successful: -```curl + +```curl CURL curl --request POST --url 'https://connector-example.com/search' --header 'Content-Type: application/json' @@ -78,13 +81,13 @@ curl --request POST "query": "How do I expense a meal?" }' ``` -```python +```python PYTHON import requests r = requests.post('{base_connector_url}/search', data={'query': 'How do I expense a meal?'}, headers={"Authorization":"Bearer {Connector API key}"}) ``` -```typescript +```typescript TYPESCRIPT const response = await fetch('{base_connector_url}/search'{ method: 'POST', body: {'query': 'How do I expense a meal?'}, @@ -93,10 +96,12 @@ const data = await response.json(); console.log(data); ``` + Finally, you will have to provide Cohere with the token. You can do this during registration: -```curl + +```curl CURL curl --request POST --url 'https://api.cohere.ai/v1/connectors' --header 'Authorization: Bearer {Cohere API key}' @@ -111,7 +116,7 @@ curl --request POST } }' ``` -```python +```python PYTHON import cohere co = cohere.Client('Your API key') created_connector = co.create_connector( @@ -123,7 +128,7 @@ created_connector = co.create_connector( }, ) ``` -```typescript +```typescript TYPESCRIPT const { CohereClient } = require("cohere-ai"); const cohere = new CohereClient({ token: "<>", @@ -141,10 +146,12 @@ const cohere = new CohereClient({ console.log(connector); })(); ``` + Or if you have already registered the connector, by performing an update: -```curl + +```curl CURL curl --request PATCH --url 'https://api.cohere.ai/v1/connectors/{id}' --header 'Authorization: Bearer {Cohere API key}' @@ -156,7 +163,7 @@ curl --request PATCH } }' ``` -```python +```python PYTHON import cohere # initialize the Cohere Client with an API Key co = cohere.Client('YOUR_API_KEY') @@ -165,7 +172,7 @@ connectors = co.update_connector(connector_id, service_auth={ "token": "{Connector API Key}", }) ``` -```typescript +```typescript TYPESCRIPT const { CohereClient } = require("cohere-ai"); const cohere = new CohereClient({ token: "<>", @@ -180,6 +187,7 @@ const cohere = new CohereClient({ console.log(connector); })(); ``` + Now when you call the Chat API with this connector specified, Cohere will now pass the configured bearer token in the search request to your connector. @@ -196,7 +204,8 @@ To enable **OAuth 2.0** for your connector, you will have to modify your connect First you will have to modify your connector to forward the Authorization header from the request to the connector to the request to the data source. A few quickstart connectors (Google Drive and Slack) do this out of the box without any configuration, so you may wish to look at those to copy this functionality. If you have access to an API key for the service, you should be able test your connector with the following request (depending on the underlying data source; most handle personal API keys and OAuth access tokens similarly): -```curl + +```curl CURL curl --request POST --url https://connector-example.com/search --header 'Content-Type: application/json' @@ -205,13 +214,13 @@ curl --request POST "query": "How do I expense a meal?" }' ``` -```python +```python PYTHON import requests r = requests.post('http://connector-example.com/search', data={'query': 'How do I expense a meal?'}, headers={"Authorization":"Bearer {Personal/Service API key}"}) ``` -```typescript +```typescript TYPESCRIPT const response = await fetch('http://connector-example.com/search'{ method: 'POST', body: {'query': 'How do I expense a meal?'}, @@ -220,6 +229,7 @@ const data = await response.json(); console.log(data); ``` + Next, you will need to configure OAuth 2.0 credentials in your data source. This looks different depending on the data source but when complete you should have a `client_id`, a `client_secret`, and optionally the desired `scope`s that define what Cohere can query on behalf of the user. You will also have to provide the following redirect URI as a part of the configuration: @@ -236,7 +246,8 @@ https://oauth2.googleapis.com/token You will have to provide all of this information to Cohere during registration: -```curl + +```curl CURL curl --request POST --url 'https://api.cohere.ai/v1/connectors' --header 'Authorization: Bearer {Cohere API key}' @@ -254,7 +265,7 @@ curl --request POST } }' ``` -```python +```python PYTHON import cohere co = cohere.Client('Your API key') created_connector = co.create_connector( @@ -269,7 +280,7 @@ created_connector = co.create_connector( }, ) ``` -```typescript +```typescript TYPESCRIPT const { CohereClient } = require("cohere-ai"); const cohere = new CohereClient({ token: "<>", @@ -290,10 +301,12 @@ const cohere = new CohereClient({ console.log(connector); })(); ``` + Or if you have already registered the connector, by performing an update: -```curl + +```curl CURL curl --request PATCH --url 'https://api.cohere.ai/v1/connectors/{id}' --header 'Authorization: Bearer {Cohere API key}' @@ -308,7 +321,7 @@ curl --request PATCH } }' ``` -```python +```python PYTHON import cohere # initialize the Cohere Client with an API Key co = cohere.Client('YOUR_API_KEY') @@ -320,7 +333,7 @@ connectors = co.update_connector(connector_id, oauth={ "scope": "https://www.googleapis.com/auth/drive.readonly" }) ``` -```typescript +```typescript TYPESCRIPT const { CohereClient } = require("cohere-ai"); const cohere = new CohereClient({ token: "<>", @@ -338,6 +351,7 @@ const cohere = new CohereClient({ console.log(connector); })(); ``` + Now you have set up OAuth 2.0. Remember, before users are able to use the connector, they will have to complete an OAuth flow. @@ -350,7 +364,8 @@ The last option available for auth allows you to specify an access token per con To use pass through authentication/authorization specify the access token in the chat request like so: -```python + +```python PYTHON import cohere co = cohere.Client('Your API key') response = co.chat( @@ -358,7 +373,7 @@ response = co.chat( connectors=[{"id": "web-search", "user_access_token": "{Personal/Service API key}" }] ) ``` -```curl +```curl CURL curl --location 'https://production.api.cohere.ai/v1/chat' \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer {Your API key}' \ @@ -368,6 +383,7 @@ curl --location 'https://production.api.cohere.ai/v1/chat' \ "connectors": [{"id": "web-search", "user_access_token": "{Personal/Service API key}" }] } ``` + In this example, cohere will call your `internal-docs` connector with an Authorization header `Bearer {Personal/Service API key}`. diff --git a/fern/pages/text-generation/connectors/connector-faqs.mdx b/fern/pages/text-generation/connectors/connector-faqs.mdx index f3c86c98c..a404f195a 100644 --- a/fern/pages/text-generation/connectors/connector-faqs.mdx +++ b/fern/pages/text-generation/connectors/connector-faqs.mdx @@ -49,7 +49,7 @@ Should you find yourself facing connector latency issues, there are a few things If your connector is returning vast numbers of trivial documents, it may be that the underlying search API is matching on "stopwords". These are low-information words like "the" or "and" which are routinely removed before natural language tasks, and you can modify the search query to get rid of them with a library like NLTK. Here's an example: -- ```python +- ```python PYTHON from nltk.corpus import stopwords stop_words = set(stopwords.words("english")) query = [word for word in query if word.lower() not in stop_words] diff --git a/fern/pages/text-generation/connectors/creating-and-deploying-a-connector.mdx b/fern/pages/text-generation/connectors/creating-and-deploying-a-connector.mdx index 02fb90978..5a56d60cf 100644 --- a/fern/pages/text-generation/connectors/creating-and-deploying-a-connector.mdx +++ b/fern/pages/text-generation/connectors/creating-and-deploying-a-connector.mdx @@ -43,7 +43,8 @@ Once the connection between the connector and the data source is configured you The request from the Chat API to the connector is a POST request that accepts a single `query` parameter in the body. The request URL must end with `/search` and contain no query parameters, like so: -```curl + +```curl CURL curl --request POST --url 'https://connector-example.com/search' --header 'Content-Type: application/json' @@ -51,11 +52,11 @@ curl --request POST "query": "How do I expense a meal?" }' ``` -```python +```python PYTHON import requests r = requests.post('{base_connector_url}/search', {'query': 'How do I expense a meal?'}) ``` -```typescript +```typescript TYPESCRIPT const response = await fetch('{base_connector_url}/search'{ method: 'POST', body: {'query': 'How do I expense a meal?'}, @@ -64,6 +65,7 @@ const data = await response.json(); console.log(data); ``` + The response from the connector should be a JSON object with a list of documents in the result field, like so: @@ -110,7 +112,8 @@ Once deployed, ensure that the connector can respond to requests from outside yo After you’ve deployed the connector and verified it can respond to requests, it is time to register the connector with Cohere. You can do so by sending the following POST request to our API. You will need a Cohere API key, which you can generate [here](https://dashboard.cohere.com/api-keys). Here’s an example request and related response: -```python + +```python PYTHON import cohere # initialize the Cohere Client with an API Key @@ -121,7 +124,7 @@ created_connector = co.create_connector( url="https://connector-example.com/search", ) ``` -```curl +```curl CURL curl --request POST --url ' --header 'Authorization: Bearer {Cohere API key}' @@ -132,7 +135,7 @@ curl --request POST "url":" }' ``` -```typescript +```typescript TYPESCRIPT const { CohereClient } = require("cohere-ai"); const cohere = new CohereClient({ token: "<>", @@ -146,6 +149,7 @@ const cohere = new CohereClient({ console.log(connector); })(); ``` + ```json Example Response JSON { @@ -164,7 +168,8 @@ Make note of the `id`, as you will need it when you want to use it later to grou During registration, the API will attempt to query the connector to verify that it works as expected. If this step fails, ensure that the connector can respond to requests from outside your network, like so: -```curl + +```curl CURL curl --request POST --url 'https://connector-example.com/search' --header 'Content-Type: application/json' @@ -172,11 +177,11 @@ curl --request POST "query": "How do I expense a meal?" }' ``` -```python +```python PYTHON import requests r = requests.post('{base_connector_url}/search', {'query': 'How do I expense a meal?'}) ``` -```typescript +```typescript TYPESCRIPT const response = await fetch('https://connector.example.com/search'{ method: 'POST', body: {'query': 'How do I expense a meal?'}, @@ -185,12 +190,14 @@ const data = await response.json(); console.log(data); ``` + #### Use your Connector with the Chat API In order to produce grounded generations, include your connector id in the `connectors` field in your chat request. Heres an example: -```python + +```python PYTHON import cohere co = cohere.Client('Your API key') response = co.chat( @@ -198,7 +205,7 @@ response = co.chat( connectors=[{"id": "example_connector_id"}] # this is from the create step ) ``` -```curl +```curl CURL curl --request POST \ --url \ --header 'Content-Type: application/json' \ @@ -209,7 +216,7 @@ curl --request POST \ "connectors": [{"id": "example_connector_id"}] } ``` -```typescript +```typescript TYPESCRIPT import { CohereClient } from "cohere-ai"; const cohere = new CohereClient({ token: "YOUR_API_KEY", @@ -222,7 +229,7 @@ const cohere = new CohereClient({ console.log("Received response", response); })(); ``` -```go +```go GO import ( cohere "github.com/cohere-ai/cohere-go/v2" cohereclient "github.com/cohere-ai/cohere-go/v2/client" @@ -235,6 +242,7 @@ response, err := client.Chat( Connectors:[]*cohereclient.ChatConnector{{Id: "web-search"}}, ) ``` + And here’s an example response: diff --git a/fern/pages/text-generation/connectors/managing-your-connector.mdx b/fern/pages/text-generation/connectors/managing-your-connector.mdx index 840675b34..44cbac949 100644 --- a/fern/pages/text-generation/connectors/managing-your-connector.mdx +++ b/fern/pages/text-generation/connectors/managing-your-connector.mdx @@ -16,12 +16,13 @@ Once your connector is deployed and registered, there are a couple of features t You can see all the connectors registered under your organization through the [Cohere dashboard](https://dashboard.cohere.com/connectors). Alternatively, you can make a GET request like the one below: -```curl + +```curl CURL curl --request GET --url 'https://api.cohere.ai/v1/connectors' --header 'Authorization: Bearer {Cohere API key}' ``` -```typescript +```typescript TYPESCRIPT const { CohereClient } = require("cohere-ai"); const cohere = new CohereClient({ token: "<>", @@ -31,23 +32,25 @@ const cohere = new CohereClient({ console.log(connectors); })(); ``` -```python +```python PYTHON import cohere # initialize the Cohere Client with an API Key co = cohere.Client('YOUR_API_KEY') connectors = co.list_connectors() ``` + ### Authorizing an OAuth 2.0 Connector If your connector is set up using OAuth 2.0, a user in your organization can authorize the connector through the dashboard by clicking on “connect your account”. Alternatively, you can make a request to the `/oauth/authorize` endpoint in your application. This will provide a redirect URL that the user can follow to authorize the OAuth application. -```curl + +```curl CURL curl --request POST --url 'https://api.cohere.ai/v1/connectors/{connector-id}/oauth/authorize' --header 'Authorization: Bearer {Cohere API key for user wishing to authorize}' ``` -```typescript +```typescript TYPESCRIPT const { CohereClient } = require("cohere-ai"); const cohere = new CohereClient({ token: "<>", @@ -59,12 +62,14 @@ const cohere = new CohereClient({ console.log(connector); })(); ``` + ### Updating a Connector You can enable and disable a connector [through the dashboard](https://dashboard.cohere.com/connectors). Additionally, you can update the connector name, URL, auth settings, and handle similar sorts of tasks through the API, as follows: -```curl + +```curl CURL curl --request PATCH --url 'https://api.cohere.ai/v1/connectors/{id}' --header 'Authorization: Bearer {Cohere API key}' @@ -81,13 +86,13 @@ curl --request PATCH "active": true, }' ``` -```python +```python PYTHON import cohere # initialize the Cohere Client with an API Key co = cohere.Client('YOUR_API_KEY') connectors = co.update_connector(connector_id, name="new name", url="new_url") ``` -```typescript +```typescript TYPESCRIPT const { CohereClient } = require("cohere-ai"); const cohere = new CohereClient({ token: "<>", @@ -100,6 +105,7 @@ const cohere = new CohereClient({ console.log(connector); })(); ``` + ### Debugging a Connector @@ -107,7 +113,8 @@ To debug issues with a registered connector, you can follow the steps in this se Step 1: Make a streaming request to the connector using the Chat API and check the search results for the error. Here's an example request: -```python + +```python PYTHON import cohere co = cohere.Client('Your API key') response = co.chat( @@ -116,7 +123,7 @@ response = co.chat( connectors=[{"id": "example_connector_id"}] # this is from the create step ) ``` -```curl +```curl CURL curl --request POST \ --url \ --header 'Content-Type: application/json' \ @@ -128,7 +135,7 @@ curl --request POST \ "connectors": [{"id": "example_connector_id"}] } ``` -```typescript +```typescript TYPESCRIPT import { CohereClient } from "cohere-ai"; const cohere = new CohereClient({ token: "YOUR_API_KEY", @@ -142,7 +149,7 @@ const cohere = new CohereClient({ console.log("Received response", response); })(); ``` -```go +```go GO import ( cohere "github.com/cohere-ai/cohere-go/v2" cohereclient "github.com/cohere-ai/cohere-go/v2/client" @@ -156,6 +163,7 @@ response, err := client.Chat( Connectors:[]*cohereclient.ChatConnector{{Id: "web-search"}}, ) ``` + The response in the search results array should contain the error message from the connector: diff --git a/fern/pages/text-generation/connectors/overview-1.mdx b/fern/pages/text-generation/connectors/overview-1.mdx index e216f720c..f789c8339 100644 --- a/fern/pages/text-generation/connectors/overview-1.mdx +++ b/fern/pages/text-generation/connectors/overview-1.mdx @@ -1,5 +1,5 @@ --- -title: "Overview" +title: "Overview of RAG Connectors" slug: "docs/overview-rag-connectors" hidden: false @@ -19,7 +19,8 @@ The following graphic demonstrates the flow of information when using a connecto Connectors are specified when calling the Chat endpoint, which you can read more about [here](/docs/chat-api#connectors-mode). An example request specifying the managed web-search connector would look like this: -```python + +```python PYTHON import cohere co = cohere.Client(api_key='Your API key') @@ -29,7 +30,7 @@ response = co.chat( connectors=[{"id": "web-search"}] ) ``` -```curl +```curl CURL curl --location 'https://production.api.cohere.ai/v1/chat' \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer {Your API key}' \ @@ -39,7 +40,7 @@ curl --location 'https://production.api.cohere.ai/v1/chat' \ "connectors": [{"id": "web-search"}] } ``` -```typescript +```typescript TYPESCRIPT import { CohereClient } from "cohere-ai"; const cohere = new CohereClient({ token: "YOUR_API_KEY", @@ -52,7 +53,7 @@ const cohere = new CohereClient({ console.log("Received response", response); })(); ``` -```go +```go GO import ( cohere "github.com/cohere-ai/cohere-go/v2" cohereclient "github.com/cohere-ai/cohere-go/v2/client" @@ -65,10 +66,11 @@ response, err := client.Chat( Connectors:[]*cohereclient.ChatConnector{{Id: "web-search"}}, ) ``` + If you or an administrator at your organization has created a new connector, you can add this connector id to the list. Here’s an example: -```python +```python PYTHON connectors=[{"id": "web-search"}, {"id": "customer-connector-id"}]. ``` diff --git a/fern/pages/text-generation/documents-and-citations.mdx b/fern/pages/text-generation/documents-and-citations.mdx index 95ad9b932..635f45174 100644 --- a/fern/pages/text-generation/documents-and-citations.mdx +++ b/fern/pages/text-generation/documents-and-citations.mdx @@ -20,7 +20,7 @@ Document mode involves users providing the model with their own documents direct Here's an example of interacting with document mode via the Postman API service. We're asking the `co.chat()` about penguins, and uploading documents for it to use: -```python +```python PYTHON { "message": "Where do the tallest penguins live?", "documents": [ @@ -43,7 +43,7 @@ Here's an example of interacting with document mode via the Postman API service. Here's an example reply: -```python +```python PYTHON { "response_id": "ea9eaeb0-073c-42f4-9251-9ecef5b189ef", "text": "The tallest penguins, Emperor penguins, live in Antarctica.", diff --git a/fern/pages/text-generation/feedback.mdx b/fern/pages/text-generation/feedback.mdx index 0ad64b1b7..13a1ef09a 100644 --- a/fern/pages/text-generation/feedback.mdx +++ b/fern/pages/text-generation/feedback.mdx @@ -44,7 +44,7 @@ The endpoint has a number of settings you can use to control the kind of output If the annotator _accepts_ the suggested response, you could format a request like this: -```python +```python PYTHON generations = co.generate(prompt=f"Write me a polite email responding to the one below: {email}. Response:") if user_accepted_suggestion: co.generate_feedback(request_id=generations[0].id, good_response=True) @@ -52,7 +52,7 @@ if user_accepted_suggestion: If the annotator _edits_ the suggested response, you could format a request like this: -```python +```python PYTHON generations = co.generate(prompt=f"Write me a polite email responding to the one below: {email}. Response:") if user_edits_suggestion: co.generate_feedback(request_id=generations[0].id, good_response=False, desired_response=user_edited_suggestion) @@ -75,7 +75,7 @@ Alternatively, you can generate feedback based on which response an annotator pr A user accepts a model's suggestion in an assisted writing setting, and prefers it to a second suggestion. Here's what a request might look like: -```python +```python PYTHON generations = co.generate(prompt=f"Write me a polite email responding to the one below: {email}. Response:", num_generations=2) if user_accepted_idx: # prompt user for which generation they prefer ratings = \[] diff --git a/fern/pages/text-generation/migrating-from-cogenerate-to-cochat.mdx b/fern/pages/text-generation/migrating-from-cogenerate-to-cochat.mdx index 7fd82da4e..ddbec7d22 100644 --- a/fern/pages/text-generation/migrating-from-cogenerate-to-cochat.mdx +++ b/fern/pages/text-generation/migrating-from-cogenerate-to-cochat.mdx @@ -24,7 +24,7 @@ The difference between Chat and Generate is that the Chat endpoint adds a defaul Here's an example: -```python +```python PYTHON # BEFORE co.generate(prompt="Write me three bullet points for my resume") diff --git a/fern/pages/text-generation/predictable-outputs.mdx b/fern/pages/text-generation/predictable-outputs.mdx index 39eb14cc8..8f205fa1e 100644 --- a/fern/pages/text-generation/predictable-outputs.mdx +++ b/fern/pages/text-generation/predictable-outputs.mdx @@ -20,7 +20,7 @@ The predictability of the model's output can be controlled using the `seed` and The easiest way to force the model into reproducible behavior is by providing a value for the `seed` parameter. Specifying the same integer `seed` in consecutive requests will result in the same set of tokens being generated by the model. This can be useful for debugging and testing. -```python +```python PYTHON import cohere co = cohere.Client(api_key="YOUR API KEY") diff --git a/fern/pages/text-generation/prompt-engineering/advanced-prompt-engineering-techniques.mdx b/fern/pages/text-generation/prompt-engineering/advanced-prompt-engineering-techniques.mdx index 18e7b3394..ddb3f3354 100644 --- a/fern/pages/text-generation/prompt-engineering/advanced-prompt-engineering-techniques.mdx +++ b/fern/pages/text-generation/prompt-engineering/advanced-prompt-engineering-techniques.mdx @@ -45,7 +45,7 @@ of Angela's firsthand account, the statement itself isn't hearsay. Using the Chat API, we could do the following: -```python +```python PYTHON example = '''On the issue of Albert's wellbeing after the accident, Angela testified that he gave a thumbs up when asked how he was feeling.''' message = f'''{example} Is there hearsay?''' diff --git a/fern/pages/text-generation/prompt-engineering/crafting-effective-prompts.mdx b/fern/pages/text-generation/prompt-engineering/crafting-effective-prompts.mdx index b145e19b8..826b3cc6e 100644 --- a/fern/pages/text-generation/prompt-engineering/crafting-effective-prompts.mdx +++ b/fern/pages/text-generation/prompt-engineering/crafting-effective-prompts.mdx @@ -138,7 +138,7 @@ But importantly, it also returns citations that ground the completion in the inc These can easily be rendered into the text to show the source of each piece of information. The following Python function adds the returned citations to the returned completion. -```python +```python PYTHON def insert_citations(text: str, citations: list[dict]): """ A helper function to pretty print citations. diff --git a/fern/pages/text-generation/prompt-engineering/preambles.mdx b/fern/pages/text-generation/prompt-engineering/preambles.mdx index 59a4397b9..9106fa119 100644 --- a/fern/pages/text-generation/prompt-engineering/preambles.mdx +++ b/fern/pages/text-generation/prompt-engineering/preambles.mdx @@ -7,11 +7,10 @@ createdAt: "Tue Mar 12 2024 19:19:02 GMT+0000 (Coordinated Universal Time)" updatedAt: "Thu Jun 13 2024 16:10:09 GMT+0000 (Coordinated Universal Time)" --- - - - A preamble is a system message that is provided to a model at the beginning of a conversation which dictates how the model should behave throughout. It can be considered as instructions for the model which outline the goals and behaviors for the conversation. + + ## Writing a custom preamble While prompting is a natural way to interact with and instruct an LLM, writing a preamble is a shortcut to direct the model’s behavior. Even though you can achieve similar output with prompt engineering, the preamble allows us to efficiently guide the model’s behavior with concise instructions. diff --git a/fern/pages/text-generation/prompt-engineering/prompt-library/add-a-docstring-to-your-code.mdx b/fern/pages/text-generation/prompt-engineering/prompt-library/add-a-docstring-to-your-code.mdx index 0e2ed8a3f..56e015e22 100644 --- a/fern/pages/text-generation/prompt-engineering/prompt-library/add-a-docstring-to-your-code.mdx +++ b/fern/pages/text-generation/prompt-engineering/prompt-library/add-a-docstring-to-your-code.mdx @@ -26,7 +26,7 @@ def add(a,b): **Output** -```python +```python PYTHON def add(a: int, b: int) -> int: """ This function takes two integers 'a' and 'b' and returns their sum. @@ -43,7 +43,7 @@ def add(a: int, b: int) -> int: **API Request** -````python +````python PYTHON import cohere co = cohere.Client(api_key='<>') diff --git a/fern/pages/text-generation/prompt-engineering/prompt-library/book-an-appointment.mdx b/fern/pages/text-generation/prompt-engineering/prompt-library/book-an-appointment.mdx index 66457f6b9..b38ffacb3 100644 --- a/fern/pages/text-generation/prompt-engineering/prompt-library/book-an-appointment.mdx +++ b/fern/pages/text-generation/prompt-engineering/prompt-library/book-an-appointment.mdx @@ -34,7 +34,7 @@ format "%Y-%m-%d %H". If there are multiple times, choose the earliest. If no times are available, output None. Output should be in JSON format: -```json +```json JSON { next_available_time: "%Y-%m-%d %H" } @@ -52,7 +52,7 @@ Output should be in JSON format: **API Request** -````python +````python PYTHON import cohere co = cohere.Client('<>') @@ -75,7 +75,7 @@ Each appointment takes 1 hour. If there is availabiltiy within "available times" If there are multiple times, choose the earliest. If no times are available, output None. Output should be in JSON format: -```json +```json JSON { next_available_time: "%Y-%m-%d %H" } diff --git a/fern/pages/text-generation/prompt-engineering/prompt-library/create-a-markdown-table-from-raw-data.mdx b/fern/pages/text-generation/prompt-engineering/prompt-library/create-a-markdown-table-from-raw-data.mdx index 8447735a7..450d82b6e 100644 --- a/fern/pages/text-generation/prompt-engineering/prompt-library/create-a-markdown-table-from-raw-data.mdx +++ b/fern/pages/text-generation/prompt-engineering/prompt-library/create-a-markdown-table-from-raw-data.mdx @@ -38,7 +38,7 @@ Emily Davis,37,Product Manager **API Request** -````python +````python PYTHON import cohere co = cohere.Client(api_key='<>') diff --git a/fern/pages/text-generation/prompt-engineering/prompt-library/create-csv-data-from-json-data.mdx b/fern/pages/text-generation/prompt-engineering/prompt-library/create-csv-data-from-json-data.mdx index 376e15eba..686b7efb8 100644 --- a/fern/pages/text-generation/prompt-engineering/prompt-library/create-csv-data-from-json-data.mdx +++ b/fern/pages/text-generation/prompt-engineering/prompt-library/create-csv-data-from-json-data.mdx @@ -51,7 +51,7 @@ Emily Davis,37,Product Manager **API Request** -````python +````python PYTHON import cohere co = cohere.Client(api_key='<>') diff --git a/fern/pages/text-generation/prompt-engineering/prompt-library/evaluate-your-llm-response.mdx b/fern/pages/text-generation/prompt-engineering/prompt-library/evaluate-your-llm-response.mdx index 5448150ee..46587bfb9 100644 --- a/fern/pages/text-generation/prompt-engineering/prompt-library/evaluate-your-llm-response.mdx +++ b/fern/pages/text-generation/prompt-engineering/prompt-library/evaluate-your-llm-response.mdx @@ -35,7 +35,7 @@ and business appropriate tone and 0 being an informal tone. Respond only with th ``` **API Request** -```python +```python PYTHON import cohere co = cohere.Client(api_key='<>') diff --git a/fern/pages/text-generation/prompt-engineering/prompt-library/faster-web-search.mdx b/fern/pages/text-generation/prompt-engineering/prompt-library/faster-web-search.mdx index 892a8c7b4..ad70ac058 100644 --- a/fern/pages/text-generation/prompt-engineering/prompt-library/faster-web-search.mdx +++ b/fern/pages/text-generation/prompt-engineering/prompt-library/faster-web-search.mdx @@ -13,7 +13,7 @@ updatedAt: "Thu May 23 2024 05:33:58 GMT+0000 (Coordinated Universal Time)" Find summarized results from the web faster without having to read multiple sources. **API Request** -```python +```python PYTHON import cohere co = cohere.Client(Api_key='<>') diff --git a/fern/pages/text-generation/prompt-engineering/prompt-library/meeting-summarizer.mdx b/fern/pages/text-generation/prompt-engineering/prompt-library/meeting-summarizer.mdx index 90fd54f1a..49e376fe5 100644 --- a/fern/pages/text-generation/prompt-engineering/prompt-library/meeting-summarizer.mdx +++ b/fern/pages/text-generation/prompt-engineering/prompt-library/meeting-summarizer.mdx @@ -104,7 +104,7 @@ homes, and economic strategies during the pandemic. ``` **API Request** -```python +```python PYTHON import cohere co = cohere.Client(api_key='<>') diff --git a/fern/pages/text-generation/prompt-engineering/prompt-library/multilingual-interpreter.mdx b/fern/pages/text-generation/prompt-engineering/prompt-library/multilingual-interpreter.mdx index 302d7d4d4..64e136c31 100644 --- a/fern/pages/text-generation/prompt-engineering/prompt-library/multilingual-interpreter.mdx +++ b/fern/pages/text-generation/prompt-engineering/prompt-library/multilingual-interpreter.mdx @@ -53,7 +53,7 @@ Arabic: يواجه العميل مشكلة ``` **API Request** -```python +```python PYTHON import cohere co = cohere.Client(api_key='<>') diff --git a/fern/pages/text-generation/prompt-engineering/prompt-library/remove-pii.mdx b/fern/pages/text-generation/prompt-engineering/prompt-library/remove-pii.mdx index d41f3ef4b..4e58b2149 100644 --- a/fern/pages/text-generation/prompt-engineering/prompt-library/remove-pii.mdx +++ b/fern/pages/text-generation/prompt-engineering/prompt-library/remove-pii.mdx @@ -46,7 +46,7 @@ Here is the conversation with all personally identifiable information redacted: ``` **API Request** -```python +```python PYTHON import cohere co = cohere.Client(api_key='<>') diff --git a/fern/pages/text-generation/prompt-engineering/prompting-command-r.md b/fern/pages/text-generation/prompt-engineering/prompting-command-r.md index 486a87369..569723892 100644 --- a/fern/pages/text-generation/prompt-engineering/prompting-command-r.md +++ b/fern/pages/text-generation/prompt-engineering/prompting-command-r.md @@ -169,7 +169,7 @@ Note that you could get the same result if you were using the HuggingFace Tokeni >
> Here is a list of tools that you have available to you: > ->
```python
+> 
```python PYTHON
 > def internet_search(query: str) -> List[Dict]:
 >      """Returns a list of relevant document snippets for a textual query retrieved from the internet
 >      Args:
@@ -178,7 +178,7 @@ Note that you could get the same result if you were using the HuggingFace Tokeni
 >      pass
 > ```
> ->
```python
+> 
```python PYTHON
 > def directly_answer() -> List[Dict]:
 >     """Calls a standard (un-augmented) AI chatbot to generate a response given the conversation history
 >     """
@@ -189,7 +189,7 @@ Note that you could get the same result if you were using the HuggingFace Tokeni
 >  What's the biggest penguin in the world?\<|END_OF_TURN_TOKEN|> \<|START_OF_TURN_TOKEN|>\<|SYSTEM_TOKEN|>  
 > Write 'Action:' followed by a json-formatted list of actions that you want to perform in order to produce a good response to the user's last input. You can use any of the supplied tools any number of times, but you should aim to execute the minimum number of necessary actions for the input. You should use the \`directly-answer\` tool if calling the other tools is unnecessary. The list of actions you want to call should be formatted as a list of json objects, for example:
 >
-> 
```json
+> 
```json JSON
 > [
 >     {
 >         "tool_name": title of the tool in the specification,
@@ -440,7 +440,7 @@ And this has the model output:
 
 ## Appendix
 
-```python
+```python PYTHON
 documents = [
    { "title": "Tall penguins",
       "text": "Emperor penguins are the tallest growing up to 122 cm in height." },
@@ -464,7 +464,7 @@ rendered_docs = render_docs(documents)
 
 ```
 
-```python
+```python PYTHON
 conversation = [
    {"role": "user", "content": "Whats the biggest penguin in the world?"}
    {"role": "system", "content": rendered_docs}
diff --git a/fern/pages/text-generation/retrieval-augmented-generation-rag.mdx b/fern/pages/text-generation/retrieval-augmented-generation-rag.mdx
index fdfc543a1..f26195a06 100644
--- a/fern/pages/text-generation/retrieval-augmented-generation-rag.mdx
+++ b/fern/pages/text-generation/retrieval-augmented-generation-rag.mdx
@@ -16,7 +16,7 @@ The code snippet below, for example, will produce a grounded answer to `"Where d
 
 **Request**
 
-```python
+```python PYTHON
 import cohere
 co = cohere.Client(api_key="")
 
@@ -30,9 +30,9 @@ co.chat(
   ])
 ```
 
-**Response **
+**Response**
 
-```json
+```json JSON
 {
   "text": "The tallest penguins, Emperor penguins, live in Antarctica.",  
   "citations": [  
@@ -62,9 +62,9 @@ You can find more code and context in [this colab notebook](https://github.com/c
 
 The RAG workflow generally consists of **3 steps**:
 
-- **Generating search queries** for finding relevant documents. _What does the model recommend looking up before answering this question? _
+- **Generating search queries** for finding relevant documents. _What does the model recommend looking up before answering this question?_
 - **Fetching relevant documents** from an external data source using the generated search queries. _Performing a search to find some relevant information._
-- **Generating a response **with inline citations using the fetched documents. _Using the acquired knowledge to produce an educated answer_.
+- **Generating a response** with inline citations using the fetched documents. _Using the acquired knowledge to produce an educated answer_.
 
 #### Example: Using RAG to identify the definitive 90s boy band
 
@@ -76,7 +76,7 @@ Calling the [Chat API](/reference/chat) with the `search_queries_only` parameter
 
 **Request**
 
-```python
+```python PYTHON
 import cohere
 co = cohere.Client(api_key="")
 
@@ -89,7 +89,7 @@ co.chat(
 
 **Response**
 
-```json
+```json JSON
 {
   "is_search_required": true,
   "search_queries": [
@@ -142,7 +142,7 @@ co.chat(
 
 **Response**
 
-```json
+```json JSON
 {
   "text": "Both the Backstreet Boys and *NSYNC enjoyed immense popularity during the late 1990s and early 2000s. \n\nThe Backstreet Boys, with massive album sales, chart-topping releases and highly successful tours, dominated the music industry for several years worldwide. They sold millions of copies of their albums No Strings Attached and Celebrity, breaking even the sales records of Adele and ranking as the second-fastest-selling album of the Soundscan era before 2015. \n\n*NSYNC, led by Justin Timberlake, also achieved tremendous success, selling millions of copies of their albums and finding popularity not just in the US but globally. \n\nHowever, when comparing the two groups, the extent of the Backstreet Boys' success puts them at a significantly higher level. They dominated the industry for several years, achieving unparalleled success in traditionally non-Western markets. Therefore, I can conclude that the Backstreet Boys are the more popular group.",
   "citations": [
@@ -166,7 +166,7 @@ As an alternative to manually implementing the 3 step workflow, the Chat API off
 
 **Request**
 
-```python
+```python PYTHON
 import cohere
 co = cohere.Client(api_key="")
 
@@ -178,7 +178,7 @@ co.chat(
 
 **Response**
 
-```json
+```json JSON
 {
   "text": "The Backstreet Boys have sold over 100 million records worldwide, making them the best-selling boy band of all time, and one of the world's best-selling music artists. They are the only boy band to have their first ten albums reach the top 10 on the Billboard 200. \n\n'NSYNC sold over 2.4 million copies in the United States during its first week of release, setting the record as the first album to have sold more than two million copies in a single week since the chart adopted Nielsen SoundScan data in May 1991. Their best-selling album, No Strings Attached sold over 11 million copies in the US alone. \n\nIt's clear that both bands have enjoyed enormous commercial success, however, based on the available sales data, The Backstreet Boys take the crown as the better-selling band of the two.",
   "search_queries": [
diff --git a/fern/pages/text-generation/streaming.mdx b/fern/pages/text-generation/streaming.mdx
index 049bbdb42..21b080072 100644
--- a/fern/pages/text-generation/streaming.mdx
+++ b/fern/pages/text-generation/streaming.mdx
@@ -16,7 +16,7 @@ You're likely already familiar with streaming. When you ask the model a question
 
 ## Example
 
-```python
+```python PYTHON
 import cohere
 
 co = cohere.Client(api_key='')
@@ -72,7 +72,7 @@ For an illustration of a generated citation with document-specific indices, look
 
 Emitted when the next token of the tool plan or the tool call is generated.
 
-```json
+```json JSON
 ...
 {
     "is_finished": false,
@@ -139,7 +139,7 @@ Emitted when the model generates tool calls that require actioning upon. The eve
 
 Below, we have a stream of events which shows the **full** output you might see during a streaming session:
 
-```json
+```json JSON
 
 {
     "is_finished": false,
diff --git a/fern/pages/text-generation/structured-outputs-json.mdx b/fern/pages/text-generation/structured-outputs-json.mdx
index 6e879827e..d9008df59 100644
--- a/fern/pages/text-generation/structured-outputs-json.mdx
+++ b/fern/pages/text-generation/structured-outputs-json.mdx
@@ -2,7 +2,7 @@
 title: "Structured Generations (JSON)"
 slug: "docs/structured-outputs-json"
 
-hidden: true
+hidden: false
 createdAt: "Thu Jun 06 2024 05:37:56 GMT+0000 (Coordinated Universal Time)"
 updatedAt: "Tue Jun 11 2024 02:43:00 GMT+0000 (Coordinated Universal Time)"
 ---
@@ -16,7 +16,7 @@ JSON is a lightweight format that is easy for humans to read and write and is al
 
 When making an API request, you can specify the `response_format` parameter to indicate that you want the response in a JSON object format.
 
-```python
+```python PYTHON
 import cohere
 co = cohere.Client(api_key="YOUR API KEY")
 
@@ -41,7 +41,7 @@ The `response_format` parameter also allows you to define a schema for the gener
 
 For example, let's say you want the LLM to generate a JSON object with specific keys for a book, such as "title," "author," and "publication_year." Your API request might look like this:
 
-```python
+```python PYTHON
 import cohere
 co = cohere.Client(api_key="YOUR API KEY")
 
@@ -78,5 +78,15 @@ We do not support the entirety of the [JSON Schema specification](https://json-s
 - [Schema Composition](https://json-schema.org/understanding-json-schema/reference/combining#schema-composition) (`anyOf`, `allOf`, `oneOf` and `not`)
 - [Numeric Ranges](https://json-schema.org/understanding-json-schema/reference/numeric#range) (`maximum` and  `minimum`)
 - [Array Length Ranges](https://json-schema.org/understanding-json-schema/reference/array#length) (`minItems` and `maxItems`) 
-- [String Length](https://json-schema.org/understanding-json-schema/reference/string#length) (`maxLength` and `minLength`)
-- [Regular Expressions](https://json-schema.org/understanding-json-schema/reference/string#regexp)
+- String limitations: 
+  - [String Length](https://json-schema.org/understanding-json-schema/reference/string#length) (`maxLength` and `minLength`)
+  - The following are not supported in [Regular Expressions](https://json-schema.org/understanding-json-schema/reference/string#regexp)
+    - `^`
+    - `$`
+    - `?=`
+    - `?!`
+  - The following [formats](https://json-schema.org/understanding-json-schema/reference/string#format) are the only supported ones
+    - `date-time`
+    - `uuid`
+    - `date`
+    - `time`
diff --git a/fern/pages/text-generation/tokens-and-tokenizers.mdx b/fern/pages/text-generation/tokens-and-tokenizers.mdx
index e9182f884..83e211af2 100644
--- a/fern/pages/text-generation/tokens-and-tokenizers.mdx
+++ b/fern/pages/text-generation/tokens-and-tokenizers.mdx
@@ -53,7 +53,7 @@ The cache for the tokenizer configuration is declared for each client instance.
 
 If you are doing development work before going to production with your application, this might be slow if you are just experimenting by redefining the client initialization. Cohere API offers endpoints for `tokenize` and `detokenize` which avoids downloading the tokenizer configuration file. In the Python SDK, these can be accessed by setting `offline=False` like so:
 
-```python
+```python PYTHON
 import cohere  
 co = cohere.Client(api_key="")
 
@@ -64,7 +64,7 @@ co.tokenize(text="caterpillar", model="command-r", offline=False) # -> [74, 2340
 
 Alternatively, the latest version of the tokenizer can be downloaded manually:
 
-```python
+```python PYTHON
 # pip install tokenizers
 
 from tokenizers import Tokenizer  
@@ -82,7 +82,7 @@ tokenizer.encode(sequence="...", add_special_tokens=False)
 
 The URL for the tokenizer should be obtained dynamically by calling the [Models API](/reference/get-model). Here is a sample response for the Command R model:
 
-```json
+```json JSON
 {  
   "name": "command-r",  
   ...
diff --git a/fern/pages/text-generation/tools/multi-step-tool-use.mdx b/fern/pages/text-generation/tools/multi-step-tool-use.mdx
index 2a553495d..33f66345f 100644
--- a/fern/pages/text-generation/tools/multi-step-tool-use.mdx
+++ b/fern/pages/text-generation/tools/multi-step-tool-use.mdx
@@ -20,7 +20,7 @@ Also, note that multi-step is enabled by default.
 
 ### Step 1: Define the tools
 
-```python
+```python PYTHON
 # define the `web_search` tool.
 
 def web_search(query: str) -> list[dict]:
@@ -45,7 +45,7 @@ web_search_tool = {
 
 ### Step 2: Ask model for tool calls and send back tool results
 
-```python
+```python PYTHON
 import cohere
 co = cohere.Client(api_key="")
 
@@ -107,18 +107,10 @@ That said single-step tool use only facilitates the model calling multiple tools
 
 #### When Should I Use Multi-step Tool Use?
 
-For more complex queries, such as those that require multiple steps, it's probably better to operate in multi-step mode. You can do this by setting `enable_multistep=True` and providing a list of tools through the Chat API. In multi-step mode, the model can reason across steps and select multiple tools to answer a question completely. 
-
-To illustrate, imagine you give the Chat API a query like "What was the weather where I was yesterday," along with a location tool (to return the user’s location given a timestamp) and a weather tool (to return the weather at a given location). Here's what happens:
-
-- First, the model will make a plan, which consists in first calling the location tool (step 1), and then calling the weather tool (step 2) based on the output of the location tool.
-- Then, the model receives the results of these tool calls and the underlying model's reasoning.
-- In a subsequent call, the model will determine that it still doesn’t have all the information required to answer, and select another tool. 
-- Etc.
+For more complex queries it's probably better to operate in multi-step mode, which is the default (you can operate in single-step mode by setting `force_single_step=True`). In multi-step mode, the model can reason across steps and select multiple tools to answer a question completely.
 
 #### What is the difference between tool use and Retrieval Augmented Generation (RAG)?
 
 Tool use is a natural extension of retrieval augmented generation (RAG). RAG is about enabling the model to interact with an information retrieval system (like a vector database). Our models are trained to be excellent at RAG use cases.
 
-Tool use pushes this further, allowing Cohere models to go far beyond information retrieval, interact with search engines, APIs, functions, databases, and many other tools.
-
+Tool use pushes this further, allowing Cohere models to go far beyond information retrieval, interact with search engines, APIs, functions, databases, and many other tools.
\ No newline at end of file
diff --git a/fern/pages/text-generation/tools/multi-step-tool-use/implementing-a-multi-step-agent-with-langchain.mdx b/fern/pages/text-generation/tools/multi-step-tool-use/implementing-a-multi-step-agent-with-langchain.mdx
index 2abfb1479..77a4a3b90 100644
--- a/fern/pages/text-generation/tools/multi-step-tool-use/implementing-a-multi-step-agent-with-langchain.mdx
+++ b/fern/pages/text-generation/tools/multi-step-tool-use/implementing-a-multi-step-agent-with-langchain.mdx
@@ -18,7 +18,7 @@ Multi-step tool use with Cohere can be implemented using the [Langchain framewor
 
 First, we'll install the dependencies. (Note: the `!` is required for notebooks, but you must omit it if you're in the command line).
 
-```python
+```python PYTHON
 ! pip install --quiet langchain langchain_cohere langchain_experimental
 ```
 
@@ -28,7 +28,7 @@ Below, we've included two code snippets, equipping the agent with the Web Search
 
 #### Example: define the Web Search tool
 
-```python
+```python PYTHON
 from langchain_community.tools.tavily_search import TavilySearchResults
 
 os.environ["TAVILY_API_KEY"] = #
@@ -46,7 +46,7 @@ internet_search.args_schema = TavilySearchInput
 
 #### Example: define the Python Interpreter tool
 
-```python
+```python PYTHON
 from langchain.agents import Tool
 from langchain_experimental.utilities import PythonREPL
 
@@ -68,7 +68,7 @@ Even better any Python function can easily be _transformed_ into a Langchain too
 
 #### Example: define a custom tool
 
-```python
+```python PYTHON
 
 from langchain_core.tools import tool
 import random
@@ -96,7 +96,7 @@ random_operation_tool.args_schema = random_operation_inputs
 
 Third, create a ReAct agent in Langchain. The model can dynamically pick the right tool(s) for the user query, call them in a sequence, analyze the results, and self-reflect. Note that your ReAct agent can optionally take an input preamble.
 
-```python
+```python PYTHON
 from langchain.agents import AgentExecutor
 from langchain_cohere.react_multi_hop.agent import create_cohere_react_agent
 from langchain_core.prompts import ChatPromptTemplate
@@ -131,7 +131,7 @@ agent_executor = AgentExecutor(agent=agent,
 
 Finally, call your agent with a question!
 
-```python
+```python PYTHON
 agent_executor.invoke({
    "input": "I want to write an essay about the Roman Empire. Any tips for writing an essay? Any fun facts?",
    "preamble": preamble,
@@ -213,7 +213,7 @@ Here are some fun facts about the Roman Empire:
 
 Beyond the web search tool and the Python interpreter tool shared in the code snippets above, we have found some tools to be particularly useful. Here's an example of leveraging a vector store for greater functionality:
 
-```python
+```python PYTHON
 # You can easily equip your agent with a vector store!
 
 from langchain.text_splitter import RecursiveCharacterTextSplitter
@@ -263,7 +263,7 @@ So far, we asked one-off questions to the ReAct agent. In many enterprise applic
 
 The ReAct agent can handle multi-turn conversations by using `chat_history`.
 
-```python
+```python PYTHON
 # Step 1: Construct the chat history as a list of LangChain Messages, ending with the last user message
 from langchain_core.messages import HumanMessage, AIMessage
 
@@ -300,7 +300,7 @@ Yes. The ReAct agent from Cohere comes out of the box with the ability to answer
 
 For example, let’s look at the following question:
 
-```python
+```python PYTHON
 agent_executor.invoke({
    "input": "Hey how are you?",
 })
@@ -311,7 +311,7 @@ By inspecting the logs, we see that the ReAct agent decided to just respond dire
 ````asp
 > Entering new AgentExecutor chain...
 Plan: I will respond to the user's greeting.
-Action: ```json
+Action: ```json JSON
 [
     {
         "tool_name": "directly_answer",
diff --git a/fern/pages/text-generation/tools/parameter-types-in-tool-use.mdx b/fern/pages/text-generation/tools/parameter-types-in-tool-use.mdx
index 1a0291c12..732a37807 100644
--- a/fern/pages/text-generation/tools/parameter-types-in-tool-use.mdx
+++ b/fern/pages/text-generation/tools/parameter-types-in-tool-use.mdx
@@ -62,7 +62,7 @@ response = co.chat(
 
 ### With specific element types
 
-```python
+```python PYTHON
 tools = [
    {
        "name": "query_daily_sales_report",
@@ -81,7 +81,7 @@ tools = [
 
 ### Without specific element types
 
-```python
+```python PYTHON
 tools = [
     {
         "name": "query_daily_sales_report",
@@ -103,7 +103,7 @@ tools = [
 
 To make sure a tool only accepts certain values you can list those values in the parameter's description. For example, you can say "Possible enum values: customer, supplier."
 
-```python
+```python PYTHON
 tools = [
     {
         "name": "fetch_contacts",
@@ -125,7 +125,7 @@ tools = [
 
 To ensure a tool is called with a default value it's recommended to specify the default on the tool's implementation and use required: False whenever possible. When this is not possible you can specify the default in the parameter's description (with required: True). For example:
 
-```python
+```python PYTHON
 tools = [
     {
         "name": "fetch_contacts",
@@ -147,7 +147,7 @@ tools = [
 
 We recommend using individual parameters whenever possible. However, when that's not possible, to make sure a tool is called with a specific array or dictionary structure you can specify the keys in the parameter's description. For example:
 
-```python
+```python PYTHON
 tools = [
     {
         "name": "plot_daily_sales_volume",
@@ -170,7 +170,7 @@ tools = [
 
 It's possible to call a tool that accepts custom Python objects, for example a data class.
 
-```python
+```python PYTHON
 from dataclasses import dataclass
 
 
diff --git a/fern/pages/text-generation/tools/single-step-vs-multi-step.mdx b/fern/pages/text-generation/tools/single-step-vs-multi-step.mdx
index 96d4bbbb0..2b12f6261 100644
--- a/fern/pages/text-generation/tools/single-step-vs-multi-step.mdx
+++ b/fern/pages/text-generation/tools/single-step-vs-multi-step.mdx
@@ -30,7 +30,7 @@ The developer provides the sales database and the products database to the model
 
 Observe that, for each tool, the developer describes the tool name, description, and inputs. Each input can have a type and can be marked as required.
 
-```python
+```python PYTHON
 import cohere
 co = cohere.Client(api_key="")
 
@@ -85,7 +85,7 @@ response = co.chat(
 
 The model's response contains the list of appropriate tools to call in order to answer the user's question, as well as the appropriate inputs for each tool call.
 
-```python
+```python PYTHON
 print("The model recommends doing the following tool calls:")
 print("\n".join(str(tool_call) for tool_call in response.tool_calls))
 
@@ -107,7 +107,7 @@ print("\n".join(str(tool_call) for tool_call in response.tool_calls))
 
 Now, the developer will query the appropriate tools and receive a tool result in return.
 
-```python
+```python PYTHON
 tool_results = []
 # Iterate over the tool calls generated by the model
 for tool_call in response.tool_calls:
@@ -183,7 +183,7 @@ print(json.dumps(tool_results, indent=4))
 
 Call the chat endpoint again with the tool results to get the final model answer. Note that this is done through the `tool_results` parameter, with the other parameters operating as expected.
 
-```python
+```python PYTHON
 response = co.chat(
    message=message,
    tools=tools,
diff --git a/fern/pages/text-generation/tools/tool-use.mdx b/fern/pages/text-generation/tools/tool-use.mdx
index dcf74feda..0b5d2de5c 100644
--- a/fern/pages/text-generation/tools/tool-use.mdx
+++ b/fern/pages/text-generation/tools/tool-use.mdx
@@ -78,7 +78,7 @@ The developer provides the sales database and the products database to the model
 
 Observe that, for each tool, the developer describes the tool name, description, and inputs. Each input can have a type and can be marked as required.
 
-```python
+```python PYTHON
 import cohere
 co = cohere.Client(api_key="")
 
@@ -133,7 +133,7 @@ response = co.chat(
 
 The model's response contains the list of appropriate tools to call in order to answer the user's question, as well as the appropriate inputs for each tool call.
 
-```python
+```python PYTHON
 print("The model recommends doing the following tool calls:")
 print("\n".join(str(tool_call) for tool_call in response.tool_calls))
 
@@ -155,7 +155,7 @@ print("\n".join(str(tool_call) for tool_call in response.tool_calls))
 
 Now, the developer will query the appropriate tools and receive a tool result in return.
 
-```python
+```python PYTHON
 tool_results = []
 # Iterate over the tool calls generated by the model
 for tool_call in response.tool_calls:
@@ -231,7 +231,7 @@ print(json.dumps(tool_results, indent=4))
 
 Call the chat endpoint again with the tool results to get the final model answer. Note that this is done through the `tool_results` parameter, with the other parameters operating as expected.
 
-```python
+```python PYTHON
 response = co.chat(
    message=message,
    tools=tools,
@@ -273,7 +273,7 @@ These citations are optional — you can decide to ignore them. Having said that
 
 Developers can control the granularity of these citations. Simply split tool results into multiple tool result objects (`tool_results` accepts lists). The language model will then cite tool results at the specified level of granularity.
 
-```python
+```python PYTHON
 print("Citations that support the final answer:")
 for cite in response.citations:
   print(cite)
diff --git a/scripts/cookbooks-mdx/agent-api-calls.mdx b/scripts/cookbooks-mdx/agent-api-calls.mdx
index 65c12558b..38dd17256 100644
--- a/scripts/cookbooks-mdx/agent-api-calls.mdx
+++ b/scripts/cookbooks-mdx/agent-api-calls.mdx
@@ -201,7 +201,7 @@ With this approach, we bring together the best of two worlds: the ability of LLM
 # Step 1: Setup
 
 
-```python
+```python PYTHON
 # Uncomment if you need to install the following packages
 # !pip install cohere
 # !pip install python-dotenv
@@ -209,7 +209,7 @@ With this approach, we bring together the best of two worlds: the ability of LLM
 ```
 
 
-```python
+```python PYTHON
 import os
 import json
 import re
@@ -223,7 +223,7 @@ from langchain_core.tools import tool
 ```
 
 
-```python
+```python PYTHON
 # load the cohere api key
 os.environ["COHERE_API_KEY"] = getpass.getpass()
 ```
@@ -233,7 +233,7 @@ os.environ["COHERE_API_KEY"] = getpass.getpass()
 Here we create a tool which implements the deterministic function to extract alphanumeric strings from the user's query and match them to the right parameter.
 
 
-```python
+```python PYTHON
 @tool
 def regex_extractor(user_query: str) -> dict:
     """Function which, given the query from the user, returns a dictionary parameter:value."""
@@ -258,7 +258,7 @@ tools=[regex_extractor]
 ```
 
 
-```python
+```python PYTHON
 # Let's define the preamble for the Agent.
 # The preamble includes info about:
 # - the tool the Agent has access to
@@ -287,7 +287,7 @@ Search products sport | Search products dress and jumpsuit | [{'taxonomies': ['S
 ```
 
 
-```python
+```python PYTHON
 # Define the prompt
 prompt = ChatPromptTemplate.from_template("{input}")
 # Define the agent
@@ -306,7 +306,7 @@ agent_executor = AgentExecutor(agent=agent,
 ```
 
 
-```python
+```python PYTHON
 # finally, let's write a function to convert the Agents output to a json
 def convert_to_json(string: str) -> json:
     return json.loads(
@@ -322,7 +322,7 @@ def convert_to_json(string: str) -> json:
 Let's now test the Agent we just defined!
 
 
-```python
+```python PYTHON
 query_1 = "Look for urn:75f2b737-06dd-4399-9206-a6c11b65138e, GLCMS004AGTCAMIS; 0000234GLCMS0100ANORAKCAA, GLCHL000CGUCHALE"
 response_1 = agent_executor.invoke(
             {
@@ -340,7 +340,7 @@ response_1 = agent_executor.invoke(
     {'tool_name': 'regex_extractor', 'parameters': {'user_query': 'Look for urn:75f2b737-06dd-4399-9206-a6c11b65138e, GLCMS004AGTCAMIS; 0000234GLCMS0100ANORAKCAA, GLCHL000CGUCHALE'}}
     {'nmgs': ['0000234GLCMS0100ANORAKCAA'], 'objref': ['GLCMS004AGTCAMIS', 'GLCHL000CGUCHALE'], 'urn': ['urn:75f2b737-06dd-4399-9206-a6c11b65138e']}Relevant Documents: 0
     Cited Documents: 0
-    Answer: ```json
+    Answer: ```json JSON
     [
         {
             "urn": ["urn:75f2b737-06dd-4399-9206-a6c11b65138e"],
@@ -349,7 +349,7 @@ response_1 = agent_executor.invoke(
         }
     ]
     ```
-    Grounded answer: ```json
+    Grounded answer: ```json JSON
      [
         {
             "urn": ["urn:75f2b737-06dd-4399-9206-a6c11b65138e"],
@@ -366,7 +366,7 @@ In the reasoning chain above, we can see that the Agent uses the tool we provide
 The output of the tool is then used to generate the request.
 
 
-```python
+```python PYTHON
 # let's have a look at the final output
 convert_to_json(response_1['output'])
 ```
@@ -383,7 +383,7 @@ convert_to_json(response_1['output'])
 As mentioned above, the Agent can use the tool when specific alphanumeric patterns have to be extracted from the query; however, it can also generate the output based on its semantic understanding of the query. For example:
 
 
-```python
+```python PYTHON
 query_2 = "I need tennis products"
 
 response_2 = agent_executor.invoke(
@@ -402,7 +402,7 @@ response_2 = agent_executor.invoke(
     {'tool_name': 'regex_extractor', 'parameters': {'user_query': 'I need tennis products'}}
     {}Relevant Documents: None
     Cited Documents: None
-    Answer: ```json
+    Answer: ```json JSON
     [
         {
             "taxonomies": [
@@ -411,7 +411,7 @@ response_2 = agent_executor.invoke(
         }
     ]
     ```
-    Grounded answer: ```json
+    Grounded answer: ```json JSON
      [
      {
      "taxonomies": [
@@ -427,7 +427,7 @@ response_2 = agent_executor.invoke(
 The Agent runs the tool to check if any target string was in the query, then it generated the request body based on its understanding.
 
 
-```python
+```python PYTHON
 convert_to_json(response_2['output'])
 ```
 
@@ -441,7 +441,7 @@ convert_to_json(response_2['output'])
 Finally, the two paths to generation - deterministic and semantic - can be applied in parallel by the Agent, as shown below:
 
 
-```python
+```python PYTHON
 query_3 = "Look for GLBRL0000GACHALE, nmg 0000234GLCZD0000GUREDTOAA and car products"
 
 response_3 = agent_executor.invoke(
@@ -460,7 +460,7 @@ response_3 = agent_executor.invoke(
     {'tool_name': 'regex_extractor', 'parameters': {'user_query': 'Look for GLBRL0000GACHALE, nmg 0000234GLCZD0000GUREDTOAA and car products'}}
     {'nmgs': ['0000234GLCZD0000GUREDTOAA'], 'objref': ['GLBRL0000GACHALE']}Relevant Documents: 0
     Cited Documents: 0
-    Answer: ```json
+    Answer: ```json JSON
     [
         {
             "objref": ["GLBRL0000GACHALE"],
@@ -471,7 +471,7 @@ response_3 = agent_executor.invoke(
         }
     ]
     ```
-    Grounded answer: ```json
+    Grounded answer: ```json JSON
      [
         {
             "objref": ["GLBRL0000GACHALE"],
@@ -487,7 +487,7 @@ response_3 = agent_executor.invoke(
 
 
 
-```python
+```python PYTHON
 convert_to_json(response_3['output'])
 ```
 
diff --git a/scripts/cookbooks-mdx/agent-short-term-memory.mdx b/scripts/cookbooks-mdx/agent-short-term-memory.mdx
index afaeb0429..2c446de03 100644
--- a/scripts/cookbooks-mdx/agent-short-term-memory.mdx
+++ b/scripts/cookbooks-mdx/agent-short-term-memory.mdx
@@ -197,7 +197,7 @@ Below, we show that, with augmented memory objects, the Agent is more aware of t
 # Step 1: Setup the Prompt and the Agent
 
 
-```python
+```python PYTHON
 # Uncomment if you need to install the following packages
 # !pip install cohere
 # !pip install python-dotenv
@@ -205,7 +205,7 @@ Below, we show that, with augmented memory objects, the Agent is more aware of t
 ```
 
 
-```python
+```python PYTHON
 import os
 import pandas as pd
 import getpass
@@ -222,25 +222,25 @@ from langchain_core.messages.human import HumanMessage
 ```
 
 
-```python
+```python PYTHON
 # load the cohere api key
 os.environ["COHERE_API_KEY"] = getpass.getpass()
 ```
 
 
-```python
+```python PYTHON
 # Load the data
 revenue_table = pd.read_csv('revenue_table.csv')
 ```
 
 
-```python
+```python PYTHON
 # Define the prompt
 prompt = ChatPromptTemplate.from_template("{input}")
 ```
 
 
-```python
+```python PYTHON
 # Define the tools
 python_repl = PythonREPL()
 python_tool = Tool(
@@ -258,7 +258,7 @@ tools=[python_tool]
 ```
 
 
-```python
+```python PYTHON
 # Define the agent
 llm = ChatCohere(model="command-r", temperature=0)
 
@@ -280,7 +280,7 @@ agent_executor = AgentExecutor(agent=agent,
 
 
 
-```python
+```python PYTHON
 # let's start the conversation with a question about the csv we have loaded
 q1 = "read revenue_table.csv and show me the column names"
 a1=agent_executor.invoke({
@@ -312,7 +312,7 @@ a1=agent_executor.invoke({
 
 
 
-```python
+```python PYTHON
 # nice! now let's ask a follow-up question
 q2 = "plot revenue numbers"
 a2_no_mem = agent_executor.invoke({
@@ -324,7 +324,7 @@ a2_no_mem = agent_executor.invoke({
     
     > Entering new AgentExecutor chain...
     Plan: I will ask the user for clarification on what data they would like to visualise.
-    Action: ```json
+    Action: ```json JSON
     [
         {
             "tool_name": "directly_answer",
@@ -348,7 +348,7 @@ Without memory, the model cannot answer follow up questions because it misses th
 Here we will populate the chat history only with the generations from the model. This is the current approach used, e.g., here: https://python.langchain.com/docs/modules/agents/how_to/custom_agent/
 
 
-```python
+```python PYTHON
 # let's answer the followup question above with the new setup
 a2_mem_ai = agent_executor.invoke({
    "input": q2,
@@ -384,7 +384,7 @@ Also in this case, the model cannot manage the follow up question. The reason is
 # Step 4: Conversation with Memory using AI Messages and Human Messages
 
 
-```python
+```python PYTHON
 a2_mem_ai_hum = agent_executor.invoke({
    "input": q2,
    "chat_history": [HumanMessage(content=q1),
@@ -417,7 +417,7 @@ a2_mem_ai_hum = agent_executor.invoke({
 It works! Let's go on with the conversation.
 
 
-```python
+```python PYTHON
 q3 = "set the min of y axis to zero and the max to 1000"
 a3_mem_ai_hum = agent_executor.invoke({
    "input": q3,
@@ -463,7 +463,7 @@ Reasoning chains can be very long, especially in the cases that contain errors a
 To avoid this issue, we need a way to extract the relevant info from the previous turns. Below, we propose a simple approach to info extraction. We format the extracted info in such a way to enhance human interpretability. We call the objects passed in the chat history *augmented memory objects*.
 
 
-```python
+```python PYTHON
 # function to create augmented memory objects
 def create_augmented_mem_objs(output_previous_turn: dict) -> str:
     """Function to convert the output of a ReAct agent to a compact and interpretable representation"""
@@ -488,7 +488,7 @@ def create_augmented_mem_objs(output_previous_turn: dict) -> str:
 ```
 
 
-```python
+```python PYTHON
 augmented_mem_obj_a1 = create_augmented_mem_objs(a1)
 augmented_mem_obj_a2 = create_augmented_mem_objs(a2_mem_ai_hum)
 ```
@@ -496,7 +496,7 @@ augmented_mem_obj_a2 = create_augmented_mem_objs(a2_mem_ai_hum)
 Below, an example of the augmented memory object generated by the model. You can see that the agent now has full visibility on what it did in the previous step.
 
 
-```python
+```python PYTHON
 print(augmented_mem_obj_a2)
 ```
 
@@ -519,7 +519,7 @@ print(augmented_mem_obj_a2)
 
 
 
-```python
+```python PYTHON
 a3_mem_ai_hum_amo = agent_executor.invoke({
     "input": q3,
     "chat_history": [SystemMessage(content=augmented_mem_obj_a1),
diff --git a/scripts/cookbooks-mdx/agentic-multi-stage-rag.mdx b/scripts/cookbooks-mdx/agentic-multi-stage-rag.mdx
index 57c194dec..30a82a3da 100644
--- a/scripts/cookbooks-mdx/agentic-multi-stage-rag.mdx
+++ b/scripts/cookbooks-mdx/agentic-multi-stage-rag.mdx
@@ -200,7 +200,7 @@ As you will see more below, the multi-stage retrieval is achieved by adding a ne
 
 
 
-```python
+```python PYTHON
 import os
 from pprint import pprint
 
@@ -210,7 +210,7 @@ from sklearn.metrics.pairwise import cosine_similarity
 ```
 
 
-```python
+```python PYTHON
 # versions
 print('cohere version:', cohere.__version__)
 ```
@@ -221,7 +221,7 @@ print('cohere version:', cohere.__version__)
 ## Setup 
 
 
-```python
+```python PYTHON
 COHERE_API_KEY = os.environ.get("CO_API_KEY")
 COHERE_MODEL = 'command-r-plus'
 co = cohere.Client(api_key=COHERE_API_KEY)
@@ -232,7 +232,7 @@ co = cohere.Client(api_key=COHERE_API_KEY)
 We leveraged data from [Washington Department of Transportation](https://wsdot.wa.gov/travel/bicycling-walking/bicycling-washington/bicyclist-laws-safety) and modified to fit the need of this demo. 
 
 
-```python
+```python PYTHON
 documents = [
     {
         "title": "Bicycle law",
@@ -294,7 +294,7 @@ db["embeddings"] = embeddings.embeddings
 ```
 
 
-```python
+```python PYTHON
 db
 ```
 
@@ -365,7 +365,7 @@ db
 Following functions and tools will be used in the subsequent tasks. 
 
 
-```python
+```python PYTHON
 def retrieve_documents(query: str, n=1) -> dict:
     """
     Function to retrieve documents a given query.
@@ -410,7 +410,7 @@ tools = [
 
 
 
-```python
+```python PYTHON
 def simple_rag(query, db):
     """
     Given user's query, retrieve top documents and generate response using documents parameter.
@@ -430,7 +430,7 @@ def simple_rag(query, db):
 ## Agentic RAG - cohere_agent()
 
 
-```python
+```python PYTHON
 def cohere_agent(
     message: str,
     preamble: str,
@@ -515,14 +515,14 @@ Here we are asking a question that can be answered easily with single-stage retr
 
 
 
-```python
+```python PYTHON
 question1 = "Is there a state level law for wearing helmets?"
 ```
 
 ### Simple RAG 
 
 
-```python
+```python PYTHON
 output = simple_rag(question1, db)
 print(output)
 ```
@@ -535,7 +535,7 @@ print(output)
 ### Agentic RAG 
 
 
-```python
+```python PYTHON
 preamble = """
 You are an expert assistant that helps users answers question about legal documents and policies.
 Use the provided documents to answer questions about an employee's specific situation.
@@ -567,14 +567,14 @@ The second question requires a double-stage retrieval because top matched docume
 |I live in orting, do I need to wear a helmet with a bike?|In the state of Washington, there is no law requiring you to wear a helmet when riding a bike. However, some cities and counties do require helmet use, so it is worth checking your local laws.|Yes, you do need to wear a helmet with a bike in Orting if you are under 17.|
 
 
-```python
+```python PYTHON
 question2 = "I live in orting, do I need to wear a helmet with a bike?"
 ```
 
 ### Simple RAG 
 
 
-```python
+```python PYTHON
 output = simple_rag(question2, db)
 print(output)
 ```
@@ -589,7 +589,7 @@ print(output)
 Produces same quality answer as the simple rag.
 
 
-```python
+```python PYTHON
 preamble = """
 You are an expert assistant that helps users answers question about legal documents and policies.
 Use the provided documents to answer questions about an employee's specific situation.
@@ -618,7 +618,7 @@ In order for the model to retrieve correct documents, we do two things:
 2. We update the instruction that directs the agent to keep retrieving relevant documents. 
 
 
-```python
+```python PYTHON
 def reference_extractor(query: str, documents: list[str]) -> str:
     """
     Given a query and document, find references to other documents.
@@ -683,7 +683,7 @@ tools = [
 ```
 
 
-```python
+```python PYTHON
 preamble2 = """# Instruction
 You are an expert assistant that helps users answers question about legal documents and policies.
 
diff --git a/scripts/cookbooks-mdx/agentic-rag-mixed-data.mdx b/scripts/cookbooks-mdx/agentic-rag-mixed-data.mdx
index 10058d450..23a385b81 100644
--- a/scripts/cookbooks-mdx/agentic-rag-mixed-data.mdx
+++ b/scripts/cookbooks-mdx/agentic-rag-mixed-data.mdx
@@ -202,7 +202,7 @@ Various LangChain-supported parsers can be found [here](https://python.langchain
 ## Install Dependencies
 
 
-```python
+```python PYTHON
 # there may be other dependencies that will need installation
 # ! pip install --quiet langchain langchain_cohere langchain_experimental
 # !pip --quiet install faiss-cpu tiktoken
@@ -214,7 +214,7 @@ Various LangChain-supported parsers can be found [here](https://python.langchain
 ```
 
 
-```python
+```python PYTHON
 # LLM
 import os
 from langchain.text_splitter import RecursiveCharacterTextSplitter
@@ -254,7 +254,7 @@ We have found that the best option for parsing is unstructured.io since the pars
 
 
 
-```python
+```python PYTHON
 # UNSTRUCTURED pdf loader
 # Get elements
 raw_pdf_elements = partition_pdf(
@@ -278,7 +278,7 @@ raw_pdf_elements = partition_pdf(
 ```
 
 
-```python
+```python PYTHON
 # extract table and textual objects from parser
 class Element(BaseModel):
     type: str
@@ -318,7 +318,7 @@ Below, we demonstrate the following process:
 - during inference, the multi-vector retrieval returns the full context document related to the summary
 
 
-```python
+```python PYTHON
 co = cohere.Client()
 def get_chat_output(message, preamble, chat_history, model, temp, documents=None):
     return co.chat(
@@ -349,7 +349,7 @@ def rerank_cohere(query, returned_documents,model:str="rerank-multilingual-v3.0"
 ```
 
 
-```python
+```python PYTHON
 # generate table and text summaries
 prompt_text = """You are an assistant tasked with summarizing tables and text. \ 
 Give a concise summary of the table or text. Table or text chunk: {element}. Only provide the summary and no other text."""
@@ -363,7 +363,7 @@ texts = [i.text for i in text_elements]
 ```
 
 
-```python
+```python PYTHON
 # The vectorstore to use to index the child chunks
 vectorstore = Chroma(collection_name="summaries", embedding_function=CohereEmbeddings())
 # The storage layer for the parent documents
@@ -403,7 +403,7 @@ With our database in place, we can run queries against it. The query process can
 - concatenate all the shortlisted/reranked docs and pass them to the generation model
 
 
-```python
+```python PYTHON
 def process_query(query, retriever):
     """Runs query augmentation, retrieval, rerank and final generation in one call."""
     augmented_queries=co.chat(message=query,model='command-r-plus',temperature=0.2, search_queries_only=True)
@@ -452,7 +452,7 @@ Unless the user asks for a different style of answer, you should answer in full
 We can now test out a query. In this example, the final answer can be found on page 12 of the PDF, which aligns with the response provided by the model:
 
 
-```python
+```python PYTHON
 query = "what are the charges for services in 2022"
 final_answer, final_answer_docs = process_query(query, retriever)
 print(final_answer)
@@ -478,7 +478,7 @@ We detect questions that do not require RAG by examining the `search_queries` ob
 In the example below, the `else` statement is invoked based on `query2`. We still pass in the chat history, allowing the question to be answered with only the prior context.
 
 
-```python
+```python PYTHON
 query2='divide this by two'
 augmented_queries=co.chat(message=query2,model='command-r-plus',temperature=0.2, search_queries_only=True)
 if augmented_queries.search_queries:
@@ -511,12 +511,12 @@ else:
 Here, we connect all of the pieces discussed above into one class object, which is then used as a tool for a Cohere ReAct agent. This class definition consolidates and clarify the key parameters used to define the RAG pipeline.
 
 
-```python
+```python PYTHON
 co = cohere.Client()
 ```
 
 
-```python
+```python PYTHON
 class Element(BaseModel):
     type: str
     text: Any
@@ -723,7 +723,7 @@ Unless the user asks for a different style of answer, you should answer in full
 ```
 
 
-```python
+```python PYTHON
 rag_object=RAG_pipeline(paths=["city_ny_popular_fin_report.pdf"])
 ```
 
@@ -748,7 +748,7 @@ Finally, we build a simple agent that utilizes the RAG pipeline defined above. W
 The intention behind coupling these tools is to enable the model to perform mathematical and other postprocessing operations on RAG outputs using Python.
 
 
-```python
+```python PYTHON
 from langchain.agents import Tool
 from langchain_experimental.utilities import PythonREPL
 from langchain.agents import AgentExecutor
@@ -832,12 +832,12 @@ You also have access to a python interpreter tool which you can use to run code
 ```
 
 
-```python
+```python PYTHON
 agent_object=react_agent(rag_retriever=rag_object)
 ```
 
 
-```python
+```python PYTHON
 step1_response=agent_object.run_agent("what are the charges for services in 2022 and 2023")
 ```
 
@@ -860,12 +860,12 @@ step1_response=agent_object.run_agent("what are the charges for services in 2022
 Just like earlier, we can also pass chat history to the LangChain agent to refer to for any other queries.
 
 
-```python
+```python PYTHON
 from langchain_core.messages import HumanMessage, AIMessage
 ```
 
 
-```python
+```python PYTHON
 chat_history=[
 HumanMessage(content=step1_response['input']),
 AIMessage(content=step1_response['output'])
@@ -873,7 +873,7 @@ AIMessage(content=step1_response['output'])
 ```
 
 
-```python
+```python PYTHON
 agent_object.run_agent("what is the mean of the two values",history=chat_history)
 ```
 
diff --git a/scripts/cookbooks-mdx/analysis-of-financial-forms.mdx b/scripts/cookbooks-mdx/analysis-of-financial-forms.mdx
index c86301df8..3e3690587 100644
--- a/scripts/cookbooks-mdx/analysis-of-financial-forms.mdx
+++ b/scripts/cookbooks-mdx/analysis-of-financial-forms.mdx
@@ -177,14 +177,14 @@ You may use this script to jumpstart financial analysis of 10-Ks or 10-Qs with C
 This cookbook relies on helpful tooling from LlamaIndex, as well as our Cohere SDK. If you're familiar with LlamaIndex, it should be easy to slot this process into your own productivity flows.
 
 
-```python
+```python PYTHON
 %%capture
 !sudo apt install tesseract-ocr poppler-utils
 !pip install "cohere<5" langchain llama-index llama-index-embeddings-cohere llama-index-postprocessor-cohere-rerank pytesseract pdf2image
 ```
 
 
-```python
+```python PYTHON
 # Due to compatibility issues, we need to do imports like this
 from llama_index.core.schema import TextNode
 
@@ -193,7 +193,7 @@ from llama_index.core.schema import TextNode
 ```
 
 
-```python
+```python PYTHON
 import cohere
 from getpass import getpass
 
@@ -225,7 +225,7 @@ You may run the following cells to load a 10-K that has already been preprocesse
 > 💡 If you'd like to run the OCR pipeline yourself, you can find more info in the section titled **PDF to Text using OCR and `pdf2image`**.
 
 
-```python
+```python PYTHON
 # Using langchain here since they have access to the Unstructured Data Loader powered by unstructured.io
 from langchain_community.document_loaders import UnstructuredURLLoader
 
@@ -263,7 +263,7 @@ We choose to use LlamaIndex's `SentenceSplitter` in this case in order to get th
 You may also apply further transformations from the LlamaIndex repo if you so choose. Take a look at the [docs](https://docs.llamaindex.ai/en/stable/understanding/loading/loading.html) for inspiration on what is possible with transformations.
 
 
-```python
+```python PYTHON
 from llama_index.core.ingestion import IngestionPipeline
 from llama_index.core.node_parser import SentenceSplitter
 
@@ -341,7 +341,7 @@ nodes = pipeline.run(nodes=nodes)
 Loading the document into a LlamaIndex vector store will allow us to use the Cohere embedding model and rerank model to retrieve the relevant parts of the form to pass into Command.
 
 
-```python
+```python PYTHON
 from llama_index.core import Settings, VectorStoreIndex
 
 from llama_index.postprocessor.cohere_rerank import CohereRerank
@@ -384,7 +384,7 @@ In order to do RAG, we need a query or a set of queries to actually _do_ the ret
 To learn more about document mode and query generation, check out [our documentation](https://docs.cohere.com/docs/retrieval-augmented-generation-rag).
 
 
-```python
+```python PYTHON
 PROMPT = "List the overall revenue numbers for 2021, 2022, and 2023 in the 10-K as bullet points, then explain the revenue growth trends."
 
 # Get queries to run against our index from the command-nightly model
@@ -408,7 +408,7 @@ else:
 Now, with the queries in hand, we search against our vector index.
 
 
-```python
+```python PYTHON
 # Convenience function for formatting documents
 def format_for_cohere_client(nodes_):
     return [
@@ -451,7 +451,7 @@ You can see this for yourself by inspecting the `response.citations` field to ch
 You can learn more about the `chat` endpoint by checking out the API reference [here](https://docs.cohere.com/reference/chat).
 
 
-```python
+```python PYTHON
 # Make a request to the model
 response = co.chat(
     message=PROMPT,
@@ -487,7 +487,7 @@ print(response.text)
 
 
 
-```python
+```python PYTHON
 # Helper function for displaying response WITH citations
 def insert_citations(text: str, citations: list[dict]):
     """
@@ -551,7 +551,7 @@ To go from PDF to text with PyTesseract, there is an intermediary step of conver
 To do this, we use `pdf2image`, which uses `poppler` behind the scenes to convert the PDF into a PNG. From there, we can pass the image (which is a PIL Image object) directly into the OCR tool.
 
 
-```python
+```python PYTHON
 import pytesseract
 from pdf2image import convert_from_path
 
@@ -566,7 +566,7 @@ pages = [pytesseract.image_to_string(page) for page in pages]
 ## Token count / price comparison and latency
 
 
-```python
+```python PYTHON
 def get_response(prompt, rag):
     if rag:
         # Get queries to run against our index from the command-nightly model
@@ -615,7 +615,7 @@ def get_response(prompt, rag):
 
 
 
-```python
+```python PYTHON
 prompt_template = """# financial form 10-K
 {tenk}
 
@@ -636,7 +636,7 @@ full_context_prompt = prompt_template.format(tenk=edgar_10k, question=PROMPT)
 
 
 
-```python
+```python PYTHON
 r1 = get_response(PROMPT, rag=True)
 r2 = get_response(full_context_prompt, rag=False)
 ```
@@ -652,7 +652,7 @@ r2 = get_response(full_context_prompt, rag=False)
 
 
 
-```python
+```python PYTHON
 def get_price(r):
     return (r.token_count["prompt_tokens"] * 0.5 / 10e6) + (r.token_count["response_tokens"] * 1.5 / 10e6)
 ```
@@ -668,7 +668,7 @@ def get_price(r):
 
 
 
-```python
+```python PYTHON
 rag_price = get_price(r1)
 full_context_price = get_price(r2)
 
@@ -689,7 +689,7 @@ print(f"RAG is {(full_context_price - rag_price) / full_context_price:.0%} cheap
 
 
 
-```python
+```python PYTHON
 %timeit get_response(PROMPT, rag=True)
 ```
 
@@ -707,7 +707,7 @@ print(f"RAG is {(full_context_price - rag_price) / full_context_price:.0%} cheap
 
 
 
-```python
+```python PYTHON
 %timeit get_response(full_context_prompt, rag=False)
 ```
 
diff --git a/scripts/cookbooks-mdx/analyzing-hacker-news.mdx b/scripts/cookbooks-mdx/analyzing-hacker-news.mdx
index cca47fd19..3f2ff3786 100644
--- a/scripts/cookbooks-mdx/analyzing-hacker-news.mdx
+++ b/scripts/cookbooks-mdx/analyzing-hacker-news.mdx
@@ -148,7 +148,7 @@ In this notebook we take thousands of the most popular posts from Hacker News an
 
 Let's start by installing the tools we'll need and then importing them.
 
-```python
+```python PYTHON
 !pip install cohere umap-learn altair annoy bertopic
 ```
 
@@ -213,7 +213,7 @@ Requirement already satisfied: urllib3<3,>=1.21.1 in /Users/alexiscook/anaconda3
 Requirement already satisfied: mpmath>=0.19 in /Users/alexiscook/anaconda3/lib/python3.11/site-packages (from sympy->torch>=1.11.0->sentence-transformers>=0.4.1->bertopic) (1.3.0)
 ```
 
-```python
+```python PYTHON
 import cohere
 import numpy as np
 import pandas as pd
@@ -231,7 +231,7 @@ pd.set_option('display.max_colwidth', None)
 
 Fill in your Cohere API key in the next cell. To do this, begin by [signing up to Cohere](https://os.cohere.ai/) (for free!) if you haven't yet. Then get your API key [here](https://dashboard.cohere.com/api-keys).
 
-```python
+```python PYTHON
 co = cohere.Client("COHERE_API_KEY") # Insert your Cohere API key
 ```
 
@@ -239,7 +239,7 @@ co = cohere.Client("COHERE_API_KEY") # Insert your Cohere API key
 
 We will use the top 3,000 posts from the Ask HN section of Hacker News. We provide a CSV containing the posts.
 
-```python
+```python PYTHON
 df = pd.read_csv('https://storage.googleapis.com/cohere-assets/blog/text-clustering/data/askhn3k_df.csv', index_col=0)
 
 print(f'Loaded a DataFrame with {len(df)} rows')
@@ -249,7 +249,7 @@ print(f'Loaded a DataFrame with {len(df)} rows')
 Loaded a DataFrame with 3000 rows
 ```
 
-```python
+```python PYTHON
 df.head()
 ```
 
@@ -382,7 +382,7 @@ df.head()
 
 We calculate the embeddings using Cohere's `embed-english-v3.0` model. The resulting embeddings matrix has 3,000 rows (one for each post) and 1024 columns (meaning each post title is represented with a 1024-dimensional embedding).
 
-```python
+```python PYTHON
 batch_size = 90
 
 embeds_list = []
@@ -406,7 +406,7 @@ embeds.shape
 
 For nearest-neighbor search, we can use the open-source Annoy library. Let's create a semantic search index and feed it all the embeddings.
 
-```python
+```python PYTHON
 search_index = AnnoyIndex(embeds.shape[1], 'angular')
 for i in range(len(embeds)):
     search_index.add_item(i, embeds[i])
@@ -423,7 +423,7 @@ True
 
 We can query neighbors of a specific post using `get_nns_by_item`.
 
-```python
+```python PYTHON
 example_id = 50
 
 similar_item_ids = search_index.get_nns_by_item(example_id,
@@ -520,7 +520,7 @@ Nearest neighbors:
 
 We're not limited to searching using existing items. If we get a query, we can embed it and find its nearest neighbors from the dataset.
 
-```python
+```python PYTHON
 query = "How can I improve my knowledge of calculus?"
 
 query_embed = co.embed(texts=[query],
@@ -625,12 +625,12 @@ Nearest neighbors:
 
 What if we want to browse the archive instead of only searching it? Let's plot all the questions in a 2D chart so you're able to visualize the posts in the archive and their similarities.
 
-```python
+```python PYTHON
 reducer = umap.UMAP(n_neighbors=100)
 umap_embeds = reducer.fit_transform(embeds)
 ```
 
-```python
+```python PYTHON
 df['x'] = umap_embeds[:,0]
 df['y'] = umap_embeds[:,1]
 
@@ -678,7 +678,7 @@ chart.interactive()
 
 Let's proceed to cluster the embeddings using KMeans from scikit-learn.
 
-```python
+```python PYTHON
 n_clusters = 8
 
 kmeans_model = KMeans(n_clusters=n_clusters, random_state=0)
@@ -687,7 +687,7 @@ classes = kmeans_model.fit_predict(embeds)
 
 ## 5- Extract major keywords from each cluster so we can identify what the cluster is about
 
-```python
+```python PYTHON
 documents =  df['title']
 documents = pd.DataFrame({"Document": documents,
                           "ID": range(len(documents)),
@@ -699,7 +699,7 @@ count = count_vectorizer.transform(documents_per_topic.Document)
 words = count_vectorizer.get_feature_names_out()
 ```
 
-```python
+```python PYTHON
 ctfidf = ClassTfidfTransformer().fit_transform(count).toarray()
 words_per_class = {label: [words[index] for index in ctfidf[label].argsort()[-10:]] for label in documents_per_topic.Topic}
 df['cluster'] = classes
@@ -710,7 +710,7 @@ df['keywords'] = df['cluster'].map(lambda topic_num: ", ".join(np.array(words_pe
 
 We can now plot the documents with their clusters and keywords
 
-```python
+```python PYTHON
 selection = alt.selection_multi(fields=['keywords'], bind='legend')
 
 chart = alt.Chart(df).transform_calculate(
diff --git a/scripts/cookbooks-mdx/article-recommender-with-text-embeddings.mdx b/scripts/cookbooks-mdx/article-recommender-with-text-embeddings.mdx
index b990cdd05..213844b7c 100644
--- a/scripts/cookbooks-mdx/article-recommender-with-text-embeddings.mdx
+++ b/scripts/cookbooks-mdx/article-recommender-with-text-embeddings.mdx
@@ -157,7 +157,7 @@ We will implement the following steps:
 
 **4: Show the top 5 recommended articles.**
 
-```python
+```python PYTHON
 ! pip install cohere
 ```
 
@@ -175,7 +175,7 @@ Installing collected packages: cohere
 Successfully installed cohere-1.3.10
 ```
 
-```python
+```python PYTHON
 import numpy as np
 import pandas as pd
 import re
@@ -194,7 +194,7 @@ Throughout this article, we'll use the [BBC news article dataset](https://www.ka
 
 We'll extract a subset of the data and in Step 1, use the first 100 data points.
 
-```python
+```python PYTHON
 df = pd.read_csv('https://raw.githubusercontent.com/cohere-ai/notebooks/main/notebooks/data/bbc_news_subset.csv', delimiter=',')
 
 INP_START = 0
@@ -262,7 +262,7 @@ Next we turn each article text into embeddings. An [embedding](https://docs.cohe
 
 We do this by calling Cohere's [Embed endpoint](https://docs.cohere.ai/embed-reference), which takes in text as input and returns embeddings as output.
 
-```python
+```python PYTHON
 articles = df_inputs['Text'].tolist()
 
 output = co.embed(
@@ -284,7 +284,7 @@ Next, we pick any one article to be the one the reader is currently reading (let
 
 [Cosine similarity](https://en.wikipedia.org/wiki/Cosine_similarity) is a metric that measures how similar sequences of numbers are (embeddings in our case), and we compute it for each target-candidate pair. 
 
-```python
+```python PYTHON
 print(f'Choose one article ID between {INP_START} and {INP_END-1} below...')
 ```
 
@@ -292,13 +292,13 @@ print(f'Choose one article ID between {INP_START} and {INP_END-1} below...')
 Choose one article ID between 0 and 99 below...
 ```
 
-```python
+```python PYTHON
 READING_IDX = 70
 
 reading = embeds[READING_IDX]
 ```
 
-```python
+```python PYTHON
 
 from sklearn.metrics.pairwise import cosine_similarity
 
@@ -319,7 +319,7 @@ def get_similarity(target,candidates):
   return similarity_scores
 ```
 
-```python
+```python PYTHON
 similarity = get_similarity(reading,embeds)
 
 print('Target:')
@@ -357,7 +357,7 @@ A typical text classification model requires hundreds/thousands of data points t
 
 To build the classifier, we need a set of examples consisting of text (news text) and labels (news category). The BBC News dataset happens to have both (columns 'Text' and 'Category'), so this time we’ll use the categories for building our examples. For this, we will set aside another portion of dataset.
 
-```python
+```python PYTHON
 EX_START = 100
 EX_END = 200
 df_examples = df.iloc[EX_START:EX_END]
@@ -425,7 +425,7 @@ df_examples.head()
 
 With the Classify endpoint, there is a limit of 512 tokens per input. This means full articles won't be able to fit in the examples, so we will approximate and limit each article to its first 300 characters.
 
-```python
+```python PYTHON
 MAX_CHARS = 300
 
 def shorten_text(text):
@@ -436,7 +436,7 @@ df_examples['Text'] = df_examples['Text'].apply(shorten_text)
 
 The Classify endpoint needs a minimum of 2 examples for each category. We'll have 5 examples each, sampled randomly from the dataset. We have 5 categories, so we will have a total of 25 examples.
 
-```python
+```python PYTHON
 EX_PER_CAT = 5 
 
 categories = df_examples['Category'].unique().tolist()
@@ -464,7 +464,7 @@ Total number of examples: 25
 
 Once the examples are ready, we can now get the classifications. Here is a function that returns the classification given an input.
 
-```python
+```python PYTHON
 
 from cohere import ClassifyExample
 
@@ -485,7 +485,7 @@ def classify_text(texts, examples):
 
 Before actually using the classifier, let's first test its performance. Here we take another 100 data points as the test dataset and the classifier will predict its class i.e. news category.
 
-```python
+```python PYTHON
 TEST_START = 200
 TEST_END = 300
 df_test = df.iloc[TEST_START:TEST_END]
@@ -553,7 +553,7 @@ df_test.head()
 
 
-```python +```python PYTHON predictions = [] BATCH_SIZE = 90 # The API accepts a maximum of 96 inputs for i in range(0, len(df_test['Text']), BATCH_SIZE): @@ -563,7 +563,7 @@ for i in range(0, len(df_test['Text']), BATCH_SIZE): actual = df_test['Category'].tolist() ``` -```python +```python PYTHON from sklearn.metrics import accuracy_score accuracy = accuracy_score(actual, predictions) @@ -586,7 +586,7 @@ We do this with the Chat endpoint. We call the endpoint by specifying a few settings, and it will generate the corresponding extractions. -```python +```python PYTHON def extract_tags(article): prompt = f"""Given an article, extract a list of tags containing keywords of that article. @@ -632,7 +632,7 @@ Let's now put everything together for our article recommender system. First, we select the target article and compute the similarity scores against the candidate articles. -```python +```python PYTHON print(f'Choose one article ID between {INP_START} and {INP_END-1} below...') ``` @@ -640,7 +640,7 @@ print(f'Choose one article ID between {INP_START} and {INP_END-1} below...') Choose one article ID between 0 and 99 below... ``` -```python +```python PYTHON READING_IDX = 70 reading = embeds[READING_IDX] @@ -650,7 +650,7 @@ similarity = get_similarity(reading,embeds) Next, we filter the articles via classification. Finally, we extract the keywords from each article and show the recommendations. -```python +```python PYTHON SHOW_TOP = 5 df_inputs = df_inputs.copy() @@ -695,7 +695,7 @@ def get_recommendations(reading_idx,similarity,show_top): break ``` -```python +```python PYTHON get_recommendations(READING_IDX,similarity,SHOW_TOP) ``` @@ -728,7 +728,7 @@ Let's try a couple of other articles in business and tech and see the output... Business article (returning recommendations around German economy and economic growth/slump): -```python +```python PYTHON READING_IDX = 1 @@ -762,7 +762,7 @@ Tags: bmw, diesel cars, robert bosch, fuel injection pump Tech article (returning recommendations around consumer devices): -```python +```python PYTHON READING_IDX = 71 diff --git a/scripts/cookbooks-mdx/basic-multi-step.mdx b/scripts/cookbooks-mdx/basic-multi-step.mdx index 8d7413796..429fd2133 100644 --- a/scripts/cookbooks-mdx/basic-multi-step.mdx +++ b/scripts/cookbooks-mdx/basic-multi-step.mdx @@ -142,7 +142,7 @@ The recommended way to achieve [multi-step tool use with Cohere](https://docs.co ## Install Dependencies -```python +```python PYTHON ! pip install --quiet langchain langchain_cohere langchain_experimental ``` @@ -162,7 +162,7 @@ The recommended way to achieve [multi-step tool use with Cohere](https://docs.co [?25h -```python +```python PYTHON import os os.environ['COHERE_API_KEY'] = ``` @@ -182,7 +182,7 @@ Plus the model can self-reflect. You can easily equip your agent with web search! -```python +```python PYTHON from langchain_community.tools.tavily_search import TavilySearchResults os.environ["TAVILY_API_KEY"] = # you can create an API key for free on Tavily's website @@ -202,7 +202,7 @@ internet_search.args_schema = TavilySearchInput You can easily equip your agent with a vector store! -```python +```python PYTHON !pip --quiet install faiss-cpu tiktoken ``` @@ -211,7 +211,7 @@ You can easily equip your agent with a vector store! [?25h -```python +```python PYTHON from langchain.text_splitter import RecursiveCharacterTextSplitter from langchain_community.document_loaders import WebBaseLoader from langchain_community.vectorstores import FAISS @@ -241,7 +241,7 @@ vectorstore_retriever = vectorstore.as_retriever() ``` -```python +```python PYTHON from langchain.tools.retriever import create_retriever_tool vectorstore_search = create_retriever_tool( @@ -255,7 +255,7 @@ vectorstore_search = create_retriever_tool( You can easily equip your agent with a python interpreter! -```python +```python PYTHON from langchain.agents import Tool from langchain_experimental.utilities import PythonREPL @@ -276,7 +276,7 @@ python_tool.args_schema = ToolInput You can easily equip your agent with any Python function! -```python +```python PYTHON from langchain_core.tools import tool import random @@ -305,7 +305,7 @@ The model can smartly pick the right tool(s) for the user query, call them in an Once the model considers it has enough information to answer the user question, it generates the final answer. -```python +```python PYTHON from langchain.agents import AgentExecutor from langchain_cohere.react_multi_hop.agent import create_cohere_react_agent from langchain_core.prompts import ChatPromptTemplate @@ -334,7 +334,7 @@ agent_executor = AgentExecutor(agent=agent, tools=[internet_search, vectorstore_ A question that requires using a predefined tool from Langchain -```python +```python PYTHON response = agent_executor.invoke({ "input": "I want to write an essay about the Roman Empire. Any tips for writing an essay? Any fun facts?", "preamble": preamble, @@ -526,7 +526,7 @@ response['output'] A question that requires the large language model to use a custom tool. -```python +```python PYTHON response = agent_executor.invoke({ "input": "Calculate the result of the random operation of 10 and 20. Then find a few fun facts about that number, as well as its prime factors.", "preamble": preamble, @@ -579,7 +579,7 @@ response['output'] A question that requires the large language model to directly answer. -```python +```python PYTHON response = agent_executor.invoke({ "input": "Hey how are you?", "preamble": preamble, @@ -614,7 +614,7 @@ response['output'] A question that requires using multipe tools, in sequence -```python +```python PYTHON response = agent_executor.invoke({ "input": "In what year was the company that was founded as Sound of Music went public? What was its stock price in 2000 and 2010.", "preamble": preamble, @@ -653,7 +653,7 @@ response['output'] The chat history enables you to have multi-turn conversations with the ReAct agent. -```python +```python PYTHON from langchain_core.messages import HumanMessage, AIMessage chat_history = [ @@ -666,7 +666,7 @@ prompt = ChatPromptTemplate.from_messages(chat_history) ``` -```python +```python PYTHON agent = create_cohere_react_agent( llm=llm, tools=[internet_search, vectorstore_search, python_tool], @@ -677,7 +677,7 @@ agent_executor = AgentExecutor(agent=agent, tools=[internet_search, vectorstore_ ``` -```python +```python PYTHON response = agent_executor.invoke({ "preamble": preamble, }) diff --git a/scripts/cookbooks-mdx/basic-rag.mdx b/scripts/cookbooks-mdx/basic-rag.mdx index 10b924654..1c49b2b2d 100644 --- a/scripts/cookbooks-mdx/basic-rag.mdx +++ b/scripts/cookbooks-mdx/basic-rag.mdx @@ -154,7 +154,7 @@ In this example, we'll use a recent piece of text, that wasn't in the training d In practice, you would typically do RAG on much longer text, that doesn't fit in the context window of the model. -```python +```python PYTHON %pip install "cohere<5" --quiet ``` @@ -163,14 +163,14 @@ In practice, you would typically do RAG on much longer text, that doesn't fit in [?25h -```python +```python PYTHON import cohere API_KEY = "..." # fill in your Cohere API key here co = cohere.Client(API_KEY) ``` -```python +```python PYTHON !pip install wikipedia --quiet import wikipedia ``` @@ -180,7 +180,7 @@ import wikipedia -```python +```python PYTHON article = wikipedia.page('Dune Part Two') text = article.content print(f"The text has roughly {len(text.split())} words.") @@ -196,7 +196,7 @@ We index the document in a vector database. This requires getting the documents, ### We split the document into chunks of roughly 512 words -```python +```python PYTHON %pip install -qU langchain-text-splitters --quiet from langchain_text_splitters import RecursiveCharacterTextSplitter ``` @@ -207,7 +207,7 @@ from langchain_text_splitters import RecursiveCharacterTextSplitter [?25h -```python +```python PYTHON text_splitter = RecursiveCharacterTextSplitter( chunk_size=512, chunk_overlap=50, @@ -228,7 +228,7 @@ print(f"The text has been broken down in {len(chunks)} chunks.") Cohere embeddings are state-of-the-art. -```python +```python PYTHON model="embed-english-v3.0" response = co.embed( texts= chunks, @@ -248,12 +248,12 @@ print(f"We just computed {len(embeddings)} embeddings.") We use the simplest vector database ever: a python dictionary using `np.array()`. -```python +```python PYTHON !pip install numpy --quiet ``` -```python +```python PYTHON import numpy as np vector_database = {i: np.array(embedding) for i, embedding in enumerate(embeddings)} ``` @@ -264,7 +264,7 @@ vector_database = {i: np.array(embedding) for i, embedding in enumerate(embeddin ### Define the user question -```python +```python PYTHON query = "Name everyone involved in writing the script, directing, and producing 'Dune: Part Two'?" ``` @@ -274,7 +274,7 @@ query = "Name everyone involved in writing the script, directing, and producing Cohere embeddings are state-of-the-art. -```python +```python PYTHON response = co.embed( texts=[query], model=model, @@ -293,7 +293,7 @@ print("query_embedding: ", query_embedding) We use cosine similarity to find the most similar chunks -```python +```python PYTHON def cosine_similarity(a, b): return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b)) @@ -333,7 +333,7 @@ We rerank the 10 chunks retrieved from the vector database. Reranking boosts ret Reranking lets us go from 10 chunks retrieved from the vector database, to the 3 most relevant chunks. -```python +```python PYTHON response = co.rerank( query=query, documents=top_chunks_after_retrieval, @@ -356,7 +356,7 @@ for t in top_chunks_after_rerank: ## Step 3 - Generate the model final answer, given the retrieved and reranked chunks -```python +```python PYTHON preamble = """ ## Task & Context You help people answer their questions and other requests interactively. You will be asked a very wide array of requests on all kinds of topics. You will be equipped with a wide range of search engines or similar tools to help you, which you use to research your answer. You should focus on serving the user's needs as best you can, which will be wide-ranging. @@ -367,7 +367,7 @@ Unless the user asks for a different style of answer, you should answer in full ``` -```python +```python PYTHON documents = [ {"title": "chunk 0", "snippet": top_chunks_after_rerank[0]}, {"title": "chunk 1", "snippet": top_chunks_after_rerank[1]}, @@ -417,7 +417,7 @@ These citations are optional — you can decide to ignore them. -```python +```python PYTHON print("Citations that support the final answer:") for cite in response.citations: print(cite) @@ -457,7 +457,7 @@ for cite in response.citations: -```python +```python PYTHON def insert_citations_in_order(text, citations): """ A helper function to pretty print citations. diff --git a/scripts/cookbooks-mdx/basic-semantic-search.mdx b/scripts/cookbooks-mdx/basic-semantic-search.mdx index d1710fa43..e1fb8646f 100644 --- a/scripts/cookbooks-mdx/basic-semantic-search.mdx +++ b/scripts/cookbooks-mdx/basic-semantic-search.mdx @@ -147,7 +147,7 @@ In this notebook, we'll build a simple semantic search engine. The applications And if you're running an older version of the SDK, you might need to upgrade it like so: -```python +```python PYTHON #!pip install --upgrade cohere ``` @@ -155,7 +155,7 @@ Get your Cohere API key by [signing up here](https://os.cohere.ai/register). Pas ## 1. Getting Set Up -```python +```python PYTHON #@title Import libraries (Run this cell to execute required code) {display-mode: "form"} import cohere @@ -175,7 +175,7 @@ pd.set_option('display.max_colwidth', None) You'll need your API key for this next cell. [Sign up to Cohere](https://os.cohere.ai/) and get one if you haven't yet. -```python +```python PYTHON model_name = "embed-english-v3.0" api_key = "" input_type_embed = "search_document" @@ -187,7 +187,7 @@ co = cohere.Client(api_key) We'll use the [trec](https://www.tensorflow.org/datasets/catalog/trec) dataset which is made up of questions and their categories. -```python +```python PYTHON dataset = load_dataset("trec", split="train") df = pd.DataFrame(dataset)[:1000] @@ -295,13 +295,13 @@ The next step is to embed the text of the questions. To get a thousand embeddings of this length should take about fifteen seconds. -```python +```python PYTHON embeds = co.embed(texts=list(df['text']), model=model_name, input_type=input_type_embed).embeddings ``` -```python +```python PYTHON embeds = np.array(embeds) embeds.shape ``` @@ -319,7 +319,7 @@ Let's now use [Annoy](https://github.com/spotify/annoy) to build an index that s After building the index, we can use it to retrieve the nearest neighbors either of existing questions (section 3.1), or of new questions that we embed (section 3.2). -```python +```python PYTHON search_index = AnnoyIndex(embeds.shape[1], 'angular') for i in range(len(embeds)): search_index.add_item(i, embeds[i]) @@ -336,7 +336,7 @@ True If we're only interested in measuring the distance between the questions in the dataset (no outside queries), a simple way is to calculate the distance between every pair of embeddings we have. -```python +```python PYTHON example_id = 92 similar_item_ids = search_index.get_nns_by_item(example_id,10, @@ -432,7 +432,7 @@ Nearest neighbors: We're not limited to searching using existing items. If we get a query, we can embed it and find its nearest neighbors from the dataset. -```python +```python PYTHON query = "What is the tallest mountain in the world?" input_type_query = "search_query" @@ -539,7 +539,7 @@ Nearest neighbors: Finally, let's plot out all the questions onto a 2D chart so you're able to visualize the semantic similarities of this dataset! -```python +```python PYTHON #@title Plot the archive {display-mode: "form"} reducer = umap.UMAP(n_neighbors=20) diff --git a/scripts/cookbooks-mdx/basic-tool-use.mdx b/scripts/cookbooks-mdx/basic-tool-use.mdx index 470fdad08..1af37d9d7 100644 --- a/scripts/cookbooks-mdx/basic-tool-use.mdx +++ b/scripts/cookbooks-mdx/basic-tool-use.mdx @@ -144,7 +144,7 @@ Below, we illustrate tool use in four steps: - Step 3: the tool calls are executed - Step 4: the model **generates a final answer with precise citations** based on the tool results -```python +```python PYTHON import cohere, json API_KEY = "..." # fill in your Cohere API key here co = cohere.Client(API_KEY) @@ -154,7 +154,7 @@ co = cohere.Client(API_KEY) Before we can illustrate tool use, we first need to do some setup. Here, we'll define the mock data that our tools will query. This data represents sales reports and a product catalog. -```python +```python PYTHON sales_database = { '2023-09-28': { 'total_sales_amount': 5000, @@ -187,7 +187,7 @@ product_catalog = { Now, we'll define the tools that simulate querying this database. For example, you could use the API of an enterprise sales platform. -```python +```python PYTHON def query_daily_sales_report(day: str) -> dict: """ Function to retrieve the sales report for the given day @@ -232,7 +232,7 @@ You can specify one or many tools to the model. Every tool needs to be described In our example, we provide two tools to the model: `daily_sales_report` and `product_catalog`. -```python +```python PYTHON tools = [ { "name": "query_daily_sales_report", @@ -265,7 +265,7 @@ In our example we'll use: "Can you provide a sales summary for 29th September 20 Only a langage model with Tool Use can answer this request: it requires looking up information in the right external tools (step 2), and then providing a final answer based on the tool results (step 4). -```python +```python PYTHON preamble = """ ## Task & Context You help people answer their questions and other requests interactively. You will be asked a very wide array of requests on all kinds of topics. You will be equipped with a wide range of search engines or similar tools to help you, which you use to research your answer. You should focus on serving the user's needs as best you can, which will be wide-ranging. @@ -281,7 +281,7 @@ message = "Can you provide a sales summary for 29th September 2023, and also giv The model intelligently selects the right tool(s) to call -- and the right parameters for each tool call -- based on the content of the user message. -```python +```python PYTHON response = co.chat( message=message, tools=tools, @@ -313,7 +313,7 @@ cohere.ToolCall { You can now execute the appropriate calls, using the tool calls and tool parameters generated by the model. These tool calls return tool results that will be fed to the model in Step 4. -```python +```python PYTHON tool_results = [] for tool_call in response.tool_calls: # here is where you would call the tool recommended by the model, using the parameters recommended by the model @@ -395,7 +395,7 @@ Tool results that will be fed back to the model in step 4: Finally, the developer calls the Cohere model, providing the tools results, in order to generate the model's final answer. -```python +```python PYTHON response = co.chat( message=message, tools=tools, @@ -406,7 +406,7 @@ response = co.chat( ) ``` -```python +```python PYTHON print("Final answer:") print(response.text) ``` @@ -434,7 +434,7 @@ These citations make it easy to check where the model’s generated response cla They help users gain visibility into the model reasoning, and sanity check the final model generation. These citations are optional — you can decide to ignore them. -```python +```python PYTHON print("Citations that support the final answer:") for cite in response.citations: print(cite) @@ -456,7 +456,7 @@ Citations that support the final answer: {'start': 298, 'end': 300, 'text': '25', 'document_ids': ['query_product_catalog:1:0']} ``` -```python +```python PYTHON def insert_citations_in_order(text, citations): """ A helper function to pretty print citations. diff --git a/scripts/cookbooks-mdx/calendar-agent.mdx b/scripts/cookbooks-mdx/calendar-agent.mdx index 402b9c1f3..e990f2fa7 100644 --- a/scripts/cookbooks-mdx/calendar-agent.mdx +++ b/scripts/cookbooks-mdx/calendar-agent.mdx @@ -136,12 +136,12 @@ slug: /page/calendar-agent In the example below, we demonstrate how to use the cohere Chat API with the `list_calendar_events` and `create_calendar_event` tools to book appointments. Booking the correct appointment requires the model to first check for an available slot by listing existing events, reasoning about the correct slot to book the new appointment and then finally invoking the right tool to create the calendar event. To learn more about Tool Use, read the official [multi-step tool use guide](https://docs.cohere.com/docs/multi-step-tool-use). -```python +```python PYTHON # !pip install cohere==5.5.3 ``` -```python +```python PYTHON # Instantiate the Cohere client import cohere @@ -152,7 +152,7 @@ co = cohere.Client(api_key=COHERE_API_KEY) ``` -```python +```python PYTHON # Define the tools import json @@ -220,7 +220,7 @@ def invoke_tool(tool_call: cohere.ToolCall): ``` -```python +```python PYTHON # Check what tools the model wants to use and how to use them res = co.chat( model="command-r-plus", diff --git a/scripts/cookbooks-mdx/chunking-strategies.mdx b/scripts/cookbooks-mdx/chunking-strategies.mdx index 941e42b33..ef32d1919 100644 --- a/scripts/cookbooks-mdx/chunking-strategies.mdx +++ b/scripts/cookbooks-mdx/chunking-strategies.mdx @@ -169,7 +169,7 @@ slug: /page/chunking-strategies } -```python +```python PYTHON %%capture !pip install cohere !pip install -qU langchain-text-splitters @@ -178,7 +178,7 @@ slug: /page/chunking-strategies ``` -```python +```python PYTHON import requests from typing import List @@ -198,7 +198,7 @@ from llama_index.core import VectorStoreIndex, ServiceContext ``` -```python +```python PYTHON co_model = 'command-r' co_api_key = getpass("Enter Cohere API key: ") co = cohere.Client(api_key=co_api_key) @@ -257,7 +257,7 @@ Designing a robust chunking strategy is as much a science as an art. There are n ## Utils -```python +```python PYTHON def set_css(): display(HTML(''' -```python +```python PYTHON #@title Enable text wrapping in Google Colab from IPython.display import HTML, display @@ -175,7 +175,7 @@ get_ipython().events.register('pre_run_cell', set_css) First, we need to get a supply of high-traffic keywords for a given topic. We can get this via keyword research tools, of which are many available. We’ll use Google Keyword Planner, which is free to use. -```python +```python PYTHON import wget wget.download("https://raw.githubusercontent.com/cohere-ai/notebooks/main/notebooks/data/remote_teams.csv", "remote_teams.csv") @@ -191,7 +191,7 @@ wget.download("https://raw.githubusercontent.com/cohere-ai/notebooks/main/notebo 'remote_teams.csv' ``` -```python +```python PYTHON df = pd.read_csv('remote_teams.csv') df.columns = ["keyword","volume"] df.head() @@ -266,7 +266,7 @@ We can do that by clustering them into topics. For this, we’ll leverage Cohere The Cohere Embed endpoint turns a text input into a text embedding. -```python +```python PYTHON def embed_text(texts): output = co.embed( texts=texts, @@ -288,7 +288,7 @@ embeds = np.array(embed_text(df['keyword'].tolist())) We then use these embeddings to cluster the keywords. A common term used for this exercise is “topic modeling.” Here, we can leverage scikit-learn’s KMeans module, a machine learning algorithm for clustering. -```python +```python PYTHON NUM_TOPICS = 4 kmeans = KMeans(n_clusters=NUM_TOPICS, random_state=21, n_init="auto").fit(embeds) df['topic'] = list(kmeans.labels_) @@ -366,7 +366,7 @@ df.head() We use the Chat to generate a topic name for that cluster. -```python +```python PYTHON topic_keywords_dict = {topic: list(set(group['keyword'])) for topic, group in df.groupby('topic')} ``` @@ -376,7 +376,7 @@ topic_keywords_dict = {topic: list(set(group['keyword'])) for topic, group in df } -```python +```python PYTHON def generate_topic_name(keywords): # Construct the prompt prompt = f"""Generate a concise topic name that best represents these keywords.\ @@ -400,7 +400,7 @@ Keywords: {', '.join(keywords)}""" } -```python +```python PYTHON topic_name_mapping = {topic: generate_topic_name(keywords) for topic, keywords in topic_keywords_dict.items()} df['topic_name'] = df['topic'].map(topic_name_mapping) @@ -481,7 +481,7 @@ df.head() -```python +```python PYTHON for topic, name in topic_name_mapping.items(): print(f"Topic {topic}: {name}") ``` @@ -505,7 +505,7 @@ Now that we have the keywords nicely grouped into topics, we can proceed to gene Here we can implement a filter to take just the top N keywords from each topic, sorted by the search volume. In our case, we use 10. -```python +```python PYTHON TOP_N = 10 top_keywords = (df.groupby('topic') @@ -527,7 +527,7 @@ for topic, group in top_keywords.groupby('topic'): } -```python +```python PYTHON content_by_topic ``` @@ -552,7 +552,7 @@ content_by_topic Next, we use the Chat endpoint to produce the content ideas. The prompt we’ll use is as follows -```python +```python PYTHON def generate_blog_ideas(keywords): prompt = f"""{keywords}\n\nThe above is a list of high-traffic keywords obtained from a keyword research tool. Suggest three blog post ideas that are highly relevant to these keywords. @@ -578,7 +578,7 @@ Abstract: """ Next, we generate the blog post ideas. It takes in a string of keywords, calls the Chat endpoint, and returns the generated text. -```python +```python PYTHON for key,value in content_by_topic.items(): value['ideas'] = generate_blog_ideas(value['keywords']) diff --git a/scripts/cookbooks-mdx/grounded-summarization.mdx b/scripts/cookbooks-mdx/grounded-summarization.mdx index f94fc38e8..e2b0100fb 100644 --- a/scripts/cookbooks-mdx/grounded-summarization.mdx +++ b/scripts/cookbooks-mdx/grounded-summarization.mdx @@ -153,7 +153,7 @@ This notebook provides the code to produce the outputs described in [this blog p ## 1. Setup -```python +```python PYTHON %%capture import cohere @@ -176,7 +176,7 @@ co = cohere.Client(co_api_key) ``` -```python +```python PYTHON from google.colab import drive drive.mount("/content/drive", force_remount=True) @@ -194,7 +194,7 @@ print(f"Loaded IMF report with {num_tokens} tokens") ### Aside: define utils -```python +```python PYTHON def split_text_into_sentences(text: str) -> List[str]: sentences = sent_tokenize(text) @@ -328,7 +328,7 @@ def _add_chunks_by_priority( First, let's see Command-R's out-of-the-box performance. It's a 128k-context model, so we can pass the full IMF report in a single call. We replicate the exact instructions from the original tweet (correcting for a minor typo) for enabling fair comparisons. -```python +```python PYTHON prompt_template = """\ ## text {text} @@ -374,7 +374,7 @@ For more information on how to enable grounded generation via our `co.chat` API, Finally, note that we chunk the IMF report into multiple documents before passing them to `co.chat`. This isn't necessary (`co.chat` annotates citations at the character level), but allows for more human-readable citations. -```python +```python PYTHON summarize_preamble = """\ You will receive a series of text fragments from an article that are presented in chronological order. \ As the assistant, you must generate responses to user's requests based on the information given in the fragments. \ @@ -411,7 +411,7 @@ print(resp.text) Let's display the citations inside our answer: -```python +```python PYTHON print(insert_citations(resp.text, resp.citations)) ``` @@ -424,7 +424,7 @@ Around 40% of employment worldwide is exposed to AI [1, 6] by checking its chunk: -```python +```python PYTHON print(chunked[6]) ``` @@ -440,7 +440,7 @@ Even though Command-R is an efficient, light-weight model, for some applications We have a whole notebook dedicated to methods for reducing context length. Here, we call our 'text-rank' method to select maximally central chunks in a graph based on the chunk-to-chunk similarties. For more detail, please refer [to this cookbook](https://colab.research.google.com/drive/1zxSAbruOWwWJHNsj3N56uxZtUeiS7Evd). -```python +```python PYTHON num_tokens = 8192 shortened = textrank(text, co, num_tokens, n_sentences_per_passage=30) diff --git a/scripts/cookbooks-mdx/hello-world-meet-ai.mdx b/scripts/cookbooks-mdx/hello-world-meet-ai.mdx index b07b28a22..b7a4b0896 100644 --- a/scripts/cookbooks-mdx/hello-world-meet-ai.mdx +++ b/scripts/cookbooks-mdx/hello-world-meet-ai.mdx @@ -148,11 +148,11 @@ We’ll cover three groups of tasks that you will typically work on when dealing The first step is to install the Cohere Python SDK. Next, create an API key, which you can generate from the Cohere [dashboard](https://os.cohere.ai/register) or [CLI tool](https://docs.cohere.ai/cli-key). -```python +```python PYTHON ! pip install cohere altair umap-learn -q ``` -```python +```python PYTHON import cohere import pandas as pd import numpy as np @@ -165,7 +165,7 @@ The Cohere Generate endpoint generates text given an input, called “prompt”. ### Try a Simple Prompt -```python +```python PYTHON prompt = "What is a Hello World program." response = co.chat( @@ -192,11 +192,11 @@ int main() { } ``` 2. **Python**: -```python +```python PYTHON print("Hello World") ``` 3. **Java**: -```java +```java JAVA class HelloWorld { public static void main(String[] args) { System.out.println("Hello World"); @@ -224,7 +224,7 @@ The "Hello World" program is a testament to the power of programming, as a simpl The output is not bad, but it can be better. We need to find a way to make the output tighter to how we want it to be, which is where we leverage _prompt engineering_. -```python +```python PYTHON prompt = """ Write the first paragraph of a blog post given a blog title. -- @@ -257,7 +257,7 @@ Starting to code can be daunting, but it's actually simpler than you think! The In real applications, you will likely need to produce these text generations on an ongoing basis, given different inputs. Let’s simulate that with our example. -```python +```python PYTHON def generate_text(topic): prompt = f""" Write the first paragraph of a blog post given a blog title. @@ -283,13 +283,13 @@ First Paragraph:""" return response.text ``` -```python +```python PYTHON topics = ["How to Grow in Your Career", "The Habits of Great Software Developers", "Ideas for a Relaxing Weekend"] ``` -```python +```python PYTHON paragraphs = [] for topic in topics: @@ -317,7 +317,7 @@ Cohere’s Classify endpoint makes it easy to take a list of texts and predict t ### Sentiment Analysis -```python +```python PYTHON from cohere import ClassifyExample examples = [ @@ -339,7 +339,7 @@ examples = [ ] ``` -```python +```python PYTHON inputs=["Hello, world! What a beautiful day", "It was a great time with great people", "Great place to work", @@ -355,7 +355,7 @@ inputs=["Hello, world! What a beautiful day", ] ``` -```python +```python PYTHON def classify_text(inputs, examples): """ Classify a list of input texts @@ -376,7 +376,7 @@ def classify_text(inputs, examples): return classifications ``` -```python +```python PYTHON predictions = classify_text(inputs,examples) classes = ["positive","negative","neutral"] @@ -448,7 +448,7 @@ Cohere’s Embed endpoint takes a piece of text and turns it into a vector embed Here we have a list of 50 top web search keywords about Hello, World! taken from a keyword tool. Let’s look at a few examples: -```python +```python PYTHON df = pd.read_csv("https://github.com/cohere-ai/notebooks/raw/main/notebooks/data/hello-world-kw.csv", names=["search_term"]) df.head() ``` @@ -504,7 +504,7 @@ df.head() We use the Embed endpoint to get the embeddings for each of these keywords. -```python +```python PYTHON def embed_text(texts, input_type): """ Turns a piece of text into embeddings @@ -522,7 +522,7 @@ def embed_text(texts, input_type): return response.embeddings ``` -```python +```python PYTHON df["search_term_embeds"] = embed_text(texts=df["search_term"].tolist(), input_type="search_document") doc_embeds = np.array(df["search_term_embeds"].tolist()) @@ -532,7 +532,7 @@ doc_embeds = np.array(df["search_term_embeds"].tolist()) We’ll look at a couple of example applications. The first example is semantic search. Given a new query, our "search engine" must return the most similar FAQs, where the FAQs are the 50 search terms we uploaded earlier. -```python +```python PYTHON query = "what is the history of hello world" query_embeds = embed_text(texts=[query], @@ -541,7 +541,7 @@ query_embeds = embed_text(texts=[query], We use cosine similarity to compare the similarity of the new query with each of the FAQs -```python +```python PYTHON from sklearn.metrics.pairwise import cosine_similarity @@ -572,7 +572,7 @@ def get_similarity(target, candidates): Finally, we display the top 5 FAQs that match the new query -```python +```python PYTHON similarity = get_similarity(query_embeds,doc_embeds) print("New query:") @@ -601,7 +601,7 @@ In the second example, we take the same idea as semantic search and take a broad We'll use the same 50 top web search terms about Hello, World! There are different techniques we can use to compress the embeddings down to just 2 dimensions while retaining as much information as possible. We'll use a technique called UMAP. And once we can get it down to 2 dimensions, we can plot these embeddings on a 2D chart. -```python +```python PYTHON import umap reducer = umap.UMAP(n_neighbors=49) umap_embeds = reducer.fit_transform(doc_embeds) @@ -610,7 +610,7 @@ df['x'] = umap_embeds[:,0] df['y'] = umap_embeds[:,1] ``` -```python +```python PYTHON chart = alt.Chart(df).mark_circle(size=500).encode( x= alt.X('x', diff --git a/scripts/cookbooks-mdx/long-form-general-strategies.mdx b/scripts/cookbooks-mdx/long-form-general-strategies.mdx index 753717a85..75b3e6da0 100644 --- a/scripts/cookbooks-mdx/long-form-general-strategies.mdx +++ b/scripts/cookbooks-mdx/long-form-general-strategies.mdx @@ -194,7 +194,7 @@ We'll show you three potential mitigation strategies: truncating the document, q ## Getting Started -```python +```python PYTHON %%capture !pip install cohere !pip install python-dotenv @@ -206,7 +206,7 @@ We'll show you three potential mitigation strategies: truncating the document, q ``` -```python +```python PYTHON import os import requests from collections import deque @@ -236,7 +236,7 @@ from IPython.display import HTML, display -```python +```python PYTHON # Set up Cohere client co_model = 'command-r' co_api_key = getpass("Enter your Cohere API key: ") @@ -244,7 +244,7 @@ co = cohere.Client(api_key=co_api_key) ``` -```python +```python PYTHON def load_long_pdf(file_path): """ Load a long PDF file and extract its text content. @@ -284,7 +284,7 @@ def save_pdf_from_url(pdf_url, save_path): In this example we use the Proposal for a Regulation of the European Parliament and of the Council defining rules on Artificial Intelligence from 26 January 2024, [link](https://data.consilium.europa.eu/doc/document/ST-5662-2024-INIT/en/pdf). -```python +```python PYTHON # Download the PDF file from the URL pdf_url = 'https://data.consilium.europa.eu/doc/document/ST-5662-2024-INIT/en/pdf' save_path = 'example.pdf' @@ -305,7 +305,7 @@ print("Document length - #tokens:", len(co.tokenize(text=long_text, model=co_mod ## Summarizing the text -```python +```python PYTHON def generate_response(message, max_tokens=300, temperature=0.2, k=0): """ A wrapper around the Cohere API to generate a response based on a given prompt. @@ -331,7 +331,7 @@ def generate_response(message, max_tokens=300, temperature=0.2, k=0): ``` -```python +```python PYTHON # Example summary prompt. prompt_template = """ ## Instruction @@ -351,7 +351,7 @@ Error: :`CohereAPIError: too many tokens:` -```python +```python PYTHON prompt = prompt_template.format(document=long_text) # print(generate_response(message=prompt)) ``` @@ -364,7 +364,7 @@ Therefore, in the following sections, we will explore some techniques to address First we try to truncate the document so that it meets the length constraints. This approach is simple to implement and understand. However, it drops potentially important information contained towards the end of the document. -```python +```python PYTHON # The new Cohere model has a context limit of 128k tokens. However, for the purpose of this exercise, we will assume a smaller context window. # Employing a smaller context window also has the additional benefit of reducing the cost per request, especially if billed by the number of tokens. @@ -383,7 +383,7 @@ def truncate(long: str, max_tokens: int) -> str: ``` -```python +```python PYTHON short_text = truncate(long_text, MAX_TOKENS) prompt = prompt_template.format(document=short_text) @@ -420,7 +420,7 @@ See `query_based_retrieval` function for the starting point. ### Query based retrieval implementation -```python +```python PYTHON def split_text_into_sentences(text) -> List[str]: """ Split the input text into a list of sentences. @@ -452,7 +452,7 @@ def build_simple_chunks(text, n_sentences=5): ``` -```python +```python PYTHON sentences = split_text_into_sentences(long_text) passages = group_sentences_into_passages(sentences, n_sentences_per_passage=5) print('Example sentence:', np.random.choice(np.asarray(sentences), size=1, replace=False)) @@ -466,7 +466,7 @@ print('Example passage:', np.random.choice(np.asarray(passages), size=1, replace -```python +```python PYTHON def _add_chunks_by_priority( chunks: List[str], idcs_sorted_by_priority: List[int], @@ -522,7 +522,7 @@ def query_based_retrieval( ``` -```python +```python PYTHON # Example prompt prompt_template = """ ## Instruction @@ -536,7 +536,7 @@ prompt_template = """ ``` -```python +```python PYTHON query = "What does the report say about biometric identification? Answer only based on the document." short_text = query_based_retrieval(long_text, MAX_TOKENS, query) prompt = prompt_template.format(query=query, document=short_text) @@ -576,7 +576,7 @@ See `text_rank` as the starting point. ### Text rank implementation -```python +```python PYTHON def text_rank(text: str, max_tokens: int, n_setences_per_passage: int) -> str: """ Shortens text by extracting key units of text from it based on their centrality. @@ -612,7 +612,7 @@ def text_rank(text: str, max_tokens: int, n_setences_per_passage: int) -> str: ``` -```python +```python PYTHON # Example summary prompt. prompt_template = """ ## Instruction @@ -626,7 +626,7 @@ Summarize the following Document in 3-5 sentences. Only answer based on the info ``` -```python +```python PYTHON short_text = text_rank(long_text, MAX_TOKENS, 5) prompt = prompt_template.format(document=short_text) print(generate_response(message=prompt, max_tokens=600)) diff --git a/scripts/cookbooks-mdx/migrating-prompts.mdx b/scripts/cookbooks-mdx/migrating-prompts.mdx index dbce6d246..9c56558c6 100644 --- a/scripts/cookbooks-mdx/migrating-prompts.mdx +++ b/scripts/cookbooks-mdx/migrating-prompts.mdx @@ -145,12 +145,12 @@ The two use cases demonstrated here are: 2. Legal Question Answering -```python +```python PYTHON #!pip install cohere ``` -```python +```python PYTHON import json import os import re @@ -160,7 +160,7 @@ import getpass ``` -```python +```python PYTHON CO_API_KEY = getpass.getpass('cohere API key:') ``` @@ -168,7 +168,7 @@ CO_API_KEY = getpass.getpass('cohere API key:') -```python +```python PYTHON co = cohere.Client(CO_API_KEY) ``` @@ -177,7 +177,7 @@ co = cohere.Client(CO_API_KEY) This application scenario is a common LLM-as-assistant use case: given some context, help the user to complete a task. In this case, the task is to write a concise autobiographical summary. -```python +```python PYTHON original_prompt = '''## information Current Job Title: Senior Software Engineer Current Company Name: GlobalSolTech @@ -194,7 +194,7 @@ Write the summary in first person.''' ``` -```python +```python PYTHON response = co.chat( message=original_prompt, model='command-r', @@ -202,7 +202,7 @@ response = co.chat( ``` -```python +```python PYTHON print(response.text) ``` @@ -212,7 +212,7 @@ print(response.text) Using Command-R, we can automatically upgrade the original prompt to a RAG-style prompt to get more faithful adherence to the instructions, a clearer and more concise prompt, and in-line citations for free. Consider the following meta-prompt: -```python +```python PYTHON meta_prompt = f'''Below is a task for an LLM delimited with ## Original Task. Your task is to split that task into two parts: (1) the context; and (2) the instructions. The context should be split into several separate parts and returned as a JSON object where each part has a name describing its contents and the value is the contents itself. Make sure to include all of the context contained in the original task description and do not change its meaning. @@ -230,7 +230,7 @@ Return everything in a JSON object with the following structure: ``` -```python +```python PYTHON print(meta_prompt) ``` @@ -265,7 +265,7 @@ print(meta_prompt) Command-R returns with the following: -```python +```python PYTHON upgraded_prompt = co.chat( message=meta_prompt, model='command-r', @@ -273,12 +273,12 @@ upgraded_prompt = co.chat( ``` -```python +```python PYTHON print(upgraded_prompt.text) ``` Here is the task delved into a JSON object as requested: - ```json + ```json JSON { "context": [ { @@ -302,7 +302,7 @@ print(upgraded_prompt.text) To extract the returned information, we will write two simple functions to post-process out the JSON and then parse it. -```python +```python PYTHON def get_json(text: str) -> str: matches = [m.group(1) for m in re.finditer("```([\w\W]*?)```", text)] if len(matches): @@ -314,7 +314,7 @@ def get_json(text: str) -> str: ``` -```python +```python PYTHON def get_prompt_and_docs(text: str) -> tuple: json_obj = json.loads(get_json(text)) prompt = json_obj['instructions'] @@ -326,12 +326,12 @@ def get_prompt_and_docs(text: str) -> tuple: ``` -```python +```python PYTHON new_prompt, docs = get_prompt_and_docs(upgraded_prompt.text) ``` -```python +```python PYTHON new_prompt, docs ``` @@ -353,7 +353,7 @@ new_prompt, docs As we can see above, the new prompt is much more concise and gets right to the point. The context has been split into 4 "documents" that Command-R can ground the information to. Now let's run the same task with the new prompt while leveraging the `documents=` parameter. Note that the `docs` variable is a list of dict objects with `title` describing the contents of a text and `snippet` containing the text itself: -```python +```python PYTHON response = co.chat( message=new_prompt, model='command-r', @@ -362,7 +362,7 @@ response = co.chat( ``` -```python +```python PYTHON print(response.text) ``` @@ -372,7 +372,7 @@ print(response.text) The response is concise. More importantly, we can ensure that there is no hallucination because the text is automatically grounded in the input documents. Using the simple function below, we can add this grounding information to the text as citations: -```python +```python PYTHON def insert_citations(text: str, citations: list[dict], add_one: bool=False): """ A helper function to pretty print citations. @@ -402,7 +402,7 @@ def insert_citations(text: str, citations: list[dict], add_one: bool=False): ``` -```python +```python PYTHON print(insert_citations(response.text, response.citations, True)) ``` @@ -416,12 +416,12 @@ Now let's move on to an arguably more difficult problem. On March 21st, the DOJ announced that it is [suing Apple](https://www.theverge.com/2024/3/21/24107659/apple-doj-lawsuit-antitrust-documents-suing) for anti-competitive practices. The [complaint](https://www.justice.gov/opa/media/1344546/dl) is 88 pages long and consists of about 230 paragraphs of text. To understand what the suit alleges, a common use case would be to ask for a summary. Because Command-R has a context window of 128K, even an 88-page legal complaint fits comfortably within the window. -```python +```python PYTHON apple = open('data/apple_mod.txt').read() ``` -```python +```python PYTHON tokens = co.tokenize(text=apple, model='command-r') len(tokens.tokens) ``` @@ -436,7 +436,7 @@ len(tokens.tokens) We can set up a prompt template that allows us to ask questions on the original text. -```python +```python PYTHON prompt_template = ''' {legal_text} @@ -445,13 +445,13 @@ prompt_template = ''' ``` -```python +```python PYTHON question = '''Please summarize the attached legal complaint succinctly. Focus on answering the question: what does the complaint allege?''' rendered_prompt = prompt_template.format(legal_text=apple, question=question) ``` -```python +```python PYTHON response = co.chat( message=rendered_prompt, model='command-r', @@ -460,7 +460,7 @@ response = co.chat( ``` -```python +```python PYTHON print(response.text) ``` @@ -470,13 +470,13 @@ print(response.text) The summary seems clear enough. But we are interested in the specific allegations that the DOJ makes. For example, skimming the full complaint, it looks like the DOJ is alleging that Apple could encrypt text messages sent to Android phones if it wanted to do so. We can amend the rendered prompt and ask: -```python +```python PYTHON question = '''Does the DOJ allege that Apple could encrypt text messages sent to Android phones?''' rendered_prompt = prompt_template.format(legal_text=apple, question=question) ``` -```python +```python PYTHON response = co.chat( message=rendered_prompt, model='command-r', @@ -484,7 +484,7 @@ response = co.chat( ``` -```python +```python PYTHON print(response.text) ``` @@ -496,7 +496,7 @@ This is a very interesting allegation that at first glance suggests that the mod While previously we asked Command-R to chunk the text for us, the legal complaint is highly structured with numbered paragraphs so we can use the following function to break the complaint into input docs ready for RAG: -```python +```python PYTHON def chunk_doc(input_doc: str) -> list: chunks = [] current_para = 'Preamble' @@ -520,12 +520,12 @@ def chunk_doc(input_doc: str) -> list: ``` -```python +```python PYTHON chunks = chunk_doc(apple) ``` -```python +```python PYTHON print(chunks[18]) ``` @@ -535,7 +535,7 @@ print(chunks[18]) We can now try the same question but ask it directly to Command-R with the chunks as grounding information. -```python +```python PYTHON response = co.chat( message='''Does the DOJ allege that Apple could encrypt text messages sent to Android phones?''', model='command-r', @@ -544,7 +544,7 @@ response = co.chat( ``` -```python +```python PYTHON print(response.text) ``` @@ -554,7 +554,7 @@ print(response.text) The responses seem similar, but we should add citations and check the citation to get confidence in the response. -```python +```python PYTHON print(insert_citations(response.text, response.citations)) ``` @@ -564,7 +564,7 @@ print(insert_citations(response.text, response.citations)) The most important passage seems to be paragraph 144. Paragraph 93 is also cited. Let's check what they contain. -```python +```python PYTHON print(chunks[144]['snippet']) ``` @@ -579,7 +579,7 @@ print(chunks[144]['snippet']) -```python +```python PYTHON print(chunks[93]['snippet']) ``` diff --git a/scripts/cookbooks-mdx/multilingual-search.mdx b/scripts/cookbooks-mdx/multilingual-search.mdx index d2b6a7e47..b9d704e45 100644 --- a/scripts/cookbooks-mdx/multilingual-search.mdx +++ b/scripts/cookbooks-mdx/multilingual-search.mdx @@ -160,7 +160,7 @@ We'll go through the following examples: - Answer the question based on the most relevant documents -```python +```python PYTHON from langchain.embeddings.cohere import CohereEmbeddings from langchain.llms import Cohere from langchain.prompts import PromptTemplate @@ -190,7 +190,7 @@ dotenv.load_dotenv(".env") # Upload an '.env' file containing an environment var ### Import a list of documents -```python +```python PYTHON import tensorflow_datasets as tfds dataset = tfds.load('trec', split='train') texts = [item['text'].decode('utf-8') for item in tfds.as_numpy(dataset)] @@ -237,7 +237,7 @@ print(f"Number of documents: {len(texts)}") -```python +```python PYTHON random.seed(11) for item in random.sample(texts, 5): print(item) @@ -253,7 +253,7 @@ for item in random.sample(texts, 5): ### Embed the documents and store them in an index -```python +```python PYTHON embeddings = CohereEmbeddings(model = "multilingual-22-12") db = Qdrant.from_texts(texts, embeddings, location=":memory:", collection_name="my_documents", distance_func="Dot") @@ -262,7 +262,7 @@ db = Qdrant.from_texts(texts, embeddings, location=":memory:", collection_name=" ### Enter a query -```python +```python PYTHON queries = ["How to get in touch with Bill Gates", "Comment entrer en contact avec Bill Gates", "Cara menghubungi Bill Gates"] @@ -273,7 +273,7 @@ queries_lang = ["English", "French", "Indonesian"] ### Return the document most similar to the query -```python +```python PYTHON answers = [] for query in queries: docs = db.similarity_search(query) @@ -281,7 +281,7 @@ for query in queries: ``` -```python +```python PYTHON for idx,query in enumerate(queries): print(f"Query language: {queries_lang[idx]}") print(f"Query: {query}") @@ -312,7 +312,7 @@ for idx,query in enumerate(queries): ## Add an article and chunk it into smaller passages -```python +```python PYTHON !wget 'https://docs.google.com/uc?export=download&id=1f1INWOfJrHTFmbyF_0be5b4u_moz3a4F' -O steve-jobs-commencement.txt ``` @@ -337,7 +337,7 @@ for idx,query in enumerate(queries): -```python +```python PYTHON loader = TextLoader("steve-jobs-commencement.txt") documents = loader.load() text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0) @@ -347,7 +347,7 @@ texts = text_splitter.split_documents(documents) ## Embed the passages and store them in an index -```python +```python PYTHON embeddings = CohereEmbeddings(model = "multilingual-22-12") db = Qdrant.from_documents(texts, embeddings, location=":memory:", collection_name="my_documents", distance_func="Dot") ``` @@ -355,7 +355,7 @@ db = Qdrant.from_documents(texts, embeddings, location=":memory:", collection_na ## Enter a question -```python +```python PYTHON questions = [ "What did the author liken The Whole Earth Catalog to?", "What was Reed College great at?", @@ -369,7 +369,7 @@ questions = [ -```python +```python PYTHON prompt_template = """Text: {context} @@ -383,7 +383,7 @@ PROMPT = PromptTemplate( ``` -```python +```python PYTHON chain_type_kwargs = {"prompt": PROMPT} qa = RetrievalQA.from_chain_type(llm=Cohere(model="command", temperature=0), @@ -500,7 +500,7 @@ for question in questions: ## Questions in French -```python +```python PYTHON questions_fr = [ "À quoi se compare The Whole Earth Catalog ?", "Dans quoi Reed College était-il excellent ?", @@ -511,11 +511,11 @@ questions_fr = [ ``` -```python +```python PYTHON ``` -```python +```python PYTHON chain_type_kwargs = {"prompt": PROMPT} diff --git a/scripts/cookbooks-mdx/pdf-extractor.mdx b/scripts/cookbooks-mdx/pdf-extractor.mdx index d5bd9bf0c..20f4bf7ab 100644 --- a/scripts/cookbooks-mdx/pdf-extractor.mdx +++ b/scripts/cookbooks-mdx/pdf-extractor.mdx @@ -185,7 +185,7 @@ In the directory, we have a simple_invoice.pdf file. Everytime a user uploads th -```python +```python PYTHON import os import cohere @@ -195,13 +195,13 @@ from unstructured.partition.pdf import partition_pdf ``` -```python +```python PYTHON # uncomment to install dependencies # !pip install cohere unstructured ``` -```python +```python PYTHON # versions print('cohere version:', cohere.__version__) ``` @@ -212,7 +212,7 @@ print('cohere version:', cohere.__version__) ## Setup -```python +```python PYTHON COHERE_API_KEY = os.environ.get("CO_API_KEY") COHERE_MODEL = 'command-r-plus' co = cohere.Client(api_key=COHERE_API_KEY) @@ -231,7 +231,7 @@ The sample invoice data is from https://unidoc.io/media/simple-invoices/simple_i Here we define the tool which converts summary of the pdf into json object. Then, it checks to make sure all necessary keys are present and saves it as csv. -```python +```python PYTHON def convert_to_json(text: str) -> dict: """ Given text files, convert to json object and saves to csv. @@ -288,7 +288,7 @@ def convert_to_json(text: str) -> dict: Below is a cohere agent that leverages multi-step API. It is equipped with convert_to_json tool. -```python +```python PYTHON def cohere_agent( message: str, preamble: str, @@ -377,7 +377,7 @@ def cohere_agent( ### main -```python +```python PYTHON def extract_pdf(path): """ Function to extract text from a PDF file. diff --git a/scripts/cookbooks-mdx/pondr.mdx b/scripts/cookbooks-mdx/pondr.mdx index 752722ea6..94ab70dd3 100644 --- a/scripts/cookbooks-mdx/pondr.mdx +++ b/scripts/cookbooks-mdx/pondr.mdx @@ -149,13 +149,13 @@ In this notebook we will walk through the first two steps. Install and import the tools we will need as well as initializing the Cohere model. -```python +```python PYTHON import cohere from cohere.responses.classify import Example import pandas as pd ``` -```python +```python PYTHON co=cohere.Client('YOUR_API_KEY') ``` @@ -163,7 +163,7 @@ co=cohere.Client('YOUR_API_KEY') Generate a list of potential conversation questions and retain the first 10. -```python +```python PYTHON #user_input is hardcoded for this example user_input='I am meeting up with a coworker. We are meeting at a fancy restaurant. I wanna ask some interesting questions. These questions should be deep.' prompt=user_input+'\nHere are 10 interesting questions to ask:\n1)' @@ -171,7 +171,7 @@ response=co.generate(model='xlarge', prompt=prompt, max_tokens=200, temperature= response ``` -```python +```python PYTHON def generation_to_df(generation): generation=response.split('\n') clean_questions=[] @@ -182,7 +182,7 @@ def generation_to_df(generation): return clean_q_df ``` -```python +```python PYTHON clean_q_df = generation_to_df(response) pd.options.display.max_colwidth=150 clean_q_df @@ -192,7 +192,7 @@ clean_q_df Rank and sort the questions based on interestingness and specificity. -```python +```python PYTHON interestingness=[ Example("What do you think is the hardest part of what I do for a living?", "Not Interesting"), Example("What\'s the first thing you noticed about me?", "Interesting"), @@ -232,7 +232,7 @@ specificity=[ Example("What would your younger self not believe about your life today?", "Specific")] ``` -```python +```python PYTHON def add_attribute(df, attribute, name, target): response = co.classify( @@ -247,7 +247,7 @@ def add_attribute(df, attribute, name, target): df[name]=q_conf ``` -```python +```python PYTHON add_attribute(clean_q_df, interestingness, 'interestingness', 'Interesting') add_attribute(clean_q_df, specificity, 'specificity', 'Specific') clean_q_df['average']= clean_q_df.iloc[:,1:].mean(axis=1) diff --git a/scripts/cookbooks-mdx/rag-evaluation-deep-dive.mdx b/scripts/cookbooks-mdx/rag-evaluation-deep-dive.mdx index f67b5dbe1..0fbc90867 100644 --- a/scripts/cookbooks-mdx/rag-evaluation-deep-dive.mdx +++ b/scripts/cookbooks-mdx/rag-evaluation-deep-dive.mdx @@ -195,14 +195,14 @@ To demonstrate the metrics, we will use data from the [Docugami's KG-RAG](https: Let's start by setting the environment and downloading the dataset. -```python +```python PYTHON %%capture !pip install llama-index cohere openai !pip install mistralai ``` -```python +```python PYTHON # required imports from getpass import getpass import os @@ -218,7 +218,7 @@ For Response evaluation, we will use an LLM as a judge. Any LLM can be used for this goal, but because evaluation is a very challenging task, we recommend using powerful LLMs, possibly as an ensemble of models. In [previous work](https://arxiv.org/pdf/2303.16634.pdf), it has been shown that models tend to assign higher scores to their own output. Since we generated the answers in this notebook using `command-r`, we will not use it for evaluation. We will provide two alternatives, `gpt-4` and `mistral`. We set `gpt-4` as the default model because, as mentioned above, evaluation is challenging, and `gpt-4` is powerful enough to efficiently perform the task. -```python +```python PYTHON # Get keys openai_api_key = getpass("Enter your OpenAI API Key: ") # uncomment if you want to use mistral @@ -232,7 +232,7 @@ model = "gpt-4" ``` -```python +```python PYTHON if model == "gpt-4": client = Client(api_key=openai_api_key) else: @@ -240,7 +240,7 @@ else: ``` -```python +```python PYTHON # let's define a function to get the model's response for a given input def get_response(model, client, prompt): response = client.chat.completions.create( @@ -251,7 +251,7 @@ def get_response(model, client, prompt): ``` -```python +```python PYTHON # load the DocugamiKgRagSec10Q dataset if os.path.exists("./data/source_files") and os.path.exists("./data/rag_dataset.json"): rag_dataset = LabelledRagDataset.from_json("./data/rag_dataset.json") @@ -274,7 +274,7 @@ We use three standard metrics to evaluate retrieval: We implement these three metrics in the class below: -```python +```python PYTHON class RetrievalEvaluator: def compute_precision(self, retrieved_documents, golden_documents): @@ -307,7 +307,7 @@ class RetrievalEvaluator: Let's now see how to use the class above to compute the results on a single datapoint. -```python +```python PYTHON # select the index of a single datapoint - the first one in the dataset idx = 0 @@ -331,7 +331,7 @@ print(f'Retrieved docs: {retrieved_docs}') -```python +```python PYTHON # we can now instantiate the evaluator evaluate_retrieval = RetrievalEvaluator() @@ -375,7 +375,7 @@ Also, while Correctness is measuring the precision of the claims in the response Let's now see how we implement the evaluation described above using LLMs. Let's start with **claim extraction**. -```python +```python PYTHON # first, let's define a function which extracts the claims from a response def extract_claims(query, response, model, client): @@ -393,7 +393,7 @@ def extract_claims(query, response, model, client): ``` -```python +```python PYTHON # now, let's consider this answer, which we previously generated with command-r response = "Apple's total net sales experienced a decline over the last year. The three-month period ended July 1, 2023, saw a total net sale of $81,797 million, which was a 1% decrease from the same period in 2022. The nine-month period ended July 1, 2023, fared slightly better, with a 3% decrease in net sales compared to the first nine months of 2022.\nThis downward trend continued into the three and six-month periods ending April 1, 2023. Apple's total net sales decreased by 3% and 4% respectively, compared to the same periods in 2022." @@ -419,7 +419,7 @@ print(f"List of claims extracted from the model's response:\n\n{claims}") Nice! now that we have the list of claims, we can go ahead and **assess the validity** of each claim. -```python +```python PYTHON # Let's create a function that checks each claim against a reference text, # which here we will call "context". As you will see, we will use different contexts, # depending on the metric we want to compute. @@ -445,7 +445,7 @@ def assess_claims(query, claims, context, model, client): ### Faithfulness -```python +```python PYTHON # Let's start with Faithfulness: in this case, we want to assess the claims # in the response against the retrieved documents (i.e., context = retrieved documents) @@ -475,7 +475,7 @@ print(f"Assessment of the claims extracted from the model's response:\n\n{assess Great, we now have an assessment for each of the claims: in the last step, we just need to use these assessments to define the final score. -```python +```python PYTHON # given the list of claims and their label, compute the final score # as the proportion of correct claims over the full list of claims def get_final_score(claims_list): @@ -486,7 +486,7 @@ def get_final_score(claims_list): ``` -```python +```python PYTHON score_faithfulness = get_final_score(assessed_claims_faithfulness) print(f'Faithfulness: {score_faithfulness}') ``` @@ -499,7 +499,7 @@ The final Faithfulness score is 1, which means that the model's response is full Before moving on, let's modify the model's response by adding a piece of information which is **not** grounded in any document, and re-compute Faithfulness. -```python +```python PYTHON # let's mess up the century, changing 2022 to 1922 modified_response = response.replace('2022', '1922') @@ -538,7 +538,7 @@ As you can see, by assessing claims one by one, we are able to spot **hallucinat As said, Faithfulness and Correctness share the same logic, the only difference being that we will check the claims against the gold answer. We can therefore repeat the process above, and just substitute the `context`. -```python +```python PYTHON # let's get the gold answer from the dataset golden_answer = rag_dataset[idx].reference_answer @@ -568,7 +568,7 @@ print(f"Assess the claims extracted from the model's response against the golden As mentioned above, automatic evaluation is a hard task, and even when using powerful models, claim assessment can present problems: for example, the third claim is labelled as 0, even if it might be inferred from the information in the gold answer. -```python +```python PYTHON # we can now compute the final Correctness score score_correctness = get_final_score(assessed_claims_correctness) print(f'Correctness: {score_correctness}') @@ -585,7 +585,7 @@ For Correctness, we found that only half of the claims in the generated response We finally move to Coverage. Remember that, in this case, we want to check how many of the claims *in the gold answer* are included in the generated response. To do it, we first need to extract the claims from the gold answer. -```python +```python PYTHON # let's extract the golden claims gold_claims = extract_claims(query, golden_answer, model, client) @@ -605,7 +605,7 @@ print(f"List of claims extracted from the gold answer:\n\n{gold_claims}") Then, we check which of these claims is present in the response generated by the model. -```python +```python PYTHON # note that in, this case, the context is the model's response assessed_claims_coverage = assess_claims(query=query, claims=gold_claims, @@ -628,7 +628,7 @@ print(f"Assess which of the gold claims is in the model's response:\n\n{assessed -```python +```python PYTHON # we compute the final Coverage score score_coverage = get_final_score(assessed_claims_coverage) print(f'Coverage: {score_coverage}') diff --git a/scripts/cookbooks-mdx/rag-with-chat-embed.mdx b/scripts/cookbooks-mdx/rag-with-chat-embed.mdx index 22c23fe6f..a50326efe 100644 --- a/scripts/cookbooks-mdx/rag-with-chat-embed.mdx +++ b/scripts/cookbooks-mdx/rag-with-chat-embed.mdx @@ -162,11 +162,11 @@ For each user-chatbot interaction: - If no query is generated - **Step 4**: Call the Chat endpoint in normal mode to generate a response -```python +```python PYTHON ! pip install cohere hnswlib unstructured python-dotenv -q ``` -```python +```python PYTHON import cohere from pinecone import Pinecone, PodSpec import uuid @@ -179,7 +179,7 @@ co = cohere.Client("COHERE_API_KEY") # Get your API key here: https://dashboard. pc = Pinecone(api_key="PINECONE_API_KEY") # (get API key at app.pinecone.io) ``` -```python +```python PYTHON import cohere import os import dotenv @@ -194,7 +194,7 @@ pc = Pinecone( First, we define the list of documents we want to ingest and make available for retrieval. As an example, we'll use the contents from the first module of Cohere's _LLM University: What are Large Language Models?_. -```python +```python PYTHON raw_documents = [ { "title": "Text Embeddings", @@ -228,7 +228,7 @@ This method uses Cohere's `embed-english-v3.0` model to generate embeddings of t `index()` This method uses the `hsnwlib` package to index the document chunk embeddings. This will ensure efficient similarity search during retrieval. Note that `hnswlib` uses a vector library, and we have chosen it for its simplicity. -```python +```python PYTHON class Vectorstore: """ A class representing a collection of documents indexed into a vectorstore. @@ -371,7 +371,7 @@ class Vectorstore: In the code cell below, we initialize an instance of the `Vectorstore` class and pass in the `raw_documents` list as input. -```python +```python PYTHON vectorstore = Vectorstore(raw_documents) ``` @@ -402,7 +402,7 @@ In the code cell below, we check the document chunks that are retrieved for the ## Test Retrieval -```python +```python PYTHON vectorstore.retrieve("multi-head attention definition") ``` @@ -433,7 +433,7 @@ In either case, we also pass the `conversation_id` parameter, which retains the We then print the chatbot's response. In the case that the external information was used to generate a response, we also display citations. -```python +```python PYTHON class Chatbot: def __init__(self, vectorstore: Vectorstore): """ @@ -540,7 +540,7 @@ The format of each citation is: - `text`: The text representing this span - `document_ids`: The IDs of the documents being referenced (`doc_0` being the ID of the first document passed to the `documents` creating parameter in the endpoint call, and so on) -```python +```python PYTHON chatbot = Chatbot(vectorstore) chatbot.run() diff --git a/scripts/cookbooks-mdx/rerank-demo.mdx b/scripts/cookbooks-mdx/rerank-demo.mdx index 6254a3f87..ec2864cec 100644 --- a/scripts/cookbooks-mdx/rerank-demo.mdx +++ b/scripts/cookbooks-mdx/rerank-demo.mdx @@ -145,7 +145,7 @@ We will demonstrate the rerank endpoint in this notebook. -```python +```python PYTHON !pip install "cohere<5" ``` @@ -171,7 +171,7 @@ We will demonstrate the rerank endpoint in this notebook.  -```python +```python PYTHON import cohere import requests import numpy as np @@ -181,7 +181,7 @@ from pprint import pprint ``` -```python +```python PYTHON API_KEY = "" co = cohere.Client(API_KEY) MODEL_NAME = "rerank-english-v3.0" # another option is rerank-multilingual-02 @@ -206,7 +206,7 @@ In the following cell we will call rerank to rank `docs` based on how relevant t -```python +```python PYTHON results = co.rerank(query=query, model=MODEL_NAME, documents=docs, top_n=3) # Change top_n to change the number of results returned. If top_n is not passed, all results will be returned. for idx, r in enumerate(results): print(f"Document Rank: {idx + 1}, Document Index: {r.index}") @@ -239,7 +239,7 @@ We use BM25 lexical search to retrieve the top-100 passages matching the query a -```python +```python PYTHON !pip install -U rank_bm25 ``` @@ -255,7 +255,7 @@ We use BM25 lexical search to retrieve the top-100 passages matching the query a -```python +```python PYTHON import json import gzip import os @@ -270,7 +270,7 @@ from tqdm.autonotebook import tqdm -```python +```python PYTHON !wget http://sbert.net/datasets/simplewiki-2020-11-01.jsonl.gz ``` @@ -297,7 +297,7 @@ from tqdm.autonotebook import tqdm -```python +```python PYTHON wikipedia_filepath = 'simplewiki-2020-11-01.jsonl.gz' passages = [] @@ -313,7 +313,7 @@ print("Passages:", len(passages)) -```python +```python PYTHON print(passages[0], passages[1]) ``` @@ -321,7 +321,7 @@ print(passages[0], passages[1]) -```python +```python PYTHON def bm25_tokenizer(text): tokenized_doc = [] @@ -344,7 +344,7 @@ bm25 = BM25Okapi(tokenized_corpus) -```python +```python PYTHON def search(query, top_k=3, num_candidates=100): print("Input question:", query) @@ -370,7 +370,7 @@ def search(query, top_k=3, num_candidates=100): ``` -```python +```python PYTHON search(query = "What is the capital of the United States?") ``` @@ -387,7 +387,7 @@ search(query = "What is the capital of the United States?") -```python +```python PYTHON search(query = "Number countries Europe") ``` @@ -404,7 +404,7 @@ search(query = "Number countries Europe") -```python +```python PYTHON search(query = "Elon Musk year birth") ``` @@ -421,7 +421,7 @@ search(query = "Elon Musk year birth") -```python +```python PYTHON search(query = "Which US president was killed?") ``` @@ -438,7 +438,7 @@ search(query = "Which US president was killed?") -```python +```python PYTHON search(query="When is Chinese New Year") ``` @@ -455,7 +455,7 @@ search(query="When is Chinese New Year") -```python +```python PYTHON search(query="How many people live in Paris") ``` @@ -472,7 +472,7 @@ search(query="How many people live in Paris") -```python +```python PYTHON search(query="Who is the director of The Matrix?") ``` diff --git a/scripts/cookbooks-mdx/sql-agent.mdx b/scripts/cookbooks-mdx/sql-agent.mdx index e1b9b7631..a987029d3 100644 --- a/scripts/cookbooks-mdx/sql-agent.mdx +++ b/scripts/cookbooks-mdx/sql-agent.mdx @@ -188,7 +188,7 @@ In this notebook we explore how to setup a [Cohere ReAct Agent](https://github.c -```python +```python PYTHON from langchain.agents import AgentExecutor from langchain_cohere.react_multi_hop.agent import create_cohere_react_agent from langchain_core.prompts import ChatPromptTemplate @@ -200,7 +200,7 @@ import json ``` -```python +```python PYTHON # Uncomment if you need to install the following packages #!pip install --quiet langchain langchain_cohere langchain_experimental --upgrade ``` @@ -217,13 +217,13 @@ These are the following tools: -```python +```python PYTHON # load the cohere api key os.environ["COHERE_API_KEY"] = "" ``` -```python +```python PYTHON DB_NAME='Chinook.db' MODEL="command-r-plus" llm = ChatCohere(model=MODEL, temperature=0.1,verbose=True) @@ -246,7 +246,7 @@ print([tool.name for tool in tools]) We follow the general cohere react agent setup in Langchain to build our SQL agent. -```python +```python PYTHON # define the prompt template prompt = ChatPromptTemplate.from_template("{input}") # instantiate the ReAct agent @@ -263,7 +263,7 @@ agent_executor = AgentExecutor(agent=agent, ``` -```python +```python PYTHON output=agent_executor.invoke({ "input": 'what tables are available?', }) @@ -284,7 +284,7 @@ output=agent_executor.invoke({ -```python +```python PYTHON print(output['output']) ``` @@ -294,7 +294,7 @@ print(output['output']) The agent uses the list_tables tool to effectively highlight all the tables in the DB. -```python +```python PYTHON output=agent_executor.invoke({ "input": 'show the first row of the Playlist and Genre tables?', }) @@ -363,7 +363,7 @@ output=agent_executor.invoke({ -```python +```python PYTHON print(output['output']) ``` @@ -383,7 +383,7 @@ print(output['output']) Here we see that the tool takes a list of tables to query the sql_db_schema tool to retrieve the various schemas. -```python +```python PYTHON output=agent_executor.invoke({ "input": 'which countries have the most invoices?', }) @@ -441,7 +441,7 @@ output=agent_executor.invoke({ -```python +```python PYTHON print(output['output']) ``` @@ -451,7 +451,7 @@ print(output['output']) The agent initially makes some errors as it jumps to answer the question using the db_query tool, but it then realizes it needs to figure out what tables it has access to and what they look like. It then fixes the SQL code and is able to generate the right answer. -```python +```python PYTHON output=agent_executor.invoke({ "input": 'who is the best customer? The customer who has spent the most money is the best.', }) @@ -532,7 +532,7 @@ output=agent_executor.invoke({ -```python +```python PYTHON print(output['output']) ``` @@ -547,7 +547,7 @@ As you can see, the agent makes an error, but is able to rectify itself. It also Generally, passing in additional context to the preamble can help reduce the initial failures. This context is provided by the SQLDBToolkit and contains the first 3 rows of the tables in the Database. -```python +```python PYTHON print('**Context to pass to LLM on tables**') print('Table Names') print(context['table_names']) @@ -781,7 +781,7 @@ print(context['table_info']) We can pass this context into the preamble and re-run a query to see how it performs. -```python +```python PYTHON preamble="""## Task And Context You use your advanced complex reasoning capabilities to help people by answering their questions and other requests interactively. You will be asked a very wide array of requests on all kinds of topics. You will be equipped with a wide range of search engines or similar tools to help you, which you use to research your answer. You may need to use multiple tools in parallel or sequentially to complete your task. You should focus on serving the user's needs as best you can, which will be wide-ranging. @@ -798,7 +798,7 @@ Here is information about the database: ``` -```python +```python PYTHON output=agent_executor.invoke({ "input": 'provide the name of the best customer? The customer who has spent the most money is the best.', "preamble": preamble @@ -820,7 +820,7 @@ output=agent_executor.invoke({ -```python +```python PYTHON print(output['output']) ``` diff --git a/scripts/cookbooks-mdx/summarization-evals.mdx b/scripts/cookbooks-mdx/summarization-evals.mdx index 7bb2b3e60..cbfc526e8 100644 --- a/scripts/cookbooks-mdx/summarization-evals.mdx +++ b/scripts/cookbooks-mdx/summarization-evals.mdx @@ -148,12 +148,12 @@ In this cookbook, we will be demonstrating an approach we use for evaluating sum You'll need a Cohere API key to run this notebook. If you don't have a key, head to https://cohere.com/ to generate your key. -```python +```python PYTHON !pip install cohere datasets --quiet ``` -```python +```python PYTHON import json import random import re @@ -172,7 +172,7 @@ co = cohere.Client(api_key=co_api_key) As test data, we'll use transcripts from the [QMSum dataset](https://github.com/Yale-LILY/QMSum). Note that in addition to the transcripts, this dataset also contains reference summaries -- we will use only the transcripts as our approach is reference-free. -```python +```python PYTHON qmsum = load_dataset("MocktaiLEngineer/qmsum-processed", split="validation") transcripts = [x for x in qmsum["meeting_transcript"] if x is not None] ``` @@ -205,7 +205,7 @@ Therefore, we must first create a dataset that contains diverse summarization pr First, we define the prompt that combines the text and instructions. Here, we use a very basic prompt: -```python +```python PYTHON prompt_template = """## meeting transcript {transcript} @@ -216,7 +216,7 @@ prompt_template = """## meeting transcript Next, we build the instructions. Because each instruction may have a different objective and modifiers, we track them using metadata. This will later be required for evaluation (i.e. to know what the prompt is asking). -```python +```python PYTHON instruction_objectives = { "general_summarization": "Summarize the meeting based on the transcript.", @@ -268,7 +268,7 @@ format_length_modifiers = { Let's combine the objectives and format/length modifiers to finish building the instructions. -```python +```python PYTHON instructions = [] for obj_name, obj_text in instruction_objectives.items(): for mod_data in format_length_modifiers.values(): @@ -309,7 +309,7 @@ print(json.dumps(instructions[:2], indent=4)) Finally, let's build the final prompts by semi-randomly pairing the instructions with transcripts from the QMSum dataset. -```python +```python PYTHON data = pd.DataFrame(instructions) transcripts = sorted(transcripts, key=lambda x: len(x), reverse=True)[:int(len(transcripts) * 0.25)] @@ -321,12 +321,12 @@ data["prompt"] = data.apply(lambda x: prompt_template.format(transcript=x["trans ``` -```python +```python PYTHON data["transcript_token_len"] = [len(x) for x in co.batch_tokenize(data["transcript"].tolist(), model=co_model)] ``` -```python +```python PYTHON print(data["prompt"][0]) ``` @@ -391,7 +391,7 @@ We use three criteria that are graded using LLMs: In this cookbook, we will use Command-R to grade the completions. However, note that in practice, we typically use an ensemble of multiple LLM evaluators to reduce any bias. -```python +```python PYTHON grading_prompt_template = """You are an AI grader that given a prompt, a completion, and a criterion, grades the completion based on the prompt and criterion. Below is a prompt, a completion, and a criterion with which to grade the completion. You need to respond according to the criterion instructions. @@ -446,7 +446,7 @@ In addition, we have two criteria that are graded programmatically: - Length: checks if the summary follows the length that was requested in the prompt. -```python +```python PYTHON def score_format(completion: str, format_type: str) -> int: """ @@ -536,7 +536,7 @@ Now that we have our evaluation dataset and defined our evaluation functions, le First, we generate completions to be graded. We will use Cohere's [Command-R](https://huggingface.co/CohereForAI/c4ai-command-r-v01) model, boasting a context length of 128K. -```python +```python PYTHON completions = [] for prompt in data["prompt"]: completion = co.chat(message=prompt, model="command-r", temperature=0.2).text @@ -546,7 +546,7 @@ data["completion"] = completions ``` -```python +```python PYTHON print(data["completion"][0]) ``` @@ -556,7 +556,7 @@ print(data["completion"][0]) Let's grade the completions using our LLM and non-LLM checks. -```python +```python PYTHON data["format_score"] = data.apply( lambda x: score_format(x["completion"], x["eval_metadata"]["format"]), axis=1 ) @@ -585,7 +585,7 @@ data["conciseness_score"] = data.apply( ``` -```python +```python PYTHON data ``` @@ -724,7 +724,7 @@ data Finally, let's print the average scores per critiera. -```python +```python PYTHON avg_scores = data[["format_score", "length_score", "completeness_score", "correctness_score", "conciseness_score"]].mean() print(avg_scores) ``` diff --git a/scripts/cookbooks-mdx/text-classification-using-embeddings.mdx b/scripts/cookbooks-mdx/text-classification-using-embeddings.mdx index 290cf6db5..f03a02b17 100644 --- a/scripts/cookbooks-mdx/text-classification-using-embeddings.mdx +++ b/scripts/cookbooks-mdx/text-classification-using-embeddings.mdx @@ -149,13 +149,13 @@ We'll go through the following steps: If you're running an older version of the SDK you'll want to upgrade it, like this: -```python +```python PYTHON #!pip install --upgrade cohere ``` ## 1. Get the dataset -```python +```python PYTHON import cohere from sklearn.model_selection import train_test_split @@ -165,7 +165,7 @@ pd.set_option('display.max_colwidth', None) df = pd.read_csv('https://github.com/clairett/pytorch-sentiment-classification/raw/master/data/SST2/train.tsv', delimiter='\t', header=None) ``` -```python +```python PYTHON df.head() ``` @@ -228,7 +228,7 @@ We'll only use a subset of the training and testing datasets in this example. We The `train_test_split` method splits arrays or matrices into random train and test subsets. -```python +```python PYTHON num_examples = 500 df_sample = df.sample(num_examples) @@ -247,7 +247,7 @@ labels_test = labels_test[:95] We're now ready to retrieve the embeddings from the API. You'll need your API key for this next cell. [Sign up to Cohere](https://os.cohere.ai/) and get one if you haven't yet. -```python +```python PYTHON model_name = "embed-english-v3.0" api_key = "" @@ -256,7 +256,7 @@ input_type = "classification" co = cohere.Client(api_key) ``` -```python +```python PYTHON embeddings_train = co.embed(texts=sentences_train, model=model_name, input_type=input_type @@ -275,7 +275,7 @@ We now have two sets of embeddings, `embeddings_train` contains the embeddings o Curious what an embedding looks like? We can print it: -```python +```python PYTHON print(f"Review text: {sentences_train[0]}") print(f"Embedding vector: {embeddings_train[0][:10]}") ``` @@ -289,7 +289,7 @@ Embedding vector: [1.1531117, -0.8543223, -1.2496399, -0.28317127, -0.75870246, Now that we have the embedding, we can train our classifier. We'll use an SVM from sklearn. -```python +```python PYTHON from sklearn.svm import SVC from sklearn.pipeline import make_pipeline from sklearn.preprocessing import StandardScaler @@ -308,7 +308,7 @@ Pipeline(steps=[('standardscaler', StandardScaler()), ## 4. Evaluate the performance of the classifier on the testing set -```python +```python PYTHON score = svm_classifier.score(embeddings_test, labels_test) print(f"Validation accuracy on is {100*score}%!") ``` diff --git a/scripts/cookbooks-mdx/topic-modeling-ai-papers.mdx b/scripts/cookbooks-mdx/topic-modeling-ai-papers.mdx index 6a1e45fa2..fa56e7944 100644 --- a/scripts/cookbooks-mdx/topic-modeling-ai-papers.mdx +++ b/scripts/cookbooks-mdx/topic-modeling-ai-papers.mdx @@ -141,13 +141,13 @@ To follow along with this tutorial, you need to be familiar with Python. Make su First, you need to install the python dependencies required to run the project. Use pip to install them using the command below -```python +```python PYTHON pip install requests beautifulsoup4 cohere altair clean-text numpy pandas sklearn > /dev/null ``` Create a new python file named cohere_nlp.py. Write all your code in this file. import the dependencies and initialize Cohere’s client. -```python +```python PYTHON import cohere api_key = '' @@ -158,7 +158,7 @@ This tutorial focuses on applying topic modeling to look for recent trends in AI First, import the required libraries to make web requests and process the web content . -```python +```python PYTHON import requests from bs4 import BeautifulSoup import pandas as pd @@ -168,14 +168,14 @@ from cleantext import clean Next, make an HTTP request to the source website that has an archive of the AI papers. -```python +```python PYTHON URL = "https://www.jair.org/index.php/jair/issue/archive" page = requests.get(URL) ``` Use this archive to get the list of AI papers published. This archive has papers published since 2015. This tutorial considers papers published recently, on or after 2020 only. -```python +```python PYTHON soup = BeautifulSoup(page.content, "html.parser") archive_links = [] @@ -189,7 +189,7 @@ for link in soup.select('a.title'): Finally, you’ll need to clean the titles of the AI papers gathered. Remove trailing white spaces and unwanted characters. Use the NTLK library to get English stop words and filter them out. -```python +```python PYTHON papers = [] for archive in archive_links: page = requests.get(archive['link']) @@ -223,7 +223,7 @@ for archive in archive_links: The dataset created using this process has 258 AI papers published between 2020 and 2022. Use pandas library to create a data frame to hold our text data. -```python +```python PYTHON df = pd.DataFrame(papers) print(len(df)) ``` @@ -242,7 +242,7 @@ Cohere’s platform provides an Embed Endpoint that returns text embeddings. An Write a function to create the word embeddings using Cohere. The function should read as follows: -```python +```python PYTHON def get_embeddings(text,model='medium'): output = co.embed( model=model, @@ -252,13 +252,13 @@ def get_embeddings(text,model='medium'): Create a new column in your pandas data frame to hold the embeddings created. -```python +```python PYTHON df['title_embeds'] = df['title'].apply(get_embeddings) ``` Congratulations! You have created the word embeddings . Now, you will proceed to visualize the embeddings using a scatter plot. First, you need to reduce the dimensions of the word embeddings. You’ll use the Principal Component Analysis (PCA) method to achieve this task. Import the necessary packages and create a function to return the principle components. -```python +```python PYTHON from sklearn.decomposition import PCA def get_pc(arr,n): @@ -269,7 +269,7 @@ def get_pc(arr,n): Next, create a function to generate a scatter plot chart. You’ll use the altair library to create the charts. -```python +```python PYTHON import altair as alt def generate_chart(df,xcol,ycol,lbl='off',color='basic',title=''): chart = alt.Chart(df).mark_circle(size=500).encode( @@ -306,7 +306,7 @@ def generate_chart(df,xcol,ycol,lbl='off',color='basic',title=''): Finally, use the embeddings with reduced dimensionality to create a scatter plot. -```python +```python PYTHON sample = 200 embeds = np.array(df['title_embeds'].tolist()) embeds_pc2 = get_pc(embeds,2) @@ -362,7 +362,7 @@ print(df_pc2.iloc[:sample]) Here’s a chart demonstrating the word embeddings for AI papers. It is important to note that the chart represents a sample size of 200 papers. -```python +```python PYTHON generate_chart(df_pc2.iloc[:sample],'0','1',title='2D Embeddings') ``` @@ -372,7 +372,7 @@ Data searching techniques focus on using keywords to retrieve text-based informa First, create a function to get similarities between two embeddings. This will use the cosine similarity algorithm from the sci-kit learn library. -```python +```python PYTHON from sklearn.metrics.pairwise import cosine_similarity def get_similarity(target,candidates): @@ -394,7 +394,7 @@ def get_similarity(target,candidates): Next, create embeddings for the search query -```python +```python PYTHON new_query = "graph network strategies" new_query_embeds = get_embeddings(new_query) @@ -403,7 +403,7 @@ new_query_embeds = get_embeddings(new_query) Finally, check the similarity between the two embeddings. Display the top 10 similar papers using your result -```python +```python PYTHON similarity = get_similarity(new_query_embeds,embeds[:sample]) print('Query:') @@ -628,7 +628,7 @@ Clustering is a process of grouping similar documents into clusters. It allows y First, import the k-means algorithm from the scikit-learn package. Then configure two variables: the number of clusters and a duplicate dataset. -```python +```python PYTHON from sklearn.cluster import KMeans df_clust = df_pc2.copy() @@ -638,7 +638,7 @@ n_clusters=5 Next, initialize the k-means model and use it to fit the embeddings to create the clusters. -```python +```python PYTHON kmeans_model = KMeans(n_clusters=n_clusters, random_state=0) classes = kmeans_model.fit_predict(embeds).tolist() print(classes) @@ -652,7 +652,7 @@ df_clust['cluster'] = (list(map(str,classes))) Finally, plot a scatter plot to visualize the 5 clusters in our sample size. -```python +```python PYTHON df_clust.columns = df_clust.columns.astype(str) generate_chart(df_clust.iloc[:sample],'0','1',lbl='off',color='cluster',title='Clustering with 5 Clusters') ``` diff --git a/scripts/cookbooks-mdx/wikipedia-search-with-weaviate.mdx b/scripts/cookbooks-mdx/wikipedia-search-with-weaviate.mdx index 3788958a9..792e14a8f 100644 --- a/scripts/cookbooks-mdx/wikipedia-search-with-weaviate.mdx +++ b/scripts/cookbooks-mdx/wikipedia-search-with-weaviate.mdx @@ -135,7 +135,7 @@ slug: /page/wikipedia-search-with-weaviate This is starter code that you can use to search 10 million vectors from Wikipedia embedded with Cohere's multilingual model and hosted as a Weaviate public dataset. This dataset contains 1M vectors in each of the Wikipedia sites in these languages: English, German, French, Spanish, Italian, Japanese, Arabic, Chinese (Simplified), Korean, Hindi \[respective language codes: `en, de, fr, es, it, ja, ar, zh, ko, hi`\] -```python +```python PYTHON import weaviate cohere_api_key = '' @@ -162,7 +162,7 @@ client.is_ready() #check if True Let's now define the search function that queries our vector database. Optionally, we want the ability to filter by language. -```python +```python PYTHON def semantic_serch(query, results_lang=''): """ @@ -228,7 +228,7 @@ def print_result(result): We can now query the database with any query we want. In the background, Weaviate uses your Cohere API key to embed the query, then retrun the most relevant passages to the query. -```python +```python PYTHON query_result = semantic_serch("time travel plot twist") print_result(query_result) @@ -260,7 +260,7 @@ print_result(query_result) If we're interested in results in only one language, we can specify it. -```python +```python PYTHON query_result = semantic_serch("time travel plot twist", results_lang='ja') print_result(query_result) diff --git a/scripts/cookbooks-mdx/wikipedia-semantic-search.mdx b/scripts/cookbooks-mdx/wikipedia-semantic-search.mdx index f8f4a488e..fc1c6a26a 100644 --- a/scripts/cookbooks-mdx/wikipedia-semantic-search.mdx +++ b/scripts/cookbooks-mdx/wikipedia-semantic-search.mdx @@ -137,7 +137,7 @@ This notebook contains the starter code to do simple [semantic search](https://t Let's now download 1,000 records from the English Wikipedia embeddings archive so we can search it afterwards. -```python +```python PYTHON from datasets import load_dataset import torch import cohere @@ -170,7 +170,7 @@ doc_embeddings = torch.tensor(doc_embeddings) Now, `doc_embeddings` holds the embeddings of the first 1,000 documents in the dataset. Each document is represented as an [embeddings vector](https://txt.cohere.ai/sentence-word-embeddings/) of 768 values. -```python +```python PYTHON doc_embeddings.shape ``` @@ -186,7 +186,7 @@ We can now search these vectors for any query we want. For this toy example, we' To search, we embed the query, then get the nearest neighbors to its embedding (using dot product). -```python +```python PYTHON query = 'Who founded Wikipedia' response = co.embed(texts=[query], model='multilingual-22-12')