Langchain 的 LLM 代理简介：当 RAG 有局限性怎么办

发布日期：2024-04-10 21:19:29 浏览次数： 4122

作者：DeepPrompting

微信搜一搜，关注“DeepPrompting”

Agents介绍

让我们通过探索 LLM 代理的各种示例。虽然这个话题被广泛讨论，但很少有人积极使用代理;通常，我们所认为的代理只是大型语言模型。让我们考虑这样一个简单的任务，例如搜索足球比赛结果并将其保存为 CSV 文件。我们可以比较几种可用的工具：

带有搜索和插件的 GPT-4：正如您将在此处的聊天记录中发现的那样，由于代码错误，GPT-4 无法完成任务
AutoGPT 至少可以通过 https://evo.ninja/ 生成某种 CSV（虽然并不理想）：

AgentGPT 通过 https://agentgpt.reworkd.ai/：决定将此任务视为合成数据生成器，这不是我们要求的，请在此处查看聊天记录

由于可用的工具不是很好，让我们从如何从头开始构建代理的首要原则中学习。

第 1 步：规划

您可能遇到过各种旨在提高大型语言模型性能的技术，例如提供提示，甚至开玩笑地威胁他们。一种流行的技术被称为“思维链”，要求模型逐步思考，从而实现自我纠正。这种方法已经演变成更高级的版本，如“具有自洽性的思想链”和广义的“思想树”，其中创建、重新评估和整合多个想法以提供输出。

在本教程中，我使用了大量 Langsmith，这是一个用于生产 LLM 应用程序的平台。例如，在构建思想树提示时，我将子提示保存在提示存储库中并加载它们：

from langchain import hub
from langchain.chains import SequentialChain

cot_step1 = hub.pull("rachnogstyle/nlw_jan24_cot_step1")
cot_step2 = hub.pull("rachnogstyle/nlw_jan24_cot_step2")
cot_step3 = hub.pull("rachnogstyle/nlw_jan24_cot_step3")
cot_step4 = hub.pull("rachnogstyle/nlw_jan24_cot_step4")

model = "gpt-3.5-turbo"

chain1 = LLMChain(
llm=ChatOpenAI(temperature=0, model=model),
prompt=cot_step1,
output_key="solutions"
)

chain2 = LLMChain(
llm=ChatOpenAI(temperature=0, model=model),
prompt=cot_step2,
output_key="review"
)

chain3 = LLMChain(
llm=ChatOpenAI(temperature=0, model=model),
prompt=cot_step3,
output_key="deepen_thought_process"
)

chain4 = LLMChain(
llm=ChatOpenAI(temperature=0, model=model),
prompt=cot_step4,
output_key="ranked_solutions"
)

overall_chain = SequentialChain(
chains=[chain1, chain2, chain3, chain4],
input_variables=["input", "perfect_factors"],
output_variables=["ranked_solutions"],
verbose=True
)

你可以在这个笔记本中看到这种推理的结果，我想在这里要说的是定义你的推理步骤并在像 Langsmith 这样的 LLMOps 系统中对它们进行版本控制的正确过程。此外，您还可以在公共存储库（如 ReAct 或 Self-ask with search）中看到其他流行的推理技术示例：

prompt = hub.pull("hwchase17/react")
prompt = hub.pull("hwchase17/self-ask-with-search")

其他值得注意的方法包括：

Reflexion （Shinn & Labash 2023）是一个框架，旨在为智能体提供动态记忆和自我反思能力，以提高推理能力。
链（CoH;Liu 等人，2023 年）鼓励模型通过显式呈现一系列过去的输出来改进自己的输出，每个输出都带有反馈注释。

第 2 步：内存

感觉记忆：记忆的这一组成部分捕捉即时的感官输入，例如我们看到、听到或感觉到的。在提示工程和 AI 模型的上下文中，提示充当瞬态输入，类似于瞬间触摸或感觉。这是触发模型处理的初始刺激。
短期记忆：短期记忆暂时保存信息，通常与正在进行的任务或对话有关。在提示工程中，这等同于保留最近的聊天记录。这种记忆使智能体能够在整个交互过程中保持上下文和连贯性，确保响应与当前对话保持一致。在代码中，通常将其添加为对话历史记录：

from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain.agents import AgentExecutor
from langchain.agents import create_openai_functions_agent

llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
tools = [retriever_tool]
agent = create_openai_functions_agent(
llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

message_history = ChatMessageHistory()
agent_with_chat_history = RunnableWithMessageHistory(
agent_executor,
lambda session_id: message_history,
input_messages_key="input",
history_messages_key="chat_history",
)

长期记忆：长期记忆既存储事实知识，也存储程序说明。在 AI 模型中，这由用于训练和微调的数据表示。此外，长期记忆支持RAG框架的运行，允许代理访问学习到的信息并将其整合到他们的响应中。它就像一个全面的知识库，代理可以利用它来生成知情和相关的输出。在代码中，通常将其添加为矢量化数据库：

from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import WebBaseLoader
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings

loader = WebBaseLoader("https://neurons-lab.com/")
docs = loader.load()
documents = RecursiveCharacterTextSplitter(
chunk_size=1000, chunk_overlap=200
).split_documents(docs)
vector = FAISS.from_documents(documents, OpenAIEmbeddings())
retriever = vector.as_retriever()

第 3 步：工具

ChatGPT 插件和 OpenAI API 函数调用是 LLM 在实践中增强工具使用能力的好例子。

内置Langchain工具：Langchain拥有大量内置工具，从互联网搜索和Arxiv工具包到Zapier和雅虎财经。在这个简单的教程中，我们将尝试使用 Tavily 提供的互联网搜索：

from langchain.utilities.tavily_search import TavilySearchAPIWrapper
from langchain.tools.tavily_search import TavilySearchResults

search = TavilySearchAPIWrapper()
tavily_tool = TavilySearchResults(api_wrapper=search)

llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0.0)
agent_chain = initialize_agent(
[retriever_tool, tavily_tool],
llm,
agent=AgentType.STRUCTURED_CHAT_ZERO_SHOT_REACT_DESCRIPTION,
verbose=True,
)

自定义工具：定义自己的工具也非常容易。让我们剖析一个计算字符串长度的工具的简单示例。您需要使用装饰器来让 Langchain 知道它。然后，不要忘记输入和输出的类型。但最重要的部分是函数注释 - 这是您的代理如何知道这个工具的作用，并将此描述与其他工具的描述进行比较：@tool""" """

from langchain.pydantic_v1 import BaseModel, Field
from langchain.tools import BaseTool, StructuredTool, tool

@tool
def calculate_length_tool(a: str) -> int:
"""The function calculates the length of the input string."""
return len(a)

llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0.0)
agent_chain = initialize_agent(
[retriever_tool, tavily_tool, calculate_length_tool],
llm,
agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
verbose=True,
)

你可以在这个脚本中找到它如何工作的示例，但你也可以看到一个错误——它没有提取 Neurons Lab 公司的正确描述，尽管调用了正确的长度计算自定义函数，但最终结果是错误的。让我们尝试修复它！

第 4 步：齐心协力

我提供了一个干净的版本，在这个脚本中将所有架构部分组合在一起。请注意，我们如何轻松地分别分解和定义：

各种工具（搜索、自定义工具等）
各种记忆（感官作为提示，短期作为可运行的消息历史记录，作为提示中的画板，长期作为从向量数据库中检索）
任何类型的计划策略（作为从 LLMOps 系统提取的提示的一部分）

代理的最终定义将如下所示：

llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
agent = create_openai_functions_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
agent_with_chat_history = RunnableWithMessageHistory(
agent_executor,
lambda session_id: message_history,
input_messages_key="input",
history_messages_key="chat_history",
)

正如您在脚本的输出中看到的（或者您可以自己运行它），它解决了上一部分中与工具相关的问题。发生了什么变化？我们定义了一个完整的架构，其中短期记忆起着至关重要的作用。我们的代理获得了消息历史记录和画板作为推理结构的一部分，这使得它能够提取正确的网站描述并计算其长度。