【AI+搜索】开源AI搜索项目学习：400行核心代码完成整个流程

发布日期：2024-07-26 07:31:54 浏览次数： 3352

0. 背景

AI大模型爆发已经一两年的时间了，目前为止，相对成熟的应用领域是在AI+搜索领域，像 Kimi Chat、百度App、New Bing等，都逐步拥有了此功能，用户只需输入想知道的事情，这些软件会自动搜索网络内容，然后根据网络内容总结出最终的答案，大大减轻了用户的检索和分析的负担。

前面，我也分析过这类AI检索功能的背后原理，同时也从0实现了一个AI搜索工具，感兴趣的可以去看下这篇文章：【AI大模型应用开发】【综合实战】AI+搜索，手把手带你实现属于你的AI搜索引擎（附完整代码）。

但是，自己实现的，终究只是个Demo，只是原理通了，但效果如何？可能用起来并不好。毕竟，AI大模型应用的特点就是：上手简单，落地难。要想实现效果好的产品，还需要大量的细节处理和打磨。

最近，我发现了一个开源的AI搜索工具，叫 Lepton Search，GitHub Star数7.5K，还挺受欢迎的。本文我们来看下它的具体实现，看下与我之前的思路有没有区别，有没有其它值得借鉴的地方。

1. Lepton Search 工具介绍

在线体验地址：https://search.lepton.run/ GitHub 源码地址：https://search.lepton.run/

界面还是挺简洁的。

搜索后答案界面如下：

它会列出最终答案、引用的链接来源，以及联想一些用户可能会问的相关问题。

2. 实现原理

我们这里不讨论其界面前端的实现，只看后端的实现原理。其后端的实现核心代码大概有400多行，在 https://github.com/leptonai/search_with_lepton/blob/main/search_with_lepton.py 文件中。

2.1 总结

先说结论，其实现原理与我之前的文章写的实现原理差别不大，即首先利用检索接口检索出相关的网页和文本内容，然后以这些文本内容作为RAG的参考文本，与原始问题一同给到大模型，大模型根据参考文本给出最终答案。

说白了，就是一个RAG的应用，只是数据源来源不同而已。

2.2 重点代码分析

2.2.1 检索数据源

该项目可以使用不同的检索数据源，例如Google，Bing等，有现成的代码可以用。当然，要自己去申请相应接口的Key。

具体可直接用的不同检索API的函数定义如下：

def search_with_bing(query: str, subscription_key: str):

def search_with_google(query: str, subscription_key: str, cx: str):

def search_with_serper(query: str, subscription_key: str):

def search_with_searchapi(query: str, subscription_key: str):

2.2.2 检索入口函数

query_function 为该项目的检索入口函数。主要代码如下：

 def query_function(
    self,
    query: str,
    search_uuid: str,
    generate_related_questions: Optional[bool] = True,
)->StreamingResponse:

if self.backend =="LEPTON":
# delegate to the lepton search api.
        result = self.leptonsearch_client.query(
            query=query,
            search_uuid=search_uuid,
            generate_related_questions=generate_related_questions,
)
returnStreamingResponse(content=result, media_type="text/html")

# First, do a search query.
    query = query or _default_query
......
    contexts = self.search_function(query)

    system_prompt = _rag_query_text.format(
        context="\n\n".join(
[f"[[citation:{i+1}]] {c['snippet']}"for i, c inenumerate(contexts)]
)
)
try:
        client = self.local_client()
        llm_response = client.chat.completions.create(
            model=self.model,
            messages=[
{"role":"system","content": system_prompt},
{"role":"user","content": query},
],
            max_tokens=1024,
            stop=stop_words,
            stream=True,
            temperature=0.9,
)
if self.should_do_related_questions and generate_related_questions:
# While the answer is being generated, we can start generating
# related questions as a future.
            related_questions_future = self.executor.submit(
                self.get_related_questions, query, contexts
)
else:
            related_questions_future =None
exceptExceptionas e:
        ......

以上代码主要做了以下几件事，也是AI搜索引擎的常规步骤：

（1）contexts = self.search_function(query) 检索相关文本

（2）system_prompt 组装RAG Prompt，Prompt模板如下：

_rag_query_text = """
You are a large language AI assistant built by Lepton AI. You are given a user question, and please write clean, concise and accurate answer to the question. You will be given a set of related contexts to the question, each starting with a reference number like [[citation:x]], where x is a number. Please use the context and cite the context at the end of each sentence if applicable.

Your answer must be correct, accurate and written by an expert using an unbiased and professional tone. Please limit to 1024 tokens. Do not give any information that is not related to the question, and do not repeat. Say "information is missing on" followed by the related topic, if the given context do not provide sufficient information.

Please cite the contexts with the reference numbers, in the format [citation:x]. If a sentence comes from multiple contexts, please list all applicable citations, like [citation:3][citation:5]. Other than code and specific names and citations, your answer must be written in the same language as the question.

Here are the set of contexts:

{context}

Remember, don't blindly repeat the contexts verbatim. And here is the user question:
"""

（3）client.chat.completions.create 调用大模型获取答案

以上3步为基本步骤。该项目还增加了额外的步骤，获取相关的问题。

2.2.3 获取相关问题

获取相关问题展示给用户的能力在某些情况下也是有用和有意义的，给用户提示，在用户不知道该如何问的时候有灵感。

其实现方法如下：

def get_related_questions(self, query, contexts):
......

try:
        response = self.local_client().chat.completions.create(
            model=self.model,
            messages=[
{
"role":"system",
"content": _more_questions_prompt.format(
                        context="\n\n".join([c["snippet"]for c in contexts])
),
},
{
"role":"user",
"content": query,
},
],
            tools=[{
"type":"function",
"function": tool.get_tools_spec(ask_related_questions),
}],
            max_tokens=512,
)
        ......

具体实现原理也是利用大模型，根据原始问题和回复的问题答案来生成几个相关问题。通过其Prompt可以很容易看出其实现方式：

_more_questions_prompt = """
You are a helpful assistant that helps the user to ask related questions, based on user's original question and the related contexts. Please identify worthwhile topics that can be follow-ups, and write questions no longer than 20 words each. Please make sure that specifics, like events, names, locations, are included in follow up questions so they can be asked standalone. For example, if the original question asks about "the Manhattan project", in the follow up question, do not just say "the project", but use the full name "the Manhattan project". Your related questions must be in the same language as the original question.

Here are the contexts of the question:

{context}

Remember, based on the original question and related contexts, suggest three such further questions. Do NOT repeat the original question. Each related question should be no longer than 20 words. Here is the original question:
"""