Application level Search Feedback | 应用级搜索反馈 #6482

arvinxx · 2025-02-24T18:54:13Z

1.64.0 We have supported application-level networking features through SearchXNG. We welcome everyone to provide feedback on their experience and suggestions.

Set environment variable: SEARXNG_URL=https://searxng-instance.com

There is a searchXNG one-click startup template on Zeabur: https://zeabur.com/templates/77FSH6

1.64.0 通过 SearchXNG 我们支持了应用级联网功能，欢迎大家反馈使用体验和建议。

配置环境变量：SEARXNG_URL=https://searxng-instance.com

Zeabur 上有 searchXNG 的一键启动模板：https://zeabur.com/templates/77FSH6

The text was updated successfully, but these errors were encountered:

130aac8 · 2025-02-24T20:33:11Z

The model should not override the default search engine configuration if the SearXNG instance is well-adjusted and optimized. Default results from the instance could be better, as it may aggregate multiple sources like Google, DuckDuckGo, Brave, Mojeek, and others, providing more comprehensive results. Allowing the model to select specific engines (e.g., Brave or DuckDuckGo) could lead to fewer results, especially if the selected engine becomes inaccessible due to anti-bot mechanisms. Overriding the default configuration might result in no available results.
Embed search results to improve relevance and allow configuration of the number of results or relevance thresholds provided to the AI.
Enable configuration of the SearXNG instance directly within the user interface.
(Optional) Allow the model to fetch the full content of one or more links to enhance the depth of information and provide more valuable insights.

Kac001 · 2025-02-25T01:04:39Z

已解决,原因是提供的zeabur模板(https://zeabur.com/templates/77FSH6) 没有设置 json 输出
进去searxng容器修改/etc/searxng/settings.yml文件
找到

formats:
    - html

修改为

  formats:
    - html
    - json

最后重启容器

lobehubbot · 2025-02-25T01:04:51Z

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

Resolved because the provided zeabur template (https://zeabur.com/templates/77FSH6) does not open json output settings
Go into the searxng container to modify the /etc/searxng/settings.yml file
turn up
formats:
- html

Add a json
formats:
- html
- json
Finally restart the container

Kac001 · 2025-02-25T01:46:04Z

可以使用docker自行部署
1.部署searxng容器,端口按需修改

docker run -d --name searxng -p 8080:8080 \
  -v 修改为本地目录:/etc/searxng \
  -e SEARXNG_SETTINGS_FILE=/etc/searxng/settings.yml \
  searxng/searxng

2.修改settings.yml
找到

formats:
- html

修改为

formats:
- html
- json

3.重启,修改lobe配置文件
SEARXNG_URL=https://服务器ip:8080

lobehubbot · 2025-02-25T01:46:20Z

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

You can use docker to deploy it yourself

Deploy the searxng container and modify the port as needed
docker run -d --name searxng -p 8080:8080
-v Modify to local directory: /etc/searxng
-e SEARXNG_SETTINGS_FILE=/etc/searxng/settings.yml
searxng/searxng
Modify settings.yml
turn up
formats:

html

Modified to
formats:

html
json

Restart and modify the lobe configuration file
SEARXNG_URL=https://server ip:8080

arvinxx · 2025-02-25T02:05:58Z

@Kac001 老哥有没有兴趣来优化docker-compose 哇 😆 直接给一键部署脚本搞上！

lobehubbot · 2025-02-25T02:06:12Z

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

@Kac001 Are you interested in optimizing docker-compose? Wow 😆 Just add one-click deployment script!

arvinxx · 2025-02-25T02:13:47Z

@130aac8 Thanks for your advice! Let me reply one by one .

The model should not override the default search engine configuration if the SearXNG instance is well-adjusted and optimized. Default results from the instance could be better, as it may aggregate multiple sources like Google, DuckDuckGo, Brave, Mojeek, and others, providing more comprehensive results. Allowing the model to select specific engines (e.g., Brave or DuckDuckGo) could lead to fewer results, especially if the selected engine becomes inaccessible due to anti-bot mechanisms. Overriding the default configuration might result in no available results.

I think you are right, I will adjust prompts to make sure only search specific engine when user point out it in the query. Actually what you say is also bother me in some cases.

Embed search results to improve relevance and allow configuration of the number of results or relevance thresholds provided to the AI.

I think it's no need, as SearXNG has already a page rank algorithm. we have use the confience scores to involve useful results. So embedding is no need.

Enable configuration of the SearXNG instance directly within the user interface.

I think it's the next step to support config with SearXNG, also we will support more search provider like tavily and exa.

(Optional) Allow the model to fetch the full content of one or more links to enhance the depth of information and provide more valuable insights.

Yeah! It's also the next step to improve the search ability. Actually we have a plugin named web-crawler that can make same functionality. You can combine use these features together.

Kac001 · 2025-02-25T02:24:38Z

@arvinxx 是docker-compose/setup.sh 这个脚本吗

lobehubbot · 2025-02-25T02:24:49Z

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

@arvinxx Is this script docker-compose/setup.sh

arvinxx · 2025-02-25T02:33:48Z

@Kac001 https://github.com/lobehub/lobe-chat/blob/main/docker-compose/local/docker-compose.yml 这个文件

lobehubbot · 2025-02-25T02:34:01Z

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

@Kac001 https://github.com/lobehub/lobe-chat/blob/main/docker-compose/local/docker-compose.yml This file

ChenLuoi · 2025-02-25T03:00:23Z

当前已经实现的搜索，应该是拿的搜索引擎结果前5条的title和content丢给AI去分析，但是很多情况下这个体验并不好。
比如我搜索“今日热点新闻”，最终总结出的结果不是热点新闻内容，而是各大新闻网站的简介。
如果拉取网页内容一并发送给AI可能结果会更好看些，但是代价就是响应时间大大延长，token用量也会增加。

lobehubbot · 2025-02-25T03:00:34Z

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

The search currently implemented should be to throw the titles and contents of the first 5 search engine results to the AI for analysis, but in many cases the experience is not good.
For example, when I searched for "Today's Hot News", the final result was not the hot news content, but the introduction of major news websites.
If you pull the web page content and send it to the AI, the result may be better, but the cost is that the response time is greatly extended and the token usage will also increase.

130aac8 · 2025-02-25T03:07:07Z

@130aac8 Thanks for your advice! Let me reply one by one .

The model should not override the default search engine configuration if the SearXNG instance is well-adjusted and optimized. Default results from the instance could be better, as it may aggregate multiple sources like Google, DuckDuckGo, Brave, Mojeek, and others, providing more comprehensive results. Allowing the model to select specific engines (e.g., Brave or DuckDuckGo) could lead to fewer results, especially if the selected engine becomes inaccessible due to anti-bot mechanisms. Overriding the default configuration might result in no available results.

I think you are right, I will adjust prompts to make sure only search specific engine when user point out it in the query. Actually what you say is also bother me in some cases.

Embed search results to improve relevance and allow configuration of the number of results or relevance thresholds provided to the AI.

I think it's no need, as SearXNG has already a page rank algorithm. we have use the confience scores to involve useful results. So embedding is no need.

Enable configuration of the SearXNG instance directly within the user interface.

I think it's the next step to support config with SearXNG, also we will support more search provider like tavily and exa.

(Optional) Allow the model to fetch the full content of one or more links to enhance the depth of information and provide more valuable insights.

Yeah! It's also the next step to improve the search ability. Actually we have a plugin named web-crawler that can make same functionality. You can combine use these features together.

I think it's no need, as SearXNG has already a page rank algorithm. we have use the confience scores to involve useful results. So embedding is no need.

Yes, it indeed has a ranking algorithm, but if you take a closer look at its source code, you'll find that it doesn't suit the scenarios we currently require. Its algorithm is based on a per-search engine, weight-based approach, similar to the principles of traditional search engines. It factors in the credibility of the sources and incorporates the ranking positions of upstream search engines into the weight calculation, considering that it itself is a meta-search engine.

This method works well for traditional usage scenarios where users search for keywords, quickly browse through the results, and manually select the links they want to open. However, in our scenario, where the input for LLMs is limited and billed by token usage, this approach falls short. This limitation might also explain why you choose to extract only the top five results to submit to the model. Such a small number of results means that the quality of our search results must be exceptionally high. Additionally, since the search only returns partial content and the LLM cannot currently access the target links, the limited information from five results significantly restricts the depth and quality of the model's responses.

In my tests, directly extracting the top five results was not ideal. The relevance and depth of the content are limited, this led to some consumption of search resources and AI costs, but the responses obtained were relatively brief and might only contain a few key points. A lot of valuable information remains buried in other search results that are not included.

Besides the per-engine weight configuration, SearXNG also has a hostname-based priority setting. However, these scoring methods are quite restrictive and often fail to deliver the best search results for every query. This is especially problematic when the user's query requires more than a simple "yes" or "no" answer. In such cases, the current implementation in LobeChat frequently leads to missing information and insufficient depth.

We can also take inspiration from how similar products handle this challenge. For instance, products like Perplexity typically use at least 10–15 search results as the information source for their models. Even then, embeddings are often used to include as much relevant information as possible within the limited input.

I suggest suggest making some optimizations here. For example, consider optionally incorporating embeddings, increasing the number of search results to at least 10–15, or allowing users to configure the number of search results submitted to the model.

SAnBlog · 2025-02-25T03:08:01Z

当前已经实现的搜索，应该是拿的搜索引擎结果前5条的title和content丢给AI去分析，但是很多情况下这个体验并不好。比如我搜索“今日热点新闻”，最终总结出的结果不是热点新闻内容，而是各大新闻网站的简介。如果拉取网页内容一并发送给AI可能结果会更好看些，但是代价就是响应时间大大延长，token用量也会增加。

是不是可以在联网配置处选择条数数量?

lobehubbot · 2025-02-25T03:08:13Z

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

The search currently implemented should be to throw the titles and contents of the first 5 search engine results to the AI for analysis, but in many cases the experience is not good. For example, when I searched for "Today's Hot News", the final result was not the hot news content, but the introduction of major news websites. If you pull the web page content and send it to the AI, the result may be better, but the cost is that the response time is greatly extended and the token usage will also increase.

Can I select the number of items in the network configuration?

KoellM · 2025-02-25T07:26:24Z

同样出现TRPCClientError，情况是自部署 + basic auth
URL为
https://username:[email protected]/search?format=json&q=claude%20latest%20model
浏览器中可以直接访问并且返回搜索响应

请求trpc/tools/search.query时返回

[
  {
    "error": {
      "json": {
        "message": "Request cannot be constructed from a URL that includes credentials: https://username:[email protected]/search?format=json&q=claude%20latest%20model",
        "code": -32603,
        "data": {
          "code": "SERVICE_UNAVAILABLE",
          "httpStatus": 503,
          "path": "search.query"
        }
      }
    }
  }
]

lobehubbot · 2025-02-25T07:26:35Z

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

TRPCClientError also appears, the situation is self-deployment + basic auth
The URL is
```https://username:[email protected]/search?format=json&q=claude%20latest%20model````
You can access it directly in the browser and return to search for the corresponding

Return when requesting ```trpc/tools/search.query``

[
  {
    "error": {
      "json": {
        "message": "Request cannot be constructed from a URL that includes credentials: https://username:[email protected]/search?format=json&q=claude%20latest%20model",
        "code": -32603,
        "data": {
          "code": "SERVICE_UNAVAILABLE",
          "httpStatus": 503,
          "path": "search.query"
        }
      }
    }
  }
]

arvinxx · 2025-02-25T07:39:17Z

@KoellM 你这种复杂 case 先不考虑支持…

lobehubbot · 2025-02-25T07:39:31Z

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

@KoellM You are a complex case, don't consider supporting it for now...

lobehubbot · 2025-02-25T07:53:37Z

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

TRPCClientError appears after deploying it yourself, and you need to add it - json The specific format is as follows

tom-kate · 2025-02-25T07:54:24Z

@xccado @KoellM 自己部署出现TRPCClientError，需要映射的settings.yml文件添加- json 具体格式如下

search:
  formats:
    - html
    - json

search路径下formats

xccado · 2025-02-25T08:04:25Z

@xccado @KoellM 自己部署出现TRPCClientError，需要映射的settings.yml文件添加- json 具体格式如下
search:
  formats:
    - html
    - json
search路径下formats

ok 解决了，这个方法是对的

SidneyLYZhang · 2025-02-25T09:07:06Z

我也是自部署的searXNG，设置好之后，查询显示 Failed to search: TOO MANY REQUESTS ……

tom-kate · 2025-02-25T09:09:36Z

我也是自部署的searXNG，设置好之后，查询显示 Failed to search: TOO MANY REQUESTS ……

settings.yml里改limiter: false

SidneyLYZhang · 2025-02-25T09:17:38Z

我也是自部署的searXNG，设置好之后，查询显示 Failed to search: TOO MANY REQUESTS ……

settings.yml里改limiter: false

解决了感谢！

breakstring · 2025-02-25T12:41:56Z

呃。。。请教两个问题：

这里向 SearXNG 发起请求是客户端的请求还是服务端发起的请求？
如果 SearXNG 里面要配置 API 或者其他鉴权方式怎么做？

@arvinxx

lobehubbot · 2025-02-25T12:42:13Z

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

Well. . . Ask two questions:

Is the request to SearXNG here initiated here by the client or the request initiated by the server?
What if you want to configure API or other authentication methods in SearXNG?

@arvinxx

breakstring · 2025-02-25T13:29:50Z

呃。。。请教两个问题：

这里向 SearXNG 发起请求是客户端的请求还是服务端发起的请求？

如果 SearXNG 里面要配置 API 或者其他鉴权方式怎么做？

@arvinxx

看日志里有请求，弄明白了。是 LobeChat 请求过去的。
牵扯到 2 的话，之前是担心要把 searXNG 暴露出去，现在既然不暴露出去。。。那么对我来说不加鉴权暂时没有影响。

lobehubbot · 2025-02-25T13:30:06Z

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

Uh. . . Ask two questions:

Is the request to SearXNG here initiated by the client or the request initiated by the server?

What if you want to configure API or other authentication methods in SearXNG?

@arvinxx

Look at the request in the log and figure it out. It was LobeChat requested by the past.
If it is involved, I was worried about exposing searXNG, but now I am not exposed. . . Then for me, it will not have any impact on me for the time being.

lobehubbot · 2025-02-25T14:03:58Z

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

When using searchWithSearXNG, DeepSeek official API (Deepseek-chat) will search infinitely, and it will not reach the level that can be used. It needs to continue to optimize...

lamcodes · 2025-02-26T01:06:06Z

sonnet3.5会出现响应结果为空，一直卡在这里，后台能看到已经调用结束，存在日志记录了

使用claude-3-7-sonnet会出现这个错误，
"error": {
"error": {
"error": {
"type": "invalid_request_error",
"message": "messages.1.content.0.type: Expected thinking or redacted_thinking, but found text. When thinking is enabled, a final assistant message must start with a thinking block (preceeding the lastmost set of tool_use and tool_result blocks). We recommend you include thinking blocks from previous turns. To avoid this requirement, disable thinking. Please consult our documentation at https://****/en/docs/build-with-claude/extended-thinking (request id: 20250226083719768385422koglCkYC)"
}
},
"status": 400,
"headers": {
"date": "Wed, 26 Feb 2025 00:37:24 GMT",
"server": "nginx",
"connection": "keep-alive",
"content-type": "application/json; charset=utf-8",
"content-length": "551",
"x-rixapi-request-id": "20250226083719768385422koglCkYC"
}
}

arvinxx · 2025-02-26T01:18:45Z

DeepSeek官方API(Deepseek-chat)用searchWithSearXNG的时候会无限循环搜索，完全无法达到可以使用的水平，需要继续优化……

这个是 ds v3 的 function calling 能力不行。你换个模型就好了

AmossXu · 2025-02-26T02:52:40Z

请问现在不支持Function call的模型可以用联网搜索么比如 DS R1

lobehubbot · 2025-02-26T02:52:52Z

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

Can models that do not support Function call can be searched online? For example, DS R1

cachexy123 · 2025-02-26T05:46:07Z

啥时候能支持应用层的搜索呀 😢

lobehubbot · 2025-02-26T05:46:18Z

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

When will the application layer search be supported?

breakstring · 2025-02-26T06:07:00Z

啥时候能支持应用层的搜索呀 😢

现在不就是支持了嘛，这个帖子是在搜集使用反馈信息了啊。。。

lobehubbot · 2025-02-26T06:07:14Z

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

When can I support application layer search? 😢

Isn't it just supported now? This post is collecting feedback information. . .

cachexy123 · 2025-02-26T06:08:36Z

啥时候能支持应用层的搜索呀 😢

现在不就是支持了嘛，这个帖子是在搜集使用反馈信息了啊。。。

现在还是需要支持函数的模型才能搜

lobehubbot · 2025-02-26T06:08:48Z

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

When can I support application layer search? 😢

Isn’t it just supported now? This post is collecting feedback information. . .

Now we still need a model that supports functions to search

SAnBlog · 2025-02-26T07:28:46Z

国内部署搜索结果的icon会挂掉,除了Proxy能否可以自定义这个地址?

lobehubbot · 2025-02-26T07:28:56Z

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

The icon that deploys search results in China will be deactivated. Can Proxy customize this address?

Application level Search Feedback | 应用级搜索反馈 #6482

Application level Search Feedback | 应用级搜索反馈 #6482

Comments

arvinxx commented Feb 24, 2025

130aac8 commented Feb 24, 2025

This comment has been minimized.

This comment has been minimized.

Kac001 commented Feb 25, 2025 • edited by arvinxx Loading

lobehubbot commented Feb 25, 2025

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

Kac001 commented Feb 25, 2025 • edited Loading

lobehubbot commented Feb 25, 2025

arvinxx commented Feb 25, 2025

lobehubbot commented Feb 25, 2025

arvinxx commented Feb 25, 2025 • edited Loading

This comment has been minimized.

Kac001 commented Feb 25, 2025

lobehubbot commented Feb 25, 2025

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

arvinxx commented Feb 25, 2025

lobehubbot commented Feb 25, 2025

This comment has been minimized.

This comment has been minimized.

ChenLuoi commented Feb 25, 2025

lobehubbot commented Feb 25, 2025

130aac8 commented Feb 25, 2025 • edited Loading

SAnBlog commented Feb 25, 2025

lobehubbot commented Feb 25, 2025

KoellM commented Feb 25, 2025 • edited Loading

lobehubbot commented Feb 25, 2025

arvinxx commented Feb 25, 2025

lobehubbot commented Feb 25, 2025

This comment has been minimized.

This comment has been minimized.

lobehubbot commented Feb 25, 2025

tom-kate commented Feb 25, 2025 • edited Loading

xccado commented Feb 25, 2025

SidneyLYZhang commented Feb 25, 2025

tom-kate commented Feb 25, 2025

SidneyLYZhang commented Feb 25, 2025

breakstring commented Feb 25, 2025

lobehubbot commented Feb 25, 2025

breakstring commented Feb 25, 2025

lobehubbot commented Feb 25, 2025

This comment has been minimized.

lobehubbot commented Feb 25, 2025

lamcodes commented Feb 26, 2025 • edited Loading

arvinxx commented Feb 26, 2025 • edited Loading

AmossXu commented Feb 26, 2025

lobehubbot commented Feb 26, 2025

cachexy123 commented Feb 26, 2025

lobehubbot commented Feb 26, 2025

breakstring commented Feb 26, 2025

lobehubbot commented Feb 26, 2025

cachexy123 commented Feb 26, 2025

lobehubbot commented Feb 26, 2025

SAnBlog commented Feb 26, 2025

lobehubbot commented Feb 26, 2025

Kac001 commented Feb 25, 2025 •

edited by arvinxx

Loading

Kac001 commented Feb 25, 2025 •

edited

Loading

arvinxx commented Feb 25, 2025 •

edited

Loading

130aac8 commented Feb 25, 2025 •

edited

Loading

KoellM commented Feb 25, 2025 •

edited

Loading

tom-kate commented Feb 25, 2025 •

edited

Loading

lamcodes commented Feb 26, 2025 •

edited

Loading

arvinxx commented Feb 26, 2025 •

edited

Loading