This commit is contained in:
glidea
2025-04-19 15:50:26 +08:00
commit 8b33df8a05
109 changed files with 24407 additions and 0 deletions

17
docs/cherry-studio-mcp.md Normal file
View File

@@ -0,0 +1,17 @@
**配置 MCP Server**
默认 URL: `http://localhost:1301/sse`
<img src="images/cherry-studio-mcp.png" alt="Cherry Studio MCP" width="500">
**配置 Prompt可选但不使用效果可能不符合预期**
完整 Prompt 见 [mcp-client-prompt.md](mcp-client-prompt.md)
<img src="images/cherry-studio-mcp-prompt.png" alt="Cherry Studio MCP Prompt" width="500">
**玩法参考**
[Doc](preview.md)
非常强大,还可以直接修改 zenfeed 配置项

178
docs/config-zh.md Normal file
View File

@@ -0,0 +1,178 @@
| 字段 | 类型 | 描述 | 默认值 | 是否必需 |
| :--------- | :------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :------------- | :------------- |
| `timezone` | `string` | 应用的时区。例如 `Asia/Shanghai`。 | 服务器本地时区 | 否 |
| `log` | `object` | 日志配置。详见下方的 **日志配置** 部分。 | (见具体字段) | 否 |
| `api` | `object` | API 配置。详见下方的 **API 配置** 部分。 | (见具体字段) | 否 |
| `llms` | `列表` | 大语言模型 (LLM) 配置。会被其他配置部分引用。详见下方的 **LLM 配置** 部分。 | `[]` | 是 (至少 1 个) |
| `scrape` | `object` | 抓取配置。详见下方的 **抓取配置** 部分。 | (见具体字段) | 否 |
| `storage` | `object` | 存储配置。详见下方的 **存储配置** 部分。 | (见具体字段) | 否 |
| `scheduls` | `object` | 用于监控 Feed 的调度配置 (也称为监控规则)。详见下方的 **调度配置** 部分。 | (见具体字段) | 否 |
| `notify` | `object` | 通知配置。它接收来自调度模块的结果,通过路由配置进行分组,并通过通知渠道发送给通知接收者。详见下方的 **通知配置**, **通知路由**, **通知接收者**, **通知渠道** 部分。 | (见具体字段) | 是 |
### 日志配置 (`log`)
| 字段 | 类型 | 描述 | 默认值 | 是否必需 |
| :---------- | :------- | :--------------------------------------------------------- | :----- | :------- |
| `log.level` | `string` | 日志级别, 可选值为 `debug`, `info`, `warn`, `error` 之一。 | `info` | 否 |
### API 配置 (`api`)
| 字段 | 类型 | 描述 | 默认值 | 是否必需 |
| :----------------- | :------- | :---------------------------------------------------------------------------------------- | :---------------------- | :-------------------- |
| `api.http` | `object` | HTTP API 配置。 | (见具体字段) | 否 |
| `api.http.address` | `string` | HTTP API 的地址 (`[host]:port`)。例如 `0.0.0.0:1300`。应用运行后不可更改。 | `:1300` | 否 |
| `api.mcp` | `object` | MCP API 配置。 | (见具体字段) | 否 |
| `api.mcp.address` | `string` | MCP API 的地址 (`[host]:port`)。例如 `0.0.0.0:1301`。应用运行后不可更改。 | `:1301` | 否 |
| `api.llm` | `string` | 用于总结 Feed 的 LLM 名称。例如 `my-favorite-gemini-king`。引用在 `llms` 部分定义的 LLM。 | `llms` 部分中的默认 LLM | 是 (如果使用总结功能) |
### LLM 配置 (`llms[]`)
此部分定义了可用的大语言模型列表。至少需要一个 LLM 配置。
| 字段 | 类型 | 描述 | 默认值 | 是否必需 |
| :----------------------- | :-------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :----------------- | :--------------------------------------------- |
| `llms[].name` | `string` | LLM 的名称 (或称 'id')。例如 `my-favorite-gemini-king`。用于在其他配置部分 (如 `api.llm`, `storage.feed.embedding_llm` 等) 引用此 LLM。 | | 是 |
| `llms[].default` | `bool` | 此 LLM 是否为默认 LLM。只能有一个 LLM 是默认的。 | `false` | 否 (但如果依赖默认行为,则必须有一个为 `true`) |
| `llms[].provider` | `string` | LLM 的提供商, 可选值为 `openai`, `openrouter`, `deepseek`, `gemini`, `volc`, `siliconflow` 之一。例如 `openai`。 | | 是 |
| `llms[].endpoint` | `string` | LLM 的自定义端点。例如 `https://api.openai.com/v1`。 | (提供商特定默认值) | 否 |
| `llms[].api_key` | `string` | LLM 的 API 密钥。 | | 是 |
| `llms[].model` | `string` | LLM 的模型。例如 `gpt-4o-mini`。如果用于生成任务 (如总结),则不能为空。如果此 LLM 被使用,则不能与 `embedding_model` 同时为空。 | | 条件性必需 |
| `llms[].embedding_model` | `string` | LLM 的 Embedding 模型。例如 `text-embedding-3-small`。如果用于 Embedding则不能为空。如果此 LLM 被使用,则不能与 `model` 同时为空。**注意:** 初次使用后请勿直接修改,应添加新的 LLM 配置。 | | 条件性必需 |
| `llms[].temperature` | `float32` | LLM 的温度 (0-2)。 | `0.0` | 否 |
### 抓取配置 (`scrape`)
| 字段 | 类型 | 描述 | 默认值 | 是否必需 |
| :----------------------- | :-------------- | :----------------------------------------------------------------------------------------------------------------------------------------------------- | :----- | :---------------------------------- |
| `scrape.past` | `time.Duration` | 抓取 Feed 的回溯时间窗口。例如 `1h` 表示只抓取过去 1 小时的 Feed。 | `3d` | 否 |
| `scrape.interval` | `time.Duration` | 抓取每个源的频率 (全局默认值)。例如 `1h`。 | `1h` | 否 |
| `scrape.rsshub_endpoint` | `string` | RSSHub 的端点。你可以部署自己的 RSSHub 服务器或使用公共实例 (参见 [RSSHub 文档](https://docs.rsshub.app/guide/instances))。例如 `https://rsshub.app`。 | | 是 (如果使用了 `rsshub_route_path`) |
| `scrape.sources` | `对象列表` | 用于抓取 Feed 的源列表。详见下方的 **抓取源配置**。 | `[]` | 是 (至少一个) |
### 抓取源配置 (`scrape.sources[]`)
描述每个要抓取的源。
| 字段 | 类型 | 描述 | 默认值 | 是否必需 |
| :-------------------------- | :------------------ | :----------------------------------------------------------------------------------- | :-------------- | :-------------------- |
| `scrape.sources[].interval` | `time.Duration` | 抓取此特定源的频率。覆盖全局 `scrape.interval`。 | 全局 `interval` | 否 |
| `scrape.sources[].name` | `string` | 源的名称。用于标记 Feed。 | | 是 |
| `scrape.sources[].labels` | `map[string]string` | 附加到此源 Feed 的额外键值标签。 | `{}` | 否 |
| `scrape.sources[].rss` | `object` | 此源的 RSS 配置。详见下方的 **抓取源 RSS 配置**。每个源只能设置一种类型 (例如 RSS)。 | `nil` | 是 (如果源类型是 RSS) |
### 抓取源 RSS 配置 (`scrape.sources[].rss`)
| 字段 | 类型 | 描述 | 默认值 | 是否必需 |
| :--------------------------------------- | :------- | :--------------------------------------------------------------------------------------------------------------------------------- | :----- | :---------------------------------------------- |
| `scrape.sources[].rss.url` | `string` | RSS Feed 的完整 URL。例如 `http://localhost:1200/github/trending/daily/any`。如果设置了 `rsshub_route_path` 则不能设置此项。 | | 是 (除非设置了 `rsshub_route_path`) |
| `scrape.sources[].rss.rsshub_route_path` | `string` | RSSHub 路由路径。例如 `github/trending/daily/any`。将与 `scrape.rsshub_endpoint` 拼接成最终 URL。如果设置了 `url` 则不能设置此项。 | | 是 (除非设置了 `url`, 且需要 `rsshub_endpoint`) |
### 存储配置 (`storage`)
| 字段 | 类型 | 描述 | 默认值 | 是否必需 |
| :------------- | :------- | :-------------------------------------------- | :----------- | :------- |
| `storage.dir` | `string` | 所有存储的基础目录。应用运行后不可更改。 | `./data` | 否 |
| `storage.feed` | `object` | Feed 存储配置。详见下方的 **Feed 存储配置**。 | (见具体字段) | 否 |
### Feed 存储配置 (`storage.feed`)
| 字段 | 类型 | 描述 | 默认值 | 是否必需 |
| :---------------------------- | :-------------- | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :---------------------- | :------- |
| `storage.feed.rewrites` | `对象列表` | 在存储每个 Feed 之前如何处理它。受 Prometheus relabeling 启发。详见下方的 **重写规则配置**。 | `[]` | 否 |
| `storage.feed.flush_interval` | `time.Duration` | 将 Feed 存储刷新到数据库的频率。更高的值会带来更高的数据丢失风险,但能减少磁盘操作并提高性能。 | `200ms` | 否 |
| `storage.feed.embedding_llm` | `string` | 用于 Feed Embedding 的 LLM 名称 (来自 `llms` 部分)。显著影响语义搜索的准确性。**注意:** 如果要切换,请注意保留旧的 LLM 配置,因为过去的数据仍隐式关联它,否则会导致过去的数据无法进行语义搜索。 | `llms` 部分中的默认 LLM | 否 |
| `storage.feed.retention` | `time.Duration` | Feed 的保留时长。 | `8d` | 否 |
| `storage.feed.block_duration` | `time.Duration` | 每个基于时间的 Feed 存储块的保留时长 (类似于 Prometheus TSDB Block)。 | `25h` | 否 |
### 重写规则配置 (`storage.feed.rewrites[]`)
定义在存储前处理 Feed 的规则。规则按顺序应用。
| 字段 | 类型 | 描述 | 默认值 | 是否必需 |
| :--------------------------------------- | :------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :----------------------- | :--------------------------------------------- |
| `...rewrites[].source_label` | `string` | 用作转换源文本的 Feed 标签。默认标签包括: `type`, `source`, `title`, `link`, `pub_time`, `content`。 | `content` | 否 |
| `...rewrites[].skip_too_short_threshold` | `*int` | 如果设置,`source_label` 文本长度低于此阈值的 Feed 将被此规则跳过 (处理将继续进行下一条规则,如果没有更多规则则进行 Feed 存储)。有助于过滤掉过短/信息量不足的 Feed。 | `300` | 否 |
| `...rewrites[].transform` | `object` | 配置如何转换 `source_label` 文本。详见下方的 **重写规则转换配置**。如果未设置,则直接使用 `source_label` 文本进行匹配。 | `nil` | 否 |
| `...rewrites[].match` | `string` | 用于匹配 (转换后) 文本的简单字符串。不能与 `match_re` 同时设置。 | | 否 (使用 `match``match_re`) |
| `...rewrites[].match_re` | `string` | 用于匹配 (转换后) 文本的正则表达式。 | `.*` (匹配所有) | 否 (使用 `match``match_re`) |
| `...rewrites[].action` | `string` | 匹配时执行的操作: `create_or_update_label` (使用匹配/转换后的文本添加/更新标签), `drop_feed` (完全丢弃该 Feed)。 | `create_or_update_label` | 否 |
| `...rewrites[].label` | `string` | 要创建或更新的 Feed 标签名称。 | | 是 (如果 `action``create_or_update_label`) |
### 重写规则转换配置 (`storage.feed.rewrites[].transform`)
| 字段 | 类型 | 描述 | 默认值 | 是否必需 |
| :--------------------- | :------- | :------------------------------------------------------------------- | :----- | :------- |
| `...transform.to_text` | `object` | 使用 LLM 将源文本转换为文本。详见下方的 **重写规则转换为文本配置**。 | `nil` | 否 |
### 重写规则转换为文本配置 (`storage.feed.rewrites[].transform.to_text`)
| 字段 | 类型 | 描述 | 默认值 | 是否必需 |
| :------------------ | :------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | :---------------------- | :------- |
| `...to_text.llm` | `string` | 用于转换的 LLM 名称 (来自 `llms` 部分)。 | `llms` 部分中的默认 LLM | 否 |
| `...to_text.prompt` | `string` | 用于转换的 Prompt。源文本将被注入。可以使用 Go 模板语法引用内置 Prompt: `{{ .summary }}`, `{{ .category }}`, `{{ .tags }}`, `{{ .score }}`, `{{ .comment_confucius }}`, `{{ .summary_html_snippet }}`。 | | 是 |
### 调度配置 (`scheduls`)
定义查询和监控 Feed 的规则。
| 字段 | 类型 | 描述 | 默认值 | 是否必需 |
| :--------------- | :--------- | :------------------------------------------------------------------------------------------------------- | :----- | :------- |
| `scheduls.rules` | `对象列表` | 用于调度 Feed 的规则列表。每个规则的结果 (匹配的 Feed) 将被发送到通知路由。详见下方的 **调度规则配置**。 | `[]` | 否 |
### 调度规则配置 (`scheduls.rules[]`)
| 字段 | 类型 | 描述 | 默认值 | 是否必需 |
| :-------------------------------- | :-------------- | :---------------------------------------------------------------------------------------------------------------------------------------------------------------- | :----- | :---------------------------------------- |
| `scheduls.rules[].name` | `string` | 规则的名称。 | | 是 |
| `scheduls.rules[].query` | `string` | 用于查找相关 Feed 的语义查询。可选。 | | 否 |
| `scheduls.rules[].threshold` | `float32` | 相关性得分阈值 (0-1),用于过滤语义查询结果。仅在设置了 `query` 时有效。 | `0.6` | 否 |
| `scheduls.rules[].label_filters` | `字符串列表` | 基于 Feed 标签的过滤器 (等于或不等于)。例如 `["category=tech", "source!=github"]`。 | `[]` | 否 |
| `scheduls.rules[].every_day` | `string` | 相对于每天结束时间的查询范围。格式: `start~end` (HH:MM)。例如, `00:00~23:59` (今天), `-22:00~07:00` (昨天 22:00 到今天 07:00)。不能与 `watch_interval` 同时设置。 | | 否 (使用 `every_day``watch_interval`) |
| `scheduls.rules[].watch_interval` | `time.Duration` | 运行查询的频率。例如 `10m`。不能与 `every_day` 同时设置。 | `10m` | 否 (使用 `every_day``watch_interval`) |
### 通知配置 (`notify`)
| 字段 | 类型 | 描述 | 默认值 | 是否必需 |
| :----------------- | :--------- | :------------------------------------------------------------------- | :----------- | :---------------- |
| `notify.route` | `object` | 主通知路由配置。详见下方的 **通知路由配置**。 | (见具体字段) | 是 |
| `notify.receivers` | `对象列表` | 定义通知接收者 (例如电子邮件地址)。详见下方的 **通知接收者配置**。 | `[]` | 是 (至少一个) |
| `notify.channels` | `object` | 配置通知渠道 (例如电子邮件 SMTP 设置)。详见下方的 **通知渠道配置**。 | (见具体字段) | 是 (如果使用渠道) |
### 通知路由配置 (`notify.route` 及 `notify.route.sub_routes[]`)
此结构可以使用 `sub_routes` 进行嵌套。Feed 会首先尝试匹配子路由;如果没有子路由匹配,则应用父路由的配置。
| 字段 | 类型 | 描述 | 默认值 | 是否必需 |
| :--------------------------------- | :----------- | :-------------------------------------------------------------------------------------------------------- | :----- | :------------ |
| `...matchers` (仅子路由) | `字符串列表` | 标签匹配器,用于确定 Feed 是否属于此子路由。例如 `["category=tech", "source!=github"]`。 | `[]` | 是 (仅子路由) |
| `...receivers` | `字符串列表` | 接收者的名称列表 (在 `notify.receivers` 中定义),用于发送匹配此路由的 Feed 的通知。 | `[]` | 是 (至少一个) |
| `...group_by` | `字符串列表` | 在发送通知前用于对 Feed 进行分组的标签列表。每个分组会产生一个单独的通知。例如 `["source", "category"]`。 | `[]` | 是 (至少一个) |
| `...compress_by_related_threshold` | `*float32` | 如果设置,则根据语义相关性压缩分组内高度相似的 Feed仅发送一个代表。阈值 (0-1),越高表示越相似。 | `0.85` | 否 |
| `...sub_routes` | `对象列表` | 嵌套路由列表。允许定义更具体的路由规则。每个对象遵循 **通知路由配置**。 | `[]` | 否 |
### 通知接收者配置 (`notify.receivers[]`)
定义*谁*接收通知。
| 字段 | 类型 | 描述 | 默认值 | 是否必需 |
| :------------------------- | :------- | :------------------------------- | :----- | :------------------ |
| `notify.receivers[].name` | `string` | 接收者的唯一名称。在路由中使用。 | | 是 |
| `notify.receivers[].email` | `string` | 接收者的电子邮件地址。 | | 是 (如果使用 Email) |
### 通知渠道配置 (`notify.channels`)
配置通知*如何*发送。
| 字段 | 类型 | 描述 | 默认值 | 是否必需 |
| :---------------------- | :------- | :-------------------------------------------------------- | :----- | :------------------ |
| `notify.channels.email` | `object` | 全局 Email 渠道配置。详见下方的 **通知渠道 Email 配置**。 | `nil` | 是 (如果使用 Email) |
### 通知渠道 Email 配置 (`notify.channels.email`)
| 字段 | 类型 | 描述 | 默认值 | 是否必需 |
| :------------------------------------ | :------- | :------------------------------------------------------------------------------------------------------------------------------------------------------ | :--------------- | :------- |
| `...email.smtp_endpoint` | `string` | SMTP 服务器端点。例如 `smtp.gmail.com:587`。 | | 是 |
| `...email.from` | `string` | 发件人 Email 地址。 | | 是 |
| `...email.password` | `string` | 发件人 Email 的应用专用密码。(对于 Gmail, 参见 [Google 应用密码](https://support.google.com/mail/answer/185833))。 | | 是 |
| `...email.feed_markdown_template` | `string` | 用于在 Email 正文中格式化每个 Feed 的 Markdown 模板。默认渲染 Feed 内容。不能与 `feed_html_snippet_template` 同时设置。可用的模板变量取决于 Feed 标签。 | `{{ .content }}` | 否 |
| `...email.feed_html_snippet_template` | `string` | 用于格式化每个 Feed 的 HTML 片段模板。不能与 `feed_markdown_template` 同时设置。可用的模板变量取决于 Feed 标签。 | | 否 |

178
docs/config.md Normal file
View File

@@ -0,0 +1,178 @@
| Field | Type | Description | Default | Required |
| :------- | :----- | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :---------------------- | :-------- |
| timezone | string | The timezone of the app. e.g. `Asia/Shanghai`. | server's local timezone | No |
| log | object | The log config. See **Log Configuration** section below. | (see fields) | No |
| api | object | The API config. See **API Configuration** section below. | (see fields) | No |
| llms | list | The LLMs config. Refered by other config sections. See **LLM Configuration** section below. | `[]` | Yes (>=1) |
| scrape | object | The scrape config. See **Scrape Configuration** section below. | (see fields) | No |
| storage | object | The storage config. See **Storage Configuration** section below. | (see fields) | No |
| scheduls | object | The scheduls config for monitoring feeds (aka monitoring rules). See **Scheduls Configuration** section below. | (see fields) | No |
| notify | object | The notify config. It receives results from scheduls, groups them via route config, and sends to receivers via channels. See **Notify Configuration**, **Notify Route**, **Notify Receiver**, **Notify Channels** sections below. | (see fields) | Yes |
### Log Configuration (`log`)
| Field | Type | Description | Default | Required |
| :---------- | :----- | :-------------------------------------------------- | :------ | :------- |
| `log.level` | string | Log level, one of `debug`, `info`, `warn`, `error`. | `info` | No |
**API Configuration (`api`)**
| Field | Type | Description | Default | Required |
| :----------------- | :----- | :------------------------------------------------------------------------------------------------------------------ | :---------------------------- | :------------------------------------- |
| `api.http` | object | The HTTP API config. | (see fields) | No |
| `api.http.address` | string | The address (`[host]:port`) of the HTTP API. e.g. `0.0.0.0:1300`. Cannot be changed after the app is running. | `:1300` | No |
| `api.mcp` | object | The MCP API config. | (see fields) | No |
| `api.mcp.address` | string | The address (`[host]:port`) of the MCP API. e.g. `0.0.0.0:1301`. Cannot be changed after the app is running. | `:1301` | No |
| `api.llm` | string | The LLM name for summarizing feeds. e.g. `my-favorite-gemini-king`. Refers to an LLM defined in the `llms` section. | default LLM in `llms` section | Yes (if summarization feature is used) |
### LLM Configuration (`llms[]`)
This section defines a list of available Large Language Models. At least one LLM configuration is required.
| Field | Type | Description | Default | Required |
| :----------------------- | :------ | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :-------------------------- | :------------------------------------------------------------- |
| `llms[].name` | string | The name (or 'id') of the LLM. e.g. `my-favorite-gemini-king`. Used to refer to this LLM in other sections (`api.llm`, `storage.feed.embedding_llm`, etc.). | | Yes |
| `llms[].default` | bool | Whether this LLM is the default LLM. Only one LLM can be the default. | `false` | No (but one must be `true` if default behavior is relied upon) |
| `llms[].provider` | string | The provider of the LLM, one of `openai`, `openrouter`, `deepseek`, `gemini`, `volc`, `siliconflow`. e.g. `openai`. | | Yes |
| `llms[].endpoint` | string | The custom endpoint of the LLM. e.g. `https://api.openai.com/v1`. | (provider specific default) | No |
| `llms[].api_key` | string | The API key of the LLM. | | Yes |
| `llms[].model` | string | The model of the LLM. e.g. `gpt-4o-mini`. Cannot be empty if used for generation tasks (like summarization). Cannot be empty with `embedding_model` at same time if this LLM is used. | | Conditionally Yes |
| `llms[].embedding_model` | string | The embedding model of the LLM. e.g. `text-embedding-3-small`. Cannot be empty if used for embedding. Cannot be empty with `model` at same time if this LLM is used. **NOTE:** Do not modify after initial use; add a new LLM config instead. | | Conditionally Yes |
| `llms[].temperature` | float32 | The temperature (0-2) of the LLM. | `0.0` | No |
### Scrape Configuration (`scrape`)
| Field | Type | Description | Default | Required |
| :----------------------- | :-------------- | :--------------------------------------------------------------------------------------------------------------------------------------------------------------- | :------ | :-------------------------------- |
| `scrape.past` | duration | The lookback time window for scraping feeds. e.g. `1h` means only scrape feeds in the past 1 hour. | `3d` | No |
| `scrape.interval` | duration | How often to scrape each source (global default). e.g. `1h`. | `1h` | No |
| `scrape.rsshub_endpoint` | string | The endpoint of the RSSHub. You can deploy your own or use a public one (see [RSSHub Docs](https://docs.rsshub.app/guide/instances)). e.g. `https://rsshub.app`. | | Yes (if `rsshub_route_path` used) |
| `scrape.sources` | list of objects | The sources for scraping feeds. See **Scrape Source Configuration** below. | `[]` | Yes (at least one) |
### Scrape Source Configuration (`scrape.sources[]`)
Describes each source to be scraped.
| Field | Type | Description | Default | Required |
| :-------------------------- | :---------------- | :------------------------------------------------------------------------------------------------------------------------------------- | :-------------- | :-------------------------- |
| `scrape.sources[].interval` | duration | How often to scrape this specific source. Overrides the global `scrape.interval`. | global interval | No |
| `scrape.sources[].name` | string | The name of the source. Used for labeling feeds. | | Yes |
| `scrape.sources[].labels` | map[string]string | Additional key-value labels to add to feeds from this source. | `{}` | No |
| `scrape.sources[].rss` | object | The RSS config for this source. See **Scrape Source RSS Configuration** below. Only one source type (e.g., RSS) can be set per source. | `nil` | Yes (if source type is RSS) |
### Scrape Source RSS Configuration (`scrape.sources[].rss`)
| Field | Type | Description | Default | Required |
| :--------------------------------------- | :----- | :------------------------------------------------------------------------------------------------------------------------------------ | :------ | :---------------------------------------------------- |
| `scrape.sources[].rss.url` | string | The full URL of the RSS feed. e.g. `http://localhost:1200/github/trending/daily/any`. Cannot be set if `rsshub_route_path` is set. | | Yes (unless `rsshub_route_path` is set) |
| `scrape.sources[].rss.rsshub_route_path` | string | The RSSHub route path. e.g. `github/trending/daily/any`. Will be joined with `scrape.rsshub_endpoint`. Cannot be set if `url` is set. | | Yes (unless `url` is set, requires `rsshub_endpoint`) |
### Storage Configuration (`storage`)
| Field | Type | Description | Default | Required |
| :------------- | :----- | :------------------------------------------------------------------------------- | :----------- | :------- |
| `storage.dir` | string | The base directory for all storages. Cannot be changed after the app is running. | `./data` | No |
| `storage.feed` | object | The feed storage config. See **Feed Storage Configuration** below. | (see fields) | No |
### Feed Storage Configuration (`storage.feed`)
| Field | Type | Description | Default | Required |
| :---------------------------- | :-------------- | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :---------------------------- | :------- |
| `storage.feed.rewrites` | list of objects | How to process each feed before storing it. Inspired by Prometheus relabeling. See **Rewrite Rule Configuration** below. | `[]` | No |
| `storage.feed.flush_interval` | duration | How often to flush feed storage to the database. Higher value risks data loss but improves performance. | `200ms` | No |
| `storage.feed.embedding_llm` | string | The name of the LLM (from `llms` section) used for embedding feeds. Affects semantic search accuracy. **NOTE:** If changing, keep the old LLM config defined as past data relies on it. | default LLM in `llms` section | No |
| `storage.feed.retention` | duration | How long to keep a feed. | `8d` | No |
| `storage.feed.block_duration` | duration | How long to keep each time-based feed storage block (similar to Prometheus TSDB Block). | `25h` | No |
### Rewrite Rule Configuration (`storage.feed.rewrites[]`)
Defines rules to process feeds before storage. Rules are applied in order.
| Field | Type | Description | Default | Required |
| :--------------------------------------- | :----- | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :----------------------- | :-------------------------------------------- |
| `...rewrites[].source_label` | string | The feed label to use as the source text for transformation. Default labels: `type`, `source`, `title`, `link`, `pub_time`, `content`. | `content` | No |
| `...rewrites[].skip_too_short_threshold` | *int | If set, feeds where the `source_label` text length is below this threshold are skipped by this rule (processing continues with the next rule or feed storage if no more rules). Helps filter short/uninformative feeds. | `300` | No |
| `...rewrites[].transform` | object | Configures how to transform the `source_label` text. See **Rewrite Rule Transform Configuration** below. If unset, the `source_label` text is used directly for matching. | `nil` | No |
| `...rewrites[].match` | string | A simple string to match against the (transformed) text. Cannot be set with `match_re`. | | No (use `match` or `match_re`) |
| `...rewrites[].match_re` | string | A regular expression to match against the (transformed) text. | `.*` (matches all) | No (use `match` or `match_re`) |
| `...rewrites[].action` | string | Action to perform if matched: `create_or_update_label` (adds/updates a label with the matched/transformed text), `drop_feed` (discards the feed entirely). | `create_or_update_label` | No |
| `...rewrites[].label` | string | The feed label name to create or update. | | Yes (if `action` is `create_or_update_label`) |
### Rewrite Rule Transform Configuration (`storage.feed.rewrites[].transform`)
| Field | Type | Description | Default | Required |
| :--------------------- | :----- | :---------------------------------------------------------------------------------------------------------- | :------ | :------- |
| `...transform.to_text` | object | Transform the source text to text using an LLM. See **Rewrite Rule Transform To Text Configuration** below. | `nil` | No |
### Rewrite Rule Transform To Text Configuration (`storage.feed.rewrites[].transform.to_text`)
| Field | Type | Description | Default | Required |
| :------------------ | :----- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | :---------------------------- | :------- |
| `...to_text.llm` | string | The name of the LLM (from `llms` section) to use for transformation. | default LLM in `llms` section | No |
| `...to_text.prompt` | string | The prompt used for transformation. The source text is injected. Go template syntax can refer to built-in prompts: `{{ .summary }}`, `{{ .category }}`, `{{ .tags }}`, `{{ .score }}`, `{{ .comment_confucius }}`, `{{ .summary_html_snippet }}`. | | Yes |
### Scheduls Configuration (`scheduls`)
Defines rules for querying and monitoring feeds.
| Field | Type | Description | Default | Required |
| :--------------- | :-------------- | :------------------------------------------------------------------------------------------------------------------------------------------------- | :------ | :------- |
| `scheduls.rules` | list of objects | The rules for scheduling feeds. Each rule's result (matched feeds) is sent to the notify route. See **Scheduls Rule Configuration** section below. | `[]` | No |
### Scheduls Rule Configuration (`scheduls.rules[]`)
| Field | Type | Description | Default | Required |
| :-------------------------------- | :-------------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :------ | :--------------------------------------- |
| `scheduls.rules[].name` | string | The name of the rule. | | Yes |
| `scheduls.rules[].query` | string | The semantic query to find relevant feeds. Optional. | | No |
| `scheduls.rules[].threshold` | float32 | Relevance score threshold (0-1) to filter semantic query results. Only works if `query` is set. | `0.6` | No |
| `scheduls.rules[].label_filters` | list of strings | Filters based on feed labels (exact match or non-match). e.g. `["category=tech", "source!=github"]`. | `[]` | No |
| `scheduls.rules[].every_day` | string | Query range relative to the end of each day. Format: `start~end` (HH:MM). e.g., `00:00~23:59` (today), `-22:00~07:00` (yesterday 22:00 to today 07:00). Cannot be set with `watch_interval`. | | No (use `every_day` or `watch_interval`) |
| `scheduls.rules[].watch_interval` | duration | How often to run the query. e.g. `10m`. Cannot be set with `every_day`. | `10m` | No (use `every_day` or `watch_interval`) |
### Notify Configuration (`notify`)
| Field | Type | Description | Default | Required |
| :----------------- | :-------------- | :------------------------------------------------------------------------------------------------------------- | :----------- | :---------------------- |
| `notify.route` | object | The main notify routing configuration. See **Notify Route Configuration** below. | (see fields) | Yes |
| `notify.receivers` | list of objects | Defines the notification receivers (e.g., email addresses). See **Notify Receiver Configuration** below. | `[]` | Yes (at least one) |
| `notify.channels` | object | Configures the notification channels (e.g., email SMTP settings). See **Notify Channels Configuration** below. | (see fields) | Yes (if using channels) |
### Notify Route Configuration (`notify.route` and `notify.route.sub_routes[]`)
This structure can be nested using `sub_routes`. A feed is matched against sub-routes first; if no sub-route matches, the parent route's configuration applies.
| Field | Type | Description | Default | Required |
| :--------------------------------- | :-------------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :------ | :------------------- |
| `...matchers` (only in sub-routes) | list of strings | Label matchers to determine if a feed belongs to this sub-route. e.g. `["category=tech", "source!=github"]`. | `[]` | Yes (for sub-routes) |
| `...receivers` | list of strings | Names of the receivers (defined in `notify.receivers`) to send notifications for feeds matching this route. | `[]` | Yes (at least one) |
| `...group_by` | list of strings | Labels to group feeds by before sending notifications. Each group results in a separate notification. e.g., `["source", "category"]`. | `[]` | Yes (at least one) |
| `...compress_by_related_threshold` | *float32 | If set, compresses highly similar feeds (based on semantic relatedness) within a group, sending only one representative. Threshold (0-1). Higher means more similar. | `0.85` | No |
| `...sub_routes` | list of objects | Nested routes. Allows defining more specific routing rules. Each object follows the **Notify Route Configuration**. | `[]` | No |
### Notify Receiver Configuration (`notify.receivers[]`)
Defines *who* receives notifications.
| Field | Type | Description | Default | Required |
| :------------------------- | :----- | :----------------------------------------------- | :------ | :------------------- |
| `notify.receivers[].name` | string | The unique name of the receiver. Used in routes. | | Yes |
| `notify.receivers[].email` | string | The email address of the receiver. | | Yes (if using email) |
### Notify Channels Configuration (`notify.channels`)
Configures *how* notifications are sent.
| Field | Type | Description | Default | Required |
| :---------------------- | :----- | :--------------------------------------------------------------------------------- | :------ | :------------------- |
| `notify.channels.email` | object | The global email channel config. See **Notify Channel Email Configuration** below. | `nil` | Yes (if using email) |
### Notify Channel Email Configuration (`notify.channels.email`)
| Field | Type | Description | Default | Required |
| :------------------------------------ | :----- | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :--------------- | :------- |
| `...email.smtp_endpoint` | string | The SMTP server endpoint. e.g. `smtp.gmail.com:587`. | | Yes |
| `...email.from` | string | The sender email address. | | Yes |
| `...email.password` | string | The application password for the sender email. (For Gmail, see [Google App Passwords](https://support.google.com/mail/answer/185833)). | | Yes |
| `...email.feed_markdown_template` | string | Markdown template for formatting each feed in the email body. Default renders the feed content. Cannot be set with `feed_html_snippet_template`. Available template variables depend on feed labels. | `{{ .content }}` | No |
| `...email.feed_html_snippet_template` | string | HTML snippet template for formatting each feed. Cannot be set with `feed_markdown_template`. Available template variables depend on feed labels. | | No |

BIN
docs/images/add-rss.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.1 MiB

BIN
docs/images/arch.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 132 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.4 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 279 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 290 KiB

BIN
docs/images/daily-brief.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 570 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.1 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 57 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 197 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 70 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 66 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 46 KiB

BIN
docs/images/monitoring.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 561 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 153 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 230 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 66 KiB

BIN
docs/images/wechat.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 715 KiB

105
docs/mcp-client-prompt.md Normal file
View File

@@ -0,0 +1,105 @@
**Your Role:** You are an expert Zenfeed assistant. Your mission is to proactively help the user manage the Zenfeed application and explore its content effectively. You demonstrate deep knowledge of Zenfeed's capabilities, anticipate user needs, and act as an intelligent interface to the application's functions.
**You can, but are not limited to:**
**Search content:** use semantic search to find articles and information in Zenfeed.
**Exploring RSSHub:** browse RSSHub's categories, websites, and feeds to help you discover new content sources.
**Configuring Zenfeed:** modify Zenfeed's settings, such as adding new feeds, configuring information monitoring, sending daily briefs, and so on.
**Interaction Style:**
* **Expert & Insightful:** Showcase your expertise not just by *using* tools, but by explaining the *implications* of the results. Provide relevant context, analysis, and potential next steps. Demonstrate understanding of *why* you're taking an action.
* **Clearly Structured:** Organize your responses logically using clear headings or bullet points. Follow this structure:
1. **Action Taken:** State clearly *which* tool you are using and *why* it addresses the user's inferred goal.
2. **Key Findings:** Present the essential results from the tool concisely and accurately.
3. **Analysis & Next Steps:** Interpret the findings, explain their significance in relation to the user's goal, and suggest relevant follow-up actions or considerations.
* **Approachable & Moderately Conversational:** Use clear, natural language. Avoid unnecessary jargon, but maintain a professional and knowledgeable tone. Be helpful, engaging, and guide the user effectively.
* **Substantive and Informative:** Your replies must be detailed enough to be genuinely useful. **Avoid overly brief or superficial answers.**
**Core Principles:**
1. **Infer Intent, Act Directly, Explain Thoroughly:** Carefully analyze the user's request to determine their underlying objective. Select the *most appropriate* tool and execute it *without asking for confirmation* (except for `apply_app_config`). Then, report and analyze the results comprehensively.
2. **Prioritize Tool Usage:** Your primary function is to leverage the available Zenfeed tools. **Always attempt to use a relevant tool first** to fulfill the user's request before resorting to general knowledge. Ensure you select the *correct* tool for the task based on your understanding of the user's intent; avoid misusing tools.
3. **Proactivity:** Anticipate user needs. If a user asks about finding new feeds, proactively suggest exploring categories. If they query content, provide insightful summaries and direct links.
**CRITICAL SAFETY EXCEPTION: Applying Configuration (`apply_app_config`)**
Modifying the application configuration requires **strict adherence** to the following **MANDATORY** steps. **DO NOT DEVIATE:**
1. **Identify Need:** Recognize the user wants to change Zenfeed's configuration.
2. **Retrieve Current Config (If Needed):** Use `query_app_config` if the current state is unknown or needed for context. State: "Okay, I need to check the current settings first. Retrieving the current Zenfeed configuration..."
3. **Construct *Complete* New Configuration:** Based *only* on the user's request and potentially the current config, formulate the **entire desired new configuration** in YAML format. This YAML *must* represent the complete final state, including any unchanged settings necessary for a valid config. Ensure correctness and proper formatting.
4. **Present Full YAML for Review:** Show the user the **complete proposed YAML configuration** you have constructed.
5. **Explicitly Request Confirmation:** Ask for the user's explicit approval using clear phrasing:
* "Okay, I've prepared the following *complete* configuration based on your request. Please review it carefully to ensure it matches exactly what you want:"
* `[Present the full YAML here]`
* "**Shall I apply this exact configuration to Zenfeed?**"
6. **Await Clear Confirmation:** **DO NOT** proceed without a clear "yes," "confirm," or equivalent affirmative response *specifically for the presented YAML*.
7. **Execute `apply_app_config`:** *Only after* receiving explicit confirmation, call the `apply_app_config` tool, passing the *exact confirmed YAML* as the `yaml` parameter.
8. **Report Outcome:** Inform the user whether the configuration was applied successfully or if an error occurred.
**Typical Workflow Emphasis: Exploring and Adding RSSHub Feeds**
When a user expresses interest in exploring new feeds via RSSHub, anticipate and guide them through this common sequence:
1. **Discover Categories:** Use `query_rsshub_categories` to show available high-level categories.
* *Assistant Action Example:* "To help you find new feeds, I'll start by fetching the available RSSHub categories..."
2. **Explore Websites within a Category:** Once the user chooses a category, use `query_rsshub_websites` with the chosen `category` ID.
* *Assistant Action Example:* "Okay, let's look at the websites available in the '[Category Name]' category. Fetching the list..."
3. **Find Specific Routes/Feeds for a Website:** When the user selects a website, use `query_rsshub_routes` with the chosen `website_id`.
* *Assistant Action Example:* "Great, let's see what specific feeds are available for '[Website Name]'. Querying the routes..."
4. **Prepare Configuration Change:** If the user wants to add a discovered route:
* Optionally use `query_app_config_schema` if needed to understand the structure for adding feeds. ("Checking the configuration rules...")
* Use `query_app_config` to get the current configuration. ("Fetching your current configuration so I can add the new feed...")
* Follow the **CRITICAL SAFETY EXCEPTION** steps precisely to construct the *new complete YAML*, present it, get explicit confirmation, and *then* use `apply_app_config`.
## Available Zenfeed Tools:
1. **`query_app_config_schema`**
* **Purpose:** Retrieves the JSON schema defining the structure and validation rules for Zenfeed's configuration (`config.yml`).
* **When to Use:** Primarily before constructing a new configuration (`apply_app_config`) to ensure validity, or if the user asks about configuration options. Mention if you're consulting it.
* **Input:** None.
* **Output:** JSON schema string. (Summarize its purpose if fetched: "I've fetched the schema that defines how the configuration file should be structured.")
2. **`query_app_config`**
* **Purpose:** Fetches Zenfeed's *current* operational configuration settings as YAML.
* **When to Use:** Essential before proposing changes (`apply_app_config`). Also useful if the user asks about current settings. Fetch proactively when config changes are likely.
* **Input:** None.
* **Output:** Current configuration as a YAML string. (Summarize key relevant settings.)
3. **`apply_app_config`** (**Requires Strict Confirmation Workflow - See Above!**)
* **Purpose:** Applies a *complete new* configuration to Zenfeed, entirely replacing the existing one.
* **Input:** `yaml` (string, required): The **complete new configuration** in valid YAML format, **as explicitly confirmed by the user.** To ensure valid YAML output, when generating YAML configurations, do not add backslashes \ after the pipe symbol | for multi-line strings. For example, it should be written as prompt: | instead of prompt: |\
* **Output:** Success/error message.
* **Reminder:** **NEVER** use without the full confirmation workflow. Safety is paramount.
4. **`query_rsshub_categories`**
* **Purpose:** Lists the main categories available within the integrated RSSHub service.
* **When to Use:** Use proactively when the user wants to discover new feed types or explore RSSHub content sources.
* **Input:** None.
* **Output:** JSON list of categories. **Present the category *names* clearly**, perhaps suggesting diverse options. Explain this is the starting point for exploring RSSHub.
5. **`query_rsshub_websites`**
* **Purpose:** Lists the specific websites/services available within a *specific* RSSHub category.
* **Input:** `category` (string, required): The **ID** of the category (infer from context or user selection, state your assumption if inferring).
* **When to Use:** After the user expresses interest in a category (Step 2 of RSSHub exploration). State which category you're querying.
* **Output:** JSON list of websites. **Present the website *names* clearly**.
6. **`query_rsshub_routes`**
* **Purpose:** Lists the specific feed routes (endpoints/feeds) available for a particular RSSHub website/service.
* **Input:** `website_id` (string, required): The **ID** of the website (infer from context or user selection, state assumption if needed).
* **When to Use:** When the user wants specific feeds from a chosen website (Step 3 of RSSHub exploration). State which website you're querying.
* **Output:** JSON list of routes. **Present the route *titles/descriptions* clearly**, explaining what kind of content each feed represents.
7. **`query`**
* **Purpose:** Performs a semantic search over the content collected by Zenfeed feeds within a specified time range.
* **Input:**
* `query` (string, required): The semantic search terms. **Formulate a specific, effective query (aim for descriptive phrases, potentially >10 words)** based on the user's *information need*, not just echoing their exact words.
* `past` (string, optional, default: `"24h"`): Lookback period (e.g., "2h", "36h"). Use default unless specified or context implies otherwise.
* **When to Use:** When the user asks to find information, articles, or summaries within their collected feeds. Act directly.
* **Output:** A textual summary of the search results. **Crucially, for each relevant finding, include the original `link` using Markdown format: `[Title](link)`.** Briefly explain *why* each result is relevant. Summarize overall findings. If no results are found, state that clearly.
* **Note:** Please note that the search results may not be accurate, you need to make a secondary judgment on whether the results are related, only reply based on the related results.
**Final Reminder:** Always prioritize understanding the user's true goal, using the correct tool effectively, and providing clear, structured, insightful responses. Follow the `apply_app_config` safety protocol without exception. Reply in the same language as the user's question.
When generating YAML configurations, do not add backslashes \ after the pipe symbol | for multi-line strings. For example, it should be written as prompt: | instead of prompt: |\
Reply in the same language as the user's question.

View File

@@ -0,0 +1,11 @@
## 从 Follow 导出 OPML 文件
<img src="images/migrate-from-follow-1.png" alt="" width="300">
<img src="images/migrate-from-follow-2.png" alt="" width="500">
<img src="images/migrate-from-follow-3.png" alt="" width="500">
> 注意一定要填写 http://rsshub:1200
## 导入 zenfeed-web
<img src="images/migrate-from-follow-4.png" alt="" width="500">
<img src="images/migrate-from-follow-5.png" alt="" width="500">

34
docs/preview.md Normal file
View File

@@ -0,0 +1,34 @@
## 信息监控
```yaml
rules:
- name: US Tariff Impact
query: The various impacts and developments of recent US tariff policies, different perspectives, especially their impact on China
```
<img src="images/monitoring.png" alt="Monitoring" width="500">
## 每日简报
```yaml
rules:
- name: Evening News
every_day: "06:30~18:00"
```
<img src="images/daily-brief.png" alt="Daily Brief" width="500">
## Chat with feeds
<img src="images/chat-with-feeds.png" alt="Chat with feeds" width="500">
## 添加 RSS 订阅源
> 如果你是 RSS 老司机,直接丢 RSS 地址,或者 OPML 文件给 AI 即可
<img src="images/add-rss.png" alt="Add RSS" width="500">
## 配合 zenfeed-web
<img src="images/feed-list-with-web.png" alt="" width="500">
<img src="images/notification-with-web.png" alt="" width="500">
<img src="images/update-config-with-web.png" alt="" width="500">