23 Commits

Author SHA1 Message Date
glidea
8f32e427d4 update README 2025-05-06 11:31:39 +08:00
glidea
3049c49f7a fix dedup 2025-05-06 11:27:36 +08:00
glidea
14a4f2b8d4 fix rewrite error handing 2025-05-03 14:51:27 +08:00
glidea
6a869574fc update README 2025-05-02 11:38:58 +08:00
glidea
c581cbacda fix rewrite error handing 2025-05-01 19:19:34 +08:00
glidea
e7fe17a4bc update image 2025-04-30 20:16:28 +08:00
glidea
b35aaa3b68 update image 2025-04-30 20:13:25 +08:00
glidea
be83967168 update README 2025-04-30 11:41:44 +08:00
glidea
064bca1dda fix lint 2025-04-29 08:22:03 +08:00
glidea
ab05089ec6 update README 2025-04-28 23:32:42 +08:00
glidea
b15c52a8c7 update README 2025-04-28 23:30:19 +08:00
glidea
18cc247532 add summary for notification & misc fix 2025-04-28 23:29:34 +08:00
glidea
98837b7d6d optimize docker compose 2025-04-28 23:26:09 +08:00
glidea
dca095f41c update README 2025-04-26 13:06:49 +08:00
glidea
92bde40ef0 update README 2025-04-26 12:35:02 +08:00
glidea
fea0bfa88d update README 2025-04-25 11:20:39 +08:00
glidea
9f9044b078 update README 2025-04-25 11:02:26 +08:00
glidea
6ee9517b31 update README 2025-04-24 21:22:26 +08:00
glidea
b6f81a3ad6 remove legacy tests 2025-04-24 18:53:13 +08:00
glidea
eb788dc738 update README 2025-04-24 13:08:18 +08:00
glidea
9b5aee1ed7 update README 2025-04-24 08:57:10 +08:00
glidea
185cb2fba5 add English doc 2025-04-23 20:58:46 +08:00
glidea
ddf284be0a update README 2025-04-23 20:19:26 +08:00
21 changed files with 440 additions and 590 deletions

172
README-en.md Normal file
View File

@@ -0,0 +1,172 @@
zenfeed: Empower RSS with AI, automatically filter, summarize, and push important information for you, say goodbye to information overload, and regain control of reading.
## Preface
RSS (Really Simple Syndication) was born in the Web 1.0 era to solve the problem of information fragmentation, allowing users to aggregate and track updates from multiple websites in one place without frequent visits. It pushes website updates in summary form to subscribers for quick information access.
However, with the rise of Web 2.0, social media, and algorithmic recommendations, RSS didn't become mainstream. The shutdown of Google Reader in 2013 was a landmark event. As Zhang Yiming pointed out at the time, RSS demands a lot from users: strong information filtering skills and self-discipline to manage feeds, otherwise it's easy to get overwhelmed by information noise. He believed that for most users, the easier "personalized recommendation" was a better solution, which later led to Toutiao and TikTok.
Algorithmic recommendations indeed lowered the bar for information acquisition, but their excessive catering to human weaknesses often leads to filter bubbles and addiction to entertainment. If you want to get truly valuable content from the information stream, you actually need stronger self-control to resist the algorithm's "feeding".
So, is pure RSS subscription the answer? Not necessarily. Information overload and filtering difficulties (information noise) remain pain points for RSS users.
Confucius advocated the doctrine of the mean in all things. Can we find a middle ground that combines the sense of control and high-quality sources from active RSS subscription with technological means to overcome its information overload drawbacks?
Try zenfeed! **AI + RSS** might be a better way to acquire information in this era. zenfeed aims to leverage AI capabilities to help you automatically filter and summarize the information you care about, allowing you to maintain Zen (calmness) amidst the Feed (information flood).
## Project Introduction
[![Codacy Badge](https://app.codacy.com/project/badge/Grade/1b51f1087558402d85496fbe7bddde89)](https://app.codacy.com/gh/glidea/zenfeed/dashboard?utm_source=gh&utm_medium=referral&utm_content=&utm_campaign=Badge_grade)
[![Maintainability Rating](https://sonarcloud.io/api/project_badges/measure?project=glidea_zenfeed&metric=sqale_rating)](https://sonarcloud.io/summary/new_code?id=glidea_zenfeed)
[![Go Report Card](https://goreportcard.com/badge/github.com/glidea/zenfeed)](https://goreportcard.com/report/github.com/glidea/zenfeed)
zenfeed is your intelligent information assistant. It automatically collects, filters, and summarizes news or topics you follow, then sends them to you. But we're not just building another "Toutiao"... 🤔
![Zenfeed](docs/images/arch.png)
**For [RSS](https://en.wikipedia.org/wiki/RSS) Veterans** 🚗
* zenfeed can be your AI-powered RSS reader (works with [zenfeed-web](https://github.com/glidea/zenfeed-web))
* An [MCP](https://mcp.so/) Server for [RSSHub](https://github.com/DIYgod/RSSHub)
* A customizable, trusted RSS data source and an incredibly fast AI search engine
* Similar to [Feedly AI](https://feedly.com/ai)
<details>
<summary>Preview</summary>
<img src="docs/images/feed-list-with-web.png" alt="Feed list with web UI" width="600">
<img src="docs/images/chat-with-feeds.png" alt="Chat with feeds" width="500">
</details>
**For Seekers of [WWZZ](https://www.wwzzai.com/) Alternatives** 🔍
* zenfeed also offers [information tracking capabilities](https://github.com/glidea/zenfeed/blob/main/docs/config.md#schedule-configuration-schedules), emphasizing high-quality, customizable data sources.
* Think of it as an RSS-based, flexible, more PaaS-like version of [AI Chief Information Officer](https://github.com/TeamWiseFlow/wiseflow?tab=readme-ov-file).
<details>
<summary>Preview</summary>
<img src="docs/images/monitoring.png" alt="Monitoring preview" width="500">
<img src="docs/images/notification-with-web.png" alt="Notification with web UI" width="500">
</details>
**For Information Anxiety Sufferers (like me)** 😌
* "zenfeed" combines "zen" and "feed," signifying maintaining calm (zen) amidst the information flood (feed).
* If you feel anxious and tired from constantly checking information streams, it's because context switching costs more than you think and hinders entering a flow state. Try the briefing feature: receive a summary email at a fixed time each day covering the relevant period. This allows for a one-time, quick, comprehensive overview. Ah, a bit of a renaissance feel, isn't it? ✨
<details>
<summary>Preview</summary>
<img src="docs/images/daily-brief.png" alt="Daily brief preview" width="500">
</details>
**For Explorers of AI Content Processing** 🔬
* zenfeed features a custom mechanism for pipelining content processing, similar to Prometheus [Relabeling](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config).
* Each piece of content is abstracted as a set of labels (e.g., title, source, body... are labels). At each node in the pipeline, you can process specific label values based on custom prompts (e.g., scoring, classifying, summarizing, filtering, adding new labels...). Subsequently, you can filter based on label queries, [route](https://github.com/glidea/zenfeed/blob/main/docs/config.md#notification-route-configuration-notifyroute-and-notifyroutesub_routes), and [display](https://github.com/glidea/zenfeed/blob/main/docs/config.md#notification-channel-email-configuration-notifychannelsemail)... See [Rewrite Rules](https://github.com/glidea/zenfeed/blob/main/docs/config.md#rewrite-rule-configuration-storagefeedrewrites).
* Crucially, you can flexibly orchestrate all this, giving zenfeed a strong tooling and personalization flavor. Welcome to integrate private data via the Push API and explore more possibilities.
<details>
<summary>Preview</summary>
<img src="docs/images/update-config-with-web.png" alt="Update config with web UI" width="500">
</details>
**For Onlookers** 🍉
Just for the exquisite email styles, install and use it now!
<img src="docs/images/monitoring.png" alt="Monitoring email style" width="400">
[More Previews](docs/preview.md)
## Installation and Usage
### 1. Installation
By default, uses SiliconFlow's Qwen/Qwen2.5-7B-Instruct (free) and Pro/BAAI/bge-m3. If you don't have a SiliconFlow account yet, use this [invitation link](https://cloud.siliconflow.cn/i/U2VS0Q5A) to get a ¥14 credit.
Support for other vendors or models is available; follow the instructions below.
#### Mac/Linux
```bash
curl -L -O https://raw.githubusercontent.com/glidea/zenfeed/main/docker-compose.yml
# If you need to customize more configuration parameters, directly edit docker-compose.yml#configs.zenfeed_config.content BEFORE running the command below.
# Configuration Docs: https://github.com/glidea/zenfeed/blob/main/docs/config.md
API_KEY=your_apikey TZ=your_local_IANA LANG=English docker-compose -p zenfeed up -d
```
#### Windows
> Use PowerShell to execute
```powershell
Invoke-WebRequest -Uri "https://raw.githubusercontent.com/glidea/zenfeed/main/docker-compose.yml" -OutFile ([System.IO.Path]::GetFileName("https://raw.githubusercontent.com/glidea/zenfeed/main/docker-compose.yml"))
# If you need to customize more configuration parameters, directly edit docker-compose.yml#configs.zenfeed_config.content BEFORE running the command below.
# Configuration Docs: https://github.com/glidea/zenfeed/blob/main/docs/config.md
$env:API_KEY = "your_apikey"; $env:TZ = "your_local_IANA"; $env:LANG = "English"; docker-compose -p zenfeed up -d
```
### 2. Using the Web UI
Access https://zenfeed-web.pages.dev
> If deployed in an environment like a VPS, access https://vps_public_ip:1400 (remember to open the security group port). Do not use the public frontend above.
> ⚠️ zenfeed currently lacks authentication. Exposing it to the public internet might leak your API Key. Please configure your security groups carefully. If you have security concerns, please open an Issue.
#### Add RSS Feeds
<img src="docs/images/web-add-source.png" alt="Add source via web UI" width="400">
> To migrate from Follow, refer to [migrate-from-follow.md](docs/migrate-from-follow.md)
> Requires access to the respective source sites; ensure network connectivity.
> Wait a few minutes after adding, especially if the model has strict rate limits.
#### Configure Daily Briefings, Monitoring, etc.
<img src="docs/images/notification-with-web.png" alt="Configure notifications via web UI" width="400">
### 3. Configure MCP (Optional)
Using Cherry Studio as an example, configure MCP and connect to Zenfeed, see [Cherry Studio MCP](docs/cherry-studio-mcp.md)
> Default address http://localhost:1301/sse
## Roadmap
* P0 (Very Likely)
* Support generating podcasts, male/female dialogues, similar to NotebookLM
* More data sources
* Email
* Web clipping Chrome extension
* P1 (Possible)
* Keyword search
* Support search engines as data sources
* App?
* The following are temporarily not prioritized due to copyright risks:
* Webhook notifications
* Web scraping
## Notice
* Compatibility is not guaranteed before version 1.0.
* The project uses the AGPLv3 license; any forks must also be open source.
* For commercial use, please contact for registration; reasonable support can be provided. Note: Legal commercial use only, gray area activities are not welcome.
* Data is not stored permanently; default retention is 8 days.
## Acknowledgments
* Thanks to [eryajf](https://github.com/eryajf) for providing the [Compose Inline Config](https://github.com/glidea/zenfeed/issues/1) idea, making deployment easier to understand.
## 👏🏻 Contributions Welcome
* No formal guidelines yet, just one requirement: "Code Consistency" it's very important.
## Disclaimer
**Before using the `zenfeed` software (hereinafter referred to as "the Software"), please read and understand this disclaimer carefully. Your download, installation, or use of the Software or any related services signifies that you have read, understood, and agreed to be bound by all terms of this disclaimer. If you do not agree with any part of this disclaimer, please cease using the Software immediately.**
1. **Provided "AS IS":** The Software is provided on an "AS IS" and "AS AVAILABLE" basis, without any warranties of any kind, either express or implied. The project authors and contributors make no warranties or representations regarding the Software's merchantability, fitness for a particular purpose, non-infringement, accuracy, completeness, reliability, security, timeliness, or performance.
2. **User Responsibility:** You are solely responsible for all actions taken using the Software. This includes, but is not limited to:
* **Data Source Selection:** You are responsible for selecting and configuring the data sources (e.g., RSS feeds, potential future Email sources) you connect to the Software. You must ensure you have the right to access and process the content from these sources and comply with their respective terms of service, copyright policies, and applicable laws and regulations.
* **Content Compliance:** You must not use the Software to process, store, or distribute any content that is unlawful, infringing, defamatory, obscene, or otherwise objectionable.
* **API Key and Credential Security:** You are responsible for safeguarding the security of any API keys, passwords, or other credentials you configure within the Software. The authors and contributors are not liable for any loss or damage arising from your failure to maintain proper security.
* **Configuration and Use:** You are responsible for correctly configuring and using the Software's features, including content processing pipelines, filtering rules, notification settings, etc.
3. **Third-Party Content and Services:** The Software may integrate with or rely on third-party data sources and services (e.g., RSSHub, LLM providers, SMTP service providers). The project authors and contributors are not responsible for the availability, accuracy, legality, security, or terms of service of such third-party content or services. Your interactions with these third parties are governed by their respective terms and policies. Copyright for third-party content accessed or processed via the Software (including original articles, summaries, classifications, scores, etc.) belongs to the original rights holders, and you assume all legal liability arising from your use of such content.
4. **No Warranty on Content Processing:** The Software utilizes technologies like Large Language Models (LLMs) to process content (e.g., summarization, classification, scoring, filtering). These processed results may be inaccurate, incomplete, or biased. The project authors and contributors are not responsible for any decisions made or actions taken based on these processed results. The accuracy of semantic search results is also affected by various factors and is not guaranteed.
5. **No Liability for Indirect or Consequential Damages:** In no event shall the project authors or contributors be liable under any legal theory (whether contract, tort, or otherwise) for any direct, indirect, incidental, special, exemplary, or consequential damages arising out of the use or inability to use the Software. This includes, but is not limited to, loss of profits, loss of data, loss of goodwill, business interruption, or other commercial damages or losses, even if advised of the possibility of such damages.
6. **Open Source Software:** The Software is licensed under the AGPLv3 License. You are responsible for understanding and complying with the terms of this license.
7. **Not Legal Advice:** This disclaimer does not constitute legal advice. If you have any questions regarding the legal implications of using the Software, you should consult a qualified legal professional.
8. **Modification and Acceptance:** The project authors reserve the right to modify this disclaimer at any time. Continued use of the Software following any modifications will be deemed acceptance of the revised terms.
**Please be aware: Using the Software to fetch, process, and distribute copyrighted content may carry legal risks. Users are responsible for ensuring their usage complies with all applicable laws, regulations, and third-party terms of service. The project authors and contributors assume no liability for any legal disputes or losses arising from user misuse or improper use of the Software.**

View File

@@ -1,5 +1,42 @@
[English](README-en.md)
![](docs/images/crad.png)
三点:
**1. AI 版 RSS 阅读器**
**2. 实时 “新闻” 知识库**
**3. 帮你时刻关注 “指定事件” 的秘书(如 “关税政策变化”“xx 股票波动”)**
开箱即用的公共服务站https://zenfeed.xyz (集成 Hacker NewsGithub TrendingV2EX 热榜等常见公开信源)
> 总结模型以更新至 Gemini 2.5pro!!
豆包机器人上架中!
加入下方👇🏻微信群关注更新
## 前言
RSS简易信息聚合诞生于 Web 1.0 时代,旨在解决信息分散的问题,让用户能在一个地方聚合、追踪多个网站的更新,无需频繁访问。它将网站更新以摘要形式推送给订阅者,便于快速获取信息。
然而,随着 Web 2.0 的发展和社交媒体、算法推荐的兴起RSS 并未成为主流。Google Reader 在 2013 年的关闭便是一个标志性事件。正如张一鸣在当时指出的RSS 对用户要求较高:需要较强的信息筛选能力和自律性来管理订阅源,否则很容易被信息噪音淹没。他认为,对于大多数用户而言,更轻松的"个性化推荐"是更优解,这也催生了后来的今日头条和抖音。
算法推荐确实降低了信息获取的门槛,但其过度迎合人性弱点,往往导致信息茧房和娱乐化沉溺。如果你希望从信息流中获取真正有价值的内容,反而需要更强的自制力去对抗算法的"投喂"。
那么,纯粹的 RSS 订阅是否就是答案?也不尽然。信息过载和筛选困难(信息噪音)依然是 RSS 用户面临的痛点。
孔子说凡事讲究中庸之道。我们能否找到一种折中的办法,既能享受 RSS 主动订阅带来的掌控感和高质量信源,又能借助技术手段克服其信息过载的弊端?
试试 zenfeed 吧!**AI + RSS**或许是这个时代更优的信息获取方式。zenfeed 旨在利用 AI 的能力帮你自动筛选、总结你所关注的信息让你在信息洪流Feed中保持禅定Zen
> 参考文章:[AI 复兴 RSS - 少数派](https://sspai.com/post/89494)
## 项目介绍
[![Codacy Badge](https://app.codacy.com/project/badge/Grade/1b51f1087558402d85496fbe7bddde89)](https://app.codacy.com/gh/glidea/zenfeed/dashboard?utm_source=gh&utm_medium=referral&utm_content=&utm_campaign=Badge_grade)
[![Maintainability Rating](https://sonarcloud.io/api/project_badges/measure?project=glidea_zenfeed&metric=sqale_rating)](https://sonarcloud.io/summary/new_code?id=glidea_zenfeed)
[![Go Report Card](https://goreportcard.com/badge/github.com/glidea/zenfeed)](https://goreportcard.com/report/github.com/glidea/zenfeed)
zenfeed 是你的智能信息助手。它自动收集、筛选并总结关注的新闻或话题,然后发送给你。但我们可不是又造了一个 "今日头条"... 🤔
@@ -58,19 +95,19 @@ zenfeed 是你的智能信息助手。它自动收集、筛选并总结关注的
## 安装与使用
### 1. 安装
> 最快 1min 拉起
默认使用硅基流动的 Qwen/Qwen2.5-7B-Instruct(免费) 和 Pro/BAAI/bge-m3。如果你还没有硅基账号使用 [邀请链接](https://cloud.siliconflow.cn/i/U2VS0Q5A) 得 14 元额度
支持使用其他厂商或模型,按下方提示操作即可
如果需要使用其他厂商或模型,或自定义部署:请编辑下方 **docker-compose.yml**#configs.zenfeed_config.content.
参考 [配置文档](https://github.com/glidea/zenfeed/blob/main/docs/config-zh.md)
#### Mac/Linux
```bash
curl -L -O https://raw.githubusercontent.com/glidea/zenfeed/main/docker-compose.yml
# 如果你需要自定义更多配置参数,请直接编辑(执行下方命令前) docker-compose.yml#configs.zenfeed_config.content
# 配置文档 https://github.com/glidea/zenfeed/blob/main/docs/config-zh.md
API_KEY=your_apikey docker-compose -p zenfeed up -d
API_KEY=硅基流动apikey docker-compose -p zenfeed up -d
```
#### Windows
@@ -78,14 +115,13 @@ API_KEY=your_apikey docker-compose -p zenfeed up -d
```powershell
Invoke-WebRequest -Uri "https://raw.githubusercontent.com/glidea/zenfeed/main/docker-compose.yml" -OutFile ([System.IO.Path]::GetFileName("https://raw.githubusercontent.com/glidea/zenfeed/main/docker-compose.yml"))
# 如果你需要自定义更多配置参数,请直接编辑(执行下方命令前) docker-compose.yml#configs.zenfeed_config.content
# 配置文档 https://github.com/glidea/zenfeed/blob/main/docs/config-zh.md
$env:API_KEY = "your_apikey"; docker-compose -p zenfeed up -d
$env:API_KEY = "硅基流动apikey"; docker-compose -p zenfeed up -d
```
安装完成!访问 https://zenfeed-web.pages.dev
### 2. 使用 Web 端
访问 https://zenfeed-web.pages.dev
> 如果部署在 VPS 等环境请访问 https://vps_public_ip:1400记得开放安全组端口不要使用上方的公共前端
> ⚠️ zenfeed 尚无认证手段,暴露到公网可能会泄露 APIKey请小心设置安全组。如果你有这方面的安全需求请提 Issue
@@ -119,14 +155,18 @@ $env:API_KEY = "your_apikey"; docker-compose -p zenfeed up -d
* 支持 Webhook 通知
* 爬虫
> 进展会第一时间在 [Linux Do](https://linux.do/u/ajd/summary) 更新
## 有任何问题与反馈,欢迎加群讨论
<img src="docs/images/wechat.png" alt="Wechat" width="150">
都看到这里了,顺手点个 Star ⭐️ 呗,用于防止我太监掉
有好玩的 AI 工作请联系我!
喜欢本项目的话,赞助杯🧋(赛博要饭)
<img src="docs/images/sponsor.png" alt="Wechat" width="150">
## 注意
* 1.0 版本之前不保证兼容性
* 项目采用 AGPL3 协议,任何 Fork 都需要开源

View File

@@ -7,6 +7,7 @@ services:
- PUBLIC_DEFAULT_API_URL=http://zenfeed:1300
depends_on:
- zenfeed
restart: unless-stopped
zenfeed:
image: glidea/zenfeed:latest
@@ -32,13 +33,15 @@ services:
- "1301:1301"
depends_on:
- rsshub
restart: unless-stopped
rsshub:
image: diygod/rsshub:2024-12-14
ports:
- "1200:1200"
environment:
- NODE_ENV=production
restart: unless-stopped
volumes:
data: {}

View File

@@ -1,17 +1,17 @@
**配置 MCP Server**
**Configure MCP Server**
默认 URL: `http://localhost:1301/sse`
Default URL: `http://localhost:1301/sse`
<img src="images/cherry-studio-mcp.png" alt="Cherry Studio MCP" width="500">
**配置 Prompt可选但不使用效果可能不符合预期**
**Configure Prompt (Optional but recommended for optimal results)**
完整 Prompt 见 [mcp-client-prompt.md](mcp-client-prompt.md)
For complete prompt, see [mcp-client-prompt.md](mcp-client-prompt.md)
<img src="images/cherry-studio-mcp-prompt.png" alt="Cherry Studio MCP Prompt" width="500">
**玩法参考**
**Usage Examples**
[Doc](preview.md)
非常强大,还可以直接修改 zenfeed 配置项
Very powerful - you can even directly modify zenfeed configuration settings

View File

@@ -142,13 +142,16 @@
此结构可以使用 `sub_routes` 进行嵌套。Feed 会首先尝试匹配子路由;如果没有子路由匹配,则应用父路由的配置。
| 字段 | 类型 | 描述 | 默认值 | 是否必需 |
| :--------------------------------- | :----------- | :-------------------------------------------------------------------------------------------------------- | :----- | :------------ |
| `...matchers` (仅子路由) | `字符串列表` | 标签匹配器,用于确定 Feed 是否属于此子路由。例如 `["category=tech", "source!=github"]`。 | `[]` | 是 (仅子路由) |
| `...receivers` | `字符串列表` | 接收者的名称列表 (在 `notify.receivers` 中定义),用于发送匹配此路由的 Feed 的通知。 | `[]` | 是 (至少一个) |
| `...group_by` | `字符串列表` | 在发送通知前用于对 Feed 进行分组的标签列表。每个分组会产生一个单独的通知。例如 `["source", "category"]`。 | `[]` | 是 (至少一个) |
| `...compress_by_related_threshold` | `*float32` | 如果设置,则根据语义相关性压缩分组内高度相似的 Feed仅发送一个代表。阈值 (0-1),越高表示越相似。 | `0.85` | 否 |
| `...sub_routes` | `对象列表` | 嵌套路由列表。允许定义更具体的路由规则。每个对象遵循 **通知路由配置** | `[]` | 否 |
| 字段 | 类型 | 描述 | 默认值 | 是否必需 |
| :--------------------------------- | :----------- | :-------------------------------------------------------------------------------------------------------- | :---------------------- | :------------ |
| `...matchers` (仅子路由) | `字符串列表` | 标签匹配器,用于确定 Feed 是否属于此子路由。例如 `["category=tech", "source!=github"]`。 | `[]` | 是 (仅子路由) |
| `...receivers` | `字符串列表` | 接收者的名称列表 (在 `notify.receivers` 中定义),用于发送匹配此路由的 Feed 的通知。 | `[]` | 是 (至少一个) |
| `...group_by` | `字符串列表` | 在发送通知前用于对 Feed 进行分组的标签列表。每个分组会产生一个单独的通知。例如 `["source", "category"]`。 | `[]` | 是 (至少一个) |
| `...source_label` | `string` | 从每个 Feed 中提取内容并进行总结的源标签。默认为所有标签。强烈建议设置为 'summary' 以减少上下文长度。 | 所有标签 | 否 |
| `...summary_prompt` | `string` | 用于总结每个分组的 Feed 的 Prompt。 | | 否 |
| `...llm` | `string` | 使用的 LLM 的名称。默认为 `llms` 部分中的默认 LLM。建议使用上下文长度较大的 LLM。 | `llms` 部分中的默认 LLM | 否 |
| `...compress_by_related_threshold` | `*float32` | 如果设置,则根据语义相关性压缩分组内高度相似的 Feed仅发送一个代表。阈值 (0-1),越高表示越相似。 | `0.85` | 否 |
| `...sub_routes` | `对象列表` | 嵌套路由列表。允许定义更具体的路由规则。每个对象遵循 **通知路由配置**。 | `[]` | 否 |
### 通知接收者配置 (`notify.receivers[]`)

View File

@@ -142,13 +142,16 @@ Defines rules for querying and monitoring feeds.
This structure can be nested using `sub_routes`. A feed is matched against sub-routes first; if no sub-route matches, the parent route's configuration applies.
| Field | Type | Description | Default | Required |
| :--------------------------------- | :-------------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :------ | :------------------- |
| `...matchers` (only in sub-routes) | list of strings | Label matchers to determine if a feed belongs to this sub-route. e.g. `["category=tech", "source!=github"]`. | `[]` | Yes (for sub-routes) |
| `...receivers` | list of strings | Names of the receivers (defined in `notify.receivers`) to send notifications for feeds matching this route. | `[]` | Yes (at least one) |
| `...group_by` | list of strings | Labels to group feeds by before sending notifications. Each group results in a separate notification. e.g., `["source", "category"]`. | `[]` | Yes (at least one) |
| `...compress_by_related_threshold` | *float32 | If set, compresses highly similar feeds (based on semantic relatedness) within a group, sending only one representative. Threshold (0-1). Higher means more similar. | `0.85` | No |
| `...sub_routes` | list of objects | Nested routes. Allows defining more specific routing rules. Each object follows the **Notify Route Configuration**. | `[]` | No |
| Field | Type | Description | Default | Required |
| :--------------------------------- | :-------------- | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :---------------------------- | :------------------- |
| `...matchers` (only in sub-routes) | list of strings | Label matchers to determine if a feed belongs to this sub-route. e.g. `["category=tech", "source!=github"]`. | `[]` | Yes (for sub-routes) |
| `...receivers` | list of strings | Names of the receivers (defined in `notify.receivers`) to send notifications for feeds matching this route. | `[]` | Yes (at least one) |
| `...group_by` | list of strings | Labels to group feeds by before sending notifications. Each group results in a separate notification. e.g., `["source", "category"]`. | `[]` | Yes (at least one) |
| `...source_label` | string | The source label to extract the content from each feed, and summarize them. Default are all labels. It is very recommended to set it to 'summary' to reduce context length. | all labels | No |
| `...summary_prompt` | string | The prompt to summarize the feeds of each group. | | No |
| `...llm` | string | The LLM name to use. Default is the default LLM in `llms` section. A large context length LLM is recommended. | default LLM in `llms` section | No |
| `...compress_by_related_threshold` | *float32 | If set, compresses highly similar feeds (based on semantic relatedness) within a group, sending only one representative. Threshold (0-1). Higher means more similar. | `0.85` | No |
| `...sub_routes` | list of objects | Nested routes. Allows defining more specific routing rules. Each object follows the **Notify Route Configuration**. | `[]` | No |
### Notify Receiver Configuration (`notify.receivers[]`)

BIN
docs/images/crad.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 617 KiB

BIN
docs/images/sponsor.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 201 KiB

View File

@@ -1,11 +1,11 @@
## 从 Follow 导出 OPML 文件
## Export OPML File from Follow
<img src="images/migrate-from-follow-1.png" alt="" width="300">
<img src="images/migrate-from-follow-2.png" alt="" width="500">
<img src="images/migrate-from-follow-3.png" alt="" width="500">
> 注意一定要填写 http://rsshub:1200
> Note: Make sure to fill in http://rsshub:1200
## 导入 zenfeed-web
## Import to zenfeed-web
<img src="images/migrate-from-follow-4.png" alt="" width="500">
<img src="images/migrate-from-follow-5.png" alt="" width="500">

View File

@@ -355,6 +355,7 @@ func (a *App) setupNotifier() (err error) {
RouterFactory: route.NewFactory(),
ChannelFactory: channel.NewFactory(),
KVStorage: a.kvStorage,
LLMFactory: a.llmFactory,
})
if err != nil {
return err

View File

@@ -145,6 +145,9 @@ type SchedulsRule struct {
type NotifyRoute struct {
Receivers []string `yaml:"receivers,omitempty" json:"receivers,omitempty" desc:"The notify receivers. It is required, at least one receiver is needed."`
GroupBy []string `yaml:"group_by,omitempty" json:"group_by,omitempty" desc:"The group by config to group the feeds, each group will be notified individually. It is required, at least one group by is needed."`
SourceLabel string `yaml:"source_label,omitempty" json:"source_label,omitempty" desc:"The source label to extract the content from each feed, and summarize them. Default are all labels. It is very recommended to set it to 'summary' to reduce context length."`
SummaryPrompt string `yaml:"summary_prompt,omitempty" json:"summary_prompt,omitempty" desc:"The prompt to summarize the feeds of each group."`
LLM string `yaml:"llm,omitempty" json:"llm,omitempty" desc:"The LLM name to use. Default is the default LLM in llms section. A large context length LLM is recommended."`
CompressByRelatedThreshold *float32 `yaml:"compress_by_related_threshold,omitempty" json:"compress_by_related_threshold,omitempty" desc:"The threshold to compress the feeds by relatedness, that is, if the feeds are too similar, only one will be notified. Default is 0.85."`
SubRoutes []NotifySubRoute `yaml:"sub_routes,omitempty" json:"sub_routes,omitempty" desc:"The sub routes to notify the feeds. A feed prefers to be matched by the sub routes, if not matched, it will be matched by the parent route."`
}
@@ -154,6 +157,9 @@ type NotifySubRoute struct {
Receivers []string `yaml:"receivers,omitempty" json:"receivers,omitempty" desc:"The notify receivers. It is required, at least one receiver is needed."`
GroupBy []string `yaml:"group_by,omitempty" json:"group_by,omitempty" desc:"The group by config to group the feeds, each group will be notified individually. It is required, at least one group by is needed."`
SourceLabel string `yaml:"source_label,omitempty" json:"source_label,omitempty" desc:"The source label to extract the content from each feed, and summarize them. Default are all labels. It is very recommended to set it to 'summary' to reduce context length."`
SummaryPrompt string `yaml:"summary_prompt,omitempty" json:"summary_prompt,omitempty" desc:"The prompt to summarize the feeds of each group."`
LLM string `yaml:"llm,omitempty" json:"llm,omitempty" desc:"The LLM name to use. Default is the default LLM in llms section. A large context length LLM is recommended."`
CompressByRelatedThreshold *float32 `yaml:"compress_by_related_threshold,omitempty" json:"compress_by_related_threshold,omitempty" desc:"The threshold to compress the feeds by relatedness, that is, if the feeds are too similar, only one will be notified. Default is 0.85."`
SubRoutes []NotifySubRoute `yaml:"sub_routes,omitempty" json:"sub_routes,omitempty" desc:"The sub routes to notify the feeds. A feed prefers to be matched by the sub routes, if not matched, it will be matched by the parent route."`
}

View File

@@ -130,7 +130,7 @@ func (e *email) buildEmail(receiver Receiver, group *route.FeedGroup) (*gomail.M
m.SetHeader("To", receiver.Email)
m.SetHeader("Subject", group.Name)
body, err := e.buildBodyHTML(group.Feeds)
body, err := e.buildBodyHTML(group)
if err != nil {
return nil, errors.Wrap(err, "build email body HTML")
}
@@ -139,7 +139,7 @@ func (e *email) buildEmail(receiver Receiver, group *route.FeedGroup) (*gomail.M
return m, nil
}
func (e *email) buildBodyHTML(feeds []*route.Feed) ([]byte, error) {
func (e *email) buildBodyHTML(group *route.FeedGroup) ([]byte, error) {
bodyBuf := buffer.Get()
defer buffer.Put(bodyBuf)
@@ -148,14 +148,24 @@ func (e *email) buildBodyHTML(feeds []*route.Feed) ([]byte, error) {
return nil, errors.Wrap(err, "write HTML header")
}
// Write summary.
if err := e.writeSummary(bodyBuf, group.Summary); err != nil {
return nil, errors.Wrap(err, "write summary")
}
// Write each feed content.
for i, feed := range feeds {
if _, err := bodyBuf.WriteString(`
<div style="margin-top:20px; padding-top:15px; border-top:1px solid #f1f3f4;">
<p style="font-size:32px; font-weight:500; margin:0 0 10px 0;">Feeds</p>`); err != nil {
return nil, errors.Wrap(err, "write feeds header")
}
for i, feed := range group.Feeds {
if err := e.writeFeedContent(bodyBuf, feed); err != nil {
return nil, errors.Wrap(err, "write feed content")
}
// Add separator (except the last feed).
if i < len(feeds)-1 {
if i < len(group.Feeds)-1 {
if err := e.writeSeparator(bodyBuf); err != nil {
return nil, errors.Wrap(err, "write separator")
}
@@ -188,6 +198,29 @@ func (e *email) writeHTMLHeader(buf *buffer.Bytes) error {
return err
}
func (e *email) writeSummary(buf *buffer.Bytes, summary string) error {
if summary == "" {
return nil
}
if _, err := buf.WriteString(`
<p style="font-size:32px; font-weight:500; margin:0 0 10px 0;">Summary</p>`); err != nil {
return errors.Wrap(err, "write summary header")
}
contentHTML, err := textconvert.MarkdownToHTML([]byte(summary))
if err != nil {
return errors.Wrap(err, "markdown to HTML")
}
contentHTMLWithStyle := fmt.Sprintf(`<div style="font-size:16px; line-height:1.8;">%s</div>`, contentHTML)
if _, err := buf.WriteString(contentHTMLWithStyle); err != nil {
return errors.Wrap(err, "write summary")
}
return nil
}
const timeLayout = "01-02 15:04"
func (e *email) writeFeedContent(buf *buffer.Bytes, feed *route.Feed) error {
@@ -311,7 +344,8 @@ func (e *email) renderMarkdownContent(buf *buffer.Bytes, feed *route.Feed) (n in
return 0, errors.Wrap(err, "markdown to HTML")
}
if _, err := buf.Write(contentHTML); err != nil {
contentHTMLWithStyle := fmt.Sprintf(`<div style="font-size:16px; line-height:1.8;">%s</div>`, contentHTML)
if _, err := buf.WriteString(contentHTMLWithStyle); err != nil {
return 0, errors.Wrap(err, "write content HTML")
}

View File

@@ -27,6 +27,7 @@ import (
"github.com/glidea/zenfeed/pkg/component"
"github.com/glidea/zenfeed/pkg/config"
"github.com/glidea/zenfeed/pkg/llm"
"github.com/glidea/zenfeed/pkg/notify/channel"
"github.com/glidea/zenfeed/pkg/notify/route"
"github.com/glidea/zenfeed/pkg/schedule/rule"
@@ -67,6 +68,9 @@ func (c *Config) From(app *config.App) *Config {
c.Route = route.Config{
Route: route.Route{
GroupBy: app.Notify.Route.GroupBy,
SourceLabel: app.Notify.Route.SourceLabel,
SummaryPrompt: app.Notify.Route.SummaryPrompt,
LLM: app.Notify.Route.LLM,
CompressByRelatedThreshold: app.Notify.Route.CompressByRelatedThreshold,
Receivers: app.Notify.Route.Receivers,
},
@@ -105,6 +109,9 @@ func convertSubRoute(from *config.NotifySubRoute) *route.SubRoute {
to := &route.SubRoute{
Route: route.Route{
GroupBy: from.GroupBy,
SourceLabel: from.SourceLabel,
SummaryPrompt: from.SummaryPrompt,
LLM: from.LLM,
CompressByRelatedThreshold: from.CompressByRelatedThreshold,
Receivers: from.Receivers,
},
@@ -169,6 +176,7 @@ type Dependencies struct {
RouterFactory route.Factory
ChannelFactory channel.Factory
KVStorage kv.Storage
LLMFactory llm.Factory
}
// --- Factory code block ---
@@ -322,7 +330,10 @@ func (n *notifier) newRouter(config *route.Config) (route.Router, error) {
return n.Dependencies().RouterFactory.New(
n.Instance(),
config,
route.Dependencies{RelatedScore: n.Dependencies().RelatedScore},
route.Dependencies{
RelatedScore: n.Dependencies().RelatedScore,
LLMFactory: n.Dependencies().LLMFactory,
},
)
}
@@ -339,7 +350,7 @@ func (n *notifier) handle(ctx context.Context, result *rule.Result) {
router := n.router
n.mu.RUnlock()
groups, err := router.Route(result)
groups, err := router.Route(ctx, result)
if err != nil {
// We don't retry in notifier, retry should be upstream.
log.Error(ctx, errors.Wrap(err, "route"))
@@ -428,7 +439,7 @@ func (n *notifier) send(ctx context.Context, work sendWork) error {
}
var nlogKey = func(group *route.FeedGroup, receiver Receiver) string {
return fmt.Sprintf("notifier.group.%s.receiver.%s", group.Name, receiver.Name)
return fmt.Sprintf("notifier.group.%s.receiver.%s.%d", group.Name, receiver.Name, group.Time.Unix())
}
func (n *notifier) isSent(ctx context.Context, group *route.FeedGroup, receiver Receiver) bool {

View File

@@ -16,6 +16,8 @@
package route
import (
"context"
"encoding/json"
"fmt"
"sort"
"strings"
@@ -25,16 +27,20 @@ import (
"k8s.io/utils/ptr"
"github.com/glidea/zenfeed/pkg/component"
"github.com/glidea/zenfeed/pkg/llm"
"github.com/glidea/zenfeed/pkg/model"
"github.com/glidea/zenfeed/pkg/schedule/rule"
"github.com/glidea/zenfeed/pkg/storage/feed/block"
"github.com/glidea/zenfeed/pkg/telemetry"
telemetrymodel "github.com/glidea/zenfeed/pkg/telemetry/model"
runtimeutil "github.com/glidea/zenfeed/pkg/util/runtime"
timeutil "github.com/glidea/zenfeed/pkg/util/time"
)
// --- Interface code block ---
type Router interface {
component.Component
Route(result *rule.Result) (groups []*Group, err error)
Route(ctx context.Context, result *rule.Result) (groups []*Group, err error)
}
type Config struct {
@@ -43,6 +49,9 @@ type Config struct {
type Route struct {
GroupBy []string
SourceLabel string
SummaryPrompt string
LLM string
CompressByRelatedThreshold *float32
Receivers []string
SubRoutes SubRoutes
@@ -158,6 +167,7 @@ func (c *Config) Validate() error {
type Dependencies struct {
RelatedScore func(a, b [][]float32) (float32, error) // MUST same with vector index.
LLMFactory llm.Factory
}
type Group struct {
@@ -166,10 +176,11 @@ type Group struct {
}
type FeedGroup struct {
Name string
Time time.Time
Labels model.Labels
Feeds []*Feed
Name string
Time time.Time
Labels model.Labels
Feeds []*Feed
Summary string
}
func (g *FeedGroup) ID() string {
@@ -216,7 +227,10 @@ type router struct {
*component.Base[Config, Dependencies]
}
func (r *router) Route(result *rule.Result) (groups []*Group, err error) {
func (r *router) Route(ctx context.Context, result *rule.Result) (groups []*Group, err error) {
ctx = telemetry.StartWith(ctx, append(r.TelemetryLabels(), telemetrymodel.KeyOperation, "Route")...)
defer func() { telemetry.End(ctx, err) }()
// Find route for each feed.
feedsByRoute := r.routeFeeds(result.Feeds)
@@ -233,12 +247,21 @@ func (r *router) Route(result *rule.Result) (groups []*Group, err error) {
// Build final groups.
for ls, feeds := range relatedGroups {
var summary string
if prompt := route.SummaryPrompt; prompt != "" && len(feeds) > 1 {
// TODO: Avoid potential for duplicate generation.
summary, err = r.generateSummary(ctx, prompt, feeds, route.SourceLabel)
if err != nil {
return nil, errors.Wrap(err, "generate summary")
}
}
groups = append(groups, &Group{
FeedGroup: FeedGroup{
Name: fmt.Sprintf("%s %s", result.Rule, ls.String()),
Time: result.Time,
Labels: *ls,
Feeds: feeds,
Name: fmt.Sprintf("%s %s", result.Rule, ls.String()),
Time: result.Time,
Labels: *ls,
Feeds: feeds,
Summary: summary,
},
Receivers: route.Receivers,
})
@@ -252,6 +275,44 @@ func (r *router) Route(result *rule.Result) (groups []*Group, err error) {
return groups, nil
}
func (r *router) generateSummary(
ctx context.Context,
prompt string,
feeds []*Feed,
sourceLabel string,
) (string, error) {
content := r.parseContentToSummary(feeds, sourceLabel)
if content == "" {
return "", nil
}
llm := r.Dependencies().LLMFactory.Get(r.Config().LLM)
summary, err := llm.String(ctx, []string{
content,
prompt,
})
if err != nil {
return "", errors.Wrap(err, "llm string")
}
return summary, nil
}
func (r *router) parseContentToSummary(feeds []*Feed, sourceLabel string) string {
if sourceLabel == "" {
b := runtimeutil.Must1(json.Marshal(feeds))
return string(b)
}
var sb strings.Builder
for _, feed := range feeds {
sb.WriteString(feed.Labels.Get(sourceLabel))
}
return sb.String()
}
func (r *router) routeFeeds(feeds []*block.FeedVO) map[*Route][]*block.FeedVO {
config := r.Config()
feedsByRoute := make(map[*Route][]*block.FeedVO)
@@ -290,6 +351,12 @@ func (r *router) groupFeedsByLabels(route *Route, feeds []*block.FeedVO) map[*mo
groupedFeeds[labelGroup] = append(groupedFeeds[labelGroup], feed)
}
for _, feeds := range groupedFeeds {
sort.Slice(feeds, func(i, j int) bool {
return feeds[i].ID < feeds[j].ID
})
}
return groupedFeeds
}
@@ -344,6 +411,16 @@ func (r *router) compressRelatedFeedsForGroup(
}
}
// Sort.
sort.Slice(feedsWithRelated, func(i, j int) bool {
return feedsWithRelated[i].ID < feedsWithRelated[j].ID
})
for _, feed := range feedsWithRelated {
sort.Slice(feed.Related, func(i, j int) bool {
return feed.Related[i].ID < feed.Related[j].ID
})
}
return feedsWithRelated, nil
}
@@ -351,8 +428,8 @@ type mockRouter struct {
component.Mock
}
func (m *mockRouter) Route(result *rule.Result) (groups []*Group, err error) {
m.Called(result)
func (m *mockRouter) Route(ctx context.Context, result *rule.Result) (groups []*Group, err error) {
m.Called(ctx, result)
return groups, err
}

View File

@@ -1,6 +1,7 @@
package route
import (
"context"
"fmt"
"testing"
"time"
@@ -11,6 +12,7 @@ import (
"k8s.io/utils/ptr"
"github.com/glidea/zenfeed/pkg/component"
"github.com/glidea/zenfeed/pkg/llm"
"github.com/glidea/zenfeed/pkg/model"
"github.com/glidea/zenfeed/pkg/schedule/rule"
"github.com/glidea/zenfeed/pkg/storage/feed/block"
@@ -382,6 +384,11 @@ func TestRoute(t *testing.T) {
tt.GivenDetail.relatedScore(&mockDep.Mock)
}
llmFactory, err := llm.NewFactory("", nil, llm.FactoryDependencies{}, component.MockOption(func(m *mock.Mock) {
m.On("String", mock.Anything, mock.Anything).Return("test", nil)
}))
Expect(err).NotTo(HaveOccurred())
routerInstance := &router{
Base: component.New(&component.BaseConfig[Config, Dependencies]{
Name: "TestRouter",
@@ -389,11 +396,12 @@ func TestRoute(t *testing.T) {
Config: tt.GivenDetail.config,
Dependencies: Dependencies{
RelatedScore: mockDep.RelatedScore,
LLMFactory: llmFactory,
},
}),
}
groups, err := routerInstance.Route(tt.WhenDetail.ruleResult)
groups, err := routerInstance.Route(context.Background(), tt.WhenDetail.ruleResult)
if tt.ThenExpected.isErr {
Expect(err).To(HaveOccurred())

View File

@@ -465,9 +465,9 @@ func (r *rewriter) Reload(app *config.App) error {
return nil
}
func (r *rewriter) Labels(ctx context.Context, labels model.Labels) (model.Labels, error) {
func (r *rewriter) Labels(ctx context.Context, labels model.Labels) (rewritten model.Labels, err error) {
ctx = telemetry.StartWith(ctx, append(r.TelemetryLabels(), telemetrymodel.KeyOperation, "Labels")...)
defer func() { telemetry.End(ctx, nil) }()
defer func() { telemetry.End(ctx, err) }()
rules := *r.Config()
for _, rule := range rules {

View File

@@ -77,7 +77,7 @@ func (r *periodic) Run() (err error) {
return nil
case now := <-tick.C:
iter(now)
tick.Reset(3 * time.Minute)
tick.Reset(5 * time.Minute)
}
}
}

View File

@@ -208,10 +208,11 @@ func (s *scraper) fillIDs(feeds []*model.Feed) []*model.Feed {
for _, feed := range feeds {
// We can not use the pub time to join the hash,
// because the pub time is dynamic for some sources.
//
// title may be changed for some sources... so...
source := feed.Labels.Get(model.LabelSource)
title := feed.Labels.Get(model.LabelTitle)
link := feed.Labels.Get(model.LabelLink)
feed.ID = hashutil.Sum64s([]string{source, title, link})
feed.ID = hashutil.Sum64s([]string{source, link})
}
return feeds

View File

@@ -523,9 +523,9 @@ type block struct {
coldLoaded bool
}
func (b *block) Run() error {
func (b *block) Run() (err error) {
ctx := telemetry.StartWith(b.Context(), append(b.TelemetryLabels(), telemetrymodel.KeyOperation, "Run")...)
defer func() { telemetry.End(ctx, nil) }()
defer func() { telemetry.End(ctx, err) }()
// Maintain metrics.
go b.maintainMetrics(ctx)
@@ -715,9 +715,9 @@ func (b *block) Query(ctx context.Context, query QueryOptions) (feeds []*FeedVO,
return result.Slice(), nil
}
func (b *block) Exists(ctx context.Context, id uint64) (bool, error) {
func (b *block) Exists(ctx context.Context, id uint64) (exists bool, err error) {
ctx = telemetry.StartWith(ctx, append(b.TelemetryLabels(), telemetrymodel.KeyOperation, "Exists")...)
defer func() { telemetry.End(ctx, nil) }()
defer func() { telemetry.End(ctx, err) }()
// Ensure the block is loaded.
if err := b.ensureLoaded(ctx); err != nil {

View File

@@ -22,6 +22,7 @@ import (
"reflect"
"strconv"
"sync"
"sync/atomic"
"time"
"github.com/benbjohnson/clock"
@@ -578,10 +579,14 @@ func (s *storage) blockDependencies() block.Dependencies {
}
func (s *storage) rewrite(ctx context.Context, feeds []*model.Feed) ([]*model.Feed, error) {
rewritten := make([]*model.Feed, 0, len(feeds))
var wg sync.WaitGroup
var errs []error
var mu sync.Mutex
var (
rewritten = make([]*model.Feed, 0, len(feeds))
wg sync.WaitGroup
mu sync.Mutex
errs []error
dropped atomic.Int32
)
for _, item := range feeds { // TODO: Limit the concurrency & goroutine number.
wg.Add(1)
go func(item *model.Feed) {
@@ -596,6 +601,7 @@ func (s *storage) rewrite(ctx context.Context, feeds []*model.Feed) ([]*model.Fe
}
if len(labels) == 0 {
log.Debug(ctx, "drop feed", "id", item.ID)
dropped.Add(1)
return // Drop empty labels.
}
@@ -607,8 +613,13 @@ func (s *storage) rewrite(ctx context.Context, feeds []*model.Feed) ([]*model.Fe
}(item)
}
wg.Wait()
if len(errs) > 0 {
return nil, errs[0]
switch len(errs) {
case 0:
case len(feeds) - int(dropped.Load()):
return nil, errs[0] // All failed.
default:
log.Error(ctx, errors.Wrap(errs[0], "rewrite feeds"), "error_count", len(errs))
}
return rewritten, nil

View File

@@ -1,520 +0,0 @@
// Copyright (C) 2025 wangyusong
//
// This program is free software: you can redistribute it and/or modify
// it under the terms of the GNU Affero General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
//
// This program is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU Affero General Public License for more details.
//
// You should have received a copy of the GNU Affero General Public License
// along with this program. If not, see <https://www.gnu.org/licenses/>.
package feed
// import (
// "context"
// "os"
// "testing"
// "time"
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
// "github.com/benbjohnson/clock"
// . "github.com/onsi/gomega"
// "github.com/stretchr/testify/mock"
// "github.com/glidea/zenfeed/pkg/config"
// "github.com/glidea/zenfeed/pkg/storage/feed/block"
// "github.com/glidea/zenfeed/pkg/storage/feed/block/chunk"
// "github.com/glidea/zenfeed/pkg/test"
// timeutil "github.com/glidea/zenfeed/pkg/util/time"
// )
// func TestNew(t *testing.T) {
// RegisterTestingT(t)
// type givenDetail struct {
// now time.Time
// blocksOnDisk []string // Block directory names in format "2006-01-02T15:04:05Z-2006-01-02T15:04:05Z"
// }
// type whenDetail struct {
// app *config.App
// }
// type thenExpected struct {
// storage storage
// storageHotLen int
// storageColdLen int
// blockCalls []func(obj *mock.Mock)
// }
// tests := []test.Case[givenDetail, whenDetail, thenExpected]{
// {
// Scenario: "Create a new storage from an empty directory",
// Given: "just mock a time",
// When: "call New with a config with a data directory",
// Then: "should return a new storage and a hot block created",
// GivenDetail: givenDetail{
// now: timeutil.MustParse("2025-03-03T10:00:00Z"),
// },
// WhenDetail: whenDetail{
// app: &config.App{
// DB: config.DB{
// Dir: "/tmp/TestNew",
// },
// },
// },
// ThenExpected: thenExpected{
// storage: storage{
// config: &Config{
// Dir: "/tmp/TestNew",
// },
// },
// storageHotLen: 1,
// storageColdLen: 0,
// },
// },
// {
// Scenario: "Create a storage from existing directory with blocks",
// Given: "existing blocks on disk",
// GivenDetail: givenDetail{
// now: timeutil.MustParse("2025-03-03T10:00:00Z"),
// blocksOnDisk: []string{
// "2025-03-02T10:00:00Z ~ 2025-03-03T10:00:00Z", // Hot block
// "2025-03-01T10:00:00Z ~ 2025-03-02T10:00:00Z", // Cold block
// "2025-02-28T10:00:00Z ~ 2025-03-01T10:00:00Z", // Cold block
// },
// },
// When: "call New with a config with existing data directory",
// WhenDetail: whenDetail{
// app: &config.App{
// DB: config.DB{
// Dir: "/tmp/TestNew",
// WriteableWindow: 49 * time.Hour,
// },
// },
// },
// Then: "should return a storage with existing blocks loaded",
// ThenExpected: thenExpected{
// storage: storage{
// config: &Config{
// Dir: "/tmp/TestNew",
// Block: BlockConfig{
// WriteableWindow: 49 * time.Hour,
// },
// },
// },
// storageHotLen: 1,
// storageColdLen: 2,
// blockCalls: []func(obj *mock.Mock){
// func(m *mock.Mock) {
// m.On("State").Return(block.StateHot).Once()
// },
// func(m *mock.Mock) {
// m.On("State").Return(block.StateCold).Once()
// },
// func(m *mock.Mock) {
// m.On("State").Return(block.StateCold).Once()
// },
// },
// },
// },
// }
// for _, tt := range tests {
// t.Run(tt.Scenario, func(t *testing.T) {
// // Given.
// c := clock.NewMock()
// c.Set(tt.GivenDetail.now)
// clk = c // Set global clock.
// defer func() { clk = clock.New() }()
// // Create test directories if needed
// if len(tt.GivenDetail.blocksOnDisk) > 0 {
// for _, blockDir := range tt.GivenDetail.blocksOnDisk {
// err := os.MkdirAll(tt.WhenDetail.app.DB.Dir+"/"+blockDir, 0755)
// Expect(err).To(BeNil())
// }
// }
// // When.
// var calls int
// var blockCalls []*mock.Mock
// blockFactory := block.NewFactory(func(obj *mock.Mock) {
// if calls < len(tt.ThenExpected.blockCalls) {
// tt.ThenExpected.blockCalls[calls](obj)
// calls++
// blockCalls = append(blockCalls, obj)
// }
// })
// s, err := new(tt.WhenDetail.app, blockFactory)
// defer os.RemoveAll(tt.WhenDetail.app.DB.Dir)
// // Then.
// Expect(err).To(BeNil())
// Expect(s).NotTo(BeNil())
// storage := s.(*storage)
// Expect(storage.config).To(Equal(tt.ThenExpected.storage.config))
// Expect(len(storage.hot.blocks)).To(Equal(tt.ThenExpected.storageHotLen))
// Expect(len(storage.cold.blocks)).To(Equal(tt.ThenExpected.storageColdLen))
// for _, call := range blockCalls {
// call.AssertExpectations(t)
// }
// })
// }
// }
// func TestAppend(t *testing.T) {
// RegisterTestingT(t)
// type givenDetail struct {
// hotBlocks []func(m *mock.Mock)
// coldBlocks []func(m *mock.Mock)
// }
// type whenDetail struct {
// feeds []*chunk.Feed
// }
// type thenExpected struct {
// err string
// }
// tests := []test.Case[givenDetail, whenDetail, thenExpected]{
// {
// Scenario: "Append feeds to hot block",
// Given: "a storage with one hot block",
// When: "append feeds within hot block time range",
// Then: "should append feeds to hot block successfully",
// GivenDetail: givenDetail{
// hotBlocks: []func(m *mock.Mock){
// func(m *mock.Mock) {
// m.On("Start").Return(timeutil.MustParse("2025-03-02T10:00:00Z")).Twice()
// m.On("End").Return(timeutil.MustParse("2025-03-03T10:00:00Z")).Twice()
// m.On("State").Return(block.StateHot).Twice()
// m.On("Append", mock.Anything, []*chunk.Feed{
// {ID: 1, Time: timeutil.MustParse("2025-03-02T11:00:00Z")},
// {ID: 2, Time: timeutil.MustParse("2025-03-02T12:00:00Z")},
// }).Return(nil)
// },
// },
// },
// WhenDetail: whenDetail{
// feeds: []*chunk.Feed{
// {ID: 1, Time: timeutil.MustParse("2025-03-02T11:00:00Z")},
// {ID: 2, Time: timeutil.MustParse("2025-03-02T12:00:00Z")},
// },
// },
// ThenExpected: thenExpected{
// err: "",
// },
// },
// {
// Scenario: "Append feeds to non-hot block",
// Given: "a storage with hot and cold blocks",
// When: "append feeds with time in cold block range",
// Then: "should return error",
// GivenDetail: givenDetail{
// coldBlocks: []func(m *mock.Mock){
// func(m *mock.Mock) {},
// },
// },
// WhenDetail: whenDetail{
// feeds: []*chunk.Feed{
// {ID: 1, Time: timeutil.MustParse("2025-03-01T11:00:00Z")},
// },
// },
// ThenExpected: thenExpected{
// err: "cannot find hot block",
// },
// },
// }
// for _, tt := range tests {
// t.Run(tt.Scenario, func(t *testing.T) {
// // Given.
// calls := 0
// var blockMocks []*mock.Mock
// blockFactory := block.NewFactory(func(obj *mock.Mock) {
// if calls < len(tt.GivenDetail.hotBlocks) {
// tt.GivenDetail.hotBlocks[calls](obj)
// calls++
// blockMocks = append(blockMocks, obj)
// }
// })
// var hotBlocks blockChain
// for range tt.GivenDetail.hotBlocks {
// block, err := blockFactory.New(nil, nil, nil, nil, nil)
// Expect(err).To(BeNil())
// hotBlocks.add(block)
// }
// blockFactory = block.NewFactory(func(obj *mock.Mock) {
// if calls < len(tt.GivenDetail.coldBlocks) {
// tt.GivenDetail.coldBlocks[calls](obj)
// calls++
// blockMocks = append(blockMocks, obj)
// }
// })
// var coldBlocks blockChain
// for range tt.GivenDetail.coldBlocks {
// block, err := blockFactory.New(nil, nil, nil, nil, nil)
// Expect(err).To(BeNil())
// coldBlocks.add(block)
// }
// s := storage{
// hot: &hotBlocks,
// cold: &coldBlocks,
// }
// // When.
// err := s.Append(context.Background(), tt.WhenDetail.feeds...)
// // Then.
// if tt.ThenExpected.err != "" {
// Expect(err.Error()).To(ContainSubstring(tt.ThenExpected.err))
// } else {
// Expect(err).To(BeNil())
// }
// for _, m := range blockMocks {
// m.AssertExpectations(t)
// }
// })
// }
// }
// func TestQuery(t *testing.T) {
// RegisterTestingT(t)
// type givenDetail struct {
// hotBlocks []func(m *mock.Mock)
// coldBlocks []func(m *mock.Mock)
// }
// type whenDetail struct {
// query block.QueryOptions
// }
// type thenExpected struct {
// feeds []*block.FeedVO
// err string
// }
// tests := []test.Case[givenDetail, whenDetail, thenExpected]{
// {
// Scenario: "Query feeds from hot blocks",
// Given: "a storage with one hot block containing feeds",
// When: "querying with time range within hot block",
// Then: "should return matching feeds from hot block",
// GivenDetail: givenDetail{
// hotBlocks: []func(m *mock.Mock){
// func(m *mock.Mock) {
// m.On("Start").Return(timeutil.MustParse("2025-03-02T10:00:00Z")).Once()
// m.On("End").Return(timeutil.MustParse("2025-03-03T10:00:00Z")).Once()
// m.On("Query", mock.Anything, mock.MatchedBy(func(q block.QueryOptions) bool {
// return q.Start.Equal(timeutil.MustParse("2025-03-02T12:00:00Z")) &&
// q.End.Equal(timeutil.MustParse("2025-03-02T14:00:00Z"))
// })).Return([]*block.FeedVO{
// {ID: 1, Time: timeutil.MustParse("2025-03-02T12:30:00Z")},
// {ID: 2, Time: timeutil.MustParse("2025-03-02T13:00:00Z")},
// }, nil)
// },
// },
// },
// WhenDetail: whenDetail{
// query: block.QueryOptions{
// Start: timeutil.MustParse("2025-03-02T12:00:00Z"),
// End: timeutil.MustParse("2025-03-02T14:00:00Z"),
// Limit: 10,
// },
// },
// ThenExpected: thenExpected{
// feeds: []*block.FeedVO{
// {ID: 2, Time: timeutil.MustParse("2025-03-02T13:00:00Z")},
// {ID: 1, Time: timeutil.MustParse("2025-03-02T12:30:00Z")},
// },
// err: "",
// },
// },
// {
// Scenario: "Query feeds from multiple blocks",
// Given: "a storage with hot and cold blocks containing feeds",
// When: "querying with time range spanning multiple blocks",
// Then: "should return combined and sorted feeds from all matching blocks",
// GivenDetail: givenDetail{
// hotBlocks: []func(m *mock.Mock){
// func(m *mock.Mock) {
// m.On("Start").Return(timeutil.MustParse("2025-03-02T10:00:00Z"))
// m.On("End").Return(timeutil.MustParse("2025-03-03T10:00:00Z"))
// m.On("Query", mock.Anything, mock.MatchedBy(func(q block.QueryOptions) bool {
// return !q.Start.IsZero() && q.End.IsZero()
// })).Return([]*block.FeedVO{
// {ID: 3, Time: timeutil.MustParse("2025-03-02T15:00:00Z")},
// {ID: 4, Time: timeutil.MustParse("2025-03-02T16:00:00Z")},
// }, nil)
// },
// },
// coldBlocks: []func(m *mock.Mock){
// func(m *mock.Mock) {
// m.On("Start").Return(timeutil.MustParse("2025-03-01T10:00:00Z"))
// m.On("End").Return(timeutil.MustParse("2025-03-02T10:00:00Z"))
// m.On("Query", mock.Anything, mock.MatchedBy(func(q block.QueryOptions) bool {
// return !q.Start.IsZero() && q.End.IsZero()
// })).Return([]*block.FeedVO{
// {ID: 1, Time: timeutil.MustParse("2025-03-01T15:00:00Z")},
// {ID: 2, Time: timeutil.MustParse("2025-03-01T16:00:00Z")},
// }, nil)
// },
// },
// },
// WhenDetail: whenDetail{
// query: block.QueryOptions{
// Start: timeutil.MustParse("2025-03-01T12:00:00Z"),
// Limit: 3,
// },
// },
// ThenExpected: thenExpected{
// feeds: []*block.FeedVO{
// {ID: 4, Time: timeutil.MustParse("2025-03-02T16:00:00Z")},
// {ID: 3, Time: timeutil.MustParse("2025-03-02T15:00:00Z")},
// {ID: 2, Time: timeutil.MustParse("2025-03-01T16:00:00Z")},
// },
// err: "",
// },
// },
// }
// for _, tt := range tests {
// t.Run(tt.Scenario, func(t *testing.T) {
// // Given.
// calls := 0
// var blockMocks []*mock.Mock
// blockFactory := block.NewFactory(func(obj *mock.Mock) {
// if calls < len(tt.GivenDetail.hotBlocks) {
// tt.GivenDetail.hotBlocks[calls](obj)
// calls++
// blockMocks = append(blockMocks, obj)
// }
// })
// var hotBlocks blockChain
// for range tt.GivenDetail.hotBlocks {
// block, err := blockFactory.New(nil, nil, nil, nil, nil)
// Expect(err).To(BeNil())
// hotBlocks.add(block)
// }
// blockFactory = block.NewFactory(func(obj *mock.Mock) {
// if calls < len(tt.GivenDetail.hotBlocks)+len(tt.GivenDetail.coldBlocks) {
// tt.GivenDetail.coldBlocks[calls-len(tt.GivenDetail.hotBlocks)](obj)
// calls++
// blockMocks = append(blockMocks, obj)
// }
// })
// var coldBlocks blockChain
// for range tt.GivenDetail.coldBlocks {
// block, err := blockFactory.New(nil, nil, nil, nil, nil)
// Expect(err).To(BeNil())
// coldBlocks.add(block)
// }
// s := storage{
// hot: &hotBlocks,
// cold: &coldBlocks,
// }
// // When.
// feeds, err := s.Query(context.Background(), tt.WhenDetail.query)
// // Then.
// if tt.ThenExpected.err != "" {
// Expect(err).NotTo(BeNil())
// Expect(err.Error()).To(ContainSubstring(tt.ThenExpected.err))
// } else {
// Expect(err).To(BeNil())
// Expect(feeds).To(HaveLen(len(tt.ThenExpected.feeds)))
// // Check feeds match expected
// for i, feed := range feeds {
// Expect(feed.ID).To(Equal(tt.ThenExpected.feeds[i].ID))
// Expect(feed.Time).To(Equal(tt.ThenExpected.feeds[i].Time))
// Expect(feed.Labels).To(Equal(tt.ThenExpected.feeds[i].Labels))
// }
// }
// for _, m := range blockMocks {
// m.AssertExpectations(t)
// }
// })
// }
// }