基于 RSS 订阅的每日科技新闻自动收集和发布系统。从权威媒体 RSS 源获取真实新闻,必要时可用第三方 API 补充财经候选新闻,再使用 AI 格式化后发布到微信公众号。
RSS Feed (XML) → Python 解析 → 规则去重/过滤 →(可选)第三方 API 财经补源 → AI/规则分类 → HTML 日报 → 公众号发布
| 媒体 | RSS 地址 | 每日获取 | 状态 |
|---|---|---|---|
| 量子位 | https://www.qbitai.com/feed | 12条 | ✅ |
| 机器之心 | https://www.jiqizhixin.com/rss | 12条 | ✅ |
| OpenAI Blog | https://openai.com/blog/rss.xml | 8条 | ✅ |
| Hugging Face Blog | https://huggingface.co/blog/feed.xml | 8条 | ✅ |
| AI News | https://www.artificialintelligence-news.com/feed/ | 8条 | ✅ |
| Google DeepMind | https://deepmind.google/blog/rss.xml | 8条 | ✅ |
| Google Research | https://research.google/blog/rss | 8条 | ✅ |
| AWS ML Blog | https://aws.amazon.com/blogs/machine-learning/feed/ | 8条 | ✅ |
| NVIDIA Blog | https://blogs.nvidia.com/feed/ | 8条 | ✅ |
| ZDNet AI | https://www.zdnet.com/topic/artificial-intelligence/rss.xml | 10条 | ✅ |
| MIT News AI | https://news.mit.edu/rss/topic/artificial-intelligence2 | 8条 | ✅ |
| KDnuggets | https://www.kdnuggets.com/feed | 8条 | ✅ |
| IEEE Spectrum AI | https://spectrum.ieee.org/feeds/topic/artificial-intelligence.rss | 8条 | ✅ |
| 媒体 | RSS 地址 | 每日获取 | 状态 |
|---|---|---|---|
| 财联社快讯 | https://rsshub.rssforever.com/cls/telegraph | 15条 | ✅ |
| 财新网 | https://rsshub.pseudoyu.com/caixin/latest | 15条 | ✅ |
| 金十数据 | https://rsshub.rssforever.com/jin10/flash | 10条 | ✅ |
| 华尔街见闻 | https://dedicated.wallstreetcn.com/rss.xml | 12条 | ✅ |
| Bloomberg Markets | https://feeds.bloomberg.com/markets/news.rss | 10条 | ✅ |
| CNBC | https://www.cnbc.com/id/100003114/device/rss/rss.html | 8条 | ✅ |
| MarketWatch | https://feeds.marketwatch.com/marketwatch/topstories/ | 8条 | ✅ |
| Yahoo Finance | https://finance.yahoo.com/news/rssindex | 8条 | ✅ |
| Seeking Alpha | https://seekingalpha.com/feed.xml | 8条 | ✅ |
| Forbes Business | https://www.forbes.com/business/feed2 | 8条 | ✅ |
| Business Insider | https://feeds.businessinsider.com/custom/all | 8条 | ✅ |
| CoinDesk | https://www.coindesk.com/arc/outboundfeeds/rss/ | 8条 | ✅ |
已废弃 / 已移除:极客公园(404)、VentureBeat AI(404)、品玩(解析问题)、DeepLearning.AI(404)、Reuters(404)、Financial Times / FT Markets / The Economist(当前版本未启用)、虎嗅 / 动点科技 / Microsoft Research / Analytics Vidhya(当前版本未启用)
~/.claude/skills/daily-tech-news/
├── SKILL.md # Skill 文档
├── README.md # 本文件
├── news_YYYYMMDD.md # 生成的日报
├── raw_news_YYYYMMDD.json # 原始新闻数据
├── scripts/
│ ├── rss_news_collector.py # RSS 收集主脚本
│ └── daily-news.sh # Shell 包装脚本
└── logs/
├── rss-news.log # 收集日志
├── scheduler.log # 调度日志
└── scheduler-error.log # 错误日志
- 线上运行时间: 每天 08:30(北京时间)
- 触发方式: Cloudflare Worker
repository_dispatch - GitHub Workflow:
.github/workflows/daily-news.yml - 本地手动调试: 直接运行
scripts/rss_news_collector.py或scripts/auto_daily_news.py
python3 ~/.claude/skills/daily-tech-news/scripts/rss_news_collector.pycp ~/.claude/skills/daily-tech-news/.env.example ~/.claude/skills/daily-tech-news/.env.local- 优先读取 shell 环境变量
- 若未设置,则自动读取项目根目录下的
.env.local .env.local已加入.gitignore,适合保存本机私有 key- 微信发布默认开启 TLS 校验;如本机网络环境特殊,可在
.env.local中显式设置WECHAT_SSL_VERIFY=false
python3 -m unittest \
~/.claude/skills/daily-tech-news/scripts/test_rss_datetime_parsing.py \
~/.claude/skills/daily-tech-news/scripts/test_rule_based_classification.py \
~/.claude/skills/daily-tech-news/scripts/test_local_env_loading.py \
~/.claude/skills/daily-tech-news/scripts/test_ssl_config.py# 收集日志
tail -f ~/.claude/skills/daily-tech-news/logs/rss-news.log
# 调度日志
tail -f ~/.claude/skills/daily-tech-news/logs/scheduler.log
# 错误日志
tail -f ~/.claude/skills/daily-tech-news/logs/scheduler-error.loglaunchctl list | grep dailytechnews# 启动
launchctl start com.dailytechnews.scheduler
# 停止
launchctl stop com.dailytechnews.scheduler
# 重新加载
launchctl unload ~/Library/LaunchAgents/com.dailytechnews.scheduler.plist
launchctl load ~/Library/LaunchAgents/com.dailytechnews.scheduler.plist编辑 scripts/rss_news_collector.py 中的 RSS_SOURCES:
RSS_SOURCES = {
"AI 领域": [
{
"name": "新媒体名称",
"url": "https://example.com/feed",
"limit": 8 # 获取条数
}
]
}生成的日报包含:
- 日期卡片: 粉绿渐变背景,显示农历/星期/公历
- AI 领域: 紫色渐变标签,5条精选单行新闻简讯
- 科技动态: 蓝色渐变标签,5条精选单行新闻简讯
- 财经要闻: 粉红渐变标签,5条精选单行新闻简讯
- 微语: 粉黄渐变背景,励志语录
- 原始 JSON 诊断:
raw_news_*.json额外保存rss_source_health,便于排查空返回 RSS 源
- 检查日志:
tail -50 logs/rss-news.log - 某些 RSS 可能暂时无更新,或 Atom 时间格式未被正确识别
- 查看
raw_news_*.json中的rss_source_health和external_source_health
- 检查豆包 API key:
echo $DOUBAO_API_KEY - 若本地未导出环境变量,确认
.env.local已填写 - 查看 API 调用日志
- 当前已内置规则分类兜底,豆包限流时不会整期清空
- 若需补强财经覆盖,可配置
MARKETAUX_API_TOKEN
- 当前支持可选的
Marketaux财经补源 - 仅当财经 RSS 近 24 小时供给不足时触发
- GitHub Actions 如需启用,请配置仓库 Secret:
MARKETAUX_API_TOKEN
- 检查公众号授权状态
- 确认微信 API key 有效
- 若遇到本机证书链问题,可临时在
.env.local中设置WECHAT_SSL_VERIFY=false或提供WECHAT_CA_BUNDLE
launchctl unload ~/Library/LaunchAgents/com.dailytechnews.scheduler.plist
rm ~/Library/LaunchAgents/com.dailytechnews.scheduler.plist添加新源前先验证:
python3 -c "
import urllib.request, ssl, xml.etree.ElementTree as ET
ssl_context = ssl._create_unverified_context()
req = urllib.request.Request('RSS_URL', headers={'User-Agent': 'Mozilla/5.0'})
with urllib.request.urlopen(req, timeout=10, context=ssl_context) as response:
root = ET.fromstring(response.read())
items = root.findall('.//item')
print(f'✅ 有效: {len(items)} 条')
"