What is Rover

Rover is an AI-powered daily tech article digest.

Its backend Pipeline automatically fetches articles from 25 RSS feeds and 45+ Twitter accounts every 3 hours, scores each article across 8 dimensions (0-10) using AI, and generates a curated daily digest at 10:00 Beijing time.

Most articles score between 3 and 6. Only a handful receive high scores. If nothing significant happens on a given day, the digest will be short — by design.

Sources

Currently tracking 25 RSS feeds and 45+ Twitter accounts, spanning tech media, AI/ML publications, engineering blogs, Chinese tech communities, and industry practitioners on Twitter. All sources are fetched automatically every 3 hours to ensure timeliness.

How It Works

Collect — Fetch latest content from RSS and Twitter concurrently every 3 hours
Filter — Remove noise like ads, job postings, and overly short content
Deduplicate — Generate semantic embeddings via Gemini Embedding, remove duplicate coverage using cosine similarity (same-day threshold 0.72, cross-day threshold 0.85)
Score — AI evaluates each article across 8 independent dimensions, then computes a weighted total
Select — Cluster by topic, enforce source diversity (max 2 per source), apply quality gap detection, and pick 5-15 best articles
Generate — Create bilingual (Chinese/English) titles and summaries for each article
Publish — Push a summary via Telegram and update the website

Scoring Dimensions

Each article is evaluated by Gemini across 8 dimensions, scored independently (0-10), then combined with weights into a total score:

Relevance (35%) — How closely it matches the focus areas
Scale (10%) — How broadly the event affects people
Impact (10%) — How strong the real-world effect is
Novelty (10%) — How unique and unexpected the information is
Potential (10%) — Likely relevance over the next 6-12 months
Legacy (10%) — Whether it could be a historical turning point
Credibility (10%) — Authority and reliability of the source
Positivity (5%) — Counterbalances media negativity bias

Why Positivity?

News sources have a built-in negativity bias — negative stories get more attention and spread faster. This dimension has a very low weight (only 5% of the total score) and doesn't significantly change the overall distribution, but makes a noticeable difference in the high-score range: without it, top articles are almost entirely about conflicts and disasters; with it, scientific discoveries and tech breakthroughs get their deserved ranking.

Focus Areas

Relevance is the highest-weighted dimension (35%). It measures how closely an article matches the following focus areas:

AI & LLMCrypto/Web3FrontendProduct DesignIndie Dev & StartupsInternet ProductsOpen SourceIoT

Selection Logic

Scoring is just the first step. The final selection goes through additional processing:

Topic clustering — Multiple reports on the same topic are clustered. Topics covered by multiple sources get a buzz boost (up to 1.4x), but only if relevance score is ≥ 7
Source diversity — Max 2 articles per source to prevent any single source from dominating
Quality gap — After selecting at least 5 articles, if the gap to the next candidate exceeds 2.5 points, selection stops automatically
Content balance — At least 2 Twitter posts are included to maintain source variety