Skip to main content
History
About
中文

2026-04-28 Digest

Tracked 291 · Curated 14

#1 OpenMOSS Releases MOSS-Audio: Open-Source Foundation Model for Audio Understanding

OpenMOSS has released MOSS-Audio, an open-source audio understanding foundation model that unifies capabilities like speech and sound recognition, music analysis, and time-aware question answering. It aims to perform complex reasoning over audio without needing multiple specialized systems. Four variants are available: MOSS-Audio-4B and MOSS-Audio-8B, each with Instruct and Thinking versions.

10.7

#2 Applied Intuition Discusses the Challenges and Future of 'Physical AI'

Qasar Younis and Peter Ludwig of Applied Intuition discuss their journey from autonomy tooling to a $15 billion 'physical AI' company. They argue that 'physical AI' differs from LLMs, with the real bottleneck being deployment onto constrained hardware. They suggest the future of autonomy might resemble an 'Android for every moving machine' rather than one-off demos. The discussion covers the reliability needs of physical AI, its evolution from tools to a platform, AI operating systems, validation challenges, and deployment realities.

8.5

#3 Why Developers Are Betting on Postgres for AI Applications

AI applications and agents require access to existing enterprise data to avoid hallucinations, with structured data being the 'ground truth' for AI. pgEdge co-founder Phillip Merrick highlights Postgres as an ideal platform due to its open-source nature, ease of use, scalability, and support for vector database functionalities via the pgvector extension. This positions Postgres as a one-stop solution for managing structured, unstructured, and vector data, enabling the development of AI applications like RAG, particularly with tools such as the pgEdge Agentic AI Toolkit.

8.1

#4 Study: One-Third of New Websites are AI-Generated

Researchers have found that approximately 35% of new websites created since late 2022 are AI-generated or AI-assisted. The study noted that while AI is making the web more positive, it did not confirm an increase in misinformation or a reduction in source diversity.

7.8

#5 Bot-to-Bot Communication Enables Multi-Agent Collaboration

The newly released Bot-to-Bot Communication feature allows bots to converse directly, making it suitable for multi-agent collaborative workflows. For instance, one bot can write code while another reviews it, communicating and making revisions in a group chat.

7.4

#6 Choco Automates Food Distribution with AI Agents

Choco shares a customer story on how they leveraged OpenAI APIs to automate food distribution, enhance productivity, and drive growth, demonstrating the real-world impact of AI.

7.3

#7 Open Questions on AI Compute, Data, and Continual Learning

The author poses several open questions about AI, seeking to hire a core researcher. The questions delve into concerns about AI compute monopolies potentially pricing out ordinary users, the primary drivers of AI model improvements (data vs. algorithms), breakthroughs in long-horizon coding agents, the trade-off between sample efficiency and memory in models, the impact of AI-generated content on future training data, and the potential for continual learning to drive AI progress.

7.2

#8 Google Cloud Next Marathon Planning System Now Open Source

The marathon planning system showcased at Google Cloud Next is now open source. The system demonstrates a coordinated multi-agent architecture with memory and MCPs, allowing users to run it locally or on the cloud. The code is available on GitHub.

7.0

#9 Our Principles for AGI

OpenAI's mission is to ensure that Artificial General Intelligence (AGI) benefits all of humanity. Sam Altman shares five core principles that guide the company's work.

6.9

#10 Clicky AI App Simplifies AI Interaction and Mac App Development

Clicky AI has launched a new application featuring a simple interface for interacting with AI and spawning agents. The app can build Mac applications, assist in finding Instagram micro-influencers, and integrates with native Apple Notes, Calendar, and Reminders, requiring zero setup for consumers.

6.8

#11 OSS Agent Outperforms Gemini-3-flash-preview on TerminalBench

An open-source (OSS) agent developed by a user has achieved a score of 65.2% on the TerminalBench benchmark, surpassing Google's Gemini-3-flash-preview (47.8%) and the closed-source model Junie CLI (64.3%). The developer assures that the agent was run compliantly with no cheating mechanisms. The submission to the leaderboard has been pending for 8 days due to unresponsiveness from the maintainers.

6.7

#12 Google Earth AI Launches New Roads Management Insights Capabilities

Google Earth AI is introducing new Roads Management Insights capabilities, including 'Road Disruptions' and 'Vehicle Counts,' for public and private sectors. These features aim to provide timely information on road conditions, helping fleet management optimize route planning and identify potential safety risks.

6.4

#13 GPT-5.5 is 39% Cheaper than Opus 4.7

According to reports, GPT-5.5 is approximately 39% cheaper than Opus 4.7. Despite a higher output token cost, GPT-5.5 is cheaper for input tokens (cache writes are free), more token-efficient, and tokenizes the same text into fewer tokens.

6.2

#14 AI Generates Playable 3D Worlds from Images

An AI experience transforms the concept of generating explorable 3D worlds from a research paper demonstration into a one-tap mobile experience. Users can now explore unfamiliar or imaginary scenes simply by inputting an image.

5.9

Type keywords to search