2026-05-15 Digest

Tracked 269 · Curated 13

#1 Fastino Labs Open-Sources GLiGuard: A Small, Fast, Safe Moderation Model

Fastino Labs has open-sourced GLiGuard, a 300 million parameter safety moderation model designed to address the high cost and latency of LLM safety checks. Unlike large, slower decoder-only models, GLiGuard uses an encoder-based approach for text classification, handling multiple safety dimensions in a single pass. This makes it up to 16x faster and matches or exceeds the accuracy of models 23-90x its size on nine benchmarks.

11.7

#2 Anthropic Overtakes OpenAI in Business AI Adoption, Amazon Launches "Alexa for Shopping"

Fintech firm Ramp's latest AI Index reveals Anthropic has surpassed OpenAI in paid business user adoption for the first time, with usage quadrupling, aligning with OpenAI's previous 'code red' concerns and strategic shifts in 2026. Meanwhile, Amazon has integrated its Rufus chatbot into "Alexa for Shopping," leveraging extensive user data for a personalized shopping agent across devices.

10.0

#3 OpenAI Integrates Codex into ChatGPT Mobile App

OpenAI has launched a preview of Codex within its ChatGPT mobile app for iOS and Android. This feature allows users to monitor, guide, and approve code execution tasks performed by Codex on their computers, accessible to all ChatGPT users, including free tier.

10.0

#4 Codex Now Accessible via ChatGPT Mobile App

Codex is now available through the ChatGPT mobile app, allowing users to monitor, steer, and approve coding tasks in real-time across devices and remote environments.

7.3

#5 Rethinking Collaboration Essentials for Human-Agent Products

The article delves into the essence of collaboration and what truly needs alignment between teams. The author posits that a thorough understanding of communication and collaboration models is essential for developing successful Human-Agent products.

7.0

#6 Nvidia CEO Jensen Huang Addresses Carnegie Mellon Class of 2026

Nvidia CEO Jensen Huang told the Carnegie Mellon Class of 2026 that no generation has entered the world with more powerful tools or greater opportunities. He stated they are at the starting line of the AI era and have a moment to shape what comes next.

6.9

#7 Programming Languages Are Less of a Lock-In

Echoing Mitchell Hashimoto's sentiment about Bun's migration from Zig to Rust, this post argues that programming languages are becoming less of a lock-in. A representative from a mid-sized tech company shared how they used AI coding agents to rewrite their legacy iPhone and Android apps to React Native. They chose React Native due to its recent improvements and the ability to port back to native if needed, highlighting the diminishing lock-in effect of programming languages.

6.8

#8 CuPy Tutorial: Mastering GPU Computing with CUDA Kernels, Streams, Sparse Matrices, and Profiling

This tutorial explores CuPy, a GPU-accelerated Python library for high-performance numerical computing, serving as a NumPy alternative. It covers CUDA device introspection, performance comparisons between NumPy and CuPy for matrix multiplication and FFTs, memory pool management, custom Elementwise and Reduction kernels, raw CUDA kernels, CUDA streams, sparse matrices, dense linear solvers, GPU image processing, DLPack interoperability, event-based profiling, and cupyx.jit. The goal is to build a practical understanding of leveraging CuPy for advanced CUDA-level performance.

6.5

#9 Arena AI Model ELO History Tracker

A developer created a live tracker visualizing the performance changes of flagship AI models by plotting their historical ELO ratings from Arena AI. The dashboard focuses on a single continuous curve per major AI lab, tracking their highest-rated model over time to illustrate generational jumps and performance decay. The creator is seeking community insights on evaluation datasets that specifically test consumer-facing chat UIs, beyond raw API endpoints, to better capture real-world user experiences.

6.4

#10

#10 A Guide To Event-Driven Architectural Patterns

This article explores Event-Driven Architectural (EDA) patterns. It explains why traditional synchronous communication breaks down at scale in distributed systems and introduces EDA as an alternative where services publish events and others react independently. The article will cover the foundations of EDA and walk through six patterns that solve specific problems introduced by this model.

6.1

#11

#11 Anthropic Separates Claude API Usage from Subscriptions, Bills at Full Price

Starting June 15, Anthropic is separating programmatic Claude API usage from existing subscription quotas. Subscribers will receive a dedicated monthly credit ranging from $20 to $200, and SDK and third-party requests will be billed at full API rates instead of previous subsidized flat rates.

6.1

#12

#12 Ben's Bites: Using Video Feedback to Enhance AI Agent Workflows

Ben's Bites introduces a novel approach to AI agent feedback using screen recordings and voiceovers to create visual reports. This method generates HTML files with action checklists, aiding agent comprehension and execution. The newsletter also covers recent AI updates from Claude, Google Gemini, Notion, Vercel, Cursor, and Orca, among others.

6.0

#13

#13 ai-cli Allows Rendering Images Directly in the Terminal

The `ai-cli` tool now supports rendering images directly in the terminal. Users can execute commands like `npx ai-cli image 'diagram description'` to access all image, video, and text models from Vercel AI Gateway instantly.

5.9