AI pulse last 7 days
Daily AI pulse from YouTube, blogs, Reddit, HN. Ruthlessly filtered.
Sources (41)▶
- criticalAndrej Karpathy
Były dyrektor AI w Tesli, OpenAI cofounder. Każde video to gold.
- criticalAnthropic
Oficjalny kanał Anthropic. Każdy release Claude'a.
- criticalComfyUI Blog
Release log dla integracji ComfyUI — Luma Uni-1, GPT Image 2, ACE-Step music gen, Seedance. Pokrywa video+image+music+workflow.
- criticalOpenAI Blog
Oficjalny blog OpenAI. Wszystkie release.
- criticalSimon Willison's Weblog
Najlepszy 'thinker' AI. Codzienne posty, deep insights, niska hype rate.
- highAI Explained
Głęboka analiza papers i benchmarków, niska hype rate.
- highAI Jason
Praktyczne tutoriale Claude Code, MCP, workflow vibe codingu.
- highBen's Bites
Daily AI digest, creator-friendly tone. Codex, model releases, agentic AI.
- highCole Medin
Vibe coding + agentic workflows + Claude Code MCP integrations.
- highFal AI Blog
Fal hostuje większość nowych AI image/video modeli — ich blog to wczesne sygnały premier.
- highHN: 3D & Gaussian Splatting
HN signal dla 3D generative — Gaussian Splatting, NeRF, image-to-3D. Próg 20 bo niszowa kategoria (top historic 182pts).
- highHN: AI agents / MCP
HN posty o agentach, MCP, vibe codingu z min 100 pkt.
- highHN: Claude / Anthropic
HN posty z 'Claude' lub 'Anthropic' z min 100 pkt.
- highHugging Face Blog
Releases dla image, video, audio, 3D modeli. Część tech-heavy — Gemini relevance odfiltruje noise. Downgraded z critical: za duży volume na 'must-read' status.
- highIndyDevDan
Claude Code power user, prompty, hooki.
- highInterconnects (Nathan Lambert)
AI policy + research analysis. Niska hype rate, opinionated.
- highLatent Space
Podcast + blog Swyx — wywiady z founderami i deep dives engineeringowe.
- highMatt Wolfe
Comprehensive AI tools weekly digest. ~700K subs.
- highMatthew Berman
AI news, model release reviews, agent demos. Wysoki output.
- highr/aivideo
Community AI video — Sora, Veo, Runway, Kling, LTX. Co naprawdę zaskakuje twórców.
- highr/ClaudeAI
Społeczność Claude'a — power users, tipy, problemy.
- highr/LocalLLaMA
Open-source LLMs, lokalne uruchamianie, benchmarks bez hype.
- highr/StableDiffusion
Największa community open-source image gen (700k+ users). Premiery modeli, LoRA, ComfyUI workflows.
- highRiley Brown
Vibe coding, AI builder workflows, Cursor + Claude tutorials.
- highThe Decoder
Niemiecki AI news outlet po angielsku, dobre breaking news.
- highTheo - t3.gg
TypeScript + AI dev workflows. Hot takes, narrative-driven.
- highYannic Kilcher
Paper reviews i deep dives w research AI.
- lowAI Weirdness
Janelle Shane — playful AI experiments, image gen quirks. Niski volume, unikalna perspektywa.
- mediumbycloud
AI papers digestible — między 2MP a Yannic Kilcher.
- mediumCreative Bloq
Design industry — gdzie AI ingeruje w klasyczne dyscypliny graficzne.
- mediumFireship
100-sec format, often AI/LLM + tech news.
- mediumfxguide
VFX i film industry — coraz więcej AI w pipeline. Profesjonalna perspektywa.
- mediumGreg Isenberg
Solo founder vibe — buduje produkty z AI, podcasty z indie hackers.
- mediumr/ChatGPTCoding
Vibe coding tipy, IDE setupy, prompty. Mix wszystkich modeli.
- mediumr/comfyui
ComfyUI workflows — custom nodes, JSON workflows, optymalizacje.
- mediumr/midjourney
Midjourney community — premiery v7+, style references, prompt patterns.
- mediumr/runwayml
Runway-specific community — premiery features, prompt patterns, comparisons z konkurencją.
- mediumr/SunoAI
Suno music gen community — nowe wersje modelu, lyric prompting techniques. Audio AI ma slaby RSS ecosystem.
- mediumTina Huang
AI workflows for data science, practical applications.
- mediumTwo Minute Papers
Krótkie streszczenia papers AI, świetne dla szybkiego scan'a.
- mediumWes Roth
AI news z bardziej clickbaitowym tonem — filtr Gemini odsiewa hype.

ZAYA1-8B: Frontier intelligence density, trained on AMD
ZAYA1-8B is a new 8B model that claims to outperform Llama 3.1 8B, proving that high-density intelligence can be achieved using AMD-based training stacks.
Zyphra has released ZAYA1-8B, a new language model designed to maximize intelligence density within the 8-billion parameter class. The model reportedly outperforms Llama 3.1 8B and Gemma 2 9B across several key benchmarks, including MMLU and GSM8K. Notably, ZAYA1-8B was trained entirely on AMD Instinct MI300X accelerators, showcasing a viable alternative to the NVIDIA-dominated training ecosystem. This release targets developers looking for high-performance models that can run efficiently on consumer hardware or edge devices. The architecture focuses on better data efficiency and architectural refinements to squeeze more reasoning capability out of fewer parameters.
r/LocalLLaMA·model_release·05/06/2026, 07:43 PM·/u/carbocation
An Open Benchmark for Testing RAG on Realistic Company-Internal Data
EnterpriseRAG-Bench provides 500k synthetic documents (Slack, Jira, etc.) to test if your RAG system can actually handle realistic, messy corporate data.
EnterpriseRAG-Bench is a new open-source benchmark designed to test RAG systems on realistic, messy enterprise data. It features a synthetic corpus of 500,000 documents simulating a fictional company, including Slack threads, emails, Jira tickets, and meeting transcripts. Unlike benchmarks based on clean Wikipedia data, this set includes intentional noise like conflicting information, near-duplicates, and misfiled documents. Initial findings released with the benchmark show that traditional BM25 keyword search often outperforms vector search on this type of data. The project includes 500 evaluation questions and a harness to measure retrieval accuracy and answer quality across 10 distinct failure-mode categories.
r/LocalLLaMA·tooling·05/06/2026, 12:58 PM·/u/Weves11
Quality comparison between Qwen 3.6 27B quantizations (BF16, Q8_0, Q6_K, Q5_K_XL, Q4_K_XL, IQ4_XS, IQ3_XXS,...)
For 16GB VRAM users, Qwen 3.6 27B at IQ4_XS quantization is the ideal choice, balancing high-quality reasoning (like SVG generation) with usable local performance.
A detailed community benchmark by /u/bobaburger compares various quantization levels of the Qwen 3.6 27B model to find the optimal balance for 16GB VRAM hardware. The test uses a creative and difficult task: tracking a non-standard chess game from PGN and rendering the board state as functional SVG code. Results show that while BF16 and Q8 are near-perfect, IQ4_XS emerges as the recommended 'sweet spot' for consumer GPUs, maintaining spatial reasoning where lower quants (Q3 and below) fail. The author also demonstrates significant performance gains using the TurboQuant fork of llama.cpp, reaching 22 tokens per second on an RTX 5060 Ti.
r/LocalLLaMA·tooling·05/06/2026, 05:10 AM·/u/bobaburger
Dense Model Shoot-Off: Gemma 4 31B vs Qwen3.6/5 27B... Result is Slower is Faster.
Gemma 4 31B proves that token efficiency beats raw speed: it completes tasks faster than Qwen 3.6 by being smarter with every token generated.
A performance comparison between Google's Gemma 4 31B and Alibaba's Qwen 3.6/3.5 27B highlights a critical distinction between raw inference speed and task completion time. While Qwen models often achieve higher scores on synthetic benchmarks, Gemma 4 demonstrates superior token efficiency, requiring fewer tokens to generate accurate responses. This creates a 'slower is faster' scenario where Gemma, despite having lower tokens-per-second due to its larger size, finishes complex tasks more quickly than its competitors. The analysis suggests that Qwen may be 'benchmaxxed'—optimized specifically for test scores—whereas Gemma offers higher intelligence density for real-world use. Local LLM enthusiasts are now looking forward to further optimizations like DFlash and MTP to enhance Gemma's perf…
r/LocalLLaMA·news·05/05/2026, 06:12 PM·/u/MiaBchDave
DeepSeek V4 Pro matches GPT-5.2 on FoodTruck Bench, our agentic benchmark — 10 weeks later, ~17× cheaper
DeepSeek V4 Pro delivers GPT-5.2 level agentic performance at 1/17th the cost, effectively closing the US-China AI gap to just 10 weeks.
DeepSeek V4 Pro has demonstrated performance parity with GPT-5.2 on the FoodTruck Bench, a rigorous 30-day agentic simulation requiring the use of 34 distinct tools and persistent memory. While ranking #4 overall, the model stays within 3% of GPT-5.2's median performance and shows superior consistency compared to Grok 4.3, with significantly less resource waste. The most significant disruption is the pricing: at $0.435/M input, it is approximately 17 times cheaper than GPT-5.2 for identical agentic workloads. This release marks a significant closing of the US-China frontier gap, now estimated at just ten weeks. The benchmark also saw a strong debut from Xiaomi’s MiMo v2.5 Pro, further populating the leaderboard with high-efficiency Chinese models.
r/LocalLLaMA·model_release·05/05/2026, 06:51 AM·/u/Disastrous_Theme5906
DeepSeek V4 Pro matches GPT-5.2 on FoodTruck Bench, our agentic benchmark — 10 weeks later, ~17× cheaper
DeepSeek V4 Pro offers GPT-5.2 level agentic performance at 1/17th the cost, narrowing the US-China AI gap to just 10 weeks.
DeepSeek V4 Pro has achieved performance parity with GPT-5.2 on the FoodTruck Bench, a complex 30-day agentic simulation involving 34 tools and persistent memory. While GPT-5.2 was tested in February, DeepSeek matched its results only ten weeks later, signaling a rapid closing of the gap between US and Chinese frontier models. Crucially, DeepSeek is approximately 17 times cheaper for agentic workloads, with significantly lower input/output pricing. The model also demonstrated superior consistency compared to Grok 4.3, showing lower variance in outcomes and better resource management. Additionally, Xiaomi’s MiMo v2.5 Pro also entered the top 6, further establishing Chinese models as high-value competitors in the frontier tier.
r/LocalLLaMA·model_release·05/05/2026, 06:51 AM·Disastrous_Theme5906
Relevance auto-scored by LLM (0–10). List shows top 30 from the last 7 days.