AI pulse last 7 days
Daily AI pulse from YouTube, blogs, Reddit, HN. Ruthlessly filtered.
Sources (41)▶
- criticalAndrej Karpathy
Były dyrektor AI w Tesli, OpenAI cofounder. Każde video to gold.
- criticalAnthropic
Oficjalny kanał Anthropic. Każdy release Claude'a.
- criticalComfyUI Blog
Release log dla integracji ComfyUI — Luma Uni-1, GPT Image 2, ACE-Step music gen, Seedance. Pokrywa video+image+music+workflow.
- criticalOpenAI Blog
Oficjalny blog OpenAI. Wszystkie release.
- criticalSimon Willison's Weblog
Najlepszy 'thinker' AI. Codzienne posty, deep insights, niska hype rate.
- highAI Explained
Głęboka analiza papers i benchmarków, niska hype rate.
- highAI Jason
Praktyczne tutoriale Claude Code, MCP, workflow vibe codingu.
- highBen's Bites
Daily AI digest, creator-friendly tone. Codex, model releases, agentic AI.
- highCole Medin
Vibe coding + agentic workflows + Claude Code MCP integrations.
- highFal AI Blog
Fal hostuje większość nowych AI image/video modeli — ich blog to wczesne sygnały premier.
- highHN: 3D & Gaussian Splatting
HN signal dla 3D generative — Gaussian Splatting, NeRF, image-to-3D. Próg 20 bo niszowa kategoria (top historic 182pts).
- highHN: AI agents / MCP
HN posty o agentach, MCP, vibe codingu z min 100 pkt.
- highHN: Claude / Anthropic
HN posty z 'Claude' lub 'Anthropic' z min 100 pkt.
- highHugging Face Blog
Releases dla image, video, audio, 3D modeli. Część tech-heavy — Gemini relevance odfiltruje noise. Downgraded z critical: za duży volume na 'must-read' status.
- highIndyDevDan
Claude Code power user, prompty, hooki.
- highInterconnects (Nathan Lambert)
AI policy + research analysis. Niska hype rate, opinionated.
- highLatent Space
Podcast + blog Swyx — wywiady z founderami i deep dives engineeringowe.
- highMatt Wolfe
Comprehensive AI tools weekly digest. ~700K subs.
- highMatthew Berman
AI news, model release reviews, agent demos. Wysoki output.
- highr/aivideo
Community AI video — Sora, Veo, Runway, Kling, LTX. Co naprawdę zaskakuje twórców.
- highr/ClaudeAI
Społeczność Claude'a — power users, tipy, problemy.
- highr/LocalLLaMA
Open-source LLMs, lokalne uruchamianie, benchmarks bez hype.
- highr/StableDiffusion
Największa community open-source image gen (700k+ users). Premiery modeli, LoRA, ComfyUI workflows.
- highRiley Brown
Vibe coding, AI builder workflows, Cursor + Claude tutorials.
- highThe Decoder
Niemiecki AI news outlet po angielsku, dobre breaking news.
- highTheo - t3.gg
TypeScript + AI dev workflows. Hot takes, narrative-driven.
- highYannic Kilcher
Paper reviews i deep dives w research AI.
- lowAI Weirdness
Janelle Shane — playful AI experiments, image gen quirks. Niski volume, unikalna perspektywa.
- mediumbycloud
AI papers digestible — między 2MP a Yannic Kilcher.
- mediumCreative Bloq
Design industry — gdzie AI ingeruje w klasyczne dyscypliny graficzne.
- mediumFireship
100-sec format, often AI/LLM + tech news.
- mediumfxguide
VFX i film industry — coraz więcej AI w pipeline. Profesjonalna perspektywa.
- mediumGreg Isenberg
Solo founder vibe — buduje produkty z AI, podcasty z indie hackers.
- mediumr/ChatGPTCoding
Vibe coding tipy, IDE setupy, prompty. Mix wszystkich modeli.
- mediumr/comfyui
ComfyUI workflows — custom nodes, JSON workflows, optymalizacje.
- mediumr/midjourney
Midjourney community — premiery v7+, style references, prompt patterns.
- mediumr/runwayml
Runway-specific community — premiery features, prompt patterns, comparisons z konkurencją.
- mediumr/SunoAI
Suno music gen community — nowe wersje modelu, lyric prompting techniques. Audio AI ma slaby RSS ecosystem.
- mediumTina Huang
AI workflows for data science, practical applications.
- mediumTwo Minute Papers
Krótkie streszczenia papers AI, świetne dla szybkiego scan'a.
- mediumWes Roth
AI news z bardziej clickbaitowym tonem — filtr Gemini odsiewa hype.
Need advice on hardware purchasing decision: RTX 5090 vs. M5 Max 128GB for agentic software development
Choosing between Nvidia and Apple for local AI coding: RTX 5090 wins on raw speed for fast iterations, while M5 Max wins on memory capacity for massive codebases.
This discussion evaluates the trade-offs between the RTX 5090 and M5 Max (128GB) for local agentic software development using models like Qwen 3.6 27B. The RTX 5090 provides approximately 3x faster token generation, which is vital for rapid code iteration, but its 32GB VRAM limits context windows and quantization levels (Q4/Q5). Conversely, the M5 Max's 128GB of unified memory supports massive context and higher precision models, though at significantly lower speeds. The author considers a multi-agent setup where a high-level orchestrator manages faster sub-agents for codebase exploration. Technical factors like Multi-Token Prediction (MTP) and MLX optimizations are highlighted as potential game-changers for Apple Silicon's usability in agentic workflows.
r/LocalLLaMA·tooling·05/07/2026, 12:34 AM·/u/BawbbySmith
ProgramBench: Can we really rebuild huge binaries from scratch? (doesn't look like it)
ProgramBench is a new, rigorous benchmark from Meta Research that tests if LLM agents can rebuild entire programs from scratch using only binaries and documentation.
Meta Research has introduced ProgramBench, a benchmark designed to evaluate how well LLM agents can reconstruct complex software from scratch. Unlike previous case studies that relied on hand-tuned setups, this framework includes 200 diverse tasks and 6 million lines of behavioral tests to prevent cheating and ensure robustness. Agents are provided only with a target executable and a README, forcing them to architect the entire system without internet access or decompilation. Initial results show that even top-tier closed-source models struggle, while open-source models underperform due to potential overfitting on older benchmarks like SWE-bench. The project is fully open-sourced, including Docker images and a CLI tool for easy evaluation.
r/LocalLLaMA·tooling·05/05/2026, 03:40 PM·/u/klieret
DeepSeek V4 Pro matches GPT-5.2 on FoodTruck Bench, our agentic benchmark — 10 weeks later, ~17× cheaper
DeepSeek V4 Pro delivers GPT-5.2 level agentic performance at 1/17th the cost, effectively closing the US-China AI gap to just 10 weeks.
DeepSeek V4 Pro has demonstrated performance parity with GPT-5.2 on the FoodTruck Bench, a rigorous 30-day agentic simulation requiring the use of 34 distinct tools and persistent memory. While ranking #4 overall, the model stays within 3% of GPT-5.2's median performance and shows superior consistency compared to Grok 4.3, with significantly less resource waste. The most significant disruption is the pricing: at $0.435/M input, it is approximately 17 times cheaper than GPT-5.2 for identical agentic workloads. This release marks a significant closing of the US-China frontier gap, now estimated at just ten weeks. The benchmark also saw a strong debut from Xiaomi’s MiMo v2.5 Pro, further populating the leaderboard with high-efficiency Chinese models.
r/LocalLLaMA·model_release·05/05/2026, 06:51 AM·/u/Disastrous_Theme5906
DeepSeek V4 Pro matches GPT-5.2 on FoodTruck Bench, our agentic benchmark — 10 weeks later, ~17× cheaper
DeepSeek V4 Pro offers GPT-5.2 level agentic performance at 1/17th the cost, narrowing the US-China AI gap to just 10 weeks.
DeepSeek V4 Pro has achieved performance parity with GPT-5.2 on the FoodTruck Bench, a complex 30-day agentic simulation involving 34 tools and persistent memory. While GPT-5.2 was tested in February, DeepSeek matched its results only ten weeks later, signaling a rapid closing of the gap between US and Chinese frontier models. Crucially, DeepSeek is approximately 17 times cheaper for agentic workloads, with significantly lower input/output pricing. The model also demonstrated superior consistency compared to Grok 4.3, showing lower variance in outcomes and better resource management. Additionally, Xiaomi’s MiMo v2.5 Pro also entered the top 6, further establishing Chinese models as high-value competitors in the frontier tier.
r/LocalLLaMA·model_release·05/05/2026, 06:51 AM·Disastrous_Theme5906
Relevance auto-scored by LLM (0–10). List shows top 30 from the last 7 days.