AI pulse last 7 days
Daily AI pulse from YouTube, blogs, Reddit, HN. Ruthlessly filtered.
Sources (41)▶
- criticalAndrej Karpathy
Były dyrektor AI w Tesli, OpenAI cofounder. Każde video to gold.
- criticalAnthropic
Oficjalny kanał Anthropic. Każdy release Claude'a.
- criticalComfyUI Blog
Release log dla integracji ComfyUI — Luma Uni-1, GPT Image 2, ACE-Step music gen, Seedance. Pokrywa video+image+music+workflow.
- criticalOpenAI Blog
Oficjalny blog OpenAI. Wszystkie release.
- criticalSimon Willison's Weblog
Najlepszy 'thinker' AI. Codzienne posty, deep insights, niska hype rate.
- highAI Explained
Głęboka analiza papers i benchmarków, niska hype rate.
- highAI Jason
Praktyczne tutoriale Claude Code, MCP, workflow vibe codingu.
- highBen's Bites
Daily AI digest, creator-friendly tone. Codex, model releases, agentic AI.
- highCole Medin
Vibe coding + agentic workflows + Claude Code MCP integrations.
- highFal AI Blog
Fal hostuje większość nowych AI image/video modeli — ich blog to wczesne sygnały premier.
- highHN: 3D & Gaussian Splatting
HN signal dla 3D generative — Gaussian Splatting, NeRF, image-to-3D. Próg 20 bo niszowa kategoria (top historic 182pts).
- highHN: AI agents / MCP
HN posty o agentach, MCP, vibe codingu z min 100 pkt.
- highHN: Claude / Anthropic
HN posty z 'Claude' lub 'Anthropic' z min 100 pkt.
- highHugging Face Blog
Releases dla image, video, audio, 3D modeli. Część tech-heavy — Gemini relevance odfiltruje noise. Downgraded z critical: za duży volume na 'must-read' status.
- highIndyDevDan
Claude Code power user, prompty, hooki.
- highInterconnects (Nathan Lambert)
AI policy + research analysis. Niska hype rate, opinionated.
- highLatent Space
Podcast + blog Swyx — wywiady z founderami i deep dives engineeringowe.
- highMatt Wolfe
Comprehensive AI tools weekly digest. ~700K subs.
- highMatthew Berman
AI news, model release reviews, agent demos. Wysoki output.
- highr/aivideo
Community AI video — Sora, Veo, Runway, Kling, LTX. Co naprawdę zaskakuje twórców.
- highr/ClaudeAI
Społeczność Claude'a — power users, tipy, problemy.
- highr/LocalLLaMA
Open-source LLMs, lokalne uruchamianie, benchmarks bez hype.
- highr/StableDiffusion
Największa community open-source image gen (700k+ users). Premiery modeli, LoRA, ComfyUI workflows.
- highRiley Brown
Vibe coding, AI builder workflows, Cursor + Claude tutorials.
- highThe Decoder
Niemiecki AI news outlet po angielsku, dobre breaking news.
- highTheo - t3.gg
TypeScript + AI dev workflows. Hot takes, narrative-driven.
- highYannic Kilcher
Paper reviews i deep dives w research AI.
- lowAI Weirdness
Janelle Shane — playful AI experiments, image gen quirks. Niski volume, unikalna perspektywa.
- mediumbycloud
AI papers digestible — między 2MP a Yannic Kilcher.
- mediumCreative Bloq
Design industry — gdzie AI ingeruje w klasyczne dyscypliny graficzne.
- mediumFireship
100-sec format, often AI/LLM + tech news.
- mediumfxguide
VFX i film industry — coraz więcej AI w pipeline. Profesjonalna perspektywa.
- mediumGreg Isenberg
Solo founder vibe — buduje produkty z AI, podcasty z indie hackers.
- mediumr/ChatGPTCoding
Vibe coding tipy, IDE setupy, prompty. Mix wszystkich modeli.
- mediumr/comfyui
ComfyUI workflows — custom nodes, JSON workflows, optymalizacje.
- mediumr/midjourney
Midjourney community — premiery v7+, style references, prompt patterns.
- mediumr/runwayml
Runway-specific community — premiery features, prompt patterns, comparisons z konkurencją.
- mediumr/SunoAI
Suno music gen community — nowe wersje modelu, lyric prompting techniques. Audio AI ma slaby RSS ecosystem.
- mediumTina Huang
AI workflows for data science, practical applications.
- mediumTwo Minute Papers
Krótkie streszczenia papers AI, świetne dla szybkiego scan'a.
- mediumWes Roth
AI news z bardziej clickbaitowym tonem — filtr Gemini odsiewa hype.
Exaggerated PCI-E bandwidth concerns?
PCIe bandwidth concerns for multi-GPU setups are likely exaggerated; even a 4.0 x4 link handles high-speed prefill for mid-range cards using vLLM and Tensor Parallelism.
A user on r/LocalLLaMA conducted benchmarks to test if PCIe bandwidth is a true bottleneck for multi-GPU local LLM setups on consumer hardware. Using two RTX 5060 Ti 16GB cards with vLLM and Tensor Parallelism (TP=2), they found that peak bandwidth during prefill reached only 3-4 GB/s. This represents about 50% of the capacity of a PCIe 4.0 x4 slot, suggesting that even limited chipset-connected slots are sufficient for mid-range cards. The test involved high-speed quants like NVFP4, achieving prefill rates up to 1700 t/s. These findings suggest hobbyists can scale to 3 or 4 GPUs using M.2 adapters without needing expensive workstation-grade motherboards.
r/LocalLLaMA·news·05/06/2026, 07:54 PM·/u/ziphnorvLLM V0 to V1: Correctness Before Corrections in RL
vLLM V1 is a major upgrade optimized for RL and reasoning models, focusing on output correctness and significantly better inference performance.
vLLM is transitioning from V0 to V1, marking a major architectural overhaul focused on Reinforcement Learning (RL) workflows. The update emphasizes a 'Correctness Before Corrections' philosophy, addressing the critical need for high-fidelity outputs in complex reasoning tasks. This shift is particularly relevant for serving modern models like DeepSeek-R1 that rely on long-chain reasoning and RL-based optimization. The new version aims to significantly reduce overhead and improve throughput while maintaining strict output validation. It represents a move towards more robust, production-ready inference for the next generation of agentic and reasoning LLMs.
Hugging Face Blog·tooling·05/06/2026, 07:06 PMQwen3.6 27B NVFP4 + MTP on a single RTX 5090: 200k context working in vLLM
You can now run Qwen3.6 27B with a massive 200k context window on a single RTX 5090 using NVFP4 quantization and vLLM.
A user successfully ran Qwen3.6 27B on a single RTX 5090 with 32GB VRAM, achieving a stable 200k context window. The setup utilizes NVFP4 quantization via the compressed-tensors library and vLLM's MTP (Multi-Token Prediction) for speculative decoding. Benchmarks show generation speeds between 65-75 tokens/second at 200k context, with TTFT (Time To First Token) dropping significantly when using prefix caching. This configuration demonstrates the potential of Blackwell's FP4 support for handling large-scale local inference. The author provides exact vLLM parameters and stability data for others to replicate the results on consumer hardware.
r/LocalLLaMA·tooling·05/06/2026, 02:05 PM·/u/Maheidem
Relevance auto-scored by LLM (0–10). List shows top 30 from the last 7 days.