AI pulse last 7 days
Daily AI pulse from YouTube, blogs, Reddit, HN. Ruthlessly filtered.
Sources (41)▶
- criticalAndrej Karpathy
Były dyrektor AI w Tesli, OpenAI cofounder. Każde video to gold.
- criticalAnthropic
Oficjalny kanał Anthropic. Każdy release Claude'a.
- criticalComfyUI Blog
Release log dla integracji ComfyUI — Luma Uni-1, GPT Image 2, ACE-Step music gen, Seedance. Pokrywa video+image+music+workflow.
- criticalOpenAI Blog
Oficjalny blog OpenAI. Wszystkie release.
- criticalSimon Willison's Weblog
Najlepszy 'thinker' AI. Codzienne posty, deep insights, niska hype rate.
- highAI Explained
Głęboka analiza papers i benchmarków, niska hype rate.
- highAI Jason
Praktyczne tutoriale Claude Code, MCP, workflow vibe codingu.
- highBen's Bites
Daily AI digest, creator-friendly tone. Codex, model releases, agentic AI.
- highCole Medin
Vibe coding + agentic workflows + Claude Code MCP integrations.
- highFal AI Blog
Fal hostuje większość nowych AI image/video modeli — ich blog to wczesne sygnały premier.
- highHN: 3D & Gaussian Splatting
HN signal dla 3D generative — Gaussian Splatting, NeRF, image-to-3D. Próg 20 bo niszowa kategoria (top historic 182pts).
- highHN: AI agents / MCP
HN posty o agentach, MCP, vibe codingu z min 100 pkt.
- highHN: Claude / Anthropic
HN posty z 'Claude' lub 'Anthropic' z min 100 pkt.
- highHugging Face Blog
Releases dla image, video, audio, 3D modeli. Część tech-heavy — Gemini relevance odfiltruje noise. Downgraded z critical: za duży volume na 'must-read' status.
- highIndyDevDan
Claude Code power user, prompty, hooki.
- highInterconnects (Nathan Lambert)
AI policy + research analysis. Niska hype rate, opinionated.
- highLatent Space
Podcast + blog Swyx — wywiady z founderami i deep dives engineeringowe.
- highMatt Wolfe
Comprehensive AI tools weekly digest. ~700K subs.
- highMatthew Berman
AI news, model release reviews, agent demos. Wysoki output.
- highr/aivideo
Community AI video — Sora, Veo, Runway, Kling, LTX. Co naprawdę zaskakuje twórców.
- highr/ClaudeAI
Społeczność Claude'a — power users, tipy, problemy.
- highr/LocalLLaMA
Open-source LLMs, lokalne uruchamianie, benchmarks bez hype.
- highr/StableDiffusion
Największa community open-source image gen (700k+ users). Premiery modeli, LoRA, ComfyUI workflows.
- highRiley Brown
Vibe coding, AI builder workflows, Cursor + Claude tutorials.
- highThe Decoder
Niemiecki AI news outlet po angielsku, dobre breaking news.
- highTheo - t3.gg
TypeScript + AI dev workflows. Hot takes, narrative-driven.
- highYannic Kilcher
Paper reviews i deep dives w research AI.
- lowAI Weirdness
Janelle Shane — playful AI experiments, image gen quirks. Niski volume, unikalna perspektywa.
- mediumbycloud
AI papers digestible — między 2MP a Yannic Kilcher.
- mediumCreative Bloq
Design industry — gdzie AI ingeruje w klasyczne dyscypliny graficzne.
- mediumFireship
100-sec format, often AI/LLM + tech news.
- mediumfxguide
VFX i film industry — coraz więcej AI w pipeline. Profesjonalna perspektywa.
- mediumGreg Isenberg
Solo founder vibe — buduje produkty z AI, podcasty z indie hackers.
- mediumr/ChatGPTCoding
Vibe coding tipy, IDE setupy, prompty. Mix wszystkich modeli.
- mediumr/comfyui
ComfyUI workflows — custom nodes, JSON workflows, optymalizacje.
- mediumr/midjourney
Midjourney community — premiery v7+, style references, prompt patterns.
- mediumr/runwayml
Runway-specific community — premiery features, prompt patterns, comparisons z konkurencją.
- mediumr/SunoAI
Suno music gen community — nowe wersje modelu, lyric prompting techniques. Audio AI ma slaby RSS ecosystem.
- mediumTina Huang
AI workflows for data science, practical applications.
- mediumTwo Minute Papers
Krótkie streszczenia papers AI, świetne dla szybkiego scan'a.
- mediumWes Roth
AI news z bardziej clickbaitowym tonem — filtr Gemini odsiewa hype.
Open-sourcing Banodoco Hivemind: 1M+ Discord messages from artists and engineers working deeply with open image/video models, packaged as an agent skill
A massive dataset of real-world discussions from artists and engineers using open image/video AI models is now available, offering a unique resource for building smarter creative…
The Banodoco Hivemind, a substantial dataset comprising over 1 million Discord messages from artists and engineers, has been open-sourced. This collection captures deep, practical discussions around open image and video AI models, offering insights into real-world usage, problem-solving, and creative applications. Packaged as an "agent skill," this resource is designed to enhance the capabilities of AI agents, allowing them to better understand and assist users in creative workflows. It provides a novel foundation for developing more context-aware and helpful AI assistants, moving beyond generic training data to specialized, community-driven knowledge.
r/comfyui·tooling·05/07/2026, 01:30 PM·/u/PetersOdyssey
I built a tool to mix two artists on one image with region masks — Van Gogh + Picasso, no training, arbitrary refs
Mix different artistic styles in specific parts of an image using masks and IP-Adapters without any training or fine-tuning.
A new open-source tool allows users to apply distinct artistic styles to specific regions of an image using spatial masks. Built on Stable Diffusion 1.5, the system utilizes ControlNet (Canny and Tile) for structural integrity and two IP-Adapters for style injection. The technical core involves spatial routing, where each adapter's contribution is masked within the cross-attention layers to prevent 'muddy' averaging of styles. It offers three modes: global mixing, painterly emphasis, and region-specific stylization. While effective, the author notes that aggressive style weights can distort realistic faces and small color details. The project includes a GitHub repository with a Colab notebook and a Hugging Face Space for testing.
r/StableDiffusion·tooling·05/07/2026, 09:24 AM·/u/Longjumping_Gur_937
So Far This is My Favorite Use-Case for LTX 2.3/ComfyUI
Discover a practical workflow for using the LTX 2.3 video model in ComfyUI to achieve high-quality, consistent video generation on local hardware.
The Reddit community is exploring the capabilities of LTX 2.3, a new video generation model, specifically within the ComfyUI node-based interface. This post demonstrates a high-quality use-case that highlights the model's strengths in temporal consistency and motion fidelity. LTX 2.3 is designed to be more accessible for local execution on consumer GPUs than previous state-of-the-art video models. The author's workflow provides a practical example of how to integrate this model into complex creative pipelines. This demonstration is particularly valuable for creators looking for alternatives to closed-source video tools like Runway or Luma.
r/StableDiffusion·tooling·05/07/2026, 08:33 AM·/u/optimisoprimeoMy Claude dreams at night and remembers everything. Better than mempalace.
A new open-source MCP server that gives Claude persistent long-term memory across sessions using local embeddings and background consolidation.
Developer /u/Mental-Spray-5263 has released iai-mcp, an open-source local daemon designed to provide Claude with persistent long-term memory across different sessions. The tool captures conversations and organizes them into three memory tiers, automatically feeding relevant context back into new chats without manual copy-pasting. It utilizes local neural embeddings for retrieval and AES-256 encryption for security, ensuring data stays private. A standout feature is background consolidation, where the system optimizes and links memories while the machine is idle. Performance benchmarks show over 99% verbatim recall and retrieval times under 100ms, with a session-start overhead of approximately 3,000 tokens.
r/ClaudeAI·tooling·05/07/2026, 03:08 AM·/u/Mental-Spray-5263
ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference
ParoQuant is a new quantization method that preserves the reasoning and logic capabilities of LLMs at low bitrates better than standard techniques.
ParoQuant introduces Pairwise Rotation Quantization, a novel technique designed to minimize information loss during the compression of reasoning-heavy LLMs. Unlike standard quantization methods that often degrade complex logic chains, ParoQuant uses a pairwise approach to handle outlier weights more effectively. The release includes a dedicated GitHub repository and pre-quantized models on HuggingFace for immediate testing. This is particularly significant for users running large reasoning models on consumer hardware where VRAM is limited. Initial benchmarks suggest superior performance in maintaining Chain of Thought (CoT) coherence compared to traditional 4-bit methods.
r/LocalLLaMA·tooling·05/07/2026, 02:07 AM·/u/Total-Resort-3120
Clippy Reloaded - a really sarky useful Clipboard node with no click.
Automatically import your system clipboard into ComfyUI workflows every time you queue a prompt, eliminating manual pasting.
Clippy Reloaded is a custom node for ComfyUI designed to streamline the process of getting text into your workflows. Instead of manually pasting text into a node, this tool automatically pulls whatever is currently in your system clipboard the moment you queue a prompt. This is particularly useful for users who frequently copy prompts, descriptions, or parameters from external websites or LLM chats. The node eliminates repetitive clicking and pasting, acting as a dynamic input source. It is available as an open-source repository on GitHub for easy integration into existing ComfyUI setups.
r/StableDiffusion·tooling·05/07/2026, 12:11 AM·/u/shootthesound
[WIP] ComfyUI Powered Klein 2 KV Edit i2i plugin (Chromium)
A browser sidebar plugin that lets you perform advanced image-to-image edits via ComfyUI using the Klein 2 KV model architecture.
Developer /u/deadsoulinside has released a Work-In-Progress (WIP) Chromium extension that integrates ComfyUI directly into the browser sidebar. The tool focuses on image-to-image (i2i) workflows using the Klein 2 KV architecture, which offers high prompt-based control over image manipulation. Users can create, save, and categorize custom prompts within the plugin's interface. To function, it requires a local ComfyUI instance with API mode and CORS enabled, specifically targeting the Flux-2-Klein 9B model and Qwen 3 text encoders. The project is open-source, serving as a template for others to build upon or port to Firefox.
r/StableDiffusion·tooling·05/06/2026, 10:12 PM·/u/deadsoulinside
The GB10 Solution Atlas is now open source, the inference engine made for the community with breakneck inference speeds (Qwen3.6-35B-FP8 100+ tok/s)
Atlas is a high-performance, Rust-based open-source inference engine that delivers 3x faster speeds than vLLM on Blackwell hardware by removing Python overhead.
Atlas is a newly open-sourced inference engine written in pure Rust and CUDA, designed to bypass the performance bottlenecks of the standard Python/PyTorch stack. Optimized for NVIDIA Blackwell (GB10) architecture, it achieves over 100 tokens per second on Qwen3.5-35B models using NVFP4 precision and Multi-Token Prediction (MTP). The engine features a lightweight 2.5GB Docker image with sub-2-minute cold starts and provides native support for OpenAI and Anthropic API formats. By rewriting the stack from HTTP handlers to kernel dispatch, the developers claim a 3x throughput increase over vLLM. Future updates aim to bring these optimizations to AMD Strix Halo and RTX 6000 Blackwell hardware.
r/LocalLLaMA·tooling·05/06/2026, 08:36 PM·/u/Live-Possession-6726
Anyone else tried this RefineAnything LoRA? Pretty impressed so far
A new ComfyUI plugin and LoRA workflow for surgical image refinement, perfect for fixing text, logos, and small details without affecting the rest of the image.
The RefineAnything project provides a specialized LoRA and workflow for surgical image repairs, specifically targeting text, logos, and product labels. A new ComfyUI plugin, ComfyUI-RefineNode, has been released to automate the manual labor of mask preparation, reference alignment, and pasting back the refined region. The plugin is model-agnostic, meaning it can enhance any local detail repair workflow, not just the RefineAnything LoRA. It supports both scribble masks and bounding boxes, ensuring the rest of the image remains 100% untouched. A technical tip from the developer suggests avoiding the 'index_timestep_zero' method to prevent noticeable color shifts during the process.
r/StableDiffusion·tooling·05/06/2026, 07:32 PM·/u/liangkun43
OpenAI built a networking protocol with AMD, Broadcom, Intel, Microsoft, and NVIDIA to fix AI supercomputer bottlenecks
OpenAI and tech giants released MRC, an open-source protocol that makes training massive models faster and cheaper by optimizing how 100,000+ GPUs communicate.
OpenAI, in collaboration with industry leaders like NVIDIA, Microsoft, and AMD, has introduced MRC (Multi-Path Remote Communication), an open-source networking protocol designed for AI supercomputing. The protocol addresses the massive data bottlenecks inherent in training LLMs across tens of thousands of GPUs. By enabling data transmission across hundreds of paths simultaneously, MRC reduces the required network switch layers from four down to just two. This architecture supports clusters of over 100,000 GPUs while significantly lowering power consumption and hardware costs. Currently, the protocol is operational within OpenAI's Stargate supercomputer project, signaling a shift towards more efficient, standardized AI infrastructure.
The Decoder·tooling·05/06/2026, 07:13 PM·Matthias BastianvLLM V0 to V1: Correctness Before Corrections in RL
vLLM V1 is a major upgrade optimized for RL and reasoning models, focusing on output correctness and significantly better inference performance.
vLLM is transitioning from V0 to V1, marking a major architectural overhaul focused on Reinforcement Learning (RL) workflows. The update emphasizes a 'Correctness Before Corrections' philosophy, addressing the critical need for high-fidelity outputs in complex reasoning tasks. This shift is particularly relevant for serving modern models like DeepSeek-R1 that rely on long-chain reasoning and RL-based optimization. The new version aims to significantly reduce overhead and improve throughput while maintaining strict output validation. It represents a move towards more robust, production-ready inference for the next generation of agentic and reasoning LLMs.
Hugging Face Blog·tooling·05/06/2026, 07:06 PM
Interactive Video Generation (Causal Forcing) - High Speed!
Generate high-speed interactive videos even on mid-range GPUs like the RTX 3060, with potential for real-time performance on high-end hardware.
Causal Forcing is a new approach to interactive video generation that emphasizes speed and efficiency. The release includes open-source code and models, with a community-repackaged version for ComfyUI. Performance benchmarks show that an RTX 3060 can generate a 2-second video (848x480) in just 11 seconds using only 4 steps. On high-end GPUs like the RTX 4090 or 5090, users report near real-time generation speeds. The model is lightweight, peaking at 6GB VRAM, making it accessible for hobbyists with mid-range hardware. This represents a significant step toward fluid, interactive AI video tools.
r/StableDiffusion·model_release·05/06/2026, 05:53 PM·/u/ZerOne82
DeepSeek V4 AI Beats Billion Dollar Systems…For Free
DeepSeek V4 is a powerful new open-source AI model that reportedly outperforms expensive commercial systems, offering advanced capabilities for free.
DeepSeek has released its new AI model, DeepSeek V4, which is being highlighted for its impressive performance. The model reportedly surpasses the capabilities of much larger and more expensive "billion-dollar" proprietary systems, yet it is available for free. This release signifies a notable advancement in the open-source LLM landscape, potentially democratizing access to high-tier AI capabilities. For creative non-developers and hobbyists, this means access to a powerful tool without significant financial investment, pushing the boundaries of what's achievable with freely available AI.
Two Minute Papers·model_release·05/06/2026, 04:07 PM·Two Minute Papers▶Watch here

Google speeds up Gemma 4 threefold with multi-token prediction
You can now generate text with Google's Gemma 4 models up to three times faster thanks to a new multi-token prediction technique.
Google has introduced multi-token prediction drafters for its Gemma 4 open model family, significantly accelerating text generation. This new feature allows Gemma 4 models to generate text up to three times faster than before. The technique involves a smaller auxiliary model that proposes several tokens simultaneously, which the main Gemma model then validates in a single pass. This enhancement provides a substantial performance boost for users working with Gemma 4, making it more efficient for various creative and development tasks.
The Decoder·model_release·05/06/2026, 04:05 PM·Matthias Bastian[Release] PaperStrip_FX COMP | An experimental scan-like strip compositor
A new experimental ComfyUI node for creating stylized 'paper strip' or 'scan-line' visual effects in AI-generated images and videos.
PaperStrip_FX COMP is an experimental tool released for ComfyUI that introduces a unique scan-like strip compositing effect. Developed by user TasTepeler, this node allows artists to slice and rearrange images into horizontal or vertical strips, mimicking physical paper collages or digital scanning glitches. It provides a creative way to post-process AI-generated content directly within the ComfyUI environment, eliminating the need for external video editing software for these specific visual styles. The release includes the workflow and custom nodes necessary to implement these transitions or static effects. This tool is particularly useful for creators seeking lo-fi, analog aesthetics in their digital generative workflows.
r/comfyui·tooling·05/06/2026, 03:56 PM·/u/TasTepeler
ComfyUI - a few image/video utility nodes
A new set of ComfyUI utility nodes for video editing, batch manipulation, and workflow debugging, including transition effects and speed control.
User /u/qdr1en has released a collection of ComfyUI utility nodes developed with the assistance of Claude Sonnet. The package includes general workflow tools like an execution timer, dynamic LoRA loader, and variable interpreter. For image and video work, it offers batch splitting, frame selection, and mirroring. Advanced features include a video speed controller with easing curves and a transition effect node that mimics CSS-style transitions. While some nodes are enhanced versions of existing tools, the collection provides a convenient toolkit for fine-tuning video sequences and debugging complex workflows.
r/comfyui·tooling·05/06/2026, 01:28 PM·/u/qdr1en
Starting with Claude Code - my new open-source project: Git for AI Agents
Regent VCS is a new open-source 'Git for AI' that tracks prompts and sessions, making it easier to undo and branch AI-generated code changes in Claude Code.
Regent VCS is an open-source project aiming to become "Git for AI Agents," specifically targeting the limitations of traditional version control in AI workflows. The developer argues that Git fails at undoing AI-generated changes effectively and doesn't track the relationship between specific prompts and code modifications. The tool currently supports Claude Code and includes both a CLI and a VS Code extension. Key features include better session tracking, conversation branching (forking context), and correlating the file tree with actual prompts. It is currently in alpha, seeking community feedback and contributors to improve the developer experience for agentic coding.
r/ClaudeAI·tooling·05/06/2026, 01:16 PM·/u/Immediate-Landscape1
SenseNova U1 Infographic Test: Image Reasoning and Infographic Generation Capabilities
SenseNova U1 is a new model specialized in generating logical infographics and structured visual explanations from simple prompts.
SenseNova U1 is an emerging model designed for comprehension-driven image generation, specifically targeting infographics and technical illustrations. A recent community test demonstrated its ability to visualize a complex chemical reaction (eggshell in vinegar) with logical structure rather than just aesthetic elements. Unlike general-purpose models, it automatically organizes content into coherent informational layouts even with minimal prompting. While the visual reasoning is strong, the model still struggles with text clarity in some instances. The project is available on GitHub, offering a new tool for users needing structured visual communication.
r/comfyui·model_release·05/06/2026, 08:37 AM·/u/Beginning-Lie-4581
SenseNova U1 Infographic Test: Capabilities in Image-Based Reasoning
SenseNova U1 excels at generating structured infographics and technical diagrams, provided you use highly detailed prompts to guide its internal reasoning.
SenseNova U1 is a multimodal model capable of generating complex infographics by interpreting and structuring input concepts into visual steps. User testing reveals that the model excels at technical illustrations, such as cross-section diagrams with annotations and callout lines. A key finding is the model's sensitivity to prompt depth; detailed, multi-layered descriptions significantly improve reasoning stability and compositional clarity. While it can "guess" based on short prompts, the quality of logical layout drops without specific guidance. The project is open-source, with code available on GitHub for further exploration of its image-based reasoning capabilities.
r/StableDiffusion·model_release·05/06/2026, 08:16 AM·/u/Nearby-Recover4701
This experimental open-source AI turns prompts into playable Marvel, Star Wars and Harry Potter games
New open-source experimental AI turns text prompts into interactive, playable game environments, enabling instant 'vibe coding' for game prototypes.
This experimental open-source AI model allows users to generate playable, interactive game environments using simple text prompts. By training on vast amounts of gameplay footage, the system can simulate the visual styles and basic physics of iconic franchises like Star Wars and Marvel. Unlike traditional video generation, this tool focuses on real-time interactivity, enabling a form of 'vibe coding' for game design where the engine interprets intent rather than rigid code. While currently limited to basic movement and environmental interaction, it represents a significant step toward generative world models. The project highlights the potential for non-developers to prototype complex 3D spaces instantly.
Creative Bloq·model_release·05/06/2026, 08:00 AM· joe.foley@futurenet.com (Joe Foley)
Built a Claude Code monitoring tool
Monitor your Claude Code CLI sessions, token usage, and costs directly inside VSCode with this new open-source observability tool called Argus.
Argus is a new open-source monitoring and observability tool designed specifically for Claude Code, Anthropic's CLI agent. It integrates directly into VSCode, providing a visual interface to track agent sessions that would otherwise be confined to the terminal. The tool helps users monitor token consumption, financial costs, and the specific sequence of actions taken by the agent in real-time. By moving observability out of the CLI and into the IDE, it simplifies the debugging of complex agentic workflows. This is particularly useful for developers concerned about the "black box" nature and potential costs of long-running Claude Code sessions.
r/ClaudeAI·tooling·05/06/2026, 07:53 AM·/u/fIak88Adding Benchmaxxer Repellant to the Open ASR Leaderboard
Hugging Face is cleaning up the Open ASR Leaderboard by using private test data to stop models from 'cheating' their way to the top.
Hugging Face has updated the Open ASR Leaderboard with a mechanism dubbed "Benchmaxxer Repellant" to combat benchmark gaming. The initiative addresses the growing issue of data contamination, where models are inadvertently or intentionally trained on public test sets. By introducing private, unseen evaluation datasets, the leaderboard can now provide a more accurate reflection of a model's generalization capabilities. This move ensures that top-ranking models actually perform better in real-world scenarios rather than just excelling at memorized benchmarks. It represents a shift towards more rigorous, verifiable standards in the open-source speech recognition community.
Hugging Face Blog·tooling·05/06/2026, 12:00 AMCommon and Obscure Models and Ways to Find Them [ Human Written ]
A high-quality list of local AI tools for audio, voice, and transcription that offer powerful alternatives to mainstream models like Whisper.
A curated collection of local AI tools and models focusing on audio processing, voice cloning, and transcription beyond standard LLM use cases. The author highlights Applio for voice-to-voice translation, Ultimate-TTS-Studio for converting EPUBs to audiobooks, and the beta desktop version of Open WebUI for a container-free experience. Notably, the post suggests alternatives to Whisper like Parakeet and VibeVoice for more accurate long-form speech transcription with fewer hallucinations. It also covers niche tools like Ultimate Vocal Remover for stem separation and Basic-Pitch for audio-to-MIDI conversion. The guide concludes with practical methods for discovering new open-source AI projects using GitHub tags and AlternativeTo.
r/LocalLLaMA·tooling·05/05/2026, 11:11 PM·/u/iMakeSense
"FLUX Creator Program" - New Flux models sooner than expected?
Black Forest Labs is launching a creator program, signaling that new FLUX models or updates are likely entering a testing phase.
Black Forest Labs (BFL) has announced the "FLUX Creator Program," sparking speculation about upcoming model releases. While specific details remain sparse, the program likely aims to provide early access or support to prominent creators in the AI art community. This move follows the massive success of FLUX.1 and suggests BFL is preparing to expand its ecosystem with new iterations. Users are particularly hopeful for new open-source weights or specialized versions like a "Klein" model mentioned in community discussions. The announcement indicates that BFL is shifting focus toward community-led development and feedback before broader public rollouts.
r/StableDiffusion·news·05/05/2026, 11:10 PM·/u/ArkCoon
I hope this helps everyone....
A massive release of 5 ComfyUI node packs (120+ nodes) covering advanced video masking, Wan Video jitter fixes, animal pose estimation, and professional VFX compositing.
Developer /u/kyahinaamrakhe-1 has released five comprehensive node packs for ComfyUI, totaling over 120 nodes designed for advanced creative workflows. The main 'CustomNodePacks' (72 nodes) introduces unique tools like a Mask Failure Explainer and a Temporal Anchor System using Signed Distance Fields (SDF) for smooth video masking without tracking. Specific fixes for Wan Video address limb jitter and face-cropping issues, while a dedicated animal preprocessor enables accurate pose estimation for species like cats, dogs, and horses. The 'NukeMaxNodes' pack bridges traditional VFX operations (FFT, PBR relighting) with AI, and the GLM-Image pack provides modular loaders for Zhipu AI's multilingual model. All tools are Apache-2.0 licensed and focus on solving production bottlenecks like tempo…
r/comfyui·tooling·05/05/2026, 04:31 PM·/u/kyahinaamrakhe-1
ProgramBench: Can we really rebuild huge binaries from scratch? (doesn't look like it)
ProgramBench is a new, rigorous benchmark from Meta Research that tests if LLM agents can rebuild entire programs from scratch using only binaries and documentation.
Meta Research has introduced ProgramBench, a benchmark designed to evaluate how well LLM agents can reconstruct complex software from scratch. Unlike previous case studies that relied on hand-tuned setups, this framework includes 200 diverse tasks and 6 million lines of behavioral tests to prevent cheating and ensure robustness. Agents are provided only with a target executable and a README, forcing them to architect the entire system without internet access or decompilation. Initial results show that even top-tier closed-source models struggle, while open-source models underperform due to potential overfitting on older benchmarks like SWE-bench. The project is fully open-sourced, including Docker images and a CLI tool for easy evaluation.
r/LocalLLaMA·tooling·05/05/2026, 03:40 PM·/u/klieretHeretic 1.3 released: Reproducible models, integrated benchmarking system, reduced peak VRAM usage, broader model support, and more
Heretic 1.3 brings byte-for-byte reproducibility to model abliteration, integrated benchmarking, and lower VRAM requirements for processing large models like Qwen 3.5.
Heretic 1.3, the leading tool for LLM abliteration (decensoring), introduces several major technical updates focused on transparency and efficiency. The headline feature is a reproducibility system that allows users to generate byte-for-byte identical models by capturing environment metadata, including GPU drivers and library versions. A new integrated benchmarking suite based on lm-evaluation-harness enables running MMLU and GSM8K tests directly within the tool to verify model quality. Additionally, peak VRAM usage has been significantly reduced, and support has been expanded to include latest-generation architectures like Qwen 3.5 and Gemma 4. This release solidifies Heretic's position as a professional-grade utility for the local LLM community.
r/LocalLLaMA·tooling·05/05/2026, 02:57 PM·-p-e-w-Heretic 1.3 released: Reproducible models, integrated benchmarking system, reduced peak VRAM usage, broader model support, and more
Heretic 1.3 brings byte-for-byte reproducibility and built-in benchmarking to LLM abliteration, making it easier to decensor models without sacrificing quality.
Heretic 1.3 introduces significant updates to the leading open-source tool for LLM abliteration (decensoring). The headline feature is byte-for-byte reproducibility, allowing users to share exact configurations and environment data to recreate identical models. It also integrates a benchmarking system based on lm-evaluation-harness, enabling users to run MMLU, EQ-Bench, or GSM8K directly to ensure model quality hasn't degraded. Technical optimizations have reduced peak VRAM usage, facilitating the processing of larger models on consumer hardware. Additionally, the update expands support to newer architectures, including Qwen 3.5 and Gemma 4.
r/LocalLLaMA·tooling·05/05/2026, 02:57 PM·/u/-p-e-w-
Converting 2D animations to 3D with LTX 2.3 Lora
Transform 2D animations into depth-rich 3D videos using LTX-Video 2.3 and a specific LoRA workflow for improved spatial consistency.
This workflow demonstrates a method for converting flat 2D animations into 3D-style videos using the LTX-Video 2.3 model and a specialized LoRA. By leveraging the temporal consistency of the LTX architecture, the technique goes beyond simple depth effects to create genuine spatial volume and realistic lighting. The process involves using existing 2D footage as a structural reference while the LoRA guides the model to reinterpret the scene with 3D depth. This provides a powerful tool for creators to modernize 2D assets or generate complex parallax movements without traditional 3D software. It highlights the growing ecosystem of fine-tuned adapters for open-source video generation models.
r/StableDiffusion·tutorial·05/05/2026, 09:09 AM·/u/CQDSN
I know, it's not for everyone, but if you liked Codex Pets, here is now Claude Pets too
Add a visual 'pet' companion to your Claude AI sessions to make your coding or chatting experience more interactive and aesthetically pleasing.
Developer /u/alvinunreal has released 'Claude Pets', a desktop companion tool inspired by the previous 'Codex Pets' project. The tool allows users to have a visual pet on their screen that corresponds to their Claude AI interactions, adding a layer of personality to the interface. Currently, it supports a single pet, but the developer plans to update it to allow multiple pets tied to specific projects or sessions. This project aims to gamify or add a 'cozy' aesthetic to the AI development workflow. The source code is available on GitHub, and additional pet designs can be found at openpets.dev.
r/ClaudeAI·tooling·05/05/2026, 07:15 AM·/u/alvinunreal
Relevance auto-scored by LLM (0–10). List shows top 30 from the last 7 days.