AI pulse last 7 days
Daily AI pulse from YouTube, blogs, Reddit, HN. Ruthlessly filtered.
Sources (41)▶
- criticalAndrej Karpathy
Były dyrektor AI w Tesli, OpenAI cofounder. Każde video to gold.
- criticalAnthropic
Oficjalny kanał Anthropic. Każdy release Claude'a.
- criticalComfyUI Blog
Release log dla integracji ComfyUI — Luma Uni-1, GPT Image 2, ACE-Step music gen, Seedance. Pokrywa video+image+music+workflow.
- criticalOpenAI Blog
Oficjalny blog OpenAI. Wszystkie release.
- criticalSimon Willison's Weblog
Najlepszy 'thinker' AI. Codzienne posty, deep insights, niska hype rate.
- highAI Explained
Głęboka analiza papers i benchmarków, niska hype rate.
- highAI Jason
Praktyczne tutoriale Claude Code, MCP, workflow vibe codingu.
- highBen's Bites
Daily AI digest, creator-friendly tone. Codex, model releases, agentic AI.
- highCole Medin
Vibe coding + agentic workflows + Claude Code MCP integrations.
- highFal AI Blog
Fal hostuje większość nowych AI image/video modeli — ich blog to wczesne sygnały premier.
- highHN: 3D & Gaussian Splatting
HN signal dla 3D generative — Gaussian Splatting, NeRF, image-to-3D. Próg 20 bo niszowa kategoria (top historic 182pts).
- highHN: AI agents / MCP
HN posty o agentach, MCP, vibe codingu z min 100 pkt.
- highHN: Claude / Anthropic
HN posty z 'Claude' lub 'Anthropic' z min 100 pkt.
- highHugging Face Blog
Releases dla image, video, audio, 3D modeli. Część tech-heavy — Gemini relevance odfiltruje noise. Downgraded z critical: za duży volume na 'must-read' status.
- highIndyDevDan
Claude Code power user, prompty, hooki.
- highInterconnects (Nathan Lambert)
AI policy + research analysis. Niska hype rate, opinionated.
- highLatent Space
Podcast + blog Swyx — wywiady z founderami i deep dives engineeringowe.
- highMatt Wolfe
Comprehensive AI tools weekly digest. ~700K subs.
- highMatthew Berman
AI news, model release reviews, agent demos. Wysoki output.
- highr/aivideo
Community AI video — Sora, Veo, Runway, Kling, LTX. Co naprawdę zaskakuje twórców.
- highr/ClaudeAI
Społeczność Claude'a — power users, tipy, problemy.
- highr/LocalLLaMA
Open-source LLMs, lokalne uruchamianie, benchmarks bez hype.
- highr/StableDiffusion
Największa community open-source image gen (700k+ users). Premiery modeli, LoRA, ComfyUI workflows.
- highRiley Brown
Vibe coding, AI builder workflows, Cursor + Claude tutorials.
- highThe Decoder
Niemiecki AI news outlet po angielsku, dobre breaking news.
- highTheo - t3.gg
TypeScript + AI dev workflows. Hot takes, narrative-driven.
- highYannic Kilcher
Paper reviews i deep dives w research AI.
- lowAI Weirdness
Janelle Shane — playful AI experiments, image gen quirks. Niski volume, unikalna perspektywa.
- mediumbycloud
AI papers digestible — między 2MP a Yannic Kilcher.
- mediumCreative Bloq
Design industry — gdzie AI ingeruje w klasyczne dyscypliny graficzne.
- mediumFireship
100-sec format, often AI/LLM + tech news.
- mediumfxguide
VFX i film industry — coraz więcej AI w pipeline. Profesjonalna perspektywa.
- mediumGreg Isenberg
Solo founder vibe — buduje produkty z AI, podcasty z indie hackers.
- mediumr/ChatGPTCoding
Vibe coding tipy, IDE setupy, prompty. Mix wszystkich modeli.
- mediumr/comfyui
ComfyUI workflows — custom nodes, JSON workflows, optymalizacje.
- mediumr/midjourney
Midjourney community — premiery v7+, style references, prompt patterns.
- mediumr/runwayml
Runway-specific community — premiery features, prompt patterns, comparisons z konkurencją.
- mediumr/SunoAI
Suno music gen community — nowe wersje modelu, lyric prompting techniques. Audio AI ma slaby RSS ecosystem.
- mediumTina Huang
AI workflows for data science, practical applications.
- mediumTwo Minute Papers
Krótkie streszczenia papers AI, świetne dla szybkiego scan'a.
- mediumWes Roth
AI news z bardziej clickbaitowym tonem — filtr Gemini odsiewa hype.
Open-sourcing Banodoco Hivemind: 1M+ Discord messages from artists and engineers working deeply with open image/video models, packaged as an agent skill
A massive dataset of real-world discussions from artists and engineers using open image/video AI models is now available, offering a unique resource for building smarter creative…
The Banodoco Hivemind, a substantial dataset comprising over 1 million Discord messages from artists and engineers, has been open-sourced. This collection captures deep, practical discussions around open image and video AI models, offering insights into real-world usage, problem-solving, and creative applications. Packaged as an "agent skill," this resource is designed to enhance the capabilities of AI agents, allowing them to better understand and assist users in creative workflows. It provides a novel foundation for developing more context-aware and helpful AI assistants, moving beyond generic training data to specialized, community-driven knowledge.
r/comfyui·tooling·05/07/2026, 01:30 PM·/u/PetersOdyssey
So Far This is My Favorite Use-Case for LTX 2.3/ComfyUI
Discover a practical workflow for using the LTX 2.3 video model in ComfyUI to achieve high-quality, consistent video generation on local hardware.
The Reddit community is exploring the capabilities of LTX 2.3, a new video generation model, specifically within the ComfyUI node-based interface. This post demonstrates a high-quality use-case that highlights the model's strengths in temporal consistency and motion fidelity. LTX 2.3 is designed to be more accessible for local execution on consumer GPUs than previous state-of-the-art video models. The author's workflow provides a practical example of how to integrate this model into complex creative pipelines. This demonstration is particularly valuable for creators looking for alternatives to closed-source video tools like Runway or Luma.
r/StableDiffusion·tooling·05/07/2026, 08:33 AM·/u/optimisoprimeo
testing LTX 2.3 1.1 distilled on my gpu. pretty much decent for creating ugc content or short tiktok vlog.
Distilled LTX 2.3 enables fast, high-quality local video generation on mid-range GPUs like the RTX 4060 Ti when paired with the latest CUDA/Torch updates.
A user on r/comfyui demonstrates the performance of the distilled LTX 2.3 1.1 model for generating short-form video content locally. The test highlights significant performance gains when using updated software stacks, specifically Torch 2.11.0 and CUDA 13.0. Running on consumer-grade hardware (RTX 4060 Ti 16GB), the model is capable of producing decent quality UGC and TikTok-style vlogs. The post includes a link to the specific ComfyUI workflow used for these results. This release represents a step forward in making high-quality video generation accessible on mid-range local GPUs.
r/comfyui·tooling·05/07/2026, 08:10 AM·/u/aziib
testing LTX 2.3 v1.1 distilled on my gpu. pretty decent for creating ugc content or short tiktok vlog.
LTX 2.3 v1.1 distilled runs efficiently on mid-range consumer GPUs (RTX 4060 Ti) for short video content when using updated Torch and CUDA drivers.
A user report demonstrates the performance of LTX 2.3 v1.1 distilled for creating short-form video content like TikTok vlogs. Running on an RTX 4060 Ti 16GB, the model shows significant speed improvements when paired with PyTorch 2.11.0 and CUDA 13.0 in ComfyUI. The distilled version of the model is specifically optimized for faster inference while maintaining enough quality for social media use cases. The post highlights the importance of driver and library updates for maximizing performance on consumer-grade hardware, making high-quality video generation more accessible.
r/StableDiffusion·tooling·05/07/2026, 08:10 AM·/u/aziib
Never got good results from Klein? Me neither, til now
Stop using turbo LoRAs with Klein 9B; it achieves peak quality and speed with just 4 steps natively.
A user on r/comfyui discovered why many creators struggle to get high-quality results from the Klein 9B model. The issue stems from incorrectly applying turbo LoRAs or using too many sampling steps, which degrades the output. Klein 9B is designed to be natively fast and performs optimally with only 4 steps without any speed-up modifications. The post includes a downloadable ComfyUI workflow and clarifies licensing terms, stating that while outputs can be used commercially, the model itself requires a commercial license from Black Forest Labs for business use. This finding explains the polarizing reception of the model and provides a clear path to better prompt adherence and speed.
r/comfyui·tutorial·05/07/2026, 01:43 AM·/u/Support_Marmoset
Clippy Reloaded - a really sarky useful Clipboard node with no click.
Streamline your ComfyUI workflow with a new clipboard node that automatically copies data without manual clicks.
Clippy Reloaded is a new custom node for ComfyUI designed to simplify data handling by automatically sending outputs to the system clipboard. Unlike standard clipboard nodes that require manual interaction, this version focuses on a "no-click" experience, triggering whenever a value passes through it. It features a humorous, sarcastic interface reminiscent of the classic Microsoft Office assistant. This tool is particularly useful for creators who frequently move prompts, seeds, or hex codes between ComfyUI and other applications. The node aims to reduce friction in repetitive creative tasks within the node-based environment.
r/comfyui·tooling·05/07/2026, 12:13 AM·/u/shootthesound
Clippy Reloaded - a really sarky useful Clipboard node with no click.
Automatically import your system clipboard into ComfyUI workflows every time you queue a prompt, eliminating manual pasting.
Clippy Reloaded is a custom node for ComfyUI designed to streamline the process of getting text into your workflows. Instead of manually pasting text into a node, this tool automatically pulls whatever is currently in your system clipboard the moment you queue a prompt. This is particularly useful for users who frequently copy prompts, descriptions, or parameters from external websites or LLM chats. The node eliminates repetitive clicking and pasting, acting as a dynamic input source. It is available as an open-source repository on GitHub for easy integration into existing ComfyUI setups.
r/StableDiffusion·tooling·05/07/2026, 12:11 AM·/u/shootthesound
My Reference Latent Node including Auto Masking and Timesteps per image is out tomorrow
A new ComfyUI node simplifies character consistency with built-in auto-masking and granular timestep control for reference images.
A new custom node for ComfyUI, developed by /u/shootthesound, introduces advanced Reference Latent capabilities for image generation. The node stands out by integrating auto-masking directly, reducing the need for manual mask preparation or external nodes. It also allows users to define specific timesteps for each reference image, providing much finer control over how much influence a reference has during the diffusion process. This is particularly useful for maintaining character consistency or transferring specific styles without overriding the entire generation. The release represents a streamlined approach to complex multi-image conditioning workflows that previously required cumbersome setups.
r/comfyui·tooling·05/06/2026, 10:32 PM·/u/shootthesound
My Reference Latent Node including Auto Masking and Timesteps per image is out tomorrow
A new ComfyUI node that offers precise control over reference images through auto-masking and per-image timestep scheduling.
Developer /u/shootthesound has released ReferenceLatentPlus, a new custom node for ComfyUI designed to refine how reference images influence generations. The tool introduces auto-masking capabilities and allows users to set specific timesteps for each reference image, providing granular control over when and how much a source image affects the output. It includes integrated VAE input and maximum resolution controls, simplifying the pipeline for piping multiple images directly into a workflow. This release addresses the need for more precise element extraction from source material without complex manual masking. The node is now publicly available on GitHub for integration into existing Stable Diffusion setups.
r/StableDiffusion·tooling·05/06/2026, 10:31 PM·/u/shootthesound
[WIP] ComfyUI Powered Klein 2 KV Edit i2i plugin (Chromium)
A browser sidebar plugin that lets you perform advanced image-to-image edits via ComfyUI using the Klein 2 KV model architecture.
Developer /u/deadsoulinside has released a Work-In-Progress (WIP) Chromium extension that integrates ComfyUI directly into the browser sidebar. The tool focuses on image-to-image (i2i) workflows using the Klein 2 KV architecture, which offers high prompt-based control over image manipulation. Users can create, save, and categorize custom prompts within the plugin's interface. To function, it requires a local ComfyUI instance with API mode and CORS enabled, specifically targeting the Flux-2-Klein 9B model and Qwen 3 text encoders. The project is open-source, serving as a template for others to build upon or port to Firefox.
r/StableDiffusion·tooling·05/06/2026, 10:12 PM·/u/deadsoulinsideKijai LTX 2.3 WIth 12 GB of VRam demo reel
You can now run the high-quality LTX 2.3 22B video model on a standard 12GB VRAM GPU using GGUF quantization and specialized ComfyUI workflows.
A user demonstrated that the LTX 2.3 22B video generation model can produce high-quality 8-second clips on consumer-grade hardware. By utilizing GGUF quantization and specific ComfyUI workflows developed by Kijai, the model fits within 12GB of VRAM, specifically tested on an RTX 3060 with 32GB of system RAM. This is a significant milestone as it brings state-of-the-art open-weight video generation to hobbyist setups. The shared resources include the GGUF model files and optimized workflows available on Civitai. This setup balances performance and accessibility, making long-form AI video generation more feasible for local execution without requiring enterprise-grade hardware.
r/comfyui·tooling·05/06/2026, 09:09 PM·/u/OfficeMagic1Acestep 1.5 XL Base Workflow?
Get the ComfyUI workflows for ACE-Step 1.5XL text-to-music generation, though be aware of potential vocal quality issues in the latest base version.
A user on r/comfyui has shared direct links to workflows for ACE-Step 1.5XL Base and ACE-Step 1.5 (4b LLM), which are models designed for text-to-music generation. While these workflows allow for integrated audio creation within ComfyUI, the author notes a significant drop in vocal quality in the 1.5XL version compared to the older 4b LLM variant. The issue persists across various prompts and default settings, resulting in audio that sounds low-bitrate or 'off'. This post serves as both a resource for those wanting to experiment with AI music and a warning about current technical limitations. It highlights the ongoing challenges in maintaining audio fidelity when scaling these specific generative models.
r/comfyui·tooling·05/06/2026, 08:48 PM·/u/uhf789
Anyone else tried this RefineAnything LoRA? Pretty impressed so far
A new ComfyUI plugin and LoRA workflow for surgical image refinement, perfect for fixing text, logos, and small details without affecting the rest of the image.
The RefineAnything project provides a specialized LoRA and workflow for surgical image repairs, specifically targeting text, logos, and product labels. A new ComfyUI plugin, ComfyUI-RefineNode, has been released to automate the manual labor of mask preparation, reference alignment, and pasting back the refined region. The plugin is model-agnostic, meaning it can enhance any local detail repair workflow, not just the RefineAnything LoRA. It supports both scribble masks and bounding boxes, ensuring the rest of the image remains 100% untouched. A technical tip from the developer suggests avoiding the 'index_timestep_zero' method to prevent noticeable color shifts during the process.
r/StableDiffusion·tooling·05/06/2026, 07:32 PM·/u/liangkun43
[Z-Image] REALSTAGRAM_ZIMG — subtle realism LoRA for Z-Image Turbo (works with any character LoRA)
Enhance Z-Image Turbo generations with a subtle, candid Instagram realism LoRA that stacks perfectly with character models.
REALSTAGRAM_ZIMG is a new realism-enhancing LoRA specifically designed for the Z-Image Turbo and De-Turbo models. It aims to shift image outputs away from the typical "AI-perfect" look toward a more amateur, candid Instagram aesthetic. The LoRA is lightweight (Rank 64, 325 MB) and does not require a trigger word, making it easy to integrate into existing prompts. It is optimized for stacking with character LoRAs at a strength of 0.2 to 0.6 to maintain character identity while adding subtle texture and lighting improvements. A ComfyUI workflow is provided to help users get started immediately.
r/StableDiffusion·tooling·05/06/2026, 06:35 PM·/u/Existing-House1230
Interactive Video Generation (Causal Forcing) - High Speed!
Generate high-speed interactive videos even on mid-range GPUs like the RTX 3060, with potential for real-time performance on high-end hardware.
Causal Forcing is a new approach to interactive video generation that emphasizes speed and efficiency. The release includes open-source code and models, with a community-repackaged version for ComfyUI. Performance benchmarks show that an RTX 3060 can generate a 2-second video (848x480) in just 11 seconds using only 4 steps. On high-end GPUs like the RTX 4090 or 5090, users report near real-time generation speeds. The model is lightweight, peaking at 6GB VRAM, making it accessible for hobbyists with mid-range hardware. This represents a significant step toward fluid, interactive AI video tools.
r/StableDiffusion·model_release·05/06/2026, 05:53 PM·/u/ZerOne82
LTX2.3 + Prompt relay + Keyframes | 2027 ChatGPT self awareness event 😝
Master complex video transitions in ComfyUI using a comprehensive LTX2.3 workflow that integrates prompt relaying and keyframe control.
A new advanced ComfyUI workflow for the LTX2.3 video model has been shared, focusing on the synergy between prompt relaying and keyframes. The setup allows for complex narrative transitions and visual consistency by chaining prompts and managing motion via keyframes. Beyond basic generation, the workflow integrates ID LoRA for character consistency, ControlNet for structural guidance, and a detailer/upscaler pass for high-quality output. It also includes support for custom audio synchronization. While the author notes that the results can be finicky, the provided Civitai link offers a complete all-in-one solution for creators looking to push the boundaries of AI video.
r/comfyui·tooling·05/06/2026, 03:57 PM·/u/Brief-Leg-8831[Release] PaperStrip_FX COMP | An experimental scan-like strip compositor
A new experimental ComfyUI node for creating stylized 'paper strip' or 'scan-line' visual effects in AI-generated images and videos.
PaperStrip_FX COMP is an experimental tool released for ComfyUI that introduces a unique scan-like strip compositing effect. Developed by user TasTepeler, this node allows artists to slice and rearrange images into horizontal or vertical strips, mimicking physical paper collages or digital scanning glitches. It provides a creative way to post-process AI-generated content directly within the ComfyUI environment, eliminating the need for external video editing software for these specific visual styles. The release includes the workflow and custom nodes necessary to implement these transitions or static effects. This tool is particularly useful for creators seeking lo-fi, analog aesthetics in their digital generative workflows.
r/comfyui·tooling·05/06/2026, 03:56 PM·/u/TasTepeler
Thanks to the sub my silly node and workflow got 3k downloads overnight, therefore I fixed some bugs, unified some features, and uploaded the latest and the greatest version to HF.
A new ComfyUI node that automates character consistency and scene composition using a structured Qwen-based procedural prompting system.
The ComfyUI Character Composer is a procedural prompt system designed to streamline character consistency and scene composition. Built upon the Qwen-Image-Edit-Rapid-AIO ecosystem, it provides a structured approach to generation, reducing the need for manual LLM prompting or copy-pasting. The tool features a unified txt2img and img2img workflow and utilizes a SFW JSON library for managing assets. Following a viral reception on Reddit with over 3,000 downloads, the developer has updated the node with bug fixes and unified features. It aims to offer more controllable generation for users working with complex character-driven workflows.
r/StableDiffusion·tooling·05/06/2026, 03:14 PM·/u/Mundane-Ad-5737
Release: LoRA Lister + Trigger happy: local LoRA stacks, list testing, and prompt sync *Link inside*
Manage and test multiple LoRAs easily in ComfyUI with automatic trigger word syncing, stack saving, and sequential batch testing.
LoRA Lister and Trigger Happy are new custom nodes for ComfyUI designed to streamline LoRA management. LoRA Lister allows users to create, save, and reorder stacks of LoRAs with individual strength controls and visual state indicators. It features a List mode for batch-testing an entire library by stepping through models one by one. The tool automatically fetches metadata, including trigger words and preview images, from CivitAI and caches them locally. Trigger Happy complements this by automatically injecting relevant trigger words into the prompt and offering advanced text encoding features. It can also extract prompts from existing images and handle complex prompt merging.
r/comfyui·tooling·05/06/2026, 01:57 PM·/u/KitchenTight7894
ComfyUI - a few image/video utility nodes
A new set of ComfyUI utility nodes for video editing, batch manipulation, and workflow debugging, including transition effects and speed control.
User /u/qdr1en has released a collection of ComfyUI utility nodes developed with the assistance of Claude Sonnet. The package includes general workflow tools like an execution timer, dynamic LoRA loader, and variable interpreter. For image and video work, it offers batch splitting, frame selection, and mirroring. Advanced features include a video speed controller with easing curves and a transition effect node that mimics CSS-style transitions. While some nodes are enhanced versions of existing tools, the collection provides a convenient toolkit for fine-tuning video sequences and debugging complex workflows.
r/comfyui·tooling·05/06/2026, 01:28 PM·/u/qdr1enLTX 2.3 ComfyUI – Identity drift in Image-to-Video (first/last frame not stable)
LTX 2.3 users are reporting issues with identity drift in Image-to-Video workflows, where the subject's appearance changes between the first and last frames.
Users of the LTX 2.3 video generation model are reporting significant identity drift when using Image-to-Video (I2V) workflows in ComfyUI. The issue manifests as a lack of consistency where the subject's features change noticeably from the initial frame to the end of the sequence. This stability problem affects the professional utility of the model for character-driven content. Community discussions suggest that while LTX 2.3 offers improvements in motion, frame-one conditioning remains a challenge. Creators are currently looking for workflow workarounds or specific node configurations to lock the identity throughout the generation process.
r/comfyui·tooling·05/06/2026, 11:53 AM·/u/White_Dragon_0
ComfyUI XAV Google Sheets
Easily pull text data from public Google Sheets into your ComfyUI workflows for dynamic prompting or batch processing without complex API setups.
A new set of custom nodes for ComfyUI allows users to integrate public Google Sheets directly into their image generation workflows. The package includes a loader that fetches spreadsheet data as a matrix and a selector that retrieves specific cell values using 0-based row and column indices. This is particularly useful for users who want to manage large sets of prompts, styles, or parameters in a familiar spreadsheet interface rather than hardcoding them into nodes. By using public URLs, it bypasses complex API authentication for simple read-only tasks. It provides a lightweight solution for automating batch runs using external data sources.
r/comfyui·tooling·05/06/2026, 11:34 AM·/u/Asleep-Platypus-3319
SenseNova-u1 | Low(ish) vram workflow
Run the new SenseNova-u1 multimodal model on 8GB VRAM using a GGUF-optimized ComfyUI workflow for high-res 2048px generations.
SenseNova-u1 is a unified multimodal model now accessible via GGUF quantization, making it runnable on consumer hardware like 8GB VRAM GPUs. The model excels at text rendering, portraiture, and image editing, with a native generation resolution of 2048x2048. Two versions are available: a Turbo variant requiring only 8 steps and a Base variant for 50 steps. While the Q6 GGUF file is approximately 16GB, the VRAM footprint is kept around 5GB during execution. A dedicated ComfyUI workflow has been released on Civitai to help users implement these high-resolution generations efficiently.
r/comfyui·model_release·05/06/2026, 11:13 AM·/u/MFGREBELBuilding a dedicated AI pipeline for 3DOOH Screen Adaptations (ComfyUI / Blender / RTX 5070)
A professional workflow for 3D anamorphic billboards using Blender and ComfyUI, optimized for high-end hardware like the RTX 5070.
This post details a specialized workflow for creating 3D Out-of-Home (3DOOH) advertising by bridging Blender's spatial precision with ComfyUI's generative capabilities. The author explains how to handle anamorphic perspectives required for large-scale public displays while leveraging AI for texture generation and scene enhancement. By integrating diffusion-based upscaling into the VFX pipeline, the process achieves high-fidelity results significantly faster than traditional rendering methods. The setup specifically utilizes the RTX 5070, providing performance benchmarks for real-time rendering and complex node execution. This approach represents a practical shift in how boutique agencies handle complex spatial media projects using accessible tools.
r/comfyui·tutorial·05/06/2026, 09:58 AM·/u/EquivalentTrash8332
ComfyUI with co-founder Yannik Marek (ComfyAnonymous)
A deep dive with the creator of ComfyUI on how node-based AI workflows are moving from experimental hacks to professional VFX production standards.
This podcast episode features an interview with Yannik Marek, the creator of ComfyUI known as ComfyAnonymous, discussing the tool's journey from a personal experiment to a professional industry standard. They explore how the node-based architecture allows for precise control over Stable Diffusion pipelines, making it indispensable for high-end VFX work. The discussion covers the transition to Comfy Org and the focus on stability and performance for enterprise environments. Marek explains the rationale behind the modular design, which enables rapid integration of new models and techniques. This is a deep dive into the technical philosophy that has made ComfyUI the preferred interface for advanced AI creators.
fxguide·tooling·05/06/2026, 09:38 AM·Mike Seymour
SenseNova U1 Infographic Test: Image Reasoning and Infographic Generation Capabilities
SenseNova U1 is a new model specialized in generating logical infographics and structured visual explanations from simple prompts.
SenseNova U1 is an emerging model designed for comprehension-driven image generation, specifically targeting infographics and technical illustrations. A recent community test demonstrated its ability to visualize a complex chemical reaction (eggshell in vinegar) with logical structure rather than just aesthetic elements. Unlike general-purpose models, it automatically organizes content into coherent informational layouts even with minimal prompting. While the visual reasoning is strong, the model still struggles with text clarity in some instances. The project is available on GitHub, offering a new tool for users needing structured visual communication.
r/comfyui·model_release·05/06/2026, 08:37 AM·/u/Beginning-Lie-4581
GTA 70s - Teaser Trailer (Alternative Version): Z-image Turbo - Flux Klein 9b - Wan 2.2
A high-quality fan trailer demonstrating the synergy between Flux Klein 9b and Wan 2.2 for consistent, cinematic AI video generation.
This creative project showcases a 1970s-themed Grand Theft Auto teaser trailer created using a sophisticated AI pipeline in ComfyUI. The creator utilized Flux Klein 9b for image generation and Wan 2.2 for video synthesis, achieving a distinct vintage aesthetic. The workflow also incorporates Z-image Turbo, likely for rapid prototyping or specific style transfers. This piece serves as a benchmark for how hobbyists can combine multiple specialized models to produce high-fidelity, thematic video content. It highlights the rapid evolution of open-source video tools and their ability to maintain stylistic consistency across scenes.
r/comfyui·creative_work·05/06/2026, 08:36 AM·/u/MayaProphecy
GTA 70s - Teaser Trailer (Alternative Version): Z-image Turbo - Flux Klein 9b - Wan 2.2
A high-quality 70s-style GTA trailer showcase using Flux and Wan 2.2, complete with downloadable ComfyUI workflows for replication.
This project showcases a fan-made 'GTA 70s' teaser trailer created using a sophisticated AI video pipeline. The creator utilized Flux Klein 9b for high-quality image generation and Wan 2.2 for video synthesis, achieving a distinct 70s cinematic aesthetic. Unlike many AI-generated videos that rely on heavy filters, this version focuses on clean film colors and realistic motion. Crucially, the author shared the full ComfyUI workflows via Google Drive, allowing the community to study and replicate the specific generation techniques. It serves as a practical benchmark for what is currently achievable with open-weight video models and fine-tuned Flux variants.
r/StableDiffusion·creative_work·05/06/2026, 08:36 AM·/u/MayaProphecy
Seedance 2.0 Anime MV
See how a complete anime music video was built using Seedance 2.0 in ComfyUI, combining AI video, Claude-generated prompts, and AI vocals.
A creator showcases an anime music video produced using the Seedance 2.0 workflow within ComfyUI. The project utilizes 'nano banana' for character and environment generation, while the video sequences rely on reference images and 'First Frame Last Frame' techniques to maintain consistency. The audio is a hybrid of human-arranged instruments and AI-generated vocals. The workflow is notably accessible, as the author used standard ComfyUI templates and leveraged Claude for scene prompting. This project serves as a practical benchmark for what hobbyists can achieve with current open-source video generation pipelines.
r/comfyui·creative_work·05/06/2026, 06:40 AM·/u/Time-Ad-7720
Chromium AI Image Description Plugin [ComfyUI Powered]
Analyze web images, detect AI artifacts, and generate motion prompts directly from your browser using your local ComfyUI setup and VLM models.
This Chromium plugin bridges the gap between web browsing and local ComfyUI workflows, allowing users to analyze images on any website. It leverages Vision Language Models (VLM) like Qwen 3.5 and Gemma 3 to provide detailed descriptions, OCR, and AI artifact detection. A standout feature is 'Motion Aware prompt', which suggests animation instructions for video generation based on a still image. The plugin requires a running ComfyUI backend and specific workflows provided by the author on GitHub. It also supports custom prompts for specialized image analysis tasks, making it a powerful tool for prompt engineering and quality control.
r/comfyui·tooling·05/06/2026, 02:26 AM·/u/deadsoulinsideTranscribing & Subtitling Audio Containing Multiple Languages
Current ComfyUI nodes for Qwen3-ASR and Whisper struggle to combine multi-language detection with sentence-level SRT output, requiring manual workarounds.
This discussion on r/comfyui addresses the technical difficulty of transcribing and subtitling audio files that contain multiple languages. The user highlights that while Faster Whisper is a standard for transcription, it fails when languages switch mid-audio. Two specific ComfyUI custom nodes based on Qwen3-ASR are evaluated: one by kaushiknishchay and the TTS-Audio-Suite. The analysis reveals a trade-off where one node handles language detection but lacks sentence-level SRT output, while the other provides proper formatting but forces a single-language output. This identifies a specific tooling gap for creators working with multilingual video content in ComfyUI.
r/comfyui·tooling·05/06/2026, 02:18 AM·/u/Far_Estimate7276
Chromium AI Image Description Plugin
A Chromium plugin that connects your browser to local ComfyUI workflows for instant image analysis, OCR, and video prompting.
This Chromium-based browser plugin allows users to send images directly to local ComfyUI workflows for processing using Vision Language Models (VLMs) like Qwen 3.5 and Gemma 3. Beyond standard image descriptions, it features AI error detection to spot artifacts and a 'Motion Aware' prompt generator that suggests animation steps for video creation based on still frames. It also includes an OCR reader for text extraction and supports custom instructions via a settings menu. The tool is designed to streamline the creative process by bridging web browsing with local AI generation environments.
r/StableDiffusion·tooling·05/06/2026, 01:41 AM·/u/deadsoulinside
LTX2.3 8GB VRAM WorkFlow
Run the latest LTX2.3 video generation model on consumer-grade 8GB VRAM GPUs using this optimized ComfyUI workflow.
This Reddit post provides a specialized ComfyUI workflow designed to run the LTX2.3 video generation model on hardware with only 8GB of VRAM. LTX-Video is a high-quality open-weights model known for strong temporal consistency, but it typically demands significant GPU resources. By utilizing optimizations like model offloading or specific node configurations, this workflow makes high-end video generation accessible to users with mid-range consumer GPUs like the RTX 3060 or 4060. This is a practical solution for hobbyists who previously could not run the full model locally due to memory constraints.
r/comfyui·tooling·05/05/2026, 10:27 PM·/u/Extension-Yard1918Trying to use V2V to extend videos and create long-form in LTX2.3. Quality degrading over time.
Extending videos in LTX-2.3 using V2V workflows often leads to quality degradation after 30 seconds due to recursive referencing and artifact accumulation.
A user on r/comfyui is reporting significant quality loss when attempting to extend 10-second clips into 1-minute videos using the LTX-2.3 model. The process involves using Rune's V2V (Video-to-Video) workflow, which relies on the final 3 seconds of a previous segment to generate the next. By the 30-second mark, which is the third iteration, the visual fidelity begins to break down. This highlights a common 'drift' issue in recursive video generation where artifacts and noise accumulate over time. The discussion points to the limitations of current LTX-2.3 workflows for long-form content without more robust context management or latent refreshing.
r/comfyui·tooling·05/05/2026, 08:10 PM·/u/BarelyAI
I hope this helps everyone....
A massive release of 5 ComfyUI node packs (120+ nodes) covering advanced video masking, Wan Video jitter fixes, animal pose estimation, and professional VFX compositing.
Developer /u/kyahinaamrakhe-1 has released five comprehensive node packs for ComfyUI, totaling over 120 nodes designed for advanced creative workflows. The main 'CustomNodePacks' (72 nodes) introduces unique tools like a Mask Failure Explainer and a Temporal Anchor System using Signed Distance Fields (SDF) for smooth video masking without tracking. Specific fixes for Wan Video address limb jitter and face-cropping issues, while a dedicated animal preprocessor enables accurate pose estimation for species like cats, dogs, and horses. The 'NukeMaxNodes' pack bridges traditional VFX operations (FFT, PBR relighting) with AI, and the GLM-Image pack provides modular loaders for Zhipu AI's multilingual model. All tools are Apache-2.0 licensed and focus on solving production bottlenecks like tempo…
r/comfyui·tooling·05/05/2026, 04:31 PM·/u/kyahinaamrakhe-1
Luma Uni-1 is now available via Partner Nodes
Luma's Uni-1, an autoregressive model that reasons before drawing, is now available in ComfyUI, offering superior prompt adherence and text rendering.
Luma AI has integrated its Uni-1 model into ComfyUI via new Partner Nodes. Unlike traditional diffusion models, Uni-1 uses a decoder-only autoregressive transformer architecture that processes text and images as a single interleaved sequence. This allows the model to reason through complex prompts, decomposing instructions and planning composition before generating pixels. Key features include high-quality text rendering, material accuracy, and temporal consistency across multi-panel outputs. Users can access it now through Comfy Cloud or by installing the specific partner nodes in their local workflows.
ComfyUI Blog·model_release·05/05/2026, 04:04 PM·Purz
I used Blender as a layout tool for AI video generation — here's the full workflow
Use Blender to control composition and motion, then let Seedance 2 handle the photorealistic AI video rendering.
The author presents a hybrid workflow that uses Blender as a director's pre-vis tool to overcome the randomness of AI video generation. By setting up basic 3D layouts, camera paths, and object animations in Blender, they establish precise spatial control over the scene. Keyframes from this layout are then converted into photorealistic images using an AI model. Finally, both the original 3D animation and the generated images are fed into Seedance 2 (Reference to Video) to produce a consistent, high-quality video sequence. This method effectively separates creative direction and composition from the technical rendering process.
r/comfyui·tutorial·05/05/2026, 03:27 PM·/u/waterarttrkgl
GTA 70s - Teaser Trailer: Z-Image Turbo - Flux Klein 9b - Wan 2.2
A high-quality demonstration of combining Flux Klein 9b and Wan 2.2 in ComfyUI to achieve a specific, consistent cinematic aesthetic.
This creative showcase presents a conceptual 'GTA 70s' trailer, demonstrating a high-end generative video pipeline within ComfyUI. The creator utilized Flux Klein 9b for base imagery, likely leveraging its efficiency and prompt adherence, combined with Wan 2.2 for video synthesis. The mention of 'Z-Image Turbo' suggests a real-time or accelerated generation layer used to speed up the creative iteration process. This project highlights the increasing convergence of specialized LoRAs and video models to achieve consistent stylistic results in a modular environment. It serves as a practical benchmark for what is possible with current open-weights models when properly orchestrated.
r/comfyui·creative_work·05/05/2026, 02:11 PM·/u/MayaProphecy
LTX2.3 8GB VRAM WorkFlow
Run the LTX2.3 video model on budget GPUs (8GB VRAM) using this optimized, multi-step ComfyUI workflow.
This Reddit post introduces a specialized ComfyUI workflow designed to run the LTX2.3 video generation model on GPUs with only 8GB of VRAM, such as the RTX 3060 Ti. Traditionally, high-end video models require significant hardware resources, but this optimization makes the technology accessible to hobbyists. The workflow achieves stability by generating initial video at a lower resolution at 24fps, then handling upscaling and frame interpolation as separate, decoupled steps. It supports both Text-to-Video and Image-to-Video modes, with the latter recommended for maintaining character consistency. This release provides a practical starting point for creative users who want to experiment with state-of-the-art video AI without expensive hardware upgrades.
r/StableDiffusion·tooling·05/05/2026, 12:46 PM·/u/Extension-Yard1918
Y'all might want to try this
New Causal-Forcing technique brings KV Cache and potential real-time frame generation to Wan models in ComfyUI.
The Causal-Forcing technique from Thu-ML is being integrated into ComfyUI via a new Pull Request, specifically targeting the Wan model architecture. This method allows for generating video frames sequentially with the benefit of KV Cache, which significantly optimizes memory and compute during inference. While the original researchers claim real-time performance on an RTX 4090, specific resolution details remain unconfirmed. The implementation in ComfyUI's core signifies a shift towards more efficient autoregressive video generation. This update is crucial for users looking to experiment with long-form video or interactive AI generation.
r/StableDiffusion·tooling·05/05/2026, 06:13 AM·/u/Altruistic_Heat_9531
LTX-2.3 + Union Control LoRA (8GB VRAM)
Generate high-quality 1280x640 LTX-2.3 videos with precise control on an 8GB VRAM GPU using this optimized ComfyUI workflow.
A new ComfyUI workflow demonstrates high-resolution video generation (1280x640) using the LTX-2.3 model on consumer-grade hardware with only 8GB of VRAM. By integrating the Union Control LoRA, users can achieve precise structural control over the video output, which was previously difficult on low-memory GPUs. The author provides a complete package including a Hugging Face repository for the workflow and a step-by-step YouTube tutorial. This release is significant for the creative community as it lowers the barrier to entry for high-quality AI cinematography. The pipeline uses Nano Banana for the initial frame generation before passing it to LTX-2.3 for temporal consistency.
r/comfyui·tooling·05/05/2026, 02:14 AM·/u/big-boss_97
LTX 2.3 Prompt Relay - Really good for concistency
Use the 'Prompt Relay' technique in ComfyUI to fix character flickering and maintain visual consistency in LTX 2.3 video generations.
A new workflow technique for LTX 2.3 called 'Prompt Relay' has been demonstrated to significantly improve character and environment consistency in generated videos. The method involves passing prompt information across frames or segments in a specific ComfyUI node setup to maintain visual coherence. This approach addresses the common issue of flickering or character morphing that plagues many open-source video models. By chaining prompt context, users can achieve more stable long-form or multi-shot sequences without losing the original artistic intent. The community is highlighting this as a practical solution for creators using LTX-Video checkpoints who need professional-grade stability.
r/comfyui·tooling·05/04/2026, 09:38 PM·/u/smereces
April Wrapped
ComfyUI adds massive video (Seedance 2.0, Wan 2.7), music (Ace Step 1.5 XL), and SVG (Quiver) support, plus parallel API execution for speed.
ComfyUI's April update introduces a wide array of new models and features, significantly expanding its creative reach. Key additions include Seedance 2.0 and Wan 2.7 for advanced video generation, and Quiver for structured SVG (vector) output. Music generation gets a boost with Ace Step 1.5 XL and Sonilo's video-to-audio capabilities. On the technical side, the introduction of Parallel Job Execution via API allows for simultaneous workflow processing, offering a major productivity gain for production environments. The ComfyHub repository has also grown to nearly 500 community-shared workflows, making it easier to find pre-built solutions.
ComfyUI Blog·tooling·05/04/2026, 04:37 PM·Team at Comfy
Relevance auto-scored by LLM (0–10). List shows top 30 from the last 7 days.