AI pulse last 7 days
Daily AI pulse from YouTube, blogs, Reddit, HN. Ruthlessly filtered.
Sources (41)▶
- criticalAndrej Karpathy
Były dyrektor AI w Tesli, OpenAI cofounder. Każde video to gold.
- criticalAnthropic
Oficjalny kanał Anthropic. Każdy release Claude'a.
- criticalComfyUI Blog
Release log dla integracji ComfyUI — Luma Uni-1, GPT Image 2, ACE-Step music gen, Seedance. Pokrywa video+image+music+workflow.
- criticalOpenAI Blog
Oficjalny blog OpenAI. Wszystkie release.
- criticalSimon Willison's Weblog
Najlepszy 'thinker' AI. Codzienne posty, deep insights, niska hype rate.
- highAI Explained
Głęboka analiza papers i benchmarków, niska hype rate.
- highAI Jason
Praktyczne tutoriale Claude Code, MCP, workflow vibe codingu.
- highBen's Bites
Daily AI digest, creator-friendly tone. Codex, model releases, agentic AI.
- highCole Medin
Vibe coding + agentic workflows + Claude Code MCP integrations.
- highFal AI Blog
Fal hostuje większość nowych AI image/video modeli — ich blog to wczesne sygnały premier.
- highHN: 3D & Gaussian Splatting
HN signal dla 3D generative — Gaussian Splatting, NeRF, image-to-3D. Próg 20 bo niszowa kategoria (top historic 182pts).
- highHN: AI agents / MCP
HN posty o agentach, MCP, vibe codingu z min 100 pkt.
- highHN: Claude / Anthropic
HN posty z 'Claude' lub 'Anthropic' z min 100 pkt.
- highHugging Face Blog
Releases dla image, video, audio, 3D modeli. Część tech-heavy — Gemini relevance odfiltruje noise. Downgraded z critical: za duży volume na 'must-read' status.
- highIndyDevDan
Claude Code power user, prompty, hooki.
- highInterconnects (Nathan Lambert)
AI policy + research analysis. Niska hype rate, opinionated.
- highLatent Space
Podcast + blog Swyx — wywiady z founderami i deep dives engineeringowe.
- highMatt Wolfe
Comprehensive AI tools weekly digest. ~700K subs.
- highMatthew Berman
AI news, model release reviews, agent demos. Wysoki output.
- highr/aivideo
Community AI video — Sora, Veo, Runway, Kling, LTX. Co naprawdę zaskakuje twórców.
- highr/ClaudeAI
Społeczność Claude'a — power users, tipy, problemy.
- highr/LocalLLaMA
Open-source LLMs, lokalne uruchamianie, benchmarks bez hype.
- highr/StableDiffusion
Największa community open-source image gen (700k+ users). Premiery modeli, LoRA, ComfyUI workflows.
- highRiley Brown
Vibe coding, AI builder workflows, Cursor + Claude tutorials.
- highThe Decoder
Niemiecki AI news outlet po angielsku, dobre breaking news.
- highTheo - t3.gg
TypeScript + AI dev workflows. Hot takes, narrative-driven.
- highYannic Kilcher
Paper reviews i deep dives w research AI.
- lowAI Weirdness
Janelle Shane — playful AI experiments, image gen quirks. Niski volume, unikalna perspektywa.
- mediumbycloud
AI papers digestible — między 2MP a Yannic Kilcher.
- mediumCreative Bloq
Design industry — gdzie AI ingeruje w klasyczne dyscypliny graficzne.
- mediumFireship
100-sec format, often AI/LLM + tech news.
- mediumfxguide
VFX i film industry — coraz więcej AI w pipeline. Profesjonalna perspektywa.
- mediumGreg Isenberg
Solo founder vibe — buduje produkty z AI, podcasty z indie hackers.
- mediumr/ChatGPTCoding
Vibe coding tipy, IDE setupy, prompty. Mix wszystkich modeli.
- mediumr/comfyui
ComfyUI workflows — custom nodes, JSON workflows, optymalizacje.
- mediumr/midjourney
Midjourney community — premiery v7+, style references, prompt patterns.
- mediumr/runwayml
Runway-specific community — premiery features, prompt patterns, comparisons z konkurencją.
- mediumr/SunoAI
Suno music gen community — nowe wersje modelu, lyric prompting techniques. Audio AI ma slaby RSS ecosystem.
- mediumTina Huang
AI workflows for data science, practical applications.
- mediumTwo Minute Papers
Krótkie streszczenia papers AI, świetne dla szybkiego scan'a.
- mediumWes Roth
AI news z bardziej clickbaitowym tonem — filtr Gemini odsiewa hype.
Open-sourcing Banodoco Hivemind: 1M+ Discord messages from artists and engineers working deeply with open image/video models, packaged as an agent skill
A massive dataset of real-world discussions from artists and engineers using open image/video AI models is now available, offering a unique resource for building smarter creative…
The Banodoco Hivemind, a substantial dataset comprising over 1 million Discord messages from artists and engineers, has been open-sourced. This collection captures deep, practical discussions around open image and video AI models, offering insights into real-world usage, problem-solving, and creative applications. Packaged as an "agent skill," this resource is designed to enhance the capabilities of AI agents, allowing them to better understand and assist users in creative workflows. It provides a novel foundation for developing more context-aware and helpful AI assistants, moving beyond generic training data to specialized, community-driven knowledge.
r/comfyui·tooling·05/07/2026, 01:30 PM·/u/PetersOdyssey
So Far This is My Favorite Use-Case for LTX 2.3/ComfyUI
Discover a practical workflow for using the LTX 2.3 video model in ComfyUI to achieve high-quality, consistent video generation on local hardware.
The Reddit community is exploring the capabilities of LTX 2.3, a new video generation model, specifically within the ComfyUI node-based interface. This post demonstrates a high-quality use-case that highlights the model's strengths in temporal consistency and motion fidelity. LTX 2.3 is designed to be more accessible for local execution on consumer GPUs than previous state-of-the-art video models. The author's workflow provides a practical example of how to integrate this model into complex creative pipelines. This demonstration is particularly valuable for creators looking for alternatives to closed-source video tools like Runway or Luma.
r/StableDiffusion·tooling·05/07/2026, 08:33 AM·/u/optimisoprimeo
testing LTX 2.3 1.1 distilled on my gpu. pretty much decent for creating ugc content or short tiktok vlog.
Distilled LTX 2.3 enables fast, high-quality local video generation on mid-range GPUs like the RTX 4060 Ti when paired with the latest CUDA/Torch updates.
A user on r/comfyui demonstrates the performance of the distilled LTX 2.3 1.1 model for generating short-form video content locally. The test highlights significant performance gains when using updated software stacks, specifically Torch 2.11.0 and CUDA 13.0. Running on consumer-grade hardware (RTX 4060 Ti 16GB), the model is capable of producing decent quality UGC and TikTok-style vlogs. The post includes a link to the specific ComfyUI workflow used for these results. This release represents a step forward in making high-quality video generation accessible on mid-range local GPUs.
r/comfyui·tooling·05/07/2026, 08:10 AM·/u/aziib
testing LTX 2.3 v1.1 distilled on my gpu. pretty decent for creating ugc content or short tiktok vlog.
LTX 2.3 v1.1 distilled runs efficiently on mid-range consumer GPUs (RTX 4060 Ti) for short video content when using updated Torch and CUDA drivers.
A user report demonstrates the performance of LTX 2.3 v1.1 distilled for creating short-form video content like TikTok vlogs. Running on an RTX 4060 Ti 16GB, the model shows significant speed improvements when paired with PyTorch 2.11.0 and CUDA 13.0 in ComfyUI. The distilled version of the model is specifically optimized for faster inference while maintaining enough quality for social media use cases. The post highlights the importance of driver and library updates for maximizing performance on consumer-grade hardware, making high-quality video generation more accessible.
r/StableDiffusion·tooling·05/07/2026, 08:10 AM·/u/aziib
Never got good results from Klein? Me neither, til now
Stop using turbo LoRAs with Klein 9B; it achieves peak quality and speed with just 4 steps natively.
A user on r/comfyui discovered why many creators struggle to get high-quality results from the Klein 9B model. The issue stems from incorrectly applying turbo LoRAs or using too many sampling steps, which degrades the output. Klein 9B is designed to be natively fast and performs optimally with only 4 steps without any speed-up modifications. The post includes a downloadable ComfyUI workflow and clarifies licensing terms, stating that while outputs can be used commercially, the model itself requires a commercial license from Black Forest Labs for business use. This finding explains the polarizing reception of the model and provides a clear path to better prompt adherence and speed.
r/comfyui·tutorial·05/07/2026, 01:43 AM·/u/Support_Marmoset
Clippy Reloaded - a really sarky useful Clipboard node with no click.
Streamline your ComfyUI workflow with a new clipboard node that automatically copies data without manual clicks.
Clippy Reloaded is a new custom node for ComfyUI designed to simplify data handling by automatically sending outputs to the system clipboard. Unlike standard clipboard nodes that require manual interaction, this version focuses on a "no-click" experience, triggering whenever a value passes through it. It features a humorous, sarcastic interface reminiscent of the classic Microsoft Office assistant. This tool is particularly useful for creators who frequently move prompts, seeds, or hex codes between ComfyUI and other applications. The node aims to reduce friction in repetitive creative tasks within the node-based environment.
r/comfyui·tooling·05/07/2026, 12:13 AM·/u/shootthesound
Clippy Reloaded - a really sarky useful Clipboard node with no click.
Automatically import your system clipboard into ComfyUI workflows every time you queue a prompt, eliminating manual pasting.
Clippy Reloaded is a custom node for ComfyUI designed to streamline the process of getting text into your workflows. Instead of manually pasting text into a node, this tool automatically pulls whatever is currently in your system clipboard the moment you queue a prompt. This is particularly useful for users who frequently copy prompts, descriptions, or parameters from external websites or LLM chats. The node eliminates repetitive clicking and pasting, acting as a dynamic input source. It is available as an open-source repository on GitHub for easy integration into existing ComfyUI setups.
r/StableDiffusion·tooling·05/07/2026, 12:11 AM·/u/shootthesound
My Reference Latent Node including Auto Masking and Timesteps per image is out tomorrow
A new ComfyUI node simplifies character consistency with built-in auto-masking and granular timestep control for reference images.
A new custom node for ComfyUI, developed by /u/shootthesound, introduces advanced Reference Latent capabilities for image generation. The node stands out by integrating auto-masking directly, reducing the need for manual mask preparation or external nodes. It also allows users to define specific timesteps for each reference image, providing much finer control over how much influence a reference has during the diffusion process. This is particularly useful for maintaining character consistency or transferring specific styles without overriding the entire generation. The release represents a streamlined approach to complex multi-image conditioning workflows that previously required cumbersome setups.
r/comfyui·tooling·05/06/2026, 10:32 PM·/u/shootthesound
My Reference Latent Node including Auto Masking and Timesteps per image is out tomorrow
A new ComfyUI node that offers precise control over reference images through auto-masking and per-image timestep scheduling.
Developer /u/shootthesound has released ReferenceLatentPlus, a new custom node for ComfyUI designed to refine how reference images influence generations. The tool introduces auto-masking capabilities and allows users to set specific timesteps for each reference image, providing granular control over when and how much a source image affects the output. It includes integrated VAE input and maximum resolution controls, simplifying the pipeline for piping multiple images directly into a workflow. This release addresses the need for more precise element extraction from source material without complex manual masking. The node is now publicly available on GitHub for integration into existing Stable Diffusion setups.
r/StableDiffusion·tooling·05/06/2026, 10:31 PM·/u/shootthesound
[WIP] ComfyUI Powered Klein 2 KV Edit i2i plugin (Chromium)
A browser sidebar plugin that lets you perform advanced image-to-image edits via ComfyUI using the Klein 2 KV model architecture.
Developer /u/deadsoulinside has released a Work-In-Progress (WIP) Chromium extension that integrates ComfyUI directly into the browser sidebar. The tool focuses on image-to-image (i2i) workflows using the Klein 2 KV architecture, which offers high prompt-based control over image manipulation. Users can create, save, and categorize custom prompts within the plugin's interface. To function, it requires a local ComfyUI instance with API mode and CORS enabled, specifically targeting the Flux-2-Klein 9B model and Qwen 3 text encoders. The project is open-source, serving as a template for others to build upon or port to Firefox.
r/StableDiffusion·tooling·05/06/2026, 10:12 PM·/u/deadsoulinsideKijai LTX 2.3 WIth 12 GB of VRam demo reel
You can now run the high-quality LTX 2.3 22B video model on a standard 12GB VRAM GPU using GGUF quantization and specialized ComfyUI workflows.
A user demonstrated that the LTX 2.3 22B video generation model can produce high-quality 8-second clips on consumer-grade hardware. By utilizing GGUF quantization and specific ComfyUI workflows developed by Kijai, the model fits within 12GB of VRAM, specifically tested on an RTX 3060 with 32GB of system RAM. This is a significant milestone as it brings state-of-the-art open-weight video generation to hobbyist setups. The shared resources include the GGUF model files and optimized workflows available on Civitai. This setup balances performance and accessibility, making long-form AI video generation more feasible for local execution without requiring enterprise-grade hardware.
r/comfyui·tooling·05/06/2026, 09:09 PM·/u/OfficeMagic1Acestep 1.5 XL Base Workflow?
Get the ComfyUI workflows for ACE-Step 1.5XL text-to-music generation, though be aware of potential vocal quality issues in the latest base version.
A user on r/comfyui has shared direct links to workflows for ACE-Step 1.5XL Base and ACE-Step 1.5 (4b LLM), which are models designed for text-to-music generation. While these workflows allow for integrated audio creation within ComfyUI, the author notes a significant drop in vocal quality in the 1.5XL version compared to the older 4b LLM variant. The issue persists across various prompts and default settings, resulting in audio that sounds low-bitrate or 'off'. This post serves as both a resource for those wanting to experiment with AI music and a warning about current technical limitations. It highlights the ongoing challenges in maintaining audio fidelity when scaling these specific generative models.
r/comfyui·tooling·05/06/2026, 08:48 PM·/u/uhf789
Anyone else tried this RefineAnything LoRA? Pretty impressed so far
A new ComfyUI plugin and LoRA workflow for surgical image refinement, perfect for fixing text, logos, and small details without affecting the rest of the image.
The RefineAnything project provides a specialized LoRA and workflow for surgical image repairs, specifically targeting text, logos, and product labels. A new ComfyUI plugin, ComfyUI-RefineNode, has been released to automate the manual labor of mask preparation, reference alignment, and pasting back the refined region. The plugin is model-agnostic, meaning it can enhance any local detail repair workflow, not just the RefineAnything LoRA. It supports both scribble masks and bounding boxes, ensuring the rest of the image remains 100% untouched. A technical tip from the developer suggests avoiding the 'index_timestep_zero' method to prevent noticeable color shifts during the process.
r/StableDiffusion·tooling·05/06/2026, 07:32 PM·/u/liangkun43
[Z-Image] REALSTAGRAM_ZIMG — subtle realism LoRA for Z-Image Turbo (works with any character LoRA)
Enhance Z-Image Turbo generations with a subtle, candid Instagram realism LoRA that stacks perfectly with character models.
REALSTAGRAM_ZIMG is a new realism-enhancing LoRA specifically designed for the Z-Image Turbo and De-Turbo models. It aims to shift image outputs away from the typical "AI-perfect" look toward a more amateur, candid Instagram aesthetic. The LoRA is lightweight (Rank 64, 325 MB) and does not require a trigger word, making it easy to integrate into existing prompts. It is optimized for stacking with character LoRAs at a strength of 0.2 to 0.6 to maintain character identity while adding subtle texture and lighting improvements. A ComfyUI workflow is provided to help users get started immediately.
r/StableDiffusion·tooling·05/06/2026, 06:35 PM·/u/Existing-House1230
Interactive Video Generation (Causal Forcing) - High Speed!
Generate high-speed interactive videos even on mid-range GPUs like the RTX 3060, with potential for real-time performance on high-end hardware.
Causal Forcing is a new approach to interactive video generation that emphasizes speed and efficiency. The release includes open-source code and models, with a community-repackaged version for ComfyUI. Performance benchmarks show that an RTX 3060 can generate a 2-second video (848x480) in just 11 seconds using only 4 steps. On high-end GPUs like the RTX 4090 or 5090, users report near real-time generation speeds. The model is lightweight, peaking at 6GB VRAM, making it accessible for hobbyists with mid-range hardware. This represents a significant step toward fluid, interactive AI video tools.
r/StableDiffusion·model_release·05/06/2026, 05:53 PM·/u/ZerOne82
LTX2.3 + Prompt relay + Keyframes | 2027 ChatGPT self awareness event 😝
Master complex video transitions in ComfyUI using a comprehensive LTX2.3 workflow that integrates prompt relaying and keyframe control.
A new advanced ComfyUI workflow for the LTX2.3 video model has been shared, focusing on the synergy between prompt relaying and keyframes. The setup allows for complex narrative transitions and visual consistency by chaining prompts and managing motion via keyframes. Beyond basic generation, the workflow integrates ID LoRA for character consistency, ControlNet for structural guidance, and a detailer/upscaler pass for high-quality output. It also includes support for custom audio synchronization. While the author notes that the results can be finicky, the provided Civitai link offers a complete all-in-one solution for creators looking to push the boundaries of AI video.
r/comfyui·tooling·05/06/2026, 03:57 PM·/u/Brief-Leg-8831[Release] PaperStrip_FX COMP | An experimental scan-like strip compositor
A new experimental ComfyUI node for creating stylized 'paper strip' or 'scan-line' visual effects in AI-generated images and videos.
PaperStrip_FX COMP is an experimental tool released for ComfyUI that introduces a unique scan-like strip compositing effect. Developed by user TasTepeler, this node allows artists to slice and rearrange images into horizontal or vertical strips, mimicking physical paper collages or digital scanning glitches. It provides a creative way to post-process AI-generated content directly within the ComfyUI environment, eliminating the need for external video editing software for these specific visual styles. The release includes the workflow and custom nodes necessary to implement these transitions or static effects. This tool is particularly useful for creators seeking lo-fi, analog aesthetics in their digital generative workflows.
r/comfyui·tooling·05/06/2026, 03:56 PM·/u/TasTepeler
Thanks to the sub my silly node and workflow got 3k downloads overnight, therefore I fixed some bugs, unified some features, and uploaded the latest and the greatest version to HF.
A new ComfyUI node that automates character consistency and scene composition using a structured Qwen-based procedural prompting system.
The ComfyUI Character Composer is a procedural prompt system designed to streamline character consistency and scene composition. Built upon the Qwen-Image-Edit-Rapid-AIO ecosystem, it provides a structured approach to generation, reducing the need for manual LLM prompting or copy-pasting. The tool features a unified txt2img and img2img workflow and utilizes a SFW JSON library for managing assets. Following a viral reception on Reddit with over 3,000 downloads, the developer has updated the node with bug fixes and unified features. It aims to offer more controllable generation for users working with complex character-driven workflows.
r/StableDiffusion·tooling·05/06/2026, 03:14 PM·/u/Mundane-Ad-5737
Release: LoRA Lister + Trigger happy: local LoRA stacks, list testing, and prompt sync *Link inside*
Manage and test multiple LoRAs easily in ComfyUI with automatic trigger word syncing, stack saving, and sequential batch testing.
LoRA Lister and Trigger Happy are new custom nodes for ComfyUI designed to streamline LoRA management. LoRA Lister allows users to create, save, and reorder stacks of LoRAs with individual strength controls and visual state indicators. It features a List mode for batch-testing an entire library by stepping through models one by one. The tool automatically fetches metadata, including trigger words and preview images, from CivitAI and caches them locally. Trigger Happy complements this by automatically injecting relevant trigger words into the prompt and offering advanced text encoding features. It can also extract prompts from existing images and handle complex prompt merging.
r/comfyui·tooling·05/06/2026, 01:57 PM·/u/KitchenTight7894
ComfyUI - a few image/video utility nodes
A new set of ComfyUI utility nodes for video editing, batch manipulation, and workflow debugging, including transition effects and speed control.
User /u/qdr1en has released a collection of ComfyUI utility nodes developed with the assistance of Claude Sonnet. The package includes general workflow tools like an execution timer, dynamic LoRA loader, and variable interpreter. For image and video work, it offers batch splitting, frame selection, and mirroring. Advanced features include a video speed controller with easing curves and a transition effect node that mimics CSS-style transitions. While some nodes are enhanced versions of existing tools, the collection provides a convenient toolkit for fine-tuning video sequences and debugging complex workflows.
r/comfyui·tooling·05/06/2026, 01:28 PM·/u/qdr1enLTX 2.3 ComfyUI – Identity drift in Image-to-Video (first/last frame not stable)
LTX 2.3 users are reporting issues with identity drift in Image-to-Video workflows, where the subject's appearance changes between the first and last frames.
Users of the LTX 2.3 video generation model are reporting significant identity drift when using Image-to-Video (I2V) workflows in ComfyUI. The issue manifests as a lack of consistency where the subject's features change noticeably from the initial frame to the end of the sequence. This stability problem affects the professional utility of the model for character-driven content. Community discussions suggest that while LTX 2.3 offers improvements in motion, frame-one conditioning remains a challenge. Creators are currently looking for workflow workarounds or specific node configurations to lock the identity throughout the generation process.
r/comfyui·tooling·05/06/2026, 11:53 AM·/u/White_Dragon_0
ComfyUI XAV Google Sheets
Easily pull text data from public Google Sheets into your ComfyUI workflows for dynamic prompting or batch processing without complex API setups.
A new set of custom nodes for ComfyUI allows users to integrate public Google Sheets directly into their image generation workflows. The package includes a loader that fetches spreadsheet data as a matrix and a selector that retrieves specific cell values using 0-based row and column indices. This is particularly useful for users who want to manage large sets of prompts, styles, or parameters in a familiar spreadsheet interface rather than hardcoding them into nodes. By using public URLs, it bypasses complex API authentication for simple read-only tasks. It provides a lightweight solution for automating batch runs using external data sources.
r/comfyui·tooling·05/06/2026, 11:34 AM·/u/Asleep-Platypus-3319
SenseNova-u1 | Low(ish) vram workflow
Run the new SenseNova-u1 multimodal model on 8GB VRAM using a GGUF-optimized ComfyUI workflow for high-res 2048px generations.
SenseNova-u1 is a unified multimodal model now accessible via GGUF quantization, making it runnable on consumer hardware like 8GB VRAM GPUs. The model excels at text rendering, portraiture, and image editing, with a native generation resolution of 2048x2048. Two versions are available: a Turbo variant requiring only 8 steps and a Base variant for 50 steps. While the Q6 GGUF file is approximately 16GB, the VRAM footprint is kept around 5GB during execution. A dedicated ComfyUI workflow has been released on Civitai to help users implement these high-resolution generations efficiently.
r/comfyui·model_release·05/06/2026, 11:13 AM·/u/MFGREBELBuilding a dedicated AI pipeline for 3DOOH Screen Adaptations (ComfyUI / Blender / RTX 5070)
A professional workflow for 3D anamorphic billboards using Blender and ComfyUI, optimized for high-end hardware like the RTX 5070.
This post details a specialized workflow for creating 3D Out-of-Home (3DOOH) advertising by bridging Blender's spatial precision with ComfyUI's generative capabilities. The author explains how to handle anamorphic perspectives required for large-scale public displays while leveraging AI for texture generation and scene enhancement. By integrating diffusion-based upscaling into the VFX pipeline, the process achieves high-fidelity results significantly faster than traditional rendering methods. The setup specifically utilizes the RTX 5070, providing performance benchmarks for real-time rendering and complex node execution. This approach represents a practical shift in how boutique agencies handle complex spatial media projects using accessible tools.
r/comfyui·tutorial·05/06/2026, 09:58 AM·/u/EquivalentTrash8332
ComfyUI with co-founder Yannik Marek (ComfyAnonymous)
A deep dive with the creator of ComfyUI on how node-based AI workflows are moving from experimental hacks to professional VFX production standards.
This podcast episode features an interview with Yannik Marek, the creator of ComfyUI known as ComfyAnonymous, discussing the tool's journey from a personal experiment to a professional industry standard. They explore how the node-based architecture allows for precise control over Stable Diffusion pipelines, making it indispensable for high-end VFX work. The discussion covers the transition to Comfy Org and the focus on stability and performance for enterprise environments. Marek explains the rationale behind the modular design, which enables rapid integration of new models and techniques. This is a deep dive into the technical philosophy that has made ComfyUI the preferred interface for advanced AI creators.
fxguide·tooling·05/06/2026, 09:38 AM·Mike Seymour
SenseNova U1 Infographic Test: Image Reasoning and Infographic Generation Capabilities
SenseNova U1 is a new model specialized in generating logical infographics and structured visual explanations from simple prompts.
SenseNova U1 is an emerging model designed for comprehension-driven image generation, specifically targeting infographics and technical illustrations. A recent community test demonstrated its ability to visualize a complex chemical reaction (eggshell in vinegar) with logical structure rather than just aesthetic elements. Unlike general-purpose models, it automatically organizes content into coherent informational layouts even with minimal prompting. While the visual reasoning is strong, the model still struggles with text clarity in some instances. The project is available on GitHub, offering a new tool for users needing structured visual communication.
r/comfyui·model_release·05/06/2026, 08:37 AM·/u/Beginning-Lie-4581
GTA 70s - Teaser Trailer (Alternative Version): Z-image Turbo - Flux Klein 9b - Wan 2.2
A high-quality fan trailer demonstrating the synergy between Flux Klein 9b and Wan 2.2 for consistent, cinematic AI video generation.
This creative project showcases a 1970s-themed Grand Theft Auto teaser trailer created using a sophisticated AI pipeline in ComfyUI. The creator utilized Flux Klein 9b for image generation and Wan 2.2 for video synthesis, achieving a distinct vintage aesthetic. The workflow also incorporates Z-image Turbo, likely for rapid prototyping or specific style transfers. This piece serves as a benchmark for how hobbyists can combine multiple specialized models to produce high-fidelity, thematic video content. It highlights the rapid evolution of open-source video tools and their ability to maintain stylistic consistency across scenes.
r/comfyui·creative_work·05/06/2026, 08:36 AM·/u/MayaProphecy
GTA 70s - Teaser Trailer (Alternative Version): Z-image Turbo - Flux Klein 9b - Wan 2.2
A high-quality 70s-style GTA trailer showcase using Flux and Wan 2.2, complete with downloadable ComfyUI workflows for replication.
This project showcases a fan-made 'GTA 70s' teaser trailer created using a sophisticated AI video pipeline. The creator utilized Flux Klein 9b for high-quality image generation and Wan 2.2 for video synthesis, achieving a distinct 70s cinematic aesthetic. Unlike many AI-generated videos that rely on heavy filters, this version focuses on clean film colors and realistic motion. Crucially, the author shared the full ComfyUI workflows via Google Drive, allowing the community to study and replicate the specific generation techniques. It serves as a practical benchmark for what is currently achievable with open-weight video models and fine-tuned Flux variants.
r/StableDiffusion·creative_work·05/06/2026, 08:36 AM·/u/MayaProphecy
Seedance 2.0 Anime MV
See how a complete anime music video was built using Seedance 2.0 in ComfyUI, combining AI video, Claude-generated prompts, and AI vocals.
A creator showcases an anime music video produced using the Seedance 2.0 workflow within ComfyUI. The project utilizes 'nano banana' for character and environment generation, while the video sequences rely on reference images and 'First Frame Last Frame' techniques to maintain consistency. The audio is a hybrid of human-arranged instruments and AI-generated vocals. The workflow is notably accessible, as the author used standard ComfyUI templates and leveraged Claude for scene prompting. This project serves as a practical benchmark for what hobbyists can achieve with current open-source video generation pipelines.
r/comfyui·creative_work·05/06/2026, 06:40 AM·/u/Time-Ad-7720
Chromium AI Image Description Plugin [ComfyUI Powered]
Analyze web images, detect AI artifacts, and generate motion prompts directly from your browser using your local ComfyUI setup and VLM models.
This Chromium plugin bridges the gap between web browsing and local ComfyUI workflows, allowing users to analyze images on any website. It leverages Vision Language Models (VLM) like Qwen 3.5 and Gemma 3 to provide detailed descriptions, OCR, and AI artifact detection. A standout feature is 'Motion Aware prompt', which suggests animation instructions for video generation based on a still image. The plugin requires a running ComfyUI backend and specific workflows provided by the author on GitHub. It also supports custom prompts for specialized image analysis tasks, making it a powerful tool for prompt engineering and quality control.
r/comfyui·tooling·05/06/2026, 02:26 AM·/u/deadsoulinside
Relevance auto-scored by LLM (0–10). List shows top 30 from the last 7 days.