divyaprakash's comments

divyaprakash · 2026-01-25T16:43:17 1769359397

If social is heading that way, at least my tool saves you the manual labor of editing the funeral.

wasmainiac · 2026-01-25T16:52:37 1769359957

divyaprakash · 2026-01-25T16:55:09 1769360109

I was just joking about your comment on social media's 'death

divyaprakash · 2026-01-25T14:30:23 1769351423

It’s actually designed for your own gameplay—it scans hours long raw session to find the best highlights and clips them into shorts. It's more about automating the tedious editing process for your own content rather than generating "slop" from scratch.

8organicbits · 2026-01-25T15:46:01 1769355961

Personal consumption is an interesting angle. I'm starting to think AI content is only desirable to the creator, but no one else wants to see the slop.

ares623 · 2026-01-25T20:04:31 1769371471

It’s like dreams.

simianparrot · 2026-01-25T14:55:20 1769352920

Automating editing is by definition making it slop.

divyaprakash · 2026-01-25T12:36:48 1769344608

Yes, more than enough. I have rtx4080 laptop gpu with 12gb vram.

divyaprakash · 2026-01-25T11:36:29 1769340989

Great idea. Integrating YOLO for 'Action Following' is high on the roadmap—I'd love a PR for that if you're interested!

divyaprakash · 2026-01-25T11:36:01 1769340961

Guilty as charged. I used Antigravity to handle the refactoring and docs so I could stay focused on the CUDA and VRAM orchestration.

wasmainiac · 2026-01-25T18:57:44 1769367464

This isn’t a job interview, drop the corpo speak. What’s going on with Cuda and vram? We are all friends here.

divyaprakash · 2026-01-25T20:00:40 1769371240

Haha fair enough.The actual internals are basically just one big fight with VRAM. I'm using decord to dump frames straight into GPU memory so the CPU doesn't bottleneck the pipeline. From there, everything—scene detection, hsv transforms, action scoring—is vectorized in torch (mostly fp16 to avoid ooming). I also had to chunk the audio stft/flux math because long files were just eating the card alive. The tts model stays cached as a singleton so it's snappy after the first run, and I'm manually tracking 'Allocated vs Reserved' memory to keep it from choking. Still plenty of refinement left on the roadmap, but it's a fun weekend project to mess around with.

wasmainiac · 2026-01-25T22:47:35 1769381255

Nice! Thanks :) what is ooming?

shaugen · 2026-01-25T23:15:13 1769382913

Out Of Memory-ing.

divyaprakash · 2026-01-25T11:35:33 1769340933

Definitely. The architecture is modular—just swap the LLM prompts for 'cinematic' styles. It's headless and dockerized, so it fits well as a SaaS backend worker

divyaprakash · 2026-01-25T07:37:33 1769326653

I built this because I was tired of "AI tools" that were just wrappers around expensive APIs with high latency. As a developer who lives in the terminal (Arch/Nushell), I wanted something that felt like a CLI tool and respected my hardware.

The Tech:

    GPU Heavy: It uses decord and PyTorch for scene analysis. I’m calculating action density and spectral flux locally to find hooks before hitting an LLM.

    Local Audio: I’m using ChatterBox locally for TTS to avoid recurring costs and privacy leaks.

    Rendering: Final assembly is offloaded to NVENC.

Looking for Collaborators: I’m currently looking for PRs specifically around:

    Intelligent Auto-Zoom: Using YOLO/RT-DETR to follow the action in a 9:16 crop.

    Voice Engine Upgrades: Moving toward ChatterBoxTurbo or NVIDIA's latest TTS.

It's fully dockerized, and also has a makefile. Would love some feedback on the pipeline architecture!

amelius · 2026-01-25T13:55:54 1769349354

> Multi-Provider Support: Choose between OpenAI (GPT-5-mini, GPT-4o) or Google Gemini for scene analysis

This is the first sentence in your features section, so it is not strange if users don't understand if this tool is running locally or not.

divyaprakash · 2026-01-25T14:06:50 1769350010

Fair point. I used SOTA models for the analysis to prioritize quality, but since the heavy media processing is local, API costs stay negligible (or free). The architecture is modular, though—you can definitely swap in a local LLM for a fully air-gapped setup.

ramon156 · 2026-01-25T09:59:40 1769335180

I don't get this reasoning. You were tired of LLM wrappers, but what is your tool? These two requirements (felt like a CLI and respects your hardware) do not line up.

Still a cool tool though! Although it seems partly AI generated.

fouc · 2026-01-25T11:21:40 1769340100

Seems like the post you're replying to has since been edited to clarify that he's referring to the wrappers that rely on third party AI APIs over the internet rather than running locally.

rustyhancock · 2026-01-25T10:06:28 1769335588

[flagged]

Hamuko · 2026-01-25T11:12:51 1769339571

I think my life's too short to ever read your READMEs.

pelasaco · 2026-01-25T16:05:43 1769357143

The life ist too short to read AI generated README, which are clearly not written for humans..

pelasaco · 2026-01-25T16:04:45 1769357085

You were tired of "AI tools", then you vibe-coded an AI tool to deal with that? Not sure if i get it why it deserves to be on "Show HN"

ithkuil · 2026-01-25T17:06:14 1769360774

The sentence continued with "that were just wrappers ...".