Here a Hugging Face space that you can test it yourself : https://huggingface.co/spaces/fancyfeast/joy-caption-pre-alpha — still working
I have been requested to make a Gradio app for this so i made an advanced app and 1-click installers
It uses a clip siglip-so400m-patch14–384 and Meta-Llama-3.1–8B-Instruct as model and a fine tuned checkpoint for better captioning
My app who wants to checkout : https://www.patreon.com/posts/110613301
It has batch folder captioning feature as well and auto save all captioned images and captions into outputs folder
Also I have a very lightweight, super fast Gradio caption editor. Since I don’t like other existing apps, i self developed this one from scratch : https://www.patreon.com/posts/108992085