Show HN: Mulimodal labeling app with Rerun and Gradio
I want to emphasize how powerful this combo is. Rerun excels at showing complex multimodal data, and Gradio makes it super easy to spin up a ui all in pure Python quickly
Here I use
1. Moge to estimate relative depth
2. Birefnet to remove background + generate masks
3. RTMPose to estimate human poses
With this, I'm able to build a labeling pipeline that's fully customizable to my needs, giving me insights into both 2D and 3D perspectives. This isn't something (at least that I've anecdotally seen) with other platforms.
The important bit here is that I'm saving the final annotated file as an RRD that contains all the complex context in a single, easily viewable data file. I also showcase reloading the saved file to allow for adjusting annotations after the fact.
No comments yet