-
Notifications
You must be signed in to change notification settings - Fork 23
Description
Dear imgutils maintainers (@deepghs and other contributors),
Firstly, thank you for creating and maintaining the imgutils library! It's a very helpful tool, and I've been particularly impressed with the detect_person functionality for anime-style images.
I am currently working on a project that involves intelligent image cropping, aiming to automatically focus on and frame characters within images. The detect_person function excellently provides bounding boxes for characters, which I'm using as a primary guide for a subsequent smart-cropping process (leveraging a library similar to smartcrop.js effect).
My Use Case & Why I Need More Granular Information:
My application processes a variety of images, often slide shows or collections, where multiple characters might be present. To enhance the smart-cropping logic and provide a better user experience, I'm looking for ways to prioritize or differentiate between detected persons based on more specific attributes, most notably gender.
For example, if an image contains multiple characters, my system could be configured to:
Preferentially crop around female characters.
Apply different framing or compositional rules based on the gender of the main subject.
Allow users to filter or select characters based on these attributes.
Currently, detect_person returns the label 'person' and a confidence score. While this is great for general person detection, it doesn't provide attributes like gender.
My Question/Feature Request:
1.Is there any existing mechanism or planned feature within imgutils (specifically for the anime person detection models like deepghs/anime_person_detection) to extract or infer more granular attributes such as gender for the detected persons?
2.If not directly available, would you have any recommendations or insights on how one might extend or combine imgutils's person detection with other models or techniques to achieve gender classification for the detected bounding boxes, particularly in the context of anime-style images?
3.Are the underlying YOLOv8 models used (from deepghs/anime_person_detection) trained purely for "person" class detection, or do they potentially contain any multi-label/attribute information that is not currently exposed through the detect_person API?
What I'm currently doing:
I'm successfully using detect_person to get bounding boxes. These boxes are then passed as boost regions to a smart cropping algorithm. The current limitation is that if multiple 'person' boxes are returned, my logic defaults to heuristics like picking the largest box or the first one, which isn't always optimal for my specific needs that involve prioritizing based on gender.
Any guidance, suggestions, or information on potential future developments in this area would be greatly appreciated. I believe having access to such attributes would significantly enhance the capabilities of applications built flusso (flow) imgutils.
Thank you for your time and for this fantastic library!
Best regards,