Options
Improving human detection in the presence of cartoon characters using retrained deep learning models
Journal
Signal, Image and Video Processing
ISSN
1863-1711
Date Issued
2025-04-02
Author(s)
Wei Jie Tiong
Khin Wee Lai
DOI
10.1007/s11760-025-04019-5
Abstract
When computer vision techniques are used to identify humans in public places, the presence of cartoon characters can often result in false detections as humans, complicating the task of human recognition and hindering the application of such technology in public. This paper aims to minimize the false detection rate by retraining the pretrained human detection models using transfer learning. The retraining process involves the utilization of a dataset consisting of two classes: humans and cartoon characters, with 11,000 images per class. The instances in the dataset are carefully labeled before splitting into training, validation, and testing sets. Each selected model is retrained, evaluated, and compared to the commonly used pretrained human detection models. The results reveal that the retrained YOLOv8n model performs the best for real-time application; it achieves 96.97% accuracy, 99.52% precision, 97.42% recall, 98.46% F1 score and a false detection rate of 8.16% yet has a small model size of 6.09 MB only. In addition, it outperforms all the pretrained models in terms of accuracy (by 5.38%) and F1 score (by 2.85%) in reducing the false detection rate of cartoon characters as humans. This has great implications in human counting and customer analytics. However, false detections of cartoons as humans still exist in either the pretrained or retrained models. More sophisticated models such as Vision Transformer will be studied in the future to minimize or completely eliminate the false detections since this can be done easily by a human being.
File(s)
Loading...
Name
j.png
Size
17.27 KB
Format
PNG
Checksum
(MD5):85f5e85fa8f8c13d7350540217a227b6
