Improving human detection in the presence of cartoon characters using retrained deep learning models

When computer vision techniques are used to identify humans in public places, the presence of cartoon characters can often result in false detections as humans, complicating the task of human recognition and hindering the application of such technology in public. This paper aims to minimize the false detection rate by retraining the pretrained human detection models using transfer learning. The retraining process involves the utilization of a dataset consisting of two classes: humans and cartoon characters, with 11,000 images per class. The instances in the dataset are carefully labeled before splitting into training, validation, and testing sets. Each selected model is retrained, evaluated, and compared to the commonly used pretrained human detection models. The results reveal that the retrained YOLOv8n model performs the best for real-time application; it achieves 96.97% accuracy, 99.52% precision, 97.42% recall, 98.46% F1 score and a false detection rate of 8.16% yet has a small model size of 6.09 MB only. In addition, it outperforms all the pretrained models in terms of accuracy (by 5.38%) and F1 score (by 2.85%) in reducing the false detection rate of cartoon characters as humans. This has great implications in human counting and customer analytics. However, false detections of cartoons as humans still exist in either the pretrained or retrained models. More sophisticated models such as Vision Transformer will be studied in the future to minimize or completely eliminate the false detections since this can be done easily by a human being.

Subjects

Human detection

Cartoon characters

Deep learning

Customer analysis

Computer vision

TRACKING

File(s)

Name

j.png

Size

17.27 KB

Format

PNG

Checksum

(MD5):85f5e85fa8f8c13d7350540217a227b6

Options

Improving human detection in the presence of cartoon characters using retrained deep learning models