Detection involves localization of the person (at bounding box level). Multiple persons may be present in the image and should all be detected. Evaluate detection performance in terms of average precision (AP).

Input: Drone image of search and rescue operations

Output:
  • Detection of the person in the image
  • Classification: Standing, Walking, Running, Sitting, Lying, Not Defined
Dataset: https://ieee-dataport.org/documents/search-and-rescue-image-dataset-person-detection-sard

Dataset

  • 1,981 single frames with people (1920 x 1080)
  • 8746 objects in total
  • 6 classes: Standing, Walking, Running, Sitting, Lying, Not Defined
  • Batches of images were taken sequentially - less than 5 seconds apart
  • No metadata available
  • Proper splitting of data - not possible

Network architecture

  • Backbone: MobileNetV2
  • Head: CenterNet
  • Input size: 512 x 512
  • Pretrained on COCO 2017

Results


Examples
















Improvement ideas

SaR solution improvement ideas