A Human Detection Model for UAV-Assisted Emergency...

A Human Detection Model for UAV-Assisted

Emergency Management Information Systems

Chu Myaet Thwal, Kyi Thar, Ye Lin Tun, and Choong Seon Hong*

Department of Computer Science and Engineering, Kyung Hee University, South Korea

{chumyaet, kyithar, yelintun, cshong}@khu.ac.kr

Abstract

With the advancement of automation and communications technologies, the Unmanned Aerial Vehicles (UAVs) have become great

assistants in several critical applications. By contributing computer vision (CV) techniques to UAV technology, a human detection model

for UAV-assisted emergency management information systems is proposed in this paper. The proposed scheme consists of a control center

that is monitoring the emergency management procedures. After training and testing the human detection model at the control center, it is

mounted on UAVs that are deployed in search and rescue (SAR) operations. By applying the proposed system in real-world scenarios, it can

be expected to help in reducing the operational delay for acquiring situational information of the disaster-affected areas in SAR operations

and thus, increasing the survival rate of the victims.

Keywords: Unmanned Aerial Vehicle (UAV), Human Detection, Computer Vision (CV), Search and Rescue (SAR)

1. Introduction

Naturally triggered disasters such as earthquakes, tsunamis,

wildfires, or floods can create a catastrophic situation resulting

in the disruption of the environment and have an immediate im-

pact on human lives. According to the statistical data of the World

Health Organization (WHO), around 90,000 people are killed and

nearly 160 million people become the victims of large-scale nat-

ural disasters every year [1]. Regardless of the type of disaster or

the size of the affected area, it is mandatory for the authorities to

perform the emergency management processes during or after the

occurrence of a disaster. In such situations, time plays a critical

factor to mitigate the number of victims and control the amount

of death. It is important to collect the disaster information and

provide the first responders with accurate data of the critical situa-

tion in a short range of time. Traditional methods such as ground

inspection and aerial scouting operated by humans can be time-

consuming and have proven to be unreliable [2]. It can also be a

difficult task for a human rescuer to search and detect the survivors

in the area of the distorted site.

With the advancement of automation and communications tech-

nologies, Unmanned Aerial Vehicles (UAVs) become ubiquitous

and popular assets by introducing several services and functions

in a wide range of crucial applications. Being the cost-effective

and efficient innovations, UAVs are broadly used in the fields of

aerial photography, package delivery, surveillance, search and res-

cue (SAR), and military operations [3, 4]. In this paper, we inte-

grate the aid of UAVs in an emergency management information

system for acquiring accurate and sufficient information for the au-

thorities to manage the SAR operations. The advanced automated

capability and agility of UAVs to access the disaster area with ease

may improve the situational awareness to support the SAR oper-

ations. To be able to recognize and locate the survivors via UAV

cameras, we design a deep learning model that is automated by the

collaboration of artificial neural networks in computer vision (CV)

techniques. Based on the convolutional neural network (CNN) lay-

ers, we develop an object detection model that can identify humans

in images obtained from UAV cameras by applying the transfer

learning approach [5] on the pre-trained model.

Thus, our objective is to provide an automated human detec-

tion model mounted on UAVs assisting in emergency management

information systems for mitigating the operational delay and pre-

serving human lives in critical situations. By applying the pro-

posed scheme in real-world scenarios, it can effectively help to ac-

quire the relevant information and detect the survivors in disaster-

affected areas. Moreover, it can be expected the drastic reduction

of time to accomplish the SAR operations compared to the tradi-

tional methods operated by humans. Our contributions are sum-

marized as follows:

• We design the system architecture of an automated search and

rescue operation to jointly work with UAVs.

• We analyze the potential of transfer learning and develop an

object detection model by fine-tuning the parameters of a pre-

trained model.

• We train our model on a relevant aerial image dataset to be

able to recognize humans via UAV cameras and deploy it in

the emergency management information system.

• We design to update our model accordingly with real-time

images and data obtained from the deployed UAVs for further

improvement of the model performance.

296

2020년 한국컴퓨터종합학술대회 논문집

2. System Model and Problem Formulation

Fig.1 shows the system architecture of a search and rescue oper-

ation that jointly works with our proposed emergency management

information system. We consider a region where natural disasters

occur frequently with a control center for monitoring the emer-

gency management processes. A set of UAVs U = { u1, u2, ...,

uU } works under the administration of the control center for ac-

quiring the emergency information and assists in SAR operations.

Our system architecture consists of three stages for utilizing the

proposed scheme in real-world scenarios: i) Model development

stage, ii) Model deployment stage, and iii) Model improvement

stage.

Figure 1: System model.

At the model development stage, we adapt the Single Shot De-

tector (SSD) MobileNet-V2 model [6] that is pre-trained on the

MS-COCO dataset [7] which consists of 330k images with several

features for object detection tasks. We apply the transfer learning

technique to the adapted model for fine-tuning the model parame-

ters and develop an object detection model that can automatically

recognize humans in images obtained via UAV cameras. Then, we

train and test our model on the Semantic Drone dataset [8] which

contains 400 aerial images at a size of 6000× 4000 px (24 Mpx),

taken by a high-resolution UAV camera at an altitude of 5 to 30meters above the ground. We divide the dataset into three sets:

350 samples for the training set, 40 samples for the validation set,

and 10 samples for the test set. The accuracy of the model is de-

termined at the control center, and measured by average precision

and recall of the predictions. Precision measures how accurate

the model does the predictions and recall measures how good the

model is at detecting every human existing in an image calculated

by:

Precision =TP

(TP + FP )(1)

Recall =TP

(TP + FN)(2)

where TP = TruePositive: the case for our model detecting hu-

man that is actually existing in the image, FP = FalsePositive:

the case for our model mistakenly detecting human that is not ex-

isting in the image, and FN = FalseNegative: the case for

our model not detecting human that is existing in the image. Af-

ter training and testing the model locally at the control center, we

deploy the model in real-world scenarios.

At the model deployment stage, our proposed model is mounted

on UAVs that are assisting in SAR operations under the manage-

ment of the control center. The UAVs are employed for collecting

the relevant information on critical situations and detecting the sur-

vivors in disaster-affected areas. As the UAVs inspect an area, the

images acquired via the UAV cameras are processed as the inputs

for our proposed model. The model detects each person presented

in the images and generates a detection box around it as an out-

put. The output images and the location of the survivors are for-

warded to the control center for evaluating the model performance

and planning the SAR operations. Fig.2 shows the demonstration

of our model by applying it to one of the samples in the test set.

Figure 2: Demonstration for human detection.

At the model improvement stage, we use the output images and

data generated by the model at the end of the deployment stage and

update the model to improve the performance and the precision of

detection boxes.

3. Performance Evaluations

Figure 3: Localization loss of detection boxes.

297


In this section, we analyze the performance of our proposed

human detection model by evaluating the simulations on the Se-

mantic Drone dataset. We demonstrate our simulations on Google

Colab GPU backend by using TensorFlow API [9]. The statistical

results shown here are illustrated over 10,000 training steps. Fig.3

shows the localization loss for the bounding box offset prediction

as the sum of squared errors. Our model can reduce the loss grad-

ually with respect to the training steps to the minimum value of

0.3142 at the training step 10,000.

Figure 4: Detection boxes precision.

Fig.4 shows the mean average precision (mAP) of the detection

boxes that increases with respect to the training steps. Our model

can achieve the maximum mAP value of 0.7818 at the training step

10,000.

Figure 5: Detection boxes recall.

Fig.5 shows the average recall (AR) of the model over 100 de-

tections per image with respect to the training steps. Our model

can achieve the maximum AR value of 0.8276 at the training step

10,000.

4. Conclusions

In this paper, we proposed the architecture of a search and res-

cue operation that jointly works with the UAVs to apply it in the

smart city domain. According to the evaluation results, our pro-

posed scheme can be deployed in emergency management infor-

mation systems by helping to provide the authorities with relevant

data thereby reducing the operational delay and increasing the sur-

vival rate of the victims in search and rescue operations. As future

works, we aim to develop a model that can distinguish between

safe and injured victims and integrate it into UAVs. Thus, the au-

thorities can prioritize rescuing the injured victims while UAVs are

providing safe victims with first-aid supplies.

Acknowledgement

This work was supported by Institute of Information & commu-

nications Technology Planning & Evaluation (IITP) grant funded

by the Korea government(MSIT) (No.2019-0-01287, Evolvable

Deep Learning Model Generation Platform for Edge Computing)

*Dr. CS Hong is the corresponding author.

References

[1] “Natural events,” Aug 2012. [Online]. Available:

https://www.who.int/environmental_health_emergencies/natural_events/en/

[2] N. Zhao, W. Lu, M. Sheng, Y. Chen, J. Tang, F. R. Yu, and K. Wong, “Uav-

assisted emergency networks in disasters,” IEEE Wireless Communications,

vol. 26, no. 1, pp. 45–51, 2019.

[3] M. Mozaffari, W. Saad, M. Bennis, Y. Nam, and M. Debbah, “A tutorial on

uavs for wireless networks: Applications, challenges, and open problems,”

IEEE Communications Surveys Tutorials, vol. 21, no. 3, pp. 2334–2360, 2019.

[4] C. M. Thwal and C. S. Hong, “A uav-assisted intelligent delivery system for

smart city,” 2019.

[5] C. Tan, F. Sun, T. Kong, W. Zhang, C. Yang, and C. Liu, “A survey on deep

transfer learning,” CoRR, vol. abs/1808.01974, 2018. [Online]. Available:

http://arxiv.org/abs/1808.01974

[6] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, “Mo-

bilenetv2: Inverted residuals and linear bottlenecks,” in The IEEE Conferenceon Computer Vision and Pattern Recognition (CVPR), June 2018.

[7] T. Lin, M. Maire, S. J. Belongie, L. D. Bourdev, R. B. Girshick, J. Hays,

P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, “Microsoft COCO:

common objects in context,” CoRR, vol. abs/1405.0312, 2014. [Online].

Available: http://arxiv.org/abs/1405.0312

[8] “News.” [Online]. Available: https://www.tugraz.at/index.php?id=22387

[9] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S.

Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp,

G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg,

D. Mané, R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens,

B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan,

F. Viégas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and

X. Zheng, “TensorFlow: Large-scale machine learning on heterogeneous

systems,” 2015, software available from tensorflow.org. [Online]. Available:

http://tensorflow.org/

298


A Human Detection Model for UAV-Assisted Emergency...

Documents

Transcript of A Human Detection Model for UAV-Assisted Emergency...