Person re - Identification in a surveillance camera network

Person ReID is known as associating cross-view images of the same person when

he/she moves in a non-overlapping camera network [1]. In recent years, along with

the development of surveillance camera systems, person re-identification (ReID) has

increasingly attracted the attention of computer vision and pattern recognition communities because of its promising applications in many areas, such as public safety and

security, human-robotic interaction, and person retrieval. In early years, person ReID

was considered as the sub-task of Multi-Camera Tracking (MCT) [2]. The purpose of

MCT is to generate tracklets in every single field of view (FoV) and then associate

the tracklets that belong to the same pedestrian in different FoVs. In 2006, Gheissari

et al [3] firstly considered person ReID as an independent task. On a certain aspect,

person ReID and Multi-Target Multi-Camera Tracking (MTMCT) are close to each

other. However, the two issues are fundamentally different from each other in terms of

objective and evaluation metrics. While the objective of MTMCT is to determine the

position of each pedestrian over time from video streams taken by different cameras.

Person ReID tries to answer the question: "Which gallery images belong to a certain

probe person?" and it returns a sorted list of the gallery persons in descending order

of the similarities to the given query person. If MTMCT classifies a pair of images as

co-identical or not, person ReID ranks the gallery persons corresponding to the given

query person. Therefore, their performance is evaluated by different metrics: classification error rates for MTMCT and ranking performance for ReID. It is worth noting

that in case of overlapping camera network, the corresponding images of the same person would be found out based on data association, and can be considered as person

tracking problem, which is out of scope of this thesis. In the last decade, with the unremitting efforts, person ReID has achieved numerous important milestones with many

great results [4, 5, 6, 7, 8], however, it is still a challenging task and confronts various

difficulties. These difficulties and challenges will be presented in the later section. First

of all, the mathematical formulation of person REID is given as follows.

pdf 143 trang dienloan 4460
Bạn đang xem 20 trang mẫu của tài liệu "Person re - Identification in a surveillance camera network", để tải tài liệu gốc về máy hãy click vào nút Download ở trên

Tóm tắt nội dung tài liệu: Person re - Identification in a surveillance camera network

Person re - Identification in a surveillance camera network
MINISTRY OF EDUCATION AND TRAINING
HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY
NGUYEN THUY BINH
PERSON RE-IDENTIFICATION
IN A SURVEILLANCE CAMERA NETWORK
DOCTORAL DISSERTATION OF
ELECTRONICS ENGINEERING
Hanoi−2020
MINISTRY OF EDUCATION AND TRAINING
HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY
NGUYEN THUY BINH
PERSON RE-IDENTIFICATION
IN A SURVEILLANCE CAMERA NETWORK
Major: Electronics Engineering
Code: 9520203
DOCTORAL DISSERTATION OF
ELECTRONICS ENGINEERING
SUPERVISORS:
1.Assoc. Prof. Pham Ngoc Nam
2.Assoc. Prof. Le Thi Lan
Hanoi−2020
DECLARATION OF AUTHORSHIP
I, Nguyen Thuy Binh, declare that the thesis titled "Person re-identification in
a surveillance camera network" has been entirely composed by myself. I assure some
points as follows:
 This work was done wholly or mainly while in candidature for a Ph.D. research
degree at Hanoi University of Science and Technology.
 The work has not be submitted for any other degree or qualifications at Hanoi
University of Science and Technology or any other institutions.
 Appropriate acknowledge has been given within this thesis where reference has
been made to the published work of others.
 The thesis submitted is my own, except where work in the collaboration has been
included. The collaborative contributions have been clearly indicated.
Hanoi, 24/11/ 2020
PhD Student
SUPERVISORS
i
ACKNOWLEDGEMENT
This dissertation was written during my doctoral course at School of Electronics and
Telecommunications (SET) and International Research Institute of Multimedia, Infor-
mation, Communication and Applications (MICA), Hanoi University of Science and
Technology (HUST). I am so grateful for all people who always support and encourage
me for completing this study.
First, I would like to express my sincere gratitude to my advisors Assoc. Prof. Pham
Ngoc Nam and Assoc. Prof. Le Thi Lan for their effective guidance, their patience,
continuous support and encouragement, and their immense knowledge.
I would like to express my gratitude to Dr. Vo Le Cuong and Dr. Ha thi Thu Lan for
their help. I would like to thank to all member of School of Electronics and Telecom-
munications, International Research Institute of Multimedia, Information, Communi-
cations and Applications (MICA), Hanoi University of Science and Technology (HUST)
as well as all of my colleagues in Faculty of Electrical-Electronic Engineering, University
of Transport and Communications (UTC). They have always helped me on research
process and given helpful advises for me to overcome my own difficulties. Moreover,
the attention at scientific conferences has always been a great experience for me to
receive many the useful comments.
During my PhD course, I have received many supports from the Management Board
of School of Electronics and Telecommunications, MICA Institute, and Faculty of
Electrical-Electronic Engineering. My sincere thank to Assoc. Prof. Nguyen Huu
Thanh, Dr. Nguyen Viet Son and Assoc. Prof. Nguyen Thanh Hai who gave me a lot
of support and help. Without their precious support, it has been impossible to conduct
this research. Thanks to my employer, University of Transport and Communications
(UTC) for all necessary support and encouragement during my PhD journey. I am
also grateful to Vietnam’s Program 911, HUST and UTC projects for their generous
financial support. Special thanks to my family and relatives, particularly, my beloved
husband and our children, for their never-ending support and sacrifice.
Hanoi, 2020
Ph.D. Student
ii
CONTENTS
DECLARATION OF AUTHORSHIP. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i
ACKNOWLEDGEMENT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
CONTENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
SYMBOLS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv
INTRODUCTION. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
CHAPTER 1. LITERATURE REVIEW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.1. Person ReID classifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.1.1. Single-shot versus Multi-shot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.1.2. Closed-set versus Open-set person ReID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.1.3. Supervised and unsupervised person ReID . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.2. Datasets and evaluation metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.2.1. Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.2.2. Evaluation metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.3. Feature extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.3.1. Hand-designed features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.3.2. Deep-learned features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.4. Metric learning and person matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.4.1. Metric learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.4.2. Person matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.5. Fusion schemes for person ReID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
1.6. Representative frame selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
1.7. Fully automated person ReID systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
1.8. Research on person ReID in Vietnam . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
CHAPTER 2. MULTI-SHOT PERSON RE-ID THROUGH REPRESEN-
TATIVE FRAMES SELECTION AND TEMPORAL FEATURE POOLING
36
2.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.2. Proposed method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.2.1. Overall framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.2.2. Representative image selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
iii
2.2.3. Image-level feature extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.2.4. Temporal feature pooling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.2.5. Person matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
2.3. Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.3.1. Evaluation of representative frame extraction and temporal feature pooling
schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.3.2. Quantitative evaluation of the trade-off between the accuracy and compu-
tational time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
61
2.3.3. Comparison with state-of-the-art methods. . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
2.4. Conclusions and Future work. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
CHAPTER 3. PERSON RE-ID PERFORMANCE IMPROVEMENT BASED
ON FUSION SCHEMES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.2. Fusion schemes for the first setting of person ReID . . . . . . . . . . . . . . . . . . . . . . . 69
3.2.1. Image-to-images person ReID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
3.2.2. Images-to-images person ReID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
3.2.3. Obtained results on the first setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
3.3. Fusion schemes for the second setting of person ReID . . . . . . . . . . . . . . . . . . . . 82
3.3.1. The proposed method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
3.3.2. Obtained results on the second setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
3.4. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
CHAPTER 4. QUANTITATIVE EVALUATION OF AN END-TO-END
PERSON REID PIPELINE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.2. An end-to-end person ReID pipeline. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
4.2.1. Pedestrian detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
4.2.2. Pedestrian tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
4.2.3. Person ReID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
4.3. GOG descriptor re-implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
4.3.1. Comparison the performance of two implementations . . . . . . . . . . . . . . . . . 99
4.3.2. Analyze the effect of GOG parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
4.4. Evaluation performance of an end-to-end person ReID pipeline . . . . . . . . . . 101
4.4.1. The effect of human detection and segmentation on person ReID in single-
shot scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
102
iv
4.4.2. The effect of human detection and segmentation on person ReID in multi-
shot scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
104
4.5. Conclusions and Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
PUBLICATIONS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
v
ABBREVIATIONS
No. Abbreviation Meaning
1 ACF Aggregate Channel Features
2 AIT Austrian Institute of Technology
3 AMOC Accumulative Motion Context
4 BOW Bag of Words
5 CAR Learning Compact Appearance Representation
6 CIE The International Commission on Illumination
7 CFFM Comprehensive Feature Fusion Mechanism
8 CMC Cummulative Matching Characteristic
9 CNN Convolutional Neural Network
10 CPM Convolutional Pose Machines
11 CVPDL Cross-view Projective Dictionary Learning
12 CVPR Conference on Computer Vision and Pattern Recognition
13 DDLM Discriminative Dictionary Learning Method
14 DDN Deep Decompositional Network
15 DeepSORT Deep learning Simple Online and Realtime Tracking
16 DFGP Deep Feature Guided Pooling
17 DGM Dynamic Graph Matching
18 DPM Deformable Part-Based Model
19 ECCV European Conference on Computer Vision
20 FAST 3D Fast Adaptive Spatio-Temporal 3D
21 FEP Flow Energy Profile
22 FNN Feature Fusion Network
23 FPNN Filter Pairing Neural Network
24 GOG Gaussian of Gaussian
25 GRU Gated Recurrent Unit
26 HOG Histogram of Oriented Gradients
27 HUST Hanoi University of Science and Technology
28 IBP Indian Buffet Process
29 ICCV International Conference on Computer Vision
30 ICIP International Conference on Image Processing
vi
31 IDE ID-Discriminative Embedding
32 iLIDS-VID Imagery Library for Intelligent Detection Systems
33 ILSVRC ImageNet Large Scale Visual Recognition Competition
34 ISR TIterative Spare Ranking
35 KCF Kernelized Correlation Filter
36 KDES Kenel DEScriptor
37 KISSME Keep It Simple and Straightforward MEtric
38 kNN k-Nearest Neighbour
39 KXQDA Kernel Cross-view Quadratic Discriminative Analysis
40 LADF Locally-Adaptive Decision Functions
41 LBP Local Binary Pattern
42 LDA LinearDiscriminantAnalysis
43 LDFV Local Descriptor and coded by Feature Vector
44 LMNN Large Margin Nearest Neighbor
45 LMNN-R Large Margin Nearest Neighbor with Rejection
46 LOMO LOcal Maximal Occurrence
47 LSTM Long-Short Term Memory
48 LSTMC Long Short-Term Memory network with a Coupled gate
49 mAP mean Average Precision
50 MAPR Multimedia Analysis and Pattern Recognition
51 Mask R-CNN Mask Region with CNN
52 MCT Multi -Camera Tracking
53 MCCNN Multi-Channel CNN
54 MCML Maximally Collapsing Metric Learning
55 MGCAM Mask-Guided Contrastive Attention Model
56 ML Machine Learning
57 MLAPG Metric Learning by Accelerated Proximal Gradient
58 MLR Metric Learning to Rank
59 MOT Multiple Object Tracking
60 MSCR Maximal Stable Color Region
61 MSVF Maximally Stable Video Frame
62 MTMCT Multi-Target Multi-Camera Tracking
63 Person ReID Person Re -Identification
64 Pedparsing Pedestrian Parsing
65 PPN Pose Prediction Network
vii
66 PRW Person Re-identification in the Wild
67 QDA Quadratic Discriminative Analysis
68 RAiD Re-Identification Across indoor-outdoor Dataset
69 RAP Richly Annotated Pedestrian
70 ResNet Residual Neural Network
71 RHSP Recurrent High-Structured Patches
72 RKHS Reproducing Kernel Hilbert Space
73 RNN Recurrent Neural Network
74 ROIs Region of Interests
75 SDALF Symmetry Driven Accumulation of Local Feature
76 SCNCD Salient Color Names bas ...  pp. 213–216. IEEE.
[92] Zeng M., Tian C., and Wu Z. (2018). Person re-identification with hierarchi-
cal deep learning feature and efficient xqda metric. In 2018 ACM Multimedia
Conference on Multimedia Conference, pp. 1838–1846. ACM.
[93] Eisenbach M., Kolarow A., Vorndran A., Niebling J., and Gross H.M. (2015).
Evaluation of multi feature fusion at score-level for appearance-based person
120
re-identification. In 2015 International Joint Conference on Neural Networks
(IJCNN), pp. 1–8. IEEE.
[94] Lejbølle A.R., Nasrollahi K., and Moeslund T.B. (2017). Enhancing person re-
identification by late fusion of low-, mid-and high-level features . Iet Biometrics .
[95] Zheng L., Wang S., Tian L., He F., Liu Z., and Tian Q. (2015). Query-adaptive
late fusion for image search and person re-identification. In Proceedings of the
IEEE conference on Computer Vision and Pattern Recognition (2015), pp. 1741–
1750.
[96] Zhao H., Tian M., Sun S., Shao J., Yan J., Yi S., Wang X., and Tang X. (2017).
Spindle net: Person re-identification with human body region guided feature de-
composition and fusion. In Proceedings of the IEEE conference on computer
vision and pattern recognition, pp. 1077–1085.
[97] Wei S.E., Ramakrishna V., Kanade T., and Sheikh Y. (2016). Convolutional
pose machines . In Proceedings of the IEEE conference on Computer Vision and
Pattern Recognition, pp. 4724–4732.
[98] Xin W., Dongdong G., Peng L., and Zhe J. (2016). Person re-identification by
features fusion. In Information Technology, Networking, Electronic and Automa-
tion Control Conference (2016), pp. 285–289. IEEE.
[99] Wu S., Chen Y.C., Li X., Wu A.C., You J.J., and Zheng W.S. (2016). An
enhanced deep feature representation for person re-identification. In 2016 IEEE
Winter Conference on Applications of Computer Vision (WACV), pp. 1–8. IEEE.
[100] Liu K., Ma B., Zhang W., and Huang R. (2015). A spatio-temporal appearance
representation for video-based pedestrian re-identification. In Proceedings of the
IEEE International Conference on Computer Vision (2015), pp. 3810–3818.
[101] Zhang W., Hu S., and Liu K. (2017). Learning compact appearance representation
for video-based person re-identification. arXiv preprint arXiv:1702.06294 .
[102] Wang T., Gong S., Zhu X., and Wang S. (2016). Person re-identification by
discriminative selection in video ranking.. IEEE Trans. Pattern Anal. Mach.
Intell., 38(12):pp. 2501–2514.
[103] Frikha M., Chebbi O., Fendri E., and Hammami M. (2016). Key frame selection
for multi-shot person re-identification. In International Workshop on Representa-
tions, Analysis and Recognition of Shape and Motion FroM Imaging Data (2016),
pp. 97–110. Springer.
121
[104] Hassen Y.H., Ayedi W., Ouni T., and Jallouli M. (2015). Multi-shot person re-
identification approach based key frame selection. In Eighth International Con-
ference on Machine Vision (ICMV 2015), volume 9875, p. 98751H. International
Society for Optics and Photonics.
[105] Hassen Y.H., Loukil K., Ouni T., and Jallouli M. (2017). Images selection and best
descriptor combination for multi-shot person re-identification. In International
Conference on Intelligent Interactive Multimedia Systems and Services (2017),
pp. 11–20. Springer.
[106] El-Alfy H., Muramatsu D., Teranishi Y., Nishinaga N., Makihara Y., and Yagi
Y. (2017). A visual surveillance system for person re-identification. In Thirteenth
International Conference on Quality Control by Artificial Vision 2017 , volume
10338, p. 103380D. International Society for Optics and Photonics.
[107] Zheng L., Zhang H., Sun S., Chandraker M., Yang Y., and Tian Q. (2017). Person
re-identification in the wild . In Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition, pp. 1367–1376.
[108] Song C., Huang Y., Ouyang W., and Wang L. (2018). Mask-guided contrastive at-
tention model for person re-identification. In Proceedings of the IEEE Conference
on Computer Vision and Pattern Recognition, pp. 1179–1188.
[109] Dollár P., Appel R., Belongie S., and Perona P. (2014). Fast feature pyramids
for object detection. IEEE Transactions on Pattern Analysis and Machine Intel-
ligence, 36(8):pp. 1532–1545.
[110] Redmon J. and Farhadi A. (2018). Yolov3: An incremental improvement . arXiv
preprint arXiv:1804.02767 .
[111] He K., Gkioxari G., Dollár P., and Girshick R. (2017). Mask r-cnn. In Proceedings
of the IEEE international conference on computer vision, pp. 2961–2969.
[112] Luo P., Wang X., and Tang X. (2013). Pedestrian parsing via deep decomposi-
tional network . In Proceedings of the IEEE international conference on computer
vision, pp. 2648–2655.
[113] Nguyen T.B., Van Phu P., Le T.L., and Le C.V. (2016). Background removal for
improving saliency-based person re-identification. In 2016 Eighth International
Conference on Knowledge and Systems Engineering (KSE), pp. 339–344. IEEE.
[114] McGuinness K. and O’Connor N.E. (2008). The k-space segmentation tool set .
[115] Le C.V., Tuan N.N., Hong Q.N., and Lee H.J. (2017). Evaluation of recurrent
neural network variants for person re-identification. IEIE Transactions on Smart
Processing & Computing , 6(3):pp. 193–199.
122
[116] Pham T.T.T., Le T.L., Dao T.K., and Le D.H. (2015). A robust model for person
re-identification in multimodal person localization. UBICOMM 2015 , p. 51.
[117] Bo L., Ren X., and Fox D. (2010). Kernel descriptors for visual recognition. In
Advances in neural information processing systems (2010), pp. 244–252.
[118] Nguyen N.B., Nguyen V.H., Duc T.N., Duong D.A., et al. (2015). Using attribute
relationships for person re-identification. In Knowledge and Systems Engineering ,
pp. 195–207. Springer.
[119] Nguyen N.B., Nguyen V.H., Duc T.N., Le D.D., and Duong D.A. (2015). Attrel:
an approach to person re-identification by exploiting attribute relationships . In
International Conference on Multimedia Modeling , pp. 50–60. Springer.
[120] Layne R., Hospedales T.M., and Gong S. (2014). Attributes-based re-
identification. In Person re-identification, pp. 93–117. Springer.
[121] Nguyen N.B., Nguyen V.H., Ngo T.D., and Nguyen K.M. (2017). Person re-
identification with mutual re-ranking . Vietnam Journal of Computer Science,
4(4):pp. 233–244.
[122] Nguyen V.H., Nguyen K., Le D.D., Duong D.A., and Satoh S. (2013). Person
re-identification using deformable part models . In International Conference on
Neural Information Processing , pp. 616–623. Springer.
[123] Viet N.C., Cong D.T., and Ho-Phuoc T. (2015). Manifold-based learning for per-
son re-identification. In 2015 International Conference on Advanced Technologies
for Communications (ATC), pp. 688–691. IEEE.
[124] Le T.L., Thonnat M., Boucher A., and Brémond F. (2009). Appearance based
retrieval for tracked objects in surveillance videos . In Proceedings of the ACM
International Conference on Image and Video Retrieval , CIVR ’09, pp. 40:1–
40:8. ACM, New York, NY, USA. ISBN 978-1-60558-480-5. doi:10.1145/1646396.
1646444.
[125] Lucas B.D., Kanade T., et al. (1981). An iterative image registration technique
with an application to stereo vision.
[126] Li P., Wang Q., and Zhang L. (2013). A novel earth mover’s distance methodology
for image matching with gaussian mixture models . In Proceedings of the IEEE
International Conference on Computer Vision, pp. 1689–1696.
[127] Singh B., Parwate D., and Shukla S. (2009). Radiosterilization of fluoroquinolones
and cephalosporins: Assessment of radiation damage on antibiotics by changes
in optical property and colorimetric parameters . AAPS PharmSciTech, 10(1):pp.
34–43.
123
[128] Wikipedia (2020). Illuminant D65 . https://en.wikipedia.org/wiki/
Illuminant_D65/. [Online; accessed 10-March-2020].
[129] Popov V., Ostarek M., and Tenison C. (2018). Practices and pitfalls in inferring
neural representations . NeuroImage, 174:pp. 340–351.
[130] John Lu Z. (2010). The elements of statistical learning: data mining, inference,
and prediction. Journal of the Royal Statistical Society: Series A (Statistics in
Society), 173(3):pp. 693–694.
[131] Li Z., Chang S., Liang F., Huang T.S., Cao L., and Smith J.R. (2013). Learning
locally-adaptive decision functions for person verification. In Proceedings of the
IEEE Conference on Computer Vision and Pattern Recognition, pp. 3610–3617.
[132] Geng S., Yu M., Liu Y., Yu Y., and Bai J. (2018). Re-ranking pedestrian re-
identification with multiple metrics . Multimedia Tools and Applications , pp. 1–
23.
[133] Li M., Zhu X., and Gong S. (2018). Unsupervised person re-identification by
deep learning tracklet association. In Proceedings of the European Conference on
Computer Vision (ECCV), pp. 737–753.
[134] Li M., Zhu X., and Gong S. (2019). Unsupervised tracklet person re-identification.
IEEE transactions on pattern analysis and machine intelligence.
[135] Zeng Z., Li Z., Cheng D., Zhang H., Zhan K., and Yang Y. (2017). Two-
stream multirate recurrent neural network for video-based pedestrian reidentifi-
cation. IEEE Transactions on Industrial Informatics , 14(7):pp. 3179–3186.
[136] Liu H., Jie Z., Jayashree K., Qi M., Jiang J., Yan S., and Feng J. (2017). Video-
based person re-identification with accumulative motion context . IEEE transac-
tions on circuits and systems for video technology , 28(10):pp. 2788–2802.
[137] Liu Z., Chen J., and Wang Y. (2016). A fast adaptive spatio-temporal 3d feature
for video-based person re-identification. In Image Processing (ICIP), 2016 IEEE
International Conference on, pp. 4294–4298. IEEE.
[138] Li Y., Zhuo L., Li J., Zhang J., Liang X., and Tian Q. (2017). Video-based
person re-identification by deep feature guided pooling . In Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition Workshops (2017), pp.
39–46.
[139] Zhang D., Wu W., Cheng H., Zhang R., Dong Z., and Cai Z. (2017). Image-
to-video person re-identification with temporally memorized similarity learning .
IEEE Transactions on Circuits and Systems for Video Technology .
124
[140] Wang G., Lai J., and Xie X. (2017). P2snet: Can an image match a video for
person re-identification in an end-to-end way? . IEEE Transactions on Circuits
and Systems for Video Technology .
[141] Ojala T., Pietikainen M., and Harwood D. (1994). Performance evaluation of
texture measures with classification based on kullback discrimination of distribu-
tions . In Pattern Recognition, 1994. Vol. 1-Conference A: Computer Vision &
Image Processing., Proceedings of the 12th IAPR International Conference on,
volume 1, pp. 582–585. IEEE.
[142] Zheng Y., Sheng H., Zhang B., Zhang J., and Xiong Z. (2015). Weight-based
sparse coding for multi-shot person re-identification. Science China Information
Sciences (2015), 58(10):pp. 1–15.
[143] Jia Y. et al. (2013). Caffe: an open source convolutional architecture for fast
feature embedding (2013). 
[144] Kittler J., Hatef M., Duin R.P., and Matas J. (1998). On combining classifiers .
IEEE transactions on pattern analysis and machine intelligence, 20(3):pp. 226–
239.
[145] Kittler J., Hatef M., Duin R.P.W., and Matas J. (Mar 1998). On combining
classifiers . IEEE Transactions on Pattern Analysis and Machine Intelligence,
20(3):pp. 226–239. ISSN 0162-8828. doi:10.1109/34.667881.
[146] Lisanti G., Masi I., Bagdanov A.D., and Del Bimbo A. (2015). Person re-
identification by iterative re-weighted sparse ranking . IEEE transactions on pat-
tern analysis and machine intelligence, 37(8):pp. 1629–1642.
[147] Sheng H., Zhou X., Zheng Y., Liu Y., and Yang D. (2017). Person re-identification
with discriminative dictionary learning . DEStech Transactions on Computer Sci-
ence and Engineering , (csae).
[148] Chen L., Yang H., Zhu J., Zhou Q., Wu S., and Gao Z. (2017). Deep spatial-
temporal fusion network for video-based person re-identification. In Proceedings of
the IEEE Conference on Computer Vision and Pattern Recognition Workshops ,
pp. 63–70.
[149] Chen L., Yang H., and Gao Z. (2020). Comprehensive feature fusion mechanism
for video-based person re-identification via significance-aware attention. Signal
Processing: Image Communication, p. 115835.
[150] Ren S., He K., Girshick R., and Sun J. (2015). Faster r-cnn: Towards real-time
object detection with region proposal networks . In Advances in neural information
processing systems , pp. 91–99.
125
[151] Friedman J., Hastie T., Tibshirani R., et al. (2000). Additive logistic regression:
a statistical view of boosting (with discussion and a rejoinder by the authors).
The annals of statistics , 28(2):pp. 337–407.
[152] Redmon J., Divvala S., Girshick R., and Farhadi A. (2016). You only look once:
Unified, real-time object detection. In Proceedings of the IEEE conference on
computer vision and pattern recognition, pp. 779–788.
[153] Nguyen H.Q., Nguyen T.B., Le T.A., Le T.L., Vu T.H., and Noe A. (2019).
Comparative evaluation of human detection and tracking approaches for online
tracking applications . In 2019 International Conference on Advanced Technologies
for Communications (ATC), pp. 348–353. IEEE.
[154] Matsukawa T., Okabe T., Suzuki E., and Sato Y. (2017). Hierarchical gaus-
sian descriptors with application to person re-identification. arXiv preprint
arXiv:1706.04318 .
[155] Liu H., Qin L., Cheng Z., and Huang Q. (2013). Set-based classification for
person re-identification utilizing mutual-information. In 2013 IEEE International
Conference on Image Processing , pp. 3078–3082. IEEE.
[156] Tian M., Yi S., Li H., Li S., Zhang X., Shi J., Yan J., and Wang X. (2018).
Eliminating background-bias for robust person re-identification. In Proceedings of
the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5794–
5803.
[157] Ghorbel M., Ammar S., Kessentini Y., and Jmaiel M. (2019). Improving per-
son re-identification by background subtraction using two-stream convolutional
networks . In International Conference on Image Analysis and Recognition, pp.
345–356. Springer.
[158] Springer (2016). MARS: A Video Benchmark for Large-Scale Person Re-
identification.
[159] Liu Z., Zhang Z., Wu Q., and Wang Y. (2015). Enhancing person re-identification
by integrating gait biometric. Neurocomputing , 168:pp. 1144 – 1156. ISSN 0925-
2312. doi:https://doi.org/10.1016/j.neucom.2015.05.008.
[160] Li W., Zhu X., and Gong S. (2018). Harmonious attention network for person re-
identification. In 2018 IEEE/CVF Conference on Computer Vision and Pattern
Recognition, pp. 2285–2294.
[161] Yu H.X., Wu A., and Zheng W.S. (2018). Unsupervised person re-identification
by deep asymmetric metric embedding . IEEE Transactions on Pattern Analysis
and Machine Intelligence, 42:pp. 956–973.
126
[162] Leng Q., Ye M., and Tian Q. (2020). A survey of open-world person re-
identification. IEEE Transactions on Circuits and Systems for Video Technology ,
30(4):pp. 1092–1108.
127

File đính kèm:

  • pdfperson_re_identification_in_a_surveillance_camera_network.pdf
  • pdf2.Abstract_Vietnamese.pdf