Person re - Identification in a surveillance camera network
Person ReID is known as associating cross-view images of the same person when
he/she moves in a non-overlapping camera network [1]. In recent years, along with
the development of surveillance camera systems, person re-identification (ReID) has
increasingly attracted the attention of computer vision and pattern recognition communities because of its promising applications in many areas, such as public safety and
security, human-robotic interaction, and person retrieval. In early years, person ReID
was considered as the sub-task of Multi-Camera Tracking (MCT) [2]. The purpose of
MCT is to generate tracklets in every single field of view (FoV) and then associate
the tracklets that belong to the same pedestrian in different FoVs. In 2006, Gheissari
et al [3] firstly considered person ReID as an independent task. On a certain aspect,
person ReID and Multi-Target Multi-Camera Tracking (MTMCT) are close to each
other. However, the two issues are fundamentally different from each other in terms of
objective and evaluation metrics. While the objective of MTMCT is to determine the
position of each pedestrian over time from video streams taken by different cameras.
Person ReID tries to answer the question: "Which gallery images belong to a certain
probe person?" and it returns a sorted list of the gallery persons in descending order
of the similarities to the given query person. If MTMCT classifies a pair of images as
co-identical or not, person ReID ranks the gallery persons corresponding to the given
query person. Therefore, their performance is evaluated by different metrics: classification error rates for MTMCT and ranking performance for ReID. It is worth noting
that in case of overlapping camera network, the corresponding images of the same person would be found out based on data association, and can be considered as person
tracking problem, which is out of scope of this thesis. In the last decade, with the unremitting efforts, person ReID has achieved numerous important milestones with many
great results [4, 5, 6, 7, 8], however, it is still a challenging task and confronts various
difficulties. These difficulties and challenges will be presented in the later section. First
of all, the mathematical formulation of person REID is given as follows.
Tóm tắt nội dung tài liệu: Person re - Identification in a surveillance camera network
MINISTRY OF EDUCATION AND TRAINING HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY NGUYEN THUY BINH PERSON RE-IDENTIFICATION IN A SURVEILLANCE CAMERA NETWORK DOCTORAL DISSERTATION OF ELECTRONICS ENGINEERING Hanoi−2020 MINISTRY OF EDUCATION AND TRAINING HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY NGUYEN THUY BINH PERSON RE-IDENTIFICATION IN A SURVEILLANCE CAMERA NETWORK Major: Electronics Engineering Code: 9520203 DOCTORAL DISSERTATION OF ELECTRONICS ENGINEERING SUPERVISORS: 1.Assoc. Prof. Pham Ngoc Nam 2.Assoc. Prof. Le Thi Lan Hanoi−2020 DECLARATION OF AUTHORSHIP I, Nguyen Thuy Binh, declare that the thesis titled "Person re-identification in a surveillance camera network" has been entirely composed by myself. I assure some points as follows: This work was done wholly or mainly while in candidature for a Ph.D. research degree at Hanoi University of Science and Technology. The work has not be submitted for any other degree or qualifications at Hanoi University of Science and Technology or any other institutions. Appropriate acknowledge has been given within this thesis where reference has been made to the published work of others. The thesis submitted is my own, except where work in the collaboration has been included. The collaborative contributions have been clearly indicated. Hanoi, 24/11/ 2020 PhD Student SUPERVISORS i ACKNOWLEDGEMENT This dissertation was written during my doctoral course at School of Electronics and Telecommunications (SET) and International Research Institute of Multimedia, Infor- mation, Communication and Applications (MICA), Hanoi University of Science and Technology (HUST). I am so grateful for all people who always support and encourage me for completing this study. First, I would like to express my sincere gratitude to my advisors Assoc. Prof. Pham Ngoc Nam and Assoc. Prof. Le Thi Lan for their effective guidance, their patience, continuous support and encouragement, and their immense knowledge. I would like to express my gratitude to Dr. Vo Le Cuong and Dr. Ha thi Thu Lan for their help. I would like to thank to all member of School of Electronics and Telecom- munications, International Research Institute of Multimedia, Information, Communi- cations and Applications (MICA), Hanoi University of Science and Technology (HUST) as well as all of my colleagues in Faculty of Electrical-Electronic Engineering, University of Transport and Communications (UTC). They have always helped me on research process and given helpful advises for me to overcome my own difficulties. Moreover, the attention at scientific conferences has always been a great experience for me to receive many the useful comments. During my PhD course, I have received many supports from the Management Board of School of Electronics and Telecommunications, MICA Institute, and Faculty of Electrical-Electronic Engineering. My sincere thank to Assoc. Prof. Nguyen Huu Thanh, Dr. Nguyen Viet Son and Assoc. Prof. Nguyen Thanh Hai who gave me a lot of support and help. Without their precious support, it has been impossible to conduct this research. Thanks to my employer, University of Transport and Communications (UTC) for all necessary support and encouragement during my PhD journey. I am also grateful to Vietnam’s Program 911, HUST and UTC projects for their generous financial support. Special thanks to my family and relatives, particularly, my beloved husband and our children, for their never-ending support and sacrifice. Hanoi, 2020 Ph.D. Student ii CONTENTS DECLARATION OF AUTHORSHIP. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i ACKNOWLEDGEMENT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii CONTENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi SYMBOLS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv INTRODUCTION. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 CHAPTER 1. LITERATURE REVIEW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.1. Person ReID classifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.1.1. Single-shot versus Multi-shot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.1.2. Closed-set versus Open-set person ReID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.1.3. Supervised and unsupervised person ReID . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 1.2. Datasets and evaluation metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.2.1. Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.2.2. Evaluation metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 1.3. Feature extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 1.3.1. Hand-designed features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 1.3.2. Deep-learned features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 1.4. Metric learning and person matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 1.4.1. Metric learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 1.4.2. Person matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 1.5. Fusion schemes for person ReID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 1.6. Representative frame selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 1.7. Fully automated person ReID systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 1.8. Research on person ReID in Vietnam . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 CHAPTER 2. MULTI-SHOT PERSON RE-ID THROUGH REPRESEN- TATIVE FRAMES SELECTION AND TEMPORAL FEATURE POOLING 36 2.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 2.2. Proposed method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 2.2.1. Overall framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 2.2.2. Representative image selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 iii 2.2.3. Image-level feature extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 2.2.4. Temporal feature pooling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 2.2.5. Person matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 2.3. Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 2.3.1. Evaluation of representative frame extraction and temporal feature pooling schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 2.3.2. Quantitative evaluation of the trade-off between the accuracy and compu- tational time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 2.3.3. Comparison with state-of-the-art methods. . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 2.4. Conclusions and Future work. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 CHAPTER 3. PERSON RE-ID PERFORMANCE IMPROVEMENT BASED ON FUSION SCHEMES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 3.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 3.2. Fusion schemes for the first setting of person ReID . . . . . . . . . . . . . . . . . . . . . . . 69 3.2.1. Image-to-images person ReID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 3.2.2. Images-to-images person ReID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 3.2.3. Obtained results on the first setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 3.3. Fusion schemes for the second setting of person ReID . . . . . . . . . . . . . . . . . . . . 82 3.3.1. The proposed method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 3.3.2. Obtained results on the second setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 3.4. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 CHAPTER 4. QUANTITATIVE EVALUATION OF AN END-TO-END PERSON REID PIPELINE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 4.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 4.2. An end-to-end person ReID pipeline. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 4.2.1. Pedestrian detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 4.2.2. Pedestrian tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 4.2.3. Person ReID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 4.3. GOG descriptor re-implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 4.3.1. Comparison the performance of two implementations . . . . . . . . . . . . . . . . . 99 4.3.2. Analyze the effect of GOG parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 4.4. Evaluation performance of an end-to-end person ReID pipeline . . . . . . . . . . 101 4.4.1. The effect of human detection and segmentation on person ReID in single- shot scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 iv 4.4.2. The effect of human detection and segmentation on person ReID in multi- shot scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 4.5. Conclusions and Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 PUBLICATIONS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 v ABBREVIATIONS No. Abbreviation Meaning 1 ACF Aggregate Channel Features 2 AIT Austrian Institute of Technology 3 AMOC Accumulative Motion Context 4 BOW Bag of Words 5 CAR Learning Compact Appearance Representation 6 CIE The International Commission on Illumination 7 CFFM Comprehensive Feature Fusion Mechanism 8 CMC Cummulative Matching Characteristic 9 CNN Convolutional Neural Network 10 CPM Convolutional Pose Machines 11 CVPDL Cross-view Projective Dictionary Learning 12 CVPR Conference on Computer Vision and Pattern Recognition 13 DDLM Discriminative Dictionary Learning Method 14 DDN Deep Decompositional Network 15 DeepSORT Deep learning Simple Online and Realtime Tracking 16 DFGP Deep Feature Guided Pooling 17 DGM Dynamic Graph Matching 18 DPM Deformable Part-Based Model 19 ECCV European Conference on Computer Vision 20 FAST 3D Fast Adaptive Spatio-Temporal 3D 21 FEP Flow Energy Profile 22 FNN Feature Fusion Network 23 FPNN Filter Pairing Neural Network 24 GOG Gaussian of Gaussian 25 GRU Gated Recurrent Unit 26 HOG Histogram of Oriented Gradients 27 HUST Hanoi University of Science and Technology 28 IBP Indian Buffet Process 29 ICCV International Conference on Computer Vision 30 ICIP International Conference on Image Processing vi 31 IDE ID-Discriminative Embedding 32 iLIDS-VID Imagery Library for Intelligent Detection Systems 33 ILSVRC ImageNet Large Scale Visual Recognition Competition 34 ISR TIterative Spare Ranking 35 KCF Kernelized Correlation Filter 36 KDES Kenel DEScriptor 37 KISSME Keep It Simple and Straightforward MEtric 38 kNN k-Nearest Neighbour 39 KXQDA Kernel Cross-view Quadratic Discriminative Analysis 40 LADF Locally-Adaptive Decision Functions 41 LBP Local Binary Pattern 42 LDA LinearDiscriminantAnalysis 43 LDFV Local Descriptor and coded by Feature Vector 44 LMNN Large Margin Nearest Neighbor 45 LMNN-R Large Margin Nearest Neighbor with Rejection 46 LOMO LOcal Maximal Occurrence 47 LSTM Long-Short Term Memory 48 LSTMC Long Short-Term Memory network with a Coupled gate 49 mAP mean Average Precision 50 MAPR Multimedia Analysis and Pattern Recognition 51 Mask R-CNN Mask Region with CNN 52 MCT Multi -Camera Tracking 53 MCCNN Multi-Channel CNN 54 MCML Maximally Collapsing Metric Learning 55 MGCAM Mask-Guided Contrastive Attention Model 56 ML Machine Learning 57 MLAPG Metric Learning by Accelerated Proximal Gradient 58 MLR Metric Learning to Rank 59 MOT Multiple Object Tracking 60 MSCR Maximal Stable Color Region 61 MSVF Maximally Stable Video Frame 62 MTMCT Multi-Target Multi-Camera Tracking 63 Person ReID Person Re -Identification 64 Pedparsing Pedestrian Parsing 65 PPN Pose Prediction Network vii 66 PRW Person Re-identification in the Wild 67 QDA Quadratic Discriminative Analysis 68 RAiD Re-Identification Across indoor-outdoor Dataset 69 RAP Richly Annotated Pedestrian 70 ResNet Residual Neural Network 71 RHSP Recurrent High-Structured Patches 72 RKHS Reproducing Kernel Hilbert Space 73 RNN Recurrent Neural Network 74 ROIs Region of Interests 75 SDALF Symmetry Driven Accumulation of Local Feature 76 SCNCD Salient Color Names bas ... pp. 213–216. IEEE. [92] Zeng M., Tian C., and Wu Z. (2018). Person re-identification with hierarchi- cal deep learning feature and efficient xqda metric. In 2018 ACM Multimedia Conference on Multimedia Conference, pp. 1838–1846. ACM. [93] Eisenbach M., Kolarow A., Vorndran A., Niebling J., and Gross H.M. (2015). Evaluation of multi feature fusion at score-level for appearance-based person 120 re-identification. In 2015 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE. [94] Lejbølle A.R., Nasrollahi K., and Moeslund T.B. (2017). Enhancing person re- identification by late fusion of low-, mid-and high-level features . Iet Biometrics . [95] Zheng L., Wang S., Tian L., He F., Liu Z., and Tian Q. (2015). Query-adaptive late fusion for image search and person re-identification. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition (2015), pp. 1741– 1750. [96] Zhao H., Tian M., Sun S., Shao J., Yan J., Yi S., Wang X., and Tang X. (2017). Spindle net: Person re-identification with human body region guided feature de- composition and fusion. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1077–1085. [97] Wei S.E., Ramakrishna V., Kanade T., and Sheikh Y. (2016). Convolutional pose machines . In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 4724–4732. [98] Xin W., Dongdong G., Peng L., and Zhe J. (2016). Person re-identification by features fusion. In Information Technology, Networking, Electronic and Automa- tion Control Conference (2016), pp. 285–289. IEEE. [99] Wu S., Chen Y.C., Li X., Wu A.C., You J.J., and Zheng W.S. (2016). An enhanced deep feature representation for person re-identification. In 2016 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1–8. IEEE. [100] Liu K., Ma B., Zhang W., and Huang R. (2015). A spatio-temporal appearance representation for video-based pedestrian re-identification. In Proceedings of the IEEE International Conference on Computer Vision (2015), pp. 3810–3818. [101] Zhang W., Hu S., and Liu K. (2017). Learning compact appearance representation for video-based person re-identification. arXiv preprint arXiv:1702.06294 . [102] Wang T., Gong S., Zhu X., and Wang S. (2016). Person re-identification by discriminative selection in video ranking.. IEEE Trans. Pattern Anal. Mach. Intell., 38(12):pp. 2501–2514. [103] Frikha M., Chebbi O., Fendri E., and Hammami M. (2016). Key frame selection for multi-shot person re-identification. In International Workshop on Representa- tions, Analysis and Recognition of Shape and Motion FroM Imaging Data (2016), pp. 97–110. Springer. 121 [104] Hassen Y.H., Ayedi W., Ouni T., and Jallouli M. (2015). Multi-shot person re- identification approach based key frame selection. In Eighth International Con- ference on Machine Vision (ICMV 2015), volume 9875, p. 98751H. International Society for Optics and Photonics. [105] Hassen Y.H., Loukil K., Ouni T., and Jallouli M. (2017). Images selection and best descriptor combination for multi-shot person re-identification. In International Conference on Intelligent Interactive Multimedia Systems and Services (2017), pp. 11–20. Springer. [106] El-Alfy H., Muramatsu D., Teranishi Y., Nishinaga N., Makihara Y., and Yagi Y. (2017). A visual surveillance system for person re-identification. In Thirteenth International Conference on Quality Control by Artificial Vision 2017 , volume 10338, p. 103380D. International Society for Optics and Photonics. [107] Zheng L., Zhang H., Sun S., Chandraker M., Yang Y., and Tian Q. (2017). Person re-identification in the wild . In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1367–1376. [108] Song C., Huang Y., Ouyang W., and Wang L. (2018). Mask-guided contrastive at- tention model for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1179–1188. [109] Dollár P., Appel R., Belongie S., and Perona P. (2014). Fast feature pyramids for object detection. IEEE Transactions on Pattern Analysis and Machine Intel- ligence, 36(8):pp. 1532–1545. [110] Redmon J. and Farhadi A. (2018). Yolov3: An incremental improvement . arXiv preprint arXiv:1804.02767 . [111] He K., Gkioxari G., Dollár P., and Girshick R. (2017). Mask r-cnn. In Proceedings of the IEEE international conference on computer vision, pp. 2961–2969. [112] Luo P., Wang X., and Tang X. (2013). Pedestrian parsing via deep decomposi- tional network . In Proceedings of the IEEE international conference on computer vision, pp. 2648–2655. [113] Nguyen T.B., Van Phu P., Le T.L., and Le C.V. (2016). Background removal for improving saliency-based person re-identification. In 2016 Eighth International Conference on Knowledge and Systems Engineering (KSE), pp. 339–344. IEEE. [114] McGuinness K. and O’Connor N.E. (2008). The k-space segmentation tool set . [115] Le C.V., Tuan N.N., Hong Q.N., and Lee H.J. (2017). Evaluation of recurrent neural network variants for person re-identification. IEIE Transactions on Smart Processing & Computing , 6(3):pp. 193–199. 122 [116] Pham T.T.T., Le T.L., Dao T.K., and Le D.H. (2015). A robust model for person re-identification in multimodal person localization. UBICOMM 2015 , p. 51. [117] Bo L., Ren X., and Fox D. (2010). Kernel descriptors for visual recognition. In Advances in neural information processing systems (2010), pp. 244–252. [118] Nguyen N.B., Nguyen V.H., Duc T.N., Duong D.A., et al. (2015). Using attribute relationships for person re-identification. In Knowledge and Systems Engineering , pp. 195–207. Springer. [119] Nguyen N.B., Nguyen V.H., Duc T.N., Le D.D., and Duong D.A. (2015). Attrel: an approach to person re-identification by exploiting attribute relationships . In International Conference on Multimedia Modeling , pp. 50–60. Springer. [120] Layne R., Hospedales T.M., and Gong S. (2014). Attributes-based re- identification. In Person re-identification, pp. 93–117. Springer. [121] Nguyen N.B., Nguyen V.H., Ngo T.D., and Nguyen K.M. (2017). Person re- identification with mutual re-ranking . Vietnam Journal of Computer Science, 4(4):pp. 233–244. [122] Nguyen V.H., Nguyen K., Le D.D., Duong D.A., and Satoh S. (2013). Person re-identification using deformable part models . In International Conference on Neural Information Processing , pp. 616–623. Springer. [123] Viet N.C., Cong D.T., and Ho-Phuoc T. (2015). Manifold-based learning for per- son re-identification. In 2015 International Conference on Advanced Technologies for Communications (ATC), pp. 688–691. IEEE. [124] Le T.L., Thonnat M., Boucher A., and Brémond F. (2009). Appearance based retrieval for tracked objects in surveillance videos . In Proceedings of the ACM International Conference on Image and Video Retrieval , CIVR ’09, pp. 40:1– 40:8. ACM, New York, NY, USA. ISBN 978-1-60558-480-5. doi:10.1145/1646396. 1646444. [125] Lucas B.D., Kanade T., et al. (1981). An iterative image registration technique with an application to stereo vision. [126] Li P., Wang Q., and Zhang L. (2013). A novel earth mover’s distance methodology for image matching with gaussian mixture models . In Proceedings of the IEEE International Conference on Computer Vision, pp. 1689–1696. [127] Singh B., Parwate D., and Shukla S. (2009). Radiosterilization of fluoroquinolones and cephalosporins: Assessment of radiation damage on antibiotics by changes in optical property and colorimetric parameters . AAPS PharmSciTech, 10(1):pp. 34–43. 123 [128] Wikipedia (2020). Illuminant D65 . https://en.wikipedia.org/wiki/ Illuminant_D65/. [Online; accessed 10-March-2020]. [129] Popov V., Ostarek M., and Tenison C. (2018). Practices and pitfalls in inferring neural representations . NeuroImage, 174:pp. 340–351. [130] John Lu Z. (2010). The elements of statistical learning: data mining, inference, and prediction. Journal of the Royal Statistical Society: Series A (Statistics in Society), 173(3):pp. 693–694. [131] Li Z., Chang S., Liang F., Huang T.S., Cao L., and Smith J.R. (2013). Learning locally-adaptive decision functions for person verification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3610–3617. [132] Geng S., Yu M., Liu Y., Yu Y., and Bai J. (2018). Re-ranking pedestrian re- identification with multiple metrics . Multimedia Tools and Applications , pp. 1– 23. [133] Li M., Zhu X., and Gong S. (2018). Unsupervised person re-identification by deep learning tracklet association. In Proceedings of the European Conference on Computer Vision (ECCV), pp. 737–753. [134] Li M., Zhu X., and Gong S. (2019). Unsupervised tracklet person re-identification. IEEE transactions on pattern analysis and machine intelligence. [135] Zeng Z., Li Z., Cheng D., Zhang H., Zhan K., and Yang Y. (2017). Two- stream multirate recurrent neural network for video-based pedestrian reidentifi- cation. IEEE Transactions on Industrial Informatics , 14(7):pp. 3179–3186. [136] Liu H., Jie Z., Jayashree K., Qi M., Jiang J., Yan S., and Feng J. (2017). Video- based person re-identification with accumulative motion context . IEEE transac- tions on circuits and systems for video technology , 28(10):pp. 2788–2802. [137] Liu Z., Chen J., and Wang Y. (2016). A fast adaptive spatio-temporal 3d feature for video-based person re-identification. In Image Processing (ICIP), 2016 IEEE International Conference on, pp. 4294–4298. IEEE. [138] Li Y., Zhuo L., Li J., Zhang J., Liang X., and Tian Q. (2017). Video-based person re-identification by deep feature guided pooling . In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (2017), pp. 39–46. [139] Zhang D., Wu W., Cheng H., Zhang R., Dong Z., and Cai Z. (2017). Image- to-video person re-identification with temporally memorized similarity learning . IEEE Transactions on Circuits and Systems for Video Technology . 124 [140] Wang G., Lai J., and Xie X. (2017). P2snet: Can an image match a video for person re-identification in an end-to-end way? . IEEE Transactions on Circuits and Systems for Video Technology . [141] Ojala T., Pietikainen M., and Harwood D. (1994). Performance evaluation of texture measures with classification based on kullback discrimination of distribu- tions . In Pattern Recognition, 1994. Vol. 1-Conference A: Computer Vision & Image Processing., Proceedings of the 12th IAPR International Conference on, volume 1, pp. 582–585. IEEE. [142] Zheng Y., Sheng H., Zhang B., Zhang J., and Xiong Z. (2015). Weight-based sparse coding for multi-shot person re-identification. Science China Information Sciences (2015), 58(10):pp. 1–15. [143] Jia Y. et al. (2013). Caffe: an open source convolutional architecture for fast feature embedding (2013). [144] Kittler J., Hatef M., Duin R.P., and Matas J. (1998). On combining classifiers . IEEE transactions on pattern analysis and machine intelligence, 20(3):pp. 226– 239. [145] Kittler J., Hatef M., Duin R.P.W., and Matas J. (Mar 1998). On combining classifiers . IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(3):pp. 226–239. ISSN 0162-8828. doi:10.1109/34.667881. [146] Lisanti G., Masi I., Bagdanov A.D., and Del Bimbo A. (2015). Person re- identification by iterative re-weighted sparse ranking . IEEE transactions on pat- tern analysis and machine intelligence, 37(8):pp. 1629–1642. [147] Sheng H., Zhou X., Zheng Y., Liu Y., and Yang D. (2017). Person re-identification with discriminative dictionary learning . DEStech Transactions on Computer Sci- ence and Engineering , (csae). [148] Chen L., Yang H., Zhu J., Zhou Q., Wu S., and Gao Z. (2017). Deep spatial- temporal fusion network for video-based person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops , pp. 63–70. [149] Chen L., Yang H., and Gao Z. (2020). Comprehensive feature fusion mechanism for video-based person re-identification via significance-aware attention. Signal Processing: Image Communication, p. 115835. [150] Ren S., He K., Girshick R., and Sun J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks . In Advances in neural information processing systems , pp. 91–99. 125 [151] Friedman J., Hastie T., Tibshirani R., et al. (2000). Additive logistic regression: a statistical view of boosting (with discussion and a rejoinder by the authors). The annals of statistics , 28(2):pp. 337–407. [152] Redmon J., Divvala S., Girshick R., and Farhadi A. (2016). You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 779–788. [153] Nguyen H.Q., Nguyen T.B., Le T.A., Le T.L., Vu T.H., and Noe A. (2019). Comparative evaluation of human detection and tracking approaches for online tracking applications . In 2019 International Conference on Advanced Technologies for Communications (ATC), pp. 348–353. IEEE. [154] Matsukawa T., Okabe T., Suzuki E., and Sato Y. (2017). Hierarchical gaus- sian descriptors with application to person re-identification. arXiv preprint arXiv:1706.04318 . [155] Liu H., Qin L., Cheng Z., and Huang Q. (2013). Set-based classification for person re-identification utilizing mutual-information. In 2013 IEEE International Conference on Image Processing , pp. 3078–3082. IEEE. [156] Tian M., Yi S., Li H., Li S., Zhang X., Shi J., Yan J., and Wang X. (2018). Eliminating background-bias for robust person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5794– 5803. [157] Ghorbel M., Ammar S., Kessentini Y., and Jmaiel M. (2019). Improving per- son re-identification by background subtraction using two-stream convolutional networks . In International Conference on Image Analysis and Recognition, pp. 345–356. Springer. [158] Springer (2016). MARS: A Video Benchmark for Large-Scale Person Re- identification. [159] Liu Z., Zhang Z., Wu Q., and Wang Y. (2015). Enhancing person re-identification by integrating gait biometric. Neurocomputing , 168:pp. 1144 – 1156. ISSN 0925- 2312. doi:https://doi.org/10.1016/j.neucom.2015.05.008. [160] Li W., Zhu X., and Gong S. (2018). Harmonious attention network for person re- identification. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2285–2294. [161] Yu H.X., Wu A., and Zheng W.S. (2018). Unsupervised person re-identification by deep asymmetric metric embedding . IEEE Transactions on Pattern Analysis and Machine Intelligence, 42:pp. 956–973. 126 [162] Leng Q., Ye M., and Tian Q. (2020). A survey of open-world person re- identification. IEEE Transactions on Circuits and Systems for Video Technology , 30(4):pp. 1092–1108. 127
File đính kèm:
- person_re_identification_in_a_surveillance_camera_network.pdf
- 2.Abstract_Vietnamese.pdf