[1] Ministry of tranport of China. Statistical bulletin on transportation industry development in 2018. [Online], available: http://xxgk.mot.gov.cn/jigou/zhghs/201904/t20190412_3186720.html, September 5, 2019
[2] 2 Shi J G, Sun Y S, Schonfeld P, Qi J. Joint optimization of tram timetables and signal timing adjustments at intersections. Transportation Research Part C: Emerging Technologies, 2017, 83(6): 104−119
[3] 3 Ji Y X, Tang Y, Du Y C, Zhang X. Coordinated optimization of tram trajectories with arterial signal timing resynchronization. Transportation Research Part C: Emerging Technologies, 2019, 99(4): 53−66
[4] Little J D C, Kelson M D, Gartner N M. Maxband: a program for setting signals on arteries and triangular networks. In: Proceedings of the 60th Annual Meeting of the Transportation Research Board. Washington, USA: Transportation Research Board, 1981. 40−46
[5] 5 Jeong Y J, Kim Y C. Tram passive signal priority strategy based on the maxband model. KSCE Journal of Civil Engineering, 2014, 18(5): 1518−1527 doi: 10.1007/s12205-014-0159-1
[6] 6 Ma W, Zou L, An K, Gartner N H, Wang M. A partition-enabled multi-mode band approach to arterial traffic signal optimization. IEEE Transactions on Intelligent Transportation Systems, 2019, 20(1): 313−322 doi: 10.1109/TITS.2018.2815520
[7] 7 Kim H, Cheng Y, Chang G. Variable signal progression bands for transit vehicles under dwell time uncertainty and traffic queues. IEEE Transactions on Intelligent Transportation Systems, 2019, 20(1): 109−122 doi: 10.1109/TITS.2018.2801567
[8] 8 Ji Y X, Tang Y, Wang W, Du Y C. Tram-oriented traffic signal timing resynchronization. Journal of Advanced Transportation, 2018, 2018(1): 1−13
[9] 9 Jacobson J, Sheffi Y. Analytical model of traffic delays under bus signal preemption: theory and application. Transportation Research Part B: Methodological, 1981, 15(2): 127−138 doi: 10.1016/0191-2615(81)90039-4
[10] 10 Yang M, Ding J, Wang W, Ma Y Y. A coordinated signal priority strategy for modern trams on arterial streets by predicting the tram dwell time. KSCE Journal of Civil Engineering, 2018, 22(2): 823−836 doi: 10.1007/s12205-017-1187-4
[11] 高阳, 陈世福, 陆鑫. 强化学习研究综述. 自动化学报, 2004, 30(1): 1−15 doi: 10.3969/j.issn.1003-8930.2004.01.001

11 Gao Yang, Chen Shi-Fu, Lu Xin. Reseacrh on reinforcement learning technology: a review. Acta Automatica Sinica, 2004, 30(1): 1−15 doi: 10.3969/j.issn.1003-8930.2004.01.001
[12] 12 Bertsekas D P. Feature-based aggregation and deep reinforcement learning: a survey and some new implementations. IEEE/CAA Journal of Automatica Sinica, 2019, 6(1): 1−31
[13] 13 Samah E T, Abdulhai B, Abdelgawad H. Design of reinforcement learning parameters for seamless application of adaptive traffic signal control. Journal of Intelligent Transportation Systems, 2014, 18(3): 227−245 doi: 10.1080/15472450.2013.810991
[14] 段艳杰, 吕宜生, 张杰, 赵学亮, 王飞跃. 深度学习在控制领域的研究现状与展望. 自动化学报, 2016, 42(5): 643−654

14 Duan Yan-Jie, Lv Yi-Sheng, Zhang Jie, Zhao Xue-Liang, Wang Fei-Yue. Deep learning for control: the state of the art and prospects. Acta Automatica Sinica, 2016, 42(5): 643−654
[15] 15 Li L, Lv Y, Wang F-Y. Traffic signal timing via deep reinforcement learning. IEEE/CAA Journal of Automatica Sinica, 2016, 3(3): 247−254
[16] 16 Liang X, Du X, Wang G, Han Z. A deep reinforcement learning network for traffic light cycle control. IEEE Transactions on Vehicular Technology, 2019, 68(2): 1243−1253 doi: 10.1109/TVT.2018.2890726
[17] 17 Ling K, Shalaby A. Automated transit headway control via adaptive signal priority. Journal of Advanced Transportation, 2004, 38(4): 45−67
[18] 舒波, 李大铭, 赵新良. 基于强化学习算法的公交信号优先策略. 东北大学学报(自然科学版), 2012, 33(10): 1513−1516 doi: 10.12068/j.issn.1005-3026.2012.10.035

18 Shu Bo, Li Da-Ming, Zhao Xin-Liang. Transit signal priority strategy based on reinforcement learning algorithm. Journal of Northeastern University (Natural Science), 2012, 33(10): 1513−1516 doi: 10.12068/j.issn.1005-3026.2012.10.035
[19] 梁星星, 冯旸赫, 马扬, 程光权, 黄金才, 王琦等. 多agent深度强化学习综述. 自动化学报, 2019. DOI: 10.16383/j.aas.c180372

Liang Xing-Xing, Feng Yang-He, Ma Yang, Cheng Guang-Quan, Huang Jin-Cai, Wang Qi, et al. Deep multi-agent reinforcement learning: a survey. Acta Automatica Sinica, 2019. DOI: 10.16383/j.aas.c180372
[20] 赵英男, 刘鹏, 赵巍, 唐降龙. 深度q学习的二次主动采样方法. 自动化学报, 2019, 45(10): 1870−1882 doi: 10.3969/j.issn.1003-8930.2019.01.001

20 Zhao Ying-Nan, Liu Peng, Zhao Wei, Tang Xiang-Long. Twice sampling method in deep Q-network. Acta Automatica Sinica, 2019, 45(10): 1870−1882 doi: 10.3969/j.issn.1003-8930.2019.01.001
[21] Wang Z Y, Schaul T, Hessel M, Hasselt H, Lanctot M, Freitas N. Dueling network architectures for deep reinforcement learning. In: Proceedings of the 33rd International Conference on Machine Learning. New York, USA: PMLR, 2016. 1995−2003
[22] Hasselt H V, Guez A, Silver D. Deep reinforcement learning with double Q-learning. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence, Phoenix, USA: MIT, 2015. 2094−2100
[23] Schaul T, Quan J, Antonoglou I, Silver D. Prioritized experience replay. In: Proceedings of the 2016 International Conference on Learning Representations 2016, San Juan, Puerto Rico: arXiv, 2016. 1−21
[24] Lopez P A, Behrisch M, Walz L B, Erdmann J, Flotterod Y, Hilbrich R, et al. Microscopic traffic simulation using sumo. In: Proceedings of the 21st IEEE International Conference on Intelligent Transportation Systems. Hawaii, USA: IEEE, 2018. 2575−2582
[25] 25 Islam M T, Tiwana J, Bhowmick A, Qiu T Z. Design of LRT signal priority to improve arterial traffic mobility. Journal of Transportation Engineering, 2016, 142(9): 04016034 doi: 10.1061/(ASCE)TE.1943-5436.0000831