[1] Mnih V, Kavukcuoglu K, Silver D, Rusu A A, Veness J, Bellemare M G, Graves A, Riedmiller M, Fidjeland A K, Ostrovski G, Petersen S, Beattie C, Sadik A, Antonoglou I, King H, Kumaran D, Wierstra D, Legg S, Hassabis D. Human-level control through deep reinforcement learning. Nature, 2015, 518(7540): 529-533 doi: 10.1038/nature14236
[2] Mordvintsev A, Olah C, Tyka M. Inceptionism: going deeper into neural networks [Online], available: http://research.googleblog.com/2015/06/inceptionism-goi-ng-deeper-into-neural.html, August 22, 2016
[3] Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe, Nevada, USA: Curran Associates Inc., 2012. 1097-1105 http://dl.acm.org/citation.cfm?id=2999257
[4] Girshick R, Donahue J, Darrell T, Malik J. Rich feature Hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH, USA: IEEE, 2014. 580-587 http://ieeexplore.ieee.org/xpls/icp.jsp?arnumber=6909475
[5] Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, LeCun Y. OverFeat: integrated recognition, localization and detection using convolutional networks [Online], available: http://arxiv.org/abs/1312.6229, August 22, 2016 http://www.oalib.com/paper/4042258
[6] Felzenszwalb P F, Girshick R B, McAllester D. Cascade object detection with deformable part models. In: Proceedings of the 2010 IEEE Conference on Computer Vision and Pattern Recognition. San Francisco, CA, USA: IEEE, 2010. 2241-2248 http://ieeexplore.ieee.org/xpls/icp.jsp?arnumber=5539906
[7] Viola P, Jones M. Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Kauai, HI, USA: IEEE, 2001. 511-518
[8] Mnih V, Heess N, Graves A, Kavukcuoglu K. Recurrent models of visual attention. In: Proceedings of the 27th International Conference on Neural Information Processing Systems. Cambridge, MA, USA: MIT Press, 2014. 2204-2212 http://dl.acm.org/citation.cfm?id=2969073
[9] Rensink R A. The dynamic representation of scenes. Visual Cognition, 2000, 7(1-3): 17-42 doi: 10.1080/135062800394667
[10] Yoo D, Park S, Lee J Y, Paek A S, Kweon I S. AttentionNet: aggregating weak directions for accurate object detection [Online], available: http://arxiv.org/abs/1506.07704, August 22, 2016
[11] Stollenga M F, Masci J, Gomez F, Schmidhuber J. Deep networks with internal selective attention through feedback connections. In: Proceedings of the 27th International Conference on Neural Information Processing Systems. Cambridge, MA, USA: MIT Press, 2014. 4(2): 107-122 http://www.ams.org/mathscinet-getitem?mr=1312581
[12] Legrand J, Collobert R. Jiont RNN-based greedy parsing and word composition [Online], avaliable: https://arxiv.org/abs/1412.7028?context=cs, August 22, 2016 http://arxiv.org/abs/1412.7028
[13] Alexe B, Heess N, Teh Y W, Ferrari V. Searching for objects driven by context. In: Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe, Nevada, USA: Curran Associates Inc., 2012. 881-889 http://dl.acm.org/citation.cfm?id=2999233
[14] 冯欣, 杨丹, 张凌.基于视觉注意力变化的网络丢包视频质量评估.自动化学报, 2011, 37(11): 1322-1331 http://www.aas.net.cn/CN/abstract/abstract17526.shtml

Feng Xin, Yang Dan, Zhang Ling. Saliency variation based quality assessment for packet-loss-impaired videos. Acta Automatica Sinica, 2011, 37(11): 1322-1331 http://www.aas.net.cn/CN/abstract/abstract17526.shtml
[15] 刘龙, 樊波阳, 刘金星, 杨乐超.面向运动目标检测的粒子滤波视觉注意力模型.电子学报, 2016, 44(9): 2235-2241 http://www.cnki.com.cn/Article/CJFDTOTAL-DZXU201609031.htm

Liu Long, Fan Bo-Yang, Liu Jin-Xing, Yang Le-Chao. Particle filtering based visual attention model for moving target detection. Acta Electronica Sinica, 2016, 44(9): 2235-2241 http://www.cnki.com.cn/Article/CJFDTOTAL-DZXU201609031.htm
[16] 张冲. 基于Attention-Based LSTM模型的文本分类技术的研究[硕士学位论文], 南京大学, 中国, 2016. http://cdmd.cnki.com.cn/Article/CDMD-10284-1016136802.htm

Zhang Chong. Text Classification Based on Attention-Based LSTM Model [Master dissertation], Nanjing University, China, 2016. http://cdmd.cnki.com.cn/Article/CDMD-10284-1016136802.htm
[17] Denil M, Bazzani L, Larochelle H, de Freitas N. Learning where to attend with deep architectures for image tracking. Neural Computation, 2012, 24(8): 2151-2184 doi: 10.1162/NECO_a_00312
[18] Paletta L, Fritz G, Seifert C. Q-learning of sequential attention for visual object recognition from informative local descriptors. In: Proceedings of the 22nd International Conference on Machine Learning. New York, NY, USA: ACM, 2005. 649-656 http://dl.acm.org/citation.cfm?id=1102433
[19] Ranzato M. On learning where to look [Online], available: http://arxiv.org/abs/1405.5488, August 22, 2016.
[20] Stanley K O, Miikkulainen R. Evolving a roving eye for go. In: Proceedings of the 2004 Genetic and Evolutionary Computation Conference. Berlin, Heidelberg, Germany: Springer, 2004. 1226-1238 http://www.springerlink.com/index/96y7lyycbj8k67ey.pdf
[21] Larochelle H, Hinton G. Learning to combine foveal glimpses with a third-order Boltzmann machine. In: Proceedings of the 23rd International Conference on Neural Information Processing Systems. Vancouver, Canada: Curran Associates Inc., 2010. 1243-1251 http://dl.acm.org/citation.cfm?id=2997328