Analyzing the Characteristics of "East Turkistan" Activities Using Text Mining and Network Analysis
-
摘要: 开源情报是反恐研究的一种新数据源,内容十分丰富且获取与分析技术日益成熟.目前,基于开源情报的反恐方面的研究成果已彰显出巨大应用前景.本文以“东突”分裂活动为研究对象,利用网络爬虫从万维网中获取相关文本数据,采用文本分析方法从这些数据中抽取“东突”分裂活动中涉及的人员、组织、时间和地点四要素,依据概念之间的关联关系构建多模元网络.首先 采用元网络分解法将多模元网络分解成单顶点子网络和二分子网络,通过对各个子网络进行中心性分析判别各类节点的重要性; 然后综合各个子网络的中心性指标形成人员、组织、时间和地点四类节点的重要性综合指数(Importance composite index,ICI).随后,进一步采用k-壳分解法直接对多模元网络进行分解,判别出元网络中的核心节点.经对比分析,发现本文的研究结果与实际结果吻合较好.Abstract: Open source intelligence, with rich content and sophisticated collection and analytical techniques, is an alternative data source in many anti-terrorism studies and has manifested significant applicable prospects. In this paper, we take the "East Turkistan" separatist activities as the research objects and collect the data from the world wide web by utilizing web crawlers. We adopt text analysis techniques to extract the information of four fundamental elements of "East Turkistan" separatist activities, including person, organization, time and place, and then construct multi-mode networks according to the relations between concepts. To analyze the characteristics of these networks, we first apply the approach of dimensional decomposition to separate the multimode networks into single vertex sub-networks and bipartite sub-networks, distinguish the importances of different types of nodes by measuring the centrality of each sub-network, and synthesize the centrality index of each sub-network to form an importance composite index (ICI). To further identify the core nodes of the four types of nodes in the network, we adopt the k-shell decomposition method to directly decompose the multi-mode network. We find that the results produced by these approaches presented above can be in accord with the actual ones.
-
Key words:
- Data mining /
- social network analysis /
- text analysis /
- East Turkistan
-
[1] Barabási A L, Albert R. Emergence of scaling in random networks. Science, 1999, 286(5439): 509-512 [2] Watts D J, Strogatz S H. Collective dynamics of "small-world" networks. Nature, 1998, 393(6684): 440-442 [3] Zheng X, Zhong Y, Zeng D, Wang F Y. Social influence and spread dynamics in social networks. Frontiers of Computer Science, 2012, 6(5): 611-620 [4] Zheng X, Zeng D, Wang F Y. Social balance in signed networks. Information Systems Frontiers, 2014, DOI: 10.1007/ s10796-014-9483-8 [5] Zheng X L, Zeng D L, Li H Q, Wang F Y. Analyzing open-source software systems as complex networks. Physica A: Statistical Mechanics and its Applications, 2008, 387(24): 6190-6200 [6] Scime A, Murray G R, Hunter L Y. Testing terrorism theory with data mining. International Journal of Data Analysis Techniques and Strategies, 2010, 2(2): 122-139 [7] Wang F Y, Carley K M, Zeng D, Mao W J. Social computing: from social informatics to social intelligence. IEEE Intelligent Systems, 2007, 22(2): 79-83 [8] Liu W, Zheng X, Wang T, Wang H. Collaboration pattern and topic analysis on intelligence and security informatics research. IEEE Intelligent Systems, 2012, DOI: 10.1109/ MIS.2012.106 [9] Cui Kai-Nan, Zheng Xiao-Long, Wen Ding, Zhao Xue-Liang. Researches and applications of computational experiments. Acta Automatica Sinica, 2013, 39(8): 1157-1169(崔凯楠, 郑晓龙, 文丁, 赵学亮. 计算实验研究方法及应用. 自动化学报, 2013, 39(8): 1157-1169) [10] Wang Fei-Yue. Parallel control: a method for data-driven and computational control. Acta Automatica Sinica, 2013, 39(4): 293-302(王飞跃. 平行控制: 数据驱动的计算控制方法. 自动化学报, 2013, 39(4): 293-302) [11] Li Ben-Xian, Li Meng-Jun, Fang Jin-Qing, Yang Jin-Xin. Empirical study on spatiotemporal evolution of terrorism organization network. Acta Automatica Sinica, 2013, 39(6): 770-779(李本先, 李孟军, 方锦清, 仰琎歆. 恐怖组织网络的时空演化规律. 自动化学报, 2013, 39(6): 770-779) [12] Parmar A. Undercover data mining. Siliconindia, 2002, 5(12): 58-59 [13] Edelstein H. Using data mining to find terrorists. DM Review, 2003, 13(5): 66 [14] Krebs V E. Mapping networks of terrorist cells. Connections, 2001, 24(3): 43-52 [15] Morselli C, Giguere C, Petit K. The efficiency security trade-off in criminal networks. Social Networks, 2007, 29(1): 143153 [16] Weinberger S. Web of war. Nature, 2011, 471(7340): 566568 [17] Memon N, Harkiolakis N, Hicks D L. Detecting high-value individuals in covert networks: 7/7 London bombing case study. In: Proceedings of the 2008 AICCSA 08-6th IEEE/ACS International Conference on Computer Systems and Applications. Doha: IEEE, 2008: 206-215 [18] Sageman M. Understanding Terror Networks. Philadelphia: University of Pennsylvania Press, 2004. [19] Basu A. Social network analysis of terrorist organizations in India. In: Proceedings of the 2005 Conference of the North American Association for Computation Social and Organizational Science. India, 2005. [20] Carley K M, Diesner J, Reminga J, Tsvetovat M. Toward an interoperable dynamic network analysis toolkit. Decision Support Systems, 2006, 43(4): 1324-1347 [21] Chen H, Wang F Y, Zeng D. Intelligence and security informatics for homeland security: information, communication, and transportation. IEEE Transactions on Intelligent Transportation Systems, 2004, 5(4): 329-341 [22] Zhang H P, Yu H K, Xiong D Y, Liu Q. HHMM-based Chinese lexical analyzer ICTCLAS. In: Proceedings of the 2nd SIGHAN Workshop on Chinese Language Processing. Stroudsburg, PA, USA: Association for Computational Linguistics, 2003, 17: 184-187 [23] Wang Xiao-Fan, Li Xiang, Chen Guan-Rong. Network Science: An Introduction. Beijing: Higher Education Press, 2012.(汪小帆, 李翔, 陈关荣. 网络科学导论. 北京: 高等教育出版社, 2012.)
点击查看大图
计量
- 文章访问数: 1553
- HTML全文浏览量: 39
- PDF下载量: 839
- 被引次数: 0