Application of ESN-based Multi Indices Dual Heuristic Dynamic Programming on Wastewater Treatment Process
-
摘要: 针对污水处理过程(Wastewater treatment process, WWTP)溶解氧(Dissolved oxygen, DO)及硝态氮浓度控制问题, 提出了一种多评价指标的DHP (Dual heuristic dynamic programming)控制策略. 该策略能够降低评价指标的复杂性, 提高评价网络的逼近精度. 采用回声状态网络(Echo state networks, ESNs)实现评价函数及控制策略的逼近, 研究了控制器的在线学习算法. 实验表明, 该策略在控制性能上优于单评价指标的DHP策略及常规PID控制策略.Abstract: In order to solve the problem of controlling dissolved oxygen (DO) concentration and nitrate concentration of wasterwater treatment process (WWTP), a multi critic indices dual heuristic dynamic programming (MDHP) policy is proposed. The approximating precision can be improved through lowering the complexity between the relationship of the critic network's outputs and inputs in this scheme. Echo state networks (ESNs) are adopted to approximate the critic indices and the optimal control policy. Online learning method of the controller is investigated. Experimental results indicate that the MDHP scheme has some advantages over single critic index DHP (SDHP) and PID in control performance.
-
[1] Shi Xiong-Wei, Qiao Jun-Fei, Yuan Ming-Zhe. Optimal control for wastewater treatment process based on improved particle optimization algorithm. Information and Control, 2011, 40(5): 698-703 (史雄伟, 乔俊飞, 苑明哲. 基于改进粒子群优化算法的污水处理过程优化控制. 信息与控制, 2011, 40(5): 698-703) [2] Holenda B, Domokos E, Rédey A, Fazakas J. Dissolved oxygen control of the activated sludge wastewater treatment process using model predictive control. Computers and Chemical Engineering, 2008, 32(6): 1270-1278 [3] Dellana S A, West D. Predictive modeling for wastewater applications: linear and nonlinear approaches. Environmental Modelling & Software, 2009, 24(1): 96-106 [4] Zhang H G, Cui L L, Zhang X, Luo Y H. Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method. IEEE Transactions on Neural Networks, 2011, 22(12): 2226-2236 [5] Lewis F L, Vamvoudakis K G. Reinforcement learning for partially observable dynamic processes: adaptive dynamic programming using measured output data. IEEE Transactions on Systems, Man, and Cybernetics — Part B: Cybernetics, 2011, 41(1): 14-25 [6] Wang F Y, Zhang H G, Liu D R. Adaptive dynamic programming: an introduction. IEEE Computational Intelligence Magazine, 2009, 4(2): 39-47 [7] Wei Qing-Lai, Zhang Hua-Guang, Cui Li-Li. Data-based optimal control for discrete-time zero-sum games of 2-D systems using adaptive critic designs. Acta Automatica Sinica, 2009, 35(6): 682-692(魏庆来, 张化光, 崔黎黎. 基于数据自适应评判的离散2-D系统零和博弈最优控制. 自动化学报, 2009, 35(6): 682-692) [8] Wei Qing-Lai, Zhang Hua-Guang, Liu De-Rong, Zhao Yan. An optimal control scheme for a class of discrete-time nonlinear systems with time delays using adaptive dynamic programming. Acta Automatica Sinica, 2010, 36(1): 121-129 (魏庆来, 张化光, 刘德荣, 赵琰. 基于自适应动态规划的一类带有时滞的离散时间非线性系统的最优控制策略. 自动化学报, 2010, 36(1): 121-129) [9] Fu J, He H B, Zhou X M. Adaptive learning and control for MIMO system based on adaptive dynamic programming. IEEE Transactions on Neural Networks, 2011, 22(7): 1133-1148 [10] Zhao Dong-Bin, Liu De-Rong, Yi Jian-Qiang. An overview on the adaptive dynamic programming based urban city traffic signal optimal control. Acta Automatica Sinica, 2009, 35(6): 676-681 (赵冬斌, 刘德荣, 易建强. 基于自适应动态规划的城市交通信号优化控制方法综述. 自动化学报, 2009, 35(6): 676-681) [11] White D A, Sofge D A. Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches. New York: Van Nostrand Reinhold Press, 1992 [12] Jaeger H. The "echo state" approach to analysing and training recurrent neural networks. GMD Report, German National Research Center for Information Technology, 2001, 12(8): 1-43 [13] Busoniu L, Babuska R, De Schutter B. Reinforcement Learning and Dynamic Programming Using Function Approximators. Boca Raton: CRC Press, 2010
点击查看大图
计量
- 文章访问数: 1669
- HTML全文浏览量: 83
- PDF下载量: 1099
- 被引次数: 0