Stochastic Variational Bayesian Learning of Wiener Model in the Presence of Uncertainty
-
摘要: 多重不确定性环境下的非线性系统辨识是一个开放问题.贝叶斯学习在描述、处理不确定性方面具有显著优势, 已在线性系统辨识方面得到广泛应用, 但在非线性系统辨识的应用较少, 面临概率估计复杂、计算量大等困难.本文针对上述问题, 以典型维纳非线性过程为对象, 提出基于随机变分贝叶斯的非线性系统辨识方法.首先对过程噪声、测量噪声以及参数不确定性进行概率描述;然后利用随机变分贝叶斯方法对模型参数进行后验估计.在估计过程中, 利用随机优化思想, 仅利用部分中间变量概率信息估计模型参数分布的自然梯度期望, 与利用所有中间变量概率信息估计模型参数比较, 显著降低了计算复杂性.该方法是首次在系统辨识领域中的应用.本文利用一个仿真实例和一个维纳模型的Benchmark问题, 证明了该方法在对大规模数据系统辨识时的有效性.Abstract: Nonlinear system identification in uncertain environment is an open problem. Bayesian learning has significant advantages in describing and dealing with uncertainties and has been widely used in linear system identification. However, the use of Bayesian learning for nonlinear system identification has not been well studied due to the complexity of the estimation of the probability, especially in the presence of large-scale data. Motivated by these problems, this paper proposes a nonlinear system identification method based on stochastic variational Bayesian for Wiener model, a typical nonlinear model. First, the process noise, measurement noise and parameter uncertainty are described in terms of probability distribution. Then, the posterior estimation of model parameters is carried out by using the stochastic variational Bayesian approach. In this framework, only a few intermediate variables are used to estimate the natural gradient of the lower bound function of the likelihood function. Compared with classical variational Bayesian approach, where the estimation of model parameters depends on the information of all the intermediate variables, the computational complexity is significantly reduced for the proposed method since it only depends on the information of a few intermediate variables. To the best of our knowledge, it is the first time to use the stochastic variational Bayesian to system identification. A numerical example and a Benchmark problem of Wiener model are used to show the effectiveness of this method in the nonlinear system identification in the presence of large-scale data.
-
表 1 不同子采样数据点对应的参数辨识情况
Table 1 Identification of parameters corresponding to different sub-sampling data points
$ \langle \theta _0 \rangle $ $ \langle \theta _1 \rangle $ $ \langle \theta _2 \rangle $ $ \langle \theta _3 \rangle $ $ \langle \theta _4 \rangle $ $ \langle \lambda _0 \rangle $ $ \langle \lambda _1 \rangle $ $ \langle \lambda _2 \rangle $ 时间(s) 真实值 1 −0.5 0.25 −0.125 0.0625 0 1 1 — 采样1个点 1±0 −0.5463±0.3604 0.2507±0.2471 −0.2446±0.2655 0.0358±0.2882 0.5434±0.4180 0.6625±0.2907 0.3803±0.2185 0.6005 采样5% 1±0 −0.5060±0.0330 0.2693±0.0497 −0.1252±0.0323 0.0633±0.0323 0.0908±0.2707 0.9871±0.1480 0.9103±0.1246 3.1829 采样10% 1±0 −0.5055±0.0248 0.2571±0.02567 −0.1341±0.0255 0.0594±0.0256 0.0631±0.0504 0.9684±0.0498 0.9499±0.0459 7.7402 采样20% 1±0 −0.5077±0.0204 0.2544±0.0202 −0.1287±0.0289 0.0659±0.0291 0.0575±0.0540 0.9813±0.0518 0.9574±0.0451 11.4620 采样全部 1±0 −0.5078±0.0278 0.2541±0.0283 −0.1299±0.0271 0.0685±0.0246 0.0777±0.0726 0.9439±0.1183 0.9252±0.1326 9.0772 表 2 不同异常值存在时的参数辨识情况
Table 2 Parameter identification when different outliers exist
$ \langle \theta _0 \rangle $ $ \langle \theta _1 \rangle $ $ \langle \theta _2 \rangle $ $ \langle \theta _3 \rangle $ $ \langle \theta _4 \rangle $ $ \langle \theta _5 \rangle $ 时间(s) 真实值 1 −0.5 0.25 −0.125 0.0625 −0.03125 — 无异常值 1±0 −0.4989±0.0292 0.2495±0.0293 −0.1254±0.0223 0.0611±0.0257 −0.0338±0.0262 2.9369 2%异常值 1±0 −0.5097±0.0389 0.2672±0.0497 −0.1305±0.0426 0.0652±0.0452 −0.0291±0.0494 2.9480 5%异常值 1±0 −0.5060±0.0330 0.2693±0.0497 −0.1252±0.0323 0.0633±0.0323 −0.0314±0.0523 3.1829 10%异常值 1±0 −0.5349±0.0325 0.2627±0.0323 −0.1314±0.033 0.0685±0.0389 −0.0377±0.0355 2.9057 表 3 不同辨识方法的性能比较
Table 3 Performance comparison of different recognition methods
$ b_0 $ $ a_1 $ $ \langle \lambda _0 \rangle(\lambda_0) $ $ \langle \lambda _1 \rangle(\lambda_1) $ $ \langle \lambda _2 \rangle(\lambda_2) $ 均方误差 时间(s) 真实值 1 0.5 0 1 1 — — 无异常值 SVBI — — 0.0648±0.062 0.9633±0.0509 0.9766±0.0626 0.9136 2.9369 VBEM — — 0.0503±0.0346 0.9411±0.0393 0.9655±0.0459 0.8978 9.7046 MLE 1±0 0.5102±0.0136 0.1054±0.0405 1.0154±0.0464 0.949±0.0411 0.913 9.0350 PEM 1±0 0.4948±0.0172 0.0828±0.0524 0.9905±0.0373 1.0072±0.0449 0.9132 0.6474 $ 5\, \% $异常值 SVBI — — 0.0575±0.054 0.9813±0.052 0.9573±0.045 5.454 2.9352 VBEM — — 0.0503±0.0411 0.977±0.0532 0.9748±0.0518 3.8695 9.7709 MLE 1±0 0.415±0.0711 -0.9407±0.1253 1.0019±0.1839 1.3715±0.1895 3.9574 9.6693 PEM 1±0 0.4999±0.0549 0.1072±0.1871 0.9646±0.1926 0.9878±0.1558 3.8374 0.6580 $ 10\, \% $异常值 SVBI — — 0.1439±0.1065 0.9163±0.0924 0.8416±0.0924 7.5364 2.9057 VBEM — — 0.0556±0.0468 0.9711±0.0538 0.9568±0.0553 5.511 9.9245 MLE — — — — — — — PEM 1±0 0.4723±0.2004 0.1458±0.5211 0.9746±0.3091 1.003±0.3253 5.4992 0.6620 表 4 过程(52)部分参数辨识结果
Table 4 The identification result of the part of the process (52)
参数 $\theta_0$ $\theta_1$ $\theta_2$ $\theta_3$ $\theta_4$ $\theta_5$ $\theta_6$ $\theta_7$ $\theta_8$ $\theta_9$ $c_0$ $c_1$ $c_2$ $Q$ $R$ 结果值 −0.039 0.0648 −0.0547 0.0856 −0.0462 0.2613 0.0501 0.2041 0.3396 0.4154 −0.0188 0.1035 −0.003 0.0034 0.0014 表 5 不同方法的性能比较
Table 5 Performance comparison of different methods
采样点数 方法 均方误差(V) 参数个数 时间(s) 2 000 SVBI 0.05695 25 256.12 VBEM 0.06283 25 1211.27 SVBI 0.03407 40 264.273 VBEM 0.03425 40 1214.55 10 000 SVBI 0.06179 25 1299.99 VBEM 0.09334 25 6347.28 SVBI 0.03385 40 1332.31 VBEM 0.03404 40 6442.98 -
[1] 王乐一, 赵文虓. 系统辨识: 新的模式、挑战及机遇. 自动化学报, 2013, 39(7): 933-942 [2] 刘鑫. 时滞取值概率未知下的线性时滞系统辨识方法. 自动化学报, doi: 10.16383/j.aas.c201016Liu Xin. Identification of linear time-delay systems with unknown delay distributions in its value range. Acta Automatica Sinica, doi: 10.16383/j.aas.c201016 [3] Stoica P. On the convergence of an iterative algorithm used for Hammerstein system identification. IEEE Transactions on Automatic Control, 2003, 26(4): 967-969 [4] 张亚军, 柴天佑, 杨杰. 一类非线性离散时间动态系统的交替辨识算法及应用. 自动化学报, 2017, 43(1): 101-113 [5] 黄玉龙, 张勇刚, 李宁, 赵琳. 一种带有色量测噪声的非线性系统辨识方法. 自动化学报, 2015, 41(11): 1877-1892 [6] Ljung L. Perspective on System Identification. Auunual Reviews in Control, 2008, 34(1): 7172-7184 [7] Schon T B, Wills A, Ninness B. System identification of nonlinear state-space models. Automatica, 2011, 47(1): 39-49 [8] Billings S A. Nonlinear System Identification: NARMAX Methods in the Time, Frequency, and Spatio-Temporal Domains. John Wiley & Sons, 2013 [9] Carini A, Orcioni S, Terenzi A, Cecchi S. Nonlinear system identification using wiener basis functions and multiple-variance perfect sequences. Signal Processing, 2019, (160): 137-149 [10] Schoukens M, Tiels K. Identification of block-oriented nonlinear systems starting from linear approximations: A survey. Automatica, 2017, (85): 272-292 [11] Bershad N J, Celka P, Mclaughlin S. Analysis of stochastic gradient identification of Wiener-Hammerstein systems for nonlinearities with Hermite polynomial expansions. IEEE Transactions on Signal Processing, 2001, 49(5): 1060-1072 [12] Valarmathi K, De V D, Radhakrishnan T K. Intelligent techniques for system identification and controller tuning in pH process. Brazilian Journal of Chemical Engineering, 2009, 26(1): 99-111 [13] Zhu Y. Distillation column identification for control using wiener model. In: Proceedings of the American Control Conference.San Diego, CA, USA: IEEE, 1999. 3462-3466 [14] Ljung L, Schoukens J, Suykens J. Wiener-Hammerstein Benchmark. Proc of Ifac Symposium on System Identification, 2009 [15] Hagenblad A, Ljung L, Wills A. Maximum likelihood identification of wiener models. Automatica, 2008, 11(44): 2697-2705 [16] Xu W, Bai E W, Cho M. System identification in the presence of outliers and random noises: A compressed sensing approach. Automatica, 2014, 50(11): 2905-2911 [17] Bottegal G, Castro-Garcia R, Suykens J A. A two-experiment approach to wiener system identification. Automatica, 2018, (93): 282-289 [18] Westwick D T, Schoukens J. Initial estimates of the linear subsystems of wiener-hammerstein models. Automatica, 2012, 48(11): 2931-2936 [19] Giordano G, Gros S, Sjoberg J. An improved method for wiener-hammerstein system identification based on the fractional approach. Automatica, 2018, (94): 349-360 [20] Liu Qie, Lin Wen-Yi, Jiang Shen-Long, Chai Yi, Sun Li. Robust Estimation of Wiener Models in the Presence of Outliers Using the VB Approach. IEEE Transactions on Industrial Electronics, 2021, 68(11): 11390-11399 [21] Li X, Yang H, Huang B. FIR model identification of multirate processes with random delays using EM algorithm. AIChE Journal, 2013, 59(11): 4124-4132 [22] Agamennoni G, Nieto J I, Nebot E M. Approximate Inference in State-Space Models With Heavy-Tailed Noise. IEEE Transactions on Signal Processing, 2012, 60(10): 5024-5037 [23] Bishop C. Pattern Recognition and Machine Learning. Springer New York, 2006 [24] Amari S. Natural Gradient Works Efficiently in Learning. Neural Computation, 1999, 10(2): 251-276 [25] Amari S. Differential geometry of curved exponential families-curvatures and information loss. The Annals of Statistics, 1982, 10(2): 357-385 [26] Bottou L. On-line learning and stochastic approximations. On-line learning in neural networks, 1999 [27] Robbins H, Monro S. A Stochastic Approximation Method. Annals of Mathematical Statistics, 1951, 22(3): 400-407 [28] Hoffman M D, Blei D M, Wang C, Paisley J. Stochastic Variational Inference. Journal of Machine Learning Research, 2013, (14): 1303-1347 -

计量
- 文章访问数: 278
- HTML全文浏览量: 93
- 被引次数: 0