[1]
|
Iiro Harjunkoski, Christos T Maravelias, Peter Bongers, Pedro M Castro, Sebastian Engell, Ignacio E Grossmann, John Hooker, Carlos Méndez, Guido Sand, and John Wassick. Scope for industrial applications of production scheduling models and solution methods. Computers & Chemical Engineering, 2014, 62: 161−193
|
[2]
|
Pedro M. Castro, Ignacio E. Grossmann, and Qi Zhang. Expanding scope and computational challenges in process scheduling. Computers & Chemical Engineering, 2018, 114: 14−42
|
[3]
|
Yan Hou, NaiQi Wu, MengChu Zhou, and ZhiWu Li. Pareto-optimization for scheduling of crude oil operations in refinery via genetic algorithm. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2017, 47(3): 517−530 doi: 10.1109/TSMC.2015.2507161
|
[4]
|
Kaizhou Gao, Yun Huang, Ali Sadollah, and Ling Wang. A review of energy-efficient scheduling in intelligent production systems. Complex & Intelligent Systems, 2020, 6(2): 237−249
|
[5]
|
Amir M. Fathollahi-Fard, Lyne Woodward, and Ouassima Akhrif. A distributed permutation flow-shop considering sustainability criteria and real-time scheduling. Journal of Industrial Information Integration, 2024, 39: 100598 doi: 10.1016/j.jii.2024.100598
|
[6]
|
Richard S. Sutton and Andrew G. Barto. Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA, USA, 1998.
|
[7]
|
Ana Esteso, David Peidro, Josefa Mula, and Manuel Díaz-Madroñero. Reinforcement learning applied to production planning and control. International Journal of Production Research, 2023, 61(16): 5772−5789 doi: 10.1080/00207543.2022.2104180
|
[8]
|
Oguzhan Dogru, Junyao Xie, Om Prakash, Ranjith Chiplunkar, Jansen Soesanto, Hongtian Chen, Kirubakaran Velswamy, Fadi Ibrahim, and Biao Huang. Reinforcement learning in process industries: Review and perspective. IEEE/CAA Journal of Automatica Sinica, 2024, 11(2): 283−300 doi: 10.1109/JAS.2024.124227
|
[9]
|
Rui Nian, Jinfeng Liu, and Biao Huang. A review on reinforcement learning: Introduction and applications in industrial process control. Computers & Chemical Engineering, 2020, 139: 106886
|
[10]
|
Richard Bellman. A markovian decision process. Journal of mathematics and mechanics, 1957679−684
|
[11]
|
Richard S Sutton. Learning to predict by the methods of temporal differences. Machine learning, 1988, 3: 9−44 doi: 10.1023/A:1022633531479
|
[12]
|
Christopher JCH Watkins and Peter Dayan. Q-learning. Machine learning, 1992, 8: 279−292
|
[13]
|
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. Playing atari with deep reinforcement learning. arXiv preprint arXiv: 1312.5602, 2013.
|
[14]
|
Hado Van Hasselt, Arthur Guez, and David Silver. Deep reinforcement learning with double q-learning. In Proceedings of the AAAI conference on artificial intelligence, volume 30, 2016.
|
[15]
|
Ziyu Wang, Tom Schaul, Matteo Hessel, Hado Hasselt, Marc Lanctot, and Nando Freitas. Dueling network architectures for deep reinforcement learning. In International conference on machine learning, pages 1995–2003, 2016.
|
[16]
|
Ronald J Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning, 1992, 8: 229−256 doi: 10.1023/A:1022672621406
|
[17]
|
Vijay Konda and John Tsitsiklis. Actor-critic algorithms. Advances in neural information processing systems, 199912
|
[18]
|
Volodymyr Mnih, Adria Puigdomenech Badia, Mehdi Mirza, Alex Graves, Timothy Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. Asynchronous methods for deep reinforcement learning. In International conference on machine learning, pages 1928–1937. PmLR, 2016.
|
[19]
|
John Schulman, Sergey Levine, Pieter Abbeel, Michael Jordan, and Philipp Moritz. Trust region policy optimization. In International conference on machine learning, pages 1889–1897. PMLR, 2015.
|
[20]
|
John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. Proximal policy optimization algorithms. arXiv preprint arXiv: 1707.06347, 2017.
|
[21]
|
Timothy P Lillicrap, Jonathan J Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. Continuous control with deep reinforcement learning. arXiv preprint arXiv: 1509.02971, 2015.
|
[22]
|
Scott Fujimoto, Herke Hoof, and David Meger. Addressing function approximation error in actor-critic methods. In International conference on machine learning, pages 1587–1596. PMLR, 2018.
|
[23]
|
Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, and Sergey Levine. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In Jennifer Dy and Andreas Krause, editors, Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pages 1861–1870. PMLR, 10–15 Jul 2018.
|
[24]
|
Xiaoyong Gao, Diao Peng, Guofeng Kui, Jun Pan, Xin Zuo, and Feifei Li. Reinforcement learning based optimization algorithm for maintenance tasks scheduling in coalbed methane gas field. Computers & Chemical Engineering, 2023, 170: 108131
|
[25]
|
Saxena Nikita, Anamika Tiwari, Deepak Sonawat, Hariprasad Kodamana, and Anurag S Rathore. Reinforcement learning based optimization of process chromatography for continuous processing of biopharmaceuticals. Chemical Engineering Science, 2021, 230: 116171 doi: 10.1016/j.ces.2020.116171
|
[26]
|
Yin Cheng, Yuexin Huang, Bo Pang, and Weidong Zhang. Thermalnet: A deep reinforcement learning-based combustion optimization system for coal-fired boiler. Engineering Applications of Artificial Intelligence, 2018, 74: 303−311 doi: 10.1016/j.engappai.2018.07.003
|
[27]
|
Zhuang Shao, Fengqi Si, Daniel Kudenko, Peng Wang, and Xiaozhong Tong. Predictive scheduling of wet flue gas desulfurization system based on reinforcement learning. Computers & Chemical Engineering, 2020, 141: 107000
|
[28]
|
Christian D. Hubbs, Can Li, Nikolaos V. Sahinidis, Ignacio E. Grossmann, and John M. Wassick. A deep reinforcement learning approach for chemical production scheduling. Computers & Chemical Engineering, 2020, 141: 106982
|
[29]
|
Kody M Powell, Derek Machalek, and Titus Quah. Real-time optimization using reinforcement learning. Computers & Chemical Engineering, 2020, 143: 107077
|
[30]
|
Chao Liu, Jinliang Ding, and Jiyuan Sun. Reinforcement learning based decision making of operational indices in process industry under changing environment. IEEE Transactions on Industrial Informatics, 2020, 17(4): 2727−2736
|
[31]
|
Derrick Adams, Dong-Hoon Oh, Dong-Won Kim, Chang-Ha Lee, and Min Oh. Deep reinforcement learning optimization framework for a power generation plant considering performance and environmental issues. Journal of Cleaner Production, 2021, 291: 125915 doi: 10.1016/j.jclepro.2021.125915
|
[32]
|
PS Pravin, Zhiyao Luo, Lanyu Li, and Xiaonan Wang. Learning-based scheduling of industrial hybrid renewable energy systems. Computers & Chemical Engineering, 2022, 159: 107665
|
[33]
|
Geunseo Song, Pouya Ifaei, Jiwoo Ha, Doeun Kang, Wangyun Won, J Jay Liu, and Jonggeol Na. The ai circular hydrogen economist: Hydrogen supply chain design via hierarchical deep multi-agent reinforcement learning. Chemical Engineering Journal, 2024, 497: 154464 doi: 10.1016/j.cej.2024.154464
|
[34]
|
Hao Wang, Hongwen He, Yunfei Bai, and Hongwei Yue. Parameterized deep q-network based energy management with balanced energy economy and battery life for hybrid electric vehicles. Applied Energy, 2022, 320: 119270 doi: 10.1016/j.apenergy.2022.119270
|
[35]
|
Gustavo Campos, Nael H El-Farra, and Ahmet Palazoglu. Soft actor-critic deep reinforcement learning with hybrid mixed-integer actions for demand responsive scheduling of energy systems. Industrial & Engineering Chemistry Research, 2022, 61(24): 8443−8461
|
[36]
|
Lijuan Li, Xue Yang, Shipin Yang, and Xiaowei Xu. Optimization of oxygen system scheduling in hybrid action space based on deep reinforcement learning. Computers & Chemical Engineering, 2023, 171: 108168
|
[37]
|
Laura Stops, Roel Leenhouts, Qinghe Gao, and Artur M Schweidtmann. Flowsheet generation through hierarchical reinforcement learning and graph neural networks. AIChE Journal, 2023, 69(1): e17938 doi: 10.1002/aic.17938
|
[38]
|
Warwick Masson, Pravesh Ranchod, and George Konidaris. Reinforcement learning with parameterized actions. In Proceedings of the AAAI conference on artificial intelligence, volume 30, 2016.
|
[39]
|
Quirin Göttl, Jonathan Pirnay, Jakob Burger, and Dominik G Grimm. Deep reinforcement learning enables conceptual design of processes for separating azeotropic mixtures without prior knowledge. Computers & Chemical Engineering, 2025, 194: 108975
|
[40]
|
Hengkai Zhang, Xiaoyu Liu, Dingshan Sun, Azita Dabiri, and Bart De Schutter. Integrated reinforcement learning and optimization for railway timetable rescheduling. IFAC-PapersOnLine, 2024, 58(10): 310−315 doi: 10.1016/j.ifacol.2024.07.358
|
[41]
|
Gelegen Che, Yanyan Zhang, Lixin Tang, and Shengnan Zhao. A deep reinforcement learning based multi-objective optimization for the scheduling of oxygen production system in integrated iron and steel plants. Applied Energy, 2023, 345: 121332 doi: 10.1016/j.apenergy.2023.121332
|
[42]
|
Ashish Kumar Shakya, Gopinatha Pillai, and Sohom Chakrabarty. Reinforcement learning algorithms: A brief survey. Expert Systems with Applications, 2023, 231: 120495 doi: 10.1016/j.eswa.2023.120495
|
[43]
|
Marcel Panzer and Benedict Bender and. Deep reinforcement learning in production systems: a systematic literature review. International Journal of Production Research, 2022, 60(13): 4316−4341 doi: 10.1080/00207543.2021.1973138
|
[44]
|
Pawel Ladosz, Lilian Weng, Minwoo Kim, and Hyondong Oh. Exploration in deep reinforcement learning: A survey. Information Fusion, 2022, 85: 1−22 doi: 10.1016/j.inffus.2022.03.003
|
[45]
|
W. Bradley Knox and Peter Stone. Interactively shaping agents via human reinforcement: the tamer framework. In Proceedings of the Fifth International Conference on Knowledge Capture, pages 9–16, New York, NY, USA, 2009. Association for Computing Machinery.
|
[46]
|
Paul F Christiano, Jan Leike, Tom Brown, Miljan Martic, Shane Legg, and Dario Amodei. Deep reinforcement learning from human preferences. In I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017.
|
[47]
|
Sam Devlin, Daniel Kudenko, and Marek Grześ. An empirical study of potential-based reward shaping and advice in complex, multi-agent systems. Advances in Complex Systems, 2011, 14(02): 251−278 doi: 10.1142/S0219525911002998
|
[48]
|
Marek Grześ and Daniel Kudenko. Plan-based reward shaping for reinforcement learning. In 2008 4th International IEEE Conference Intelligent Systems, volume 2, pages 10–22–10–29, 2008.
|
[49]
|
Marek Grześ and Daniel Kudenko. Multigrid reinforcement learning with reward shaping. In Artificial Neural Networks - ICANN 2008, pages 357–366, Berlin, Heidelberg, 2008.
|
[50]
|
Yu Du, Jun-qing Li, Xiao-long Chen, Pei-yong Duan, and Quan-ke Pan. Knowledge-based reinforcement learning and estimation of distribution algorithm for flexible job shop scheduling problem. IEEE Transactions on Emerging Topics in Computational Intelligence, 2023, 7(4): 1036−1050 doi: 10.1109/TETCI.2022.3145706
|
[51]
|
Yong Gui, Dunbing Tang, Haihua Zhu, Yi Zhang, and Zequn Zhang. Dynamic scheduling for flexible job shop using a deep reinforcement learning approach. Computers & Industrial Engineering, 2023, 180: 109255
|
[52]
|
Yinlam Chow, Mohammad Ghavamzadeh, Lucas Janson, and Marco Pavone. Risk-constrained reinforcement learning with percentile risk criteria. Journal of Machine Learning Research, 2018, 18(167): 1−51
|
[53]
|
Lin Luo and Xuesong Yan. Scheduling of stochastic distributed hybrid flow-shop by hybrid estimation of distribution algorithm and proximal policy optimization. Expert Systems with Applications, 2025, 271: 126523 doi: 10.1016/j.eswa.2025.126523
|
[54]
|
Hepeng Li, Zhiqiang Wan, and Haibo He. Real-time residential demand response. IEEE Transactions on Smart Grid, 2020, 11(5): 4144−4154 doi: 10.1109/TSG.2020.2978061
|
[55]
|
Juntao Dai, Jiaming Ji, Long Yang, Qian Zheng, and Gang Pan. Augmented proximal policy optimization for safe reinforcement learning. Proceedings of the AAAI Conference on Artificial Intelligence, 2023, 37(6): 7288−7295 doi: 10.1609/aaai.v37i6.25888
|
[56]
|
Shangding Gu, Long Yang, Yali Du, Guang Chen, Florian Walter, Jun Wang, and Alois Knoll. A review of safe reinforcement learning: Methods, theories, and applications. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, 46(12): 11216−11235 doi: 10.1109/TPAMI.2024.3457538
|
[57]
|
Dongjie Yu, Wenjun Zou, Yujie Yang, Haitong Ma, Shengbo Eben Li, Yuming Yin, Jianyu Chen, and Jingliang Duan. Safe model-based reinforcement learning with an uncertainty-aware reachability certificate. IEEE Transactions on Automation Science and Engineering, 2024, 21(3): 4129−4142 doi: 10.1109/TASE.2023.3292388
|
[58]
|
Peng Kou, Deliang Liang, Chen Wang, Zihao Wu, and Lin Gao. Safe deep reinforcement learning-based constrained optimal control scheme for active distribution networks. Applied Energy, 2020, 264: 114772 doi: 10.1016/j.apenergy.2020.114772
|
[59]
|
Yan Song, Bin Zhang, Chuanbo Wen, Dong Wang, and Guoliang Wei. Model predictive control for complicated dynamic systems: a survey. International Journal of Systems Science, 2025.
|
[60]
|
Mario Zanon and Sebastien Gros. Safe reinforcement learning using robust mpc. IEEE Transactions on Automatic Control, 2021, 66(8): 3638−3652 doi: 10.1109/TAC.2020.3024161
|
[61]
|
Yanan Sui, Alkis Gotovos, Joel Burdick, and Andreas Krause. Safe exploration for optimization with gaussian processes. In Francis Bach and David Blei, editors, Proceedings of the 32nd International Conference on Machine Learning, volume 37 of Proceedings of Machine Learning Research, pages 997–1005, 2015.
|
[62]
|
Matteo Turchetta, Felix Berkenkamp, and Andreas Krause. Safe exploration in finite markov decision processes with gaussian processes. In D. Lee, M. Sugiyama, U. Luxburg, I. Guyon, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 29. Curran Associates, Inc., 2016.
|
[63]
|
Haokun Yang, David E. Bernal, Robert E. Franzoi, Faramroze G. Engineer, Kysang Kwon, Sechan Lee, and Ignacio E. Grossmann. Integration of crude-oil scheduling and refinery planning by lagrangean decomposition. Computers & Chemical Engineering, 2020, 138: 106812
|
[64]
|
Pedro M. Castro. Optimal scheduling of a multiproduct batch chemical plant with preemptive changeover tasks. Computers & Chemical Engineering, 2022, 162: 107818
|
[65]
|
Robert E. Franzoi, Brenno C. Menezes, Jeffrey D. Kelly, Jorge A.W. Gut, and Ignacio E. Grossmann. Large-scale optimization of nonconvex minlp refinery scheduling. Computers & Chemical Engineering, 2024, 186: 108678
|
[66]
|
Lu Sun, Lin Lin, Haojie Li, and Mitsuo Gen. Large scale flexible scheduling optimization by a distributed evolutionary algorithm. Computers & Industrial Engineering, 2019, 128: 894−904
|
[67]
|
Wanting Zhang, Wei Du, Guo Yu, Renchu He, Wenli Du, and Yaochu Jin. Knowledge-assisted dual-stage evolutionary optimization of large-scale crude oil scheduling. IEEE Transactions on Emerging Topics in Computational Intelligence, 2024, 8(2): 1567−1581 doi: 10.1109/TETCI.2024.3353590
|
[68]
|
Meng Xu, Yi Mei, Fangfang Zhang, and Mengjie Zhang. Genetic programming with lexicase selection for large-scale dynamic flexible job shop scheduling. IEEE Transactions on Evolutionary Computation, 2024, 28(5): 1235−1249 doi: 10.1109/TEVC.2023.3244607
|
[69]
|
Yi Zheng, Jian Wang, Chengmin Wang, Chunyi Huang, Jingfei Yang, and Ning Xie. Strategic bidding of wind farms in medium-to-long-term rolling transactions: A bi-level multi-agent deep reinforcement learning approach. Applied Energy, 2025, 383: 125265 doi: 10.1016/j.apenergy.2024.125265
|
[70]
|
Youshan Liu, Jiaxin Fan, and Weiming Shen. A deep reinforcement learning approach with graph attention network and multi-signal differential reward for dynamic hybrid flow shop scheduling problem. Journal of Manufacturing Systems, 2025, 80: 643−661 doi: 10.1016/j.jmsy.2025.03.028
|
[71]
|
Atit Bashyal, Tina Boroukhian, Pakin Veerachanchai, Myanganbayar Naransukh, and Hendro Wicaksono. Multi-agent deep reinforcement learning based demand response and energy management for heavy industries with discrete manufacturing systems. Applied Energy, 2025, 392: 125990 doi: 10.1016/j.apenergy.2025.125990
|
[72]
|
Patrick de Mars and Aidan O’Sullivan. Applying reinforcement learning and tree search to the unit commitment problem. Applied Energy, 2021, 302: 117519 doi: 10.1016/j.apenergy.2021.117519
|
[73]
|
Lingwei Zhu, Go Takami, Mizuo Kawahara, Hiroaki Kanokogi, and Takamitsu Matsubara. Alleviating parameter-tuning burden in reinforcement learning for large-scale process control. Computers & Chemical Engineering, 2022, 158: 107658
|
[74]
|
Rong Hu, Yu-Fang Huang, Xing Wu, Bin Qian, Ling Wang, and Zi-Qi Zhang. Collaborative q-learning hyper-heuristic evolutionary algorithm for the production and transportation integrated scheduling of silicon electrodes. Swarm and Evolutionary Computation, 2024, 86: 101498 doi: 10.1016/j.swevo.2024.101498
|
[75]
|
Xin Chen, Yibing Li, Kaipu Wang, Lei Wang, Jie Liu, Jun Wang, and Xi Vincent Wang. Reinforcement learning for distributed hybrid flowshop scheduling problem with variable task splitting towards mass personalized manufacturing. Journal of Manufacturing Systems, 2024, 76: 188−206 doi: 10.1016/j.jmsy.2024.07.011
|
[76]
|
Maxime Bouton, Kyle D. Julian, Alireza Nakhaei, Kikuo Fujimura, and Mykel J. Kochenderfer. Decomposition methods with deep corrections for reinforcement learning. Autonomous Agents and Multi-Agent Systems, 2019, 33(3): 330−352 doi: 10.1007/s10458-019-09407-z
|
[77]
|
Yuandong Chen, Jinliang Ding, and Qingda Chen. A reinforcement learning based large-scale refinery production scheduling algorithm. IEEE Transactions on Automation Science and Engineering, 2023, 21(4): 6041−6055
|
[78]
|
Joseph Fré Bonnans, dé, and ric. Lectures on stochastic programming: Modeling and theory. SIAM Review, 2011181−183
|
[79]
|
Zukui Li and Marianthi G. Ierapetritou. Robust optimization for process scheduling under uncertainty. Industrial and Engineering Chemistry Research, 2008, 47(12): 4148−4157 doi: 10.1021/ie071431u
|
[80]
|
Lukas Glomb, Frauke Liers, and Florian Rösel. A rolling-horizon approach for multi-period optimization. European Journal of Operational Research, 2022, 300(1): 189−206 doi: 10.1016/j.ejor.2021.07.043
|
[81]
|
Muhammad Qasim, Kuan Yew Wong, and Komarudin. A review on aggregate production planning under uncertainty: Insights from a fuzzy programming perspective. Engineering Applications of Artificial Intelligence, 2024107436
|
[82]
|
Xinquan Wu, Xuefeng Yan, Donghai Guan, and Mingqiang Wei. A deep reinforcement learning model for dynamic job-shop scheduling problem with uncertain processing time. Engineering Applications of Artificial Intelligence, 2024, 131: 107790 doi: 10.1016/j.engappai.2023.107790
|
[83]
|
Marcelo Luis Ruiz-Rodríguez, Sylvain Kubler, Jérémy Robert, and Yves Le Traon. Dynamic maintenance scheduling approach under uncertainty: Comparison between reinforcement learning, genetic algorithm simheuristic, dispatching rules. Expert Systems with Applications, 2024123404
|
[84]
|
Daniel Rangel-Martinez and Luis A. Ricardez-Sandoval. Recurrent reinforcement learning strategy with a parameterized agent for online scheduling of a state task network under uncertainty. Industrial and Engineering Chemistry Research, 2025.
|
[85]
|
Muyi Huang, Renchu He, Xin Dai, Wenli Du, and Feng Qian. Reinforcement learning based gasoline blending optimization: Achieving more efficient nonlinear online blending of fuels. Chemical Engineering Science, 2024120574
|
[86]
|
Wuliang Peng, Xuejun Lin, and Haitao Li. Critical chain based proactive-reactive scheduling for resource-constrained project scheduling under uncertainty. Expert Systems with Applications, 2023119188
|
[87]
|
Felix Grumbach, Anna Müller, Pascal Reusch, and Sebastian Trojahn. Robust-stable scheduling in dynamic flow shops based on deep reinforcement learning. Journal of Intelligent Manufacturing, 2024667−686
|
[88]
|
Jiang-Ping Huang, Liang Gao, and Xin-Yu Li. A hierarchical multi-action deep reinforcement learning method for dynamic distributed job-shop scheduling problem with job arrivals. IEEE Transactions on Automation Science and Engineering, 20252501−2513
|
[89]
|
Guillaume Infantes, Stéphanie Roussel, Pierre Pereira, Antoine Jacquet, and Emmanuel Benazera. Learning to solve job shop scheduling under uncertainty. In International Conference on the Integration of Constraint Programming, Artificial Intelligence, and Operations Research, 2024.
|
[90]
|
Constantin Waubert de Puiseau, Richard Meyes, and Tobias Meisen. On reliability of reinforcement learning based production scheduling systems: a comparative survey. Journal of Intelligent Manufacturing, 2022911−927
|
[91]
|
Chia-Yen Lee, Yi-Tao Huang, and Peng-Jen Chen. Robust-optimization-guiding deep reinforcement learning for chemical material production scheduling. Computers & Chemical Engineering, 2024108745
|
[92]
|
Daniel Rangel-Martinez and Luis A. Ricardez-Sandoval. A recurrent reinforcement learning strategy for optimal scheduling of partially observable job-shop and flow-shop batch chemical plants under uncertainty. Computers & Chemical Engineering, 2024108748
|