Executive summary

This paper proposes an improved hybrid residual prediction model for quality prediction of coke and blended coal. It combines random forest feature extraction with various prediction models to forecast residuals between blended coal and single coal, and between coke and blended coal. An optimal strategy is determined by evaluating metrics and selecting the best predictive performance for different coal quality indicators. The paper also introduces an enhanced prior knowledge-based and adaptive random initialization genetic optimization algorithm (P-awGA) with adaptive weights to address challenges in generating initial populations under strict constraints. A novel coal allocation optimization method integrating these algorithms is proposed, using past blended coal ratios as prior knowledge and a hybrid residual prediction model to calculate the fitness of the population. The method demonstrates superior success rates in solving under strict constraints and significantly reduces coal blending costs.

Coal allocation optimization based on a hybrid residual prediction model with an improved genetic algorithm

By A.S. Akopov

Introduction

Studies show that in recent years coal has been the main energy source, consuming 29% of global fossil energy demand, generating 41% of the world's total electricity, and accounting for 44% of world industrial demand (“Data & Statistics - IEA,” n.d.; Wu and Chen, 2018). Coal's massive consumption as an industrial raw material and energy source worldwide has caused widespread concerns in environmental pollution, and cost of fossil energy (Jaramillo and Muller, 2016; Shon et al., 2020; Xie et al., 2020; Zhang et al., 2020). Coke, an important coal component accounts for approximately 19% of total coal production worldwide, with approximately 650,000 kiloton of coke being produced each year (“Data & Statistics - IEA,” n.d.). The relatively high cost of good quality coal as an ideal raw material for coke and its low global production has forced the coking industry to use poor quality coal as an alternative for production. However, poor quality coal has more impurities and emits a large number of pollutants during the coking and use processes, including volatile organic compounds, such as alkanes, olefins and alkynes, as well as fine particulate matter fine-tune, carbon monoxide, carbon dioxide, nitrogen oxides and sulphur dioxide. These pollutants not only have a significant impact on the atmosphere and people's health, but also reduce productivity and cause safety accidents (Gong et al., 2017; Guo et al., 2018; Tian et al., 2018; Zhang et al., 2020). These pollutants not only have a significant impact on the atmosphere and people's health, but also reduce productivity and cause safety accidents. Therefore, this paper explores a method to predict coke quality and optimize coal blending to minimize production costs and reduce resource wastes and environmental pollution while maintaining the quality of the coke produced. A prediction and optimization model based on machine learning and genetic algorithm for coal blending and coking is proposed for the actual coal blending and coking production process, as shown in Fig. 1.

In the coking process, as shown in Fig. 1, individual coals need to be obtained, grounded and mixed according to the corresponding ratios. The mixed coals are then put into the coke oven for a period of high-temperature coking to finally form coke. As the coals are mixed and the heated, the form, the surface and the chemical composition of the coals are changed (Chen et al., 2020; Guo et al., 2020; Wall et al., 2002; Xu et al., 2016). However, as the blended coal is made up of a mixture of coals, it is challenging to determine the overall quality of blended coals, taking into account the quality of the individual coals and the physical changes during the blending process (Díez et al., 2002). The change from blended coal to coke involves complex reactions among various components of the blended coals at high temperatures. Although the traditional method of using simplified blast furnace coking before assaying can be used on a trial basis to obtain quality indicators for the coke as a product (Shen and Yu, 2016), this has the disadvantage that actual coking tests are required to obtain a true quality specification. Throughout the coking process, individual coal ratios are usually given by professionals based on experience and expectations of the target coke quality. Actual coking tests are carried out several times and the ratios are constantly adjusted to obtain the final target coke. Obviously, these multiple tests inevitably result in higher production costs and waste of resources. The environmental pollution and waste of resources caused by the coking process and the use of coke is a severe obstacle to the development of the coking industry and its subsidiary industries (Martinson et al., 2017; Zhu et al., 2019). Therefore, it is necessary to develop a predictive model of coke quality and to study the optimization of the composition ratio of coking coal raw materials and utilize the advantages of each raw material to improve the quality of coke while reducing resource consumption and costs as well as protecting the environment (Jiao et al., 2018).

The selection of the coking coal feed is the first step in the coking chemical process and the ratio of the selected single coal feed is directly related to the yield and quality of the coke output from the coking process (Díez et al., 2002; Jiao et al., 2018; van Krevelen and Schuyer, 1957; Zhang et al., 2004). The coke has a high crushing strength ( $M_{40}$ ), abrasion strength ( $M_{10}$ ), impurity level (ash $A_{d}$ , sulphur content $S_{t . d}$ ) and the grade parameters of the coke (fugitive constituent) $V_{d a f}$ (Li, 2016), which directly affect its combustion and the effect of iron making in the blast furnace.

And the performance of coke mainly depends on the grade parameters, the level of impurities (ash $A_{d}$ , Sulphur $S_{t . d}$ ), agglomeration characteristics, etc. (van Krevelen and Schuyer, 1957). Blending coal is formed by mixing single coals in a certain proportion, and its performance index depends on the quality and proportion of each single coal. As the physical changes (crushing, mixing, etc.) occur in the process of converting individual coals to blended coals, the quality of the individual coals can be approximated by weighted summation of the corresponding ratios (Zhang et al., 2004).

For coke quality prediction methods, there are now several major research directions in the industry as follows.

Approaches based on coal petrography and optical analysis model (Smȩdowski and Krzesińska, 2013; Flores et al., 2017; Jiao et al., 2018; Chen et al., 2020; Yuan et al., 2020) predict coke quality by observing changes in the microstructure and optical properties of blended coals and coke before and after coking. Approaches based on fluid dynamics model (Wall et al., 2002; Wałowski, 2019; Xing et al., 2019) analyze the chemical and physical changes that occur in the coking process by modelling the fluid motion of the gases in the coking process and the pulverized coal under heat treatment conditions, and then to predict the quality of the coke in combination with the data obtained from subsequent tests on the quality of the produced coke. Data mining algorithms such as random forests (Chehreh Chelgani et al., 2016), back propagation neural networks (Khoshjavan et al., 2011) and integrated deep learning (Yin et al., 2020) were developed to analyze the quality indicators of blended coal and coke, learn the features between the indicators and finally use the obtained feature vectors for regression to build a predictive model of coke quality. Although these approaches have some success, considering the heterogeneity of coal, i.e., coals of similar quality with different structures and compositions due to differences in geological conditions, and the fact that there are too many types of single coals as coke feedstock with different microstructures and qualities, it is difficult to predict coke quality from the properties of single coals and mixed coals themselves (Flores et al., 2017; Meng et al., 2017).Therefore, it is necessary to consider a new approach to combine the strengths of different models in order to enhance the reliability and effectiveness of coke quality prediction.

Furthermore, for coal blending optimization, practitioners want to optimize production costs while maintaining coke quality. A number of optimization methods have been proposed by the industry to solve such problems. These methods are usually based on a predictive model for matching coal and coke, as summarized above, constructing an objective function that reasonably describes the optimization problem, and then choosing a more appropriate optimization algorithm to solve the problem (Tian et al., 2016). In recent research, it has been found that, due to the ability to perform high-performance optimization with non-linear optimization under complex constraints, genetic algorithms (Chakraborty and Chakraborty, 2012; Ilbeigi et al., 2020; Kurnadi et al., 2022; Xi-Jin et al., 2009; You et al., 2020) and particle swarm algorithms (Jagtap et al., 2020; Li and Yao, 2017; Yuan et al., 2020) have been used in the industry to solve some complex optimization problems. Genetic algorithms have superior performance in solving complex constrained optimization, non-linear optimization problems and also have better robustness compared to other optimization algorithms. Genetic algorithms also allow the population size to be increased to improve the efficiency and robustness of the search, which inevitably increases the time and space overheads but can be reduced by parallel and distributed operations. (Akopov et al., 2019; Kim and Kim, 2017; Lu et al., 2020, Lu et al., 2020). However, the above optimization algorithms still exhibit certain limitations: encountering difficulties in obtaining solutions that satisfy all constraints is highly probable in solving problems with complex constraints (Ma et al., 2021). Furthermore, traditional heuristic optimization algorithms (including genetic algorithms and particle swarm optimization) still demonstrate poor performance in terms of time efficiency (Abbasi et al., 2020). These issues have been confirmed to potentially be fatal in coal blending industrial production (Yuan et al., 2020). Therefore, proposing a novel optimization algorithm that can balance time efficiency and solution success rate is an urgent and significant challenge for researchers to address.

In summary, in order to optimize the cost of coal distribution, there are two key issues as follows.

1.
Develop a predictive model for blended coal and coke quality with a high confidence level

Considering that the quality relationship between blended coal and coke is too complex, it is difficult to extract feature vectors that are more closely related to the corresponding quality indicators. Since the relationship between blended coal and coke is not simply linear, this leads us to need a reasonable non-linear model to describe the relationship.

2.
Select a reasonable optimization algorithm to solve the optimization objective function

As the optimization of coke costs requires the use of a non-linear predictive model as described above, with a non-convex feasible domain, there are also complex and numerous input constraints and proportionality constraints to be considered in practice. This can make it difficult or costly to find an optimal solution if a suitable optimization algorithm is not found.

To address the above issues, this paper proposes a coal allocation optimization system based on a hybrid residual prediction model with a genetic algorithm, which is proven to accurately predict coke quality and effectively optimize coal allocation costs. In order to improve the robustness of the model and solve the problem of good model fit but poor prediction results, we propose the use of residual prediction to reduce the influence of ' in the data set by predicting the difference between the target index and the input values, thus improving the data utilization and the prediction effectiveness of the model (Peng et al., 2015; Zhou et al., 2017). We then use a random forest algorithm to predict the residuals and extract the eigenvectors associated with the target metric, to minimize the effect of irrelevant variables to improve prediction accuracy and reduce the risk of overfitting (Chehreh Chelgani et al., 2016). Finally, we tested Adaboost, lightGBM and XGBoost and selected the method with the highest prediction accuracy to build the prediction model, and mixed the prediction models built by the different methods to improve the prediction performance.

After building a coke prediction model with high confidence, we propose an improved genetic algorithm for the coal allocation optimization problem. First, the improved genetic algorithm is based on the adaptive weight genetic algorithm (awGA), which has good convergence and solution set distribution for optimization problems with complex constraints and high-dimensional decision variables, and is therefore suitable for solving industrial-scale optimization problems (Gu and Wang, 2020; Liu et al., 2020). The algorithm is therefore suitable for solving larger-scale optimization problems in industry. An improved genetic algorithm based on prior knowledge and adaptive stochastic initialization is then designed and implemented for solving optimization problems with strict constraints in order to provide better generalization capabilities. Finally, using the improved genetic algorithm, we successfully solved the optimal single coal proportioning solution and its required cost.

In summary, the main contributions proposed in this paper can be summarized as follows:

1)
Presented an improved hybrid residual prediction model applied to the task of quality prediction for coke and blended coal. Building upon the Random Forest feature extraction method, the residuals between blended coal and single coal, as well as between coke and blended coal, are computed. Various prediction models are then employed to forecast these residuals. Finally, an optimal strategy for model selection is determined by evaluating metrics, selecting the model with the best predictive performance for different coal quality indicators, and integrating them into a comprehensive quality prediction model.
2)
Introduced an enhanced prior knowledge-based and adaptive random initialization genetic optimization algorithm (P-awGA) with adaptive weights. Building upon the adaptive weight genetic algorithm (awGA), to address the challenge of generating initial populations under strict constraints, this paper employs an adaptive constraint initialization method and prior knowledge to generate the initial population. Furthermore, an adaptive weight adjustment method is employed to guide changes in constraint conditions, aiming to improve the success rate of solving and optimization effectiveness.
3)
Proposed a novel coal allocation optimization method integrating the aforementioned algorithms. Utilizing past blended coal ratios as prior knowledge, a population is generated around them. The fitness of the population is calculated using the hybrid residual prediction model, and an adaptive search mechanism ensures that the generated population satisfies constraint conditions. Ultimately, the optimal individual is iteratively determined, with its corresponding single coal ratio serving as the result. Through validation with real coal blending data, the method proposed in this paper demonstrates superior success rates in solving under strict constraints and significantly reduces coal blending costs.

Section snippets

Hybrid residual prediction model

Firstly, this paper uses the linear weighting method to build the prediction model from single coal to blended coal. After multiplying the input single coal index data by the single coal ratio and summing them, the index data of blended coal is obtained.

After building a quality prediction model from single coal to blended coal, a hybrid residual prediction model from blended coal to coke is then built. The prediction models for each metric used in this paper are independent of the prediction

Simulation conditions

All simulations covered in this paper were conducted on an AMD Ryzen 7 2700, 3.3 GHz CPU, 16 GB RAM and 64-bit Windows 10 operating system. The programming language used was Python. The data used in this paper came from the assay data of a coal company for the blended coal and coke as its products from 2016 to 2020, and its total sample data were 2939 sets. In order to improve the accuracy and generalization of the model, the order of the data in the dataset was disordered and the dataset was

Conclusions

As the optimization of coal blending solutions is a challenging problem for the normal production and quality control of coke plants.However, the existing coal allocation and coke quality prediction methods based on traditional optimization algorithms and neural network model-based prediction algorithms have problems, such as low efficiency and poor prediction accuracy.

To address these problems, this paper proposes a coal allocation optimization algorithm based on a hybrid residual prediction

CRediT authorship contribution statement

Ming Liu: Conceptualization, Formal analysis, Funding acquisition, Methodology, Project administration, Resources, Supervision, Writing – original draft, Writing – review & editing. Ziqi Yu: Software, Visualization, Writing – original draft. Boran Li: Software, Validation, Writing – original draft. Qingjie Wang: Data curation, Formal analysis. Huawei Ren: Data curation, Formal analysis. Dong Xu: Funding acquisition, Methodology, Writing – review & editing.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work was partially supported by the National Science Foundation of China (grants 11631003; Ming Liu), the Science and Technology Project of Jilin Provincial Development and Reform Commission (grant 2022C043-2; Ming Liu), the Natural Science Foundation of Jilin Province (grant 20200201157JC; Ming Liu), and Paul K. and Diane Shumaker Endowment Fund at University of Missouri (Dong Xu).