Document Type : Research Paper
Authors
 Hossein Yavari ^{1}
 Mohammad Sabah ^{} ^{1}
 Rassoul Khosravanian ^{2}
 David. A Wood ^{3}
^{1} M.S. Student of Petroleum Department of Amirkabir University of Technology, Tehran, Iran
^{2} Assistant Professor of Petroleum Department of Amirkabir University of Technology, Tehran, Iran
^{3} DWA Energy Limited, Lincoln, United Kingdom
Abstract
The rate of penetration (ROP) is one of the vital parameters which directly affects the drilling time and costs. There are various parameters that influence the drilling rate; they include weight on bit, rotational speed, mud weight, bit type, formation type, and bit hydraulic. Several approaches, including mathematical models and artificial intelligence have been proposed to predict the rate of penetration. Previous research has showed that artificial intelligence such as neural network and adaptive neurofuzzy inference system are superior to conventional methods in the prediction of drilling rate. On the other hand, many complicated analytical ROP models have also been developed during recent years that are able to predict drilling rate with a high degree of accuracy. Therefore, comparing different approaches to find the most accurate model and assess the conditions in which each model works well can be highly effective in reducing drilling time as well as drilling cost. In this study, HarelandRampersad (HR) model, Bourgoyne and Young (BY) model, and an adaptiveneurofuzzy inference system (ANFIS) are employed to predict the drilling rate in the South Pars gas field (SP) offshore of Iran, and their results are compared to find the best ROPprediction model for each formation. A database covering the drilling parameters, sonic log data, and modular dynamic test data collected from several drilling sites in SP are used to construct the mentioned models for each formation. The results show that when a large amount of data is available, the ANFIS is more accurate than the other approaches in predicting drilling rate. In the case of ROP models, BY model works considerably better than HR model for the majority of the formations. However, in formations where some drilling parameters are constant, but formation strength is variable, HR model shows better prediction performance than BY model.
Keywords
 Rate of Penetration (ROP)
 ANFIS
 Bourgoyne and Young
 HarelandRampersad
 Simulated Annealing Algorithm (SAA)
Main Subjects
The exploration and development of gas and oil fields require the efficient and costeffective drilling of well bores, which can be achieved through the optimization of several variables (Aghajanpour et al., 2017; Alexeyev et al., 2017; Eren, 2010a). The rate of penetration (ROP) is one of the important parameters, which should be predicted and optimized to reduce the drilling costs. There are various parameters which significantly affect the drilling rate, some of which include formation properties, weight on bit, bit rotational speed, bit type, hydraulic, and bit wear; these parameters make drilling rate render its behavior nonlinear and difficult to predict.
Several researchers have investigated the effects of drilling parameters on the rate of penetration. Bilgesu et al. (1997) modeled the rate of penetration and bit wear using a new neural network under various formation types and drilling parameters. In their work, the operational parameters such as formation type, torque, weight on bit, rotational speed, and hydraulic horsepower are considered to predict drilling rate. Xiangchao Shi et al. (2015) employed a new confined compressive strength (CCS) model combined with specific mechanical energy to optimize the operational parameters. Their results showed that these parameters could be used to detect efficient drilling situations. Bodaghi et al. (2015) established a formulation between drilling variables and the rate of penetration by using optimized support vector regression. For this purpose, the genetic and cuckoo search algorithms were utilized to optimize the support vector regression. Following this, Ansari et al. (2016) predicted the rate of penetration by using a method based on imperialist competitive algorithm. Kahraman (2016) predicted the drilling rate by using artificial neural network and a multiple regression method. In addition to operational parameters, uniaxial compressive strength, tensile strength, and relative abrasiveness were considered as the input variables. Their results showed that the artificial neural network was more reliable than multiple regression method to predict the ROP. In a study of Khandelwal et al. (2016), multiple regression method, artificial neural network, and hybrid genetic algorithm were used to estimate the drilling rate. The comparison of the results showed that the hybrid genetic algorithm had better performance in ROP prediction compared to the other models. Yi et al. (2014) used a shuffled frog algorithm to obtain optimum values of drilling parameters. Weight on bit, bit rotational speed, flow rate, and bit tooth wear are some of the variables considered in their study. Jiang et al. (2016) combined an artificial neural network and ant colony optimization (ACO) to acquire optimum values for drilling rate. They considered the weight on bit, rotational speed, flow rate, and gamma ray as the input variables. Xian Shi et al. (2016) used an extreme learning machine, an upperlayer solution aware model, and an artificial neural network to predict drilling rate. Their results indicated that all of these methods can be an appropriate method to predict the drilling rate. Hegde et al. (2015) predicted the rate of penetration by using statistical learning techniques such as trees, bagged trees, and random forests; these techniques were used for a data set with nine predictors. Khosravanian et al. (2016) used the fuzzy inference systems (FIS) of Sugeno and Mamdani to predict the weight on bit. According to their results, Sugenotype FIS is more accurate than Mamdanitype FIS in the prediction of weight on bit. Moraveji et al. (2016) investigated the effect of six variables on ROP. They used response surface methodology to develop a mathematical relation between drilling rate and drilling parameters such as depth, weight on bit (WOB), RPM, jet impact force, yield point to plastic viscosity ratio, and 10 minto10 s gel strength ratio. They used bat algorithm to optimize drilling parameters to reach the maximum ROP. Their results showed that this model provided an efficient tool for the prediction and optimization of drilling rate. Monazami et al. (2012) utilized ANN for the prediction of ROP in one of Iranian oil fields. Their results showed that ANN is a useful tool for the prediction of ROP, especially when the relationships between drilling parameters and ROP are too complicated.
Arabjamaloei et al. (2011) used ANN to predict the drilling rate in one of the Iranian oil fields, and they then optimized the drilling parameters using genetic algorithm to achieve the maximum drilling rate. There was good agreement between their results and the real field data, indicating that the ANN was able to predict the drilling rate accurately. Basarir et al. (2014) compared the performance of linear regression, nonlinear regression, and ANFIS for the prediction of ROP. Their results showed that ANFIS is the most accurate method compared to the regression methods. The previous researches proved that the ANN is able to predict the ROP and other parameters such as permeability, minimum miscibility pressure, and stuck pipe when a large data set is available (Afshari et al., 2014; Ahmadi et al., 2012, 2013; Arabjamaloei et al., 2011; Asoodeh et al., 2015; Monazami et al., 2012; Rabiei et al., 2015; Hedayat Rahimzadeh et al., 2010; Shadizadeh et al., 2010; Zoveidavianpoor et al., 2013b). Since ANFIS is a combination of artificial neural network and fuzzy logic (Zoveidavianpoor et al., 2013a), it is expected that its results are more accurate than ANN in cases which a low amount of data exists (Basarir et al., 2014). Therefore, in this study, based on the number of available data, ANFIS is utilized to predict ROP.
Clearly, several approaches have been employed to predict the drilling rate and to optimize the drilling parameters to various degrees of accuracy in recent years (Hedayat Rahimzadeh et al., 2010). It is worth noting that the accuracy of these approaches depends on various parameters such as drilling condition, drilling variables, and the number of data points they incorporate. Selecting the most accurate approach for predicting the drilling rate can be extremely beneficial to reducing the drilling time as well as drilling costs.
In this study, HarelandRampersad (HR) model and Bourgoyne and Young (BY) model, which are the most widely used ROP models, together with an adaptiveneurofuzzyinference system (ANFIS) are used to predict the ROP in the South Pars (SP) gas field offshore of Iran. Their results are compared to find the most proper model for each formation and to assess the conditions in which each model works well.
2. Field information and data analysis
SP gas field consists of various phases, and the drilling data of one of these phases have been used in this study. Due to confidentiality purposes, the name of the field development phase and specific well numbers cannot be disclosed. The geological formations penetrated by the wells of the SP phase studied, in order of the shallowest to the deepest, consist of the Asmary, Ilam, Sarvak, Upper limestone, Dashtak, Surmeh, and Kangan.
All the information, which are required to construct ROP models for each formation, have been extracted from the daily mud logging reports (DMLR), modular dynamic test (MDT), and sonic log data. The extracted drilling data include drilling rate, weight on bit (WOB), rotational speed (RPM), pump flow rate, mud weight, bit type, and bit wear. Modular dynamic test and sonic log data are used to estimate the pore pressure and uniaxial compressive strength respectively. The overall data set consists of 721 records divided randomly into two parts, in which 70% (504 data records) of the overall data are used to construct (train) the model, and the remaining 30% (217 data records) are employed to test the developed model. All the data used herein are displayed graphically in Figures 12.
Figure 1
Drilling data extracted from daily drilling reports of one of the wells.
Figure 2
Drilling data extracted from daily drilling reports, pore pressure, and modular dynamic test data.
3. Bourgoyne and Young (BY) model
Several models have been suggested to predict the rate of penetration. The model established by Bourgoyne and Young (Bourgoyne Jr. et al., 1974) is one of the most complete drilling models used for roller cone bits (Nascimento et al., 2015). They suggested using eight functions to model the influence of different drilling parameters such as weight on bit, rotary speed, bit wear, formation strength, and jet impact force. BourgoyneYoung drilling model is defined by Equation 1:
(1) 
where,
(2) 

(3) 

(4) 

(5) 

(6) 

(7) 

(8) 

(9) 
in which, D (ft.)is the true verticalwell depth, and (lbm/gal) represents the porepressure gradient; (lbm/gal)is the equivalent circulating density, and (1000 lbf/in) stands for the threshold bit weight per inch of bit diameter at which the bit begins to drill; h is the fractional tooth wear; (lbf) denotes the hydraulic impact force beneath the bit, and through are the constants which must be chosen on the basis of local drilling conditions. All the information required to construct this model for each formation is extracted from daily mud logging reports (DMLR) and modular dynamic test (MDT) data. The multiple regression method is utilized to determine the model constants ( through ) for each formation. Initially, Bourgoyne and Young model needs to be expressed in a linear form; the linear form is obtained by taking the natural logarithms of both sides of Equation 1. Equation 10 shows the linear form of the BY model. Table 1 lists the values of to in the linear form of Bourgoyne and Young model.
(10) 
Table 1
Coefficients of linear form of Bourgoyne and Young model.
Characteristic 
Variable 
Amount 
Normal compaction parameter 

Under compaction parameter 

Pressure differential parameter 

Bit weight parameter 

Rotary speed parameter 

Tooth wear parameter 

Hydraulic parameter 
The and values (Equation 10) are then determined using the data set. The general form of multiplelinear regression (MLR) for a problem with k constants is as shown in Equation 11, where is the number of the data records involved (Eren, 2010b). By solving this matrix, the constants can be determined:
(11) 
Multiplelinear regression is an accurate method for determining the constant coefficients of a model, but sometimes, because of the quality of the existing data (e.g. anomalous data points in certain data records), it leads to meaningless constants (e.g. calculating negative ROP’s), especially when applied to Bourgoyne and Young model. In this situation, the abnormal data points need to be removed from the data set. In addition, there are some numerical methods for determining the constant coefficients. In this study, the simulated annealing algorithm was utilized to determine Bourgoyne and Young model constants when MLR leads to meaningless constants.
Simulated annealing algorithm
In 1983, Kirkpatrick et al. (1983) successfully used the simulation of physical annealing in optimization. The parameters of simulated annealing algorithm (SAA) include the design variables (X_{0}); the energy state (E(X_{0})),which is equivalent to the objective function; the initial temperature (T_{0}); the freezing temperature (T_{f}); the length of Markov chain (L); and temperature decrement factor (α) (Granville et al., 1994). SAA uses a Metropolis criterion to escape from local optimum point and to have a better chance to obtain the global optimum point. The acceptance probability of the Metropolis criterion is as follows (Metropolis et al., 1953).
(12) 
Figure 3 shows the flowchart of the simulated annealing algorithm.
Figure 3
Flowchart of the simulated annealing algorithm (Wang et al., 2016).
As shown in Figure 3, step 1 determines the algorithm parameters. Because there is no direct way, a trial and error approach is used to determine the best algorithm parameters. Second, in step 2, a random set of constant coefficients is generated. Thirdly, using these constants, the ROP values are calculated, and the RMSE_{1} for all the data set is determined using Equation 13. Fourthly, a neighboring set of constant coefficients is generated in step 4, and the RMSE_{2} is determined for this new solution. In the fifth place, If RMSE_{2} is lower than RMSE_{1}, the new solution is considered as the best solution, otherwise if P (Solution) > Random (0~1), the solution 2 is then considered as the best solution. Finally, the temperature decreases , and the process returns to step 4. The process is repeated until the freezing temperature is met.
In this study, objective function is considered as the root mean square error (RMSE). The RMSE is calculated using Equation 13.
(13) 
Bourgoyne and Young recommended lower bound and upper bound for each of the eight coefficients. Table 2 contains recommended bounds for each of the constants (Bourgoyne et al., 1991; H Rahimzadeh et al., 2011).
Table 2
Recommended bounds for each of the constant coefficient of Bourgoyne and Young model (Bourgoyne et al., 1991; H Rahimzadeh et al., 2011).
Coefficients 
Lower bound 
Upper bound 
0.5 
1.9 

0.000001 
0.0005 

0.000001 
0.0009 

0.000001 
0.0001 

0.5 
2 

0.4 
1 

0.3 
1.5 

0.3 
0.6 
Using a trial and error approach, the best parameters of the SAA are determined. Table 3 contains the algorithm parameters used herein.
Table 3
The best algorithm parameters.
No. 
SAA parameters 
Value 
1 
T_{0} 
1000 
2 
Α 
0.925 
3 
Length of Markov chain 
8 
4 
T_{f} 
0.0002 
Figure 4 shows the performance of the simulated annealing algorithm for determining the constant coefficients of Bourgoyne and Young model in Surmeh formation. It shows that the determined constants can predict the ROP with an RMSE value equal to 2.97318. The second diagram shows the determined constant coefficients, and the third diagram displays the final temperature. It shows that the algorithm stops when the temperature reaches the freezing temperature.
Figure 4
Performance of the simulated annealing algorithm for determining Bourgoyne and Young model in Surmeh formation.
The calculated constants are summarized in Table 4. When SAA is used for determining the constant coefficients of mathematical models, sometimes after each run, a set of completely different constants is obtained. In this situation, the parameters of the algorithm must be changed to reach the global optimum. The measured and predicted drilling rates for each formation are depicted in Figure 5.
Table 4
Bourgoyne and Young model constants for each formation.
Formation 

Asmary 
1.535 
0.000075 
0.000002 
0.0001 
0.595 
0.94 
0.895 
0.49 
Ilam 
1.125 
0.000101 
0.000459 
0.000002 
0.5 
0.405 
1.4999 
0.595 
Sarvak 
1.289 
0.000079 
0.000002 
0.000002 
0.505 
0.98 
0.3067 
0.435 
Upper limestone 
1.259 
0.00001 
0.000043 
0.000002 
0.865 
1 
0.305 
0.595 
Dashtak 
1.329 
0.000044 
0.00012 
0.000002 
0.505 
0.95 
0.301 
0.3046 
Surmeh 
1.831 
0.000021 
0.0001 
0.000001 
0.619 
0.516 
0.375 
0.342 
Kangan 
1.525 
0.000001 
0.000001 
0.000001 
0.79 
0.995 
0.303 
0.410 

Figure 5
Measured versus predicted drilling rate for each formation by Bourgoyne and Young model.
4. HarelandRampersad (HR) model
In 1994, Hareland and Rampersad established a model to predict the drilling rate. This model is derived using the conservation of mass where the rate of cutting removal is equivalent to the rate of penetration (Hareland et al., 1994). The general form of the model for completely efficient bit cleaning is as follows:
(14) 
where, ROP (ft./hr.) is the predicted rate of penetration, and WOB (lb_{.}) is weight on bit; RPM is the rotary speed of bit in revolution per minute; G stands for a constant coefficient which is related to bit type and blade geometry, and (in) denotes bit diameter; (psi) is the uniaxial compressive strength (UCS) of the rock, and and are model constants; indicates the wear function. Wear function calibrates the ROP for a worn bit. The wear function is calculated using Equation 15 as a function of cumulative bit wear. The cumulative bit wear is a function of applied RPM and WOB, rock strength, and length of drilling interval:
(15) 
where, ΔBG is the IADC bit dull grading estimated by Equation 16:
(16) 
In above equation, represents the bit wear coefficient and is a function of the durability of PDC layer material and the relative hardness of the cutters with respect to the PDC layer material (Liu et al., 2014). and are model constants, and is the length of drilling interval in feet. By assuming a linear relation for the bit wear function, is given by (Hedayat Rahimzadeh et al., 2010):
(17) 
where,
(18) 
In this model, the uniaxial compressive strength of the rock is obtained using empirical correlations which are a function of bulk density and sonic wave transit time. The same data set, which is used to construct Bourgoyne and Young model, is utilized to obtain Hareland model constants in each formation. For the calculation of , the summation term in Equation 18 is calculated using drilling data extracted from MLDR; then, the IADC bit dull grading term is divided by that summation to provide the value of . In addition, HarelandRampersad model has three constants; for determining these constants, the model needs to be linearized. The linear form of the HR model is written by Equations1920. The formulas for calculating the HR coefficients of and are given in Table 5.
(19) 

(20) 
Table 5
Coefficients of linear form of HarelandRampersad model.
Characteristic 
Variable 
Amount 
Drilling rate parameter 
Y 

Weight on b parameter 
X_{1} 

Rotary speed parameter 
X_{2} 

Constant coefficient 
Then, and values are determined using the data set, so MLR matrix is constructed. By solving the MLR matrix, the constants are determined. When MLR leads to negative constants, SAA can be employed to determine the constants. Table 6 contains the recommended ranges of HarelandRampersad model constants.
Table 6
Recommended bounds of HarelandRampersad model constants.
Coefficients 
Lower bound 
Upper bound 
0.1 
100 

0.5 
1.5 

0.5 
1.5 
The model constants for each formation are summarized in Table 7. Figure 6 shows the measured and predicted drilling rate acquired by Hareland model with the best fit line and values for different formations of the SP gas field.
Table 7
Hareland model constants for each formation.
Formation 
G 
α 
ɣ 

Asmary 
0.00521673 
1.3982 
0.9025 
1.53E6 
Ilam 
53.489 
0.5858 
0.9887 
3.4E7 
Sarvak 
7.198 
0.6113 
1.2944 
1.72E7 
Upper limestone 
50.436 
0.7473 
0.7163 
9.32E8 
Dashtak 
50.6 
0.5944 
1.0074 
3.06E8 
Surmeh 
54.535 
0.6948 
0.8687 
6.11E8 
Kangan 
53.425 
0.79568 
0.63118 
1.68E8 

Figure 6
Measured versus predicted drilling rate for each formation by Hareland model.
Artificial intelligence models
Previous researches proved that ANN can predict ROP accurately within the range of input data when a large data set is available. Table 8 contains the drilloff test data of Surmeh formation in a 12.25 inch hole section which was drilled with a PDC bit. The mud weight and flowrate were equal to 9.6 ppg 870 gpm respectively. The ANN and ANFIS are employed to construct a model for the prediction of ROP using the drilloff test data.
Table 8
Drilloff test data of Surmeh formation.
WOB (klb) 
5 
8 
11 
13 
14 
15 
17 
14 
14 
14 
14 
RPM (rpm) 
180 
180 
180 
180 
180 
180 
180 
190 
200 
210 
220 
ROP (m/hr.) 
5.5 
9.12 
13.46 
14 
14.21 
13.5 
13.09 
15.31 
16.5 
17.35 
18 
The WOB and RPM are considered as the input variables, and ROP is the output of the model. The available data sets consisted of 80% for the pure network learning process, 10% for validation, and 10% for testing purposes. A twolayered feedforward back propagation algorithm with 10 neurons in the hidden layer is used. The LevenbergMarquardt (LM) algorithm is selected for training the network. Figure 7 shows the structure of the designed ANN model.
Figure 7
Structure of the suggested ANN for the prediction of ROP.
Figure 8 shows the relationship between WOB, RPM, and ROP using the constructed ANN model. As it is seen, the ANN is able to predict the ROP only within the range of the input data, and when there is not enough input data with a sufficient distribution, it cannot be used as a reliable model for the prediction of ROP.
Figure 8
Relationship between WOB, RPM, and ROP using the designed ANN model.
Following this, ANFIS is used to construct a model for the prediction of ROP using the drilloff test data. WOB and RPM are the input variables, and ROP is the output of the model. The data used to construct the ANFIS are the same as ANN: 80% for training, 10% for testing, and 10% for checking purposes. The type of membership functions is selected as a Gaussian curve, and three membership functions are chosen for each linguistic variable. Figure 9 shows the structure of the designed ANFIS model.
Figure 9
Structure of the designed ANFIS model for the prediction of ROP.
Figure 10 shows the relationship between WOB, RPM, and ROP using the designed ANFIS model. It is noticeable that ANFIS has better performance than ANN in a way that it can be used as a reliable model for the prediction of ROP when a few numbers of input data with a sufficient distribution are available.
Figure 10
Relationship between WOB, RPM, and ROP using the designed ANFIS model.
It can be concluded that in cases in which a vast amount of data set are not available, using ANFIS instead of ANN can lead to more accurate and reliable results.
5. Theoretical basis and justification for ANFIS applied to ROP prediction models
The models which are based on fuzzyinference system use linguistic terms and ifthen rules instead of numerical terms. Linguistic variables have their values expressed as words or sentences of natural language describing degrees of membership. A fuzzy set, which belongs to these linguistic variables, is an extension of a crisp set where each element can have binary membership, i.e., full membership or no membership. However, fuzzy sets allow partial membership in which an element can partially belong to one or more than one set (Nedjah et al., 2005). In other words, in a crisp set, the membership level of x elements in set A can be expressed by the membership function , such that if:
(21) 
where, for fuzzy set A, membership function has values between 0, 1.
In ANFIS, input data are converted to fuzzy input by membership functions. Following this, fuzzy inputs are entered into the neural network block. This block is connected to an inference engine which includes a rule base. A backpropagation algorithm is used to train the inference engine and to determine appropriate rules that can reproduce meaningful dependent variable values. After training, the rules generated are applied to the dataset from the neural network to yield the optimum output. Then, the output which is obtained from the neural network block is converted into crisp values by a defuzzification algorithm (Sugeno et al., 1988). This analytical sequence is shown in Figure 11. There are five layers in the structure of neurofuzzy system; these layers are fuzzification, rules, normalization, defuzzification, and output as shown in Figure 12. Each of these layers includes nodes, which processes the fuzzy inputs. Since the ANFIS only allows one model output, the outputs of these nodes are combined to yield a single crisp output. Then, the derived output is reentered as an input to the model and compared with the actual set value. If there is any deviation, the error signal is generated and becomes the input to the next iteration of the ANFIS model. Following a series of iterations, results converge to a stable system with minimal errors between the predicted and measured values (Mathur et al., 2016).
The Takagi, Sugeno, and Kang (TSK) fuzzy inference system is used to construct the ANFIS model, which consists of two rules (Sugeno et al., 1988). The TSK ROP model involves two inputs WOB and RPM, one output ROP, and fuzzy sets . and are fuzzy sets of variables WOB and RPM respectively. In the ANFIS model, the relation between the inputs and output is expressed by the following IfThen rules:
Rule 1:
Rule 2:
where, , , , , , and are consequent parameters. are the fuzzy sets which represent the linguistic labels. Each layer in the ROP ANFIS model consists of the following node functions:
Layer 1: This layer is the fuzzification layer. In this layer, the crisp value enters into node which is converted into a fuzzy value associated with fuzzy set or . Then, the membership level of this input is determined by a membership function of the respective fuzzy set. The output of each node is calculated by using the following equations:
(22) 

(23) 
Layer 2: This layer is the first rule layer. The nodes of this layer are fixed, and they multiply the membership levels of all the inputs according to each rule as follows:
(24) 
where, denotes the output of layer 2, and is the firing strength. In this layer, each node calculates the firing strength of each rule via multiplication, and the rule which has the high firing strength matches the input data. The number of nodes is equal to the number of rules in this layer.
Layer 3: This layer is the second rule layer. In this layer, each node calculates the ratio of firing strength of each rule to the sum of all rules. The firing strength is calculated by:
= 
(25) 
where, represents normalized firing strength.
Layer 4: This layer is the defuzzification layer. The node function in this layer is calculated as follows:
(26) 
where, is normalized firing strength calculated from layer 3, and can be a polynomial function or constant number. is a consequent parameter set for rule (Jang, 1993).
Layer 5: This layer is the output layer. It only has one node, and this node calculates the sum of the output of all the nodes from layer 4 to produce the overall ANFIS output as reads:
(27) 
This ROP ANFIS model developed using the training data subset is then used to predict the rate of penetration for each data set record of the test subset so as to determine its accuracy.
Figure 11
A highlevel schematic for the sequence involved in a fuzzy neural network (Mathur et al., 2016).
Figure 12
ANFIS architecture involving two rules and two inputs.
5.1 Prediction of drilling rate by ANFIS
The same data set which is used for the two previous mathematical ROP models (i.e., BY and HR models) is utilized to construct the ANFIS model. After the classification of the data set, the ANFIS model is trained using MATLAB software. The Takagi, Sugeno and Kang (TSK) fuzzy inference system is used to construct the ANFIS model, and a hybrid rule algorithm is utilized to train the adaptive network. Model inputs include depth, weight on bit, rotational speed, flow rate, mud weight, pore pressure, and bit wear, and the only output is the rate of penetration. In the developed model, threemembership functions are considered for each input data record. The type of membership functions is trimf membership function which consists of three constants. The linguistic expressions for the input data, except for bit type, are low (L), moderate (M), and high (H). These linguistic labels state the relation between the input and output data via fuzzy IfThen rules. The linguistic labels and corresponding membership functions for the Sarvak formation are summarized in Table 9.
In this study, IfThen rules are created according to the relationship between the input and output for each record of the training subset of the data set. The created rulebase contains 2187 rules, and a sample of these rules is given by:
Rule 1:
The last step is defuzzification and drilling ROP is converted from a fuzzy expression into a crisp value. A plot of the predicted and measured penetration rates together with the best fit line and correlation coefficient ( values for testing data set is given in Figure 13.
Table 9
Linguistic labels and corresponding membership functions for the Sarvak formation.
Parameter 
Linguistic term 
Parameters of membership function 

TVD 
Low 
827.5 
995 
1162.5 
Moderate 
995 
1162.5 
1330 

High 
1162.5 
1330 
1497.5 

WOB 
Low 
2.45 
4.6 
11.65 
Moderate 
4.6 
11.65 
18.7 

High 
11.65 
18.7 
25.75 

RPM 
Low 
8.5 
85 
161.5 
Moderate 
85 
161.5 
238 

High 
161.5 
238 
314.5 

Flow rate 
Low 
2718.5 
3312 
3905.5 
Moderate 
3312 
3905.5 
4499 

High 
3905.5 
4499 
5092.5 

Mud weight 
Low 
9.41 
9.41 
9.41 
Moderate 
9.41 
9.41 
9.41 

High 
9.41 
9.41 
9.41 

Pore pressure 
Low 
8.51 
8.51 
8.51 
Moderate 
8.51 
8.51 
8.51 

High 
8.51 
8.51 
8.51 

Bit wear 
Low 
0.101 
0.0161 
0.1332 
Moderate 
0.0161 
0.1332 
0.2503 

High 
0.1332 
0.2503 
0.3674 

Figure 13
Measured versus predicted drilling rate for each formation applying the ANFIS ROP model.
6. Model performance analysis
According to the results obtained by the ANFIS model, this model has the least amount of error compared to Bourgoyne and Young and HarelandRampersad models; its average error is less than 10% in all the studied formations. Therefore, the ANFIS ROP model can be considered as the most appropriate tool to predict the drilling ROP. Figure 14 shows the measured and predicted values of ROP using the three different models evaluated. In addition, the amount of the residual error for each of the developed models is shown in Figure 15. Clearly, the residual errors yielded by the ANFIS ROP model are lower than the ones of the other two widely used ROP models. In other words, the deviation of the predicted values from the measured values is less in ANFIS. Analytical models consider the effect of a limited number of parameters to predict the drilling rate. On the other hand, there is no limitation for the number of input variables in ANFIS, and the effect of any variables on drilling ROP can be considered and included. For this reasons, ANFIS has the best performance in all the studied formations.
It is worth noting that the artificial intelligence systems such as adaptiveneurofuzzyinference system work better than other approaches when a large amount of data exists. In situations where a large number of data records are not available to train such models, conventional mathematical methods are likely to be superior to inference systems. On the other hand, ANFIS can predict ROP accurately within the range of input data, but it is not able to predict the ROP beyond the range of input data; however, mathematical ROP models can predict the ROP in all ranges, and they need less input data compared to ANFIS.
Overall, it can be seen that Bourgoyne and Young model has a better performance than the HarelandRampersad model in most of the formations. Since the effect of different parameters such as pore pressure, mud weight, flow rate, and bit nozzle size are considered in Bourgoyne and Young model, it works better than HarelandRampersad model in the prediction of drilling ROP.
Figure 14
Measured and predicted drilling rates using different approaches.
Figure 15
Residual errors of predicted values by ANFIS, Bourgoyne and Young, and Hareland models.
On the other hand, HarelandRampersad model is more accurate than Bourgoyne and Young model in Upperlimestone and Surmeh formations as can be seen in Table 10. In these formations, the pore pressure, mud weight, and mud flow rate have very little variations, while the uniaxial compressive strength (UCS) of these formations is variable. The HarelandRampersad model considers the effect of UCS, which has a significant impact on the rate of penetration, whereas the Bourgoyne and Young model considers a constant value of drillability for these formations. For this reason, the HarelandRampersad model has a better prediction performance than Bourgoyne and Young model for those two formations. It can be concluded that HarelandRampersad ROP model in formations where the pore pressure, flow rate, and mud weight do not vary significantly is a more appropriate choice compared to the Bourgoyne and Young model; nevertheless, both are inferior to the ANFIS ROP model developed here.
Table 10
Average percentage error for ANFIS, Bourgoyne and Young, and HarelandRampersad models for each formation.
Formation type 
ANFIS model (%) 
Bourgoyne and Young model (%) 
Hareland model (%) 
Asmary 
3.75 
36.91 
43.49 
Ilam 
3.01 
13.93 
17.96 
Sarvak 
10.79 
14.88 
23.97 
Upperlimestone 
0.75 
10.18 
9.73 
Dashtak 
8.01 
9.17 
13.99 
Surmeh 
9.64 
15.15 
12.45 
Kangan 
6.32 
14.44 
16.24 
7. Conclusions
In this study, the adaptive neurofuzzy inference system and two mostwidely used mathematical ROP models (i.e. Bourgoyne and Young and HarelandRampersad models) are used to predict the rate of penetration in the SP gas field. A comparison of the results suggests the following conclusions:
 When a few numbers of input data are available, the ANN cannot be used for the prediction of ROP. ANFIS uses a combination of fuzzy logic and neural network, which helps ANFIS to have a better performance than ANN, especially when a few numbers of input data are available.
 ANFIS is an accurate tool for the prediction of ROP, but the performance of ANFIS depends on various factors such as the number, accuracy, and distribution of input data and the type of membership functions. Since there is no direct way to determine the proper membership functions, trial and error should be used for this purpose. It is worth noting that this method is time consuming, especially when the number of input variables increases, and sometimes it is not possible to construct a reliable model for the prediction of ROP using ANFIS.
 Although mathematical ROP models are less accurate than ANFIS, they need less data to be constructed and always provide a reasonable estimation of ROP. Therefore, it is better to use ANFIS and mathematical ROP models simultaneously for the prediction of drilling rate in a particular formation.
 Overall, Bourgoyne and Young model has better performance than HarelandRampersad model in the prediction of drilling rate.
 In formations where some parameters such as mud weight, mud flow rate, and pore pressure gradients do not change significantly, HarelandRampersad model is a better choice for the prediction of drilling ROP compared to Bourgoyne and Young model.
Acknowledgments
The authors would like to thank National Iranian Drilling Company (NIDC) for supporting this work, for providing the mud logging data, and for permitting the publication of the results of the project.
Nomenclature
ANFIS 
: Adaptive neurofuzzy inference system 
ANN 
: Artificial neural network 
BG 
: IADC bit dull grading 
BY 
: Bourgoyne and Young model 
: Bit wear coefficient 

D 
: True vertical depth (ft.) 
d_{b} 
: Bit diameter (in) 
E 
: Energy function 
ECD 
: Equivalent circulation density (ppg) 
f 
: Bourgoyne and Young model functions 
FIS 
: Fuzzy inference system 
F_{j} 
: Jet impact force (lb_{f}) 
G 
: Constant coefficient of Hareland model 
h 
: Bit wear 
HR 
: Hareland model 
L 
: Length of Markov chain 
N 
: Rotary speed at Bourgoyne and Young model (rpm) 
O 
: Output of each node in ANFIS 
ROP 
: Rate of penetration (ft./hr.) 
RPM 
: Rotary speed (rpm) 
SAA 
: Simulated annealing algorithm 
T_{0} 
: Initial temperature (°C) 
T_{f} 
: Freezing temperature (°C) 
UCS 
: Uniaxial compressive strength of the rock (psi) 
w_{f} 
: Wear function 
WOB 
: Weight on bit (klb) 
X,y 
: Variables 
: Constant coefficients of Bourgoyne and Young model 

: Temperature decrement factor at SAA 

: Hareland model constants 

: Membership function 

: Uniaxial compressive strength of the rock (psi) 