A proposal to fault diagnosis in industrial systems using bio-inspired strategies

In this work a study on the application of bio-inspired strategies for optimization to Fault Diagnosis in industrial systems is presented. The principal aim is to establish a basis for the development of new and viable model-based Fault Diagnosis Methods which improve some difficulties that the current methods cannot avoid. These difficulties are related mainly with fault sensitivity and robustness to external disturbances. In this study, there have been considered the Differential Evolution and the Ant Colony Optimization algorithms. This application is illustrated using simulated data of the Two tanks system benchmark. In order to analyze the advantages of these algorithms to obtain a diagnosis which needs to be sensitive to faults and robust to external disturbances, some experiments with incipient faults and noisy data have been simulated. The results indicate that the proposed approach, basically the combination of the two algorithms, characterizes a promising methodology for Fault Diagnosis.


INTRODUCTION
Fault diagnosis is the process of detection and isolation of a fault (FDI).The faults change the characteristic properties of the system and produce its incapacity to fulfill the intended purpose for which they were designed [7].Taking into account the severe consequences of this situation, many FDI methods have been developed.They should guarantee that faults can be detected and isolated early (sensitive to faults) while rejecting any false alarm caused by noise, external disturbances or spurious signals (robustness).
The FDI methods are classified in three general groups: those which do not use a model of the process, those which use a qualitative model of the process and those that are based on a quantitative model [25].The methods that are not based on a model of the system have as principal inconvenient, the necessity of a great amount of historical data of the system which reflects the effect of the faults.
The model-based approaches using the quantitative analytical model are simply known as analytical redundancy methods or model-based methods.They allow a deep insight into the process behavior [9].This advantage and the emergent development of techniques for mathematical simulations, makes the other methods to be considered just as an alternative to the analytical model-based ones [5].
The great variety of proposed model-based methods can be brought down to a few basic concepts such as: the parity space; observer approach; the fault detection filter approach and the parameter identification or estimation approach [4,5,9].For the final diagnosis theses methods are commonly divided in two steps: detection and isolation.For the first three methods the decision of fault detection is based on the difference between the estimated state, which is obtained from a mathematical model of the system and the real input-output data.The parameter estimation is an alternative approach to the methods based on state estimation and it is based on the detection of the faults via estimation of the parameters of the mathematical model [4,9].
Although the model-based robust approaches have been developed, [1,4,7,9,15] FDI is still considered as a problem open to further research [19,20].The unavoidable process disturbances and the modeling errors increase the complexity of the FDI in a practical application, which makes that most of the FDI methods become almost unfeasible [19,20] and advanced methods of fault diagnosis are needed [9] It has been recognized that practical limitation of many cited FDI methods leads to the necessity to develop new and viable FDI methods [6,19,20].In that sense, the principal objective of this work is to explore the advantages of the application of the bio-inspired strategies to the development of robust and sensible FDI methods via parameter estimation.The selection of the bio-inspired strategies is based on the simple structure, robustness, great number of practical applications that have been reported [11-13, 17, 21, 24] and have conduced to some recent and incipient applications of bio-inspired strategies to some FDI problems [26][27][28].
The main contribution of this paper can be summarized as: the proposal of a new approach for developing robust and sensitive FDI methods based on bio-inspired strategies and their cooperation that alleviate some of the disadvantages of the current methods, specifically those based on parameter estimation.The viability of the proposal is established by diagnosing simulation data of the Two tanks system benchmark with Differential Evolution (DE) and Ant Colony Optimization (ACO) algorithms.The selection of the Two tanks system as a first case of study is based on some criterion such as: its non linear model and that it has been a benchmark for testing FDI methods.The work also explores the theoretical and computational properties of each algorithm.
The rest of this work is organized as follows.First, the model-based FDI methods via parameter estimation are introduced.After that, the two bioinspired strategies that form part of this work are presented.Subsequently, the details of the case of study, Two tanks system, and the simulations are put forward and consecutively the FDI proposal and its application to the Two tanks system is exposed and the details of the implementations are revealed.The section Results presents a set of test experiments which are performed with the aim to evaluate the properties of the two bio-inspired strategies when they are used for FDI methods.Lastly, some concluding comments and remarks are put forward.

STRUCTURE OF THE MODEL-BASED FDI METHODS VIA PARAMETER ESTIMATION
FDI based on model parameters which are partially not known or not known at all, requires online parameter estimation methods.These parameters can be determined with parameter estimation methods by measuring input and output signals if the basic model structure is known.
The two approaches that are commonly used for estimating the parameters in FDI consider a lineal model of the system and are differentiated with respect to the minimization function [4,10]: • minimization of the sum of least squares of the equation error • minimization of the sum of least squares of output error The first case is linear in the parameters and allows, therefore, direct estimation of the parameters (least squares estimates) in non-recursive form.
In this case some improvements of the numerical properties are needed and the use of filters is recommended [9].
For the second case numerical optimization methods are needed.These methods give more precise parameter estimations, but the computational effort is then much larger, and on-line real-time application is in general not possible [9].
Parameter estimation can also be applied for nonlinear static process models [8] and in general the numerical methods have been reported as the more precise under the influence of process disturbances [4] with the disadvantage of the computational effort that makes almost unfeasible the application for on-line diagnose [8].
Let be the process model that represents as close as possible the physical laws which govern the process behavior [7,9].The vector of state variables is represented by and output signals y(t) ∈ ℜ p can be directly obtained by the use of physical sensors; θ p ∈ ℜ j and θ F ∈ ℜ l are the process parameters vector and the fault parameters vector, respectively, and they determine the model parameters vector θ p ∈ ℜ l+j .
The model (1) considers that the influence of the fault is absolutely represented by the fault parameters vector θ F ∈ ℜ l .That is the reason why the estimations of the vector θ F will allow diagnosing the system once the relationship between each component of θ F and the faults f k is established.
Considering that the process parameters vector θ p keep constant, the estimations of the vector θ F can be obtained from the solution of the minimization problem minimize F y y where I is the number of sampling instants, ŷt(t, θ ̂F) is the estimated vector output in each instant of time t and it is obtained from the model (1); y t (t, θ F ) is the output vector measured by the sensors at the same instant t [9].This procedure is represented in the Figure 1.
For the solution of the optimization problem that was specified in (2) even in a noisy environment, and with independency of the linearity or not of F(θ ̂F) with respect to the parameters θ F , the bio-inspired optimization strategies can be implemented.The idea behind the application of the bio-inspired strategies is to make a robust and sensitive diagnose of the system, via parameter estimation, with an acceptable computational effort which makes it feasible for the on-line diagnose.

DIFFERENTIAL EVOLUTION AND ANT COLONY OPTIMIZATION
This section describes the basis of the two algorithms that will take part in our study.

Differential evolution
The Differential Evolution (DE) was proposed around 1995 for optimization problems [23].DE is an improved version of the Goldberg's Genetic Algorithm [6] taking the basis of Simulated Annealing [10].Some of the most important advantages of DE are: simple structure, simple computational implementation, speed and robustness [16,23].
Basically, DE is based on the generation, crossover and selection of individuals in a population.DE generates a new solution vector by adding the weighted difference between a pair population vectors to a certain vector (the number of pairs can be changed) [16].The general scheme of the algorithm is summarized by the notation DE/X j / γ / λ where X j denotes the vector to disturb in the iteration j; γ the number of pair of vectors for disturbing X j and λ indicates the type of crossover to be used.In this case it has been considered DE/X j(best) /1/ bin.Expressing it in a more formal fashion where X j+1 , X j(best) , X j(a) , X j(b) ∈ℜ n are members of the populations and F s is the weight applied to random differential or scaling factor.The crossover operator is defined for each vector component x n : where 0 ≤ C R ≤ 1 is another parameter of control in DE; it is called crossover constant.R is a random number which is generated by the distribution λ, In this case it has been considered the binomial distribution.
Finally the selection operator The key parameters of control in DE are the population size Z; the crossover constant C R ; and the scaling factor F s .In [23] 6:

Ant colony optimization
Ant Colony Optimization (ACO) was initially proposed for integer programming problems [3] but recently it has been successfully extended and adapted to continuous optimization problems [18,21,22].A good point of this algorithm is that its parameters can be manipulated in order to aim a more exploitation or exploration structure which allows an efficient hybridization with other algorithms.ACO is inspired on the behavior of ants seeking a path between their colony and a source of food.This behavior is due to the deposit and evaporation of pheromone.
For the continuous case the idea of the ACO is to mimic this behavior with simulated ants which are identified with feasible solutions [18,21].The first step is to divide the feasible interval of each variable of the problem in k possible values x n k .For each iteration a family of Z new ants is generated based on the information obtained from the previous ants and based on a selection mechanism.The information of the previous ants is saved on the pheromone accumulative probability matrix PC (the matrix has dimensions n × k where n is the number of variables in the problem) which is updated at each iteration where f ij are the elements of the pheromone matrix F and express the pheromone level of the discrete value j th of the variable i th .This matrix is updated in each iteration based on an evaporation factor C evap and an incremental factor C inc : Where The scheme for generating the new colony of ants considers a parameter q 0 and a family of n random numbers q 1 rand , q 2 rand , q n rand , for the z th ant to be generated.For each variable x (z)  n that will be part of the ant z th is set the following generation mechanism: where and The control parameter q 0 allows controlling the level of randomness during the ant generation.This fact determines, join Z and k, the level of intensification or diversification of ACO [21].The general scheme of the algorithm is presented in Algorithm 2.
Algorithm 2: Algorithm ACO 1: Set k for each variable.2: Set C evap , C inc and q 0 .3: Generate a random initial ant ⇒X (best) 4: Generate a random initial pheromone matrix F with the condition that all f ij are the same.5: Calculate PC following the equation ( 6).6: for j = 0 to j = (Iter_Max-1) do 7: for i=1:Z do 8: Generate an ant (based on equations (9-11)) ⇒X j+1(i) 9: end for 10: Update X (best) .11: Update the matrix F (based on equation ( 7)) and matrix PC (based on equation ( 6)).
12: Verify the stopping criteria 13: end for

CASE OF STUDY: TWO TANKS SYSTEM
In order to demonstrate the efficacy of the proposed FDI approach the benchmark Two tanks was selected.This section describes the system and the simulations.

Mathematical model of the Two tanks system
The Two tanks system that has been considered for the study is represented in Figure 2 and it is a simplified version of the Three tanks system, which was adopted as a benchmark problem for FDI and reconfigurable control around the 90's [14].
The Two tanks system is formed by two tanks of liquid that can be filled with two similar and independent pumps acting on the tank 1 and tank 2.
Both tanks have the same cross section S 1 =S 2 .The pumps deliver the flow rates q 1 in tank 1 and q 2 in tank 2. The tanks are interconnected to each other through lower pipes.All the pipes have the same cross section Sp.The liquid levels L 1 and L 2 in each tank are the controlled variables and output of the system.They are measured with continuous valued level sensors.The variables q 1 and q 2 are chosen as manipulated variables to control the levels of tank 1 and tank 2, also known as input signals.
The system has two additive faults to be detected and isolated: • Fault 1: Leak at the tank 1, an outflow with magnitude q f1 .• Fault 2: Leak at the tank 2, an outflow with magnitude q f2 .
The differential equations that describe the system under the presence of faults are derived from conservation of mass in the system of the Two tanks: and by the application of the Torricelli's law: where µ i are flow coefficients.
The following system of equations is obtained: The Table 1 shows the values of the constants considered in the model of the Two tanks system [2].As a first approach it has been supposed that the leaks at both tanks do not change in time and it is assumed that the magnitudes of the leaks are less or equal than 1 l/s.In other words, the following restrictions for the parameters q fi , i = 1, 2, have been established:

Simulations of the Two tanks system
The closed loop behavior of the process was simulated under no faults and under different faulty situations.The system of differential equations ( 16) was numerically solved by Runge Kutta 4. The Figure 3 and Figure 4 show two different situations that were simulated where the set points established to control the liquid levels in Tank 1 and Tank 2 were 4 m and 3 m respectively.

FDI PROPOSAL ON THE TWO TANKS SYSTEM
Estimations of the parameter θ F = (q f1 , q f2 ) allow to diagnose the system.The relationship between the components of the vector of faults θ F and the faults is one-one.In order to estimate these parameters, the following optimization problem is formulated: minimize where the components of L ∈ℜ 2 are the measurement of the liquid levels at different instants of time t; and the components of ŷ ∈ℜ 2 are the liquid levels of the tanks, computed by the model using Runge Kutta 4 at the same instant of time t.
In order to diagnose the faults, the minimization problem formulated in (18) was implemented considering DE and ACO.

Implementation
All the implementations were made using MATLAB R2007b.The stopping criterion for both algorithms considered the number of iterations (100 iterations), a maximum for the number of repetitions of the value of the objective function (15 repetitions) and the value of an objective function measurement e F (e F < 0.01): The function e F allows to make comparisons in an easier way.It is interesting to note that the computational effort of ACO is greater than DE: for each iteration, the DE algorithm evaluates the objective function only once while ACO makes Z evaluations.The selection of the values of the parameters in DE and ACO was based on the desired level of intensification and diversification.In the case of ACO, the parameters Z, k and q 0 determine the tendency of the search [21].In DE the parameters that more influence the type of search are Z and C R [16,23] .The number of individuals Z=10 in both algorithms (five times the number of variables in the minimization problem) has maintained constant.
For both algorithms there have been considered three strategies, one with more inclination to the diversification (DE I , ACO I ), the second with more tendency to the intensification (DE D , ACO D ) and the last one with no discernible trend to diversification or intensification ( DE M , ACO M ).

ACO implementation:
On the other hand the ACO implementation considered the mechanism that was described in the Algorithm 2: the value of k is the same for both variables, k = 128, and for both versions, C evap = 0.30 and C inc = 0.10.The difference was established by q 0 = 0.85 for the ACO I , q 0 = 0.15 for the ACO D and q 0 = 0.55 for the ACO M .

RESULTS
With the aim to analize the merit of our proposal, three aspects have been considered: robustness, sensitivity and computational effort.With this goal in mind the Two tanks system is diagnosed under the faulty cases exposed in the Table 2.The case 1 considers only fault at one of the two tanks (Tank 1) and the case 2 reflect the two faults at the same time.
As a first step it has been considered noise data with only 2-5% of noise and it has been analyzed the behavior of the diagnosis via DE and ACO with their different variants and under the faulty situations of the Table 2.
Table 2. Faulty Situations to be diagnosed in the Two tanks system.

Case 1
Case 2 Tank 1 q f1 =0.6 l/s q f1 =0.2 l/s Tank 2 q f2 =0 l/s q f2 =0.2 l/s The Table 3 and Table 4 show the results.Each algorithm has been tested 30 times for each case of the Table 2 with the intend to make statistically valid description of the results by means of computation of two descriptive statistics: arithmetic average and variance of the estimations of the parameters.These statistics give a measure of central tendency of the estimations of each parameter and a measure of dispersion, respectively.The abbreviations that were used in the tables are: alg for algorithm, Mean Eval F(θ ̂F) for the arithmetic average of the number of objective function evaluations that were achieved, Mean q̂f 1 , q̂f 2 and Var q̂f 1 , q̂f 2 for the arithmetic average and the variance, respectively, of the estimations of the faulty parameters q f1 , q f2 and Mean t for the arithmetic average of the computing time, in seconds.The computational effort of the algorithm is analyzed based on the number of objective function evaluations and time.
Table 3. Results of the diagnosis obtained for the case 1: q f1 =0.6 l/s and q f2 =0 l/s.Noise data between 2 and 5% error.

ALg
Case 1: q f1 =0.6 l/s and q f2 =0 l/s The Table 3 and Table 4 demonstrate that both algorithms, with their variants, detect the faults but the best performances are given by DE M and ACO M ; note that diagnose of DE M is more accurate than ACO M .The variants DE I and ACO I provide the larger values for the variance of the estimations, that is attributable to the more intensification trend of these versions and, as a result of this fact, the estimations depend strongly of the initial population.The DE D and ACO D get, in all cases, the maximum number of iterations which gives an idea of slow convergence.The computational cost of DE is less in all cases.

Analysis of robustness
With the purpose of analyze the robustness of diagnose of DE and ACO, the faulty situations of noise are between 15-20%.
The results are revealed in Table 5 and Table 6.The number of experiments for each algorithm is kept on 30.This time the best mean of the estimations is provided by the versions of ACO, specifically by ACO D , but its number of function evaluations gets the maximum (1000) which means that the computational effort is high.Table 5. Results of the diagnosis obtained for the case 1: q f1 =0.6 l/s and q f1 =0 l/s.Noise data between15 and 20% error.

Hybrid Strategy between ACO and DE
In order to keep the robustness revealed by ACO D and reduce its computational cost it is proposed a hybrid strategy between ACO and DE, basically between ACO D and DE I .
The results indicate that the major diversification of the algorithm stimulates the robustness of the algorithm, but in that case, the convergence is slower.With the intention to improve this disadvantage it is projected the application of ACO D with a smaller value of k, to reduce the possible values and therefore, the search space.Subsequently it is used the best ten ants of the history of ACO D for the initial population of DE I with the aim to make intensification around a promising region.The use of ACO D is proposed for no more than 40 iterations.This means not more than 400 evaluations of the objective function and no more than 100 iterations of DE I .This will reduce to 500 the maximum number of function evaluations (a half of the best performance of ACO for the robustness study).
This hybrid strategy has been called ACO-DE and it is shown in Algorithm 3. It is expected that this strategy keeps the robustness of ACO and reduces the computational cost of ACO.
Algorithm 3: Algorithm ACO-DE 1: Apply the Algorithm 2 with the parameters k=101 for each variable, Cevap,=0.30,C inc =0.10, q 0 =0.15,Iter_Max=40 2: Save the best ten ants of the step 1 ⇒Γ (best) 3: Apply the Algorithm 1 with the parameters C R =0.90, F S =0.6, Iter_Max=100 and with initial population Γ (best)   The Table 7 and Table 8 present the results of the diagnosis by the Algorithm 3. The faulty situations are the same that were considered in analysis of robustness; the data kept with noise between 15 and 20% and 30 repetitions with each algorithm are made.A comparison with the Table 5 and Table 6, that shows the robustness results of the pure strategies, indicates that the hybrid strategy provides better results.

Analysis of sensitivity
For analyzing the equilibrium between sensitivity and robustness of the hybrid strategy we proposed some experiments with the case 3 and case 4 that are shown in Table 9.The case 3 describes an incipient fault in the tank 1 and case 4 considers that no faults are present.In both cases it is considered noise data between 15 and 20%.These results are revealed in Tables 10 and Table 11.We have tested 30 runs of each algorithm for each case.Table 9. Faulty Situations that are considered for analyzing sensitivity in the Two tanks system.

Outflow Case 3 Case 4
Tank 1 q f1 =0.05 l/s q f1 =0 l/s Tank 2 q f1 =0 l/s q f1 =0 l/s The Table 10 reveals that the hybrid strategy is sensitive to incipient faults.The algorithm is competent for diagnosing the incipient faults that are described in the case 3 of the Table 9, even in a very noise environment.
The Table 11 shows the best (with the notation *) and the worst (with the notation ~) diagnosis obtained for each algorithm when the system is under case 4 and it is affected by disturbances that are represented by noise on the measurable variables.The ACO-DE strategy reduces the computational effort comparing with ACO and provides a more robust diagnosis: the best/worst result for DE M was 100/100 evaluation of the objective function, for ACO D the best/worst result obtained was 1000/1000 evaluations while for the ACO-DE strategy the result was 410/500.

CONCLUSIONS
This study indicates that the application of bioinspired algorithms and their cooperative use characterize a promising methodology for the fault diagnosis problem based on parameter estimation with the advantage that the structure is generalized to non linear model of the systems and the computational effort makes possible the on-line application.
The advantages observed in the application of the two algorithms to the FDI problem were: correct and fast diagnosis, easy structure, robustness to disturbances and sensitive enough to incipient faults.In other words, the numerical properties are which alleviate the main disadvantages of the two groups of parameter estimation-based method.
The hybrid strategy ACO-DE shows more robust and more sensitive (to faults) diagnosis than pure DE or pure ACO.In this strategy it is exploited the major robustness shown by ACO, and the minor computational cost made by DE.
The experiments have shown that the parameter q o , which is one of the parameters that characterize the diversification or intensification in ACO, has an influence on the robustness, in fact, the version that had more tendency to the diversification has more robustness.This was used for setting the parameters in the hybrid strategy and for combining ACO with a version of DE that guarantees more intensification of the search space.
In this sense the study of a more cooperative strategy between these two natures inspired algorithms will be done: considering the influence of the parameters k and Z in a more diversification version of ACO algorithm and the parameters Z and F S of the DE algorithm in order to obtain a version of DE that makes more intensification.
It is our interest to extend the applications of the hybrid strategy to more complex systems and compare with other FDI methods based on model.With the aim to obtain a general scheme we are concerned on the performance of the two algorithms and the hybrid strategy when faults of different nature are involved.

Figure 1 .
Figure 1.Representation of the FDI based on parameter estimation.

Figure 3 .Figure 4 .
Figure 3. Closed loop behavior of the process when no faults are present, noise data 2-5 %.
The DE implementation was based on the description made in this paper and the Algorithm 1.The value of the parameter C R in DE I was C R = 0.90; for the DE D version the parameter was set in C R = 0.10 and C R = 0.55 for DE M .All the versions have the same mutation mechanism (DE / x j best / l / bin) and the same F s = 0.6.
some simple rules for choosing the parameters of DE for any application are given.The general scheme of the algorithm is presented in Algorithm 1.

Table 1 .
Values of the constants of the Two tanks system.

Table 4 .
Results of the diagnosis obtained for the case 2: q f1 =0.2 l/s and q f2 =0.2 l/s.Noise data between 2 and 5% error.

Table 6 .
Results of the diagnosis obtained for the case 2: q f1 =0.2 l/s and q f2 =0.2 l/s.Noise data between 15 and 20% error.

Table 8 .
Results of the diagnosis obtained by ACO-DE for the case 2: q f1 =0.2 l/s and q f2 =0.2 l/s Noise data between 15 and 20% error.

Table 10 .
Results of the diagnosis obtained by ACO-DE for the case 3: q f1 =0.05 l/s and q f1 =0 l/s.Noise data between 15 and 20% error.

Table 11 .
Comparison of the diagnosis obtained in thirty runs of Case 4: q f1 =0 l/s = q f2 =0 l/s.Noise data between 15 and 20%.