Logo.gif (10726 bytes)

Why not Use Neural Networks Alone?

|Home|Technology|Solutions|Product|Literature|


Over-fitting Problem with ANN

Problem Background:

Artificial neural networks (ANNs) have strong data fitting capability. In industrial environment, data collection is not perfect and people tend to miss some data point in out-door working sites due to operator's sleep-over or hush conditions.  When the number of data samples available is not infinite, which is always true, specially when the number of samples is not large enough, there are actually a large number of artificial neural networks with different structures that fit the training samples. This is so because the number of nonlinear functions that fit these sample points are really infinite.  But perhaps only a few of them are the "true" mathematical model for the practical problems. The dilemma is "which is the true model?"  The particular difficulty is that no method can assure or validate that the result of ANN is the true mathematical model. This is why overfitting cannot be avoided by ANN and ANN can not be use alone.  Other methods must be used in combination with ANNs to offer an effective solution to data mining.

Cause Analysis:

Industrial data often have strong noises and the distribution of data points is usually non-uniform. In these cases, overfitting will be more serious.

For example, in the aluminum production case (the zz72.dat data file), the wrong prediction by ANN is in the region of high a4 and low a1, but very few data points actually fall in this region. If you try to add a few additional training points in this region and assign them to class "2," the wrong prediction will be prevented. But the number of data points meeting these conditions (high a4 and low a1) is actually very few in real production, since the rules of real-world operations do not allow so.

Dangerous Extrapolation by ANN

For comparison purpose, real-world data from an aluminum production plant were processed, and two models for the underlying process were developed, one by ANN and the other by MasterMiner. In the picture shown below, The red region is the optimal operating zone calculated by ANN, whereas the blue region is calculated by the MREC algorithm of MasterMiner.  Production data have verified that the red region by ANN is wrong and extrapolation based on ANN could lead to damages to production. On the contrary, MasterMiner offers an effective and reliable approach that can point out the best direction for extrapolation with significant benefit to the users.

wpe11.jpg (12782 bytes)

ANNs Do not Know Local Views (Statistical Interactions)

In practical industry applications, the process under study is rather complicated. In particular, a change in a system parameter may lead to a change in the physical process or chemical reaction. Parameters are divided into several intervals, in which a system or process may have different operation mechanisms (called local structure of the system).  For example, in a chemical process, when the carbon (C) content is below 0.25%, the process is a chemical kinetics control process.  But when C is above 0.25%, the process is a diffusion control process.  In these two cases, the data exhibit different patterns, as shown by the picture below.

Such change in data pattern could not be discovered by ANNs. See the left-hand side of the picture where data is not separated.  Whereas MasterMiner could easily find the two different patterns using hidden projection method. One pattern is displayed on the right-hand side of the picture.

wpe12.jpg (30804 bytes)

 

 Copyright © 1997 - 2000, ZAPTRON Systems, Inc.