Logo.gif (10726 bytes)

Hyperspace Data Mining Technology

Discover Knowledge using Projective Geometry

|Home|Technology|Solutions|Product|Literature|


The Data Evolution:

Data -> Databases ->Data Patterns -> Data Mining/Data Fusion -> Data Models -> Data …

Related Technologies:

  • Correlation, association
  • Clustering
  • Factor analysis
  • linear discrimination, logistic regression
  • Trend prediction & forecasting
  • Neural networks
  • Genetic algorithms
  • Fuzzy logic
  • Uncertainty reasoning (Dempster-Shaffer, rough sets)
  • Bayessian nets
  • Hyper space data mining

Wide Applications:

  • Internet (cookies, profiler, shopping cart)
  • Circuit design & optimization (EDA)
  • Traffic prediction/scheduling (wireless nets)
  • Semicon process design & optimization
  • Machine and equipment diagnostics
  • Customer support and warranty forecast
  • Advanced materials/medals design
  • Petrochemical, chemical - chemometrics
  • Biomedical, pharmaceutical
  • Defense and space applications
  • Financial database applications
  • Portfolio/Investment analysis
  • Consumer price/futures prediction
  • Credit/bank/insurance fraud detection
  • Automated fraud explanation
  • Consumer preference analysis/forecast
  • Customer interest profiling
  • Market research & services
  • Stock prediction (hard!)
  • Econometrics

The Common Issue - find a model to describe the relationship in data

wpe9.jpg (44207 bytes)

 

The Catch 21 Problem:

       Data Pattern <--?--> Data Model

Can you see it? If not, mining in a hyperspace is needed

wpe5.jpg (27337 bytes)

 

Principal Component Analysis (PCA) - Not suited for nonlinear cases:
In general no good separation of data is achieved by traditional PCA or Fisher method in nonlinear cases.  See the first picture below where the red box is the 2-D space that contains both red (good) and blue (bed) data points. A model developed using the data in the red box would not be a good representation of the underlying process since both red and blue data  points are used in building the model.

wpeC.jpg (69833 bytes)

Fig-1 PCA - No separation of data.

Principal Component Analysis (PCA) or Kohenum-Louve Transform:

Projection in maximum separable direction. Good for linear, Gaussian cases without noise. All data are used in building a model.

Fisher's Method:

Line projection with maximum distance between clusters. Result is similar to that of PCA.

An example is in Fig-1.

wpe6.jpg (58627 bytes)

Fig-2 Good separation by hidden projection of MasterMiner™

MasterMiner™:

Based on a projective geometry (hidden projection) method. It is well suited for nonlinear (or linear), non-Gaussian (or Gaussian) cases with noise. Only a sub set of data are used in building a model for the underlying process. It's data separability is superior to that of either PCA or Fisher. For the same data, MasterMiner gives much better data separation.

For comparison with PCA, see result in Fig-2.

Mathematical model built from the date in the red box generated by MasterMiner:
When good separation is achieved, a mathematical model for the process data can be readily generated from the data points in the red box shown in Fig-2 above.   Linear regression is used to model these data points in a sub-space, the picture below shows 4 inequalities that separate the space and 2 linear equations that represent the data model.

wpeD.jpg (47863 bytes)

tree

 

|Home|MasterMiner Solutions|Products|Literature|

back to Home,    Email Questions and comments, Last updated November 21, 1998
Copyright © 1997 - 2000, ZAPTRON Systems, Inc.