Data Mining Techniques in Processing Medical Knowledge
Luminita STATE
Catalina COCIANU
Keywords
data mining,
principal component analysis,
fuzzy clustering,
c-means algorithm,
supervised+learning,
cluster analysis
Abstract
Data mining is an evolving and growing area of research
and development, both in academia as well as in industry. It involves
interdisciplinary research and development encompassing diverse domains.
In this age of multimedia data exploration, data mining should no longer
be restricted to the mining of knowledge from large volumes of high-dimensional
data sets in traditional databases only. The aim of the paper is to
develop a new learning by examples PCA-based algorithm for extracting
skeleton information from data to assure both good recognition performances,
and generalization capabilities in case of large data set. The classes
are represented in the measurement/feature space by continuous repartitions,
that is the model is given by the family of density functions , where
H stands for the finite set of hypothesis (classes). The basis of the
learning process is represented by samples of possible different sizes
coming from the considered classes. The skeleton of each class is given
by the principal components obtained for the corresponding sample.