Зарегистрироваться
Восстановить пароль
FAQ по входу

Müller J.-A., Lemke F. Self-organising Data Mining

  • Файл формата pdf
  • размером 4,33 МБ
  • Добавлен пользователем
  • Описание отредактировано
Müller J.-A., Lemke F. Self-organising Data Mining
Today, there is an increased need to extract information for decision making from a large collection of data. This transformation of data into knowledge is an interactive and iterative process of various subtasks and decisions, and is called Knowledge Discovery from Data. The central part of Knowledge Discovery is Data Mining.
Most important for a more sophisticated data mining is to try to limit the user involvement in the entire data mining process to the inclusion of well-known a priori knowledge while making this process more automated and more objective. Soft computing, i.e., Fuzzy Modelling, Neural Networks, Genetic Algorithms and other methods of automatic model generation, is a way to mine data by generating mathematical models from empirical data more or less automatically. In the past years there has been much publicity about the ability of Artificial Neural Networks to
learn and to generalize despite important problems with design, development and application of Neural Networks:
Neural Networks have no explanatory power by default to describe why results are as they are. This means that the knowledge (models) extracted by Neural Networks is still hidden and distributed over the network.
There is no systematical approach for designing and developing Neural Networks. It is a trialand-error process.
Training of Neural Networks is a kind of statistical estimation often using algorithms that are slower and less effective than algorithms used in statistical software.
If noise is considerable in a data sample, the generated models systematically tend to being overfitted.
In contrast to Neural Networks that use
Genetic Algorithms as an external procedure to optimize the network architecture and several pruning techniques to counteract overtraining,
the new approach described in this book introduces principles of evolution - inheritance, mutation and selection - for generating a network structure systematically enabling automatic model structure synthesis and model validation. Models are generated from the data in the form of networks of active neurons in an evolutionary fashion of repetitive generation of populations of competing models of growing complexity and their validation and selection until an optimal complex model - not too simple and not too complex - has been created. That is, growing a treelike network out of seed information (input and output variables' data) in an evolutionary fashion of pairwise combination and survival-of-the-fittest selection from a simple single individual (neuron) to a desired final, not overspecialised behavior (model). Neither, the number of neurons and the number of layers in the network, nor the actual behavior of each created neuron is predefined. All this is adjusted during the process of self-organisation, and therefore, is called self-organising data mining.
A self-organising data mining creates optimal complex models systematically and autonomously by employing both parameter and structure identification. An optimal complex model is a model that optimally balance model quality on a given learning data set ("closeness of fit") and its generalisation power on new, not previously seen data with respect to the data's noise level and the task of modelling. It thus solves the basic problem of experimental systems analysis of systematically avoiding "overfitted" models based on the data's information only. This makes selforganising data mining a most automated, fast and very efficent supplement and alternative to other data mining methods.
This book provides a thorough introduction to self-organising data mining technologies for business executives, decision makers and specialists involved in developing Executive Information Systems or in modelling, data mining or knowledge discovery projects. It is a book for working professionals in many fields of decision making: Economics (banking, financing, marketing), business oriented computer science, ecology, medicine and biology, sociology, engineering sciences and all other fields of modelling of ill-defined systems.
Each chapter includes some practical examples and a reference list for further reading. The book offers a comprehensive view to all major issues related to self-organising data mining and its practical application for solving real-world problems. It gives not only an introduction to self-organising data mining, but provides clear answers to questions like:
what is self-organising data mining compared with other known data mining techniques,
what are the pros, cons and difficulties of the main data mining approaches,
what problems can be solved by self-organising data mining, specifically by using the KnowledgeMiner modelling and prediction tool,
what is the basic methodology for self-organising data mining and application development using a set of real-world business problems exemplarily,
how to handle KnowledgeMiner and how preparing a problem for solution.
  • Чтобы скачать этот файл зарегистрируйтесь и/или войдите на сайт используя форму сверху.
  • Регистрация