On Data Mining and Classification Using a
Bayesian Confidence Propagation Neural Network
(only summary pdf),
(thesis cover pdf)
Author: Roland Orre
Abstract
The aim of this thesis is to describe how a statistically based neural
network technology, here named BCPNN (Bayesian Confidence Propagation
Neural Network), which may be identified by rewriting Bayes' rule,
can be used within a few applications, data mining and classification
with credibility intervals as well as unsupervised pattern
recognition.
BCPNN is a neural network model somewhat reminding about Bayesian
decision trees which are often used within artificial intelligence
systems. It has previously been successfully applied to
classification tasks such as fault diagnosis, supervised
pattern recognition,
hiearchical clustering and also used as a model for cortical memory.
The learning paradigm used in BCPNN is rather different from
many other neural network architectures. The learning in, e.g.,
the popular backpropagation (BP) network, is a gradient
method on an error surface, but learning in BCPNN is based
upon calculations of marginal and joint probabilities
between attributes. This is a quite time efficient
process compared to, for instance, gradient learning.
The interpretation of the weight values in BCPNN is also
easy compared to many other network architechtures.
The values of these weights and their uncertainty is also what
we are focusing on in our data mining application.
The most important results and findings in this thesis can be
summarised in the following points:
- We demonstrate how BCPNN (Bayesian Confidence Propagation
Neural Network) can be extended to model
the uncertainties in collected statistics to produce outcomes
as distributions from two different aspects: uncertainties induced
by sparse sampling, which is useful for data mining; uncertainties
due to input data distributions, which is useful for process modelling.
- We indicate how classification with BCPNN gives higher certainty
than an optimal Bayes classifier and better precision than a naive
Bayes classifier for limited data sets.
- We show how these techniques have been turned into a useful tool
for real world applications within the drug safety area in particular.
- We present a simple but working method for doing automatic
temporal segmentation of data sequences as well as indicate some
aspects of temporal tasks for which a Bayesian neural network may be
useful.
- We present a method, based on recurrent BCPNN, which performs a
similar task as an unsupervised clustering method, on a large database
with noisy incomplete data,
but much quicker, with an efficiency in finding patterns
comparable with a well known (Autoclass) Bayesian clustering method,
when we compare their performane on artificial data sets.
Apart from BCPNN being able to deal with
really large data sets, because it is a global method working on
collective statistics, we also get good indications that the outcome
from BCPNN seems to have higher clinical relevance
than Autoclass in our application
on the WHO database of adverse drug reactions and therefore
is a relevant data mining tool to use on the WHO database.
Roland Orre
Addendums
(Python code to Monty Hall problem page 4)
Special thanks to:
Dean S Horak who wrote a simulation in Java, and
Curt Welch who wrote a simulator in python, which inspired me to try on my own but using the set type in python.
Thanks also to Dean S Horak for adding the code to the
pythonfiddle.
Last modified: Thu Sep 22 16:10:00 CEST 2016