On Data Mining and Classification Using a Bayesian Confidence Propagation Neural Network

Author: Roland Orre

Abstract

The aim of this thesis is to describe how a statistically based neural network technology, here named BCPNN (Bayesian Confidence Propagation Neural Network), which may be identified by rewriting Bayes' rule, can be used within a few applications, data mining and classification with credibility intervals as well as unsupervised pattern recognition.

BCPNN is a neural network model somewhat reminding about Bayesian decision trees which are often used within artificial intelligence systems. It has previously been successfully applied to classification tasks such as fault diagnosis, supervised pattern recognition, hiearchical clustering and also used as a model for cortical memory. The learning paradigm used in BCPNN is rather different from many other neural network architectures. The learning in, e.g., the popular backpropagation (BP) network, is a gradient method on an error surface, but learning in BCPNN is based upon calculations of marginal and joint probabilities between attributes. This is a quite time efficient process compared to, for instance, gradient learning. The interpretation of the weight values in BCPNN is also easy compared to many other network architechtures. The values of these weights and their uncertainty is also what we are focusing on in our data mining application. The most important results and findings in this thesis can be summarised in the following points:

We demonstrate how BCPNN (Bayesian Confidence Propagation Neural Network) can be extended to model the uncertainties in collected statistics to produce outcomes as distributions from two different aspects: uncertainties induced by sparse sampling, which is useful for data mining; uncertainties due to input data distributions, which is useful for process modelling.
We indicate how classification with BCPNN gives higher certainty than an optimal Bayes classifier and better precision than a naive Bayes classifier for limited data sets.
We show how these techniques have been turned into a useful tool for real world applications within the drug safety area in particular.
We present a simple but working method for doing automatic temporal segmentation of data sequences as well as indicate some aspects of temporal tasks for which a Bayesian neural network may be useful.
We present a method, based on recurrent BCPNN, which performs a similar task as an unsupervised clustering method, on a large database with noisy incomplete data, but much quicker, with an efficiency in finding patterns comparable with a well known (Autoclass) Bayesian clustering method, when we compare their performane on artificial data sets. Apart from BCPNN being able to deal with really large data sets, because it is a global method working on collective statistics, we also get good indications that the outcome from BCPNN seems to have higher clinical relevance than Autoclass in our application on the WHO database of adverse drug reactions and therefore is a relevant data mining tool to use on the WHO database.

Roland Orre

Addendums

(Python code to Monty Hall problem page 4)
Special thanks to:
Dean S Horak who wrote a simulation in Java, and
Curt Welch who wrote a simulator in python, which inspired me to try on my own but using the set type in python.
Thanks also to Dean S Horak for adding the code to the pythonfiddle.

Last modified: Thu Sep 22 16:10:00 CEST 2016