The history of medical diagnosis is said to have begun with Hippocrates. A Greek physician who lived approximately 460-377 B.C. and has been credited for laying the foundations of scientific medicine. Over the centuries, man has made great progress in medical research including the sequencing of the human genome. However, diagnosis is still being dominated by theories devised in the early 1900s.
As soon as electronic computers came into use in the fifties and sixties, the algorithms were developed that enabled modeling and analysing large sets of data. From the very beginning three major branches of machine learning emerged. Classical work in symbolic learning is described by Hunt et al. (1966), in statistical methods by Nilsson (1965) and in neural networks by Rosenblatt (1962). Through the years all three branches developed advanced methods (Michie et al., 1994): statistical or pattern recognition methods, such as the k-nearest neighbours, discriminant analysis, and Bayesian classifiers, inductive learning of symbolic rules, such as top-down induction of decision trees, decision rules and induction of logic programs, and artificial neural networks, such as the multilayered feedforward neural network with backpropagation learning, the Kohonen\'s self-organizing network and the Hopfield\'s associative memory.
The naive Bayesian classifier
I limit the historical overview of statistical methods to the naive Bayesain classifier. From the very beginning I was very interested in it. The algorithm is extremely simple but very powerful, and later I discovered that it can provide also comprehensive explanations which was confirmed in long discussions with physicians.
I was fascinated with its efficiency and ability to outperform most advanced and sophisticated algorithms in many medical and also non-medical diagnostic problems. For example, when compared with six algorithms, described in Section 3, the naive Bayesian classifier outperformed all the algo-rithms on five out of eight medical diagnostic problems (Kononenko et al., 1998). Another example is a hard problem in mechanical engineering, called mesh design. In one study, sophisticated inductive logic programming algorithms achieved modest classification accuracy between 12 and 29% (LavraÂ·c and DÂ·zeroski, 1994; Pompe and Kononenko, 1997) while the naive Bayesian classifier achieved 35%.
The naive Bayesian classifier became for me a benchmark algorithm that in any medical domain has to be tried before any other advanced method. Other researcher had similar experience. For example, Spiegelhalter et al. (1993) were for several man-months developing an expert system based on Bayesian belief networks for diagnosing the heart disease for new born babies. The final classification accurracy of the system was 65.5%. When they tried the naive Bayesian classifier, they obtained 67.3%.
The theoretical basis for the successful applications of the naive Bayesian classifier (also called simple Bayes) and its variants was developed by Good (1950; 1964). We demonstrated the efficiency of this approach in medical diagnosis and other applications (Kononenko et al., 1984; Cestnik et al., 1987). But only in the early nineties the issue of the transparency (in terms of the sum of information gains in favor or against a given decision) of this approach was also addressed and shown successful in the applications in medical diagnosis (Kononenko, 1989; 1993).