Diagnosing skin diseases using an artificial neural network

Development of medical expert systems that use artificial neural networks as their knowledge bases appears to be a promising method for predicting diagnosis and possible treatment routine. This paper deals with the construction and training of an artificial neural network for Skin Disease Diagnosis (SDD) based on patients' symptoms and causative organisms. The artificial neural network constructed using a feed-forward architectural design is shown to be capable of successfully diagnosing selected skin diseases in the tropical areas such as Nigeria with 90 percent accuracy. The work may in the future serve as a knowledge base for an expert system specializing in medical diagnosis, testing evaluation, treatment evaluation, and treatment effectiveness. The work serves as the first component of a much larger system that will assist physicians facilitate the reasonable ordering of tests and treatments and minimize unnecessary laboratory routines while reducing operational costs.


Introduction
The field of Artificial Neural Networks (ANNs) or Neurocomputing or Connectionists Theory (CT) was born out of a research conducted by McCulloch and Pitt in 1943 on a simple model of a neuron. Following this were further attempts to generate knowledge from the study of human nervous system and application of such knowledge to the design and implementation of systems, which mimic the way human beings process information. Most of the early works in ANNs attempted to simulate the human nervous system. An ANN is an information or signal processing system which is composed of a large number of simple processing elements called artificial neurons or nodes that are interconnected by direct links, called connections, and which cooperate to perform parallel distributed processing (PDP) operation in order to solve a given problem (Haykin, 1994). A major feature of ANNs is their ability to adapt to changes in the environment by changing their connection strength or structure. This feature enables neural network systems to learn automatically. ANNs in their present form are grossly simplified model of the human brain. This is because the operation involved in the behavior of the brain is much more complex than model proposed in the ANN concepts. It is more reasonable to compare ANN capabilities to simpler nervous systems found in primitive animals, like insects, which have the ability to adapt themselves to a complex environment. A simplified schematic diagram of a biological neuron is as shown in fig. 1. The components of the biological neuron, which are of importance in neural network simulation study, are discussed as follows: i. Soma or Cell Body: This is the large round central body of the neuron in which all the logical functions are realized. The cell body contains the genetic and metabolic machinery necessary to keep the neuron alive. The neuron soma also contains the nucleus and the protein-synthesis machinery. ii. Axon: This is the nerve fibre attached to the soma and can serve as the final output channel of the neuron. An axon is usually highly branched. The initial segment of the axon is called an axon hillock. Here the signals are converted into sequence of nerve pulse (spikes), which are propagated without attenuation along the axon to the target cells (i.e. other neurons, receptors and muscles). iii. Dendrites: The dendrites (inputs) represent a highly branching tree of fibres. These long irregularly shaped nerve fibres are attached to the soma. There are about 10 3 to 10 4 dendrites per neuron. Dendrites connect a neuron to a set to other neurons. Dendrites either receive inputs from other neurons via specialized contacts called synapse or www.intechopen.com Artificial Neural Networks -Methodological Advances and Biomedical Applications 254 connect other dendrites to the synaptic outputs. Dendrites are regarded as providing receptive surface for input signals to the neurons and as conducting signals with decrement to the body cell and the axon hillock. iv. Synapses: Synapses are specialized contacts on a neuron, which are the termination points for the axons from other neurons. Synapses play the role of interfaces connecting some axon of the neurons to the spines of the input dendrites. Synapses can also be found on the cell body. Synapses are capable of changing a dendrite's local potential in a positive or negative direction. Due to their function, the synapses can be of excitatory or inhibitory nature in accordance with their ability to increase or to damp the neuron excitation. Storing of information in a neuron is supposed to be concentrated in its synapses connections or more precisely in the pattern of these connections and strengths (weights) of the synaptic connections. The human synapses are mainly complex chemical nature while synapses in nervous system of primitive animals such as insects are predominantly based on electrical signal transmission. along an axon to the synaptic connections of other neurons. The output pulse-rate (impulse density) depends on both the strength of the input signals and the weight/ strength of the corresponding synaptic connections. The maximum firing rate is around 1000 pulses per second. The input signals at the excitatory synapse increase the pulse rate while the input signals at the inhibitory synapse reduce the pulse rate or even block the output signal. A neuron operates in mixed digital and analogue form. The neuron output signal is proportional, in some range, to a liner combination of the neurons input signal values. This information between neurons is transmitted in the form of nerve impulses, which can be considered as digital signals. However the encoded information has the form of the pulse density, which is an analogue signal. Each pulse (or nerve impulse) arriving at the synapse generates an analogue internal potential in some proportion to the synaptic strength, which has either a positive or negative value corresponding to an exciting or inhibitory synapses. These potentials are summed in a spatial-temporal way and when the total potential exceeds some value called the threshold, a train of pulse is generated and travels along its axon. The signals are changed when they travel along the connections: They are combined -usually by multiplication with the connection weights. A neuron gathers the input from all incoming connections to compute its activation value. In 1949 Hebb hypothesized about learning in networks of artificial neurons. The Hebb rule encodes the correlations of activations of connected units in the weights. A weight is increased, if the two neurons that are connected by it are active at the same time. The weight is decreased, if only one of the two connected neurons is active (Hebb, 1949). The structure of a neuron model is illustrated in figure 2. The units of most types of neural networks are organized in layers, which are called input layer, hidden layers(s) and output layer, depending on their functionality. A neural network with n units in its input layer and m units in its output layer, implements a mapping f: X n → Y m . The sets X n ⊆ R n and Y m ⊆ R m are the input and output domains. Hidden layers add to the complexity of a neural network, and are important, if arbitrary mappings have to be www.intechopen.com Artificial Neural Networks -Methodological Advances and Biomedical Applications 256 represented. Neural networks differ from one another in the way the nodes are interconnected and in the rules they use to produce a given output signal for different sets of input signals. Some networks are also fully interconnected grids, with each processor directly connected to every other processor; others embody treelike or layered architectures. The capabilities of ANNs that are considered especially attractive include (DeCegama, 1989): i. Correct memory data retrieval, even if some individual processors (neurons) fail. ii. Retrieval of closest matching data if there is no exact match to the requested information. iii. The ability to retrieve original inputs from a degraded version, to connect one item with another, or to connect an item to a set of categories. iv. The capability to discover statistically salient features among the stored data. v. The ability to find solutions for problems that involve combinatorial explosion. vi. The ability to be trained (or taught) instead of programmed. To achieve the above capabilities, the connection weights of ANNs are estimated by learning algorithms. Learning algorithms of neural networks use a learning problem, described by a set of training data and iteratively update the parameters of a network such that some error measure is decreased or some performance measure is increased. The training data can consist of input and output data (supervised learning), of input data and success or failure signals (reinforcement learning) or of input data alone (unsupervised learning). Among the many interesting properties of a neural network, the property that is of primary significance is the ability of the network to learn from its environment, and to improve its performance through learning; the improvement in performance takes place over time in accordance with some prescribed measure. A neural network learns about its environment through an iterative process of adjustments applied to its synaptic weights and thresholds. Ideally, the network becomes more knowledgeable about its environment after each iteration of the learning process. In the context of neural networks, learning is defined by Haykin (1994) as follows: Learning is a process by which the free parameters of a neural network are adapted through a continuing process of stimulation by the environment in which the network is embedded. The type of learning is determined by the manner in which the parameter changes take place. This definition of the learning process implies the following sequence of events: i. An environment stimulates the neural network. ii. The neural network undergoes changes as a result of this stimulation. iii. The neural network responds in a new way to the environment, because of the changes that have occurred in its internal structure. An essential ingredient of supervised or active learning evident in fig. 2 is the availability of an external teacher (Haykin, 1994). In conceptual terms, we may think of the teacher as having knowledge of the environment that is represented by a set of input-output examples. The environment is, however, unknown to the neural network of interest. Suppose now that the teacher and the neural network are both exposed to a training vector (i.e. example) drawn from the environment. By virtue of built-in knowledge, the teacher is able to provide the neural network with a desired or target response for that training vector. Indeed, the desired response represents the optimum action to be performed by the neural network. The network parameters are adjusted under the combined influence of the training vector and the error signal; the error signal is defined as the difference between the actual response of the network and the desired response. This adjustment is carried out iteratively in a step-by-step fashion with the aim of eventually making the neural network emulate the teacher; the emulation is presumed to be optimum in some statistical sense. In other words, knowledge of the environment available to the teacher is transferred to the neural network as fully as possible. When this condition is reached, we may then dispense with the teacher and let the neural network deal with the environment thereafter completely by itself (i.e. an unsupervised fashion). The form of supervised learning described is indeed the error-correction learning in a closed-loop feedback system, but the unknown environment is not in the loop. As a performance measure for the system, we may think in terms of the mean-squared error (i.e. the expected value of the sum of squared errors) defined as a function of the free parameters of the system. This function may be visualized as a multidimensional error-performance surface or simply error surface, with the free parameters as coordinates. The true error surface is averaged over all possible input-output examples (Haykin, 1999). Any given operation of the system under the teacher's supervision is represented as a point on the error surface.
For the system to improve performance over time and therefore learn from the teacher the operating point has to move down successively toward a minimum point of the error surface; the minimum point may be a local minimum or a global minimum. A supervised learning system is able to do this by virtue of some useful information it has about the gradient of the error corresponding to the current behavior of the system. The gradient of an error surface at any point is a vector that points in the direction of steepest descent. In fact, in the case of supervised learning from examples, the system uses an instantaneous estimate of the gradient vector, with the example indices presumed to be those of time. The use of such an estimate results in a motion of the operating point on the error surface is typically in the form of a random walk. Nevertheless, given an algorithm designed to minimize the cost function of interest, and given an adequate set of input -output examples and enough time permitted to do the training, a supervised learning system is usually able to perform such tasks as pattern classification and function approximation satisfactorily. For this reason one is poised to expect more industrial applications of ANNs to human development. A recent example is the use of ANN in financial crime detection (Bakpo, 2008). The use of expert system as a mean of conducting medical diagnosis and recommending successful treatments has been a highly active research field in the past few years. Development of medical expert system that uses artificial neural networks (ANN) as knowledge base appears to be a promising method for diagnosis and possible treatment routines. One of the major applications of medical informatics has been the implementation and use of expert systems to predict medical diagnoses based upon a set of symptoms (Eugene et al., 1997). Furthermore, such expert systems serve as an aid to medical professionals in recommending effective laboratory tests and treatments of diseases. An intelligent computer program assisting medical diagnosis could provide easy access to a wealth of information from past patient data. Such a resource may help hospitals reduce excessive costs from unnecessary laboratory test and ineffective patient treatment, while maintaining high quality of medical care. Klerfors (1998) argued that, so far these expert systems only served as aids to the physician and are not 100% reliable. Current expert systems do not provide enough value to the physician to justify their large-scale implementation. One major drawback of conventional medical expert systems is the use of static knowledge base developed from a limited number of cases and a limited population size, demographics, and geographic location. The knowledge base is inherently not dynamic and is not routinely updated to keep up with emerging trends such as the appearance or increased prevalence of unforeseen diagnoses. The result is that, after a given period of time this inflexibility limits the use of the knowledge base as it no longer reflects the current characteristics of the population at risk. Given these points, the development of a knowledge base using artificial neural network technology naturally lends itself towards the task of predicting medical diagnosis. In addition, the technology appears to be a promising method for recommending possible treatment routines. It offers flexible and quick means of designing dynamic expert systems that consider different decision variables in their predictive routines. With its dynamic nature and on-line learning capability, a neural network knowledge base can also be updated with more recent patient data. Thus, once an initial knowledge base has been set up, it can never become obsolete with time. In this way, the system can effectively capture varying ailment trends in a given population while retaining its previous knowledge. One of the most important problems of medical diagnosis, in general, is the subjectivity of the specialist. Human being always makes mistakes and because of his limitation, errors do occur during diagnosis. It has been noted, in particular in pattern recognition activities, that the experience of the professional is closely related to the final diagnosis (Salim, 2004). This is due to the fact that the result does not depend on a systematized solution but on the interpretation of the patient's signals. Brause (2001) pointed out that almost all physicians are confronted during their formation by the task of learning to diagnose. Here, they have to solve the problem of deducing certain diseases or formulating a treatment based on more or less specified observations and past experience. For this task, certain basic difficulties have to be taken into account namely the basis for a valid diagnosis. A sufficient number of experienced cases is reached only in the middle of a physician's career and is therefore not yet present at the end of the academic formation. This is especially true for rare or new diseases where also experienced physicians are in the same situation as newcomers. Principally, humans can recognize patterns or objects very easily but fail when probabilities have to be assigned to observations. Brause (2001)

Skin diagnostic analysis and artificial neural network design
Skin diseases are diseases that may originate inside the body and manifest on the skin or start from the skin and manifest on the skin (Buxton, 1991). This session presents the analysis of these diseases. It also presents the structure of an artificial neural network for handling skin diseases diagnosis.

Analysis of skin disease diagnostic system
The human expert performs the diagnosis of skin diseases by collecting patient records and complaints. This list of patient complaints and observed skin conditions is then expanded into several Boolean symptoms. The symptoms are further subjected to knowledge matching with the knowledge already possessed by the human expert (knowledge base-experience). If there is a match, the doctor recommends the disease as a possible skin disease. In some cases, the human expert may subject the patient to further laboratory tests in order to ascertain the causative agent of the skin condition. The test could serve as a confirmatory test if the disease diagnosed is actually caused by microorganism such as bacteria, warts, virus, fungi, etc. When the human expert is inexperienced or has not come across such skin condition, he uses trial and error to diagnose. This is done by the combination of all the possible conditions, comparing them with known conditions and narrowing the judgment. During this process, learning is said to have taken place if the skin condition is properly diagnosed and treated.
Thus, the human expert depends largely on his experience and the patient complaint interpretation. Fig. 3 fig. 3, disease survey includes the general collection and listing of the skin diseases that is studied and used in the process of system development. This includes complaint, observation and lab test.

Disease survey and plan of diagnosis
Disease survey includes the general collection and listing of the skin diseases that will be studied and used in the process of development of the Artificial Neural Network for the system. Examples of several skin disease infections are shown in figure 4. Similarly, skin diseases and their symptoms are shown in Table 1. The National Center for skin diseases Singapore was selected in this study for the symptomatic classification because the center uses state-of-the-art equipment in checking the diseases. It uses skin-scanning machines, which can better classify the skin colour variation due to the disease using sample human races. The particular skin lesion can also be better observed by the machine than by human sight which can be erratic. The center also offers online sample test in their Online Consultation using Camera on the particular skin portion to capture the symptom of a given case that can be passed to our Neural Network system when operational.

Skin disease data input design
The data used for the diagnostic system consist of the following components: Patient vital signs, Patient verbal complaints, Patient demographics and Presence of specific symptoms. The elements of each of the components serve as an input variable to the network. Thus, the task of the artificial neural network is to draw a correlation between the patient's presentation, using self-reported symptoms and vital sign. While patient demographics and vital signs are a key element in providing clues to an ailment, it has been observed that the patients' complaints provide greater insights for predicting medical diagnoses. However some skin conditions have to rely heavily on good scanning machines for an accurate prediction (NSC, 2005). Expanded complaint input variables are shown in Table 2.  Table 2, shows a list of input variables created by the expansion of the symptom string. The goal is to create an automated data processing system capable of taking as input a set of these records and using it to arrive at a correct diagnosis. The symptom input variables were created to draw a correlation between a specific complaint and the possible skin disease.

Artificial neural network structure for skin disease diagnostic
The design and architecture of ANN selected for the skin diseases is based on the feedforward network. This means that the artificial neurons are organized in layers, and send their signals forward, (i.e., from input to output) and then the errors are propagated backwards (see, fig. 5).
The network receives input symptoms by neurons in the input layer, and the output of the network is given by the neurons on an output layer. There may be one or more intermediate hidden layers. The backpropagation algorithm uses supervised learning, which means that we provide the algorithm with examples of the inputs and outputs we want the network to compute, and then the error (difference between actual and expected results) is calculated. The idea is to reduce this error, until the ANN learns the training data. The training begins with random weights, and the goal is to adjust them so that the error will be minimal. The activation function of the artificial neurons in ANNs implementing the backpropagation algorithm is given as follows (Haykin, 1999): (1) The output function uses the sigmoidal function: We defined the error function for the output of each neuron as: The weights are adjusted using the method of gradient descent: Where: x i are the inputs, w ij are the weights, O j (x , w ) are the actual outputs, dj are the expected outputs and η -learning rate.
For the skin diagnostic system, we defined the following firing rules: take a collection of symptoms for a disease to a node, the presence of which causes it to fire (the 1-taught set of patterns) and the absence prevents it from firing (the 0-taught set). If there is a tie in symptoms, then the ANN remains in the undefined state (1/0). For instance, in scabies, which have 4 -input symptoms, the neuron is taught to fire (i.e., output 1), when the input (tiny bumps-X 1 , itching-X 2 , scaly-X 3 and on fingers-X 4 ) is 1111 or 0111 or 1101 and will not fire ((i.e., output 0) when the input is 0000 or 1010 or 1110. The firing rule is depicted in Table 3. It is important to note that the decision concerning each firing rule was derived directly from the Scabies condition. This condition specifies that: Scabies occurs only on the hand has to be Itchy. These conditions are weighed higher than other conditions. The values 1111 or 0111 or 1101, implies that all conditions must be met or at least there is either tiny bumps or scaly skin to meet the 50% threshold mark required for firing. This is a very important rule  formulation trick, which also requires some higher level knowledge of the disease for which rule is formulated. The rule provides a greater degree of flexibility for determining whether scabies is a skin disease or not. This is particularly true for a case where the patient has used cream to make the skin less scaly. Omission of the scaly symptom will not affect the decision of the neural network in determining whether to suspect scabies or not. This is a more realistic situation which definitely makes the neural network more desirable. The firing rule gives the neuron a sense of similarity and enables it to respond 'sensibly' to symptoms not seen during training.

System implementation
Once the system algorithm has been specified the coding of the system follows. The coding to a large extent is guided by the algorithm. Since the program is object oriented with classes, its implementation may not necessarily be procedural. The program coding is implemented in c++. Testing of the system was done module by module, code segment by code segment. Once a segment is in good order another segment is merged with it until the module is complete. In the testing, codes that are retested and proven correct were reused for system development.
The stubs were tested on individual member functions; the main program, subroutine or subprogram. In this process the lower level modules were simulated to allow the test of the higher level modules. The diagnostic modules for instance were simulated to test the neural network selection of its actions when the selected operation is called in the main program module where the object is instantiated. When the response was proved correct the diagnostic modules itself was fully developed. This testing method made the program to work when they were merged into a single program. After the testing and merging, all the modules that have been coded and stub tested were retested as an integrated single unit.

Description of implementation data
The data used in the implementation and testing of the Artificial Neural Network were collected from the Olivet Clinic, Port Harcourt and from the National Skin Center for Dermatology. The center offered the access of their recorded data on the dermatological research carried out up to March 2005. The data contain information on various symptoms and their diseases and the various causative organisms. It was accessed on http://www.nsc.gov.sg/. The data collected were enormous but selected ones were used in testing the program. The selection was based on the skin diseases more common in Nigeria. The rectification of the commonality of the diseases was carried out at the Olivet Clinic, off Garison junction, Port Harcourt. Olivet is a Hospital dedicated to dermatology in Port Harcourt. Table 1 shows selected skin diseases collected from the net (http://www.nsc.gov.sg/). Table 4 shows the selected skin diseases from Olivet Clinic, Port Harcourt.

Training of the artificial neural network
The learning process proceeds by way of presenting the network with a training set composed of input patterns together with required response pattern. By comparing the actual output with the target output for given pattern, the error is computed using equation (3). The error can then be used to alter the connection strengths between layers in other to achieve a better network's response to the same input pattern in subsequent iterations. In the proposed ANN structure, a weight of 10 was initially assigned to all the symptoms, while a threshold of 50, was chosen for the first layer. In the next stage a threshold of 70, was chosen and weights were assigned based on some probabilities. The network was trained and the result was not satisfied, although some diseases were diagnosed correctly. Following this, a threshold of 50 was reversed and weights assigned based on its peculiarity (but, all not 10 this time). Finally, a threshold value of 70 was chosen again, while higher weights are assigned based on its exclusiveness. Scabies was chosen as a test case because it has a large number of input symptoms which matches with the ANN structure implemented. Table 5, presents the results of testing the system. The result showed cases where suspected and diagnosed based symptoms were supplied to the system. From Table 5, a critical analysis of the results in test 1 to 4, clearly illustrates the effect of weight adjustments in the result of the diagnosis of the skin disease. In test 1, all the symptoms were used and the system confirmed with scabies. In test 2, tiny bumps symptom was dropped and the weight at 50 confirmed that bacteria is present in scabies. This was also true for test 3. However, in test 4 a symptom was dropped and the weight was below 50, indicating that an important symptom has been dropped. Thus the diagnosis indicated that bacteria are the causative organism of scabies.

Performance analysis of the ANN system
To justify the performance of our diagnostic system, we conducted two analyses. The first is using a general performance scheme. Secondly, we carried out a number of tests at random using various symptom combinations.

A. Performance Benchmark
The proposed neural networks skin disease diagnostic system (NNSDDS) architecture relies on a piece of software for easy skin disease diagnosis. The principles underlying diagnostic software are grounded in classical statistical decision theory. There are two sources that generate inputs to the diagnostic software: disease (H 0 ) and no disease (H 1 ). The goal of the diagnostic software is to classify each diagnostic as disease or no disease. Two types of errors can occur in this classification: i. Classification of disease as normal (false negative); and ii. Classification of a normal as disease (false positive). We define: Probability of detection P D = Pr (classify into H 1 |H 1 is true), or Probability of false negative = 1-P D. Probability of false positive P F = Pr (classify into H 1 |H 0 is true). Let the numerical values for the no disease (N) and disease (C) follow exponential distributions with parameters λ N and λ C , λ N > λ C , respectively. Then we can write the probability of detection P D and probability of false positive P F as Thus P D can be expressed as a function of P C as P D = r C P, Where r = λ C /λ N is between 0 and 1. Consequently, the quality profile of most diagnostic software is characterized by a curve that relates its P D and P C , known as the receiver operating curve (ROC) (Trees, 2001). ROC curve is a function that summarizes the possible performances of a diagnostic system. It visualizes the trade -off between false rates and success rates, thus facilitating the choice of a decision functions. Following the work done by Huseyin and Srinivasan, (2004), Fig. 6 shows sample ROC curves for various values of r. The performance analysis of the NNSDDS algorithms was carried out using MATLAB software package (MATLAB R , 2009) and the results compared with the collected data for scabies, acne and vulgar are as shown in Fig. 7, Fig. 8 and Fig. 9, respectively. Explanation: In figures 7, 8 and 9, it can be seen that the model results compared satisfactorily well with the collected data results for all cases examined in the paper.

B. Overall Success Rating of the NNSDD System
In testing the NNSDDS, twenty different tests were carried out at random using various symptom combinations and the results matched with the expected result of the NNSDDS.
Where there was a match, success was recorded. In situations where there was no match failure were recorded. The total number of success = 18. Total number of failure = 2. Total number of test was 20.

Conclusion
The paper presented a framework for diagnosing skin diseases using artificial neural networks. The proposed system was able to achieve a high level of success using the artificial neural network technique. A success rate of 90% was achieved. This infers that ANN technique is an effective and efficient method for implementing diagnostic problems. The features of the ANN provided learning capability, which makes the system opened ended to new disease conditions or variation of a known skin disease due to the mutation of the causative organism. This provides great flexibility in the diagnostic system and makes the system to be opened ended. With this flexibility in the system, the level of coverage of skin conditions by the diagnostic system is limitless. This makes the application to be useful in many conditions even in unforeseen instances.

Recommendation
This work is recommended to human experts and dermatologist who specializes in diagnosing and treatment of skin and related diseases. The human experts will find it useful as an aid in the decision making process and confirmation of suspected cases. Also, a nonexpert will still find the work useful in areas where prompt and swift actions are required for the diagnosis of a given skin diseases listed in the system. Medical practitioners who operate in areas where there are no specialist (dermatologist) can also rely on the system for assistance. The skin diseases covered include scabies, Acne, Vulgari, Impetigo, Lieshmaniasis, Atopic Dermatitis, Syringoma, Benign, Skin Turmour, Leprosy, and Diaper Candidiasis. Others are Folliculities, Soborrhocic Dermatitis, Xantherlasma, Malasma, Urtticaria and Ichen simplex. All these diseases can be handled using the system developed in this research. The research work can also act as a pedestal for the advancement of research in neural network applications in medical diagnostic researches. The work is also recommended for developers of object-oriented system and decision-support systems.