Analysis of CT Brain Images using Radial Basis Function Neural Network

Medical image processing and analysis is the tool to assist radiologists in the diagnosis process to obtain a more accurate and faster diagnosis. In this work, we have developed a neural network to classify the computer tomography (CT) brain tumor image for automatic diagnosis. This system is divided into four steps namely enhancement, segmentation, feature extraction and classification. In the first phase, an edge-based selective median filter is used to improve the visibility of the loss of the gray-white matter interface in CT brain tumor images. Second phase uses a modified version of shift genetic algorithm for the segmentation. Next phase extracts the textural features using statistical texture analysis method. These features are fed into classifiers like BPN, Fuzzy k-NN, and radial basis function network. The performances of these classifiers are analyzed in the final phase with receiver operating characteristic and precision-recall curve. The result shows that the CAD system is only to develop the tool for brain tumor and proposed method is very accurate and computationally more efficient and less time consuming.

Angular second moment                                ASM
Computer tomography                                   CT
Contrast                                                          CON
Correlation                                                      COR
Difference variance                                        DV
Difference entropy                                          DE
Entropy                                                           ENT
Edge-based selective median filter                ESMF
Genetic algorithm                                           GA
Information measure of correlation1              IMC1
Information measure of correlation2              IMC2
Inverse difference moment                            IDM
Maximum correlation coefficient                    MCC
Precision-recall                                              PR
Selective median filter                                    SMF
Sum average                                                  SA
Sum variance                                                 SV
Sum entropy                                                   SE
Shift-genetic algorithm                                    sGA
Variance                                                          VAR

The chances of applying image analysis techniques successfully are very much dependent on selecting the right problem and having a simple, high-quality images as free as possible from distracting elements such as dust, stray hairs, reflections or shadows. It is beneficial to begin with evenly illuminated samples, while structures of interest must be sufficiently distinct to allow them to be easily isolated from the background. The automation of image analysis is wholly dependent on these factors. If applications are forced to rely on continuous operator involvement to ensure that the correct structures are measured, then this potentially undermines attempts to reduce the labor involved in routine identifications. Lee7, et al. mentioned furthermore that if the skill required for the preparation of specimens prior to image analysis is comparable to that required for taxonomic analysis, then the potential application of automated image analysis would be severely restricted.
It is no coincidence that insect wings have provided the subjects for much image analysis work in the field of taxonomy. Their transparent, two-dimensional structure and obvious venation pattern make them ideal for this purpose. How often wing venation alone can sufficiently characterize a specimen for identification purposes is uncertain. The extensive feature measurement capabilities of modern image processing software suggest, however, that many potentially important or novel features may now be measured, leading to highly comprehensive descriptions. Novel characteristics described by image analysis may, in fact, feedback to be used in traditional taxonomy. Herz17, et al. discussed that head and body structures may also provide the subjects for image analysis, although their three-dimensionality may lead to distortions in two-dimensional image analysis. The extension of imaging techniques to organisms which do not possess structures amenable to image analysis, therefore, may be problematic.
The difficulties involved in consistently acquiring high-quality, in-focus images and objective feature measurements in poorly understood character spaces may restrict the application of these techniques. Selecting a set of features that captures the information required for identification is not easy. It is usually necessary to obtain as many feature measurements as possible in the hope of getting the information required. Feature measurements must be adjusted to compensate for variation in illumination, orientation and overall body size. While illumination and orientation may be standardised using a variety of image-processing algorithms, variation in body size is most conveniently overcomes using a number of formal dimensionless expressions that may be used as shape descriptors. It is questionable whether the precise meaning of any particular shape descriptor must be known before it may be used for identification. Providing the descriptor consistently quantifies some aspect of shape its meaning may be largely irrelevant.
Proposed computer tomography (CT) brain image analysis system is designed with four phases for automatic diagnosis described by Golemati5, et al. They are enhancement, segmentation, feature extraction and classification. Enhancement phase reduces the noise, segmentation phase extracts the suspicious region, feature extraction phase extracts the textural features from the segmented regions and the classification phase classifies the image.
In the enhancement phase, an edge-based selective median filter (ESMF) is used to improve the visibility of the loss of the gray-white matter interface in CT brain images. The loss of the gray-white matter interface is one of the early signs of brain tumor diagnosis. Here the noise is removed using selective median filter (SMF) while applying this filter, based on the edge map, the edge pixels are ignored. From this phase, a noise-free and edge preserved CT image is obtained. The next phased uses this enhanced version to segment the suspicious regions of the brain image. Bayesian algorithm is the tool for MRI image segmentation of soft tissue image. So, we have also used the same algorithm for performance classification.
The medical image data are obtained from biomedical devices which uses imaging techniques of CT which indicates the presence or absence of the lesion along with the patient history. To diagnose and classify the image we have used radial basis function network (RBFN) classifier. So far, there is no classification that uses RBFN is present. The use of RBFN for classification gives a more accurate result.
A modified version of genetic algorithm (GA) namely shift-genetic algorithm (sGA) is used in the segmentation process that introduces a new crossover operator (based on binary shift operations). In the feature extraction phase, Abraham and Sorwar1 discussed that the textural features have been extracted using statistical texture analysis method called reduced gray level run length method (GLRLM). Gandhi and Shah4 mentioned that the Haralick features are extracted from this method and are fed into three different classifiers back propagation network (BPN), Fuzzy k-NN, and RBFN. The performances of the classifiers are analyzed with ROC and PR curve. Based on the experiments and results, proposed RGLRLM texture analysis method with RBFN achieved better performance than the others.

1.1   Shift Genetic Algorithm

A GA is a heuristic search or optimization technique for obtaining the best possible solution in a vast solution space. To apply a GA, an initial population is generated and the fitness of each member of the population is evaluated. The algorithm then iterates the following: Members from the population are selected for reproduction in accordance to their fitness evaluations. The reproduction operators are then applied, which generally include a crossover operator that models the exchange of genetic material between the parent chromosomes and a mutation operator to maintain diversity and introduce new alleles into the generation, or a combination of both, to generate the offspring of the next generation. The fitness of the offspring is then evaluated, and the algorithm starts a new iteration. The algorithm stops when either a sufficiently good solution is found, or after a predetermined number of iterations.
The most important parameters that control the GA can significantly affect the performance are the population size, the crossover rate and the mutation probability. Initially the images are divided into kernels of size 10 ×10 pixels. The initial population of the GA is constructed by selecting the kernels at random. For each chromosome, two random numbers are generated, the numbers are considered as coordinates for selecting the kernel. Then the numbers are converted into binary to generate the chromosome. The size of the initial population is 10. And the mean feature value is calculated for each kernel as fitness value. The reproduction operator is applied to select the kernels having high probability based on roulette wheel selection. And the mutation operator is applied to generate the new population with the probability of 0.03. From the new population the optimum fitness value is calculated. This optimum value is compared with the local optimum, if the local optimum is greater than the new one, then the local optimum is considered as global one and the next iteration is continued with the old population, otherwise, if the local optimum is less than the new one then the new population is copied into the old population and the new value is considered as global optimum and the next iteration is continued with the new population. This process is repeated for 50 iterations and the global optimum from the last iteration is considered as threshold value to segment the brain images.

Golemati5, et al. mentioned that radial basis function network (RBFN) can be used for approximating functions and recognizing patterns. It uses Gaussian potential functions. The Gaussian potential functions are also used in networks called regularization networks. Powell has used radial basis functions in exact interpolation. In interpolation we have n data points xiRd, and n real valued numbers tiR, where i = 1,…, n. The task is to determine a function S in linear space such that S(xi) = ti, i = 1,…,n. The interpolation function is a linear combination of basis functions.

$S\left(x\right)=\sum _{i=1}^{n}{w}_{i}{v}_{i}\left(x\right)$                   (1)

As basis functions bi, radial basis function of the form is

(2)

Figure 1.Shift genetic-based segmentation.

where, φ is mapping R+ → R, and the norm is Euclidian distance. The following forms have been considered as radial basis functions.
(a)Multi-quadric function φ(r) = (r2 + c)1/2, where c is a positive constant and r ∈ R.

(b)φ(r) = r
(c)φ(r) = r2
(d)φ(r) = r3
(e) φ(r) = exp(–r2)

It has been proved that, the global basis functions may have slightly better interpolation properties than the local once.
The architecture of radial basis function consists of three layers, the input, the hidden and the output layers as shown in the Fig 2. There exist n number of input neurons and m number of output neurons with the hidden layer existing between the input and output layer. The interconnection between the input layer and the hidden layer forms a hypothetical connection and between the hidden and output layer forms weighted connections. The training algorithm is used for updating of weights in all the interconnections.

Table 1 illustrates various measures using five normal and abnormal image features, which is taken from the segmented image and are used as input to the 14 × 1 RBFN classifier referred by Wang and Yong9 and Harlic and Shapiro16. Angular second moment, contrast, correlation, variance, inverse difference moment, sum average, sum variance, sum entropy, difference variance, difference entropy, information measure of correlation1, information measure of correlation2 and maximum correlation coefficient are the calculated measures of CT brain haralic texture features values discussed in Channin14,15, et.al. are used in 14 × 1 with weighted sum RBFN classifier

Table 1. Texture feature value.

2.1   Activation Function and Algorithm for Training

Author used RBFN, which needs Gaussian activation function to correlate the output. Since, the response of such function is non-negative for all value of x. The function is defined as:
f(x) = exp(-x2)                  (3)
its derivative is given by
f'(x) = -2x exp(-x2) = -2x f(x)                   (4)
The radial basis function is different from the back propagation network used in the Gaussian function. The training algorithm for the network is given as follows:
Step 1. Initialize the weights. (set to small random values).
Step 2. For each input, do steps 3-9.
Step 3. Each input unit (xi, i = 1,...,n) receives input signals, to all units in the layer above (hidden unit).
Step 4. Calculate the radial basis function.
Step 5. Choose the centers for the radial basis functions. The centers are chosen from the set of input vectors. A sufficient number of centers have to be selected in order to ensure adequate sampling of the input vectors space.
Step 6. The output of im unit vi(xi) in the hidden layer.

${v}_{i}\left({x}_{i}\right)=e\left(-\sum _{j=1}^{r}\left[{x}_{ji}-{\stackrel{\wedge }{x}}_{ji}\right]/{\sigma }_{i}^{2}\right)$                  (5)
where, xji = center of the RBF unit for input variables, σi = width of the ith RBF unit and
${\stackrel{\wedge }{x}}_{ji}$ = jth variable of input pattern
Step 7. Initialize the weights in the output layer of the network to some small random values.
Step 8. Calculate the output of the neural network.

${y}_{net}=\sum _{i=1}^{H}{w}_{im}{v}_{i}\left(x{}_{i}\right)+w{}_{0}$                  (6)
where, H = number of hidden layer nodes (RBF function), ynet = output value of mth node in output layer for the nth incoming pattern, wim = weight between ith RBF unit and mth output node, w0 = biasing term at nth output node
Step 9. Calculate error, and test stopping condition. The stopping condition may be the weight change, the number of epochs, etc.

Experimental analysis considers Lu and Weng8 approach to evaluate the performance of RBFN classifier, Fawcett3 approach for receiver operating characteristic analysis and uses Davis and Goadrich11 approach to compute precision-recall curve analysis.

3.1   Receiver Operating Characteristic Curve Analysis

Receiver operating characteristic (ROC) curve is one of the performance measures for classification. Receiver operating characteristic curves measure predictive utility by showing the trade-off between the true-positive rate and the false-positive rate inherent in selecting specific thresholds on which predictions might be based. The area under this curve represents the probability that, given a positive case and a negative one, the classifier rule output will be higher for the positive case and it is not dependent on the choice of decision threshold. Othman and Basri13 describes the convenient way to display the diagnostic accuracy expressed in terms of sensitivity (or true-positive rate) against (1 - specificity) (or false-positive rate) at all possible threshold values. Hand and Till6 discussed that performance of each test is characterized in terms of its ability to identify true positives while rejecting false positives, with the following definitions.

• True positive (TP): lesions called cancer and prove to be cancer
• False positive (FP): lesions called cancer that prove to be benign
• False negative (FN): lesions called negative or benign and prove to be cancer
• True negative (TN): lesions that are called benign and prove to be benign

False positive fraction (FPF) = FP/(TN–FP) ; True positive fraction (TPF) = TP/(TP–FN) ;
True negative fraction (TNF) = TN/(TN–FP) ; False negative fraction (FNF) = FN/(TP – FN)
Note that because every actual positive results in either a true positive or a false negative, while every actual negative results in either a true negative or a false positive, TPF is the ratio of true positives (actually positive and reported positive) to actual positives, and TNF is the ratio of true negatives to actual negatives. Two other quantities of interest for performance characterization are defined in terms of the above quantities, as follows:

Sensitivity = TPF

Specificity = TNF = 1.0 – FPF
Choosing a value of threshold c defines an 'operating point', at which the test has a particular combination of sensitivity and specificity. A plot of TPF vs FPF for all possible operating points is the ROC curve for test X, which makes explicit the trade-off between sensitivity and specificity for the test. Both TPF and FPF range from 0 to 1, so the ROC is often plotted within a unit square. It is useful to note that a test that 'guesses', that is, randomly assigns a value of true or false to each event, has a locus of operating points along the diagonal from the lower left to the upper right corner of the unit square. Include this 'guesswork' line as a reference when appropriate.
Macskassy and Provost12 discussed that a ROC curve allows us to explore the relationship between the sensitivity and specificity of a clinical test for a variety of different cut points, thus allowing the determination of an optimal cut point. To determine the presence or absence of a disease, author often have to carry out a test, which provides a result on a continuous measure. From this it is necessary to decide if the disease is present or absent, so a cut point is selected. To one side of this cut point, say above, claim the disease is present and below this cut point, claim the disease is absent. Using any test will make diagnostic errors. Sensitivity is the probability that diagnose the disease when it is actually present (the true positive rate). Specificity is the probability that identifies the disease is absent when it is truly absent (the true negative rate).
Author CAD system ideally wants both sensitivity and specificity to be one. Unfortunately, changing the cut point to try and increase either sensitivity or specificity will usually result in a decrease in the other measure. To make the ROC graph, the X-axis is 1 minus the specificity (the false positive rate) and the Y-axis is the sensitivity (the true positive rate).
An index of the goodness of the test is the area under the curve; a perfect test has area 1.0, whilst a non-discriminating test (one which falls on the diagonal) has area 0.5. Streiner and Norman18 discuss this in more detail and provide examples.
When the different possible errors that can be made by the classifier have different 'costs' then selecting the appropriate operating point on the ROC curve can maximize 'profits'. In practical application, this requires that the underlying parameters of the classifier be easily manipulate to facilitate selection of the ROC operating point. Bradley2, et al. mentioned that the AUCs are estimated by using the trapezoid rule for the discrete operating points.
This type of curve fitting is generally done for medical imaging studies when operating points are obtained by presenting a reader with normal and abnormal images in random order, and the reader is asked to rank each image on a discrete ordinal scale of 5 or 6 categories ranging from definitely normal to definitely abnormal. This is known as a confidence rating. The ROC points are obtained by successively considering broader and broader categories of abnormal. In other words, thresholds are labeled abnormal. While any images rated below the threshold are labeled normal. Figure 3 shows the ROC curves for comparison of classification performances for the proposed system. Kadam10, et.al. mentioned that UT and MRI are used in the earlier system, but proposed system uses CT brain images for classification.

Figure 3. Receiver operating characteristic analysis of the classifiers.

3.2   Precision-recall Curve Analysis

Receiver operator characteristic curves are commonly used to present results for binary decision problems in machine learning. An important difference between ROC space and precision-recall (PR) space is the visual representation of the curves. Looking at PR curves can expose differences between algorithms. In PR space, one plots recall on the x-axis and precision on the y-axis. The metrics are calculated as:
Precision = TP / (TP+FP)
Recall = TP / (TP+FN)
Figure 4 shows the PR curves for comparison of classification performances for the proposed system and the following Table 2 shows the performance of the proposed system. Table 3 shows the result of the classification.

Figure 4. Precision-recall curve analysis of the classifiers.

Table 2. Performance of classifiers.

Table 3. Classification result of the entire CT brain image database

Feed forward networks have been the subject of considerable research in recent years and form the basis of most present day applications. Radial basis function network is implemented to classify the images into normal and abnormal. The performance of the proposed system is analyzed with ROC and PR curve analysis shows that the RBFN outperforms better than other classifier.

3.3   Experimental Output

The proposed classification of CT brain images using radial basis function neural network is developed using MATLAB® 7 (Release 14). Channin and Furst14 mentioned that the total framework is designed as simple user friendly software that uses GUI whereas in the earlier system uses decision trees, posters and demos. Initially, the user has to select an image, and the user has to perform the steps such as enhancement, segmentation, feature extraction and classification one by one. Also, the results of each step can be viewed in the same window. The radiologist can use this system as a second opinion for their diagnosis. Figure 5 shows the screen shot with sample results of CAD system.

Figure 5. Screenshots of CAD system.

The algorithm and methodology that were proposed will enable the easy and faultless identification of abnormalities present in the scanned region. This will allow for the further testing of algorithm to the brain, as a successful implementation here would give the necessary credentials for success in other regions. In this paper, we have developed an automated brain image analysis system. We have studied the literature based on these four phases. There will always be a need to continue researching until a method is developed that classifies with 100 per cent accuracy. Obviously, it is arguable whether this will ever eventuate.

However, since the motivation to save human lives have inspired the researchers to develop accurate and efficient methods for detection and diagnosis. 89 per cent is a very good classification rate achieved by the experiments using the proposed method. Local Neuro-clinic database is used to test and experiment the stability of the proposed system thoroughly. Although the RBFN has impressive capabilities, it does have limitations due to time constraints and the scope of this research. These limitations include detects and diagnoses only the benign and malign and no other abnormalities or lesions. The RBFN developed is a demonstration of what a real-life diagnosis system could be. To become viable as a real-life system, the RBFN will need to overcome limitations. It has the potential to do this with some further research and development, as the foundation work has now been completed. Moreover present work will be helpful to analyze CT brain images with improved performance in the medical sector of defence personnel.

1. Abraham & Sorwar, G. DCT-based texture classification using a soft computing approach. Malaysian J. Comp. Sci., 2004, 17(1), 13-23.

2. Bradley, A.P. The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition, 1997,30(7), 1145-159.

3. Fawcett, T. An introduction to ROC analysis. Pattern Recog. Letters, 2006, 27(8), 861-74.

4. Gandhi, V. & Shah, S.K. Image classification based on textural features using artificial neural network (ANN). Electronics Telecom Engg., 2004, 84, 72-77.

5. Golemati, S.; Niccolaides, A.N.; Nikita, K.S. & Stoitsis, J. A modular software system to assist interpretation of medical images–Applications to vascular ultrasound images. In the IEEE International Workshop on Imaging Systems and Techniques (IST), May 2004, pp. 135-140.

6. Hand, D.J. & Till, R.J. A simple generalization of the area under the ROC curve to multiple class classification problems. Machine Learning, 2001, 45(2), 171-86.

7. Lee, Y.; Ohkubo, M.; Sekiya, M. & Tsai, D. Medical image classification using genetic algorithm based fuzzy-logic approach. J. Electronic Imaging, 2004, 13(4), 780-88.

8. Lu, D. & Weng, Q. A survey of image classification methods and techniques for improving classification performance. Int. J. Remote Sensing, 2007, 28(5), 823-70.

9. Wang, Z.Z. & Yong, J.H. Texture analysis and classification with linear regression model based on wavelet transform. IEEE Trans. Image Proc., 2008, 17(8), 1421-430.

10. Kadam, D.B.; Gade, S.S.; Uplane, M.D. & Prasad, R.K. Neural network based brain tumor detection using MR images. Int. J. Comp. Sci. Communications, 2011, 2(2), 325-31.

11. Davis, J. & Goadrich, M. The relationship between precision-recall and ROC curves. In the ACM International Conference Proceeding 23rd International Conference on Machine learning, Pittsburgh, PA, 2006, 148, pp. 233-40.

12. Macskassy, S.A. & Provost, F. Confidence bands for ROC curves: Methods and an empirical study. In the Proceedings of the First Workshop on ROC Analysis in AI (ROCAI-2004) at ECAI-2004. Pittsburgh, PA, August 2004. pp. 61-70.

13. Othman M.F. & Basri, M.A.M. Probabilistic neural network for brain tumor classification. In the 2nd International Conference on Intelligent Systems, Modelling and Simulation (ISMS), March 2011, pp 136-38.

14. Channin, D.; Furst, J.D.; Lilly, L.; Limpsangsri, C.; Raicu, D.S. & Xu, D.H. Classification of tissues in computed tomography using decision trees. In the 90th Scientific Assembly and Annual Meeting of Radiology Society of North America (RSNA04), Chicago, IL, USA, 28 Nov. - 3 Dec. 2004. (Poster and Demo)

15. Xu, D.; Lee, J.; Raicu, D.S.; Furst, J.D. & Channin, D. Texture classification of normal tissues in computed tomography. In the Annual Meeting of the Society for Computer Applications in Radiology, Orlando, Florida, 2005. (Abstract)

16. Haralic, R.M. & Shapiro, L.G. Computer and robot vision. Addison Wesley Publishing Co., 1992.

17. Herz, J.; Krogh, A. & Palmer, R. Introduction to the theory of neural computation. Reading, MA: Addison-Wesley, 1991.

18. Streiner, D.L. & Norman, G.R. Health measurement scales: A practical guide to their development and use. Oxford: Oxford University Press. 2nd Ed., 1995.

 Prof T. Joshva Devadas received his MTech (Computer Science and Engineering) and MSc (Computer Science). Presently working as a Professor in the Department of Information Technology, Sethu Institute of Technology, Tamilnadu, India. He has published more than 13 research papers in national and international journals. His current research areas include: Software intelligent agent-based data mining, image processing, wireless sensor networks, and agent-based learning for data cleaning using machine learning algorithms, and knowledge management systems. Dr R. Ganesan received his ME from MIT, Anna University in 1999 and completed his PhD in 2010. Presently working as a Professor in the Department of Electrical and Electronics Engineering at Sethu Institute of Technology, Tamilnadu, India. He has published more than 16 research papers in national and international journals. His current research areas include: Neural network, genetic algorithm, image processing, control system, knowledge management systems and instrumentation.