LIST OF TABLES
TABLE NO.
TITLE
PAGE NO.
4.1
Normal Images – Mean
32
4.2
Normal Images – Variance
33
4.3
Normal Images – Entropy
33
4.4
Abnormal Images – Mean
34
4.5
Abnormal Images – Variance
35
4.6
Abnormal Images – Entropy
36
4.7
Performance Measures without
37
applying transforms 4.8
Performance Measures of DWT
37
based SVM 4.9
Performance Measures of Gabor
38
based SVM 5.1
Performance Measures – Comparison
viii
40
LIST OF FIGURES
FIGURE NO.
TITLE
PAGE NO.
3.1
Normal Image
6
3.2
Abnormal Image
6
3.3
Three level wavelet decomposition Tree
10
3.4
Wavelet Families
11
3.5
SVM principles
22
3.6
Linear SVM
25
4.1
Wavelet transform of abnormal image
27
4.2
Wavelet transform of normal image
28
4.3
Gabor sub-bands
29
4.4
Gabor Transform of normal image
30
4.5
Gabor Transform of abnormal image
31
ix
LIST OF ABBREVIATIONS
MRI
Magnetic Resonance Imaging
CADe
Computer Aide Detection
CADx
Computer Aided Diagnosis
t2fs
T2 weighted FLAIR image
FLAIR
Fluid Attenuated Inversion Recovery
ANN
Artificial Neural Network
WT
Wavelet Transform
CWT
Continuous Wavelet Transform
DWT
Discrete Wavelet Transform
GT
Gabor Transform
COM
Co-Occurrence Matrix
IDM
Inverse Difference Moment
SVM
Support Vector Machine
RBF
Radial Basis Function
SLT
Statistical Learning Theory
SRM
Structural Risk Minimization
x
CHAPTER 1 INTRODUCTION
There are many factors in the process of the physicians’ diagnosis. Firstly, the physicians’ diagnosis is subjective, because the diagnosis’ result is affected by the doctors’ experience and ability; Secondly, the physicians’ diagnosis tends to omit some tiny changes that the human eyes can’t find; thirdly, different physicians would get the different diagnosis’ conclusions for the same medical image. Comparing to the physicians, the computer has tremendous predominance in the aspect of avoiding the incorrect results.
Image Classification plays an important role in the fields of Medical diagnosis, Remote Sensing, Image analysis and Pattern Recognition. Digital image classification is the process of sorting of images into a finite number of individual classes. In the case of main fields like medical diagnosis, images have to be classified with maximum accuracy, or else it will lead to incomplete treatment of the corresponding disease.
The main advantage of applying transform to a medical image before extracting its features is to reduce redundant data present in the image. Applying transforms can improve the accuracy and accordance of the diagnosis’ result. First and second order statistics of the wavelet detail coefficients provide descriptors that can discriminate intensity properties spatially distributed throughout the image, according to various levels of resolution. Then, using SVM classifier, these features are classified and performance measures of SVM classifier such as accuracy, precision, sensitivity, specificity are evaluated and compared with those from the nontransformed images.
1
CHAPTER 2 LITERATURE REVIEW
Image Classification plays an important role in the fields of Medical diagnosis, Remote sensing, Image analysis and Pattern Recognition.. In case of main fields like medical diagnosis, images have to be classified with maximum accuracy, or else it will lead to incomplete treatment to the corresponding disease.
In order to achieve high classification accuracy in image classification, numerous methods for feature extraction and image classification were introduced. For feature extraction, different transforms like Wavelet, Ridgelet, Curvelet, Contourlet transforms were used. For classification different statistical classifiers like Feed forward Neural Networks(FNN) ,Feedback
Neural
Networks,
Back
propagation
Neural
Networks(BPNN),Multi-Layer Perceptron(MLP),Hybrid Hopfield Neural Networks(HHNN),Radial Basis Function Neural Network(RBF-NN)[1],SelfOrganizing
Map(SOM)
and
kernel
classifiers
like
Support
Vector
Machine(SVM) are used.
One method is by using Ridgelet transform for feature extraction and classification as explained in the paper “Ridgelet based Texture Classification of Tissues in Computed tomography” [3] by Lindsay Semlera, Lucia Dettoria and Brandon Kerrb of DePaul University, Chicago. This article focuses on using
Ridgelet-based
multi-resolution
texture
analysis
and
also
on
investigating the discriminating power of several Ridgelet based texture descriptors. Even though Ridgelet transform provided good performance, but
2
it gives good results only in case of texture descriptors, which are somewhat difficult to calculate. A paper titled titled “A Comparison of Daubechies and Gabor Wavelets for Classification of MR Images” [2] by Ulas Bagci, Li Bai of CMAIG proposes a machine learning algorithm for classifying MR image using wavelet transform and Gabor transform using SVM.But,results of 100% classification accuracy in case of classifying normal and abnormal images were obtained using SVM classifier of Sigmoid and RBF kernels only,which are more time consuming and less effective than SVM classifier of linear kernel.
An article named “A Comparison of Wavelet-based and Ridgelet based texture Classification of tissues in Computed Tomography”[3] by Lindsay Semler, Lucia Dettori of DePaul University, Chicago presents a comparison of wavelet-based and Ridgelet-based algorithms for medical image classification. Tests on a large set of chest and abdomen CT images indicate that, the one using texture features derived from the Haar wavelet transform clearly outperforms the one based on Daubechies and Coiflet transform. The tests also show that the Ridgelet-based algorithm is significantly more effective, but it fails to recognize point coordinates in image which represents the tumour effectively and does not provide 100% accuracy.
Another problem lies in the case of classification. Before SVM, Artificial Neural Networks (ANN) were widely used for classification of images
and other
Networks(ANN)
applications. Many types of
Artificial Neural
namely Feed forward Neural Networks(FNN) ,Feedback
Neural Networks, Back propagation Neural Networks(BPNN) ,Multi-Layer Perceptron(MLP), etc were developed. But all of them have their own
3
advantages and disadvantages and no one of them provided 100% classification accuracy.
With the introduction of kernel classifiers, SVM classifier becomes more common. But, kernel classifiers called Support Vector Machine (SVM) with linear kernel provided 100% classification accuracy. A comparison on SVM and
FNN was provided in the paper paper titled titled “Research on Comparison
and Application of SVM and FNN Algorithm” [5] written by Shaomei Yang and Qian Zhu. This paper concluded that high recognition accuracy and high speed convergence are achieved by using SVM rather than FNN.
Another study titled “Improved Classification of Pollen texture Images using SVM and MLP ”[4] by M. Fernandez-Delgado, P. Carrion, E. Cernadas, J.F. Galvez, Pilar Sa-Otero, explored the use of more sophisticated classifiers to improve the classification stage. It explains that SVM achieved high accuracy when compared to k-Nearest Neighbours (KNN) and MultiLayer Perceptron (MLP). Also KNN requires storing of whole training set and hence it is time consuming.
4
CHAPTER 3 MEDICAL IMAGE CLASSIFICATION
3.1 MEDICAL IMAGES
As the project’s main objective is to classify medical images, the first and foremost need is the input to the algorithm, i.e., medical images. Here, classification is done on medical images obtained from Magnetic Resonance imaging (MRI) scans. Among different types of MRI scans like Diffusion MRI, Functional MRI, FLAIR (Fluid Attenuated Inversion Recovery) MRI, Interventional MRI, images taken here are FLAIR images of t2fs type. Fluid attenuated inversion recovery (FLAIR) is a pulse sequence used in
magnetic resonance imaging which was invented by Dr. Graeme Bydder. FLAIR can be used with both three dimensional imaging ( 3D FLAIR) or two dimensional imaging (2D FLAIR). The pulse sequence is an inversion recovery technique that nulls fluids. For example, it can be used in brain imaging to suppress cerebrospinal fluid (CSF) effects on the image, so as to bring out the periventricular hyperintense lesions, such as multiple sclerosis (MS) plaques. By carefully choosing the TI, the signal from any particular tissue can be nullified. The appropriate TI depends on the tissue via the formula: TI = ln2 * T1 One should typically yield a TI of 70% of T1. In the case of CSF suppression, one aims for T2 weighted images. T 2-weighted
long
T R .
scans use a spin echo (SE) sequence, with long
T E
and
They have long been the clinical workhorse as the spin echo
5
sequence is less susceptible to inhomogeneities in the magnetic field. They are particularly well suited to edema as they are sensitive to water content (edema is characterized by increased water content). Also, this project mainly focuses on creating an algorithm for classifying brain images especially and detecting any abnormalities in brain like tumor. Therefore, the images which are given as input to the algorithm are of two types –
1. Normal brain images (without tumor) (eg. Fig. 3.1) and 2. Abnormal brain images (with tumor) (eg. ( eg. Fig. 3.2)
Fig. 3.1
Fig.3.2
Normal Image
Abnormal Image
6
3.2 APPLYING IMAGE TRANSFORMS 3.2.1 IMAGE TRANSFORMATIONS
In different image processing applications, the main step considered is applying transforms on images. This step is usually carried out for the following reasons: 1. To reduce the redundant data in image 2. To reduce the actual size of image content. 3. Easy analysis.
The transform of a signal is a way to represent the signal. It does not alter the information content present in the signal. There are different image transformations available such as:
1. Cosine Transform
2.Sine Transform
3. Fourier Transform
4.Wavelet Transform
5. Ridgelet Transform
6. Curvelet Transform
7. Contourlet Transform
8.Riplet Transform
9. Gabor Transform, etc.
Among these most promising are - 1.Wavelet Transform and 2.Gabor transform.
3.2.2 DISCRETE WAVELET TRANSFORM
The Wavelet Transform provides a time-frequency representation of the signal. It was developed by Morlet to overcome the short comings of the Short Time Fourier Transform (STFT) and Fourier Transform (FT), which
7
can also be used to analyze non-stationary signals where all frequencies have an infinite coherence time. While FT and STFT give a constant resolution at all frequencies, the Wavelet Transform uses multi-resolution technique by which different frequencies are analyzed with different resolutions and a coherence time proportional to the period of the signal.
Also, Fourier transform (FT) does not provide any information to show where within single certain frequencies occur, i.e., information about the time domain is lost. Since, most of the signals in the real-world change with time, it is especially useful to characterize signals in both time and frequency domain simultaneously. Wavelet transform is used for this reason. reas on.
The Wavelet Transform (WT) decomposes a signal into a linear sum of basis-functions which are dilated and translated wavelets. A wavelet is a function
L2 () with zero average, normalized to one and centered in the
neighbourhood of t=0. A family of time-frequency components is obtained by translating it by u and scaling it by .
Convolution of this wavelet function with the image will give us the Wavelet transform of the image f(x,y). The wavelet transform of the image can be analysed with respect to both time and frequency domains. Hence, wavelet transform provides information about both time and frequency domains at various resolutions. Discrete Wavelet transform converts a discrete time signal into wavelets. Discrete Wavelet transform is expressed as: 8
where dm(n) = and
are basis for L () satisfying the biorthogonality condition condition ,which is defined 2
as(mn ,lk )=ml
nk. In orthogonal case, mn can be used for both synthesis
and analysis of wavelets.
The two-dimensional Wavelet transform is carried out by applying one-dimensional wavelet transform to the rows and columns of the twodimensional data consecutively. It is computed by successive lowpass and highpass filtering of the discrete time-domain signal .This is called the Mallat algorithm or Mallat-tree decomposition. In the figure 3.3, the signal is denoted by the sequence x[n], where n is an integer. The low pass filter is denoted by G0 while the high pass filter is denoted by H 0. At each level, the high pass filter produces detailed information; d[n], while the low pass filter associated with scaling function produces coarse coa rse approximations,a[n].
Generally, there are three detailed components in images namely: 1.Horizontal, 2.Vertical and 3.Diagonal components. These three components contain all the detailed information about high frequency contents present in the image and hence they are used to extract image features rather than from approximation components.
9
Fig 3.3 Three level wavelet decomposition tree
The advantages of DWT are: 1. It is easy to implement 2. It reduces the computation time 3. Resources required for the computation of DWT co-efficients are less. 4. It needs only O(N) operations to get computed and hence, it is also referred to as the fast wavelet transform
In Wavelet Families there are a number of basis functions that can be used as the mother wavelet for Wavelet Transformation. Since the mother wavelet produces all wavelet functions used in the transformation through translation and scaling, it determines the characteristics of the resulting Wavelet Transform. They include Haar wavelet, Daubechies wavelet, Symlet, Coiflet, Biorthogonal wavelet, Morlet, Meyer wavelet and Mexican Hat wavelet as given in fig 3.4
10
Fig 3.4 Wavelet Families (a)Haar (b)Daubechies (c)Coiflet1 (d)Symlet (e)Meyer (f)Morlet (g)Mexican Hat
Haar wavelet is the oldest and simplest wavelet available which is used as mother wavelet for classifying medical images. Since Haar wavelet is suitable for representing the image in time domain, it is used here.
3.2.3 GABOR TRANSFORM
Gabor transforms are widely used in image analysis and computer vision. The Gabor transform provides an effective way to analyze images and has been elaborated as a frame for understanding the orientation and spatial frequency selective properties of signals. It seems to be a good approximation to the sensitivity profiles. The important advantages are infinite smoothness and exponential decay in frequency. Gabor transform represents an approach to characterise a time function in terms ter ms of time and frequency simultaneously.
11
A joint space-frequency analysis of any signal cannot be performed by the Fourier transform. This can be easily performed by Short time Fourier Transform (STFT).The STFT of a signal s(t) can be given as-
STFT (τ (τ,) = s(t) g(t-τ g(t-τ) exp (-jt) dt
Thus, STFT of a signal is defined as Fourier Transform of the signal s(t) windowed by the function g(t-τ g(t- τ).The STFT with a Gaussian window is called a Gabor transform.
The Gabor transform can be regarded as a signal being convoluted with a filter bank, whose impulse response in the time domain is Gaussian signal modulated by sine and cosine waves. The characteristics of the Gabor wavelets, especially for frequency and orientation representations, are similar to those of human visual system, and they have been found to be particularly appropriate for texture representation and discrimination. In the spatial domain, the 2D Gabor filter is a Gaussian kernel function modulated by a sinusoidal plane wave.
where f is the central frequency of the sinusoidal plane wave, clockwise rotation of the Gaussian and the plane wave,
is the anti-
is the sharpness of
the Gaussian along the major axis parallel to the wave, and β is the sharpness
12
of the Gaussian minor axis perpendicular to the wave. γ = f /
and η = f / β
are defined to keep the ratio between frequency and sharpness constant.
The Gabor filters are self-similar; all filters can be generated from one mother wavelet by dilation and rotation. Each filter is in the shape of plane waves with frequency f, restricted by a Gaussian envelope function with relative width
and β. To extract useful features from an image, normally a
set of Gabor filters with different frequencies and orientations are required.
The Gabor representation of a MR brain image M(~x) can be obtained by convolving the image with the family of Gabor filters is given by:
where LHS denotes resultant Gabor representation of image M at orientation u and scale v. Thus, the resultant Gabor feature set consists of the convolution results of an input image M(x) with all of the Gabor wavelets:
13
The number of Gabor wavelets used varies with different applications. Better results can be obtained, if 5 scales and 8 orientations are used for classification of MR brain images. Extracted feature vectors are then concatenated together to construct a new feature vector to be used for classification purposes. In Figure 4.3, a MR brain image is convolved with 40 Gabor filters (8 orientations, 5 scales) and each row and column shows a different scale and orientation orientation respectively.
3.3 FEATURE EXTRACTION
Feature Extraction is an important step in image classification. It refers to extraction of various characteristics of the image, either transformed or untransformed. There are numerous features available in the image to get extracted. The feature extraction includes the 1. Gray scale (intensity) Extraction 2. Shape extraction 3. Texture extraction
3.3.1 GRAY SCALE FEATURES
In digital picture process, the two-dimensional digitized gray scale image (M×N) can be seen as M×N pixels in two-dimensional surface XOY, each pixel f(x,y) can be expressed as its gray value.
Mean
Mean or average value represents the arithmetic mean or average of all pixels in the image f(x,y).
14
Variance
Variance is the first moment about the mean. It reflects the separate degree of gray scale value.
Skewness
Skewness is the second moment about the mean. It takes the mean value as the central data distribution not as symmetrical degree.
Kurtosis
Kurtosis is the third moment about the mean. It reflects the normal distribution sharpness or smoothness of compared the data.
Histogram
The gray histogram is the gray scale value function; it describes the rate of the pixels that have the same gray scale value in the picture.
where n b is the number of the pixels that the gray scale value is b, n is the total number of the picture’s pixels.
15
The features aiming at the gray scale histogram is mainly the gray scale average value, gray scale variance, gray Skewness, gray Kurtosis, gray energy and gray entropy. 3.3.2 SHAPE FEATURES
The shape features have the characters that are scaling invariability, rotation invariability, translation invariability, so they can become the object recognition features. The features include the Normalized Moment of Inertia (NMI), moment invariants and sphericity.
The shape features can highly identify the figure. As the exterior shapes of brain image are all nearly ellipse, it can’t distinguish the image by extracting the features to the whole image. Considering the place and character of the certain pathological changes we select the ROI and extract its features.
3.3.3 TEXTURE FEATURES
A co-occurrence matrix (COM) is a square matrix whose elements correspond to the relative frequency of occurrence p(i,j,d,h) of two pixel values (one with intensity i and the other with intensity j ), separated by a certain distance d in a given direction h . A COM is therefore a square matrix that has the size of the largest pixel value in the image. The COM is scale invariant. The matrices present the relative frequency distributions of gray levels and describe how often one gray level will appear in a specified spatial relationship to another gray level within each image.
16
The matrix was normalized by the following function:
where, R is the normalized function, which is usually set as the sum of the matrix.
Energy
Energy is also called Angular Second Moment. It is a measure of the homogeneousness of the image and can be calculated from the normalized COM. It is a suitable measure for detection of disorder in texture image. Higher values for this feature mean that the amplitude or intensity changes less in the image, resulting in a much sparser COM.
Contrast
Contrast is a measure of amount of the local variation in the image. It presents the degree of the legibility of the image. A higher contrast value indicates a high amount of local variation, so the higher contrast is, the clearer the image is.
IDM
The Inverse Difference Moment (IDM) reflects the local texture changes. It is another feature of image contrast.
17
Entropy
Entropy is a statistical measure of randomness that can be used to characterize the texture of the input image. The entropy of the image reflects the gray asymmetrical extent and complicated extent.
We have considered three important features-two gray scale features namely mean and variance
and one texture feature feature namely entropy are
calculated and used to create datasets for SVM classification.
3.4 DATASET FORMATION
As prior to classification by SVM the extracted features must be arranged in dataset. All the images used for classification are got from hospital. There are totally 38 images, of which 7 images are got from healthy persons and the remaining 31 images are got from persons who are suffering from brain tumour.
As classification by SVM consists of two phases namely: training and testing phases, all images are used for these two phases in the following manner:
For training phase: 11 abnormal images and 3 normal images are used. For testing phase: 20 abnormal images and 4 normal images are used.
18
3.5 CLASSIFICATION BY SVM 3.5.1 STATISTICAL LEARNING THEORY
Learning can be thought of as inferring regularities from a set of training examples. Various learning algorithms allow the extraction of regularities. If the learning has been successful, these intrinsic regularities will be captured in the values of some parameters of a learning machine. Geometry provides a very intuitive background for the understanding and the solution of many problems in Machine Learning. Statistical learning theory addresses a key question that arises when constructing predictive models from data-how to decide whether a particular model is adequate or whether a different model would produce better predictions. Whereas classical statistics typically assumes that the form of the correct model is known and the objective is to estimate the model parameters, statistical learning theory presumes that the correct form is completely unknown and the goal is to identify mathematical form and none of them need be correct. The theory provides a sound statistical basis for assessing model adequacy under these circumstances, which are precisely the circumstances encountered in machine learning, pattern recognition, and exploratory data analysis. Estimating the performance of competing models is the central issue in statistical learning theory. Performance is measured through the use of loss functions. The loss Q(z,) between a data vector z and a specific model
(one with values assigned to all parameters) is a score that indicates how well
performs on z, with lower scores indicating better performance. Statistical Learning Theory forms the basis of classification by Support Vector Machine (SVM).
19
3.5.2 SUPPORT VECTOR MACHINE
Support Vector Machines (SVMs) are a new supervised classification technique that has its basis in Statistical Learning Theory .Based on machine vision fields such as character, handwriting digit and text recognition there has been increased interest in their application to image classification. SVMs are non-parametric hence they boost the robustness associated with Artificial Neural Networks and other nonparametric classifiers. The purpose of using SVM is to obtain the acceptable results r esults fast and easily.
The Support Vector Machine is a theoretically superior machine learning methodology with great results in classification of high dimensional datasets .A classification task usually involves with training and testing data which consist of some data instances. Each instance in the training set contains one “target value” (class labels) and several “attributes” (features). The goal of SVM is to produce a model which predicts target value of data instances in the testing set which are given only the attributes.
SVM functions by nonlinear projection of the training data in the input space to a feature space of higher (infinite) dimension by use of a kernel function φ. Then SVM identifies a linear separating hyperplane with the maximal margin in this higher dimensional space. This process enables the classification of remote sensing datasets which are usually nonlinearly separable in the input space. In many instances, classification in high dimension feature spaces results in overfitting of the input space. However, in SVMs, overfitting is controlled through the principle of structural risk minimization. The empirical risk of misclassification is minimised by maximizing the margin between the data points and the decision boundary. In
20
practice this criterion is softened to the minimisation of a cost factor involving both the complexity of the classifier and the degree to which marginal points are misclassified. The tradeoff between these factors is managed through a margin of error parameter which is tuned through crossvalidation procedures.
The SVM paradigm in Machine Learning presents a lot of advantages over other approaches such as: 1) Uniqueness of the solution (as it is guaranteed to be the global minimum of the corresponding optimization problem), 2) Good generalization properties of the solution, 3) Rigid theoretical foundation based on SLT and optimization theory, 4) Common formulation for the class separable and the class non-separable problems as well as for linear and non-linear problems (through kernel trick). 5) Clear geometric intuition of the classification class ification problem.
Due to these very attractive properties, SVM have been successfully used in a number of applications. SVM allows the construction of various learning machines by choice of different dot products. Thus the influence of the set of function that can be implemented by a specific learning machine can be studied in a unified frame work. SVM builds support vectors on results of statistical learning theory, namely on the Structural Risk Minimisation principle guaranteeing high generalization ability. There is reason to believe that decision rules constructed by the support vector algorithm do not reflect incapabilities of learning machine but rather the regularities of data.
21
Fig 3.5 SVM principle
A SVM finds the best separating (maximal margin) hyperplane between two classes of training samples in the feature space, which is in line with optimizing bounds concerning the generalization error. The playground for SVM is the feature space H, which is a Reproducing Kernel Hilbert Space (RKHS), where the mapped patterns reside (Ф (Ф: X->H). It is not necessary to know the mapping Ф itself analytically, but only its kernel, i.e., the value of the inner products of the mappings of all the samples (K (x1, x2) = <Ф(x1),(x2)> for all x1, x2 є X) .Through the “kernel trick”, it is possible to transform a nonlinear classification problem to a linear one, but in a higher 2
(maybe infinite) dimensional space H . Once the patterns are mapped in the feature space, provided that the problem for the given model (kernel) is separable, the target of the classification task is to find the maximal margin hyperplane. 22
Principle of Structural Risk Minimisation Technique (SRM) is an inductive principle of use in machine learning. Commonly in machine learning a generalized model must be selected from a finite data set, with the consequent problem of overfitting the model becoming too strongly tailored to the particularities of the training set and generalizing poorly to new data. The SRM principle addresses this problem by balancing the Modd’s complexity against its success at fitting the training data. The functions used to project the data from input space to feature space are called kernels. T
K (xi, x j) ≡φ(x ≡φ(xi) φ(x j) is called the kernel function.
3.5.3 KERNEL CLASSIFIERS
Depending on the kernel function used, us ed, SVM classifier is classified into different types namely: 1. Linear SVM 2. Polynomial SVM 3. Radial Basis Function (RBF) SVM 4. Sigmoid SVM Four basic kernels functions used are: •
Linear: T x .
K(xi, x j) = x i j •
Polynomial: +
d
K(xi, x j) = (γ (γxiTx j r) , γ > 0. •
Radial basis function (RBF): 2
K(xi, x j) = exp(−γ exp(−γkx kxi− x jk ), γ > 0.
23
•
Sigmoid: K(xi, x j) = tanh(γ tanh(γxiTx j+ r)
Here, γ, r, and d are kernel parameters.
In this project linear SVM classifier is used for the classification of MR brain images.
3.5.4 LINEAR SUPPORT VECTOR MACHINE
The task of image classification is similar to the following problem: Consider the problem of classifying m points in the n-dimensional real space n
R , represented by the m*n matrix A, according to membership of each point Ai in the class A+ or A- as specified by a given m*m diagonal matrix D with plus ones or minus ones along its diagonal. For this problem the standard support vector machine with a linear kernel is given by the following quadratic program with parameter υ>0 υ>0
Here ω is the normal to the bounding planes and γ determines their location relative to the origin. The plane ω*x+b>1 bounds the class A+ points, possibly with some error, and the plane ω*x+b>1 bounds the class A points, also possibly with some error. The linear separating surface is the plane ω*x+b=0 which is the midway between the bounding planes.
24
Fig.3.6 Linear SVM
3.6 PERFORMANCE OF SVM
The performance of Linear Support Vector Machine (SVM) classifier is to be measured to find the extent of classification of MR images. It is done by calculating the performance measures of the classifier. The performance measures of SVM classifier are:
1. Accuracy/Classification rate/Efficiency 2. Precision 3. Sensitivity/Recall 4. Specificity
25
Accuracy
Accuracy is the probability that a diagnostic test is correctly performed. Accuracy = (TP + TN)/(TP + TN + FP + FN) Precision
Precision is the probability that a diagnostic test is correctly performed, when certain classes of images come consecutively. It is the degree of accuracy. Precision=TP/(TP+FP) Sensitivity
Sensitivity (true positive fraction) is the probability that a diagnostic test is positive, given that the person has the disease. Sensitivity = TP/(TP + FN) Specificity
Specificity (true negative fraction) is the probability that a diagnostic test is negative, given that the person does not have the disease. Specificity = TN/(TN + FP) Where: TP (True Positives)
: Correctly classified positive cases,
TN (True Negative)
: Correctly classified negative cases,
FP (False Positives)
: Incorrectly classified negative cases, and
FN (False Negative)
: Incorrectly classified positive cases
26
CHAPTER 4 RESULTS
4.1 RESULTS FOR TRANSFORMS
The Discrete Wavelet transform with Haar function and Gabor transform were applied to the MR images, both normal and abnormal. The following results were obtained.
4.1.1 DISCRETE WAVELET TRANSFORM
Fig.4.1 Wavelet transform of abnormal image
27
Fig.4.2 Wavelet transform of normal image
28
4.1.2 GABOR TRANSFORM
Fig.4.3 Gabor sub-bands
29
Fig.4.4 Gabor Transform of normal image
30
Fig.4.5 Gabor Transform of abnormal image
31
4.2 RESULTS FOR FEATURE EXTRACTION
In this the feature extraction results such as mean, variance and entropy values for both normal and abnormal images, without and with applying wavelet and Gabor transformations are given as follows,
Mean Image
Wavelet
Gabor
Without Transform Horizontal
Vertical
Diagonal
(Sub-band 1)
Image 1
60.74703
0.037695
0.093213
0.000317
0.251645
Image 2
56.40564
0.046558
0.091089
0.001172
0.235427
Image 3
56.32641
0.051931
0.059011
0.005503
0.286015
Image 4
58.95343
0.049337
0.066793
0.001343
0.287435
Image 5
68.64449
0.056753
0.104955
0.00174
0.339758
Image 6
70.68972
0.062788
0.102326
0.000313
0.323999
Image 7
56.61366
0.046875
0.094604
0.000342
0.271618
Table 4.1 Normal Images – Mean
32
Variance Image
Wavelet
Gabor
Without Transform Horizontal
Vertical
Diagonal
(Sub-band 1)
Image 1
2891.725
62.92573
50.4192
2.032886
0.044661
Image 2
2369.369
57.73664
45.22733
1.953733
0.040062 0.040062
Image 3
2861.059
69.58309
67.54331
3.643807
0.068496 0.068496
Image 4
3059.756
69.86778
71.42149
3.549834
0.061923 0.061923
Image 5
4193.412
81.61674
73.89571
3.67788
0.087613
Image 6
4173.711
80.50006
73.31604
3.333723
0.074757 0.074757
Image 7
2005.429
54.35772
40.09178
2.547767
0.049764 0.049764
Table 4.2 Normal Images – Variance Entropy Image
Wavelet
Gabor
Without Transform Horizontal
Vertical
Diagonal
(Sub-band 1)
Image 1
0.04601
1.22917
1.348253
1.525145
7.067558
Image 2
0.046103
1.243307
1.3554
1.528097
6.928274
Image 3
0.042084
1.235241
1.299876
1.497687
7.117588
Image 4
0.042005
1.224813
1.29799
1.492563
7.132964
Image 5
0.046667
1.190325
1.259527
1.465836
7.210374
Image 6
0.046762
1.191948
1.25965
1.474173
7.186631
Image 7
0.04601
1.216871
1.287355
1.49454
7.067558
Table 4.3 Normal Images – Entropy
33
Mean Image
Wavelet
Gabor
Without Transform Horizontal
Vertical
Diagonal
(Sub-band 1)
Image 1
44.6699
0.043616
0.071741
0.000916
0.275158
Image 2
36.6242
0.053906
0.077075
7.32E-05
0.28023
Image 3
57.9026
0.045923
0.083789
0.002222
0.247304
Image 4
55.8079
0.038306
0.077734
0.001074
0.263427
Image 5
51.7139
0.044006
0.077087
0.000183
0.276598
Image 6
55.1247
0.046
0.067322
0.000346
0.267245
Image 7
54.1993
0.052643
0.061106 0 .061106
0.001516
0.28876
Image 8
53.3235
0.047933
0.073222
0.000732
0.312288
Image 9
49.8209
0.052948
0.073639
0.00294
0.332287
Image 10
56.3098
0.04066
0.058034
0.001801
0.285401
Image 11
57.4865
0.046356
0.065684 0 .065684
0.002879
0.27992
Image 12
76.994
0.057016
0.104016
0.002879
0.325076
Image 13
75.3437
0.052534
0.099584
0.000225
0.305669
Image 14
75.7451
0.042781
0.098295 0 .098295
0.001991
0.30544
Image 15
72.9491
0.048465
0.09534
0.000563
0.294071
Image 16
72.0666
0.045297
0.108248
0.001828
0.292423
Image 17
67.9041
0.050343
0.088654
0.000814
0.280234
Image 18
63.5098
0.0573
0.089111
0.000464
0.288493
Image 19
67.2230
0.047266
0.088062
0.000537
0.300335
Image 20
65.9541
0.045044
0.094971
0.004272
0.294079
Image 21
66.4657
0.054529
0.095325
8.54E-05
0.293742
Image 22
62.9119
0.055029
0.093604
0.001172
0.282132
Image 23
66.3750
0.05575
0.097644
0.001379
0.301768
Image 24
68.7174
0.057751
0.096448
0.004602
0.319182
Image 25
36.8495
0.001372
0.008499
0.001082
0.151476
Image 26
36.5772
0.002877
0.015546
0.002349
0.146255
Image 27
36.4418
0.003194
0.010505
0.002323
0.135219
Image 28
35.7769
0.005754
0.013725
0.001056
0.117068
Image 29
34.1989
0.00293
0.004408
0.001795
0.132999
Image 30
35.6062
0.006202
0.001874
0.00161
0.13789
Image 31
36.5106
0.003946
0.004025
0.000699
0.155908
Table 4.4 Abnormal Images – Mean 34
Variance Image
Wavelet
Gabor
Without Transform Horizontal
Vertical
Diagonal
(Sub-band 1)
Image 1
3074.957
45.2334
23.89255
1.681724
0.226409
Image 2
2216.829
33.04906
16.64941
1.577053
0.22223
Image 3
2919.207
58.86982
44.82321
1.835159
0.203457
Image 4
3453.95
57.30016
37.80715
1.713261
0.216925
Image 5
3469.815
53.62047
27.9559
1.799635
0.228751
Image 6
3096.86
54.13341
41.61895
2.36158
0.221209
Image 7
3280.115
59.26085
40.41683
2.638464
0.24328
Image 8
3468.613
59.65723
39.96407
3.090099
0.267676
Image 9
3396.621
60.73324
39.6609
3.361088
0.301476
Image 10
3013.183
67.5673
68.50105
3.282204
0.249265
Image 11
3262.737
62.39801
56.1362
2.669302
0.244138
Image 12
4768.797
80.14766
70.45372
2.669302
0.275816
Image 13
4533.509
71.62744
60.21357
3.232478
0.262391
Image 14
4685.575
78.90971
63.72544
3.317919
0.268795
Image 15
4513.721
72.547
62.16007
3.181601
0.256495
Image 16
4579.436
69.05656
62.94815
3.01611
0.24539
Image 17
4306.119
60.29156
54.93299
2.56416
0.229173
Image 18
2904.708
70.198
55.69006
2.925376
0.242471
Image 19
3307.889
76.85706
64.99517
3.422935
0.250287
Image 20
3327.111
68.24504
65.98338
3.216113
0.239318
Image 21
3560.139
61.61814
62.5934
2.89766
0.235636
Image 22
3186.652
51.54247
46.8791
2.492735
0.222
Image 23
2802.463
66.34095
48.72643
2.943655
0.249485
Image 24
3172.501
79.34072
60.32123
3.514329
0.26923
Image 25
2641.164
75.10117
52.95186
3.766885
0.323131
Image 26
2780.292
72.00747
51.93181
3.444566
0.302872
Image 27
2851.483
63.2016
46.25931
2.906308
0.278063
Image 28
2927.764
45.76717
35.66286
2.058119
0.233298
Image 29
2283.569
57.25194
41.55681
2.828104
0.265601
Image 30
2417.582
66.79382
48.52081
3.271357
0.288454
Image 31
2511.656
83.11073
52.94149
3.973229
0.33414
Table 4.5 Abnormal Images – Variance 35
Entropy Image
Wavelet
Gabor
Without Transform Horizontal
Vertical
Diagonal
(Sub-band 1)
Image 1
0.04601
1.22975
1.360388
1.518533
7.074133
Image 2
0.046057
1.229486
1.354789
1.522715
7.102217
Image 3
0.046057
1.234505
1.358728
1.52438
6.980089
Image 4
0.046057
1.231195
1.347056
1.531343
7.038835
Image 5
0.04601
1.222984
1.343129
1.525853
7.08164
Image 6
0.042044
1.237614
1.314423
1.506982
7.080376
Image 7
0.042044
1.224281
1.3119
1.49547
7.149141
Image 8
0.042163
1.220224
1.287376
1.482197
7.21587
Image 9
0.042084
1.217412
1.29642
1.48085
7.249478
Image 10
0.042044
1.233267
1.305855
1.500817
7.13097
Image 11
0.042005
1.233912
1.308709
1.509876
7.122963
Image 12
0.046667
1.183855
1.259021
1.474565
7.178054
Image 13
0.04662
1.196428
1.283083
1.485851
7.143073
Image 14
0.04662
1.20201
1.280835
1.488458
7.133883
Image 15
0.046667
1.214072
1.288439
1.495205
7.099794
Image 16
0.046667
1.2072
1.29658
1.498235
7.120152
Image 17
0.046715
1.215082
1.306171
1.510551
7.086348
Image 18
0.04601
1.207481
1.274256
1.485508
7.113188
Image 19
0.046057
1.198508
1.272535
1.470825
7.146436
Image 20
0.04601
1.200482
1.270475
1.489269
7.136099
Image 21
0.04601
1.198974
1.27809
1.487759
7.142772
Image 22
0.04615
1.208821
1.28574
1.49064
7.142772
Image 23
0.046103
1.198707
1.271244
1.485709
7.148212
Image 24
0.04601
1.190959
1.257168
1.466409
7.185435
Image 25
0.969726
0.798299
0.804175
0.877173
3.950262
Image 26
0.962768
0.777796
0.791559
0.859575
3.90922
Image 27
0.958183
0.769407
0.783881
0.857869
3.824544
Image 28
0.947184
0.764631
0.771277
0.837862
3.692486
Image 29
0.972916
0.802867
0.822998
0.89641
3.939474
Image 30
0.971439
0.805303
0.81776
0.886377
3.95897
Image 31
0.973666
0.812416
0.813293
0.888803
3.988039
Table 4.6 Abnormal Images – Entropy 36
4.3 RESULTS FOR SVM PERFORMANCE MEASUREMENT MEASUREMENT
After the feature extraction of normal and abnormal images, the extracted results are classified using the SVM classifier with linear kernel and the performance of SVM classifier is measured.
The performance measured without applying transforms is shown below,
Accuracy
Precision
Sensitivity
Specificity
73.33
73.33
100
0
Table 4.7 Performance Measures without applying transforms
4.3.1 WAVELET PERFORMANCE MEASURES
Accuracy
Precision
Sensitivity
Specificity
Horizontal
73.3
73.3
100
0
Vertical
73.3
76.9
90.9
25
Diagonal
73.3
73.3
100
0
Table 4.8 Performance Measures of DWT based SVM
37
4.3.2 GABOR PERFORMANCE MEASURES
Sub band
Accuracy
Precision
Sensitivity
Specificity
Sub band 1
100
100
100
100
Sub band 2
73.33
76.9
90.91
25
Sub band 3
73.33
78.57
100
25
Sub band 4
73.33
78.57
100
25
Sub band 5
80
78.57
100
25
Sub band 6
73.33
73.33
100
0
Sub band 7
73.33
73.33
100
0
Sub band 8
73.33
78.57
100
25
Sub band 9
80
78.57
100
25
Sub band 10
73.33
73.33
100
0
Sub band 11
73.33
73.33
100
0
Sub band 12
73.33
73.33
100
0
Sub band 13
80
78.57
100
25
Sub band 14
86.67
100
81.81
100
Sub band 15
80
78.57
100
25
Sub band 16
73.33
73.33
100
0
Sub band 17
73.33
73.33
100
0
Sub band 18
73.33
78.57
90.9
0
Sub band 19
80
78.57
90.9
0
Sub band 20
73.33
73.33
100
0
Sub band 21
73.33
73.33
100
0
Sub band 22
80
78.57
100
25
Sub band 23
80
78.57
100
25
Sub band 24
73.33
73.33
100
0
Sub band 25
73.33
73.33
100
0
Sub band 26
73.33
73.33
100
0
Sub band 27
80
78.57
100
25
Sub band 28
80
83.33
90.9
50
Sub band 29
80
78.57
100
25
38
Sub band 30
73.33
73.33
100
0
Sub band 31
73.33
73.33
100
0
Sub band 32
80
78.57
100
25
Sub band 33
80
78.57
100
25
Sub band 34
73.33
73.33
100
0
Sub band 35
73.33
73.33
100
0
Sub band 36
73.33
73.33
100
0
Sub band 37
73.33
73.33
100
0
Sub band 38
80
78.57
100
25
Sub band 39
73.33
73.33
100
0
Sub band 40
73.33
73.33
100
0
Table 4.9 Performance Measures of Gabor based SVM
39
CHAPTER 5 COMPARISON OF RESULTS AND ANALYSIS
Performance measures Without transform
Accuracy
Precision
Sensitivity Specificity
73.33
73.33
100
0
With
Horizontal
80
100
100
100
DWT
Vertical
86.67
84.6
100
50
Diagonal
86.67
100
100
100
100
100
100
100
With Gabor transform (Sub-band -1) Table 5.1 Performance Measures – Comparison
After obtaining the results, they are compared with one another to determine which provide good results. From the obtained results, the following things are inferred:
1. If images are classified directly without applying any transform, it provided a classification accuracy of 73.33%, but specificity becomes 0.This means that SVM classifier finds more difficulty in recognizing the abnormal images, when it is fed with features from untransformed images as inputs.
2. If images are classified by extracting features through wavelet transform, it provided a classification accuracy of 73.33%, precision of 76.8% and specificity of 25%.This means that SVM classifier finds more difficulty in recognizing about 3/4th of abnormal images available, when it is fed with features from wavelet transformed images as inputs. 40
3. If images are classified by extracting extracting features through Gabor transform with 5 scales and 8 orientations, it provided a classification accuracy of 100%,precision of 100% and specificity of 100% even in the first sub band itself(scale-1,orientation-1).This means that SVM classifier finds more easiness when it is fed with simply simply first sub band of Gabor extracted features.
41
CHAPTER 6 CONCLUSION
The real world data is generally imperfect for two reasons: one is that the data can be incomplete for the lack of the necessary information; the other is that data may be inaccurate, because it includes the noise and even the wrong information. With this incomplete real world data, how to extract the specific image feature is still a present major issue. Feature extraction and selection are essential in case of image classification. Once we have extracted the features, the feature selection aiming at improving classification accuracy and reducing the restless feature plays an important role in classification because the SVM has its influence on classification. In this project, the proposed approach based on linear SVM has demonstrated great potential and usefulness in MRI image classification. From the above analysis ,it is clear that medical images images are perfectly classified with 100% classification accuracy when they are given to Gabor transform based Support Vector Machine (SVM) with linear kernel. Also, in addition to 100% classification accuracy, it provides high recognition efficiency and high speed convergence when comparing to discrete wavelet transform based Support Vector Machine (SVM) with linear kernel. Also, results show that Gabor extracted features provide greater and absolute resolution to images in their classification process. These entire factors make Gabor Transform based Support Vector Machine (SVM) with linear kernel more efficient than other classifiers in medical image classification process.
42
REFERENCES
[1] Antonio V. Netto, Cesar B. Castanon Cas tanon , C. P. De Almeida, João E. S. B. Neto, Maria Cristina F. de Oliveira Osvaldo, ‘Ridgelet based Texture Classification of Tissues in Computed tomography’, Machine Vision and Applications, Springer-Verlag, 2008 [2] Li Bai , Ulas Bagci, ‘A Comparison Compar ison of Daubechies and Gabor Wavelets for Classification of MR Images’, Collaborative Medical Image Analysis Group (CMIAG), 2007 [3] Lindsay Semler, Lucia Dettori, ‘A Comparison of Wavelet-based and Ridgelet based texture Classification of tissues in Computed Tomography’, Springer-Verlag, 2008, p.no:21-29 [4] P. Carrion, E. Cernadas, M. Fernandez-Delgado, J.F. Galvez, Pilar SaOtero, ‘Improved Classification of Pollen texture Images using us ing SVM and MLP’, Springer-Verlag, 2008, p.no:115-119 [5] Qian Zhu , Shaomei Yang, ‘Comparison and Application of SVM and FNN Algorithm’, Vol :1,2007 [6] Abdel-Badeeh, El-Dahshan, El-Sayed, M.Salem, Tamer.H.Younis, ‘ A Hybrid Technique for Automatic MRI Brain Images Classification’, Studia Univ. Babesbolyai, Informtica,Vol:54,2009 [7]M. Fletcher-Heath, D. B. Goldgof, L. O. Hall, F. R. Murtagh, ‘Automatic segmentation of non-enhancing brain tumors in magnetic resonance images’, Artificial Intelligence in Medicine 21 (2001), pp. 43-63
43
[8] Wei-Li Zhang, Xi-Zhao Wang, ‘Feature Extraction and Classification for Human Brain CT images’, International Conference on Machine Learning and Cybernetics, Hong Kong, 2007 [9] Dinstein, Haralick, R. M., Shanmugam, D., ‘Texture Features for Image Classification’, IEEE Transactions on Systems and Cybernetics, Vol:3,1973, p.no: 610 – 621 [10] Vapnik, V. N., ‘The Nature of Statistical Learning Theory’,(New York: Springer-Verlag), 1995. [11] Chih-Chung Chang, Chih-Jen Lin, Chih-Wei Hsu, ‘A Practical Guide to Support Vector classification’, http://www.csie.ntu.edu.tw/~cjlin, http://www.csie.ntu.edu.tw/~cjlin, 2009 [12] C. Chang, C. Lin, ‘LIBSVM: a library for support vector machines’, 2001. [13] Anna Rejani, Y.Ireaneus, Dr. S. Thamarai Selvi, ‘Early Detection of Breast Cancer using SVM Classifier Technique’, International Journal on Computer Science and Engineering , 2009, Vol.1,p.no:127-130. [14] S. Mika, K.R. Muller, ‘An introduction to kernel-based learning algorithms’, IEEE Transactions on Neural Network, 2001, Vol: 12, p.no:181201. [15] V. Vapnik, “Statistical learning theory”, Wiley, New York, 1998 [16] Ahmet Sertbas, Niyazi Kilic, Onur Osman, Osman N. Ucan, Pelin Gorgel, , ‘Mammographic Mass Classification Using Wavelet based Support Sup port Vector Machine’, Journal of Electrical & Electronics Engineering,Vol:9,2009,p.no:867-875
44