Final Year Project - Medical Image Classification Using Support Vector Machine

LIST OF TABLES

TABLE NO.

TITLE

PAGE NO.

4.1

Normal Images – Mean

32

4.2

Normal Images – Variance

33

4.3

Normal Images – Entropy

33

4.4

Abnormal Images – Mean

34

4.5

Abnormal Images – Variance

35

4.6

Abnormal Images – Entropy

36

4.7

Performance Measures without

37

applying transforms 4.8

Performance Measures of DWT

37

based SVM 4.9

Performance Measures of Gabor

38

based SVM 5.1

Performance Measures – Comparison

viii

40

LIST OF FIGURES

FIGURE NO.

TITLE

PAGE NO.

3.1

Normal Image

6

3.2

Abnormal Image

6

3.3

Three level wavelet decomposition Tree

10

3.4

Wavelet Families

11

3.5

SVM principles

22

3.6

Linear SVM

25

4.1

Wavelet transform of abnormal image

27

4.2

Wavelet transform of normal image

28

4.3

Gabor sub-bands

29

4.4

Gabor Transform of normal image

30

4.5

Gabor Transform of abnormal image

31

ix

LIST OF ABBREVIATIONS

MRI

Magnetic Resonance Imaging

CADe

Computer Aide Detection

CADx

Computer Aided Diagnosis

t2fs

T2 weighted FLAIR image

FLAIR

Fluid Attenuated Inversion Recovery

ANN

Artificial Neural Network

WT

Wavelet Transform

CWT

Continuous Wavelet Transform

DWT

Discrete Wavelet Transform

GT

Gabor Transform

COM

Co-Occurrence Matrix

IDM

Inverse Difference Moment

SVM

Support Vector Machine

RBF

Radial Basis Function

SLT

Statistical Learning Theory

SRM

Structural Risk Minimization

x

CHAPTER 1 INTRODUCTION

There are many factors in the process of the physicians’ diagnosis. Firstly, the physicians’ diagnosis is subjective, because the diagnosis’ result is affected by the doctors’ experience and ability; Secondly, the physicians’ diagnosis tends to omit some tiny changes that the human eyes can’t find; thirdly, different physicians would get the different diagnosis’ conclusions for the same medical image. Comparing to the physicians, the computer has tremendous predominance in the aspect of avoiding the incorrect results.

Image Classification plays an important role in the fields of Medical diagnosis, Remote Sensing, Image analysis and Pattern Recognition. Digital image classification is the process of sorting of images into a finite number of individual classes. In the case of main fields like medical diagnosis, images have to be classified with maximum accuracy, or else it will lead to incomplete treatment of the corresponding disease.

The main advantage of applying transform to a medical image before extracting its features is to reduce redundant data present in the image. Applying transforms can improve the accuracy and accordance of the diagnosis’ result. First and second order statistics of the wavelet detail coefficients provide descriptors that can discriminate intensity properties spatially distributed throughout the image, according to various levels of resolution. Then, using SVM classifier, these features are classified and performance measures of SVM classifier such as accuracy, precision, sensitivity, specificity are evaluated and compared with those from the nontransformed images.

1

CHAPTER 2 LITERATURE REVIEW

Image Classification plays an important role in the fields of Medical diagnosis, Remote sensing, Image analysis and Pattern Recognition.. In case of main fields like medical diagnosis, images have to be classified with maximum accuracy, or else it will lead to incomplete treatment to the corresponding disease.

In order to achieve high classification accuracy in image classification, numerous methods for feature extraction and image classification were introduced. For feature extraction, different transforms like Wavelet, Ridgelet, Curvelet, Contourlet transforms were used. For classification different statistical classifiers like Feed forward Neural Networks(FNN) ,Feedback

Neural

Networks,

Back

propagation

Neural

Networks(BPNN),Multi-Layer Perceptron(MLP),Hybrid Hopfield Neural Networks(HHNN),Radial Basis Function Neural Network(RBF-NN)[1],SelfOrganizing

Map(SOM)

and

kernel

classifiers

like

Support

Vector

Machine(SVM) are used.

One method is by using Ridgelet transform for feature extraction and classification as explained in the paper “Ridgelet based Texture Classification of Tissues in Computed tomography” [3] by Lindsay Semlera, Lucia Dettoria and Brandon Kerrb of DePaul University, Chicago. This article focuses on using

Ridgelet-based

multi-resolution

texture

analysis

and

also

on

investigating the discriminating power of several Ridgelet based texture descriptors. Even though Ridgelet transform provided good performance, but

2

it gives good results only in case of texture descriptors, which are somewhat difficult to calculate. A paper titled titled “A Comparison of Daubechies and Gabor Wavelets for Classification of MR Images” [2] by Ulas Bagci, Li Bai of CMAIG proposes a machine learning algorithm for classifying MR image using wavelet transform and Gabor transform using SVM.But,results of 100% classification accuracy in case of classifying normal and abnormal images were obtained using SVM classifier of Sigmoid and RBF kernels only,which are more time consuming and less effective than SVM classifier of linear kernel.

An article named “A Comparison of Wavelet-based and Ridgelet based texture Classification of tissues in Computed Tomography”[3] by Lindsay Semler, Lucia Dettori of DePaul University, Chicago presents a comparison of wavelet-based and Ridgelet-based algorithms for medical image classification. Tests on a large set of chest and abdomen CT images indicate that, the one using texture features derived from the Haar wavelet transform clearly outperforms the one based on Daubechies and Coiflet transform. The tests also show that the Ridgelet-based algorithm is significantly more effective, but it fails to recognize point coordinates in image which represents the tumour effectively and does not provide 100% accuracy.

Another problem lies in the case of classification. Before SVM, Artificial Neural Networks (ANN) were widely used for classification of images

and other

Networks(ANN)

applications. Many types of

Artificial Neural

namely Feed forward Neural Networks(FNN) ,Feedback

Neural Networks, Back propagation Neural Networks(BPNN) ,Multi-Layer Perceptron(MLP), etc were developed. But all of them have their own

3

advantages and disadvantages and no one of them provided 100% classification accuracy.

With the introduction of kernel classifiers, SVM classifier becomes more common. But, kernel classifiers called Support Vector Machine (SVM) with linear kernel provided 100% classification accuracy. A comparison on SVM and

FNN was provided in the paper paper titled titled “Research on Comparison

and Application of SVM and FNN Algorithm” [5] written by Shaomei Yang and Qian Zhu. This paper concluded that high recognition accuracy and high speed convergence are achieved by using SVM rather than FNN.

Another study titled “Improved Classification of Pollen texture Images using SVM and MLP ”[4] by M. Fernandez-Delgado, P. Carrion, E. Cernadas, J.F. Galvez, Pilar Sa-Otero, explored the use of more sophisticated classifiers to improve the classification stage. It explains that SVM achieved high accuracy when compared to k-Nearest Neighbours (KNN) and MultiLayer Perceptron (MLP). Also KNN requires storing of whole training set and hence it is time consuming.

4

CHAPTER 3 MEDICAL IMAGE CLASSIFICATION

3.1 MEDICAL IMAGES

As the project’s main objective is to classify medical images, the first and foremost need is the input to the algorithm, i.e., medical images. Here, classification is done on medical images obtained from Magnetic Resonance imaging (MRI) scans. Among different types of MRI scans like Diffusion MRI, Functional MRI, FLAIR (Fluid Attenuated Inversion Recovery) MRI, Interventional MRI, images taken here are FLAIR images of t2fs type. Fluid attenuated inversion recovery (FLAIR) is a pulse sequence used in

magnetic resonance imaging which was invented by Dr. Graeme Bydder. FLAIR can be used with both three dimensional imaging ( 3D FLAIR) or two dimensional imaging (2D FLAIR). The pulse sequence is an inversion recovery technique that nulls fluids. For example, it can be used in brain imaging to suppress cerebrospinal fluid (CSF) effects on the image, so as to bring out the periventricular hyperintense lesions, such as multiple sclerosis (MS) plaques. By carefully choosing the TI, the signal from any particular tissue can be nullified. The appropriate TI depends on the tissue via the formula: TI = ln2 * T1 One should typically yield a TI of 70% of T1. In the case of CSF suppression, one aims for T2 weighted images. T 2-weighted

long

T R .

scans use a spin echo (SE) sequence, with long

T E

and

They have long been the clinical workhorse as the spin echo

5

sequence is less susceptible to inhomogeneities in the magnetic field. They are particularly well suited to edema as they are sensitive to water content (edema is characterized by increased water content). Also, this project mainly focuses on creating an algorithm for classifying brain images especially and detecting any abnormalities in brain like tumor. Therefore, the images which are given as input to the algorithm are of two types –

1. Normal brain images (without tumor) (eg. Fig. 3.1) and 2. Abnormal brain images (with tumor) (eg. ( eg. Fig. 3.2)

Fig. 3.1

Fig.3.2

Normal Image

Abnormal Image

6

3.2 APPLYING IMAGE TRANSFORMS 3.2.1 IMAGE TRANSFORMATIONS

In different image processing applications, the main step considered is applying transforms on images. This step is usually carried out for the following reasons: 1. To reduce the redundant data in image 2. To reduce the actual size of image content. 3. Easy analysis.

The transform of a signal is a way to represent the signal. It does not alter the information content present in the signal. There are different image transformations available such as:

1. Cosine Transform

2.Sine Transform

3. Fourier Transform

4.Wavelet Transform

5. Ridgelet Transform

6. Curvelet Transform

7. Contourlet Transform

8.Riplet Transform

9. Gabor Transform, etc.

Among these most promising are - 1.Wavelet Transform and 2.Gabor transform.

3.2.2 DISCRETE WAVELET TRANSFORM

The Wavelet Transform provides a time-frequency representation of the signal. It was developed by Morlet to overcome the short comings of the Short Time Fourier Transform (STFT) and Fourier Transform (FT), which

7

can also be used to analyze non-stationary signals where all frequencies have an infinite coherence time. While FT and STFT give a constant resolution at all frequencies, the Wavelet Transform uses multi-resolution technique by which different frequencies are analyzed with different resolutions and a coherence time proportional to the period of the signal.

Also, Fourier transform (FT) does not provide any information to show where within single certain frequencies occur, i.e., information about the time domain is lost. Since, most of the signals in the real-world change with time, it is especially useful to characterize signals in both time and frequency domain simultaneously. Wavelet transform is used for this reason. reas on.

The Wavelet Transform (WT) decomposes a signal into a linear sum of basis-functions which are dilated and translated wavelets. A wavelet is a function

L2 () with zero average, normalized to one and centered in the

neighbourhood of t=0. A family of time-frequency components is obtained by translating it by u and scaling it by .

Convolution of this wavelet function with the image will give us the Wavelet transform of the image f(x,y). The wavelet transform of the image can be analysed with respect to both time and frequency domains. Hence, wavelet transform provides information about both time and frequency domains at various resolutions. Discrete Wavelet transform converts a discrete time signal into wavelets. Discrete Wavelet transform is expressed as: 8

where dm(n) = and

are basis for L () satisfying the biorthogonality condition condition ,which is defined 2

as(mn ,lk )=ml

nk. In orthogonal case, mn can be used for both synthesis

and analysis of wavelets.

The two-dimensional Wavelet transform is carried out by applying one-dimensional wavelet transform to the rows and columns of the twodimensional data consecutively. It is computed by successive lowpass and highpass filtering of the discrete time-domain signal .This is called the Mallat algorithm or Mallat-tree decomposition. In the figure 3.3, the signal is denoted by the sequence x[n], where n is an integer. The low pass filter is denoted by G0 while the high pass filter is denoted by H 0. At each level, the high pass filter produces detailed information; d[n], while the low pass filter associated with scaling function produces coarse coa rse approximations,a[n].

Generally, there are three detailed components in images namely: 1.Horizontal, 2.Vertical and 3.Diagonal components. These three components contain all the detailed information about high frequency contents present in the image and hence they are used to extract image features rather than from approximation components.

9

Fig 3.3 Three level wavelet decomposition tree

The advantages of DWT are: 1. It is easy to implement 2. It reduces the computation time 3. Resources required for the computation of DWT co-efficients are less. 4. It needs only O(N) operations to get computed and hence, it is also referred to as the fast wavelet transform

In Wavelet Families there are a number of basis functions that can be used as the mother wavelet for Wavelet Transformation. Since the mother wavelet produces all wavelet functions used in the transformation through translation and scaling, it determines the characteristics of the resulting Wavelet Transform. They include Haar wavelet, Daubechies wavelet, Symlet, Coiflet, Biorthogonal wavelet, Morlet, Meyer wavelet and Mexican Hat wavelet as given in fig 3.4

10

Fig 3.4 Wavelet Families (a)Haar (b)Daubechies (c)Coiflet1 (d)Symlet (e)Meyer (f)Morlet (g)Mexican Hat

Haar wavelet is the oldest and simplest wavelet available which is used as mother wavelet for classifying medical images. Since Haar wavelet is suitable for representing the image in time domain, it is used here.

3.2.3 GABOR TRANSFORM

Gabor transforms are widely used in image analysis and computer vision. The Gabor transform provides an effective way to analyze images and has been elaborated as a frame for understanding the orientation and spatial frequency selective properties of signals. It seems to be a good approximation to the sensitivity profiles. The important advantages are infinite smoothness and exponential decay in frequency. Gabor transform represents an approach to characterise a time function in terms ter ms of time and frequency simultaneously.

11

A joint space-frequency analysis of any signal cannot be performed by the Fourier transform. This can be easily performed by Short time Fourier Transform (STFT).The STFT of a signal s(t) can be given as-

STFT (τ (τ,) =  s(t) g(t-τ g(t-τ) exp (-jt) dt

Thus, STFT of a signal is defined as Fourier Transform of the signal s(t) windowed by the function g(t-τ g(t- τ).The STFT with a Gaussian window is called a Gabor transform.

The Gabor transform can be regarded as a signal being convoluted with a filter bank, whose impulse response in the time domain is Gaussian signal modulated by sine and cosine waves. The characteristics of the Gabor wavelets, especially for frequency and orientation representations, are similar to those of human visual system, and they have been found to be particularly appropriate for texture representation and discrimination. In the spatial domain, the 2D Gabor filter is a Gaussian kernel function modulated by a sinusoidal plane wave.

where f is the central frequency of the sinusoidal plane wave, clockwise rotation of the Gaussian and the plane wave,

 is the anti-

 is the sharpness of

the Gaussian along the major axis parallel to the wave, and β is the sharpness

12

of the Gaussian minor axis perpendicular to the wave. γ = f /

 and η = f / β

are defined to keep the ratio between frequency and sharpness constant.

The Gabor filters are self-similar; all filters can be generated from one mother wavelet by dilation and rotation. Each filter is in the shape of plane waves with frequency f, restricted by a Gaussian envelope function with relative width

 and β. To extract useful features from an image, normally a

set of Gabor filters with different frequencies and orientations are required.

The Gabor representation of a MR brain image M(~x) can be obtained by convolving the image with the family of Gabor filters is given by:

where LHS denotes resultant Gabor representation of image M at orientation u and scale v. Thus, the resultant Gabor feature set consists of the convolution results of an input image M(x) with all of the Gabor wavelets:

13

The number of Gabor wavelets used varies with different applications. Better results can be obtained, if 5 scales and 8 orientations are used for classification of MR brain images. Extracted feature vectors are then concatenated together to construct a new feature vector to be used for classification purposes. In Figure 4.3, a MR brain image is convolved with 40 Gabor filters (8 orientations, 5 scales) and each row and column shows a different scale and orientation orientation respectively.

3.3 FEATURE EXTRACTION

Feature Extraction is an important step in image classification. It refers to extraction of various characteristics of the image, either transformed or untransformed. There are numerous features available in the image to get extracted. The feature extraction includes the 1. Gray scale (intensity) Extraction 2. Shape extraction 3. Texture extraction

3.3.1 GRAY SCALE FEATURES

In digital picture process, the two-dimensional digitized gray scale image (M×N) can be seen as M×N pixels in two-dimensional surface XOY, each pixel f(x,y) can be expressed as its gray value.

Mean

Mean or average value represents the arithmetic mean or average of all pixels in the image f(x,y).

14

Variance

Variance is the first moment about the mean. It reflects the separate degree of gray scale value.

Skewness

Skewness is the second moment about the mean. It takes the mean value as the central data distribution not as symmetrical degree.

Kurtosis

Kurtosis is the third moment about the mean. It reflects the normal distribution sharpness or smoothness of compared the data.

Histogram

The gray histogram is the gray scale value function; it describes the rate of the pixels that have the same gray scale value in the picture.

where n b is the number of the pixels that the gray scale value is b, n is the total number of the picture’s pixels.

15

The features aiming at the gray scale histogram is mainly the gray scale average value, gray scale variance, gray Skewness, gray Kurtosis, gray energy and gray entropy. 3.3.2 SHAPE FEATURES

The shape features have the characters that are scaling invariability, rotation invariability, translation invariability, so they can become the object recognition features. The features include the Normalized Moment of Inertia (NMI), moment invariants and sphericity.

The shape features can highly identify the figure. As the exterior shapes of brain image are all nearly ellipse, it can’t distinguish the image by extracting the features to the whole image. Considering the place and character of the certain pathological changes we select the ROI and extract its features.

3.3.3 TEXTURE FEATURES

A co-occurrence matrix (COM) is a square matrix whose elements correspond to the relative frequency of occurrence p(i,j,d,h) of two pixel values (one with intensity i and the other with intensity j ), separated by a certain distance d in a given direction h . A COM is therefore a square matrix that has the size of the largest pixel value in the image. The COM is scale invariant. The matrices present the relative frequency distributions of gray levels and describe how often one gray level will appear in a specified spatial relationship to another gray level within each image.

16

The matrix was normalized by the following function:

where, R is the normalized function, which is usually set as the sum of the matrix.

Energy

Energy is also called Angular Second Moment. It is a measure of the homogeneousness of the image and can be calculated from the normalized COM. It is a suitable measure for detection of disorder in texture image. Higher values for this feature mean that the amplitude or intensity changes less in the image, resulting in a much sparser COM.

Contrast

Contrast is a measure of amount of the local variation in the image. It presents the degree of the legibility of the image. A higher contrast value indicates a high amount of local variation, so the higher contrast is, the clearer the image is.

IDM

The Inverse Difference Moment (IDM) reflects the local texture changes. It is another feature of image contrast.

17

Entropy

Entropy is a statistical measure of randomness that can be used to characterize the texture of the input image. The entropy of the image reflects the gray asymmetrical extent and complicated extent.

We have considered three important features-two gray scale features namely mean and variance

and one texture feature feature namely entropy are

calculated and used to create datasets for SVM classification.

3.4 DATASET FORMATION

As prior to classification by SVM the extracted features must be arranged in dataset. All the images used for classification are got from hospital. There are totally 38 images, of which 7 images are got from healthy persons and the remaining 31 images are got from persons who are suffering from brain tumour.

As classification by SVM consists of two phases namely: training and testing phases, all images are used for these two phases in the following manner:

For training phase: 11 abnormal images and 3 normal images are used. For testing phase: 20 abnormal images and 4 normal images are used.

18

3.5 CLASSIFICATION BY SVM 3.5.1 STATISTICAL LEARNING THEORY

Learning can be thought of as inferring regularities from a set of training examples. Various learning algorithms allow the extraction of regularities. If the learning has been successful, these intrinsic regularities will be captured in the values of some parameters of a learning machine. Geometry provides a very intuitive background for the understanding and the solution of many problems in Machine Learning. Statistical learning theory addresses a key question that arises when constructing predictive models from data-how to decide whether a particular model is adequate or whether a different model would produce better predictions. Whereas classical statistics typically assumes that the form of the correct model is known and the objective is to estimate the model parameters, statistical learning theory presumes that the correct form is completely unknown and the goal is to identify mathematical form and none of them need be correct. The theory provides a sound statistical basis for assessing model adequacy under these circumstances, which are precisely the circumstances encountered in machine learning, pattern recognition, and exploratory data analysis. Estimating the performance of competing models is the central issue in statistical learning theory. Performance is measured through the use of loss functions. The loss Q(z,) between a data vector z and a specific model



(one with values assigned to all parameters) is a score that indicates how well

 performs on z, with lower scores indicating better performance. Statistical Learning Theory forms the basis of classification by Support Vector Machine (SVM).

19

3.5.2 SUPPORT VECTOR MACHINE

Support Vector Machines (SVMs) are a new supervised classification technique that has its basis in Statistical Learning Theory .Based on machine vision fields such as character, handwriting digit and text recognition there has been increased interest in their application to image classification. SVMs are non-parametric hence they boost the robustness associated with Artificial Neural Networks and other nonparametric classifiers. The purpose of using SVM is to obtain the acceptable results r esults fast and easily.

The Support Vector Machine is a theoretically superior machine learning methodology with great results in classification of high dimensional datasets .A classiﬁcation task usually involves with training and testing data which consist of some data instances. Each instance in the training set contains one “target value” (class labels) and several “attributes” (features). The goal of SVM is to produce a model which predicts target value of data instances in the testing set which are given only the attributes.

SVM functions by nonlinear projection of the training data in the input space to a feature space of higher (infinite) dimension by use of a kernel function φ. Then SVM identifies a linear separating hyperplane with the maximal margin in this higher dimensional space. This process enables the classification of remote sensing datasets which are usually nonlinearly separable in the input space. In many instances, classification in high dimension feature spaces results in overfitting of the input space. However, in SVMs, overfitting is controlled through the principle of structural risk minimization. The empirical risk of misclassification is minimised by maximizing the margin between the data points and the decision boundary. In

20

practice this criterion is softened to the minimisation of a cost factor involving both the complexity of the classifier and the degree to which marginal points are misclassified. The tradeoff between these factors is managed through a margin of error parameter which is tuned through crossvalidation procedures.

The SVM paradigm in Machine Learning presents a lot of advantages over other approaches such as: 1) Uniqueness of the solution (as it is guaranteed to be the global minimum of the corresponding optimization problem), 2) Good generalization properties of the solution, 3) Rigid theoretical foundation based on SLT and optimization theory, 4) Common formulation for the class separable and the class non-separable problems as well as for linear and non-linear problems (through kernel trick). 5) Clear geometric intuition of the classification class ification problem.

Due to these very attractive properties, SVM have been successfully used in a number of applications. SVM allows the construction of various learning machines by choice of different dot products. Thus the influence of the set of function that can be implemented by a specific learning machine can be studied in a unified frame work. SVM builds support vectors on results of statistical learning theory, namely on the Structural Risk Minimisation principle guaranteeing high generalization ability. There is reason to believe that decision rules constructed by the support vector algorithm do not reflect incapabilities of learning machine but rather the regularities of data.

21

Fig 3.5 SVM principle

A SVM finds the best separating (maximal margin) hyperplane between two classes of training samples in the feature space, which is in line with optimizing bounds concerning the generalization error. The playground for SVM is the feature space H, which is a Reproducing Kernel Hilbert Space (RKHS), where the mapped patterns reside (Ф (Ф: X->H). It is not necessary to know the mapping Ф itself analytically, but only its kernel, i.e., the value of the inner products of the mappings of all the samples (K (x1, x2) = <Ф(x1),(x2)> for all x1, x2 є X) .Through the “kernel trick”, it is possible to transform a nonlinear classification problem to a linear one, but in a higher 2

(maybe infinite) dimensional space H . Once the patterns are mapped in the feature space, provided that the problem for the given model (kernel) is separable, the target of the classification task is to find the maximal margin hyperplane. 22

Principle of Structural Risk Minimisation Technique (SRM) is an inductive principle of use in machine learning. Commonly in machine learning a generalized model must be selected from a finite data set, with the consequent problem of overfitting the model becoming too strongly tailored to the particularities of the training set and generalizing poorly to new data. The SRM principle addresses this problem by balancing the Modd’s complexity against its success at fitting the training data. The functions used to project the data from input space to feature space are called kernels. T

K (xi, x j) ≡φ(x ≡φ(xi) φ(x j) is called the kernel function.

3.5.3 KERNEL CLASSIFIERS

Depending on the kernel function used, us ed, SVM classifier is classified into different types namely: 1. Linear SVM 2. Polynomial SVM 3. Radial Basis Function (RBF) SVM 4. Sigmoid SVM Four basic kernels functions used are: •

Linear: T x .

K(xi, x j) = x i j •

Polynomial: +

d

K(xi, x j) = (γ (γxiTx j r) , γ > 0. •

Radial basis function (RBF): 2

K(xi, x j) = exp(−γ exp(−γkx kxi− x jk ), γ > 0.

23

•

Sigmoid: K(xi, x j) = tanh(γ tanh(γxiTx j+ r)

Here, γ, r, and d are kernel parameters.

In this project linear SVM classifier is used for the classification of MR brain images.

3.5.4 LINEAR SUPPORT VECTOR MACHINE

The task of image classification is similar to the following problem: Consider the problem of classifying m points in the n-dimensional real space n

R , represented by the m*n matrix A, according to membership of each point Ai in the class A+ or A- as specified by a given m*m diagonal matrix D with plus ones or minus ones along its diagonal. For this problem the standard support vector machine with a linear kernel is given by the following quadratic program with parameter υ>0 υ>0

Here ω is the normal to the bounding planes and γ determines their location relative to the origin. The plane ω*x+b>1 bounds the class A+ points, possibly with some error, and the plane ω*x+b>1 bounds the class A points, also possibly with some error. The linear separating surface is the plane ω*x+b=0 which is the midway between the bounding planes.

24

Fig.3.6 Linear SVM

3.6 PERFORMANCE OF SVM

The performance of Linear Support Vector Machine (SVM) classifier is to be measured to find the extent of classification of MR images. It is done by calculating the performance measures of the classifier. The performance measures of SVM classifier are:

1. Accuracy/Classification rate/Efficiency 2. Precision 3. Sensitivity/Recall 4. Specificity

25

Accuracy

Accuracy is the probability that a diagnostic test is correctly performed. Accuracy = (TP + TN)/(TP + TN + FP + FN) Precision

Precision is the probability that a diagnostic test is correctly performed, when certain classes of images come consecutively. It is the degree of accuracy. Precision=TP/(TP+FP) Sensitivity

Sensitivity (true positive fraction) is the probability that a diagnostic test is positive, given that the person has the disease. Sensitivity = TP/(TP + FN) Specificity

Specificity (true negative fraction) is the probability that a diagnostic test is negative, given that the person does not have the disease. Specificity = TN/(TN + FP) Where: TP (True Positives)

: Correctly classified positive cases,

TN (True Negative)

: Correctly classified negative cases,

FP (False Positives)

: Incorrectly classified negative cases, and

FN (False Negative)

: Incorrectly classified positive cases

26

CHAPTER 4 RESULTS

4.1 RESULTS FOR TRANSFORMS

The Discrete Wavelet transform with Haar function and Gabor transform were applied to the MR images, both normal and abnormal. The following results were obtained.

4.1.1 DISCRETE WAVELET TRANSFORM

Fig.4.1 Wavelet transform of abnormal image

27

Fig.4.2 Wavelet transform of normal image

28

4.1.2 GABOR TRANSFORM

Fig.4.3 Gabor sub-bands

29

Fig.4.4 Gabor Transform of normal image

30

Fig.4.5 Gabor Transform of abnormal image

31

4.2 RESULTS FOR FEATURE EXTRACTION

In this the feature extraction results such as mean, variance and entropy values for both normal and abnormal images, without and with applying wavelet and Gabor transformations are given as follows,

Mean Image

Wavelet

Gabor

Without Transform Horizontal

Vertical

Diagonal

(Sub-band 1)

Image 1

60.74703

0.037695

0.093213

0.000317

0.251645

Image 2

56.40564

0.046558

0.091089

0.001172

0.235427

Image 3

56.32641

0.051931

0.059011

0.005503

0.286015

Image 4

58.95343

0.049337

0.066793

0.001343

0.287435

Image 5

68.64449

0.056753

0.104955

0.00174

0.339758

Image 6

70.68972

0.062788

0.102326

0.000313

0.323999

Image 7

56.61366

0.046875

0.094604

0.000342

0.271618

Table 4.1 Normal Images – Mean

32

Variance Image

Wavelet

Gabor


Vertical

Diagonal

(Sub-band 1)

Image 1

2891.725

62.92573

50.4192

2.032886

0.044661

Image 2

2369.369

57.73664

45.22733

1.953733

0.040062 0.040062

Image 3

2861.059

69.58309

67.54331

3.643807

0.068496 0.068496

Image 4

3059.756

69.86778

71.42149

3.549834

0.061923 0.061923

Image 5

4193.412

81.61674

73.89571

3.67788

0.087613

Image 6

4173.711

80.50006

73.31604

3.333723

0.074757 0.074757

Image 7

2005.429

54.35772

40.09178

2.547767

0.049764 0.049764

Table 4.2 Normal Images – Variance Entropy Image

Wavelet

Gabor


Vertical

Diagonal

(Sub-band 1)

Image 1

0.04601

1.22917

1.348253

1.525145

7.067558

Image 2

0.046103

1.243307

1.3554

1.528097

6.928274

Image 3

0.042084

1.235241

1.299876

1.497687

7.117588

Image 4

0.042005

1.224813

1.29799

1.492563

7.132964

Image 5

0.046667

1.190325

1.259527

1.465836

7.210374

Image 6

0.046762

1.191948

1.25965

1.474173

7.186631

Image 7

0.04601

1.216871

1.287355

1.49454

7.067558

Table 4.3 Normal Images – Entropy

33

Mean Image

Wavelet

Gabor


Vertical

Diagonal

(Sub-band 1)

Image 1

44.6699

0.043616

0.071741

0.000916

0.275158

Image 2

36.6242

0.053906

0.077075

7.32E-05

0.28023

Image 3

57.9026

0.045923

0.083789

0.002222

0.247304

Image 4

55.8079

0.038306

0.077734

0.001074

0.263427

Image 5

51.7139

0.044006

0.077087

0.000183

0.276598

Image 6

55.1247

0.046

0.067322

0.000346

0.267245

Image 7

54.1993

0.052643

0.061106 0 .061106

0.001516

0.28876

Image 8

53.3235

0.047933

0.073222

0.000732

0.312288

Image 9

49.8209

0.052948

0.073639

0.00294

0.332287

Image 10

56.3098

0.04066

0.058034

0.001801

0.285401

Image 11

57.4865

0.046356

0.065684 0 .065684

0.002879

0.27992

Image 12

76.994

0.057016

0.104016

0.002879

0.325076

Image 13

75.3437

0.052534

0.099584

0.000225

0.305669

Image 14

75.7451

0.042781

0.098295 0 .098295

0.001991

0.30544

Image 15

72.9491

0.048465

0.09534

0.000563

0.294071

Image 16

72.0666

0.045297

0.108248

0.001828

0.292423

Image 17

67.9041

0.050343

0.088654

0.000814

0.280234

Image 18

63.5098

0.0573

0.089111

0.000464

0.288493

Image 19

67.2230

0.047266

0.088062

0.000537

0.300335

Image 20

65.9541

0.045044

0.094971

0.004272

0.294079

Image 21

66.4657

0.054529

0.095325

8.54E-05

0.293742

Image 22

62.9119

0.055029

0.093604

0.001172

0.282132

Image 23

66.3750

0.05575

0.097644

0.001379

0.301768

Image 24

68.7174

0.057751

0.096448

0.004602

0.319182

Image 25

36.8495

0.001372

0.008499

0.001082

0.151476

Image 26

36.5772

0.002877

0.015546

0.002349

0.146255

Image 27

36.4418

0.003194

0.010505

0.002323

0.135219

Image 28

35.7769

0.005754

0.013725

0.001056

0.117068

Image 29

34.1989

0.00293

0.004408

0.001795

0.132999

Image 30

35.6062

0.006202

0.001874

0.00161

0.13789

Image 31

36.5106

0.003946

0.004025

0.000699

0.155908

Table 4.4 Abnormal Images – Mean 34

Variance Image

Wavelet

Gabor


Vertical

Diagonal

(Sub-band 1)

Image 1

3074.957

45.2334

23.89255

1.681724

0.226409

Image 2

2216.829

33.04906

16.64941

1.577053

0.22223

Image 3

2919.207

58.86982

44.82321

1.835159

0.203457

Image 4

3453.95

57.30016

37.80715

1.713261

0.216925

Image 5

3469.815

53.62047

27.9559

1.799635

0.228751

Image 6

3096.86

54.13341

41.61895

2.36158

0.221209

Image 7

3280.115

59.26085

40.41683

2.638464

0.24328

Image 8

3468.613

59.65723

39.96407

3.090099

0.267676

Image 9

3396.621

60.73324

39.6609

3.361088

0.301476

Image 10

3013.183

67.5673

68.50105

3.282204

0.249265

Image 11

3262.737

62.39801

56.1362

2.669302

0.244138

Image 12

4768.797

80.14766

70.45372

2.669302

0.275816

Image 13

4533.509

71.62744

60.21357

3.232478

0.262391

Image 14

4685.575

78.90971

63.72544

3.317919

0.268795

Image 15

4513.721

72.547

62.16007

3.181601

0.256495

Image 16

4579.436

69.05656

62.94815

3.01611

0.24539

Image 17

4306.119

60.29156

54.93299

2.56416

0.229173

Image 18

2904.708

70.198

55.69006

2.925376

0.242471

Image 19

3307.889

76.85706

64.99517

3.422935

0.250287

Image 20

3327.111

68.24504

65.98338

3.216113

0.239318

Image 21

3560.139

61.61814

62.5934

2.89766

0.235636

Image 22

3186.652

51.54247

46.8791

2.492735

0.222

Image 23

2802.463

66.34095

48.72643

2.943655

0.249485

Image 24

3172.501

79.34072

60.32123

3.514329

0.26923

Image 25

2641.164

75.10117

52.95186

3.766885

0.323131

Image 26

2780.292

72.00747

51.93181

3.444566

0.302872

Image 27

2851.483

63.2016

46.25931

2.906308

0.278063

Image 28

2927.764

45.76717

35.66286

2.058119

0.233298

Image 29

2283.569

57.25194

41.55681

2.828104

0.265601

Image 30

2417.582

66.79382

48.52081

3.271357

0.288454

Image 31

2511.656

83.11073

52.94149

3.973229

0.33414

Table 4.5 Abnormal Images – Variance 35

Entropy Image

Wavelet

Gabor


Vertical

Diagonal

(Sub-band 1)

Image 1

0.04601

1.22975

1.360388

1.518533

7.074133

Image 2

0.046057

1.229486

1.354789

1.522715

7.102217

Image 3

0.046057

1.234505

1.358728

1.52438

6.980089

Image 4

0.046057

1.231195

1.347056

1.531343

7.038835

Image 5

0.04601

1.222984

1.343129

1.525853

7.08164

Image 6

0.042044

1.237614

1.314423

1.506982

7.080376

Image 7

0.042044

1.224281

1.3119

1.49547

7.149141

Image 8

0.042163

1.220224

1.287376

1.482197

7.21587

Image 9

0.042084

1.217412

1.29642

1.48085

7.249478

Image 10

0.042044

1.233267

1.305855

1.500817

7.13097

Image 11

0.042005

1.233912

1.308709

1.509876

7.122963

Image 12

0.046667

1.183855

1.259021

1.474565

7.178054

Image 13

0.04662

1.196428

1.283083

1.485851

7.143073

Image 14

0.04662

1.20201

1.280835

1.488458

7.133883

Image 15

0.046667

1.214072

1.288439

1.495205

7.099794

Image 16

0.046667

1.2072

1.29658

1.498235

7.120152

Image 17

0.046715

1.215082

1.306171

1.510551

7.086348

Image 18

0.04601

1.207481

1.274256

1.485508

7.113188

Image 19

0.046057

1.198508

1.272535

1.470825

7.146436

Image 20

0.04601

1.200482

1.270475

1.489269

7.136099

Image 21

0.04601

1.198974

1.27809

1.487759

7.142772

Image 22

0.04615

1.208821

1.28574

1.49064

7.142772

Image 23

0.046103

1.198707

1.271244

1.485709

7.148212

Image 24

0.04601

1.190959

1.257168

1.466409

7.185435

Image 25

0.969726

0.798299

0.804175

0.877173

3.950262

Image 26

0.962768

0.777796

0.791559

0.859575

3.90922

Image 27

0.958183

0.769407

0.783881

0.857869

3.824544

Image 28

0.947184

0.764631

0.771277

0.837862

3.692486

Image 29

0.972916

0.802867

0.822998

0.89641

3.939474

Image 30

0.971439

0.805303

0.81776

0.886377

3.95897

Image 31

0.973666

0.812416

0.813293

0.888803

3.988039

Table 4.6 Abnormal Images – Entropy 36

4.3 RESULTS FOR SVM PERFORMANCE MEASUREMENT MEASUREMENT

After the feature extraction of normal and abnormal images, the extracted results are classified using the SVM classifier with linear kernel and the performance of SVM classifier is measured.

The performance measured without applying transforms is shown below,

Accuracy

Precision

Sensitivity

Specificity

73.33

73.33

100

0

Table 4.7 Performance Measures without applying transforms

4.3.1 WAVELET PERFORMANCE MEASURES

Accuracy

Precision

Sensitivity

Specificity

Horizontal

73.3

73.3

100

0

Vertical

73.3

76.9

90.9

25

Diagonal

73.3

73.3

100

0

Table 4.8 Performance Measures of DWT based SVM

37

4.3.2 GABOR PERFORMANCE MEASURES

Sub band

Accuracy

Precision

Sensitivity

Specificity

Sub band 1

100

100

100

100

Sub band 2

73.33

76.9

90.91

25

Sub band 3

73.33

78.57

100

25

Sub band 4

73.33

78.57

100

25

Sub band 5

80

78.57

100

25

Sub band 6

73.33

73.33

100

0

Sub band 7

73.33

73.33

100

0

Sub band 8

73.33

78.57

100

25

Sub band 9

80

78.57

100

25

Sub band 10

73.33

73.33

100

0

Sub band 11

73.33

73.33

100

0

Sub band 12

73.33

73.33

100

0

Sub band 13

80

78.57

100

25

Sub band 14

86.67

100

81.81

100

Sub band 15

80

78.57

100

25

Sub band 16

73.33

73.33

100

0

Sub band 17

73.33

73.33

100

0

Sub band 18

73.33

78.57

90.9

0

Sub band 19

80

78.57

90.9

0

Sub band 20

73.33

73.33

100

0

Sub band 21

73.33

73.33

100

0

Sub band 22

80

78.57

100

25

Sub band 23

80

78.57

100

25

Sub band 24

73.33

73.33

100

0

Sub band 25

73.33

73.33

100

0

Sub band 26

73.33

73.33

100

0

Sub band 27

80

78.57

100

25

Sub band 28

80

83.33

90.9

50

Sub band 29

80

78.57

100

25

38

Sub band 30

73.33

73.33

100

0

Sub band 31

73.33

73.33

100

0

Sub band 32

80

78.57

100

25

Sub band 33

80

78.57

100

25

Sub band 34

73.33

73.33

100

0

Sub band 35

73.33

73.33

100

0

Sub band 36

73.33

73.33

100

0

Sub band 37

73.33

73.33

100

0

Sub band 38

80

78.57

100

25

Sub band 39

73.33

73.33

100

0

Sub band 40

73.33

73.33

100

0

Table 4.9 Performance Measures of Gabor based SVM

39

CHAPTER 5 COMPARISON OF RESULTS AND ANALYSIS

Performance measures Without transform

Accuracy

Precision

Sensitivity Specificity

73.33

73.33

100

0

With

Horizontal

80

100

100

100

DWT

Vertical

86.67

84.6

100

50

Diagonal

86.67

100

100

100

100

100

100

100

With Gabor transform (Sub-band -1) Table 5.1 Performance Measures – Comparison

After obtaining the results, they are compared with one another to determine which provide good results. From the obtained results, the following things are inferred:

1. If images are classified directly without applying any transform, it provided a classification accuracy of 73.33%, but specificity becomes 0.This means that SVM classifier finds more difficulty in recognizing the abnormal images, when it is fed with features from untransformed images as inputs.

2. If images are classified by extracting features through wavelet transform, it provided a classification accuracy of 73.33%, precision of 76.8% and specificity of 25%.This means that SVM classifier finds more difficulty in recognizing about 3/4th of abnormal images available, when it is fed with features from wavelet transformed images as inputs. 40

3. If images are classified by extracting extracting features through Gabor transform with 5 scales and 8 orientations, it provided a classification accuracy of 100%,precision of 100% and specificity of 100% even in the first sub band itself(scale-1,orientation-1).This means that SVM classifier finds more easiness when it is fed with simply simply first sub band of Gabor extracted features.

41

CHAPTER 6 CONCLUSION

The real world data is generally imperfect for two reasons: one is that the data can be incomplete for the lack of the necessary information; the other is that data may be inaccurate, because it includes the noise and even the wrong information. With this incomplete real world data, how to extract the specific image feature is still a present major issue. Feature extraction and selection are essential in case of image classification. Once we have extracted the features, the feature selection aiming at improving classification accuracy and reducing the restless feature plays an important role in classification because the SVM has its influence on classification. In this project, the proposed approach based on linear SVM has demonstrated great potential and usefulness in MRI image classification. From the above analysis ,it is clear that medical images images are perfectly classified with 100% classification accuracy when they are given to Gabor transform based Support Vector Machine (SVM) with linear kernel. Also, in addition to 100% classification accuracy, it provides high recognition efficiency and high speed convergence when comparing to discrete wavelet transform based Support Vector Machine (SVM) with linear kernel. Also, results show that Gabor extracted features provide greater and absolute resolution to images in their classification process. These entire factors make Gabor Transform based Support Vector Machine (SVM) with linear kernel more efficient than other classifiers in medical image classification process.

42

REFERENCES

[1] Antonio V. Netto, Cesar B. Castanon Cas tanon , C. P. De Almeida, João E. S. B. Neto, Maria Cristina F. de Oliveira Osvaldo, ‘Ridgelet based Texture Classification of Tissues in Computed tomography’, Machine Vision and Applications, Springer-Verlag, 2008 [2] Li Bai , Ulas Bagci, ‘A Comparison Compar ison of Daubechies and Gabor Wavelets for Classification of MR Images’, Collaborative Medical Image Analysis Group (CMIAG), 2007 [3] Lindsay Semler, Lucia Dettori, ‘A Comparison of Wavelet-based and Ridgelet based texture Classification of tissues in Computed Tomography’, Springer-Verlag, 2008, p.no:21-29 [4] P. Carrion, E. Cernadas, M. Fernandez-Delgado, J.F. Galvez, Pilar SaOtero, ‘Improved Classification of Pollen texture Images using us ing SVM and MLP’, Springer-Verlag, 2008, p.no:115-119 [5] Qian Zhu , Shaomei Yang, ‘Comparison and Application of SVM and FNN Algorithm’, Vol :1,2007 [6] Abdel-Badeeh, El-Dahshan, El-Sayed, M.Salem, Tamer.H.Younis, ‘ A Hybrid Technique for Automatic MRI Brain Images Classification’, Studia Univ. Babesbolyai, Informtica,Vol:54,2009 [7]M. Fletcher-Heath, D. B. Goldgof, L. O. Hall, F. R. Murtagh, ‘Automatic segmentation of non-enhancing brain tumors in magnetic resonance images’, Artificial Intelligence in Medicine 21 (2001), pp. 43-63

43

[8] Wei-Li Zhang, Xi-Zhao Wang, ‘Feature Extraction and Classification for Human Brain CT images’, International Conference on Machine Learning and Cybernetics, Hong Kong, 2007 [9] Dinstein, Haralick, R. M., Shanmugam, D., ‘Texture Features for Image Classification’, IEEE Transactions on Systems and Cybernetics, Vol:3,1973, p.no: 610 – 621 [10] Vapnik, V. N., ‘The Nature of Statistical Learning Theory’,(New York: Springer-Verlag), 1995. [11] Chih-Chung Chang, Chih-Jen Lin, Chih-Wei Hsu, ‘A Practical Guide to Support Vector classification’, http://www.csie.ntu.edu.tw/~cjlin, http://www.csie.ntu.edu.tw/~cjlin, 2009 [12] C. Chang, C. Lin, ‘LIBSVM: a library for support vector machines’, 2001. [13] Anna Rejani, Y.Ireaneus, Dr. S. Thamarai Selvi, ‘Early Detection of Breast Cancer using SVM Classifier Technique’, International Journal on Computer Science and Engineering , 2009, Vol.1,p.no:127-130. [14] S. Mika, K.R. Muller, ‘An introduction to kernel-based learning algorithms’, IEEE Transactions on Neural Network, 2001, Vol: 12, p.no:181201. [15] V. Vapnik, “Statistical learning theory”, Wiley, New York, 1998 [16] Ahmet Sertbas, Niyazi Kilic, Onur Osman, Osman N. Ucan, Pelin Gorgel, , ‘Mammographic Mass Classification Using Wavelet based Support Sup port Vector Machine’, Journal of Electrical & Electronics Engineering,Vol:9,2009,p.no:867-875

44

Final Year Project - Medical Image Classification Using Support Vector Machine

Recommend Documents