Lung cancer image classification in Python using LIDC dataset.
View the Project on GitHub yeexunwei/lung-cancer-image-classification
Lung cancer image classification in Python using LIDC dataset. Images are processed using local feature descriptors and transformation methods before input into classifiers.
config.py - global variablespreprocessing.py - preprocessing methodsimage_processing.py - image transformations methodsimport_data.py - read and convert raw datadata_lidc.py - generates features from LIDC datasetmain.py - train modelsModels Comparison.ipynb - models comparisonData source from cancerimagingarchive.net consists of 1018 labelled CT scans cases.
![]() |
|---|
| Dataset CT scan slices. |
Data from dicom format is read into array.
![]() |
|---|
| Flow of data to classifiers. |
K-means algorithm is used to group features extracted from images. Images transformed are directly fed into classiifers. A comparison is made for the each local feature descriptors and image transformation methods in the diagram.
| One example of image transformations, wavelet tranform. |
![]() |
|---|
| Best accuracy obtained after 3rd wavelet transformation and LBP clustering |
![]() |
|---|
| Screenshot of flask app running. |
This is my first time experimenting on a large dataset. Make use of data pipeline for clean and reusable codes. Try on hadoop to handle insufficient memory.