Lung cancer image classification in Python using LIDC dataset.
View the Project on GitHub yeexunwei/lung-cancer-image-classification
Lung cancer image classification in Python using LIDC dataset. Images are processed using local feature descriptors and transformation methods before input into classifiers.
config.py
- global variablespreprocessing.py
- preprocessing methodsimage_processing.py
- image transformations methodsimport_data.py
- read and convert raw datadata_lidc.py
- generates features from LIDC datasetmain.py
- train modelsModels Comparison.ipynb
- models comparisonData source from cancerimagingarchive.net consists of 1018 labelled CT scans cases.
Dataset CT scan slices. |
Data from dicom format is read into array.
Flow of data to classifiers. |
K-means algorithm is used to group features extracted from images. Images transformed are directly fed into classiifers. A comparison is made for the each local feature descriptors and image transformation methods in the diagram.
One example of image transformations, wavelet tranform. |
Best accuracy obtained after 3rd wavelet transformation and LBP clustering |
Screenshot of flask app running. |
This is my first time experimenting on a large dataset. Make use of data pipeline for clean and reusable codes. Try on hadoop to handle insufficient memory.