Portfolio

End-to-end Automatic Speech Recognition Systems - PyTorch

Published: November 19, 2021

Implement 4 different deep learning architecture (MLP, CNN, RNN, ANN) to parse audio sentences (feature extraction).
- scrape, clean and preprocess audio data
- experiment 4 different architecture for both extractor and classifier layers
- RNN extractor with RNN classifier performs best

Car Marketplace Data Mining

Published: November 14, 2021

Data analysis on used cars market. Huge amount of code on data cleaning and visualization. Predict car prices from car features and find out significant features to increase car sales.
- data cleaning and feature engineering
- predict car sales price
- identify significant car features to increase sales

LDA Topic Modeling on Movie Reviews - Spark

Published: November 11, 2021

LDA topic modeling of movie reviews with Spark.
- text preprocessing, tokenizing
- LDA modeling
- LDA visualizations

Interpretable Machine Learning

Published: April 07, 2021

Write up of interpretable machine learning. Explains inherently interpretable machine learning models and LIME algorithm.
- importance of interpretability
- what is interpretability
- inherently interpretable models (linear regression, decision tree)
- Local Interpretable Model-Agnostic Explanations (LIME) algorithm

Download here

Lung Cancer Image Classification

Published: December 10, 2019

Comparative study of image feature extraction methods on lung cancer image classification.
- image feature extraction (feature descriptors) and image transformation of lungs CT image
- clustering of image descriptors
- train classifiers and evaluate

Online Shopper Intention - SAS Enterprise Miner

Published: December 04, 2019

Use SAS Enterprise Miner to classify online shoppers to be potential buyers.
- data exploration using Graph, Stat, Cluster nodes
- data preprocessing with Replacement, Sampling, Partition nodes
- modeling with Regression, Neural Network, Decision Tree

Aerial Bombing in World War II - Tableau

Published: October 30, 2019

Visualization of bombing events during WWII.
- visualize bombing activities, industry bombed and aircraft evolution
- applies Sankey chart, map graph, scatter plot, line graph and bar chart

Next-Word Predictor

Published: May 24, 2019

Predicting what word comes next with Tensorflow. Implement RNN and LSTM to develop four models of various languages.
- scrape Twitter API data
- LSTM model
- Django frontend

Search Engine of Cyber-Security Data

Published: May 17, 2019

Search engine with Lucene model.
- rank search result using tf-idf score
- users provide relevance feedback to improve the effectiveness

Job Recommendation with ALS

Published: March 02, 2019

A Job Recommendation Engine with Implicit Feedback. Two models are developed. The first used content-based filtering, the second implemented Collaborative Filtering for Implicit Feedback^source.
- text processing and vectorize
- recommendation based on job and user similarity
- Alternating Least Square (ALS) algorithm using implicit feedback

Predictive Maintenance of Forklifts using Sensor Data

Published: September 29, 2018

A combination of LSTM and EDM models to address the issue of anomaly classification and prediction in time series data. Working with sensor data of forklifts used in storage and retrieval systems. Predictors based on variance and median methods in the handling of anomalies.
- predict machine components failure using historical data
- train classification model to improve performance of outlier and breakout detection

YEE Xun Wei

Portfolio