LDA topic modeling of movie reviews with Spark.
Topic Modeling pipeline using Spark NLP for preprocessing and Spark MLlib’s LDA to extract topics from the data.
Generate topic words associated with a textual review. Latent dirichlet allocation (LDA) model is used to cluster topics from textual reviews.
Data from Amazon movie reviews
topic_modeling.ipynb
Overview of LDA topic modeling and visualization.topic_modeling.py
Implementation of Spark data preprocessing and modeling.