Skip to content

Getting Started

Topic Modeling Documentation

A comprehensive topic modeling solution using Gensim and LDA (Latent Dirichlet Allocation) for document analysis and topic discovery.

Key Features

  • Document preprocessing and filtering
  • TF-IDF based term filtering
  • Bigram detection
  • LDA model training with customizable parameters
  • Topic visualization
  • Model persistence and loading
  • Coherence score evaluation

Prerequisites

  • Python 3.8+
  • pandas
  • numpy
  • scikit-learn
  • scipy
  • matplotlib
  • gensim
  • pyLDavis
  • spacy
  • nltk
  • bertopic