Our Products

product

Features:

1. Advanced NLP Preprocessing:

  • URL, email, and HTML tag removal
  • Tokenization with POS tagging
  • Lemmatization with WordNet POS mapping
  • Stop word removal
  • Feature extraction (text length, spam indicators, special characters, etc.)

2. Multiple ML Models:

  • Random Forest
  • SVM with linear kernel
  • Naive Bayes
  • Logistic Regression
  • Gradient Boosting
  • XGBoost
  • Neural Network (MLP)

3. GUI Features:

  • Modern interface with tabs
  • Email input with sample emails
  • Real-time analysis with feature extraction
  • Model selection dropdown
  • Performance metrics display
  • Visualization capabilities
  • Dataset loading functionality

4. Advanced Features:

  • Spam indicator detection
  • Feature importance visualization
  • Multiple vectorization techniques
  • Cross-validation ready
  • Real-time predictions with probabilities

How to Use:

  1. Run the application:

bash

pip install nltk scikit-learn pandas numpy matplotlib seaborn xgboost tkinter

python spam_classifier.py

  1. Load a dataset (CSV format with 'text' and 'label' columns)
  2. Train models using the "Train Models" button
  3. Analyze emails by typing or loading samples
  4. View results including:
  • Spam/Ham prediction
  • Confidence probabilities
  • Extracted features
  • Spam indicators found

Dataset Format:

The classifier expects a CSV file with:

  • text: Email content
  • label: 1 for spam, 0 for ham

Customization:

You can easily:

  • Add more spam indicators in the AdvancedTextPreprocessor class
  • Modify feature extraction parameters
  • Add new ML models to the ensemble
  • Customize the GUI appearance

The application provides a complete end-to-end solution for email spam classification with both real-time analysis and batch training capabilities.

 

Comments

Leave a Comment

Comment*

Reviews

Write Your Reviews

(0.0)

comment*

Up to Top