An end-to-end ML system that detects phishing websites with 97.11% accuracy, protecting users from fraudulent sites through real-time API inference.
I built a complete phishing detection pipeline using a Random Forest model, achieving 97.11% accuracy. The system processes URL features, trains on 30 website attributes, and serves predictions via a FastAPI, all containerized with Docker for scalable deployment.
Phishing websites steal sensitive user data by mimicking legitimate sites. Traditional detection methods are often slow and fail to catch sophisticated attacks, leaving users vulnerable.
I developed an ML pipeline that analyzes 30 URL and metadata features using Random Forest, achieving 97.11% accuracy with a real-time FastAPI deployment for instant protection.
End-to-End ML Pipeline
High Accuracy (97.11%)
Real-Time Inference API
Containerized with Docker
Multi-Model Training
Feature-Rich Analysis
Architecture: Modular pipeline with separate components for ingestion, validation, training, and inference. FastAPI provides a lightweight, high-performance API.
I was the sole developer for this project, responsible for everything from data analysis and model training to API development and containerization.
The dataset was sourced from the UCI Machine Learning Repository.
# Clone and setup git clone https://github.com/sujeetgund/phishing-website-detection.git cd phishing-website-detection pip install -r requirements.txt # Run API server uvicorn run_api:app --reload
I'm passionate about building ML systems that solve real-world security problems. Let's discuss how we can collaborate.