📊 Project Overview
Datasets analysed: U.S. store locations & Google reviews
Methods & tooling: Python (Pandas, NumPy, Matplotlib, Seaborn) + NLP (NLTK, BERT via 🤗 Transformers)
Analytic focus areas
Store distribution across states
Review-based sentiment scoring
Category breakdown (Women’s, Men’s, Kids’, Accessories)
Average ratings & review volumes
Weekly operating-hour patterns
Topic modelling & word-frequency analysis
🔍 Key Insights
Theme | Key Takeaways |
|---|---|
Store strategy | Fast-fashion brands outnumber competitors but trail in customer satisfaction. |
State opportunities | 📈 Growth potential identified in PA, NJ, and AZ. |
Customer sentiment | Fast-fashion avg rating 2.79⭐ vs. other brands 3.64⭐. Reviews focus on price & variety; negatives cite quality and service. |
NLP findings | Fast-fashion reviews cluster around “price”, “cheap”, “return”; traditional retail around “tailoring”, “bespoke”, “sustainability”. |
🧠 Tech Stack
Layer | Tools & Libraries |
|---|---|
Language | Python (Jupyter Notebook) |
Data wrangling | Pandas, NumPy |
Viz | Matplotlib, Seaborn |
NLP / ML | NLTK, Scikit-learn, BERT |
Outputs | Custom plots (bars, KDEs), topic-model dashboards |



