C:\Users\16096\Documents\KAJAL\Semester2\Data Wrangling\Projects-HW\dwproj_new\Lib\site-packages\requests\__init__.py:89: RequestsDependencyWarning: urllib3 (1.26.18) or chardet (3.0.4) doesn't match a supported version!
warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
[nltk_data] Downloading package punkt_tab to
[nltk_data] C:\Users\16096\AppData\Roaming\nltk_data...
[nltk_data] Package punkt_tab is already up-to-date!
Topic Modeling of Reviews
(126725, 8)
review | review_datetime | data_source | app_name | upvote_count | total_comments | app_rating | sentiment | |
---|---|---|---|---|---|---|---|---|
0 | uber eats for owls? will they ever come out wi... | 2025-04-20 21:51:15 | UberEats | 1.0 | 2.0 | NaN | Neutral | |
1 | serious question yall is it worth going out to... | 2025-04-20 21:41:21 | UberEats | 1.0 | 1.0 | NaN | Neutral | |
2 | ubereats charged me for a successful chargebac... | 2025-04-20 20:50:04 | UberEats | 1.0 | 2.0 | NaN | Negative | |
3 | ubereats driver scammed me by buying half the ... | 2025-04-20 20:48:13 | UberEats | 1.0 | 9.0 | NaN | Negative | |
4 | ubereats why you do this? family went out of t... | 2025-04-20 20:19:15 | UberEats | 1.0 | 3.0 | NaN | Negative |
Found 3 unique apps: UberEats, DoorDash, GrubHub
Coherence Score vs Number of Topics
Code
for app in unique_apps:
print(f"\nProcessing app: {app}")
= df[df['app_name'] == app]['review']
app_reviews
= preprocess_reviews(app_reviews)
processed_reviews #print(len(processed_reviews))
= compute_coherence_values(processed_reviews, start=2, limit=20, step=1)
coherence_scores, models =app)
plot_coherence_scores(coherence_scores, app_name
= max(coherence_scores, key=lambda x: x[1])
best_num_topics, best_score print(f"Best number of topics for {app}: {best_num_topics} with coherence score {best_score:.4f}")
for num, model, corpus, dictionary in models:
if num == best_num_topics:
= model, corpus, dictionary
best_model, best_corpus, best_dictionary break
=app)
visualize_topics_pyldavis(best_model, best_corpus, best_dictionary, app_name
print("\n✅ All Apps Processed Successfully!")
Processing app: UberEats
Preprocessing reviews...
Completed preprocessing 64153 reviews.
Computing coherence scores...
Training LDA with 2 topics...
Training LDA with 3 topics...
Training LDA with 4 topics...
Training LDA with 5 topics...
Training LDA with 6 topics...
Training LDA with 7 topics...
Training LDA with 8 topics...
Training LDA with 9 topics...
Training LDA with 10 topics...
Training LDA with 11 topics...
Training LDA with 12 topics...
Training LDA with 13 topics...
Training LDA with 14 topics...
Training LDA with 15 topics...
Training LDA with 16 topics...
Training LDA with 17 topics...
Training LDA with 18 topics...
Training LDA with 19 topics...
Training LDA with 20 topics...
Completed coherence score calculation.
Best number of topics for UberEats: 15 with coherence score 0.5595
Preparing pyLDAvis visualization for UberEats...
Saved pyLDAvis visualization to UberEats_topics.html 🚀
Processing app: DoorDash
Preprocessing reviews...
Completed preprocessing 53719 reviews.
Computing coherence scores...
Training LDA with 2 topics...
Training LDA with 3 topics...
Training LDA with 4 topics...
Training LDA with 5 topics...
Training LDA with 6 topics...
Training LDA with 7 topics...
Training LDA with 8 topics...
Training LDA with 9 topics...
Training LDA with 10 topics...
Training LDA with 11 topics...
Training LDA with 12 topics...
Training LDA with 13 topics...
Training LDA with 14 topics...
Training LDA with 15 topics...
Training LDA with 16 topics...
Training LDA with 17 topics...
Training LDA with 18 topics...
Training LDA with 19 topics...
Training LDA with 20 topics...
Completed coherence score calculation.
Best number of topics for DoorDash: 6 with coherence score 0.5708
Preparing pyLDAvis visualization for DoorDash...
Saved pyLDAvis visualization to DoorDash_topics.html 🚀
Processing app: GrubHub
Preprocessing reviews...
Completed preprocessing 8844 reviews.
Computing coherence scores...
Training LDA with 2 topics...
Training LDA with 3 topics...
Training LDA with 4 topics...
Training LDA with 5 topics...
Training LDA with 6 topics...
Training LDA with 7 topics...
Training LDA with 8 topics...
Training LDA with 9 topics...
Training LDA with 10 topics...
Training LDA with 11 topics...
Training LDA with 12 topics...
Training LDA with 13 topics...
Training LDA with 14 topics...
Training LDA with 15 topics...
Training LDA with 16 topics...
Training LDA with 17 topics...
Training LDA with 18 topics...
Training LDA with 19 topics...
Training LDA with 20 topics...
Completed coherence score calculation.
Best number of topics for GrubHub: 4 with coherence score 0.5687
Preparing pyLDAvis visualization for GrubHub...
Saved pyLDAvis visualization to GrubHub_topics.html 🚀
✅ All Apps Processed Successfully!