Unit 1, Lesson 6

🎭

Reading the Room

Every day, millions of people post reviews, comments, tweets, and feedback.

How do companies analyze what people REALLY think when there are 10,000 reviews to read?

You're about to learn:
• How computers "read" sentiment (happy, angry, sad)
• Why AI sometimes gets it completely wrong
• How to detect when text analysis is biased

By the end, you'll know which methods to trust... and which to question.

📱 The Scenario

You work for a restaurant chain.

Last month, you got 5,000 customer reviews on Yelp, Google, and social media.

Your boss asks: "Are customers happy or unhappy? I need to know by tomorrow."

You CAN'T read 5,000 reviews manually.

So you turn to AI...

But there's a catch: Different AI methods give you DIFFERENT answers.

Which one do you trust?

📝 Here are 3 Real Customer Reviews

Review 1:
"The pasta was absolutely disgusting. I loved it! Best meal I've had in months. Seriously amazing."

Review 2:
"Service was fine. Food was okay. Nothing special."

Review 3:
"They said 15 minute wait. We waited 90 minutes. The food was great but I'm never coming back."

🤔 Quick Question (No XP - Just Think About It):

Which review is POSITIVE? Which is NEGATIVE?

Review 1: _____ (Positive or Negative?)
Review 2: _____ (Positive or Negative?)
Review 3: _____ (Positive or Negative?)

Now let's see how different AI methods analyze these...

📚 Method 1: Dictionary-Based Sentiment

How it works:

Computer has a list of "positive" words (great, amazing, loved, best)
And a list of "negative" words (disgusting, terrible, worst, hate)
Count positive words vs negative words
More positive = positive review

Sounds simple, right?

Let's test it on Review 1:

"The pasta was absolutely disgusting. I loved it! Best meal I've had in months."

Dictionary method finds:

Negative words: "disgusting" (-1)
Positive words: "loved" (+1), "Best" (+1)
Score: +1 (POSITIVE)

✅ Dictionary method says: POSITIVE

But wait... read the review again. Is it ACTUALLY positive?

😏 The Sarcasm Problem

Review 1 again:
"The pasta was absolutely disgusting. I loved it!"

A HUMAN reads this and knows: This person is being sarcastic.

"Disgusting" is used ironically - they mean it was SO GOOD it was almost too much.

But the dictionary method doesn't understand sarcasm.
It just counts words.

When Dictionary Methods FAIL:

❌ Sarcasm: "Oh great, another cold burger"
❌ Negation: "NOT good" (counts "good" as positive)
❌ Context: "The ambiance was great but the food was terrible"
❌ Slang: "This pizza slaps" ("slaps" isn't in dictionaries as positive)

So... what's better?

🤖 Method 2: Supervised Machine Learning

How it works:

Humans label 1,000 reviews as positive/negative
AI learns patterns (not just individual words)
AI can detect sarcasm, negation, context
Train once, analyze millions

Example:

"NOT good" → AI learns "NOT" flips sentiment
"Great... if you like food poisoning" → AI learns sarcasm patterns

✅ Pros:

Understands context better than dictionaries
Can detect sarcasm (sometimes)
Learns domain-specific language

❌ Cons:

Requires 1,000+ labeled examples (expensive, time-consuming)
Only works for categories you trained it on
Biased by whoever labeled the training data

🧠 Method 3: LLM-Based Classification (2026 Game Changer)

How it works:

Give an LLM (like GPT-4, Claude) the review
Ask: "Is this positive or negative? Explain why."
LLM uses reasoning, not just pattern matching

Example prompt:

                        Analyze this review's sentiment:

                        "The pasta was absolutely disgusting. I loved it!"

LLM response:

"This review is POSITIVE. The author is using hyperbole and irony. 'Disgusting' is used playfully to emphasize how rich/indulgent the pasta was. 'I loved it' and 'Best meal' confirm positive sentiment."

✅ Pros:

Understands sarcasm, irony, context
Can explain its reasoning
No training data required (zero-shot)
Works on any category instantly

❌ Cons:

Expensive ($0.01-0.10 per review)
Slower than dictionary/ML methods
Can still make mistakes (hallucinations)

🎮 Method Comparison Challenge

For each review, guess which method will get it RIGHT

Question 1:
"This movie was not good. It was terrible."

Which method gets this RIGHT?

🎮 Method Comparison Challenge (2/3)

Question 2:
"Oh great, another pizza place. Just what we needed. 🙄"

Which method gets this RIGHT?

🎮 Method Comparison Challenge (3/3)

Question 3:
You have 100,000 product reviews to analyze. You need results in 10 minutes. Budget: $50.

Which method should you use?

🎯 CommDAAF Checkpoint: Who Decides What's "Positive"?

Complete all 4 sections to continue

📊 DISCOVER

Look at these two movie reviews:

Review A: "This film brilliantly deconstructs late capitalism and patriarchal structures."

Review B: "Fun popcorn movie! Explosions and car chases. Loved it!"

Run both through a sentiment analyzer. What scores do they get? Which is rated more "positive"?

🔍 ANALYZE

Most sentiment analyzers were trained on Amazon product reviews and Yelp.

• What KIND of language appears in product reviews?
• Is "brilliantly deconstructs late capitalism" the kind of phrase that appears in Amazon reviews?
• Whose vocabulary is the AI trained on?

⚖️ ASSESS

Imagine a company uses sentiment analysis to decide which employee feedback to prioritize.

Employee 1 (Business major): "Great team synergy! Productive quarter!"

Employee 2 (PhD researcher): "Methodologically rigorous approach to systemic optimization challenges."

Both employees are giving POSITIVE feedback. Who gets heard? Who gets ignored? What happens when AI is trained on one group's language patterns?

🛠️ FORMULATE

If you were building a sentiment analyzer, what would you do differently to reduce bias?

Ideas to consider: Training data sources, multiple models for different audiences, human oversight, transparency about limitations

💾 Your responses are saved to your learning journal

🎮 Sentiment Guessing Game (1/3)

"The service was absolutely fine."

Guess the sentiment:

🎮 Sentiment Guessing Game (2/3)

"I can't say enough good things about this place!"

Guess the sentiment:

🎮 Sentiment Guessing Game (3/3)

"If you enjoy waiting 2 hours for cold food, this is your spot."

Guess the sentiment:

⚠️ Real-World Application: Toxicity Detection

Beyond positive/negative, text classification is used to detect:

• Toxic comments (harassment, hate speech)
• Spam vs legitimate messages
• Fake reviews vs real reviews
• Misinformation vs factual claims

Example: YouTube auto-moderates comments

Supervised ML model trained on millions of human-labeled comments
Flags potentially toxic comments for review
But it makes mistakes...

Common failures:

❌ "I'm going to kill... this workout!" (False positive)
❌ Coded hate speech (new slang AI hasn't seen)
❌ Context-dependent phrases ("That's sick!" = good or bad?)

The stakes are HIGH:
• False negatives: Toxic content stays up, harms users
• False positives: Legitimate speech gets censored

This is why human oversight still matters.

🎯 CommDAAF Checkpoint: Algorithmic Censorship

Who gets censored?

📊 DISCOVER

Research from 2024 found that toxicity detectors disproportionately flag:

African American Vernacular English (AAVE)
LGBTQ+ community discussing their experiences
Disabled people describing ableism

Meanwhile, they MISS: Coded hate speech, dog whistles (subtle discriminatory language), sealioning (bad-faith questioning)

What did you find about toxicity detector bias?

🔍 ANALYZE

Training data matters. If a toxicity detector was trained on flagged comments from 2010-2015, what language patterns did moderators flag back then?

• Whose speech was considered "toxic"?
• Who had the power to decide what's "acceptable"?
• How have norms changed since then?

⚖️ ASSESS

Scenario: You run content moderation for a social platform. Your AI auto-removes 10,000 comments per day for "toxicity."

But you discover it's removing comments from marginalized groups describing their discrimination.

• What's the harm of false positives in this case?
• What's the harm of false negatives (letting actual toxicity through)?
• How do you balance freedom of expression with user safety?

🛠️ FORMULATE

Design a better moderation system:

• Should AI auto-remove or just flag for human review?
• How would you reduce bias in the training data?
• What role should community input play?
• Should different communities have different moderation standards?

🏆 The Recommended Workflow (Used by Professionals)

After analyzing thousands of text datasets, here's what works best:

STEP 1: Unsupervised Topic Modeling
↓
STEP 2: LLM Labeling
↓
STEP 3: Train Supervised Classifier

STEP 1: Unsupervised Topic Modeling

Discover themes in your data automatically (no labeling required)

Use: BERTopic or LDA
Output: "There are 15 topics in this dataset"

STEP 2: LLM Labeling

Ask an LLM to label each topic

Use: GPT-4, Claude, etc.
Output: Topic 1 = "Shipping complaints", Topic 2 = "Product quality praise"

STEP 3: Train Supervised Classifier

Now that you know the categories, label 100-500 examples per category. Train a fast, cheap supervised ML model. Use this for future classification at scale.

Why this workflow?

✅ Unsupervised finds themes you didn't know existed
✅ LLM provides human-quality labels quickly
✅ Supervised classifier is cheap/fast for ongoing use

This is how modern companies do text analysis in 2026.

🎯 CommDAAF Checkpoint: Automation's Blind Spots

What gets lost in automation?

📊 DISCOVER

You just automated customer support ticket categorization.

Ticket: "I've emailed 5 times about my billing issue. I'm a single mom and this overcharge means I can't buy groceries this week. Please help."

• AI categorized it as: "Billing dispute" (correct category)
• Priority: Low (incorrect - this is urgent)

What did the AI miss?

🔍 ANALYZE

What information do humans capture that AI doesn't?

Consider: Emotional urgency, context, frustration, power dynamics

Can sentiment analysis capture all of this?

⚖️ ASSESS

Imagine your company automates text analysis to:

Hire employees (analyze resumes)
Approve loans (analyze applications)
Moderate content (flag harmful posts)

For each use case: What's the benefit of automation? What's the risk? Who is harmed when the AI gets it wrong?

🛠️ FORMULATE

Create your own "Human-in-the-Loop" policy:

For text analysis in high-stakes situations (hiring, content moderation, etc.): What percentage should be human-reviewed? What triggers should escalate to human review? How do you audit the AI's decisions? When should you NOT use automation at all?

Write a 3-5 sentence policy:

🎓 Key Takeaways

You've learned 3 text analysis methods:

1️⃣ Dictionary-Based

✅ Fast, cheap, transparent

❌ Misses sarcasm, context, negation

Use when: Simple sentiment on large scale

2️⃣ Supervised Machine Learning

✅ Learns patterns, handles context better

❌ Requires training data, expensive setup

Use when: You have labeled examples, need speed + accuracy

3️⃣ LLM-Based Classification

✅ Best accuracy, explains reasoning, zero-shot

❌ Expensive, slower

Use when: Accuracy > cost, need explanations

🏆 Recommended Workflow:

Topic modeling → LLM labeling → Supervised classifier

⚠️ Critical Lessons:

• Text analysis is NEVER neutral
• Training data determines what AI considers "positive," "toxic," etc.
• Automation at scale can amplify bias
• Always maintain human oversight for high-stakes decisions

The power to analyze millions of texts comes with responsibility.

Use it wisely.

🎉

Lesson Complete!

0 XP

🏆 Badge Unlocked: Sentiment Sleuth

You now know:
✅ How sentiment analysis works (3 methods)
✅ When each method fails
✅ How to detect bias in text classification
✅ The professional workflow for text analysis

Next up: Lesson 7 - Topic Modeling