Discourse Analysis Research Pipeline

Platform Setup

Choose the platform you want to collect data from, then enter your credentials.

1 — Select a Platform

▶️

YouTube

Video comments & metadata. Best for tracking mainstream political commentary channels.

API key required

🟠

Posts & comments across subreddits. Excellent for tracking fringe-to-mainstream migration.

API key required

🦋

Bluesky

Posts & replies via AT Protocol. Growing platform with strong political discourse communities.

App password required

📋 How to get a YouTube API Key

Go to console.cloud.google.com
Create a new project (or select an existing one)
Click APIs & Services → Enable APIs
Search for and enable YouTube Data API v3
Go to APIs & Services → Credentials
Click Create Credentials → API Key
Copy the key and paste it below

⚠️ Quota note: Free tier gives 10,000 units/day. Each search costs ~100 units.

YouTube Data API v3 Key * Your key is only stored locally in this session and sent only to your own backend server.

📋 How to get Reddit API credentials

Log in to Reddit, then go to reddit.com/prefs/apps
Scroll down and click Create App (or Create Another App)
Give it a name (e.g. discourse_research)
Select script as the app type
Set the redirect URI to http://localhost:8080
Click Create App
Your Client ID is the string under the app name. The Client Secret is labelled "secret".

Reddit Client ID *

Reddit Client Secret *

Subreddits to search (separate with +) Start broad. Add or remove subreddits to focus your search. Examples: ukpolitics, conspiracy, casualuk

📋 How to get Bluesky credentials

Log in to Bluesky at bsky.app
Go to Settings → Privacy and Security → App Passwords
Click Add App Password
Give it a name (e.g. discourse_research) and click Create App Password
Copy the password shown (it will only be displayed once) and paste it below

ℹ️ App passwords are separate from your main Bluesky password. They can be revoked at any time from the same settings page without affecting your account.

Bluesky Handle * Your full Bluesky handle including .bsky.social

App Password * The app password you created above — not your main account password

2 — Session Title & Time Window

Session Title * A short descriptive title. A folder named [title]_[date]_[time] will be created automatically to store all data and results from this session.

Start Year * Recommended: 2014 for a full 10-year window

End Year *

Max Posts / Videos per Term per Year Higher = richer data but slower collection and more API quota used

Max Comments per Post

Search Terms

Enter the terms you want to track. The system will search for these across the selected platform and time window.

Enter Search Terms

💡 Tip: Enter one term per line. You can include variant spellings or related phrases — the system will search for each separately. For this research, terms like great replacement, real british people and two-tier policing are good starting points.

Search Terms (one per line) * Each term is searched independently. Variants (e.g. "great replacement" and "replacement theory") give richer longitudinal coverage.

⚠️ A note on sensitive terms: This tool is designed for academic research into discursive normalisation. The terms you enter may surface disturbing content. Results are stored locally on your own machine and are not sent anywhere beyond the platform APIs.

Suggested Terms for This Research

🌍 Transnational (Global)

great replacement
replacement theory
demographic replacement
white replacement

🇬🇧 Local (UK)

real british people
ordinary british people
genuine british
legacy british

⚖️ Procedural

two-tier policing
two tier justice
one rule for
unequal treatment

Data Collection

Once configured, start the collection process. This runs in the background — you can monitor progress below.

Collection Summary

Complete the Setup and Terms steps first, then return here to begin collecting.

Live Log

Analysis

Run NLP analysis on collected data. This generates sentiment scores, framing classification (critical / neutral / affirmative) and pattern detection.

What the analysis produces

📉 Sentiment Distribution

Proportion of positive / neutral / negative comments over time

🔖 Framing Classification

Whether terms appear with critical distancing, neutrally, or with affirmative endorsement — the core normalisation indicator

📈 Normalisation Over Time

Framing ratio tracked month by month — shows the marked-to-unmarked transition

☁️ Word Cloud

Most frequent terms in the collected corpus

Live Log

Results

Download your data files and view visualisations generated by the analysis.

Click Refresh Results to load available outputs.

Help & Guide

Everything you need to get started.

Getting Started — Step by Step

Step 1 — First-time setup (one time only)

You need Python 3 installed. Check by opening Spotlight (⌘ Space), typing Terminal, and running:

                        python3 --version
                    

If you see a version number (e.g. Python 3.11.x) you are ready. If not, download Python 3 from python.org/downloads/macos — this is the only time you will need Terminal.

Step 2 — Start the tool

In Finder, open the folder containing your downloaded files. You will see a file called Launch Discourse Tool.command.

First time only: Right-click (or Control-click) the .command file and choose Open, then click Open in the security dialog. After that, you can double-click it normally every time.

A black terminal window will appear, install any missing packages automatically, and open this interface in your browser. Keep that black window open while you use the tool — closing it stops the server.

Step 3 — Use the tool

The browser interface opens automatically. Follow the tabs from left to right: Setup → Search Terms → Collect → Analyse → Results.

Platform Credentials

YouTube: Requires a free Google Cloud account. Full step-by-step instructions appear when you select YouTube in the Setup tab.
Reddit: Requires a free Reddit account. Full instructions appear when you select Reddit in the Setup tab.
Bluesky: Requires a Bluesky account and an app password (not your main password). Go to Settings → Privacy and Security → App Passwords in Bluesky to create one. Full instructions appear in the Setup tab.

Understanding the Results

Framing Over Time chart: This is the most important output for normalisation research. A rising affirmative line and falling critical line indicates the term is moving from marked (used with distancing) to unmarked (used without distancing) — the core signal of discursive normalisation.
Toxicity Over Time chart: Shows mean negativity score per month. Spikes often correspond to real-world events.
CSV files: The full dataset with all scores and classifications, for further analysis in Excel or R.

Common Problems

"Server offline" in sidebar: The launcher window has been closed. Double-click Launch Discourse Tool.command again to restart.
Security warning on first launch: Right-click the .command file and choose Open, then confirm. This is a one-time step — after that, double-clicking works normally.
YouTube quota exceeded: You have used your 10,000 daily API units. Wait until the next day or reduce the number of terms / years.
Reddit "prawcore.exceptions": Check your Client ID and Client Secret are entered correctly.
Bluesky "AtProtocolError": Check your handle includes .bsky.social and that you are using an app password, not your main account password.
No results after analysis: Make sure collection completed successfully before running analysis.

🔍 Discourse Analysis Research Pipeline

Live Log

Live Log

Step 1 — First-time setup (one time only)

Step 2 — Start the tool

Step 3 — Use the tool

Platform Credentials

Understanding the Results

Common Problems