Earnings Sentiment & Price Reaction Analyser

An NLP pipeline that scores earnings call transcripts using domain-specific financial sentiment lexicons and tests whether management tone predicts short-term abnormal equity returns — replicating the methodology of academic finance NLP research.

Research question

Does positive (negative) sentiment in earnings call transcripts predict positive (negative) abnormal stock returns in the 1, 3, and 5 trading days following the call?

Pipeline architecture

SEC EDGAR 8-K filings + synthetic fallback
              │
              ▼
┌─────────────────────────┐
│  1. Transcript          │  SEC EDGAR API → text extraction
│     Collection          │  → synthetic fallback for unavailable filings
│     fetcher.py          │
└──────────┬──────────────┘
           │  raw transcript text
           ▼
┌─────────────────────────┐
│  2. Sentiment           │  Loughran-McDonald financial lexicon
│     Analysis            │  + VADER rule-based scorer
│     analyser.py         │  → composite score + label
└──────────┬──────────────┘
           │  sentiment scores per earnings event
           ▼
┌─────────────────────────┐
│  3. Price Reaction      │  Event study: abnormal returns
│     Analysis            │  → correlation test + t-test
│     price_reaction.py   │  → interactive dashboard
└─────────────────────────┘

Methods

Sentiment scoring

Two complementary approaches are combined into a composite score:

Loughran-McDonald (LM) Financial Lexicon Domain-specific wordlist calibrated for financial text. Words like "uncertainty" and "risk" are negative in this context — unlike general-purpose sentiment models which treat them as neutral. Standard in academic finance NLP. Weight: 60% of composite score.

VADER (rule-based) Handles negation ("not profitable"), intensifiers ("significantly exceeded"), and capitalisation. Fast and interpretable baseline. Weight: 40% of composite score.

Composite score: 0.6 × LM_net + 0.4 × VADER_compound ∈ [-1, 1]

Event study

Abnormal return: AR(t) = R_stock(t) - R_market(t) (market-adjusted model)
Cumulative abnormal return: CAR[1,N] = Σ AR from t+1 to t+N
Event windows: 1-day, 3-day, 5-day post-earnings
Statistical tests: Pearson correlation + independent samples t-test (positive vs negative sentiment groups)

Modules

Module 1 — Transcript Collection (`1_collection/fetcher.py`)

Queries SEC EDGAR submissions API for recent 8-K filings
Extracts text from filing documents
Falls back to deterministic synthetic transcripts where SEC data unavailable
Outputs: output/transcripts/transcripts.csv

Module 2 — Sentiment Analysis (`2_sentiment/analyser.py`)

Scores each transcript with LM lexicon and VADER
Computes composite score and categorical label (positive/neutral/negative)
Outputs: output/sentiment/sentiment_scores.csv, ticker_sentiment_summary.csv

Module 3 — Price Reaction Analysis (`3_analysis/price_reaction.py`)

Joins sentiment scores to forward abnormal returns
Runs correlation analysis and t-tests across event windows
Generates interactive Plotly dashboard
Outputs: output/analysis/event_study_results.csv, statistical_summary.csv, sentiment_analysis_dashboard.html

How to run

1. Install dependencies

pip install -r requirements.txt

2. Run the full pipeline

python run_pipeline.py

3. View results

open output/analysis/sentiment_analysis_dashboard.html

Or run modules individually:

python 1_collection/fetcher.py
python 2_sentiment/analyser.py
python 3_analysis/price_reaction.py

Outputs

File	Description
`output/transcripts/transcripts.csv`	Raw transcript text per ticker per quarter
`output/sentiment/sentiment_scores.csv`	Per-transcript LM, VADER, and composite scores
`output/sentiment/ticker_sentiment_summary.csv`	Aggregated sentiment per ticker
`output/analysis/event_study_results.csv`	Sentiment joined to forward CAR
`output/analysis/statistical_summary.csv`	Correlation and t-test results
`output/analysis/sentiment_analysis_dashboard.html`	Interactive dashboard

Dependencies

yfinance>=0.2.28
pandas>=2.0.0
numpy>=1.24.0
scipy>=1.10.0
plotly>=5.0.0
requests>=2.28.0

Limitations and extensions

Known limitations:

Synthetic transcripts are used where SEC filings are unavailable — results using synthetic data should be interpreted as methodology demonstration only, not empirical findings
Simple market-adjusted abnormal returns (no beta estimation)
No correction for multiple testing across windows

Natural extensions:

Replace synthetic transcripts with a paid transcript provider (Refinitiv, Bloomberg, Motley Fool)
Add FinBERT transformer-based scoring for comparison
Incorporate conference call Q&A section separately from prepared remarks
Extend to analyst report sentiment

Author

Atrija Haldar LinkedIn MSc Engineering, Technology and Business Management — University of Leeds

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Earnings Sentiment & Price Reaction Analyser

Research question

Pipeline architecture

Methods

Sentiment scoring

Event study

Modules

Module 1 — Transcript Collection (`1_collection/fetcher.py`)

Module 2 — Sentiment Analysis (`2_sentiment/analyser.py`)

Module 3 — Price Reaction Analysis (`3_analysis/price_reaction.py`)

How to run

Outputs

Dependencies

Limitations and extensions

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
README.md		README.md
analyser.py		analyser.py
fetcher.py		fetcher.py
price_reaction.py		price_reaction.py
requirements.txt		requirements.txt
run_pipeline.py		run_pipeline.py

Folders and files

Latest commit

History

Repository files navigation

Earnings Sentiment & Price Reaction Analyser

Research question

Pipeline architecture

Methods

Sentiment scoring

Event study

Modules

Module 1 — Transcript Collection (1_collection/fetcher.py)

Module 2 — Sentiment Analysis (2_sentiment/analyser.py)

Module 3 — Price Reaction Analysis (3_analysis/price_reaction.py)

How to run

Outputs

Dependencies

Limitations and extensions

Author

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Module 1 — Transcript Collection (`1_collection/fetcher.py`)

Module 2 — Sentiment Analysis (`2_sentiment/analyser.py`)

Module 3 — Price Reaction Analysis (`3_analysis/price_reaction.py`)

Packages