Finance News Sentiment

Project Proposal

A brief description of our project

Using supervised machine learning, create an algorithm that would take financial headlines as an input and analyze current sentiment in the financial industry. The analysis would be particularly interesting in the context of the Covid pandemic. One would think that a majority of the financial headlines would denote some kind of negative sentiment. That would be our initial assumption. Here is a link to our hosting Website.

Why our final sentiment analysis will be useful to users?

Using financial headline inputs in csv format, users can get an sentiment analysis with classification of each inputted financial headline.
Using additional Stock News API data (date and tickers), stock list csv file, and our Tableau dashboard template, users can perform additional anaylsis on the following:
- Sentiment analysis a given time period
- Sentiment analysis by companies, sectors and industry
- Emotional Analysis by keywords.
  - Keywords grouped in Tableau based on their feelings to create visualizations. Feelings Resource Website
- Word Blast

Data sources

Technologies

Python Pandas
PySpark
Tableau
Google Colab
ScikitLearning
JavaScript
HTML/CSS
w3schools.com
cloudtables.com

Machine Learning Steps

Get Test Data to retrieve financial headlines and sentiment classification and save as csv
Get API key
Read Test Data in as DataFrame
Tokenization - Break financial headlines into a list of words
Countvectorizer - Generate the term frequency vectors
Inverse Document Frequency - Down-weighs features which appear frequently in a corpus.
Split Data - Train 80%, Test 20%
Hypertuning and Model - Run Machine Notebook. The notebook runs the test data through both the Logistics Regression and Naive Bayes, applies hypertuning with Param Grid Search and Cross Validator and allows users to evaluate the models using area under ROC, accuracy, F1 score.

Using Machine Learning Model

Get financial headlines from Stock New API or use alternative dataset
Read dataset in as DataFrame
Run the model from above to retrieve sentiment classification for each headline

from pyspark.ml.classification import LogisticRegression
from pyspark.ml.evaluation import BinaryClassificationEvaluator
from pyspark.ml.tuning import CrossValidator, ParamGridBuilder
from pyspark.ml.evaluation import MulticlassClassificationEvaluator
import numpy as np
lr = LogisticRegression(maxIter = 10)

paramGrid_lr = ParamGridBuilder() \
    .addGrid(lr.regParam, np.linspace(0.3, 0.01, 10)) \
    .addGrid(lr.elasticNetParam, np.linspace(0.3, 0.8, 6)) \
    .build()
crossval_lr = CrossValidator(estimator=lr,
                          estimatorParamMaps=paramGrid_lr,
                          evaluator=MulticlassClassificationEvaluator(),
                          numFolds= 5)  
cvModel_lr = crossval_lr.fit(trainDF)
best_model_lr = cvModel_lr.bestModel.summary
best_model_lr.predictions.columns

Data Analysis

Get date and ticker information from Stock News API or use alternative dataset
Use Stock List from NASDAQ with key ticker information
Run the functions notebook to clean and merge the financial sentiment classification, date, and detailed ticker information to perform analysis
Save merged DataFrame as csv
Open csv in Tableau, using template provided, view data visualization.

Slidedeck

Financial News Analysis

Team

Team Member	Github username
Adriana Icasiano	adriana-icasiano
Paul Feliciano	pfeliciano1
Alberto Gonzalez	dalismo
Abayomi Olujobi	bay0624
Lovensky Lubin	Lubinl

Name		Name	Last commit message	Last commit date
Latest commit History 206 Commits
API_pull_notebooks		API_pull_notebooks
Machine_Learning		Machine_Learning
Resources		Resources
Sentiment_comparison_to_stocknews		Sentiment_comparison_to_stocknews
data_consolidation_for_tableau_notebooks		data_consolidation_for_tableau_notebooks
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Finance News Sentiment

Table of Contents

Project Proposal

A brief description of our project

Why our final sentiment analysis will be useful to users?

Data sources

Technologies

Machine Learning Steps

Using Machine Learning Model

Data Analysis

Slidedeck

Team

About

Uh oh!

Releases

Packages

Contributors 5

Uh oh!

Languages

dalismo/Finance_News_Sentiment

Folders and files

Latest commit

History

Repository files navigation

Finance News Sentiment

Table of Contents

Project Proposal

A brief description of our project

Why our final sentiment analysis will be useful to users?

Data sources

Technologies

Machine Learning Steps

Using Machine Learning Model

Data Analysis

Slidedeck

Team

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Uh oh!

Languages

Packages