Skip to content

Performed exploratory data analysis on covid tweets collected from 25-07-2020, with #covid19 involved.#6

Open
errpv78 wants to merge 1 commit intoksksksks-dev:mainfrom
errpv78:main
Open

Performed exploratory data analysis on covid tweets collected from 25-07-2020, with #covid19 involved.#6
errpv78 wants to merge 1 commit intoksksksks-dev:mainfrom
errpv78:main

Conversation

@errpv78
Copy link
Copy Markdown

@errpv78 errpv78 commented Oct 13, 2020

Approach
Opening the columns.txt and understanding the columns their types and description.
Loading the dataset and viewing its sample, dimensions, and a sample row.
Checking Frequency distribution and description in integer columns like user_followers, user_friends.
Checking for null and missing values and cleaning the data.
Exploring the unique values in columns their frequency and maximum lengths of columns to understand more about the distribution of data.
Filtering columns to get relevant information needed for better grouping and understanding data like adding tweet_date for date of tweet column from the date column which had both date and time to look for the frequency distribution of tweets date wise.
Plotting tweet lengths to see a variation of tweet lengths in words.
Building and training sentimental analysis Bert model, the link for separate training file included below.
Preprocessing the tweet text to filter out unnecessary words and characters and adding sentiment column to the data frame and saving data frame to a new CSV file.
Checking overall sentiment distribution among tweets.
Subsetting data frame with conditions on columns to understand distribution among columns.
Visualizing sentiments among top values and frequency distribution of sentiments with respect to other columns.
Exploring hashtags column to understand different hashtags and their relation with sentiments and other columns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant