Skip to content

sobcza11/NLP_HK_Security_Law

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

alt text

OVERVIEW

After the end of the First Opium War in 1841, the Qing Dynasty ceded Hong Kong ("HK") to the United Kingdom ("UK"). After World War 2 & still under British rule, HK became a global financial center & its population developed a hybrid culture of maintaining their Chinese ethos while adopting British principles; notably Common Law.

In 1984, the UK & China signed the Sino-British Joint Declaration whereby the UK agreed to cede HK back to China in 1997 under “one country, two systems” ("OcTs"). The mutually agreed OcTs outlined that from 1997 to 2047, HK would not participate in China’s socialist system & HK's capitalist system, Common Law & way of life to remain unchanged. In 1997, the transfer was complete & HK was handed back to China under OcTs.

In 2020, the Congress of China unanimously passed the National Security Law. This law, for simplicity purposes, criminalizes secession, subversion of state power, terrorism & collusion with foreign entities in HK. The law was not enacted by the HK Legislative Council which under HK Basic Law, Article 23, is defined as the governing body to enact such law. This dichotomy in perceived legal right led to notable protests in HK ("HK Protests").

DESCRIPTION

This Natural Language Processing ("NLP") Data Science project uses the Data Science Method ("DSM") to identify variances in sentiment during the HK Protests using a hybrid of automatic & rules-based systems. The source of the data was obtained by scraping selected international newspapers & Twitter.

The goal of the DSM is not to find or present an answer to this geo-political situation but to pragmatically present the facts as they exist; the facts are created by humans resting in unstructured text found in newspapers & social media outlets. Therefore, the initial goal is to build the following:

  • Sentiment & Topic Analysis

To reiterate, the goal of this project is not to establish a position on who’s right. The goal is to go through the DSM to see where certain groups stand & on what footing.

CONTENTS

Here you will find the following folders & description of what is located in each:

  • data
    • This folder contains the scraped data in Excel format
  • notebooks
    • This folder contains the source code written in Python
  • reports
    • This folder contains the Report & Presentation
      • As the name suggests, the Presentation was prepared as a presentation; thus, I suggest that it is downloaded & viewed as a PDF, not on GitHub

About

Unsupervised Learning Capstone

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published