-
Notifications
You must be signed in to change notification settings - Fork 1
Expand file tree
/
Copy pathREADME.Rmd
More file actions
101 lines (70 loc) · 2.79 KB
/
README.Rmd
File metadata and controls
101 lines (70 loc) · 2.79 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
---
output: github_document
always_allow_html: true
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# speechbr <img src="man/figures/hexlogo.png" align="right" width = "120px"/>
<!-- badges: start -->
[](https://cran.r-project.org/package=speechbr)
<!-- badges: end -->
## Overview
The goal of `{speechbr}` is to democratize access to the speeches of the deputies, that is, their ideias and thoughts.
The data is obtained on [Discursos e Notas Taquigráficas](https://www2.camara.leg.br/atividade-legislativa/discursos-e-notas-taquigraficas) of [Câmara dos Deputados](https://www.camara.leg.br/).
## Observation
The released version from CRAN is limited to speeches before 2022. For access speeches after 2021-12-31, use the development version.
## Installation
You can install the released version of `{speechbr}` from [CRAN](https://cran.r-project.org/) with:
```{r eval=FALSE, error=FALSE, message=FALSE, warning=FALSE}
install.packages("speechbr")
```
You can install the development version of `{speechbr}` from [GitHub](https://github.com/) with:
```{r eval=FALSE, error=FALSE, message=FALSE, warning=FALSE}
# install.packages("devtools")
devtools::install_github("dcardosos/speechbr")
```
## Example
An example of a base searching for the term "tecnologia" between 2021-09-01 and 2021-10-01:
```{r example, warning=FALSE, message=FALSE}
library(speechbr)
tab <- speechbr::speech_data(
keyword = "tecnologia",
start_date = "2021-09-01",
end_date = "2021-10-01")
dplyr::glimpse(tab)
```
The others parameters are `party` (political party), `speaker` (speaker's name) and `uf` (state acronym). Their default values are _empty_ ("").
A simple application using the base, a wordcloud:
```{r example_2, eval = FALSE}
# install.package("wordlcoud2")
# install.package("tidytext")
stop_words <- tidytext::get_stopwords("pt")
others_words <- c("nao", "ter", "termos", "r", "fls", "sr", "ja", "sao",
"porque", "aqui","ha", "ser", "ano", "presidente", "tambem")
tab %>%
tibble::rowid_to_column("id") %>%
dplyr::select(id, discurso) %>%
tidytext::unnest_tokens(word, discurso) %>%
dplyr::filter(!grepl('[0-9]', word)) %>%
dplyr::mutate(word = abjutils::rm_accent(word)) %>%
dplyr::anti_join(stop_words) %>%
dplyr::group_by(word) %>%
dplyr::count(word, sort = TRUE) %>%
dplyr::filter(n > 5, !word %in% others_words) %>%
wordcloud2::wordcloud2()
```
### Example of a base
```{r example_3, echo=FALSE}
tab %>%
head(2) %>%
knitr::kable()
```
## How to cite
[](https://doi.org/10.5281/zenodo.5921104)