-
Notifications
You must be signed in to change notification settings - Fork 4
Expand file tree
/
Copy path01-intro-to-git-github.Rmd
More file actions
201 lines (157 loc) · 8.71 KB
/
01-intro-to-git-github.Rmd
File metadata and controls
201 lines (157 loc) · 8.71 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
---
editor_options:
markdown:
wrap: 80
---
```{r, echo = F}
require(tidyverse)
require(vembedr)
```
# An Introduction to Git And GitHub
Git is a version control system that helps manage projects, especially projects
that involve multiple people. GitHub is a hosting service for these Git
projects (in GitHub-speak, projects are housed in **repositories**). Here, we
will outline how to set up Git, create your own GitHub account, and connect
them to you local R environment to easily publish and share the code you
develop. A great article on the benefits of using Git and GitHub has been
written by Jennifer Bryan, which you can find
[here](https://happygitwithr.com/big-picture.html) and we highly recommend
reading!
In fact, Jennifer Bryan has developed what we feel is the best guide for using
Git and GitHub out there, [Happy Git and GitHub for the
useR](https://happygitwithr.com/index.html). So as not to reinvent the wheel,
this outline will mostly just direct you to that body of work.
## Setting up Git and GitHub
Matt made an excellent YouTube video (also based on Jenny Bryan's work) on how
to set up your own GitHub account, install Git, and allow them to interact with
your personal R environment. It's also a great primer for how Git and GitHub
work in R Studio.
```{r}
embed_url("https://www.youtube.com/watch?v=Y4sOTFV3FAM&t=1s") %>%
use_align("center")
```
**Key takeaways:**
1) Create a GitHub account. (see <https://happygitwithr.com/github-acct.html>)
<br>
2) Install Git on you local computer. (see
<https://happygitwithr.com/install-git.html>) <br>
3) In the R Studio Terminal, connect your GitHub account to your local
computer. (see <https://happygitwithr.com/hello-git.html>)
## Developing, viewing, and editing repositories
As mentioned in Matt's video (\~7:53), creating a repository is the first step
when starting a brand-new Git project. Essentially, a **repository** is the
main folder in which all the files and code related to a project live. For any
repository on GitHub, there are two ways of copying it to your local computer:
**cloning** and **forking**.
### Cloning
Cloning a repository maps a local version of that repository to your computer,
allowing you to sync both the local (your computer) and remote (what's on
GitHub) versions. This type of copying is useful for workflows where you are
the only one working on them. If you clone a repository that you did not create
yourself, you will be unable to make changes to the 'main' repository hosted on
GitHub unless the developer has given your account collaborator approval, but
it *will* allow you to refresh and update what's on your local computer if
changes have been made to what's on the 'main' repository on GitHub.
However, for a repository that you have created yourself or are a collaborator
on, cloning allows two-way syncing; you are able to make changes to the GitHub
repository as well as update what's on your local computer if the GitHub
repository changes.
After editing, adding, or removing pieces of the project, you can re-sync those
changes by committing and then pushing the project back to GitHub. A **commit**
essentially packages and saves the changes you made; it also allows you to
comment on what exactly you changed in the repository. Meanwhile, the **push**
command sends those changes that were packaged by the commmit to GitHub and
updates the remote repository on GitHub accordingly. Because cloning creates a
direct connection between the GitHub 'main' repository and what you've cloned
to your local computer, you can only push changes to repositories that are
located in your own GitHub account, or to repositories that the developer of
the repository has explicitly made you a collaborator on.
**How to clone and update your GitHub**:
1) Create a repository on GitHub.
(<https://happygitwithr.com/new-github-first.html#make-a-repo-on-github-2>)
<br>
2) Clone the repository to your computer. (see
<https://happygitwithr.com/rstudio-git-github.html#clone-the-test-github-repository-to-your-computer-via-rstudio>)
<br>
3) After changes have been made, commit and push them to your GitHub. (see
<https://happygitwithr.com/rstudio-git-github.html#make-local-changes-save-commit>)
### Forking
Unlike cloning, forking a repository creates a totally separate, parallel copy
of a repository that is housed within your own personal GitHub account.
Specifically, if you've made changes to the code in a forked repository then
committed and pushed those changes, it does not automatically make those same
changes to the 'main' repository that it was forked from on GitHub. Instead,
all changes are saved and stored only within the repository hosted within your
own GitHub account.
Forking essentially creates a line of defense for changes in a workflow that
would otherwise disrupt the workflows of other people working within that
'main' repository project. For the 'main' repository to be updated with the
changes you made in your forked version, you must submit a pull request.
A **pull request** is an invitation to the owner of the 'main' repository to
review the changes you've made to the code and potentially merge them into
their 'main' repository. If the owner/reviewer finds that the code you've
developed would be a good addition to the workflow and/or that the code has not
diverged/disrupted others' workflows, they can approve the pull request, and
the 'main' repository will be updated accordingly.
Matt and a former grad student, Bryce Pulver, made another excellent video that
walks through an example collaborative workflow using forking:
```{r}
embed_url("https://www.youtube.com/watch?v=a_YY_PIDeq8") %>%
use_align("center")
```
**How to fork and submit a pull request**
1) Fork the repository on GitHub, then clone the forked repository to your
local environment. (see
<https://happygitwithr.com/fork-and-clone.html#fork-and-clone-without-usethis>)
<br>
2) Create a direct connection with the main repository to keep your personal
repository up-to-date. (see
<https://happygitwithr.com/fork-and-clone.html#fork-and-clone-finish>) <br>
3) After changes have been made to the repository on your local computer,
commit and push them to the forked repository on your personal GitHub. (see
<https://happygitwithr.com/rstudio-git-github.html#make-local-changes-save-commit>,
or Bryce at \~25:45) <br>
4) Go to GitHub in your web browser. Submit a pull request by going to the main
repository and selecting and creating a pull request. (Bryce does this at
\~31:40)
## Making changes to repository visibility
When you create a new repository you can choose whether you'd like its
visibility to be public or private. Within the lab our default expectation is
public, in order to make our work open and reproducible. However, in certain
circumstances repositories will be need to be private or visibility will need
to change during the lifecycle of a project.
Before you make a change to the visibility of a repository, it's important to
make sure that all of your forks are up to date and to warn any of your
collaborators that you are going to make this change. Changing visibility
***severs*** connections between all forks and the repository, so this is
important to prepare for.
### Reconnecting severed repositories
You can follow these steps if the connection between your fork and the upstream
repository is severed:
1. **Rename your fork.** This will enable you to have this (old) fork exist at
the same time as a new fork in step 2.
2. **Create a second fork of the repository.**
3. **Remove all files from the new fork.**
4. **Clone your existing fork locally using this command in the command
prompt:** `git clone --bare https://your_original_fork_url.`
5. **Move into the newly created folder from step 4** using
`cd your_original_fork_folder.`
6. **Run the following command.** It will push the branch and commit history
into the new fork: `git push --mirror https://new_fork_url.`
*Instructions adapted from [this
post](https://dev.to/ismailpe/duplicating-a-github-repo-with-history-3223) and
input from B Steele.*
## Additional resources
Below is a list of additional resources about GitHub and collaborative
workflows.
• GitHub website (<https://docs.github.com/en>).
• Forked workflows
(<https://www.atlassian.com/git/tutorials/comparing-workflows/forking-workflow>).
• Using Git and GitHub for team collaboration
(<https://medium.com/anne-kerrs-blog/using-git-and-github-for-team-collaboration-e761e7c00281>).
• Starting a new group project on GitHub
(<https://www.digitalcrafts.com/blog/learn-how-start-new-group-project-github>).
```{r, echo = F}
knitr::wrap_rmd('01-intro-to-git-github.Rmd', width = 80, backup = NULL)
#note, this will not wrap text that are prefaced by any special characters (like bullets!)
```