Skip to content
Emrys Koenigsmann edited this page May 7, 2024 · 1 revision

Introduction to GitHub

Purpose of this Guide

The purpose of this guide is to give an introductory overview of GitHub as a solution for VIMS researchers, students and faculty. This guide provides a basic overview of the benefits of GitHub, how to get started, and links to detailed documentation for more advanced concepts.

What is GitHub?

GitHub stands as a cornerstone in the world of software development, yet its utility extends far beyond. For the new users at VIMS, imagine a digital lab notebook that not only meticulously records every change made to your research code and data but also enables seamless collaboration with colleagues across the globe. GitHub is this and much more—it's a powerful platform for hosting code, managing projects, and fostering collaboration among researchers and developers alike. Built on Git, the most widely used version control system, GitHub provides an interface that makes tracking changes, reviewing code, and contributing to projects both efficient and straightforward.

Benefits of Github

GitHub is not just a platform for software developers; it is a hub for hosting code, managing projects and facilitating collaboration across diverse fields and organizations. At its core, GitHub operates on Git, which is the most popular version control system that allows users to track, record and manage changes to code or project files over time.

  • Collaborative Environment: GitHub's collaborative features, such as fork and pull requests, make it easier for teams to work together on projects, share insights, and improve code quality through peer review.
  • Version Control: With Git at its core, GitHub provides a robust system for tracking changes to projects. This means every alteration to a program, dataset or model is recorded, allowing researchers to understand the evolution of their work and revert to previous versions if needed.
  • Code Sharing and Publishing: GitHub makes sharing projects with the global scientific community straightforward. Researchers can easily publish their code, making it accessible for others to use, cite, and build upon.
  • Project Management Tools: Beyond code hosting, GitHub offers project management tools like issue trackers and project boards, helping research teams organize tasks, prioritize work, and track progress efficiently.
  • Data Security and Privacy: GitHub offers various levels of privacy settings, including private repositories for sensitive projects that can be shared with select collaborators and public repositories for open science initiatives.
  • GitHub Organizations: More advanced management of code standards, policies and team management can be done with Organizations. This includes member management, team creation, and more advanced code review.

Basic Concepts of Git and GitHub

Before diving into the practical use of GitHub, it's important to understand some underlying concepts of Git, the version control system that GitHub is built upon.

Version Control with Git

  • What is Version Control?
    Version control is a system that records changes to a file or set of files over time so that you can recall specific versions later. It allows multiple people to work on the same project without conflicting changes.
  • Why Git?
    Git is the most widely used modern version control system in the world today. It's a distributed version control system, meaning that every developer's working copy of the code is also a repository that can contain the full history of all changes. For mor information on version control and Git, check out the Git Handbook.

Repositories

  • What is a Repository?
    A GitHub repository can be thought of as a project's folder. It contains all of the project files (including documentation) and stores each file's revision history. Repositories can have multiple collaborators and can be either public or private.

Branches and Commits

  • Branching:
    Branches allow you to develop features, fix bugs, or safely experiment with new ideas in a contained area of your repository. The default branch in GitHub is called main.
  • Making a Commit:
    A commit is a snapshot of your repository at a specific point in time. When you commit, you are essentially taking a photo of your project's currently staged changes. Commit messages capture the history of your changes, so other contributors can understand what you’ve done and why.
  • Branches and Commits in GitHub:
    Learn more about branching and committing in GitHub through the GitHub branching guide and the committing changes guide.

Installing Git and GitHub Desktop

Before you can start using GitHub to its full potential, you'll need to have Git installed on your computer. For those who prefer a graphical user interface (GUI) instead of the command line, GitHub Desktop offers a user-friendly way to commit changes, create branches, and collaborate with others on projects.

Installing Git

Git is a free and open-source distributed version control system that's essential for using GitHub. Here's how to install Git on different operating systems:

  • Windows: The official Git website provides an easy-to-use installer for Windows users. You can download and install Git from Git for Windows.
  • macOS: For macOS users, Git can be installed through the standalone installer or using a package manager like Homebrew. Instructions and download links are available at Git for macOS.
  • Linux: Linux users can install Git from their distribution's package manager. For example, on Ubuntu or Debian-based distributions, you can use apt-get install git. For detailed instructions for various Linux distributions, visit Git for Linux.

Installing GitHub Desktop

GitHub Desktop provides a visual interface to manage your repositories, commit changes, open pull requests, and collaborate with others. GitHub Desktop is available for both Windows and macOS:

  • Windows & macOS: You can download GitHub Desktop directly from the GitHub Desktop website. The site detects your operating system and offers the correct version for your system.

Further Reading:

  • For comprehensive instructions and troubleshooting during the installation process of Git, refer to the Git installation guide on the official Git documentation.
  • To learn more about GitHub Desktop and its features, check out the GitHub Desktop documentation.

Getting Started with GitHub

Setting Up a GitHub Account

  1. Sign Up: Begin by visiting GitHub's sign-up page. You'll be prompted to enter a username, email address, and password. Choose a username that reflects your professional identity, as this will be visible to collaborators and the public. You will need to use your VIMS email address to access VIMS resources on GitHub.
  2. Verify Your Email: After signing up, GitHub will send a verification email to confirm your account. Click the verification link to activate your account.
  3. Personalize Your Profile: Adding information like your real name, a profile picture, and bio can make your profile more recognizable to collaborators. Navigate to your profile settings to customize these elements.

For a detailed guide on creating and setting up your GitHub account, refer to the official documentation.

Navigating the GitHub Interface

Once your account is set up, take some time to explore the GitHub interface. Key areas to familiarize yourself with include:

  • Dashboard: Your dashboard is the first screen you see upon logging in. It shows activity from repositories you're interested in and provides quick access to your work.
  • Repositories: This is where your projects live. Each repository contains all of the project files and records its history of changes.
  • Issues: Issues are used to track ideas, enhancements, tasks, or bugs for work on GitHub. They're a key component of collaborative project management.
  • Pull Requests: Pull requests let you tell others about changes you've pushed to a branch in a repository. They are central to GitHub's collaborative model, allowing others to review and contribute to your projects.

The GitHub Quickstart Guide offers comprehensive insights into navigating and utilizing GitHub's interface efficiently.

Creating Your First Repository

A repository (or "repo") is where your project's files and revision history are stored. To create a new repo:

  1. Create New Repository: Click the "+" icon in the top-right corner of GitHub and select "New repository."
  2. Repository Name: Choose a name that clearly describes your project. If you're working on a specific research project, consider naming it after that project.
  3. Description (Optional): Providing a brief description of your repository can help others understand the purpose of your project.
  4. Privacy Settings: Decide whether your repository will be public (anyone can see this repository) or private (you choose who can see and commit).
  5. Initialize the Repository: Optionally, you can initialize your repository with a README, .gitignore, and license. A README is highly recommended as it provides information about your project.

For more detailed instructions on creating a repository, check out the GitHub documentation on repositories.

Making Your First Commit

Committing is the process of saving your changes to the repository's history. Here's how to make your first commit:

  1. Create a File: In your new repository, click "Add file" > "Create new file."
  2. Name Your File: Give your file a name. If it's a README, name it README.md.
  3. Edit File: Add information about your project or any text to the file.
  4. Commit Changes: Scroll down, enter a commit message that describes your changes, and click "Commit new file."

For further details on making commits, consult the GitHub documentation on commits.

Working with Repositories

This guide will walk you through the process of cloning a repository, making changes, and then pushing those changes back to the repository on GitHub.

Cloning a Repository

Cloning a repository means creating a local copy of a repository that exists on GitHub on your computer. This allows you to work on your project's files locally. There are many ways to clone and work with your repositories, and this guide only covers the most basic methods. Please refer to Github’s documentation on cloning a repository for more advanced information.

Steps to Clone a Repository

  1. Find the Repository: Navigate to the main page of the repository on GitHub.
  2. Copy the URL: Click on the "Code" button and copy the URL shown under "Clone with HTTPS".
  3. Clone using Git: Open your terminal, navigate to the directory where you want the repository to be copied, and type git clone, followed by the URL you copied.

Making Changes and Committing

Editing Files:
Once you have cloned the repository, you can begin making changes to the files. Use your favorite text editor or IDE to edit, add, or delete files.

Staging Changes:
After making changes, you need to stage them using Git. Staging is like preparing a snapshot of your changes before committing them.

  • To stage changes, use the command git add . to stage all modified files, or git add <file> to stage a specific file.

Committing Changes:
With your changes staged, it's time to commit them. A commit is a snapshot of your repository at a particular point in time.

  • To commit your changes, type git commit -m "Your meaningful commit message" in the terminal. Replace "Your meaningful commit message" with a brief description of what your changes entail.

Check out Committing changes to your project for more on committing.

Pushing Changes

What is Pushing?
Pushing is the act of sending your committed changes to a remote repository on GitHub. It updates the repository with your latest commits.

How to Push Changes:

  1. Open your terminal.
  2. Navigate to your repository's directory.
  3. Type git push origin main to push your changes to the main branch. Replace main with the name of the branch you're pushing to if different.

For more information on pushing changes, visit Pushing commits to a remote repository.

Best Practices for Repository Management

  • Regularly Pull Changes: If you're working collaboratively, regularly pull changes from the remote repository to keep your local copy up to date.
  • Use Meaningful Commit Messages: This helps others understand the purpose of your changes and track project history effectively.
  • Manage Branches Wisely: Use branches to isolate development work without affecting other branches in the repository.

Collaboration on GitHub

Collaboration is at the heart of GitHub's design, enabling multiple people to work together on projects from anywhere in the world. This section will cover how to use forks, pull requests, and issues to collaborate effectively on GitHub.

Forking a Repository

Forking a repository allows you to create a personal copy of someone else's project. This copy is independent of the original repository and resides in your GitHub account. You can make changes in your fork without affecting the original repository.

How to Fork a Repository:

  1. Navigate to the repository you want to fork on GitHub.
  2. In the top-right corner of the page, click the "Fork" button.
  3. Once the process is complete, you'll be taken to your copy of the repository.

When to Fork:
Forking is useful when you want to contribute to a project you don't have write access to. It's also used to start your projects based on the work of others.

Pull Requests

A pull request (PR) is a way to propose changes you've made in a repository. When you open a PR, you're requesting that someone reviews and pulls your contribution into their branch.

Creating Pull Requests:

  1. Push your changes to your fork.
  2. Navigate to the original repository where you want your changes to be merged.
  3. Click the "Pull Request" button.
  4. Select your fork and the branch with your changes.
  5. Add a title and description to explain your changes.
  6. Click "Create Pull Request."

Best Practices for Pull Requests:

  • Provide a clear and detailed description of your changes and the reasons for them.
  • Make sure your PR is focused on a specific issue or feature to facilitate the review process.
  • Engage in the discussion if reviewers have questions or feedback.

Issues

The Issues feature is a great way to keep track of tasks, enhancements, and bugs for your projects. They're a place where you can discuss project-related problems and decide on the best ways to tackle them.

Creating and Managing Issues:

  1. In the repository, navigate to the "Issues" tab.
  2. Click "New Issue" to create a new one.
  3. Provide a title and description for the issue. Be as specific as possible.
  4. Assign labels to categorize the issue (e.g., bug, feature request).
  5. Submit the new issue.

Collaborating Through Issues:
You can mention other users to bring them into a conversation, assign issues to specific collaborators, and tag issues in pull requests to link them together.

Resources for Further Learning

GitHub Organizations

GitHub Organizations offer a platform for collaborative projects to groups, companies, and open-source projects. They act as a hub for collective project management, allowing multiple users to work under a single organization umbrella with various permission levels across the projects.

What is a GitHub Organization?

An organization on GitHub can include multiple members and repositories. It provides a shared space where teams can collaborate across many projects at once. Permissions and access can be managed and structured according to the organization's needs.

To create a new organization, please contact VIMS ITNS by opening a support ticket. Please provide your department, team or lab name that will be using the organization, as well as your preferred acronym. ITNS will need two “owners” who will be responsible for managing the new GitHub Organization as well.

Details for setting up an organization can be found in the GitHub documentation on organizations.

Teams and Permissions

Teams
Within an organization, you can create teams to organize group members according to their role or the project they are working on. Teams can have their repositories, and access levels can be customized for each team.

Permissions
GitHub Organizations provide several levels of permissions, which can be assigned to teams or individual users. These permissions control the level of access to the organization's repositories and settings.

User Management

Organization owners can invite users to become members, manage user roles, and handle other administrative tasks.

Explore more about teams and permissions in the GitHub documentation on setting up and managing organization teams.

Repository Management

Repositories within an organization are owned by the organization itself, rather than by individual members. This allows for seamless collaboration and ensures the continuity of the project even as individual members come and go. Access to each repository is controlled through team permissions and individual collaborator settings.

Learn how to manage organization repositories with the guide on Repository permission levels for an organization.

Best Practices for GitHub Organizations

  • Use Teams Effectively: Organize members into teams for better collaboration and control over repository access.
  • Consistent Naming Conventions: Adopt a standard naming convention for your repositories and teams for clarity and organization.
  • Implement Access Controls: Regularly review and update permissions to ensure that members have appropriate access.
  • Leverage Organization Features: Utilize features like project boards, issue templates, and automated workflows to enhance collaboration.

Best Practices and Tips for Using GitHub

Adopting best practices on GitHub not only improves the quality of your work but also facilitates better collaboration among team members. Below are some quick that can help you and your team work more effectively.

Clear and Consistent Commit Messages

Why It's Important:
Clear commit messages communicate the changes made in that commit to other team members (and to your future self). They are crucial for understanding the history of a project.

Tips:

  • Begin with a short, concise summary of the commit (ideally less than 50 characters).
  • Use the imperative mood ("Add feature" not "Added feature" or "Adds feature").
  • If necessary, follow the summary with a more detailed explanation.

Structured Branching Strategy

Why It's Important:
A well-structured branching strategy can prevent chaos in your repository, especially as more people contribute.

Tips:

  • Use the main or master branch for stable and deployable code.
  • Develop features in feature branches, which are then merged back into the main code line.
  • Establish naming conventions for branches (e.g., feature/, bugfix/, hotfix/).

Pull Request (PR) Best Practices

Why It's Important:
PRs are a key part of collaboration. They allow the team to review code and discuss changes before they are integrated into the main project.

Tips:

  • Keep PRs small and focused to make review easier.
  • Provide context and link to related issues in the PR description.
  • Before submitting a PR, rebase your branch on the latest main or master branch to ensure a clean merge.

Code Review

Why It's Important:
Code reviews help maintain code quality and share knowledge across the team.

Tips:

  • Review code for functionality, style, and design.
  • Provide constructive feedback.
  • Avoid large code reviews when possible to keep them manageable.

Use Issues and Milestones

Why It's Important:
Issues allow you to track tasks, enhancements, bugs, and other types of work within GitHub.

Tips:

  • Be descriptive when creating an issue.
  • Use labels and milestones to organize issues.
  • Close issues once they are resolved.

Automate Repetitive Tasks

Why It's Important:
Automation saves time and prevents human errors.

Tips:

  • Use GitHub Actions for continuous integration and deployment.
  • Automate code linting and tests.
  • Set up project boards for automated task management.

Documentation

Why It's Important:
Good documentation is key to open source and collaborative projects because it explains how to use and contribute to the project.

Tips:

  • Include a README file with an introduction to the project, how to set up and contribute.
  • Use a CONTRIBUTING file to detail the process for submitting changes.
  • Comment your code where necessary and keep the comments up to date.

Resources for Further Learning

Official GitHub Learning Resources

  • GitHub Docs: This comprehensive resource is the first place you should look for detailed documentation on every aspect of GitHub.
  • GitHub Learning Lab: Interactive courses to learn GitHub by doing. It covers everything from GitHub basics to advanced topics.
  • GitHub Guides: A collection of short guides on the different features of GitHub.

William & Mary Resources

  • LinkedIn Learning: Formerly Lynda.com. LinkedIn Learning is an online educational platform for business, technology-related, and creative skills through expert-led course videos. W&M login required.
  • Coursera: Introduction to Git and GitHub is a free, online class for learning about Git repositories and GitHub.

Additional Resources

The Turing Way is an open source, open collaboration, and community-driven handbook to reproducible, ethical and collaborative data science. Below are the two most relevant sections of the handbook in regards to GitHub.