Creating a Repository

Last updated on 2024-03-04 | Edit this page

Estimated time: 25 minutes

Overview

Questions

  • Where does Git store information?

Objectives

  • Create a local Git repository.
  • Describe the purpose of the .git directory.

The Git jargon


Git is a topic that contains a lot of words to do version control.

Word cloud for Git from https://thoughtbot.com/blog/recommending-blog-posts

We will locate them using this workflow bellow as template. We will relate Version control actions that we can perform with specific git verb commands. These verbs will record your changes between Git spaces associated to your folder.

Workflow will show actions, git verb commands, and spaces.
Workflow will show actions, git verb commands, and spaces.

In this episode, we are going to learn how to initialize Git to create a Local Repository in our folder, also known as Working directory or Workspace.

Initialize a Local Repository in your Workspace with the git init command verb
Initialize a Local Repository in your Workspace with the git init command verb

Let’s start with a new R project in Rstudio.

PREREQUISITES

To start, you need to be out of any R project. In Rstudio, close you Project from File > Close Project. You can confirm this in the upper right corner Project: (None).

Create a local repository


Once Git is configured, we can start using it.

We will continue with the story of Wolfman and Dracula who are investigating a disease outbreak and build a situational report.

wolfman and dracula using computers for data analysis
Image by Bing, 2023, CC BY 4.0, created with Bing Image Creator powered by DALL·E 3

First, let’s create a new project folder for our work. Create a new project as you like. Here we are going to use functions from the usethis package.

If using RStudio desktop, the project is opened in a new session. Otherwise, the working directory and active project is changed:

R

usethis::create_project(path = "cases")

OUTPUT

✔ Creating 'cases/'
✔ Setting active project to 'C:/~/cases'
✔ Creating 'R/'
✔ Writing 'cases.Rproj'
✔ Adding '.Rproj.user' to '.gitignore'
✔ Opening 'C:/~/cases/' in new RStudio session
✔ Setting active project to '<no active project>'

Then we tell Git to make cases a repository -- a place where Git can store versions of our files:

R

usethis::use_git()

OUTPUT

✔ Setting active project to 'C:/~/cases'
✔ Initialising Git repo
✔ Adding '.Rhistory', '.Rdata', '.httr-oauth', '.DS_Store', '.quarto' to '.gitignore'
There are 2 uncommitted files:
* '.gitignore'
* 'cases.Rproj'
Is it ok to commit them?

Remember that each record of change can be commit. So, you can make these two files, .gitignore and cases.Rproj, part of it. Select that Yes, you agree!

OUTPUT

✔ Adding files
✔ Making a commit with message 'Initial commit'
• A restart of RStudio is required to activate the Git pane
Restart now?

Agree to restart your session to activate the Git pane in Rstudio:

The Git tab in the Environments pane shows the status of your repository.
The Git tab in the Environments pane shows the status of your repository.

The Git tab is in the Environments pane, usually in the upper right corner of the Rstudio IDE.

In this and next episodes you’ll learn the function of all those buttons on the top of the Git tab!

It is important to note that usethis::use_git() will create a repository that can include subdirectories and their files—there is no need to create separate repositories nested within the cases repository, whether subdirectories are present from the beginning or added later. Also, note that the creation of the cases directory and its initialization as a repository are completely separate processes.

This step is known as git init because it initialise your Git repository.

Checklist

Set up Git once per computer. Initialize Git once per project.
Set up Git once per computer. Initialize Git once per project.

Find the new files of a local repository


If we look at the Files tab in the Output pane to show the directory’s contents, it appears that nothing has changed.

But under the “cogwheel” button we get access to the “More file commands”. Click to the Show hidden files to show everything. We can see that Git has created a hidden directory within cases called .git:

Show hidden files in an Local repository.
Show hidden files in an Local repository.

The .git file gives the identity to the .git repository also known as the Local Repository “Local Repo”.

The .git folder is a hidden folder in a Local repository.
The .git folder is a hidden folder in a Local repository.

Git uses this special subdirectory to store all the information about the project, including the tracked files and sub-directories located within the project’s directory. If we ever delete the .git subdirectory, we will lose the project’s history.

From the Console to the Terminal

Now, we are going to use the Rstudio Terminal. The Terminal tab is next to the Console tab.

Click on the Terminal tab and a new terminal session will be created (if there isn’t one already).

Visual appearance of the Terminal.
Visual appearance of the Terminal.

Alternatively, in the Rstudio Terminal, with the ls -a command we can see the hidden directory called .git/:

BASH

$ ls -a

OUTPUT

./   .git/       .Rhistory     cases.Rproj
../  .gitignore  .Rproj.user/  R/

Important!

The .git directory is the Local Repository. This is the one of the Git spaces we talk about in the introduction of this episode!

Git stores all of its repository data (and your coming changes!) in the .git directory.

Initialize a Local Repository in your Workspace with the git init command verb
Initialize a Local Repository in your Workspace with the git init command verb

Check the status


To interact with Git, we can also use the Rstudio Terminal.

In the RStudio Terminal, we can check that everything is set up correctly by asking Git to tell us the git status of our project:

BASH

$ git status

OUTPUT

On branch main

nothing to commit, working tree clean

If you are using a different version of git, the exact wording of the output might be slightly different.

Checklist

Use the git init command to initialize a Local Repository in your Workspace. Use git status to check the status of the repository.
Use the git init command to initialize a Local Repository in your Workspace. Use git status to check the status of the repository.

The steps done with usethis can also be done with commands in the Terminal. For example, instead of usethis::use_git() in the Console you can use git init in the Terminal. However, we prefer using the first one given their explicit messages, interactivity, and warnings to prevent errors!

Git has a verb command similar to the help() function in R.

Always remember that if you forget the subcommands or options of a git command, you can access the relevant list of options typing git <command> -h or access the corresponding Git manual by typing git <command> --help, e.g.:

BASH

$ git config -h
$ git config --help

While viewing the manual, remember the : is a prompt waiting for commands and you can press Q to exit the manual.

More generally, you can get the list of available git commands and further resources of the Git manual typing:

BASH

$ git help

For complementary resources, refer to the Git Cheatsheets for Quick Reference inside this tutorial website.

Places to Create Git Repositories

Along with tracking information about cases (the project we have already created), Dracula would also like to track information about interventions. Despite Wolfman’s concerns, Dracula creates a interventions project inside his cases project and initialize Git. Dracula uses a sequence of commands in the Rstudio Console:

R

usethis::create_project(path = "interventions")
usethis::use_git()

Is the usethis::use_git() command, run inside the interventions subdirectory, required for tracking files stored in the interventions subdirectory?

No. Dracula does not need to make the interventions subdirectory a Git repository because the cases repository can track any files, sub-directories, and subdirectory files under the cases directory. Thus, in order to track all information about interventions, Dracula only needed to add the interventions subdirectory to the cases directory.

Additionally, Git repositories can interfere with each other if they are “nested”: the outer repository will try to version-control the inner repository. Therefore, it’s best to create each new Git repository in a separate directory. To be sure that there is no conflicting repository in the directory, check the output of git status from the Terminal. If it looks like the following, you are good to go to create a new repository as shown above:

BASH

$ git status

OUTPUT

fatal: Not a git repository (or any of the parent directories): .git

Actually, if you try to create a new project using usethis within the cases repository, you will get this message:

R

usethis::create_project(path = "interventions")

OUTPUT

New project 'interventions' is nested inside an existing project './', which is rarely a good idea.
If this is unexpected, the here package has a function, `here::dr_here()` that reveals why './' is regarded as a project.
Do you want to create anyway?

Using the R functions from the usethis package can be less error-prone!

Lastly, Dracula used a Bash commands in the Terminal to create a subdirectory.

$ mkdir interventions    # make a subdirectory cases/interventions

If you are interested to learn more about Bash commands, we invite you to read this tutorial on Bash commands!

Places to Create Git Repositories (continued)

Key Points

  • Use usethis::use_git() to initialize a repository.
  • Git stores all of its repository data in the .git directory.
  • Use git status in the Terminal to check the status of a repository.