use git init to initalize a git repo in your project’s
root directory
view contents of the project folder before and after git
initialization
Open a terminal emulator and use cd to navigate to the
root directory of the project that you would like to turn into a git
repository.
You can use the chunk of code below, where
<path/to/your/project> should be replaced with an
appropriate absolute or relative file path to the root directory of your
project.
cd <path/to/your/project>
Once you have navigated to the correct directory, use the
ls command to list files in the directory. Take note of
what files are included. Then use the ls -a command to list
all files in the directory.
Now initialize a git repository using the following command.
git init
Again, use the ls and ls -a commands to
list files in the directory. What has changed?
The output of ls should be the same as before. However,
when you use ls -a you should now see a new directory
called .git.
The .git directory is where all relevant information for
version control is stored. There are several files and subdirectories
included in this directory. In general, you should not need to worry
about what is happening in this directory. However, you should note
that:
the only thing that makes something a git repository is the presence
of this .git directory
deleting this directory will remove the project from version
control
if your project is also stored in a Sharepoint/Dropbox folder, you
need to make sure that the contents of this directory stays synchronized
across computers that work with. Strange behavior can result from
incompletely synchronized .git folders.
If you would like to read more about what is inclued in the
.git folder, you can check out the resources below:
Before you start, make sure that your terminal emulator is still in
the correct directory by using pwd. If you are not in the
correct folder, use cd to navigate to the root directory of
your folder.
Once you are in the correct directory, execute the following
command:
git status
The output will look something like this:
On branch main
No commits yet
Untracked files:
(use "git add <file>..." to include in what will be committed)
.Renviron
.Rhistory
...
...
Note that git is telling you that there are files in this directory
that are not being tracked.
To make a new commit, we generally use the command
git add <file_name1> <file_name2> <...>
Exercise: Use this command to stage all of the R
scripts in the project for commit.
We could use the following command:
git add *.R
If this command is executed from the projects root directory, then it
will add all files with the .R extension included in your
project.
Or if all your R scripts are contained in a directory, e.g., named
source, you could add the folder.
git add source
Pattern matches can be useful for adding multiple files, but keep two
things in mind:
First, you may consider avoiding commits with multiple files
implementing multiple things at the same time. It may be better to split
up the files across different commits.
Second, the behavior of * is somewhat unexpected in this
circumstance:
git add *
This will add all files in your project except any dot-files
(i.e., files whose name begins with .). This may miss some
important files.
To add all files correctly use:
git add --all
Before committing, stage the analysis directory for commit as
well.
git add analysis
Then make the commit using the following command:
git commit -m "<add-your-message-here>"
Be sure to replace <add-your-message-here> with an
informative message!
Take a final look at git status. What has changed?
Run the following command to view the project history:
git log --oneline
Second commit
In this exercise, we will:
make a modification to a file
use git diff to see what modifications have been made
to a file
commit a new version of the file
use git log to look at the history of a project
Open the source/04_data_visualization.R script in the R
studio editor. Change something about the plot code. For example, you
could add a different theme to a plot
plot1 = covid %>%
mutate(county = factor(county),
county = fct_reorder(county, concentration)) %>%
ggplot(aes(county, log(concentration), group = county, fill = county)) +
geom_boxplot() +
theme(legend.position = "none",
axis.text.x = element_text(angle = 45, hjust = 1)) +
theme_bw()
Save the file in R Studio. Confirm that the file has saved by
ensuring that the name of the file has changed from red to black. (You
would be surprised how much confusion can be caused by forgetting to
save files)
Take a look at git status. What does it have to say
about the modified file?
Use the following command to view the difference between the new
version of the file and the version on the current commit.
git diff source/04_data_visualization.R
If all of the changes fit on one screen, the differences will simply
print to the console.
If there are A LOT of changes, then you can scroll through the
differences either using: enter, page up/down, or arrow keys. When you
are done browsing the changes, press q to return to the
command line.
Next, stage the file for commit:
git add source/04_data_visualization.R
Exercise: We are now going to do something a little
bit scary. We are going to run the git commit command but
forget to include a message 😱
git commit
More than likely you will see the following on the
screen:
# Please enter the commit message for your changes. Lines starting
# with '#' will be ignored, and an empty message aborts the commit.
But if you try to type nothing happens. What the…
The exercise is simple: escape!
What happened here? When you forget to include a commit message, git
launches a text editor program on your computer. The default text editor
that git uses is vim, a
powerful, but painfully minimal and more painfully complex editor.
The key part of this solution is: DON’T PANIC. We will get through
this together.
Follow these precise steps:
type i
you should see -- INSERT -- at the bottom of the
screen
now when you type you will actually see letters appear!
use the keyboard as normal to type your commit message
notice that you cannot use the mouse to select the cursors position,
you have to use the arrow keys
when you have finished your message press esc
you should no longer see -- INSERT -- at the
bottom
type :x (colon and the x key)
you will see :x appear at the bottom of the screen
press enter to save the commit message and quit
vim
This will complete the commit. But what the heck just happened?!
In vim, we use i to enter INSERT mode,
which is where we actually get to type things.
Pressing esc exited from INSERT mode (back
to what is called NORMAL mode).
The colon key takes us into command-line mode. This is where we can
execute a host of commands (e.g., find and replace text, save a
document, etc…). Think of this like a text-based tool bar for
vim.
In command-line mode, we execute the command x which is
to save and close the document.
This is exactly what git was asking us to do to complete the commit –
save a message.
If all this was too scary for you, you have two options:
Option 1: abandon the commit! Before you do anything else in
vim type :q!, which will quit vim without
saving. This abandons the commit. Then you can use the one-liner
git commit -m "<with-a-message-this-time-dummy>" to
complete the commit.
Option 2: configure a different text editor for git! Read more here.
Third commit
In this exercise, we will:
add a temporary file to a new commit
demonstrate renaming files using git mv in a new
commit
delete a file using git rm in a new commit
In R Studio, open a new file, add whatever you want to the file, and
save it in the source directory. Name the file
foo.R.
Use the following commands to commit this new file:
git add source/foo.R
git commit -m "Add temporary file to demonstrate renaming and removing"
To rename the file use:
git mv source/foo.R source/bar.R
Check git status to see what git thinks about this
change! To commit, run:
git commit -m "Rename temporary file"
To remove the file use:
git rm source/bar.R
Check git status to see what git thinks about this
change! To commit, run:
git commit -m "Remove temporary file"
Exercise: Repeat the above process of creating a new
file in the source directory named foo.R. Add
it to a new commit using git add and
git commit. Now suppose you forget to use
git mv to rename the file and instead use plain old
mv:
mv source/foo.R source/bar.R
Check git status to see what git thinks has happened.
What do you need to do to complete the process of committing the renamed
file?
Notice that git status should show
source/foo.R as being deleted and show
source/bar.R as an Untracked file.
Exercise: In the previous exercise, you again
renamed the file source/foo.R to source/bar.R.
Now create a commit that removes source/bar.R but
does not delete the file from your local directory.
This can be accomplished using the following (not very intuitively
named) option:
Using ls should confirm that source/bar.R
has not been deleted from your directory.
Checking git status should show that
source/bar.R is now treated as any other untracked file.
For example, it could be added back to a new commit.
Ignoring files
In this exercise, we will:
create a .gitignore file to ignore certain files
After completion of the previous exercise, you should have a file
named source/bar.R that is untracked by git. If you did not
complete the previous exercise or do not have such a file available,
then create one by running:
touch source/bar.R
Now we are ready to create our .gitignore file. In R
Studio, select File: New File: Text file.
Save the file as .gitignore by clicking File: Save as:
and type .gitignore.
Now add the following text to the .gitignore file
# don't want credentials in git!
.Renviron
# ignoring this file to see behavior
source/bar.R
Save the file when you are done adding the text.
Now check git status again. You should no longer see
source/bar.R appear as an untracked file.
Exercise: You may have also noticed that git is now
treating the .gitignore file itself as an untracked file.
Here’s what should be an easy exercise by now (hopefully!) – make a new
commit that includes the .gitignore file
We can execute the following commands:
git add .gitignore
git commit -m "Add .gitignore to repo"
Pushing files to GitHub
In this exercise, we will:
create an empty GitHub repository
use git remote add to create a link between our local
repository and the GitHub repository
create ssh credentials to allow us to push to GitHub
use git push to push files to GitHub
Follow these instructions to create an empty GitHub
repository:
From GitHub dashboard, click on the + symbol in the
upper right-hand corner and select New repository
Give the repository a name
DO NOT check the Add README file
box
DO NOT add a .gitignore
DO NOT choose a license
Click Create repository
Now go back to your R Studio terminal emulator and run the following
command to add the GitHub repository as a remote named
origin to your local repository:
Confirm that the remote was added as expected by listing the remotes
for your repository:
git remote –v
Now we are ready to push, except that we need to prove to GitHub that
we have permission to push to a repository that we own. To do this we
will use private/public
keys.
First, check if any keys exist already:
ls -al ~/.ssh
Look for files named
id_rsa.pub
id_ecdsa.pub
id_ed25519.pub
If you do not have these files or if the .ssh directory
does not exist, then you need to create a key pair.
Be sure to replace <github_account@email.com> with
the email address you used to create your GitHub account.
When prompted to
"Enter a file in which to save the key", press
Enter to accept the default file location.
For simplicity, when prompted to enter a passphrase, just press
Enter twice to use no passphrase. If you would like the
additional security of a password, read more about configuring
authentication agents.
Re-run the ls --al ~/.ssh command to confirm the key was
created successfully. You should see a file with .pub
ending displayed.
Now head back to your web browser and GitHub. Navigate to the GitHub
dashboard (i.e., your landing page once you sign in).
Click on the icon in the top right corner of the page (this will
be your picture, if you’ve uploaded one to your profile, otherwise it
will be a pixelated image).
Select Settings from the dropdown menu.
When the settings menu opens, on the left side of the screen,
click “SSH and GPG keys”.
Click the green button “New SSH Key”
Give the key a title (can be arbitrary)
Open the file ~/.ssh/id_<something>.pub in R
Studio.
Recall that ~ means your HOME directory.
To remind yourself of what this directory is you can run
echo $HOME in the terminal.
You may need to ensure that hidden files are shown in your file
finder to see this file in order to open in in R Studio.
Copy the contents of the file to the Key box on GitHub
Hi <user>! You've successfully authenticated, but GitHub does not provide shell access.
Now we are finally ready to push to GitHub. Use the command:
git push origin main
If you get an error about no branch named main, then
try:
git push origin master
Once you have pushed, refresh your browser on your GitHub repository.
You should be able to see your code!
Practicing add/commit/push
In this exercise, we will:
modify README.md file for your GitHub repository
add, commit, and push README.md to your GitHub
repository
Exercise: Open the README.md using R
Studio.
Add the following passage somewhere in the document.
This repo houses test code from the SISMID 2024 course Hitchhikers Guide to reproducibility.
Make a new commit that includes the README.md and then
push the commit to GitHub.
To add and commit the file:
git add README.md
git commit -m "Add README to repo"
To push to GitHub:
# replace main with master as appropriate
git push origin main
Creating and merging branches
In this series of exercises, we will:
create a new branch
make a commit on the new branch
merge the new branch with main
delete the new branch
Exercise: Create a branch named plot.
Checkout this branch and begin to make more minor modifications to the
figure generated by source/04_data_visualization.R. Create
a commit with the new changes.
To create a new branch we run:
git branch plot
To move the HEAD pointer to the new branch we run:
git checkout plot
Note that there is an option that can be used as a shortcut to do
both of these steps at the same time!
git checkout -b plot
Now you can modify the plotting file as you please. E.g., you might
change the axis labels. Save the file when you are done.
Use git log --oneline to confirm that plot
branch is now 1 commit ahead of main.
Now that we have made a commit on the plot branch, let’s
pause to see how we would view the older version of the code that is
still on the main branch.
To do this, we simply move the HEAD pointer by checking
out main.
git checkout main
View the contents of source/04_data_visualization.R in R
Studio. You should see the older version of the code.
Exercise: Merge plot into
main.
Be sure that you have main checked out. If you are not
sure, use git branch to see. The branch with the asterisk
is your current branch. If you are not on main, use
git checkout main to switch branches.
Once you are satisfied that you are on the main branch
run:
git merge plot
Check git log --oneline to confirm that that
main pointer has appropriately moved.
Resolving merge conflicts
In this exercise, we will:
create a new branch
make a commit on the new branch
checkout the main branch
make a commit on the main branch
attempt to merge when conflicts are present
resolve conflicts and complete the merge
We are going to create a new branch specifically to work on our
README again. Let’s call the branch
readme:
git branch readme
git checkout readme
Open README.md in R Studio. Modify the line:
This repo houses test code from the SISMID 2024 course Hitchhikers Guide to reproducibility.
to read
This repo houses test code from the *FANTASTIC* SISMID 2024 course Hitchhikers Guide to reproducibility.
Save the file, knit it, and commit the changes:
# use pattern matching to add both files
git add README.md
git commit -m "Update README with thoughts on course."
Now checkout the main branch. When you open
README.md you should see that the text has reverted to
This repo houses test code from the SISMID 2024 course Hitchhikers Guide to reproducibility.
Change it to
This repo houses test code from the *absolutely miserable* SISMID 2024 course Hitchhikers Guide to reproducibility.
Save the file, knit it, and commit the changes:
# use pattern matching to add both files
git add README.md
git commit -m "Update README with new thoughts on course."
Exercise: Attempt to merge readme into
main. Identify the merge conflicts by selecting your true
feelings for this course. Commit the result.
There will be conflicts in README.md. In both files you
should see something like this:
<<<<<<< HEAD
This repo houses test code from the *absolutely miserable* SISMID 2024 course Hitchhikers Guide to reproducibility.
=======
This repo houses test code from the *FANTASTIC* SISMID 2024 course Hitchhikers Guide to reproducibility.
>>>>>>> readme
Remember that we are on the main branch so
HEAD is currently point to main. Thus, the
text that appears first is that from the main branch; the
text on the bottom is from readme.
Express your true feelings by removing the text that you do not want
to keep (and the extra symbols). Save the file and commit the
result:
git add README.md
git commit -m "Resolve my conflict(ed feelings)"
Pull requests and upstream tracking
This is a longer exercising involving a partner. Designate one
partner as User A and the other as User B.
First, User B will:
fork User A’s existing repository
clone the repository to create a local repository
make changes to the local repository
push changes to GitHub
submit a pull request
Next, User A will:
add User B’s repository as a remote
fetch User B’s repository
merge User B’s changes to their local repository
push their changes to GitHub
Then, we will use upstream tracking by User B of User
A’s repository. User A will:
make changes to their local repository
push those changes to GitHub
Finally, User B will:
add User A’s repository as an upstream remote
fetch User A’s repository
merge User A’s changes into their local repository
push their changes to GitHub
Specific instructions are given in the subheadings below.
User A: Confirm GitHub repository is accessible
User A should have a GitHub repository created associated
with the exercises above. Share your GitHub user name and your GitHub
repository name with User A.
User B: Fork and clone the repository
User B should now fork and clone User A’s
repository on GitHub using the following steps.
To do this, User B should navigate to
https://github.com/<user_a_name>/<user_a_repo>
and click “Fork” to create a fork of User A’s repository.
replace <user_a_name> with User A’s
GitHub user name
replace <user_a_repo> with User A’s
GitHub repository name
Recall that this creates a copy of User A’s GitHub
repository on User B’s GitHub. This repository is now viewable
at
https://github.com/<user_b_name>/<user_b_repo>.
User B is now going to clone their copy
(not User A’s copy!) of the GitHub repository.
To do this, User B should use cd in their
terminal to navigate to a directory where they wish to download User
A’s repository.
User B should execute
# make sure this is User B's repo
# and NOT User A's!!!
git clone git@github.com:<user_b_name>/<user_b_repo>`
this will clone User B’s fork of the repository
be sure to use the
git@github.com:<user_b_name>/<user_b_repo>
syntax and nothttps://github.com/<user_b_name>/<user_b_repo>
syntax.
You can confirm what web address was used to add the remote by
executing git remote -v.
If the output of git remote -v shows that you
accidentally used https:// syntax in your
git clone command then User B should remove the
origin remote using git remote remove origin
and then re-add the remote using “ssh”-style syntax:
git remote add origin git@github.com:<user_b_name>/<user_b_repo>.
User B should confirm that a folder called
<user_b_repo> was added to the current working
directory of their terminal.
User B: Update the repository and submit a pull request
User B will now make a new branch and make updates to
User A’s repo on that branch. User B should complete
the following steps:
Create and checkout a new branch called feature by
executing git checkout -b feature
recall that this simultaneously creates and checks out a new branch
called feature
confirm that User B has switched to the new branch by
executing git branch. You should see a star next to
feature
Modify something about the analysis/final_report.Rmd
script. E.g., modify section heading names or the title
field in the yaml header.
After completing changes, appropriately use git add
and git commit to make a new commit along the
feature branch.
Push the feature branch to GitHub.
git push origin feature
Submit a pull request to User A’s repository.
the pull request should request that User B’s
feature branch be merged into User A’s
main branch.
User A: test out pull request code
User A will now fetch the code submitted by
User B, test it out, and eventually merge it into their
main branch, thereby closing the pull request. To do this
User A should