Research compendium
Last updated on 2024-12-03 | Edit this page
Estimated time: 30 minutes
Overview
Questions
- How do you create a research compendium for an R project?
- How do I facilitate users and collaborators to participate in my project?
- What features are related to sustainable software?
Objectives
- Adapt a research compendium template with files and folders organized logically with rcompendium.
- Add community files for users to seek support and contribute with usethis
- Identify your project features related to sustainable software.
What is a research compendium?
A research compendium collects all digital parts of a research project, including data, code, and texts (protocols, reports, questionnaires, metadata). We create this collection in such a way that reproducing all results is straightforward (The Turing Way Community, 2022)
Using templates facilitates having all the required files from the beginning of your project.
We understand that creativity can be “messy” sometimes. You will be able to handle it in the present, but your collaborators (and the future you) may have problems understanding it. Reproducibility is as much about the humans that interact with the code as the machines that need to run it (Campitelli and Corrales, 2022).
Let’s code
Create a Rstudio Project
Go to Project
, which is in the top right corner of
Rstudio and select New Project...
. Follow these steps:
- Select
New directory
, - Select
New project
, and - Check the
[x] Create a git repository
option
Stop! Find a name!
Don’t use projectname
as your R project name!
Create a new one, thinking about your current research project.
Your projectname
must follow some rules for everything
to work. It must:
- contain only ASCII letters, numbers, and dots “
.
” (it cannot have a hyphen “-
”) - have at least two characters
- start with a letter (not a number)
- not end with a dot “
.
”
Create a research compendium
To create a new research compendium run:
R
rcompendium::new_compendium()
This function will create new files and folders as a template. You can rearrange the folder elements by size to identify its components.
We will explore the content of each new element during the workshop.
This function will also create the GitHub repository for your project. This step will open a new tab in your browser.
Add community files
We are going to add more files to the default template. For this, we
are going to use a package with helper functions called {usethis}
.
To add community files, run:
R
usethis::use_tidy_github()
This function is a convenience wrapper function
that adds four template files in a new folder called
.github/
:
-
SUPPORT.md
with resources to seek support. -
CONTRIBUTING.md
with contributing guidelines. -
issue_template.md
with steps on how to report issues. -
CODE_OF_CONDUCT.md
with guidelines to foster an environment of inclusiveness and to explicitly discourage inappropriate behaviour.
These four files follow the tidyverse standards. You can edit them writing
with Markdown
to fit your specific project content
purposes.
Prerequisite
Now commit
and push
your changes using
git
.
Git reminders
We use
git commit
to capture a snapshot of the project’s currently staged changes. We usegit add
to ‘stage’ changes that we will store in a commit.We use
git push
to uploadlocal
repository content to aremote
repository.
- You can use Git with Rstudio to performs these tasks.
- From the “Review changes” pane:
- Go to the “History” tab in the top left.
- Show that each commit has an ID under the SHA/hash column
- Go to GitHub:
- Identify where this ID, called SHA/hash, is located.
Where are community files visible?
GitHub automatically recognizes these files and adds them as hyperlinks in specific places.
- Go to the About section in the upper right corner
side of your repository, to read the
Code of conduct
:
- Go to the Issues tab on the navigation bar at the
top of your repository on GitHub. You will find a link to the
issue templates
you added there.
- Press the
"Get started"
button on the right to write on top of the template. In the lower right corner, the Contributing and Support files are accessible under the Helpful resources subtitle.
These community files are also known as community health files
Discussion
Do you find the links to the Community files visible enough on GitHub?
Have you ever found them in a different place in the past?
Checklist
Sustainable software features
Software is sustainable when it’s easier to maintain and extend rather than replace. This easiness depends on the:
- Quality of the software,
- Skills of the potential maintainers, and
- How much the user community is willing to invest to keep the software up to date.
Features like a Research compendium template and Version control increase the quality of the software.
- A Research compendium follows Project organization good practices. This give a logical and familiar structure to the project.
- A version control follows the Keep track of changes good practice. This registers the project’s history and how one or multiple contributors wrote code and made decisions.
Additionally, Community files follow Collaboration good practices. They consider any gaps in the community of users to facilitate their participation and how to interact with maintainers.
Testimonial
Is a data analysis also considered a piece of software?
Nick Huber, from the blog Towards Data Science, concludes that data analysis best practices/tools are starting to strongly resemble practices/tools from software engineering
The repository of this lesson also came from a template that looks like a derivative of a research compendium, which also looks like a piece of software like an R package.
Key Points
- Use rcompendium templates to reuse all the files and folders a research project needs.
- Use usethis to add complementary community files to a research project.
- Version control, Research compendium, and Community files are features related to Sustainable software.