Setup your capstone project repository
Step 1: Create a new repository on GitHub
Open the GitHub Organisation for the course https://github.com/rbtl-fs24
Create a new repository and name it: project-USERNAME`. Replace USERNAME with your GitHub username. Avoid using spaces.
Step 2: Setup folder structure
Clone the project-USERNAME repository to Posit Cloud.
Open your project-USERNAME repository on Posit Cloud.
Create an
R
folder and inside the folder create a new file and save it as01-data_download.R
.Create another file inside
R
and save it as02-data_cleaning.R
.Create a
data
folder, and inside the folder create three sub-folders titledraw
,processed
, andfinal
.Create a
docs
folder inside the root folder.Create a Quarto document inside
docs
and name itindex.qmd
.Add all files to a commit, commit the changes with a meaningful commit message, and push the changes to GitHub.
You will use the 01-data_download.R
file to download your data from the Google Sheet that you have established for your capstone project. Use the googlesheets4
package to access the data. Do not clean the data, but save it in the data/raw
folder using the write_csv()
function. Give the the a meaningful name that describes your project.
You will use the 02-data_cleaning.R
file to process your data into a state that is ready for analysis. Do not access the data from Google Drive, but rather use the read_csv()
function to access your data from the data/raw
folder. Save the processed, analysis-ready data in the data/processed
folder using the write_csv()
function. Again, give the file a meaningful name that describes your project.
You will use the index.qmd
file to write your final report. This will be a Quarto document that will contain your analysis and visualizations. You will need to access the processed data from the data/processed
folder using the read_csv()
function.
You will save the data underlying each displayed summary table and visualization in the data/final
folder. This requires that each visualization and table is generated from an object that is not further transformed before the code for the visualization or table is written. Use the write_csv()
function to save the data underlying each visualization and table in the data/final
folder. Give the files the name of the label you have chosen inside the code-chunk. See example below.
Step 3: Create a README.md
Navigate to the Files tab in the bottom right window of RStudio.
Open the
data/processed
folder.Click on the “Blank File” button to create a new file.
Select the option “Text file”.
Enter the name “README.md” in field and click OK.
Go to: https://raw.githubusercontent.com/rbtl-dev/metadata-readme-template/main/README.md
Copy the content that’s displayed in your browser and paste it into the
README.md
file you have just created.Add all files to a commit, commit the changes with a meaningful commit message, and push the changes to GitHub.
Step 4: Create a data dictionary
Open the following link to a Google Drive folder: https://drive.google.com/drive/u/0/folders/1LW5j1n4Xv3Tyb4JMsUHUDmIYbJ-int68
Open the
capstone-project
folder.Create a new folder and use your GitHub username as the name of the folder.
Enter your folder.
Create a new spreadsheet and name it
dictionary
.Add two column names to the spreadsheet:
variable_name
anddescription
.
This data dictionary will be used for your processed, analysis-ready data in data/processed
. You will need to describe each variable in your dataset. The variable_name
column should contain the name of the variable in your dataset, and the description
column should contain a description of the variable. One sentence per variable without a comma ,
.
Once you completed your dictionary, you will need to download it as a .csv
file and add it to your data/processed
folder in your project-USERNAME repository. You can do that by:
- Using the upload button in the Files tab in RStudio on Posit Cloud.
- Opening the
data/processed
folder on your project-USERNAME repository on GitHub and uploading the file there. If you do this, then you need to pull the changes to your local repository on Posit Cloud.
Step 5: Submit homework assignment
- Add all files to the commit, commit the changes with a meaningful commit message, and push the changes to GitHub.
- Open an issue on GitHub on your project-USERNAME repo and tag the course instructor
@larnsce
.