date | module | topic |
---|---|---|
22 February 2024 | 1 | Welcome & get ready for the course |
29 February 2024 | 2 | Data science lifecycle & Exploratory data analysis using visualization |
07 March 2024 | 3 | Data transformation with dplyr |
14 March 2024 | 4 | Data import & Data organization in spreadsheets |
21 March 2024 | 5 | Conditions & Dates & Tables |
28 March 2024 | 6 | Data types & Vectors & Pivoting |
04 April 2024 | Easter Break | |
11 April 2024 | 7 | Joining tables & Creating and publishing scholarly articles with Quarto and GitHub pages |
18 April 2024 | 8 | Waste Research |
25 April 2024 | 9 | Research Design |
02 May 2024 | 10 | Questionnaires |
09 May 2024 | Auffahrt Break | |
16 May 2024 | 11 | Pre-test and logistics |
23 May 2024 | Data collection | |
30 May 2024 | 12 | Data analysis & report writing |
06 June 2024 | Project Submission Deadline | |
13 June 2024 | Exam |
Course Overview
Thank you for your interest in this course. Your course instructors: Prof Elizabeth Tilley and Lars Schöbitz are looking forward to meet you.
We will meet Thursdays at CAB G 59 from 12:15 to 15:00. There is no Moodle page for this course. Everything you need will be published through this website.
Course Information
This course provides learners with skills in using the collection of R tidyverse packages as a tool for data analysis, reproducible research and communication. Lectures will be delivered through participatory live coding for students to learn how to write code in code-along exercises. We will use publicly available data related to waste management, air quality, and sanitation. Students will learn how to help themselves to build upon the obtained skills to apply them to their data analysis projects.
Topics include:
- The data science life-cycle
- Data organization in spreadsheets
- Exploratory data analysis using visualization
- Concept of tidy data and data tidying
- Data transformation and descriptive statistics
- Data communication using the Quarto open-source scientific and technical publishing system
- Theory and foundations of field-based research
- Research Design and implications for analysis
Learning Goals
Be able to use a common set of data science tools (R, RStudio IDE, Git, GitHub, tidyverse, Quarto) to illustrate and communicate the results of data analysis projects.
Learn to use the Quarto file format and the RStudio IDE visual editing mode to produce scholarly documents with citations, footnotes, cross-references, figures, and tables.
Be able to design a questionnaire to collect information that can be analysed to answer a waste-related research question that is relevant for Zurich.
Understand the main challenges associated with managing different types of waste, and how they differ between Europe and Africa.
Textbooks and Materials
We will rely entirely on open source and open access material for this course. We will use “R for Data Science” by Hadley Wickham as complementary reading and learning material for this course. Additional readings will consist of blog posts, journal articles, and reports. All required readings and class material will be provided through this website.
Course Calendar
Weekly Structure
Assignment submission: Wednesdays, latest by 23:59.
Monday | |
Tuesday | Student hours from 14:00 to 16:00 (CET) |
Wednesday | Assignment submission, latest by 23:59 (CET) |
Thursday | Lecture from 12:15 to 15:00 (CET) |
Friday |
Performance assessment
The performance assessment and resulting grading scheme are shown below.
- End-of-semester exam: 50 points
- Compulsory continuous performance assessment: 50 points, of which
- Homework assignments: 20 points (n = 10)
- Capstone project: 30 points, of which
- Technical parts of submitted report: 20 points (we will communicate what we expect)
- Intellectual framing of results: 10 points (we will communicate what we expect)
Table Table 1 shows the conversion from points to grades. Grades follow the ETHZ’s Grading System. Points are rounded to the nearest grade, for example:
- 97 points = 5.75
- 93 points = 5.75
- 92 points = 5.50
- 45 points = 4.00
- 44 points = 3.50
grade | points |
---|---|
6.00 | 100 |
5.75 | 95 |
5.50 | 90 |
5.25 | 85 |
5.00 | 80 |
4.75 | 75 |
4.50 | 70 |
4.25 | 60 |
4.00 | 50 |
3.50 | 40 |
3.00 | 30 |
2.50 | 20 |
2.00 | 10 |
1.00 | 0 |
End-of-semester exam
There is a 2-hour final written exam, which assesses the technical skills taught during the course. It contains programming exercises using the R programming language. The success of the exam depends on the effort put into the compulsory continuous performance assessment. The exam receives 50 points.
Compulsory continuous performance assessment
Homework assignments: Each week will have at least one homework assignment. Homework assignments are delivered as Quarto documents with instructions and sample code. Students are required to submit their work through GitHub. A total of ten assignments receive a pass/fail with 2 points for each assignment and 20 points in total.
Capstone Project: A final capstone project provides students with an opportunity to apply their skills and techniques to real-world data sets. Each student will collect their own data for this project, either using a survey based tool (Google Forms) or an observational study (Google Sheets).
Detailed instructions for the completion of the capstone project will be provided. The project report will be delivered as a Quarto documens and students are asked to submit their work through GitHub. The capstone project receives 30 points.
Readings: Every week, additional readings will be provided that support students in learning the underlying concept that are taught during the class. Readings are not graded.
Policies
Class attendance
We hope that you can attend class in person. If you cannot attend a class, we expect you to contact us and inform us about it. There will be a live streaming recording that you can watch from home, however we will not accomodate for two way communication.
If you miss a class, we expect you to work through the material of the class using the recording of the live streaming.
AI Policy
We expect you to use AI tools in this class (e.g. perplexity.ai, ChatGPT, etc.). Some assignments may require it. Learning to us AI is an emerging skill that we want you to embrace.
Be aware of the limits of these tools:
Minimum effort prompts will yield low quality results. Refine your prompts to get good outcomes. This will take work.
Don’t trust anything it says. Unless you know the answer or know how to check it, assume it is wrong. You will be responsible for any errors or omissions provided by the tool. It works best for topics you understand.
AI is a tool that you need to acknowledge using. Include links to your prompts and explain how you used AI to complete an assignment. Failure to do so is in violation of academic integrity policies.
Be thoughtful about when this tool is useful. Don’t use it if it isn’t appropriate for the case of circumstance.
Code of Conduct
This course follows the ETH Respect Code of Conduct. If you have not yet read this Code of Conduct, please familiarize yourself with it. If you experience inappropriate behaviour from us or any of your classmates, you will find contact and advice services here: https://respekt.ethz.ch/en/contact-and-advice-services.html