This is an instructor’s guide to conducting replication projects in courses. In addition to benefiting the field in ways that have been previously discussed by some of the authors of this book (e.g., Hawkins et al., 2018, Frank & Saxe, 2012), replication-based courses can additionally benefit students in these courses. In this guide, we will describe these benefits, explore different ways in which courses may be modified depending on student level and resources, and provide some guidelines and examples to help you set up the logistics of your course.
Over the years, we have observed many ways in which our replication-based courses benefited students above and beyond a more traditional lecture and problem set-based course. Some of these benefits include:
A project-based course may look very different depending on student level (undergraduate vs. graduate/post-doc level) and availability of resources at your institution for a course like this, namely in terms of TA support and course funding (for data collection). For most of this guide, we will assume that you have a similar setup to ours (i.e., teaching at the graduate/post-doc level and have course funding and TAs to support the course), but we have also spent some time considering ways to adjust the course to fit different student levels and availability of resources (see “Scenarios for different course layouts”).
If it is your first time teaching this course, you may want to decide ahead of time whether your course will mainly focus on content, or whether you will cover both content and relevant practical skills. For instance, if the course is for undergraduate students, you may decide to focus mainly on content, whereas if the course is for graduate students, they may find it more useful if the course covers both content and practical skills they can use in their research.
Another important consideration is how long your course will be. Depending on whether your university operates on quarters or semesters, the pace of the course will differ. For Psych 251, since we are on the quarter system, we use the 10-week schedule shown below. However, we have also adapted this schedule to a 16-week course given that it better represents a majority of other institutions’ academic calendars. At the end of this chapter, we give a set of sample class schedules.
Depending on your course format and teaching philosophy, you may have preferred grading criteria. As a point of reference, in Psych 251, we wanted to encompass both the assignments (problemsets and project components) as well as actual course attendance and participation. In addition, because the replication project is a central part of the course, we weighted the project components slightly more than the problemsets:
40%: Problem sets (four, at 10% each) 50%: Final project components, including presentations, data collection, analysis, and writeup 10%: Attendance and participation in class
For our course, we usually receive around US$1,000 for course funding from the Psychology Department. In addition, when students from other departments are enrolled, we have been lucky to receive additional funding from those departments as well, to further support the course. Still, making sure that the course funds cover all students’ projects is one of the most challenging parts of the course. Assuming you have a budget to work with, here are some lessons we’ve learned along the way regarding budgeting (and if you don’t have any funding, please refer to the section titled “Course Funding (or lack thereof)” under “## Scenarios for different course layouts”):
Once all projects have been approved as within-budget, we encourage you to create a shared spreadsheet containing each student’s name, so that they can fill in the details of their replication project. Ultimately, this will help ensure that students are paying fair wages to their participants and keep track of how the course funds are being divided up.
Now that we have covered the standard format of the course, we want to now turn our attention to ways in which this format can be tweaked in order to fit different needs and resources. We have organized this section into two main categories: student level, and course resources (such as TAs and course funding).
While Psych 251 at Stanford is geared towards graduate students (and is currently a required class for entering first-year graduate students in the Psychology Department), we also accept advanced undergraduate students as well as graduate students from other departments (e.g., Education, Human-Computer Interaction, Philosophy, Computer Science). On the first day of our course, we tell students that they should be comfortable with two of the three following topics:
Some knowledge of psychological experimentation & subject matter
Statistical programming: things like functions and variables
Basic statistics like ANOVA and t-test
If students are only comfortable with one of the three topics above, we warn them ahead of time that the course may demand more time from them than the average student.
Now, if you are planning on catering this course for undergraduate students, chances are that they have had less exposure to these topics overall, so there are multiple ways to calibrate the course accordingly:
Prerequisites: Require students to have completed courses that cover at least two of the three topics mentioned above (i.e., a psychology class, a class that covers statistical programming, a class that covers basic statistics, any two of the three).
Pace: unlike Psych 251, where the entire course only lasts 10 weeks, a class for undergraduates may benefit from a slower pace, allowing more time to cover the foundational principles before diving into the project. For instance, the course could be held over multiple academic semesters/quarters, with the project goal of Course #1 being choosing and planning the replication study, and the project goal of Course #2 being the execution and interpretation of the replication.
Pair-Group-Based Projects: In our course, each student is required to conduct their own replication project. However, this structure may be overwhelming for undergraduate students who may have less confidence taking on an entire replication project by themselves. One option that may alleviate this pressure is to have students conduct these projects as pairs or as small teams, so that they can collectively draw on each others’ strengths. When assigning these pairs or teams, it may be especially helpful to try to ensure a relatively even balance of students who are confident in each of the three areas outlined above (psychology, statistical programming, basic statistics).
Now that we’ve offered a few suggestions to address different student levels, let’s dive into the issue of course resources.
We think there are two main ways in which your course may have different resources from our model: In terms of course assistance (i.e., teaching assistants), and in terms of course funding for student projects. We’ll explore ways to work around each of these in this section:
As a point of comparison, in general, 2-3 teaching assistants are allocated to Psych 251, which enrolls about 36 students, which comes out to about 12-18 students per TA. Since a project-based course requires individual attention and feedback, we would recommend against a student-TA ratio that is much higher than that. That means that if you know you will have just one TA for the class, you should think about reducing the enrollment cap accordingly. But what if you have no TAs? With some adjustments, there are still ways you can make the course work sans-TA; we outline a few ideas below:
Peer grading: As an instructor with no TAs, the area that will require the biggest lift in terms of time and attention is grading. One way to overcome this is to introduce a peer-grading system, in which students grade each others’ work. If you choose this route, two things that may encourage fair grading among your students is to 1) distribute a clear and specific rubric that reduces the amount of subjectivity in the grading process as much as possible, and 2) anonymize the assignments so that students do not know whose assignment they are grading. If possible, it may again be beneficial to assign grading pairs that consist of students that are relatively knowledgeable in different areas, so that they can provide feedback that address weak points in each others’ work.
Collective troubleshooting: The second most time intensive area you will have to make up for is the amount of troubleshooting you may have to do for students who run into issues implement their projects, anywhere from getting GitHub and RMarkdown up and running on their devices, to trouble with data collection on Mechanical Turk. One way to encourage communal support among your students is to set up a central discussion board for the course (e.g., Piazza or a course channel on Slack) where students can publicly (but anonymously, if desired) post issues they are running into. Then, you can offer extra credit to students who help troubleshoot these issues, in order to further incentivize collective troubleshooting. There will likely still be issues that cannot be addressed by the students, but this system at least frees up your time to focus your attention on those that only you can address.
Single class-wide project: Finally, if the collective grading and troubleshooting methods outlined above do not cut down on enough time, you could consider walking through a single replication project as a class. 269 This approach does cancel out some of the benefits of a project-based course we mentioned at the start – namely, the project will likely no longer fit each student’s specific research interest, so there may be less benefit in terms of specific student interest and usefulness for their program of research, but the other two benefits of realism and intuition (especially if the project is discussed in the context of other replication findings) still stand. To make a single-project course work, you could have students nominate studies they would like to replicate as a class, and then have them vote on the final choice. Once the target study has been selected, every student can individually carry out all the steps of the project, including preregistering and writing up the analysis script. Then, setting up and running the data collection phase can happen during class, and once data has been collected, you can distribute it to the students for them to run it through their analysis script and interpret the result. Whether you choose to have students grade each others’ work or whether you grade their work yourself, the fact that the project is standardized should cut down on a lot of the time you would otherwise spend learning about the details of every individual project.
In addition to availability of TAs, another way in which your course may be different from ours is in terms of course funding. If you have little or not funding for your course (even after reaching out to relevant members of your department or institution), we suggest the following adjustments:
Pair-Group-Based Projects: Similarly to suggestion #3 for addressing different student levels, one option for limited course budgets is to have students conduct the replication projects as pairs or teams to reduce the cost of data collection. This structure may have the added benefit of encouraging students to problem-solve together. Alternatively, each student in the pairs or teams could complete each step of the replication individually (e.g., writing up the report, analyzing the data, interpreting the result), which would ensure that each student takes full responsibility for every step of the project. This structure may also provide opportunities for interesting discussions at the end of the course around analytic reproducibility, especially if students in the same teams (with the same dataset) differed in the conclusions they drew about the replication outcome.
Funding from Advisors: In some cases, students come to us with target studies that require more funding than we are able to allocate, but that they feel particularly invested in (e.g., because of how relevant the study is to their line of research). Once we rule out other ways of making the study fit our budget (e.g., dropping extra control conditions, running a subset of the study), we often ask students whether their advisor would be willing to fund the study. We have found that advisors are often willing to do this, especially if the replication could serve an important role in the development of the student’s research program. Similarly, one way to reduce the burden on a limited course budget would be to encourage all students to first ask their advisors about whether they would be willing to fund part or all of the data collection for the replication. While chances are that some advisors will be unwilling or unable to do this, there should still be a meaningful reduction in the number of projects the course will need to fund.
Reproduce a Replication: The suggestions above apply if you at least have some amount of course funding, but what if you have no funding at all? While there are obvious limitations to this solution, one suggestion is to have students reproduce past public replications. For instance, our course Github page, contains public repositories of all past replication projects that have been conducted in our course. Since the data for each replication project is available in these repositories, you could provide each of your students with a dataset and the original paper associated with it, and assign them to reproduce the results of the replication (Mike – would this be okay? Feel free to adjust or delete if not). Students should then be able to follow each step of the replication project described below (e.g., writing the report, identifying the key analysis, running the analysis). This format will only work if students do not view the original final replication reports that are posted publicly for their project, so it may be necessary to be clear about this at the beginning of the course.
For those of you who are working with a different course format (whether in terms of student level or course resources), we hope these suggestions were useful. If you try out a new idea in your course that you found helpful, we would be thrilled if you shared them with us!
The sample syllabi laid out below are categorized along the following decisions: 1) Material: whether the course focuses on just content or both content and skills, and 2) Duration: whether the course is 10-weeks long or 16-weeks long.
For undergraduate instructors, we have labelled advanced topics in purple. We expect that these topics are best suited for advanced undergraduate students. As for content around statistics (e.g., Estimation, Inference), instructors should decide how much of this content to teach, depending on how prepared students have been in previous classes.
|1||W||Experiments and theories||2|
|2||F||Tidyverse Tutorial continued (with TAs)|
|3||M||Measurement, Reliability, and Validity||7|
|3||W||Design of Experiments||8|
|4||W||Experiments 1: Simple survey experiments using Qualtrics|
|4||F||Experiments 2: Project-specific Implementation (TAs)|
|5||F||Sample Size Planning|
|6||W||Midterm Presentations 1|
|6||F||Midterm Presentations 2|
|8||F||Exploratory Data Analysis Workshop|
|9||M||Sampling, Representativeness, and Generalizability||3|
|9||W||Data and Participant Ethics|
|9||F||Authorship and Research Ethics|
|10||W||Final Project Presentations 1|
|10||F||Final Project Presentations 2|
|1||W||Experiments and theories||1|
|1||F||Replication and reproducibility||2|
|2||F||Design of experiments 1||8|
|3||M||Design of experiments 2||8|
|5||F||Introduction to statistics|
|9||M||Ethics: Data and Participants||3|
|9||W||Ethics: Authorship and Research||3|
|9||F||Ethics: Open discussion||3|
|1||2||Experiments and theories||2|
|3||2||Tidyverse Tutorial continued (with TAs)|
|4||1||Measurement, Reliability, and Validity||7|
|4||2||Design of Experiments||8|
|6||1||Experiments 1: Simple survey experiments using Qualtrics|
|6||2||Experiments 2: Project-specific Implementation (TAs)|
|8||1||Sample Size Planning|
|9||1||Midterm Presentations 1|
|9||2||Midterm Presentations 2|
|12||2||Exploratory Data Analysis Workshop|
|13||1||Sampling, Representativeness, and Generalizability||3|
|13||2||Data and Participant Ethics|
|14||1||Authorship and Research Ethics|
|16||1||Final Project Presentations 1|
|16||2||Final Project Presentations 2|
|2||1||Experiments and theories||1|
|2||2||Replication and reproducibility||2|
|3||2||Design of experiments 1||8|
|4||1||Design of experiments 2||8|
|8||1||Introduction to statistics|
|14||1||Ethics: Data and Participants||3|
|14||2||Ethics: Authorship and Research||3|
|15||1||Ethics: Open discussion||3|