appendix A — Instructor’s guide

A.1 Introduction

This is an instructor’s guide to conducting replication projects in courses. In addition to benefiting the field in ways that have been previously discussed by some of the authors of this book (e.g., Hawkins et al. 2018; Frank and Saxe 2012), replication-based courses can additionally benefit students in these courses. In this guide, we will describe these benefits, explore different ways in which courses may be modified depending on student level and resources, and provide some guidelines and examples to help you set up the logistics of your course.

A.2 Why teach a project-based course?

Over the years, we have observed many ways in which our replication-based courses benefited students above and beyond a more traditional lecture and problem set-based course. Some of these benefits include:

Student interest: Since each student will be free to replicate a study that is aligned with their research interests, this freedom facilitates a more direct application of course methods and lessons to a project that is interesting to each student.
Usefulness: If this course is taught in the first year of the program (as recommended), students may use their replication project as a way to establish robustness of a phenomenon before building studies on top of it.
Realism: Practice datasets that are typically provided for course exercises lack the complexity and messiness of real data. By conducting a replication project and dealing with real data, students learn to apply the tools provided in the course in a way that more closely demonstrates their usefulness beyond the course.
Intuition: Presentations of replication outcomes across the class along with a discussion of what factors seemed to predict these outcomes helps students develop a better intuition when reading the literature for how likely studies are to replicate.
Perspective: Frustrating experiences with ambiguity (whether regarding experimental methods, materials, or analyses) can motivate students to adopt best practices for their own future studies.

A project-based course may look very different depending on student level (undergraduate vs graduate/post-doc level) and availability of resources at your institution for a course like this, namely in terms of TA support and course funding (for data collection). For most of this guide, we will assume that you have a similar setup to ours (i.e., teaching at the graduate/post-doc level and have course funding and TAs to support the course), but we have also spent some time considering ways to adjust the course to fit different student levels and availability of resources (see “Scenarios for different course layouts”).

A.3 Logistics

A.3.1 Syllabus considerations

If it is your first time teaching this course, you may want to decide ahead of time whether your course will mainly focus on content, or whether you will cover both content and relevant practical skills. For instance, if the course is for undergraduate students, you may decide to focus mainly on content, whereas if the course is for graduate students, they may find it more useful if the course covers both content and practical skills they can use in their research.

Another important consideration is how long your course will be. Depending on whether your university operates on quarters or semesters, the pace of the course will differ. For Psych 251, since we are on the quarter system, we use the 10-week schedule shown below. However, we have also adapted this schedule to a 16-week course given that it better represents a majority of other institutions’ academic calendars. At the end of this appendix, we give a set of sample class schedules.

A.3.2 Grading

Depending on your course format and teaching philosophy, you may have preferred grading criteria. As a point of reference, in Psych 251, we wanted to encompass both the assignments (problem sets and project components) and course attendance and participation. In addition, because the replication project is a central part of the course, we weighted the project components slightly more than the problem sets:

40%: Problem sets (four, at 10% each)
50%: Final project components, including presentations, data collection, analysis, and writeup
10%: Attendance and participation in class

A.3.3 Course budget

For our course, we usually receive around US$1,000 for course funding from the Psychology Department. In addition, when students from other departments are enrolled, we have been lucky to receive additional funding from those departments as well, to further support the course. Still, making sure that the course funds cover all students’ projects is one of the most challenging parts of the course. Assuming you have a budget to work with, here are some lessons we’ve learned along the way regarding budgeting (and if you don’t have any funding, please refer to the section titled “Course Funding” under “Scenarios for different course layouts”):

Before students pick their study to replicate, provide them with an estimate of how many participant hours they will be able to receive for their project
As soon as students pick a study for their replication project, help each student run a power analysis to confirm that replicating the study would be within the budget (TAs can help with this)
If a student feels strongly about a study that does not fit within the budget, consider the following ways to adjust the study: (1) Can the study be made shorter by cutting out unnecessary measures? (2) If it is a multi-trial study, can the number of trials be reduced? (3) Would their advisors be willing to provide additional funding? (4) Can the study be run on university participant pools?
As mentioned above, if there are students from other departments who are enrolled in your course, one possibility to obtain more funding is to reach out to the heads of those departments to see whether they would be willing to help support your course.

Once all projects have been approved as within-budget, we encourage you to create a shared spreadsheet containing each student’s name, so that they can fill in the details of their replication project. Ultimately, this will help ensure that students are paying fair wages to their participants and keep track of how the course funds are being divided up.

A.4 Scenarios for different course layouts

Now that we have covered the standard format of the course, we want to now turn our attention to ways in which this format can be tweaked in order to fit different needs and resources. We have organized this section into two main categories: student level and course resources (such as TAs and course funding).

A.4.1 Student level

While Psych 251 at Stanford is geared toward graduate students (and is currently a required class for entering first-year graduate students in the Psychology Department), we also accept advanced undergraduate students as well as graduate students from other departments (e.g., education, human-computer interaction, philosophy, computer science). On the first day of our course, we tell students that they should be comfortable with two of the three following topics:

Some knowledge of psychology experimentation subject matter
Statistical programming: things like functions and variables
Basic statistics like ANOVA and $t$-test

If students are only comfortable with one of the three topics above, we warn them ahead of time that the course may demand more time from them than the average student.

Now, if you are planning on catering this course for undergraduate students, chances are that they have had less exposure to these topics overall, so there are multiple ways to calibrate the course accordingly:

Prerequisites: Require students to have completed courses that cover at least two of the three topics mentioned above (i.e., a psychology class, a class that covers statistical programming, a class that covers basic statistics, any two of the three).

Pace: Unlike Psych 251, where the entire course only lasts 10 weeks, a class for undergraduates may benefit from a slower pace, allowing more time to cover the foundational principles before diving into the project. For instance, the course could be held over multiple academic semesters/quarters, with the project goal of Course #1 being choosing and planning the replication study, and the project goal of Course #2 being the execution and interpretation of the replication.

Pair-based or group-based projects: In our course, each student is required to conduct their own replication project. However, this structure may be overwhelming for undergraduate students who may have less confidence taking on an entire replication project by themselves. One option that may alleviate this pressure is to have students conduct these projects as pairs or as small teams, so that they can collectively draw on each others’ strengths. When assigning these pairs or teams, it may be especially helpful to try to ensure a relatively even balance of students who are confident in each of the three areas outlined above (psychology, statistical programming, basic statistics).

A.4.2 Course resources

We think there are two main ways in which your course may have different resources from our model: in terms of course assistance (i.e., teaching assistants), and in terms of course funding for student projects. We’ll explore ways to work around each of these in this section.

Teaching assistants. As a point of comparison, in general, two to three teaching assistants are allocated to Psych 251 (which enrolls about 25-35 students); this typically comes out to about 10-12 students per TA. We are very lucky to have this level of TA support. Since a project based course requires individual attention and feedback, having fewer than this number of TAs entails a lot of work for the instructor. What if you have no TAs? With some adjustments, there are still ways you can make the course work sans-TA; we outline a few ideas below:

Peer grading. As an instructor with no TAs, the area that will require the biggest lift in terms of time and attention is grading. One way to overcome this is to introduce a peer-grading system, in which students grade each others’ work. If you choose this route, two things that may encourage fair grading among your students is to (1) distribute a clear and specific rubric that reduces the amount of subjectivity in the grading process as much as possible, and (2) anonymize the assignments so that students do not know whose assignment they are grading. If possible, it may again be beneficial to assign grading pairs that consist of students that are relatively knowledgeable in different areas, so that they can provide feedback that address weak points in each others’ work.
Collective troubleshooting. The second most time intensive area you will have to make up for is the amount of troubleshooting you may have to do for students who run into issues implement their projects, anywhere from getting GitHub and R Markdown up and running on their devices, to trouble with data collection on Amazon Mechanical Turk. One way to encourage communal support among your students is to set up a central discussion board for the course (e.g., Piazza or a course channel on Slack) where students can publicly (but anonymously, if desired) post issues they are running into. Then, you can offer extra credit to students who help troubleshoot these issues, in order to further incentivize collective troubleshooting. There will likely still be issues that cannot be addressed by the students, but this system at least frees up your time to focus your attention on those that only you can address.
Single class-wide project. Finally, if the collective grading and troubleshooting methods outlined above do not cut down on enough time, you could consider walking through a single replication project as a class.¹ To make a single-project course work, you could have students nominate studies they would like to replicate as a class, and then have them vote on the final choice. Once the target study has been selected, every student can individually carry out all the steps of the project, including preregistering and writing up the analysis script. Then, setting up and running the data collection phase can happen during class, and once data has been collected, you can distribute it to the students for them to run it through their analysis script and interpret the result. Whether you choose to have students grade each others’ work or whether you grade their work yourself, the fact that the project is standardized should cut down on a lot of the time you would otherwise spend learning about the details of every individual project.

¹ This approach does cancel out some of the benefits of a project-based course we mentioned at the start—namely, the project will likely no longer fit each student’s specific research interest, so there may be less benefit in terms of specific student interest and usefulness for their program of research, but the other two benefits of realism and intuition (especially if the project is discussed in the context of other replication findings) still stand.

Course funding. In addition to availability of TAs, another way in which your course may be different from ours is in terms of course funding. If you have little or not funding for your course (even after reaching out to relevant members of your department or institution), we suggest the following adjustments:

Pair-group-based projects. Similarly to suggestion 3 for addressing different student levels, one option for limited course budgets is to have students conduct the replication projects as pairs or teams to reduce the cost of data collection. This structure may have the added benefit of encouraging students to problem-solve together. Alternatively, each student in the pairs or teams could complete each step of the replication individually (e.g., writing up the report, analyzing the data, interpreting the result), which would ensure that each student takes full responsibility for every step of the project. This structure may also provide opportunities for interesting discussions at the end of the course around analytic reproducibility, especially if students in the same teams (with the same dataset) differed in the conclusions they drew about the replication outcome.
Funding from advisors. In some cases, students come to us with target studies that require more funding than we are able to allocate, but that they feel particularly invested in (e.g., because of how relevant the study is to their line of research). Once we rule out other ways of making the study fit our budget (e.g., dropping extra control conditions, running a subset of the study), we often ask students whether their advisor would be willing to fund the study. We have found that advisors are often willing to do this, especially if the replication could serve an important role in the development of the student’s research program. Similarly, one way to reduce the burden on a limited course budget would be to encourage all students to first ask their advisors about whether they would be willing to fund part or all of the data collection for the replication. While chances are that some advisors will be unwilling or unable to do this, there should still be a meaningful reduction in the number of projects the course will need to fund.
Reproduce a replication. The suggestions above apply if you at least have some amount of course funding, but what if you have no funding at all? While there are obvious limitations to this solution, one suggestion is to have students reproduce past public replications. For instance, our course GitHub page contains public repositories of all past replication projects that have been conducted in our course. Since the data for each replication project is available in these repositories, you could provide each of your students with a dataset and the original paper associated with it, and assign them to reproduce the results of the replication. Students should then be able to follow each step of the replication project described below (e.g., writing the report, identifying the key analysis, running the analysis). This format will only work if students do not view the original final replication reports that are posted publicly for their project, so it may be necessary to be clear about this at the beginning of the course.

For those of you who are working with a different course format (whether in terms of student level or course resources), we hope these suggestions were useful. If you try out a new idea in your course that you found helpful, we would be thrilled if you shared them with us!

A.5 Sample course schedules

The sample syllabi laid out below are categorized along the following decisions: (1) material: whether the course focuses on just content or both content and skills, and (2) duration: whether the course is 10 weeks long or 16 weeks long.

For undergraduate instructors, we have labeled advanced topics in purple. We expect that these topics are best suited for advanced undergraduate students. As for content around statistics (e.g., estimation, inference), instructors should decide how much of this content to teach, depending on students’ prior preparation.

A sample 10-week syllabus with both skills and content materials:

Week	Day	Topic	Chapter	Appendix
1	M	Class introduction	1
1	W	Theories	2
1	F	Version control		B
2	M	Reproducible reports	14	C
2	W	Tidyverse tutorial		D
2	F	Tidyverse tutorial continued (with TAs)
3	M	Measurement, reliability, and validity	8
3	W	Design of experiments	9
3	F	Sampling	10
4	M	Project management	13
4	W	Experiments 1: Simple survey experiments using Qualtrics
4	F	Experiments 2: Project-specific implementation (TAs)
5	M	Estimation	5
5	W	Inference	6
5	F	Sample size planning
6	M	Survey design
6	W	Midterm presentations 1
6	F	Midterm presentations 2
7	M	Preregistration	11
7	W	Meta-analysis	16
7	F	Open science	3
8	M	Visualization 1	15	E
8	W	Visualization 2
8	F	Exploratory data analysis workshop
9	M	Sampling, representativeness, and generalizability	4
9	W	Data and participant ethics	12
9	F	Authorship and research ethics
10	M	Open discussion	17
10	W	Final Project presentations 1
10	F	Final Project presentations 2

A sample 10-week syllabus with only content materials:

Week	Day	Topic	Chapter
1	M	Class introduction	1
1	W	Theories	2
1	F	Replication and reproducibility	3
2	M	Open Science
2	W	Measurement	8
2	F	Design of experiments 1	9
3	M	Design of experiments 2
3	W	Sampling	10
3	F	Experimental strategy
4	M	Preregistration	11
4	W	Data collection	12
4	F	Visualization 1	15
5	M	Visualization 2
5	W	Midterm exam
5	F	Introduction to statistics
6	M	Estimation 1	5
6	W	Estimation 2
6	F	Inference 1	6
7	M	Inference 2
7	W	Models 1	7
7	F	Models 2
8	M	Meta-analysis	16
8	W	Project management	13
8	F	[Instructor-specific topics]
9	M	Sampling, representativeness, and generalizability	4
9	W	Data and participant ethics	12
9	F	Authorship and research ethics
10	M	Conclusion	17
10	W	Conclusion
10	F	Final exam

A sample 16-week syllabus with both skills and content materials:

Week	Day	Topic	Chapter	Appendix
1	1	Class introduction	1
1	2	Theories	2
2	1	Version control		B
2	2	Reproducible reports	14	C
3	1	Tidyverse tutorial		D
3	2	Tidyverse tutorial continued (with TAs)
4	1	Measurement, reliability, and validity	8
4	2	Design of experiments	9
5	1	Sampling	10
5	2	Project management	13
6	1	Experiments 1: Simple survey experiments using Qualtrics
6	2	Experiments 2: Project-specific implementation (TAs)
7	1	Estimation	5
7	2	Inference	6
8	1	Sample size planning
8	2	Survey design
9	1	Midterm presentations 1
9	2	Midterm presentations 2
10	1	Preregistration	11
10	2	Meta-analysis	16
11	1	Open science	3
11	2	Visualization 1	15	E
12	1	Visualization 2
12	2	Exploratory data analysis workshop
13	1	Sampling, representativeness, and generalizability	4
13	2	Data and participant ethics	12
14	1	Authorship and research ethics
14	2	[Instructor-specific topics]
15	1	Open discussion	17
15	2	Open discussion
16	1	Final project presentations 1
16	2	Final project presentations 2

A sample 16-week syllabus with only content materials:

Week	Day	Topic	Chapter
1	1	Class introduction	1
1	2	Theories	2
2	1	Replication and reproducibility	3
2	2	Open science
3	1	Measurement	8
3	2	Design of experiments 1	9
4	1	Design of experiments 2
4	2	Sampling	10
5	1	Experimental strategy
5	2	Preregistration	11
6	1	Data collection	12
6	2	Visualization 1	15
7	1	Visualization 2
7	2	Midterm exam
8	1	Introduction to statistics
8	2	Estimation 1	5
9	1	Estimation 2
9	2	Inference 1	6
10	1	Inference 2
10	2	Models 1	7
11	1	Models 2
11	2	Meta-analysis	16
12	1	Project management	13
12	2	[Instructor-specific topics]
13	1	[Instructor-specific topics]
13	2	Sampling, representativeness, and generalizability	4
14	1	Data and participant ethics
14	2	Authorship and research ethics
15	1	Ethics: Open discussion
15	2	Conclusion	17
16	1	Conclusion
16	2	Final exam