I’ve speculated a bit whether instead of grades, instructors/lecturers/professors could write a couple of paragraphs on the student’s progress throughout the entire semester. Companies, government offices, graduate school committees, etc., could then use Large Language Models (AI) to comb through students’ written report cards to determine whether the student was a good fit for their specific opportunity.
There are a lot of flaws with this approach. It doesn’t scale well with classroom size. The instructor might not have a good sense of all the material the student has grasped. And students might be hesitant to ask questions, concerned that doing so could indicate their trouble understanding the material.
The motivation though is that a single midterm grade (like the one we just had) tells a limited story of a student’s understanding of the material. So I’m always interested in how we can do better. How can we better align “learning” with “evaluation”.
Hopefully it can be a helpful exercise to think critically about just how informative a single midterm grade is about your understanding of the material.
Let’s start a bit informally, by thinking about why a midterm grade is not “perfectly” informative of your understanding of the material. One thing to note is that the midterm only had 18 questions on it. And 18 questions is surely a small number relative to all the questions that could have been asked on a midterm. Maybe the 18 questions that we saw on the midterm where really good for you. Maybe you really knew how to tackle those questions. Or maybe, you just happened to get stumped on some of those questions, but if they had been replaced by a few other questions, you would have done really well.
We also know that we don’t take midterms in a vacuum. Maybe the week of October 16th was really relaxing for you. You didn’t have any other midterms. Your roommates we’re gone for the week. You were getting those 9-10 hours of sleep. So you felt great walking into the midterm. Or maybe, you were one of those students with three other midterms, your roommate was sick that week. And so you walked into the midterm absolutely exhausted. Whatever you experienced on October 16th, it was just one of the possible Wednesdays experiences that you could have had. Your roommate doesn’t get sick, your accounting class has reschedules the midterm for the following week, you Wednesday feels different and you likely would have performed different.
From a little bit of brainstorming, it’s clear that our “performance” depends on the questions that we get, and the particular Wednesday on which we tackle this questions. You change either of these components, and our performance changes.
We can represent this visually with the following figure which depicts a sample space and a random variable. Here the sample space is the “product” of the set of all possible questions, $\Omega_1$, with the set of all possible Wednesdays, $\Omega_2$. Each element of the sample space is pair of elements $(\omega_1, \omega_2)$ where the first element is a specific question and the second element is a specific Wednesday. The random variable $X$ then tells us whether the student would have gotten that specific question, $\omega_1$, correct on that specific $\omega_2$.