In this set of notes we’re going to

  1. Highlight our approach to Learning

<aside>

1.

The typical undergraduate class proceeds sequentially. You cover one topic, say the origins of World War One and then move onto a different, but related topic, like covering the events and issues that arose during the war. Having discussed that topic, you move onto a third, like the results and ramifications of the WW1. Only as a final essay, perhaps, or in a class debate, are asked to make sense of the entire sequence of topics.

There is certainly a sequential aspect to the material covered in this class too. We first went over the basics of Python — how to create a variable, how to define a function, etc. Then we discussed how to manipulate data in Python — how to create frequency tables, how to filter data, etc. And last week, we introduced a conceptual overview of probability theory - the idea of a sample space, the essence of a random variable. And so it’s natural to try and make sense to things as concepts within their own separate groups.

Screenshot 2024-10-06 at 10.53.57 AM.png

But this class is a little bit different from other classes (maybe you would say a lot a bit different!), in the sense that we don’t present concepts initially in their entirety. There’s too much going on to able to absorb an entire concept in one class period. With varying levels of success, we’ve instead developed our understanding of topic by layering. We introduce an initial concept. We see another concept. And then we apply the new concept to the initial concept, thereby furthering our understanding of the initial concept. It’s perhaps a little different from how other classes are structured, but if done well, it allows us to cover concepts in greater depth. You further your understanding of a concept by exploring how it relates to other concepts.

One way to think about this class is that we learn more about concepts throughout the course of the semester by understanding the various relationships between concepts

One way to think about this class is that we learn more about concepts throughout the course of the semester by understanding the various relationships between concepts

In line with this approach to learning, let’s consider the following setup where we have a random variable $X$ which maps from the sample space into some set of real numbers. As we’ve discussed previously, given a random variable $X$, we can construct a related function $X^{-1}$ which maps from the power set of the real numbers into the power set of the sample space. Given this set up, we can ask — how does the idea of $X^{-1}$ relate to the idea of filtering?

Screenshot 2024-10-06 at 11.29.29 AM.png

Thought #1: Nothing!

There is no relation between these two concepts. $X^{-1}$ kinda “undoes” the random variable, and filtering is a data manipulation process that allows you to drop rows. These are distinct ideas, there is no relation.

… 30 seconds later

Thought #2: Something!

They do seem like different concepts but they have to be related. We wouldn’t be asked this question if they weren’t related. But I have no idea.

…5-10 minutes later

Thought #3: They’re pretty related

We know that via filtering, we can select a subset of the dataset.

For example, we can create a subset of a dataset which only consists of statistics for players how have averaged at least five minutes per game as follows: df[df['PTS'] >=5]

We can also use filtering to select a subset of a column

For example, we can keep just the name of players who average at least five minutes per game. df['Player'][df['PTS'] >=5]

If we think of the sample space as the set of WNBA players and Points as our random variable, $X$, then given a subsets of points, $X^{-1}$should return the set of players who score in that subset.

Screenshot 2024-10-06 at 11.33.51 AM.png