So I guess it’s basketball season. For me, basketball is tied up with gender equity debates. I remember that what little discussion over gender equity took place at my high school largely centered around the boys’ and girls’ basketball teams. For example, questions like whether the girls’ basketball team received as much attention or support as the boys’. Of course, they did not.
I remember a particularly heated discussion when it was decided that the desirable Friday and Saturday night game times would start being split between the boys and the girls (before the boys had laid exclusive claim over these nights), so that the girls would have the opportunity to play in front of larger audiences.
Cue male privilege. Predictable and yet disheartening arguments about whether women’s sports were interesting enough to be given these time slots ensued. The consensus (at least among the boys) was that they apparently were not. At the time, I didn’t really understand the role that biased perceptions and dominant misogynistic discourses were playing in these discussions. Nowadays, I’m a lot more sensitive to such things.
A couple of weeks ago, I decided to take a closer look at what role the media might play in reinforcing the way we view men’s and women’s sports. College basketball provides an interesting place to compare the way the media presents these two, since both men and women compete, and are in theory supposed to be given equal (if separate) opportunities to do so. Thus, I went to NCAA.com and downloaded all of the articles for Division I men’s and women’s basketball from the past 10 years. This gave me two corpora: one with 3451 articles about men’s basketball (containing about 2.75 million words) and one with 1825 articles about women’s basketball (containing about 1.2 million words). The disparity in the number of articles is itself a reflection of the lower status women’s basketball continues to have to this day.
From this data (and using methods which I describe more thoroughly below), I generated the plot above which compares the two sets of data in terms of how frequent different words within them are.
You’ll notice that there are no names of players, coaches, or teams which you might have expected to see. This is because I was not interested in the actors per se but rather in the way in which various writers presented those actors and the competitions themselves. Thus, I removed names (more on how below).
I ranked the left over words according to how much more frequent they were in one corpus than in the other correcting for the fact that the two corpora are of different sizes. The top 20 for each corpus are presented in the plot. While the plot is itself interesting to look at and speculate about, it’s really important to consider in more depth why these particular words appear characteristic of coverage of either men’s or women’s basketball. The answers are often not as straightforward as you might assume.
Some observations about the keywords
Much of what we see in the plot is not terribly surprising. There are numerous gender specific words dominating the top spaces in the women’s articles and many of the middle positions for the men. It’s nonetheless interesting to consider that gender-specific terms are even more key for the women than for the men. In other words, for female-specific words like she there’s a greater difference between the articles about men’s and women’s basketball than there is for male-specific words like he. This seems to be caused by the fact that men’s basketball is an all men’s zone with not only the players but also the other major actors like the coaches, referees, commentators, etc. being male. Hence, words for women rarely show up. In contrast, many of the coaches and other actors in women’s basketball are men.
The presence of the word girls in the top 20 is also quite striking, especially since the corresponding boys does not appear in the men’s list. We might expect to see the use of the term girls applying to the players, and it does sometimes, usually used in quotations from coaches and the players themselves, like in this excerpt:
“We had some young guards step up today and they really had a great game,” Carey said. “They may be young but those girls know the defense and they know what we do.” Iowa State coach Bill Fennelly said 10 of his team’s 11 players are in their first or second year. “It was a great experience for our girls and I know they learned a lot,” said Fennelly, whose team went 2-1 at the tournament.
However, even more common is the presence of a discussion of the need to make young girls interested in sports and the way that female college athletes are engaged in this type of work, such as Mobolaji Akiode who created a nonprofit organization Hope 4 Girls that is creating opportunities for young girls in Nigeria to engage in sports.
Looking at the men’s side, the top position is the word pound which was used to describe male players in sentences like this “He’s a 6-foot-9, 230 pound sophomore”. While female players were described according to their height, there isn’t a single occurrence of them being described by their weight in my data. It is interesting to speculate both about why women are not described in this way as well as what effects it has on the way their competitions are perceived. The authors of these articles are probably responding to a wider taboo about women’s weight when they omit female players’ weights. However, when men are described according to their weight, do they not often appear more powerful and foreboding? Is any perception that female athletes are less powerful or that their competitions are less intense or physical reinforced by the absence of discussion of their weight? I would also love to know what female athletes themselves feel about this.
Some of the men’s keywords also point to a tendency to portray men’s college basketball as big business and worthy of serious analysis to a greater extent than women’s college basketball. While it must be the case that women’s coaches and schools reach agreements and that people associated with women’s basketball are paid salaries, and money is made or lost in the sport, the NCAA.com articles do not discuss this side of women’s basketball. However, keywords like agreement, million, and hired all point to the greater attention afforded the business of men’s college basketball. It appears that NCAA.com feels that the hiring, firing, and dealing of actors in men’s basketball are newsworthy but not in women’s basketball.
In a similar vein, the keyness of words like stats and opinion suggest a tendency to engage in more analysis of men’s basketball, with writers discussing stats, offering predictions about outcomes, etc. This type of discussion of women’s basketball is less common.
Of course this is not all positive for men’s basketball, there is also a tendency to have negative events like infractions presented as newsworthy in NCAA.com articles, as suggested by keywords like suspended, suspension, and violations. While it’s likely the case that male players are suspended more frequently than female players, it may also be that men’s suspensions and violations are seen as more newsworthy than women’s, perhaps used as features of the type of analysis that I mentioned above. For example, if one team has numerous players that are suspended, this might impact an analyst’s predictions about the outcome of a game.
Some notes on method
My program scraped the archives of the NCAA.com for Division I men’s and women’s basketball. The program downloaded all of the articles to my computer, and then cleaned them so that only the text of the article and the title were left in the file. I then used TagAnt to tag the parts of speech of each word. I eliminated all words tagged as proper nouns or plural proper nouns (this eliminated most names). However, like any part of speech tagger, TagAnt is not a perfect tool, so some names remained. I manually eliminated them from my later analysis.
I then computed the frequency, normed frequency (frequency / total words in corpus *1000), contextual diversity (% of articles that a word appears in), and keyness (% difference in normed frequencies) of each word in both corpora.
Once I had computed the statistics, I limited my analysis to words that were significantly key (had a chi square test with a p-value of less than .01) and were found in at least 2% of the articles in one of the two corpora. This was done to ensure that the words I looked at were both highly characteristic of one corpus (that is had a high keyness value) and reasonably frequent at least within that corpus.
From this reduced set, I looked at the top 20 keywords. I checked each keyword both to ensure that its keyness was in fact a genuine difference and not, as was sometimes the case, the cause of some type of repeated bit of text (in particular titles of other articles often showed up and this often led to a word being artificially key). I also looked at the actual texts to get a sense of how the word was being used, for example, who were being described as girls in the articles on women’s basketball?