In the past month we’ve seen the unfolding of another chapter in the bloody conflict between Israelis and Palestinians in Gaza. Since this conflict is, for me and probably many of my readers, taking place halfway around the world, the primary way we come to understand it is through reading accounts of it in news reports. However, even allegedly ‘objective’ news reporting can’t help but make choices that reveal particular ideological biases.
Examining media portrayals of Gaza
To explore how this conflict is being presented in two US news outlets, CNN and FOX, I’ve used techniques from corpus linguistics (see my notes on methods below) to extract words that are key to coverage of the Gaza conflict. In other words, these words are used more frequently in coverage about Gaza than in other contexts where other topics are being discussed. I then compared all of these words in FOX and CNN’s articles.
The plot above shows the results of this. Words in the middle are important to coverage of Gaza, but they are shared by both FOX and CNN. As you move to the left, you find words that are more characteristic of CNN’s coverage. That is, they are less frequent in FOX than in CNN. For example, the word teen is used about 0.43 times per 1000 words in FOX but 1.37 times per 1000 words in CNN or about 216% more frequently in CNN than FOX. To the right of the plot, you’ll find words that are more characteristic of FOX’s coverage of Gaza. For example, the word militants is used about 0.81 times per 1000 words in CNN but about 2.30 times per 1000 words in FOX or about 185% more frequently in FOX than CNN.
While the x-axis (or left-right orientation) of the plot shows relative frequency in FOX and CNN, the plot also incorporates information about the actual frequency. For example, the word israel is quite large because it is the most frequent word in this set occurring about 14.69 times per 1000 words in the FOX articles (slightly less frequently in the CNN corpus: 10.10 times per 1000 words). In contrast, the word political is quite small because it is relatively infrequent occurring only 0.47 times per 1000 words in the CNN corpus (and less in FOX’s articles).
There is a lot to say about the differences between CNN and FOX’s coverage of these events. I want to make just a couple of broad observations in this post.
One of the most notable differences between CNN’s and FOX’s coverage is the presence of words that denote religious, racial, and ethnic identity in FOX’s articles. Words like jewish, islamic, and arab indicate that FOX is presenting the conflict as motivated by ethnicity and religion to a much greater extent than CNN. In particular, FOX presents the perpetrators of violence as islamic militants including labeling Hamas the islamic militant group. You can see this in the concordance lines below (these lines show most of the occurrences of the word in context sorted according to what follows islamic).
In contrast, CNN uses islamic less frequently than FOX does. It also occasionally labels Hamas a militant islamic group or organization, but with less frequency than FOX. It also avoids the phrase islamic militants used by FOX. You can see this in the conconrdance lines below.
For its part, CNN has focused on the human toll of the violence. Its coverage seems to suggest that it finds things like injury reports more newsworthy or more relevant than FOX does. For example, both injured and dead are substantially more frequent in the CNN articles than in the FOX articles and this corresponds to a tendency for CNN to report figures about the numbers of dead and injured as a result of the violence.
While there is of course a great deal more that could be said about this topic, I think in general we can say that people who read FOX’s account of the violence are more likely to be exposed to a conflict presented as an ethnic and religious conflict, perhaps even with a fairly clear ‘bad guy’: the militants. In contrast, people who read CNN’s coverage are likely to be exposed to stories about the human toll of the violence. This includes the injuries and deaths that Palestinians have sustained.
Additional notes on methodological choices
I began by collecting news articles using the search term “israel” to gather articles from both CNN.com and FoxNews.com using Bing’s News aggregator. After screening out explicitly marked opinion articles (to the extent that they are clearly marked), I was left with 94 articles from CNN and 90 articles from FOX.
In order to understand the language of reporting on Gaza generally, I compared both the articles from CNN and the articles from FOX to a corpus of general US English: the American National Corpus. This allowed me to pull out not only the words that were specific to CNN or FOX but also words that are important to the topic of the violence in Gaza but used in both venues’ reporting (the graphic has a middle section with words from both). I limited my set of keywords to those that were significantly more frequent in either CNN or FOX (than in the ANC) and that appeared in at least 20% of the articles in either corpus. Finally, I also used a stop list to pull out a standard set of function words (for example, the, and, she, he, they, etc.) and also eliminated words that pertained not to the coverage of Gaza but rather to the venue itself (for example, fox or cnn).
After pulling out the 182 words that were key to the coverage of Gaza generally, I compared the frequency of all of these words in CNN and FOX. This is where the horizontal positioning in the plot at the top comes from. Words that are more frequent in CNN than in FOX are plotted to the left, and words that are more frequent in FOX than in CNN are plotted to the right. This plotting technique for comparing word frequencies is inspired in part by Drew Conway’s “better word cloud“.