Linguistics 203: Handout
Week 1: September 25, 2002


Using Corpora to Test Claims about Idioms


I. Riehemann (2001) cites Jackendoff (1997; 170) as claiming that the idiom 'raise hell' is syntactically inflexible, specifically unpassivizable. But the following all appear in the NYT corpus:

  1. So much hell was raised that the biologists threw up their hands in surrender .
  2. ...the internal investigation was reopened "in part because of the hell that Plitman raised about Newcomb's role in Leatherneck."
  3. Few folks in the Apple speculated on the hell that would have been raised by George Steinbrenner if the Yankees had been similarly robbed at Camden Yards.
  4. ...but how much hell can you raise at Jack-in-the-box?
  5. All that was really raised was a little hell and a lot of dust.

II. Koopman and Sportiche (1991) make the following claim, which is repeated by Richards (2001):

Consider 'the cat...out of the bag'. This idiom virtually always occurs in a phrase headed by a verb, but the verb is not part of the idiom. It is sometimes listed as 'let the cat out of the bag', but Riehemann (2001) found that 23 of 48 occurrences in the NYT corpus did not contain 'let'. Of the others 14 were headed by 'is' or 'was', but other verbs (such as 'get') also occurred.

III. Idioms with double passives (e.g. 'take advantage') show an overwhelming tendency to use the inner passive when the idiomatic noun is modified, and virtually never use the outer passive when the noun is modified. Nunberg, et al (1994) searched a very large corpus of American newspapers for outer passives of this idiom and found over 1200 exemplars. Of these, only the following three had anything between 'taken' and 'advantage':

  1. It would be very hard to enforce, and it will be taken unfair advantage of.
  2. Not even six Cochise fielding errors were taken full advantage of.
  3. She is pretty and she also has a personality, but does not wish to be taken such advantage of and hold the left-handed compliments, too.

In contrast, the same corpus included only 71 examples of inner passives of 'take advantage', but in 47 of these, 'advantage' was preceded by an adjective and/or quantifier. For example:

  1. Maximum advantage is taken of the natural beauties of the place.
  2. Full advantage is taken of facilities nearby.
  3. No undue advantage is taken nor any dangerous weapon used.
  4. In the Wanderer/Alberich scene, imaginative advantage was taken of Tom Fox's physical stature
  5. But further advantage has been taken of the opportunity.
  6. Greater advantage can be taken of federal funds available through the use of locally raised matching money.

66 percent of the (almost 6000) occurrences of 'take advantage' in which exactly one word occurred between 'take' and 'advantage', that word was 'full', and in another 7 percent, it was 'unfair'. In contrast, we found no other word appearing in this context with a frequency greater than 0.4 percent.

REFERENCES

Jackendoff, Ray S. (1997) The Architecture of the Language Faculty. Cambridge, Mass: MIT Press.

Koopman, Hilda and Dominique Sportiche (1991) The position of subjects. Lingua 85: 211-258.

Nunberg, Geoffrey, Ivan A. Sag, & Thoma Wasow (1994) Idioms. Language, 703: 491-538.

Richards, Norvin (2001) An idiomatic argument for lexical decomposition. Linguistic Inquiry 32:183b192.

Riehemann, Susanne Z. (2001) A Constructional Approach to Idioms and Word Formation. Stanford University dissertation.



Last modified: January 20, 2003