• Smithsonian
    Institution
  • Travel
    With Us
  • Smithsonian
    Store
  • Smithsonian
    Channel
  • goSmithsonian
    Visitors Guide
  • Air & Space
    magazine

Smithsonian.com

  • Subscribe
  • History & Archaeology
  • Science
  • Ideas & Innovations
  • Arts & Culture
  • Travel & Food
  • At the Smithsonian
  • Photos
  • Videos
  • Games
  • Shop
  • Art
  • Design
  • Fashion
  • Music & Film
  • Books
  • Art Meets Science
  • Arts & Culture

The Million Word March

What defines a word? Lexicographers and other experts don’t always agree

| | | Reddit | Digg | Stumble | Email |
  • By Anika Gupta
  • Smithsonian.com, September 24, 2008, Subscribe
 
Dictionary
(iStockPhoto.com)

Related Links

  • Global Language Monitor

More from Smithsonian.com

  • Words from the Dictionary of American Regional English
  • History's "Global Languages"
  • Common English Words

It used to be that the expert source on what was or wasn't a word was that school-day staple: the dictionary. American Heritage, Webster's Third, the Oxford English: there were a few trusted players in the game.

But what if those players are losing their edge?

Take the word "staycation." Staycation, which means to spend a vacation at home, recently appeared in the New York Times, USA Today and MSNBC. But it isn't likely to appear anytime soon in a dictionary. The same goes for "bracketology," (the science of NCAA March Madness betting) Facebook and Wikipedia.

"We try to cover the most salient" words," says Joe Pickett, executive editor of the American Heritage Dictionary. "What does the educated layperson need to know?"

The people who make dictionaries are known as lexicographers ("authors or editors of a dictionary." Thanks, Merriam-Webster). And they have a time-tested method for choosing which new words to certify and which ones to toss before the next edition or update of a dictionary's Web site.

Groups of editors at a dictionary watch specific subject areas, logging the hits a new word gets. A "hit" is a mention in a book, newspaper or Web site. Then they put the hits in a database and compare the new terms to words they already have. So although Facebook, being a brand name, doesn't qualify, every word in Shakespeare's plays does – including cap-a-pie ("from head to foot") and fardel ("burden"). Being the granddaddy of creative linguistics, Shakespeare invented more than 1,700 words. All of them appear in an unabridged dictionary.

Dictionaries reject words for being too technical (even the most die-hard "Grey's Anatomy" fan will never need to know what a mammosomatotroph is) or for being too young (staycation).

They don't count brand names (Coke, Facebook, Wikipedia) or most foreign words and phrases.

"We aren't trying to be Wikipedia," Pickett said.

So who is? Who's keeping track of, counting and sorting the words English speakers use on an everyday basis?

The Austin, Tex., has been tracking words for the past five years. Using its own teams of experts and its own algorithm, they say English adds a new word every 98 minutes. This means there are more than 900,000 English words in the world, and the one-millionth will appear sometime in April 2009.


It used to be that the expert source on what was or wasn't a word was that school-day staple: the dictionary. American Heritage, Webster's Third, the Oxford English: there were a few trusted players in the game.

But what if those players are losing their edge?

Take the word "staycation." Staycation, which means to spend a vacation at home, recently appeared in the New York Times, USA Today and MSNBC. But it isn't likely to appear anytime soon in a dictionary. The same goes for "bracketology," (the science of NCAA March Madness betting) Facebook and Wikipedia.

"We try to cover the most salient" words," says Joe Pickett, executive editor of the American Heritage Dictionary. "What does the educated layperson need to know?"

The people who make dictionaries are known as lexicographers ("authors or editors of a dictionary." Thanks, Merriam-Webster). And they have a time-tested method for choosing which new words to certify and which ones to toss before the next edition or update of a dictionary's Web site.

Groups of editors at a dictionary watch specific subject areas, logging the hits a new word gets. A "hit" is a mention in a book, newspaper or Web site. Then they put the hits in a database and compare the new terms to words they already have. So although Facebook, being a brand name, doesn't qualify, every word in Shakespeare's plays does – including cap-a-pie ("from head to foot") and fardel ("burden"). Being the granddaddy of creative linguistics, Shakespeare invented more than 1,700 words. All of them appear in an unabridged dictionary.

Dictionaries reject words for being too technical (even the most die-hard "Grey's Anatomy" fan will never need to know what a mammosomatotroph is) or for being too young (staycation).

They don't count brand names (Coke, Facebook, Wikipedia) or most foreign words and phrases.

"We aren't trying to be Wikipedia," Pickett said.

So who is? Who's keeping track of, counting and sorting the words English speakers use on an everyday basis?

The Austin, Tex., has been tracking words for the past five years. Using its own teams of experts and its own algorithm, they say English adds a new word every 98 minutes. This means there are more than 900,000 English words in the world, and the one-millionth will appear sometime in April 2009.

In contrast, most standard dictionaries have about 200,000 words, unabridged dictionaries about 600,000.

But the Monitor is so sure of its numbers it has started a Million Word March, a countdown to the one-millionth word.

"We went back to the Middle English and saw that the definition of a word was 'a thought spoken,'" said Paul JJ Payack, president and chief word analyst at the Monitor, "which means if I say a word, and you understand me, it's a real word."

Payack counts staycation, Facebook and Wikipedia as words. But he also follows some of the old rules. For example, words that are both noun and verb, such as "water" are counted only once. He doesn't count all the names there are for chemicals, because there are hundreds of thousands.

Once the Monitor identifies a word, it tracks it over time, watching to see where the word appears. Based on that measurement, they decide if the word has "momentum," basically, whether it's becoming more popular or if it's a one-hit wonder of the linguistic world.

At first glance, this seems a lot like a dictionary's system.

"It's the same as the old [method], just recognizing the new reality," Payack said. The Monitor's method gives a lot more weight to online citations.

But is Payack's "new reality" well, real? He claims that the fast flow of information and the advent of global English have changed the way people use words. And that the gap between the words people use and the words that appear in dictionaries might be on the rise.

"It turns out that once something enters the Internet, it's like an echo chamber," said Payack. Since the first web browser appeared in 1991, the Internet has added a lot of words to the English language—dot-com, blog—and it's added these words fast. The Web has also taken existing words to new ears.

"Back in the mid-'90s, getting a couple of thousand browser hits for a word made us inclined to enter it; now the threshold has changed," Pickett said. "You can find so much evidence for obscure words and expressions."

But dictionaries are used to playing catch-up. After all, it's hard to define a word before it's coined.

Payack says the Internet isn't the most pressing challenge to traditional word-counting methodology. That, in his opinion, is "global English."

English has nearly 400 million native speakers, putting it second in the world, but it has 1.3 billion speakers overall, making it the world's most widely understood language, explains Payack . It's spoken by over 300 million people in India as a second language, and by at least that many second speakers in China.

"Anyone who speaks English right now feels like they own it," Payack says. For example, look at the adjective "brokeback." After director Ang Lee's called his movie about two cowboys who fall in love "Brokeback Mountain," the word "brokeback" wormed its way into the English vernacular as a synonym for 'gay.' Although "brokeback" may be past its glory days in the United States, the word, with this new meaning, is still popular in China, Payack said. It appears on blogs and Web sites, which means it has momentum, which means it's a word.

"Nowadays we have much more human traffic going in all directions around the world," said Salikoko Mufwene, a linguistics professor at the University of Chicago, who has studied the development of regional dialects. Whether or not Chinese-inspired words will become part of American English, for example, "depends on how regularly Americans are going to interact with Asians in English," he said.

And if they did, would Americans become, on average, more verbose? Average Americans use about 7,500 words a day and know about 20,000 total. Even Shakespeare only knew about 60,000.

So the number of words in the English language will always be many, many more than any one person knows or uses.

Both Mufwene and American Heritage's Pickett said English could very well have a million words already. Counting words, after all, is an imprecise science.

It's also not the dictionary's science. The job of dictionaries has always been, Mufwene said, "to reflect how people speak, not to teach them how to speak." If the dictionary reflection grows narrower, it can still be valuable.

"You need people to edit the dictionary and take responsibility for it, so that it's reliable," Pickett said. "And I don't think that's going to change."


Single Page 1 2 3 Next »

    Subscribe now for more of Smithsonian's coverage on history, science and nature.


Related topics: Linguistics


| | | Reddit | Digg | Stumble | Email |
 

Add New Comment


Name: (required)

Email: (required)

Comment:

Comments are moderated, and will not appear until Smithsonian.com has approved them. Smithsonian reserves the right not to post any comments that are unlawful, threatening, offensive, defamatory, invasive of a person's privacy, inappropriate, confidential or proprietary, political messages, product endorsements, or other content that might otherwise violate any laws or policies.

Comments (9)

This is an excellent time for conscientious people to expand the language in a thoughtful conscious way. Perhaps we should go out of our way and coin some words that apply to situations in the world that are not addressed or are ignored because of peoples basic self-interest(Which is forgivable, after all as an evolutionary form of survival thought but can now be added to with free-will.) How about words that apply to the greenification of the energy supply on this planet. Or the need to reclaim vast swaths of land containing ecosystems including entire river systems from ocean to the mountains in order to renativize and defootprint them. Or the need to pull people out of primitive mindsets that let them believe their actions on a local scale don't multiply and magnify on a global scale along with everybody elses actions to creat calamities ie; global warming, ocean deadspots, dustbowls,Oh and there's another. The failure of developing countries to look back on America's well documented falures, dust bowl etc and to plan for that. Okay i'm done.

Posted by Kory T. on May 11,2009 | 01:17 AM

So... this means the millionth word is gonna be here by the end of the day today?

Posted by MATRIX on April 30,2009 | 03:17 PM

'Average Americans do not know 20,000 words. 2,000 would be a stretch for most people.' Indeed!

Posted by Trevor B on April 19,2009 | 06:52 AM

Average Americans do not know 20,000 words. 2,000 would be a stretch for most people.

Posted by Ben Evans on January 5,2009 | 06:29 AM

Last year I married my husband who happens to be Jewish. He said I was the one in the family that uses Yiddish the most. He also said I use it correctly the least. I have since then made it a point to understand the definitions of Yiddish words and find their roots and usage. I think words nowadays are becoming more technologically related. For example; friends text message me with sayings such as, "LOL", and "TTFN" which mean laughing out loud and ta ta for now. The funny thing will be to see how advanced our language will become in the future as far as this new age kind of techno-talk we use without really realizing we do. In addition, I found out that UCLA offers a course in Ebonics, which is very interesting due to the cultural understanding and background!

Posted by Amy Schlossberg on October 22,2008 | 10:26 PM

Very Interesting. Just as words get added as tey gain "momentum", I guess, we also need someone to watch out for words that are losing it, and hence to sort of "de-notify" them, else imagine a world where this Language has , say 4 million "words" and the average person continues to know only about 20,000 - since the capacity of the human brain is unlikely to change anytime soon, it would make for a world where people feel dumber by the day!!

Posted by Amitabh Jaipuria on October 12,2008 | 01:20 AM

Two words in use in my family are: shrammed - when you are so cold your bones feel as if they are aching, and twitten - a paved alley between buildings. I believe they both originate from Sussex, England, where my mother (now 85) was brought up by her grandmother, born in 1855.

Posted by Lyn Hewitt-Jones on October 8,2008 | 11:50 AM

Yiddish words, long used in conversations, are now appearing in various English language writings. i.e: nosh, putz, goniff,shikseh, etc. Will such as these be added?

Posted by hpuziss on October 6,2008 | 05:52 PM

Perhaps Smithsonian and GLM should arrange a joint celebration of this wonderful event. Unless I'm woefully ignorant, early next spring, English will become the first language in human history to have over one-million words. --Mike Perry, Untangling Tolkien

Posted by Mike Perry on October 6,2008 | 03:56 PM

The way one looks at this topic really depends on one's point of view. It's similar to the idea of "traditional writing versus text messaging"--there's really no right or wrong answer; there are only different points of view.

Posted by Tori Myers on September 30,2008 | 02:08 PM



Advertisement


Most Popular

  • Viewed
  • Emailed
  • Commented
  1. Most of What You Think You Know About Grammar is Wrong
  2. The Story Behind Banksy
  3. The Psychology Behind Superhero Origin Stories
  4. The Saddest Movie in the World
  5. Real Places Behind Famously Frightening Stories
  6. When Did Girls Start Wearing Pink?
  7. Teller Reveals His Secrets
  8. A Brief History of Chocolate
  9. Best. Gumbo. Ever.
  10. The History of Sweetheart Candies
  1. Requiem for the Redhead
  1. Most of What You Think You Know About Grammar is Wrong
  2. The Glorious History of Handel's Messiah

View All Most Popular »

Advertisement

Follow Us

Smithsonian Magazine
@SmithsonianMag
Follow Smithsonian Magazine on Twitter

Sign up for regular email updates from Smithsonian.com, including daily newsletters and special offers.

In The Magazine

February 2013

  • The First Americans
  • See for Yourself
  • The Dragon King
  • America’s Dinosaur Playground
  • Darwin In The House

View Table of Contents »






First Name
Last Name
Address 1
Address 2
City
State   Zip
Email


Travel with Smithsonian




Smithsonian Store

Framed Lincoln Tribute

This Framed Lincoln Tribute includes his photograph, an excerpt from his Gettysburg Address, two Lincoln postage stamps and four Lincoln pennies... $40



View full archiveRecent Issues


  • Feb 2013


  • Jan 2013


  • Dec 2012

Newsletter

Sign up for regular email updates from Smithsonian magazine, including free newsletters, special offers and current news updates.

Subscribe Now

About Us

Smithsonian.com expands on Smithsonian magazine's in-depth coverage of history, science, nature, the arts, travel, world culture and technology. Join us regularly as we take a dynamic and interactive approach to exploring modern and historic perspectives on the arts, sciences, nature, world culture and travel, including videos, blogs and a reader forum.

Explore our Brands

  • goSmithsonian.com
  • Smithsonian Air & Space Museum
  • Smithsonian Student Travel
  • Smithsonian Catalogue
  • Smithsonian Journeys
  • Smithsonian Channel
  • About Smithsonian
  • Contact Us
  • Advertising
  • Subscribe
  • RSS
  • Topics
  • Member Services
  • Copyright
  • Site Map
  • Privacy Policy
  • Ad Choices

Smithsonian Institution