We are independent & ad-supported. We may earn a commission for purchases made through our links.
Advertiser Disclosure
Our website is an independent, advertising-supported platform. We provide our content free of charge to our readers, and to keep it that way, we rely on revenue generated through advertisements and affiliate partnerships. This means that when you click on certain links on our site and make a purchase, we may earn a commission. Learn more.
How We Make Money
We sustain our operations through affiliate commissions and advertising. If you click on an affiliate link and make a purchase, we may receive a commission from the merchant at no additional cost to you. We also display advertisements on our website, which help generate revenue to support our work and keep our content free for readers. Our editorial team operates independently of our advertising and affiliate partnerships to ensure that our content remains unbiased and focused on providing you with the best information and recommendations based on thorough research and honest evaluations. To remain transparent, we’ve provided a list of our current affiliate partners here.
Linguistics

Our Promise to you

Founded in 2002, our company has been a trusted resource for readers seeking informative and engaging content. Our dedication to quality remains unwavering—and will never change. We follow a strict editorial policy, ensuring that our content is authored by highly qualified professionals and edited by subject matter experts. This guarantees that everything we publish is objective, accurate, and trustworthy.

Over the years, we've refined our approach to cover a wide range of topics, providing readers with reliable and practical advice to enhance their knowledge and skills. That's why millions of readers turn to us each year. Join us in celebrating the joy of learning, guided by standards you can trust.

What Is a Frequency List?

By Matt Hubbard
Updated: May 23, 2024
Views: 7,275
Share

A frequency list is a tool for quantitative linguistic analysis, a listing of everything that appears in a chosen block of text and how frequently it occurs. Linguistic analysis is a cross-disciplinary field that studies the structure of language and how it is used. Combining elements of anthropology, mathematics, computer science and logic, linguistic analysis is used for projects such as mechanical translation, cryptography and deciphering ancient writings.

Frequency lists can be listings of words or of letters. Letter frequencies typically are used in cryptography. One of the simplest codes is a substitution cipher, where each letter is replaced with another letter or symbol. For example, the message "attack at dawn" might be encoded as "zoozhl zo azqp." The benefit of substitution ciphers is that they don’t require a code book, but the weakness is that they can be cracked by comparing the frequency of letters and letter combinations within the message to a frequency list of common usage.

In Arthur Conan Doyle’s The Adventure of the Dancing Men, the fictional detective Sherlock Holmes uses frequency analysis to crack a substitution cipher. Historically, codemakers tried various tricks to make their ciphers more difficult to crack with a frequency list: rolling ciphers where the substitution used depended on a letter’s position within the message, eliminating or encoding spaces so that word frequencies couldn’t be used, keeping messages short and avoiding expected words so code-breakers wouldn’t have enough of a sample to use for frequency analysis. Ultimately, any cipher can be broken with a large enough sample, which is why more sophisticated encryption protocols have become standard.

Frequency lists of words and word types are also used in ancient language studies. When Jean-Francois Champollion translated the Rosetta Stone in the 1820s, his process used a mixture of comparing frequencies and transliterations to piece together the hieroglyphic language. Studies have shown that for ancient languages, as for modern English, a core vocabulary of 1,500 to 2,000 words covers 85-90 percent of common texts, a level that allows the reader to expand his or her vocabulary from context.

Zipf’s law, named for Harvard linguistics professor George Kingsley Zipf, is an empirical observation on the behavior of frequency ratings. It states that the frequency of an event is inversely proportional to the ranking of the event. The event is generally a word or letter in a linguistic frequency list, but Zipf’s law has been generalized to cover other phenomenon such as city populations and corporate earnings.

A frequency list is an important tool in projects to help computers make sense of spoken and written language. Mechanical translation — the use of computers to translate documents from one language to another — is one example. Another example is Watson, the natural language supercomputer that was showcased as a contestant on the television game show Jeopardy! in February 2011. Frequencies both of words and of usage types are incorporated into their programming as a tool to finding meaning.

Share
Language & Humanities is dedicated to providing accurate and trustworthy information. We carefully select reputable sources and employ a rigorous fact-checking process to maintain the highest standards. To learn more about our commitment to accuracy, read our editorial process.
Discussion Comments
Share
https://www.languagehumanities.org/what-is-a-frequency-list.htm
Copy this link
Language & Humanities, in your inbox

Our latest articles, guides, and more, delivered daily.

Language & Humanities, in your inbox

Our latest articles, guides, and more, delivered daily.