Tag Archives: linguistics

Practical Corpus Linguistics by Martin Weisser

Practical Corpus Linguistics: An Introduction to Corpus-Based Language Analysis

Practical Corpus Linguistics: An Introduction to Corpus-Based Language Analysis by Martin Weisser

Practical Corpus Linguistics is a great introduction to analyzing language data with hands-on exercises using free software and websites. For anyone interested in textual analysis, corpus linguistics, and digital humanities, this book will get you started on the basics. There are other introduction to corpus linguistics books available, but this appears to be the only one that is designed as more of a how-to guide rather than a theoretical overview.

Chapters include collecting and cleaning data, concordancing, querying mega corpora online, frequency analysis, keywords in context, and part-of-speech tagging. There are even chapters on regular expressions and XML. Each chapter features several exercises for you to try out, as well as solutions to and comments on the exercises. There is also a companion website for the book with more exercises as well as updates.

The BYU website was changed shortly after the book was published, so you will need to check the companion website for instructions on using the new interface for accessing the Corpus of Contemporary American English. However, the old website is still available. You will also need to download the free software, AntConc, and create a free account at the BNCweb site to follow along with the exercises.

If you prefer to analyze languages other than English, check out a simple analysis of telenovela Spanish phrases I carried out using DownThemAll to collect transcripts, Python to clean the text files, and AntConc to find the most common phrases.

More Corpus Linguistics

The 8-week Corpus Linguistics MOOC at Futurelearn is another great introduction to the methodology. Both Lancaster and Birmingham run Corpus Linguistics summer schools in the UK every year if you’re able to travel to them.

If you’re interested in learning more programming for humanities research (usually in Python), try the lessons at Programming Historian and check out my list of Digital Humanities tools. Martin Weisser also wrote Essential Programming for Linguists (using Perl) that you might be interested in.

Mundolingua: Museum of Languages and Linguistics in Paris

I was recently in Paris to present at a conference and I was finally able to check out Mundolingua, a museum of languages and linguistics that opened last year. It’s on Rue Servandoni in the 6th, just south of Saint Sulpice.

Mundolingua

Mundolingua

 

The first fun/nerdy thing to play with is this interactive IPA chart. Press the button and hear how the consonants are pronounced.

IMG_3395

Push all the buttons!

 

There are plenty of videos to watch and activities to do to learn about the world’s languages and the various topics that encompass linguistics.

Sexy syntax

Sexy syntax

 

You can also just hang out in the back and read language books, or play language games downstairs. Check out the Scrabble rug (available online here)! They’ve made tiles for English, French, Spanish, Russian and Arabic so far.

Will someone please buy me this rug?

Will someone please buy me this rug?

 

They even have a small cinema room where you can watch DVDs. The toilet is multilingual too.

IMG_3398

Peepee place

Mundolingua costs 7€ for adults and it’s open 7 days a week.

Non-Linguists, Please Stop Trying to Do or Talk About Linguistics Without the Help of Actual Linguists

Ben Zimmer has a wonderful article on “When physicists do linguistics” over at the Boston Globe, which can perhaps be best summarized by this comic from xkcd:

Joking aside, I am happy that other disciplines have an interest in language – however, I hate when other disciplines try to do linguistic research and fail because they do not involve any actual linguists in the research. I agree completely when Zimmer says that there is a “need for better communication between disciplines that previously had little to do with each other.” Communication among related fields could use a little boost too, because it isn’t just physicists who publish papers that contradict linguistic research. Psychologists, speech pathologists, and cognitive scientists have been doing it wrong for a while too, especially when it comes to multilingual and cultural aspects of language acquisition.

Linguistics seems to the be the field that everyone thinks they can do without any special training. Most people wouldn’t think of talking about chemistry or mathematics without actually having studied those subjects. Yet everyone seems to think they are experts on language simply because they speak a language (their native language) or because they have learned another language. Sorry, but those abilities do not make you a qualified linguist nor do they give you the right to talk about language without checking facts or to teach language as if you were an experienced teacher. I know how to drive a car, but I don’t go around pretending to be a certified mechanic or give advice to others on how to fix their own cars.

Robert Lane Greene’s book, You Are What You Speak: Grammar Grouches, Language Laws, and the Politics of Identity, is about this phenomenon. People believe, and repeat, such ridiculous things as “this language has eleventy billion words for X” or “this language is primitive but that language is logical” all the time. Even worse, respected authors repeat these myths in their articles and books, such as Bill Bryson in The Mother Tongue, and so they are repeated again and again without anyone questioning whether they are true or not. These myths are dangerous because a lot of them are based on ethnocentrism and the perceived superiority of the way we speak compared to everyone else.

Please, do yourself a favor and study language seriously instead of repeating myths. Talk to actual linguists, read books written by actual linguists or whose authors talked to actual linguists. In addition to You Are What You Speak, you can start with Language Myths (for a general overview), Vocabulary Myths (for language learners/teachers, which I previously posted about), and the “truth-squad” blog Language Log. But most importantly, always question what is written about language even if it is published by best-selling authors or academic researchers because they may not be linguists at all.

Update 26/02/13: And another one! Ugh. “Why speaking English can make you poor when you retire” about research done by a behavioural economist. Hey, that’s not linguistics! ::sigh:: At least the article quotes my hero, John McWhorter.

Update 15/03/15: So glad I’m not the only one who complains about this: If you’re not a linguist, don’t do linguistic research by @EvilJoeMcVeigh

Books on French Linguistics and Sociolinguistics (in English)

For any students interested in French linguistics or sociolinguistics, here are the books that I recommend for an introduction as well as a more in-depth explanation. You don’t necessarily need to have a background in linguistics to be able to understand everything, especially for the first three books.

Exploring the French Language by R. Anthony Lodge, Nigel Armstrong, Yvette M. L. Ellis and Jane F. Shelton

French: A Linguistic Introduction by Zsuzsanna Fagyal, Douglas Kibbee, and Frederic Jenkins

The French Language Today: A Linguistic Introduction by Adrian Battye, Marie-Anne Hintze, and Paul Rowlett

A Sociolinguistic History of Parisian French by R. Anthony Lodge

French: From Dialect to Standard by R. Anthony Lodge

Unfortunately a few of these books are a bit more expensive (mostly because they only exist in hardcover). Hopefully you can access them electronically through your library.

Social and Linguistic Change in European French by Nigel Armstrong and Tim Pooley

Social and Stylistic Variation in Spoken French by Nigel Armstrong

Sociolinguistic Variation in Contemporary French edited by Kate Beeching, Nigel Armstrong, and Françoise Gadet

Recommendations for books written in French to follow.

The Power of Babel by John McWhorter

The Power of Babel is a book about the natural history of language that I read recently while getting over my Christmas cold. (As you have probably noticed from the lack of website updates, I’m still recovering and not doing much besides sleeping and reading.) The book is rather inexpensive at Amazon though it is not available for Kindle, which unfortunately seems to be the case for many language and linguistics books.

Click image for Amazon.com page

Since I found the book to be rather entertaining and insightful, here are some interesting factoids from a few chapters.

  • The future tense in Romance languages derives from combining the main verb plus the conjugated forms of have in Latin. I will love was amare habeo in Latin and it transformed into amerò in Italian. So having to learn various endings for all six person and tense combinations in Italian, French, Spanish, etc? Thanks Latin!  Inflections are transformed this way in many languages, but thankfully English had a simpler process with fewer endings overall (did became -ed for all six, for example.)
  • Much like inflections, tones developed over time from sound changes to distinguish meaning between words. In Vietnamese, for example, tones did not originally exist but then final consonants wore off of many words, changing the sound of the preceding vowel. Now it is these tones that distinguish the differences in meanings instead of the final consonant.  Inflections and tones were not present in the earliest forms of language and they are not necessary to human communication. They are merely accidental changes of words and sounds that produced a more complicated form of the language.
  • The Normans who invaded England in 1066 did not speak a standardized or Parisian French that many people think of, but rather the Norman dialect. The “French” words borrowed at that time were actually the Norman pronunciations, where Norman had k and ei but Parisian had sh and oi (compare carbon/aveir and charbon/avoir). This is also why Montréal is not Montroyal – it was settled by people from Northwestern France rather than Paris.
  • Most people know that double negatives used to be grammatically correct in English, but there are other features of contemporary non-standard dialects that are in fact closer to early modern English than today’s English. Even though thou went out of fashion by 1700, the singular you did not and its corresponding verb conjugation for be in the past tense was, in fact, was.  Letters written by educated people in the 1800’s indicate that “you was” was the standard and it was only because prescriptive grammarians decided that it didn’t sound correct that they stamped it out of modern English by rewriting grammar books.
  • One of the few examples of Scots that still exists, or at least is recognizable, in modern-day English is auld lang syne, literally old long since or “days of yore.”
  • The human proto-language (if you believe that there was one) was very similar to today’s creoles in that the grammar was much simpler – no inflections or tones, or even relative clauses, because these complex features developed due to sound changes and the fact that most language became written instead of only spoken.
  • And of course, my favorite part: the acknowledgement that French is actually two languages: written and spoken. McWhorter mentions a few of the parallels (nous vs. on, ne…pas vs. pas, est vs. c’est) and how textbooks do not do a very good job of informing the learner that the gap between these two is wider than for most other languages.  Written French was codified centuries ago and rarely changes, but the spoken form is highly dynamic, even for non-colloquial speech by the educated. It should be no wonder that c’est was the basis for is instead of est in French-based creoles – se in Haitian creole – because that is what the people always heard in everyday speech.