Arabic Vocabulary

Learning any language means learning new vocabulary. If the language you are learning, in our case Arabic, differs considerably from any other language you know, this probably means that words in Arabic do not resemble words from one of the languages you already know.
I remember very well how I started learning Arabic many years ago (in 1976). First the words do 'sound unfamiliar' and secondly, the new words all resemble each other (kabîr, SaGîr, jadîd, qadîm etc.)
There is much more to say about learning vocabulary but the final conclusion is: without vocabulary knowledge you cannot really use the language.

I want to quote Karin Ryding, an authority in the field of teaching Arabic here:
Learning useful, frequent, and widely understood vocabulary should be the primary task of the beginning Arabic student, relying on the linguistic and cultural sophistication of the teacher to select the most pragmatic and practical items to work with. Vocabulary is the most essential tool that Arabic learners can use to construct meaning and it provides the context and anchor for grammatical structure. (Ryding in chapter 18, 'Teaching and Learning Vocabulary', in her book 'Teaching and Learning Arabic as a Foreign Language, A Guide for Teachers').


I have made available a variety of tools for vocabulary learning:
  1. Basic Vocabulary
  2. Top Media Vocabulary
  3. Downloadable vocabulary lists
  4. MEMRISE courses: Arabic Basic Vocabulary, Arabic Vocabulary in Arabian Nights, Top Media Vocabulary, Media Arabic Text Collection (MATC) glosses, MATC Advanced Level glosses
  5. List of phraseology (collocations and multiword expressions) in the Media Arabic Text Collection (MATC)
  6. Information on Arabic dictionaries

1 Basic Vocabulary
For the texts of the CATC we assume the user has a basic vocabulary knowledge of Modern Standard Arabic. Our 'definition' of basic vocabulary knowledge is that the user knows (at least) the 2000 most frequent Arabic words from the Frequency Dictionary of Buckwalter and Parkinson.
You can read more about this on the explainer page of the CATC. From that page you can also download a pdf-version of the BV.


2 Top Media Vocabulary
For the texts of the MATC I assume the user has a basic vocabulary knowledge of Modern Standard Arabic. My 'definition' of basic vocabulary knowledge for Media Arabic is that the user knows (at least) the 2600 most frequent Arabic words related to media topics. This list was selected from the Frequency Dictionary of Buckwalter and Parkinson.
You can read more about this on the explainer page of the MATC. From that page you can also download several versions of the TMV.
You will read on that page that this selection covers 95% of the content of normal news reports, by which we mean factual reports and political news, but not background articles, analyses, interviews or columns. Below, in the paragraph about the Memrise platform, you can read there is a group of Memrise courses available to learn and memorise the words from the TMV.

There is a strong overlap between the Basic Vocabulary and the Top Media Vocabulary.

3 Downloadable vocabulary lists
Both Vocabulary lists mentioned above are intended to stimulate the users of CATC and MATC to work on acquiring a basic vocabulary before starting to read authentic Arabic texts. And the lists are reference lists for us (text collectors) relieving us of presenting glosses for every word in the text, which would be an unbearable effort.
So we only provide a gloss explaining the meaning of a word (in a specific context) if that word is not in the reference list.
With each text of the CATC and the MATC we provide a downloadable list of all glosses presented in that text (in Excel format and pdf). You only have to press the Download button on the top of the screen.

There is one essential difference between the vocabulary of the CATC and the MATC. In the CATC we have decided to present a gloss with new words only, obviously at the first occurrence of the word. The user is strongly advised to actively learn the words of every text before proceeding to the next text, as a reoccurrence of that word will not be considered as a new occurrence. The task of actively learning the words is facilitated through a MEMRISE course, as described in the paragraph just below.

4 MEMRISE courses
MEMRISE is a platform for vocabulary build up training. Words are presented and progress of the student is tested while learning. On this platform I have created several courses. The courses aim at learning the words by storing them in long term memory. Obviously this requires frequent use of the platform.

The first course is actually a set of four courses that, together, contain all 2000 words of the Arabic Basic Vocabulary mentioned above. I separated the words in four different courses so every user can step in on his proper level. My advice is that the user starts with the course Arabic Basic Vocabulary 1 (ABV1). If it is clear he already masters the 500 most frequent (and very basic!) words he can proceed to the next course ABV2 etc.

The second MEMRISE course contains all the glosses provided with the texts of the Classical Arabic Text Collection (CATC). Words are grouped in different levels containing 30 words each. The goal of this course was described above.

The Memrise courses mentioned above cover the vocabulary of the CATC. A similar set of courses is available fo the MATC.
The first MATC tool is a group of 5 courses (grouped together in the group Top Media Arabic Vocabulary 2600 words) containing all words of the Top Media Vocabulary, a list of 2560 words as described above. This link will lead you to the first course.
Another Memrise course contains all words that were provided with glosses in the reading texts of the Intermediate (First) Level of the MATC from the subsets PAL (Palestine) and TUN (Tunisia). This is the course Media Arabic Text Collection Intermediate Level, which contains 400 words based on the same number of 400 unique glosses, created to explain the meaning of words that are not part of the TMV.
You should use this course to memorise these 400 words (in addition to the 2600 words of the TMV) if you want to proceed to the Advanced Level of the MATC.
The vocabulary buildup of the Advanced Level of the MATC is based on the same principle as the CATC, i.e. with a accumulative reference list. This means that a gloss for a new word (i.e. the word did not occur before in the MATC) is given only once, when the word occurs for the first time. If the word occurs again in another text, you are supposed to know this word. To help you memorising these new words another Memrise course was created: Media Arabic Text Collection Adv Level (growing). The word 'growing' between brackets means the list will grow with every new text that is added to the MATC AL.

To summarise the collection of Memrise courses:
Learners who go through the whole sequence of Media Arabic Text Collection should in the end have acquired 3280 words using the following vocabulary lists (including Memrise courses):
Learners with an interest in Classical Arabic, i.e. users of the CATC, will in the present stage (November 22, all 20 Alf Layla wa Layla texts available), should in this stage have acquired 3300 words using the following vocabulary lists (including Memrise courses):

5 List of phraseology in the MATC
Phraseology can consist of combinations of 2 or more words that frequently occur in each other's presence. On the importance of learning this special category of vocabulary I again quote Karin Ryding:
Vocabulary learning is not just word learning. It is learning predictable chunks of language that are meaningful, frequent, and useful for both self expression and for reading and listening skills... She (Moon) refers to the "idiom principle" of structural patterning, that is, "a language user has available to him or her a large number of semi preconstructed phrases that constitute single choices, even though they might appear to be analyzable into segments" (Moon 1998, 42, quoting Sinclair 1987, 319 325). (Ryding p. 198).


So phraseology is important and in Media Arabic there are many frequent combinations and expressions. I have written about it on the page introducing the MATC.
I have brought together all phraseology occurring in the MATC in one Excel Table which can be downloaded on the page on collocations.
It should be mentioned phraseology is far less frequent in literary texts and especially in Classical Arabic literary texts of the CATC.

6 Information on Arabic dictionaries
The ultimate tool for learning vocabulary is obviously the dictionary. As a lexicographer and dictionary maker myself I can of course only recommend the dictionaries I have worked on myself.
You can read all about the Arabic Dutch (and D A) dictionary here and the Arabic English (and E A) dictionary of Oxford University Press here.
As indicated, I am not neutral but I am confident every academically trained teacher of Arabic as a second language will agree that these dictionaries are the best available at this moment (2022).


Tressy Arts, editor in chief of the OAD and yours truly in 2014 when OAD was presented.