Phraseology in the MATC

In the Media Arabic Text Collection (MATC) phraseology is one of the key components since every text has a tab specifically dedicated to phraseology. Two subcategories of phraselogy are indicated in these tabs: collocations and multi word expressions. You can read more about this on the MATC explanation page.

If you want to learn to speak and write Arabic correctly, it is important that your command of phraseology is correct and this is an important element to proceed to higher levels of proficiency.

To help you benefit from the phraseology in the MATC I have decided to collect all the phraseology data from the texts and merge them in one Excel table. Since the MATC is still growing, this database will grow accordingly. This means the downloadable file will be updated on a regular basis. Updates will be announced on the MATC News page.

To enable the user to maximally benefit from this data I have decided to make the original Excel database available for downloading. I request you to respect the Creative Commons Licence conditions as mentioned in the bottom of this page. You can download the file here.
In addition I made a downloadable pdf available for those who are not familiar with working in Excel.

What can you do with the Phraseology file? (not exhaustive)



Description of the columns in the database
A Arabic collocation (or mwe)
B English translation
C source text
D type of phraseology (if not mwe it is a collocation, see below for an explanation of the various categories of collocations)
E first keyword (database is sorted by this field, you can change this yourself)
F root of first keyword
G second keyword (can be empty)
H root of second keyword

Remarks
Frequent collocations may occur more than once in the merged version. You can consider this as an indication of the frequency and importance.
The database will by default be sorted by the first keyword. The first keyword is generally the most specific word, i.e. the lemma where you would look up the combination in a dictionary.
The second keyword of a MWE can be more or less randomly chosen.
An MWE can have only one keyword if other constituents are function words.

You can download the Excel file here or the pdf here.

More about Collocations


Collocations represent the vast majority of phraseology items in the MATC. I have added information about different categories of collocations in the column 'Type' of the table with phraseology. To explain this I will give a short introduction about collocations.

Collocations are combinations of two words related to each other. There exist grammatical collocations and lexical collocations. The first type consists for example of a verb and a preposition. I will not treat this type of collocation in the MATC. The second type, lexical collocations, consists of a combination of two words related to each other on the basis of their meanings. A lexical collocation in English is for example 'to commit a crime' or 'to perpetrate a crime'. These are combinations of a verb and a noun as its object. Another examples is 'a vicious crime', in which the same noun combines with an adjective.
In 'Media language' lexical collocations are very frequent. When for example laws are mentioned, they often are violated or maintained (law as an object). The law can forbid or allow something (law as subject of a verb) and a law can be severe or controversial (law specified by an adjective). All these lexical collocations exist in Arabic too. And based on the POS (part of speech) of the two constituents they can be classified in different categories.

In the introduction above we have already seen:

These three types are the most frequent in Media Arabic (and probably in language in general). But there are more categories:

Less frequent categories:

In the following section I will present a few examples for every category. For these examples I will use combinations with the two most frequent nouns in the table: qânûn or 'ijrâ', if these are present in the table.

NOV, noun as object + verb
فَرَضَ إِجْراءاتٍ
اِتَّخَذَ إِجْراءاتٍ
سَنَّ قانونًا

NSV, noun as subject + verb
حَظَرَ القانونُ
كَشَفَ التَّقْريرُ أَنَّ
وَقَعَتْ الحادِثَةُ
انْدَلَعَتْ الحَرْب
اسْتَغْرَقَتْ الزّيارَةُ

NA, noun + adjective
إِجْراءاتٌ قانونيَّةٌ
إِجْراءاتٌ لازِمَةٌ
إِجْراءٌ صارِمٌ
إِجْراءٌ أَمْنيٌّ
إِجْراءاتٌ اسْتِثْنائيَّةٌ
القانونُ الدَّوْليُّ
قانونٌ صادِرٌ عَنْ
القانونُ الِانْتِخابيُّ
القانونُ الأَساسيُّ

NN, noun + noun (construct phrase, iDâfa)
اتِّخاذُ إِجْراءاتٍ
سَنُّ القانونِ
مَشْروعُ قانونٍ
بِموجِبِ القانونِ
إِقْرارُ قانونٍ
إِقْرارُ القانونِ
مَشْروعُ قانونٍ
تَنْفيذُ القانونِ
In the table you will find two different categories N1N2 and N2N1. N1N2 means the first word of the construct phrase is mentioned in the first column of the table (Key1) and the second word in the second column (Key2). With N2N1 this is vice versa: the second word of the construct phrase (in general the most specific word) is mentioned in the first column. This first column would correspond to the lemma in a dictionary to which I would include the collocation. So the difference between N2N1 and N1N2 is only relevant for dictionary makers ;-).

NPN, noun (masdar) + preposition + noun
المُطالَبَةُ بِحُقوقٍ
الدِّفاع عَنْ الحُقوقِ
حُكْمٌ بِالْإِعْدامِ
التَّوَصُّلُ إِلَى حَلٍّ
الإِصابَةُ بِرَصاصٍ

NPV, noun as indirect object + verb with preposition
قامَ بِإِجْراءٍ
وافَقَ عَلَى قانونٍ
تَعَرَّضَ لِلتَّمْييزِ
أُصيبَ بِجُروحٍ
اكْتَرَثَ بِحُقوقِ الإِنْسانِ
اعْتَرَفَ بِدَوْلَةٍ
أُصيبَ بِرَصاصٍ
أَثَّرَ عَلَى السَّلامَةِ

VAdv, verb + adverb
بَقِيَ ساريًا
ماتَ جوعًا

AAdv, adjective + adverb
كَبير جِدًّا
مُتَأَّخِر قَليلًا
These are two theoretical examples, presentely there are no collocations of this category in the list.

AN, adjective + noun
أُحاديّ الجانِبِ

APN, adjective (participle) + preposition + noun
مُتَّهَمٌ بِالتَّوَرُّطِ
مُتَأَثِّرٌ بِجُروحٍ
مُسْتَوْرِدٌ مِنْ دُوَلٍ

NAdv, noun + adverb (circumstantial accusative)
الوُقوعُ فَريسَةً
المُلاحَقَة قَضائيًّا

PAIR, a pair of two words, more or less synonyms, frequently mentioned together
الأَمْن والِاسْتِقْرار

Collocations combined

You will understand that collocations of different categories can be combined.
A noun can collocate with a verb (NOV, NSV, NPV) and with an adjective (NA). So this can lead to a combined collocation, for example:
NOV+NA = Verb + Noun(object of verb) + Adjective (modifying the noun)
Obviously the same can be applied to the cateogories NSV and NPV.
And other combinations are possible:
N1N2+NA = Noun + Noun (construct phrase) + Adjective (modifyig noun1 or noun2)
N1N2+N1N2 = Noun + Noun + Noun (construct phrase of 3 components)
NA+NA = Noun + Adjective + Adjective (noun modified by 2 adjectives)

Here I will present a few examples taken from texts in the MATC:
NOV+NA
فَتَحَ تَحْقيقًا قَضائيًّا

N1N2+NA
أستاذ العلوم السياسية

N1N2+N1N2
ٱنْسِحابِ قُوّاتِ ٱلِٱحْتِلالِ
قرار مجلس الأمن

NA+NA
المؤسسات الإقليمية والدولية
الجهود الدبلوماسية مستمرة

In addition you might want to know that combinations of more than two collocations are possible too. Here I will present one example of a series of 3 collocations: NOV+NN+NA (verb+noun+noun+adjective):
اتَّخّذَ إجْراءاتِ أَمْنٍ صارِمةً



Arabic Media Text Collection by Jan Hoogland is licensed under  CC BY-NC-ND 4.0