Featured image of post Adawat

Adawat

Adawat, Arabic Language Toolkit

Adawat is a toolkit which provides multiple functions as:

  • Tashkeel
    • tashkeel : vocalize text, we recomand to use mishkal-console instead.
    • tashkeel with suggestions for every word.
    • reduce : strip unnecessary tashkeel from avocalized text
    • strip : remove all harakat and shadda
    • compare : Compare Tashkeel between input text and the automatic vocalized text
  • Transformation and Converion
    • romanize : convert an arabic script text to latin representation
    • arabize : convert an transliterated arabic script text to arabic
    • inverse : inverse text
    • numbers to words : convert numeric value to words
    • normalize : normalize letters in arabic text
    • unshape : unshape arabic letters
  • Analysis and generation
    • stem : morphology analysis of given texts
    • tokenize : tokenize a text to words
    • wordtag : classify words into (nouns, verbs, stopwords)
    • affixate : generate all word forms by affixation
  • Extraction
    • collocation : extract collocations from text
    • language : detect arabic and latin clauses in text
    • named : extract named enteties from text
    • numbered : extarct numbred clauses from text
  • Divers
    • affixate : generate all word forms by affixation
    • poetry : format poetry texts to columns poetry
    • random : get a random text
Licensed under CC BY-SA 4.0
Built with Hugo
Theme Stack designed by Jimmy