Persian Language Resources Based on Dependency Grammar Outline Iran in addition to Persian Language: An overview Meaning Iran Map through History
Watkins, Mylin, Coordinating Producer has reference to this Academic Journal, PHwiki organized this Journal Persian Language Resources Based on Dependency Grammar Mohammad Sadegh Rasooli email@example.com Novermber 2012 Outline Iran in addition to Persian Language: An overview Challenges in Persian Language Processing Persian Resources Based on Dependency Grammar
This Particular University is Related to this Particular Journal
Iran in addition to Persian Language: An overview Meaning Iran: L in addition to of nobles Persia: L in addition to of Persian people Persian (Parsi): People from Aryan (Arian) tribe. Arya (Aria): Noble (people lived in plateau of Iran). Persian language: Language spoken by Persian people. Iran Map through History http://en.wikipedia.org/wiki/Greater-Iran
Iran Ethno-religious Distribution Persian Language in History First known as Pahlavi language with Pahlavi script: Persian Language in History Pahlavi script is very similar to Indian scripts.
Persian Language in History After Islam, Pahlavi script was replaced by Arabic script with 4 additional characters. Persian Language in History Now, Arabic script is also used in Iran official flag. In the middle: On the horizental sides: What is Farsi In st in addition to ard Arabic there is no p sound. For 2 centuries, Iran was governed by Arab governors. Parsi became Farsi just to be pronounced easier by Arab people. : – Profit Mohammad: Even if knowledge is in the skies, people from Fars will gain that knowledge (Behar-al-anvar, 1, 195).
Persian Language An Indo-European language Written with Arabic script with right-to-left direction. Spoken by about 100 million people. Now, Persian is the official language in Iran, Afghanistan in addition to Tajikistan. In Tajikistan, it is written with Cyrillic script. e.g. /naezdik/ Challenges in Persian Language Processing Challenges Lack of Annotated data Colloquial Language Orthography Morphology Syntax
Lack of Annotated Data For many open problems in NLP, there is no available Persian corpus. Rule based models in Persian did not lead to promising results. Colloquial Language Most of the people use it in their speakings or even their unofficial writings /miXAhaed/ (he wants) /miXAd/ /miSaevaed/ (it becomes) /miSe/ Orthography Diacritics are usually hidden (unless as long as manual disambiguation) /ae/ /e/ /o/ /s r/ /sor/: slippy /saer/: head /ser/: secret
Orthography Some characters have more than one encoding. Affixes are written in multiple shapes (based on the writer style): / / I say / / / Libraries Orthography Semi-space (zero-width non-joiner) is used to attach parts of a unit word, but many people (even experts) do not use it properly. vs. /mey/ means wine in Persian I say vs. I say wine vs. /taer/ means wet is Persian better vs. good wet Orthography People do not use punctuation between phrases regularly. Example (no punctuation, no diacritics): /to/ /ketAb/ /ketAb/ /e/ /to/: Your book /ketAb/ , /to/: book, you
Orthography Some Arabic characters have the same pronunciation in Persian: /s/ /t/ /z/ This problem cause ambiguity in speech processing, spell checking, etc. Morphology It is a language with rich morphology. Not as much as Arabic in addition to Turkish /tehrAnihAyeSan/ Theirs that are from Tehran /zadeaemeSAn/ I have hit them Arabic words cause irregularity in nouns in addition to verbs Morphology Verbs are the most challenging problem in Persian morphology. Types of Persian verbs: Simple Prefix verb Compound verb Prefix compound verb Prepositional phrase verb
Morphology Usually, each verb has two lemmas: 1) present in addition to 2) past lemma /goft/ -to speak- (past) /gu/ -to speak- (present) Verbs (when inflected) can have more than one token: /goft/: He told /gofte aest/: He has told /gofte Xahaed Sod/: It will be told Morphology Compound verbs: A noun (non-verbal element) with a light verb: : speaking : to do : to speak Compound verbs can have long distance dependencies (other words can be present between non-verbal element in addition to the light verb) I spoke with you Morphology Non-verbal elements can also be inflected. I spoke with you a lot
Syntax Two major problems: Pro-drop Subjects can be omitted easily. Free word order Usually SOV, but others are acceptable. Lots of crossings in syntactic trees. Persian Resources Based on Dependency Grammar Motivation We developed a spell checker, but there were no syntactic analysis. There were no syntactic treebank or lexicons. We decided to create A verb valency lexicon (Rasooli et al., 2011) Each verb has what types of complements. More than 4000 verb entries A syntactic treebank
References Oflazer, Kemal, Bilge Say, Dilek Zeynep Hakkani-Tür, in addition to Gökhan Tür. “Building a Turkish treebank.” Treebanks (2003): 261-277. Rasooli, Mohammad Sadegh, Amirsaeid Moloodi, Manouchehr Kouhestani, in addition to Behrouz Minaei-Bidgoli. “A syntactic valency lexicon as long as Persian verbs: The first steps towards Persian dependency treebank.” In 5th Language & Technology Conference (LTC): Human Language Technologies as a Challenge as long as Computer Science in addition to Linguistics, pp. 227-231. 2011. Zeman, Daniel, David Mareek, Martin Popel, Loganathan Ramasamy, Jan tpánek, Zdenk abokrtský, in addition to Jan Haji. “Hamledt: To parse or not to parse.” In Proceedings of the Eighth Conference on International Language Resources in addition to Evaluation (LREC12), Istanbul, Turkey. 2012.
Watkins, Mylin Coordinating Producer
Watkins, Mylin is from United States and they belong to Entertainment Tonight – Paramount Domestic Television and they are from Studio City, United States got related to this Particular Journal. and Watkins, Mylin deal with the subjects like Television Industry
Journal Ratings by Arnolds Beauty School
This Particular Journal got reviewed and rated by Arnolds Beauty School and short form of this particular Institution is TN and gave this Journal an Excellent Rating.