Persian Language Resources Based on Dependency Grammar Outline Iran in addition to Persian Language: An overview Meaning Iran Map through History

Persian Language Resources Based on Dependency Grammar Outline Iran in addition to Persian Language: An overview Meaning Iran Map through History www.phwiki.com

Persian Language Resources Based on Dependency Grammar Outline Iran in addition to Persian Language: An overview Meaning Iran Map through History

Watkins, Mylin, Coordinating Producer has reference to this Academic Journal, PHwiki organized this Journal Persian Language Resources Based on Dependency Grammar Mohammad Sadegh Rasooli rasooli@cs.columbia.edu Novermber 2012 Outline Iran in addition to Persian Language: An overview Challenges in Persian Language Processing Persian Resources Based on Dependency Grammar

Arnolds Beauty School TN www.phwiki.com

This Particular University is Related to this Particular Journal

Iran in addition to Persian Language: An overview Meaning Iran: L in addition to of nobles Persia: L in addition to of Persian people Persian (Parsi): People from Aryan (Arian) tribe. Arya (Aria): Noble (people lived in plateau of Iran). Persian language: Language spoken by Persian people. Iran Map through History http://en.wikipedia.org/wiki/Greater-Iran

Iran Ethno-religious Distribution Persian Language in History First known as Pahlavi language with Pahlavi script: Persian Language in History Pahlavi script is very similar to Indian scripts.

Persian Language in History After Islam, Pahlavi script was replaced by Arabic script with 4 additional characters. Persian Language in History Now, Arabic script is also used in Iran official flag. In the middle: On the horizental sides: What is Farsi In st in addition to ard Arabic there is no “p” sound. For 2 centuries, Iran was governed by Arab governors. Parsi became Farsi just to be pronounced easier by Arab people. : – Profit Mohammad: Even if knowledge is in the skies, people from Fars will gain that knowledge (Behar-al-anvar, 1, 195).

Persian Language An Indo-European language Written with Arabic script with right-to-left direction. Spoken by about 100 million people. Now, Persian is the official language in Iran, Afghanistan in addition to Tajikistan. In Tajikistan, it is written with Cyrillic script. e.g. /naezdik/ Challenges in Persian Language Processing Challenges Lack of Annotated data Colloquial Language Orthography Morphology Syntax

Lack of Annotated Data For many open problems in NLP, there is no available Persian corpus. Rule based models in Persian did not lead to promising results. Colloquial Language Most of the people use it in their speakings or even their unofficial writings /miXAhaed/ (he wants) /miXAd/ /miSaevaed/ (it becomes) /miSe/ Orthography Diacritics are usually hidden (unless as long as manual disambiguation) /ae/ /e/ /o/ /s r/ /sor/: slippy /saer/: head /ser/: secret

Orthography Some characters have more than one encoding. Affixes are written in multiple shapes (based on the writer style): / / “I say” / / / “Libraries” Orthography Semi-space (zero-width non-joiner) is used to attach parts of a unit word, but many people (even experts) do not use it properly. vs. /mey/ means “wine” in Persian “I say” vs. “I say wine” vs. /taer/ means “wet” is Persian “better” vs. “good wet” Orthography People do not use punctuation between phrases regularly. Example (no punctuation, no diacritics): /to/ /ketAb/ /ketAb/ /e/ /to/: “Your book” /ketAb/ , /to/: “book, you”

Orthography Some Arabic characters have the same pronunciation in Persian: /s/ /t/ /z/ This problem cause ambiguity in speech processing, spell checking, etc. Morphology It is a language with rich morphology. Not as much as Arabic in addition to Turkish /tehrAnihAyeSan/ “Theirs that are from Tehran” /zadeaemeSAn/ “I have hit them” Arabic words cause irregularity in nouns in addition to verbs Morphology Verbs are the most challenging problem in Persian morphology. Types of Persian verbs: Simple Prefix verb Compound verb Prefix compound verb Prepositional phrase verb

Watkins, Mylin Entertainment Tonight - Paramount Domestic Television Coordinating Producer www.phwiki.com

Morphology Usually, each verb has two lemmas: 1) present in addition to 2) past lemma /goft/ -to speak- (past) /gu/ -to speak- (present) Verbs (when inflected) can have more than one token: /goft/: “He told” /gofte aest/: “He has told” /gofte Xahaed Sod/: “It will be told” Morphology Compound verbs: A noun (non-verbal element) with a light verb: : “speaking” : “to do” : “to speak” Compound verbs can have long distance dependencies (other words can be present between non-verbal element in addition to the light verb) I spoke with you Morphology Non-verbal elements can also be inflected. I spoke with you a lot

Syntax Two major problems: Pro-drop Subjects can be omitted easily. Free word order Usually SOV, but others are acceptable. Lots of crossings in syntactic trees. Persian Resources Based on Dependency Grammar Motivation We developed a spell checker, but there were no syntactic analysis. There were no syntactic treebank or lexicons. We decided to create A verb valency lexicon (Rasooli et al., 2011) Each verb has what types of complements. More than 4000 verb entries A syntactic treebank

References Oflazer, Kemal, Bilge Say, Dilek Zeynep Hakkani-Tür, in addition to Gökhan Tür. “Building a Turkish treebank.” Treebanks (2003): 261-277. Rasooli, Mohammad Sadegh, Amirsaeid Moloodi, Manouchehr Kouhestani, in addition to Behrouz Minaei-Bidgoli. “A syntactic valency lexicon as long as Persian verbs: The first steps towards Persian dependency treebank.” In 5th Language & Technology Conference (LTC): Human Language Technologies as a Challenge as long as Computer Science in addition to Linguistics, pp. 227-231. 2011. Zeman, Daniel, David Mareek, Martin Popel, Loganathan Ramasamy, Jan Štpánek, Zdenk Žabokrtský, in addition to Jan Haji. “Hamledt: To parse or not to parse.” In Proceedings of the Eighth Conference on International Language Resources in addition to Evaluation (LREC’12), Istanbul, Turkey. 2012.

Watkins, Mylin Coordinating Producer

Watkins, Mylin is from United States and they belong to Entertainment Tonight – Paramount Domestic Television and they are from  Studio City, United States got related to this Particular Journal. and Watkins, Mylin deal with the subjects like Television Industry

Journal Ratings by Arnolds Beauty School

This Particular Journal got reviewed and rated by Arnolds Beauty School and short form of this particular Institution is TN and gave this Journal an Excellent Rating.