David Martinez eta Meladel Mistica ikertzaileen hitzaldia.(2011/10/14)

Bi hitzaldi izango dira bihar Ixa Taldeko mintegian. Australiatik datoz hizlari biak, baina David oso ezaguna dugu, ixakide ohia da-eta. Bisitan datorkigu beste behin.

Tokia: Fakultateko 3.2 gela
Eguna: Urriaren 14a, ostirala
Ordua: 15.00
Izenburua: Word classes in Indonesian: A linguistic reality or a convenient fallacy in natural language processing?
Hizlaria: Meladel Mistica (Australian National University)


In this talk I will be presenting work on Indonesian (Bahasa Indonesia), and the claim that there is no noun-verb distinction within the language as it is spoken in regions such as Riau and Jakarta. We test this claim for the language as it is written by a variety of Indonesian speakers using empirical methods traditionally used in part-of-speech  induction.
In this study we use only morphological patterns that we generate from a pre-existing  morphological analyser. We find that once the distribution of the data points in our experiments match the distribution of the text from which we  gather our data, we obtain results that show a significant distinction between the class of nouns and the class of verbs in Indonesian. Furthermore  it shows promise that the labelling of word classes may be achieved only with  morphological features, which could be applied to out-of-vocabulary items.

Izenburua: Text classification of patient reports and event-modifier identification for the biomedical literature
Hizlaria: David Martinez (NICTA – National ICT Australia)

The first short talk describes the implementation and evaluation of a text classification system of pathology reports for the Royal Melbourne Hospital, which relied on  document-level annotations obtained from the medical workflow. We observed that a  basic machine learning framework with linguistic features carries the potential to make an impact in their process.
The second talk describes our work on modifiers of biomedical events over the BioNLP-2009 dataset. Our system combines a simple bag-of-words method with two grammar-based approaches, namely the English Resource Grammar and the RASP parser. We interpret the output of the respective parsers via  MRS (Minimal Recursion Semantics), and feed them into a machine learner. Our results indicate that grammar-based techniques can enhance the accuracy of methods for detecting event modification.

Utzi erantzuna

Zure e-posta helbidea ez da argitaratuko. Beharrezko eremuak * markatuta daude