Syntax-Morphosyntax

Aditza+izena Unitate Fraseologikoak gaztelaniatik euskarara: azterketa eta tratamendu konputazionala // Verb+Noun Multiword Expressions: A linguistic analysis for identification and translation

Unitate Fraseologikoak (UFak) hizkuntzek bere-bereak dituzten hitz-konbinazio idiomatikoak dira. Hizkuntzaren Prozesamenduko (HPko) tresnek kalitatezko emaitzak izan ditzaten, beharrezkoa da halakoak ondo tratatzea, baina lan horrek hainbat zailtasun ditu; besteak beste, hitzez hitzeko itzulgarritasun eza. Tesi-lan honetan, aditza+izena motako UFen azterketa linguistiko bat egin dugu, halakoek HPren alorrean sortzen dituzten bi arazo garrantzitsuri aurre egiten laguntzeko: batetik, corpusetan UFak automatikoki identifikatzeari, eta bestetik, UF horiek gaztelaniaren eta euskararen

The DISRPT 2019 Shared Task on Elementary Discourse UnitSegmentation and Connective Detection

In 2019, we organized the first iteration of a shared task dedicated to the underlying units used in discourse parsing across formalisms: the DISRPT Shared Task on Elementary Discourse Unit Segmentation and Connective Detection. In this paper we review the data included in the task, which cover 2.6 million manually annotated tokens from 15 datasets in 10 languages, survey and compare submit-ted systems and report on system performance on each task for both annotated and plain-tokenized versions of the data.

Literal occurrences of Multiword Expressions: rare birds that cause a stir

Multiword expressions can have both idiomatic and literal occurrences. For instance pulling strings can be understood either as making use of one’s influence, or literally. Distinguishing these two cases has been addressed in linguistics and psycholinguistics studies, and is also considered one of the major challenges in MWE processing. We suggest that literal occurrences should be considered in both semantic and syntactic terms, which motivates their study in a treebank.

Unitate Fraseologikoen agerpen literalak, urre baina urri

Unitate fraseologiko asko idiomatikoki eta literalki uler daitezke. Esate baterako, ziria sartzeak bi esanahi izan ditzake testuinguruaren arabera: norbaiti iruzur egitea edo nonbait ziri bat sartzea literalki. Lan honetan, corpusetan oinarritutako azterketa eleaniztun baten berri emango dugu, eta erakutsiko dugu, batetik, halako hitz-konbinazioak oso gutxitan erabiltzen direla literalki praktikan, eta bestetik, idiomatiko-literal bereizketa

Ayuda de las tecnologı́as lingüı́sticas en la investigación en Humanidades Digitales

El acercamiento digital al estudio de las humanidades ofrece nuevas oportunidades para la colaboración, la reutilización de herramientas y la difusión multimodal de estos estudios. Nuevas actividades, objetos de estudio y técnicas de investigación han propiciado nuevas formas para leer, escribir, revisar, buscar, ordenar, describir y enseñar. Todo esto puede suponer un hándicap considerable en la inmersión de las Humanidades Digitales, pero el uso de las tecnologı́as lingüı́sticas y la ayuda o colaboración de las infraestructuras en humanidades como CLARIN

Saying no but meaning yes: negation and sentiment analysis in Basque

In this work, we have analyzed the effects of negation on the semantic orientation in Basque. The analysis shows that negation markers can strengthen, weaken or have no effect on sentiment orientation of a word or a group of words. Using the Constraint Grammar formalism, we have designed and evaluated a set of linguistic rules to formalize these three phenomena. The results show that two phenomena, strengthening and no change, have been identified accurately and the third one, weakening, with acceptable results.

Pages

Subscribe to RSS - Syntax-Morphosyntax