MUSTER: Multimodal processing of Spatial and TEmporal expRessions: Toward Understanding Space and Time in Language Enhanced by Vision.

Deskribapen motza, derrigorrezkoa proiektuak logorik ez badu (eu):

Prozesaketa multimodala denbora-espresio zein espresio tenporaletan: espazio eta denboraren ulermeneruntz hizkuntza eta bisioa erabiliz.

MUSTER proiektuak hitz eta esaldien errepresentazio multimodalak aztertu nahi ditu. Azken boladan bultzada handia jaso dute hitz eta esaldi mailako errepresentazio distribuzioanalak, corpus erraldoiak aztertuz sortzen direnak. MUSTERen interesa da errepresentazio distribuzionaletik haratago joatea, eta horretarako seinale bisualak gehitzea proposatzen du. Horrela, bada, MUSTER proiektuan irudi zein bideoetatik erauzitako errepresentazioak uztartu nahi dira testuzko errepresentazio distribuzionalarekin. Sinetsita baikaude, perzeptzioan oinarritutako seinaleak lagungarriak direla ataza semantikoetan, hala nola, hitzen antzekotasun zein ahaidetasun semantikoa neurtzeko, edo hitzen adierak desanbiguatzeko.

Deskribapen motza, derrigorrezkoa proiektuak logorik ez badu (en):

Multimodal processing of Spatial and TEmporal expRessions: Toward Understanding Space and Time in Language Enhanced by Vision.

Deskribapena (en):

MUSTER a fundamental pilot research project which introduces a new multi-modal framework for the machine-readable representation of meaning. The focus of MUSTER lies on exploiting visual and perceptual input in the form of images and videos coupled with textual modality for building structured multi-modal semantic representations for the recognition of objects and actions, and their spatial and temporal relations. The MUSTER project will investigate whether such novel multi-modal representations will improve the performance of automated understanding of human language. MUSTER starts from the current state-of-the-work platform for language representation learning known as text embeddings, but introduces the visual modality to provide contextual world knowledge which text-only models lack while humans possess such knowledge when understanding language. MUSTER will propose a new pilot framework for joint representation learning from text and vision data tailored for spatial and temporal language processing. The constructed framework will be evaluated on a series of semantic tasks, which closely mimic the processes of human language acquisition and understanding.

MUSTER will rely on recent advances in multiple research disciplines spanning natural language processing, computer vision, machine learning, representation learning, and human language technologies, working together on building structured machine-readable multi-modal representations of spatial and temporal language phenomena.

Within this framework, MUSTER will focus on building semantic representations of nouns and verbs based on distributional information from textual corpus and implicit knowledge encoded in knowledge bases. It will also integrate those models with information based on visual features. Finally, it will evaluate the new combined representation models in semantic tasks such as semantic textual similarity and disambiguation or spatial role labeling.

Deskribapen motza, derrigorrezkoa proiektuak logorik ez badu (es):

Procesamiento multi-modal de expresiones espaciales y temporales: Hacia el entendimiento del espacio y tiempo en el lenguaje ayudado por la visión.

Deskribapena (es):

El proyecto MUSTER tiene como objetivo el crear un marco para la representación multimodal de las palabras y sus acepciones. MUSTER explorará el uso de información visual y perceptual derivadas de imágenes y videos, junto con representaciones basadas en texto. Así, en el proyecto se estudiarán formas de fusionar información de diversas modalidades (texto, imágenes y videos) en una única representación multimodal. El proyecto estudiará también si estas nuevas representaciones multimodales ayudan en la comprensión del lenguaje humano. Para ello, MUSTER propondrá evaluar representaciones multimodales de palabras y frases en tareas semánticas tales como el cálculo de similaridad semántica o la desambiguación de acepciones.

Kode ofiziala:

PCIN-2015-226

Ikertzaile nagusia:

Patrick Gallinari

Erakundea:

CHIST-ERA MINECO

Hasiera data:

2016/01/01

Bukaera data:

2018/12/31

Taldeko ikertzaile nagusia:

Aitor Soroa

Ixakideak:

Eneko Agirre

Aitor Soroa

Kontratua:

Hizkuntzak

Nor gara?

Zer egiten dugu?

Beste batzuk

MUSTER: Multimodal processing of Spatial and TEmporal expRessions: Toward Understanding Space and Time in Language Enhanced by Vision.

Bilaketa formularioa

Hizkuntzak

Hemen zaude

Nor gara?

Zer egiten dugu?

Beste batzuk

MUSTER: Multimodal processing of Spatial and TEmporal expRessions: Toward Understanding Space and Time in Language Enhanced by Vision.