Do Multilingual Language Models Think Better in English?

Translate-test is a popular technique to improve the performance of multilingual language models. This approach works by translating the input into English using an external machine translation system before running inference. However, these improvements can be attributed to the use of a separate translation system, which is typically trained on large amounts of parallel data not seen by the language model. In this work, we introduce a new approach called self-translate that leverages the few-shot translation capabilities of multilingual language models. This allows us to analyze the effect of translation in isolation. Experiments over 5 tasks show that self-translate consistently outperforms direct inference, demonstrating that language models are unable to leverage their full multilingual potential when prompted in non-English languages. Our code is available at https://github.com/juletx/self-translate.

Egileak (ixakideak):

Mikel Artetxe

Gorka Azkune

Julen Etxaniz

Oier López de Lacalle

Aitor Soroa

Egileak:

Julen Etxaniz, Gorka Azkune, Aitor Soroa, Oier Lacalle, Mikel Artetxe

Fitxategi publikoak:

2024.naacl-short.46.pdf

Urtea:

2024

Artikuluaren erreferentzia:

In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 550–564, Mexico City, Mexico. Association for Computational Linguistics.

Argitalpen mota:

Aldizkaria, kongresua, liburua, liburu atala edo hitzaldi gonbidatua

Argitalpen mota fina (argitalpen_sailkapen_ohia):

ISBN gabeko kongresua

Kongresuaren balorazioa:

SCIE clase 1

URLa (ahal dela DOI):

https://aclanthology.org/2024.naacl-short.46

Hizkuntzak

Nor gara?

Zer egiten dugu?

Beste batzuk

Do Multilingual Language Models Think Better in English?

Argitalpen mota:

Argitalpen mota fina (argitalpen_sailkapen_ohia):

Kongresuaren balorazioa:

Bilaketa formularioa

Hizkuntzak

Hemen zaude

Nor gara?

Zer egiten dugu?

Beste batzuk

Do Multilingual Language Models Think Better in English?

Argitalpen mota:

Argitalpen mota fina (argitalpen_sailkapen_ohia):

Kongresuaren balorazioa: