Toward Low-Resource Languages Machine Translation: A Language-Specific Fine-Tuning With LoRA for Specialized Large Language Models

Xiao Liang; Jasmina Khaw Yen Min; Liew Soung Yue; Tien-Ping Tan; Donghong Qin

doi:10.1109/ACCESS.2025.3549795

Toward Low-Resource Languages Machine Translation: A Language-Specific Fine-Tuning With LoRA for Specialized Large Language Models

Journal

IEEE Access

ISSN

2169-3536

Date Issued

2025

Author(s)

Xiao Liang

Jasmina Khaw Yen Min

Faculty of Information and Communication Technology

Liew Soung Yue

Faculty of Information and Communication Technology

Tien-Ping Tan

Donghong Qin

DOI

10.1109/ACCESS.2025.3549795

Abstract

In the field of computational linguistics, addressing machine translation (MT) challenges for low-resource languages remains crucial, as these languages often lack extensive data compared to high-resource languages. General large language models (LLMs), such as GPT-4 and Llama, primarily trained on monolingual corpora, face significant challenges in translating low-resource languages, often resulting in subpar translation quality. This study introduces Language-Specific Fine-Tuning with Low-rank adaptation (LSFTL), a method that enhances translation for low-resource languages by optimizing the multi-head attention and feed-forward networks of Transformer layers through low-rank matrix adaptation. LSFTL preserves the majority of the model parameters while selectively fine-tuning key components, thereby maintaining stability and enhancing translation quality. Experiments on non-English centered low-resource Asian languages demonstrated that LSFTL improved COMET scores by 1-3 points compared to specialized multilingual machine translation models. Additionally, LSFTL's parameter-efficient approach allows smaller models to achieve performance comparable to their larger counterparts, highlighting its significance in making machine translation systems more accessible and effective for low-resource languages.

Subjects

Machine translation

low-resource language...

large language models...

parameter-efficient f...

LoRA

File(s)

Name

j.png

Size

17.27 KB

Format

PNG

Checksum

(MD5):85f5e85fa8f8c13d7350540217a227b6

Options

Toward Low-Resource Languages Machine Translation: A Language-Specific Fine-Tuning With LoRA for Specialized Large Language Models