Xiao LiangJasmina Khaw Yen MinLiew Soung YueTien-Ping TanDonghong Qin2025-09-302025-09-30202510.1109/ACCESS.2025.3568474https://dspace-cris.utar.edu.my/handle/123456789/11406This paper addresses the challenge of Chinese-Malay speech-to-text translation (S2TT), a crucial yet under-resourced language pair in computational linguistics. We introduce Layer-Freezing Adaptive Fine-Tuning (LFAFT), a parameter-efficient strategy that selectively freezes and unfreezes Transformer layers to optimize model adaptation. LFAFT achieves an 11.8% relative improvement in BLEU-4 scores while reducing trainable parameters by 45% compared to full fine-tuning. Using our newly constructed Chinese-Malay parallel corpus, our approach improves BLEU scores from 1.86 to 9.30 (+7.44 points) compared to existing Chinese-Malay speech translation systems. This work not only establishes the first large-scale Chinese-Malay S2TT dataset but also presents an efficient adaptation method that makes low-resource speech translation more accessible and computationally feasible.enTranslationAdaptation modelsComputational modelingCultural differencesPhoneticsMultilingualData modelsTransformersSensitivity analysisFoundation modelsChinese-Malay translationparameter-efficient fine-tuninglayer freezingmultimodal corpuslow-resource languagesEfficient Chinese-Malay Speech-Text Translation via Layer-Freezing Adaptation of Multimodal Foundation Modelsjournal-article