Font Size: 
Toward Optimum Transformer Model for Sequence-to-sequence Data Transformation under Low-resource Computation Constraint
Yaya Heryadi, Bambang Dwi Wijanarko, Dina Fitria Murad, Cuk Tho, Kiyota Hashimoto

Last modified: 2022-06-09

Abstract


Accurate language translator applications running on low-resource computing devices such as smartphone is very instrumental to support tourism industry. The main challenge to achieve such objective is how to optimize performance of machine translation model targeted to limited resource of computing devices. Vanila transformer model has been well known as one of state-of-the-art neural machine translation model. However, the drawback of this model is its large number of parameter models which might not be suitable for low-resource computing devices. This paper presents study findings in efforts to optimize 2 encoder-decoder stack depth of vanilla transformer by exploring several activation function using fine-tuning approach. The pre-trained transformer model is fine-tuned using parallel corpus Bahasa Indonesia-Sundanese language to address machine translation task. The experiment results found that Sigmoid gives the highest model performance (0.993 average training accuracy and 0.987 average testing similarity) and GeLU gives the lowest model performance (0.987 average training accuracy and 0.980 average testing similarity) of the tested vanilla transformer models.

Keywords


ReLU, Sigmoid, Tanh, GeLU, ELU, transformer model, machine translation, Indonesian language, Sundanese language.

An account with this site is required in order to view papers. Click here to create an account.