Machine Translation Post Editing

In the spring semester of 2025, I took a Machine Translation Post Editing course taught by Prof. Adam Wooten. Here, I'd like to show you some of my work from the class.

During this course, we studied many different aspects of Neural Machine Translation, from utilizing controlled languages and style guides to ensure quality NMT output, to how to prepare and clean data for NMT training. Our final project, we created proposals for custom-trained NMT engines, and then attempted to train the engines ourselves using SYSTRAN. I learned many things from the mistakes I made, and I consider the project to be a success.

The NMT engine that I customized was an engine to translate Dungeons and Dragons modules from Japanese into English. Below, You will be able to find documents for my project proposal, the testing data and training data that I aligned by hand in Trados, and my lessons learned presentation that presents my projects and its findings.

Click the button below to open the Google Drive folder containing the aforementioned project files

Summary of Findings

I was able to improve the BLEU scores of my NMT from its original 17.99 score to 27.65 after 5 rounds of training. For training, I used .TMX files with segments by I aligned myself using Trados from the English and Japanese translations of the Dungeons and Dragons Players and Dungeon Master's Rulebooks.

In the future, to improve the engine's BLEU scores, I would try to find more diverse data to train with -- perhaps by hand-translating sections of existing modules in English or by finding different glossaries and corpuses online.

While the improvement seemed negligible to me at first, I was still able to post-edit at a rate of about 1000 English words an hour. At that rate, I would be able to work through 8000 English words of output a day. This is x4 the output compared to a translator that works at a pace of 2000 words a day. So, while it still had errors in the output, I consider this output to still be successful.