You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Efficient Machine Translation Corpus Generation with LLM Integration
Introduction
This README outlines a groundbreaking approach to Machine Translation (MT) corpus generation, developed and implemented by aiXplain Inc. This method leverages Large Language Models (LLMs) to enhance the quality and efficiency of MT corpora, addressing the inherent challenges in traditional corpus generation methods.
Abstract
This project presents an innovative approach to machine translation (MT) corpus generation, blending human-in-the-loop post-editing with the utilization of Large Language Models (LLMs). This methodology enhances both the efficiency and quality of MT processes. We've built on our previous work, which focused on real-time training of custom MT quality estimation metrics informed by linguist updates, and expanded it to include the capabilities of LLMs. This integration leads to a more efficient and accurate translation process, including features like LLM-Enhanced Translation Synthesis, LLM-Assisted Annotation Analysis, LLM-Driven Pseudo Labeling, and LLM Translation Recommendation System. The source code and a demo video of this project are available for community use.