— Let the human interpreter train with AI, not be trained by it.

Observation:
In practice, some interpreters, especially when working into English, might to ask LLMs questions like:
- “Can I say it this way?”
- “Does this sentence sound right?”
- “Is my tone appropriate?”
Caution:
This instinct to consult AI for quick validation skips a critical developmental step in interpreter training: Self-Annotation.
Interpreting students should first be trained to review their own performance and annotate their mistakes. This includes:
- Identifying what the problem is (e.g., grammar, omission, addition, excessive fillers).
- Assessing how serious the problem is.
- Categorizing the type of error.
Why This Matters:
The ability to empathize with the audience and make independent judgments about language quality is essential for interpreters. This process builds a systematic and objective standard for self-evaluation—an indispensable part of professional growth.

1. LLMs Are Not Interpreters: Understanding the Limits of Transformers
Key Distinction:
Generative AI is not an interpreter, and is not trained for interpretation or translation.
While it may possess a vast command of vocabulary and excellent English fluency, it lacks the specialized training and professional mindset that interpreting demands.
Interpreting is not about word-for-word conversion. It’s shaped by multiple variables, including speaker speed, client preferences, mode of interpreting (simultaneous vs. consecutive)
Interpreters often make strategic linguistic decisions: In simultaneous interpreting, condensing 2–3 utterances into a single, efficient output is often necessary. In consecutive interpreting, interpreters may reorganize the discourse structure, bringing the conclusion forward, summarizing key points (e.g., “first… second… third…”), and making the message more digestible for the audience.
Gap in Capability: No Prompt Can Bridge What AI Cannot Do.
Interpretation involves discourse-level restructuring, cognitive compression, and audience awareness.
Even if you explicitly instruct the AI, it doesn’t guarantee performance because these interpreting strategies require complex cognitive judgments, including:
- Prioritizing information
- Real-time decision-making under pressure
- Empathizing with listener needs
Overlooked Issue: LLMs are people pleasers
LLMs are overly encouraging by design. They are trained to be helpful and affirming, which means:
When you ask, “Is this interpretation good?”, they are very likely to respond positively.
This creates a false sense of accuracy, depriving interpreters of the critical feedback they actually need.
Intuition:
Treating AI as an evaluator in interpreting is misleading. At most, LLMs can serve as conversation partners after self-assessment, or help generate alternative phrasings, but not as substitutes for human cognitive judgment in interpreting tasks.

2. Quality Evaluation Must Flow from Expertise to Model—Not the Other Way Around
Fundamental Question — Who should be evaluating whom?
A key concern is the emerging practice of allowing AI to evaluate human interpreting output. This, the author argues, is a serious misstep.
Why This Is Problematic:
Language quality assessment must come from someone with greater experience, higher skill, or at least comparable training.
In real-world itraining settings, this role is fulfilled by:
- An professional instructor evaluating a student.
- Peers engaging in collaborative review and constructive dialogue to improve output quality.
No, Your LLM Cannot Fulfill This Role Reliably:
LLMs do not possess robust interpreting competence. Their output is not consistently accurate or context-aware, making them an unreliable judge of nuanced human performance.
Trusting them to assess human interpreting quality could lead to:
- False confidence in subpar interpretations.
- Overlooking context-specific decisions that only trained interpreters would understand.
The Reverse Is More Productive:
While AI shouldn’t be the judge, it can serve as a flexible and expansive source of alternatives that interpreters can critically engage with.
An example of proper interaction may be: ask the LLM to generate multiple translation variants of a sentence (e.g., 5–10 options with diverse syntax and diction).
The interpreter then:
- Evaluates each version.
- Identifies which are better and why.
- Reflects on what makes an expression more appropriate, efficient, or elegant.
This Approach, applied with critical thinking and professional guide, could strengthens syntactic flexibility of students, sharpens linguistic intuition and critical reasoning, and expand their personal language database, absorbing useful structures and discarding ineffective ones.
Leave a comment