User guide · Part 9 bAIbel AV
English
Draft

TM Alignment

You often already have a document and its finished translation, but no translation memory linking them. TM Alignment turns that pair into memory: it matches the source to its translation, segment by segment, and writes the result as a TMX file you can reuse. This guide explains the purpose and the options — above all, the three ways it can align.

What it is for

A translation memory is only valuable if it is full. Every document you have already translated is potential memory — if you can pair the two sides up. TM Alignment does exactly that, taking a source file and its translation and producing reusable translation-memory pairs. You can then load the result as a memory in any project and benefit from the matches, as in guide 1.

Starting a run

Open TM Alignment and choose Create New Alignment Run. Like extraction, each run is saved so you can resume it later.

Figure 1. The TM Alignment Runs list. Create a new run, or resume one in progress.

Step 1 — Choose the alignment type

After naming the run, you pick how it will align. This is the most important choice, because the three methods suit different files and have different trade-offs.

TypeHow it worksBest for
Lightweight (LLM Alignment) Uses a model to match the two sides. Quick to set up, and includes the privacy controls. Does not handle PDFs. Most everyday documents, especially confidential ones.
External App (Advanced Alignment) Uses a dedicated external server. More powerful, and it does handle PDFs. Large or difficult alignments, and PDF sources.
Structural Alignment Matches the two files by their structure, with no model at all. Already-structured files such as XLIFF or XML.
Figure 2. Choosing the alignment type. Lightweight for everyday and confidential work, External for power and PDFs, Structural for structured files.

Step 2 — Configure files and settings

Now you set the languages, the output, the files, and — for the Lightweight type — the model and privacy.

OptionWhat it does
Source / Target LanguageThe two languages. Detected automatically for XLIFF and similar; set by hand for other files.
Output ModePool everything into one TMX file, or write a separate TMX for each pair of files.
Extraction LLMFor Lightweight alignment, the model that does the matching. Choose it with Change….
Anonymizer / Obfuscation / BoilerplateThe privacy profiles, applied before text reaches the model and restored afterwards.
Alignment PromptThe prompt that guides the model. Required for Lightweight alignment.
HTML/XML Filter (Optional)An XPath or CSS selector to align only part of a web page or XML file.
Minimum words/characters per batchHow much text is matched at a time.
Source Files / Target FilesThe documents to align. Add them from disk or by URL.
Lightweight keeps the privacy controls

The Lightweight type sends text to a model, so it carries the full privacy toolkit. Attach an Anonymizer or Numerical Obfuscation profile and your confidential content is masked before alignment and restored after — the same protection as everywhere else.

Figure 3. Configuring files and settings. Languages, output mode, the model and privacy for Lightweight runs, and the source and target files.

Step 3 — Pair the files

If you added several files on each side, you tell the wizard which source goes with which target. Auto-pair by Name matches them by filename, or you select a source and a target and Pair Selected. A status line warns you if any file is still unpaired.

Figure 4. Pairing files. Auto-pair by name, or pair them by hand, until every file has its partner.

Step 4 — Align, review, and save

Start the alignment and watch its progress — elapsed time, pairs processed, and segments aligned. When it finishes, a summary shows how many pairs and segments were produced.

Use View Results to read the aligned pairs side by side and check the match is sound. Then Finish & Save TMX writes the memory to disk — one pooled file, or a set of files, depending on the output mode you chose.

Adjusting an alignment

The results viewer is for checking, not editing. If a pair needs correcting, save the TMX and open it in the TM editor, where you can adjust entries directly.

Figure 5. Alignment in progress, with live counts for pairs and segments.
Figure 6. The finished run. Review the aligned pairs side by side, then save the TMX.

Terminology used in this guide

TM Alignment
The feature that pairs a source document with its translation to produce translation-memory entries.
TMX
The standard file format for translation memory, which an alignment run produces.
Lightweight / External / Structural
The three alignment methods: model-based, external-server, and structure-based.
Output mode
Whether all pairs are pooled into one TMX file or written as separate files.
Pairing
Telling the wizard which source file matches which target file.