Updated

PDFs vs. Other Formats for Translation

Why do translators hate Word docs but love PDFs (sometimes)? Choosing the right format for your localization workflow.

A scale balancing a PDF file against a DOCX file with translation symbols

PDFs vs. Other Formats for Translation

If you send a PDF to a translator, they might sigh. “Can I have the source file?” Why? Because PDFs are “final” documents, not “working” documents.

1. The Extraction Problem

To translate a PDF, you have to extract the text.

  • Hard Returns: PDFs often break lines manually. “The cat sat on the” [Line Break] “mat.”
  • Translation Memory: Translation software (CAT tools) hates these breaks. It sees two sentence fragments instead of one sentence.

2. The Layout Problem

German text is 30% longer than English text.

  • Word/HTML: The text reflows. The document gets longer.
  • PDF: The text hits the edge of the box and disappears (overflow). Translating a PDF often requires “Desktop Publishing” (DTP) work to resize boxes and fix the layout manually.

3. When PDF is the Right Choice

  • Review: Send the PDF for the final visual check. The translator needs to see the text in context (e.g., to see if a caption matches the image).
  • Reference: Always provide the PDF alongside the source file so the translator knows what the final output should look like.

Conclusion

Generate from source, review in PDF. Don’t use PDF as the input for translation; use it as the output.

Source-driven. MergeCanvas generates PDFs from structured JSON data, making it easy to swap languages at the data level before the PDF is ever created.