<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
        <title>DOCX on Jingyuan</title>
        <link>https://jingyuan-zheng.github.io/tags/docx/</link>
        <description>Recent content in DOCX on Jingyuan</description>
        <generator>Hugo -- gohugo.io</generator>
        <language>en-US</language>
        <copyright>Jingyuan Zheng</copyright>
        <lastBuildDate>Mon, 25 May 2026 10:00:00 +0200</lastBuildDate><atom:link href="https://jingyuan-zheng.github.io/tags/docx/index.xml" rel="self" type="application/rss+xml" /><item>
            <title>Translate Documents on Mac from Finder: PDFs, Word Files, Images, and Transcripts</title>
            <link>https://jingyuan-zheng.github.io/p/translate-document-quick-action/</link>
            <pubDate>Mon, 25 May 2026 10:00:00 +0200</pubDate>
            <guid>https://jingyuan-zheng.github.io/p/translate-document-quick-action/</guid>
            <description>&lt;img src=&#34;https://jingyuan-zheng.github.io/img/translate-document-quick-action/featured.png&#34; alt=&#34;Featured image of post Translate Documents on Mac from Finder: PDFs, Word Files, Images, and Transcripts&#34; /&gt;&lt;p&gt;This post introduces Translate Document Quick Action, a Mac workflow that translates PDFs, Word files, Markdown, images, and audio transcripts directly from Finder.&lt;/p&gt;&#xA;&lt;p&gt;I have open-sourced &lt;strong&gt;Translate Document Quick Action&lt;/strong&gt;, a small macOS-focused tool for translating everyday documents directly from Finder, with the core translation workers kept as plain Python scripts.&lt;/p&gt;&#xA;&lt;p&gt;The project is here: &lt;a class=&#34;link&#34; href=&#34;https://github.com/Jingyuan-Zheng/translate-document-quick-action&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;&#xA;    &gt;Jingyuan-Zheng/translate-document-quick-action&lt;/a&gt;.&lt;/p&gt;&#xA;&lt;h2 id=&#34;why-i-built-it&#34;&gt;Why I Built It&#xA;&lt;/h2&gt;&lt;p&gt;Translation tasks rarely arrive in one clean format. One day it is a PDF report, the next day it is a Word document, a Markdown note, a screenshot, or an audio recording that first needs a transcript.&lt;/p&gt;&#xA;&lt;p&gt;Most tools handle one piece of that workflow. This project tries to make the common cases feel like one action: select a file in Finder, run the Quick Action, and get translated output next to the original file.&lt;/p&gt;&#xA;&lt;p&gt;It also works from the command line, so the same workers can be used outside Finder or on non-macOS systems where the dependencies are available.&lt;/p&gt;&#xA;&lt;h2 id=&#34;what-it-supports&#34;&gt;What It Supports&#xA;&lt;/h2&gt;&lt;p&gt;The current version supports:&lt;/p&gt;&#xA;&lt;ul&gt;&#xA;&lt;li&gt;&lt;strong&gt;PDF&lt;/strong&gt; translation through &lt;code&gt;pdf2zh-next&lt;/code&gt;&lt;/li&gt;&#xA;&lt;li&gt;&lt;strong&gt;DOCX&lt;/strong&gt; translation by editing Word XML in place, preserving the original package structure and media references&lt;/li&gt;&#xA;&lt;li&gt;&lt;strong&gt;Markdown&lt;/strong&gt; translation with common Markdown structure protection&lt;/li&gt;&#xA;&lt;li&gt;&lt;strong&gt;TXT&lt;/strong&gt; translation with line-preserving output&lt;/li&gt;&#xA;&lt;li&gt;&lt;strong&gt;Images&lt;/strong&gt; through macOS Vision OCR or an optional &lt;code&gt;manga-image-translator&lt;/code&gt; adapter&lt;/li&gt;&#xA;&lt;li&gt;&lt;strong&gt;Audio and video transcripts&lt;/strong&gt; through the MacWhisper &lt;code&gt;mw&lt;/code&gt; CLI, with optional transcript translation&lt;/li&gt;&#xA;&lt;/ul&gt;&#xA;&lt;p&gt;Output files are written next to the input file and existing files are not overwritten. Monolingual outputs use a target-language suffix such as &lt;code&gt;_CN.docx&lt;/code&gt;; bilingual outputs include both language codes, such as &lt;code&gt;_EN_CN.docx&lt;/code&gt;.&lt;/p&gt;&#xA;&lt;h2 id=&#34;translation-output-examples&#34;&gt;Translation Output Examples&#xA;&lt;/h2&gt;&lt;p&gt;For PDFs, the bilingual output uses &lt;code&gt;pdf2zh-next&lt;/code&gt;&amp;rsquo;s alternating-page dual PDF mode, which keeps the original page layout readable while adding the translated version.&lt;/p&gt;&#xA;&lt;div class=&#34;post-figure&#34;&gt;&#xA;    &lt;img src=&#34;https://jingyuan-zheng.github.io/img/translate-document-quick-action/featured.png&#34; alt=&#34;Bilingual PDF translation output&#34;&gt;&#xA;    &lt;div class=&#34;caption&#34;&gt;Figure 1: PDF bilingual output keeps the translated and original pages easy to compare.&lt;/div&gt;&#xA;&lt;/div&gt;&#xA;&lt;p&gt;For Markdown and TXT, the bilingual output is interleaved, which is useful when reviewing paragraph-level translation quality.&lt;/p&gt;&#xA;&lt;div class=&#34;post-figure&#34;&gt;&#xA;    &lt;img src=&#34;https://jingyuan-zheng.github.io/img/translate-document-quick-action/txt-output.png&#34; alt=&#34;TXT translation output&#34;&gt;&#xA;    &lt;div class=&#34;caption&#34;&gt;Figure 2: TXT output keeps the source file easy to inspect line by line.&lt;/div&gt;&#xA;&lt;/div&gt;&#xA;&lt;div class=&#34;post-figure&#34;&gt;&#xA;    &lt;img src=&#34;https://jingyuan-zheng.github.io/img/translate-document-quick-action/markdown-bilingual.png&#34; alt=&#34;Markdown bilingual translation output&#34;&gt;&#xA;    &lt;div class=&#34;caption&#34;&gt;Figure 3: Markdown bilingual output preserves common document structure.&lt;/div&gt;&#xA;&lt;/div&gt;&#xA;&lt;p&gt;DOCX translation inserts the translated paragraph after the original paragraph while preserving media and layout references where possible.&lt;/p&gt;&#xA;&lt;div class=&#34;post-figure&#34;&gt;&#xA;    &lt;img src=&#34;https://jingyuan-zheng.github.io/img/translate-document-quick-action/docx-bilingual.png&#34; alt=&#34;DOCX bilingual translation output&#34;&gt;&#xA;    &lt;div class=&#34;caption&#34;&gt;Figure 4: DOCX bilingual output keeps the original document structure useful for review.&lt;/div&gt;&#xA;&lt;/div&gt;&#xA;&lt;p&gt;Image translation can use a lightweight macOS Vision OCR engine for clean screenshots, diagrams, and slides. It scans text, translates it, and redraws the translated text into detected boxes.&lt;/p&gt;&#xA;&lt;div class=&#34;post-figure&#34;&gt;&#xA;    &lt;img src=&#34;https://jingyuan-zheng.github.io/img/translate-document-quick-action/image-bilingual.png&#34; alt=&#34;Image bilingual translation output&#34;&gt;&#xA;    &lt;div class=&#34;caption&#34;&gt;Figure 5: Image bilingual output places the original and translated images side by side.&lt;/div&gt;&#xA;&lt;/div&gt;&#xA;&lt;h2 id=&#34;engines-and-privacy-choices&#34;&gt;Engines and Privacy Choices&#xA;&lt;/h2&gt;&lt;p&gt;The text translation workers currently support Google and Bing web endpoints, plus a local &lt;strong&gt;Ollama&lt;/strong&gt; adapter. The Google and Bing options are convenient, but they are not official paid APIs and may be rate limited or change upstream behavior.&lt;/p&gt;&#xA;&lt;p&gt;For sensitive documents, I would use a local backend such as Ollama or replace the adapter with an official translation API. The tool is intentionally structured so the file handling and translation backend are separate concerns.&lt;/p&gt;&#xA;&lt;h2 id=&#34;installation&#34;&gt;Installation&#xA;&lt;/h2&gt;&lt;p&gt;The basic setup is a Python environment:&lt;/p&gt;&#xA;&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;&#xA;&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;&#xA;&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1&#xA;&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;2&#xA;&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;3&#xA;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&#xA;&lt;td class=&#34;lntd&#34;&gt;&#xA;&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;python3 -m venv .venv&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;nb&#34;&gt;source&lt;/span&gt; .venv/bin/activate&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;pip install -r requirements.txt&#xA;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&#xA;&lt;/div&gt;&#xA;&lt;/div&gt;&lt;p&gt;PDF translation needs &lt;code&gt;pdf2zh-next&lt;/code&gt; installed separately:&lt;/p&gt;&#xA;&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;&#xA;&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;&#xA;&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1&#xA;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&#xA;&lt;td class=&#34;lntd&#34;&gt;&#xA;&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;uv tool install --python python3.13 &lt;span class=&#34;s2&#34;&gt;&amp;#34;pdf2zh-next==2.6.4&amp;#34;&lt;/span&gt; --with &lt;span class=&#34;s2&#34;&gt;&amp;#34;BabelDOC==0.5.16&amp;#34;&lt;/span&gt;&#xA;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&#xA;&lt;/div&gt;&#xA;&lt;/div&gt;&lt;p&gt;Finder Quick Actions can then be installed with:&lt;/p&gt;&#xA;&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;&#xA;&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;&#xA;&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1&#xA;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&#xA;&lt;td class=&#34;lntd&#34;&gt;&#xA;&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;python3 macos/install_quick_actions.py&#xA;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&#xA;&lt;/div&gt;&#xA;&lt;/div&gt;&lt;h2 id=&#34;cli-examples&#34;&gt;CLI Examples&#xA;&lt;/h2&gt;&lt;p&gt;Translate TXT, Markdown, and DOCX:&lt;/p&gt;&#xA;&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;&#xA;&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;&#xA;&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1&#xA;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&#xA;&lt;td class=&#34;lntd&#34;&gt;&#xA;&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;python3 scripts/translate_document_worker.py --engine google --lang-out zh --mode both file.txt notes.md paper.docx&#xA;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&#xA;&lt;/div&gt;&#xA;&lt;/div&gt;&lt;p&gt;Translate an image with the lightweight macOS Vision OCR path:&lt;/p&gt;&#xA;&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;&#xA;&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;&#xA;&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1&#xA;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&#xA;&lt;td class=&#34;lntd&#34;&gt;&#xA;&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;python3 scripts/translate_image_worker.py --image-engine simple-macos --text-engine google --lang-in auto --lang-out zh --mode both image.png&#xA;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&#xA;&lt;/div&gt;&#xA;&lt;/div&gt;&lt;p&gt;Transcribe audio or video and translate the transcript:&lt;/p&gt;&#xA;&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;&#xA;&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;&#xA;&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1&#xA;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&#xA;&lt;td class=&#34;lntd&#34;&gt;&#xA;&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;python3 scripts/translate_audio_worker.py --operation both --engine google --lang-out zh --mode dual interview.m4a&#xA;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&#xA;&lt;/div&gt;&#xA;&lt;/div&gt;&lt;h2 id=&#34;practical-notes&#34;&gt;Practical Notes&#xA;&lt;/h2&gt;&lt;p&gt;This is a practical automation tool, not a promise that every complex document will translate perfectly.&lt;/p&gt;&#xA;&lt;p&gt;DOCX translation covers normal body text, headers, footers, footnotes, endnotes, and comments. Very complex Word features such as SmartArt, embedded objects, equations, or unusual text boxes may need additional testing.&lt;/p&gt;&#xA;&lt;p&gt;The lightweight &lt;code&gt;simple-macos&lt;/code&gt; image engine is best for clean screenshots, slides, and diagrams. It is not AI inpainting. For manga or complex backgrounds, the optional &lt;code&gt;manga-image-translator&lt;/code&gt; adapter is a better fit.&lt;/p&gt;&#xA;&lt;p&gt;If your daily translation work jumps between PDFs, Word documents, Markdown notes, screenshots, and recordings, this project gives you one place to start instead of a pile of one-off scripts.&lt;/p&gt;&#xA;&lt;h2 id=&#34;related-posts&#34;&gt;Related Posts&#xA;&lt;/h2&gt;&lt;ul&gt;&#xA;&lt;li&gt;For fully local AI text translation on Apple Silicon, see &lt;a class=&#34;link&#34; href=&#34;https://jingyuan-zheng.github.io/p/mac-lite-translator-apple-silicon-mlx/&#34; &gt;Mac-Lite-Translator&lt;/a&gt;.&lt;/li&gt;&#xA;&lt;li&gt;For multilingual typing and academic symbols on macOS, see &lt;a class=&#34;link&#34; href=&#34;https://jingyuan-zheng.github.io/p/abc-custom-keyboard/&#34; &gt;ABC Custom Keyboard&lt;/a&gt;.&lt;/li&gt;&#xA;&lt;/ul&gt;&#xA;</description>
        </item></channel>
</rss>
