From office-docx-skills
Use when editing `.docx` or Word documents that need visible Track Changes, revision marks, 修订模式, insertions, deletions, reviewer-facing redlines, or preserved change history.
How this skill is triggered — by the user, by Claude, or both
Slash command
/office-docx-skills:docx-tracked-changesThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Edit `.docx` files programmatically so that changes appear as tracked revisions in Microsoft Word — deletions show as red strikethrough, insertions as red underline, with author/date metadata in the margin. This uses direct XML manipulation of the OOXML `w:ins` / `w:del` elements via `lxml`, NOT `python-docx`'s high-level API (which has no tracked-change support).
Edit .docx files programmatically so that changes appear as tracked revisions in Microsoft Word — deletions show as red strikethrough, insertions as red underline, with author/date metadata in the margin. This uses direct XML manipulation of the OOXML w:ins / w:del elements via lxml, NOT python-docx's high-level API (which has no tracked-change support).
pip install python-docx lxml
python-docx >= 1.2.0 — used only for initial inspection (paragraph styles, run formatting). NOT used for the actual tracked-change writes.lxml >= 5.0 — the real workhorse; parses and modifies word/document.xml inside the .docx zip.Both are pure-Python installs, no system-level dependencies.
.docx and wants to SEE the changes in Word (修订模式 / Track Changes).docx is insufficient because the change history matterspython-docx directly.doc (old binary format) — not supportedThis skill can be used alone or together with other skills in office-docx-skills.
word-default-formatting when revised text should also follow the package's default Word formatting profile.word-formula-writing when formula-related edits should be represented with Word-native editable equations and visible revision marks.A .docx is a zip containing word/document.xml. Tracked changes use two wrapper elements around w:r (run) elements inside w:p (paragraph):
<!-- Deletion: wraps original runs, text tag becomes w:delText -->
<w:del w:id="102" w:author="Claude" w:date="2026-04-17T10:00:00Z">
<w:r>
<w:rPr><!-- original formatting preserved --></w:rPr>
<w:delText xml:space="preserve">old text here</w:delText>
</w:r>
</w:del>
<!-- Insertion: wraps new runs with cloned formatting -->
<w:ins w:id="103" w:author="Claude" w:date="2026-04-17T10:00:00Z">
<w:r>
<w:rPr><!-- cloned from original run --></w:rPr>
<w:t xml:space="preserve">new text here</w:t>
</w:r>
</w:ins>
Key rules:
w:del and w:ins are direct children of w:p, siblings of w:pPrw:id (integer, must not collide with existing IDs in the document)w:author and w:date populate the margin annotation in Wordw:del, text elements MUST be w:delText (not w:t)w:ins, text elements are normal w:tw:rPr) must be cloned from original runs to preserve font/size/boldThe reusable implementation lives in tracked_change_editor.py beside this skill. Copy that helper into your working directory when you want code you can import directly.
from tracked_change_editor import TrackedChangeEditor
editor = TrackedChangeEditor('input.docx', author='Your Name')
for i, p in enumerate(editor.body_paras):
if 'target text' in editor._get_para_text(p):
editor.replace_paragraph_text(i, 'replacement text')
break
editor.save('output.docx')
For section-aware paragraph insertion, use insert_paragraph_after_with_tracked_change() after locating the correct section and anchor paragraph:
editor.insert_paragraph_after_with_tracked_change(ref_entry_index, "New reference entry text")
tracked_change_editor.py handles the important mechanics:
w:ins / w:del nodesw:rPr.docxeditor = TrackedChangeEditor('input.docx', author='Your Name')
# Find target paragraph by text content
for i, p in enumerate(editor.body_paras):
if 'target text' in editor._get_para_text(p):
editor.replace_paragraph_text(i, 'replacement text')
break
editor.save('output.docx') # Always save to a NEW file first
If you need to make 7 edits, do them ALL in a single script run against a single TrackedChangeEditor instance, then save once. Do NOT:
_revised.docx_revised.docx (OVERWRITES script A's work)If you must split across multiple script runs, each subsequent run MUST load from the previous output:
# Round 1
editor = TrackedChangeEditor('original.docx')
# ... edits ...
editor.save('revised.docx')
# Round 2 — load from revised, NOT original
editor2 = TrackedChangeEditor('revised.docx')
# ... more edits ...
editor2.save('revised.docx')
Why this matters: Loading from the original file discards all tracked changes from prior rounds. This is the single most common mistake — it silently drops work with no error.
When inserting a new paragraph (e.g., a reference entry), NEVER search the entire document for a text match. Body text citations contain the same author names as reference entries. A naive search like if 'Acharya' in text will match a body paragraph first and insert the reference into the middle of Section 2.
Correct approach — find the section boundary first:
# Find where References section starts
refs_start_idx = None
for i, p in enumerate(editor.body_paras):
t = editor._get_para_text(p)
if t.strip() == 'References':
refs_start_idx = i
break
# Search ONLY within References
for i in range(refs_start_idx + 1, len(editor.body_paras)):
t = editor._get_para_text(editor.body_paras[i])
if t.startswith('Acharya') and 'Johnson' in t:
# Insert after this paragraph
break
Why this matters: Author names appear in both in-text citations and reference entries. Without section-aware search, new reference paragraphs end up in the body text — a corruption that's hard to spot until the user opens Word.
When replacing text in academic manuscripts or reference-preserving rewrites, preserve every (Author Year) citation from the original unless the user explicitly asks to delete, merge, or rewrite citations. Academic papers have a reference list that should match in-text citations; dropping a citation can orphan the reference entry or weaken the argument.
Correct approach — extract before rewriting:
import re
old_text = editor._get_para_text(editor.body_paras[idx])
old_cites = re.findall(r'\([^)]*\d{4}[a-z]?\)', old_text)
# Write new_text ensuring every item in old_cites appears
# After writing, verify:
for c in old_cites:
assert c in new_text, f'LOST CITATION: {c}'
Why this matters: A rewritten introduction that drops (Malmendier and Nagel 2011; Choi, Gao, and Jiang 2020) silently breaks the paper's citation integrity. The reference list still contains these entries but nothing in the text points to them. Reviewers and copy-editors will flag this.
If a paragraph already contains w:ins/w:del from prior edits, _collect_direct_runs only grabs direct w:r children — runs inside existing w:ins/w:del wrappers are skipped. This avoids nesting revisions but means the replacement only covers the "accepted" portion of text. For paragraphs with heavy prior revisions, accept all changes in Word first, then re-edit.
Never overwrite the original. Save as _revised.docx, let the user verify in Word, then replace manually.
python-docx / lxml cannot refresh Word's Table of Contents. After editing, the user must open in Word → right-click TOC → Update Field.
The _find_max_id() method scans all existing w:ins, w:del, w:rPrChange, w:pPrChange, w:sectPrChange elements. New IDs increment from the max. If IDs collide, Word may corrupt the revision history.
_get_default_rpr clones the w:rPr from the first run in the paragraph. This works for paragraphs with uniform formatting. For paragraphs with mixed formatting (e.g., bold "Keywords:" followed by normal text), do surgical per-run replacement instead of whole-paragraph replacement.
Saving via lxml produces minor XML cosmetic differences (quote style " vs ', line endings \r\n vs \n). Content and formatting are preserved. File size may differ due to zip compression level — this is harmless.
After saving, open in Word and confirm:
w:ins with your author name and print each one's first 100 chars. If the count is less than expected, prior edits were lost (see Gotcha 0).python3 verify_tracked_changes.py output.docx --author Codex
verify_tracked_changes.py is included beside this skill and prints insertion/deletion counts plus the first 100 characters of each tracked change.
Creates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.
npx claudepluginhub wintersdragon-c/office-docx-skills --plugin office-docx-skills