How do I install Word Documents (DOCX)?

Install Word Documents (DOCX) with a single command: npx mdskills install anthropics/docx-documents. This downloads the skill files into your project and your AI agent picks them up automatically.

What platforms support Word Documents (DOCX)?

Word Documents (DOCX) works with Claude Code. Skills use the open SKILL.md format which is compatible with any AI coding agent that reads markdown instructions.

← Back to plugins

Word Documents (DOCX)

Name: Word Documents (DOCX): AI Agent Skill
Rating: 9 (1 reviews)
Author: anthropics

Verified

PluginFile ProcessingBeginner friendly

Use this skill whenever the user wants to create, read, edit, or manipulate Word documents (.docx files). Triggers include: any mention of \"Word doc\", \"word document\", \".docx\", or requests to produce professional documents with formatting like tables of contents, headings, page numbers, or letterheads. Also use when extracting or reorganizing content from .docx files, inserting or replacing images in documents, performing find-and-replace in Word files, working with tracked changes or comments, or converting content into a polished Word document. If the user asks for a \"report\", \"memo\", \"letter\", \"template\", or similar deliverable as a Word or .docx file, use this skill. Do NOT use for PDFs, spreadsheets, Google Docs, or general coding tasks unrelated to document generation.

by @anthropics95 downloads0Updated 2/19/2026

Add this skill

npx mdskills install anthropics/docx-documents

Fork & Edit

Skill Advisor9.0

Comprehensive Word document manipulation with detailed XML editing patterns and validation workflows

+Provides extensive docx-js code examples with critical pitfalls clearly marked
+Covers full lifecycle from creation to XML-level editing with validation steps
+Documents smart quote handling, table sizing, and cross-platform compatibility issues
-References external Python scripts without showing their implementation or error handling

SKILL.md

Edit in Browser

1---
2name: docx
3description: "Use this skill whenever the user wants to create, read, edit, or manipulate Word documents (.docx files). Triggers include: any mention of \"Word doc\", \"word document\", \".docx\", or requests to produce professional documents with formatting like tables of contents, headings, page numbers, or letterheads. Also use when extracting or reorganizing content from .docx files, inserting or replacing images in documents, performing find-and-replace in Word files, working with tracked changes or comments, or converting content into a polished Word document. If the user asks for a \"report\", \"memo\", \"letter\", \"template\", or similar deliverable as a Word or .docx file, use this skill. Do NOT use for PDFs, spreadsheets, Google Docs, or general coding tasks unrelated to document generation."
4license: Proprietary. LICENSE.txt has complete terms
5---
6 
7# DOCX creation, editing, and analysis
8 
9## Overview
10 
11A .docx file is a ZIP archive containing XML files.
12 
13## Quick Reference
14 
15| Task | Approach |
16|------|----------|
17| Read/analyze content | `pandoc` or unpack for raw XML |
18| Create new document | Use `docx-js` - see Creating New Documents below |
19| Edit existing document | Unpack → edit XML → repack - see Editing Existing Documents below |
20 
21### Converting .doc to .docx
22 
23Legacy `.doc` files must be converted before editing:
24 
25```bash
26python scripts/office/soffice.py --headless --convert-to docx document.doc
27```
28 
29### Reading Content
30 
31```bash
32# Text extraction with tracked changes
33pandoc --track-changes=all document.docx -o output.md
34 
35# Raw XML access
36python scripts/office/unpack.py document.docx unpacked/
37```
38 
39### Converting to Images
40 
41```bash
42python scripts/office/soffice.py --headless --convert-to pdf document.docx
43pdftoppm -jpeg -r 150 document.pdf page
44```
45 
46### Accepting Tracked Changes
47 
48To produce a clean document with all tracked changes accepted (requires LibreOffice):
49 
50```bash
51python scripts/accept_changes.py input.docx output.docx
52```
53 
54---
55 
56## Creating New Documents
57 
58Generate .docx files with JavaScript, then validate. Install: `npm install -g docx`
59 
60### Setup
61```javascript
62const { Document, Packer, Paragraph, TextRun, Table, TableRow, TableCell, ImageRun,
63        Header, Footer, AlignmentType, PageOrientation, LevelFormat, ExternalHyperlink,
64        TableOfContents, HeadingLevel, BorderStyle, WidthType, ShadingType,
65        VerticalAlign, PageNumber, PageBreak } = require('docx');
66 
67const doc = new Document({ sections: [{ children: [/* content */] }] });
68Packer.toBuffer(doc).then(buffer => fs.writeFileSync("doc.docx", buffer));
69```
70 
71### Validation
72After creating the file, validate it. If validation fails, unpack, fix the XML, and repack.
73```bash
74python scripts/office/validate.py doc.docx
75```
76 
77### Page Size
78 
79```javascript
80// CRITICAL: docx-js defaults to A4, not US Letter
81// Always set page size explicitly for consistent results
82sections: [{
83  properties: {
84    page: {
85      size: {
86        width: 12240,   // 8.5 inches in DXA
87        height: 15840   // 11 inches in DXA
88      },
89      margin: { top: 1440, right: 1440, bottom: 1440, left: 1440 } // 1 inch margins
90    }
91  },
92  children: [/* content */]
93}]
94```
95 
96**Common page sizes (DXA units, 1440 DXA = 1 inch):**
97 
98| Paper | Width | Height | Content Width (1" margins) |
99|-------|-------|--------|---------------------------|
100| US Letter | 12,240 | 15,840 | 9,360 |
101| A4 (default) | 11,906 | 16,838 | 9,026 |
102 
103**Landscape orientation:** docx-js swaps width/height internally, so pass portrait dimensions and let it handle the swap:
104```javascript
105size: {
106  width: 12240,   // Pass SHORT edge as width
107  height: 15840,  // Pass LONG edge as height
108  orientation: PageOrientation.LANDSCAPE  // docx-js swaps them in the XML
109},
110// Content width = 15840 - left margin - right margin (uses the long edge)
111```
112 
113### Styles (Override Built-in Headings)
114 
115Use Arial as the default font (universally supported). Keep titles black for readability.
116 
117```javascript
118const doc = new Document({
119  styles: {
120    default: { document: { run: { font: "Arial", size: 24 } } }, // 12pt default
121    paragraphStyles: [
122      // IMPORTANT: Use exact IDs to override built-in styles
123      { id: "Heading1", name: "Heading 1", basedOn: "Normal", next: "Normal", quickFormat: true,
124        run: { size: 32, bold: true, font: "Arial" },
125        paragraph: { spacing: { before: 240, after: 240 }, outlineLevel: 0 } }, // outlineLevel required for TOC
126      { id: "Heading2", name: "Heading 2", basedOn: "Normal", next: "Normal", quickFormat: true,
127        run: { size: 28, bold: true, font: "Arial" },
128        paragraph: { spacing: { before: 180, after: 180 }, outlineLevel: 1 } },
129    ]
130  },
131  sections: [{
132    children: [
133      new Paragraph({ heading: HeadingLevel.HEADING_1, children: [new TextRun("Title")] }),
134    ]
135  }]
136});
137```
138 
139### Lists (NEVER use unicode bullets)
140 
141```javascript
142// ❌ WRONG - never manually insert bullet characters
143new Paragraph({ children: [new TextRun("• Item")] })  // BAD
144new Paragraph({ children: [new TextRun("\u2022 Item")] })  // BAD
145 
146// ✅ CORRECT - use numbering config with LevelFormat.BULLET
147const doc = new Document({
148  numbering: {
149    config: [
150      { reference: "bullets",
151        levels: [{ level: 0, format: LevelFormat.BULLET, text: "•", alignment: AlignmentType.LEFT,
152          style: { paragraph: { indent: { left: 720, hanging: 360 } } } }] },
153      { reference: "numbers",
154        levels: [{ level: 0, format: LevelFormat.DECIMAL, text: "%1.", alignment: AlignmentType.LEFT,
155          style: { paragraph: { indent: { left: 720, hanging: 360 } } } }] },
156    ]
157  },
158  sections: [{
159    children: [
160      new Paragraph({ numbering: { reference: "bullets", level: 0 },
161        children: [new TextRun("Bullet item")] }),
162      new Paragraph({ numbering: { reference: "numbers", level: 0 },
163        children: [new TextRun("Numbered item")] }),
164    ]
165  }]
166});
167 
168// ⚠️ Each reference creates INDEPENDENT numbering
169// Same reference = continues (1,2,3 then 4,5,6)
170// Different reference = restarts (1,2,3 then 1,2,3)
171```
172 
173### Tables
174 
175**CRITICAL: Tables need dual widths** - set both `columnWidths` on the table AND `width` on each cell. Without both, tables render incorrectly on some platforms.
176 
177```javascript
178// CRITICAL: Always set table width for consistent rendering
179// CRITICAL: Use ShadingType.CLEAR (not SOLID) to prevent black backgrounds
180const border = { style: BorderStyle.SINGLE, size: 1, color: "CCCCCC" };
181const borders = { top: border, bottom: border, left: border, right: border };
182 
183new Table({
184  width: { size: 9360, type: WidthType.DXA }, // Always use DXA (percentages break in Google Docs)
185  columnWidths: [4680, 4680], // Must sum to table width (DXA: 1440 = 1 inch)
186  rows: [
187    new TableRow({
188      children: [
189        new TableCell({
190          borders,
191          width: { size: 4680, type: WidthType.DXA }, // Also set on each cell
192          shading: { fill: "D5E8F0", type: ShadingType.CLEAR }, // CLEAR not SOLID
193          margins: { top: 80, bottom: 80, left: 120, right: 120 }, // Cell padding (internal, not added to width)
194          children: [new Paragraph({ children: [new TextRun("Cell")] })]
195        })
196      ]
197    })
198  ]
199})
200```
201 
202**Table width calculation:**
203 
204Always use `WidthType.DXA` — `WidthType.PERCENTAGE` breaks in Google Docs.
205 
206```javascript
207// Table width = sum of columnWidths = content width
208// US Letter with 1" margins: 12240 - 2880 = 9360 DXA
209width: { size: 9360, type: WidthType.DXA },
210columnWidths: [7000, 2360]  // Must sum to table width
211```
212 
213**Width rules:**
214- **Always use `WidthType.DXA`** — never `WidthType.PERCENTAGE` (incompatible with Google Docs)
215- Table width must equal the sum of `columnWidths`
216- Cell `width` must match corresponding `columnWidth`
217- Cell `margins` are internal padding - they reduce content area, not add to cell width
218- For full-width tables: use content width (page width minus left and right margins)
219 
220### Images
221 
222```javascript
223// CRITICAL: type parameter is REQUIRED
224new Paragraph({
225  children: [new ImageRun({
226    type: "png", // Required: png, jpg, jpeg, gif, bmp, svg
227    data: fs.readFileSync("image.png"),
228    transformation: { width: 200, height: 150 },
229    altText: { title: "Title", description: "Desc", name: "Name" } // All three required
230  })]
231})
232```
233 
234### Page Breaks
235 
236```javascript
237// CRITICAL: PageBreak must be inside a Paragraph
238new Paragraph({ children: [new PageBreak()] })
239 
240// Or use pageBreakBefore
241new Paragraph({ pageBreakBefore: true, children: [new TextRun("New page")] })
242```
243 
244### Table of Contents
245 
246```javascript
247// CRITICAL: Headings must use HeadingLevel ONLY - no custom styles
248new TableOfContents("Table of Contents", { hyperlink: true, headingStyleRange: "1-3" })
249```
250 
251### Headers/Footers
252 
253```javascript
254sections: [{
255  properties: {
256    page: { margin: { top: 1440, right: 1440, bottom: 1440, left: 1440 } } // 1440 = 1 inch
257  },
258  headers: {
259    default: new Header({ children: [new Paragraph({ children: [new TextRun("Header")] })] })
260  },
261  footers: {
262    default: new Footer({ children: [new Paragraph({
263      children: [new TextRun("Page "), new TextRun({ children: [PageNumber.CURRENT] })]
264    })] })
265  },
266  children: [/* content */]
267}]
268```
269 
270### Critical Rules for docx-js
271 
272- **Set page size explicitly** - docx-js defaults to A4; use US Letter (12240 x 15840 DXA) for US documents
273- **Landscape: pass portrait dimensions** - docx-js swaps width/height internally; pass short edge as `width`, long edge as `height`, and set `orientation: PageOrientation.LANDSCAPE`
274- **Never use `\n`** - use separate Paragraph elements
275- **Never use unicode bullets** - use `LevelFormat.BULLET` with numbering config
276- **PageBreak must be in Paragraph** - standalone creates invalid XML
277- **ImageRun requires `type`** - always specify png/jpg/etc
278- **Always set table `width` with DXA** - never use `WidthType.PERCENTAGE` (breaks in Google Docs)
279- **Tables need dual widths** - `columnWidths` array AND cell `width`, both must match
280- **Table width = sum of columnWidths** - for DXA, ensure they add up exactly
281- **Always add cell margins** - use `margins: { top: 80, bottom: 80, left: 120, right: 120 }` for readable padding
282- **Use `ShadingType.CLEAR`** - never SOLID for table shading
283- **TOC requires HeadingLevel only** - no custom styles on heading paragraphs
284- **Override built-in styles** - use exact IDs: "Heading1", "Heading2", etc.
285- **Include `outlineLevel`** - required for TOC (0 for H1, 1 for H2, etc.)
286 
287---
288 
289## Editing Existing Documents
290 
291**Follow all 3 steps in order.**
292 
293### Step 1: Unpack
294```bash
295python scripts/office/unpack.py document.docx unpacked/
296```
297Extracts XML, pretty-prints, merges adjacent runs, and converts smart quotes to XML entities (`&#x201C;` etc.) so they survive editing. Use `--merge-runs false` to skip run merging.
298 
299### Step 2: Edit XML
300 
301Edit files in `unpacked/word/`. See XML Reference below for patterns.
302 
303**Use "Claude" as the author** for tracked changes and comments, unless the user explicitly requests use of a different name.
304 
305**Use the Edit tool directly for string replacement. Do not write Python scripts.** Scripts introduce unnecessary complexity. The Edit tool shows exactly what is being replaced.
306 
307**CRITICAL: Use smart quotes for new content.** When adding text with apostrophes or quotes, use XML entities to produce smart quotes:
308```xml
309<!-- Use these entities for professional typography -->
310<w:t>Here&#x2019;s a quote: &#x201C;Hello&#x201D;</w:t>
311```
312| Entity | Character |
313|--------|-----------|
314| `&#x2018;` | ‘ (left single) |
315| `&#x2019;` | ’ (right single / apostrophe) |
316| `&#x201C;` | “ (left double) |
317| `&#x201D;` | ” (right double) |
318 
319**Adding comments:** Use `comment.py` to handle boilerplate across multiple XML files (text must be pre-escaped XML):
320```bash
321python scripts/comment.py unpacked/ 0 "Comment text with &amp; and &#x2019;"
322python scripts/comment.py unpacked/ 1 "Reply text" --parent 0  # reply to comment 0
323python scripts/comment.py unpacked/ 0 "Text" --author "Custom Author"  # custom author name
324```
325Then add markers to document.xml (see Comments in XML Reference).
326 
327### Step 3: Pack
328```bash
329python scripts/office/pack.py unpacked/ output.docx --original document.docx
330```
331Validates with auto-repair, condenses XML, and creates DOCX. Use `--validate false` to skip.
332 
333**Auto-repair will fix:**
334- `durableId` >= 0x7FFFFFFF (regenerates valid ID)
335- Missing `xml:space="preserve"` on `<w:t>` with whitespace
336 
337**Auto-repair won't fix:**
338- Malformed XML, invalid element nesting, missing relationships, schema violations
339 
340### Common Pitfalls
341 
342- **Replace entire `<w:r>` elements**: When adding tracked changes, replace the whole `<w:r>...</w:r>` block with `<w:del>...<w:ins>...` as siblings. Don't inject tracked change tags inside a run.
343- **Preserve `<w:rPr>` formatting**: Copy the original run's `<w:rPr>` block into your tracked change runs to maintain bold, font size, etc.
344 
345---
346 
347## XML Reference
348 
349### Schema Compliance
350 
351- **Element order in `<w:pPr>`**: `<w:pStyle>`, `<w:numPr>`, `<w:spacing>`, `<w:ind>`, `<w:jc>`, `<w:rPr>` last
352- **Whitespace**: Add `xml:space="preserve"` to `<w:t>` with leading/trailing spaces
353- **RSIDs**: Must be 8-digit hex (e.g., `00AB1234`)
354 
355### Tracked Changes
356 
357**Insertion:**
358```xml
359<w:ins w:id="1" w:author="Claude" w:date="2025-01-01T00:00:00Z">
360  <w:r><w:t>inserted text</w:t></w:r>
361</w:ins>
362```
363 
364**Deletion:**
365```xml
366<w:del w:id="2" w:author="Claude" w:date="2025-01-01T00:00:00Z">
367  <w:r><w:delText>deleted text</w:delText></w:r>
368</w:del>
369```
370 
371**Inside `<w:del>`**: Use `<w:delText>` instead of `<w:t>`, and `<w:delInstrText>` instead of `<w:instrText>`.
372 
373**Minimal edits** - only mark what changes:
374```xml
375<!-- Change "30 days" to "60 days" -->
376<w:r><w:t>The term is </w:t></w:r>
377<w:del w:id="1" w:author="Claude" w:date="...">
378  <w:r><w:delText>30</w:delText></w:r>
379</w:del>
380<w:ins w:id="2" w:author="Claude" w:date="...">
381  <w:r><w:t>60</w:t></w:r>
382</w:ins>
383<w:r><w:t> days.</w:t></w:r>
384```
385 
386**Deleting entire paragraphs/list items** - when removing ALL content from a paragraph, also mark the paragraph mark as deleted so it merges with the next paragraph. Add `<w:del/>` inside `<w:pPr><w:rPr>`:
387```xml
388<w:p>
389  <w:pPr>
390    <w:numPr>...</w:numPr>  <!-- list numbering if present -->
391    <w:rPr>
392      <w:del w:id="1" w:author="Claude" w:date="2025-01-01T00:00:00Z"/>
393    </w:rPr>
394  </w:pPr>
395  <w:del w:id="2" w:author="Claude" w:date="2025-01-01T00:00:00Z">
396    <w:r><w:delText>Entire paragraph content being deleted...</w:delText></w:r>
397  </w:del>
398</w:p>
399```
400Without the `<w:del/>` in `<w:pPr><w:rPr>`, accepting changes leaves an empty paragraph/list item.
401 
402**Rejecting another author's insertion** - nest deletion inside their insertion:
403```xml
404<w:ins w:author="Jane" w:id="5">
405  <w:del w:author="Claude" w:id="10">
406    <w:r><w:delText>their inserted text</w:delText></w:r>
407  </w:del>
408</w:ins>
409```
410 
411**Restoring another author's deletion** - add insertion after (don't modify their deletion):
412```xml
413<w:del w:author="Jane" w:id="5">
414  <w:r><w:delText>deleted text</w:delText></w:r>
415</w:del>
416<w:ins w:author="Claude" w:id="10">
417  <w:r><w:t>deleted text</w:t></w:r>
418</w:ins>
419```
420 
421### Comments
422 
423After running `comment.py` (see Step 2), add markers to document.xml. For replies, use `--parent` flag and nest markers inside the parent's.
424 
425**CRITICAL: `<w:commentRangeStart>` and `<w:commentRangeEnd>` are siblings of `<w:r>`, never inside `<w:r>`.**
426 
427```xml
428<!-- Comment markers are direct children of w:p, never inside w:r -->
429<w:commentRangeStart w:id="0"/>
430<w:del w:id="1" w:author="Claude" w:date="2025-01-01T00:00:00Z">
431  <w:r><w:delText>deleted</w:delText></w:r>
432</w:del>
433<w:r><w:t> more text</w:t></w:r>
434<w:commentRangeEnd w:id="0"/>
435<w:r><w:rPr><w:rStyle w:val="CommentReference"/></w:rPr><w:commentReference w:id="0"/></w:r>
436 
437<!-- Comment 0 with reply 1 nested inside -->
438<w:commentRangeStart w:id="0"/>
439  <w:commentRangeStart w:id="1"/>
440  <w:r><w:t>text</w:t></w:r>
441  <w:commentRangeEnd w:id="1"/>
442<w:commentRangeEnd w:id="0"/>
443<w:r><w:rPr><w:rStyle w:val="CommentReference"/></w:rPr><w:commentReference w:id="0"/></w:r>
444<w:r><w:rPr><w:rStyle w:val="CommentReference"/></w:rPr><w:commentReference w:id="1"/></w:r>
445```
446 
447### Images
448 
4491. Add image file to `word/media/`
4502. Add relationship to `word/_rels/document.xml.rels`:
451```xml
452<Relationship Id="rId5" Type=".../image" Target="media/image1.png"/>
453```
4543. Add content type to `[Content_Types].xml`:
455```xml
456<Default Extension="png" ContentType="image/png"/>
457```
4584. Reference in document.xml:
459```xml
460<w:drawing>
461  <wp:inline>
462    <wp:extent cx="914400" cy="914400"/>  <!-- EMUs: 914400 = 1 inch -->
463    <a:graphic>
464      <a:graphicData uri=".../picture">
465        <pic:pic>
466          <pic:blipFill><a:blip r:embed="rId5"/></pic:blipFill>
467        </pic:pic>
468      </a:graphicData>
469    </a:graphic>
470  </wp:inline>
471</w:drawing>
472```
473 
474---
475 
476## Dependencies
477 
478- **pandoc**: Text extraction
479- **docx**: `npm install -g docx` (new documents)
480- **LibreOffice**: PDF conversion (auto-configured for sandboxed environments via `scripts/office/soffice.py`)
481- **Poppler**: `pdftoppm` for images
482

Full transparency — inspect the skill content before installing.