From hwpx
Use this skill whenever the user wants to create, read, edit, or manipulate HWP/HWPX documents (한글 문서). Triggers include: any mention of 'HWP', 'HWPX', '한글', '한컴', 'Hancom', '.hwp', '.hwpx', or requests to produce Korean-standard documents with equations, tables, multi-column layouts, or government/education formatting. Also use when extracting text or equations from HWP/HWPX files, converting between HWP/HWPX and Markdown, working with HWP equation scripts, or generating Korean exam papers (시험지). If the user asks for a document in HWP/HWPX format, use this skill. Do NOT use for DOCX, PDF, or general document tasks unrelated to HWP/HWPX.
How this skill is triggered — by the user, by Claude, or both
Slash command
/hwpx:hwpxThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
A .hwpx file is a ZIP archive containing XML files — Hancom Office's Open XML format (한글 2014+). The older .hwp format is OLE2 binary (한글 97–2014).
A .hwpx file is a ZIP archive containing XML files — Hancom Office's Open XML format (한글 2014+). The older .hwp format is OLE2 binary (한글 97–2014).
| Task | Approach |
|---|---|
| Read/analyze content | python scripts/reader.py doc.hwpx or unpack for raw XML |
| Create new document | python scripts/generator.py — see Creating New Documents below |
| Edit existing document | Unzip → edit XML → rezip — see Editing Existing Documents below |
| Convert HWPX → Markdown | python scripts/reader.py doc.hwpx output.md |
| Convert Markdown → HWPX | python scripts/generator.py output.hwpx "Title" "body text" |
| Convert legacy .hwp | pip install pyhwp && hwp5txt doc.hwp or LibreOffice |
| Validate HWPX | python scripts/validate.py doc.hwpx |
Legacy .hwp files (OLE2 binary) cannot be directly edited as XML:
# Text extraction only (no formatting)
pip install pyhwp
hwp5txt document.hwp > output.txt
# Via LibreOffice (limited equation support)
libreoffice --headless --convert-to docx document.hwp
Best quality path: Open in 한글 program → Save As HWPX → use reader.py
# Markdown extraction with equations preserved as LaTeX
python scripts/reader.py document.hwpx output.md
# Raw XML access
unzip document.hwpx -d unpacked/
cat unpacked/Contents/section0.xml
python scripts/validate.py document.hwpx
Generate .hwpx files with Python using scripts/generator.py.
python scripts/generator.py output.hwpx "문서 제목" "본문 내용" [A4|B4] [1|2]
A4 (default), B4 (Korean exam paper standard)1 (default), 2 (newspaper-style, for exam papers)from generator import generate_hwpx, build_text_para, build_equation_para, build_table, build_empty_para
# Simple document
generate_hwpx("output.hwpx", "제목", "본문 텍스트\n두 번째 줄")
# With inline equations (use $...$ in text)
generate_hwpx("output.hwpx", "수학 문제", "이차방정식 $x^2 + 3x + 2 = 0$을 풀어라.")
# B4 2-column exam paper
generate_hwpx("exam.hwpx", "2024학년도 수학 시험", body_text, "B4", 2)
For complex documents, build sections manually:
from generator import make_section, build_text_para, build_equation_para, build_table, build_empty_para
import zipfile
paras = []
paras.append(build_text_para("1. 다음 방정식을 풀어라."))
paras.append(build_empty_para())
paras.append(build_equation_para(r"\frac{x+1}{x-1} = 3"))
paras.append(build_empty_para())
paras.append(build_table([["x", "y"], ["1", "2"], ["3", "4"]]))
body = "\n".join(paras)
# Then use make_section() and assemble the ZIP
<!-- A4 (default) -->
<hp:pagePr landscape="WIDELY" width="59528" height="84188">
<hp:margin header="4252" footer="4252" gutter="0"
left="8504" right="8504" top="5668" bottom="4252"/>
</hp:pagePr>
<!-- B4 JIS (Korean exam paper standard) -->
<hp:pagePr landscape="WIDELY" width="72851" height="103181">
<hp:margin header="4252" footer="4252" gutter="0"
left="8504" right="8504" top="7086" bottom="5668"/>
</hp:pagePr>
Units: 1 HWP unit = 1/7200 inch. 8504 units ≈ 30mm.
<!-- 2-column newspaper style -->
<hp:colPr id="" type="NEWSPAPER" layout="LEFT" colCount="2" sameSz="1" sameGap="2268"/>
<!-- Font declaration in header.xml -->
<hh:fontface lang="HANGUL" fontCnt="2">
<hh:font id="0" face="신명중명조" type="TTF" isEmbedded="0"/>
<hh:font id="1" face="나눔고딕" type="TTF" isEmbedded="0"/>
</hh:fontface>
<!-- Character property: 10pt body text -->
<hh:charPr id="0" height="1000" textColor="#000000">
<hh:fontRef hangul="0" latin="0"/>
</hh:charPr>
<!-- Character property: 11pt bold -->
<hh:charPr id="2" height="1100" textColor="#000000">
<hh:fontRef hangul="0" latin="0"/>
<hh:bold/>
</hh:charPr>
height units: 100 = 1pt. 1000 = 10pt, 1100 = 11pt.
CRITICAL: HWP equations are NOT LaTeX. They use a completely different syntax.
| LaTeX | HWP Equation Script | Notes |
|---|---|---|
\frac{a}{b} | {a} over {b} | Fraction |
\sqrt{x} | sqrt {x} | Square root |
\text{cm} | "cm" | Text in quotes |
\mathrm{log} | "log" | Roman text |
\left( | left ( | Left delimiter |
\right) | right ) | Right delimiter |
\left\{ | left lbrace | Left curly brace |
\right\} | right rbrace | Right curly brace |
\{ | lbrace | Standalone curly brace |
\} | rbrace | Standalone curly brace |
\left| | LEFT | | Absolute value (UPPERCASE LEFT/RIGHT) |
\right| | RIGHT | | Absolute value |
\to | `->` | Arrow with backtick spacing |
\cdot | cdot | Dot product (NOT bullet!) |
\cdots | `cdots` | Ellipsis with backtick spacing |
\overline{AB} | rm bar{AB} | Line segment (rm for geometry) |
\vec{AB} | vec{rm AB it} | Vector |
\triangle ABC | rm triangle ABC | Triangle |
\angle ABC | rm ANGLE ABC | Angle |
\quad | ~~ | Wide space |
\qquad | ~~~~ | Very wide space |
\, \; \: \! | ~ | Thin space |
\alpha | alpha | Greek (no backslash) |
\sin | `sin` | Function with backtick spacing |
\therefore | therefore~ | With trailing space |
\because | because~ | With trailing space |
These rules are mandatory for Korean education documents (시험지, 교과서).
HWP equations render letters in italic by default. Use rm (roman/upright) explicitly where required.
| Target | Example | HWP Script |
|---|---|---|
| Geometry vertices | A, B, C, P, Q | rmA, rmB, rmABCD |
| Triangle | △ABC | rm triangle ABC |
| Line segment | AB̄ | rm bar{AB} or bar{rmPQ} |
| Angle | ∠ABC | rm ANGLE ABC |
| Units | cm, kg, L | `rmcm (with ` spacing before) |
| Probability symbols | P, C, H, B, N, E | {rmP}, {rmC}, {rmN}, {rmE} |
| Congruence conditions | SSS, SAS | rmSSS, rmSAS |
| Numbers | 1, 2, 3.14 | rm{1}, rm{3.14} (auto-handled) |
| Target | Example | HWP Script |
|---|---|---|
| Variables | a, b, x, y | a, b, x, y (no prefix needed) |
| Negative after inequality | x < -2 | x<it-2 (use it instead of space) |
| Negative after limit arrow | lim(x→-2) | lim_{x->it-2} |
| Expression | HWP Script |
|---|---|
| Permutation ₙPᵣ | _{it n}{rmP}_{it r} |
| Combination ₙCᵣ | _{it n}{rmC}_{it r} |
| Probability P(X=r) | {rmP}{it(X=r)} |
| Binomial B(n,p) | {rmB}{it(n,~p)} |
| Normal N(m,σ²) | {rmN}{it(m,~sigma^2)} |
| Expected value E(X) | {rmE}{it(X)} |
| Position | Symbol | Example |
|---|---|---|
| Before units | ` | 150`rmkg |
| After comma | ~ | (a,~b) |
| Around cdots | ` | `cdots` |
| After therefore/because | ~ | therefore~a=b |
| Ordered pairs | ~ | (a,~b) |
| Set elements | ~ | LEFT { a,~b,~c RIGHT } |
| Between points | ~ | rmP,~Q |
| Cases alignment | ~~ | cases{ax+b~~&(x ne 1)} |
| Coordinate comma | ` | rmA(-2, `-1)`` |
| Trig/log functions | ` | `sin`, `log` |
document.hwpx (ZIP)
├── META-INF/
│ └── container.xml # OPF container (root file path)
├── Contents/
│ ├── content.hpf # OPF package manifest (file list)
│ ├── header.xml # Document settings (fonts, styles, paragraph properties)
│ ├── section0.xml # Main content (paragraphs, equations, images, tables)
│ └── section1.xml # Additional sections (optional)
├── BinData/ # Image file storage
│ ├── image1.png
│ └── image2.png
├── Preview/
│ └── PrvText.txt # Preview text
├── settings.xml # Application settings
└── mimetype # "application/hwp+zip"
hp = http://www.hancom.co.kr/hwpml/2011/paragraph (body content)
hh = http://www.hancom.co.kr/hwpml/2011/head (header/styles)
hs = http://www.hancom.co.kr/hwpml/2011/section (sections)
hc = http://www.hancom.co.kr/hwpml/2011/core (core)
ha = http://www.hancom.co.kr/hwpml/2011/app (application)
<hp:p id="0" paraPrIDRef="0" styleIDRef="0"
pageBreak="0" columnBreak="0" merged="0">
<hp:run charPrIDRef="1">
<hp:t>문제 텍스트입니다.</hp:t>
</hp:run>
</hp:p>
CRITICAL: An empty <hp:t/> MUST follow every equation element.
<hp:p id="0" paraPrIDRef="0" styleIDRef="0"
pageBreak="0" columnBreak="0" merged="0">
<hp:run charPrIDRef="1">
<hp:t>이차방정식 </hp:t>
</hp:run>
<hp:run charPrIDRef="1">
<hp:equation id="0" zOrder="0" numberingType="EQUATION"
textWrap="TOP_AND_BOTTOM" textFlow="BOTH_SIDES" lock="0"
dropcapstyle="None" version="Equation Version 60" baseLine="61"
textColor="#000000" baseUnit="1100" lineMode="CHAR" font="HYhwpEQ">
<hp:sz width="14000" widthRelTo="ABSOLUTE"
height="3000" heightRelTo="ABSOLUTE" protect="0"/>
<hp:pos treatAsChar="1" affectLSpacing="0" flowWithText="1"
allowOverlap="0" holdAnchorAndSO="0"
vertRelTo="PARA" horzRelTo="PARA"
vertAlign="TOP" horzAlign="LEFT"
vertOffset="0" horzOffset="0"/>
<hp:outMargin left="56" right="56" top="0" bottom="0"/>
<hp:shapeComment>수식입니다.</hp:shapeComment>
<hp:script>x^rm{2}+rm{3}x+rm{2}=rm{0}</hp:script>
</hp:equation>
<hp:t/>
</hp:run>
<hp:run charPrIDRef="1">
<hp:t>을 풀어라.</hp:t>
</hp:run>
</hp:p>
Key equation attributes:
version="Equation Version 60": CRITICAL — 수식 렌더러 버전. 빈 문자열이면 복잡한 수식이 표시되지 않음baseLine="61": Baseline offset (한글 프로그램 기본값)vertRelTo="PARA", horzRelTo="PARA": CRITICAL — 문단 기준 위치. PAPER/COLUMN 사용 시 수식 렌더링 실패vertAlign="TOP", vertOffset="0": 문단 상단 정렬, 오프셋 0outMargin top="0" bottom="0": 상하 여백 0 (한글 기본값)<hp:shapeComment>수식입니다.</hp:shapeComment>: 한글 프로그램 호환성용 주석width: Equation width (HWP units, ~600 per character)height: Equation height (default 3000)treatAsChar="1": Inline equationfont="HYhwpEQ": HWP equation font (required)CRITICAL: Table cells use borderFillIDRef="3" for standard solid-line borders.
<hp:tbl id="0" zOrder="0" numberingType="TABLE"
textWrap="TOP_AND_BOTTOM" textFlow="BOTH_SIDES" lock="0"
dropcapstyle="None" pageBreak="CELL" repeatHeader="1"
rowCnt="2" colCnt="3" cellSpacing="0" borderFillIDRef="3" noAdjust="0">
<hp:sz width="42000" widthRelTo="ABSOLUTE"
height="3600" heightRelTo="ABSOLUTE" protect="0"/>
<hp:pos treatAsChar="1" .../>
<hp:outMargin left="0" right="0" top="0" bottom="0"/>
<hp:inMargin left="0" right="0" top="0" bottom="0"/>
<hp:tr>
<hp:tc name="" header="0" hasMargin="0" protect="0" editable="0"
dirty="0" borderFillIDRef="3">
<hp:subList id="" textDirection="HORIZONTAL" lineWrap="BREAK"
vertAlign="CENTER" ...>
<hp:p ...><hp:run charPrIDRef="0"><hp:t>Cell</hp:t></hp:run></hp:p>
</hp:subList>
<hp:cellAddr colAddr="0" rowAddr="0"/>
<hp:cellSpan colSpan="1" rowSpan="1"/>
<hp:cellSz width="14000" height="1800"/>
<hp:cellMargin left="141" right="141" top="141" bottom="141"/>
</hp:tc>
</hp:tr>
</hp:tbl>
Table width calculation: Default total width 42000. Column widths = total_width / n_cols.
BinData/ in ZIPContents/content.hpf:<opf:item id="img1" href="BinData/image1.png" media-type="image/png"/>
<hp:pic id="0" zOrder="0" numberingType="PICTURE"
textWrap="TOP_AND_BOTTOM" textFlow="BOTH_SIDES">
<hp:imgRect>
<hp:orgSz width="28000" height="28000"/>
<hp:curSz width="0" height="0"/>
</hp:imgRect>
<hp:sz width="28000" widthRelTo="ABSOLUTE"
height="28000" heightRelTo="ABSOLUTE" protect="0"/>
<hp:pos treatAsChar="1" .../>
<hp:img bright="0" contrast="0" effect="REAL_PIC" binaryItemIDRef="1"/>
</hp:pic>
Endnotes auto-place at document end — useful for exam solutions:
<hp:run charPrIDRef="1">
<hp:ctrl>
<hp:endNote number="1" suffixChar="41" instId="2000000000">
<hp:subList id="" textDirection="HORIZONTAL" lineWrap="BREAK" vertAlign="TOP">
<hp:p ...>
<hp:run charPrIDRef="0">
<hp:ctrl>
<hp:autoNum num="1" numType="ENDNOTE">
<hp:autoNumFormat type="DIGIT"/>
</hp:autoNum>
</hp:ctrl>
</hp:run>
<hp:run charPrIDRef="2">
<hp:t>정답: 42</hp:t>
</hp:run>
</hp:p>
</hp:subList>
</hp:endNote>
</hp:ctrl>
</hp:run>
<!-- Page break -->
<hp:p id="0" paraPrIDRef="0" styleIDRef="0" pageBreak="1" columnBreak="0" merged="0">
<hp:run charPrIDRef="0"><hp:t/></hp:run>
</hp:p>
<!-- Column break -->
<hp:p id="0" paraPrIDRef="0" styleIDRef="0" pageBreak="0" columnBreak="1" merged="0">
<hp:run charPrIDRef="0"><hp:t/></hp:run>
</hp:p>
Register all files:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<opf:package xmlns:opf="http://www.idpf.org/2007/opf" unique-identifier="bookid" version="1.0">
<opf:manifest>
<opf:item id="header" href="Contents/header.xml" media-type="application/xml"/>
<opf:item id="section0" href="Contents/section0.xml" media-type="application/xml"/>
<opf:item id="img1" href="BinData/image1.png" media-type="image/png"/>
</opf:manifest>
<opf:spine>
<opf:itemref idref="header"/>
<opf:itemref idref="section0"/>
</opf:spine>
</opf:package>
Follow all 3 steps in order.
mkdir unpacked
unzip document.hwpx -d unpacked/
Edit files in unpacked/Contents/. Key files:
section0.xml — main body contentheader.xml — fonts, styles, paragraph propertiesUse the Edit tool directly for string replacement. Do not write Python scripts.
cd unpacked
zip -r ../output.hwpx . -x ".*"
CRITICAL: The mimetype file must be the first entry and stored uncompressed:
cd unpacked
zip -0 ../output.hwpx mimetype
zip -r ../output.hwpx . -x mimetype ".*"
python scripts/reader.py document.hwpx output.md
Extracts text, equations (as LaTeX $...$), and images.
python scripts/generator.py output.hwpx "Title" "Body text with $equations$"
Supports inline $...$ equations auto-converted to HWP equation script.
| Method | Text | Equations | Tables | Images | Formatting |
|---|---|---|---|---|---|
| hwp5txt (HWP5) | ✅ | ❌ | △ | ❌ | ❌ |
| hwp5html → pandoc | ✅ | △ | ✅ | ✅ | △ |
| LibreOffice → pandoc | ✅ | ❌ | ✅ | ✅ | △ |
| reader.py (HWPX) | ✅ | ✅ LaTeX | ✅ | ✅ | △ |
Best quality path: HWP → 한글에서 HWPX 저장 → reader.py (equations preserved as LaTeX)
version="Equation Version 60" — 수식 렌더러 버전 필수. 빈 문자열이면 수식이 본문에 표시되지 않음vertRelTo="PARA", horzRelTo="PARA" — 수식 위치는 문단 기준. PAPER/COLUMN 사용 시 렌더링 실패<hp:t/> after equations — Without it, equations won't display\sqrt, \text, \mathrm 변환 시 end_pos +1 — _match_brace 후 닫는 }를 건너뛰어야 중괄호 불일치 방지^/_ 뒤 공백 금지 — v^ rm{2}는 파싱 오류. v^rm{2} 또는 v^{rm{2}}로 붙여 써야 함html.escape() for XML content<hp:colPr> inside <hp:secPr> — Multi-column fails otherwiserm)cdot not bullet — Always use cdot for dot product`) between number and unit is mandatoryversion="" (빈 문자열) → version="Equation Version 60" 필수vertRelTo="PAPER" → vertRelTo="PARA", horzRelTo="COLUMN" → horzRelTo="PARA" 변경<hp:t/> after <hp:equation> element^ 또는 _ 뒤에 공백이 있으면 안 됨. v^rm{2} (O), v^ rm{2} (X)\sqrt, \text 변환 시 end_pos = m.end() + len(content) + 1 (닫는 } 건너뛰기)html.escape()binaryItemIDRef doesn't match content.hpf id<hp:colPr> not inside <hp:secPr>pageBreak="1" attribute on <hp:p>columnBreak="1" attribute on <hp:p>borderFillIDRef not set to "3" (solid line)cdot for dot product, never bulletpip install pyhwp (HWP5 binary reading only)Creates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.
npx claudepluginhub hohyon-ryu/claude-hwpx-plugin --plugin hwpx