Skill

Workflow

Generate a full academic paper from metadata using EasyPaper Python SDK. Collects metadata interactively if not provided, then generates the paper directly.

Popularity

Parent stars

Parent forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/easypaper:easypaper-paper-from-metadata

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Use this skill when the user wants to generate an academic paper from metadata. This skill handles both metadata collection and paper generation in one unified workflow.

SKILL.md

359 lines · ~4.6k tokens

Stats

LanguagePython

Parent stars4

Parent forks4

MaintenanceGood

Last CommitApr 21, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

Workflow

Phase 1: Check for Existing Metadata

Ask user if they have complete metadata:
- "Do you already have a complete metadata file (JSON) or would you like me to collect it interactively?"
- If user provides metadata (file path or JSON object), validate it against examples/meta.json structure
- IMPORTANT: Convert all relative paths in metadata to absolute paths before validation
- If metadata is provided but incomplete, proceed to collection phase for missing fields
- If no metadata provided, proceed to collection phase

Phase 2: Collect Metadata (if needed)

If metadata is missing or incomplete, collect all required fields interactively:

Required Fields (collect one by one with examples):

Title
- Prompt: "Please provide the title of your paper."
- Example: "Artificial intelligence tools expand scientists' impact but contract science's focus"
- Validation: Non-empty string, 10-200 characters
Idea/Hypothesis
- Prompt: "What is the core research question or hypothesis of your paper? Describe the main idea you want to explore."
- Example: "The study hypothesizes a dual effect of AI adoption in science: while AI tools increase individual scientists' productivity, citations, and career advancement, they simultaneously narrow the collective scope of scientific exploration..."
- Validation: Non-empty, at least 50 characters
Method
- Prompt: "Describe the methodology used in your research. Include details about experimental design, data collection, analysis methods, and any tools or frameworks used."
- Example: "The study analyzes 41,298,433 papers across biology, medicine, chemistry, physics, materials science, and geology (1980-2025), primarily from OpenAlex..."
- Validation: Non-empty, should describe research approach
Data
- Prompt: "What data sources, datasets, or materials did you use? Describe the data collection process and any preprocessing steps."
- Example: "Primary bibliometric data come from OpenAlex, covering 41,298,433 papers in six natural science disciplines from 1980 to 2025..."
- Validation: Non-empty, should describe data sources
Experiments/Results
- Prompt: "Describe your experimental results, findings, or main outcomes. Include key metrics, comparisons, and interpretations."
- Example: "Main findings show a clear individual-versus-collective divergence. Individual-level outcomes: Productivity: AI-using scientists publish 3.02x more papers..."
- Validation: Non-empty, should describe results
References
- Prompt: "Provide your references in BibTeX format or as a list of structured citations. You can provide them one by one or paste multiple at once."
- Example:
```
[
  "Wang, H. et al. Scientific discovery in the age of artificial intelligence. Nature 620, 47-60 (2023).",
  "LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436-444 (2015)."
]
```
- Validation: Non-empty array, at least 3 references recommended

Optional Fields (with smart defaults):

Style Guide (Venue): "Which venue or style guide? Options: NeurIPS, ICML, ICLR, ACL, AAAI, COLM, Nature, or custom." (Default: "Nature")
Target Pages: "What is your target page count?" (Default: 20)
Template Path:
- Prompt: "Do you have a custom LaTeX template? If yes, provide the absolute path to the template file or directory."
- IMPORTANT: Must be an absolute path. If user provides relative path, convert it to absolute path using os.path.abspath() or pathlib.Path.resolve().
- Example: /Users/username/papers/templates/nature.zip or /home/user/templates/custom.tex
- Validation: Path must exist and be absolute
- Default: null
Compile PDF: "Should the paper be compiled to PDF? (yes/no)" (Default: true)
Enable Review: "Enable VLM-based review and iterative improvement? (yes/no)" (Default: true)
Max Review Iterations: "Maximum number of review iterations (if review is enabled):" (Default: 3)

Advanced Options (optional):

Figures:

Prompt: "Do you have figures to include? For each figure, provide: ID, absolute file path, caption, and description."
Format: Array of objects with id, file_path (must be absolute), caption, description
IMPORTANT: All file_path values must be absolute paths. Convert relative paths to absolute using os.path.abspath() or pathlib.Path.resolve().

Example:

[
  {
    "id": "fig:architecture",
    "file_path": "/Users/username/papers/figures/architecture.png",
    "caption": "System architecture diagram",
    "description": "Shows the overall system design"
  }
]

Validation: Each file_path must be absolute and file must exist
Default: empty array

Tables: Array of table objects (Default: empty array)
Code Repository:

Prompt: "Do you want to include code from a repository? Provide type (local_dir/git) and absolute path or URL."
For local_dir type: absolute path is required
For git_repo type: URL is required
IMPORTANT: If local_dir and user provides relative path, convert to absolute path using os.path.abspath() or pathlib.Path.resolve().

Example for local_dir:

{
  "type": "local_dir",
  "path": "/Users/username/projects/my_code",
  "on_error": "fallback"
}

Example for git_repo:

{
  "type": "git_repo",
  "url": "https://github.com/user/repo.git",
  "ref": "main"
}

Validation: For local_dir, path must be absolute and directory must exist
Default: null

Output Directory:

Prompt: "Where should the generated paper be saved? Provide an absolute path to the output directory."
IMPORTANT: Must be an absolute path. If user provides relative path, convert it to absolute path using os.path.abspath() or pathlib.Path.resolve().
Example: /Users/username/papers/output/my_paper or /home/user/output/output_20250120
Validation: Path must be absolute (can create directory if doesn't exist)
Default: {current_working_directory}/output_{timestamp} (converted to absolute)

Phase 3: Review and Confirm

Before generating:

Display summary of all collected metadata
Verify all paths are absolute: Check that template_path, figures[].file_path, code_repository.path (if local_dir), and output_dir are all absolute paths. Convert any relative paths found.
Ask "Would you like to modify any field? (yes/no)"
Optionally save metadata to metadata.json file (with all absolute paths)
Get final confirmation: "Ready to generate paper? (yes/no)"

Phase 4: Generate Paper

Check environment:
- Ensure easypaper package is installed (use setup-environment skill if needed)
- Check if config file exists or ask user for config path
- IMPORTANT: Config path should be absolute. If relative, convert to absolute.

Import and initialize EasyPaper:

from easypaper import EasyPaper, PaperMetaData, PaperGenerationRequest
from pathlib import Path

# Convert config path to absolute if needed
config_path = Path("configs/openrouter.yaml").resolve()  # or user-provided path

# Initialize with config
ep = EasyPaper(config_path=str(config_path))

Obtain SDK inputs (prefer loading from file):
- When user has a metadata file (recommended): Parse with request = PaperGenerationRequest.model_validate_json_file(path), then:
  - paper_metadata = request.to_metadata()
  - options = request.to_generate_options()
- Convert relative paths to absolute for template_path, figures[].file_path, code_repository.path (if local_dir), and output_dir.
- When metadata is from interactive collection or a dict: Build PaperMetaData from collected content, and build runtime options for generate().
- examples/meta.json is fully supported: Treat it as a full PaperGenerationRequest example.
- Validate required fields are present.

Generate paper:

from pathlib import Path
from easypaper import PaperGenerationRequest

request = PaperGenerationRequest.model_validate_json_file("metadata.json")
paper_metadata = request.to_metadata()
options = request.to_generate_options()

# Resolve output_dir to absolute if present
if options.get("output_dir"):
    options["output_dir"] = str(Path(options["output_dir"]).resolve())

result = await ep.generate(metadata=paper_metadata, **options)

OR use streaming for progress updates:

async for event in ep.generate_stream(metadata, **options):
    print(f"{event.get('phase', '')}: {event.get('message', '')}")

Report results:
- Show generation status
- List output files: paper.tex, references.bib, paper.pdf (if compiled)
- Provide absolute file paths
- Show summary: word count, sections generated, etc.
- Select final PDF with strict priority:
  1. result.pdf_path (authoritative)
  2. result.output_path/iteration_*_final/**/*.pdf
  3. latest result.output_path/iteration_* directory PDF
  4. result.output_path/paper.pdf fallback
- If no PDF exists, explicitly report final PDF unavailable and include compile errors.

Typesetter execution mode:

Prefer in-process Typesetter (SDK self-contained) when peer agent is available.
Fallback to HTTP Typesetter endpoint (AGENTSYS_SELF_URL) if peer is unavailable.

Path Handling Rules

CRITICAL: All paths in metadata must be absolute paths. Follow these rules:

When collecting paths from user:
- Always ask for absolute paths explicitly
- If user provides relative path, convert immediately using:
```
from pathlib import Path
absolute_path = Path(relative_path).resolve()
```
When reading metadata from file:
- After loading JSON, scan for all path fields and convert relative paths to absolute
- Path fields to check: template_path, figures[].file_path, code_repository.path (if local_dir), output_dir
When saving metadata:
- Save with all absolute paths (never save relative paths)
Path fields that must be absolute:
- template_path: LaTeX template file/directory path
- figures[].file_path: Figure image file paths
- code_repository.path: Local code repository directory path (if type is local_dir)
- output_dir: Output directory path
- config_path: EasyPaper configuration file path

Metadata Structure

The metadata should match the structure in examples/meta.json, but with absolute paths:

{
  "title": "...",
  "idea_hypothesis": "...",
  "method": "...",
  "data": "...",
  "experiments": "...",
  "references": [...],
  "style_guide": "Nature",
  "target_pages": 20,
  "template_path": "/absolute/path/to/template.zip",
  "compile_pdf": true,
  "enable_vlm_review": true,
  "max_review_iterations": 3,
  "figures": [
    {
      "id": "fig:example",
      "file_path": "/absolute/path/to/figure.png",
      "caption": "...",
      "description": "..."
    }
  ],
  "tables": [],
  "code_repository": {
    "type": "local_dir",
    "path": "/absolute/path/to/code",
    "on_error": "fallback"
  },
  "output_dir": "/absolute/path/to/output"
}

Alternative: Generate Metadata from a Materials Folder

Choosing between SDK one-shot and Claude-driven interactive build: If you want Claude Code to walk the folder with its own tools and co-author the metadata in conversation (with cross-skill help from paperhub-*, pdf, exa-search, sequential-thinking), use the dedicated interactive-metadata-build skill (slash command /easypaper-metadata-build) instead. The interactive skill produces the same PaperMetaData JSON shape, so this skill consumes its output without changes.

The section below describes the SDK one-shot path: a single Python call that does everything autonomously, no user dialogue. Best for batch / CI / regression eval.

When the user has a folder of research materials (code, data, PDFs, images, notes, BibTeX) instead of a ready-made metadata.json, EasyPaper can scan the folder and synthesize PaperMetaData automatically.

When to use

User says "I have a project folder / experiment directory" instead of providing structured metadata.
Folder typically contains a mix of .py, .csv, .json, .md, .tex, .bib, .png, .pdf, etc.

Quick path

from easypaper import EasyPaper
from pathlib import Path

ep = EasyPaper(config_path=str(Path("configs/openrouter.yaml").resolve()))

# Step 1: Scan folder → PaperMetaData (with prose, figures, tables, venue inference)
metadata = await ep.generate_metadata_from_folder(
    str(Path("path/to/materials").resolve()),
    max_figures=12,           # hard cap; LLM picks best subset if exceeded
    max_tables=12,
    vision_enrich_figures=True,  # default True; uses vision model per figure
    # vision_model="gpt-4o",   # defaults to the config model
    # max_vision_figures=8,     # optional: cap vision API calls
)

# Step 2: Generate paper from the synthesized metadata
result = await ep.generate(metadata, compile_pdf=True)

What the pipeline does

Scan & classify — FolderScanner walks the directory, classifies files by extension (IMAGE, PDF, BIB, TEXT, CODE, DATA, CONFIG).
Extract fragments — Each file type has a dedicated extractor (ImageExtractor, DataExtractor, PDFExtractor with LLM, etc.) producing ExtractedFragment objects. All paths stored as POSIX relative to materials_root.
Synthesize prose — MetadataSynthesizer (one LLM call) merges fragments into the five prose fields (idea_hypothesis, method, data, experiments) plus title, and infers style_guide, target_pages, and per-asset section placement.
Deduplicate & curate — Figures and tables are deduped by path; if count exceeds max_figures / max_tables, an LLM curator selects the minimal supporting subset (rule-based fallback if no LLM).
Vision enrichment — For each retained figure (after curation), a vision model receives a downscaled JPEG and writes 2–4 sentence description prose, replacing the raw "Image file: …" placeholder. Results are cached by content hash on disk (~/.cache/easypaper/figure_vision/ or EASYPAPER_CACHE_DIR).

Key parameters

Parameter	Default	Description
`max_figures`	`None` (no cap)	Hard upper bound on figure count
`max_tables`	`None` (no cap)	Hard upper bound on table count
`vision_enrich_figures`	`True`	Run vision model on each retained figure
`vision_model`	same as `model_name`	Multimodal model id (e.g. `gpt-4o`)
`max_vision_figures`	`None` (all)	Max vision API calls
`max_vision_long_edge`	`896`	Downscale longest image edge before JPEG encode
`vision_cache_dir`	auto	Disk cache for vision descriptions

Path handling difference

Unlike hand-written metadata (where all paths must be absolute), the folder pipeline sets metadata.materials_root to the folder's absolute path and stores figures[].file_path / tables[].file_path as relative POSIX paths. Downstream generation resolves them via materials_root. Do not manually convert these to absolute paths.

Cost control

Vision runs only on figures that survived curation — not on every image in the folder.
Use max_vision_figures to hard-cap the number of vision API calls.
Cached descriptions (keyed by file content SHA-256) prevent repeated billing across runs.
vision_enrich_figures=False disables vision entirely (useful for tests or non-multimodal models).

Best Practices

Progressive disclosure: Start with required fields, then optional, then advanced
Examples: Always provide examples when asking for input (use absolute paths in examples)
Validation: Validate each field immediately and ask for correction if invalid
Path conversion: Always convert relative paths to absolute paths immediately upon collection
Reference: Always reference examples/meta.json when users ask about structure (but note paths should be absolute). Treat it as a full PaperGenerationRequest sample and use to_metadata() + to_generate_options().
Final artifact selection: Always prefer result.pdf_path; never assume the first PDF in output dir is final.
Flexibility: Allow users to provide metadata in different formats (paste full JSON, or answer questions)
Direct import: Use from easypaper import EasyPaper, PaperMetaData, PaperGenerationRequest directly - no API calls needed
Config handling: Ask user for config path if not found, or use default configs/dev.yaml (convert to absolute)

Error Handling

If easypaper package not installed, use setup-environment skill
If config file missing, ask user for path or create default (ensure absolute path)
If metadata validation fails, show specific errors and ask for corrections
If path validation fails (file/directory doesn't exist), show clear error and ask for correct absolute path
If generation fails, show error message and suggest fixes
If relative path detected, automatically convert to absolute and inform user

Workflow

Popularity

Invocation

Context Preview

SKILL.md

Workflow

Popularity

Invocation

Context Preview

SKILL.md

Workflow

Phase 1: Check for Existing Metadata

Phase 2: Collect Metadata (if needed)

Required Fields (collect one by one with examples):

Optional Fields (with smart defaults):

Advanced Options (optional):

Phase 3: Review and Confirm

Phase 4: Generate Paper

Path Handling Rules

Metadata Structure

Alternative: Generate Metadata from a Materials Folder

When to use

Quick path

What the pipeline does

Key parameters

Path handling difference

Cost control

Best Practices

Error Handling

Similar Skills

Workflow

Phase 1: Check for Existing Metadata

Phase 2: Collect Metadata (if needed)

Required Fields (collect one by one with examples):

Optional Fields (with smart defaults):

Advanced Options (optional):

Phase 3: Review and Confirm

Phase 4: Generate Paper

Path Handling Rules

Metadata Structure

Alternative: Generate Metadata from a Materials Folder

When to use

Quick path

What the pipeline does

Key parameters

Path handling difference

Cost control

Best Practices

Error Handling

Similar Skills