Skill

import-bibtex

Full BibTeX import and abstract enrichment pipeline running entirely in Cowork. Parses BibTeX files, creates research libraries, enriches missing abstracts via OpenRouter/Gemini, deduplicates, and inserts structured citations into Supabase. Use this skill whenever the user says "import my BibTeX," "upload my bibliography," "import this into my research library," "process this BibTeX," "add these papers," "start a new research project" with an attached file, "load my references," or any request to import academic references from a BibTeX file into the research pipeline. Also trigger when a .bib file is uploaded or when the user mentions BibTeX in the context of getting started with the research pipeline.

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/research-pipeline:import-bibtex

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Parse, enrich, deduplicate, and load a BibTeX file into the research pipeline — entirely

SKILL.md

234 lines · ~1.9k tokens

Stats

LanguagePython

Parent stars0

MaintenanceGood

Last CommitMay 12, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

Import BibTeX

Parse, enrich, deduplicate, and load a BibTeX file into the research pipeline — entirely within Cowork.

Prerequisites

Supabase MCP — for creating libraries and inserting citations
Google Drive MCP — for uploading the annotated bibliography
Rube MCP — for calling OpenRouter API (abstract enrichment via Gemini)
If Rube is unavailable, use WebFetch against the OpenRouter API directly

Step 1: Receive the File

The user will either:

Upload a .bib file directly in the conversation
Paste BibTeX content as text
Reference a file path

If the file is uploaded, read it from the uploads path. If pasted, capture the full text.

Step 2: Create or Select Library

Ask the user what to name this research project, or let them pick an existing library.

New library:

INSERT INTO research_libraries (name, description, metadata)
VALUES ('{name}', '{description}', '{"created_by": "dorian"}'::jsonb)
RETURNING id, name

Existing library:

SELECT id, name FROM research_libraries ORDER BY created_at DESC

Store the library_id for all subsequent operations.

Step 3: Parse BibTeX

Use the bundled parser script at references/bibtex-parser.py:

python3 ${CLAUDE_PLUGIN_ROOT}/skills/research-pipeline/references/bibtex-parser.py /path/to/input.bib /tmp/parsed_citations.json

Or parse inline if the file is small (<50 entries):

Split the content on @type{ patterns. For each entry extract:

citation_key — the key after @type{
entry_type — article, misc, inproceedings, etc.
title, authors, year, journal, abstract, doi, arxiv_id, url
bibtex_raw — the original entry text

Deduplicate by citation_key + abstract signature. Report to the user: "Found [X] entries, [Y] unique after deduplication, [Z] already have abstracts."

Step 4: Enrich Missing Abstracts

For entries without meaningful abstracts (empty, placeholder text, or missing entirely):

Batch Strategy

Group entries into batches of 10-15 for efficient LLM processing. For each batch, construct a prompt and send to OpenRouter via Rube MCP:

Via Rube MCP (preferred):

Use RUBE_REMOTE_WORKBENCH or RUBE_MULTI_EXECUTE_TOOL to call the OpenRouter chat completions endpoint:

POST https://openrouter.ai/api/v1/chat/completions
Authorization: Bearer {OPENROUTER_API_KEY}
Content-Type: application/json

{
  "model": "google/gemini-3-pro-preview",
  "messages": [
    {"role": "system", "content": "...enrichment prompt..."},
    {"role": "user", "content": "...batch of entries..."}
  ]
}

Via direct LLM call (fallback):

If Rube is unavailable, process entries directly in conversation. For each entry missing an abstract:

Check if it has a DOI or URL
If DOI exists, try to fetch metadata from CrossRef: https://api.crossref.org/works/{doi}
If the API returns an abstract, use it directly
If no API abstract available, generate a 2-4 sentence academic abstract based on the title, authors, journal, and any other available metadata

Enrichment Prompt

For each batch sent to the LLM:

You are processing BibTeX entries that need abstracts. For each entry:

1. If a meaningful abstract exists, preserve it exactly
2. If the abstract is missing or placeholder text:
   a. If a DOI or URL is provided, describe what the paper likely covers based on
      the title, authors, journal, and year
   b. Generate a 2-4 sentence academic abstract that:
      - States the central aim or problem
      - Synthesizes likely main arguments or methods
      - Highlights potential implications
   c. Mark generated abstracts with [AI-generated] prefix

Output each entry as JSON with fields: citation_key, abstract, abstract_source
where abstract_source is "original", "crossref", or "generated"

Progress Reporting

After each batch completes, report progress: "Enriched batch [N] of [M] — [X] abstracts found via API, [Y] generated, [Z] preserved."

Step 5: Insert into Supabase

For each parsed and enriched citation, insert into the citations table:

INSERT INTO citations (
  library_id, citation_key, entry_type, title, authors, year,
  journal, abstract, doi, arxiv_id, url, bibtex_raw,
  verification_status, source
) VALUES (
  {library_id}, '{citation_key}', '{entry_type}', '{title}', '{authors}',
  {year}, '{journal}', '{abstract}', '{doi}', '{arxiv_id}', '{url}',
  '{bibtex_raw}', 'unverified', 'bibtex_import'
)
ON CONFLICT (library_id, citation_key) DO UPDATE SET
  abstract = EXCLUDED.abstract,
  updated_at = now()

Use ON CONFLICT ... DO UPDATE so re-imports update abstracts without creating duplicates.

Process in batches of 50 for efficiency.

Step 6: Upload to Google Drive

Generate an annotated bibliography markdown file:

# Annotated Bibliography: {library_name}
Generated: {date}
Total citations: {count}

## Citations

### {citation_key_1}
**{title}** ({year})
{authors}
*{journal}*
DOI: {doi}

{abstract}

---

### {citation_key_2}
...

Upload to Google Drive in the user's preferred folder. Store the Google Drive link.

Step 7: Summary Report

Present the final results:

BibTeX Import Complete
══════════════════════

Library: {name} (ID: {library_id})
Total entries parsed:    {total}
Duplicates removed:      {dupes}
Citations inserted:      {inserted}
  ├─ With original abstracts:  {original_count}
  ├─ Abstracts from CrossRef:  {crossref_count}
  ├─ AI-generated abstracts:   {generated_count}
  └─ No abstract available:    {no_abstract_count}

Annotated Bibliography: {google_drive_link}

Next steps:
  → "Verify my citations" — run 4-layer reference validation
  → "Find more papers" — discover related literature
  → "Synthesize my research" — build thematic maps + perspective

Handling Large Files (200+ entries)

For large BibTeX files:

Parse all entries first (this is fast, no LLM needed)
Insert all entries into Supabase immediately with whatever abstracts exist
Then enrich missing abstracts in background batches
Update Supabase as each batch completes

This way the user has their library available immediately and can start synthesis while enrichment continues. Report progress periodically: "Enrichment progress: [X]/[Y] entries processed..."

Error Handling

If a BibTeX entry can't be parsed (malformed braces, etc.), skip it and log the citation key in a "skipped entries" list
If OpenRouter/Gemini is unavailable, insert entries without enrichment and mark them for later processing
If Supabase insert fails for a specific entry, log it and continue with the rest
Always report what worked and what didn't at the end

import-bibtex

Invocation

Context Preview

SKILL.md

import-bibtex

Invocation

Context Preview

SKILL.md

Import BibTeX

Prerequisites

Step 1: Receive the File

Step 2: Create or Select Library

Step 3: Parse BibTeX

Step 4: Enrich Missing Abstracts

Batch Strategy

Enrichment Prompt

Progress Reporting

Step 5: Insert into Supabase

Step 6: Upload to Google Drive

Step 7: Summary Report

Handling Large Files (200+ entries)

Error Handling

Similar Skills

Import BibTeX

Prerequisites

Step 1: Receive the File

Step 2: Create or Select Library

Step 3: Parse BibTeX

Step 4: Enrich Missing Abstracts

Batch Strategy

Enrichment Prompt

Progress Reporting

Step 5: Insert into Supabase

Step 6: Upload to Google Drive

Step 7: Summary Report

Handling Large Files (200+ entries)

Error Handling

Similar Skills