From antigravity-awesome-skills
Token-efficient web research protocol that prioritizes minimal fetching. Guides API-based GitHub repo analysis, layered URL scraping, and search queries.
How this skill is triggered — by the user, by Claude, or both
Slash command
/antigravity-awesome-skills:efficient-web-researchThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
A protocol for accessing web content in the most token-efficient, accurate, and structured way —
A protocol for accessing web content in the most token-efficient, accurate, and structured way — using the right tool at the right depth, and stopping as soon as the question is answerable.
Fetch the minimum needed to answer. Skim before you dive. Stop when you can answer.
Every unnecessary fetch wastes tokens and adds noise. This skill enforces a layered approach where you escalate fetch depth only when shallower layers fail.
Before fetching anything, identify what kind of input you received:
| Input Type | Example | Go To |
|---|---|---|
| GitHub repo URL | github.com/user/repo | GitHub Protocol |
| Specific page URL | docs.python.org/3/library/os | URL Protocol |
| Topic / query (no URL) | "how does RAFT consensus work" | Search Protocol |
| Multiple URLs | List of links | Multi-URL Protocol |
| PDF / file link | .pdf, .txt, .md URL | File Protocol |
Use when input is a GitHub URL (repo, file, PR, issue, etc.)
github.com/{owner}/{repo} → Repo root
github.com/{owner}/{repo}/tree/{branch} → Directory
github.com/{owner}/{repo}/blob/{branch}/{path} → Single file
github.com/{owner}/{repo}/issues/{n} → Issue
github.com/{owner}/{repo}/pull/{n} → Pull request
Always prefer the GitHub API. It returns clean JSON — no HTML parsing needed.
# Repo metadata (name, description, language, stars, topics)
GET https://api.github.com/repos/{owner}/{repo}
# File tree (see what files exist — very cheap)
GET https://api.github.com/repos/{owner}/{repo}/git/trees/{ref}?recursive=1
# Single file content (base64 encoded)
GET https://api.github.com/repos/{owner}/{repo}/contents/{path}?ref={ref}
# README only (usually enough to understand the repo)
GET https://api.github.com/repos/{owner}/{repo}/readme
Layer 1 (always do first):
→ Fetch repo metadata + README only
→ Can you answer the user's question now? YES → STOP. NO → continue.
Layer 2 (only if needed):
→ Fetch file tree to understand structure
→ Identify the 1-3 most relevant files based on the question
→ Can you answer now? YES → STOP. NO → continue.
Layer 3 (last resort):
→ Fetch specific relevant files only (never fetch all files)
→ Prioritize: main entry point, config files, key modules
Use when the user gives a specific non-GitHub URL (docs, articles, blogs, etc.)
| Site type | Likely works with | Notes |
|---|---|---|
| Static docs / MDN / ReadTheDocs | read_url_content | Fast, clean, cheap |
| News articles / blogs | read_url_content | Usually fine |
| SPAs / React/Next.js apps | browser_subagent | JS-rendered |
| Auth-gated pages | browser_subagent | Needs login |
| Raw GitHub files (raw.githubusercontent) | read_url_content | Direct text |
Layer 1 — Skim
→ Fetch the URL with read_url_content
→ Read only headings (H1, H2, H3) and first paragraph
→ Does this page contain what the user needs? NO → try a different URL or search. YES → continue.
Layer 2 — Targeted Extract
→ If the page has anchor links (e.g. /docs/page#section), fetch with the anchor
→ Extract only the relevant section (200–500 tokens max)
→ Can you answer? YES → STOP.
Layer 3 — Full Fetch
→ Fetch full page, strip boilerplate (nav, footer, ads, cookie banners, sidebars)
→ Cap at 2000 tokens. Summarize before passing to answer.
Layer 4 — Browser Subagent (last resort only)
→ Use ONLY if read_url_content returns empty, garbled, or JS-placeholder content
→ Instruct subagent: "Navigate to [URL], wait for content to load, extract [specific section]"
→ Do NOT use browser_subagent for static pages — it's expensive
Always remove before using fetched content:
Extract and keep:
Use when the user gives a topic, question, or query — not a specific URL.
Do NOT search the raw user query. Transform it first:
Raw: "how to deploy fastapi on aws"
Sharpened: "fastapi AWS deployment tutorial 2024"
Raw: "python async vs threads"
Sharpened: "Python asyncio vs threading performance comparison"
Raw: "best way to structure react project"
Sharpened: "React project folder structure best practices"
Query sharpening rules:
1. Run search_web with the sharpened query
2. Get results (titles + snippets)
3. Scan titles + snippets ONLY — do not fetch yet
4. Pick the TOP 1-2 most relevant results (max 3 in complex cases)
5. Skip results from: forums (if docs exist), aggregator blogs, paywalled sites
6. Prefer: official docs, GitHub repos, well-known tech blogs, academic sources
Apply the URL Protocol (above) to each selected URL. Process results one at a time — only fetch the second URL if the first didn't answer the question.
Use when the user provides a list of URLs to compare or summarize.
1. Skim all URLs first (Layer 1 fetch for each)
2. Group by relevance to the user's question
3. Deep-fetch only the most relevant 1-3 URLs
4. Summarize each in 3-5 sentences before combining
5. Never dump raw content from multiple pages — always summarize per-source first
Use when URL points directly to a file (PDF, .txt, .md, .csv, etc.)
.md / .txt / .csv → read_url_content works directly, read full content.pdf → Use browser_subagent or a PDF extraction tool; extract text only.json / .yaml → read_url_content, parse structure, summarize schema + key values| Anti-pattern | Why it's bad | Do this instead |
|---|---|---|
| Fetching full page for a simple fact | Wastes 1000s of tokens | Use snippet or targeted anchor |
| Using browser_subagent for static sites | Very expensive | Use read_url_content first |
| Searching with the raw user query | Vague results | Sharpen query first |
| Fetching 5+ search results | Token explosion | Max 3, stop when answered |
| Dumping raw HTML into context | Noisy, wasteful | Always strip to Markdown |
| Fetching "just in case" | Unnecessary tokens | Only fetch what's needed to answer |
| Re-fetching the same URL | Redundant | Cache result in context, reuse |
| Fetching entire GitHub repo | Extremely wasteful | README + targeted files only |
Input received
│
├─ GitHub URL?
│ ├─ Fetch README + metadata via API
│ ├─ Answered? → STOP
│ ├─ Need more? → Fetch file tree, pick 1-3 files
│ └─ Still need more? → Fetch specific files only
│
├─ Specific URL?
│ ├─ Try read_url_content → skim headings
│ ├─ Answered? → STOP
│ ├─ Need more? → Targeted section fetch
│ ├─ Still need more? → Full fetch, stripped
│ └─ JS-rendered / broken? → browser_subagent (last resort)
│
├─ Topic/query?
│ ├─ Sharpen query
│ ├─ search_web → scan snippets
│ ├─ Snippet enough? → Answer from snippet, STOP
│ ├─ Need more? → Fetch top 1 result (targeted)
│ └─ Still need more? → Fetch top 2nd result (targeted)
│
└─ List of URLs?
├─ Skim all (Layer 1 each)
├─ Deep fetch top 1-3 relevant ones
└─ Summarize per-source, then combine
After fetching, structure your response as:
Source: [URL or "Web search for: query"]
Summary: [2-5 sentences of what was found]
Answer: [Direct answer to user's question]
Confidence: [High / Medium / Low — based on source quality]
For multiple sources:
Source 1: ...
Source 2: ...
Combined Answer: ...
Never output:
| Operation | Approximate token cost | When to use |
|---|---|---|
| GitHub README fetch | ~300–800 tokens | Always first for repos |
| GitHub API metadata | ~200 tokens | Always for repos |
| Skim (headings only) | ~100–200 tokens | Always first for URLs |
| Targeted section fetch | ~300–600 tokens | When skim isn't enough |
| Full page fetch (stripped) | ~1000–2000 tokens | Only when targeted fails |
| browser_subagent | ~2000–5000 tokens | Last resort only |
| Search snippet scan | ~300–500 tokens | Always before fetching |
Rule of thumb: If you're about to spend >2000 tokens on a fetch, ask yourself if there's a cheaper path first.
browser_subagent for these, which is slower and more expensive.npx claudepluginhub sickn33/antigravity-awesome-skills --plugin antigravity-bundle-aas-mobile-app-builderGathers content from URLs (auto-detects Google/Slack/Notion/GitHub), web searches (Tavily/Exa), and local codebase into markdown artifacts for stable reasoning context.
Routes research tasks to optimal tools via decision tree: Context7 for library docs, Perplexity Ask/Search/Research for Q&A/events/deep analysis, Firecrawl for URL scraping/mapping/extraction.
Intercepts GitHub URLs in user input and routes to efficient API calls for PRs, issues, files, commits, discussions, avoiding 5-10MB HTML pages.