Skill

XHS Browsing

This skill should be used when the user asks to "browse xiaohongshu", "scrape xiaohongshu", "get news from xiaohongshu", "extract xiaohongshu posts", "浏览小红书", "小红书新闻", "小红书热点", or needs to automate content extraction from xiaohongshu.com (小红书). Provides the workflow, page structure knowledge, and JavaScript extraction patterns for reliably browsing and extracting post content from xiaohongshu.com using agent-browser.

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/xhs-news:xhs-browsing

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Automate browsing xiaohongshu.com to search for topics, extract post content, and compile structured summaries using agent-browser.

Supporting Files

references/extract-scripts.md

SKILL.md

134 lines · ~1.2k tokens

Stats

Parent stars0

MaintenanceFair

Last CommitFeb 17, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

XHS Browsing Skill

Automate browsing xiaohongshu.com to search for topics, extract post content, and compile structured summaries using agent-browser.

Prerequisites

The agent-browser plugin must be installed and available
A saved session state file OR willingness to log in manually in headed mode

Core Workflow

1. Session Management

First-time login:

Open xiaohongshu in headed mode with a named session:

agent-browser --headed --session-name xiaohongshu open https://www.xiaohongshu.com/explore

Inform the user: "Please log in to xiaohongshu in the browser window. Tell me when done."

After login confirmation, save state:

agent-browser --session-name xiaohongshu state save xiaohongshu-auth.json

Returning sessions:

Load saved state and open the site:

agent-browser --session-name xiaohongshu state load xiaohongshu-auth.json
agent-browser --session-name xiaohongshu open https://www.xiaohongshu.com/explore

If the page shows a login prompt, fall back to first-time login flow.

2. Search for Topics

Navigate to search results using a URL with encoded keyword. Always quote the URL to prevent shell expansion:

agent-browser --session-name xiaohongshu open 'https://www.xiaohongshu.com/search_result?keyword=<ENCODED_KEYWORD>&source=web_search_result_notes&type=51'

Wait for the page to load:

agent-browser --session-name xiaohongshu wait --load networkidle

3. Extract Post List

Use JavaScript evaluation to extract post metadata from the search results page. Use eval --stdin with heredoc to avoid shell quoting issues.

Refer to references/extract-scripts.md for the complete extraction scripts.

The key extraction pattern targets section.note-item elements and extracts title, author, likes count, and the a.cover link href (which includes the xsec_token parameter needed for access).

4. Open and Extract Individual Posts

Critical: Do NOT navigate directly to /explore/<id> URLs — these will return 404 due to anti-scraping protection. Instead, click post cards via JavaScript from the search results page:

agent-browser --session-name xiaohongshu eval --stdin <<'EVALEOF'
document.querySelectorAll('section.note-item')[INDEX].querySelector('a.cover').click();
'clicked'
EVALEOF

Wait 3 seconds for the modal to load, then extract content from the detail overlay:

agent-browser --session-name xiaohongshu eval --stdin <<'EVALEOF'
const noteContainer = document.querySelector('.note-detail-mask')
  || document.querySelector('[class*="note-detail"]')
  || document.querySelector('.note-scroller');
const content = noteContainer ? noteContainer.innerText : 'not found';
content.substring(0, 3000);
EVALEOF

Close the modal with Escape before opening the next post:

agent-browser --session-name xiaohongshu press Escape

5. Post Prioritization

Sort posts by engagement (likes count) and prioritize:

Posts with high like counts (>50)
Posts from recent timeframes (today, yesterday)
Posts with informative titles (skip ads, irrelevant content)

Aim to extract 8-12 high-quality posts per search session.

6. Compile Summary

Organize extracted content into a structured Markdown report with these sections:

Header: Topic, date, source
Major Events: Biggest news items with details
Industry Trends: Broader patterns and analysis
Technology Updates: New releases, benchmarks, technical developments
Notable Opinions: Interesting viewpoints from posts and comments
Summary Table: Quick-reference table of key topics

Save the report as a Markdown file with naming convention: xhs-<topic>-<YYYYMMDD>.md

Important Notes

Always use --session-name xiaohongshu for session persistence across commands
Always quote URLs containing special characters with single quotes
Use eval --stdin <<'EVALEOF' for all JavaScript to avoid shell escaping issues
The xsec_token in search result links is session-bound; do not reuse across sessions
Post detail modals overlay the search results page; press Escape to return
Rate limit: wait 1-3 seconds between post extractions to avoid triggering anti-bot measures

Additional Resources

Reference Files

references/extract-scripts.md — Complete JavaScript extraction code snippets for post lists and post content

XHS Browsing

Invocation

Context Preview

Supporting Files

SKILL.md

XHS Browsing

Invocation

Context Preview

Supporting Files

SKILL.md

XHS Browsing Skill

Prerequisites

Core Workflow

1. Session Management

2. Search for Topics

3. Extract Post List

4. Open and Extract Individual Posts

5. Post Prioritization

6. Compile Summary

Important Notes

Additional Resources

Reference Files

Similar Skills

XHS Browsing Skill

Prerequisites

Core Workflow

1. Session Management

2. Search for Topics

3. Extract Post List

4. Open and Extract Individual Posts

5. Post Prioritization

6. Compile Summary

Important Notes

Additional Resources

Reference Files

Similar Skills