From redaxo-search-it
Generates and manages Search It indexes in REDAXO: reindex articles, DB columns, files/PDFs; console commands like search_it:reindex/clearCache; cronjobs; plaintext/PDF conversion. Use for stale results or custom indexing.
How this skill is triggered — by the user, by Claude, or both
Slash command
/redaxo-search-it:search-it-indexingThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Search It stores all searchable content in database tables (`rex_tmp_search_it_index`, `rex_tmp_search_it_keywords`). Content must be indexed before it appears in search results.
Search It stores all searchable content in database tables (rex_tmp_search_it_index, rex_tmp_search_it_keywords). Content must be indexed before it appears in search results.
By default:
pdftotext (must be installed on server)Backend > Search It > Generate index > "Generate index" button.
php redaxo/bin/console search_it:reindex # full reindex
php redaxo/bin/console search_it:clearCache # clear search cache only
use FriendsOfRedaxo\SearchIt\SearchIt;
$search = new SearchIt();
$search->generateIndex(); // full reindex
$search = new SearchIt();
// Single article (optionally for specific language)
$search->indexArticle(42); // all languages
$search->indexArticle(42, rex_clang::getCurrentId()); // current language only
// Database column
$search->indexColumn(rex::getTable('my_table'), 'description');
// File
$search->indexFile('document.pdf');
$search = new SearchIt();
$search->deleteIndex(); // drop entire index (requires regeneration)
$search->deleteCache(); // clear cached search results
$search->deleteKeywords(); // clear similarity keyword index
Search It hooks into REDAXO extension points via EventHandler and automatically re-indexes articles when they are saved, published or deleted. This happens for:
ART_ADDED, ART_UPDATED, ART_DELETEDART_STATUS (online/offline toggle)SLICE_ADDED, SLICE_UPDATED, SLICE_DELETEDMEDIA_ADDED, MEDIA_UPDATEDNo manual action needed for article content changes. But if you change backend settings (e.g. add a new DB column source), you must trigger a full reindex.
In the backend under Cronjob addon, two cronjob types are available:
Useful for sites with frequently changing external data sources (e.g. DB columns filled by imports).
Articles are fetched via HTTP (or socket), rendered, then converted to plaintext. The PlaintextConverter strips HTML, applies CSS selector exclusions, runs regex replacements and optionally parses Textile.
Configure in backend: Settings > Plaintext settings:
nav, .no-search, footer (content in these elements is not indexed)rex_extension::register('SEARCH_IT_PLAINTEXT', function(rex_extension_point $ep) {
$text = $ep->getSubject();
// Remove specific content before indexing
$text = preg_replace('/<div class="no-index">.*?<\/div>/s', '', $text);
return $text;
});
Return an array to control further processing:
return ['text' => $cleanedText, 'process' => true];
// process = true: standard plaintext conversion still runs after your hook
// process = false: use your text as-is, skip built-in conversion
Requires pdftotext (from poppler-utils) on the server:
apt-get install poppler-utils # Debian/Ubuntu
Search It uses PdfConverter to extract text from PDF files in the media pool. Enable file indexing and add pdf to the allowed extensions in backend settings.
| Table | Purpose |
|---|---|
rex_tmp_search_it_index | Main index (plaintext, metadata per article/column/file) |
rex_tmp_search_it_keywords | Keywords for similarity search |
rex_tmp_search_it_cache | Cached search results |
rex_tmp_search_it_cacheindex_ids | Links cache entries to index entries |
rex_tmp_search_it_stats_searchterms | Search term statistics |
Tables use the tmp_ prefix because they are regenerable – the index can always be rebuilt from source content.
generateIndex() on every page load – this is an expensive operation. Only call it from console, cronjob, or backend.indexArticle() – if the search cache contains stale results, call deleteCache() as well.poppler-utils on the server – PDF files are silently skipped if pdftotext is not available.npx claudepluginhub friendsofredaxo/claude-marketplace --plugin redaxo-search-itCreates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.