From data-annotation
Initialize a local git repository for a dataset, with the conventional Hugging Face layout, LFS configuration, license, and a stub dataset card. Use when the user wants to start a new dataset repo, or as a step inside `hf-setup`.
How this skill is triggered — by the user, by Claude, or both
Slash command
/data-annotation:init-dataset-repoThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Creates a clean local git repository structured for Hugging Face hosting. Does not push anywhere — `hf-setup` handles remote creation and push.
Creates a clean local git repository structured for Hugging Face hosting. Does not push anywhere — hf-setup handles remote creation and push.
<workspace>/dataset-repo/ if a shape-dataset workspace exists, otherwise ask.mit if the user hasn't decided; flag this in the card as a TODO).<dataset-name>/
├── .gitattributes # LFS rules
├── .gitignore # excludes scratch/, .env, __pycache__/
├── LICENSE # full license text
├── README.md # dataset card stub
└── data/ # empty; prepared splits go here later
*.parquet filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.tar.gz filter=lfs diff=lfs merge=lfs -text
*.jsonl filter=lfs diff=lfs merge=lfs -text
*.csv filter=lfs diff=lfs merge=lfs -text
*.png filter=lfs diff=lfs merge=lfs -text
*.jpg filter=lfs diff=lfs merge=lfs -text
*.wav filter=lfs diff=lfs merge=lfs -text
*.mp3 filter=lfs diff=lfs merge=lfs -text
If certain formats are known to be small in this dataset, the user can prune later — better to over-include LFS rules at init time.
Minimal HF dataset card with YAML frontmatter and section headers. Every value is <!-- TODO --> until prep produces real metadata. The full population happens in hf-setup.
mkdir -p <path>/data.gitattributes, .gitignore, LICENSE, README.md.git init and git lfs install inside the repo.git add -A && git commit -m "Initialize dataset repo skeleton".Do not add any HF remote here. hf-setup does that after asking public/private.
npx claudepluginhub danielrosehill/claude-code-plugins --plugin data-annotationSearches MemPalace before answering questions about past work, people, projects, or prior decisions. Returns verbatim stored content instead of guessing from model memory.
Guides Payload CMS config (payload.config.ts), collections, fields, hooks, access control, APIs. Debugs validation errors, security, relationships, queries, transactions, hook behavior.
Implements vector databases with Pinecone, Weaviate, Qdrant, Milvus, pgvector for semantic search, RAG, recommendations, and similarity systems. Optimizes embeddings, indexing, and hybrid search.