From posthog
Changes sync configuration of an existing data warehouse schema: sync_type, incremental_field, primary_key_columns, CDC mode, or sync_frequency. Use when switching from full refresh to incremental, fixing failing sync diagnoses, or adjusting sync cadence.
How this skill is triggered — by the user, by Claude, or both
Slash command
/posthog:tuning-incremental-sync-configThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
A sync's configuration lives on the `ExternalDataSchema` and can be changed any time via
A sync's configuration lives on the ExternalDataSchema and can be changed any time via
external-data-schemas-partial-update. Most changes are non-destructive (take effect on the next sync), but a few
(switching sync_type, changing primary keys) require careful handling to avoid corrupting the synced data.
If the user is setting up a brand-new source, use setting-up-a-data-warehouse-source instead — configuration is
chosen at creation time there.
| Tool | Purpose |
|---|---|
external-data-schemas-retrieve | Current sync_type, incremental_field, PKs, sync_frequency |
external-data-schemas-incremental-fields-create | Refresh candidate incremental fields from the live source |
external-data-schemas-partial-update | Apply the config change |
external-data-schemas-reload | Trigger a sync with the new config |
external-data-schemas-resync | Wipe and re-import from scratch when the change invalidates existing data |
external-data-schemas-delete-data | Drop the synced table while keeping the schema entry |
external-data-sources-check-cdc-prerequisites-create | Pre-flight Postgres CDC (only when switching to/from CDC) |
external-data-sources-webhook-info-retrieve | Current webhook state (when switching to/from sync_type=webhook) |
external-data-sources-create-webhook-create | Register a webhook after switching a schema to sync_type=webhook |
external-data-sources-update-webhook-inputs-create | Rotate a webhook signing secret |
external-data-sources-delete-webhook-create | Unregister webhook when switching schemas off sync_type=webhook |
From the partial-update endpoint:
| Field | Values | Notes |
|---|---|---|
sync_type | full_refresh, incremental, append, cdc, webhook | Source must support the target type — check via incremental-fields |
incremental_field | Column name from the source | Must appear in incremental_fields list for the schema |
incremental_field_type | datetime, date, timestamp, integer, numeric, objectid | Must match the column's real type |
primary_key_columns | Array of column names | Required for CDC. Used for upsert dedup on incremental |
cdc_table_mode | consolidated, cdc_only, both | Only meaningful when sync_type=cdc |
sync_frequency | 1min, 5min, 15min, 30min, 1hour, 6hour, 12hour, 24hour, 7day, 30day, never | Applies to all non-CDC types |
sync_time_of_day | HH:MM:SS | When sync_frequency is daily/weekly-scale |
should_sync | true / false | Pause the schema without deleting it |
Always start with external-data-schemas-retrieve({id}). Understanding the current state prevents mistakes like
"fixing" an incremental_field that's actually correct.
Note:
sync_type, incremental_field, incremental_field_type, primary_key_columnsstatus (don't tune a schema that's currently Running — wait or cancel first)last_synced_at (so you can tell if the next sync worked)latest_error if present (the error often tells you exactly what to change)Call external-data-schemas-incremental-fields-create({id}). Even though the operation name says "create", it
re-reads the source and returns the current candidate fields — use it to confirm the field you want to set actually
exists on the source and which sync types are now available for this table.
The response:
{
"incremental_fields": [{"field": "updated_at", "type": "datetime", ...}, ...],
"incremental_available": true,
"append_available": true,
"cdc_available": true,
"full_refresh_available": true,
"detected_primary_keys": ["id"],
"available_columns": [...]
}
If your target incremental_field isn't in the list, tell the user — they need to either pick a different field or
change the source table to add one.
Call external-data-schemas-partial-update({id}, {...changed fields}).
Only send the fields that are actually changing. Partial update means unspecified fields stay as they are.
Examples:
// Switch from full_refresh to incremental
{
"sync_type": "incremental",
"incremental_field": "updated_at",
"incremental_field_type": "datetime"
}
// Change sync frequency to hourly
{"sync_frequency": "1hour"}
// Fix wrong PK on a CDC table
{"primary_key_columns": ["tenant_id", "order_id"]}
// Pause a schema
{"should_sync": false}
This is the step that's easy to get wrong. Some config changes invalidate the synced data; others don't.
Changes that DON'T invalidate existing data:
sync_frequency, sync_time_of_day — scheduling onlyshould_sync — on/offcdc_table_mode in most cases — next sync will start writing to the new shape, but historical consolidated rows
stay validincremental and full_refresh with the same incremental_field — next sync just re-runs
freshsync_type: "webhook" — the synced data stays valid; only the ingestion path changes.
Remember to register or unregister the webhook (see sections below) alongside the sync_type change.Changes that MAY invalidate existing data and need a resync:
incremental_field to a different column — the high-water mark is from the old column and won't match.
Without a resync you'll miss rows that were updated between the two fields' histories.primary_key_columns — existing rows may be deduplicated incorrectly against new PK definitions.full_refresh to append — the existing rows don't have the version-history shape that append
expects.append to full_refresh — opposite problem; you'll end up with duplicate historical versions.cdc — the table shape changes fundamentally.When the change invalidates data, the clean flow is:
external-data-schemas-partial-update with the new configexternal-data-schemas-resync to wipe and re-import under the new configOr equivalently, external-data-schemas-delete-data → external-data-schemas-reload. delete-data + reload is
cleaner when the table is large and the user wants to start from zero.
For non-destructive changes, call external-data-schemas-reload({id}) to pick up the new config immediately rather
than waiting for the schedule.
Wait a moment, then external-data-schemas-retrieve({id}) to confirm status = Running then Completed. Report
last_synced_at and any new latest_error.
incremental-fields-create to confirm the desired field exists and incremental_available: true.partial-update: {sync_type: "incremental", incremental_field, incremental_field_type}.external-data-sources-check-cdc-prerequisites-create on the parent source. Only proceed if valid: true.incremental-fields-create to confirm cdc_available: true and see detected_primary_keys.partial-update: {sync_type: "cdc", primary_key_columns: [...], cdc_table_mode: "consolidated"}.external-data-schemas-resync after the update.
Warn the user this wipes existing data.Source dropped the updated_at column. Sync has been failing with "column does not exist".
incremental-fields-create to see what fields remain.full_refresh if none are suitable).partial-update with the new field + type (or new sync_type).reload to retry.partial-update: {primary_key_columns: [...]}.resync, warn the user.partial-update: {sync_frequency: "1hour"}.sync_type: "webhook"Only works for sources that implement WebhookSource (today: Stripe) and tables where supports_webhooks: true
from incremental-fields-create.
incremental-fields-create to confirm supports_webhooks: true for the table.partial-update: {sync_type: "webhook"}.webhook-info-retrieve), call
external-data-sources-create-webhook-create({source_id}) to register it.sync_frequency set (e.g. 24hour) — it acts as a safety-net reconciliation in case any webhook delivery
is missed.sync_type: "webhook"partial-update: {sync_type: "incremental"} (or whatever bulk type is appropriate) with the required
incremental_field + incremental_field_type.sync_type: "webhook", call
external-data-sources-delete-webhook-create({source_id}) to unregister. Leaving an orphaned webhook
registered on the source side just means events will be received and dropped — not harmful, but messy.The source's signing secret (e.g. Stripe's whsec_...) was rotated, and payloads are now failing signature
verification.
external-data-sources-update-webhook-inputs-create({source_id}, {inputs: {signing_secret: "whsec_..."}}).partial-update: {should_sync: false}. Schema stops syncing but stays configured.partial-update: {should_sync: true}, then reload for an immediate run.partial-update doesn't complain if you set a
field to the value it already had, but you might be about to change something you didn't realize was already set.incremental-fields-create response tells you what's
available right now, which can be different from what was available at creation (e.g. CDC may have been
enabled for the team since).sync_type: "cdc" without running check-cdc-prerequisites-create
first. The sync will just fail immediately.external-data-schemas-cancel before applying the change. Updating config mid-sync can leave the incremental
high-water mark inconsistent.npx claudepluginhub anthropics/claude-plugins-official --plugin posthogDiagnoses why a data warehouse sync is failing and recommends the right recovery action. Covers source-level vs schema-level failures, stuck Running states, credential and schema-drift errors, and incremental-field misconfigurations.
Generates a drt sync YAML configuration to connect a data warehouse table to an external service (Slack, REST API, HubSpot, Google Sheets, etc.) or set up a Reverse ETL pipeline.
Migrates Reverse ETL syncs from Census, Hightouch, Polytomic, or custom scripts to drt by mapping sources, destinations, schedules, and generating sync YAML configs.