Skill

tuning-incremental-sync-config

Changes sync configuration of an existing data warehouse schema: sync_type, incremental_field, primary_key_columns, CDC mode, or sync_frequency. Use when switching from full refresh to incremental, fixing failing sync diagnoses, or adjusting sync cadence.

PostgreSQL

SQL

data-engineering

database

Popularity

Stars

Forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/posthog:tuning-incremental-sync-config

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

A sync's configuration lives on the `ExternalDataSchema` and can be changed any time via

SKILL.md

260 lines · ~3.7k tokens

Stats

LanguagePython

Stars54

Forks8

MaintenanceExcellent

Last CommitJun 17, 2026

Actions

View Source View Plugin View on GitHub View README

Tuning incremental sync config

A sync's configuration lives on the ExternalDataSchema and can be changed any time via external-data-schemas-partial-update. Most changes are non-destructive (take effect on the next sync), but a few (switching sync_type, changing primary keys) require careful handling to avoid corrupting the synced data.

When to use this skill

The user wants to change how an already-connected table is synced
A diagnosis flagged the incremental field or primary key as wrong
The table is syncing too often / not often enough
Switching an incremental table to CDC (or vice versa)
The source table was changed on the other side (new columns, dropped columns) and the sync config needs to catch up

If the user is setting up a brand-new source, use setting-up-a-data-warehouse-source instead — configuration is chosen at creation time there.

Available tools

Tool	Purpose
`external-data-schemas-retrieve`	Current sync_type, incremental_field, PKs, sync_frequency
`external-data-schemas-incremental-fields-create`	Refresh candidate incremental fields from the live source
`external-data-schemas-partial-update`	Apply the config change
`external-data-schemas-reload`	Trigger a sync with the new config
`external-data-schemas-resync`	Wipe and re-import from scratch when the change invalidates existing data
`external-data-schemas-delete-data`	Drop the synced table while keeping the schema entry
`external-data-sources-check-cdc-prerequisites-create`	Pre-flight Postgres CDC (only when switching to/from CDC)
`external-data-sources-webhook-info-retrieve`	Current webhook state (when switching to/from sync_type=webhook)
`external-data-sources-create-webhook-create`	Register a webhook after switching a schema to sync_type=webhook
`external-data-sources-update-webhook-inputs-create`	Rotate a webhook signing secret
`external-data-sources-delete-webhook-create`	Unregister webhook when switching schemas off sync_type=webhook

The fields you can tune

From the partial-update endpoint:

Field	Values	Notes
`sync_type`	`full_refresh`, `incremental`, `append`, `cdc`, `webhook`	Source must support the target type — check via incremental-fields
`incremental_field`	Column name from the source	Must appear in `incremental_fields` list for the schema
`incremental_field_type`	`datetime`, `date`, `timestamp`, `integer`, `numeric`, `objectid`	Must match the column's real type
`primary_key_columns`	Array of column names	Required for CDC. Used for upsert dedup on incremental
`cdc_table_mode`	`consolidated`, `cdc_only`, `both`	Only meaningful when sync_type=cdc
`sync_frequency`	`1min`, `5min`, `15min`, `30min`, `1hour`, `6hour`, `12hour`, `24hour`, `7day`, `30day`, `never`	Applies to all non-CDC types
`sync_time_of_day`	`HH:MM:SS`	When sync_frequency is daily/weekly-scale
`should_sync`	`true` / `false`	Pause the schema without deleting it

Workflow

Step 1 — Read the current config

Always start with external-data-schemas-retrieve({id}). Understanding the current state prevents mistakes like "fixing" an incremental_field that's actually correct.

Note:

Current sync_type, incremental_field, incremental_field_type, primary_key_columns
Current status (don't tune a schema that's currently Running — wait or cancel first)
last_synced_at (so you can tell if the next sync worked)
latest_error if present (the error often tells you exactly what to change)

Step 2 — If changing sync_type or incremental_field, refresh candidates

Call external-data-schemas-incremental-fields-create({id}). Even though the operation name says "create", it re-reads the source and returns the current candidate fields — use it to confirm the field you want to set actually exists on the source and which sync types are now available for this table.

The response:

{
  "incremental_fields": [{"field": "updated_at", "type": "datetime", ...}, ...],
  "incremental_available": true,
  "append_available": true,
  "cdc_available": true,
  "full_refresh_available": true,
  "detected_primary_keys": ["id"],
  "available_columns": [...]
}

If your target incremental_field isn't in the list, tell the user — they need to either pick a different field or change the source table to add one.

Step 3 — Apply the change

Call external-data-schemas-partial-update({id}, {...changed fields}).

Only send the fields that are actually changing. Partial update means unspecified fields stay as they are.

Examples:

// Switch from full_refresh to incremental
{
  "sync_type": "incremental",
  "incremental_field": "updated_at",
  "incremental_field_type": "datetime"
}

// Change sync frequency to hourly
{"sync_frequency": "1hour"}

// Fix wrong PK on a CDC table
{"primary_key_columns": ["tenant_id", "order_id"]}

// Pause a schema
{"should_sync": false}

Step 4 — Decide whether existing data is still valid

This is the step that's easy to get wrong. Some config changes invalidate the synced data; others don't.

Changes that DON'T invalidate existing data:

sync_frequency, sync_time_of_day — scheduling only
should_sync — on/off
cdc_table_mode in most cases — next sync will start writing to the new shape, but historical consolidated rows stay valid
Switching between incremental and full_refresh with the same incremental_field — next sync just re-runs fresh
Switching to or from sync_type: "webhook" — the synced data stays valid; only the ingestion path changes. Remember to register or unregister the webhook (see sections below) alongside the sync_type change.

Changes that MAY invalidate existing data and need a resync:

Changing incremental_field to a different column — the high-water mark is from the old column and won't match. Without a resync you'll miss rows that were updated between the two fields' histories.
Changing primary_key_columns — existing rows may be deduplicated incorrectly against new PK definitions.
Switching from full_refresh to append — the existing rows don't have the version-history shape that append expects.
Switching from append to full_refresh — opposite problem; you'll end up with duplicate historical versions.
Switching to/from cdc — the table shape changes fundamentally.

When the change invalidates data, the clean flow is:

external-data-schemas-partial-update with the new config
Warn the user this is destructive
external-data-schemas-resync to wipe and re-import under the new config

Or equivalently, external-data-schemas-delete-data → external-data-schemas-reload. delete-data + reload is cleaner when the table is large and the user wants to start from zero.

Step 5 — Trigger and confirm

For non-destructive changes, call external-data-schemas-reload({id}) to pick up the new config immediately rather than waiting for the schedule.

Wait a moment, then external-data-schemas-retrieve({id}) to confirm status = Running then Completed. Report last_synced_at and any new latest_error.

Specific common changes

Switching full_refresh → incremental

incremental-fields-create to confirm the desired field exists and incremental_available: true.
partial-update: {sync_type: "incremental", incremental_field, incremental_field_type}.
No data wipe needed — next sync just switches strategy. If the source is growing fast, the next incremental sync is the cheap one.

Switching incremental → cdc (Postgres only)

Run external-data-sources-check-cdc-prerequisites-create on the parent source. Only proceed if valid: true.
incremental-fields-create to confirm cdc_available: true and see detected_primary_keys.
partial-update: {sync_type: "cdc", primary_key_columns: [...], cdc_table_mode: "consolidated"}.
Resync required — CDC tables have a different shape. Trigger external-data-schemas-resync after the update. Warn the user this wipes existing data.

Fixing a stale incremental field after schema drift

Source dropped the updated_at column. Sync has been failing with "column does not exist".

incremental-fields-create to see what fields remain.
Pick a replacement (or switch to full_refresh if none are suitable).
partial-update with the new field + type (or new sync_type).
reload to retry.

Changing primary keys on a CDC table

partial-update: {primary_key_columns: [...]}.
Resync required — existing CDC tombstones and upsert keys won't match the new PK definition, leading to row duplication or missed updates.
resync, warn the user.

Changing sync_frequency

partial-update: {sync_frequency: "1hour"}.
No reload needed — the next scheduled sync picks up the new cadence. Or reload manually if the user wants to confirm nothing broke.

Switching a schema to `sync_type: "webhook"`

Only works for sources that implement WebhookSource (today: Stripe) and tables where supports_webhooks: true from incremental-fields-create.

incremental-fields-create to confirm supports_webhooks: true for the table.
partial-update: {sync_type: "webhook"}.
If the source doesn't already have a webhook registered (check with webhook-info-retrieve), call external-data-sources-create-webhook-create({source_id}) to register it.
No resync required — the schema's existing bulk-synced data stays, and the webhook becomes the primary ingestion path once the next reconciliation finishes.
Keep sync_frequency set (e.g. 24hour) — it acts as a safety-net reconciliation in case any webhook delivery is missed.

Switching off `sync_type: "webhook"`

partial-update: {sync_type: "incremental"} (or whatever bulk type is appropriate) with the required incremental_field + incremental_field_type.
If no other schemas on the source are still using sync_type: "webhook", call external-data-sources-delete-webhook-create({source_id}) to unregister. Leaving an orphaned webhook registered on the source side just means events will be received and dropped — not harmful, but messy.
If other schemas on the source are still on webhook, leave the webhook registered — it's shared across all webhook-type schemas on the source.

Rotating a webhook signing secret

The source's signing secret (e.g. Stripe's whsec_...) was rotated, and payloads are now failing signature verification.

Grab the new secret from the source's dashboard.
external-data-sources-update-webhook-inputs-create({source_id}, {inputs: {signing_secret: "whsec_..."}}).
No reload needed — the next inbound webhook payload will verify against the new secret.

Pausing a schema

partial-update: {should_sync: false}. Schema stops syncing but stays configured.
To resume later: partial-update: {should_sync: true}, then reload for an immediate run.

Important notes

Read before you write. Always retrieve the current config first. partial-update doesn't complain if you set a field to the value it already had, but you might be about to change something you didn't realize was already set.
Not every sync_type is available on every schema. The incremental-fields-create response tells you what's available right now, which can be different from what was available at creation (e.g. CDC may have been enabled for the team since).
Wipe when the shape changes. Switching sync strategy often changes the physical table. If you don't resync, you'll be mixing row shapes and queries will return garbage.
CDC needs prerequisites. Never switch to sync_type: "cdc" without running check-cdc-prerequisites-create first. The sync will just fail immediately.
Don't touch a Running schema. If the schema is currently running, either wait for it to finish or external-data-schemas-cancel before applying the change. Updating config mid-sync can leave the incremental high-water mark inconsistent.
Sync frequency is cheap to change. Encourage experimentation there. Sync_type and incremental_field are expensive to change — encourage care.
Webhooks are registered at the source level, not the schema level. Multiple webhook-type schemas on the same source share one webhook registration. Only delete the webhook when the last webhook-type schema on that source is being switched away, otherwise other schemas stop receiving pushes.

tuning-incremental-sync-config

Popularity

Invocation

Context Preview

SKILL.md

tuning-incremental-sync-config

Popularity

Invocation

Context Preview

SKILL.md

Tuning incremental sync config

When to use this skill

Available tools

The fields you can tune

Workflow

Step 1 — Read the current config

Step 2 — If changing sync_type or incremental_field, refresh candidates

Step 3 — Apply the change

Step 4 — Decide whether existing data is still valid

Step 5 — Trigger and confirm

Specific common changes

Switching full_refresh → incremental

Switching incremental → cdc (Postgres only)

Fixing a stale incremental field after schema drift

Changing primary keys on a CDC table

Changing sync_frequency

Switching a schema to sync_type: "webhook"

Switching off sync_type: "webhook"

Rotating a webhook signing secret

Pausing a schema

Important notes

Similar Skills

Tuning incremental sync config

When to use this skill

Available tools

The fields you can tune

Workflow

Step 1 — Read the current config

Step 2 — If changing sync_type or incremental_field, refresh candidates

Step 3 — Apply the change

Step 4 — Decide whether existing data is still valid

Step 5 — Trigger and confirm

Specific common changes

Switching full_refresh → incremental

Switching incremental → cdc (Postgres only)

Fixing a stale incremental field after schema drift

Changing primary keys on a CDC table

Changing sync_frequency

Switching a schema to sync_type: "webhook"

Switching off sync_type: "webhook"

Rotating a webhook signing secret

Pausing a schema

Important notes

Similar Skills

Switching a schema to `sync_type: "webhook"`

Switching off `sync_type: "webhook"`

Switching a schema to `sync_type: "webhook"`

Switching off `sync_type: "webhook"`