From thinking-frameworks-skills
Detects and removes duplicate transactions across overlapping bank, credit-card, and brokerage statement imports using a stable composite key (account_id, date ±1d, amount_cents, description_normalized). Emits lists of new transactions, suppressed duplicates with reasons, and suspicious near-duplicates needing human review.
How this skill is triggered — by the user, by Claude, or both
Slash command
/thinking-frameworks-skills:transaction-deduplicatorThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
- [Overview](#overview)
Statement drops often overlap. A January statement covers December 15 → January 14; the December statement covers November 15 → December 14; the same December 15 transaction appears in both. This skill identifies those duplicates without losing legitimate same-day same-amount same-merchant repeat charges (e.g., two coffees in one day).
The caller provides:
incoming — array of newly extracted transactions: {date, post_date, account_id, amount_cents, description_raw, source}.existing — array of transactions already in the store with the same fields plus id.A duplicate is identified by the tuple:
(account_id, abs(amount_cents), description_normalized, |date_a - date_b| <= 1 day)
description_normalized uses the same normalization as the categorizer (uppercase, strip vendor codes, strip geo, collapse spaces, drop dates).date vs post_date mismatches between two sources.abs(amount_cents) allows a refund matched against the original purchase to NOT be considered a duplicate (different sign). The composite key uses signed amount.Use signed amount_cents. Refunds (opposite sign) are never duplicates of purchases.
Dedupe Progress:
- [ ] Step 1: Index existing transactions by (account_id, signed_amount, normalized_desc)
- [ ] Step 2: For each incoming, look up the index
- [ ] Step 3: Filter index hits by date proximity (≤ 1 day)
- [ ] Step 4: If no hit, mark as new
- [ ] Step 5: If exactly one hit, mark as duplicate of that id
- [ ] Step 6: If multiple hits, run the multi-instance same-day rule
- [ ] Step 7: Surface near-duplicates (different amount or desc) for review
Build existing_by_key[(account_id, amount_cents, description_normalized)] = [tx, …].
For each incoming transaction, compute its key tuple and look up the bucket.
For each candidate in the bucket, keep only those with |incoming.date − candidate.date| ≤ 1 day. Use min(date, post_date) on each side if post_date exists.
Mark decision: "new". The bookkeeper will append it to transactions.json.
Mark decision: "duplicate" and link duplicate_of: <existing_id>. Do not import.
When the existing store already has N transactions with the identical key on the same day, and the incoming batch contains M transactions with the same key on that day:
M ≤ N → all incoming considered duplicates of existing ones (1:1 pairing in date order).M > N → the first N incoming are duplicates; the remaining M − N are new transactions (legitimate same-day repeat charges, e.g., two coffees, gas-station pre-auth + final).This rule preserves real repeat charges while still suppressing overlap-import duplicates.
A near-duplicate shares everything except amount or description and is within 1 day. These commonly arise when:
Emit these to review[] with both records side-by-side and a suggested action: keep_incoming_drop_existing | keep_existing_drop_incoming | keep_both | merge.
Compute a similarity score on near-misses:
near_dup_score = 0.4*amount + 0.4*description + 0.2*date.
Surface for review when 0.7 ≤ near_dup_score < 0.95. Above 0.95 is treated as duplicate; below 0.7 is treated as independent.
{
"new": [
{ "id": "tx_20260115_017", "decision": "new" }
],
"duplicates": [
{
"incoming_index": 4,
"decision": "duplicate",
"duplicate_of": "tx_20251220_003",
"reason": "exact key match within 1 day window"
}
],
"review": [
{
"incoming_index": 12,
"matched_existing_id": "tx_20260108_005",
"near_dup_score": 0.86,
"diff": {
"amount_cents": [-4500, -4583],
"description_raw": ["AMAZON PENDING", "AMZN MKTP US*AB12CD"]
},
"suggested_action": "keep_incoming_drop_existing",
"rationale": "incoming is the finalized charge (post_date set, definite merchant code)"
}
],
"summary": {
"incoming_total": 142,
"new_count": 96,
"duplicate_count": 44,
"review_count": 2
}
}
description_raw strings to the human; do not show the normalized form.existing.duplicate_of so the user can trace why a transaction did not appear in the new import.npx claudepluginhub lyndonkl/claude --plugin thinking-frameworks-skillsImports financial transactions from CSV, OFX, or QIF bank exports. Detects format and banks (Chase, Amex, BofA), previews first 5, deduplicates by date/amount/description, and inserts new records into database.
Reconciles financial statements by validating the identity opening + sum(transactions) = closing. Flags missing rows, double-counts, sign errors, and rounding diffs.
Activate for: bank reconciliation, nostro reconciliation, suspense account, GL reconciliation, provision reconciliation, inter-company reconciliation, nostro break, unmatched item, reconciling item, MT940, MT950, MT942, aged items, reconciliation certificate, suspense clearing, four-way reconciliation, IFRS 9 provision reconciliation, settlement break, trade reconciliation, position break, GL-to-risk reconciliation. NOT for: IFRS 9 ECL model calculation (use ifrs9-ecl), capital adequacy reporting (use basel-capital), AML transaction monitoring (use aml-typologies).