From kyvos
Identify statistical anomalies, outliers, and unusual patterns in datasets. Use when users ask to find anomalies, detect outliers, identify unusual patterns, spot irregularities, or analyze data for unexpected behavior. Supports time-series analysis, distribution-based detection, and pattern recognition for numerical and categorical data.
How this skill is triggered — by the user, by Claude, or both
Slash command
/kyvos:anomaly-detectorThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
This skill identifies anomalies in data using multiple statistical methods. It can detect unusual values in numerical data, unexpected shifts in time-series data, and rare occurrences in categorical data.
This skill identifies anomalies in data using multiple statistical methods. It can detect unusual values in numerical data, unexpected shifts in time-series data, and rare occurrences in categorical data.
For numeric columns, anomalies are typically values that fall far from the central tendency of the data.
This method is best for data that is approximately normally distributed. It measures how many standard deviations a data point is from the mean.
# Assumes data is in a pandas DataFrame 'df' and we're checking 'value' column
z_scores = (df['value'] - df['value'].mean()) / df['value'].std()
anomalies = df[abs(z_scores) > 3]
This method is robust to outliers and does not assume a normal distribution, making it suitable for skewed data. An anomaly is a value that falls outside the range defined by Q1 - 1.5IQR and Q3 + 1.5IQR.
# Assumes data is in a pandas DataFrame 'df' and we're checking 'value' column
Q1 = df['value'].quantile(0.25)
Q3 = df['value'].quantile(0.75)
IQR = Q3 - Q1
lower_bound = Q1 - 1.5 * IQR
upper_bound = Q3 + 1.5 * IQR
anomalies = df[(df['value'] < lower_bound) | (df['value'] > upper_bound)]
A simple method to identify extreme values by defining anomalies as values that fall in the top or bottom X% of the data.
# Identify values in the bottom 1% or top 1%
anomalies = df[(df['value'] < df['value'].quantile(0.01)) |
(df['value'] > df['value'].quantile(0.99))]
For time-series data, anomalies can be sudden spikes/dips or deviations from a recurring pattern (seasonality).
This method identifies values that deviate significantly from a rolling average, which helps smooth out short-term noise.
# Assumes 'df' has a datetime index and a 'value' column
# Calculate 7-period moving average
df['moving_average'] = df['value'].rolling(window=7).mean()
# Calculate deviation from moving average
df['deviation'] = df['value'] - df['moving_average']
# Identify points with a large deviation (e.g., > 3 standard deviations of the deviation)
anomaly_threshold = df['deviation'].std() * 3
anomalies = df[abs(df['deviation']) > anomaly_threshold]
For categorical data, anomalies are often categories that appear with unusually low frequency.
Identify categories that are rare compared to others.
# Assumes 'df' has a 'category' column
frequency = df['category'].value_counts(normalize=True)
# Identify categories that make up less than 1% of the data
rare_categories = frequency[frequency < 0.01].index.tolist()
anomalies = df[df['category'].isin(rare_categories)]
Do NOT automatically remove anomalies. Instead:
Always report how anomalies were identified and handled.
npx claudepluginhub ki-kyvos/kyvos-plugins --plugin kyvosDetects anomalies in datasets using Z-score, IQR, modified Z-score, Isolation Forest, LOF, rolling windows, and STL decomposition for tabular and time-series data.
Detects anomalies and outliers in datasets using ML like Isolation Forest, One-Class SVM, LOF, autoencoders. For analyzing unusual patterns or deviations.
Identifies statistical outliers and contextual anomalies in datasets, flagging unusual values and potential stories for journalists and data investigators.