Job Matching Skill v3.16 (EXP-173: Location proximity clusters for Korean districts)
143 skills, 50+ similarity pairs, 17 domain categories (incl. blockchain, security).
Score Weights (Validated — EXP-017)
| Component | Weight | Score Range | Description |
|---|
| Skill match | 40% | 0-100 | Core skill alignment with tiered matching |
| Experience fit | 20% | 0-100 | Career stage and experience level alignment |
| Company culture fit | 15% | 0-100 | Cultural values and work environment matching |
| Career stage alignment | 15% | 0-100 | Professional development stage compatibility |
| Location/work/salary/employment fit | 10% | 0-100 | Work type, location, salary preference, and employment type alignment |
Discrimination Requirements (EXP-028)
After scoring, these rules must hold:
- HIGH group: score ≥ 70
- MEDIUM group: score ≤ 65
- HIGH min − MED max gap ≥ 15
- LOW group: score ≤ 25
Skill-Gated Scoring (EXP-021, tuned EXP-037, EXP-165)
When skill score < 40, all non-skill components are dampened by a quadratic gate multiplier:
- gate = 0.12 + 0.88 × (skillScore / 40)² for skill < 40; gate = 1.0 for skill ≥ 40
- At skill=0: gate=0.12, skill=10: gate=0.175, skill=20: gate=0.34, skill=40: gate=1.0
- This prevents unrelated jobs from scoring high on experience/culture/location alone
Job Coverage Gate (EXP-168)
When skill score is above the quadratic gate threshold (≥ 40) but job coverage is below 60%, an additional 0.75 dampening is applied to non-skill components. This catches cases where shared infrastructure skills (AWS, Docker, PostgreSQL) inflate the skill score for fundamentally mismatched domain jobs (e.g., a React/Node.js candidate vs a Python/Django job that happens to use the same infrastructure).
- effectiveGate = skillGate × coverageGate
- coverageGate = 0.75 when skill ≥ 40 AND jobCoverage < 60%; otherwise 1.0
- Experience scoring considers range upper bounds (e.g., "3~7년" with 5 years experience = 95)
Primary Domain Alignment (EXP-024, tuned EXP-037, expanded EXP-104)
When the job's primary technology stack has zero overlap with the candidate's core domain skills, the skill score is penalized by 40% (multiplied by 0.60).
Primary domains detected (EXP-049: framework-aware, EXP-104: full 122-skill coverage):
- js/ts: React, Next.js, Vue, Nuxt, Svelte, Angular, Node.js, Express, NestJS, React Native, Deno, Bun, Remix, Astro, Fastify, Koa, Hono, Vite, Tailwind, Vercel, tRPC, Storybook, Jest, Cypress, Prisma, Drizzle, TypeORM, Sequelize, Mongoose, Redux, Zustand, Recoil, MobX, Vuex, Pinia, Electron, Capacitor, Ionic, Sentry, Firebase, Supabase, GraphQL, REST API, gRPC
- python: Python, Django, Flask, FastAPI, TensorFlow, PyTorch, ML, LLM, LangChain, MLOps, Computer Vision, NLP, HuggingFace, Fine-tuning, Stable Diffusion, RAG, Prompt Engineering, Vector Database
- java: Java, Spring, Spring Boot, JPA, Jetpack Compose, Kotlin
- cloud: AWS, GCP, Azure, AWS Lambda, AWS S3, AWS SQS, DynamoDB, CloudWatch
- devops: Docker, Kubernetes, Terraform, Ansible, Jenkins, GitHub Actions, Linux, Nginx, CI/CD, Datadog, Grafana, Prometheus
- data: PostgreSQL, MySQL, MongoDB, Redis, Elasticsearch, Oracle, MSSQL, Kafka, RabbitMQ, Spark, Hadoop, Airflow, dbt, BigQuery, Snowflake, R
- rust: Rust, Tauri
- go: Go
- swift: Swift, SwiftUI
- c#: C#, .NET, ASP.NET
- c++: C++
- dart: Dart, Flutter
- ruby: Ruby, Rails
- php: PHP, Laravel
- game: Unity, Unreal
- design: Figma
This prevents infrastructure-only overlap from inflating scores for jobs in completely different primary tech stacks. EXP-104: Previously only ~45 skills had domain mappings; now all 122 skills from skill-inference.js are covered, ensuring the domain penalty correctly applies for jobs requiring Vite/Tailwind/Jest, Drizzle/TypeORM, Electron/Tauri, Grafana/Prometheus, Unity/Unreal, etc.
Technology Similarity Map
Tier 1: Exact Equivalents (100%)
- TypeScript ↔ JavaScript
- React ↔ Next.js
- Vue ↔ Nuxt.js
- PostgreSQL ↔ MySQL ↔ SQL
- Docker ↔ Container
- Kubernetes ↔ K8s (alias)
- spring_boot ↔ Spring Boot (alias)
Tier 2: Strong Compatibility (75%)
- Spring ↔ Spring Boot
- Express ↔ Node.js ↔ NestJS
- FastAPI ↔ Python ↔ Django ↔ Flask (EXP-074: same-language web framework cross-similarity)
- AWS ↔ GCP ↔ Azure ↔ Cloud
- Java ↔ Kotlin (JVM interoperable — EXP-062)
- React ↔ React Native (shared React paradigm — EXP-062)
- GraphQL ↔ REST API (API paradigms — EXP-064)
- Jenkins ↔ GitHub Actions (CI/CD — EXP-064)
- Terraform ↔ Ansible (IaC/config management — EXP-064)
- Kafka ↔ RabbitMQ (message queues — EXP-064)
- TensorFlow ↔ PyTorch (ML frameworks — EXP-064)
- LLM ↔ Machine Learning ↔ PyTorch ↔ TensorFlow (AI/ML ecosystem — EXP-097)
- LangChain ↔ LLM (LLM orchestration — EXP-097)
- RAG ↔ LLM ↔ Vector Database (retrieval-augmented generation — EXP-097)
- Computer Vision ↔ Machine Learning ↔ PyTorch ↔ TensorFlow (CV is ML subfield — EXP-097)
- NLP ↔ Machine Learning ↔ LLM (NLP is ML subfield — EXP-097)
- HuggingFace ↔ PyTorch ↔ TensorFlow ↔ LLM (model hosting — EXP-097)
- MLOps ↔ Machine Learning ↔ Docker ↔ Kubernetes (ML+DevOps — EXP-097)
- Elasticsearch ↔ Redis (real-time data stores — EXP-064)
- Oracle ↔ MSSQL (enterprise RDBMS — EXP-064)
- Dart ↔ Flutter (Flutter's language — EXP-096, promoted from TIER3)
- Angular ↔ TypeScript (Angular mandates TypeScript — EXP-096)
- JPA ↔ Spring ↔ Java (ORM ecosystem — EXP-088)
- DevOps ↔ Docker ↔ Kubernetes ↔ Terraform ↔ CI/CD (DevOps umbrella — EXP-088)
- AWS Lambda/S3/SQS ↔ AWS (AWS services to parent cloud — EXP-088)
Tier 3: Partial Overlap (25%)
- React ↔ Vue ↔ Svelte ↔ Angular (frontend frameworks — EXP-074: Angular added)
- Node.js ↔ Python (server-side)
- AWS ↔ Docker (cloud/containers)
- Docker ↔ Kubernetes (container ecosystem — EXP-062)
- Kubernetes ↔ Container
- SQL ↔ MongoDB (data handling)
- Docker ↔ Terraform (DevOps provisioning — EXP-064)
- Nginx ↔ Docker (infrastructure/deployment — EXP-064)
- Spark ↔ Hadoop (big data ecosystem — EXP-064)
- DevOps ↔ Jenkins ↔ GitHub Actions (CI/CD tools — EXP-088)
- AWS Lambda ↔ Docker ↔ Kubernetes (compute models — EXP-088)
- AWS S3 ↔ BigQuery ↔ Snowflake (data pipeline — EXP-088)
- AWS SQS ↔ Kafka ↔ RabbitMQ (messaging — EXP-088)
- Figma ↔ React ↔ Angular ↔ Vue (design-frontend overlap — EXP-088)
- LLM ↔ RAG ↔ HuggingFace (LLM ecosystem — EXP-097)
- Prompt Engineering ↔ LLM (prompting is LLM-specific — EXP-097)
- Fine-tuning ↔ PyTorch ↔ TensorFlow ↔ Machine Learning (fine-tuning uses ML frameworks — EXP-097)
- Stable Diffusion ↔ PyTorch ↔ Computer Vision (generative AI — EXP-097)
- Vector Database ↔ Elasticsearch ↔ Redis ↔ MongoDB (vector/search/NoSQL overlap — EXP-097)
- MLOps ↔ Terraform ↔ CI/CD (MLOps shares infra with DevOps — EXP-097)
- LangChain ↔ Python ↔ TypeScript (LangChain runs on Python/TS — EXP-097)
- Pandas ↔ Spark (data processing — EXP-064)
- GraphQL ↔ gRPC (modern API protocols — EXP-064)
- MongoDB ↔ Redis (NoSQL stores — EXP-064)
Tier 4: Context-Based Matches (50%)
- Domain-specific associations
- Technology stack relationships
Job Intent Classification
Categorize job postings by technical domain:
- Development: 개발, development, engineer, programmer
- Data: 데이터, data, analytics, AI/ML
- Management: 매니저, manager, leader, pm
- Design: 디자인, design, ui/ux
- Sales: 영업, sales, business development
- Research: 연구, research, scientist, r&d
Company Culture Keywords (EXP-043, EXP-048)
Culture keywords are extracted from job listing text by the scraper (see skills/job-scraping/SKILL.md):
- Innovative: 혁신, 도전, 창의, 크리에이티브, creative, innovation, 실험, experiment
- Collaborative: 협업, 팀워크, 소통, 협력, collaborat*, teamwork, 함께, 공동, 수평적, 가로형, 크로스 펑셔널
- Fast-paced: 빠른, agile, 실시간, 스타트업, fast-paced, 릴리즈, 스프린트, sprint
- Structured: 체계, 프로세스, systematic, QA, 품질관리, 코드리뷰, code review, 가이드라인
- Learning-focused: 성장, 학습, learning, 교육, 스터디, 멘토링, 세미나, 사내강의, 도서지원
- Autonomous: 자율, 독립, autonomous, 자기주도, 오너십, 자유도, 주도적
- Work-life balance: 워라밸, 워크라이프밸런스, WLB, 유연근무, 시차출근, 자유출퇴근, 연차, 리프레시, 가족친화
When culture_keywords is empty/null, culture score defaults to 50 (neutral). When present, score is based on overlap with candidate's cultural_preferences. Unknown experience, career_stage, and location/work_type also default to 50 — missing data should not inflate scores (EXP-051).
Experience Scoring (EXP-076)
| Job Experience | Candidate Years | Score | Notes |
|---|
신입 | 0-1 | 95 | Perfect: new graduate |
신입 | 2-3 | 65 | Junior: overqualified |
신입 | 4+ | 40 | Senior: poor fit |
신입가능 / 신입 가능 | 0-1 | 95 | New grad welcome — perfect |
신입가능 / 신입 가능 | 2-3 | 80 | Junior — good |
신입가능 / 신입 가능 | 4-7 | 70 | Mid — acceptable |
신입가능 / 신입 가능 | 8+ | 50 | Senior — overqualified |
신입·경력 / 신입/경력 | any | 85 | Both welcome — broad match |
경력무관 | any | 80 | Experience not a factor |
경력 (bare) | 0 | 30 | No experience — poor fit |
경력 (bare) | 1 | 60 | Minimal — acceptable |
경력 (bare) | 3 | 80 | Junior — good fit |
경력 (bare) | 5-10 | 90 | Mid/Senior — great fit |
경력 (bare) | 15+ | 75 | Very senior — overqualified |
3~7년 | 5 | 95 | In range |
3년 이상 | 5 | 90 | Meets minimum |
3년 이상 | 2 | 70 | Below minimum |
| unknown | any | 50 | Neutral default |
Title-Based Skill Inference (EXP-052)
When job.skills is empty or has <2 entries (common from LinkedIn/partial scrapes), explicit technology keywords are extracted from the job title to improve matching accuracy.
Rules:
- Only extract explicit technology mentions (React, Java, Python, etc.) — do NOT infer from role names
- Korean equivalents supported: 리액트→React, 파이썬→Python, 스프링→Spring, etc.
- Title-inferred skills supplement (not replace) explicit skills
- Not used when
job.skills already has ≥2 entries
Example: A job with title: "React/TypeScript 프론트엔드" and skills: [] gets effective skills [React, TypeScript] — matching score reflects actual domain alignment instead of defaulting to neutral 50.
Discrimination impact: Without title inference, a React job with no skills and a Java job with no skills both score ~50. With inference, the React job scores HIGH and the Java job scores LOW for a JS candidate — correct discrimination is restored.
Salary Preference Alignment (EXP-084)
The 10% Location/Work/Salary component now includes salary preference matching when both the candidate has preferences.salary_range: {min, max} and the job has salary_min/salary_max populated.
Location Proximity Clusters (EXP-173)
Location matching uses proximity clusters to provide partial credit for nearby districts, not just exact string matches.
Scoring:
- Exact match (candidate pref is substring of job location): +15
- Same cluster (e.g., 강남↔역삼, 판교↔분당, 홍대↔마포): +10
- Adjacent clusters (e.g., 강남↔성수, 강남↔판교): +5
- No proximity (e.g., 강남↔부산): +0
Clusters:
- gangnam: 강남, 역삼, 삼성, 논현, 신사, 청담, 압구정, 선릉, 대치, 도곡, 개포, 일원, 수서
- pangyo: 판교, 분당, 정자, 수내, 미금, 서현, 이매, 야탑, 구미
- cbd: 여의도, 영등포, 당산, 문래, 신길
- downtown: 광화문, 을지로, 종로, 서울역, 명동, 충무로
- hongdae: 홍대, 신촌, 마포, 합정, 망원, 상수, 연남, 공덕, DMC
- seongsu: 성수, 용산, 건대, 왕십리, 한남, 이태원
- guro: 구로, 가산, 독산, 신도림, 관악, 신림, 봉천
- incheon: 인천, 송도, 부평, 일산, 파주, 김포
- suwon: 수원, 평촌, 안양, 동탄, 화성, 천안
- daejeon: 대전, 세종, 유성, 둔산
Adjacent pairs: gangnam↔pangyo, gangnam↔seongsu, cbd↔guro, hongdae↔seongsu, downtown↔hongdae, downtown↔cbd, gangnam↔suwon, pangyo↔suwon
Scoring breakdown (base 50):
- Location match: +15
- Work type match: +15
- Salary alignment: -20 to +20
- Employment type alignment: -15 to +5 (EXP-085)
Employment Type Alignment (EXP-085)
Jobs are classified by employment_type: regular (정규직, default), contract (계약직/파견), intern (인턴), freelance (프리랜서). All three post-processors extract this field.
Scoring:
- Match with candidate preference: +5
- Contract job, candidate doesn't want contract: -10
- Intern job, candidate doesn't want intern: -15
- No preference specified or no employment_type: neutral (0)
Backward compatible — jobs without employment_type data score neutrally.
Salary alignment logic:
- Ranges overlap: +5 to +20 (proportional to overlap ratio)
- Job below candidate min: -5 to -20 (proportional to gap)
- Job above candidate max: +5 (slight positive — above expectations is acceptable)
- No salary data on either side: 0 (neutral, backward compatible)
Example: Candidate wants 5000-8000만원. Job offering 6000-7000 → good overlap (+10). Job offering 2500-3500 → below range (-11). Job offering 9000-12000 → above range (+5).
This prevents a 3000만원 job from scoring identically to a 7000만원 job when the candidate explicitly prefers 5000-8000.
Matching Workflow
- Parse job → extract skills, experience range, culture keywords, work type, location
- Load candidate → from
data/resume/master.yaml (skill_summary, experience_years, career_stage, preferences, cultural_preferences)
- Score each component using weights above, with skill-gate and domain alignment adjustments
- Verify discrimination — HIGH/MED/LOW groups must satisfy gap requirements
- Output match report with scores, matched/missing skills, and recommendations
Output Format
{
"job_id": "JOB-001",
"overall_score": 78,
"components": {
"skills": { "score": 80, "weighted": 28, "matched": ["Node.js", "TypeScript"], "missing": ["Docker"] },
"experience": { "score": 75, "weighted": 19 },
"culture": { "score": 90, "weighted": 14, "matched": ["innovative", "collaborative"] },
"career_stage": { "score": 85, "weighted": 13 },
"location": { "score": 100, "weighted": 10 }
},
"recommendations": ["Docker 경험 추가 학습 권장"]
}
Career Stage Mapping
| Years | Stage |
|---|
| 0-1 | entry |
| 1-3 | junior |
| 3-7 | mid |
| 7-12 | senior |
| 12+ | lead |
Career Score Modifiers
- Specific range detected (e.g., "3~7년", "5년"): base score (85 for match)
- Bare 경력 / no specific range: -10 (75 for match) — less informative than explicit ranges
- 무관 (open to all): -15 (70 for match) — accepts all career stages, lowest specificity