Replace fuzzy string matching with semantic AI matching to fix false
positives where similar-sounding but different companies were matched
(e.g., "Families First CiC" incorrectly matching "FAMILIES AGAINST
CONFORMITY LTD").
Changes:
- Add ICompanyNameMatcherService interface and AICompanyNameMatcherService
implementation using Claude Sonnet 4 for semantic company name comparison
- Add SemanticMatchResult and related models for AI match results
- Update CompanyVerifierService to use AI matching with fuzzy fallback
- Add detection for public sector employers, charities, and self-employed
entries that cannot be verified via Companies House
- Update tests to work with new AI matcher integration
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add audit logging system for tracking CV uploads, processing, deletion,
report views, and PDF exports for billing/reference purposes
- Add processing stage display on dashboard instead of generic "Processing"
- Add delete button for CV checks on dashboard
- Fix duplicate primary key error in CompanyCache (race condition)
- Fix DbContext concurrency in Dashboard (concurrent delete/load operations)
- Fix ProcessCVCheckJob to handle deleted records gracefully
- Fix duplicate flags in verification report by deduplicating on Title+Description
- Remove internal cache notes from verification results
- Add EF migrations for ProcessingStage and AuditLog table
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Use IDbContextFactory pattern to create isolated DbContext instances
for each cache operation, making parallel verification thread-safe.
Changes:
- Add IDbContextFactory<ApplicationDbContext> registration
- Update CompanyVerifierService to use factory for cache operations
- Update tests with InMemoryDatabaseRoot for shared test data
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Features:
- Add UK institution recognition (170+ universities)
- Add diploma mill detection (100+ blacklisted institutions)
- Add education verification service with date plausibility checks
- Add local file storage option (no Azure required)
- Add default admin user seeding on startup
- Enhance Serilog logging with file output
Security fixes:
- Fix path traversal vulnerability in LocalFileStorageService
- Fix open redirect in login endpoint (use LocalRedirect)
- Fix password validation message (12 chars, not 6)
- Fix login to use HTTP POST endpoint (avoid Blazor cookie issues)
Code improvements:
- Add CancellationToken propagation to CV parser
- Add shared helpers (JsonDefaults, DateHelpers, ScoreThresholds)
- Add IUserContextService for user ID extraction
- Parallelized company verification in ProcessCVCheckJob
- Add 28 unit tests for education verification
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>