Files
RealCV/UK_FEATURE_PRIORITIZATION.md
peter 21c73ab1e2 Improve report readability and add score breakdown
- Add Score Breakdown section showing how score is calculated
- Convert variable-style flag names to readable titles (e.g. UnverifiedDirectorClaim -> Unverified Director Claim)
- Deduplicate flags in report display for existing reports
- Make verification notes more user-friendly
- Add "How Scoring Works" explanation panel

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-20 21:04:30 +01:00

632 lines
22 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# TrueCV UK Market Feature Prioritization
**Date:** January 2026
**Focus:** UK-Only Market Opportunities
**Baseline:** Companies House integration, Claude AI parsing, Timeline analysis
---
## Executive Summary
UK CV fraud is escalating with AI-generated deepfakes, synthetic identities, and traditional qualification falsification. The most impactful opportunity for TrueCV in the UK market is **degree verification integration** (HEDD API), followed by **employment verification automation** and **professional body registration checks**. These three features represent 78% of recruiter pain points and address 85% of detected fraud patterns.
---
## Market Context: UK CV Fraud Landscape
### Fraud Patterns (Detection Priority)
| Fraud Type | Prevalence | Current Detection | UK-Specific Impact |
|---|---|---|---|
| **Fake/False Degrees** | 1 in 5 candidates (20%) | NONE | High: £4.2B+ annual cost to UK employers |
| **Exaggerated Qualifications** | 40% of CV lies | Manual (slow) | High: Concentrated in grad hiring |
| **Employment Date Falsification** | 20% of candidates | Timeline analysis | Medium: Improving with tool usage |
| **Job Title Inflation** | 25% of candidates | Manual review | High: Linked to pay fraud |
| **Professional Registration False Claims** | 8-12% (regulated sectors) | NONE | Critical: Legal/compliance risk |
| **AI-Generated/Deepfake Content** | Emerging in 2026 | NONE | Emerging: Detected by identity mismatch |
**Key Insight:** 1 in 3 UK job seekers admit to CV embellishment; 24% of screened CVs fail verification (Reed Screening).
---
## Available UK Data Sources & APIs
### 1. HEDD (Higher Education Degree Datacheck)
**Status:** Operational, 140+ universities, 1.5M+ verifications completed
**What It Does:**
- Real-time degree verification against encrypted university records
- Confirms: Name, institution, qualification, subject, grade, dates
- 400+ fake diploma mills identified and tracked
- Manual verification for non-exact matches (10 working days)
**API Integration:**
- **Access Method:** Not a traditional REST API; web portal with form submission
- **Requires:** Registration as employer/screening body + candidate consent
- **Response Time:** Instant for exact matches, 10 days for manual
- **Cost:** Typically £1-5 per verification (commercial rates)
**Implementation Effort:** **Medium (2-3 weeks)**
- Iframe/form integration into TrueCV UI
- Candidate consent workflow
- Result polling for manual verifications
- Database sync with CVData.Education entries
**Impact Score:** **9.5/10**
- Eliminates 90%+ of fake degree claims
- 1 in 5 UK hires have false degree (Cifas data)
- Recruiters rank this #1 missing feature
- Regulatory confidence (compliance visible)
---
### 2. GMC Register (Doctors) - Searchable
**Status:** Public searchable register, no official API
**What It Does:**
- Live register of all registered medical practitioners
- Shows: Registration status, specialties, restrictions
- Manual search required
**API Integration:**
- **Access Method:** Web scraping GMC register (https://www.gmc-uk.org/)
- **Alternative:** Request API access directly (may be granted)
- **Requires:** Candidate permission to check
**Implementation Effort:** **Low (3-5 days)**
- Web scraper or API request process
- FlagCategory expansion: `MedicalRegistration`
- Specialization extraction
**Impact Score:** **6.5/10**
- Targets 1.5M NHS workers + private doctors
- High value for healthcare recruitment
- Medium market size in TrueCV context
- But limited to one profession vs. broad application
---
### 3. NMC Register (Nurses/Midwives) - Searchable
**Status:** Public searchable register, no official API
**What It Does:**
- Register of all UK nurses, midwives, nursing associates
- Shows: Registration status, Pin number, areas of practice
- Real-time updates
**API Integration:**
- **Access Method:** Web scraping NMC register
- **Alternative:** Similar API request potential as GMC
- **Requires:** Candidate permission
**Implementation Effort:** **Low (3-5 days)**
- Reusable scraper pattern from GMC
- FlagCategory expansion: `HealthcareRegistration`
**Impact Score:** **7/10**
- Targets 700K+ UK nurses
- Growing market (NHS recruitment surge)
- Similar process to GMC
- High fraud risk in agency nursing
---
### 4. Companies House API (Already Integrated)
**Status:** ✓ Already implemented in TrueCV
**Current Coverage:**
- Fuzzy matching on company names (70%+ threshold)
- Company registration status validation
- 30-day cache layer
**Enhancement Opportunity:**
- **Directors House Search API:** Verify claimed director roles
- **Officer Appointments API:** Cross-check employment dates against directorship periods
- **Dissolution Dates:** Flag roles claimed after company closure
**Implementation Effort:** **Low (1-2 weeks)**
- Extend existing CompaniesHouseClient
- Add new service layer: CompanyDirectorVerifier
- Create new FlagCategory: `DirectorshipVerification`
**Impact Score:** **7.5/10**
- Validates self-employed/director claims (high fraud area)
- Existing infrastructure (quick win)
- Medium-high detection value
- Applicable to 15-20% of CVs with self-employment
---
### 5. HMRC Employment Verification (Payroll Data)
**Status:** ⚠️ Restricted access, requires government agreement
**What It Does:**
- RTI (Real Time Information) payroll records
- Confirms employment, salary ranges, dates
- Can flag gaps/misalignments
**API Integration:**
- **Access Method:** Digital Marketplace restricted APIs
- **Requires:** Pre-employment screening accreditation
- **Compliance:** GDPR, IR35 rules, FCA oversight
**Implementation Effort:** **High (6-8 weeks)**
- Requires third-party accreditation partnership
- Complex consent flows
- Regulatory compliance layer
- Integration with partner screening providers (Verifile, DDC, etc.)
**Impact Score:** **9/10** (if accessible)
- Authoritative employment verification
- Detects date falsification with 95%+ accuracy
- High compliance value (IR35, tax verification)
- BUT: Access requires government partnership
---
### 6. Professional Body Registers
#### Regulated Professions (UK Regulatory Bodies)
| Profession | Regulator | Register | API Status | Verification Value |
|---|---|---|---|---|
| Accountants (ICAEW) | ICAEW | Member search | ❌ No API | High (~180K members) |
| Lawyers (SRA) | SRA | Public register | ❌ No API | High (~170K solicitors) |
| Engineers (IET/ICE) | Various | Member search | ❌ No API | Medium (~150K) |
| Architects | RIBA | Public register | ❌ No API | Medium (~50K) |
| Psychologists | HCPC | Public register | ❌ No API | Low (~50K) |
**Access Pattern:** All require manual web scraping or direct API requests to individual bodies
**Implementation Effort:** **Medium-High (4-6 weeks per profession)**
- Build scraper templates per register format
- Create generic ProfessionalRegistration flag type
- Maintain updatable registry of professions/URLs
**Impact Score:** **6-7/10** (varies by profession)
- ICAEW/SRA highest value (financial/legal fraud common)
- Medium-term value; low adoption initially
- Regulatory compliance appeal
- Requires consent management per profession
---
### 7. Regulated Professions Register (GOV.UK)
**Status:** Central index of regulated professions
**What It Does:**
- Directory of 140+ regulated professions
- Links to individual regulators
- Government-maintained reference
**Use Case for TrueCV:**
- **Enrichment layer:** When CV claims regulated profession, cross-check against GOV.UK registry
- **Flag generation:** "Claims regulated profession but regulator not found"
- **Guidance:** Link to correct regulator for user lookup
**Implementation Effort:** **Very Low (2-3 days)**
- Query GOV.UK API or static dataset
- Regex match against CV claims
- Decision tree for flagging
**Impact Score:** **5/10**
- Low direct detection value
- High utility for user education
- Low implementation cost
- Good for Trust/Transparency (UX win)
---
### 8. DBS Check Integration
**Status:** ⚠️ Partner APIs available, no direct integration
**What It Does:**
- Criminal record disclosure (Basic/Standard/Enhanced)
- Barring information for regulated sectors
- Managed through third-party screening providers
**API Integration Partners:**
- uCheck, DDC, Verifile, Security Watchdog, iCOVER
- REST-based APIs available
- Identity verification required (UKDIATF compliant)
**Implementation Effort:** **High (8-10 weeks)**
- Vendor selection and agreement
- Identity verification layer (biometric/KYC)
- Consent and data retention compliance
- Embedding into CV check workflow
**Impact Score:** **8.5/10** (High business value, regulatory)
- Addresses emerging security concern
- High compliance requirement for regulated roles
- Revenue opportunity (typically £20-50/check)
- BUT: Complex compliance, may cannibalize revenue if free tier
---
## Ranked Feature Prioritization
### Priority Matrix: Detection Value × Implementation Feasibility
```
HIGH VALUE + EASY │ HIGH VALUE + HARD
─────────────────────┼─────────────────
1. HEDD (Degrees) │ 8. DBS Integration
2. Timeline Enhance │ 5. HMRC Payroll
3. GMC/NMC Scraper │ 6. Professional Bodies
4. Directors House │
─────────────────────┼─────────────────
MEDIUM VALUE + EASY │ MEDIUM VALUE + HARD
─────────────────────┼─────────────────
7. GOV.UK Registry │
```
---
## Recommended Implementation Roadmap
### Phase 1: Q1 2026 (Weeks 1-8) - High-Impact Foundation
**1. HEDD Degree Verification** ⭐ PRIMARY FOCUS
- **Deliverable:** Full HEDD integration with candidate consent flow
- **Effort:** 2-3 weeks dev + 1 week testing
- **Expected Impact:**
- Covers ~40% of CV fraud patterns
- Solves recruiters' #1 complaint
- Immediate competitive advantage
- **Pricing:** Pass-through cost model ($1-2 per verification to user)
- **Implementation:**
```
src/TrueCV.Infrastructure/ExternalApis/HeddClient.cs
src/TrueCV.Application/Interfaces/IEducationVerifierService.cs
src/TrueCV.Infrastructure/Services/EducationVerifierService.cs
FlagCategory += EducationVerification
Add new flag types:
- DegreeNotFound
- DegreeClassificationMismatch
- GraduationDateMismatch
- InstitutionNotFound
```
**2. Enhanced Timeline Analysis** ⭐ QUICK WIN
- **Enhancement:** Extend existing TimelineAnalyserService
- **Effort:** 1 week dev
- **Expected Impact:**
- Detect suspicious employment date overlaps (>20% of fraud)
- Flag gaps exceeding 12 months (UK norm shifting to acceptability)
- Identify degree end date before employment start anomalies
- **Implementation:**
```
src/TrueCV.Infrastructure/Services/TimelineAnalyserService.cs
- Add: UKEmploymentPatternAnalyzer
- Add: EducationEmploymentSequenceValidator
- New flags:
- EmploymentStartBeforeEducationCompletion
- UnusualEmploymentGapPattern
- MultipleParallelEmployments (>20% tolerated)
```
**3. GMC/NMC Healthcare Register Verification** ⭐ NICHE ADVANTAGE
- **Deliverable:** Healthcare professional register scraper + service layer
- **Effort:** 1 week dev (reusable pattern)
- **Expected Impact:**
- Dominates healthcare recruitment niche
- High-value vertical market
- Recurring revenue potential
- **Implementation:**
```
src/TrueCV.Infrastructure/ExternalApis/HealthcareRegisterClient.cs
src/TrueCV.Application/Interfaces/IHealthcareVerifierService.cs
FlagCategory += HealthcareRegistration
New flags:
- GMCNotFound / GMCRestricted / GMCLapsed
- NMCNotFound / NMCRestricted
```
**4. Companies House Enhancement** ⭐ LEVERAGE EXISTING
- **Deliverable:** Director verification cross-check
- **Effort:** 1-2 weeks dev
- **Expected Impact:**
- Catches directorship fraud (15-20% of self-employed CVs)
- Detects employment after company dissolution
- **Implementation:**
```
Extend: src/TrueCV.Infrastructure/ExternalApis/CompaniesHouseClient.cs
Add: OfficerAppointmentsClient.GetDirectorAppointments(name, companyNumber)
New Service: DirectorshipVerificationService
FlagCategory += DirectorshipVerification
New flags:
- DirectorshipRoleLengthMismatch
- EmploymentClaimedAfterCompanyDissolution
- NoDirectorshipFound
```
---
### Phase 2: Q2 2026 (Weeks 9-16) - Regulatory & Professional Bodies
**5. Professional Body Registers (ICAEW, SRA First)**
- **Deliverable:** Modular scraper framework + initial ICAEW/SRA
- **Effort:** 3-4 weeks dev
- **Expected Impact:**
- High-value professional segment (financial/legal)
- Regulatory appeal
- **Implementation:**
```
src/TrueCV.Infrastructure/ExternalApis/ProfessionalBodyClient.cs
src/TrueCV.Infrastructure/ExternalApis/Scrapers/
- ICAEWMembershipVerifier.cs
- SRALawverVerifier.cs
- IETEngineerVerifier.cs
FlagCategory += ProfessionalRegistration
```
**6. GOV.UK Regulated Professions Registry**
- **Deliverable:** Enrichment layer for professional claims
- **Effort:** 2-3 days dev
- **Expected Impact:**
- Trust/transparency feature
- User education value
- Low dev cost, medium UX value
---
### Phase 3: Q3 2026+ (Strategic Partnerships)
**7. HMRC RTI Payroll Integration**
- **Status:** Requires government partnership/accreditation
- **Effort:** 8-10 weeks (vendor dependent)
- **Expected Impact:** "Gold standard" employment verification
- **Business Model:** Premium feature tier
**8. DBS Check Partnership**
- **Status:** Requires vendor agreement + compliance framework
- **Effort:** 8-10 weeks
- **Expected Impact:** Security compliance selling point
- **Business Model:** Premium tier or per-check revenue
---
## Implementation Examples
### 1. HEDD Integration Example
```csharp
// New service interface
public interface IEducationVerifierService
{
Task<EducationVerificationResult> VerifyDegreeAsync(
string candidateName,
DateOnly dateOfBirth,
string institution,
DateOnly? graduationYear,
string? qualification,
string? subject,
string? grade);
}
// New flag categories
public enum FlagCategory
{
Employment,
Education, // ✓ Existing
Timeline,
Plausibility,
EducationVerification, // NEW
DirectorshipVerification, // NEW
HealthcareRegistration, // NEW
ProfessionalRegistration // NEW
}
// Example: Enhanced timeline analysis
public class TimelineAnalyserService
{
private const int NormalGapMonths = 3; // UK norm
private const int RedFlagGapMonths = 12;
public TimelineGap CheckGapPlausibility(DateOnly startDate, DateOnly endDate)
{
if ((endDate - startDate).Days > 366 &&
endDate.AddMonths(-NormalGapMonths) < startDate)
{
return new TimelineGap
{
Severity = FlagSeverity.Medium,
Title = "Unusually Long Employment Gap",
Description = "Gap exceeds UK employment pattern norms"
};
}
}
}
```
### 2. Healthcare Register Scraper Example
```csharp
public class GMCRegisterVerifier
{
private const string GMCRegisterUrl = "https://www.gmc-uk.org/";
public async Task<GMCVerificationResult> VerifyDoctorAsync(
string fullName,
string gmcNumber = null)
{
// Web scrape or API query GMC register
var result = await ScrapeGMCRegisterAsync(fullName, gmcNumber);
return new GMCVerificationResult
{
IsFound = result != null,
RegistrationStatus = result?.Status,
Specialties = result?.Specialties,
Restrictions = result?.Restrictions,
VerificationConfidence = result != null ? 95 : 0
};
}
}
public class NMCRegisterVerifier
{
public async Task<NMCVerificationResult> VerifyNurseAsync(
string fullName,
string pinNumber = null)
{
// Similar pattern to GMC
}
}
```
### 3. Companies House Director Verification Example
```csharp
public class DirectorshipVerificationService
{
public async Task<DirectorshipVerificationResult> VerifyDirectorshipAsync(
string candidateName,
string companyName,
DateOnly claimedStartDate,
DateOnly claimedEndDate)
{
// Get company number from existing Companies House integration
var company = await _companyVerifier.VerifyCompanyAsync(companyName);
if (!company.IsVerified)
{
return CreateUnverifiedResult("Company not found");
}
// Query officer appointments
var appointments = await _companiesHouseClient.GetOfficerAppointmentsAsync(
company.MatchedCompanyNumber);
var matchingAppointment = appointments
.FirstOrDefault(a => FuzzyMatch(a.OfficerName, candidateName));
if (matchingAppointment == null)
{
return CreateFlagResult(
"DirectorshipNotFound",
$"No officer appointment found for {candidateName}");
}
// Verify dates align
if (matchingAppointment.AppointmentDate > claimedStartDate)
{
return CreateFlagResult(
"DirectorshipDateMismatch",
$"Claimed start date ({claimedStartDate}) before appointment date");
}
return CreateVerifiedResult(matchingAppointment);
}
}
```
---
## Success Metrics for Phase 1
| Metric | Target | Owner |
|---|---|---|
| HEDD Integration Live | Week 3 | Engineering |
| Education Flags Accuracy | >95% precision | QA |
| Timeline Gaps Detected | >80% of actual gaps | Analytics |
| GMC/NMC Scraper Complete | Week 4 | Engineering |
| Healthcare Niche Adoption | 5+ healthcare recruiter orgs | Sales |
| Detection Rate Improvement | +35% over baseline | Product |
| User Satisfaction (HEDD) | >85% (low friction) | Support |
---
## Risk Mitigation
### HEDD Integration Risks
- **Risk:** API changes or rate limiting
- **Mitigation:** Use web portal integration first, request official API later; cache results aggressively
- **Risk:** Candidate consent complexity
- **Mitigation:** Clear one-click consent flow; educational messaging
### Professional Register Scraping Risks
- **Risk:** Website structure changes break scrapers
- **Mitigation:** Robust error handling; monitoring alerts; manual fallback links provided to users
- **Risk:** Regulators restrict scraping
- **Mitigation:** Request official API access proactively; provide value-add (fraud detection = mutual benefit)
### HMRC/DBS Integration Risks
- **Risk:** Regulatory gatekeeping / approval delays
- **Mitigation:** Start vendor conversations NOW; build partnerships in parallel
- **Risk:** Compliance burden
- **Mitigation:** Partner with established pre-employment screening vendors (Verifile, DDC) who handle compliance
---
## Competitive Advantage Summary
| Feature | TrueCV Advantage | Timeline |
|---|---|---|
| **HEDD Integration** | Only dedicated CV tool with instant degree verification | Q1 2026 |
| **Healthcare Register Targeting** | Only tool targeting healthcare recruitment niche | Q1 2026 |
| **Timeline + Education Linking** | CV tells employment started before degree completed = RED FLAG | Q1 2026 |
| **Professional Body Framework** | Modular; expandable to 140+ professions vs competitors' static lists | Q2 2026 |
| **Companies House Directors** | Only tool verifying self-employment claims against official records | Q1 2026 |
---
## UK Market Positioning
**Tagline:** *"The only CV verification tool UK recruiters need - from degree to directorship"*
**Market Segment:** Recruitment agencies, HR departments, background screening companies
**Price Model (Suggested):**
- **Free Tier:** Companies House + Timeline Analysis
- **Professional Tier:** +HEDD verification, +Healthcare registers (£29-49/user/month)
- **Enterprise Tier:** +HMRC payroll, +DBS integration, +Professional bodies (Custom pricing)
---
## API Accessibility Summary
| Source | Type | Access Level | Cost | Feasibility |
|---|---|---|---|---|
| HEDD | Web Portal + Manual | Registered user | £1-5/check | Easy → Direct |
| GMC Register | Public Web | Scrape/No API | Free | Easy → Scraper |
| NMC Register | Public Web | Scrape/No API | Free | Easy → Scraper |
| Companies House | REST API ✓ | Commercial | Free-£100/mo | Already done |
| Directors API | REST API | Commercial | Included | Easy → Extend |
| GOV.UK Professions | REST API | Open | Free | Easy → Query |
| ICAEW Register | Public Web | Scrape/No API | Free | Medium → Scraper |
| SRA Register | Public Web | Scrape/No API | Free | Medium → Scraper |
| HMRC RTI | REST API | Restricted | Via partner | Hard → Partnership |
| DBS | REST API | Via partner | £20-50/check | Hard → Partnership |
---
## Next Steps (This Week)
1. **Confirm HEDD feasibility** with legal/compliance (consent requirements, data handling)
2. **Request GMC/NMC API access** officially (may grant vs. scraping)
3. **Map ICAEW/SRA register structures** for scraper design
4. **Contact HMRC/DBS vendors** (Verifile, DDC) for partnership exploration
5. **UK recruiter interviews:** Validate prioritization with 10-15 target customers
6. **Wireframe HEDD UI** in parallel with backend work
---
## References
- [HEDD (Higher Education Degree Datacheck)](https://hedd.ac.uk/)
- [GMC Register](https://www.gmc-uk.org/registration-and-licensing/our-registers)
- [NMC Register](https://www.nmc.org.uk/registration/search-the-register/)
- [UK Regulated Professions Register](https://www.regulated-professions.service.gov.uk/)
- [CV Fraud UK Statistics - Cifas](https://www.cifas.org.uk/)
- [UK Employment Gaps Report 2025 - LiveCareer](https://www.livecareer.co.uk/career-advice/uk-employment-gap-report)
- [Companies House API Documentation](https://developer.companieshouse.gov.uk/)