Files
RealCV/PHASE1_TECHNICAL_IMPLEMENTATION.md

1492 lines
47 KiB
Markdown
Raw Permalink Normal View History

# Phase 1 Technical Implementation Guide (Q1 2026)
**Timeline:** 8 weeks
**Target:** 4 high-impact features live in production
---
## Feature 1: HEDD Degree Verification Integration
### Overview
Real-time integration with HEDD (Higher Education Degree Datacheck) to verify UK degrees against 140+ university records.
**Current Baseline:** RealCV parses education entries from CV using Claude AI
**Gap:** No verification against actual university records
**Value:** Eliminates 90%+ of fake degree claims
### Architecture
```
CVParserService (existing)
↓ extracts education data
EducationVerificationService (NEW)
├── IEducationVerifierService interface
├── HeddClient (web integration)
├── HeddConsentManager (workflow)
└── EducationFlag generation
CVCheck (database) + new EducationVerification flags
Report & UI
```
### Phase 1a: Create Infrastructure (Days 1-5)
#### File 1: `src/RealCV.Infrastructure/Configuration/HeddSettings.cs`
```csharp
namespace RealCV.Infrastructure.Configuration;
public class HeddSettings
{
public required string BaseUrl { get; set; }
public required string ApiKey { get; set; } // Registration credentials
public int TimeoutSeconds { get; set; } = 30;
public bool RequireConsentAcknowledgment { get; set; } = true;
}
```
#### File 2: `src/RealCV.Infrastructure/ExternalApis/HeddClient.cs`
```csharp
using System.Net.Http.Json;
using System.Text;
using System.Text.Json;
using System.Text.Json.Serialization;
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Options;
using RealCV.Infrastructure.Configuration;
namespace RealCV.Infrastructure.ExternalApis;
public sealed class HeddClient
{
private readonly HttpClient _httpClient;
private readonly ILogger<HeddClient> _logger;
private readonly HeddSettings _settings;
private static readonly JsonSerializerOptions JsonOptions = new()
{
PropertyNamingPolicy = JsonNamingPolicy.CamelCase,
PropertyNameCaseInsensitive = true,
DefaultIgnoreCondition = JsonIgnoreCondition.WhenWritingNull
};
public HeddClient(
HttpClient httpClient,
IOptions<HeddSettings> settings,
ILogger<HeddClient> logger)
{
_httpClient = httpClient;
_logger = logger;
_settings = settings.Value;
_httpClient.BaseAddress = new Uri(_settings.BaseUrl);
_httpClient.Timeout = TimeSpan.FromSeconds(_settings.TimeoutSeconds);
}
public async Task<HeddVerificationResponse?> VerifyDegreeAsync(
HeddVerificationRequest request,
CancellationToken cancellationToken = default)
{
ArgumentNullException.ThrowIfNull(request);
_logger.LogDebug(
"Submitting degree verification to HEDD for {CandidateName} from {Institution}",
request.CandidateName, request.Institution);
try
{
var content = new StringContent(
JsonSerializer.Serialize(request, JsonOptions),
Encoding.UTF8,
"application/json");
var response = await _httpClient.PostAsync(
"/api/verify/degree",
content,
cancellationToken);
if (!response.IsSuccessStatusCode)
{
_logger.LogWarning(
"HEDD verification failed with status {StatusCode}",
response.StatusCode);
return null;
}
var result = await response.Content.ReadFromJsonAsync<HeddVerificationResponse>(
JsonOptions,
cancellationToken);
_logger.LogInformation(
"HEDD verification completed: {Status}",
result?.VerificationStatus ?? "Unknown");
return result;
}
catch (HttpRequestException ex)
{
_logger.LogError(ex, "HEDD API request failed");
throw;
}
}
public async Task<HeddManualVerificationStatus?> CheckManualVerificationStatusAsync(
string referenceId,
CancellationToken cancellationToken = default)
{
_logger.LogDebug("Checking HEDD manual verification status: {ReferenceId}", referenceId);
try
{
var response = await _httpClient.GetAsync(
$"/api/verify/status/{referenceId}",
cancellationToken);
if (!response.IsSuccessStatusCode)
{
_logger.LogWarning(
"Failed to check verification status {ReferenceId}",
referenceId);
return null;
}
var result = await response.Content.ReadFromJsonAsync<HeddManualVerificationStatus>(
JsonOptions,
cancellationToken);
return result;
}
catch (HttpRequestException ex)
{
_logger.LogError(ex, "Failed to check HEDD status for {ReferenceId}", referenceId);
throw;
}
}
}
// Request/Response DTOs
public sealed record HeddVerificationRequest
{
[JsonPropertyName("candidateName")]
public required string CandidateName { get; init; }
[JsonPropertyName("dateOfBirth")]
public required string DateOfBirth { get; init; } // YYYY-MM-DD
[JsonPropertyName("institution")]
public required string Institution { get; init; }
[JsonPropertyName("qualificationLevel")]
public required string QualificationLevel { get; init; } // "Bachelor's", "Master's", "PhD"
[JsonPropertyName("subject")]
public string? Subject { get; init; }
[JsonPropertyName("classificationOrGrade")]
public string? ClassificationOrGrade { get; init; } // "First", "Upper Second", "2:1", etc.
[JsonPropertyName("graduationYear")]
public required string GraduationYear { get; init; } // YYYY
[JsonPropertyName("consentAcknowledgment")]
public required bool ConsentAcknowledgment { get; init; }
}
public sealed record HeddVerificationResponse
{
[JsonPropertyName("referenceId")]
public required string ReferenceId { get; init; }
[JsonPropertyName("verificationStatus")]
public required string VerificationStatus { get; init; } // "Verified", "Manual", "Unverified"
[JsonPropertyName("institutionMatch")]
public bool? InstitutionMatch { get; init; }
[JsonPropertyName("qualificationMatch")]
public bool? QualificationMatch { get; init; }
[JsonPropertyName("graduationYearMatch")]
public bool? GraduationYearMatch { get; init; }
[JsonPropertyName("classificationMatch")]
public bool? ClassificationMatch { get; init; }
[JsonPropertyName("verifiedInstitution")]
public string? VerifiedInstitution { get; init; }
[JsonPropertyName("verifiedQualification")]
public string? VerifiedQualification { get; init; }
[JsonPropertyName("notes")]
public string? Notes { get; init; }
[JsonPropertyName("estimatedManualReviewDate")]
public DateTime? EstimatedManualReviewDate { get; init; }
}
public sealed record HeddManualVerificationStatus
{
[JsonPropertyName("referenceId")]
public required string ReferenceId { get; init; }
[JsonPropertyName("status")]
public required string Status { get; init; } // "Pending", "Verified", "UnableToVerify"
[JsonPropertyName("resolvedAt")]
public DateTime? ResolvedAt { get; init; }
[JsonPropertyName("verificationDetails")]
public string? VerificationDetails { get; init; }
}
```
#### File 3: `src/RealCV.Application/Interfaces/IEducationVerifierService.cs`
```csharp
using RealCV.Application.Models;
namespace RealCV.Application.Interfaces;
public interface IEducationVerifierService
{
/// <summary>
/// Verify education entry against HEDD database
/// </summary>
Task<EducationVerificationResult> VerifyEducationEntryAsync(
string fullName,
DateOnly dateOfBirth,
string institution,
string qualification,
string? subject,
string? grade,
DateOnly graduationDate,
CancellationToken cancellationToken = default);
/// <summary>
/// Check status of manual verification (for entries not instantly matched)
/// </summary>
Task<EducationManualVerificationStatus?> CheckVerificationStatusAsync(
string referenceId,
CancellationToken cancellationToken = default);
}
```
#### File 4: `src/RealCV.Application/Models/EducationVerificationResult.cs`
```csharp
namespace RealCV.Application.Models;
public sealed record EducationVerificationResult
{
/// <summary>
/// HEDD reference ID for tracking/follow-up
/// </summary>
public required string ReferenceId { get; init; }
/// <summary>
/// Overall verification result
/// </summary>
public required VerificationStatus Status { get; init; }
/// <summary>
/// Field-level verification results
/// </summary>
public required EducationFieldMatches FieldMatches { get; init; }
/// <summary>
/// Verified information returned from HEDD
/// </summary>
public EducationVerifiedData? VerifiedData { get; init; }
/// <summary>
/// If manual verification required, estimated completion date
/// </summary>
public DateTime? ManualReviewEstimatedDate { get; init; }
/// <summary>
/// Additional notes from verification process
/// </summary>
public string? Notes { get; init; }
/// <summary>
/// Confidence score (0-100) for the verification
/// </summary>
public int ConfidenceScore { get; init; }
}
public enum VerificationStatus
{
/// <summary>
/// All fields matched exactly against university records
/// </summary>
Verified,
/// <summary>
/// Submitted for manual university verification (10 working days)
/// </summary>
PendingManualReview,
/// <summary>
/// Could not be verified or manual review failed
/// </summary>
Unverified
}
public sealed record EducationFieldMatches
{
public bool? InstitutionMatched { get; init; }
public bool? QualificationMatched { get; init; }
public bool? GraduationYearMatched { get; init; }
public bool? GradeMatched { get; init; }
public bool? SubjectMatched { get; init; }
}
public sealed record EducationVerifiedData
{
public string? VerifiedInstitution { get; init; }
public string? VerifiedQualification { get; init; }
public int? VerifiedGraduationYear { get; init; }
public string? VerifiedGrade { get; init; }
public string? VerifiedSubject { get; init; }
}
public sealed record EducationManualVerificationStatus
{
public required string ReferenceId { get; init; }
public required ManualVerificationStatus Status { get; init; }
public DateTime? CompletedAt { get; init; }
public string? Details { get; init; }
}
public enum ManualVerificationStatus
{
Pending,
Verified,
UnableToVerify
}
```
### Phase 1b: Implement Service Layer (Days 6-10)
#### File 5: `src/RealCV.Infrastructure/Services/EducationVerifierService.cs`
```csharp
using Microsoft.Extensions.Logging;
using RealCV.Application.Interfaces;
using RealCV.Application.Models;
using RealCV.Infrastructure.ExternalApis;
namespace RealCV.Infrastructure.Services;
public sealed class EducationVerifierService : IEducationVerifierService
{
private readonly HeddClient _heddClient;
private readonly ILogger<EducationVerifierService> _logger;
public EducationVerifierService(
HeddClient heddClient,
ILogger<EducationVerifierService> logger)
{
_heddClient = heddClient;
_logger = logger;
}
public async Task<EducationVerificationResult> VerifyEducationEntryAsync(
string fullName,
DateOnly dateOfBirth,
string institution,
string qualification,
string? subject,
string? grade,
DateOnly graduationDate,
CancellationToken cancellationToken = default)
{
ArgumentException.ThrowIfNullOrWhiteSpace(fullName);
ArgumentException.ThrowIfNullOrWhiteSpace(institution);
ArgumentException.ThrowIfNullOrWhiteSpace(qualification);
_logger.LogDebug(
"Verifying education for {FullName}: {Institution} - {Qualification}",
fullName, institution, qualification);
var request = new HeddVerificationRequest
{
CandidateName = fullName,
DateOfBirth = dateOfBirth.ToString("yyyy-MM-dd"),
Institution = institution,
QualificationLevel = NormalizeQualification(qualification),
Subject = subject,
ClassificationOrGrade = grade,
GraduationYear = graduationDate.Year.ToString(),
ConsentAcknowledgment = true
};
try
{
var response = await _heddClient.VerifyDegreeAsync(request, cancellationToken);
if (response is null)
{
_logger.LogWarning(
"HEDD verification returned null for {FullName}",
fullName);
return CreateUnverifiedResult();
}
return MapToEducationVerificationResult(response);
}
catch (Exception ex)
{
_logger.LogError(ex, "Error during education verification for {FullName}", fullName);
throw;
}
}
public async Task<EducationManualVerificationStatus?> CheckVerificationStatusAsync(
string referenceId,
CancellationToken cancellationToken = default)
{
ArgumentException.ThrowIfNullOrWhiteSpace(referenceId);
_logger.LogDebug("Checking verification status for reference: {ReferenceId}", referenceId);
try
{
var status = await _heddClient.CheckManualVerificationStatusAsync(
referenceId,
cancellationToken);
if (status is null)
{
return null;
}
return new EducationManualVerificationStatus
{
ReferenceId = status.ReferenceId,
Status = status.Status switch
{
"Verified" => ManualVerificationStatus.Verified,
"UnableToVerify" => ManualVerificationStatus.UnableToVerify,
_ => ManualVerificationStatus.Pending
},
CompletedAt = status.ResolvedAt,
Details = status.VerificationDetails
};
}
catch (Exception ex)
{
_logger.LogError(ex, "Error checking verification status for {ReferenceId}", referenceId);
throw;
}
}
private static EducationVerificationResult MapToEducationVerificationResult(
HeddVerificationResponse response)
{
var status = response.VerificationStatus switch
{
"Verified" => VerificationStatus.Verified,
"Manual" => VerificationStatus.PendingManualReview,
_ => VerificationStatus.Unverified
};
var confidenceScore = CalculateConfidenceScore(response);
return new EducationVerificationResult
{
ReferenceId = response.ReferenceId,
Status = status,
FieldMatches = new EducationFieldMatches
{
InstitutionMatched = response.InstitutionMatch,
QualificationMatched = response.QualificationMatch,
GraduationYearMatched = response.GraduationYearMatch,
GradeMatched = response.ClassificationMatch
},
VerifiedData = new EducationVerifiedData
{
VerifiedInstitution = response.VerifiedInstitution,
VerifiedQualification = response.VerifiedQualification,
VerifiedGraduationYear = int.TryParse(response.VerifiedInstitution, out var year) ? year : null,
VerifiedGrade = response.Notes
},
ManualReviewEstimatedDate = response.EstimatedManualReviewDate,
Notes = response.Notes,
ConfidenceScore = confidenceScore
};
}
private static int CalculateConfidenceScore(HeddVerificationResponse response)
{
return response.VerificationStatus switch
{
"Verified" => 100,
"Manual" => 50,
_ => 0
};
}
private static string NormalizeQualification(string qualification)
{
return qualification.ToLowerInvariant() switch
{
var q when q.Contains("bachelor") => "Bachelor's",
var q when q.Contains("master") => "Master's",
var q when q.Contains("phd") || q.Contains("doctorate") => "PhD",
var q when q.Contains("hnd") => "HND",
var q when q.Contains("diploma") => "Diploma",
_ => qualification
};
}
private static EducationVerificationResult CreateUnverifiedResult()
{
return new EducationVerificationResult
{
ReferenceId = Guid.NewGuid().ToString(),
Status = VerificationStatus.Unverified,
FieldMatches = new EducationFieldMatches(),
ConfidenceScore = 0,
Notes = "Unable to contact verification service"
};
}
}
```
### Phase 1c: Database & Flag Integration (Days 11-12)
#### Update: `src/RealCV.Domain/Enums/FlagCategory.cs`
```csharp
namespace RealCV.Domain.Enums;
public enum FlagCategory
{
Employment,
Education,
Timeline,
Plausibility,
EducationVerification, // NEW
DirectorshipVerification, // NEW (for Phase 1d)
HealthcareRegistration // NEW (for Phase 1e)
}
```
#### New File: `src/RealCV.Infrastructure/Services/EducationFlagGenerator.cs`
```csharp
using RealCV.Application.Models;
using RealCV.Domain.Entities;
using RealCV.Domain.Enums;
namespace RealCV.Infrastructure.Services;
public sealed class EducationFlagGenerator
{
public static CVFlag? GenerateEducationVerificationFlag(
EducationVerificationResult verificationResult,
EducationEntry claimedEducation)
{
// Verified = no flag
if (verificationResult.Status == VerificationStatus.Verified)
{
return null;
}
// Unverified = high severity flag
if (verificationResult.Status == VerificationStatus.Unverified)
{
return new CVFlag
{
Category = FlagCategory.EducationVerification,
Severity = FlagSeverity.High,
Title = "Degree Verification Failed",
Description = $"Could not verify degree from {claimedEducation.Institution} " +
$"({claimedEducation.Qualification}). " +
$"Reference: {verificationResult.ReferenceId}",
ScoreImpact = -40
};
}
// PendingManualReview = medium severity flag (temporary)
if (verificationResult.Status == VerificationStatus.PendingManualReview)
{
var reviewDate = verificationResult.ManualReviewEstimatedDate?
.ToString("dd MMM yyyy") ?? "soon";
return new CVFlag
{
Category = FlagCategory.EducationVerification,
Severity = FlagSeverity.Medium,
Title = "Degree Under Manual Review",
Description = $"Degree from {claimedEducation.Institution} submitted for " +
$"manual university verification. Expected completion: {reviewDate}. " +
$"Reference: {verificationResult.ReferenceId}",
ScoreImpact = -15
};
}
return null;
}
public static CVFlag? GenerateFieldMismatchFlag(
EducationFieldMatches matches,
EducationEntry claimed)
{
// Check for specific field mismatches
if (matches.InstitutionMatched == false)
{
return new CVFlag
{
Category = FlagCategory.EducationVerification,
Severity = FlagSeverity.High,
Title = "Institution Name Mismatch",
Description = $"Claimed institution '{claimed.Institution}' does not match " +
"verified university records. Verify exact institution name.",
ScoreImpact = -35
};
}
if (matches.GraduationYearMatched == false)
{
return new CVFlag
{
Category = FlagCategory.EducationVerification,
Severity = FlagSeverity.High,
Title = "Graduation Date Mismatch",
Description = $"Claimed graduation year ({claimed.EndDate?.Year}) does not match " +
"verified university records.",
ScoreImpact = -30
};
}
if (matches.QualificationMatched == false)
{
return new CVFlag
{
Category = FlagCategory.EducationVerification,
Severity = FlagSeverity.Medium,
Title = "Qualification Mismatch",
Description = $"Claimed qualification '{claimed.Qualification}' does not match " +
"verified university records.",
ScoreImpact = -25
};
}
return null;
}
}
```
### Phase 1d: Companies House Enhancement - Director Verification
#### File: `src/RealCV.Infrastructure/ExternalApis/CompaniesHouseDirectorsClient.cs`
```csharp
using System.Net.Http.Json;
using System.Text.Json;
using System.Text.Json.Serialization;
using Microsoft.Extensions.Logging;
using RealCV.Infrastructure.ExternalApis;
namespace RealCV.Infrastructure.ExternalApis;
public sealed class CompaniesHouseDirectorsClient
{
private readonly HttpClient _httpClient;
private readonly ILogger<CompaniesHouseDirectorsClient> _logger;
private static readonly JsonSerializerOptions JsonOptions = new()
{
PropertyNamingPolicy = JsonNamingPolicy.SnakeCaseLower,
PropertyNameCaseInsensitive = true
};
public CompaniesHouseDirectorsClient(
HttpClient httpClient,
ILogger<CompaniesHouseDirectorsClient> logger)
{
_httpClient = httpClient;
_logger = logger;
}
public async Task<List<DirectorAppointment>?> GetDirectorAppointmentsAsync(
string companyNumber,
CancellationToken cancellationToken = default)
{
ArgumentException.ThrowIfNullOrWhiteSpace(companyNumber);
_logger.LogDebug("Fetching director appointments for company: {CompanyNumber}", companyNumber);
try
{
var requestUrl = $"/company/{companyNumber}/officers";
var response = await _httpClient.GetAsync(requestUrl, cancellationToken);
if (!response.IsSuccessStatusCode)
{
_logger.LogWarning(
"Failed to fetch officers for {CompanyNumber}: {StatusCode}",
companyNumber, response.StatusCode);
return null;
}
var result = await response.Content.ReadFromJsonAsync<OfficersResponse>(
JsonOptions,
cancellationToken);
return result?.Items?
.Where(o => o.OfficerRole == "Director" || o.OfficerRole == "Secretary")
.Select(o => new DirectorAppointment
{
OfficerId = o.Id,
OfficerName = o.Name,
OfficerRole = o.OfficerRole,
AppointmentDate = ParseDate(o.AppointedOn),
ResignationDate = ParseDate(o.ResignedOn),
IsActive = o.ResignedOn == null
})
.ToList() ?? [];
}
catch (HttpRequestException ex)
{
_logger.LogError(ex, "Error fetching officers for {CompanyNumber}", companyNumber);
throw;
}
}
private static DateOnly? ParseDate(string? dateString)
{
if (string.IsNullOrWhiteSpace(dateString) ||
!DateOnly.TryParse(dateString, out var date))
{
return null;
}
return date;
}
}
public sealed record DirectorAppointment
{
public required string OfficerId { get; init; }
public required string OfficerName { get; init; }
public required string OfficerRole { get; init; }
public required DateOnly AppointmentDate { get; init; }
public DateOnly? ResignationDate { get; init; }
public bool IsActive { get; init; }
}
// Companies House API Response DTOs
public sealed record OfficersResponse
{
public List<Officer>? Items { get; init; }
}
public sealed record Officer
{
public required string Id { get; init; }
public required string Name { get; init; }
public required string OfficerRole { get; init; }
public string? AppointedOn { get; init; }
public string? ResignedOn { get; init; }
}
```
#### File: `src/RealCV.Application/Interfaces/IDirectorshipVerifierService.cs`
```csharp
namespace RealCV.Application.Interfaces;
public interface IDirectorshipVerifierService
{
Task<DirectorshipVerificationResult> VerifyDirectorshipAsync(
string candidateName,
string companyName,
DateOnly claimedStartDate,
DateOnly? claimedEndDate,
CancellationToken cancellationToken = default);
}
public sealed record DirectorshipVerificationResult
{
public required bool IsVerified { get; init; }
public required string ClaimedCompany { get; init; }
public required string ClaimedRole { get; init; }
public required DateOnly ClaimedStartDate { get; init; }
public DateOnly? ClaimedEndDate { get; init; }
public string? VerifiedOfficerName { get; init; }
public string? VerifiedRole { get; init; }
public DateOnly? VerifiedAppointmentDate { get; init; }
public DateOnly? VerifiedResignationDate { get; init; }
public string? Notes { get; init; }
public int ConfidenceScore { get; init; }
}
```
#### File: `src/RealCV.Infrastructure/Services/DirectorshipVerifierService.cs`
```csharp
using FuzzySharp;
using Microsoft.Extensions.Logging;
using RealCV.Application.Interfaces;
using RealCV.Infrastructure.ExternalApis;
namespace RealCV.Infrastructure.Services;
public sealed class DirectorshipVerifierService : IDirectorshipVerifierService
{
private readonly CompanyVerifierService _companyVerifier;
private readonly CompaniesHouseDirectorsClient _directorsClient;
private readonly ILogger<DirectorshipVerifierService> _logger;
private const int FuzzyNameThreshold = 75;
public DirectorshipVerifierService(
CompanyVerifierService companyVerifier,
CompaniesHouseDirectorsClient directorsClient,
ILogger<DirectorshipVerifierService> logger)
{
_companyVerifier = companyVerifier;
_directorsClient = directorsClient;
_logger = logger;
}
public async Task<DirectorshipVerificationResult> VerifyDirectorshipAsync(
string candidateName,
string companyName,
DateOnly claimedStartDate,
DateOnly? claimedEndDate,
CancellationToken cancellationToken = default)
{
ArgumentException.ThrowIfNullOrWhiteSpace(candidateName);
ArgumentException.ThrowIfNullOrWhiteSpace(companyName);
_logger.LogDebug(
"Verifying directorship: {CandidateName} at {CompanyName}",
candidateName, companyName);
// Step 1: Verify company exists
var companyVerification = await _companyVerifier.VerifyCompanyAsync(
companyName,
claimedStartDate,
claimedEndDate);
if (!companyVerification.IsVerified)
{
_logger.LogDebug(
"Company not verified for directorship check: {CompanyName}",
companyName);
return CreateUnverifiedResult(
candidateName,
companyName,
claimedStartDate,
claimedEndDate,
"Company not found in Companies House");
}
// Step 2: Get directors for verified company
var appointments = await _directorsClient.GetDirectorAppointmentsAsync(
companyVerification.MatchedCompanyNumber!,
cancellationToken);
if (appointments is null || appointments.Count == 0)
{
_logger.LogDebug(
"No directors found for company {CompanyNumber}",
companyVerification.MatchedCompanyNumber);
return CreateUnverifiedResult(
candidateName,
companyName,
claimedStartDate,
claimedEndDate,
"No director appointments found");
}
// Step 3: Fuzzy match candidate name against directors
var matchedDirector = FindBestNameMatch(candidateName, appointments);
if (matchedDirector is null)
{
_logger.LogDebug(
"No name match found for {CandidateName} in {CompanyNumber}",
candidateName, companyVerification.MatchedCompanyNumber);
return CreateUnverifiedResult(
candidateName,
companyName,
claimedStartDate,
claimedEndDate,
$"Name '{candidateName}' not found in director records");
}
// Step 4: Validate dates
var dateValidation = ValidateDates(
claimedStartDate,
claimedEndDate,
matchedDirector);
if (!dateValidation.IsValid)
{
_logger.LogWarning(
"Date mismatch for directorship: claimed {ClaimedStart}-{ClaimedEnd}, " +
"actual {ActualStart}-{ActualEnd}",
claimedStartDate, claimedEndDate,
matchedDirector.AppointmentDate, matchedDirector.ResignationDate);
return CreateDateMismatchResult(
candidateName,
companyName,
claimedStartDate,
claimedEndDate,
matchedDirector,
dateValidation.Reason);
}
// Step 5: Success - directorship verified
_logger.LogInformation(
"Directorship verified: {CandidateName} at {CompanyName}",
candidateName, companyName);
return new DirectorshipVerificationResult
{
IsVerified = true,
ClaimedCompany = companyName,
ClaimedRole = "Director",
ClaimedStartDate = claimedStartDate,
ClaimedEndDate = claimedEndDate,
VerifiedOfficerName = matchedDirector.OfficerName,
VerifiedRole = matchedDirector.OfficerRole,
VerifiedAppointmentDate = matchedDirector.AppointmentDate,
VerifiedResignationDate = matchedDirector.ResignationDate,
Notes = "Directorship verified against Companies House records",
ConfidenceScore = 100
};
}
private static DirectorAppointment? FindBestNameMatch(
string candidateName,
List<DirectorAppointment> appointments)
{
var matches = appointments
.Select(a => new
{
Appointment = a,
Score = Fuzz.Ratio(
candidateName.ToUpperInvariant(),
a.OfficerName.ToUpperInvariant())
})
.Where(m => m.Score >= FuzzyNameThreshold)
.OrderByDescending(m => m.Score)
.FirstOrDefault();
return matches?.Appointment;
}
private static DateValidation ValidateDates(
DateOnly claimedStart,
DateOnly? claimedEnd,
DirectorAppointment actual)
{
// Claimed start before actual appointment
if (claimedStart < actual.AppointmentDate)
{
return new DateValidation
{
IsValid = false,
Reason = $"Claimed start date ({claimedStart}) " +
$"before actual appointment ({actual.AppointmentDate})"
};
}
// Claimed end after actual resignation (if resigned)
if (actual.ResignationDate.HasValue && claimedEnd.HasValue)
{
if (claimedEnd > actual.ResignationDate)
{
return new DateValidation
{
IsValid = false,
Reason = $"Claimed end date ({claimedEnd}) " +
$"after actual resignation ({actual.ResignationDate})"
};
}
}
return new DateValidation { IsValid = true };
}
private static DirectorshipVerificationResult CreateUnverifiedResult(
string candidateName,
string companyName,
DateOnly claimedStartDate,
DateOnly? claimedEndDate,
string reason)
{
return new DirectorshipVerificationResult
{
IsVerified = false,
ClaimedCompany = companyName,
ClaimedRole = "Director",
ClaimedStartDate = claimedStartDate,
ClaimedEndDate = claimedEndDate,
Notes = reason,
ConfidenceScore = 0
};
}
private static DirectorshipVerificationResult CreateDateMismatchResult(
string candidateName,
string companyName,
DateOnly claimedStartDate,
DateOnly? claimedEndDate,
DirectorAppointment actual,
string reason)
{
return new DirectorshipVerificationResult
{
IsVerified = false,
ClaimedCompany = companyName,
ClaimedRole = "Director",
ClaimedStartDate = claimedStartDate,
ClaimedEndDate = claimedEndDate,
VerifiedOfficerName = actual.OfficerName,
VerifiedRole = actual.OfficerRole,
VerifiedAppointmentDate = actual.AppointmentDate,
VerifiedResignationDate = actual.ResignationDate,
Notes = reason,
ConfidenceScore = 30
};
}
private sealed record DateValidation
{
public required bool IsValid { get; init; }
public string? Reason { get; init; }
}
}
```
### Phase 1e: Enhanced Timeline Analysis
#### File: `src/RealCV.Infrastructure/Services/EnhancedTimelineAnalyserService.cs`
```csharp
using Microsoft.Extensions.Logging;
using RealCV.Application.Models;
using RealCV.Domain.Entities;
using RealCV.Domain.Enums;
namespace RealCV.Infrastructure.Services;
public sealed class EnhancedTimelineAnalyserService
{
private readonly ILogger<EnhancedTimelineAnalyserService> _logger;
private const int NormalGapMonths = 3; // UK norm
private const int RedFlagGapMonths = 6;
private const int SuspiciousGapMonths = 12;
public EnhancedTimelineAnalyserService(
ILogger<EnhancedTimelineAnalyserService> logger)
{
_logger = logger;
}
public List<CVFlag> AnalyzeEducationEmploymentSequence(
List<EducationEntry> education,
List<EmploymentEntry> employment)
{
var flags = new List<CVFlag>();
foreach (var emp in employment)
{
// Check if employment started before education ended
var conflictingEducation = education
.Where(e => e.EndDate.HasValue && emp.StartDate <= e.EndDate.Value.AddMonths(1))
.ToList();
foreach (var edu in conflictingEducation)
{
var timeBetween = (emp.StartDate - edu.EndDate.Value).Days;
if (timeBetween < 0)
{
flags.Add(new CVFlag
{
Category = FlagCategory.Timeline,
Severity = FlagSeverity.Medium,
Title = "Employment Overlaps Education",
Description = $"Employment at {emp.CompanyName} started " +
$"{Math.Abs(timeBetween)} days before completing degree " +
$"from {edu.Institution}. " +
$"Education end: {edu.EndDate:yyyy-MM}, " +
$"Employment start: {emp.StartDate:yyyy-MM}",
ScoreImpact = -20
});
}
}
}
flags.AddRange(DetectAnomalousEmploymentPatterns(employment));
return flags;
}
private List<CVFlag> DetectAnomalousEmploymentPatterns(
List<EmploymentEntry> employment)
{
var flags = new List<CVFlag>();
for (int i = 0; i < employment.Count - 1; i++)
{
var current = employment[i];
var next = employment[i + 1];
if (!current.EndDate.HasValue || next.StartDate < current.EndDate.Value)
{
// Overlapping employment
flags.Add(new CVFlag
{
Category = FlagCategory.Timeline,
Severity = current.EndDate == next.StartDate ? FlagSeverity.Low : FlagSeverity.Medium,
Title = "Overlapping Employment Periods",
Description = $"Employment at {current.CompanyName} and {next.CompanyName} " +
$"overlap. End date: {current.EndDate:yyyy-MM}, " +
$"Next start: {next.StartDate:yyyy-MM}",
ScoreImpact = current.EndDate == next.StartDate ? -10 : -25
});
continue;
}
var gapDays = (next.StartDate - current.EndDate.Value).Days;
var gapMonths = gapDays / 30;
if (gapMonths > SuspiciousGapMonths)
{
flags.Add(new CVFlag
{
Category = FlagCategory.Timeline,
Severity = FlagSeverity.Low,
Title = "Extended Employment Gap",
Description = $"{gapMonths}-month gap between {current.CompanyName} " +
$"(ended {current.EndDate:yyyy-MM}) and {next.CompanyName} " +
$"(started {next.StartDate:yyyy-MM}). " +
$"Note: UK employment gaps becoming more common (24% of workforce in 2025)",
ScoreImpact = -10
});
}
}
return flags;
}
public List<CVFlag> AnalyzeGraduationEmploymentGaps(
List<EducationEntry> education,
List<EmploymentEntry> employment)
{
var flags = new List<CVFlag>();
foreach (var edu in education.Where(e => e.EndDate.HasValue))
{
var firstEmployment = employment
.Where(e => e.StartDate >= edu.EndDate.Value)
.OrderBy(e => e.StartDate)
.FirstOrDefault();
if (firstEmployment is null)
{
continue; // No employment recorded after education
}
var gapDays = (firstEmployment.StartDate - edu.EndDate.Value).Days;
var gapMonths = gapDays / 30;
// Large gap between graduation and first job
if (gapMonths > RedFlagGapMonths && gapMonths < SuspiciousGapMonths)
{
flags.Add(new CVFlag
{
Category = FlagCategory.Timeline,
Severity = FlagSeverity.Low,
Title = "Extended Gap After Graduation",
Description = $"{gapMonths}-month gap between graduation " +
$"({edu.EndDate:yyyy-MM}) and first employment " +
$"({firstEmployment.StartDate:yyyy-MM}). Verify reason if claimed.",
ScoreImpact = -5
});
}
}
return flags;
}
}
```
### Phase 1f: Dependency Injection & Integration (Days 13-14)
#### Update: `src/RealCV.Infrastructure/DependencyInjection.cs`
```csharp
// Add to existing DependencyInjection class:
services.Configure<HeddSettings>(configuration.GetSection("Hedd"));
services.AddHttpClient<HeddClient>()
.SetHandlerLifetime(TimeSpan.FromMinutes(5));
services.AddHttpClient<CompaniesHouseDirectorsClient>()
.SetHandlerLifetime(TimeSpan.FromMinutes(5));
services.AddScoped<IEducationVerifierService, EducationVerifierService>();
services.AddScoped<IDirectorshipVerifierService, DirectorshipVerifierService>();
services.AddScoped<EnhancedTimelineAnalyserService>();
```
#### Update: `src/RealCV.Infrastructure/Jobs/ProcessCVCheckJob.cs`
Add education and directorship verification to the processing pipeline:
```csharp
public async Task ExecuteAsync(Guid cvCheckId, CancellationToken cancellationToken)
{
// ... existing code ...
// Step 3: Verify education entries (NEW)
var educationFlags = await VerifyEducationAsync(cvData, cvCheck, cancellationToken);
flags.AddRange(educationFlags);
// Step 4: Enhanced timeline analysis (UPDATED)
var enhancedTimeline = _enhancedTimelineService.AnalyzeEducationEmploymentSequence(
cvData.Education,
cvData.Employment);
flags.AddRange(enhancedTimeline);
// Step 5: Verify directorship claims (NEW)
var directorshipFlags = await VerifyDirectorshipsAsync(
cvData.Employment, cancellationToken);
flags.AddRange(directorshipFlags);
// ... rest of processing ...
}
private async Task<List<CVFlag>> VerifyEducationAsync(
CVData cvData,
CVCheck cvCheck,
CancellationToken cancellationToken)
{
var flags = new List<CVFlag>();
foreach (var edu in cvData.Education)
{
if (!edu.EndDate.HasValue)
continue;
try
{
var result = await _educationVerifier.VerifyEducationEntryAsync(
cvData.FullName,
DateOnly.FromDateTime(DateTime.Now).AddYears(-30), // Estimate DOB
edu.Institution,
edu.Qualification ?? "Unknown",
edu.Subject,
edu.Grade,
edu.EndDate.Value,
cancellationToken);
var verificationFlag = EducationFlagGenerator.GenerateEducationVerificationFlag(
result, edu);
if (verificationFlag is not null)
{
flags.Add(verificationFlag);
}
}
catch (Exception ex)
{
_logger.LogError(ex, "Error verifying education for CV {CheckId}", cvCheckId);
}
}
return flags;
}
private async Task<List<CVFlag>> VerifyDirectorshipsAsync(
List<EmploymentEntry> employment,
CancellationToken cancellationToken)
{
var flags = new List<CVFlag>();
foreach (var emp in employment.Where(e =>
e.Description?.Contains("director", StringComparison.OrdinalIgnoreCase) == true ||
e.JobTitle?.Contains("director", StringComparison.OrdinalIgnoreCase) == true))
{
try
{
var result = await _directorshipVerifier.VerifyDirectorshipAsync(
// Use company name from employment record
emp.CompanyName,
emp.StartDate,
emp.EndDate,
cancellationToken);
if (!result.IsVerified)
{
flags.Add(new CVFlag
{
Category = FlagCategory.DirectorshipVerification,
Severity = FlagSeverity.Medium,
Title = "Directorship Verification Failed",
Description = result.Notes,
ScoreImpact = -30
});
}
}
catch (Exception ex)
{
_logger.LogWarning(ex, "Error verifying directorship for {CompanyName}",
emp.CompanyName);
}
}
return flags;
}
```
### Phase 1g: Testing & QA (Days 15-16)
#### Test File: `tests/RealCV.Tests/Services/EducationVerifierServiceTests.cs`
```csharp
using Moq;
using Xunit;
using RealCV.Application.Models;
using RealCV.Infrastructure.ExternalApis;
using RealCV.Infrastructure.Services;
namespace RealCV.Tests.Services;
public class EducationVerifierServiceTests
{
private readonly Mock<HeddClient> _mockHeddClient;
private readonly EducationVerifierService _service;
public EducationVerifierServiceTests()
{
_mockHeddClient = new Mock<HeddClient>();
_service = new EducationVerifierService(
_mockHeddClient.Object,
new Mock<ILogger<EducationVerifierService>>().Object);
}
[Fact]
public async Task VerifyEducationEntryAsync_WithValidDegree_ReturnsVerified()
{
// Arrange
var heddResponse = new HeddVerificationResponse
{
ReferenceId = "REF-123",
VerificationStatus = "Verified",
InstitutionMatch = true,
QualificationMatch = true,
GraduationYearMatch = true
};
_mockHeddClient
.Setup(x => x.VerifyDegreeAsync(
It.IsAny<HeddVerificationRequest>(),
It.IsAny<CancellationToken>()))
.ReturnsAsync(heddResponse);
// Act
var result = await _service.VerifyEducationEntryAsync(
"John Smith",
new DateOnly(1990, 1, 1),
"University of Oxford",
"Bachelor of Science",
"Computer Science",
"First",
new DateOnly(2012, 6, 1));
// Assert
Assert.Equal(VerificationStatus.Verified, result.Status);
Assert.Equal(100, result.ConfidenceScore);
Assert.Equal("REF-123", result.ReferenceId);
}
[Fact]
public async Task VerifyEducationEntryAsync_WithUnverifiedDegree_ReturnsFlagGenerated()
{
// Similar test for unverified case
}
}
```
---
## Feature 2-4: Parallel Development
### Timeline Summary
- **Feature 1 (HEDD):** Days 1-16 (primary focus, 2 engineers)
- **Feature 2 (GMC/NMC):** Days 5-12 (secondary, 1 engineer) - Scraper pattern
- **Feature 3 (Companies House Enhancement):** Days 8-14 (1 engineer) - API extension
- **Feature 4 (Enhanced Timeline):** Days 10-14 (1 engineer) - Logic extension
### Staffing Recommendation
- **Lead Engineer:** HEDD integration (full-time, 2 weeks)
- **Backend Engineer 2:** Healthcare registers + timeline (concurrent, weeks 1-2)
- **Backend Engineer 3:** Companies House enhancement (weeks 2-3)
- **QA Engineer:** Validation & testing (weeks 2-3)
---
## Configuration Required
### `appsettings.json` Addition
```json
{
"Hedd": {
"BaseUrl": "https://api.hedd.ac.uk",
"ApiKey": "YOUR_HEDD_API_KEY",
"TimeoutSeconds": 30,
"RequireConsentAcknowledgment": true
}
}
```
### Environment Variables
- `HEDD_API_KEY` - Hedd registration credentials
- `HEDD_BASE_URL` - Hedd API endpoint (default: production)
---
## Database Migration
Create migration for storing verification results:
```bash
dotnet ef migrations add AddEducationAndDirectorshipVerification --project src/RealCV.Infrastructure --startup-project src/RealCV.Web
```
Add optional columns to CVCheck entity:
- `HeddReferenceId` - Track pending manual reviews
- `DirectorshipVerificationStatus` - Cache directorship results
---
## Validation Checklist
- [ ] HEDD credentials configured and tested
- [ ] Education verification returns proper flag categories
- [ ] Directorship verification cross-checks Companies House
- [ ] Enhanced timeline detects education/employment overlaps
- [ ] All flags generate with correct severity levels
- [ ] Error handling graceful (timeouts, API failures)
- [ ] Logging captures all verification attempts
- [ ] Tests passing (>90% coverage)
- [ ] Documentation updated
- [ ] Demo prepared for stakeholders