SEO schema cleanup + blog index update

Removed 42 deprecated/restricted schema blocks across 21 files:
- FAQPage removed from all commercial pages (restricted Aug 2023)
- HowTo removed from all pages (rich results removed Sep 2023)
- Compliance guide: author type fixed Organization->Person

Blog index:
- New article cards: ai-web-scraping-2026, web-scraping-lead-generation-uk
- Stats updated: 55+ articles -> 57+, 2025 Content -> 2026 Content
- Featured article date updated to March 2026
- Blog schema updated with new BlogPosting entries
This commit is contained in:
Peter Foster
2026-03-08 10:48:11 +00:00
parent 790ffef935
commit 62e69542b0
21 changed files with 40 additions and 867 deletions

View File

@@ -545,37 +545,6 @@ $read_time = 9;
<script src="../../assets/js/main.js"></script>
<script src="../../assets/js/cro-enhancements.js"></script>
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "FAQPage",
"mainEntity": [
{
"@type": "Question",
"name": "What is advanced statistical validation in data pipelines?",
"acceptedAnswer": {
"@type": "Answer",
"text": "Advanced statistical validation uses techniques such as z-score analysis, interquartile range checks, Kolmogorov-Smirnov tests, and distribution comparison to detect anomalies in data pipelines that simple rule-based checks miss. It catches issues like distributional drift, unexpected skew, or out-of-range values that only become visible when compared to historical baselines."
}
},
{
"@type": "Question",
"name": "What tools are best for data quality validation in Python?",
"acceptedAnswer": {
"@type": "Answer",
"text": "The most widely used Python tools for data quality validation are Great Expectations (comprehensive rule-based validation with HTML reports), Pandera (schema validation for DataFrames), Deequ (Amazon's library for large-scale validation), and dbt tests for SQL-based pipelines. Great Expectations is the most popular choice for production data pipelines in UK data teams."
}
},
{
"@type": "Question",
"name": "How do you validate data quality automatically in a pipeline?",
"acceptedAnswer": {
"@type": "Answer",
"text": "Automated data quality validation involves: (1) defining schema and type constraints, (2) setting statistical thresholds based on historical baselines, (3) running validation checks as pipeline steps, (4) routing failed records to a quarantine layer, and (5) alerting the data team via Slack or email. Tools like Great Expectations or dbt can run these checks natively within Airflow or Prefect workflows."
}
}
]
}
</script>
</body>
</html>