fix: clean Apply Now URLs and disable TED demo scraper

- Strip tracking query params from find_tender URLs (?origin=SearchResults)
- Disable TED EU scraper (requires browser automation, was using demo data)
- Update 220 find_tender database records with clean URLs
- Delete 4 TED demo records from database
- Add URL_FIX_SUMMARY.md documentation

All 615 tenders now have direct links to tender detail pages.
Fixes Apply Now button UX issue.
This commit is contained in:
Peter Foster
2026-02-15 13:18:50 +00:00
parent bba8f97bbe
commit 6ca3e9c576
3 changed files with 82 additions and 186 deletions

View File

@@ -49,10 +49,12 @@ async function scrapeTenders() {
const titleLink = element.find('.search-result-header a').first();
const title = titleLink.text().trim();
const rawHref = titleLink.attr('href') || '';
const noticeUrl = rawHref.startsWith('http') ? rawHref : 'https://www.find-tender.service.gov.uk' + rawHref;
const rawUrl = rawHref.startsWith("http") ? rawHref : "https://www.find-tender.service.gov.uk" + rawHref;
// Strip query parameters to get clean notice URL
const noticeUrl = rawUrl.split("?")[0];
// Extract source ID from URL
const urlMatch = noticeUrl.match(/\/([A-Z0-9-]+)$/);
const urlMatch = noticeUrl.match(/\/Notice\/([A-Z0-9-]+)/);
const sourceId = urlMatch ? urlMatch[1] : noticeUrl;
const authority = element.find('.search-result-sub-header').text().trim();