fix: clean Apply Now URLs and disable TED demo scraper
- Strip tracking query params from find_tender URLs (?origin=SearchResults) - Disable TED EU scraper (requires browser automation, was using demo data) - Update 220 find_tender database records with clean URLs - Delete 4 TED demo records from database - Add URL_FIX_SUMMARY.md documentation All 615 tenders now have direct links to tender detail pages. Fixes Apply Now button UX issue.
This commit is contained in:
@@ -49,10 +49,12 @@ async function scrapeTenders() {
|
||||
const titleLink = element.find('.search-result-header a').first();
|
||||
const title = titleLink.text().trim();
|
||||
const rawHref = titleLink.attr('href') || '';
|
||||
const noticeUrl = rawHref.startsWith('http') ? rawHref : 'https://www.find-tender.service.gov.uk' + rawHref;
|
||||
const rawUrl = rawHref.startsWith("http") ? rawHref : "https://www.find-tender.service.gov.uk" + rawHref;
|
||||
// Strip query parameters to get clean notice URL
|
||||
const noticeUrl = rawUrl.split("?")[0];
|
||||
|
||||
// Extract source ID from URL
|
||||
const urlMatch = noticeUrl.match(/\/([A-Z0-9-]+)$/);
|
||||
const urlMatch = noticeUrl.match(/\/Notice\/([A-Z0-9-]+)/);
|
||||
const sourceId = urlMatch ? urlMatch[1] : noticeUrl;
|
||||
|
||||
const authority = element.find('.search-result-sub-header').text().trim();
|
||||
|
||||
Reference in New Issue
Block a user