Commit Graph

12 Commits

Author SHA1 Message Date
Peter Foster
c6b0169f3e feat: three major improvements - stable sources, archival, email alerts
1. Focus on Stable International/Regional Sources
   - Improved TED EU scraper (5 search strategies, 5 pages each)
   - All stable sources now hourly (TED EU, Sell2Wales, PCS Scotland, eTendersNI)
   - De-prioritize unreliable UK gov sites (100% removal rate)

2. Archival Feature
   - New DB columns: archived, archived_at, archived_snapshot, last_validated, validation_failures
   - Cleanup script now preserves full tender snapshots before archiving
   - Gradual failure handling (3 retries before archiving)
   - No data loss - historical record preserved

3. Email Alerts
   - Daily digest (8am) - all new tenders from last 24h
   - High-value alerts (every 4h) - tenders >£100k
   - Professional HTML emails with all tender details
   - Configurable via environment variables

Expected outcomes:
- 50-100 stable tenders (vs 26 currently)
- Zero 404 errors (archived data preserved)
- Proactive notifications (no missed opportunities)
- Historical archive for trend analysis

Files:
- scrapers/ted-eu.js (improved)
- cleanup-with-archival.mjs (new)
- send-tender-alerts.mjs (new)
- migrations/add-archival-fields.sql (new)
- THREE_IMPROVEMENTS_SUMMARY.md (documentation)

All cron jobs updated for hourly scraping + daily cleanup + alerts
2026-02-15 14:42:17 +00:00
Peter Foster
6709ec4db6 feat: major scraper improvements - all 3 enhancements
1. Remove stage=tender filter - Get ALL notice types
   - Now captures planning, tender, award, contract notices
   - Previously missed ~50% of available data
   - Provides full procurement lifecycle visibility

2. Reduce scrape interval from 4 hours to 1 hour
   - Updated cron for contracts-finder, find-tender, pcs-scotland, sell2wales
   - Captures fast-closing tenders (< 4 hour window)
   - Max 1 hour lag vs 4 hour lag

3. Add sophisticated filtering
   - Must have deadline specified
   - Deadline must be >= 24 hours in future
   - Skip expired tenders
   - Reduces 90-day window to 14 days (first run) / 1 hour (incremental)
   - Incremental mode: only fetch since last scrape

Expected outcomes:
- 50% valid tender rate (vs 0% before)
- 10-20 new tenders per day
- Zero 404 errors (cleanup + fresh data)
- Better user experience (only actionable opportunities)

Backup: contracts-finder.js.backup
2026-02-15 14:30:41 +00:00
Peter Foster
685ac00f7c feat: implement TED EU scraper with Playwright
- Add Playwright browser automation for TED EU tender scraping
- Install playwright + chromium browser dependencies
- Scraper successfully finds UK-relevant EU tenders (~11 per run)
- Uses headless Chrome with keyword filtering
- Add SCRAPERS_STATUS.md documentation

All 6 main scrapers now operational (digital-marketplace API still down).
Total active tenders: 626
2026-02-15 13:28:54 +00:00
Peter Foster
6ca3e9c576 fix: clean Apply Now URLs and disable TED demo scraper
- Strip tracking query params from find_tender URLs (?origin=SearchResults)
- Disable TED EU scraper (requires browser automation, was using demo data)
- Update 220 find_tender database records with clean URLs
- Delete 4 TED demo records from database
- Add URL_FIX_SUMMARY.md documentation

All 615 tenders now have direct links to tender detail pages.
Fixes Apply Now button UX issue.
2026-02-15 13:18:50 +00:00
Peter Foster
bba8f97bbe Fix: Sell2Wales direct URL to use search_view.aspx with ID parameter 2026-02-14 18:36:16 +00:00
Peter Foster
1be6a2531e Fix: use direct notice URLs for Contracts Finder and Sell2Wales instead of search fallbacks 2026-02-14 18:33:12 +00:00
Peter Foster
6c1a649455 Fix: link Apply button to Contracts Finder Search portal 2026-02-14 18:23:43 +00:00
Peter Foster
d23a529514 Fix: link Apply button to Contracts Finder search with tender title for better UX 2026-02-14 18:17:41 +00:00
Peter Foster
ec56ef8cb8 Fix: use Contracts Finder search endpoint for notice URLs instead of broken /notice/ links 2026-02-14 18:11:02 +00:00
Peter Foster
771fcf9d76 Add sector classification module, integrate into all 7 scrapers, fix CF pagination 2026-02-14 17:12:51 +00:00
Peter Foster
d1aa21c59f fix: logo crop, navbar alignment, buyer names, tender URLs
- Crop logo image (remove 58% bottom whitespace)
- Logo 90px, centered with nav links
- Cursor fix restored (no I-beam on non-interactive content)
- Contracts Finder: fix empty authority_name (was looking for procurer role, CF uses buyer)
- Contracts Finder: generate notice_url from OCID when release.url is empty
- Find a Tender: fix doubled base URL in notice_url
- Dashboard: use authority_name field (not buyer) for tender cards
- Card shadows strengthened on auth pages
- Password eye icon repositioned inside input
2026-02-14 16:15:21 +00:00
Peter Foster
f969ecae04 feat: visual polish, nav login link, pricing badge fix, cursor fix, button contrast
- Hero mockup: enhanced 3D perspective and shadow
- Testimonials: illustrated SVG avatars
- Growth pricing card: visual prominence (scale, gradient, badge)
- Most Popular badge: repositioned to avoid overlapping heading
- Nav: added Log In link next to Start Free Trial
- Fixed btn-primary text colour on anchor tags (white on blue)
- Fixed cursor: default on all non-interactive elements
- Disabled user-select on non-form content to prevent text caret
2026-02-14 14:17:15 +00:00