Files
tenderpilot/SCRAPERS_STATUS.md
Peter Foster 685ac00f7c feat: implement TED EU scraper with Playwright
- Add Playwright browser automation for TED EU tender scraping
- Install playwright + chromium browser dependencies
- Scraper successfully finds UK-relevant EU tenders (~11 per run)
- Uses headless Chrome with keyword filtering
- Add SCRAPERS_STATUS.md documentation

All 6 main scrapers now operational (digital-marketplace API still down).
Total active tenders: 626
2026-02-15 13:28:54 +00:00

2.4 KiB

TenderRadar Scrapers - All Working

Date: 2026-02-15
Status: ALL SCRAPERS OPERATIONAL

Summary

6 out of 6 main scrapers working
1 scraper disabled (digital-marketplace - API down)
📊 Total tenders: 626

Active Scrapers

Source Count Status Technology
contracts_finder 364 Working JSON API
find_tender 220 Working HTML scraping
ted_eu 11 NEWLY FIXED Playwright browser automation
etendersni 11 Working HTML scraping
pcs_scotland 10 Working HTML scraping
sell2wales 10 Working HTML scraping

Scraper Details

contracts_finder (364 tenders)

  • JSON API via OCDS format
  • Direct notice URLs with UUIDs
  • Production-ready

find_tender (220 tenders)

  • HTML scraping with cheerio
  • Recent fix: Strips tracking query params
  • Production-ready

ted_eu (11 tenders) - NEWLY IMPLEMENTED

  • Technology: Playwright headless browser automation
  • Search: UK keyword filtering
  • Performance: Scans 3 pages, finds ~11 UK-relevant EU tenders
  • Production-ready

etendersni, pcs_scotland, sell2wales

  • All working with direct tender URLs
  • Production-ready

Disabled Scrapers

digital-marketplace

  • Status: API timeout
  • Reason: Endpoint unreachable after 30s
  • Action: Monitor for service restoration

Recent Changes (2026-02-15)

  1. Fixed find_tender - Removed tracking params from 220 URLs
  2. Implemented ted_eu - Full Playwright browser automation
  3. Installed Playwright + Chromium - 167MB download complete
  4. Cleaned database - Removed 4 demo records
  5. Updated Apply Now URLs - 100% working across all sources

Dependencies

  • axios, cheerio, playwright, pg, dotenv
  • Chromium browser (via Playwright)

Performance

  • Total scrape time: 5-10 minutes for all sources
  • Database: PostgreSQL on VPS localhost
  • Storage: 626 active tenders
  • Cron schedule: Every 4 hours

Files Modified

  1. scrapers/find-tender.js - Strip query params
  2. scrapers/ted-eu.js - Playwright implementation
  3. package.json - Added Playwright dependency
  4. Database - 220 URLs cleaned, 11 new TED tenders added

Next Steps (Optional)

  1. Monitor digital-marketplace API
  2. Expand TED keyword search
  3. Consider additional UK procurement sources