- Set up daily cron job (3am UTC) for tender URL validation - Verified dashboard filtering (API already filters status=open) - Completed full cleanup: 97 valid tenders, 529 removed (84% removal rate) - Add comprehensive setup documentation in CLEANUP_SETUP.md - Updated cleanup script to check ALL open tenders (removed 100 limit)
3.0 KiB
3.0 KiB
TenderRadar Cleanup - Setup Complete
Date: 2026-02-15 14:17 GMT
Summary
✅ Daily cleanup job configured
✅ Dashboard filtering verified
✅ Initial cleanup completed
Results
Database Status (After Full Cleanup)
- Total tenders: 626
- Open (valid URLs): 97 (~16%)
- Closed (removed): 529 (~84%)
Removal rate: 84% of scraped tenders were already removed from source websites!
Current Valid Tenders
The dashboard will show 97 tenders with working Apply Now buttons, distributed across:
- TED EU: 11 ✅
- Contracts Finder: ~40-50 (many removed early)
- Find Tender: Active tenders
- eTendersNI: 11 ✅
- PCS Scotland: 10 ✅
- Sell2Wales: 10 ✅
Configuration
1. Daily Cron Job ✅
0 3 * * * cd /home/peter/tenderpilot && /usr/bin/node cleanup-invalid-tenders.mjs >> logs/cleanup.log 2>&1
What it does:
- Runs daily at 3am UTC
- Checks all "open" tender URLs
- Marks removed tenders as "closed"
- Keeps database in sync with source websites
- Logs to
/home/peter/tenderpilot/logs/cleanup.log
2. Dashboard Filtering ✅
API endpoint: /api/tenders (in server.js)
Automatic filtering:
WHERE status = 'open'
AND (deadline IS NULL OR deadline > NOW())
Result: Dashboard shows only 97 tenders with valid, working URLs
No changes needed - API already filters correctly!
Cron Schedule Summary
All TenderRadar cron jobs on VPS:
0 */4 * * * - Contracts Finder scraper (every 4 hours)
10 */4 * * * - Find Tender scraper (every 4 hours)
20 */4 * * * - PCS Scotland scraper (every 4 hours)
30 */4 * * * - Sell2Wales scraper (every 4 hours)
20 5 * * * - TED EU scraper (daily at 05:20)
30 5 * * * - eTendersNI scraper (daily at 05:30)
0 7 * * * - Email digest (daily at 7am)
0 3 * * * - Cleanup invalid tenders (NEW - daily at 3am)
Log Files
- Cleanup logs:
/home/peter/tenderpilot/logs/cleanup.log - Scraper logs:
/home/peter/tenderpilot/scraper.log - Manual cleanup runs:
/home/peter/tenderpilot/cleanup-full-*.log
Monitoring
Check cleanup effectiveness:
# View recent cleanup log
tail -50 ~/tenderpilot/logs/cleanup.log
# Check current database status
psql tenderpilot -c "SELECT status, COUNT(*) FROM tenders GROUP BY status;"
# See what dashboard shows
psql tenderpilot -c "SELECT COUNT(*) FROM tenders WHERE status='open' AND (deadline IS NULL OR deadline > NOW());"
Next Steps (Optional)
- ✅ Daily cleanup job - DONE
- ✅ Dashboard filtering - VERIFIED WORKING
- ⏳ Reduce scrape interval from 4 hours to 1 hour (captures more fast-closing tenders)
- ⏳ Add more notice types to scrapers (not just
stage=tender) - ⏳ Monitor
cleanup.logfor removal rate patterns
Files Created
/home/peter/tenderpilot/cleanup-invalid-tenders.mjs- Cleanup script/home/peter/tenderpilot/TENDER_CLEANUP_SUMMARY.md- Problem analysis/home/peter/tenderpilot/CLEANUP_SETUP.md- This setup documentation