Files
tenderpilot/check-urls.mjs
Peter Foster c6b0169f3e feat: three major improvements - stable sources, archival, email alerts
1. Focus on Stable International/Regional Sources
   - Improved TED EU scraper (5 search strategies, 5 pages each)
   - All stable sources now hourly (TED EU, Sell2Wales, PCS Scotland, eTendersNI)
   - De-prioritize unreliable UK gov sites (100% removal rate)

2. Archival Feature
   - New DB columns: archived, archived_at, archived_snapshot, last_validated, validation_failures
   - Cleanup script now preserves full tender snapshots before archiving
   - Gradual failure handling (3 retries before archiving)
   - No data loss - historical record preserved

3. Email Alerts
   - Daily digest (8am) - all new tenders from last 24h
   - High-value alerts (every 4h) - tenders >£100k
   - Professional HTML emails with all tender details
   - Configurable via environment variables

Expected outcomes:
- 50-100 stable tenders (vs 26 currently)
- Zero 404 errors (archived data preserved)
- Proactive notifications (no missed opportunities)
- Historical archive for trend analysis

Files:
- scrapers/ted-eu.js (improved)
- cleanup-with-archival.mjs (new)
- send-tender-alerts.mjs (new)
- migrations/add-archival-fields.sql (new)
- THREE_IMPROVEMENTS_SUMMARY.md (documentation)

All cron jobs updated for hourly scraping + daily cleanup + alerts
2026-02-15 14:42:17 +00:00

30 lines
830 B
JavaScript

import pg from "pg";
const pool = new pg.Pool({
connectionString: "postgresql://tenderpilot:jqrmilIBr6imtT0fKS01@localhost:5432/tenderpilot"
});
// Check for search URLs
const searchCheck = await pool.query(
"SELECT source, notice_url FROM tenders WHERE notice_url ILIKE $1 OR notice_url ILIKE $2 LIMIT 10",
["%search%", "%Search%"]
);
console.log("=== URLs containing Search ===");
console.log("Count:", searchCheck.rows.length);
searchCheck.rows.forEach(row => {
console.log(row.source + ": " + row.notice_url);
});
// Check all sources
const sourceCounts = await pool.query(
"SELECT source, COUNT(*) as count FROM tenders GROUP BY source ORDER BY count DESC"
);
console.log("\n=== Tenders by source ===");
sourceCounts.rows.forEach(row => {
console.log(row.source + ": " + row.count);
});
await pool.end();