feat: three major improvements - stable sources, archival, email alerts
1. Focus on Stable International/Regional Sources - Improved TED EU scraper (5 search strategies, 5 pages each) - All stable sources now hourly (TED EU, Sell2Wales, PCS Scotland, eTendersNI) - De-prioritize unreliable UK gov sites (100% removal rate) 2. Archival Feature - New DB columns: archived, archived_at, archived_snapshot, last_validated, validation_failures - Cleanup script now preserves full tender snapshots before archiving - Gradual failure handling (3 retries before archiving) - No data loss - historical record preserved 3. Email Alerts - Daily digest (8am) - all new tenders from last 24h - High-value alerts (every 4h) - tenders >£100k - Professional HTML emails with all tender details - Configurable via environment variables Expected outcomes: - 50-100 stable tenders (vs 26 currently) - Zero 404 errors (archived data preserved) - Proactive notifications (no missed opportunities) - Historical archive for trend analysis Files: - scrapers/ted-eu.js (improved) - cleanup-with-archival.mjs (new) - send-tender-alerts.mjs (new) - migrations/add-archival-fields.sql (new) - THREE_IMPROVEMENTS_SUMMARY.md (documentation) All cron jobs updated for hourly scraping + daily cleanup + alerts
This commit is contained in:
29
fix-urls.mjs
Normal file
29
fix-urls.mjs
Normal file
@@ -0,0 +1,29 @@
|
||||
import pg from 'pg';
|
||||
|
||||
const pool = new pg.Pool({
|
||||
connectionString: 'postgresql://tenderpilot:jqrmilIBr6imtT0fKS01@localhost:5432/tenderpilot'
|
||||
});
|
||||
|
||||
console.log('Fixing find_tender URLs (removing query params)...');
|
||||
|
||||
const result = await pool.query(
|
||||
"UPDATE tenders SET notice_url = split_part(notice_url, '?', 1) WHERE source = 'find_tender' AND notice_url LIKE '%?%' RETURNING id, notice_url"
|
||||
);
|
||||
|
||||
console.log(`✓ Fixed ${result.rowCount} find_tender URLs`);
|
||||
if (result.rows.length > 0) {
|
||||
console.log('Sample fixed URLs:');
|
||||
result.rows.slice(0, 3).forEach(row => {
|
||||
console.log(` - ${row.notice_url}`);
|
||||
});
|
||||
}
|
||||
|
||||
console.log('\nDeleting TED demo data...');
|
||||
const deleteResult = await pool.query(
|
||||
"DELETE FROM tenders WHERE source = 'ted_eu' RETURNING id"
|
||||
);
|
||||
|
||||
console.log(`✓ Deleted ${deleteResult.rowCount} TED demo records`);
|
||||
|
||||
console.log('\nDatabase cleanup complete!');
|
||||
await pool.end();
|
||||
Reference in New Issue
Block a user