Files
ukaiautomation/robots.txt
root b6e39fe0c2 Security hardening + new tools deployment
- Hide Apache version (ServerTokens Prod)
- Add Permissions-Policy header
- Remove deprecated X-XSS-Protection
- Consolidate security headers to .htaccess only (remove duplicates from PHP)
- Deploy free tools: robots-analyzer, data-converter
- Deploy tools announcement blog post
- Update sitemap with new tools and blog post
2026-02-05 04:11:15 +00:00

62 lines
1.3 KiB
Plaintext

# UK Data Services - robots.txt
# https://ukdataservices.co.uk
User-agent: *
Allow: /
# Block sensitive directories and files
Disallow: /includes/
Disallow: /assets/
Disallow: /admin/
Disallow: /logs/
Disallow: /vendor/
Disallow: /config/
Disallow: /database/
Disallow: /docker/
Disallow: /redis/
# Block configuration and handler files
Disallow: /*-handler.php
Disallow: /*.log$
Disallow: /*.inc$
Disallow: /*.sql$
Disallow: /*.sh$
Disallow: /*.bak$
Disallow: /db-config.php
Disallow: /.email-config.php
Disallow: /.recaptcha-config.php
# Block query string URLs to prevent duplicate content
Disallow: /*?*
# Allow important static assets for rendering
Allow: /assets/css/*.css
Allow: /assets/js/*.js
Allow: /assets/images/*.webp
Allow: /assets/images/*.png
Allow: /assets/images/*.jpg
Allow: /assets/images/*.svg
# Sitemaps
Sitemap: https://ukdataservices.co.uk/sitemap.xml
Sitemap: https://ukdataservices.co.uk/sitemap-index.xml
Sitemap: https://ukdataservices.co.uk/sitemap-blog.xml
Sitemap: https://ukdataservices.co.uk/sitemap-services.xml
# Crawl-delay for respectful crawling
Crawl-delay: 1
# Specific instructions for major search engines
User-agent: Googlebot
Allow: /
Crawl-delay: 0
User-agent: Bingbot
Allow: /
Crawl-delay: 1
User-agent: Slurp
Allow: /
Crawl-delay: 2
Sitemap: https://ukdataservices.co.uk/sitemap-tools.xml