User-agent: * Allow: /web/image/ Allow: /web/content/ Allow: /web/assets/ Allow: /llms.txt Allow: /llms-txt Disallow: /odoo/ Disallow: /web/ Disallow: /web/dataset/ Disallow: /mail/ Disallow: /longpolling/ Disallow: /bus/ Disallow: /my/ Disallow: /shop/checkout/ Disallow: /shop/address/ Disallow: /shop/payment/ Disallow: /shop/confirmation/ Disallow: /shop/cart/ Disallow: /client-portal/ Disallow: /private/ Disallow: /appointment/ Disallow: /calendar/ Disallow: /contactus-thank-you/ Disallow: /thank-you/ Disallow: /your-task-has-been-submitted/ # FIX: Was /blog//feed (double slash — never matched). Now correct: Disallow: /blog/*/feed Disallow: /blog/feed # FIX v3.0: Block paginated blog post URLs — prevents noindex trap # /blog/slug/page/2 was triggering Mainlayout pager noindex logic. # Google was selecting /page/2 as canonical instead of clean URL. Disallow: /blog/*/page/ Disallow: /?search= Disallow: /?filter= Disallow: /?sort= Disallow: /?order= # NOTE: /?tag= removed — blog tag pages aid topical discovery # Use noindex+follow meta robots tag on tag templates instead Disallow: /?view_type= Disallow: /?enable_editor= Disallow: /?debug= # FIX: /?page= REMOVED — was blocking all blog pagination beyond page 1 # Only block page= when combined with debug/editor params: Disallow: /?page=&debug= Disallow: /?page=&enable_editor= # Site is English-only — lang param creates duplicates (intentional block) Disallow: /?lang= Disallow: /?currency= Disallow: /?*session_id= # ============================================================= # 2) Block unwanted commercial/SEO tool crawlers & scrapers # FIX v2.0: Bytespider moved to AI section (ByteDance — TikTok) # FIX v2.0: AI2Bot moved to AI section (Allen Institute — academic) # ============================================================= User-agent: AhrefsBot User-agent: MJ12bot User-agent: SemrushBot User-agent: DotBot User-agent: DataForSeoBot User-agent: PetalBot User-agent: img2dataset User-agent: QuillBot-com User-agent: MyCentralAIScraperBot User-agent: SBIntuitionsBot User-agent: DigitalOceanGenAI-Crawler User-agent: Scrapy User-agent: PiplBot Disallow: / # ============================================================= # 3) AI / LLM crawlers — optimised for max citation visibility # FIX v2.0: Phantom agents removed # FIX v2.0: Bytespider + AI2Bot moved here from blocked section # FIX v2.1: ChatGPT-User/2.0 removed — version suffix not valid # in robots.txt spec; ChatGPT-User covers all versions # FIX v3.0: /blog/*/page/ added — consistent with other sections # ============================================================= User-agent: GPTBot User-agent: OAI-SearchBot User-agent: ChatGPT-User User-agent: ClaudeBot User-agent: Claude-Web User-agent: Claude-User User-agent: Google-Extended User-agent: PerplexityBot User-agent: Perplexity-User User-agent: GrokBot User-agent: DuckAssistBot User-agent: MistralAI-User User-agent: YouBot User-agent: cohere-ai User-agent: CCBot User-agent: Applebot User-agent: Applebot-Extended User-agent: Amazonbot User-agent: FacebookBot User-agent: Meta-ExternalAgent User-agent: Diffbot User-agent: Bytespider User-agent: ByteDance-User User-agent: AI2Bot User-agent: AI2Bot-Dolma Allow: / Allow: /llms.txt Allow: /llms-txt Allow: /web/image/ Allow: /web/content/ Allow: /web/assets/ Disallow: /odoo/ Disallow: /web/ Disallow: /web/dataset/ Disallow: /my/ Disallow: /appointment/ Disallow: /shop/checkout/ Disallow: /shop/address/ Disallow: /shop/payment/ Disallow: /shop/confirmation/ Disallow: /shop/cart/ Disallow: /client-portal/ Disallow: /private/ Disallow: /mail/ Disallow: /longpolling/ Disallow: /bus/ Disallow: /calendar/ Disallow: /contactus-thank-you/ Disallow: /thank-you/ Disallow: /your-task-has-been-submitted/ Disallow: /blog/*/feed Disallow: /blog/feed # FIX v3.0: Block paginated blog post URLs Disallow: /blog/*/page/ Disallow: /?search= Disallow: /?filter= Disallow: /?sort= Disallow: /?order= Disallow: /?view_type= Disallow: /?enable_editor= Disallow: /?debug= Disallow: /?page=&debug= Disallow: /?page=&enable_editor= Disallow: /?lang= Disallow: /?currency= Disallow: /?*session_id= # ============================================================= # 4) Googlebot # ============================================================= User-agent: Googlebot User-agent: Googlebot-Image User-agent: Googlebot-News Allow: / Allow: /web/image/ Allow: /web/content/ Allow: /web/assets/ Allow: /social_instagram/ Allow: /llms.txt Allow: /llms-txt Disallow: /odoo/ Disallow: /web/ Disallow: /web/dataset/ Disallow: /mail/ Disallow: /longpolling/ Disallow: /bus/ Disallow: /my/ Disallow: /appointment/ Disallow: /shop/checkout/ Disallow: /shop/address/ Disallow: /shop/payment/ Disallow: /shop/confirmation/ Disallow: /shop/cart/ Disallow: /client-portal/ Disallow: /private/ Disallow: /calendar/ Disallow: /contactus-thank-you/ Disallow: /thank-you/ Disallow: /your-task-has-been-submitted/ Disallow: /blog/*/feed Disallow: /blog/feed # FIX v3.0: Block paginated blog post URLs — fixes noindex/canonical trap Disallow: /blog/*/page/ Disallow: /?search= Disallow: /?filter= Disallow: /?sort= Disallow: /?order= Disallow: /?view_type= Disallow: /?enable_editor= Disallow: /?debug= Disallow: /?page=&debug= Disallow: /?page=&enable_editor= Disallow: /?lang= Disallow: /?currency= Disallow: /?*session_id= # ============================================================= # 5) Bingbot # ============================================================= User-agent: Bingbot User-agent: msnbot User-agent: BingPreview Allow: / Allow: /web/image/ Allow: /web/content/ Allow: /web/assets/ Allow: /llms.txt Allow: /llms-txt Disallow: /odoo/ Disallow: /web/ Disallow: /web/dataset/ Disallow: /mail/ Disallow: /longpolling/ Disallow: /bus/ Disallow: /my/ Disallow: /appointment/ Disallow: /shop/checkout/ Disallow: /shop/address/ Disallow: /shop/payment/ Disallow: /shop/confirmation/ Disallow: /shop/cart/ Disallow: /client-portal/ Disallow: /private/ Disallow: /calendar/ Disallow: /contactus-thank-you/ Disallow: /thank-you/ Disallow: /your-task-has-been-submitted/ Disallow: /blog/*/feed Disallow: /blog/feed # FIX v3.0: Block paginated blog post URLs Disallow: /blog/*/page/ Disallow: /?search= Disallow: /?filter= Disallow: /?sort= Disallow: /?order= Disallow: /?view_type= Disallow: /?enable_editor= Disallow: /?debug= Disallow: /?page=&debug= Disallow: /?page=&enable_editor= Disallow: /?lang= Disallow: /?currency= Disallow: /?*session_id= # ============================================================= # 6) Sitemap # ============================================================= Sitemap: https://www.rightpathway.com.au/sitemap.xml