What is the Robots.txt Tester and Validator Tool?
The Robots.txt tester tool is designed to check that your robots.txt file is accurate and free of errors. Robots.txt is a file that is part of your website and which provides indexing rules for search engine robots, to ensure that your website is crawled (and indexed) correctly and the most important data on your website is indexed first.
This tool is simple to use and gives you a report in seconds – just type in your full website URL, followed by /robots.txt (e.g. yourwebsite.com/robots.txt) and click on the ‘check’ button. Our robots.txt checker will find any mistakes (such as typos, syntax and ‘logic’ errors) and give you tips for optimizing your robots.txt file.
Why Do I Need to Check My Robots.txt File?
Problems with your robots.txt file – or not having a robots.txt file at all – can have a negative impact on your SEO scores, your website may not rank as well in search engine result pages (SERPs). This is due to the risk of non-relevant content being crawled before or instead of the important content.
Checking the file before your website is crawled means that you can avoid issues such as all your website content being crawled and indexed rather than just the pages you want indexing. For example, if you have a page that you only want visitors to access after filling in a subscription form, or a member’s login page, but don’t exclude it in your robot.txt file, it could end up being indexed.
What Do the Errors and Warnings Mean?
There’s a range of errors that can affect your robots.txt file, as well as some ‘best practice’ warnings that you might see when you check your file. These, are things that can affect your SEO and should be fixed. Warnings are less important and act as advice on how to improve your robots.txt file.
Errors you may see include:
Invalid URL – You’ll see this error if your robots.txt file is missing completely
Potential wildcard error – Although technically a warning rather than an error, if you see this message it’s usually because your robots.txt file contains a wildcard (*) in the Disallow field (e.g. Disallow: /*.rss). This is a best practice issue – Google allows wildcards in the Disallow field but it’s not a recommended practice.
Generic and specific user-agents in the same block of code – This is a syntax error in your robots.txt file and should be corrected to avoid problems with crawling your website.
Warnings you may see include:
Allow: / – Using the allow order isn’t going to damage your ranking or affect your website, but it’s not standard practice. Major robots including Google and Bing will accept this directive, but not all crawlers do – and generally speaking, it’s best to make your robots.txt file compatible with all crawlers, not just the big ones.
Field name capitalization – While field names are not necessarily case sensitive, some crawlers may require capitalization, therefore it’s a good idea to capitalize field names for specific user-agents
Sitemap support – Many robots.txt files include the details of the sitemap for the website, but this is not considered to be best practice. Google and Bing, however, both support this feature.
How Do I Fix Errors in My Robots.txt File?
Fixing the errors in your robots.txt file depends on the platform that you use. If you use WordPress, it is advisable to use a plugin such as WordPress Robots.txt Optimization or Robots.txt Editor. If you connect your website to Google Search Console, you’re also able to edit your robots.txt file there.
Some website builders like Wix don’t allow you to edit your robots.txt file directly but do allow you to add no-index tags for specific pages.