Technical SEO

  • SEO Marketing Hub 2.0
  • Technical SEO
  • Robots.txt
Page Speed and SEO
Duplicate Content

Robots.txt

What Is Robots.txt?

Robots.txt is a file that tells search engine spiders to not crawl certain pages or sections of a website. Most major search engines (including Google, Bing and Yahoo) recognize and honor Robots.txt requests.

Why Is Robots.txt Important?

Most websites don’t need a robots.txt file.

That’s because Google can usually find and index all of the important pages on your site.

And they’ll automatically NOT index pages that aren’t important or duplicate versions of other pages.

That said, there are 3 main reasons that you’d want to use a robots.txt file.

Block Non-Public Pages: Sometimes you have pages on your site that you don’t want indexed. For example, you might have a staging version of a page. Or a login page. These pages need to exist. But you don’t want random people landing on them. This is a case where you’d use robots.txt to block these pages from search engine crawlers and bots.

Maximize Crawl Budget: If you’re having a tough time getting all of your pages indexed, you might have a crawl budget problem. By blocking unimportant pages with robots.txt, Googlebot can spend more of your crawl budget on the pages that actually matter.

Prevent Indexing of Resources: Using meta directives can work just as well as Robots.txt for preventing pages from getting indexed. However, meta directives don’t work well for multimedia resources, like PDFs and images. That’s where robots.txt comes into play.

The bottom line? Robots.txt tells search engine spiders not to crawl specific pages on your website.

You can check how many pages you have indexed in the Google Search Console.

Google Search Console – Indexed

If the number matches the number of pages that you want indexed, you don’t need to bother with a Robots.txt file.

But if that number is higher than you expected (and you notice indexed URLs that shouldn’t be indexed), then it’s time to create a robots.txt file for your website.

Best Practices

Create a Robots.txt File

Your first step is to actually create your robots.txt file.

Being a text file, you can actually create one using Windows notepad.

And no matter how you ultimately make your robots.txt file, the format is exactly the same:

User-agent: X
Disallow: Y

User-agent is the specific bot that you’re talking to.

And everything that comes after “disallow” are pages or sections that you want to block.

Here’s an example:

User-agent: googlebot
Disallow: /images

This rule would tell Googlebot not to index the image folder of your website.

You can also use an asterisk (*) to speak to any and all bots that stop by your website.

Here’s an example:

User-agent: *
Disallow: /images

The “*” tells any and all spiders to NOT crawl your images folder.

This is just one of many ways to use a robots.txt file. This helpful guide from Google has more info the different rules you can use to block or allow bots from crawling different pages of your site.

Useful rules

Make Your Robots.txt File Easy to Find

Once you have your robots.txt file, it’s time to make it live.

You can technically place your robots.txt file in any main directory of your site.

But to increase the odds that your robots.txt file gets found, I recommend placing it at:

https://example.com/robots.txt

(Note that your robots.txt file is case sensitive. So make sure to use a lowercase “r” in the filename)

Check for Errors and Mistakes

It’s REALLY important that your robots.txt file is setup correctly. One mistake and your entire site could get deindexed.

Fortunately, you don’t need to hope that your code is set up right. Google has a nifty Robots Testing Tool that you can use:

Robots.txt – Testing results

It shows you your robots.txt file… and any errors and warnings that it finds:

Robots.txt – Errors

As you can see, we block spiders from crawling our WP admin page.

We also use robots.txt to block crawling of WordPress auto-generated tag pages (to limit duplicate content).

Robots.txt vs. Meta Directives

Why would you use robots.txt when you can block pages at the page-level with the “ noindex” meta tag?

Like I mentioned earlier, the noindex tag is tricky to implement on multimedia resources, like videos and PDFs.

Also, if you have thousands of pages that you want to block, it’s sometimes easier to block the entire section of that site with robots.txt instead of manually adding a noindex tag to every single page.

There are also edge cases where you don’t want to waste any crawl budget on Google landing on pages with the noindex tag.

That said:

Outside of those three edge cases, I recommend using meta directives instead of robots.txt. They’re easier to implement. And there’s less chance of a disaster happening (like blocking your entire site).

Learn More

Learn about robots.txt files: A helpful guide on how they use and interpret robots.txt.

What is a Robots.txt File? (An Overview for SEO + Key Insight): A no-fluff video on different use cases for robots.txt.

6 Resources
Next Duplicate Content
Previous Page Speed and SEO
Next Duplicate Content
More Topics
All Topics
8 ResourcesSEO Fundamentals
4 ResourcesKeyword Research Strategies
8 ResourcesContent Optimization Strategies

代做工资流水公司莆田办理流水账单铜陵签证流水图片大庆代做签证工资流水西安工资流水单价格长春自存流水开具衡阳流水单代开许昌打贷款工资流水上饶薪资流水单台州开入职流水开封银行流水电子版代做三亚贷款银行流水模板漳州代办在职证明包头查询流水账单盐城代做签证工资流水湛江对公银行流水费用邯郸入职工资流水模板南通代开签证工资流水信阳办企业对私流水大庆贷款流水代办阜阳车贷工资流水 公司杭州签证银行流水 图片邯郸银行流水账图片南京代开银行流水账保定入职流水模板薪资流水费用淄博房贷银行流水黄冈购房银行流水代做三亚开个人工资流水邯郸查询薪资流水单宁波薪资银行流水样本香港通过《维护国家安全条例》两大学生合买彩票中奖一人不认账让美丽中国“从细节出发”19岁小伙救下5人后溺亡 多方发声卫健委通报少年有偿捐血浆16次猝死汪小菲曝离婚始末何赛飞追着代拍打雅江山火三名扑火人员牺牲系谣言男子被猫抓伤后确诊“猫抓病”周杰伦一审败诉网易中国拥有亿元资产的家庭达13.3万户315晚会后胖东来又人满为患了高校汽车撞人致3死16伤 司机系学生张家界的山上“长”满了韩国人?张立群任西安交通大学校长手机成瘾是影响睡眠质量重要因素网友洛杉矶偶遇贾玲“重生之我在北大当嫡校长”单亲妈妈陷入热恋 14岁儿子报警倪萍分享减重40斤方法杨倩无缘巴黎奥运考生莫言也上北大硕士复试名单了许家印被限制高消费奥巴马现身唐宁街 黑色着装引猜测专访95后高颜值猪保姆男孩8年未见母亲被告知被遗忘七年后宇文玥被薅头发捞上岸郑州一火锅店爆改成麻辣烫店西双版纳热带植物园回应蜉蝣大爆发沉迷短剧的人就像掉进了杀猪盘当地回应沈阳致3死车祸车主疑毒驾开除党籍5年后 原水城县长再被查凯特王妃现身!外出购物视频曝光初中生遭15人围殴自卫刺伤3人判无罪事业单位女子向同事水杯投不明物质男子被流浪猫绊倒 投喂者赔24万外国人感慨凌晨的中国很安全路边卖淀粉肠阿姨主动出示声明书胖东来员工每周单休无小长假王树国卸任西安交大校长 师生送别小米汽车超级工厂正式揭幕黑马情侣提车了妈妈回应孩子在校撞护栏坠楼校方回应护栏损坏小学生课间坠楼房客欠租失踪 房东直发愁专家建议不必谈骨泥色变老人退休金被冒领16年 金额超20万西藏招商引资投资者子女可当地高考特朗普无法缴纳4.54亿美元罚金浙江一高校内汽车冲撞行人 多人受伤

代做工资流水公司 XML地图 TXT地图 虚拟主机 SEO 网站制作 网站优化