What is robots.txt?
Robots are often used by search engines to categorize websites. When a site owner wishes to give instructions to web robots, they place a text file called robots.txt in the root of the web site. This text file contains the instructions in a specific format.
The standard specifies how to inform the web robot about which areas of the website should not be processed or scanned.
Robots that choose to follow the instructions try to fetch this file and read the instructions before fetching any other file from the website.
A robots.txt file on a website will function as a request that specified robots ignore specified files or directories when crawling a site. If this file doesn't exist, web robots assume that the website owner is not wishing to place any limitations on crawling the entire site.
Before we start, you have to know about custom robots Header tags which are as follow
In Blogger, you are going to deal with the following custom robots
header tags.
1. all – If you set
this tag, crawlers are not bound by any constraints. They can freely crawl,
index and expose your content.
2. noindex – Not all
the blogs are for public notice. Even if you don’t share the URL of your
personal blog with anybody, chances are people will come to it from search
results. On such a scenario, you can use noindex tag as it prevents search
engines from indexing the pages.
3. nofollow – Nofollow
and dofollow tags are for outbound links. Dofollow is the default robot tag for
all your outbound links. That means the search engines can sneak upon the pages
you linked to. If you don’t want search bots to look through your links,
addition of a nofollow tag must help you.
4. none – none combines
the features of both noindex and nofollow tags. The crawlers will neither index
your pages nor skim through the links.
5. noarchive – You
might have noticed a cached label with most of the website links on SERPs. It
shows that Google has captured a copy of your site into their server to display
in case it goes down. That being said, the noarchive tag turns off cached
version in search pages.
6. nosnippet – The text
snippets in search results help people find what’s on the webpage. If you want
to keep the content exclusive, you can turn this header tag on.
7. noodp – Open
Directory Project or Dmoz is a man-made directory of websites. Google uses the
information from there sometimes. You can turn it off with this tag if you want
to.
8. notranslate – Do you
want to disable translation on your site? Then use notranslate for the exact
purpose.
9. noimageindex – If
you allow Google to index your images, people may steal it and use on their own
websites. To prevent that, you can keep the images deindexed using noimageindex
tag. Also Read this Image SEO guide.
10. unavailable_after –
In Blogger, you will get a field right to this tag. So, the webpage will be
deindexed after this time.
Now i think you have learned pretty much information about custom robots header tags to fix our main issue. Let's fix this issue.
How to fix Indexed, though blocked by robots.txt?
Step 1: Just Login to your blogger.com and Choose your Blog for which you want to fix “Indexed, though blocked by robots.txt” issue.
Step 2: Then go to Settings >> Search preference. There you can see two settings
1.
Custom robots.txt
2.
Custom robots header tags
Step
3. To set "Custom robots.txt" Click on “Edit” and select “Yes” for “Enable custom
robots.txt content?” and Now generate XML Sitemap code for your
Blog from Ctrl.org Site.
Step 4: You will get something like this "XML Sitemap code" for your site.
Example : -
# Blogger Sitemap generated on 2020.03.01
User-agent: *
Disallow: /search/
Allow: /
Sitemap: https://www.techvigyaan.com/atom.xml?redirect=false&start-index=1&max-results=500
Step 5: Now Just add these two line below "Disallow: /search" in your xml sitemap code.
Disallow: /category/
Disallow: /tag/
your code will look like this:
# Blogger Sitemap generated on 2020.03.01
User-agent: *
Disallow: /search/
Disallow: /category/
Disallow: /tag/
Allow: /
Sitemap: https://www.techvigyaan.com/atom.xml?redirect=false&start-index=1&max-results=500
Step 6: Now paste this code into the given text box and click on Save button.
Step 7: Now it's time to set "Custom robots header tags?". Click to on edit option and select yes to enable header tags.
Step 8: Now you will see the number of header tags. You can set them by just follow the same settings I chose (refer
to the image given below) and hit Save changes.

Final Step 9: Now go to the "google search console" >> coverage and start the validation process to fix it.
Final Step 9: Now go to the "google search console" >> coverage and start the validation process to fix it.
Final Words:
That's it guy's, that's all you have to do to fix this issue.
LET ME KNOW IF IT WORKS FOR YOU.
Even still if You have any problem, feel free to write us.
For now this is #BharatSharma signing off. Thank You ...Enjoy.
HEY GUYS ONE SMALL REQUEST TO ALL OF YOU, JUST NEED YOUR SUPPORT TO GROW MY YOUTUBE CHANNEL #TECHVIGYAAN, IF YOU LIKE MY WORK, PLEASE SUBSCRIBE TO OUR CHANNEL, THANK YOU...!!
You May Also Read: -
- How To Fix data-vocabulary.org Schema Deprecated Error
- 1000+ Backlinks in One Click For Your Website
- How to Select Target Country on New Webmaster Tools
- How to Create a Blog on Blogger Complete Guide for Beginners
- 10 Best Free WordPress Hosting Site for Startups in 2019
- Top 60+ DoFollow Social Bookmarking Site for SEO in 2019
- 25 Best Ping Sites to Rank up your Websites 2019
- 111 Adsense High CPC Keywords [May 2019]
- How to Create Google Analytic Tool Account Full Detail
0 Comments