Wordpress Robots.txt file for SEO
Wordpress, straight out of the box is an excellent platform that is feature rich, easy to use and highly customizable. But guess what, it’s navigation structure is absolutely horrible in the eyes of search engines because many of the features that make user navigation so easy and intuitive end up having the exact opposite effect for search engine spiders/bots. The ability to easily navigate to posts/pages by multiple links of various names can produce a pile of the ever-dreaded duplicate content that search engines such as Google will penalize your site for. Unless you want your pages in the supplemental index, read on to learn one more SEO “must-do” to keep your site SERP’ing strong.
By properly utilizing a robots.txt file, we can tell search engine bots where to look and where to not waste their time. The process is quite easy, you just simply have to instruct the bot not to look in any places where they won’t find any valuable content such as the wp-admin, wp-content etc folders. The file itself should go in your root directory and guess what, it should be named robots.txt (I know, too obvious)
Here is a sample robots.txt file that you can use for your wordpress powered site, because guess what… Wordpress doesn’t come with one by default, yep thats right, unless you have added one there will not be one there at all. So, here you go (this one will work just fine but you can obviously customize it to suite your specific needs by adding more directories to disallow access to.
User-agent: *
Disallow: /cgi-bin
Disallow: /wp-admin
Disallow: /wp-includes
Disallow: /wp-content/plugins
Disallow: /wp-content/cache
Disallow: /wp-content/themes
Disallow: /trackback
Disallow: /feed
Disallow: /comments
Disallow: /category/*/*
Disallow: */trackback
Disallow: */feed
Disallow: */comments
Disallow: /*?*
Disallow: /*?
Allow: /wp-content/uploads
# Google Image
User-agent: Googlebot-Image
Disallow:
Allow: /*
# Google AdSense
User-agent: Mediapartners-Google*
Disallow:
Allow: /*
# Internet Archiver Wayback Machine
User-agent: ia_archiver
Disallow: /
# digg mirror
User-agent: duggmirror
Disallow: /
This is by no means the only step that should be taken to optimize your Wordpress site, but it is one good step in the right direction. I personally believe a combination of the proper use of meta tags in addition to a robots.txt file is the best overall approach, the robots.txt file should be used primarily as a mechanism to restrict what folders are indexed but can also be used to show bots what areas they should specifically look in as well, I will write more on that topic another time. I hope you find this article useful, please feel free to comment and share any best practices you have in this area.
-
Bogan Marketing
-
Jesse
-
dean
-
Jesse
-
PRO-Webs, Inc
-
Jesse
-
ews_blog
-
Jesse
-
ews_blog
-
ark
-
ragy B. Garagnon
-
jeffsawyer
-
Joe Jones
-
Jesse