Stumble Digg Technorati Delicious

WordPress Robots.txt file for SEO

Having your WordPress Robots.txt properly setup for SEO will improve your rankings in the search results and get you more traffic.

WordPress, straight out of the box is an excellent platform that is feature rich, easy to use and highly customizable. But guess what, it comes with no WordPress robots.txt file and it’s navigation structure is absolutely horrible in the eyes of search engines because many of the features that make user navigation so easy and intuitive end up having the exact opposite effect for search engine spiders/bots. The ability to easily navigate to posts/pages by multiple links of various names can produce a pile of the ever-dreaded duplicate content that search engines such as Google will penalize your site for. Unless you want your pages in the supplemental index, read on to learn about how you can properly use a WordPress robots.txt file as one more SEO “must-do” to keep your site SERP’ing strong.

By properly utilizing a WordPress robots.txt file, we can tell search engine bots where to look and where to not waste their time.

WordPress robots.txt

The process of making a WordPress robots.txt file is quite easy, you just have to instruct  the bot by providing it a WordPress robots.txt file telling it to not look in any places where they won’t find any valuable content such as the wp-admin, wp-content etc folders. The WordPress robots.txt file itself should go in your root directory and guess what, it should be named robots.txt (I know, too obvious)

Here is a sample WordPress robots.txt file that you can use for your wordpress powered site, because guess what… WordPress doesn’t come with a WordPress robots.txt file by default, yep thats right, unless you have added a WordPress robots.txt file there will not be one there at all. So, here you go (this one will work just fine but you can obviously customize it to suite your specific needs by adding more directories to disallow access to.

User-agent: *
Disallow: /cgi-bin
Disallow: /wp-admin
Disallow: /wp-includes
Disallow: /wp-content/plugins
Disallow: /wp-content/cache
Disallow: /wp-content/themes
Disallow: /trackback
Disallow: /feed
Disallow: /comments
Disallow: /category/*/*
Disallow: */trackback
Disallow: */feed
Disallow: */comments
Disallow: /*?*
Disallow: /*?
Allow: /wp-content/uploads

# Google Image
User-agent: Googlebot-Image
Disallow:
Allow: /*

# Google AdSense
User-agent: Mediapartners-Google*
Disallow:
Allow: /*

# Internet Archiver Wayback Machine
User-agent: ia_archiver
Disallow: /

# digg mirror
User-agent: duggmirror
Disallow: /

Having a WordPress robots.txt file is by no means the only step that should be taken to optimize your WordPress site, but it is one good step in the right direction. I personally believe a combination of the proper use of WordPress SEO Meta Tags in addition to a WordPress robots.txt file is the best overall approach, the WordPress robots.txt file should be used primarily as a mechanism to restrict what folders are indexed but can also be used to show bots what areas they should specifically look in as well, I will write more on that topic another time. I hope you find this article useful, please feel free to comment and share any best practices you have in this area or questions you may have about WordPress robots.txt files.


Add your comment here or use Facebook below.

24 Responses to “WordPress Robots.txt file for SEO”

  1. Comment by jeffsawyer posted on April 2nd, 2009

    Interesting tip. I’ll have to look into doing this for my sites.

    Reply to this comment

  2. Comment by ragy B. Garagnon posted on June 2nd, 2009

    Informative post. I use a plugin “Platnum SEO”
    in my site. I can set no-follow options on all modules. It works very well.

    ragy B. Garagnon’s last blog post..Trafficseeker SEO Software

    Reply to this comment

  3. Comment by ark posted on August 26th, 2009

    This is very helpful. Thanks.
    .-= ark´s last blog ..Where to Watch UFC 102 Live Streaming Online Free =-.

    Reply to this comment

  4. Comment by ews_blog posted on September 30th, 2009

    Excellent Tip! One question: If I have wordpress in a subdomain [i.e: http://subdomain.mydomain.com How should I configure the robots.txt? Thanks
    .-= ews_blog´s last blog ..Mootools Examples I: Array =-.

    Reply to this comment

    • Comment by Jesse (Twitter name: ) posted on September 30th, 2009

      Hi and thanks for stopping by. If you have wordpress in a subdomain you do it exactly the same, as long as the robots.txt file is in your sites root folder it makes no difference what address the bots use to get there, the end result is the same.

      Reply to this comment

  5. Comment by PRO-Webs, Inc posted on November 8th, 2009

    The structure is solid, but a couple of things I noticed. I am pretty sure that you cannot use an allow command in your robots.txt… only disallow, lack of which is allow.

    Also, the wildcards are not supported by all engines, so it become necessary to use the user agent to specify rules with wildcards for just those that do… Otherwise one can make quite a mess with MSN =-) as they in all of their wisdom do not support wildcards.

    Reply to this comment

  6. Comment by Jesse posted on November 8th, 2009

    Correct, not all search engines suppost wildcards but the big G does :-) and
    I have no problem being properly indexed by Bing or Yahoo.. While this
    example is not “technically” perfect it does work for my uses and
    accomplishes what I need done.

    If the big 3 would get together and agree on a set of standards that were
    universal across the board I would jump on ship right away, until then were
    all stuck just doing what works instead of what is 100% correct.

    Reply to this comment

  7. Comment by dean posted on November 9th, 2009

    maybe i need to modify my robots again so it will become more SEO.thank you very much

    Reply to this comment

  8. Comment by Jesse posted on November 9th, 2009

    Thanks for stopping by, glad you found this post useful.

    Reply to this comment

  9. Comment by dean posted on November 9th, 2009

    maybe i need to modify my robots again so it will become more SEO.thank you very much

    Reply to this comment

  10. Comment by Jesse posted on November 9th, 2009

    Thanks for stopping by, glad you found this post useful.

    Reply to this comment

  11. Comment by Joe Jones posted on February 15th, 2010

    Admin Daily,
    I have one website that's a wordpress site of only 2 web pages. I used all the disallows you have listed but Google's keyword list for the site had “disallow”, “user”, and “agent” as the most frequently used keywords for the site. How do I stop google from using words in the robots.txt file as a source of keywords.

    Thanks

    Reply to this comment

  12. Comment by Jesse posted on February 15th, 2010

    That is very odd, let me know what your domain is or use the contact form
    here and let me know and I will take a peek at it and see why this could be
    happening.

    Reply to this comment

  13. Comment by Bogan Marketing posted on March 3rd, 2010

    Hey Jessie,

    Nice post, I went into this today too, you can check out my take on using robots.txt for silos in WordPress here

    thanks

    Reply to this comment

  14. Comment by Jesse posted on March 3rd, 2010

    Thanks for stopping by, I'll check out your article as well.

    Reply to this comment

  15. Comment by bhavin posted on September 22nd, 2010

    why the hack you are adding your site link.
    i need a plain robot.txt

    Reply to this comment

    • Comment by Jesse (Twitter name: ) posted on September 22nd, 2010

      can you elaborate more, I don’t understand what you are trying to say.

      Reply to this comment

  16. Comment by Phil posted on November 10th, 2010

    Jesse this is great info. Thank you for sharing with the community. I’m putting up a site that will not have any use for google ads or media partners. How do I disallow them?

    Disallow: / or Disallow: * or Disallow: /*

    Also, can you elaborate on the links you have included in the file? Never seen them in a robots text file.

    Thanks again

    Reply to this comment

    • Comment by Jesse (Twitter name: ) posted on November 10th, 2010

      Thanks Phil, I appreciate the feedback..

      To disallow something you just change it to Disallow: / and that will take care of everything from your site root folder down.

      the links you see aren’t supposed to be there, it seems that a plugin I use is messing with the code.. thanks for the heads up, I’m going to sort that out right now :)

      Reply to this comment

  17. Comment by Mobile Themes World (Twitter name: ) posted on April 20th, 2011

    Thanks for this valuable information.by the way do i have to add sitemap for subdomain in robots.txt and webmaster tools

    Reply to this comment

    • Comment by Jesse (Twitter name: ) posted on April 23rd, 2011

      If you are using a manually created robots.txt file (instead of the virtual robots.txt file many plugins make) you will want to ad a sitemap reference for all domains in use to the file because each subdomain is considered a separate domain by search engines.

      Reply to this comment

  18. Comment by Karl (Twitter name: ) posted on October 12th, 2011

    Hi Jesse, thanks for the article, one question, what are the reasons you used the links in the robots.txt, , special for the feed, is that to avoid duplicate content?

    Reply to this comment

    • Comment by Jesse (Twitter name: ) posted on October 12th, 2011

      Sorry lol… those were not supposed to be in there, a plugin that I use to automatically add in affiliate links based on keywords on the site had added those… It was supposed to be excluded from that. I have corrected this page and you shouldn’t see those links in there any more :)

      Reply to this comment


Leave a Reply

"I just voted for “Mailchimp addon.. Allow per product list subscriptions” - what do you think? #feedback http://t.co/LRG2JYOm"

6540

Member Login