How to prevent a web page from being indexed by search engines

At work, there are times when we need to publish web pages with embargoed content for review by stakeholders. Often it’s too much hassle to add some basic password protection to the page to prevent unwanted visitors. All you really want is to make sure that it doesn’t appear in search results until you want it to. I’ve gotten the panicked phone call to urgently hide a web page too many times, so now I’m going to document how to hide a web page for search engines.

From my 5 minutes of research there are 2 simple ways to prevent search engines like Google, Bing and Yahoo from indexing a page.

  1. Use robots.txt to disallow access to a path on your website
  2. Add a robots meta tag to the web page with noindex

For the first method a simple text file named robots.txt needs to be added to the root folder of your web site with the following content:

This essentially tells any robot crawling the site to not index the page at path /path/of/webpage-to-hide.html. This method is not a good idea for hiding embargoed content as it’s a publicly viewable file. It would just point out what you don’t want people to see.

For the second method, the following tag needs to be inserted between the head tags of the web page you don’t want to be indexed.

Using a robots meta tag by itself is a good option as you would need to know the existed to know that we don’t want it to be indexed.

For more details on using the robots.txt and robots meta tag, check out the related links.

Related Links