At work, there are times when we need to publish web pages with embargoed content for review by stakeholders. Often it’s too much hassle to add some basic password protection to the page to prevent unwanted visitors. All you really want is to make sure that it doesn’t appear in search results until you want it to. I’ve gotten the panicked phone call to urgently hide a web page too many times, so now I’m going to document how to hide a web page for search engines.
From my 5 minutes of research there are 2 simple ways to prevent search engines like Google, Bing and Yahoo from indexing a page.
- Use robots.txt to disallow access to a path on your website
- Add a robots
metatag to the web page with
For the first method a simple text file named robots.txt needs to be added to the root folder of your web site with the following content:
This essentially tells any robot crawling the site to not index the page at path
/path/of/webpage-to-hide.html. This method is not a good idea for hiding embargoed content as it’s a publicly viewable file. It would just point out what you don’t want people to see.
For the second method, the following tag needs to be inserted between the head tags of the web page you don’t want to be indexed.
<meta name="robots" content="noindex, nofollow">
Using a robots
meta tag by itself is a good option as you would need to know the existed to know that we don’t want it to be indexed.
For more details on using the robots.txt and robots
meta tag, check out the related links.
- Block access to your content – Google Search Console Help
- Robots Exclusion Standard – Wikipedia