Ben Wong is a Web/iOS Developer from Brisbane, Australia. When he's working he is usually coding in HTML, CSS, Javascript, ASP.NET (C#/VB.NET) or SQL. He enjoys working with Umbraco CMS. When he's not coding he's probably on a basketball court.

How to prevent a web page from being indexed by search engines

At work, there are times when we need to publish web pages with embargoed content for review by stakeholders. Often it’s too much hassle to add some basic password protection to the page to prevent unwanted visitors. All you really want is to make sure that it doesn’t appear in search results until you want it to. I’ve gotten the panicked phone call to urgently hide a web page too many times, so now I’m going to document how to hide a web page for search engines.

From my 5 minutes of research there are 2 simple ways to prevent search engines like Google, Bing and Yahoo from indexing a page.

  1. Use robots.txt to disallow access to a path on your website
  2. Add a robots meta tag to the web page with noindex

For the first method a simple text file named robots.txt needs to be added to the root folder of your web site with the following content:

User-agent: *
Disallow: /path/of/webpage-to-hide.html
This essentially tells any robot crawling the site to not index the page at path /path/of/webpage-to-hide.html. This method is not a good idea for hiding embargoed content as it’s a publicly viewable file. It would just point out what you don’t want people to see.

For the second method, the following tag needs to be inserted between the head tags of the web page you don’t want to be indexed.

<meta name="robots" content="noindex, nofollow">
Using a robots meta tag by itself is a good option as you would need to know the existed to know that we don’t want it to be indexed.

For more details on using the robots.txt and robots meta tag, check out the related links.

Related Links

Always specify radix parameter for Javascript parseInt

This week I learned how important the radix parameter for the javascript parseInt is. I implemented my own custom date string parser and discovered that older versions of Firefox behaved differently when running the following line.

var month = parseInt('08');

All the latest version browsers (IE, Chrome, Firefox, Safari) I tested on returned 8, but older versions of Firefox returned 0. My solution to this was to set the radix to 10.

var month = parseInt('08', 10);

Mozilla Developer Network’s Javascript parseInt page recommends always specifying the radix parameter to guarantee predictable behaviour.

How to use ASP.NET FileUpload’s PostedFile.InputStream

This week I started working on a new file uploader for the company website admin. The current version uses the HttpPostedFile SaveAs method to save the file to a temporary directory before uploading to the web server using FTP. I decided to make the new version use HttpPostedFile’s InputStream property to transfer the file in one go instead.

Seemed straightforward enough, but I hit a snag.

My first attempt at the code to upload the file looked something like this:

byte[] buffer = new byte[postedFile.InputStream.Length];
postedFile.InputStream.Read(buffer, 0, postedFile.InputStream.Length);
It seemed to work, but it turned out while the image file I was uploading ended up being the right size, the image was corrupted.

After reading a bunch of MSDN articles, forum posts, blog posts and StackOverflow answers, I discovered that the seek position needed to be reset before reading the stream.

So the code needed to be like this:

byte[] buffer = new byte[postedFile.InputStream.Length];
postedFile.InputStream.Seek(0, SeekOrigin.Begin);
postedFile.InputStream.Read(buffer, 0, postedFile.InputStream.Length);

Useful .NET libraries for console applications

In the past couple of weeks at work I’ve being rewriting a .NET console application that imports data and runs as a schedule task on a server. I’ve had to work on few of these sort of apps as a web developer, so I thought I’d share my knowledge.

These type of applications aren’t particularly exciting to work on for a bunch of reasons. But they’re almost always very critical and are required to run correctly and reliably. Luckily there are some useful libraries that ease the burden.

Here are some that I’ve used in the past couple of weeks.

nLog –

Logging is very important for applications that are run as scheduled tasks because when it fails you need some way of determining why. nLog is very simple to use and rich with features. I’ve set it up to trigger an email on fail and also configured log file archiving.

FileHelpers –

FileHelpers is a brilliant library that simplifies the job of parsing import files. It takes some time to understand how to use initially, but it’s still much less time than writing the parsing code yourself.

DotNetZip –

Sometimes import files will need to be unzipped. DotNetZip seemed like the simplest .NET zip library to use of the bunch. It worked and didn’t require many lines of code to use – can’t beat that.

ndesk.options –

The data importer I was working on required some command line arguments to trigger the various different ways I needed to run it – type of import and different time spans. That’s where a command line argument parser like desk.options comes in handy. The way it works seems a little quirky, but it does the job.