How To Keep Bad Visitors Off Your Web Site
If you’re like most web site owners, you probably want all the traffic you can get. The problem is that as in real life, some visitors can be unruly. They usually fall into one of two categories: bad “bots” and bad individual (human) visitors.
What are bad “bots”? First, “bots” are robots, or automated computers that scan the web for various purposes. The “good” bots generally visit your site to catalog it so they can display the results in web searches. Bingbot and Googlebot are examples. Generally, you want these “good” bots to visit you because they will help bring desirable traffic to your site. However, there may be some content on your site that you don’t want indexed, and you can use a robots.txt file to tell robots which files to avoid. Good robots should always obey a properly formatted robots.txt file.
Bad bots have ulterior motives. They can have a relatively low impact on your site, or they can be truly destructive. Some of the things they do include:
- Stealing bandwidth by repetitively querying files on your site; bandwidth that slows down page load times for your intended visitors and that your web host– and ultimately you– have to pay for
- Hitting your site often enough to create a Denial of Service attack, preventing anyone from accessing it, or hurting your search engine rankings
- Scanning your site so it can “tattle” on you if someone thinks you’ve stolen their content
- Actually stealing your content, including pictures, articles, and other kinds of intellectual property. In some cases, entire web sites have been ripped off.
- Looking for vulnerabilities in your site so it can alert the operator that your site might be a good candidate for hacking
How can you tell if bad bots are accessing your web site? One way is by examining your server logs. Unfortunately these are rather arcane and difficult to read unless you have special software for that purpose. If you are running a WordPress site, there is a nice plug-in called Wordfence that has an excellent user interface for examining your visitor traffic that makes it easy to find visitors who are bent on mischief.
What should you look for? One of the most common tricks bad bots use is to try to load pages that don’t exist on your site, causing “404” (not found) errors. There are situations where 404’s can be generated legitimately, such as visitors looking for web pages that you once had on your site that you deleted or changed. Ideally you should use 301 permanent redirects for these pages to take your visitors to the correct address for the replacement page, or at least to some other valid page on your site. However, if you see visitors trying to access pages that you never created or trying to execute php scripts, you know they’re up to no good. These users should be blocked permanently from your site.
Another thing bad visitors often try to do is to break into your site so they can gain control over it. Most Content Management Systems like WordPress, Joomla, and Drupal have an administrator login page. Even if the login link is not displayed on any of your web pages, you can be sure that hackers know it is there, and how to find it. You should never use the default login name (such as “admin”) and you should use a strong password. Even so, people will try to break in if you don’t put effective measures into place to stop them. One of the best techniques is to use a plugin that allows you to immediately lock out anyone who uses the wrong login name. So for example, for a WordPress site, you should block anyone who tries to login as “admin”. Of course it is possible to lock yourself out of your own site, but with a little care this can be prevented.
If your site is hacked, it can be a real headache. You will lose visitors (and business) and your site will probably be de-indexed by the major search engines. It can take weeks to get back in, during which time your site might as well be offline. So be careful! The best thing to do is to stop hackers before they get in. If you are running a CMS, or any kind of active code on your site, you absolutely must have an effective security solution in place. So far, we are pretty happy with Wordfence for WordPress sites. Cloudflare is another promising possibility.
If you are running a static html site, you are considerably less vulnerable to hack attacks, but these kinds of sites are becoming much less common except for basic information sites. However, they are still vulnerable to bad bots that want to steal or block your content from your intended users, so you should still pay attention to that risk.
In summary, you need to pay close attention to what your visitors are doing on your site as part of your overall security plan. You’ll probably be surprised at how much malevolent behavior you will find going on, once you start looking at your logs. It is truly mind-boggling. If you don’t want to pay attention to the details yourself, you can hire a third party to do it for you. For example, Weebly is a popular online CMS that gives you a totally “hands off” solution, if that is attractive to you. It can be a good solution for those wanting simple sites and not requiring a great deal of custom work.