randys.org

Wasting your precious bandwidth since 1999

Google Friendly URLs with PHP and Apache

Dynamic websites are essential for content heavy sites and chances are, you’re not going to create one static page for every single page/article/product item you have stored in a database. This means you’re going to need to pass parameters to your script to pull the right content from your database. Up until recently, most people didn’t give much thought to those nasty URLs with all those ampersands (&) and equal signs and how they affect spidering by search engines. While these URLs are perfectly valid, they tend not to get indexed by Google (and other search engines) unless you submit each URL for indexing (which can cost you money). And these days, if you’re not in Google, you’re not being found.

Enter PHP and Apache

If you’re using PHP and Apache, you can still be dynamic and be spidered by all the popular search engines fairly easily. Of course, this depends on your hosting provider and how much control they let you have in terms of htaccess. .htaccess files let you manipulate some Apache settings on a per host/per folder basis. Have a look at the Apache manual for a full listing of configuration settings for Apache web server.

Now for the goods. Here’s a basic rundown of how search engine friendly URLs can be used in a dynamic way. Let’s say you have a catalog of items, and your script takes 3 parameters: ‘section’ , ‘action’, and ‘item.’ Normally, when passing parameters to a script, your url will look something like this:

/index.php?section=widgets&action=view&item=327

With the proper directives in an Apache .htaccess file and some simple PHP scripting, you can turn that URL into something like this:

/catalog/section/widgets/action/view/item/327

Looks like a bunch of folders, but it’s not. ‘catalog’ is actually a file (a PHP file) without the extention, and with the help of a .htaccess file, Apache treats it like a PHP script. Here’s what the .htaccess file looks like:

<br />&lt;Files catalog&gt;
<br />     ForceType application/x-httpd-php
<br />&lt;/Files&gt;
<br />



That’s it. Nothing too complicated. Now for the trickier part, grabbing those parameters from the URL with PHP.

There’s a server variable that gets passed to the script calle ‘PATH_INFO’ which contains the entire string after the catalog file (including that first slash). To grab it in a PHP script, you’d use the $_SERVER global variable (a la $_SERVER[‘PATH_INFO’]). The idea is to explode the string into an array and loop through them to get the parameters. Here’s a small function to do so:

<br />&lt;?php
<br />function getArgs()
<br />{
<br />    $params = explode("/", $_SERVER['PATH_INFO']);
<br />
<br />    for($i = 1; $i < sizeof($params); $i = $i + 2)
<br />    {
<br />        $args[$params[$i]] = $params[$i+1];
<br />    }
<br />    return $args;
<br />}
<br /> ?&gt;
<br />



When used on the above example, this function will return an array like so:

<br /> Array (
<br />     [section] => widgets
<br />     [action] => view
<br />     [item] => 327
<br /> )
<br /> 



Implimenting this into a current site shouldn’t be too hard. The ‘catalog’ script would simply be a wraper script where you can pass the required values into existing code.

If you find that your web host doesn’t allow use of .htaccess files, or they don’t allow the <Files> directive to be used, you can do one of two things (and I highly recommend the first).

1) Change hosts! I use Dreamhost and they are extremely flexible, have an awsome support team and have a very good deal for $9.95/mo.

2) Just use a normal PHP script (i.e. catalog.php) and use the same function. While this method works, I’ve read that some search engines see this as a hack/cheat and might not spider the pages anyway. However, if you are a reputable business, you can usually contact them and get your pages indexed.

Other Web Servers

If you’re running IIS (god forbid), there is hope for you. There is at least one product (ISAPI) available that mimics the Apache mod_rewrite module. This ISAPI can help do the same thing (although it won’t be quite as easy). But the drawbacks are that a) the ISAPI isn’t free and b) if you’re in a shared environment, you can’t use it. There are other ways to manipulate your IIS settings to use search engine friendly URLs, but they all have their downsides. Bottom line: stop using IIS.

As for other servers, I can’t really say. Chances are there are some tweaks that can be made to the configuration and/or scripts. If you know of any, please drop me a line and I’ll try to get it posted.

Comments are closed.

All content Copyright © Randy Sesser | Hosted by WebFaction
Entries (RSS) | Comments (RSS)