2012-01-08

PHP Programming: Clean URLs

I've spent the past two weekends re-writing old, horrible, ugly code that I wrote as a kid. I still have a lot of work ahead of me, and there's still no guarantee that the code I'm writing now will be great, but it'll be better than the stuff I wrote before. I'm not a web developer by any means, but it won't keep me from having some nerdy fun on the weekend. I've learned some tricks, and figured I'd share them here in a few short entries. Today, it's so-called "Clean" URLs.

In most dynamic web scripting languages such as PHP, dynamic pages can be accessed by passing parameters through a query string. As an example: http://foo.somesite.com/article.php?p=42

You've undoubtedly seen dynamic websites that don't pass query strings, but instead generate nice, friendly URLs that look more like this: http://foo.somesite.com/article/Hello_World

Clean URLs are easier to link to, they're better for search engine optimization, and with proper implementation provide a few security features as well. The project I'm working on is a custom CMS that was my very first Apache/PHP/MySQL project. It ended up getting kind of popular among a niche crowd, so I left it online in all its ugly glory. I have an article list script that can filter articles by category, and another script to display the articles. I'll just call them list.php and article.php here for the sake of discussion.

Removing .php from the URL

The first part of the problem is the fact that I'd rather not have ".php" show up in the URL, so that "article" and "list" would appear to be sub-directories of my site.

After playing around with a bunch of promising hacks that tended to work fine, including mod_rewrite, I decided that the most elegant way to tackle this was to rely on Apache's "FilesMatch" directive. I created symbolic links from list.php and article.php to strip the .php extension. You could just as easily rename them, though. Then I added the FilesMatch directive to .htaccess for "article" and "list" as shown below. Make sure "AllowOverride All" is set in your Apache configuration, too.



At this point, pulling up http://foo.somesite.com/list will try to run "list" as a PHP script. In my case, it shows a list of articles.

Parsing URI parameters with PHP
The next part is mostly easy. As far as I know, any PHP script will let you add a bunch of parameters to it with slashes like this: http://foo.somesite.com/index.php/testing/1/2/3/4/5. It may not render well, but it doesn't keep it from executing. Above, we got list to show a list of articles, so now it's time to set up filtering. As you can see below, the last element of the URL (split by /) is considered to be the category. The following URL would show a category of "Widgets": http://foo.somesite.com/list/Widgets

To avoid the problem of getting a category of "list" or "list.php" (remember, I chose to symlink my .php files), I included logic to check to see if the category matches the script name. If so, it assumes that it should instead display the Table Of Contents. You'll also see I am setting the title based on what happens here. As the category types are stored in the database and displayed with spaces, I am using str_replace to replace underscores in the URI with spaces. This means that I have to ensure spaces are turned into underscores elsewhere in the code and in all links to my site. That's beyond the scope of this article, though. What matters is that a URL of http://foo.somesite.com/list/Arts_and_Crafts would pull up a category of "Arts and Crafts".
$category=(end(explode('/', urldecode($_SERVER['REQUEST_URI']))));
if((
$category) && ($category != 'list.php' && $type != 'list'))
{
$rawcat=str_replace('_', ' ', $category);
$list=getlist($rawcat);
$title=$rawcat." Articles";
}
else
{
$toc=TRUE;
$list=getlist();
$title="Table of Contents";
}

From here, the rest of my PHP code can query the database for articles. getlist() is a function that I wrote. If its parameter is null, it gets all of them, otherwise it (safely!) queries the database for a category.

I used similar code in the article php script, which queries the database for the title of an article, so I get nice-looking URLs for individual articles as well, such as:

http://foo.somesite.com/article/How_to_narfle_the_Garthok

blog comments powered by Disqus