Friday, 3 May 2013

Clean Up your Dirty URLs in Drupal

Having installed Drupal, the website building Content Management System (CMS), on a cloud server I had some issues getting Clean URLs to work.  This post describes how I got rid of my dirty URLs for Drupal hosted websites.

Dirty URLs

Default URLs for a Drupal hosted site include ?q= e.g. because it triggers a database query for what's after the ?q=.  Removing these characters, or at least not displaying them in the URL makes search engines friendlier towards your site and may increase your page's rankings.  A further step is to add aliases to pages so there is a human readable name instead of an id e.g. instead of

Apache mod_rewrite

Apache is the software that serves up webpages from my server and a standard component of my CentOS 6.2 LAMP installation.  You can add and configure a number of different modules for Apache and one important one used by Clean URLs is mod_rewrite which translates URLs according to a set of rules (e.g. remove '?q='.

To check if this is installed you need to log into your server with ssh and navigate to the http.conf file which in this instance was at /etc/httpd/conf/httpd.conf.  Edit this httpd.conf file using a text editor e.g. nano and scroll down (found at line 190 with my OS) to check for LoadModule rewrite_module modules/  If it is commented out with a # then remove that comment and restart Apache with service httpd restart.  Finally to check it's working run the command apachectl -M and look for rewrite_module under Loaded Modules.

I opened my browser and found that although worked didn't so Clean URLs wasn't working yet.

A little extra reading suggested another change to the httpd.conf file: change AllowOverride None to AllowOverride All within the <Directory "var/www/html"> section.  For my installation the line to be changed was line number 338 however Ctrl-W allows you to search for a word such as override in nano.  Lots of webpages go into more details of the Rewrite code - however this already existed along with lots of other stuff in /var/www/html/.htaccess.  The change to AllowOverride allows this file to come into effect for the DocumentRoot i.e. /var/www/html.

Changing this and retrying the browser URL change worked so I went back to my site (, still logged in as user 1 (full admin priveleges) I went to Configuration -> Clean URLs and hit the test button.  This time it passed and I could finally get to the Enable Clean URLs check box.

URL Alias

Now my homepage was simply an improvement.  However my About page was when I'd like it to be

This is where URL Aliases come into effect as shown below:
Which caused this:

Which allowed me to use

Job done! Phew - not as hard as I was imagining or as hard as the posts I found out there made it.  Of course my situation was quite vanilla as it was a new and simple installation so this solution is not a cure-all.  However this is a simple explanation that currently isn't out there.

Hope it helps someone!