Friday, February 13, 2009

Quick guide to UTF-8 in PHP regular expressions

There's a lot of different issues that come up with any sort of Unicode support in applications. Mainly because things get a lot more complicated than the standard 8-bit ASCII implementation. UTF-8 is the most common encoding in use for Unicode support. There have been many in-depth documents written about this and I encourage you to read them if you have to do anything more than modify a couple of lines of code. To modify regular expressions in PHP for UTF-8 characters, here are a couple of steps that can get you off the ground quickly:

* Use the preg functions, the ereg functions do not support Unicode and are going away in PHP 6.
* To enable unicode for a regex, add a /u flag. i.e.

if(preg_match("/[^[:space:]a-zA-Z0-9{1,}/u", $string)) {

* Find a good table of UTF-8 characters and their hexadecimal value, like this one. Use this table to look up the Unicode value, it should be in the format U+NNN where the NNN represents a variable length hexadecimal number. For example the euro symbol, € is U+20AC. To use this in a regular expression, use the \x{NNN} format, i.e. \x{20AC}.

* For example, to add € to the allowed characters in the above expression, use the following:

if(preg_match("/[^[:space:]a-zA-Z0-9\x{20AC}{1,}/u", $string)) {

Wednesday, February 4, 2009

Comparing email options for small/medium organizations

Often we'll have clients that need mail setup for their domain as almost an afterthought. The primary use of the domain is to serve a website to a global audience, and the e-mail addresses will primarily be used for customer support or communication among a small team. Usually, it's tacked on to the end of a list of requirements for server configuration, i.e. "oh yeah, and we'll need a mail server too." Generally "mail server" in this context means that they want SMTP, POP, IMAP, a webmail interface and an administrative interface that allows them manage their own user accounts and aliases. Unfortunately, this setup is a little more effort than just doing a "yum install mailserver", so this is generally what we recommend for mail server installation:

Don't do it.
Use a hosted solution instead, email is a headache to administer and is now standardized enough that it can be offered as a commodity. Standard edition Google Apps for business is free and meets most needs well, there are many other good solutions out there. However, this doesn't work for everyone, due to confidentiality concerns, customization needs, etc.


If you have the horsepower, run Zimbra. It's free, mostly open source and provides all of the functionality that I mentioned above along with several other features like calendaring. Unfortunately, it requires a lot of resources (especially memory) to run all of its separate components, so it usually needs its own server, which is overkill for a lot of <50 user installations.

Postfix + Dovecot + Squirrelmail + postfixadmin.
If you don't have the resources to run Zimbra and need to run an in-house mail server, this is probably your best bet for putting together a cost-effective system. Unfortunately, it requires more setup time than the other solutions, but you can easily run it on a low-powered server or in a virtual instance.

Qmail + Courier + Squirrelmail + qmailadmin
I'm not a big fan of qmail, but it does have a fair number of supporters and I support a couple of different installations, so I'll mention it. Similar to the Postfix bundle, it doesn't require many server resources, but can take a fair bit of setup time. However, qmail is generally more difficult to administer, less well supported, etc, so I would recommend using Postfix as your SMTP server.