This week I’ve been having fun using uTidyLib, a python wrapper for HTML tidy. All was working swimmingly until I hooked it up to a custom form validation function in Django. The Python process on my mac kept crashing and I was wondering what the cause was since it was working fine from the CLI.
After looking at the type of the data coming Django and seeing that it was unicode I realised it might be something to do with what was being passed in:
Python 2.5.1 (r251:54863, Feb 4 2008, 21:48:13) [GCC 4.0.1 (Apple Inc. build 5465)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import tidy >>> tidy.parseString(u'EPIC FAIL') Bus error
So it would seem that uTidyLib unicode handling is somewhat sub par.
Naturally I’ll raise a bug report as soon as I get a mo’!
Another quick tip if you’re using uTidyLib on a mac – it can’t find the tidy library by default until you symlink the built-in dynamic library file as a .so file or apply the patch found here e.g (credit for Evil Rob for finding this out):
ln -s /usr/lib/libtidy.dylib /usr/lib/libtidy.so