Muffinresearch Labs by Stuart Colville

FAIL of the week: uTidyLib unicode error | Comments (3)

Posted in Code, Linux/Unix on 17th July 2008, 4:38 pm by Stuart

uTidyLib bus error message
This week I’ve been having fun using uTidyLib, a python wrapper for HTML tidy. All was working swimmingly until I hooked it up to a custom form validation function in Django. The Python process on my mac kept crashing and I was wondering what the cause was since it was working fine from the CLI.

After looking at the type of the data coming Django and seeing that it was unicode I realised it might be something to do with what was being passed in:

Python 2.5.1 (r251:54863, Feb  4 2008, 21:48:13)
[GCC 4.0.1 (Apple Inc. build 5465)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import tidy
>>> tidy.parseString(u'EPIC FAIL')
Bus error

So it would seem that uTidyLib unicode handling is somewhat sub par. Naturally I’ll raise a bug report as soon as I get a mo’! A bug report has already been raised

Another quick tip if you’re using uTidyLib on a mac – it can’t find the tidy library by default until you symlink the built-in dynamic library file as a .so file or apply the patch found here e.g (credit for Evil Rob for finding this out):

ln -s /usr/lib/libtidy.dylib /usr/lib/libtidy.so

Post Tools

Comments: Add yours

1. On July 17th, 2008 at 9:39 pm sil said:

Blimey. Happens to me as well on Ubuntu. The uTidyLib people need a boot in the arse to shape up.

2. On July 18th, 2008 at 1:46 am Robert Lofthouse said:

uTidyLib has uber epic fail ;) I agree with Stuart, I mean come on…these are fairly simple things to fix. Pull your finger out guys ;)

3. On July 29th, 2008 at 9:17 pm Working around uTidyLib’s unicode handling by Stuart Colville said:

[...] A couple of weeks back I was giving uTidyLib a hard time for exploding when passed a unicode string. (see FAIL of the week: uTidyLib unicode error). [...]







XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>



Using Loggerhead with mod_wsgi|(0)

Here’s a post I wrote over on the Project Fondue Blog about our use of Loggerhead with mod_wsgi under Apache. Loggerhead is the rather nice branch viewer for bazaar branches as used on Launchpad.net.

If you’re not already subscribed to the Project Fondue blog feed then I can recommend it, as there should be some interesting posts coming out of there in the coming months (yes I’m unashamedly biased!).

Ubuntu: Turn off changing workspace with mouse wheel|(1)

I found the changing with the workspace with the mouse wheel really annoying. To disable it go to System => Preferences => CompizConfig (available if the compizconfig-settings-manager package is installed) and uncheck “Viewport Switcher” which is under the “Desktop” heading.

Photos on Flickr

© Copyright 2004-10 Stuart Colville, all rights reserved. May contain traces of Muffin. Powered by WordPress. Hosting by Slicehost.com This page was baked in 0.612s.