Jonathan Hitchcock's blog

Yes, we can.

Senator Barack Obama has won the race to become the nominee of the Democratic Party in the United States of America. Today, Hilary Clinton threw her weight behind him:

"Today as I suspend my campaign, I congratulate him on the victory he has won and the extraordinary campaign he has won," Mrs Clinton said. "I endorse him and throw my full support behind him and I ask of you to join me in working as hard for Barack Obama as you have for me."
I can't really tell you why I'm so hopeful for Obama. Maybe it's because people I really respect have come out and endorsed him - several times - breaking their usual non-political stances in several instances. But I've read about him, and I've watched his speeches (transcript here), and I'm optimistic about where America is going.

If you haven't seen the video before, watch it now: yes, we can.

America, so unwell, but still able to thrill us when you get it right...

Just do it

If you have any spare clothes (and I know you do), or any spare cash (and I know you do), please, PLEASE, help out. There are thousands of people left with no homes, and nothing possessions but what they're wearing. Most supermarkets should have donation boxes, but for more ideas of how to help, check out the Treatment Action Campaign site, and there's even a facebook group for you.

Thanks.

Phoblog: Weekend in Stanford

A champagne breakfast out of town in Stanford on the Garden RouteOverberg. Nothing more relaxing.

Pictured: Danni Davies, and our bottle of Poingracz. Not pictured: Matth Gair, Emily Davies, Shannon Morreira, James Reeler, and yours truly.

phoblogged image

phoblogged image

Phoblog: The mighty Stephen Hawking

You can't really make him out, but that is totally my man MC Hawking up there on stage. He's presenting a talk entitled Universe. It's awesome.

phoblogged image

May GeekDinner

As I said on the GeekDinner announcement list:

Since the last GeekDinner was held at the end of March, and since we hold the GeekDinners bimestrially, it seems we are due another one at the end of May. This is the eightthhth GeekDinner, and we're calling it "Happy Habanero". What with the habanero being the national vegetable of Azerbaijan, we're going to hold the dinner on Azerbaijan's Republic Day, which, according to wikipedia, is Wednesday, May 28th.

The venue for this dinner is Mel's Kitchen, in Rondebosch Village, just off Klipfontein Road.

As usual, you can sign up, and check on the other details, on the wiki page.

We're now in our second year of GeekDinners, and they seem to be going strong. We have a good model, mostly sustainable, although it is slightly dependent on the core group of organisers to get things moving. We have a solid set of regular attendees that should provide the dinners with enough momentum to continue, though, should anything happen, and I'm very positive about the future of the dinners. We're also managing, for the most part, to keep talks short and interesting to all comers - I know that newcomers always worry that everything's going to be "too hardcore techie", but honestly, it's not about microchips and "ones and zeroes". My favourite talks have been about hippies and the buttons on car radios. So, please, if you haven't been before, why not come along, meet some new people, share some free wine, and enjoy some excellent food.

The slideshow karaoke has become a regular feature of our dinners, and is always one of the most entertaining parts. The way it works is, somebody prepares a set of slides on any topic they want (we've had "Etiquette when dealing with British Royalty", "Common problems with cement tiles", and "A primer on lesser known Norse gods"). Somebody else then presents a talk based on these slides without any prior knowledge of the topic, or of the content of the slides - always to amusing effect. This time, Darb is preparing the slides, and we have yet to find a volunteer to present them. If you're keen, do volunteer. If not, maybe you have something interesting you'd like to talk about anyway - we have no volunteers for speakers yet.

If I've sold you, sign up on the wiki, and we'll see you there!

Phoblogging

An apology

The sharp-eyed among you will have noticed that my blog has suddenly become one of those sites. I apologise for flooding your feed readers with pictures of seals, but let me explain.

A justification

The whole "let me post random photos from my life on my blog" thing was more an exercise in "how easy would it be to make a phoblog?" than a desire to share what my shoes look like. I will admit that when I took the sunset photo, I thought "this would be a really good thing to share with the world", because, let's admit it, Cape Town is one ridiculously beautiful city, and people need to hear that. But that got me thinking how easy it would be to make a photo shareable, and here is what I came up with.

An explanation

There is actually a function on my phone labelled "blog this", but I think it sends the image (or whatever) to a Sony-sponsored blogging site, and I'm frankly not interested in that. I wanted to solve the problem academically, for the general case, and as a side effect solve it for my specific case - I run this blog in a drupal instance on my own server hosted with Layered Tech.

A discussion

So, the various ways to get information from my phone to my server were MMS, email, some form of push to a web-page, or bluetooth/cable upload to a laptop/desktop which will send it on. The last option defeated the point - I wanted to be able to blog a photo from anywhere, using nothing but my phone. Using the web-page push is what the "blog this" function does, but for my specific case, I'd have to write a custom application for the phone, which was way more effort than I wanted to expend. Sending an MMS would require me to have a GSM modem listening somewhere to receive it, and had the added disadvantage of requiring that the images got resized down. So, it seems, the best way to get the information from my phone to my server was to simply send an email (with images attached).

A technical discussion

The rest of this post describes the technical details of what happens to the email when it arrives at my server.

As an overview: I catch mail meant for the phoblog using a procmail recipe, and pipe the mail to a python script, which parses the message and pulls out the relevant parts, constructing the body text, creating thumbnails of the images and saving them in the right place. Having deconstructed the message and constructed the blog post, it passes the bits (title, body, and publication date, which it extracts from the EXIF information in the photos) to a PHP script, which hooks into the Drupal API and actually creates the blog post.

The PHP script is necessary, since there's no other way to hook into the Drupal API. I could do something like faking a bunch of HTTP GETs and POSTs, and passing the information in as if I was actually blogging it from the web interface, but that's even more klunky than simply piping it into a PHP script. The question then arises why I couldn't write the whole thing in PHP, and save myself the expense of running two scripts requiring two different interpreters, but frankly, trying to get PHP to do what is necessary would end in such an inelegant, ugly, hackish result that it just wouldn't be worth it.

An added advantage to separating the Python parser and the PHP script is that you can replace the PHP script with one that injects an entry into a different blogging platform, and it'll still work fine. So, somebody could write a script that talks to Wordpress, and simply drop it into place.


The injector (the PHP script)

The PHP script needs to hook into the Drupal API, so we first need to bootstrap into the Drupal environment. First we fake some HTTP headers in the $_SERVER array so that Drupal knows which site is being "requested" (Drupal does some clever multi-site stuff based on which URL is being requested). Then we change to the Drupal base directory (defined as a constant at the top), include the bootstrap code (also defined at the top), and then simply run the drupal_bootstrap() function:

<?php 
// Defined as a constant, could/should be passed as an option or loaded from a config file:
define('PHOBLOG_DRUPAL_URI''http://vhata.net/');
// Fairly standard for Drupal installations, but as above:
define('PHOBLOG_DRUPAL_ROOT''/usr/share/drupal');
define('PHOBLOG_DRUPAL_BOOTSTRAP''includes/bootstrap.inc');

// Fake the necessary HTTP headers that Drupal needs:
$drupal_base_url parse_url(PHOBLOG_DRUPAL_URI);
$_SERVER['HTTP_HOST'] = $drupal_base_url['host'];
$_SERVER['PHP_SELF'] = $drupal_base_url['path'].'/index.php';
$_SERVER['REQUEST_URI'] = $_SERVER['SCRIPT_NAME'] = $_SERVER['PHP_SELF'];
$_SERVER['REMOTE_ADDR'] = NULL;
$_SERVER['REQUEST_METHOD'] = NULL;

// Change to Drupal root dir.
chdir(PHOBLOG_DRUPAL_ROOT);

require_once(
PHOBLOG_DRUPAL_BOOTSTRAP);
drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL);

?>

Now we are running in a Drupal environment. The next step is to collect the information that we want to insert as a blog entry. We take the title and publish date from the arguments passed to the script, and then do a loop to read the body from standard input:

<?php 
$date 
$_SERVER['argv'][1];
$subject $_SERVER['argv'][2];

$fp fopen('php://stdin''r');
$body "";
while(
$line fgets($fp4096)) {
        
$body .= $line;
}

?>

I may be wrong, but I don't think there are any sanitization problems in the above code. Let me know if you can see any? I'm pretty sure I don't need to escape anything, since I pass all variables as-is to Drupal, which does full sanitization before using them. Anyway, the final step is to simply call the drupal node_save() function to save the blog post as a node (passing it some default values):

<?php 
node_save
((object)(array('created' => $date,
        
'title' => $subject,
        
'body' => $body,
        
'teaser' => $body,
        
'format' => '3',
        
'uid' => 1,
        
'type' => 'blog',
        
'status' => 1,
        
'comment' => 2,
        
'promote' => 0,
        
'sticky' => 0)));

?>

My only worry there is that I specify that the format is type 3 (unfiltered HTML) - this might leave the phoblogger open to code-injection exploits. I should probably specify type 1, filtered HTML, to make sure that nobody can accidentally blog something nasty.

So, that's the PHP script that injects the entry into Drupal. The other part of the system is, of course, the Python script that parses the email in the first place.

The parser (the Python script)

I'm not posting the full script here, for a number of reasons, mostly to do with it "not being finished yet". It works, but it doesn't do everything it should (including a complete security check, since that's kinda hard to implement on emails, which can be faked). Suffice it to say, it uses the optparse, ConfigParser and logging modules to be nicely configurable, runnable, and debuggable, and all that. But, yeah, I'm still embarrassed about it, and won't post sourcecode until I think it's good-looking enough for public consumption. What I will post here is bits of python code that demonstrate the actual meat of the thing - how I deconstruct and process the email that I receive.

The basic steps I perform are:

  1. Break up the email and extract the bits I need from it.
  2. Process each attachment part:
    • Text attachments get HTML-ified
    • HTML attachments get inserted as-is
    • Image attachments get thumbnailed, and the thumbnails and originals get stored somewhere web-accessible, and a chunk of HTML that references them gets created.
  3. Send the results of this processing to the injector script with the right subject and date.
Breaking up the email is trivial using the email module in python:

import email
msg = email.message_from_file(sys.stdin)
subject = u''.join(unicode(part, encoding or 'us-ascii') for part, encoding in email.header.decode_header(msg.get('subject')))
msgfrom = email.utils.getaddresses([msg.get('from')])[0][1]
msgid = msg.get('message-id')

for piece in msg.get_payload():
   processpiece(piece)

As you can see, no regular expressions needed to match headers, do MIME decoding, or break up an email address. You can even give it a list of all the different stupid formats for addresses that mail clients seem to use these days, and it will understand them:

>>> getaddresses(["jonathan@vhata.net", '"Jonathan III" <vhata@clug.org.za>', 'pope@vatican.org (Benedict)'])
[('', 'jonathan@vhata.net'),
 ('Jonathan III', 'vhata@clug.org.za'),
 ('Benedict', 'pope@vatican.org')]

I break each attachment up and send them to the processpiece() function one at a time.

Inside the processpiece() function, I can get at the content-type of the chunk I'm processing by using the get_content_type() method:

>>> piece.get_content_type()
'image/jpeg'
>>> piece.get_content_maintype()
'image'
>>> piece.get_content_subtype()
'jpeg'

and I can use this to work out what I want to do with the chunk. I can also get the chunk in its raw form (i.e. decoded from the MIME transport that email uses by simply calling get_payload() on it:

payload = piece.get_payload(decode=True)

If it's text, I simply replace all the newlines with HTML line breaks:

payload.replace("\n","<br />\n")

The difficult case is, of course, when it's an image. Here, I use the Python Imaging Library to process the image. I extract the EXIF timestamp and turn into a datetime structure, so that I can create a hierarchical directory tree to store the images. Then, I construct a thumbnail filename and create the thumbnail:

payload = piece.get_payload(decode=True)
image = Image.open(StringIO.StringIO(payload))

timestamp = datetime.datetime.strptime(image._getexif()[EXIF_DATETIME], "%Y:%m:%d %H:%M:%S")
self.entrystamp = timestamp

targetdir = "%04d/%02d/%02d" % (timestamp.year, timestamp.month, timestamp.day)
try:
   os.makedirs("%s/%s" % (TARGETDIR, targetdir), 0755)
except OSError:
   pass

fname = piece.get_filename()
(rootname, ext) = os.path.splitext(fname)
ext = ext.lower()
fname = "%s%s" % (rootname, ext)
thumbname = "%s-thumb%s" % (rootname, ext)

image.save("%s/%s/%s" % (TARGETDIR, targetdir, fname))
os.chmod("%s/%s/%s" % (TARGETDIR, targetdir, fname), 0644)
image2 = image.copy()
image2.thumbnail([THUMBSIZE,THUMBSIZE])
image2.save("%s/%s/%s" % (TARGETDIR, targetdir, thumbname))
os.chmod("%s/%s/%s" % (TARGETDIR, targetdir, thumbname), 0644)

Then I return a templated chunk of text to dump into the blog post. Easy as pie.

The last step is to pipe the individually formatted pieces to the injector script, passing it the date (extracted from the EXIF information above) and subject as parameters:

injector = subprocess.Popen([ADDCMD, entrystamp.strftime("%s"), "Phoblog: %s" % subject],stdin=subprocess.PIPE)
for piece in body:
   injector.stdin.write(piece)
injector.communicate()

And off it goes.

Some concerns

First and foremost, security is a problem. If I'm sending an email from my phone, anybody can send the same email from their own phone - there is no identification in the email. One way around this would be to require a keyword in the subject before accepting it. This is security by obscurity - anybody who gets hold of the keyword will be in. I can decrease this risk by forcing some sort of hash on the keyword. For example, if the keyword was "pilates", I could require that the number of consonants in the current day be appended to that: "pilates6" on a Sunday, "pilates7" on a Tuesday. This slightly decreases the risk, but not much. There are other, even cleverer variations on this theme, but they are all basically just security by obscurity. A better way would be to use authenticated SMTP, and only accept phoblog messages that were authenticated through my own SMTP server, and I think I might implement this, unless I can think of a flaw in the idea.

Another problem is that I might lay myself open to HTML/javascript/etc injections, but I think this will be allayed if I solve the problem above.

A conclusion

This has been a somewhat rambling, somewhat disjointed explication, but I hope it gives you the general gist of what I did. If I ever look at the script again, maybe I'll fix it up properly, and make it publicly available. I even registered phoblog.za.net but that's taking some time. Meantime, enjoy piccies.

Phoblog: Camps Bay

My sister and I went to Camps Bay for the afternoon. It was extremely windy, and we both got thoroughly exfoliated, but at the end of it, we had this magnificent sunset as a reward. I don't think there's a prettier beach in the world than our west coast.

phoblogged image

phoblogged image

Phoblog: Lord Bo

Siamese cats are born snobby. This is Bo, and he's been sick, and looks it, but he still manages to carry himself with an air of utter aloofness.

phoblogged image

phoblogged image

Phoblog: Seal at the Waterfront

There's a little platform in the Waterfront harbour with steps up to it from the water, so the seals can climb up and sun themselves. This guy spent ten minutes grooming himself for us.

phoblogged image

phoblogged image

phoblogged image

Theremin

I want a theremin.

There, I said it. Why do more people not know about these things? The theremin was "one of the earliest electronic musical instruments, and the first musical instrument played without being touched". Basically, it has two aerials which detect how close the player's hands are - one controls the pitch of the note, and one controls the intensity. So, if you wave at it, it'll squeal back at you. If you've ever listened to music and waved your hands in a sort of "visualisation" of the song, you've got an inkling of what it's like to play the theremin, although, of course, I'm sure it's closer to trying to play the trombone. This should make it clear how it works.

If you don't think you've heard a theremin before, think again: they've been used in songs by Muse, The Pixies, Portishead, the Rolling Stones, System of a Down, Led Zeppelin, and the Beach Boys, among others.

A little known fact is that some of the most prolific players of the theremin are, in fact, cats. It's true!. However, humans can do a pretty decent job, as this rendition of Gnarls Barkley's song "Crazy" shows. However, predictably, our new machine overlords manage to take our achievements and do a much better job.

In conclusion, if you simply refuse to buy me a fez, I will graciously accept a theremin instead.

Syndicate content