Muffinresearch Labs by Stuart Colville

PHP: Multiple DNS Queries Using fopen | Comments (4)

Posted in Code on 12th August 2009, 6:03 pm by Stuart

Whilst working on some inherited PHP code that used fopen I noticed an interesting comment in the PHP manual which pointed out that fopen always makes a DNS lookup for every request. Taking the following code as an example:

<?php
$handle = fopen("http://muffinresearch.co.uk/robots.txt", "r");
$contents = stream_get_contents($handle); // PHP5+ ONLY
echo $contents;
?>

Using wireshark for capturing and calling that script 3 times I got 3 DNS lookups because fopen doesn’t make use of any DNS lookup caches:

Wireshark dialogue showing 3 DNS queries for muffinresearch.co.uk

Not only is this a problem for fopen but I also found the same problem with file_get_contents too.

The comment in the manual suggests using gethostbyname which uses the DNS cache. You can then use this to provide the ip address in the arguments to fopen. However as soon as you’re trying to fetch something which uses name-based virtual hosts this approach will fail. This is due to there being several sites on the same server all being served on the same ip address; if you contact the server by ip address it will simply serve you content from the default virtual host which is the conf which happens to be first alphabetically.

A Solution

The cURL library (php5-curl is the package you’ll need on Ubuntu) is the preferred way of fetching content with PHP, mainly because it gives you far greater control over requests.

The other big benefit of using cURL is that it makes use of the DNS cache so we can save a DNS lookup for repetitive calls to the same hostname:

<?php
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, "http://muffinresearch.co.uk/robots.txt");
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    $output = curl_exec($ch);
    echo $output;
    curl_close($ch);
?>

Running that script three times now results in only one DNS request (I manually cleared the DNS cache with sudo /etc/init.d/networking restart first)

Wireshark dialogue showing 1 DNS query for muffinresearch.co.uk

Conclusion

If you’re using php to fetch data from the web cURL is a much more powerful solution than relying on fopen or file_get_contents. Not only that if you’re fetching a lot of data from the same hosts frequently your scripts will run faster as a result of only making the minimum DNS requests.

Post Tools

Comments: Add yours

1. On August 12th, 2009 at 6:28 pm Yoan said:

Yet another good reason to use cURL; thanks for the demo.

2. On August 13th, 2009 at 8:46 am Stuart Colville said:

@Yoan: Exactly!

3. On August 18th, 2009 at 12:53 pm milkfilk said:

nscd restart will also clear the DNS. Or “nscd -i hosts” (invalidate hosts). Sometimes it’s better in case networking drops your SSH session or does something else you didn’t mean to.

Cool blog, was looking at your Puppet post.

4. On July 1st, 2010 at 12:59 am Drupal developer said:

First time to see the difference of the above methods.
Thanks.







XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>



Using Loggerhead with mod_wsgi|(0)

Here’s a post I wrote over on the Project Fondue Blog about our use of Loggerhead with mod_wsgi under Apache. Loggerhead is the rather nice branch viewer for bazaar branches as used on Launchpad.net.

If you’re not already subscribed to the Project Fondue blog feed then I can recommend it, as there should be some interesting posts coming out of there in the coming months (yes I’m unashamedly biased!).

Ubuntu: Turn off changing workspace with mouse wheel|(1)

I found the changing with the workspace with the mouse wheel really annoying. To disable it go to System => Preferences => CompizConfig (available if the compizconfig-settings-manager package is installed) and uncheck “Viewport Switcher” which is under the “Desktop” heading.

Photos on Flickr

© Copyright 2004-10 Stuart Colville, all rights reserved. May contain traces of Muffin. Powered by WordPress. Hosting by Slicehost.com This page was baked in 0.696s.