Saturday, February 20, 2010

Why you should use an autoload function in PHP

The loading of classes is something that managed languages like Java and C# don't need to worry about, class loaders are built into the compiler. But C/C++ programmers have always had to deal with the issue of accidentally including the same file into a build. They found an easy way around that by wrapping some includes in an
#ifndef CONSTANT
#include 'myfile.h'
#endif
and placing
#ifndef CONSTANT
#define CONSTANT
#endif
into 'myfile.h'. This is a good system for a compiled language that only needs to evaluate these expressions once at build time.

PHP doesn't use this method becuase it has the handy little include functions, include_once and require_once, that prevent you from loading the same file more than once, but unlike a compiled language, PHP re-evaluates these expressions over and over during the evaluation period each time a file containing one or more of these expressions is loaded into the runtime. That is where the Standard PHP Library (SPL), introduced in PHP 5, and the wonderful little _autoload function come in to enhance the speed and uniformity of your PHP code.

__autoload is a magic function, that you define, that enables PHP to let you know when it doesn't have a class loaded, but that class needs to be loaded.

If you define the __autoload function like so,
function __autoload ($classname)
{
    require('/path/to/my/classes/'.$classname.'.php');
}
you no longer need to add
require_once('/path/to/my/classes/MyClass.php');
into your files, because the first time that PHP encounters
$mine = new MyClass();
or
MyClass::staticMethodCall();
it will automatically call the __autoload function that you defined earlier.
__autoload('MyClass');
PHP doesn't do this EVERY time it encounters these calls, just the first time. Thus, you no longer need to add the require_once('/path/to/my/classes/MyClass.php'); to any files at all.

Why is __autoload a good thing?

The primary reason is that it improves the performance of your scripts by preventing PHP from checking if the file has already been loaded or not, like it does every time you call require_once or include_once. Moreover, you no longer have to load a class file just because you MIGHT need it during the execution of you script, because PHP will let you know if it is needed, when it is needed.

Of course, if you are sure that a class is not yet loaded, and that you will positively need that class during the execution of your script, you should by all means use the require() function to include your file. But from personal experience this is something that rarely happens among files that contain classes. For instance if you have a class that extends another class, you know for sure that the other class will be needed, but do you know for sure that it has not already been loaded? Usually not, because typically, you would be extending that parent class with at least one other child class. But I guess this is not always the case, so you should do what you think is best.

Advanced Usage

__autoload also makes it possible to change the include directory for a class based on some identifier in the class name
function __autoload ($classname)
{
    if (strstr('MyNamespace', $classname))
    {
         require('/some/other/path/to/my/classes/'.$classname.'.php');
    }
    else
    {
         require('/path/to/my/classes/'.$classname.'.php');
    }
}
or translate a class name into a file path location

function __autoload ($classname)
{
    //you could also replace '\\', if you are using namespacing in PHP 5.3 or greater
    require('/path/to/my/classes/'.str_replace('_', '/', $classname).'.php');
}
 There are even more techniques that can be used like changing file extensions and so on.

What if I need more than one __autoload function in my script?

One of the greatest things about SPL is that it provides a way to define more than one __autoload function using spl_autoload_register. If you already have an __autoload function you will need to register that function before registering any additional functions though.
spl_autoload_register('__autoload');
spl_autoload_register('my_other__autoload');
Of course if you do this, you will need to use the include function in you autoloaders instead of the require function, or check if a file exists in the expected path, otherwise the next function will never get called, because the runtime will encounter a fatal error. Additionally, spl_autoload_register accepts any 'callable' type variable, meaning that you can use a method from a class as an autoload function as well.
//for a static method
spl_autoload_register(array('MyAlreadyLoadedClass', 'autoloader'));
or
//and for an instatiated object method
spl_autoload_register(array($object, 'someAutoLoader'));

So, what if I just need a simple autoloader?

There is an awesome feature in SPL that allows you to tell PHP where to look for class files by default. Every time the PHP runtime encounters a class that is not yet loaded, it calls the spl_autoload function which in turn looks for files with the same name as the class that is supposed to be loaded in the include_path. It uses the file path extensions that are defined by the spl_autoload_extensions function, having .inc and .php set by default.

So, how do I use this to create a simple loader? If your classes are in files with the same name as the class name, and they are all in the same folder, simply add that folder to the include_path
set_include_path(get_include_path() . PATH_SEPARATOR . '/path/to/my/classes/');
Now every time PHP encounters a class that is not yet loaded, it calls spl_autoload, which looks through the files in each of the include_path folders for a file named MyClass.inc or MyClass.php. This method is slightly faster than the __autoload function, because it is native to the PHP runtime. And if you need to add a file path extension that you want spl_autoload to look for just call
spl_autoload_extensions(spl_autoload_extensions() . ',.class.php');
And spl_autoload will look for files that also end with .class.php.

SPL is chock full of goodies, but the autoload functionality is in my opinion one of the most useful additions that make SPL so useful. You are likely to find great performance increases by using these methods.

4 comments:

  1. Hello fella.

    Dude I must thank you for this excellent post. Really. I was struggling for find some useful information like you just posted here.

    Thanks a lot. Again

    ReplyDelete
  2. auto-load does slow things down a bit but for the gains its worth very it. I tried writing a framework without auto-load and it was no fun trying to include all the dependent files specially when you have name-spaces Also when auto-loading be sure to use require instead of require_once

    ReplyDelete
  3. I completely disagree and am firmly against using auto loaders. It does seem like I stand alone in this though.

    Your arguments for using them are both related to bad programming practice in my mind. You said, "it improves the performance of your scripts by preventing PHP from checking if the file has already been loaded or not" and also that, "you no longer have to load a class file just because you MIGHT need it during the execution of you script".

    Neither of these should ever happen, though. As a programmer, it's my responsibility to know what's loaded and what's needed. If my code includes the same file in two different situations, then I should probably fix whatever caused that situation in the first place; it's probably a bug.

    I should also know if I used a class. Including a class just in case I might need it someday is silly.

    Auto loaders also add a level of obscurity to your code that is often confusing. If you're loading a class that is not in an 'include' or 'require' statement at the top of your code how do you know where it came from? This is particularly difficult to figure out in some of the large Zend Framework projects I've worked on.

    Maybe I'm crazy.

    ReplyDelete
    Replies
    1. You are not crazy. When to use autoload depends on context. It has its tradeoff, everything is a tradeoff in computer science and programming. You stated the tradeoff, which is worth to know.

      Delete