This copy is registered to: Linn Wilson
[email protected]
Cookies and milk. Peanut butter and jam. Nothing goes well together like PHP and the Web... and this year you get both when you come join us at php|works and web|works 2005—two great conferences dedicated to the wonderful world of PHP and advanced Web development. What’s best, you get access to both great conferences for one low price!
Rasmus Lerdorf–Identifying and Preventing XSS Attacks, John Coggeshall–PHP Enterprise Architecture, Wez Furlong–PHP Streams: Lucky Dip, George Schlossnagle–Regex Unlimited, Ilia Alshanetsky–Managing PHP Performance, Derick Rethans–How PHP Ticks, Chris Shiflett–Hands-on PHP (BYOL), Marcus Böerger–Happy SPLing, Dan Scott–LIMIT Yourself to MySQL No More, Daniel Udey–Separating Content and Design, Lukas Smith–Database Abstraction, Paul Reinheimer–REST, Robert Reinhardt–Multilingual Flash, Ron Harwood–Web Games with PHP
Early-bird in effect until August 1st, 2005 Hurry! Space is limited. Prices start at just $349 US!
http://www.phparch.com/works
07.2005
DEPARTM ENT S
FEATURES
6 EDITORIAL
pear upgrade Home_Residence
7 WHAT’S NEW
14 The Interator Pattern
Making Manipulating Object Collections Easy by Jason E. Sweat
10 TIPS & TRICKS
Input Filtering: Part 1 Why Filter? by Ben Ramsey
54 TEST PATTERN Not Just Nouns
25 PHP Library for Permissions Management
A Generic Permissions Management PHP Library by Simone Grassi and Bernhard Gaul
by Marcus Baker
58 PRODUCT REVIEW
FPDF: PDF Generation Library by Peter B. MacIntyre
63 SECURITY CORNER Theory
by Chris Shiflett
35 Change Your Life with Version Control An Introduction to Subversion by Clay Loveless
44 Creating a Simple Image Gallery by Martin Psinas
67 Exit(0);
Forget Viagra, Get a Regex! by Marco Tabini
Download this month’s code at: http://www.phparch.com/code/
ED ITO RIA L
pear upgrade Home_Residence L
ast month, my wife, daughter, and I moved out of our (what we’d come to refer to as ghetto) apartment, and into our first house. I hate moving. I hate packing every little thing I own into boxes, and disassembling the furniture. I hate trying to wedge the n-hundred pound sofa-bed out the all-too-narrow doors, and trying to move the refrigerator without breaking any of the ceramic tiles that make up the kitchen floor (“oops”). Then, after many hours of what seemed like endless stair-climbing, box taping, keep-or-toss decision making, and one too many not-as-fun-as-it-sounds Tetrislike games of van and truck packing (“Can we get the rest in this trip? What if we move this box, and put the chairs in the other truck? I think we need one of those tall skinny pieces.”), the process is reversed, and we’re left with the joyous tasks of unloading, more stair climbing, stacking, new-paint scrape-avoidance, more narrow-door squeezing, trying to remember how to re-assemble the customerassembled furniture, and a basement full of boxes that were poorly labeled (in haste). To top it all off, my genius (and by “genius,” I mean moronic) telephone company somehow couldn’t figure how to reconnect our phone, properly, no matter how many times we “call[ed] back in three hours.” As a result, we spent a full week offline—it took me days to catch up on email. Fortunately, I have wonderful friends and family—some of whom traveled over 1000 km to help us get the house ready (and visit us, of course)—who worked for nothing more than pizza, cold beer a sincere “thank you.” As much as I hate all things related to moving, and am glad it’s over, there’s a great joy that accompanies moving into our first house—our own first house. I see a parallel between upgrading our home, and upgrading my development and production environments. PHP 5.1 is on the horizon, and while I truly hate the stress that upgrading a production environment brings (no matter how well tested), I’m always—ok, usually—left with a similar joy of a successful upgrade, better performance, and new features. As always, we’re developing in exciting times! This month, we have a special treat for you: a chapter from our soon-to-bereleased php|architect’s Guide to Design Patterns, by Jason Sweat. In it, he’ll show you the ins and outs of Iterators in PHP, whether self-constructed, or built on a foundation like PHP 5’s Standard PHP Library (SPL). The piece is literally full of code, and it’s sure to whet your appetite for more design pattern goodness. Security corner is back, rounding out our full lineup of columns, and Peter has reviewed the newest version of FPDF, a library that, if you haven’t used, you’ve probably heard of. Happy reading! Now, I must go back to painting, landscaping, plastering, wiring, cleaning, organizing and unpacking.
php|architect
TM
Volume IV - Issue 7 July, 2005
Publisher Marco Tabini
Editorial Team Arbi Arzoumani Peter MacIntyre Eddie Peloke
Graphics & Layout Aleksandar Ilievski
Managing Editor Emanuela Corso
News Editor Leslie Hill
[email protected]
Authors Marcus Baker, Bernhard Gaul, Simone Grassi, Clay Loveless, Peter B. MacIntyre, Martin Psinas, Ben Ramsey, Chris Shiflett, Jason E. Sweat php|architect (ISSN 1709-7169) is published twelve times a year by Marco Tabini & Associates, Inc., P.O. Box 54526, 1771 Avenue Road, Toronto, ON M5M 4N5, Canada. Although all possible care has been placed in assuring the accuracy of the contents of this magazine, including all associated source code, listings and figures, the publisher assumes no responsibilities with regards of use of the information contained herein or in all associated material.
Contact Information: General mailbox:
[email protected] Editorial:
[email protected] Subscriptions:
[email protected] Sales & advertising:
[email protected] Technical support:
[email protected] Copyright © 2003-2005 Marco Tabini & Associates, Inc. — All Rights Reserved
July 2005
●
PHP Architect
●
www.phparch.com
What’s
?>
NEW phpBB 2.0.16
PHP 5.1 Beta 2 Php.net announces the release of PHP 5.1 beta2. "PHP 5.1 Beta 2 is now available! A lot of work has been put into this upcoming release and we believe it is ready for public testing. Some of the key improvements of PHP 5.1 include: • PDO (PHP Data Objects) - A new native database abstraction layer providing performance, ease-of-use, and flexibility. • Significantly improved language performance mainly due to the new Zend Engine II execution architecture. • The PCRE extension has been updated to PCRE 5.0. • Many more improvements including lots of new functionality & many bug fixes, especially in regards to SOAP, streams and SPL. • See the bundled NEWS file for a more complete list of changes. Everyone is encouraged to start playing with this beta, although it is not yet recommended for missioncritical production use." Check out php.net.
all
the
latest
info
at
phpBB.com has released the latest version of their open source bulletin board package. What's new? Phpbb.com lists the changes as: • Fixed critical issue with highlighting • Url descriptions able to be wrapped over more than one line again • Fixed bug with eAccelerator in admin_ug_auth.php • Check new_forum_id for existence in modcp.php • Prevent uploading avatars with no dimensions • Fixed bug in usercp_register.php, forcing avatar file removal without updating avatar informations within the database • Fixed bug in admin re-authentication redirect for servers not having index.php as one of their default files set Visit phpbb.com for all the latest info.
FUDforum 2.6.14RC2
phpReports 0.4.1
Fudforum.org announces their latest release: "The 2nd release candidate for 2.6.14 is now out, aside from a number of bug fixes few important developments were done as well. • FUDforum can now make use of PDO Database driver for PHP 5.0/5.1 with support for MySQL,PostgreSQL and SQLite backends. • FUDforum can now be installed on systems running PHP 5.1, the few BC changes introduced by this release are now being accommodated. • The temporary table usage is now optional, which means forum install no longer requires this permission to be available."
phpReports report generator has announced the latest release, version 0.4.1. According to the announcement, this release includes:
To grab the latest release or for more info, visit fudforum.org.
AjaxAC 0.4.1 Do you have a project which requires the use of AJAX? Check out the latest release of AjaxAC a "PHP framework which can be used to develop, create, and generate AJAX applications". According the announcement, version 0.4.1 includes: "The ArithmeJax sample application was created. The JavaScript escape() was replaced with encodeURIComponent(). The hook name generator was changed to include __ in front of the hookname due to an IE6 compatibility error. The redundant AjaxAC class was removed and this functionality was moved to the AjaxACApplication class. All examples were updated to reflect the removal of the main AjaxAC class." Grab the latest release from http://ajax.zervaas.com.au/
July 2005
●
PHP Architect
●
www.phparch.com
"The setPageSize(size) and getPageSize() methods were added to the PHPReportMaker object. Now you can specify the page size using code like "$oRpt = new PHPReportMaker(); $oRpt->setPageSize(30);". This method overrides the XML value." Visit http://phpreports.source forge.net/ for more information or to download.
SOLAR 0.5.0 Solar.php announces the latest release of their "simple object library and application repository" version 0.5.0. paul-m-jones.com announces some of the highlights as: • Unit tests for Solar_Base, _Cache, _Error, and _Locale • End-user documentation (not just API docs) for those same classes, plus the overarching Solar class itself For all the highlights, visit solarphp.com.
7
Wha t’s N ew ?>
Check out some of the hottest new releases from PEAR.
XML_RPC 1.3.1 A PEAR-ified version of Useful Inc's XML-RPC for PHP. It has support for HTTP/HTTPS transport, proxies and authentication. This release is security related, and solves the recently discovered, and widespread remote-code-execution vulnerability. All users are strongly encouraged to upgrade immediately.
Translation2 2.0.0beta7 This class provides an easy way to retrieve all the strings for a multilingual site from a data source (i.e. db). The following containers are provided, more will follow: • PEAR::DB • PEAR::MDB • PEAR::MDB2 • gettext • XML • PEAR::DB_DataObject (experimental) It is designed to reduce the number of queries to the db, caching the results when possible. An Admin class is provided to easily manage translations (add/remove a language, add/remove a string). Currently, the following decorators are provided: • CacheLiteFunction (for file-based caching) • CacheMemory (for memory-based caching) • DefaultText (to replace empty strings with their keys) • ErrorText (to replace empty strings with a custom error text) • Iconv (to switch from/to different encodings) • Lang (resort to fallback languages for empty strings) • SpecialChars (replace html entities with their hex codes) • UTF-8 (to convert UTF-8 strings to ISO-8859-1)
Mail 1.1.5 PEAR's Mail:: package defines the interface for implementing mailers under the PEAR hierarchy, and provides supporting functions useful in multiple mailer backends. Currently supported are native PHP mail() function, sendmail and SMTP. This package also provides a RFC 822 Email address list validation utility class.
HTML_QuickForm_advmultiselect 0.4.0 The HTML_QuickForm_advmultiselect package adds an element to the HTML_QuickForm package that is two select boxes next to each other emulating a multi•select.
DB_ldap 1.1.1 The PEAR::DB_ldap class provides a DB compliant interface to LDAP servers.
php|architect Releases New Design Patterns Book We're proud to announce the release of php|architect's Guide to PHP Design Patterns, the latest release in our Nanobook series. You have probably heard a lot about Design Patterns---a technique that helps you design rock-solid solutions to practical problems that programmers everywhere encounter in their day-to-day work. Even though there has been a lot of buzz, however, no-one has yet come up with a comprehensive resource on design patterns for PHP developers—until today. Author Jason E. Sweat's book php|architect's Guide to PHP Design Patterns is the first, comprehensive guide to design patterns designed specifically for the PHP developer. This book includes coverage of 16 design patterns with a specific eye to their applications in PHP when building complex web applications, both in PHP 4 and PHP 5 (where appropriate, sample code for both versions of the language is provided). For more information, http://www.phparch.com/shop_product.php?itemid=96.
July 2005
●
PHP Architect
●
www.phparch.com
8
Wha t’s N ew ?>
Looking for a new PHP Extension? Check out some of the lastest offerings from PECL.
ibm_db2 1.0.2 This extension supports IBM DB2 Universal Database, IBM Cloudscape, and Apache Derby databases.
yaz 1.0.3 This extension implements a Z39.50 client for PHP using the YAZ toolkit.
pecl_http 0.9.0 • Building absolute URIs • RFC compliant HTTP redirects • RFC compliant HTTP date handling • Parsing of HTTP headers and messages • Caching by "Last-Modified" and/or ETag (with 'on the fly' option for ETag generation from buffered output) • Sending data/files/streams with (multiple) ranges support • Negotiating user preferred language/charset • Convenient request functions built upon libcurl • HTTP auth hooks (Basic) • PHP5 classes: HttpUtil, HttpResponse, HttpRequest, HttpRequestPool, HttpMessage
runkit 0.3.0 Replace, rename, and remove user defined functions and classes. Define customized superglobal variables for general purpose use. Execute code in restricted environment (sandboxing).
July 2005
●
PHP Architect
●
www.phparch.com
9
T IP S & T RICKS
Input Filtering, Part 1: Why Filter? by Ben Ramsey
This year has seen an increased focus on PHP security, and this is good for the language, developers, and business community. One phrase that comes to mind when discussing secure coding practices is Chris Shiflett’s mantra of “filter input, escape output.” While we know what this means in a general sense, practical examples elude us, so for the next three months, Tips & Tricks will give practical suggestions for input filtering, chock full of code examples.
F
ilter input. What does that mean? Well, in short, it means what it says, but there’s something deeper hidden behind these words, something sinister. Yes, these words mean user input cannot be trusted. For that matter, no input, regardless of its source— forms, RSS feeds, cookies, etc.—is trustworthy. In fact, the level of distrust in input must be so high that you no longer accept anything from these sources at face value. Always verify the input data to ensure it’s the expected, genuine
July 2005
●
PHP Architect
●
article. But why is this so hard to do? Is it because we innately want to trust people and other sources? Heavens, no! It’s hard because programmers are naturally lazy. Filtering input means writing more code, writing smarter code. For those who wish to finish a project quickly, this is daunting, and so they quickly scribble down some code—if, in fact, code can be scribbled—and deploy a release hoping to catch the problems in later bugfix (sometimes called security)
www.phparch.com
releases. This can, however, cause great problems in the meantime, not the least of which could consist of SQL injection or cross-site scripting (XSS)… or just plain bad data. Ensuring against bad data through filtering input is what we’ll focus on over the next three installments of Tips & Tricks. So, come along with me, and before we’re finished, you’ll be cynical and distrustful with the best of them—no longer able to trust input of any kind—and, thus, security-conscious.
10
T IP S & T RICKS
Input Filtering, Part 1: Why Filter?
Why Filter Input? Input is bad. In fact, it’s evil. Just get that through your head, and you’ll be off to a great start. Input is evil because its source cannot be trusted and the type of data expected is not always the type received, and all the client-side validation scripts in the world can’t stop input coming from another source completely invalidated. What do I mean by “another source?” I mean: another form on another Web site that makes use of your form (often referred to as a spoofed form) for some insidious means—or someone or some script posting by any number of alterna-
tive means. Let’s take, for example, the form in Listing 1, which is located at the imaginary URL h tt p: // ex am pl e. ne t /f or m. ht ml . (We’ll continue to come back to this form during the next few months; don’t worry—the code will be included in each column.) Now, this is a form we’ve all seen; it asks for a name and contact information—no doubt, you’ve used a similar form in the past, and there’s nothing wrong with this form, but there are a few assumptions often made about it. One assumption is that the maxlength attribute of the fields pre-
Listing 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Listing 2 1 2 3 4 5 6 7 8 9 10
Listing 2 1 2 3 4 5 6 7 8 9 10 11 12
setMethod(HTTP_REQUEST_METHOD_POST); $req->addHeader(‘Referer’, ‘http://example.net/form.html’); $req->addPostData(‘name’, ‘Gandalf the Grey’); $req->addPostData(‘state’, ‘Middle-earth’); $req->addPostData(‘email’, ‘Olorin I was in my youth’); $response = $req->sendRequest(); ?>
July 2005
●
PHP Architect
●
www.phparch.com
vents a user from entering more text than allowed. This is wrong. While a Web browser can correctly prevent a user from doing so through this particular form, there’s nothing to stop the re-creation of this form on another server and using it to submit a much longer string of data. Another assumption is that the user may pick states only from among the options listed in the state drop-down field. Again, this is wrong and for the same reasons. The Web browser might prevent said user from entering other values when using this form, but if recreated, the sky’s the limit. We’re starting to see a pattern emerge. A Web form/application is safe only when used properly. This is obvious. But if used improperly, then processing scripts can receive any and all kinds of input. Still, let’s look at two more assumptions about this form—just for the heck of it. This form has a set number of fields. Does that mean these are the only fields that can be submitted? No! Also, can we assume that the processing script ([process_form.php] in this case) can only receive submissions from this form? The answer, again, is no. The form in Listing 2 illustrates why these assumptions are wrong. This form lives on another server— for example, at http://evil.example.net/form-spoof.html. The first thing to notice about this form is that there are no maxlength attributes. Well, for one, these are hidden fields that don’t use the maxlength attribute, but that’s not important. The fields don’t have to be hidden, and, either way, a devious miscreant may enter as much data as he pleases. Secondly, the state field now has a value of “The Shire.” Wait a minute… that wasn’t in our option list, but it doesn’t matter because it’ll post just fine. Thirdly, this form includes a new field: the junk field. This doesn’t do much now, but consider a server
11
T IP S & T RICKS
Input Filtering, Part 1: Why Filter?
where register_globals is enabled and variables aren’t initialized— think about what it can do. The Referer Question Invariably, the question now arises: But what about the Referer ? Yes, what about it? I can check it, right?
place, so let’s process it }
Now, this snippet of code will properly thwart a form such as the one in Listing 2 from posting to process_form.php , so long as the client includes a Referer header that doesn’t match, but mischievous users aren’t in the business of
Web browser and, thus, can modify any part of the request. In this case, PEAR::HTTP_Request generates a valid POST request, while adding a Referer header. Thus, the script successfully posts to process_form.php because it sends a valid Referer header with a value that process_form.php expects.
“No input, regardless of its source is trustworthy.” Sure, go ahead, but it’ll bite you in the end. It is a common misconception that every request includes a Referer header and that the value of this header always represents the origin of the request. In truth and practice, the origin of the request is always the client. The client can be a Web browser or it can be a script that resides on a server, somewhere. It may or may not choose to include a Referer header in requests. The Referer, when included, may or may not indicate the previously requested parent resource. In fact, some proxy servers have been known to modify or drop the Referer header altogether, thus blocking entire offices and even ISPs from viewing Web sites programmed to check for it. All this amounts to the fact that Referer is highly unreliable as a means of protecting Web applications from outside posting. Furthermore, it is not as important to ensure input comes from a specific place as it is that the input received conforms to expectations. Nevertheless, we’ll take a look at how scripts use Referer to block requests from other sites: if (strcmp($_SERVER[‘HTTP_REFERER’], ‘http://example.net/form.html’) == 0) { // It came from the right
July 2005
●
PHP Architect
●
being foiled by clients. Let’s consider another means of posting and take a look at Listing 3. The code in Listing 3 is similar to that found in Listing 2 in that it posts to process_form.php from a different location and bypasses all the local constraints placed on it (e.g. maxlength and any client-side scripting). However, Listing 3 is different because it doesn’t rely on a
Now You’re Getting It And so, we must filter the input. It’s that simple. We cannot be sure the input comes from the proper location, nor are we sure it is exactly what we want. In fact, we’re pretty sure it’s not. Feeling distrustful yet? Good. Great, even. Do not trust input from users, from anywhere. This is why it’s important to ensure that input received is input expected.
Listing 4 1 2 3 4 5 6 7 8 9 10 11
Listing 5 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
www.phparch.com
$value) { if (in_array($key, $allowed)) { $validated[$key] = $value; } } return $validated; } $white_list = array( ‘name’, ‘street’, ‘city’, ‘state’, ‘postal’, ‘phone’, ‘email’ ); $clean = filter($_POST, $white_list); ?>
12
T IP S & T RICKS
Input Filtering, Part 1: Why Filter?
The approach we’ll take to filter input is often called a “whitelist” approach (as opposed to a “blacklist” approach). Instead of using a blacklist to tell our script what kind of input we won’t allow (e.g. input coming from somewhere other than form.html, as in the Referer example), we’ll use a whitelist to tell it exactly what to allow. This is actually a much simpler approach because, now, we don’t have to think of the myriad kinds of data an attacker might try to submit to our script. Instead, we need only know what we want to receive and ensure that the received input matches up. Capturing and Taming Input Now, let’s talk about capturing some of this evil input. There are a few places we’ll consider looking for input: $_GET, $_POST, and $_COOKIE . We’ll not look in $_REQUEST , though it does contain the values from each of these superglobal arrays. In short, we want to know the exact scope of the input, so we’ll use the specific superglobal for the location we expect to find it. For example, $_REQUEST[‘name’] could refer to $_GET[‘name’], $_POST[‘name’] , or even $_COOKIE[‘name’], so we want to be sure it’s coming from the correct location, which is POST in this case. Luckily for us, PHP has already done the work of capturing the input. In process_form.php, the values passed by the input from— form.html (or from wherever it was submitted)—are in $_POST . But the data in $_POST, you’ll remember, is still evil data. We must first filter it. There’s more than one way, however, to filter form input, and I won’t pretend that my suggestions are any more than what they are: suggestions. They are not the right way, but they are a way, and these tips are sure to help control input and provide a foundation on which to build. What’s important is to write code with a security-conJuly 2005
●
PHP Architect
●
scious mindset, and part of that mindset includes being wary of input. Now, to keep track of our good data, we’ll store everything that’s considered clean (as in: it conforms to expectations) to the aptly named $clean array, which will somewhat mimic everything that’s in $_POST— without all the evil tendencies. One approach that I often see is a sanitizing function that gets applied to the $_POST array, as seen in Listing 4. While this type of approach removes harmful characters, it does not provide a whitelist solution. Instead, it blacklists potentially harmful characters (control characters) and escapes the input (with htmlentities()), which is not a part of the filtering process. We’re only concerned with filtering the input at this point, so we want the raw data—filtered, but raw. Escaping will take place during the output stage, which isn’t covered here. A whitelist approach defines the valid range of characters/numbers, the acceptable values (of a select field, for example), and the allowed fields. For now, let’s take a look at defining the allowed fields to ensure we receive and process nothing more than expected. Listing 5 gives a whitelist example for defining the allowed fields. First, we use the $white_list array to define the allowed fields. Then, we run the $_POST array through the function using filter() $white_list as a model. What’s returned to the $clean array is the expected input. Anything unexpected is left back in $_POST where it safely remains excluded from the rest of the script. This is a very simple approach
that does not include any further input checking—for now. Though, I hope it is evident how this approach adds a level of flexibility to the filtering process. For example, imagine a $post_white_list , $get_white_list , or even $rss_white_list. Now, it becomes clear that this simple example can expand to filter anything: $post_clean = filter($_POST, $post_white_list); $get_clean = filter($_GET, $get_white_list); $rss_clean = filter($rss, $rss_white_list);
In next month’s column, I’ll revisit this same code and discuss strategies for defining the data type for each field. Wrap Up By now, you should be fully convinced that all input is evil and why it’s important to filter all incoming data. When it comes to input, there are no guarantees as to the origin of the data or the type received. Whether working with GET, POST, cookies, RSS feeds, and the like, always filter input—regardless. Tune in next month when we’ll wrestle more input to ensure input received is input expected.
About the Author
?>
Ben Ramsey is a Technology Manager for Hands On Network in Atlanta, Georgia. He is an author, Principal member of the PHP Security Consortium, and Zend Certified Engineer. Ben lives just north of Atlanta with his wife Liz and dog Ashley. You may contact him at
[email protected] or read his blog at http://benramsey.com/ .
www.phparch.com
To Discuss this article: http://forums.phparch.com/238
13
The Iterator Pattern by Jason E. Sweat author of php|architect’s Guide to PHP Patterns
You have probably heard a lot about Design Patterns---a technique that helps you design rock-solid solutions to practical problems that programmers everywhere encounter in their day-to-day work. Even though there has been a lot of buzz, however, no-one has yet come up with a comprehensive resource on design patterns for PHP developers---until today. In this excerpt from Jason E. Sweat's book php|architect's Guide to PHP Design Patterns, you'll learn about the Iterator pattern, whether custom-built, or with PHP 5's new Standard PHP Library.
O
bject-Oriented Programming encapsulates application logic in classes. Classes, in turn, are instantiated as objects, and each individual object has a distinct identity and state. Individual objects are a useful way to organize your code, but often you want to work with a group of objects, or a collection. A set of result rows from a SQL query is a collection. A collection need not be homogeneous either. A Window object in a graphical user interface framework could collect any number of control objects — a Menu, a Slider, and a Button , among others. Moreover, the implementation of a collection can
July 2005
●
PHP Architect
●
www.phparch.com
REQUIREMENTS PHP
5
OS
Any
Code Directory
iterator
vary: a PHP array is a collection, but so is a hash table, a linked list, a stack, and a queue. The Problem: How can one easily manipulate any collection of objects? The Solution: Use the Iterator pattern to provide uniform access to the contents of a collection. You may not realize it, but you use the Iterator pattern every day—it’s embodied in PHP’s array type and rich set of array manipulation functions. (Indeed, given the combination of the native array type in the language and a host of flexible functions designed to work with this native type, you need a pretty compelling reason not to use arrays as your means of manipulating collections of objects.) 14
FEA T URE Here’s native array iteration in PHP: $test = array(‘one’, ‘two’, ‘three’); $output = ‘’; reset($test); do { $output .= current($test); } while (next($test)); echo $output; // produces ‘onetwothree’
The reset() function restarts iteration to the beginning of the array; current() returns the value of the current element; and next() advances to the next element in the array and returns the new current() value. When you advance past the end of the array, next() returns false. Using these iteration methods, the internal implementation of a PHP array is irrelevant to you. Iterator couples the object-oriented programming principals of encapsulation and polymorphism. Using Iterator, you can manipulate the objects in a collection without explicitly knowing how the collection is implemented or what the collection contains (what kinds of objects). Iterator provides a similar interface to different
concrete iteration implementations, which do contain the details of how to manipulate a specific collection, including which items to show (filtering) and in what order (sorting). Let’s create a simple object to manipulate in a collection. (Though this example is in PHP 5, Iterators are not unique to PHP 5 and most of the examples in this chapter work in PHP 4 as well, albeit with a healthy amount of reference operators added). The object, Lendable , represents media such as movies and albums and is intended to be part of a web site or service to let users review or lend portions of their media collection to other users. (For this example, do not concern yourself with persistence and the like.) Let’s start with the code in Listing 1 as the basis for the class and write some tests. To implement the requirements of this initial test, create a class with a few public attributes and some methods to toggle the values of these attributes, such as that in Listing 2. Lendable is a good, generic start. Let’s extend it to track items like DVDs or CDs.
Figure 1
July 2005
●
PHP Architect
●
www.phparch.com
15
FEA T URE
The Iterator Pattern
Media extends Lendable and tracks details about specific media, including the name of the item, the year it was released, and what type of item it is. See Listing 3. To keep things simple, Media has three public instance variables, Media::name, Media::year , and Media::type. The constructor takes two arguments and stores the first in $name and the second in $year . The constructor also allows an optional third parameter to specify type (which defaults to “dvd”). Given individual objects to manipulate, you can now create a container to hold them: a Library. Like a regular library, Library should be able to add, remove and count the items in the collection. Eventually, Library should also permit access to individual items (objects) in the collection (which is shown momentarily in the Sample Code section of this chapter). For right now, let’s build a test case for Library: class LibraryTestCase extends UnitTestCase { function TestCount() { $lib = new Library; $this->assertEqual(0, $lib->count()); } }
It’s easy enough to write a class that satisfies this test: class Library { function count() { return 0; } }
Listing 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Listing 2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
An easy way to implement add() is to piggyback on PHP’s flexible array functions: you can add items to an array instance variable and use count() to return the number of items in the collection. class Library { protected $collection = array(); function count() { return count($this->collection); } function add($item) { $this->collection[] = $item; } }
Library is now a collection, but it provides no way to retrieve or manipulate the individual members of the collection. Let’s move on to the purpose of the article, implementation of the Iterator design pattern. The following UML class diagram shows the GoF Iterator pattern with the Media and Library classes used
July 2005
●
PHP Architect
●
www.phparch.com
status = ‘borrowed’; $this->borrower = $borrower; } public function checkin() { $this->status = ‘library’; $this->borrower = ‘’; } } ?>
Listing 3
Now add some interesting features to the test: class LibraryTestCase extends UnitTestCase { function TestCount() { /* ... */ } function TestAdd() { $lib = new Library; $lib->add(‘one’); $this->assertEqual(1, $lib->count()); } }
assertFalse($item->borrower); $item->checkout(‘John’); $this->assertEqual(‘borrowed’, $item->status); $this->assertEqual(‘John’, $item->borrower); } function TestCheckin() { $item = new Lendable; $item->checkout(‘John’); $item->checkin(); $this->assertEqual(‘library’, $item->status); $this->assertFalse($item->borrower); } } ?>
1 2 3 4 5 6 7 8 9 10 11 12 13
name = $this->type = $this->year = }
__construct($name, $year, $type=’dvd’) { $name; $type; (int)$year;
} ?>
Listing 4 1 assertIsA($it = $this->lib->getIterator(), ‘LibraryGofIterator’); 6 $this->assertFalse($it->isdone()); 7 $this->assertIsA($first = $it->currentItem(), ‘Media’); 8 $this->assertEqual(‘name1’, $first->name); 9 $this->assertFalse($it->isdone()); 10 11 $this->assertTrue($it->next()); 12 $this->assertIsA($second = $it->currentItem(), ‘Media’); 13 $this->assertEqual(‘name2’, $second->name); 14 $this->assertFalse($it->isdone()); 15 16 $this->assertTrue($it->next()); 17 $this->assertIsA($third = $it->currentItem(), ‘Media’); 18 $this->assertEqual(‘name3’, $third->name); 19 $this->assertFalse($it->next()); 20 $this->assertTrue($it->isdone()); 21 } 22 } 23 ?>
16
FEA T URE Listing 5 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
collection = $collection; } function first() { reset($this->collection); } function next() { return (false !== next($this->collection)); } function isDone() { return (false === current($this->collection)); } function currentItem() { return current($this->collection); } } ?>
Listing 6 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
collection = $collection; } function first() { $this->key=0; } function next() { return (++$this->key < $this->collection->count()); } function isDone() { return ($this->key >= $this->collection->count()); } function currentItem() { return $this->collection->get($this->key); } } ?>
Listing 7 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
assertIsA( $it = $this->lib->getIterator(‘media’) ,’LibraryIterator’); $output = ‘’; while ($item = $it->next()) { $output .= $item->name; } $this->assertEqual(‘name1name2name3’, $output); } } ?>
Listing 8 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
collection = $collection; } function next() { if ($this->first) { $this->first = false; return current($this->collection); } return next($this->collection); } } ?>
July 2005
●
PHP Architect
●
www.phparch.com
to make the example concrete. (GoF is the Gang of Four— Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides, writers of the famous, and definitive Design Patterns, Elements of Reusable Object-Oriented Software). Your collection class must provide a FactoryMethod to create an instance of your Iterator. Iterator classes define an interface of first() to go to the beginning of a collection, next() to move to the next item in sequence as you iterate, currentItem() to retrieve the current item from the collection as you iterate, and isDone() to indicate when you have iterated over the entire collection.
“Use the Iterator pattern to provide uniform access to the contents of a collection.”
In the next section, we are going to create the LibraryGofIterator class as an example of a direct implementation of the GoF Iterator design pattern—see Figure 1. Sample Code The first step in implementing the GoF Iterator pattern within Library is to write a new test case for the new concrete Iterator. Since each test method will be manipulating a Library filled with Media instances, you can employ the UnitTestCase::setUp() method to populate a variable with a Library in a known state for each test. (For the purposes of this article, treat UnitTestCase as a generic unit testing suite. The associated code does, however, serve to illustrate how Library should perform.) Start by adding the Library::getIterator() method as a FactoryMethod for instances of the LibraryGofIterator class. class IteratorTestCase extends UnitTestCase { protected $lib; function setup() { $this->lib = new Library; $this->lib->add(new Media(‘name1’, 2000)); $this->lib->add(new Media(‘name2’, 2002)); $this->lib->add(new Media(‘name3’, 2001)); } function TestGetGofIterator() { $this->assertIsA($it = $this->lib->getIterator() ,’LibraryGofIterator’); } }
Here’s the implementation. class Library {
17
FEA T URE
The Iterator Pattern
// ... function getIterator() { return new LibraryGofIterator($this->collection); } }
The getIterator() method passes the Library’s $collection to the constructor of the new concrete iterator. This technique has two important implications: each iterator is independent, so multiple iterators can operate at the same time. Additionally, the iterator operates on the collection as it existed at the time the iterator was requested. If another item is added to the collection at any time later, you must request another iterator to display it (at least in this implementation). Continue enhancing the test suite by adding assertions to the TestGetGofIterator() method to match the Iterator design pattern. The isDone() method should only be true if you’ve iterated over the entire collection. If the iterator’s just been created, isDone() should obviously return false to indicate it’s okay to iterate. class IteratorTestCase extends UnitTestCase { function setup() { /* ... */ } function TestGetGofIterator() { $this->assertIsA($it = $this->lib->getIterator() ,’LibraryGofIterator’); $this->assertFalse($it->isdone()); } }
tional items remain to be iterated over. class IteratorTestCase extends UnitTestCase { function setup() { /* ... */ } function TestGetGofIterator() { $this->assertIsA($it = $this->lib->getIterator() ,’LibraryGofIterator’); $this->assertFalse($it->isdone()); $this->assertIsA($first = $it->currentItem(), ‘Media’); $this->assertEqual(‘name1’, $first->name); $this->assertFalse($it->isdone()); } }
It’s critical that LibraryGofIterator receives the $collection in the constructor (see the minimal implementation of Library above) and returns the current() item of that array from the currentItem() method. class LibraryGofIterator { protected $collection; function __construct($collection) { $this->collection = $collection; } function currentItem() { return current($this->collection); } function isDone() { return false; } }
What should happen in the next iteration? The next() method should change what item is returned by the currentItem() method. This test captures that expected behavior:
As usual with Test Driven Development (TDD), implement the simplest possible code that satisfies your test case:
class IteratorTestCase extends UnitTestCase { function setup() { /* ... */ } function TestGetGofIterator() { $this->assertIsA($it = $this->lib>getIterator(), ‘LibraryGofIterator’); $this->assertFalse($it->isdone()); $this->assertIsA($first = $it->currentItem(), ‘Media’); $this->assertEqual(‘name1’, $first->name); $this->assertFalse($it->isdone());
class LibraryGofIterator { function isDone() { return false; } }
So, what should happen during the first iteration. currentItem() should return the first Media object added in the IteratorTestCase::setUp() method and isDone() should continue to be false, since two addiListing 9 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
collection); } } ?>
July 2005
●
PHP Architect
●
www.phparch.com
$this->assertTrue($it->next()); $this->assertIsA($second = $it->currentItem(),
Listing 10 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
lib->add($dvd = new Media(‘test’, 1999)); $this->lib->add(new Media(‘name4’, 1999)); $this->assertIsA( $it = $this->lib->getIterator(‘available’) ,’LibraryAvailableIterator’); $output = ‘’; while ($item = $it->next()) { $output .= $item->name; } $this->assertEqual(‘name1name2name3testname4’, $output); $dvd->checkOut(‘Jason’); $it = $this->lib->getIterator(‘available’); $output = ‘’; while ($item = $it->next()) { $output .= $item->name; } $this->assertEqual(‘name1name2name3name4’, $output); } } ?>
18
FEA T URE
The Iterator Pattern
‘Media’); $this->assertEqual(‘name2’, $second->name); $this->assertFalse($it->isdone()); } }
Piggybacking again on PHP’s array functions, use next() on the array. class LibraryGofIterator { protected $collection; function __construct($collection) { $this->collection = $collection; } function currentItem() { return current($this->collection); } function next() { return next($this->collection); } function isDone() { return false; } }
The third iteration looks much like the others, except the isDone() method must return true . You also want next() to indicate success of moving to the next iteration. With small modifications to the [next()] and [isDone()] methods, all of the tests pass. (See Listings 4 and 5). There’s just one problem with the Iterator test case: it doesn’t reflect how iterators are typically used. Yes, it tests all of the features of the Iterator pattern, but application code uses the Iterator in a much simpler way. So, the next step is to write a test to run more realistic code. class IteratorTestCase extends UnitTestCase { protected $lib; function setup() { /* ... */ } function TestGetGofIterator() { /* ... */ } function TestGofIteratorUsage() { $output = ‘’; for ($it=$this->lib->getIterator(); !$it>isDone(); $it->next()){ $output .= $it->currentItem()->name; } $this->assertEqual(‘name1name2name3’, $output); } }
Listing 11 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
collection = $collection; } function next() { if ($this->first) { $this->first = false; $ret = current($this->collection); } else { $ret = next($this->collection); } if ($ret && ‘library’ != $ret->status) { return $this->next(); } return $ret; } } ?>
July 2005
●
PHP Architect
●
www.phparch.com
So far, the implementation of Iterator copies an array (the collection) and uses PHP’s internal pointer to track the iteration. You can also implement the Iterator by keeping track of the collection index by yourself. This requires a new accessor method in Library to fetch an object by key. class Library { // ... function get($key) { if (array_key_exists($key, $this->collection)) { return $this->collection[$key]; } } }
Also, you’d pass $this (the library itself) to the constructor instead of $this->collection (the array containing the Media collection) in the Library::getIterator() method. The “external” iterator would then just track a pointer internally to know which element of the Library collection it’s currently referencing, and would use the reference to the Library passed in the constructor to call the get() method to retrieve the current object. The implementation seen in Listing 6 assumes that your collection array is indexed starting with 0 and is completely sequential. A Variant Iterator API While the foregoing code is a complete implementation of the Iterator pattern as described by GoF, you may find the four-method API a bit cumbersome. If so, you can collapse next() , currentItem(), and isDone() into just next() by having the latter either advance and return the current item from the collection or return false if the entire collection has been processed. Listing 7 shows one way to write a test for this variation of the API. Notice the simplified control structure for looping. next() returns an object or false , allowing you to perform the assignment inside the while loop conditional. The next few examples explore variations of the Iterator pattern using the smaller interface. As a convenience, change the Library::getIterator() method to a parameterized FactoryMethod so you can get either the four-method iterator or the two-method iterator (next() and reset() from that single method). class Library { // ... function getIterator($type=false) { switch (strtolower($type)) { case ‘media’: $iterator_class = ‘LibraryIterator’; break; default: $iterator_class = ‘LibraryGofIterator’; } return new $iterator_class($this->collection); } }
19
FEA T URE
The Iterator Pattern
Here, Library::getIterator() now accepts a parameter to select what kind of iterator to return. The default is LibraryGofIterator (so the existing tests still pass. Passing the string media to the method creates and returns a LibraryIterator instead. This is some code to implement [LibraryIterator]: class LibraryIterator { protected $collection; function __construct($collection) { $this->collection = $collection; } function next() { return next($this->collection); } }
Oops! The dreaded test failure! What caused this? Somehow, the first iteration was skipped—that’s a bug. To fix the error, return current() for the first call of the next() method. Listing 12 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
lib->add(new Media(‘second’, 1999)); $this->lib->add(new Media(‘first’, 1989)); $this->assertIsA( $it = $this->lib->getIterator(‘released’) ,’LibraryReleasedIterator’); $output = array(); while ($item = $it->next()) { $output[] = $item->name .’-’. $item->year; } $this->assertEqual( ‘first-1989 second-1999 name1-2000 name3-2001 name2-2002’ ,implode(‘ ‘,$output)); } } ?>
Listing 13 1 collection = $collection; 9 $sort_funct = create_function( 10 ‘$a,$b,$c=false’, 11 ‘static $collection; 12 if ($c) { 13 $collection = $c; 14 return; 15 } 16 return ($collection->get($a)->year 17 $collection->get($b)->year);’); 18 $sort_funct(null,null,$this->collection); 19 $this->sorted_keys = $this->collection->keys(); 20 usort($this->sorted_keys, $sort_funct); 21 } 22 23 function next() { 24 if (++$this->key >= $this->collection->count()) { 25 return false; 26 } else { 27 return $this->collection->get($this->sorted_keys[$this>key]); 28 } 29 } 30 } 31 ?>
July 2005
●
PHP Architect
●
www.phparch.com
The code in Listing 8 corrects our logic error and provides a streamlined while loop iterator. Filtering Iterator With Iterators, you can do more than just present each item of the collection, you can also select which items are presented. Let’s modify the Library::getIterator() to allow two additional iterator types (Listing 9). The LibraryAvailableIterator class should only iterate over items that have a status of “library” (recall that the checkOut() method changes the status to “borrowed”). The test in Listing 10 creates a new Media instance and stores it in the variable $dvd . The first highlighted assertEqual() assertion verifies that the new item is present when iterating with LibraryAvailableIterator. Next, the test uses the checkOut() method and verifies that the new item is missing from the display. The code to implement filtering (Listing 11) is very similar to the LibraryIterator::next(), except filtering is done prior to returning the item. If the current item does not match the filter criteria, the code returns $this->next() instead.
Listing 14 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
lib = new ForeachableLibrary; $this->lib->add(new Media(‘name1’, 2000)); $this->lib->add(new Media(‘name2’, 2002)); $this->lib->add(new Media(‘name3’, 2001)); } function TestForeach() { $output = ‘’; foreach($this->lib as $item) { $output .= $item->name; } $this->assertEqual(‘name1name2name3’, $output); } } ?>
Listing 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
collection); } function next() { $this->valid = (false !== next($this->collection)); } function key() { return key($this->collection); } function valid() { return $this->valid; } function rewind() { $this->valid = (false !== reset($this->collection)); } } ?>
20
FEA T URE
The Iterator Pattern
Sorting Iterator An iterator can do more than show all or a portion of the collection. An iterator can also show the collection in a specific order. Let’s create an iterator that sorts the Media in the collection by release date. For a test (Listing 12), add some Media instances with dates older that those of the items added in the setUp() method. If the iterator works, these older items should be sorted to the beginning of the iteration. This test uses the items in each iteration slightly differently: instead of just appending the $name values in a string, a string is formed from both the $name and $year properties, which is then appended to an $output array. The implementation of LibraryReleasedIterator is nearly identical to LibraryIterator, except for one
additional line in the constuctor. class LibraryReleasedIterator extends LibraryIterator { function __construct($collection) { usort($collection, create_function(‘$a,$b’, ‘return ($a->year - $b->year);’)); $this->collection = $collection; } }
The usort() statement sorts the $collection array prior to iteration. You can avoid copying all of the other code for the class by simply inheriting from the LibraryIterator class itself. Is it possible to use an external iterator to accomplish this same sorted iteration? Yes, but you must pull a few tricks to accomplish it. See Listing 13. Key here is the creation of a utility function for per-
“Iterator couples the object-oriented programming principals of encapsulation and polymorphism.” Listing 16 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
lib = new PolymorphicForeachableLibrary; $this->lib->add(new Media(‘name1’, 2000)); $this->lib->add(new Media(‘name2’, 2002)); $this->lib->add(new Media(‘name3’, 2001)); } function TestForeach() { $output = ‘’; foreach($this->lib as $item) { $output .= $item->name; } $this->assertEqual(‘name1name2name3’, $output); } } ?>
Listing 17 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
iterator->current(); } function next() { return $this->iterator->next(); } function key() { return $this->iterator->key(); } function valid() { return $this->iterator->valid(); } function rewind() { $this->iterator = new StandardLibraryIterator($this->collection); $this->iterator->rewind(); } } ?>
July 2005
●
PHP Architect
●
www.phparch.com
forming the sort. The sorting function needs to have access to the collection so it can fetch members for comparison. However, because the generated function is used in a usort(), you don’t have the option of passing the collection as an additional parameter. Instead, you can use the trick shown in the code block above to store a reference to the collection inside the function prior to calling it with usort(). What you’re sorting is the list of keys for the collection. When usort() is complete, the keys will be sorted in order by the year attribute of each object in the collection. In the next() method, an object in the collection is accessed via the get() method, but indirectly through the $sorted_keys mapping. If you recall the external version of the GoF-style iterator, arrays with gaps or strings in the keys could be problematic. This same trick could be used for a simple external iterator to alleviate the problem of gaps in the sequence of keys. SPL Iterator No article on the Iterator design pattern and PHP would be complete without discussing the “Standard PHP Library” (SPL) iterator. The while loop structure used so far is very compact and usable, but PHP coders may be more comfortable with the foreach structure for array iteration. Wouldn’t it be nice to use a collection directly in a foreach loop? That’s exactly what the SPL iterator is for. (Even though this article has been written entirely for 21
FEA T URE
The Iterator Pattern
PHP 5, the following SPL code is the only code that works solely in PHP 5, and then only if you’ve compiled PHP 5 with SPL enabled. Harry Fuecks wrote a nice article introducing the SPL and covering the SPL iterator; http://www .sitepoint.com/ article/php5-s tan see dard-library .) Using SPL is essentially a completely different way to implement iteration, so let’s start over with a new unit test case and a new class, the ForeachableLibrary, and Listing 14. ForeachableLibrary is the collection that implements the SPL Iterator interface. You have to implement five functions to create an SPL iterator: current() , next(), key() , valid(), and rewind(). key() returns the current index of your collection. rewind() is like reset(): iteration begins at the start of your collection. See Listing 15. Here we just implement the required functions working on our $collection attribute. (If you don’t implement all five functions and you add the implements Iterator to your class definition, PHP will generate a fatal error.) The tests pass, and everything is happy. There’s just one problem: the implementation is limited to one style of iteration—sorting or filtering is impossible. Can anything be done to rectify this? Yes! You can apply the Strategy pattern and delegate the SPL iterator’s five functions to another object. Listing 16 is a test for PolymorphicForeachableLibrary. The only difference between this case and the test for SplIteratorTestCase is the class of the $this->lib attribute created in the setUp() method. That makes sense: the two classes must behave identically. Listing 17 contains PolymorphicForeachableLibrary . Library is extended to get the collection manipulation methods. The SPL methods are added, too, all delegating to the $iterator attribute, which is created in rewind() . Below is the code for the StandardLibraryIterator. The code in Listing 18 should look familiar: essentially, it’s a copy of the five SPL functions from the ForeachableLibrary class. The tests pass. OK, the code is more complex now, but how does it support additional iterator types? Let’s add a test for a “released” version of the iterator to see how additional iterator types work in this design. The test case in Listing 19 should look familiar, too, as it’s very similar to the previous “release” iterator, but using the foreach control structure to loop. The new iteratorType() method (Listing 20) lets you switch which style of iterator you want to use. (Since the iterator type isn’t chosen during the instantiation of the object and because you can choose a different iterator type on-the-fly by calling the iteratorType() method again, the code is actually implementing the
July 2005
●
PHP Architect
●
www.phparch.com
Listing 18 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
collection = $collection; } function current() { return current($this->collection); } function next() { $this->valid = (false !== next($this->collection)); } function key() { return key($this->collection); } function valid() { return $this->valid; } function rewind() { $this->valid = (false !== reset($this->collection)); } } ?>
Listing 19
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
lib->add(new Media(‘second’, 1999)); $this->lib->add(new Media(‘first’, 1989)); $output = array(); $this->lib->iteratorType(‘Released’); foreach($this->lib as $item) { $output[] = $item->name .’-’. $item->year; } $this->assertEqual( ‘first-1989 second-1999 name1-2000 name3-2001 name2-2002’ ,implode(‘ ‘,$output)); } } ?>
Listing 20 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
iteratorType(); } function iteratorType($type=false) { switch(strtolower($type)) { case ‘released’: $this->iterator_type = ‘ReleasedLibraryIterator’; break; default: $this->iterator_type = ‘StandardLibraryIterator’; } $this->rewind(); } // ... function rewind() { $type = $this->iterator_type; $this->iterator = new $type($this->collection); $this->iterator->rewind(); } } ?>
22
FEA T URE
The Iterator Pattern
State pattern, rather than the Strategy pattern.) class ReleasedLibraryIterator extends StandardLibraryIterator { function __construct($collection) { usort($collection, create_function(‘$a,$b’, return ($a->year - $b->year);’)); $this->collection = $collection; } }
Iterator class as a base class for the libraries’ iterators? Of those, how many define the five required methods in the same capacity? Perhaps implements Foreachable would have been a less intrusive name. If you choose to use the SPL, you should investigate the other supported iterators, like RecursiveArrayIterator and numerous other flavors.
You can easily implement ReleasedLibraryIterator by extending StandardLibraryIterator and overriding the constructor to add the sorting of the incoming array. And with that you have a working PolymorphicForeachableLibrary. Issues Iterators are a nice way to standardize working with collections of objects in your applications. The examples here have been based on arrays, but the ability to work on non-array based collections with an identical interface is powerful. The ability to use collections in the foreach control structure is indeed cool. The only unfortunate issue with the SPL implementation is the significant potential for namespace clashing with “Iterator ”. How much PHP 4 object-oriented code has some sort of an
About the Author
?>
Jason has been an IT professional for over ten years. He is currently an application developer and intranet webmaster for a Fortune 100 company. He has written several tutorials and articles for the Zend website, and has recently contributed to the Wrox “PHP Graphics” handbook. He is also the author of “php|architect’s Guide to PHP Patterns. He resides in Iowa with his wife and two children. Jason can be contacted at
[email protected] ..
To Discuss this article: http://forums.phparch.com/233
Available Right At Your Desk
All our classes take place entirely through the Internet and feature a real, live instructor that interacts with each student through voice or real-time messaging.
What You Get
Your Own Web Sandbox Our No-hassle Refund Policy Smaller Classes = Better Learning
Curriculum
The training program closely follows the certification guide— as it was built by some of its very same authors.
Sign-up and Save!
For a limited time, you can get over $300 US in savings just by signing up for our training program! New classes start every three weeks!
http://www.phparch.com/cert
July 2005
●
PHP Architect
●
www.phparch.com
23
FEA T URE
PHP Library for Permission Management
F E A T U R E
by Simone Grassi and Bernhard Gaul
A generic library to manage permissions is what you would want for many projects. It should have a generic interface to populate the permissions database and manage permission needs so that they can be easily personalized. To achieve this, we created a PHP library generic enough to satisfy the needs of most projects and also provided a Flash user interface to manage permissions that is ready for deployment with any web project.
W
hether you need certain functionality for a big software house or single developer, you want to find good libraries that you can use, directly, via Application Programming Interfaces (APIs) without having to modify the code. This is possible by using Object Oriented development and Pattern Design, which are both commonly used to create re-usable code for common tasks.
REQUIREMENTS PHP
4.x
OS
Tested in Linux
Other Software
Apache, MySQL, Flash plug-in
Code Directory
permissions
July 2005
●
PHP Architect
●
www.phparch.com
What Do We Need to Manage? The first objective of our library is to be able to manage permissions for many different types of projects. The main elements are users and permissions. As far as users are concerned, you usually need to manage groups and roles. Permissions are created as a flat list whereby each application using the library shall be able to have its own, independent permissions list. Those permissions must then be applied to objects (e.g. read or write permission on a specific document). To do this effectively, an entity category of objects was introduced, that allows grouping objects by type. So, we end up with: • Users, groups and roles • Permissions • Objects and object categories Users can be part of one or more groups, and each group can have a parent group, to allowing the cre25
PHP Library for Permissions Management
ation of group hierarchies. Later, we will see why we need roles and how they associate permissions to objects. In an application that requires permissions, you will usually have many different objects to which these permissions may be applies. Each object will probably have a unique key (database primary keys) within its category. If we use object categories it will allow us to apply permissions to single objects by specifying the object category as well as the unique id of the object. The permissions database will use the same unique id of the object that is used in the application database. Our library assumes that these unique identifiers are integers. Permissions to Users and Groups A user can be assigned permissions, directly. If, for example, a READ permission is assigned to a user, directly, he will be granted this permission in any case, on all objects of any possible object category. Groups, on the other hand, are useful in different ways: first to allow the creation of hierarchies, using subgroups; second, if a set of permissions is assigned to a group then every user associated with the group will be granted those permissions. It will be enough to change the permissions assigned to the group to change those associated with the individual users. How do you associate permissions with users and groups? There are three different possible ways: • Directly: the assigned permission is valid on
FEA T URE all objects of all categories • Relative to a category of objects: the permission applies to all objects that are part of this category of objects. • Relative to a single object: valid only for the specific object within a specific category The objective is to satisfy the requirements of different scenarios. As an example, let’s imagine we have an application that manages users’ access to documents within different folders. In this scenario, there are at least two categories of objects: documents and folders. Assignment of a permission, directly to a user or group of users, is useful, for example, to allow the administrator to have write permissions on all folders and documents (that is on all objects of all categories of objects). Assigning permissions to a single category of objects allows, for instance, assigning read and write permissions to a single user on all folders, creating a sort of folder administrator. Finally, assigning permissions on a single object allows gives specific access—like read or write—on a single folder (e.g. a specific user may only access a specified folder with write permission). Permissions can be assigned to groups in the same way as to users—that is directly to the group, relative to a specific category of objects, or to a single object. This allows the creation of permission profiles, and users associated with a given group will inherit this profile (there could be, for instance, a group of folder administrators or document administrators).
Figure 1
July 2005
●
PHP Architect
●
www.phparch.com
28
FEA T URE
PHP Library for Permissions Management
Why Groups are Not Enough? We have seen that we can assign permission to both users and groups (in a general way, to all objects, only on a category of objects or on a single object). For many projects, though, this will not be enough. What is needed is the possibility to assign a set of permissions to a user or group, on a specific object or category of objects. To account for this need, we introduced the concept of roles. Take a simple example: within a sample application where users use and manage folders and documents, there could be the need for a publisher role, defined as a user allowed to “publish” documents—adding docu-
“The first objective of the library is to be able to manage permissions for many different
types of projects.”
object_id_description_fieldname field stores a description (or title) of the object. The use of those fields allows the developer to retrieve information about a single object, directly from the table created by the application that uses the permission library. The Library Like many other libraries, our permissions implementation is a single class and can be used by “client”-software through a single API. All data is stored in a database (we used MySQL), and PEAR::DB_DataObject is used as DB Abstract Layer which makes it easy to move the library to other databases. Configuration is simple, and a few parameters are enough to enable DataObject to communicate with the DB. An Example Application: Folders and Documents The code included with php|architect allows you to see the library in action. The example implementation shows permissions management in a small application. All we need to manage are folders and documents. We have users, and each of them has different needs. To accommodate these needs, we will use the different capabilities of the library. The first thing that example.php (see the zip file that accompanies this article) needs to do is instantiate the class: $auth = &new authorization_manager($username);
Passing the username is sufficient; it allows the class to retrieve permissions information about that user. Still, you can see the API in action:
ments to folders. A publisher would need to have access permission on all folders to create new documents within them. This role could be assigned to all folders, in which case the role is assigned to the user relative to a category of objects (the folder category). Or, a user could be defined as publisher for just a single folder. In this case, the role is assigned to the user only on the specific object (the folder that is part of the category folders). As you can see from Figure 1, the difference between groups and roles is how those entities are related to users. A user is part or not part of a group, but a role is assigned to a user relative to an object category (or relative to a single object). Note that Figure 1 does not show the “object”. This entity is undefined, as every application will have their own entities that will act as objects. We provided 3 fields, though, to associate object categories with objects: object_id_table, object_id_fieldname and object_id_description_fieldname . For each object category, object_id_table is the name of the table that stores the objects that are part of this category. The object_id_fieldname field stores the name of the field that is the primary key of the object table. Finally, the
July 2005
●
PHP Architect
●
www.phparch.com
while ($doc_db->fetch()) { $docs[$i][‘document_name’] = $doc_db>document_name; // ... $i++; }
This while statement loops over a query, of all documents matching the query conditions. Within the while , an array is prepared, it contains the document_id , document_name and information about permissions. As you can see, to determine if the current user has READ permission, we just call authorize(). The first parameter is the requested permission, and the second parameter is the object category (OBJECT_TYPE_DOCUMENT ). The object id, in this situation, is the document_id. The API would return true if the current user has this permission, false otherwise. In the example, you can see two tables: in the first, a list of folders; in the second, a list of documents from the current folder. The permissions are represented by RWA (read-write-add elements) for folders and RW (read-write) for documents. When the letter x is present, it means the current user does not have this permission.
29
FEA T URE
PHP Library for Permissions Management
Changing the user, by using the drop-down menu, you can see how different permissions are assigned to different users. See the database entries to fully understand how this is implemented, with groups, roles, and direct permissions. Let’s see, user by user, how permissions are assigned. The superuser user has direct permission to READ, WRITE and ADD_ELEMENTS on everything.
Listing 1 (cont’d)
Listing 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71
Folder & Document permission example