Posts Tagged ‘php’
RSS feed for AlertBox
For those of you concerned with web usability, there are few better sources than Jakob Nielsen’s AlertBox.
Although his site is quite infrequently updated, it contains very informative posts. Unfortunately and ironically, the site does not have a RSS feed.
After an especially boring afternoon, I decided to see if I couldn’t do something about that, so with the use of this PHP Feed Generator class and some HTML parsing, I came up with the following script to generate an RSS feed from the AlertBox website:
Just download the FeedGenerator FeedWriter class from the given link, put the following code and FeedWriter.php in the same directory, upload it to a webserver with PHP support, and voilla, you have an AlertBox RSS feed.
Enjoy!
<?php
date_default_timezone_set("UTC");
class FetchAlertbox {
private $contents = array();
private function fetch() {
$page = file_get_contents("http://www.useit.com/alertbox/index.html");
$startList = stripos ( $page, '<ul>' );
$page = substr ( $page, $startList, stripos ( $page, '</ul>', $startList ) - $startList + strlen ( '</ul>' ) );
$page = str_ireplace ( '<p>', '', $page );
$page = preg_replace ( "/<li>(.*)/i", '<li>$1</li>', $page );
$page = preg_replace ( "@</?strong>@i", "", $page );
$p = @DOMDocument::loadHTML($page);
foreach ( $p->getElementsByTagName("li") as $article ) {
$url = '';
foreach ( $article->getElementsByTagName("a") as $link ) {
if ( preg_match ( "@\w+.html@i", $link->getAttribute('href') ) ) {
$url = 'http://www.useit.com/alertbox/' . $link->getAttribute('href');
break;
}
}
if ( $url === '' ) {
continue;
}
$contents = str_replace ( "\n", " ", $article->textContent );
$contents = preg_replace ( "/\s{2,}/", " ", $contents );
$date = 0;
$possibleStartDate = strrpos ( $contents, '(' );
if ( $possibleStartDate !== false ) {
$timestring = substr ( $contents, $possibleStartDate + 1, strpos ( $contents, ')', $possibleStartDate ) - $possibleStartDate - 1 );
$parsedDate = strtotime ( $timestring );
if ( $parsedDate !== false ) {
$date = $parsedDate;
$contents = str_replace ( "($timestring)", "", $contents );
}
}
$this->contents[] = array ( 'url' => $url, 'title' => trim ( $contents ), 'date' => $date );
}
return $this->contents;
}
public function fetchWithSummaries($type = "RSS2", $limit = 10, $start = 0) {
if ( empty ( $this->contents ) ) { $this->fetch(); }
include ( "FeedWriter.php" );
$feedType = RSS2;
$feedDate = DATE_RSS;
switch ( $type ) {
case 'RSS1':
$feedType = RSS1;
break;
case 'ATOM':
$feedType = ATOM;
$feedDate = DATE_ATOM;
break;
}
$feed = new FeedWriter($feedType);
$feed->setTitle("Alertbox");
$feed->setLink("http://www.useit.com/alertbox/");
$feed->setDescription("Current Issues in Web Usability - Bi-weekly column by Dr. Jakob Nielsen, principal, Nielsen Norman Group");
if ( $type === 'ATOM' ) {
$feed->setChannelElement('updated', date($feedDate, $this->contents[0]["date"]));
} else {
$feed->setChannelElement('pubDate', date($feedDate, $this->contents[0]["date"]));
}
$feed->setChannelElement('language', 'en-us');
$feed->setChannelElement('author', array('name' => 'Dr. Jakob Nielsen'));
if ( $type === "RSS1" ) {
$feed->setChannelAboute("http://www.useit.com/alertbox/");
}
for ( $i = $start; $i < $start + $limit && $i < count ( $this->contents ); $i++ ) {
$p = @DOMDocument::loadHTML ( file_get_contents ( $this->contents[$i]["url"] ) );
$blockquotes = $p->getElementsByTagName("blockquote");
if ( $blockquotes->length > 0 ) {
$this->contents[$i]["summary"] = trim ( str_replace ( "Summary:", "", $blockquotes->item(0)->textContent ) );
} else {
$contents = $p->saveHTML();
$endHeadline = stripos ( $contents, "</h1>" );
$this->contents[$i]["summary"] = trim ( strip_tags ( substr ( $contents, $endHeadline + strlen ( "</h1>" ), stripos ( $contents, "<p>" ) - ( $endHeadline + strlen ( "</h1>" ) ) ) ) );
}
$item = $feed->createNewItem();
$item->setTitle($this->contents[$i]["title"]);
$item->setLink($this->contents[$i]["url"]);
$item->setDate($this->contents[$i]["date"]);
$item->setDescription($this->contents[$i]["summary"]);
$feed->addItem($item);
// echo date ( "F jS, Y", $this->contents[$i]["date"] ) . ": " . $this->contents[$i]["title"] . " -> " . $this->contents[$i]["url"] . "\n";
// echo "\t" . $this->contents[$i]["summary"] . "\n\n";
}
$feed->genarateFeed();
}
}
$s = new FetchAlertbox();
$s->fetchWithSummaries(isset($_GET['type']) ? strtoupper ( $_GET['type'] ) : 'RSS2', isset($_GET['limit']) && ctype_digit($_GET['limit']) ? $_GET['limit'] : 10);
Using WordPress as a website backend
How often have you found yourself creating a great new website design for a friend, family or a client, just to realize after all your HTML and CSS is done that the thing will need an administration panel? Way too often, the administration comes as an afterthought, and thus it ends up being incomplete, hard to use or in some cases absent. Two months down the track you end up being contacted again to do some “site updates”, and these just keep coming.
Making a good administration panel takes a bit of work, and it takes planning to ensure that the client can update the parts of the site as he/she wishes, and more importantly, that your interface gets all the information it needs. If your design has a thumbnail for every post, your administration panel has to provide the opportunity to upload an image and attach it to a post.
For years, I have been making these admin pages myself, until I read a post on WordPress custom themes. This post was on how to write a theme for WordPress, but it hinted at the possibility of having a standalone site that used WordPress as its backend, thus providing a full-fledged, well designed and familiar administration interface to any site. At first I was skeptical to the idea because I thought WordPress was way to inflexible to allow for the variety of pages I make. I was wrong. You see, if you really thing about it, all you need to make most websites are two things: Posts in categories and dynamic text/HTML blocks.
To give you an idea on how flexible WordPress is, and how it can be used, consider the following website which I am currently working on. The site is for a sound production company, and will contain the following:
- Short posts giving updates on what the studio is working on at the moment. These have to have a short text to be shown on the front page, a feature image and a rich-text full article.
- An image feed showing images of the sound studio
- A page with the various projects the studio has completed. Each project has an image, a description and one or more music tracks. Each track has two variants, a streaming-quality MP3 and a high-quality WAV.
- An about page with two separate columns, one on the sound designer, and one on the company
- A list of previous clients with a short description and a link to the client’s website
At first, this seems outside the scope of WordPress which only deals with Posts and Pages, but let’s have a closer look.
The posts are clearly just regular WordPress posts. Let us put them in a category called “Frontpage”.
The image feed is just a stream of images. We here have two choices, use WordPress’ media library, and attach all images to be shown in the feed to a static page, or use a flickr stream. I opted for the latter, but WordPress could have handled this fine!
Each music project can be a blog post (you heard me right) in the category “Projects”. We can then attach images and songs to the post using the WordPress media library. The two versions of each music file can be given the same name, and we can use the MIME type to distinguish between them.
The two text blocks 0n the about page can be two WordPress Pages (let’s call them ‘designer’ and ‘studio’)
Finally, the list of clients can be implemented using a set of links in a link category.
So, how would we go about and convert our static HTML into a fully dynamic site with a complete administration interface? Let’s dig into some code.
Getting access to WordPress from your code
The first step to a well-integrated site is to include the following code at the top of any page that needs to access WordPress functions. Naturally this does not need to be included in included/required files.
define('WP_USE_THEMES', false);
require('./wp-blog-header.php');
This gives you access to a whole set of WordPress functions (http://codex.wordpress.org/Function_Reference) that will aid us in integrating your site with WordPress. Unfortunately, the WordPress API is not very well structured, and naming conventions are a bit all over the place, but we’ll make do.
Getting content from WordPress
From the WordPress function list, there are a couple of terms you need to become accustomed to in order to start using the API. First of all, “The Loop”.
Now, we will not actually be using the loop to display any of our pages for two reasons: “The Loop” is a magic thing that mysteriously figures out what posts/pages to display, and is associated with some magic methods such as get_the_author which magically contain data about the “current” post, whatever that is. Second, it provides very little flexibility for selecting only certain posts/pages.
I will not go into the details of “The Loop” here, I will just say that it is a while loop that most WordPress templates use to print blog posts, pages, etc. to abstract away the backend query. There are several methods in the WordPress API that depend on being used in The Loop, and these usually contain “the” in the function name. Avoid these!
Next, in WordPress, pages are posts. Special types of posts, but posts nonetheless. This means that if you fetch a page, the various fields available will be the exact same as those for post, and they will be named post_title, post_content, etc…
When printing the content of posts (that is, any field that has a rich text field as input), WordPress depends on you running the data through another magical function: wpautop. This function automatically adds p tags where it thinks is appropriate to mimic the appearance in TinyMCE in the admin panel. Always put post_content through this function, otherwise your output is going to look very weird indeed.
Finally, the WordPress API usually returns objects or lists of objects. This is very convenient for most uses, but it also means that you have to take care in those cases where it doesn’t. One such method that you will probably be using is wp_get_attachment_image_src; this function actually returns a numerically indexed array.
Most function that return objects contain all the fields outlined in the appropriate table in this database diagram: http://codex.wordpress.org/Database_Description. Note that almost all models will contain an ID field which comes in very handy. Most other columns are prefixed by the name of the table, and this prefix is also used in the object attributes.
Getting posts
When getting blog posts, the main function to think about is get_posts. This function has a plethora of configuration options, but usually, you will only need the numposts option, and maybe offset. In the case of the website I was developing, I wanted just the posts in a given category
foreach ( get_posts ( 'numberposts=7&category=4' ) as $post ) {
echo $post -> post_title;
}
This is a very simple example which only prints the title of each post, but you get the drift.
If you want a full version of a post given its ID, you would use the quite similar get_post (http://codex.wordpress.org/Function_Reference/get_post) function. This function takes a post ID, and returns an object representing that post. This object tells you nothing about the author or any attached images, so these will have to be fetched separately as such:
$post = get_post ( $_GET['p'] ); // This is probably quite insecure. Sanitize your input!
$author = get_userdata ( $post -> post_author ); // Here we should do some error checking on $post first
$images = get_children ( array(
'post_type' => 'attachment',
'post_parent' => $post -> ID,
'post_mime_type' => 'image'
) );
There is a bit of voodoo going on here, so let’s take it step by step.
The first line should be pretty straightforward, we simply get the appropriate post object by its ID (which we take from the query string).
The next line is also quite simple, we fetch the user data of the user with the ID matching that of the author of the post.
Now we get to the strange bit; get_children. You see, WordPress treats almost everything as posts. Even attachments, no matter the type, are considered posts, and are part of the page/post hierarchy. Thus, to get the attachments of a post or page, we are actually getting all the children of the given object of the type ‘attachment’. I have also added a filter on ‘post_mime_type’ to ensure we only get images. Notice how WordPress sometimes uses strings as arguments, and other times uses arrays? Turns out you can usually get away with both approaches… Someone should really write a wrapper class to sort out that mess, but until then, we’ll have to deal with it. The good part though is that the process for getting the images for a page is exactly the same, just replace get_post with get_page!
The most interesting part though is showing an image you’ve fetched. WordPress “conveniently” provides user-customizable thumbnails for all uploaded images. Unfortunately, these tend to be cropped in weird ways, and are very unpredictable and unlikely to look nice. When printing an image, you have a choice between several formats, amongst others ‘thumbnail’ (the default) and ‘full’. The only one that gives you an uncropped image is ‘full’, but this will give you the image in its original resolution. True, the users can edit and scale the images in the admin panel, but how many end-users can you expect to do that? Unfortunately there is no way around that at the moment AFAIK, but one happy day…
Anyway, until then, you have a choice of two functions for printing your images: wp_get_attachment_image and wp_get_attachment_image_src. They both take the same arguments, but the difference is that the first one prints a full ‘img’ HTML tag with the alt, title, width and height attributes already set, whereas the second one just returns the image url, the widht and the height as a numerically indexed array that you can decide what to do with. They both take the ID of the image as a first parameter, and the size you want as the second. Here, you can either give a predetermined size such as ‘thumbnail’, get the full image with ‘full’ or get a cropped thumbnail that fits inside a certain box by passing an array of two values, width and height as such: array ( 64, 64 ). If you want to get the image description and title yourself, those are stored in the object you used to get the ID for wp_get_attachment_image_*, i.e. in $images[$i].
Other attachments (mp3s for instance)
When it comes to getting other post attachments, this is actually quite trivial once you know how to fetch images. Instead of using ‘post_mime_type’ => ‘image’, you simply use another MIME type. On the site I am developing, I will use the MIME type for mp3 which is ‘audio/mpeg’ as far as WordPress can tell (you can see this in the WordPress admin panel -> Media). I would therefore substitute ’post_mime_type’ => ‘image’ with ‘post_mime_type’ => ‘audio/mpeg’. Simple as that!
To get the direct URL for a non-image attachment, you can use the wp_get_attachment_url function. As for getting the high quality version of a file, this is just a matter of selecting an attachment with the same title (i.e. the name of the file without the extension), but a different MIME type.
Dealing with pages/editable content boxes
Now, for the boxes on the about page which the administrators of the page should be allowed to edit. This is as easy as just creating two new pages in the WordPress admin and noting down the name you use. Back in your code, you can then use the following snippet to print the content of the box/page:
$page = get_page_by_title ( 'About box left' ); echo wpautop ( $page -> post_content );
By now this should look familiar. We are simply fetching the page by its title, and then passing the pages content through wpautop, and echoing the result.
Lists of links with descriptions
Our final challenge for this site will be to fetch the list of clients. We’ve already determined that we are going to use the WordPress Links library because this provides exactly the fields we need, a title, a URL and a short description. However, if you start looking through the WordPress API for anything related to links, you will come up empty handed. The reason for this is that in their wisdom, WordPress decided to call links “bookmarks” in their API for the sake of clarity. The function we are looking for here is called get_bookmarks, and again we may specify lots of parameters. In our case, however, we are only concerned with one of them; category. Since we may want to add other links later that should not show up in the clients list, we create a link category from the WordPress admin and note down the category ID. In my setup it was 3, and so my code to get the links/bookmarks becomes:
foreach ( get_bookmarks ( array ( 'category' => 3 ) ) as $link ) {
echo '<a href="' . $link -> link_url . '">' . $link -> link_name . '</a>';
echo '<blockquote><p>' . $link -> link_description . '</p></blockquote>';
}
Of course, this is a simplified version of the end result, but it should give you enough of an idea to get you on your way.
Final thoughts
As you have now seen, this entire page can now be administered fully through WordPress with its quite good admin panel, and the user won’t even think twice about WordPress really being a blogging tool. In fact, neither should you, because as you can see, it is more than flexible enough to be used for quite complex websites. Your users will be happy with a comfortable admin interface, and you won’t have to touch a single piece of admin code!
Inline website administration
Almost all modern websites require some sort of administration, and this usually involves creating a separate administration page where articles can be added and users managed. Lately, I’ve been making quite a few new websites that will be released in the upcoming year, and all of these have been quite simple sites with a single user and where the administration consists mainly of adding simple news updates and updating page text. For these sites, a full blown administration panel is not necessary, and is also quite inconvenient as the user will have to go back and forth to see the results. So, what are the alternatives?
(Live examples are not available at the moment, but might come later)
AJAX driven, on-page administration
Here, the user (the person administering the website) is allowed to edit content on the same page as the content through a rich text area in a popup, and the text is then changed afterwards to reflect the users edits.
The simplest, and in my experience most flexible way of doing this is through named fields. Each block of text on the site gets its own unique name, and is linked to a plain text file on the server. In my small site setups, I usually use a structure like this:
/ pages/ about.inc.php projects.inc.php bio.inc.php api.php page.php index.php
The .inc.php files can either be plain text or contain PHP code. The most important thing is that they have a unique name. The files the usually look something like this (simplified for clarity – remember security and error checking!)
<?php
// page.php
function printBlock($name) {
if ( !file_exists ( 'pages/' . $name . '.inc.php' ) ) return;
echo '<div id="' . $name . '" class="editable">';
require 'pages/' . $name . '.inc.php';
echo '</div>';
}
}
// api.php
require 'page.php';
$action = $_GET['a'];
$block = $_GET['e'];
switch ( $action ) {
case 'get':
printBlock ( $block );
break;
case 'post':
file_put_contents ( 'pages/' . $name . '.inc.php', $_POST['content'] );
break;
}
// index.php
require 'page.php';
?>
<!-- HTML structure -->
<!-- Then, whenever you're printing a block or page that should be editable, call printBlock -->
<?php printBlock ( 'about' ); ?>
Next, you will have to make some sort of JavaScript hook to make all editable areas editable. I like to use a combination of CKEditor, a simplified version of lightbox and jQuery so the end result looks something like this when a user double clicks on a box with the editable class:
Upon saving, jQuery sends a AJAX request to the api.php file with the updated contents, and also changes the contents of the block on the page using the .html() method on the element with the same ID as the block name.
In-line administration
On some sites, popup boxes simply won’t cut it. In fact, they might even become a bit cumbersome when working with news articles and such where you might want a live preview of the article as you’re typing it. Earlier, one had to have a rich text editor with a “Preview” button, but now we have a much better tool available: contentEditable. This awesome attribute allows you to tell the browser to allow the user to change the contents of an element on your page at will. Consider these screenshots that illustrate adding a new news post on a page utilizing this attribute for administration:
As you can see, this is a very simple way of creating and editing posts – and immediately seeing how it would look on the page. The major drawback is that you cannot easily accept rich inline content such as images and video, or even simple text formatting. On the other hand, such features often clutter the articles anyway. On this site, I have overcome this by allowing file attachments that are placed beneath the article based on their type (images are shown in a gallery strip, videos are embedded, etc.) Text formatting is achieved through a markdown-like syntax handled by JavaScript. There is no rich text logic in the backend.
Using contentEditable is quite simple. All you have to do is use JavaScript’s setAttribute/removeAttribute functions on any element you want to be editable. Set the attribute to true when you want it turned on, and remove it when you want it off. Apart from this, everything is quite straight-forward and very similar to the previous method of popup administration. JavaScript sends the new content to the backend, which saves it and returns the HTML rendering of the content as it would be displayed when loading the front page regularly. JavaScript then swaps the editable post area with the HTML from the server and disables editing on it.
Rounding up
Both these techniques provide quite intuitive and easy-to-access administration equivalents to classical admin-panel interfaces. They are not especially complex to build either, though they provide the user with a more comfortable and usable way to manage their sites. If you have any questions regarding these techniques, don’t hesitate to use the comment field below or e-mail me at jon <you know what goes here> thesquareplanet <and you know this one as well> com.
Developing for the modern web
Web development today is a constantly struggle between three major stakeholders: the customer, the designer and the developer. The customer tries to push through his or her (often distorted and silly) mental image of the website, the designer wants to be original, creative and fancy creating lots of intricate designs with fancy visual effects, and the developer who attempts desperately to explain to both the customer and the designer why what they’re doing is a bad idea (heavy background images, crammed pages, no whitespace, confusing visual effects…). The developers aren’t all good either though – They tend to put in as many fancy tricks and solutions in the final product as they can, often resulting in exotic bugs in various browsers and usually ungraceful downgrading™. In all of this, one stakeholder is often wholly forgotten, even though it is probably the most important one; the users.
Users often don’t know the first thing about how the web works. They don’t care whether the site is optimized for Firefox, Internet Explorer, Chrome or Safari (in fact, they probably don’t even know what a browser is…) The users want a site that is visually appealing, but not distracting – informative, but not cluttered – clear, but not over-simplified – and most importantly, one that is responsive. When a user does something, they should begin to see something happening within .1 seconds (http://www.useit.com/alertbox/timeframes.html) to feel as though they aren’t being slowed down by the site itself. Furthermore, the total loading time for whatever action the user initiates should be less than a second for the user not to fall out of his or her “flow”. Way to many websites violate these simple rules, causing the site to feel unresponsive to the user, and the users are likely to jump to the next site on their list.
In this post, I hope to show you how to make your website faster – mainly through optimizing the initial page load. In order to do this, there are three steps that need to be taken: Combine, Compress and Communicate. Repeat after me: Combine, Compress and Communicate.
Combine
Many developers seem to think (albeit erroneously) that many small files are better than few large ones. This might seem intuitive since a smaller file downloads faster than a large one, and you would think they could all be gotten out of the way quicker. The truth is quite different. Due to limitations of the HTTP protocol, the browser has to initiate a new request to the server for every single file, causing quite a bit of overhead when having to download several files. Also, modern browsers limit the amount of simultaneous downloads to 6, meaning downloading all of your small files will go even slower. Add to this the sequential nature of JavaScript, and the fact that the browser stops loading the page once it hits a JavaScript piece (external or not), and doesn’t continue loading until the JavaScript file is finished downloading and has been interpreted.
Therefore, you should work to combine as many of your files as possible. Don’t jump to put all your scripts and styles inline, however (you will understand why in Communicate). Instead, you should attempt to combine all your CSS files into one, all your JavaScript into another, and all your images into a third. Ideally, you should need no more than three external files on your site. So, how do you go about doing this?
CSS and JavaScript
Combining CSS and JS files shouldn’t itself be a problem.. Open up a text editor, copy-paste all of your CSS or JS into that file, save it and upload. You should probably still keep the separated files for readability though. Of course, modern web applications are usually a bit more complicated. For instance, you might have a stylesheet that is only included on sites with ads on them or a JavaScript file that is only needed on your frontpage. In these cases, your should look into using a combinator. One of the best sites describing the techniques of combining is this one. The mod_concat plugin for Apache2 provides several advantages over traditional scripting approaches especially with regards to communication (as will be discussed later)
Images
All your images should be done as sprites. Ideally, you should even be able to put every single image on your site into a single png image. Do this, and you will substantially reduce the loading time of your site. For an introduction to CSS sprites, have a look here.
Compress
All your CSS and JS files should be compressed to reduced overall download size. Again, it is usually a good idea to keep the original, uncompressed versions of the files, and re-compress the files whenever you change them. For CSS, I recommend the YUI compiler (http://www.refresh-sf.com/yui/). It does JavaScript as well, but Google’s recently released Closure Compiler seems to be even more effective at compressing it. You can find it at http://closure-compiler.appspot.com/home. With the Closure Compiler, you can also select the advanced compiler which will decrease the total file size even more, but will mess up your files’ external API. This means that any functions you define inside your files won’t be available from the outside by the same name. The internal workings of the file will be preserved though.
Apart from minimizing the files, you should also compress them using something like GZip which is natively supported by several browsers. To see how to do this automatically with Apache2, have a look at http://www.cyberciti.biz/tips/speed-up-apache-20-web-access-or-downloads-with-mod_deflate.html.
Communicate
OK, so all of your files are combined and compressed, and you’ve never seen the CSS and JavaScript download so quickly. How can it possibly go any faster? Quite simple – by preventing the browser from having to download the files at all. Modern browsers include a lot of caching technology to prevent them from downloading unnecessary data from the server. The problem is that many web servers do not communicate properly the states of the files, and the browsers can thus not determine if a file has changed or not; and therefore they download the file just to make sure. So, what should you do?
First of all, you need to tell your web server to send out as much data as possible about your file. This especially applies to dynamic files such as those created by PHP. Have a look here for a more thorough discussion of this topic.
Second, files that are GZipped by Apache don’t always get an expiration date, causing the browser to re-download the file on every page load. To overcome this problem, have a look at the first answer on this page
Final thoughts
In the course of this post, I hope I have given you an overview of what can be done to speed up the loading time of web pages, and enough pointers to keep you going in your quest for the best speed your website can achieve. This is an ever-expanding topic, and new techniques are always appearing, so you should attempt as best you can to keep up to speed (pun intended) on the newest advances in the field.
Happy speeding!
Browse the web with PHP
Ever so often, you come across a website that you would like to check regularly. Usually, this website is placed behind some sort of login, and therefore, you think, you might just as well forget it. A while ago, I found myself in the same situation. My university in Oslo published grades online, but gave you no warning when the exam results where published, so you had to check every now and then to see if you had any new ones. I figured that this was a bit bothersome, and wanted to find a way around it.
There are several scripts and browser plugins out there that can check a page for updates on a regular basis, and notify you when something changes. The problem is that this site required you to log in first by submitting a form, and then navigate to the relevant page. I therefore decided to write a PHP class (or actually two) that would allow me to browse the web as through a browser; submitting forms and clicking links.
The result was the two classes Browser (http://www.phpclasses.org/browse/package/5450.html) and RemoteForm (http://www.phpclasses.org/browse/package/5449.html). The latter is a class that takes a form and parses out any input fields, selects and textareas and their respective default values. It then allows you to set values for these fields and submit the form – returning the resulting URL. The Browser class is one layer above, and depends on the RemoteForm class for handling form submission. It allows you to start a browser session and then navigate by simulating clicks on links through XPath selection.
See how simple it is to submit a search form on Wikipedia:
<?php require 'browser.class.php'; /** * The long way to the PHP Reference Manual... */ /** * New browser object */ $b = new Browser ( ); /** * Navigate to the first url */ $b -> navigate ( 'http://en.wikipedia.org/wiki/Main_Page' ); /** * Search for php */ $b -> submitForm ( $b -> getForm ( "//form[@id='searchform']" ) -> setAttributeByName ( 'search', 'php' ), 'fulltext' ) -> click ( "//a[@title='PHP']" ) // Click the PHP search result -> click ( "PHP Reference Manual" ); // Click the link to the ref echo $b -> getSource(); // Output the source
Setting up a virtual development server
As a web developer, I often come up with interesting new concepts that I want to try out. Occasionally, these require more than simply HTML, CSS and JavaScript, at which point I need to begin uploading my PHP (my language of choice) files to a remote server running apache, test the page there, make adjustments in my local code, upload and test again. This is quite slow compared to the very efficient development cycle of plain old HTML where you can preview what you’re doing instantly in the browser.
Whilst some IDEs have support for FTP uploading directly from the editor, this still means you have to wait for the upload to complete. Also, If you want to delete files or rename folders, it often requires you to start up a separate FTP client anyway. Wouldn’t it be great if you could work with your PHP (or whatever server-side language you prefer) files directly on your computer, and access them directly through your browser without any intermediate steps? Just as if it was static HTML…
There are two ways you can do this; one is to install all the server-side software on your own computer and set it up so that it points to the directory you work from as its directory root. The other, which I will be telling you how to set up, is to run a virtual server on your box. The reason I prefer this approach is that it keeps a separation between your own computer and the server, and at the same time allows you to set up your server to match the server you will be deploying your application on.
So, first of all, grab a copy of Sun’s VirtualBox. This piece of software allows you to set up virtual computers running whatever OS you want it to. Next, download the ISO containing your favorite server OS (I have chosen Arch Linux, but this guide should apply to most Linux-based OSs, and the guiding principles should be applicable to any server OS). After installing VirtualBox, create a new Virtual Machine (VM). You can name it anything you want, and set how much RAM it should have, its hard-drive size and various other parameters. Usually the defaults are fine. When your OS has finished downloading, right-click your newly created VM in VirtualBox and select settings → Storage → Click the image with a CD icon → Click on the small folder icon with a green flick on it (The Virtual Media Manager) → Click add in the new window that pops up and select your ISO → Select the image that appears in the list and click “Select” → Click OK
Next, we need to do some low-level dirty stuff to make the host OS (Your computer) can connect to the guest OS (The server) through for instance port 80 (HTTP) and port 22 (SSH).
- Select “Network” in the settings dialog for your VM
- In “Adapter 1″, make sure the drop-down has “NAT” selected.
- Click advanced
- Set the adapter type to PCnet-PCI II
- Click OK and close VirtualBox completely
- Open up the VirtualBox configuration file for your VM in notepad or similar (On my Windows 7 install, it is located in C:\Users\<username>\.VirtualBox\Machines\<Name of VM>\<Name of VM>.xml)
- At the top where it says: “<ExtraData>”, append the following code:
<ExtraDataItem name="VBoxInternal/Devices/pcnet/0/LUN#0/Config/apache/GuestPort" value="80"/> <ExtraDataItem name="VBoxInternal/Devices/pcnet/0/LUN#0/Config/apache/HostPort" value="8888"/> <ExtraDataItem name="VBoxInternal/Devices/pcnet/0/LUN#0/Config/apache/Protocol" value="TCP"/> <ExtraDataItem name="VBoxInternal/Devices/pcnet/0/LUN#0/Config/ssh/GuestPort" value="22"/> <ExtraDataItem name="VBoxInternal/Devices/pcnet/0/LUN#0/Config/ssh/HostPort" value="2222"/> <ExtraDataItem name="VBoxInternal/Devices/pcnet/0/LUN#0/Config/ssh/Protocol" value="TCP"/>
- Save the file
What we just did was to tell VirtualBox that we want to forward the port “8888″ on the host to port 80 on the guest, and similarly port 2222 to port 22. The reason we had to change the network adapter type to PCnet-PCI II in step 4 was that, as you can see from the strings you added to the XML, they reference “/pcnet/” which only works on the PCnet-type cards. If you use the intel-based ones, you need to find the shorthand for those (shouldn’t be too much of a hassle).
Allright, so now we have the VM itself sorted out, next we need the server up an running. Time to start up your VM for the first time. This guide will not go through the actual OS install as it is way outside its scope, but generally, you don’t need any GUI stuff, and should select any server software you’ll need if you get the choice.
Next, you should install the VirtualBox guest OS additions. Under “Devices” in the VM window, select Install Guest Additions. This will download and mount an ISO image with the install files for most guest OSs. For a more in-depth explanation see this link. From Linux, mount the CD and “cd” to the CD directory (pun actually not intended…), then run “sudo sh ./<script-relevant-for-your-architecture>” – for example “sudo sh ./VBoxLinuxAdditions-x86.run”. This should compile and install the relevant modules. You will also have to add two modules to your startup process: “vboxadd” and “vboxvfs”. The first is the base system for the VirtualBox Guest Additions, and the second one is the file system controller that allows you to access the shared folders set by the host. Some OSs also have these things available through repositories. In Arch for instance, the relevant packages are in the package “virtualbox-additions” in community. To install, just type “pacman -S virtualbox-additions”.
Under Arch, edit “/etc/rc.conf” and add the two said modules to the MODULES array (i.e. “MODULES = (vboxadd vboxvfs)”. Since your there, you might want to add “httpd” and “sshd” to your DAEMONS list as well. You should also add the following to your “/etc/hosts.allow”: “httpd: ALL” and “sshd: ALL”. This allows the host to connect on those ports.
So now that the guest has its additions, we need to install the server software. I won’t go through the specifics here, but in my case, I installed Apache2 with PHP and PostgreSQL.
And so, to tie it all together: At this point, you have a working server, and after a reboot going to “http://locahost:8888/” on your host should take you to the default start page in whatever web server you’ve set up. You should also be able to connect to SSH if you’ve set that up. Thus far though, you will still have to transfer your files to the server to test them there. This is where VirtualBox’s shared folders come in.
In the VM window, select “Devices” → “Shared Folders”. Here, add your development folder as a new shared folder with full access and click OK. If you run a GUI guest you should now be able to access the folder as a network drive. If not, however, you need to do some more console magic. To get the folder to mount automatically in Linux, all you have to do is add the following line to “/etc/fstab”
"<Name of shared folder> /srv/http/ vboxsf defaults 0 0"
The name of the shared folder is stated in the Shared Folders dialog we opened earlier.
Next, run “sudo mount -a” to mount the new folder. This should allow you to navigate to “/srv/http” on your guest OS and see all the files in your development folder. Finally, set up your web server to have “/srv/http/” as its document root, and you should be able to access any of your projects at “http://localhost:8888/path/to/file/from/development/folder/” from your host the instant you save a file with all the bells and whistles of a fully-fledged web server.
If you experience Apache serving you old versions of a file even though you KNOW you’ve made a change, edit the Apache config (“/etc/httpd/conf/httpd.conf” on Arch), and uncomment the line saying “EnableSendFile off” and restart Apache.
Enjoy your new upload-free development environment!
References:
Drop-in PHP folder gallery
Update 21/11/09: The script now supports thumbnails for video and text, as well as timecodes for video and audio. FFMpeg is needed though..
Have you ever uploaded a bunch of images to a new folder at your webserver to share them with others, and ended up with just a listing of clickable filenames? No previews or thumbs of anything..
Well, I’ve found myself in that position too many times, and decided to create a standalone drop-in PHP gallery file that pulls thumbs from images, videos and text files and displays them in an easily scannable format. The thought behind it was that it should be a single PHP file that could be dropped in the directory and left there without any more work.
The file is not complete, but it does work with images so far, and video and text support is half-way there.
You can see the alpha version here: http://jon.thesquareplanet.com/index.gallery.phps
To see it in action, see: http://jon.thesquareplanet.com/bond/toga/




