Tech That!

The rantings of a mad scientist

University posts

leave a comment »

You may notice that every now and then there appears a post that seems forced, or not in line with the kinds of posts I usually make such as the Javascript and DOM references post. This is because one of my university courses at Bond University is running a Blog Assessment in which we are given certain topics to blog about every now and then. Since these assignments are compulsory, I cannot refrain from writing them though they clutter the rest of the blog entries.

Oh well, just thought I would explain the issue for those who are confused by these seemingly random posts.

Written by Jon Gjengset

November 3, 2009 at 15:03

Posted in University

Tagged with

Inline website administration

leave a comment »

Almost all modern websites require some sort of administration, and this usually involves creating a separate administration page where articles can be added and users managed. Lately, I’ve been making quite a few new websites that will be released in the upcoming year, and all of these have been quite simple sites with a single user and where the administration consists mainly of adding simple news updates and updating page text. For these sites, a full blown administration panel is not necessary, and is also quite inconvenient as the user will have to go back and forth to see the results. So, what are the alternatives?

(Live examples are not available at the moment, but might come later)

AJAX driven, on-page administration

Here, the user (the person administering the website) is allowed to edit content on the same page as the content through a rich text area in a popup, and the text is then changed afterwards to reflect the users edits.

The simplest, and in my experience most flexible way of doing this is through named fields. Each block of text on the site gets its own unique name, and is linked to a plain text file on the server. In my small site setups, I usually use a structure like this:

/
 pages/
  about.inc.php
  projects.inc.php
  bio.inc.php
 api.php
 page.php
 index.php

The .inc.php files can either be plain text or contain PHP code. The most important thing is that they have a unique name. The files the usually look something like this (simplified for clarity – remember security and error checking!)

<?php

// page.php
function printBlock($name) {
    if ( !file_exists ( 'pages/' . $name . '.inc.php' ) ) return;
        echo '<div id="' . $name . '" class="editable">';
        require 'pages/' . $name . '.inc.php';
        echo '</div>';
    }
}

// api.php
require 'page.php';
$action = $_GET['a'];
$block = $_GET['e'];
switch ( $action ) {
    case 'get':
        printBlock ( $block );
        break;
    case 'post':
        file_put_contents ( 'pages/' . $name . '.inc.php', $_POST['content'] );
        break;
}

// index.php
require 'page.php';
?>
<!-- HTML structure -->
<!-- Then, whenever you're printing a block or page that should be editable, call printBlock -->
<?php printBlock ( 'about' ); ?>

Next, you will have to make some sort of JavaScript hook to make all editable areas editable. I like to use a combination of CKEditor, a simplified version of lightbox and jQuery so the end result looks something like this when a user double clicks on a box with the editable class:

Upon saving, jQuery sends a AJAX request to the api.php file with the updated contents, and also changes the contents of the block on the page using the .html() method on the element with the same ID as the block name.

In-line administration

On some sites, popup boxes simply won’t cut it. In fact, they might even become a bit cumbersome when working with news articles and such where you might want a live preview of the article as you’re typing it. Earlier, one had to have a rich text editor with a “Preview” button, but now we have a much better tool available: contentEditable. This awesome attribute allows you to tell the browser to allow the user to change the contents of an element on your page at will. Consider these screenshots that illustrate adding a new news post on a page utilizing this attribute for administration:

Before adding the article

Adding a title

Editing the post body

After saving the new post

As you can see, this is a very simple way of creating and editing posts – and immediately seeing how it would look on the page. The major drawback is that you cannot easily accept rich inline content such as images and video, or even simple text formatting. On the other hand, such features often clutter the articles anyway. On this site, I have overcome this by allowing file attachments that are placed beneath the article based on their type (images are shown in a gallery strip, videos are embedded, etc.) Text formatting is achieved through a markdown-like syntax handled by JavaScript. There is no rich text logic in the backend.

Using contentEditable is quite simple. All you have to do is use JavaScript’s setAttribute/removeAttribute functions on any element you want to be editable. Set the attribute to true when you want it turned on, and remove it when you want it off. Apart from this, everything is quite straight-forward and very similar to the previous method of popup administration. JavaScript sends the new content to the backend, which saves it and returns the HTML rendering of the content as it would be displayed when loading the front page regularly. JavaScript then swaps the editable post area with the HTML from the server and disables editing on it.

Rounding up

Both these techniques provide quite intuitive and easy-to-access administration equivalents to classical admin-panel interfaces. They are not especially complex to build either, though they provide the user with a more comfortable and usable way to manage their sites. If you have any questions regarding these techniques, don’t hesitate to use the comment field below or e-mail me at jon <you know what goes here> thesquareplanet <and you know this one as well> com.

Written by Jon Gjengset

December 30, 2009 at 01:12

Developing for the modern web

leave a comment »

Web development today is a constantly struggle between three major stakeholders: the customer, the designer and the developer. The customer tries to push through his or her (often distorted and silly) mental image of the website, the designer wants to be original, creative and fancy creating lots of intricate designs with fancy visual effects, and the developer who attempts desperately to explain to both the customer and the designer why what they’re doing is a bad idea (heavy background images, crammed pages, no whitespace, confusing visual effects…). The developers aren’t all good either though – They tend to put in as many fancy tricks and solutions in the final product as they can, often resulting in exotic bugs in various browsers and usually ungraceful downgrading™. In all of this, one stakeholder is often wholly forgotten, even though it is probably the most important one; the users.

Users often don’t know the first thing about how the web works. They don’t care whether the site is optimized for Firefox, Internet Explorer, Chrome or Safari (in fact, they probably don’t even know what a browser is…) The users want a site that is visually appealing, but not distracting – informative, but not cluttered – clear, but not over-simplified – and most importantly, one that is responsive. When a user does something, they should begin to see something happening within .1 seconds (http://www.useit.com/alertbox/timeframes.html) to feel as though they aren’t being slowed down by the site itself. Furthermore, the total loading time for whatever action the user initiates should be less than a second for the user not to fall out of his or her “flow”. Way to many websites violate these simple rules, causing the site to feel unresponsive to the user, and the users are likely to jump to the next site on their list.

In this post, I hope to show you how to make your website faster – mainly through optimizing the initial page load. In order to do this, there are three steps that need to be taken: Combine, Compress and Communicate. Repeat after me: Combine, Compress and Communicate.

Combine

Many developers seem to think (albeit erroneously) that many small files are better than few large ones. This might seem intuitive since a smaller file downloads faster than a large one, and you would think they could all be gotten out of the way quicker. The truth is quite different. Due to limitations of the HTTP protocol, the browser has to initiate a new request to the server for every single file, causing quite a bit of overhead when having to download several files. Also, modern browsers limit the amount of simultaneous downloads to 6, meaning downloading all of your small files will go even slower. Add to this the sequential nature of JavaScript, and the fact that the browser stops loading the page once it hits a JavaScript piece (external or not), and doesn’t continue loading until the JavaScript file is finished downloading and has been interpreted.

Therefore, you should work to combine as many of your files as possible. Don’t jump to put all your scripts and styles inline, however (you will understand why in Communicate). Instead, you should attempt to combine all your CSS files into one, all your JavaScript into another, and all your images into a third. Ideally, you should need no more than three external files on your site. So, how do you go about doing this?

CSS and JavaScript

Combining CSS and JS files shouldn’t itself be a problem.. Open up a text editor, copy-paste all of your CSS or JS into that file, save it and upload. You should probably still keep the separated files for readability though. Of course, modern web applications are usually a bit more complicated. For instance, you might have a stylesheet that is only included on sites with ads on them or a JavaScript file that is only needed on your frontpage. In these cases, your should look into using a combinator. One of the best sites describing the techniques of combining is this one. The mod_concat plugin for Apache2 provides several advantages over traditional scripting approaches especially with regards to communication (as will be discussed later)

Images

All your images should be done as sprites. Ideally, you should even be able to put every single image on your site into a single png image. Do this, and you will substantially reduce the loading time of your site. For an introduction to CSS sprites, have a look here.

Compress

All your CSS and JS files should be compressed to reduced overall download size. Again, it is usually a good idea to keep the original, uncompressed versions of the files, and re-compress the files whenever you change them. For CSS, I recommend the YUI compiler (http://www.refresh-sf.com/yui/). It does JavaScript as well, but Google’s recently released Closure Compiler seems to be even more effective at compressing it. You can find it at http://closure-compiler.appspot.com/home. With the Closure Compiler, you can also select the advanced compiler which will decrease the total file size even more, but will mess up your files’ external API. This means that any functions you define inside your files won’t be available from the outside by the same name. The internal workings of the file will be preserved though.

Apart from minimizing the files, you should also compress them using something like GZip which is natively supported by several browsers. To see how to do this automatically with Apache2, have a look at http://www.cyberciti.biz/tips/speed-up-apache-20-web-access-or-downloads-with-mod_deflate.html.

Communicate

OK, so all of your files are combined and compressed, and you’ve never seen the CSS and JavaScript download so quickly. How can it possibly go any faster? Quite simple – by preventing the browser from having to download the files at all. Modern browsers include a lot of caching technology to prevent them from downloading unnecessary data from the server. The problem is that many web servers do not communicate properly the states of the files, and the browsers can thus not determine if a file has changed or not; and therefore they download the file just to make sure. So, what should you do?

First of all, you need to tell your web server to send out as much data as possible about your file. This especially applies to dynamic files such as those created by PHP. Have a look here for a more thorough discussion of this topic.

Second, files that are GZipped by Apache don’t always get an expiration date, causing the browser to re-download the file on every page load. To overcome this problem, have a look at the first answer on this page

Final thoughts

In the course of this post, I hope I have given you an overview of what can be done to speed up the loading time of web pages, and enough pointers to keep you going in your quest for the best speed your website can achieve. This is an ever-expanding topic, and new techniques are always appearing, so you should attempt as best you can to keep up to speed (pun intended) on the newest advances in the field.

Happy speeding!

Written by Jon Gjengset

December 17, 2009 at 19:57

Browse the web with PHP

with one comment

Ever so often, you come across a website that you would like to check regularly. Usually, this website is placed behind some sort of login, and therefore, you think, you might just as well forget it. A while ago, I found myself in the same situation. My university in Oslo published grades online, but gave you no warning when the exam results where published, so you had to check every now and then to see if you had any new ones. I figured that this was a bit bothersome, and wanted to find a way around it.

There are several scripts and browser plugins out there that can check a page for updates on a regular basis, and notify you when something changes. The problem is that this site required you to log in first by submitting a form, and then navigate to the relevant page. I therefore decided to write a PHP class (or actually two) that would allow me to browse the web as through a browser; submitting forms and clicking links.

The result was the two classes Browser (http://www.phpclasses.org/browse/package/5450.html) and RemoteForm (http://www.phpclasses.org/browse/package/5449.html). The latter is a class that takes a form and parses out any input fields, selects and textareas and their respective default values. It then allows you to set values for these fields and submit the form – returning the resulting URL. The Browser class is one layer above, and depends on the RemoteForm class for handling form submission. It allows you to start a browser session and then navigate by simulating clicks on links through XPath selection.

See how simple it is to submit a search form on Wikipedia:

<?php
require 'browser.class.php';
/**
* The long way to the PHP Reference Manual...
*/
/**
* New browser object
*/
$b = new Browser ( );
/**
* Navigate to the first url
*/
$b -> navigate ( 'http://en.wikipedia.org/wiki/Main_Page' );
/**
* Search for php
*/
$b -> submitForm (
$b  -> getForm ( "//form[@id='searchform']" )
-> setAttributeByName ( 'search', 'php' ),
'fulltext'
)
-> click ( "//a[@title='PHP']" ) // Click the PHP search result
-> click ( "PHP Reference Manual" ); // Click the link to the ref
echo $b -> getSource(); // Output the source

Written by Jon Gjengset

December 16, 2009 at 17:59

Setting up a virtual development server

leave a comment »

As a web developer, I often come up with interesting new concepts that I want to try out. Occasionally, these require more than simply HTML, CSS and JavaScript, at which point I need to begin uploading my PHP (my language of choice) files to a remote server running apache, test the page there, make adjustments in my local code, upload and test again. This is quite slow compared to the very efficient development cycle of plain old HTML where you can preview what you’re doing instantly in the browser.

Whilst some IDEs have support for FTP uploading directly from the editor, this still means you have to wait for the upload to complete. Also, If you want to delete files or rename folders, it often requires you to start up a separate FTP client anyway. Wouldn’t it be great if you could work with your PHP (or whatever server-side language you prefer) files directly on your computer, and access them directly through your browser without any intermediate steps? Just as if it was static HTML…

There are two ways you can do this; one is to install all the server-side software on your own computer and set it up so that it points to the directory you work from as its directory root. The other, which I will be telling you how to set up, is to run a virtual server on your box. The reason I prefer this approach is that it keeps a separation between your own computer and the server, and at the same time allows you to set up your server to match the server you will be deploying your application on.

So, first of all, grab a copy of Sun’s VirtualBox. This piece of software allows you to set up virtual computers running whatever OS you want it to. Next, download the ISO containing your favorite server OS (I have chosen Arch Linux, but this guide should apply to most Linux-based OSs, and the guiding principles should be applicable to any server OS). After installing VirtualBox, create a new Virtual Machine (VM). You can name it anything you want, and set how much RAM it should have, its hard-drive size and various other parameters. Usually the defaults are fine. When your OS has finished downloading, right-click your newly created VM in VirtualBox and select settings → Storage → Click the image with a CD icon → Click on the small folder icon with a green flick on it (The Virtual Media Manager) → Click add in the new window that pops up and select your ISO → Select the image that appears in the list and click “Select” → Click OK

Next, we need to do some low-level dirty stuff to make the host OS (Your computer) can connect to the guest OS (The server) through for instance port 80 (HTTP) and port 22 (SSH).

  1. Select “Network” in the settings dialog for your VM
  2. In “Adapter 1″, make sure the drop-down has “NAT” selected.
  3. Click advanced
  4. Set the adapter type to PCnet-PCI II
  5. Click OK and close VirtualBox completely
  6. Open up the VirtualBox configuration file for your VM in notepad or similar (On my Windows 7 install, it is located in C:\Users\<username>\.VirtualBox\Machines\<Name of VM>\<Name of VM>.xml)
  7. At the top where it says: “<ExtraData>”, append the following code:
    <ExtraDataItem name="VBoxInternal/Devices/pcnet/0/LUN#0/Config/apache/GuestPort" value="80"/>
    <ExtraDataItem name="VBoxInternal/Devices/pcnet/0/LUN#0/Config/apache/HostPort" value="8888"/>
    <ExtraDataItem name="VBoxInternal/Devices/pcnet/0/LUN#0/Config/apache/Protocol" value="TCP"/>
    <ExtraDataItem name="VBoxInternal/Devices/pcnet/0/LUN#0/Config/ssh/GuestPort" value="22"/>
    <ExtraDataItem name="VBoxInternal/Devices/pcnet/0/LUN#0/Config/ssh/HostPort" value="2222"/>
    <ExtraDataItem name="VBoxInternal/Devices/pcnet/0/LUN#0/Config/ssh/Protocol" value="TCP"/>
  8. Save the file

What we just did was to tell VirtualBox that we want to forward the port “8888″ on the host to port 80 on the guest, and similarly port 2222 to port 22. The reason we had to change the network adapter type to PCnet-PCI II in step 4 was that, as you can see from the strings you added to the XML, they reference “/pcnet/” which only works on the PCnet-type cards. If you use the intel-based ones, you need to find the shorthand for those (shouldn’t be too much of a hassle).

Allright, so now we have the VM itself sorted out, next we need the server up an running. Time to start up your VM for the first time. This guide will not go through the actual OS install as it is way outside its scope, but generally, you don’t need any GUI stuff, and should select any server software you’ll need if you get the choice.

Next, you should install the VirtualBox guest OS additions. Under “Devices” in the VM window, select Install Guest Additions. This will download and mount an ISO image with the install files for most guest OSs. For a more in-depth explanation see this link. From Linux, mount the CD and “cd” to the CD directory (pun actually not intended…), then run “sudo sh ./<script-relevant-for-your-architecture>” – for example “sudo sh ./VBoxLinuxAdditions-x86.run”. This should compile and install the relevant modules. You will also have to add two modules to your startup process: “vboxadd” and “vboxvfs”. The first is the base system for the VirtualBox Guest Additions, and the second one is the file system controller that allows you to access the shared folders set by the host. Some OSs also have these things available through repositories. In Arch for instance, the relevant packages are in the package “virtualbox-additions” in community. To install, just type “pacman -S virtualbox-additions”.

Under Arch, edit “/etc/rc.conf” and add the two said modules to the MODULES array (i.e. “MODULES = (vboxadd vboxvfs)”. Since your there, you might want to add “httpd” and “sshd” to your DAEMONS list as well. You should also add the following to your “/etc/hosts.allow”: “httpd: ALL” and “sshd: ALL”. This allows the host to connect on those ports.

So now that the guest has its additions, we need to install the server software. I won’t go through the specifics here, but in my case, I installed Apache2 with PHP and PostgreSQL.

And so, to tie it all together: At this point, you have a working server, and after a reboot going to “http://locahost:8888/” on your host should take you to the default start page in whatever web server you’ve set up. You should also be able to connect to SSH if you’ve set that up. Thus far though, you will still have to transfer your files to the server to test them there. This is where VirtualBox’s shared folders come in.

In the VM window, select “Devices” → “Shared Folders”. Here, add your development folder as a new shared folder with full access and click OK. If you run a GUI guest you should now be able to access the folder as a network drive. If not, however, you need to do some more console magic. To get the folder to mount automatically in Linux, all you have to do is add the following line to “/etc/fstab”

"<Name of shared folder>    /srv/http/    vboxsf    defaults    0    0"

The name of the shared folder is stated in the Shared Folders dialog we opened earlier.

Next, run “sudo mount -a” to mount the new folder. This should allow you to navigate to “/srv/http” on your guest OS and see all the files in your development folder. Finally, set up your web server to have “/srv/http/” as its document root, and you should be able to access any of your projects at “http://localhost:8888/path/to/file/from/development/folder/” from your host the instant you save a file with all the bells and whistles of a fully-fledged web server.

If you experience Apache serving you old versions of a file even though you KNOW you’ve made a change, edit the Apache config (“/etc/httpd/conf/httpd.conf” on Arch), and uncomment the line saying “EnableSendFile off” and restart Apache.

Enjoy your new upload-free development environment!

References:

Written by Jon Gjengset

December 14, 2009 at 19:37

Downloading a mms://video stream

leave a comment »

Have you ever wanted to watch a video online, but due to a slow connection or frequent dropouts, streaming is impossible to watch. In these cases, there is rarely a “Download” button that allows you to download the entire thing and watch it in full when it’s done. Evidently, this is a real-world application of Murphy’s law.

Here at Bond University, some lectures are streamed and saved for later viewing. These are available to all students from an online interface. The problem is that these videos are streamed (in the proper sense of the word – that is, not that they just play as you download, but as in that the browser has to play back a continuous video stream which causes problems on slow connections since the browser cannot keep up, and has to stop the video all the time and request that the server restart from a previous point in time) even when the lecture has been completed. Even when sitting in the on-campus accomodation, the connection or the server (I don’t know which) is too slow to cope with playing these videos in real time, and such, it is a nightmare trying to watch any of these lectures.

The streaming plays in Windows Media Player and uses a protocol called mms:// (Multimedia stream). VLC and Mplayer can both play this as well, but take ages to load for some reason.

In my frustration, I decided to find out how to download the stream so I can play it without delay, and rewind and fast-forward as much as I wanted. Turns out this is not as straight forward as one would expect with streams.

First of all, the file is never present as a file from the server, only as a stream. This means that you cannot download the video faster than the actual length of the video. A two hour lecture therefore takes at least two hours to download. Furthermore, you cannot simply right-click the video and attempt to get the URL and download that because this will only give you a tiny text file with more URLs.

So, here is what you have to do:

  1. Get Mplayer
  2. Go to the page with the streaming video on it and right click the video
  3. Select properties and copy the URL from the window that opens
  4. Open a new tab in your browser and navigate to the given URL
  5. Press Ctrl+S, or otherwise save the page
  6. Open the downloaded file with notepad, you should see something like this:
    [Reference]
    Ref1=http://straumod.nrk.no/disk02/Lovebakken/2009-09-11/?MSWMExt=.asf
    Ref2=http://10.103.0.56:80/disk02/Lovebakken/2009-09-11/?MSWMExt=.asf
  7. Copy either of the URLs
  8. Start mplayer with the following parameters: “-dumpstream -dumpfile stream.wmv <URL>”
    For those of you who are not familiar with MPlayer and run Windows, here is how you do that:

    • Press Win+R or press the Start menu and click “Run”
    • Type “cmd” and press enter
    • In the new window, type “cd \”, press enter, type “mkdir stream”, press enter, type “cd stream” – The previous commands made a new folder in the root of your main drive called “stream”
    • Next, to run mplayer: type “C.\Program Files\mplayer and press tab (with the opening quote at the beginning before pressing tab), type “\mplayer” (without the quotes) and press tab again
    • Write a space, followed by the parameters written above (starting with “-dumpstream”), replacing “<URL>” with the URL you copied in step 7
    • Press enter and wait
    • When the program finishes (i.e. the last line says something like “C:\stream>”), you should find the video in the folder “C:\stream” as “stream.wmv”.
    • Rename, play and enjoy!

Written by Jon Gjengset

December 13, 2009 at 14:43

My web

leave a comment »

We all use the web differently, and all have the pages we cannot live without. Following is a list of the top websites I cannot live without. What are yours?

Google Apps

Google Apps is a service that allows you to manage Google services for your own domains. For instance – I own several domains (thesquareplanet.com, casa-rioja.com, awknard.com, nerd-geek.com …), and host them all myself. I also used to host my own mail and calendar, but when I discovered Google Apps, I stopped doing that. Not only would I lose all mail that was sent to me if my server or internet connection went down, but I also had to constantly maintain security and other enhancements. With Google Apps, you can add all of your domains into a central system, and manage e-mail addresses, users, calendars and all sorts of other Google hosted services. Best thing is – it’s free! (as long as you’re non-profit that is)

I now have the GMail and GCalendar interfaces on my own domains, and can use all of their features as if I was using my Google Account.

Facebook

Although Facebook receives quite a bit of critique with regards to privacy issues, and many abandon it entirely because of this, I feel that it is truly an amazing tool on the web that should be taken advantage of. As with everything else one does online, one has to be conscious that everything that is put out can, and probably will, be used against you, but as long as this is clear, Facebook brings a lot to the table. It allows you to organize your social life, and easily stay in touch with everyone you know (and some you don’t). By using Facebook, you also create an online presence which is extremely important in this increasingly online world.

Google Reader

If you follow anything online, be it a blog, a forum, a news site, a video feed or acoolname.com, you should get a feed reader. It allows you to merge feeds from all your article sources, and view them in one grand, unified list; sorted and grouped to heart’s desires. Google Reader is an online feed reader which provides a clean online interface for you to check updates in your online world from anywhere, at any time. All you need is a Google account, and you’re on your way!

What else do I use?

  • Vimeo – Upload & watch videos
  • YouTube – Upload & watch videos
  • GrooveShark – Listen to music online for free
  • Songza – Listen to music online for free
  • XMarks – Synchronize your bookmarks and passwords across browsers and computers, and access them from anywhere
  • StumbleUpon – Find new, interesting pages on the web

Written by Jon Gjengset

November 25, 2009 at 22:40

SPeeDY – The new HTTP?

leave a comment »

Google have been coming with a lot of new cool projects lately – From Chrome and Chrome OS to Wave and Social Search. Following this innovative trend, they have now announced that they’re working on a possible replacement for HTTP. Actually, it is not as much a replacement as it is an augmentation or “fix”. SPDY will still be using the headers and basic structure of HTTP, but will treat that structure quite differently and introduce several enhancements to make it more efficient and more suitable for the contemporary web context.

Google argues that we need a new protocol for web traffic because of the way the web has changed over the last decade. Nowadays, pages use both multimedia and several externally linked files – something which HTTP was not optimized for. More specifically – HTTP does not allow:

  • Fetching several resources through a single HTTP connection
  • Push-like behavior from the server to the client
  • Lack of native, and compulsory, compression of packet contents – especially headers
  • Statelessness – HTTP does not “remember” anything from previous exchanges. This makes for redundant information such as certain almost static headers to be resent unnecessarily

Although several other projects have tried to come up with an appropriate replacement for the HTTP protocol, Google now believe that they have found one that is suitable. The SPDY project aims to accomplish several things; for example, a 50% decrease in latency for online request-response cycles and near-transparent transfer from the old technology to the new one.

This latter point is quite interesting, and has been the reason why many other similar proposals have failed. Far too many protocols attempt to reinvent the wheel, but Google has decided to retain the TCP protocol as the underlying transportation agent, and to minimize the impact on developers and end-users. This is achieved by the SPDY protocol avoiding any changes to the way the data is handled on both end-points, only the protocol in between, so that web developers won’t have to change a thing on the server side. The only changes that are needed are in the browsers and the web server itself.

The reduction in latency will be primarily by enforcing compression on headers and body, slicing away unnecessary header tags and allowing several resources to be fetched in a single TCP request to avoid packet overhead. Google has also decided to take some steps to improve the overall quality of the protocol as a whole by introducing SSL as the rule, and non-SSL as the exception (if it will be allowed at all); as well as cutting down the protocol definition so that the implementation will be much simpler.

Overall, SPDY promises a lot, and looks very promising – All that remains is to see whether server and browser developers will join the cause and develop working implementations of the draft for testing. Knowing Google, they will probably release support for it in both Chrome and an “experimental” web server that will probably be released soon.

One major obstacle for its popularity though is that the premise of multiple resources in a single TCP stream, and a move away from the stringent request/response cycle of current HTTP sessions means substantial changes will be made to the web servers to allow for this kind of behavior. Hopefully Google will release their testing server to the public soon so we can start to see test implementations of the technology, and how hard or easy it will be to implement.

For more info on the technology, have a look at the SPDY Whitepaper

Written by Jon Gjengset

November 25, 2009 at 21:50

Surveying Social Search

leave a comment »

Google recently published a new feature on their Labs Search Experiments page: Google Social Search. By enabling this service, Google attempts to locate your friends and connections through online communities like Twitter, Facebook, GMail and Google Reader, and present you with search results that are not only relevant to your search phrase, but also to your social circle. For instance, searching for a restaurant will provide you not only with the regular search results for that restaurant, it will also give you blog posts, tweets and other publicly available messages from your friends, and their friends about what you search for. This may be a review, a comment, images or anything else that Google finds relevant.

In order for this to work correctly, you must tell Google how to find your friends, and this is usually done through your Google Profile. There, you can add links to pages that represent you online, and Google will then go through these to attempt to find connections to other people. Results from these people will then be integrated into your search results.

It is hard to say what kind of an impact this will have on searching in the future, but I believe it will make search results much more relevant to the individual. We are more likely to listen to someone we know about a product or a service, than believe the words of a total stranger. The real roadblock that needs to be overcome is to properly determine a user’s contacts and friends. Because several systems are closed to the public (Facebook for instance), a lot of connections are invisible to these kinds of crawlers. In my opinion, this is very unfortunate, and I think it is sad that people feel such a need for privacy that they feel the need to hide their friends, pictures and info from the public eye. As Google Social Search is currently dependent solely on publicly available connections and information, it cannot get any better than what is publicized. It is limited by the raw data it can fetch, and at the moment, that is unfortunately too little for the social search results to be relevant enough. That is, at least, my experience so far with trying out Google Social Search. Hopefully, however, this will change in the future, and bring truly individually adapted search results.

Written by Jon Gjengset

November 21, 2009 at 16:23

Jump-starting tricks for aspiring web developers

leave a comment »

So, you want to make websites, do you? Becoming a web developer is both very easy, and very hard at the same time. Mocking up a simple page online with some text and images is easy. Not only are there several WYSIWYG website editors (What You See Is What You Get) out there, but there are also several websites that allow you to create your page online directly through point and click. This is not web development.

Furthermore, if all you want to do is make a design for a web page in Photoshop, you are not a web developer, you are a web designer. Although many web developers tend to be web designers and vice-versa, this is certainly not a matter of implication. A web designer creates a site design, a web developer implements that design – there is nothing more to it. If you create your designs and implement them, you are a web developer as well as a web designer.

There are, of course, several degrees of web development; from basic HTML and CSS to fully-fledged PHP/ASP/<insert programming language here> web applications, but this is not the topic of this post. In this post I intend to give you, as an aspiring web developer, a couple of shortcuts, strategies, tricks and gotchas that I have found during my six years of development experience at the time of writing. This is by no means a complete guide to becoming a web developer, but more of a reference document to get you past the various obstacles browser developers, web standards, faulty documentation and operating systems have put in place to make our life a bit more interesting.

So, without further ado:

The only 5 tags you’ll ever need

HTML and XHTML both contain large amounts of tags. Too many, in fact, for them all to be useful in most cases. Remember, XHTML (and HTML to a large degree) aims to describe the structure of the content, and that is what we have all the tags for. To make the content readily available to screen-readers, text-to-speech engines, search engines and the likes. When prototyping a design, however, you should rarely make use of all of, for instance, the subtle difference between an <strong> and a <b> tag. In fact, you probably shouldn’t even care about the difference between an <em> and a <strong>; you would probably substitute them both with a <span> anyway. Although you should put in the appropriate tags when you begin to develop larger websites, or when you begin to polish the smaller ones, you will find that there are only 5 tags you really need when building an initial design.

  • <span> – The fundamental inline element
  • <div> – The fundamental block element
  • <a> – The link
  • <img> – The image
  • <ul> – For making lists

Although coding your site using only these tags may be considered bad practice for the reasons explained above, they will in fact get you started quickly and substantially reduce the amount of tags you have to keep in your head when you’ve just started making web pages.

Due note that these are only content tags, and not the additional meta tags such as <link>, <style>, <script> and <meta> that you will also need to style and animate your site.

Reset your styles

The #1 reason why your designs do not work when you try to open them in a different browser from the one you initially developed and tested in is because of default margins and paddings. Every browser has its own definition for what padding and margin every element should have if you don’t specify any, and consequently, when you move to another browser, all your elements become slightly smaller or larger, and your design collapses into an unrecognizable heap of divs for no apparent reason.

Although reset stylesheets (google it) have recently become quite popular, I often find them unnecessary as they tend to reset too much – giving you more work. Instead, I just a good old rule which simply resets the margin and padding, and nothing more:

* { margin: 0; padding: 0 }

Try putting this in your document before you begin, and you’ll find cross-browser design becomes a whole lot easier!

Understand the box model

Way too many web developers don’t understand what the difference between margin and padding is, and how these are rendered together with the border of the element. Much less how to calculate the total dimensions of the element. The fact of the matter is that this is essential to being able to create potent web sites. It is also the alpha-omega of many of the CSS hacks you will encounter through your web development career.

A simple Google search reveals several images and sites trying to explain it, and one of the first results explains it quite simply:

Learn it by heart – it will save you much hassle and confusion!

Face the truth – learn to program

If you’re going to make any decent web site, you will have to learn how to program. And I’m not talking about plain ol’ HTML, I’m talking of at least Javascript, and preferably a proper server-side language such as PHP or ASP. Javascript allows you to manipulate your site dynamically to make your site a lot more interactive. For instance, you can use it to show a date-picker for an input field, validate form input without going through the server (though for security reasons you should ALWAYS check the data on the server as well), update the page behind the scenes without the user having to refresh the page (AJAX) or make elements on your page fly all over the place. All this power, however, becomes nothing but a fun topping when you consider a server-side language.

Where Javascript deals with the page the user is seeing, the server-side languages allow you to store data the user submits, add dynamic content to your site (This can be anything from a simple “Quote of the day” to allowing users to add articles, list recently added articles and show all articles using the same HTML, allowing the scripting language to fill in the content) and track user sessions (i.e. login and have their own preferences and personalized pages).

Personally, I prefer PHP to ASP, but this is completely up to you – Just learn one because you will have to!

Use a JavaScript library

Javascript is a wonderful thing, but it also quite awkward. In order to do even simple animations, you need to write a lot of code. In addition, Javascript lacks the broad set of tools that often comes with larger, self-contained languages. There are many libraries out there that attempt to “fix” Javascript in one way or another. Some simply make it easier to manipulate the DOM (Document Object Model) and do animations (jQuery is a typical, and extremely popular library that aims to do this), whilst others go all the way and manipulate Javascript’s native methods and objects to make them more powerful, more usable and more flexible. A good example of the latter is the Javascript framework/library MooTools. For larger, more complex Javascript-enabled projects, such a framework is often preferred over the lightweight jQuery equivalents. Pick the tool for the task at hand.

Know your clients

When developing a website, you need to know what browsers and resolutions you are developing for. If you are making a site for a design bureau, you can usually assume that they will have high resolution screens of at least 1280×1024 or 1600×1400, and you should design your site accordingly. In such cases a fluid design layout might be worth considering to allow your users to utilize the full resolution of their screens. More commonly though, you will be designing for the majority of users, and the majority do not have resolutions of that scale. Too many users still use a 1024×768 resolution, or even smaller, and consequently we have to take this into consideration. Usually, making a page between 900px and 950px wide makes the site viewable for most users, and at the same forces you to avoid extraneous information.

Also, determine whether you should support IE6 or not. This is a major issue as supporting IE6 requires a lot more work than not supporting it because of its blatant disregard for standards and ridiculous implementations of it at times.

A word of caution: If a designer hands you a design with a page-width wider than 950px, don’t just scale it down and get on your way! This will not only distort the design, but it will also make all the text smaller which you should avoid at all costs. Make sure all text on your site is easily readable, and try to keep to a maximum of 15 words per line. Anything smaller makes it hard to read, especially on high resolutions!

A second word of caution: Not everyone has Javascript enabled! Make sure you either provided a usable scaled-down version of your site without JS, or that you warn the user that your site requires Javascript. Don’t just leave it up to them to find that nothing works..

Prototype, then validate

If you’re told to make a design, make a quick and dirty mock-up. That is not to say it should not look like the design, but that you shouldn’t care too much whether your prototype validates or uses the correct tags. Those kinds of things can be changed later if the design is approved. Designers change their mind all the time, and you don’t want to spend a lot of time on something they are going to change or remove entirely at the next signpost.

When you’re prototyping, however, do try to make the site look similar in all target browsers. The reason for this is mainly because if you make it work in one browser and OK the design, and then, when told to make the site to the design you OK’d, you might find that what you did in that one browser cannot be done in another due to different implementations of the standard. Then what are you going to do? You OK’d the design, remember?

Follow standards, but not blindly

Standards are a good thing – no question about it. The problem is that at times, it can be a bit too strict. Especially when trying to make your pages render correctly cross-browser.

For instance, IE6 only supports the CSS :hover attribute on <a> elements, and as such, you may be force to put divs, or other block-level elements, inside an <a> tag which will not validate. It will seldom create any problem in any other browser, so theoretically you could just ignore the validation warnings. The problem is that all too many people are concerned about whether their code validates or not. Bottom line is, if it works, and you’re confident that it works, and will continue to work cross-browser, the warning is really not that important.

That said, if you can follow the standard and validate your site, do so. It is a form of guarantee that your site design should not be broken by future browsers that might have different rendering engines or follow newer standards. Often, when you’re code doesn’t validate, it is quite simple to fix the issue. The fix also tends to come at the expense of a degraded experience to IE6 users…. Oh well, that’s sad isn’t it?

Plan for security, but delay the implementation

When developing web applications, security should be a major concern. Unfortunately, it is often completely overlooked, or applied seemingly haphazardly; “Oh, I think I’ll just put in a striptags here and we’ll be good to go!” Not thinking about, and planning, for security can cost you very dearly in the end.

On the other hand, security takes a lot of time and effort to implement, and in the initial stages of web development – when prototyping a new concept or design – you don’t really want to bother with that kind of thing. The danger is that once the prototype is complete, you decide to skip the extra work and simply continue work on the prototype with all its hacks, shortcuts and security holes. Don’t do it! Instead, plan you security measures thoroughly from the beginning. Do not simply say: “We’ll run striptags on all output and addslashes on all input”, but set out an abstraction layer which allows you to secure all input and output, not matter where it’s going. Decide on security policies and content rules beforehand, instead of patching your old code when someone breaks your system.

This might seem like a lot of work, and it is! It is, however, also very necessary to prevent all sorts of nasty security breaches. The upside of course is that you do not have to implement these security measures when prototyping. In fact, you SHOULDN’T implement them at this stage. Prototyping is about coming up with something workable quickly to see if it works as intended and according to plan. This does not require excessive security precautions. Just remember to put them in when you begin developing the real system.

Use as few images as possible and merge those you can

Everyone does not have a fast internet connection. In fact, there are still those out there browsing the web on a 56.6kbit modem, and although they are a minority, it should tell us that the excessive use of images and other external media is not exactly taking care of your users. Rather the contrary. That is not to say that you should never use images on a site, in fact images are usually essential to a visually appealing website. The danger is excessive use of media.

A common misconception about external media on web pages is that as long as one uses small images, everything is fine. The truth is that the fewer files the browser has to fetch, the better. Every new request to the server comes with overhead and delay waiting for the server to send its response. Where you can, merge together your images and use CSS sprites to display only the part you wish to show. This saves you from the overhead and increases the overall compression rate of your images.

Don’t make a mess – modulate

When you develop websites, it is all too easy to put everything in one file. When prototyping this makes your development speed much higher, but as your system grows it will soon become slow and unmanageable. Instead, try to split your system in to logically separated units. A good place to start is to implement a MVC oriented framework for your site.

Avoid the lazy fallbacks

If you can’t understand how to do something the first time around, don’t fall back to the easy solutions.

  • Tables are for tabular data
  • Absolute positioning is for overlay windows
  • Frames are an absolute no-no

Instead, try to learn something new, and ask someone who knows more about the problem than you do to help you out.

When asking for help

If you are not already a programmer, do not make the mistake, as many do, of demanding answers or asking for a piece of code instead of advice. There are plenty of skillful people out there willing to help, but they will not help you if:

  • You do not ask nicely
  • You do not provide source code and other relevant material for them to examine for problems
  • You ask for the complete solution
    • Rather than saying “I need a script that does x”, try to make one that does it yourself, and then ask: “I’m trying to do x, and have come up with y. I seem to be stuck because of z. Has anyone got a suggestion about how I might accomplish this?” where y is your source code and z is your problem.

Essential hacks and gotchas:

In the world of web development, there are some gotchas that are very very common, but are not too often explained. I will try to bring up some of those here:

The “overflow: hidden” fix:

Say you have the following code:

<div>
<div style="float: left; height:200px;"></div>
<div style="float:right; height: 300px;"></div>
</div>

How tall would you guess the parent div to be? 200? 300?

It will in fact be 0px tall. The reason is that the browser does not count floated elements when considering the height of the element. This “bug” becomes very apparent if you, for instance, try to position something absolutely a certain distance from the bottom within the parent div, because that element would really be position from the bottom of the tallest NON-FLOAT element within the parent.

There are several fixes for this problem, but the most common (and cleanest) method is simply to put the style “overflow: hidden” or “overflow: auto” on the parent. For some reason the browser then takes the floating children into account and correctly sets the height of the parent.

Absolute positioning

Consider the following code:

<div style="margin: 50px;">
<div style="position: absolute; top:50px; left: 50px;"></div>
</div>

Where would you say the child div would be positioned in relation to the browser viewport? At (100, 100)? You would be wrong. The correct answer is at (50,50). The reason for this is that absolute positioning is considered relative to the first parent element that has been given a “non-flow” position. That is, any element that has its “position” style set to something else than “static”. If no such parent exists, it is positioned in relation to the viewport. A quick fix is simply to set the style declaration “position: relative” on the parent you want to absolutely position the child in relation to. Because relative positioning does in fact not move the element at all unless left and top are specified this means the rest of the page is not affected, and we can get on with our work.

Due note however, that doing this means that ALL absolutely positioned children of the element that receives “position: relative” will be positioned relative to that element. Thus, you cannot have two children elements where one should be relative to the viewport and one relative to the parent.

Full height column backgrounds

A common design layout is one which has multiple vertical columns (often two or three) next to each other. In the designs, these columns almost always are the same length, with the background color extending to the very bottom of the elements. The problem is that whilst this is very much possible in Photoshop, in HTML an element is only as large as its content or as large as you specify it to be. And unfortunately, CSS does not allow you to specify an element to be as tall as another element. Therefore, if you have multiple columns, and at least one of them has content that will vary in height, you will inevitably end up with the columns being different height. If you then set a background color on each of them, you will notice that the background color is only drawn on the element, thereby making it very obvious that the columns are not the same height.

So, how do you make the columns appear to both be as high as the tallest column? Faux columns. In essence, this approach depends on the parent being as tall as the tallest of the children – which it will be unless we’re floating the columns; in which case we can apply the “overflow: hidden” trick. A background image containing all column backgrounds is then applied to the parent and repeated down its entire height. Read the link for full implementation details.

Working with IE

Let’s face it – IE makes our life terrible. With IE8, Microsoft is starting to get it right, but IE7 and particularly IE6 gets a lot of things wrong. Luckily, the fact that they don’t follow the standards also gives us several methods of giving specifically tailored instructions to IE. There are two ways of approaching IE specific hacks – the quick, simple way, and the longer but better way:

The quick fix

Targeting IE6 in CSS: Prefix your selector with “* html “

Targeting IE7 and 6 in CSS: Prefix the attribute with a star (no space between the star and the attribute) – Note that this hack will invalidate your CSS!

The good fix

Use the Internet Explorer only Conditional Comments

Jumping pages

Okay, so you’ve made you site perfect. It is centered, looks beautiful and navigation is smooth as a kitten’s hair. Just one more page to test – the “About us” page with lots of text… As you click the link, your entire page jumps slightly to the left. You try to figure out why, and notice that you now have a scrollbar which means the center of the page has moved, and so, dutifully following your CSS, the browser has moved you page to the new center.

The solution to this annoying problem is to make sure that the scrollbars are always present, but are grayed out when there is nothing below the fold. And how do you do that? Like this:

html {
	height: 100%;
}
body {
	min-height: 100.1%;
	overflow-y: auto;
}

Centering

CSS has many ways of centering, to the confusion and irritation of most developers:

text-align: center – This makes inline elements center horizontally in its parent element

vertical-align: middle – Should be applied to inline element and sometimes centers and element vertically if you’re lucky (press “sometimes” for more details)

margin: 0 auto – Centers a block-level element by setting its left and right margins to the same value. Here, IE of course has to mess up the beauty, and requires the parent to have “text-align: center” in order for it to work. Remember to reset the text-align to left inside the element!

position: absolute – This one requires a bit more explanation. The idea here is that you position an element in the center of another by moving it 50% from the top and 50% from the left, and then move it back by half its dimensions using a negative margin. For example – if you wanted to center an element that is 400×200px within its parent, you would first of all set “position: relative” or similar on the parent (see the Absolute positioning headline further up in this post), and then you would set the following styles on the element itself:

position: absolute;
/* Center vertically */
top: 50%;
margin-top: -200px;
/* Center horizontally */
left: 50%;
margin-left: -100px;

If this does not make sense, read it again.

display: table-cell – This quite new method relies on using CSSs ability to render any element as if it was a table cell, and then using the vertical alignment property of a table cell to center the content. It is especially good for centering text! View it here.

Use a Doctype

A Doctype is not something one puts in simply to make the page validate, it does actually have an effect as well. In the case of some browsers, the determine whether to render the page according to the standard or not based on the presence of a Doctype. Therefore, put it in!

Final words

This non-exhaustive list contains my experiences from web development so far, and will probably be expanded upon by both me and hopefully some of you (use the comment field below!). Use it for what its worth, and avoid the pitfalls that are all too easy to fall into when one is new in the field. If you have any questions or comments regarding these notes, feel free to post your comment below, and I’ll do my best to give you a proper response!

Happy coding!

Recovering data from a Mac drive from Linux

leave a comment »

The Mac of a friend of mine crashed the other day – complete harddrive failure. He turned it into the Apple store, and they decided to give him a new disk because they said they couldn’t recover the old one. Somehow, he managed to talk them into leaving him with the old drive so he could try to get at least some of his data back. My friend then came to me, and asked me to have a look at what I could find.

He had already tried to hook the drive into another mac, but it just froze every time he tried to enter a folder on the drive. The same happened when I used my SATA docking station, and attempted to either access or repair the drive through a piece of software called MacDrive. It seemed as though all hope was lost.

I decided to give it another shot from Linux, just to see if I could somehow avoid the corrupt files, and copy only those that could be properly read. Turns out, Linux didn’t even break a sweat when seeing the corrupted harddrive. All I had to do ( using Arch Linux that is ) was this:

  1. Attach drive to my SATA dock
  2. Mount the drive: “mount /dev/sdg2 /mnt/ext”
    • This assumes the external drive is sdg, check “fdisk -l” to find attached drives
    • Mac-formatted drives have at least two partitions ( one is the boot partition ). Therefore, you must mount partition two ( or whichever is formatted as HFS+ )
    • Because Apple decided to use journaling on their HFS+ drives, these are not writable from Linux. Either live with it, or insert the drive into a Mac, open up the terminal ( usually Applications/Utilities/Terminal ) and run “diskutil disableJournal /dev/disk#” where # is the drive number. Find the drive number by running “diskutil list”. For more info, see: http://castyour.net/node/40
  3. Run “cd /mnt/ext/”
  4. Navigate to whatever folder you want to copy and run “cp -R <folder> <destination>”
  5. The cp command will then dutifully copy all the files it can read properly into your destination folder, and tell you if it can’t read a file. It will automatically skip them.
  6. Your files are saved!

Written by Jon Gjengset

November 15, 2009 at 19:29