Chapter 6. Going Offline

If you know HTML, CSS, and JavaScript, you already have the tools you need to develop Android applications. This hands-on book shows you how to use these open source web standards to design and build apps that can be adapted for any Android device -- without having to use Java. Buy the print book or ebook or purchase it in iBooks.

There’s a feature of HTML 5 called the offline application cache that allows users to run web apps even when they are not connected to the internet. It works like this: when a user navigates to your web app, the browser downloads and stores all the files it needs to display the page (HTML, CSS, JavaScript, images, etc...). The next time the user navigates to your web app, the browser will recognize the url and serve the files out of the local application cache instead of pulling them across the network.

The Basics of the Offline Application Cache

The main component of the offline application cache is a cache manifest file that you host on your web server. I’m going to use a simple example to explain the concepts involved, and then I’ll show you how to apply what you’ve learned to the Kilo example we’ve been working on.

A manifest file is just a simple text document that lives on your web server and is sent to the user’s device with a content-type of cache-manifest. The manifest contains a list of files that a user’s device must download and save in order to function. Consider a web directory containing the following files:

index.html
logo.jpg
scripts/demo.js
styles/screen.css
        

In this case, index.html is the page that users will load in their browser when they visit your application in the browser. The other files are referenced from within index.html. To make everything available offline, create a file named demo.manifest in the directory with index.html. Here’s a directory listing showing the added file:

demo.manifest
index.html
logo.jpg
scripts/demo.js
styles/screen.css
        

Next, add the following lines to demo.manifest:

CACHE MANIFEST
index.html
logo.jpg
scripts/demo.js
styles/screen.css
        

The paths in the manifest are relative to the location of the manifest file. You can also use absolute urls like so (don’t bother creating this just yet; you’ll see how to apply this to your app shortly):

CACHE MANIFEST
http://www.example.com/index.html
http://www.example.com/logo.jpg
http://www.example.com/scripts/demo.js
http://www.example.com/styles/screen.css
        

Now that the manifest file is created, you need to link to it by adding a manifest attribute to the HTML tag inside index.html:

<html manifest="demo.manifest">

You must serve the manifest file with the text/cache-mainfest content type or the browser will not recognize it. If you are using the Apache web server or a compatible web server, you can accomplish this by adding an .htaccess file to your web directory with the following line:

AddType text/cache-manifest .manifest

Note

If the .htaccess file doesn’t work for you, please refer to the portion of your web server documentation that pertains to MIME types. You must associate the file extension .manifest with the MIME type of text/cache-manifest. If your web site is hosted by a web hosting provider, your provider may have a control panel for your web site where you can add the appropriate MIME type. I’ll also show you an example that uses a PHP script in place of the .htaccess file a little later on in this chapter (because PHP can set the MIME type in code, you won’t need to configure the web server to do that).

Our offline application cache is now in working order. The next time a user browses to http://example.com/index.html the page and its resources will load normally over the network (you’d replace example.com/index.html with the URL of your web app). In the background, all the files listed in the manifest will be downloaded locally. Once the download completes and he refreshes the page, he’ll be accessing the local files only. He can now disconnect from the internet and continue to access the web app.

Now that the user is accessing our files locally on his device, we have a new problem: how does he get updates when changes are made to the web site?

When the user does have access to the internet and navigates to the url of your web app, his browser checks the manifest file on our site to see if it still matches the local copy. If the remote manifest has changed, the browser downloads all the files listed in it. It downloads these in the background to a temporary cache.

Note

The comparison between the local manifest and the remote manifest is a byte-by-byte comparison of the file contents (including comments and blank lines). The file modification timestamp or changes to any of the resources themselves are irrelevant when determining whether or not changes have been made.

If something goes wrong during the download (e.g. user loses internet connection), then the partially downloaded temporary cache is automatically discarded and the previous one remains in effect. If the download is successful, the new local files will be used the next time the user lanches the app.

Note

Remember that when a manifest is updated, the download of the new files takes place in the background after the initial launch of the app. This means that even after the download completes, the user will still be working with the old files. In other words, the currently loaded page and all of its related files don’t auto-magically reload when the download completes. The new files that were downloaded in the background will not become visible until the user relaunches the app.

This is very similar to standard desktop app update behavior. You launch an app, it tells you that updates are available, you click download updates, the download completes, and you are prompted to relaunch the app for the updates to take effect.

If you wanted to implement this sort of behavior in your app, you would listen for the updateready event of the window.applicationCache opbject as described in the section called “The JavaScript Console” and notify the user however you like.

Online Whitelist and Fallback Options

It is possible force the browser to always access certain resources over the network (this process is known as whitelisting). This means that the browser will not cache them locally, and that they will not be available when the user is offline. To specify a resource as online only, you use the NETWORK: keyword (the trailing : is essential) in the manifest file like so:

CACHE MANIFEST
index.html
scripts/demo.js
styles/screen.css

NETWORK:
logo.jpg
        

Here, I’ve whitelisted logo.jpg by moving it into the NETWORK section of the manifest file. When the user is offline, the image will show up as a broken image link (Figure 6.1, “Whitelisted images will show up as broken links when the user is offline.”). When he is online, it will appear normally (Figure 6.2, “Whitelisted images will show up normally when the user is online.”).

Figure 6.1. Whitelisted images will show up as broken links when the user is offline.

Whitelisted images will show up as broken links when the user is offline.

Figure 6.2. Whitelisted images will show up normally when the user is online.

Whitelisted images will show up normally when the user is online.

If you don’t want offline users to see the broken image, you use the FALLBACK keyword specify a fallback resource like so:

CACHE MANIFEST
index.html
scripts/demo.js
styles/screen.css

FALLBACK:
logo.jpg offline.jpg
        

Now, when the user if offline, he’ll see offline.jpg (Figure 6.3, “Fallback images will show up when the user is offline.”) and when he’s online, he’ll see logo.jpg (Figure 6.4, “Hosted images will show up normally when the user is online.”).

Note

It's worth noting that you don't have to additionally list offline.jpg to the CACHE MANIFEST section. It will automatically be stored locally by virtue of being listed in the FALLBACK section of the manifest.

Figure 6.3. Fallback images will show up when the user is offline.

Fallback images will show up when the user is offline.

Figure 6.4. Hosted images will show up normally when the user is online.

Hosted images will show up normally when the user is online.

This becomes even more useful when you consider that you can specify a single fallback for multiple resources by using a partial path. Let’s say I add an images directory to my web site and put some files in it:

/demo.manifest
/index.html
/images/logo.jpg
/images/logo2.jpg
/images/offline.jpg
/scripts/demo.js
/styles/screen.css
        

I can now tell the browser to fallback to offline.jpg for anything contained in the images directory like so:

CACHE MANIFEST
index.html
scripts/demo.js
styles/screen.css

FALLBACK:
images/ images/offline.jpg
        

Now, when the user if offline, he’ll see offline.jpg (Figure 6.5, “A single fallback image will show up in place of multiple images when the user is offline.”) and when he’s online, he’ll see logo.jpg and logo2.jpg (Figure 6.6, “Hosted images will show up normally when the user is online.”).

Figure 6.5. A single fallback image will show up in place of multiple images when the user is offline.

A single fallback image will show up in place of multiple images when the user is offline.

Figure 6.6. Hosted images will show up normally when the user is online.

Hosted images will show up normally when the user is online.

Whether you should add resources to the NETWORK or FALLBACK sections of the manifest file depends on the nature of your application. Keep in mind that the offline application cache is primarily intended to store apps locally on a device. It’s not really meant to be used to decrease server load, increase performance, etc.

In most cases you should be listing all of the files required to run your app in the manifest file. If you have a lot of dynamic content and you are not sure how to reference it in the manifest, you app is probably not a good fit for the offline application cache and you might want to consider a different approach (e.g. a client-side database, perhaps).

Creating a Dynamic Manifest File

Now that we’re comfortable with how the offline app cache works, let’s apply it to the Kilo example we’ve been working on. Kilo consists of quite a few files and manually listing them all in a manifest file would be a pain. Plus, a single typo would invalidate the entire manifest file and prevent the application from working offline.

To address this issue, we’re going to write a little PHP file that reads the contents of the application directory (and sub directories) and creates the file list for us. Create a new file in your Kilo directory named manifest.php and add the following code:

<?php
  header('Content-Type: text/cache-manifest');1
  echo "CACHE MANIFEST\n";2

  $dir = new RecursiveDirectoryIterator(".");3
  foreach(new RecursiveIteratorIterator($dir) as $file) {4
    if ($file->IsFile() &&5
        $file != "./manifest.php" &&
        !strpos($file, '/.') &&
        substr($file->getFilename(), 0, 1) != ".") {
      echo $file . "\n";6
    }
  }
?>

1

I’m using the PHP header function to output this file with the cache-manifest content type. Doing this is an alternative to using an .htaccess file to specify the content-type for the manifest file. In fact, you can remove the .htaccess file you created in the section called “The Basics of the Offline Application Cache”, if you are not using it for any other purpose.

2

As I mentioned earlier in this chapter, the first line of a cache manifest file must be CACHE MANIFEST. As far as the browser is concerned, this is the first line of the document; the PHP file runs on the web server, and the browser only sees the output of commands that emit text, such as echo.

3

This line creates an object called $dir, which enumerates all the files in the current directory. It does so recursively, which means that if you have any files in subdirectories, it will find them, too.

4

Each time the program passes through this loop, it sets the variable $file to an object that represents one of the files in the current directory. In English, this line would be “Each time through, set the file variable the next file found in the current directory or its subdirectories.”

5

The if statement here checks to make sure that the file is actually a file (and not a directory or symbolic link) and also ignores files name manifest.php or any file that starts with a . (such as .htaccess), or is contained in a directory that begins with a . (such as .svn).

Note

The leading ./ is part of the file’s full path; the . refers to the current directory and the / separates elements of the file’s path. So there’s always a ./ that appears before the filename in the output. However, when I check for a leading . in the filename I use the getFilename function, which returns the filename without the leading path. This way, I can detect files beginning with a . even if they are buried in a subdirectory.

6

Here’s where I display each file’s name.

To the browser, manifest.php will look like this:

CACHE MANIFEST
./index.html
./jqtouch/jqtouch.css
./jqtouch/jqtouch.js
./jqtouch/jqtouch.transitions.js
./jqtouch/jquery.js
./kilo.css
./kilo.js
./themes/apple/img/backButton.png
./themes/apple/img/blueButton.png
./themes/apple/img/cancel.png
./themes/apple/img/chevron.png
./themes/apple/img/grayButton.png
./themes/apple/img/listArrowSel.png
./themes/apple/img/listGroup.png
./themes/apple/img/loading.gif
./themes/apple/img/on_off.png
./themes/apple/img/pinstripes.png
./themes/apple/img/selection.png
./themes/apple/img/thumb.png
./themes/apple/img/toggle.png
./themes/apple/img/toggleOn.png
./themes/apple/img/toolbar.png
./themes/apple/img/toolButton.png
./themes/apple/img/whiteButton.png
./themes/apple/theme.css
./themes/jqt/img/back_button.png
./themes/jqt/img/back_button_clicked.png
./themes/jqt/img/button.png
./themes/jqt/img/button_clicked.png
./themes/jqt/img/chevron.png
./themes/jqt/img/chevron_circle.png
./themes/jqt/img/grayButton.png
./themes/jqt/img/loading.gif
./themes/jqt/img/on_off.png
./themes/jqt/img/rowhead.png
./themes/jqt/img/toggle.png
./themes/jqt/img/toggleOn.png
./themes/jqt/img/toolbar.png
./themes/jqt/img/whiteButton.png
./themes/jqt/theme.css
        

Note

Try loading the page yourself in a browser (be sure to load it with an HTTP URL such as http://localhost/~YOURUSERNAME/manifest.php). If you see a lot more files in your listing, you may have some extraneous files from the jQTouch distribution. The files LICENSE.txt, README.txt, and sample.htaccess are safe to delete, as are the directories demos and extensions. If you see a number of directories named .svn, you may also safely delete them (unless you have put your working directory under the SVN version control system, in which case these files are important). Files beginning with a . will not be visible in the Mac OS X Finder or Linux File Manager (but you can work with them at the command line).

Now open index.html and add a reference to manifest.php in the <head> element like so:

<html manifest="manifest.php">
        

Now that the manifest is generated dynamically, let’s modify it so that its contents change when any of the files in the directory change (remember that the client will only re-download the application if the manifest’s contents have changed). Here is the modified manifest.php:

<?php
  header('Content-Type: text/cache-manifest');
  echo "CACHE MANIFEST\n";

  $hashes = "";1

  $dir = new RecursiveDirectoryIterator(".");
  foreach(new RecursiveIteratorIterator($dir) as $file) {
    if ($file->IsFile() &&
        $file != "./manifest.php" &&
        substr($file->getFilename(), 0, 1) != ".")
    {
      echo $file . "\n";
      $hashes .= md5_file($file);2
    }
  }
  echo "# Hash: " . md5($hashes) . "\n";3
?>
        

1

Here, I’m initializing a string that will hold the hashed values of the files.

2

On this line I’m computing the hash of each file using PHP’s md5_file function (Message-Digest algorithm 5), and appending it to the end of the $hashes string. Any change, however small, to the file will also change the results of the md5_file function. The hash is a 32-character string, such as "4ac3c9c004cac7785fa6b132b4f18efc".

3

Here’s where I take the big string of hashes (all of the 32-character strings for each file concatenated together), and compute an MD5 hash of the string itself. This gives us a short (32-characters instead of 32 multiplied by the number of files) string that’s printed out as a comment (beginning with the comment symbol #).

From the viewpoint of the client browser, there’s nothing special about this line. It’s a comment, and the client browser ignores it. However, if one of the files is modified, this line will change, which means the manifest has changed.

Here’s an example of what the manifest looks like with this change (some of the lines have been truncated for brevity):

            CACHE MANIFEST
./index.html
./jqtouch/jqtouch.css
./jqtouch/jqtouch.js
...
./themes/jqt/img/toolbar.png
./themes/jqt/img/whiteButton.png
./themes/jqt/theme.css
# Hash: ddaf5ebda18991c4a9da16c10f4e474a

            

The net result of all of this business is that changing a single character inside of any file in the entire directory tree will insert a new hash string into the manifest. This means that any edits we do to any Kilo files will essentially modify the manifest file, which in turn will trigger a download the next time a user launches the app. Pretty nifty, eh?

Debugging

It can be tough to debug apps that use the offline application cache because there’s very little visibility into what is going on. You find yourself constantly wondering if your files have downloaded, or if you are viewing remote or local resources. Plus, switching your device between online and offline modes is not the snappiest procedure and can really slow down the develop, test, debug cycle.

One thing you can do to help determine what’s going on when things aren’t playing nice is to set up some console logging in JavaScript.

Note

If you want to see what’s happening from the web server’s perspective, you can monitor its log files. For example, if you are running a web server on a Mac or Linux computer, you can open the command line (see Using the Command Line), and run these commands. The $ is the shell prompt, which you should not type:

$ cd /var/log/apache2/
$ tail -f access?log

This will display the web server’s log entries, showing information such as the date and time a document was accessed, as well as the name of the document. When you are done, press Control-C to stop following the log.

The ? on the second line will match any character; on Ubuntu Linux, the filename is access.log and on the Mac it is access_log. If you are on another version of Linux or on Windows, the name of the file and its location may be different.

The JavaScript Console

Adding the following JavaScript to your web apps during development will make your life a lot easier, and can actually help you internalize the process of what is going on. The following script will send feedback to the console and free you from having to constantly refresh the browser window:

Note

You can store this in a .js file such as debug.js, and refer to it in your HTML document via the script element’s src attribute), as in <script type="text/javascript" src="debug.js"></script>.

// Convenience array of status values1
var cacheStatusValues = [];
cacheStatusValues[0] = 'uncached';
cacheStatusValues[1] = 'idle';
cacheStatusValues[2] = 'checking';
cacheStatusValues[3] = 'downloading';
cacheStatusValues[4] = 'updateready';
cacheStatusValues[5] = 'obsolete';

// Listeners for all possible events2
var cache = window.applicationCache;
cache.addEventListener('cached', logEvent, false);
cache.addEventListener('checking', logEvent, false);
cache.addEventListener('downloading', logEvent, false);
cache.addEventListener('error', logEvent, false);
cache.addEventListener('noupdate', logEvent, false);
cache.addEventListener('obsolete', logEvent, false);
cache.addEventListener('progress', logEvent, false);
cache.addEventListener('updateready', logEvent, false);

// Log every event to the console
function logEvent(e) {3
    var online, status, type, message;
    online = (navigator.onLine) ? 'yes' : 'no';
    status = cacheStatusValues[cache.status];
    type = e.type;
    message = 'online: ' + online;
    message+= ', event: ' + type;
    message+= ', status: ' + status;
    if (type == 'error' && navigator.onLine) {
        message+= ' (prolly a syntax error in manifest)';
    }
    console.log(message);4
}

// Swap in newly download files when update is ready
window.applicationCache.addEventListener(
    'updateready', 
    function(){
        window.applicationCache.swapCache();
        console.log('swap cache has been called');
    }, 
    false
);

// Check for manifest changes every 10 seconds
setInterval(function(){cache.update()}, 10000);
            

This might look like a lot of code, but there really isn’t that much going on here:

1

The first 7 lines are just me setting up an array of status values for the application cache object. There are 6 possible values defined by the HTML5 spec, and here I’m mapping their integer values to a short description (i.e. status 3 means “downloading”). I’ve included them to make the logging more descriptive down in the logEvent function.

2

On the next chunk of code, I’m setting up an event listener for every possible event defined by the spec. Each one calls the logEvent function.

3

The logEvent function takes the event as input and makes a few simple calculations in order to compose a descriptive log message. Note that if the event type is error and the user is online, then there is probably a syntax error in the remote manifest. Syntax errors are extremely easy to make in the manifest because all of the paths have to be valid. If you rename or move a file but forget to update the manifest, future updates will fail.

Caution

Using a dynamic manifest file helps avoid syntax errors. However, you have to watch out for including a file (such as in a .svn subdirectory) that the server can’t serve up due to permissions. This will make even a dynamic manifest fail, since the file ends up being unreadable.

4

Once I have my message composed, I send it to the console.

You can view the console messages in Chrome by selecting View→Developer→JavaScript Console and then clicking Console if it was not automatically selected.

If you load the web page in your browser and then open the console, you’ll see new messages appear every ten seconds (Figure 6.7, “The console.log() function can be used to send debugging messages to the JavaScript console.”). If you don’t see anything, change the contents of one of the files (or the name of a file) and reload the page in your browser twice. I strongly encourage you to play around with this until you have a feel for what’s going on. You can tinker around with the manifest (e.g. change the contents and save it, rename it, move it to another directory, etc...) and watch the results of your actions pop into the console like magic.

Figure 6.7. The console.log() function can be used to send debugging messages to the JavaScript console.

The console.log() function can be used to send debugging messages to the JavaScript console.

What You’ve Learned

In this chapter, we’ve learned how to give users access to a web app, even when they have no connection to the internet. With this new addition to our programming toolbox, we now have the ability to create an offline app that is virtually indistinguishable from a native application downloaded from the Android Market.

Of course, a pure web app such as this is still limited by the security constraints that exist for all web apps. For example, a web app can’t access the Address Book, the camera, vibration, or the accelerometer on the phone. In the next chapter, I’ll address these issues and more with the assistance of an open source project called PhoneGap.

Site last updated on: November 17, 2010 at 11:11:58 AM PST