MODx Evolution: Creating a Simple Custom Snippet Part 4 – Optimize with Caching

Part 0: Introduction
Part 1: Write and Test the PHP Script
Part 2: Create the Snippet
Part 3 – MODx API, Placeholders, Chunks

In the fourth part of this brief series on creating a custom MODx Evolution snippet, we’re going to do some optimizing to make our snippet a little more efficient.  So far the snippet seems to be working as expected and is pulling in the feeds as needed.  However, it’s a little inefficient.  The way things are set up now, every time someone refreshes the page, the script fetches a new copy of the rss feed from the coding pad, or whatever source you’re pulling your feeds from. If you have a heavily trafficked site where people spend a lot of time on your site and visit many pages (thus, for example, constantly refreshing your snippet call in the sidebar), this can be resource heavy on both your side slowing your website down, and also on the web server that you’re pulling the feeds from.  We can mitigate this by implementing caching.

Wikipedia defines a cache as:

…a collection of data duplicating original values stored elsewhere or computed earlier, where the original data is expensive to fetch (owing to longer access time) or to compute, compared to the cost of reading the cache. In other words, a cache operates as a temporary storage area where frequently accessed data can be stored for rapid access. Once the data is stored in the cache, it can be used in the future by accessing the cached copy rather than re-fetching or recomputing the original data.

There are several ways we could implement caching, and this is a whole PHP topic in and of itself.  Some options that I considered were using the Cache Lite package from the PEAR library or the Zend_Cache class.  However, I am going to do something pretty basic for caching with this snippet, and hopefully explain it in a way that will be clear 🙂  This will be mostly PHP coding with one small visit to the MODx API.

We need a couple of things to implement caching.  We need

  1. a storage location for our cache,
  2. a way to identify and distinguish the items stored in our cache to prevent overwriting,
  3. a cache lifetime.

The first one is easy, I am simply going to use the built in MODx cache directory.  In your site manager, navigate to Elements->Manage Files.  Click on the assets folder, and in it you should see a folder called cache.

snippettut4_1

If I look in this cache folder there are just a few files at the moment

snippettut4_2

This is where we’re going to store our cached xml files so that we’re not constantly retrieving the feed contents from the source web server.  Instead, once the website has loaded and the feed has been fetched, we want it’s contents to be stored for a predefined period of time and then once the cache expires, fetch the feed afresh.

The statement in our snippet that fetches the feed is this:


$content = file_get_contents($feed_url);

We need to replace this with a conditional statement that says, check the cache location and see if there’s already a cached file and that it hasn’t expired. If there is, set the value of $content to the contents of that file.  If there is no cached file or it has expired, fetch the feed from the source site, set that as the value of $contents, delete the expired cached file if it exists, and save the retrieved file into the cache.

OK, so let’s translate this statement into PHP code, shall we?  I have added plenty of comments so that the code makes more sense.


....

//CACHING
//The cached feed contents should only last for a specific time and then expire, so we need to register the current time.
$currentTime = microtime(true);

....

To specify the cache location that we selected, we’ll use the MODx API one more time by specifying our directory using modx->config() to set the url and then add in the directory.

For part 2 of our requirements outlined above, I am going to prefix the cached files with “rff” (for rssfeedFetcher) and then, to separate the feed from each source from the others so that there’s no overwriting, I will actually use the md5 hash of the url as the filename.


....

//We now set a location for the cache file
//We prefix it with rff and use the md5 hash of the url to allow creation of a cache file for each feed we're calling to avoid overwriting

$cache = $modx->config['base_path'] . "assets/cache/rff_" . md5($feed_url);

//First check for an existing version of the time, and then check to see whether or not it has expired.

// I am going to use 2 hours (7200 seconds) as my cache duration
if(file_exists($cache) && filemtime($cache) > (time()-7200)) {

 //If there's a valid cache file, load its data.
 $content = file_get_contents($cache);
} else {
//If there's no valid cache file, grab a live version of the data and save it to a temporary file.
 $content = file_get_contents($feed_url);
 $tempName = tempnam('$modx->config['base_path'] . "assets/cache", 'rff');
 file_put_contents($tempName, $content);

 // Once the file is complete, I want to copy it to a permanent file
 //first I have to delete the old cached file if it exists (use @ to suppress error message for the first time the website loads)
 @unlink($cache);
 rename($tempName, $cache);
}

....

If we now save the snippet and load our webpage for the first time, there shouldn’t be much of a difference. When we open our Test Feed page, the sidebar should load much faster but the page will take some time to load since there’s nothing cached yet.  However, if we now refresh the page, or navigate to another part of the site and then back to that page, you should notice that it loads significantly faster.

If we now look at the contents of our cache folder, you’ll notice two new additions:

snippettut4_3

You can see our two new cache files that weren’t there before, and they have the rff_ prefix as we specified in our code.

This is a pretty simple implementation of caching but it works fine and makes the browsing experience much more pleasant for the site visitor as well as more optimal for your webserver and the source webserver.  For part 3 of our requirements, we’ve hard coded the cache lifetime into the snippet, but as before, you can make this an optional parameter in your snippet call and then set a fallback duration that the caching lifetime will default to.  I’m confident that if you’ve followed this series through you now know how to make that happen.

Here’s the full code for our snippet call so far.


<?php
//default parameter values
$limit = (isset($limit)) ? $limit : 10;
$cssChunk = (isset($cssChunk)) ? $cssChunk : 'rssfeedFetcherCSS';

//Inject the CSS code into the head section
$modx-&gt;regClientCSS($modx-&gt;getChunk($cssChunk));

if (!function_exists(fetchFeed)) {
function fetchFeed($feed_url, $limit, $cssChunk, $tplChunk){
global $modx;

//create a variable to hold the output
$output = '';

//retrieve file and return as string
//$content = file_get_contents($feed_url);

//CACHING
//The cached feed contents should only last for a specific time and then expire, so we need to register the current time.
$currentTime = microtime(true);

//We now set a location for the cache file
//We prefix it with rff and use the md5 hash of the url to allow creation of a cache file for each feed we're calling to avoid overwriting

$cache = $modx-&gt;config['base_path'] . "assets/cache/rff_" . md5($feed_url);

//First check for an existing version of the time, and then check to see whether or not it has expired.  I m going to use 2 hours (7200 seconds) as my cache duration
if(file_exists($cache) &amp;&amp;
filemtime($cache) &gt; (time()-7200)) {

//If there's a valid cache file, load its data.
$content = file_get_contents($cache);

} else {

//If there's no valid cache file, I grab a live version of the data and save it to a temporary file.
$content = file_get_contents($feed_url);
$tempName = tempnam($modx-&gt;config['base_path'] . "assets/cache", 'rff');
file_put_contents($tempName, $content);

// Once the file is complete, I want to copy it to a permanent file
//first I have to delete the old cached file if it exists (suppress error message for first pass through)
@unlink($cache);
rename($tempName, $cache);
}

try {
//all is good, we parse the feed
$feeditems = new SimpleXMLElement($content);

//iterate over item in the channel and get the title of each item
foreach($feeditems-&gt;channel-&gt;item as $entry){
//set up a counter to determine how many items to be displayed
if ($i &lt; $limit) {
$modx-&gt;setPlaceholder("fftitle", $entry-&gt;title);
$modx-&gt;setPlaceholder("fflink", $entry-&gt;link);
$output .= (!isset($tplChunk)) ?  $modx-&gt;parseChunk('rssfeedFetchertpl', $modx-&gt;placeholders, '[+', '+]') : $modx-&gt;parseChunk($tplChunk, $modx-&gt;placeholders, '[+', '+]');
}
$i++;
}
} catch (Exception $e) {
//some error occured, we output an error message and a description of the error
$output .= 'An error occurred.  The feed ' . $feed_url . ' could not be read: ' . $e-&gt;getMessage();
}
return $output;
}
}
//call the function
return fetchFeed($feed_url, $limit, $cssChunk, $tplChunk);

If you want a numerical measure of the performance improvement this little caching system brings to the snippet, you can add some little code to check how long the script takes to execute on first pass, and then again after the content has been cached.  An example of a script you can use is here: http://www.tipsntutorials.com/tips/PHP/74

Well, that’s it for today.  Hope this post has been helpful to you not just for your MODx snippet creation skills but also overall for your PHP coding. If you have other ideas on how caching could be implemented better or differently please don’t hesitate to share it in the comments!

More Reading

PHP.net – We’ve used a few PHP functions that you may not be familiar with, such as rename(), tempnam(), unlink(), etc.  For any of these that you don’t know, read up on them and make sure you understand their syntax for your own use and reference.

Leave a Reply