SharePoint Caching Techniques

Controlling the Client (Browser) Cache

In any web application caching can be both a benefit and a burden. We use caching to serve pages and objects repeatedly without having to use resources to re-render items that have been previously been processed.

There are also times when reprocessing of previously rendered items is desired because either the data changes frequently or needs to be handled for every request or user differently. There are times when just an individual control, dataset, or other object needs to be stored for reuse and other times the entire page output as rendered needs to be stored and served on successive requests. There are many places where items may be cached including on the server, the client, and many network devices that may reside between the two. ASP.NET uses the System.Web.HttpResponse.Cache object to manipulate the HTTP cache headers that are used to control how a page is cached. The locations where an item may be cached are as follows:

Server: The page is to be cached on the sever that processed the page so that additional requests for the same item are served from cache instead of being reprocessed, conserving server resources.

Public: The page’s content can be cached on the client and on shared (proxy) caches that may reside on networks between the client and server in an effort to conserve network bandwidth.

Private: The page is cached on the client browser and may be cached on shared (proxy) caches, but the shared cache must revalidate the freshness of the content from the server before serving the page to clients.

The System.Web.HttpCacheability enumeration is used to specify the combination of the above locations and on what devices in the communication chain items can be cached.

	Member name	Description
	NoCache	Sets the Cache-Control: no-cache header. Without a field name, the directive applies to the entire request and a shared (proxy server) cache must force a successful revalidation with the origin Web server before satisfying the request. With a field name, the directive applies only to the named field; the rest of the response may be supplied from a shared cache.
	Private	Default value. Sets Cache-Control: private to specify that the response is cacheable only on the client and not by shared (proxy server) caches.
	Server	Specifies that the response is cached only at the origin server. Similar to the NoCache option. Clients receive a Cache-Control: no-cache directive but the document is cached on the origin server. Equivalent to ServerAndNoCache.
	ServerAndNoCache	Applies the settings of both Server and NoCache to indicate that the content is cached at the server but all others are explicitly denied the ability to cache the response.
	Public	Sets Cache-Control: public to specify that the response is cacheable by clients and shared (proxy) caches.
	ServerAndPrivate	Indicates that the response is cached at the server and at the client but nowhere else. Proxy servers are not allowed to cache the response.

Table 1 HttpCacheability Options – source: MSDN

Meta tags in the HTML header can also be used to control cache settings of the page. However, the use of this caching technique is often not the most reliable method for specifying the page’s cache settings as most shared (proxy) cache devices only look at the HTTP header and do not inspect HTML content resulting in these settings often being ignored.

SharePoint Publishing Cache

The publishing features of SharePoint come with a feature called the SharePoint publishing cache. This cache type is a wrapper for the ASP.NET cache settings that provide an interface that can be used to control the HTTP cache headers and how the cache is varied. However, this cache only affects items that are publishing pages in SharePoint and do not have any effect on non-publishing pages.

The SharePoint publishing cache settings are defined in a Cache Profile at the root of the site collection. The cache profile can then be assigned at an individual site level or for a particular page layout.

SharePoint BLOB Cache

Binary-Large-Objects can be cached separately from publishing items using the SharePoint BLOB Cache. The blob cache is a disk-based cache that prevents a read from the database for the BLOB reducing the load on the SQL server and serving the file from the file system. When a content object is updated the BLOB cache is notified so that it will refresh the cached item from the database on the next request. To configure the BLOB cache, settings are configured in each web application’s web.config file.

Layouts Folder Cache

When SharePoint provisions a web application in IIS it maps several folders as virtual directories. One of the most utilized is the …/TEMPLATE/LAYOUTS folder that is mapped as /_layouts. When the virtual directory is created, it is given a content expiration that informs public and private caches to cache the item for 365 days without making another request to check for an update to files from this location. This may cause some issues with custom JS and CSS files that are deployed to this location.

Site Pages

Any page that is not a SharePoint Publishing page, but still is a content page that resides in a SharePoint document library is referred to as a site page. Site pages by default are set for private cache with an expiration date 15 days in the past. This will have the client validate the page has not changed before the client renders the page from its cache.

Server-Side Cache Techniques

ASP.NET Cache Object

The ASP.NET cache is used to cache objects in a dictionary where the objects can be easily retrieved again using the key name. These objects can be placed into the cache and be expired automatically or explicitly when needed. The easiest way to access this cache is through the use of the System.Web.Caching.Cache object and is availiable via many different methods including: Page.Cache, Context.Cache, HttpRuntime.Cache, Reponse.Cache. The ASP.NET cache object is unique to each server in the farm and is not shared between servers. Therefore it may be possible to have a completely different version of the object from server-to-server.

SharePoint Web Part Cache

SharePoint Web Parts can utilize their own caching technique. What sets apart the SharePoint Web Part cache from the ASP.NET cache is that the web part can easily be cached for all users or varied per user. SharePoint web part cache items can be cached using a database so that all servers share the same cache data. To use the SharePoint web part cache, the web part must derive from Microsoft.SharePoint.WebPartPages.WebPart. To manipulate cache items for the web part use the PartCacheRead, PartCacheWrite, and PartCacheInvalidate methods. This cache is stored per web part instance (and by user if chosen), not by web part type.

Partial Caching

Partial caching is a technique used to cache different elements on a page using different cache schedules or cache variation rules. This can be an effective method when the page is not cached but some of the controls are cached. Another use for partial caching is when the page is cached, but various elements are to be cached as well, but differently than the page itself. Partial caching is accomplished declaratively using the <%@ OutputCache %> directive on a page or control; or programmatically by decorating the control’s class with the [PartialCache] attribute.

Donut Caching

Donut caching is used instead of partial caching when there is an element that is not to be cached at all when the page itself will be cached (on the server). When using the SharePoint publishing cache the entire page as rendered will be placed into the cache and served to all site visitors in the same way. This will cause problems for HTML elements that need to be varied by user or other logic. A caching technique that can prevent a portion of a cache page to be excluded from cache is called donut caching. This is useful when you want to exclude part of the rendered page from cache so that it can be reprocessed even when the page itself is cached. Donut caching is implemented by implementing a delegate method that matches the signature of HttpResponseSubstitutionCallback that is registered with the Response object. This can be done either declaratively using the ASP.NET Substitution control or programmatically by directly registering the delegate method with the WriteSubstitution method of the Response object.

Method 1 – Declaratively using the ASP.NET Substitution Control:

Create a new ASCX user control with code-behind.

In the code-behind, create a new static method with a signature of: public static string <YourMethodName> (HttpContext context)
In the markup, create a new <asp: Substitution /> server control as a placeholder for what will not be cached. Set the MethodName property to the name of the method created in step (2): MethodName=”<YourMethodName>”.
Implement the desired logic in the new method and return the resulting HTML that will be displayed on the page.

Method 2 – Programmatically Through Code:

Create a new static method with a signature of: public static string <YourMethodName> (HttpContext context) in your class.
In a method outside of the new static method (OnLoad, OnPreRender, Render, etc.) register the substation by calling: Context.Response.WriteSubstitution(<YourMethodName>);
Implement the desired logic in the new method and return the resulting HTML that will be displayed on the page.

Note: when using Donut caching client-side cacheability cannot be used; in effect making every request contact the server.

Rendering complete or dynamic controls in the HttpResponseSubstitutionCallback

When using donut caching, you can render complete controls or a set of control dynamically using a proxy control. This control uses either method 1 or 2 above, then, instantiates the controls that will be rendered within the HttpResponseSubstitutionCallback and returns the HTML result of their rendering. The blog at http://webnet.web44.net/advanced-donut-caching-using-dynamically-loaded-controls/ has more details about this approach, but a simple example of dynamically loading a control and rendering its output using donut caching is show below:

public static string MyMethodName(HttpContext context)
{
    var output = new StringBuilder(10000);
    using(var page = new Page())
    using(var ctl = page.LoadControl(“~/_controltemplates/somecontrol.ascx”))
    using(var writer = new StringWriter(output))
    using(var htmlWriter = new HtmlTextWriter(writer))
    {
        ctl.DataBind();
        ctl.RenderControl(htmlWriter);
    }
    return output.ToString();
}

Over the years that I’ve worked on teams developing software, I regularly hear of the “three legged stool”, a metaphor for balancing three opposing aspects of the development process. The metaphor works well because if any leg of a stool is shorter than the other, the stool can stay upright but any pressure can cause it to fall over.

Not practicing one of them at all can make it not even able to stand. These three legs are quality, features, and schedule. The motivation between choosing these 3 aspects is to help business stakeholders, managers and developers make smart decisions about the compromises we inevitably make on projects. In practice however, I find that business owners will always choose features, managers will always choose schedule, and developers will always choose quality.

The reason behind this is simply a human desire for self-preservation. Business stakeholders’ performance is measured by what features they deliver, managers are measured by their ability to deliver on time, and developers are often measured by their ability to write solid code and deliver it with minimal defects. Since most organizations are structured where developers report to managers, who in turn report to the business; this places an uneven emphasis on adding new features at the expense of consistent time to market with new features of high quality.

As a developer or architect, the list of features and desired release date is often out of our hands, even with a backlog and a fairly well followed set of agile processes. Rather than continue this trend, I’m of the belief that even though business stakeholders are supposed to be the best people at knowing the business, they are often not well versed enough in technology to understand the implications of ignoring requests for refactoring, and making smart decisions with respect to technology selection. Developers are still the core of the value chain in this aspect of the development process. Rather than argue this point, I’d like to propose a list of 4 priorities that developers can regularly consider when making technology and design decisions. If you follow this set of priorities, you’ll find yourself regularly making the best decisions for all parties.

Can I implement the feature?
Does the design allow extension of features along planned integration points?
Is the design flexible enough to allow unplanned changes to be made?
Is the design, and are the technologies, the most modern available?

Can I implement the feature?

If you don’t make the software do what the business wants it to, you’ve failed as a developer. It doesn’t matter if you’ve used to most forward thinking technologies and patterns, and written unit tests to give you 110% code coverage (chuckle). If you don’t deliver to market what your business expects, users won’t have compelling reasons to use the software, your business stakeholders won’t have much to sell, and you probably won’t be seen as having much value in the company at one point or another. This assumes that the business is choosing the right features, and designing them to be usable, for success in the marketplace. If you don’t trust the business to make the right decisions here, I’d suggest you join another company you can be passionate about where you believe in the vision and leadership there. Above all else, evaluate design decisions and technology selections by their ability to enable you to meet the requirements and make the feature work. Usability and quality are included in this priority. I’m of the opinion that one should never deliver features that are unusable or of bad quality to the market. If a feature is hard to use, users won’t use it, or worse yet use it as an example of bad design to laugh at with their peers. If a feature doesn’t work, they will say bad things to their friends about your company and kill your brand. Enough said.

Does the design allow extension of features along planned integration points?

When you build a software product, unless you want to see a competitor blow you out of the water very soon after you ship your first version of a feature, you’ve got to have a backlog of new features to add in future releases. I’m of the opinion that this list of future features should be readily available to developers to provide transparency into the direction of the business. While it’s true that an inexperienced developer may look at a backlog and be tempted to make design decisions that extend the schedule and don’t add value needed in that feature or release, a quick read of YAGNI and a revisiting of that principle during design decisions should be all it takes to avoid this. The downside to not letting developers see the backlog of future features is much more nefarious; they can make design decisions that make it difficult, if not impossible to support features planned in the short to medium term future. There is a delicate balance here, but the heading says it well. Make decisions in your design that allow extension of features along integration points you really have planned. Stakeholders should have a way to identify features on the backlog that are uncertain (nice to have) or a duration of time past the current release after which they should not be considered in design decisions. If you’re not sure, don’t do it.

Is the design flexible enough to allow unplanned changes to be made?

I often encounter developers new to OO, data driven design, and contract-first development that go crazy with decoupling and make their designs overly complicated to implement with little value. They add interfaces to classes with trivial logic to enable unit testing (thus adding un-necessary assets to maintain, test, and revisit during refactoring), use solution-specific patterns to implement logic that really belongs in one method, and create complicated multi-step ETL processes to cover up for bad database design. As a developer, you should try to select patterns and technologies that follow best practices in the industry for enabling refactoring, testing, and feature integration; but only to a level that is sufficient to support the first two priorities. This priority is closely related to the one above with a crucial difference – the 2nd priority has a business stakeholder driving the decision for flexibility while this one is driven by a developer desire to use patterns that make the code easy to maintain or rapid to extend but that will be difficult to explain to business stakeholders and management. Usually it’s easy to convince management or stakeholders that you have to design an interface to support integration with data providers they have planned integration with in their backlog. It’s not as easy to convince them that a set of your classes need interfaces for passing into a processing pipeline of some sort when the processing could be done in one method, and there are no plans to modify the pipeline’s order. Read the Tradable Quality Hypothesis from Martin Fowler to help you convince management of the importance for supporting activities that fall into this priority when you feel the need for them.

Is the design, and are the technologies, the most modern available?

Last on your set of priorities should be using new technologies. ASP.NET MVC3, JQuery Templates, OData, Ruby, and F# may offer significant advantages over past technologies or patterns; but if using these doesn’t result in more success with the first 3 priorities, it’s hard to make a logical case for it. There are times where using a technology is necessary for marketing purposes, and in this case use of that technology becomes a feature itself (thus bumping it to priority 1) but otherwise there are many times when using an older version of a database, framework, or language is more than sufficient to enable the business to consistently deliver quality value to customers. Unless your organization has a handle on the first 3 priorities, you’ve got work to do before tackling this one.