Crawlable Javascript Pages with KnockoutJS
For sites or applications that wish to have their pages visible in search engines, or accessible by browsers not using javascript (screen readers for example), it is important to be crawlable. This means the server can produce pure html output that represents the content of the page or application. In modern client (javascript) focused projects this takes a bit of effort.
Google has defined a standard for using HTML4 which allows the server to render page content specific to hash values in links (#anchor). Normally these hash values are not sent to a server as they are supposed to be internal to the page, for navigation purposes. But, in HTML4 browsers these are also used to initiate javascript updates to a page, resulting in unique content for each hash tag. In HTML5 there is a new history API that allows javascript to alter page contents based on normal urls, rather than needing hash tags to indicate new content. In this case the browser simply needs to be able to render all urls the application or page uses in all its links, even if they would be generated in javascript when used within an HTML5 browser.
In this post I will cover the use of HTML5 history API to present a dynamic site, while still supporting server side rendering of all content for crawling, and while defining the content only once. KnockoutJS is used as the client-side javascript library for this post, as it is low overhead, and approachable, while keeping most of the focus on the content. For the server side I will be using Rails 3.1, but the technique applies equally to any server or client frameworks.
The Components
For this post I will be presenting one page in detail and describing how the system works for multiple pages. This site uses this technique and can be viewed as a larger example.
- Server Side Controller
- Server Side Layout
- Server Side View
- Client Side Logic
- Server Side Routes
The general technique is to render the content on the server into a string. This content is then either rendered either into the overall HTML page when served by the server, or inserted into the page when navigating using javascript. The History API is used to provide smooth navigation on the client side. A variation on this technique is to render all pages into javascript templates on initial page load allowing fully autonomous navigation between pages. For relatively small static sites this would work, or sites that are largely data driven where the templates would be small, and per-page data fetched as part of the navigation.
For initial page load you can either render the content on the server then re-render on the client for consistency or only render pages on navigation in the client. In the case of Knockout bindings are established on page load and so this example re-renders the content after page load to allow the bindings to operate automatically after each navigation.
The Server Side Controller
The server side rendering of content relies on the ability to render content as HTML to a string, then use that string as either JSON for an AJAX request, or directly into an HTML response to a direct request.
class HomeController < ApplicationController
...
def index
@page_title = 'TechnoMage'
@page_tag = 'home'
html = render_to_string(:format => :html,
:file => 'home/index.html.erb', :layout => false)
respond_to do |format|
format.html { render :html => html }
format.json { render :json => json_for_content(html)}
end
end
def about
@page_title = 'TechnoMage - about'
@page_tag = 'about'
html = render_to_string(:format => :html,
:file => 'home/about.html.erb', :layout => false)
respond_to do |format|
format.html { render :html => html }
format.json { render :json => json_for_content(html)}
end
end
end
When rendering content to JSON some additional information is passed from the server to update the page. The page title is updated on each navigation, and the current page tag is passed so that any navigation menu can be updated. When the content is rendered to HTML these values are still used when re-rendering using javascript, but not directly used when javascript is disabled or not processed when viewed by a crawler such as a search engine.
class ApplicationController < ActionController::Base
...
def json_for_content html
{:cur=>@page_tag,
:page_title => @page_title,
:content => html}.to_json.html_safe
end
helper_method :json_for_content
end
The Server Side Layout
In Rails when a page is rendered to HTML a layout is used to render common elements of all or related pages. In this example the layout provides the navigation menu and handles creating the Knockout bindings for content and setting up the History API to provide javascript based navigation.
You will notice in several places in this layout that the server side rendering is substituting in data for things like the page title, paths for link's href, and the classes to apply to navigation elements so the current page is marked as current. These all support access by crawlers and javascript disabled browsers or screen readers to present complete content. In addition to these static elements there are KnockoutJS bindings to allow dynamic update of these elements to support javascript based navigation updates. In the case of page title since that is in the header, we alter the page title explicitly upon navigation.
<!DOCTYPE html>
<html>
<head>
<title><%= @page_title %></title>
<%= stylesheet_link_tag "application" %>
<%= javascript_include_tag "application" %>
<%= csrf_meta_tags %>
</head>
<body class="blog" data-bind="attr: {class: cat}">
<div id="topbar">
<a href="<%= root_path %>"
onclick="return show_home();"><div id="logo"></div></a>
<div id="nav">
<%= link_to "Home", root_path,
:class=>"first #{'current' if @page_tag=='home'}",
:onclick=>"return show_home();",
:"data-bind"=>"css: {current: isCurrent('home')}" %>
<%= link_to "About", about_path, :onclick=>"return show_about()",
:class=>"middle #{'current' if @page_tag=='about'}",
:"data-bind"=>"css: {current: isCurrent('about')}" %>
...
</div>
</div>
<div id="content" data-bind="dom: rendered_content()">
<% @content = yield %>
<%= @content %>
</div>
</body>
</html>
<script>
jQuery(function() {
// Copy the html into KO data and re-render to allow KO to modify
// it in the future, while still falling back on plain html if
// js is not enabled
js = <%= json_for_content @content %>;
ko.mapping.fromJS(js, {}, model);
ko.applyBindings(model);
//
// Handle history changes
//
var History = window.History;
if (History.enabled) {
History.Adapter.bind(window,'statechange',function() {
var state = History.getState();
load(state.url);
});
}
});
</script>
The Server Side View
The server side views are just normal HTML views in Rails. They contain no special content for this technique.
<h1>Home</h1>
<p>This is a sample page view containing just HTML.</p>
<h1>About</h1>
<p>This is a sample page view containing just HTML.</p>
The Client Side Logic
The client side logic to navigate to a page is presented below. It uses the history.js library which provides a cross-browser interface to the HTML5 history API while accounting for various browser specific differences.
The main function is the goto function that navigates to a new URL using the history API. There are helper functions for various page URLs based on the rails named route helpers which are statically generated into the applicaiton.js.erb file as part of the asset pipeline support in Rails 3.1.
function goto(path) {
var History = window.History;
if (History.enabled) {
History.pushState(null, null, path);
return false;
}
return true;
}
function show_home() {
return goto('/');
}
function show_about() {
return goto('/about');
}
...
The client side logic to respond to a url and present the contents is presented below. It uses the history.js library which provides a cross-browser interface to the HTML5 history API while accounting for various browser specific differences.
The History API calls the callback in the layout upon receiving a request to navigate from the above goto function. This ensures that the URL bar of the browser, the presented content, and any bookmarks are all in sync. That callback calls this load function to actually load the content for the requested path. This function then uses an AJAX request to get the JSON version of the content and supporting data from the server, then uses KnockoutJS to update the page's DOM, and then the page title. Note that in this case we replace the root path '/' with '/home' so the .json extension is properly formated. This requires that the server be configured to reply with the same results for both '/' and '/home'.
function load(path) {
if (path.match(/\/$/)) {
path = path + 'home';
}
jQuery.get(path+'.json', function(data) {
ko.mapping.fromJS(data, {}, model);
document.title = model.page_title();
});
}
The Server Side Routes
The final component is the server side routes that map URLs to controllers and actions. The below supports the sample pages used above for home and about actions. These are standard Rails 3.1 and only included for completeness.
get "about", :to => "home#about", :as => :about
get 'home', :to => "home#index", :as => :home
root :to => 'home#index', :as => :root
Recap
This post provides a complete set of Rails 3.1 server side code, and required client side code to use the KnockoutJS and history.js libraries on the client to create a site that is both crawlable by search engines, supports raw HTML as a fall back or to support applications like screen readers, and when javascript is available presents a seamless navigation model between pages for HTML5 browsers.
While history.js does provide support for hash tag based javascript navigation on the client, it greatly complicates the life on the client and server to support this style of URL and navigation. Google has defined a means for sites to use hash tag navigation and still provide search engines content to index. It is overly complex but should be considered if you need to support HTML4 and search engine crawling. History.js also provides a means to support HTML4 without the google protocol which is cleaner than the google option, but still more complex than the HTML5 version presented here.