Consigli SEO per javascripter: Se ‘o sai è meglio. (ricapitoliamo)

What does is indexed by search engines? Of course the static pages are, but what about javascript originated content (i.e. AJAX)?

By the end of 2011 Google indexs comments loaded via AJAX (i.e. Facebook, Disqus, …).

https://developers.google.com/webmasters/ajax-crawling/docs/specification this is the last specs, that means (reading https://developers.google.com/webmasters/ajax-crawling/docs/html-snapshot) that the web server must provide a snapshot of the requested page and it should do someway.

All was due to limitation of javascript on replacing location.href string on browser address bar, replace limited to the hash sign # following part of url. Thus hashbang (#!) was introduced, very famous (sometime infamous) in twitter, but that give the opportunity to index pages with AJAX loaded content. BUT it demand the server to provide a static version of AJAX page to the search engine crawler when requested with ?_escaped_fragment=… (ellipse are for the hashbang following string escaped).

Using Java all fine. There is PhantomJS, but ia not nodejs (is a desktop app). PHP has a javascript intepreter (V8, v8js extension), but it does not manipulate DOM, or it has to be implemented, just an idea, maybe it is possible to load the standard browser environment (window and document) via registerExtension method. Life is hard.

But now there is HTML5 and pushState … what? the same thing, but better. Simply remove limitation on browser address bar rewrite to javascript and let it manage history by javascript. Say:

var data = null;
var title = 'view video';
var url = '/listvideo/video1';
window.history.pushState(data,title,url);

this snippet replace the url in address bar, then it could be followed by code to inject html with video tag loaded via AJAX. Let say the video is loaded in a layer and the close cross is clicked then click handler will do another history.pushState(), remove the layer and user will see the list of video again, while using the browser bar back button it goes back in the history. Cool, doesn’t? … a moment, for back to function it is needed an event’s handler, something like:

window.addEventListener("popstate", function(event){
  // location.pathname contains the new current path
  // location.data contains the passed data
  showPage(location.pathname);
});

Well, what search engines want is that the server provides the whole page (with both base page and overlay), in response to requested url. But it can be done via client side code! (read javascript)

For browsers that do not support the new history api, snippet contents have to be loaded every time and everytime AJAX should fill it: the server provide the base page, a check is done on location.href and the right content loaded via AJAX (i.e. showPage(location.href)). It is vital for SEO prospective that pushState url matches the href of clicked element. (an intrusive example to show it: <a href=/url/video1/ onclick=action> then action() must call pushState with url=/url/video1)

No more snapshot due by server, all is managed by search engines, Bing and Google agree on this: client side code are executed before indexing.

From a SEO prospective: HTML5? use it.