Towards Page Segmentation SEO – Ignore at Your Own Peril
The basic concept of page segmentation (PS) analysis by the search engines to reduce on page noise levels isn’t particularly new: related patents and studies go way back to the mid 90s and beyond.
To recapitulate, page segmentation is a method of analysis deployed by search engines to determine any given web page’s structure and assign hierarchies of relevance by different regions or blocks. The idea is to ignore or demote less relevant segments (”noise”) in favor of those that will actually address and satisfy a surfer’s search query (”signal”).
One typical segment found on most web pages is the navigation block. While it may, for example, point search engine spiders to a web site’s internal pages, as a general rule it’s hardly relevant to the page’s content proper. At the very least, it will feature many elements that won’t really help the search engine in deciding whether that page deserves being ranked for a given query.
By way of an (admittedly very simplistic) illustration, think keyword density as an example. Generic terms like “About Us”, “Privacy Policy”, “Contact”, “TOS” etc. obviously don’t contribute a lot of useful data towards determining what the page is actually about. Thus, by either ignoring navigation segments or dampening their score or impact on a page’s relevancy analysis, the signal to noise ratio can be improved.
Again, this is merely a very basic outline – in actual reality, things get way more complex than that. Take multi-topic pages as an example: news portals, online newspapers and magazines, radio and tv sites, social bookmarking sites, article directory index pages and, in fact, the vast majority of blogs will feature a very heterogeneous mix of headlines, text excerpts or snippets and even vast blocks of text and multimedia content. This makes it both impossible and actually undesirable to work from a simplistic “Page X is about Topic Y” formula when it should really read “Page X is about Topics Y, Z, A, E, K and P” – whereas intelligent page segmentation can help pinpoint and organize topical focus areas or blocks.
Yet another major area where PS becomes critical is the issue of distinguishing between advertisement and editorial content: banner ads, sponsored links, and even third party provided search results can considerably screw up page content analysis unless properly determined.
For an even more sophisticated approach, search engines could conceivably synch behavioral metrics (BM) with PS to aid them in determining which segments of a given web page will typically be parsed and read or analyzed by visitors (and, thus, deemed relevant to their search query), much like Heat Map technology is used to determine where to position advertising for maximum impact.
What’s more, even link analysis and evaluation may be impacted by PS. Rather than expound on this aspect myself, I’d rather quote Dave Harry who sums it up very neatly:
To begin with, page segmentation can help bolster link analysis methods such as page rank, HITS and their ilk. Or so the story goes. Consider a page with a variety of semantically or not so related content, complete with links (internal or external). Traditional analysis tells us that the page is treated as a whole and thus link relevance can be effected from a lack of focused theme. If search engines can begin to break out blocks of information, independent of the whole, new valuations can be had for links from within a single document. In short there could be more link juice to go around.
Another interesting element would be the ability to build links to a multi-semantic page with diverse anchor texts. Many times in SEO one creates target page(s) built around terms and builds related links to that page. This has always made ecommerce SEO a struggle between clicks to purchase and SEO readiness as far as structural elements and ‘landing pages’ are concerned. Page segmentation methods mean we could build more diversified link profiles to a given page (such as a main category page in the case of the ecommerce example).
Think of links from a block-to-page level and page-to-block (instead of say PageRank which is page-to-page). One can see how greater relevance from link analysis can be had.
As yet, the actionable SEO implications of this approach are far from clear. While it’s long been considered standard wisdom to position your page’s most relevant content (including keywords) above the fold, in your title tag, using H-tags to reinforce the message etc., dealing with multi-topical content is an entirely different ball game altogether.
For a first range of great suggestions on how to deal with PS from an SEO point of view, see Dave Harry’s post quoted above.
In any case it doesn’t require an Einstein to predict that this particular area of search technology is up for some very extensive testing by a great number of leading SEOs within the foreseeable future – ignore their findings at your own peril!
Sources
Bill Slawski of SEO by the Sea is, of course, the #1 guru if not “grandfather” of search engine patents analysis and always an excellent read. In regards to page segmentation, check out this post of his to get started:
Search Engines, Web Page Segmentation, and the Most Important Block
For what’s currently the best general overview and intro to this subject (linking to tons of resources as well), see Dave Harry’s highly recommendable piece:
“How search engines could get granular”
A “black hat” approach to page segmentation, currently under development, is Mosaic Cloaking, for which please see our own piece:
Mosaic Cloaking – Pushing IP Delivery to the Next Level of Black Hat SEO
And as a final reminder: in last week’s post on bounce rates etc. we’ve rooted for devoting more attention to BM, bounce rates – and to the prospect of SEO Surfbot nets emerging as a consequence:
SEO Bounce Rates, Behavioral Metrics and the Birth of SEO Surfbot Nets
[ Keywords: behavioral metrics, bm, link analysis, mosaic cloaking, page segmentation, ps, search engine marketing, search engine optimization, SEM, SEO ]
Trackback link: http://fantomaster.com/fantomNews/archives/2009/01/08/towards-page-segmentation-seo-ignore-at-your-own-peril/trackback/
![[Home]](http://fantomaster.com/images/shim.gif)






















