Understanding Hilltop
Understanding Hilltop
Abstract:
Copied from patent:
Ranking scheme for broad queries that places the most authoritative
pages on the query topic at the top of the ranking. Based
on “expert documents” – subset of the pages
on the web identified as directories of links to various sources.
Results are ranked based on the match between the query and
relevan descriptive text for hyperlinks on expert pages.
Overview
Copied from patent:
Algo is based on assumption that the number and quality of
sources referring to a page is a good measure of the page’s
quality. The key difference consists in the fact that we are
only considering expert sources – pages that have been
created with the specific purpose of directing people towards
resources. In response to a query, we first compute a list
of the most relevant experts on the query topic. Then, we
identify relevant links within the selected set of experts,
and follow them to identify target web pages. The targets
are then ranked according to the number and relevance of non-affiliated
experts that point to them. Thus the score of a target page
reflects the collective opinion of the best independent experts
on the query topic. Hilltop is tuned for result accuracy and
not query coverage.
Process:
- Detecting Host Affiliation
a. Using union-find algorithm, they group hosts that:
i.
Share the same rightmost non-generic suffix
ii.
Have an IP address in common (first 3 octets)
b. If
the lookup maps two hosts to same value, then they are affiliated,
otherwise, they are non-affiliated.
- Selecting the Experts
- Process search engine database and select:
i.
Pages with out-degree greater than threshold
ii.
Test if these point to distinct non-affiliated
hosts
iii.
Every such page is considered an expert page
iv.
Check if there is a broad classification available.
If there is, they can distinguish between:
1. random
collections of links
2. categorized
resource directories
- Indexing the Experts
- Create inverted index to map keywords
to experts on which they occur.. - Only index text contained in “key phraseâ€
i.
Title
ii.
Headings
iii.
Anchor text
- These list is organized into match positions,
determined to occurrence of keyword within key phrase
in expert page. - Store list of URL’s within every expert
- User types in query
- Algo determines a list of N experts
that are most relevant to query - Computing the Expert Score
- Rank results by slectively following
relevant links from experts and assigning authority score
to each page. To qualify experts:
i.
All query kw’s should occur in document
ii.
Assign score to each expert with number and importance
of key phrases that contain query keywords, as well as how
much they match the query
- Computing the Target Score
i.
Consider top N experts and examine pages they link
to (targets)
ii.
Target needs to be pointed to by at least 2 experts
iii.
For every target that qualifies, a score is calculated
with the nubmer and relevance of the experts pointing to it,
as well as the relevance of the phrases
1. Connect
every expert with the target it links to (directed edge),
and qualify these edges
2. For
each query keyword, define an “edge scoreâ€
3. Check
for affiliations between expert pages that point to same target.
4. To
compute target score, use sum of the edge scores of all edges
incident to it
iv.
List of targets is ranked by computing Target score
1. This
can be filtered by testing if the query keywords are present
in the targets
2. Filter
by matching query keywords against each target to compute
match score with content analysis, and combine target score
with match score before ranking targets.
What
this means for SEO’s
- Identify potential general and categorical experts in
your industry - Make sure your site is listed in at least 2 directories
and potential experts - Make sure the experts use your keywords in the listing:Â
title, header tags, and outbound anchor text - The more experts link to you, the higher the target
score - Experts should have your target keywords often, as it
will increase their expert score and help your site - Make sure that experts that link to you are non-affiliated,
otherwise they will only count once - Make sure your target keywords are on your target page,
as they will be matched with the query and the expert
page - The category of your listing on the expert site will
also be used to determine the category of your site and
what it will be allowed to rank for - Google is probably using DMOZ as a main expert, as they
can easily use the data and it qualifies with everything
mentioned in this paper.






