Understanding Hilltop

0

Understanding Hilltop

Abstract:

Copied from patent:

Ranking scheme for broad queries that places the most authoritative
pages on the query topic at the top of the ranking. Based
on “expert documents” – subset of the pages
on the web identified as directories of links to various sources.
Results are ranked based on the match between the query and
relevan descriptive text for hyperlinks on expert pages.

Overview

Copied from patent:

Algo is based on assumption that the number and quality of
sources referring to a page is a good measure of the page’s
quality. The key difference consists in the fact that we are
only considering expert sources – pages that have been
created with the specific purpose of directing people towards
resources. In response to a query, we first compute a list
of the most relevant experts on the query topic. Then, we
identify relevant links within the selected set of experts,
and follow them to identify target web pages. The targets
are then ranked according to the number and relevance of non-affiliated
experts that point to them. Thus the score of a target page
reflects the collective opinion of the best independent experts
on the query topic. Hilltop is tuned for result accuracy and
not query coverage.

Process:

  1. Detecting Host Affiliation

a. Using union-find algorithm, they group hosts that:


i.
Share the same rightmost non-generic suffix


ii.
Have an IP address in common (first 3 octets)

b. If
the lookup maps two hosts to same value, then they are affiliated,
otherwise, they are non-affiliated.

  1. Selecting the Experts
  1. Process search engine database and select:


i.
Pages with out-degree greater than threshold


ii.
Test if these point to distinct non-affiliated
hosts


iii.
Every such page is considered an expert page


iv.
Check if there is a broad classification available.
If there is, they can distinguish between:

1. random
collections of links

2. categorized
resource directories

  1. Indexing the Experts
  1. Create inverted index to map keywords
    to experts on which they occur..
  2. Only index text contained in “key phrase”


i.
Title


ii.
Headings


iii.
Anchor text

    1. These list is organized into match positions,
      determined to occurrence of keyword within key phrase
      in expert page.
    2. Store list of URL’s within every expert
  1. Query Processing
    1. User types in query
    2. Algo determines a list of N experts
      that are most relevant to query
    3. Computing the Expert Score
    4. Rank results by slectively following
      relevant links from experts and assigning authority score
      to each page. To qualify experts:


    i.
    All query kw’s should occur in document


    ii.
    Assign score to each expert with number and importance
    of key phrases that contain query keywords, as well as how
    much they match the query

      1. Computing the Target Score


      i.
      Consider top N experts and examine pages they link
      to (targets)


      ii.
      Target needs to be pointed to by at least 2 experts


      iii.
      For every target that qualifies, a score is calculated
      with the nubmer and relevance of the experts pointing to it,
      as well as the relevance of the phrases

      1. Connect
      every expert with the target it links to (directed edge),
      and qualify these edges

      2. For
      each query keyword, define an “edge score”

      3. Check
      for affiliations between expert pages that point to same target.

      4. To
      compute target score, use sum of the edge scores of all edges
      incident to it


      iv.
      List of targets is ranked by computing Target score

      1. This
      can be filtered by testing if the query keywords are present
      in the targets

      2. Filter
      by matching query keywords against each target to compute
      match score with content analysis, and combine target score
      with match score before ranking targets.

      What
      this means for SEO’s

      1. Identify potential general and categorical experts in
        your industry
      2. Make sure your site is listed in at least 2 directories
        and potential experts
      3. Make sure the experts use your keywords in the listing:Â
        title, header tags, and outbound anchor text
      4. The more experts link to you, the higher the target
        score
      5. Experts should have your target keywords often, as it
        will increase their expert score and help your site
      6. Make sure that experts that link to you are non-affiliated,
        otherwise they will only count once
      7. Make sure your target keywords are on your target page,
        as they will be matched with the query and the expert
        page
      8. The category of your listing on the expert site will
        also be used to determine the category of your site and
        what it will be allowed to rank for
      9. Google is probably using DMOZ as a main expert, as they
        can easily use the data and it qualifies with everything
        mentioned in this paper.

      Comments are closed.