http://www.google.com/relatedlinks/

We have added an example here: AuthorityDomains.com Google Related Links

Google recently launched a new program of “related” links, where you add a javascript code to your site, and they will serve a list of relevant links to other websites.

When webmasters set the links up, they can select whether they want to display related searches, related web pages, or related news stories.

What is the source of this data?

Google doesn’t mention how they compile this information. From a brief look, it loos like they use their standard contextual algorithms to find other related sites and queries. The related sites seem to come from people paying for Google Adwords, although it is tough to confirm as they have to decide what the keywords are (which may not be on the page).

So basically, based on the page where the code is, Google determines a few keyowrds that are relevant, then searches for related sites and related queries.

Is it beneficial to add this code to your site?

We don’t see any real benefit in adding this code, as all it’ll do is give people many ways OUT of your site.

One way in which it is useful, is to see how Google categorizes your site, what kind of keywords it believes your site is about in order to present these as relevant.

Other than that, all it’s doing is giving Google free advertising.

3 Ways to Promote your Site in 2006

Promoting your site is not a “static” business. The web changes and renews itself about every 6 months, with new services, algorithms, and technologies coming out all the time.

For you to “stay on top” or to “get on top” you need to be up-to-date with the industry. It is easy to continue using old-school internet marketing techniques, and then wonder why your site isn’t performing better.

What’s the answer? Bring your internet marketing techniques up-to-date. This may include tuning in to new internet trends, or simply revisiting old concepts with a fresh view.

How? Read below and you will get 3 ideas to add to your internet marketing campaign for 2006.

Media Buys

SEO is still highly reliant on link popularity – for your site to rank you must have relevant, powerful links pointing to it. This is a no-brainer, right? But, the catch-22 is, if you are selling a product or service, how can you naturally gain these links?

If you are an already established brand, if you are a large corporation, or if you have a product that is “newsworthy”, this article is not for you. If you have a small website / product that you are trying to promote, and don’t have all the benefits of a large, branded corporation behind you, read on.

Let’s use the example of traditional advertising. When people want to promote a product “offline” they have to rely on magazine, newspaper, tv ads, and other forms of advertisment, to gain exposure for their site.

The difference on the web is, it’s difficult to identify the best places to buy links. Are the sites penalized? Will they bring you buyers? Will the links improve your rankings?

Knowing where to buy advertising from is very important. Here are some of the things you should consider:

* The site MUST be relevant. This will ensure that you will get buyers from people who see your links. Additionally, Google and Yahoo (and soon MSN) look for links that are relevant to your site and boost your rankings for those. Non-relevant links may even be DETRIMENTAIL to your SEO efforts .

* Where is the site ranking for popular keywords? Look at the meta title of the site, type a few of the words from the title in the major engines – is the site ranking in the top 20? How competitive are those keywords? If it is ranking well for your target keywords, it is a good chance that site has a good reputation in the engines and would be well worth paying for links.

* Will the links be identified as paid ads? If so, they are worthless in terms of boosting your search engine placement. The engines can easily identify these links and will NOT give you credit for them. Make sure your links are not identified as “paid” or “sponsored”.

Buying links from relevant, high ranking sites will have many benefits. Make sure you do it correctly and it will help to bring you customers, both from other sites and from the engines.

Social Bookmarking

As the web has grown and become more commercialized, new tools / services have appeared to allow people to have more participation in their web surfing experience.

Social Bookmarking allows people to “vote” for the sites that they like and keep records of them for themselves and others. This is a form of referrals that doesn’t require meeting people face to face.

For example, if you wish to find an SEO, you may ask your friends who they’ve worked with before, and what their experiences were. Then, you are more likely to go with someone that others recommended instead of someone totally new about whom you don’t know anything.

Social tagging allows you to receive virtual referrals from others, so if you don’t know anyone that’s used an SEO, you can see what other people have picked and who they recommend.

There are many objectives for social bookmarking; we’ve only given you one example of how it may be used.

So to stay on top, you need to learn about how social tagging works and how you can integrate it’s principles into your site promotion campaign. Feel free to read our Social Bookmarking and SEO article for more information on how this works.

Offering unique content

What differentiates you from everyone else? Why should someone buy from you, when they can buy from any other site?

One way is to offer something no one else offers in your industry. Unique content doesn’t just have to be information. For example, you could be offering video tutorials, industry news, or even web tools that your users would find helpful.

If you were looking to buy a video conferencing solution, would you buy from a site that offers little information, nothing unique – a “packaged” standard site? Or would you prefer to buy from a site that has many pages of information about video conferencing, how it’s used, tutorials, benefits of their services..perhaps they even offer a free tool as part of their services?

Unique content will also improve your link popularity. People will be more likely to link to you if you are constantly publishing new articles, or if you are offering a free tool that they can download or link to.

If you do a search in Google for “link popularity tool”, www.marketleap.com is # 1. Why? They offer a free tool that anybody can use – and it automatically links back to their site. This has given them over 30,000 links in Yahoo.

Why not identify a tool or service for your industry that no one else offers? It is KEY for you to offer something unique – regardless of the format.

Why waste more time?

Go to delicious and digg.com to create accounts, start brainstorming a new tool or writing new articles. The more you engage in these techniques, the more visitors your site will have from the search engines and from other sites.

Social Bookmarking and SEO

Social Bookmarking is the practice of saving bookmarks to a
public Web site and “tagging” them with keywords. Basically, is uses the concept of “bookmarking” but extends it to include internet technology. So instead of bookmarking sites on your local computer, you bookmark sites that you like through web services. This way, your bookmarking process remains the same; the only difference is where the bookmarks are stored. By being stored on the internet, it gives other people access to your bookmarks, and allows people to create “group” categories in which they can collaborate to create a list of resources about a particular topic.

Also, the sites where you bookmark allow you to “tag” your entries. Once you select a site to bookmark, you can also assign a few words, or “tags” to your bookmark, which allow the bookmarking service to easily categorize the link that you added. This way, if other people search their site for a similar topic, they can find a link to the sites you have bookmarked by using keyword, persons, or popularity. Tagging allows you to share the benefit of your research and favorite sites by allowing others to find them. However, if you wish, you can also choose to keep your bookmarks private.

The technology behind social bookmarking is not complex, which means the threshold to participate is low, both for Web sites offering such services and for users. As the landscape for online resources changes and new
systems of classifying those resources emerge and mature, the design and function of databases themselves may ultimately be changed to accommodate new ways of managing information.

Most Popular Social Bookmark Communities

del.icio.us – Del.icio.us allows you to register for an account, and then you can immediately begin bookmarking. Adding buttons to your toolbars facilitates the process of tagging as you surf, and quickly accessing your bookmark account. If enough people bookmark a page, it will then rank higher when people search for topics that appear in the tags.

Digg – Digg is unique in that it allows people to submit articles, the other users “digg” the article if they found it helpful. The more times your article is “dugg” , the higher it’ll appear in the rankings. This can result in a large amount of traffic to your site.

Technorati – You can add technorati tags to your blog posts or pages. You simply add a link using the specific technorati tag that you want, including specific tags on the link, and technorati will pick those up.

Social Bookmarking Effects on SEO

Social bookmarking is already having an effect on search engine optimization. Search engines can use the data available to them through social bookmarking by: counting the bookmarks as votes, using tags to help categorize data, using bookmarks to enhance link popularity counts.

Votes

By spidering social bookmarking sites and identifying how many sites have bookmarked or voted for your site, the engines can use this data as a modifier for their rankings. If you get a lot of bookmarks, this could then increase your SE rankings.

Link Popularity

If many sites have bookmarked your pages, this means that there will be links to your site from other pages (social bookmarking sites). The engines can then use this data for their link popularity counts. Additionally, if you rank well in the social bookmarking site, this will give your site additional exposure, thus allowing other people to find and link to your site.

Tagging

Based on how users tag your site, the engines can use this data for their semantic analysis. So, if your article is tagged with the word “computers” , the engines can add this data to the info they have about your site and improve your rankings in the “computers” keyword category.

Crawling

The engines crawl sites such as technorati and del.icio.us often. So, if people bookmark or vote for your site, it’ll help your page get crawled quicker.

What Now?

First of all, visit all these sites and spend some time there learning about the sites and how they work. Read their help pages and get to know them.

Then, you can create an account and bookmark your pages from your own account.

Additionally, you can encourage other people to bookmark or vote for your site by adding those links / buttons to your pages. Visit this page to get the relevant code:

http://ekstreme.com/seo/socialbookmarkingcode.php

You can add these to the code of your site so people can instantly “dig” your site or add it to their delicious bookmarks.

Here’s another one:
http://www.toprankresults.com/tools/social-bookmark.php

Social Bookmarking Link List: You will want to subscribe to many, if not all of these services.

I’ve come accross many sites where Google estimates there are more pages indexed, than pages available on the server.

For example, one of our clients has a site with about 10,000 informational pages. Google says that their site has over 200K pages indexed.

If this happens to you, here’s what you can do:

Go over the pages you can find in the Google index. Make notes of the ones that seem odd to you in any way and inspect those pages manually. Use a server header checker, and make sure they return a “200 OK” code (or perhaps “304 Not Modified”). If Google has pages listed as belonging to your site, but they are not there, request a removal. Obviously, you can’t do this with 200k pages, but if you have a smaller site, this would work for you.

Then, go over your site with a fine tooth comb. Even try to break it by entering URLs that you know isn’t there, and/or url’s that look a lot like those that are there (eg. “.htm” in stead of “.html” or “index” in stead of “index.php”, or “filename.htm/foldername/filename.htm”). Whenever you manage to get a page shown on an URL that shouldn’t show anything, close that hole.

You may want to also write to Google explaining the situation to make sure nothing happens to the site. A simple letter explaining the situation may help get the site reviewed and added to the “safe” list.

, ,

Today someone asked me about URL’s that show with spaces in Google. When they do a site:domain.com query, they find URL’s from their site that have an added space. When they check the coding and the pages, they are coded correctly without spaces.

If you find this, DON’T WORRY.

Google adds spaces in the SERP’s so that the URL’s don’t break the screen for smaller resolution monitors. But Google has the pages correctly indexed.

Understanding
TrustRank

Disclaimer: This is our INTERPRETATION of the data that
we read in this paper. As we are not programmers, mathematicians,
or IR specialists, we have used our knowledge of marketing
and SEO to extrapolate meaning and ideas from the information
in this paper. Feel free to email us if you disagree with
our conclusions or if you would like to give us your own
ideas.

Abstract

”Web spam pages use various techniques to achieve
higher-than-deserved rankings in a search engine’s
results. While human experts can identify spam, it is too
expensive to manually evaluate a large number of pages. Instead,
we propose techniques to semi automatically separate reputable,
good pages from spam.

We first select a small set of seed pages to be evaluated
by an expert. Once we manually identify the reputable seed
pages, we use the link structure of the web to discover other
pages that are likely to be good. In this paper we discuss
possible ways to implement the seed selection and the discovery
of good pages. We present results of experiments run on the
World Wide Web indexed by AltaVista and evaluate the performance
of our techniques. Our results show that we can
effectively filter out spam from a significant fraction of
the web, based on a good seed set of less than 200 sites.”

  1. Preliminaries
    a. Web Model

    i. Web is modeled as a graph consisting of pages and a set of directed links that connect pages.
    ii. Self links and multiple links from same site are removed

    b. PageRank
    i. The proposed algorithm relies on pagerank (the importance of a page
    influences and is being influenced by the importance
    of other pages)
    ii. PageRank assigns a static score to
    each page, but a biased Page Rank version may break this
    rule. A non-zero static score can be assigned to a set
    of special pages only. The score of these pages is then
    spread during the iterations to the pages they point
    to.

  2. Assessing Trust
    a. Oracle and Trust Functions

    i. Oracle assigns value: 0 if page is bad, 1 if page
    is good
    ii. As this is expensive and time consuming, the oracle should only review a
    subset of pages
    iii. Approximate isolation of the good set: good pages seldom link to bad pages
    iv. Trust Function: yields a range of values between 0 and 1. It should give
    probability that a page is good or not.

    b. Ordered Trust Property: The Trust function should
    predict the likelihood of a page being good, so
    the results can be ranked by their trust value (high
    probability
    of being good means pages get ranked higher in a
    list, and vice versa)

    c. Threshold Trust Property: if a page receives
    a score above a

  3. Evaluation Metrics
    a. Pairwise Orderedness: signals if a bad page received
    an equal or higher trust score than a good page (violation
    of ordered trust property). This evaluates the accuracy
    of T

    b. Precision: fraction of good among all pages in X that
    have a trust score above a threshold

    c. Recall: ratio between the number of good pages with
    a trust score above a threshold and the total number
    of good pages in X

  4. Computing Trust
    a. Ignorant Trust Function: For pages not reviewed and
    given a value by an oracle, they are given a value of ½ which
    means that no data is known for those pages

    b. Trust Propagation: The oracle is invoked to check a
    random selection of L pages. Then, expecting that good
    pages only link to good pages, we assign a score of 1
    to all pages that are reachable from a page with positive
    trust in M or fewer steps ( 1 and 2 steps gave the best
    results)

  5. i. The problem with this is that sometimes
    good pages link to bad pages. The further away we are
    from good pages, the less certain we are that a page
    is good.

c. Trust Attenuation: Essential to remove trust the
further we are from seed pages

    i. Trust Dampening: the trust factor is reduced the
    further away we are from a good site. So if good seed
    A has a score of 1, site B has a score of b < 1,
    and site C has a score of b * b (reduced more the further
    away you are from good site)

    ii. Trust Splitting: This handles pages with multiple outlinks. That is, if a
    good page has only a handful of outlinks, then it is likely that the pointed
    pages are also good. However, if a good page has hundreds of outlinks, it is
    more probable that some of them will point to bad pages.

    1. Trust score is split amongst the outbound links
    based on the amount. So if a good seed has 2 outbound
    links, its trust score of 1 is split into 2, so each
    page gets .5 trust points.

    2. The actual score of the page will be the sum of the score fractions received
    through its inlinks. The more “credit” , the likelier it is to be
    good.

    iii. Trust splitting can be combined with trust
    dampening.

  • The TrustRank Algorithm
    a. Select Seeds: used to identify desirable pages for the
    seed set (the most useful in identifying good pages).
    Needs to be relatively small.

      i. Inverse PageRank: Number of outbound links
      (the higher the outlinks, the more likely of
      getting picked) – importance of a page depends
      on its outlinks, not on inlinks.
      ii. High PageRank : Preference is given to pages
      with high page rank as they are more likely to
      link to other high page rank pages.

    b. Generate a corresponding order of the seeds according
    to their desirability as seeds

    c. Select Good Seeds: invokes oracle...so person reviews
    those sites and gives them value.

    d. Normalize static score distribution vector: this only
    allows to have a trust of max 1

    e. Compute TrustRank scores: uses biased pagerank computation,
    with the uniform distribution factor being replaced.

    1. i. Uses trust dampening and splitting where trust
      score is split amongst its neighbors and dampened by
      a factor
      ii. TrustRank “refines” the original
      scores given by the oracle according to the structures
      of links, since it has more information to use..

    f. Unreferenced pages have score of 0, unless
    they are selected as seeds.

    g. Pages can be organized first by PageRank,
    and only pages with high enough pagerank
    are used to
    compute TrustRank,
    otherwise its’ a waste of resources.

    Conclusion:

    Basically, this enables them to modify PageRank. PageRank
    can be easily manipulated as it doesn’t care about
    quality. By using a combination of both, the basic PageRank
    formula can be used (is cheap to use and works well), then
    modified according to trust factors.

    Adding human interaction enables them to then compute
    scores automatically. People manually review sites and
    assign a trust score. Then, this trust score is split amongst
    its outbound links using the algorithm. So the trust score
    of other sites, even if they are not manually reviewed,
    is then based upon the trust score received from other
    sites (with a max of 1). Sites with a higher trust score
    can then rank higher.

    What this means for SEO’s

    • Try to identify good sites in your industry. These
      sites were chosen by number of outbound links as well
      as by high page rank scores. Remember that those pages would’ve
      been reviewed by a person, so only select sites that
      are genuinely valuable.
    • Good sites are bound to be ranking in the serp’s
      as they will have high TrustScores, thus modifying their
      pagerank and excluding spam sites
    • Try to get links from those good sites, or at
      least from pages that are receiving links from good sites.
    • Use up to 3 levels of separation from the good
      sites
    • The more links you receive from good sites, the
      higher your TrustScore.
    • If you have too many links from bad sites, it’ll
      lower your score. Bad sites can be sites considered “unworthy” by
      human reviewers, or sites that received low points from
      other sites
    • Avoid having too many links from bad sites, as
      the more you have, the more it’ll work against you
      based on your “trust score”.
    • Try to only have links from trusted sites, and
      to have as few links as possible from bad sites
    • The higher your trust score, the higher you rank, as
      it’ll modify your Page Rank score positively.
    • Page Rank is still used, so you still need to
      get links, but try to get links mainly from good sites.
    • They are “collapsing” multiple links
      from one URL, and only counting it as one link
    • Self links are removed and only links from external
      sites are taken into account

    Make sure this is an important aspect of your SEO campaign,
    for these trusted links enable you to get past the sandbox
    and to boost your rankings significantly.

    Articles

    Understanding
    Kleinberg’s Hubs and Authorities

    Definitions:

    Abundance Problem: The number of pages that could reasonably
    be returned as relevant is far too large for a human user
    to digest. To provide effective search methods under these
    conditions, one needs a way to filter from a huge collection
    of relevant pages, a small set of the most authoritative
    or definitive ones.

    Abstract

    “Hyperlinks encode a considerable amount of latent
    human judgment. By creating links to another page, the creator
    of that link has “conferred authority” on the
    target page. Links afford us the opportunity to find potential
    authorities purely through the pages that point to them.

    In this work we propose a link-based model for the conferral
    of authority, and show how it leads to a method that consistently
    identifies relevant, authoritative www pages for a broad
    search topics. Our model is based on the relationship that
    exists between the authorities for a topic and those pages
    that link to many related authorities | we refer to pages
    of this latter type as hubs. We observe that a certain natural
    type of equilibrium exists between hubs and authorities in
    the graph developed by the link structure, and we exploit
    this to develop an algorithm that identifies both types of
    pages simultaneously. The algorithm operates on focused
    subgraphs
    of the www that we construct from the output of a text-based
    www search engine; our technique for constructing such subgraphs
    is designed to produce small collections of pages likely
    to contain the most authoritative pages for a given topic.”

    Constructing a Focused Subgraph
    of the WWW

    1. query string q
    2. build subset of Sq with:
      a. relatively small
      b. rich in relevant pages
      c. contains most of the strongest authorities
    3. collect the t highest ranked pages for query q from
      a search engine (root set Rq)
    4. Expand Rq along the links that enter and leave it
    5. Sq = expanding Rq by including any page pointed
      to by a page in Rq, and any page that points to a page
      in Rq (setting a max amount of pages any given page can
      bring in to the set Sq)
    6. Sq is base set of q
    7. Intrinsic links (links from within the same site)
      are deleted from the graph, keeping only links to and
      from external domains
    8. Gq is graph Sq minus intrinsic links
    9. Pages are then given a value as an authority weight
      or a hub weight. If a page points to many pages with
      large authority values, then it receives a high hub value;
      if a page is pointed to by many pages with large hub
      values, then it should receive a large authority value.
    10. Filter out the top C authorities and the top C
      hubs

    Summary

    Basically, they are creating small subsections used for link analysis to rank
    pages according to their authority status. To do this, they match a query with
    X amount of top ranked pages. Then, instead of ranking those, they use it to
    build a subset of relevant pages. Then, they add to that subset of pages Y
    amount of pages that link to or from the previous subset. Then, based on how
    these pages link to each other, they determine the hub & authority status
    of each of the pages. This is defined by linkage structure – if a page
    has a high number of links OUT to a page that has a large number of Links IN,
    it’s considered a hub. If a page has a large number of links IN from
    a large number of pages that have many links OUT, it’s considered an
    Authority. Then, rankings are organized by the amount of links in from hubs
    to the authorities, and this score is used to rank the authorities.

    What
    this means for SEO’s

    • Make sure your site has links from sites that are relevant to your
      keywords.
    • Identify the “hubs” for your keywords – sites
      that have a lot of outbound links, and get as many links
      as possible from those sites.
    • If possible, identify “authorities” in
      your keyword sector – sites with many links in from
      various hubs – and get links from those sites.

    While reading a sports article (Yahoo News) on Ricky Williams’s latest drug prolems in the NFL, I noticed some sponsored links that caught my eye. The reason for this was they had thumbnails next to each result.

    I have never seen this before and showed others the page on different datacenters and nobody else seems to be seeing yet.
    Here is the screenshot “I think it’s a really great idea. Using images near ppc ads has always increased revenue in my studies.

    Yahoo PPC Images or Thumbnails

    I am now seeing these all over the place. http://sports.yahoo.com/nba/recap;_ylt=ApHh02RB5nXIApS9gUcc3928vLYF?gid=2006041114&prov=ap This page shows me the following ads down at the bottom.

    More Sponsored Results Images

    Articles

    Understanding History Data

    Disclaimer: This is
    our INTERPRETATION of the data that we read in the patent
    application that was published by Google. As we are not programmers,
    mathematicians, or IR specialists, we have used our knowledge
    of marketing and SEO to extrapolate meaning and ideas from
    the information in the patent application. Feel free to email
    us if you disagree with our conclusions or if you would like
    to give us your own ideas.

    Abstract
    “A system identifies a document and obtains one or more
    types of history data associated with the document. The system
    may generate a score for the document based, at least in part,
    on the one or more types of history data.”
    What this patent application means for SEO’s

    Google may be using all the data mentioned in this patent
    application to both enhance relevancy, attack spam, and identify
    when freshness or staleness are desirable according to the
    query. To accomplish this, they are looking at information
    about keywords, target demographics, how people surf on a
    website, content updates, and other such aspects. They use
    as much information as they have access to in order to get
    more thorough information about the site. Then, they are using
    all that information to modify the rankings of the site based
    on the given queries.

    Here are some suggestions to improve your website:

    • Domain Registration: make sure all your information
      is valid, that your site, server, and dns server are
      not connected to other spam sites or many other sites
      you own. Register the domain for at least 3 years.
      Basically, you shouldn’t just register domains
      when you think about them – you should have
      a domain registration strategy to distribute how the
      domains are registered, where they’re hosted,
      which DNS to use, etc.
    • Sandbox: “Newer sites scored lower because
      of number of back links. However, when inception dates
      are considered, scores of documents may be modified”.
      This implies that they are giving an advantage to
      older domains, as well as domains that are trusted
      or are authorities. They use the example of a new
      site with 10 back links and an old site with 100 back
      links; the rate of link growth is higher for the new
      site. This could indicate spam or popularity. However,
      if new domains contain “news” or seasonal
      information, according to the query, they can then
      rank quickly, even if they are new. Also, if they
      have a rapid increase of links, and these links are
      on authority sites, it may indicate that it’s
      an important site and may be worth giving points to
      so it ranks quickly, regardless of inception date.
      For some queries, older documents may be more favorable
      than new ones, so age can be considered and document’s
      score modified by a difference between documents’
      age and average age.
    • Content Changes: they re looking at how often you
      add pages to your site, how often you update pages
      of your site, and the ratio for each. This allows
      them to determine if your site is fresh or stale,
      and allows them to compare this ratio to other sites
      in the same keyword sector. Thus, determine a content
      update ratio and stick to it – try not to establish
      hefty goals that you can’t fulfill, as your
      ratio and rankings may suffer.
    • Your Page in the SERP’s: as they may be tracking
      click through ratios and time spent on site (between
      the time the visitor clicks on your link on the SERP’s
      and returns to check the next result). It may be time
      to employ a copywriter to make your keyword rich titles
      very attractive, so they attract people to click on
      your link. Once they’re on your site, make sure
      you offer valuable content so they stay. Also, don’t
      throw up pages as simple entry points without valuable
      content, as the user will just click back quickly
      to the SERP’s and negatively impact your site.
    • Traffic & User Behavior: if you provide google
      with traffic information for your site through Google
      Analytics, try and make sure you are presenting favorable
      information. It is unknown how Google will work traffic
      data into the SERP’s, as they will have this
      info for some sites but not for others. It may bias
      the rankings for sites that give them traffic info,
      as there is more info available. However, if the info
      is negative, it may negatively affect that site. They
      may apply an “unknown” factor if they
      are unaware of traffic data for a site – but
      they’d have to work into the formula a way of
      handling sites that do and sites that don’t
      offer traffic data. By having access to logs they
      determine how valuable your site is according to how
      much traffic you receive from other engines and links,
      how long people stay on your site, what pages they
      visit the most, etc.
    • Freshness vs. Staleness: In order to improve their
      relevancy, they are using lots of different types
      of information to determine if the query requires
      fresh or stale results, and if your site is fresh
      or stale. So try to determine if your keywords favor
      freshness or staleness, and make sure your content
      is updated at that rate.
    • Behavior of links: this allows them to monitor
      many things: If links are added too quickly it may
      be spam or “burstiness”. If they are added
      slowly then the site may not be worth linking to.
      If the anchor text changes, the content may have changed,
      so the links may not need to pass pagerank. For each
      query they determine an average of link growth and
      if your site deviates too much its scoring will be
      adjusted accordingly. If you are trying to rank for
      keywords where staleness is preferable, it’s
      best that your link ratio should be slow. If your
      links appear on authority or trusted sites, and remain
      over time, with few changes, this may be favorable
      as it shows your site continues to be trusted. If
      the link text is too random or too synthetic, it may
      indicate manual intervention. Analyze anchor text
      for other authority sites and mimic their anchor text
      behavior. The amount of links from unrelated sites
      is monitored to determine if there are changes over
      time – you don’t want a rapid increase
      in non-related links as it can mean buying links or
      too many reciprocal links. Try to keep your link growth
      steady, not too fast, not to slow. Do whatever you
      can to encourage people not to take your links down
      – run a report every month to see who is linking
      to you, get their email address from their site, and
      email them if you see that your link drops to encourage
      them to put the link back on. If you want to switch
      links and anchor text (if you have control), keep
      a base of links that only keeps increasing, and rotate
      a small percentage. Where the link is placed is also
      important – if it is part of the body it is
      considered more valuable. If you are buying text links,
      make sure the links are included in the body and are
      not easily identified as paid or sponsored ads.
    • User Generated Data: your site will gain points
      if many people bookmark it or refer to it in other
      ways. If it’s emailed to gmail users, if people
      add it to their bookmarks, or return to it often through
      the address bar, it’ll be considered a valuable
      site. This information can perhaps be used to modify
      trustrank.
    • Ranking History: spikes in ranking may reflect
      active SEO and potential spam techniques. This could
      trigger a manual review or it could automatically
      adjust the score accordingly. A filter may be applied
      to prevent sites from ranking too quickly for too
      many keywords. How a site performs when it ranks is
      also taken into effect, by determining how often people
      click on the page and how long they stay. Ranking
      behavior of other sites is taken into account and
      compared. Thus, try to keep your SEO gradual –
      don’t apply too much at once, as it could result
      in drastic improvements and, later on, in penalties.
      Keep everything slow and steady, even your efforts
      at ranking.

    Abstract A system identifies a document and obtains one or
    more types of history data associated with the document. The
    system may generate a score for the document based, at least
    in part, on the one or more types of history data. Claims
    Scoring a document based on history data:

    1. Inception datea. Scoring based upon inception date (modified
      positively or negatively according to query)
      1. For documents with many documents:
        •determining an age of each of the documents based on inception date
        • determining average of above
        • scoring based on the difference between ages and average age<
      2. elapsed time measured from the inception date
        • when a search engine first discovers the document
        a. date document was registered can be used
        b. time stamp by server hosting the document
        • when search engine first discovers link to the document
        • when the document includes at least a predetermined number of pages

      Age of domain is a factor, especially
      when relating to query. Queries that are new and time-sensitive,
      such as hurricanes, will give bonus points to newer sites,
      whereas older queries that don’t change may give bonus
      points to older sites.

      1. How content of the document changes
        over time

      a. Frequency at which content changes over time
      1. Average time between changes
      2. Number of changes in a time period
      3. Comparison of a rate of change in current time period with a rate of change in a previous time period
      b. Amount by whichthe content changes over time
      1. Frequency
      2. Amount
      • new pages associated with a document within a time period
      • ratio of a number of new pages vs. total number of pages
      • percentage of the content of the document that haschanged during time period
      a. weighting different portions of the content differently based on perceived importance of portions
      c. When document changes
      1. Date when content last changed
      2. Average date of change
      • difference between when content last changed and averagedate of change

      Understanding Hilltop

      Abstract:

      Copied from patent:

      Ranking scheme for broad queries that places the most authoritative
      pages on the query topic at the top of the ranking. Based
      on “expert documents” – subset of the pages
      on the web identified as directories of links to various sources.
      Results are ranked based on the match between the query and
      relevan descriptive text for hyperlinks on expert pages.

      Overview

      Copied from patent:

      Algo is based on assumption that the number and quality of
      sources referring to a page is a good measure of the page’s
      quality. The key difference consists in the fact that we are
      only considering expert sources – pages that have been
      created with the specific purpose of directing people towards
      resources. In response to a query, we first compute a list
      of the most relevant experts on the query topic. Then, we
      identify relevant links within the selected set of experts,
      and follow them to identify target web pages. The targets
      are then ranked according to the number and relevance of non-affiliated
      experts that point to them. Thus the score of a target page
      reflects the collective opinion of the best independent experts
      on the query topic. Hilltop is tuned for result accuracy and
      not query coverage.

      Process:

      1. Detecting Host Affiliation

      a. Using union-find algorithm, they group hosts that:


      i.
      Share the same rightmost non-generic suffix


      ii.
      Have an IP address in common (first 3 octets)

      b. If
      the lookup maps two hosts to same value, then they are affiliated,
      otherwise, they are non-affiliated.

      1. Selecting the Experts
      1. Process search engine database and select:


      i.
      Pages with out-degree greater than threshold


      ii.
      Test if these point to distinct non-affiliated
      hosts


      iii.
      Every such page is considered an expert page


      iv.
      Check if there is a broad classification available.
      If there is, they can distinguish between:

      1. random
      collections of links

      2. categorized
      resource directories

      1. Indexing the Experts
      1. Create inverted index to map keywords
        to experts on which they occur..
      2. Only index text contained in “key phrase”


      i.
      Title


      ii.
      Headings


      iii.
      Anchor text

        1. These list is organized into match positions,
          determined to occurrence of keyword within key phrase
          in expert page.
        2. Store list of URL’s within every expert
      1. Query Processing
        1. User types in query
        2. Algo determines a list of N experts
          that are most relevant to query
        3. Computing the Expert Score
        4. Rank results by slectively following
          relevant links from experts and assigning authority score
          to each page. To qualify experts:


        i.
        All query kw’s should occur in document


        ii.
        Assign score to each expert with number and importance
        of key phrases that contain query keywords, as well as how
        much they match the query

          1. Computing the Target Score


          i.
          Consider top N experts and examine pages they link
          to (targets)


          ii.
          Target needs to be pointed to by at least 2 experts


          iii.
          For every target that qualifies, a score is calculated
          with the nubmer and relevance of the experts pointing to it,
          as well as the relevance of the phrases

          1. Connect
          every expert with the target it links to (directed edge),
          and qualify these edges

          2. For
          each query keyword, define an “edge score”

          3. Check
          for affiliations between expert pages that point to same target.

          4. To
          compute target score, use sum of the edge scores of all edges
          incident to it


          iv.
          List of targets is ranked by computing Target score

          1. This
          can be filtered by testing if the query keywords are present
          in the targets

          2. Filter
          by matching query keywords against each target to compute
          match score with content analysis, and combine target score
          with match score before ranking targets.

          What
          this means for SEO’s

          1. Identify potential general and categorical experts in
            your industry
          2. Make sure your site is listed in at least 2 directories
            and potential experts
          3. Make sure the experts use your keywords in the listing:Â
            title, header tags, and outbound anchor text
          4. The more experts link to you, the higher the target
            score
          5. Experts should have your target keywords often, as it
            will increase their expert score and help your site
          6. Make sure that experts that link to you are non-affiliated,
            otherwise they will only count once
          7. Make sure your target keywords are on your target page,
            as they will be matched with the query and the expert
            page
          8. The category of your listing on the expert site will
            also be used to determine the category of your site and
            what it will be allowed to rank for
          9. Google is probably using DMOZ as a main expert, as they
            can easily use the data and it qualifies with everything
            mentioned in this paper.

          SEO Powered by Platinum SEO from Techblissonline