Browse >
Home /
SEO / Understanding History Data
Understanding History Data
Articles
Understanding History Data
Disclaimer: This is
our INTERPRETATION of the data that we read in the patent
application that was published by Google. As we are not programmers,
mathematicians, or IR specialists, we have used our knowledge
of marketing and SEO to extrapolate meaning and ideas from
the information in the patent application. Feel free to email
us if you disagree with our conclusions or if you would like
to give us your own ideas.
Abstract
“A system identifies a document and obtains one or more
types of history data associated with the document. The system
may generate a score for the document based, at least in part,
on the one or more types of history data.”
What this patent application means for SEO’s
Google may be using all the data mentioned in this patent
application to both enhance relevancy, attack spam, and identify
when freshness or staleness are desirable according to the
query. To accomplish this, they are looking at information
about keywords, target demographics, how people surf on a
website, content updates, and other such aspects. They use
as much information as they have access to in order to get
more thorough information about the site. Then, they are using
all that information to modify the rankings of the site based
on the given queries.
Here are some suggestions to improve your website:
- Domain Registration: make sure all your information
is valid, that your site, server, and dns server are
not connected to other spam sites or many other sites
you own. Register the domain for at least 3 years.
Basically, you shouldn’t just register domains
when you think about them – you should have
a domain registration strategy to distribute how the
domains are registered, where they’re hosted,
which DNS to use, etc.
- Sandbox: “Newer sites scored lower because
of number of back links. However, when inception dates
are considered, scores of documents may be modified”.
This implies that they are giving an advantage to
older domains, as well as domains that are trusted
or are authorities. They use the example of a new
site with 10 back links and an old site with 100 back
links; the rate of link growth is higher for the new
site. This could indicate spam or popularity. However,
if new domains contain “news” or seasonal
information, according to the query, they can then
rank quickly, even if they are new. Also, if they
have a rapid increase of links, and these links are
on authority sites, it may indicate that it’s
an important site and may be worth giving points to
so it ranks quickly, regardless of inception date.
For some queries, older documents may be more favorable
than new ones, so age can be considered and document’s
score modified by a difference between documents’
age and average age.
- Content Changes: they re looking at how often you
add pages to your site, how often you update pages
of your site, and the ratio for each. This allows
them to determine if your site is fresh or stale,
and allows them to compare this ratio to other sites
in the same keyword sector. Thus, determine a content
update ratio and stick to it – try not to establish
hefty goals that you can’t fulfill, as your
ratio and rankings may suffer.
- Your Page in the SERP’s: as they may be tracking
click through ratios and time spent on site (between
the time the visitor clicks on your link on the SERP’s
and returns to check the next result). It may be time
to employ a copywriter to make your keyword rich titles
very attractive, so they attract people to click on
your link. Once they’re on your site, make sure
you offer valuable content so they stay. Also, don’t
throw up pages as simple entry points without valuable
content, as the user will just click back quickly
to the SERP’s and negatively impact your site.
- Traffic & User Behavior: if you provide google
with traffic information for your site through Google
Analytics, try and make sure you are presenting favorable
information. It is unknown how Google will work traffic
data into the SERP’s, as they will have this
info for some sites but not for others. It may bias
the rankings for sites that give them traffic info,
as there is more info available. However, if the info
is negative, it may negatively affect that site. They
may apply an “unknown” factor if they
are unaware of traffic data for a site – but
they’d have to work into the formula a way of
handling sites that do and sites that don’t
offer traffic data. By having access to logs they
determine how valuable your site is according to how
much traffic you receive from other engines and links,
how long people stay on your site, what pages they
visit the most, etc.
- Freshness vs. Staleness: In order to improve their
relevancy, they are using lots of different types
of information to determine if the query requires
fresh or stale results, and if your site is fresh
or stale. So try to determine if your keywords favor
freshness or staleness, and make sure your content
is updated at that rate.
- Behavior of links: this allows them to monitor
many things: If links are added too quickly it may
be spam or “burstiness”. If they are added
slowly then the site may not be worth linking to.
If the anchor text changes, the content may have changed,
so the links may not need to pass pagerank. For each
query they determine an average of link growth and
if your site deviates too much its scoring will be
adjusted accordingly. If you are trying to rank for
keywords where staleness is preferable, it’s
best that your link ratio should be slow. If your
links appear on authority or trusted sites, and remain
over time, with few changes, this may be favorable
as it shows your site continues to be trusted. If
the link text is too random or too synthetic, it may
indicate manual intervention. Analyze anchor text
for other authority sites and mimic their anchor text
behavior. The amount of links from unrelated sites
is monitored to determine if there are changes over
time – you don’t want a rapid increase
in non-related links as it can mean buying links or
too many reciprocal links. Try to keep your link growth
steady, not too fast, not to slow. Do whatever you
can to encourage people not to take your links down
– run a report every month to see who is linking
to you, get their email address from their site, and
email them if you see that your link drops to encourage
them to put the link back on. If you want to switch
links and anchor text (if you have control), keep
a base of links that only keeps increasing, and rotate
a small percentage. Where the link is placed is also
important – if it is part of the body it is
considered more valuable. If you are buying text links,
make sure the links are included in the body and are
not easily identified as paid or sponsored ads.
- User Generated Data: your site will gain points
if many people bookmark it or refer to it in other
ways. If it’s emailed to gmail users, if people
add it to their bookmarks, or return to it often through
the address bar, it’ll be considered a valuable
site. This information can perhaps be used to modify
trustrank.
- Ranking History: spikes in ranking may reflect
active SEO and potential spam techniques. This could
trigger a manual review or it could automatically
adjust the score accordingly. A filter may be applied
to prevent sites from ranking too quickly for too
many keywords. How a site performs when it ranks is
also taken into effect, by determining how often people
click on the page and how long they stay. Ranking
behavior of other sites is taken into account and
compared. Thus, try to keep your SEO gradual –
don’t apply too much at once, as it could result
in drastic improvements and, later on, in penalties.
Keep everything slow and steady, even your efforts
at ranking.
|
Abstract A system identifies a document and obtains one or
more types of history data associated with the document. The
system may generate a score for the document based, at least
in part, on the one or more types of history data. Claims
Scoring a document based on history data:
- Inception datea. Scoring based upon inception date (modified
positively or negatively according to query)
- For documents with many documents:
•determining an age of each of the documents based on inception date
• determining average of above
• scoring based on the difference between ages and average age<
- elapsed time measured from the inception date
• when a search engine first discovers the document
a. date document was registered can be used
b. time stamp by server hosting the document
- when search engine first discovers link to the document
- when the document includes at least a predetermined number of pages
|
Age of domain is a factor, especially
when relating to query. Queries that are new and time-sensitive,
such as hurricanes, will give bonus points to newer sites,
whereas older queries that don’t change may give bonus
points to older sites.
- How content of the document changes
over time
a. Frequency at which content changes over time
1. Average time between changes
2. Number of changes in a time period
3. Comparison of a rate of change in current time period with a rate of change in a previous time period
b. Amount by whichthe content changes over time
1. Frequency
2. Amount
• new pages associated with a document within a time period
• ratio of a number of new pages vs. total number of pages
• percentage of the content of the document that haschanged during time period
a. weighting different portions of the content differently based on perceived importance of portions
c. When document changes
1. Date when content last changed
2. Average date of change
• difference between when content last changed and averagedate of change
|
Share on Facebook
Posted by AJ Ghergich on Thursday, April 6, 2006 at 3:01 pm
Filed under SEO · Tagged with