SportsGuy
Staff
Joined: Aug 30, 2002
# Posts: 3600
|
Posted: 2008-Jul-31 17:37
So there's an unofficial "rule" that the engines don't follow links on a page after the first 100 or so they find.
Not sure how real it is, but for the sake of this conversation, let's assume it's real.
The question is:
If you have 175 links on a page, and you nofollow the first 125, for example, does the crawler skip them and follow your other 50 links as if they were the ONLY 50 links on the page (outbound, of course)?
Mostly a theory question, but one I found intriguing and didn't have an immediate answer for.
Please discuss.
|
 |
Hampstead
Joined: Feb 20, 2001
# Posts: 2012
|
Posted: 2008-Jul-31 18:38
I have a URL that has never been spidered.
Wanna try an experiment?
|
 |
SportsGuy
Staff
Joined: Aug 30, 2002
# Posts: 3600
|
Posted: 2008-Jul-31 23:30
Say, that might be fun.
So, how would we do this?
Maybe create a small site with 101 pages, one index page with 101 links (one to each page), and say, 50 of them nofollowed - the 1st fifty on the page.
Then we watch to see if any of the other 51 pages get indexed int he SERPs.
Could that work?
We could even just dupe the content on most of the pages, as we won't care about results other than indexing, and the deduping should filter after the indexing... (open to thoughts here).
|
 |
Prowler
Staff
Joined: Aug 14, 2000
# Posts: 1788
|
Posted: 2008-Aug-01 05:55
I am afraid we would bring in another factor into this experiment - if we duplicate the contents (at least as far as we could avoid). I have a "large" site in the making with unique content. It still is not up yet. All pages are created on the fly sourced from a large database. Each page has one unique URL (from the ground up it is designed that way with unique headers for each page).
We can conduct the experiment in this along with Hampstead's.
We will not display the link here so as not to skew the results.
|
 |
Hampstead
Joined: Feb 20, 2001
# Posts: 2012
|
Posted: 2008-Aug-01 08:57
A simple test on the 100 link "rule" could be to add a link page to an existing site with a link to the green URL at position 105 (or whatever), then see if Google visits.
We could do the same test with another green URL using nofollows and see what happens.
I've got quite a few unused URLs.
|
 |
g1smd
Staff
Joined: Jul 28, 2002
# Posts: 10418
|
Posted: 2008-Aug-01 10:35
Even if links have nofollow they still have to be read by the indexer.
|
 |
SportsGuy
Staff
Joined: Aug 30, 2002
# Posts: 3600
|
Posted: 2008-Aug-01 12:30
Agree G1 - they are still read & cataolgued, but I want to know if there's any truth to the 100 links theory, and if nofollow influences it.
I think it's the nofollow bit that is critical here, as that's the instructive bit to the engine.
Prowler - agreed on the dupe content issue - I simply wanted to avoid suggesting someone create a unique 101 page website to test this.
So, given Hampstead simplified idea (passes the sniff test, but I am, admittedly, before coffee here... ), it would be relatively easy to set up.
Would anyone care to join me in joint testing on multiple domains so we can lend some depth to this experiment?
...and we could take this offline, if appropriate, then report back on the findings in a couple weeks.
We can start with PM and move to e-mail rather than post mails here.
Duane
|
 |
Hampstead
Joined: Feb 20, 2001
# Posts: 2012
|
Posted: 2008-Aug-01 12:53
The experiment could look at whether the links are on topic and whether they are reciprocated too.
My guess is that if the parent site is a respected site and the links are on topic, they will be followed.
If the links are off topic or reciprocated, I would expect the links to be either not followed or only followed occasionally during a deep index cycle.
With a bit of thought and planning, we may be able to turn the theory into a rule based on empirical data.
|
 |
beth_lk
Staff
Joined: Jun 23, 2004
# Posts: 1211
|
Posted: 2008-Aug-01 23:19
I find this experiment very interesting and hope you all post your findings here as things happen.
I am wondering if the test site will or will not also have a site map ?
|
 |
Prowler
Staff
Joined: Aug 14, 2000
# Posts: 1788
|
Posted: 2008-Aug-02 06:33
In my case, I won't set up any sitemap so that we know that the engines have no other way of finding the URLs. The domain is up for more than a year with little more than an index page with out any links. When I am set, I will PM all who are interested in this experiment starting with SportsGuy.
|
 |
Hampstead
Joined: Feb 20, 2001
# Posts: 2012
|
Posted: 2008-Aug-02 09:41
Firstly we need a trusted site to initiate the link.
Any ideas?
Here perhaps?
|
 |
Prowler
Staff
Joined: Aug 14, 2000
# Posts: 1788
|
Posted: 2008-Aug-02 09:50
Trusted sites ? No Problem. Let us not use a high profile site as this would again skew the results. It is enough if we get a link from one run-of-the-mill general purpose directory to initiate the link.
As far as possible we conduct the tests under "sterile" conditions.
|
 |
Hampstead
Joined: Feb 20, 2001
# Posts: 2012
|
Posted: 2008-Aug-02 10:20
Perhaps we should draw up a spec for the test so that we can all agree on what is being tested and the modus operandi.
I'm happy to knock up a quick brief that we can discuss and develop if you like.
|
 |
Hampstead
Joined: Feb 20, 2001
# Posts: 2012
|
Posted: 2008-Aug-04 09:37
The first thing we need to do is to ascertain whether or not the 100 link "rule" exists.
To test this, I have a green URL that has never been associated with a hosting account.
I will set up some hosting for this and create a holding page with a small amount of unique content.
I will then create a link page with 100 plus links. The link to the green URL will be in position 110 (open to suggestions).
I would like to have this page hosted on a different IP block, hopefully on a trusted site.
Volunteers?
Thoughts please?
|
 |
Hampstead
Joined: Feb 20, 2001
# Posts: 2012
|
Posted: 2008-Aug-04 09:39
After thought:
The link page will have 200 random links.
|
 |
Prowler
Staff
Joined: Aug 14, 2000
# Posts: 1788
|
Posted: 2008-Aug-04 11:46
>> Perhaps we should draw up a spec for the test so that we can all agree on what is being tested and the modus operandi.
Ok.
>> I will set up some hosting for this and create a holding page with a small amount of unique content.
Ok.
>> I will then create a link page with 100 plus links. The link to the green URL will be in position 110 (open to suggestions).
Aha. What is the rule on this ? Do we use a directory kind of structure here? Will this page contain some content in the top ? or just a page of plain links ?
|
 |
Hampstead
Joined: Feb 20, 2001
# Posts: 2012
|
Posted: 2008-Aug-04 13:34
I would be inclined to start with a simple list of links.
Reason being that the "rule" could be subject to filters which can be tripped algorithmically. For example, if the page is seen as being of interest to Google with lots of nice new content, they may be inclined to crawl more deeply first time round.
If it is a simple site map type page, we may find that Google doesn't bother crawling past 100 links.
Of course we may not, but it would help establish a base level where the page is of no great interest to Google.
|
 |
SportsGuy
Staff
Joined: Aug 30, 2002
# Posts: 3600
|
Posted: 2008-Aug-04 15:09
This is a great conversation guys & gals.
So far, sounds all good - and I'll add that we may want to consdier runnign this on a page WITH content, that is already indexed - just changes the number of outgoign links to see what gets followed.
This would obviously necessitate an established site - which I can contribute if needed.
It seems there are a few ways we could approach this, so I agree with Prowler's suggestion of a spec so we can outline the work and cover the points to stay on track. In the end, we may have multiple experiments here, each with slight variations on things to consider.
|
 |
Prowler
Staff
Joined: Aug 14, 2000
# Posts: 1788
|
Posted: 2008-Aug-04 15:57
One more thing: As far as the rest of the world is concerned, no one even knows that such a page exists. So the 'honey trap' designed for Google must also write a log which we can all see/track how things are progressing. Just a basic access_log of time,UA,IP and the referrer information (in case a human visitor - one of us) will do.
Once we are set, we will consolidate the rules / roles here.
|
 |
Hampstead
Joined: Feb 20, 2001
# Posts: 2012
|
Posted: 2008-Aug-04 16:07
yes - we need to formalise this. There are variants of the 100 link rule.
Most talk about same site links, but some also talk about external links too.
|
 |