Google Search Engine Trick Has Been Revealed:

Google is TOP#1 Search Engine in the World of Web. However, we all have doubts that how can Google search things for us and provide results in 0.07 to 1 second.
Lets have a look at their website/robot/server/spider management:

Google has a very good database server. I do not like to call it as a database server and instead it is an cache server. Will unlock all the things one after another.

How Google crawls a page?

Google is not allowed to crawl a page directly log into the servers. Also, it does not use manual website submitting procedure.  What Google does is, Google will first index the website where the site is manually submitted into the Google using Google WebMaster or AddURL or any other direct Google submit method.

Now, while crawling the website, it does grab all the external and internal links and save it to the cache database. They have a software called spiders which crawls the website and the spider will be start crawling the external link which is linked by first website and then it goes on crawling one after another external internal links.
Example: If you have a website called website.com and you may have many external links such as externalweb.com, externallink2.com etc. So, the spider will crawl your website and it moves to externalweb.com, externallink2.com and it moves to the external links of externalweb.com, externallink2.com. It goes on crawling all the sites which linked togather.

Now, secondly, the spider will save the details which they have grabbed to different storage servers. They store links, text descriptions and title to one server storage and images, videos to different servers. However, Google will not be storing complete website!. It does save only links and source code of your website and of course images. I will come it images later on.

Google Web Spiders

Google Web Spiders

So, the search will be handled by following methods:
Websearch handled by: www.google.com
Image search handled by: images.google.com
and other subdomains for different searches such as video news etc.  Also, the spider is an application which just separates text, image, links, title from a web page and send it to the servers which assigned to them.

Have you ever tried to search some images in google.com? Does it show same results as images.google.com? Of course not!

Now have a look at caching tricks.
There was a time when Google used to cache whole website and user used to get backups of ther content properly by searching Google cache. The Google has changed now and it will not store your contents any more. It will only stores few text, links and some smaller images. It does not keep (cache) large images to its storage server. If you want to test it right now, then do the following:
Change your WordPress theme or change color of your website page body and search your website in Google and click on ‘Cached’, can you see the old template which was there two minutes ago? No right? Yes, Google stopped caching templates, large images, videos to decrease storage volume. I mean to save server storage space!. If you want to check this directly then, access below link:

http://webcache.googleusercontent.com/search?q=cache:www.yourwebsitelink.com

The Google stores text and few images, tabs, links. That’s all. All other css/js file will be accessed by your hosting server.

Now have a look at their image caching storage system. As I have mentioned above, the spiders will be deliver all the text, links, description to one server and images to a different server. Also, the image will not be stored with exact size as it is on user’s website. It will be re sized to 250*250pixel size. The image will be delivered to their caching server with image name, size, link details.
Example: Access images.google.com and search for eminem and mouse hover on eminem images, then right click and select view image. You can see a small image such as:

http://t3.gstatic.com/images?q=tbn:ANd9GcTq9BOk9biNCro0nuk9wV2dcEWohslU7lQsBZ-QaNNnLYR98Gmm

So, Google has nothing but just the links and few text details of each website.

Just think about it for a minute. You have a website with template, images, links, bg color and it takes 2 to 3 minutes to load completely yes? Okay. Now, create a website using HTML without any bg contents, images. Just have 10 hyper links and 10 paragraphs. Check the speed of both the files on your browser.
Example:Find the difference yourself.

http://www.timesmangalore.com/webguru//google.htm

The Google search page embedded on my site page.

My original WordPress website with database query:

http://www.timesmangalore.com/webguru//?s=linux+commands

Can you see the difference? Yes, the first link loaded from my end with 0.10 second and the second one loaded with 20 to 50 seconds. I have both file on same server and did you see the difference?
Now access the link at:http://www.timesmangalore.com/webguru//myfile.htm
This file has only text contents and it loaded within 0.12 secs even if am in Shared Server.

All Google doing is caching web links, title, description and storing in the database. Also, the Google will cache its result too!! Example, if you search for “I love Google”, it takes about half secs to show results and if you search again, you get same results in 0.07 secs!

The difference between our website and Google web are:
1. We have multiple css, js files.
2.We have large or many images on site.
3. We have multiple paragraph and may be flash on our website.
4. We are Shared Hosters.
5. We store complete data in database and not the summary.
All above is opposite to Google!

Google has different cashing database server for each country. It shows different results for each country.
Example:

http://www.google.com/search?q=eminem

http://www.google.in/search?q=eminem

http://www.google.com.hk/search?q=eminem

Search method:
When a user types anything in search box and clicks on ‘Search”, the request goes to Google database respected to Google domain such as google.co.in, .com, .hk. Then the request will do a query on server.
Example: If you search ‘Hollywood’, the script will do a query and search for ‘Hollywood’ in database where all Websites titles, links and descriptions are stored. Then it will give result as:

Movies | Reviews | Movie Times | Hollywood News | Hollywood.com

Hollywood.com is the only 100% pure entertainment online source for movies, movie reviews, movie times and hollywood news.
www.hollywood.com/CachedSimilar
It does not mean that http://www.hollywood.com is only the good website for the keyword but its how google calculates before providing reslult.
The script will be commands database server as: Show list of site links to this user where the sites have ‘HollyWood’ keyword on their titles, links and descriptions.

The Google has a BIG IP and after that, the query will be redirected to the respective servers so Google can manage the server overloading. By all these explanations, we can see that Google has very well maintained servers, technology and idea to rule the web. It seems to us that Google is infinity, but not at all. Its just amazing in its business.

That’s all.

Please comment if you like my article! or contact at mesam6@gmail.com .


One Comment

  1. Posted April 3, 2011 at 3:42 pm | Permalink

    Hi, thank you for your post, it helped me a lot figuring out many things.