Saturday, November 10, 2012

Google Search Architecture (Part-2)

Thanks for the response on the first article on the "Google Search Architecture". Even this is quite an old topic but enjoyed your comments on Blogspot/Wordpress blogs.  Also, this motivates me to write more of this kind.

The first part can be found in

We have seen some components MapReduce, GFS (Google File System) API/software abstractions.

Major Modifications in Google Architecture

1. MapReduce is time-consuming and major revamp in architecture is made. This  replaced with  Caffeine. 

2. GFS is also replaced by Colossus (means large statue), this is used across the all the "Google web Services". GFS was using batch operations logic. Colossus is built for "Real-time services"[1]. Colossus is also used for "Google Cloud Service" just like Amazon "S3"

Incarnation of Google Architecture

Google published the architecture. And based on this major software firms "Facebook" and others use the same architecture. 

Hadoop was created by Doug Cutting and Michael J. Cafarella.Doug, who was working at Yahoo at the time, named it after his son's toy elephant. It was originally developed to support distribution for the Nutch search engine project.

Hadoop cluster
Hadoop Architecture

1. Hadoop  and  its "Hadoop Distributed File System" (HDFS), an open source Java product similar to GFS.

2. Hadoop is open source implementation of the GFS

Besides Facebook and Yahoo!, many other organizations are using Hadoop to run large distributed computations.  


No comments: