Google’s new technology in web search is made public. Caffeine, as they call it is the brains behind the Google Search since yesterday. It provides 50% fresher results which is great news for information crazy public who dwell on getting new information from the net.
Illustration : Suppose I wanted to search for an event that took place a few hours back, Google Search was an utter failure, but Google News picked it up within a few minutes since it was reported. So, I used to head over to News and type the keyword to get the information I am looking for. But, on Caffeinated Google Search, I can get current news on Search within a few hours.
What makes the difference in Caffeine?
The old Google Search indexes was in the form of layers. To update the index, the entire layer was needed to be refreshed which was time consuming. And, the indexing was done for the entire web in one single stretch. This translates into the time taken for a webpage to show up on Google Search.
Caffeine is different, similar to the RISC technology in microprocessors. It does multiple mini updates on an hourly basis. The idea here is to scout for new data and whenever it finds one, feed it to the indexes and go out searching for a new one.
Here’s some Caffeine facts / stats from the Google blog:
Caffeine lets us index web pages on an enormous scale. In fact, every second Caffeine processes hundreds of thousands of pages in parallel. If this were a pile of paper it would grow three miles taller every second. Caffeine takes up nearly 100 million gigabytes of storage in one database and adds new information at a rate of hundreds of thousands of gigabytes per day. You would need 625,000 of the largest iPods to store that much information; if these were stacked end-to-end they would go for more than 40 miles.
via Tech Pedia