Google suspends real-time search
A lot of things happened that last week. Google opened their new service Google+ to a small set of people (don’t bother getting an invite now, they seem to have closed down the set of users for now), and also changed their layout for the search page and their calendar.
As a “side effect”, Google real-time search is apparently gone. Mashable confirms this in an article and gets some insight. Apparently, the agreement between Twitter and Google has ended, and Google did not extend it for now, with Google planning to focus on their own service Google+. Google states that it’s still crawling the publicly accessible pages from Twitter, which is of course hardly the same as having access to the Twitter stream.
I’m actually not that surprised as Google’s real-time search always felt like a half-hearted attempt at real-time search. I think the challenges are somewhat different between real-time search and web search.
Web pages are relatively static, both in content and relevancy. In real-time search, however, the information is changing very rapidly and has a high probablity of becoming obsolete quickly. This leads to quite different technological challenges, so that it’s probably hard to fit real-time search to Google’s existing infrastructure easily. The amount of cacheable information is also very limited.
Real-time search requires real-time relevancy measures. It doesn’t really make sense to just show the most recent messages matching your query. For popular events, there might be thousands of hits, swamping any really relevant hit after a few minutes or even seconds.
Displaying a list of all hits doesn’t make sense. Often, you have many near identical hits, and some form of aggregation would really be useful. Google is doing something like this for news, but news lives on a much slower timescale than real-time search.
Naturally, these are also topics we’re very much interested in at Twimpact. For example, our retweet based trending and user Twimpact score is a good starting point to get a better estimate of relevancy (This is currently demoed at our Japanese trending site). Currently, we’re also moving to an infrastructure which does most of the analysis in memory to deal with the real-time requirements. This allows us to process literally thousand of messages in real-time with relatively modes hardware requirements. You can get a glimpse on beta.twimpact.com.
As far as I remember Google had some plans of incorporating, for example, the social graph of a user to refine the search results, but I don’t know how far that went. Let’s see whether they take the time to create something better. In the interim, a site like Topsy gives you a more comprehensive real-time search feature than Twitter’s own search.
^MB
Some more links on the later state of Google’s real-time search: