Monday 27 April, 2009

Google preventing automated searches



It looks like there has been an increase in the number of automated searches being performed over google. And they are not happy with it.

What i was fascinated with is the google calculator. I has just started to write a script in Python for calculation related things and what i could get is that there are now less ways to do automated search over Google.


Let's have a look at how Google prevents automated searches.

If we open http://www.google.com and do a search for "35 mm in inches", what we obtain is the result. A close look at the URL pattern gives the following information.

  • The new search pattern of google is
    http://www.google.com/#hl=en&q=5+mm+in+inches&btnG=Google+Search&aq=0&oq=5+mm+in+in&fp=CGM4k02K5DI
    . The use of anchor tag (#hl) is interesting.
    Another interesting thing is that there are two parameteres on which the same search keyword is being fired - q and oq.
    The q is the actual query that is used to fetch the result and oq is the query that you typed. The following image makes this even more clear.


  • Another interesting thing is that if we try to open the same page via Python's URLLIB2 interface, we get the google home page.

  • Apart from that, the most famous URL pattern for any search
    http://www.google.com/search?q=google+search

    gives a 403 - Forbidden, when tried to access via Python's urllib2 interface.


So, it looks like Google is narrowing the way people perform automated search on its engine.


In the next couple of days, i will be trying to find out if there is still some holes left, or not and will discuss my findings here in a more elaborate manner. You could very well, follow my blog to make sure you don't miss anything. Comments most welcome.

No comments:

Post a Comment