Using Python’s pycurl (cURL) and re (Regular Expression) libraries, it’s possible to write a script that will check the Google ranking of a specific domain for a specific search term.

To check for and install Python 2.4 and the py-curl library on Mac OS X:

Follow these instructions to install MacPorts if it hasn’t been installed yet, then open a new Terminal window and enter the following command to see a listing of all installed ports:

sudo port installed

If ‘python24‘ and ‘py-curl‘ are not listed amongst the installed ports, install them by entering:

sudo port install python24
sudo port install py-curl

To check for and install Python 2.4 and the pycurl library on Ubuntu Linux:

Open a new Terminal window and enter the following command to install Python and the pycurl library (you’ll be notified if they’ve already been installed):

sudo apt-get install python2.4
sudo apt-get install python-pycurl

To run the rankcheck.py script:

Download Geekology’s version of this script here, or copy the code below to create your own rankcheck.py script file:

#!/usr/bin/python
 
"""
 
 This script accepts Domain, Search String and Google Locale arguments, then returns
 which Search String results page for the Google Locale the Domain appears on.
 
 
 Usage example:
 
  rankcheck {domain} {searchstring} {locale}
 
 
 Output example:
 
  rankcheck geekology.co.za 'bash scripting' .co.za
   - The domain 'geekology.co.za' is listed in position 2 (page 1) for the search 'bash+scripting' on google.co.za
 
"""
 
__author__    = "Willem van Zyl (willem@geekology.co.za)"
__version__   = "$Revision: 1.5 $"
__date__      = "$Date: 2009/02/10 12:10:24 $"
__license__   = "GPLv3"
 
import sys, pycurl, re
 
# check if all arguments were specified and whether help was requested:
if len(sys.argv) < 4:
  if len(sys.argv) == 1:
    print "usage: rankcheck DOMAIN SEARCHSTRING LOCALE";
    print "`rankcheck --help' for more information."
    sys.exit()
  elif sys.argv[1] == '--help':
    print "usage: rankcheck DOMAIN SEARCHSTRING LOCALE"
    print "Check the Search String page ranking of a Domain on a specific Google Locale"
    print "\nExample: rankcheck geekology.co.za 'bash scripting' .co.za"
    print "\nReport bugs to <willem@geekology.co.za>."
    sys.exit()
  else:
    print "usage: rankcheck DOMAIN SEARCHSTRING LOCALE";
    print "`rankcheck --help' for more information."
    sys.exit()
 
 
# some initial setup:
USER_AGENT = 'Mozilla/4.0 (compatible; MSIE 6.0)'
FIND_DOMAIN = sys.argv[1]
SEARCH_STRING = sys.argv[2].replace(' ', '+')
LOCALE = sys.argv[3]
 
# check if the locale is valid:
if sys.argv[3] == '.co.za':
  SEARCH_COUNTRY = '&meta=cr%3DcountryZA'
elif sys.argv[3] == '.co.uk':
  SEARCH_COUNTRY = '&meta=cr%3DcountryUK'
elif sys.argv[3] == '.com':
  SEARCH_COUNTRY = ''
else:
  print "Only the '.com', '.co.uk' and '.co.za' locales are allowed."
  sys.exit()
 
ENGINE_URL = 'http://www.google' + LOCALE + '/search?q=' + SEARCH_STRING + SEARCH_COUNTRY
 
 
# define class to store result:
class RankCheck:
  def __init__(self):
    self.contents = ''
 
  def body_callback(self, buf):
    self.contents = self.contents + buf
 
 
# instantiate curl and result objects:
rankRequest = pycurl.Curl()
rankCheck = RankCheck();
 
 
# set up curl:
rankRequest.setopt(pycurl.USERAGENT, USER_AGENT)
rankRequest.setopt(pycurl.FOLLOWLOCATION, 1)
rankRequest.setopt(pycurl.AUTOREFERER, 1)
rankRequest.setopt(pycurl.WRITEFUNCTION, rankCheck.body_callback)
rankRequest.setopt(pycurl.COOKIEFILE, '')
rankRequest.setopt(pycurl.HTTPGET, 1)
rankRequest.setopt(pycurl.REFERER, '')
 
# run curl:
for i in range(0, 10):
  rankRequest.setopt(pycurl.URL, ENGINE_URL + '&start=' + str(i * 10))
  rankRequest.perform()
 
# close curl:
rankRequest.close()
 
 
# collect the search results
html = rankCheck.contents
counter = 0
result = 0
 
url=unicode(r'(<h3 class=r><a href=")((https?):((//))+[\w\d:#@%/;$()~_?\+-=\\\.&]*)')
 
for google_result in re.finditer(url, html):
  # print m.group()
  this_url = google_result.group()
  this_url = this_url[21:]
  counter += 1
 
  google_url_regex = re.compile("((https?):((//))+([\w\d:#@%/;$()~_?\+-=\\\.&])*" + FIND_DOMAIN + "+([\w\d:#@%/;$()~_?\+-=\\\.&])*)")
  google_url_regex_result = google_url_regex.match(this_url)
  if google_url_regex_result:
    result = counter
    break
 
 
# show results
if result == 0:
  print " - The domain '" + FIND_DOMAIN + "' wasn't listed in the first 10 pages for the search '" + SEARCH_STRING + "' on google" + LOCALE
else:
  print " - The domain '" + FIND_DOMAIN + "' is listed in position " + str(result) + " (page " + str((result / 10) + 1) + ") for the search '" + SEARCH_STRING + "' on google" + LOCALE

Open a new Terminal window and navigate to the folder containing the script, then execute it by entering:

python ./rankcheck.py {domain} '{search string}' {locale}

… filling in the Domain, Search String and Locale that you want to check.

Because the Python script file starts with ‘#!/usr/bin/python‘, you’ll be able to execute it from the command line without invoking the python executeable if you set execute permissions on the file:

sudo chmod 744 rankcheck.py
 
./rankcheck.py {domain} '{search string}' {locale}
Share this article: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Twitter
  • GatorPeeps
  • Digg
  • Reddit
  • muti.co.za
  • DZone
  • del.icio.us
  • StumbleUpon
  • Technorati
  • Ma.gnolia
  • Slashdot

Related posts:

  1. Checking your internal and external IP Addresses on a Unix machine
  2. How to regain country-specific searches on Google
  3. Uninstall Google Software Update on Mac OS X
  4. Advanced Google search hacks and tricks
  5. Transfer your business emails to Google Apps with Google Email Uploader for Mac