MediaWiki and

A ways back in the past I had a MediaWiki install at with the hopes that a wiki could be formed for the omaha community, by the omaha community (sound familiar?). Anyway, I never really did much with it, and a few days ago a professor from Creighton contacted me about my domain and pooling resources.

He has created which now re-directs to. He is having students flesh it out. I came into the picture to help set up some bots to manage the content.

It turns out there is a cool framework for MediaWiki's called the "Python Wikipedia Robot Framework" that is written in python. I got the scripts working on my machine and then I turned my attention to writing a bot that would do a word-count on every page, and add a stub to that page if it was under a given threshold.

I had forgotten how awesome Python is. It really is a good language, I just wish I had call to use it every once in a while. Anyway, here is my Python bot for that framework. You can grab a file version here

# -*- coding: utf-8  -*-
-----// Stub Adder //------------------------------------------------------
Version: 1.0
Author: John Hobbs
Contact: [email protected]

This bot will iterate through all pages of the wiki and append a generic
stub ('') to them if they do not have one already and have under
a given number of "words" in them.  Words, here, are counted as _any_ series
of characters seperated by a space.  The default maximum number of words
that the bot will work on is 5, so it is recommended that you pass it a more
realistic value.



to have your change be done on all pages of the wiki. If that takes too
long to work in one stroke, run:

python Pagename

to do all pages starting at pagename.

There are two command line options:

    This will check and notify you but will not actually change anything.
  This is the word threshold. Replace XX with the biggest wordcount that you
  want the bot to append stubs to.
import wikipedia
import pagegenerators
import sys

def workon(page):
        text = page.get()
    except wikipedia.IsRedirectPage:

    jmh_tokens = text.split(' ')
    if len(jmh_tokens) <= jmh_count and -1 == text.find('Stub}}'):
      text += ''
      if jmh_dryrun:
        print '--// MATCH: [['+page.title()+']] -> Dry Run, No Change //--'
        print '--// MATCH: [['+page.title()+']] -> Stub Added //--'

    start = []
    test = False
    jmh_dryrun = False
    jmh_count = 5
    for arg in wikipedia.handleArgs():
        if arg.startswith("-words="):
            temp = arg.split('=')
            jmh_count = int(temp[1])
        elif arg.startswith("-dryrun"):
            jmh_dryrun = True
    if start:
        start = " ".join(start)
        start = "!"
    mysite = wikipedia.getSite()
    basicgenerator = pagegenerators.AllpagesPageGenerator(start=start)
    generator = pagegenerators.PreloadingGenerator(basicgenerator)
    for page in generator: