Tuesday, February 10, 2015

structuring Python's main()

You would not imagine that such a seemingly trivial topic deserves a blog post of its own. Until you see this:

def main(argv=None):
    if argv is None:
        argv = sys.argv
    opt, args = parser.parse_args(argv[1:])

...and this

if __name__ == "__main__":

repeatedly in the code, and you start thinking this is a pattern somebody defined elsewhere. And indeed it is, find Guido's original article here.

From the issues that the original post addresses, let me only focus on the ones printed above, so mainly leaving out the details of argument parsing and displaying the usage message.

We are after a function that's easy to invoke in other contexts, e.g. from the interactive Python prompt

Let's add print statements for opt and args and run the code above from the command line:

c:\Temp>c:\Python27\python test.py arg1 arg2
['arg1', 'arg2']

And let's call it from the interactive shell, looking after the same behavior:

Python 2.7.6 (default, Nov 10 2013, 19:24:18) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import test
>>> test.main(['arg1', 'arg2'])
>>> test.main(['arg_that_nobody_cares_of', 'arg1', 'arg2'])
['arg1', 'arg2']

You'll also feel the smell once you try to document it:

def main(argv=None):
    Parses argv and prints the results as parsed by optparse.OptionParser.parse_args(), after dropping argv[0]. So if calling it from another script, make sure the args you care about start at index 1.
    Keyword arguments:
    argv: array of options and arguments. If None, it defaults to sys.argv.

I would argue that main() should exhibit the same behavior when called the same way from different contexts and expose a consistent interface. Here is the proposal:

def main(opts_args=[]):                      # (1)
    opt, args = parser.parse_args(opts_args) # (2)

if __name__ == "__main__":
    sys.exit(main(sys.argv[1:]))             # (3)

(1) opts_args defaults to [] instead of None, because calling parser.parse_args(None) will actually parse sys.argv[1:] (see optparse.py), and you don't want this if calling main() from another script.
(2) if you want to use the script name in the help message, __file__ should do, you don't need sys.argv[0]
(3) all sys code in one place