Plumbing with Snakes 3: Extending the Trachea

Posting? On Christmas Eve? What sort of a Scrooge operation is he running over there? Don't worry, I prep these well in advance, it's currently the start of November as I write this and by the time it goes live I'll likely be deep in festivities. If you're wondering about the title by the way, if you've been following the theme its the process by which a python kills and eats its prey, but it also has reference to the work. Last week Coils within Coils describes how we built a tight stranglehold grip around our program so it only behaves in well controlled ways. This week we're looking at how to handle swallowing whatever a user might throw at us, so the focus is going to be validating and sanitizing input, namely the arguments the program was passed with so we have strict control over what the program can be asked to do. Just bare in mind the idea of the boilerplate is to be program agnostic, so the aim here is to ensure the arguments make sense and aren't being passed off as something else, the program itself will have to do its own specific sanitization though the existence of this basic level will allow for assumptions to be made that simplify the process a lot.

So if you've followed along correctly your files should look like this in order of running:

#!/usr/bin/env python3

# run.py
import sys

# Check python version, display error and exit if wrong
try:
    if sys.version_info.major != 3:
        raise RuntimeError("run with wrong Python version")
except RuntimeError as e:
    print("{} (0x01) Program only supports Python 3.".format(type(e).__name__))
    sys.exit(0x01)

import err

# Run main program
if __name__ == "__main__":
    err.main(sys.argv)

#!/usr/bin/env python3

# err.py
import sys
import app

def throw(error: BaseException, code: str, message: str) -> None:
    """
    Generic error throwing function for use within the program
    :param error: The type of error to be thrown.
    :param code: The error code to be provided upon exit.
    :param message: The message to be presented to the user.
    """
    try:
        raise error from None
    except error as e:
        print(f"{type(e).__name__} ({code}) {message}")
        sys.exit(int(code, 16))

def main(argv: list[str]) -> None:
    return_code = app.main(argv)
    return

# Throw an error if run directly.
try:
    assert __name__ != "__main__"
except AssertionError:
    throw(RuntimeError, "0x02", "Please run from run.py.")

#!/usr/bin/env python3

# app.py

def main(argv: list[str]) -> int:
    """
    Main function
    :param argv: A list of arguments containing the program name and the user choice.
    :return: An exit code signifying the status upon completion.
    """
    return 0

# Throw an error if run directly.
try:
    assert __name__ != "__main__"
except AssertionError:
    from err import throw
    throw(RuntimeError, "0x02", "Please run from run.py.")

If you give these a quick test you should be able to confirm they do indeed do nothing very quickly, but as frameworks go its got some nice potential for expansion. To sanitize the input before being sent to the core program we need to intercept it on route, and the best place to do this is in the err.main function since this has direct access to the error handler and is initialized with the input before the program starts. Let's set out some basic parameters,if you read the sys.argv documentation you'll see that the library imports the command line as a list of strings split on the spaces in the command, this actually solves a bunch of minor issues straight out of the bag. It provides limitations as it won't read in control characters and certain special characters are excluded too, and there are no issues handling a null input as it will always be at least an empty string. There is a temptation to limit them to alphanumerics but sometimes filenames or similar are passed to programs so it is wise to allow special characters and weed them out where inappropriate within the core program.

At this stage the main concerns we should worry about are length based, that the number of parameters passed isn't too long or short, and that the parameters themselves are a sensible length. Given how much even this depends on the program itself I tend to take a light touch approach and build a simple extendable length checking system that can be modified to protect the program, in the general sense the shortest an argument can be is an empty string, which is fine and I leave it open-ended since character limits can cause unusual behaviour that is better left to someone to specifically configure, though a character limit of 255 would not be unreasonable for most cases. The number of arguments can be similarly difficult to pin down, so I usually set a lower limit which accounts for just the script name and can be increased for mandatory arguments.

def main(argv: list[str]) -> None:
    """
    Core error finding process, handle arguments.
    :param argv: The list of arguments program was run with.
    """
    # Handle arguments and throw and error if there are too many or not enough.
    try:
        assert 0 < len(argv) < 2
    except AssertionError:
        if len(argv) > 1:
            throw(SyntaxError, "0x03", "Undefined argument.")
        else:
            throw(SyntaxError, "0x04", "Please provide an argument.")

    return_code = app.main(argv)
    return

Notice how I've used the fact Assertion won't fail on a 1 and a simple condition to differentiate between too many and too few arguments, in this case you could also write it as assert len(argv) == 1 but then it hides the expandability. As described above I've avoided putting in a character limit for the arguments, though this could easily be achieved with a for loop on argv. Other than that it looks like Python has already solved a lot of the input problems for us, but there is still a lot more can be done with this function, for example we currently still do nothing with those return codes, not that there are any yet.

There are lots of ways you could arrange your exit codes you could just come up with a number whenever you think of a new unrecoverable error, you could try to put similar errors together, or you could group them so the first digit defines the general error and the rest is a more specific description of the problem. There's a lot of room for movement, though it is typically standardized that the 0 code is reserved for successful completion, and it pays to use a decimal return code that directly maps to the exit code for simplicity. The most important thing you should do though is when documenting the program include information on all the known error codes, the more specific you can be the better as it will help people to diagnose problems if they know exactly what went wrong, the typical "wrap the whole program in a try-catch" principle will only leave your users lost and confused. So lets add in some method to automatically translate those return codes into error messages.

def main(argv: list[str]) -> None:
    """
    Core error finding process, handle arguments.
    :param argv: The list of arguments program was run with.
    """
    # Handle arguments and throw and error if there are too many or not enough.
    try:
        assert 0 < len(argv) < 2
    except AssertionError:
        if len(argv) > 1:
            throw(SyntaxError, "0x03", "Undefined argument.")
        else:
            throw(SyntaxError, "0x04", "Please provide an argument.")

    return_code = app.main(argv)

    # Handle the main program return codes.
    match return_code:
        case 0:
            print("(0x00) Program completed successfully.")
        case None:
            throw(RuntimeError, "0x07", "Critical program failure.")
        case _:
            throw(RuntimeError, "0x06", "Unknown error occured.")
    return

So here we've set up an extendable switch statement that detects the return code and responds appropriately, its pretty basic in that it only really detects successful completion or failure to be set, but you could imagine expanding this to add whatever errors you might need. If you wanted to make it really flexible you could add an automated case num if num > 0 guard to pick up all the different errors, but you would also need to get the specific error you need and a message to define it, combining it by running hex(return_code) to provide the exit code. To do this though you'd need to pass back the error and message with the return code rewriting the app.main function and implement a system to fail out of the program entirely, so I'll leave that up to you to figure out. A good alternative could be to simply import the throw function when needed and throw the error in place as sys.exit() should end the program as a whole, where this wouldn't be wise is if there are in flight changes that need to be corrected on exit, so those sorts of scenarios will require handling a little differently, so its very much up to you how you do it.

Okay so what else is there? Looks like not much left to do except build a program, there is one small job left though. Most programs when you run them from the command line can be cancelled with a particular key combination, this is a really useful feature for if you input the wrong information or have realised the program is no longer useful. This is known as a Keyboard Interrupt and is relatively easy to implement with a small adjustment to our err.main function:

def main(argv: list[str]) -> None:
    """
    Core error finding process, handle arguments, watch for keyboard interrupts and check return codes.
    :param argv: The list of arguments program was run with.
    """
    # Handle arguments and throw and error if there are too many or not enough.
    try:
        assert len(argv) == 1
    except AssertionError:
        if len(argv) > 1:
            throw(SyntaxError, "0x03", "Undefined argument.")
        else:
            throw(SyntaxError, "0x04", "Please provide an argument.")

    # Run the main program and listen for keyboard interrupts.
    return_code = None
    try:
        return_code = app.main(argv)
    except KeyboardInterrupt:
        throw(KeyboardInterrupt, "0x05", "Program exited by user.")

    # Handle the main program return codes.
    match return_code:
        case 0:
            print("(0x00) Program completed successfully.")
        case None:
            throw(RuntimeError, "0x07", "Critical program failure.")
        case _:
            throw(RuntimeError, "0x06", "Unknown error occured.")
    return

Obviously this way it won't become immediately active until the initialization has taken place but my experience is that happens faster than a human can react anyway. That about concludes everything in this boilerplate code there is to discuss, hopefully its given you some insights about good codewriting and the supporting infrastructure an application requires. If you want to reuse this code elsewhere, adapt it and develop it feel free, and have a happy holidays!

Previous Post Next Post