The Advent of Code should be active by now, and that's got me thinking about code, and how I haven't really covered it here at all. Now a quick warning, I don't intend to teach the basics of how to code here, I'm really not the right person to do that so I'm going to assume at least a reasonable level of ability with code in general and familiarity with Python here specifically. If you don't have that there are many great resources to look into and Python is by far one of the easiest languages to learn to get started with coding.
There are many many more, these ones I've personally worked with and they're really good at helping you develop the right habits. I also understand that even once you know something like Python getting started writing your own code is quite difficult, it can feel like its impossible to break out of the box you've been taught as there is an incredible hump to get over in how do you turn this into something useful without guidance. If this is you, you're going to find this hard to hear but remember I've been where you are now, you are the problem, and you have to just get over this. Its not all that harsh though, the difficulty mostly comes from an inability to think outside the box, you've never had this opportunity before but I broke through it like so:
I would recommend actually making programs that do these things and expand on them to ensure you properly understand the implications and complexity of what this means. From here many basic level programmers never really evolve much, there's so much you can do with intuitions like these and its very possible to build lots of functional code for all manner of purposes, but this is still very much the domain of the entry level developer, its just for many jobs out there no more sophistication is required, if your heart is set on being a developer though, you will need to push these boundaries further.
How to do this can be difficult to know without some guidance, I think the best way is to look back at what you've done so far for answers in how to develop it, so far what we've talked about is building datastructures to manipulate data, but these structures are quite simple, there is a lot more you can do here. To expand beyond this point I found one of the greatest tools was a free course with some exceptional teachers, Harvard University's CS50, by following that you will go back over a lot of the basics you know, which is important to build a solid foundation, in the process you will learn the concepts in a lower language (C) and tackle some very complex problems. In solving those problems though you will begin to learn about complex data structures, memory management, linked lists, hash tables, trees, even getting a first look at object-oriented programming and lambda functions, what they are and when you should and shouldn't use them, these are the things you will need to be good at to tackle later puzzles in the Advent of Code, and you will be well on your way to becoming a senior developer. If you have been able to master all that so far, the code I'm about to share should be fairly trivial to understand, and will hopefully help you to think carefully about what you want a program to do when you build it, because proper planning is 9/10ths of success. If you haven't achieved that yet, its not that important for this but hopefully it gives you a roadmap of learning that I wish I had when I started out.
So, you're about to build some sort of application in Python how do we start? Pretty simply really, lets make a file with a .py so we know its a Python script.
~> touch app.py
touch
is a linux file that modifies the timestamp of a file, or just creates it if it didn't exist, its the simplest form of editing and is about as close to poking a file as a computer gets, if you then used ls
you'd be able to see the file exists. Alright, so we have our file and it should be readable to Python, what do we put in it? The first thing I always add is what some people call a shebang.
#!/usr/bin/env python3
What this does is define the executable to run the file, #
is a pretty universal type of comment so the Python ignores it, the !
is the bang, it notifies the shell that this is how to execute the code, which is followed by the absolute directory of environment and then the program that runs it, the 3 could be omitted but I prefer to be explicit since they are fundamentally incompatible with python2
and it has been getting phased out for longer than I have been coding. So what this simple line means for the user is that the shell knows how to execute the code, allowing me to from the directory it is in run app
or whatever name you gave it and it will run, though you could simply use python3 app
instead calling the interpreter manually. So what next? Well you could literally just write code from there and have it autorun, however this isn't very proper and can be prone to abuse or cause unexpected behaviour, we of course intend to be better than that. If you've programmed in C or similar you may have encountered int main(void){}
as a means of calling the main function so the compiler doesn't throw a tantrum, and we can do similar here.
if __name__ == "__main__":
So there's a lot to unpack there, first its not a function definition its an if statement, don't worry about that, this is exactly why C throws a tantrum, it looks for a specific function and if it doesn't exist it doesn't know how to start the program. Python doesn't really care essentially what this condition does is ensure the code inside only runs if this file was the one directly called by the system preventing unexpected code execution. Python doesn't really have a proper concept of public versus private variables and methods so the double underscore is how it achieves this making those items "protected" in a fashion, you will see it a lot in the use of syscalls as they demarcate things you really shouldn't touch unless you know what you're doing. Alright so lets finish expanding this out and making it a bit more complete.
#!/usr/bin/env python3
# app.py
import sys
def main(argv: list[str]) -> int:
"""
Main function
:param argv: A list of arguments containing the program name and the user choice.
:return: An exit code signifying the status upon completion.
"""
return 0
if __name__ == "__main__":
main(sys.argv)
A lot more going on now so lets explain it all, first thing is there's a couple of blank lines after the shebang, this is mostly for keeping the linter happy, you can get a little weirdness if you don't do this apparently but I've never really encountered it. Below that we have a comment, simply to help keeping track of which file we're in, I find this sort of thing a handy reference point when debugging with many libraries and files. Then we have and import statement which you should be familiar with, and it imports the sys library, after that we have a function definition that we'll skip over like the interpreter does for a moment before we reach our condition. Not a lot new here but it now calls the main function with the argv method from the sys library so we know why it imported it now, so lets go back and have a look at that main definition.
def main(argv<mark>: list[str]</mark>) <mark>-> int</mark>:
So that first line is full of some weird stuff I've highlighted, what on earth is that? That is what you might call type hinting or declarations, and its much more common in other languages Python actually doesn't need it at all. The reason it's here is it makes the function much easier to understand as you know from looking at it what it wants and what you will receive from it, and because of that I'd recommend using it wherever you write code. If you recognised the next section as a multi-line comment you'd be right, in this case its what is known as a docstring, it is something I learned about through Java and the purpose of it is to tell you about the function, what it does, and describe parameters and return values. An interesting note about the way Python returns multiple values is it uses a tuple, and you may even be aware you can define multiple variables at once giving them the output of a function to grab the value directly from that tuple, as such if you return multiple values you can declare it as such -> Tuple[bool, str]
.
Why do we need a docstring? Well, that brings us to a bit of a contentious issue in coding, but one that we have to tackle if we ever want to be considered a good developer, comments. Let me first start by stating, I loathe comments, not everyone does but I do know a lot of developers feel the same, given the choice we'd rather not have them at all. However, there's nothing worse than having to debug code with no documentation or instructions, it's like trying to skewer a fly in the dark with darts, writing documentation is often even worse, there aren't many things more dull, then trying to interpret them and find which bits of code they refer to? Ugh! What a nightmare! All that can be avoided with a few comments and since they get removed at compile time they don't even slow anything down, so whats the harm? The ideas, thoughts and philosophies surrounding comments could fill the encyclopedia brittanica, everyone has their own opinions and companies often enforce their own policies on it, but what's the best way to approach it? Sadly the answer is going to be slightly different for every person, but there does tend to be a lot of common ground, I'd encourage you to look into "Clean Code"
Clean code is simple, obvious, and formulaic to read, because everything is well-ordered and sensibly labelled the code is often said to be "self-documenting". Now stop right there, you're probably thinking this means good well written code should have no comments, false, but when done right few will be needed, be cautious of leaving no comments at all, or assuming what it does is obvious, sometimes a little help is still necessary. I think beyond those insights I'd rather leave the explanation of good comment writing to wiser folks than I, see the following links, but in terms of the docstring, because it can literally used to automatically generate documentation for your code I would advise you get used to using them. They can also take most of the work out of writing comments for you, simply have the docstring describe what your function does for each function and all the processes should remain obvious, if they don't the need for a new comment will arise and here's where they get really useful, when that happens its usually a good sign your function has grown too cumbersome and should be split.
Thoughts on comments:
The last line of the function should look fairly familiar, its a return statement and just as the declaration states an integer is returned, of course this function doesn't yet do anything, the entire return could be omitted, its not used anywhere. Something I learned from working with bash scripts is programs tend to return a code to the shell when they are complete, these vary for each program but the standard states 0 is a successful execution and anything other than 0 is typically some sort of error, but that error can be useful for programming the automation of other tools that depend on that exit status, so if you plumb in to only run when the exit code is 0 you can prevent cascade errors and weird unexpected behaviour. Python doesn't directly give us access to these codes due to a whole bunch of reasons you'll start to pick up when we go into error handling, so you can't have our condition simply return the 0 and complete the program as there's nowhere for it to return to, but by returning a 0 from the function I can provide back the information that the program has executed successfully to the boilerplate which can then be interpreted and appropriately handled by a function we are yet to create that does access the program's exit codes.
Okay so that covers the basic constructions for the boilerplate code to run and app, maybe now we can start filling in some of those blanks. Let's test it first though, fire it up and sure enough it should just execute without any great issues, only hang on a sec, we want this thing to be safe, users aren't always the brightest of people, its not uncommon for them to run it with Python 2. Obviously our shebang should handle that if run directly but if called manually that would be ignored, so we best put in some sort of protection for that. So if you have a poke around in the sys library you should find some things that can help, give this a whirl:
if sys.version_info.major != 3:
sys.exit(1)
It's a tad rudimentary, but it should do the job, put it somewhere near the top of the file so its one of the first things it sees and hopefully it should work. "Hopefully" is kind of a key word there cos if you try to run it with Python 2 you will probably get something like this, drat!
~> python2 app.py File "app.py", line 10 def main(argv: list[str]) -> int: ^ SyntaxError: invalid syntax
The problem is python2
scans the entire file before running, and errors out as soon as it sees something it doesn't understand. In a way you could say that worked as it errored out rather than ran, but it's not very proper and the error is not at all descriptive, this is bad form, we need to catch Python 2 before it has chance to run a file incompatible with it so we can throw a valid error. This post is getting overly long now so we'll pick that up in the next one, see you then!