Writing robust software in sh

← Homepage

Writing robust software is hard, even if you pick the right tools. There are a lot of things that have to be balanced: implementation speed, code clarity, extensibility and structure, performance, ease of infrastructure implementation, and so, so many other things. This is why, even though Facebook was initially written in PHP, they later on extended and modified the runtime so hard, it became a language of its own. PHP is known for the ease of hacking a bunch of things together and get a working prototype, but, as the software matures, the ability to cut corners becomes a nuisance.

And as the software gets bigger and bigger, you need to introduce abstraction layers that take care of underlying stuff in a manner that’s easy to reason about (which is another hard problem in it for itself). A wrong abstraction can be much worse than code duplication.

In my opinion, the usability of tools like sh and bash falls very fast if you need more than a hundred lines of code with logic. One of the major downfalls is that putting code snippets into functions isn’t foolproof, since you don’t even get the guarantee of getting the required number of arguments out of the box, and you need to fill it in yourself. The other major downfall is that you can’t return more complex data than a string¹.

The way that everything operates on strings also means that the language is very quirky. For loops can break when you have a lot of unexpected characters; commands can interpret parameters as options; at any point, if you quote something incorrectly, a variable value can be interpreted as a call to another script. A lot of pitfalls.

Combining the above shows, that if you pick shell scripts to get things done, everything may suddenly become harder. This is exactly why I wanted to try it out. The idea was bouncing around my head for a very long time already, but I wanted to build something that I would actually use over a longer time, to see the bugs that would arise with prolonged usage. Because of that I struggled for some time to find the best candidate. Until I started blogging.

Rewriting `blogit`

blogit is a wonderful piece of infrastructure showing the power of *NIX; if you have the right mindset, you can use the available tools for many different purposes. Using make to convert a bunch of .md files into a functioning blog with RSS and Atom feeds definitely isn’t something that Stuart Feldman had in mind when he wrote it. But it’s perfectly capable of doing that.

Unfortunately for me, I don’t maintain any makefiles, so using and extending this one wasn’t easy for me. I changed a few things a bit, but I soon hit a wall. I could modify the current compiling procedure more and more, but without a proper understanding of makefiles, I couldn’t do so efficiently. Alternatively, there was also a rewrite option.

I quickly realized that I could finally start this experiment to use shell scripting to its full capacity and beyond. Thus, I started thinking how I could achieve the same things that the current makefile was doing, and more. After a couple of days of planning, I had a brief outline of the script architecture in my mind.

The only thing that was left was to start writing.

I won’t describe the process of developing the script, since it followed the usual rule of software development: some ideas were excellent, some needed a bit of work, and some I had to rethink from scratch. It took me about a week to write something that works and has more features than the inital makefile had, all that while providing a much more readable codebase. At the time of writing this article I have a total of 916 lines of pure sh code that does what I want. There’s no documentation yet, but the code-writing part is more or less finished.

Lessons learned

Before I start writing more, I have a confession to make - I kind-of lied to you; the whole part about why using sh is a bad idea are my opinions from before this experiment. I formed those mostly when I was just starting playing around with programming and tried to build a few things with bash. Since then, I learned a lot about building good programs, so some changed, and some didn’t.

Tooling

Available tooling is a very important part of choosing the technology. You can have the best developers and equipment, but people are fallible and make mistakes. Machines (usually) don’t. If you can programmatically check the code in terms of both syntax validity and whether it is performing the right job, you can use it to ensure the code doesn’t break and works in different environments.

Let’s see how this tooling look in the world of shell scripts.

Linting

Linting the code is a very obvious concept for many programmers. But a lot of people seem very surprised that scripting has a tool like that too. Its shellcheck and its invaluable when writing anything more than a pipeline.

I discovered it around two, maybe three years ago, and since then writing scripts became a lot easier. No more weird behaviours that I don’t understand - if I write anything that will bite my sorry ass, I get a warning. I can choose to ignore those, but I only do so when I absolutely need to.

Writing blog.sh was the first time I tried to create something bigger with shell since I discovered shellcheck, and I have to say that it covers most of the pitfalls that you can encounter. By default, it warns you about every not quoted variable where quotes are preferred. It tells you about weird things like how to properly write loops when reading from a file that may not work with a standard for loop.

Overall, a great tool that provides a lot of good tips.

Debugger

Another very important thing is a debugger. No matter the language or framework, I always try to get it working. The ability to inspect the code as it is running, call functions to see their output, setup watchers, breakpoints and do other interesting things is crucial to be able to remove bugs from the code reliably. It is also very helpful in understanding codebases if there is no documentation. Whenever I use some new tool, I often step into it’s code just to see what happens under the hood. Even if it gives me a level of understanding that dips under the surface only a little bit, it is often enough for me to be much more productive.

And one year ago, I learned that there is a thing called bashdb which is a debugger extension for bash. It doesn’t work for sh as far as I could tell, so I can’t verify how useful it would be for blog.sh, but I’ve used it a few times already, and it’s good.

Testing

Automatic testing is a huge leg up in software development. Having a way to determine if portions of the code do what they are supposed to do allows you to make changes and add new things in a confident way. Turns out that shell can have that kind of guarantees too²; there are a couple of different programs that provide this capability, like bats-core or shunit2. I have just discovered them while writing this article, so I can’t say much about them, but someone named Nikita Sobolev wrote a quite in-depth analysis of a few, so you can check it out.

I’ll definitely try to include some tests as a part of blog.sh, but I already know this will serve as a great way to maintain quality.

Coding

Aside from anything outside the language that helps you develop, a very crucial thing is the language itself. Even if Brainfuck had the best tooling in the programming world, most people would probably choose something else to develop their code in, since the language itself is a huge pain.

Shells have a way to ease development in a few ways, but how helpful those are?

Submodules

You can easily split your code into a few files and then use it like that, you just need to tell the shell to read them. This alone is a huge help; getting around a file that’s over nine hundred lines long is a huge cognitive burden in and of itself. Unfortunately sh lacks a way to put different files into namespaces. bash is a bit better when it comes to that, since you can use : in function names, thus allowing you to use identifiers like namespace:path:function but it is still a hack and not a native feature. It won’t pick up that you’re in namespace:path currently, so all calls have to use full names. Because of that you either bother with namespacing and writing full names everywhere or you just make sure that no names overlap.

Another shortcoming of this lack of standard modules is that you can’t really keep anything encapsulated into its own world. Let’s say that you want to create a module that needs to save the number of times it was used for some reason. You can create a variable that stores it, but there are no ways to make sure that only this module can modify it.

Using variables

By default, variables that you define have global scope. This:

function one() {
    one='first variable'
    two
}

function two() {
    echo "$zero $one"
}
zero='zeroth variable'
one

will output zeroth variable first variable. This ability to reach through different parts of code is very powerful and might be tempting, but using global variables is a sure-way to get yourself into a lot of troubles down the line. This also means, that this:

function one() {
    one='first variable'
    two
    echo "$zero $one"
}

function two() {
    zero='changed variable'
    one='another changed variable'
}
zero='zeroth variable'
one

will result in changed variable another changed variable being printed to the terminal. Because of that naming variables becomes a game of searching the whole project for duplicates. If you want to avoid it, both sh and bash give you the local keyword to modify the variable’s scope. Modifying the example above:

function one() {
    local one='first variable'
    two
    echo "$zero $one"
}

function two() {
    local zero='changed variable'
    local one='another changed variable'
}
zero='zeroth variable'
one

results in zeroth variable first variable. This is nice, but unfortunately the default behaviour can give you some grief if you forget to scope the variable. Perhaps this could be an optional lint rule in shellcheck?

Thanks Benutzername for pointing out that local exists ;)

Using functions

Using functions is the bread and butter of programming. Shell has those too, but with a caveat: those are treated like commands and scripts, not like functions. For a number of reasons, this means that you can’t require arguments to the function and instead you have to use positional arguments. This forces you to check the argument validity and existence every time. Let’s take this example:

function test_1() {
    echo "'$1'"
}

test_1 'test_1'
test_1

will result in

'test_1'
''

being printed. There is no way to force the shell to require an argument. If you want to do that, you need to check it manually and error out. A really unhandy to do stuff, and, in my opinion, one of the main reasons that the shell code can’t scale well. Before you get to two hundred lines of code, you’ll be fed up with using $1 and $2 everywhere.

Handling more complex data

While programming different things, I often want or need to use more complex data structures. Most of the time it’s just very convenient to have a way of associating different things using one variable. Structs, object, hashmaps - all of them are very useful to do that. Shell just doesn’t have that. If you want to just use a thing in a loop, then you can just stuff values into different variables and call it a day. But if you want to have things available throught the whole codebase, you have to either use global variables, or cache values in files. The second concept is an interesting one, but I don’t think it would scale very well. Your users may also complain that you write to disk too often, which may be a problem for SSDs.

A summary

I have to say, that using shell to write blog.sh wasn’t hard, but it wasn’t straightforward either. While I was able to resolve each and every code issue within a couple of minutes, and architectural issues with an hour or two of thinking, I certainly don’t recommend anyone to write something so big with this technology. The arbitrary limit of a hundred lines of complex code still somewhat stands; you can stretch things out to maybe two hundreds, or even three hundreds if you’re feeling brave, but I’d just use Python at that point.

The available tooling alleviates some of the issues that you may encounter, but, as I have said - you won’t fix the core language issues with linters and debuggers.

Unless you want to return JSON strings and parse it with jq, which is a gross idea, and you’re still returning strings ¯\_(ツ)_/¯↩︎
Although I think that if you write something so complex and often changing, unless you want to experiment like me, you really should just use a different language. Really, shell wasn’t meant to lift such heavy weights.↩︎

Tags: english technical programming blogsh shell