Assembling MicroContainers

TLDR: Building the smallest possible hello-world Docker container.

Introduction
From Scratch
Multi-phase Assault
Librarian's Revenge
Assembling Elfs
And We're Done

Introduction

It's been about a year since I posted properly, and I'm still pretty busy but I'd like to get back into this, so I went out and found an old project from a couple of years ago that might interest some of you. Word of warning it gets a little frenetic and technical towards the end but I'll let old me do the talking:

So back when I first started playing with automation, I started like many do with a hello-world docker container image. It's a simple push-the-button and-away-it-goes type of thing, dull compared to what I do now. After that I figured to learn more I'll go simple and build up, start from a basic container, and develop into a really complex system. I quickly discovered in docker that the difficulty scale works the other way around, the complex is just a few commands, but simple oh boy you better get a PhD in software. Obviously, I put the idea down after countless failed attempts and much confusion but I've revisited it a few times since then, and it became really fascinating.

From Scratch

So to build simple, the obvious thing is you need a basic container, and the most basic of basic is the one everything was built from, the FROM scratch container, which is literally just empty, with no OS, no files, not even a shell. Just the docker kernel and a container full of tumbleweeds, and frankly like that it is the epitome of uselessness. But the hello-world docker container image is built on this, and if you read their docs they'll tell you all you need is a standalone binary program. They aren't wrong, but it is very easy to misinterpret what that means.

binary-whale

So the program sounds like the complicated bit you have to build something, haha! Nope. Building something is easy, it's those other two words are the problem, binary, which means you have to compile it down to machine code, and standalone, meaning it can't rely on anything else, hmm, well, we'll get back to that in a minute. So compiling stuff, it's not exactly Python but it's not exactly that complex either, all you need is a compiler. See there's the issue, a compiler is a thing you are relying on. It doesn't exist in an empty container, okay so we'll just install one, good idea, except, how do you do that with no shell? So you google around and find out that scratch containers copy in tarballs from outside and extract them to overlay a filesystem and that's how they build something. This is actually how operating systems typically start in a process called bootstrapping, it works and you can do that, it's not exactly easy but it's doable.

Only once you've got your compiler and you compile your program, which let's face it by this point you've already gone off to find out how makefiles work because its just easier, you take a look at your container, and see how chunky it is. It's hardly just a binary anymore, I mean the thing is a pig to load and you wonder surely there must be a better way to do this? So you google compiler and container, and discover huh neat GCC has a container just for this and it's way smaller than yours, so you switch to that instead and it is more functional but you start having nagging doubts, this is way more than just a scratch container, this is cheating effectively, so you go away to ponder what to do.

overweight-gnu

Multi-phase Assault

You come back much later after learning about multi-stage containers eager to have another crack at it, you do your compiling in your GCC container then copy the binary across to your scratch container, now we're getting somewhere, or so you think...

So you fire up your compose file which you now have to simplify the constant container rebuilds and it happily compiles, copies over the new file and it flat fails. It gives a reason though, just a confusing one, it says it can't find the file, so you try it a bunch of different ways but nothing works, so you start googling. You don't find much but it does pop up in a lot of blog posts about making the smallest possible docker image, you begin to realize this is essentially what you are aiming for so you have a good read, and here is where the story gets interesting.

Interesting stuff so you dig a bit deeper, maybe try emulating what they did, it still doesn't work, but you come across some really interesting stuff:

No answers yet but you don't want to stop cos you're learning some cool tricks for making really small programs, then bam!

Librarian's Revenge

https://bxbrenden.github.io/

It takes a while to process but the answer is in there, it turns out libraries are more than just a thing you import. I had assumed somewhere in the back of my head they were like program files stored on some website somewhere it pulled when required, when I thought about it I realized that theory has more than a few holes, but I was almost right they're installed locally. Call a library without installing it somewhere for the program to find and it'll just fail, you can't just copy across binaries and expect them to work. The only reason they do is because either the program bundles up the libraries in the framework for you or the libraries also exist on the other machine, and that's how you make the binary too, just calling library after library until you find the one that converts what you have into ones and zeroes.

Now of course the trick here is we're talking about Rust and it has static libraries, a brilliant idea. It makes it much simpler to move programs, with a massive headache on the language development side, but I'm working in C and yeah I called stdio.h so I have to rationalize that somehow, and C isn't built like Rust, simple solution? Drop to assembly, so I do that and many other things besides, and guess what... it still doesn't work. You check ldd, static libraries which is fine, so what's the problem? And further down the rabbit hole we go...

https://blog.oddbit.com/post/2015-02-05-creating-minimal-docker-images/

Enter into the ring the Linux Dynamic Runtime Loader, this is as the name suggests, a loader that dynamically links on runtime, woo! It turned out I needed another command to spot that library.

https://eli.thegreenplace.net/2011/08/25/load-time-relocation-of-shared-libraries

Assembling Elfs

Next the process of trying to find the best way to extricate that bundle of joy from the program so it statically links and the program will run, a lot easier said than done, and I didn't even cover having to adapt from nasm x86-i386 to nasm x86-64, here's a hint, change the e's to r's in the registers. Clearly, scratch containers are not for the weak-willed, and you will learn a lot about ELF files. For the record anyone that gets this far, compiling with the -static flag should also work. Digging further into it I found this and started writing the binaries myself to sidestep all the garbage they add.

https://tuket.github.io/notes/asm/elf64_hello_world/

Here is how to make a tiny assembler program in x86-64:

BITS 64
org  0x08048000

ehdr:                                                 ; Elf64_Ehdr
              db      0x7F, "ELF"                     ;   e_ident
     times 12 db      0
              dw      2                               ;   e_type
              dw      62                              ;   e_machine
      times 4 db      0
              dq      _start                          ;   e_entry
              dq      phdr - $$                       ;   e_phoff
     times 14 db      0
              dw      phdrsize                        ;   e_phentsize
phdr:                                                 ; Elf64_Phdr
              dd      1                               ;   e_phnum ; p_type
              dd      7                               ;   p_flags
              dq      0                               ;   p_offset
              dq      $$                              ;   p_vaddr
              dq      $$                              ;   p_paddr
              dq      filesize                        ;   p_filesz
              dq      filesize                        ;   p_memsz
              dq      0x1000                          ;   p_align

phdrsize      equ     $ - phdr

; your program here

filesize      equ       $ - $$

And here is a simply hello world program in assembly:

  SECTION .text
_start:
  mov rax,4                     ; syscall 4 write
  xor rbx,rbx                   ; set rbx to 0
  inc rbx                       ; increment to 1, to stdout
  mov rcx,msg                   ; message
  mov rdx,len                   ; length
  int 0x80                      ; call

  xor rax,rax                   ; set rax to 0
  inc rax                       ; increment to 1, syscall 1 exit
  mov bl,0                      ; exit code
  int 0x80                      ; call

  SECTION .data
msg:  db "Hello, World!", 10
len:  equ $ - msg

And We're Done

The binary is 166 bytes, the container is 11,264 bytes and compressed it comes in at 1,180 bytes, written from scratch with a standalone binary entirely statically linked written in assembly and prints "Hello, World!" when run. docker-compose takes 25s to run including pulling the image required. A standard docker load will run it for you, here is the container compressed as a txz:

hello-docker-image.tar

So that's another weird project finished.

Additionally there is this:

https://gpfault.net/posts/asm-tut-0.txt.html

For tackling it from a windows perspective. I was also able to write a hello world that was 4 bytes shorter by leaning heavily on logical operators, incrementers and only writing the bits of the register you need. The reason why is they're faster, logical operators trigger mechanically on the chip not digitally in code so happen at \(2 \times 10^8 \; ms^{-1}\), the rest is because the less bits you write the less time and effort it takes. Hopefully you were able to follow all that and learn something interesting.

Previous Post Next Post