Docker, Jails and All That Jazz

Different means of isolating applications and processes existed for decades, starting from chroot, through FreeBSD jails and Solaris zones to the most recent invention – Docker. The point is, all of those technologies serve slightly different purposes, despite the fact that people insist on arranging them into a linear, progressing hierarchy. For instance, chroots (or chroot jails) are meant to restrict the trapped user to a sub-directory, ideally isolated from the root of the host filesystem. It is a common, lightweight approach to isolate file access in FTP server daemons since it is fairly easy to implement. However, it lacks certain abstractions available in more advanced tools like jails and zones. So far, I have used chroots, Docker and FreeBSD jails, and I would like to share some key observations I made regarding these technologies.

Traditional chroot jails are extremely basic, nevertheless useful partially for that very reason. If all you want is a nested directory hierarchy for building software packages or similar purposes, chroots are good enough. Unfortunately, they do not prevent secondary access to the host environment via another chroot, neither do they limit resource consumption by the chroot’ed application. Therefore, care should be taken when using chroots in an Internet-fronting environment. The FTP context is moderately safe since a chroot-trapped user is limited by the FTP protocol. In general, chroots are most useful when they constitute a subsystem in a larger, more complex tool.

FreeBSD jails were developed as an improvement to the venerable UNIX chroot technology. In practice, they contain a complete FreeBSD userland, fully transparent to the host system, yet restricted by the host system kernel from within. From the outside they’re no different than an unpacked system tarball or disc image. The great thing about jails, though, is that they’re an integral part of FreeBSD and as such can be easily managed in a multitude of ways. Resource handling? Easy via rctl. Installing packages? The pkg package manager has a “-j” flag specifically for that purpose. What about enabling and starting/stopping daemons? Again, service has a “-j” flag and so does sysrc to control lines in /etc/rc.conf. The only tricky bit is network management, but this too can be easily scripted due to the modular nature of FreeBSD. The main issue with jails is that they’re a FreeBSD-only technology. This unfortunately restricts adoption, but so would any other technology requiring an intimate relationship with the operating system.

Docker is a lot more modern and similarly to FreeBSD jails, it relies on features specific to a single platform, namely Linux. It leverages kernel cgroups for privilege separation and a virtual filesystem (overlay2, previously aufs) for managing the underlying disk volume. Compared to FreeBSD jails, Docker containers are a lot more abstracted and communication is typically carried out through the docker daemon. In the case of docker volumes, I find it slightly distressing that the only access to the data is via the daemon. Another thing that differentiates it from FreeBSD jails is the default access level. The root user in Docker containers has absolute control over the data in the container, and on top of it can perform network-related actions which should be reserved to the host system. The biggest issue however arises when mounting host directories in the container. An unprivileged user in the container has root access to host directories. Frankly, FreeBSD jails entail similar risks and therefore the official announcement in the Handbook:

Important:

Jails are a powerful tool, but they are not a security panacea. While it is not possible for a jailed process to break out on its own, there are several ways in which an unprivileged user outside the jail can cooperate with a privileged user inside the jail to obtain elevated privileges in the host environment.

Most of these attacks can be mitigated by ensuring that the jail root is not accessible to unprivileged users in the host environment. As a general rule, untrusted users with privileged access to a jail should not be given access to the host environment.

 

Despite all that and the fact that it’s the most recent of before-mentioned technologies, it quickly evoked significant appeal (aka hype) and is now both widely used, and integrated with many open-source tools and platforms like OpenStack, Cockpit and GitLab. In fact, nowadays it is very popular to provide ready-to-use Docker containers as part of the standard software package distribution. Security, as usual, is an afterthought since spinning up Docker containers is so trivial that it should be illegal.

The question still stands, though – Why are some technologies considered superior to others? From this competition I would immediately exclude chroot, since it is more of a tool for developing higher-level isolation concepts. However, what about Docker and FreeBSD jails? Why is Docker considered better? A general problematic trend I see in software development is trying to make tools do things they were not designed to do. Yes, oftentimes the product is successful and we’re amazed by the tool’s robustness. Alas, when it fails, we complain about its uselessness. That’s rather unfair, no? Jails are an extension of the chroot concept and are meant to hold long-running processes like database servers, Web servers, etc. You create a ZFS sub-volume, unpack a complete FreeBSD userland into it and route it to the Internet via the host packet filter. Thanks to PF’s design, it is easy to control data packet flow via TCP/IP down to a single packet. The ZFS sub-volume can be selectively backed up via per-volume snapshots, stored on another ZFS array or re-deployed at will. Hell, if we want, we can make it transient and completely obliterate it after the application finished running, a la Docker. The FreeBSD package building infrastructure, poudriere, makes heavy use of semi-transient jails. All of this seems sounds so trivial, n’est ce pas? Yet, the adoption compared to Docker is simply not there, because Docker was overly hyped and demonstrated to be easier than figuring out FreeBSD + jails for the average user.

This is where Docker wins – widespread integration and adoption. It is theoretically much easier (and faster!) for an average Joe to roll out a Docker container from an official pre-built image. Basically, almost any server appliance can be deployed as a Docker container. That doesn’t mean that Docker is necessarily better. Applications won’t work equally well on all container images due to differences in configuration and library versioning (say, CentOS vs Debian). More complex setups still require customization and writing dockerfiles is no easier or less clunky than deploying FreeBSD jails. Even more so, after some tinkering, I have to admit that setting up a FreeBSD jail with iocage was far easier than getting around Docker. The file system transparency inherent to jails is a big boon here. Regardless, orchestration tools like Chef or Ansible will help in both cases equally well. Furthermore, Docker was not designed for permanent storage from the get-go, but rather received this feature much later. Docker volumes are quite stable, but I still find it more convenient to create a ZFS volume and share it with the container rather than play around with Docker volumes. Again, this approach is much more transparent. Nevertheless, thanks to the overall hype, Docker was tested and used in production many times over already.

In conclusion, both Docker and FreeBSD jails have their respective niches in the IT ecosystem. For quick testing and deployment of lightweight applications (simple Python daemons, etc.) – Docker. For full-blown processes like a PostgreSQL or MySQL server, which require a dedicated IP or a mix of daemons – jails. If jails are not available, there is always KVM or a similar hardware virtualization platform which similarly emulates a complete ecosystem. The important thing is to understand how to choose the tool most suitable for a given use case. There is no better nor best, after all.

DevOps Oversimplified

To begin with, DevOps is one of those newfangled expressions in the IT ecosystem, which are currently hip and trendy, but cause a grimace on the faces of older system administrators. Before, we relied on cron and the flimsy magic of throwaway Shell scripts to manage system daemons. Now, we have Ansible, Docker and other sexy new tools which bridge the gap between software development and deployment. I still have some reservations towards Docker, though I really enjoy using Ansible in my pseudo-sysadmin workflow. The truth is that all of these tools have their place and are astonishingly powerful when used correctly. As a person who worked with multiple UNIX-like and non-UNIX operating systems, I can definitely claim that your millage may vary. Different scenarios require different tools and no tool is universal enough to work on all operating systems. Portability is a bit of an Eldorado which we all aim for, though never really reach. The other aspect is skepticism. I shudder every time I see adoption before comprehension. People notice something is cool, then just take it and use it right away, completely ignoring the How and What if. That is essentially what this article is all about!

Paper magazines are a bit of a dying concept, especially in the IT world. Stack Overflow is a Google / DuckDuckGo search away (to the left from your local booze supplier, can’t miss it!). Nevertheless, they’re an important part of my childhood and buying them in a way supports the open-source cause. Therefore, I purchase an issue from time to time. Unfortunately…

itsatrap

The article on Amazon Web Services (AWS) and managing virtual machine images looked interesting at a glance. I never worked with the AWS, therefore I decided to give it a go. The first thing that hit me was choosing Ubuntu 16.04 LTS as the guest operating system. I have nothing against Ubuntu as I use it at work as well, but what most people forget is that Debian and Ubuntu have quirks, which other Linux distributions lack. Most importantly, though – What is the use case? That is the first question which should be addressed when choosing a distribution. In fact, I would argue that if the article is meant to be useful professionally, CentOS, RedHat or openSUSE are better picks. That’s being needlessly pedantic, of course. Chances are the addressee of the article used Ubuntu before and has some experience with it.

Later, I find the phrase the package manager holds <package>. Personally, I have never seen this expression. Everyone talks/writes/types about repositories (repos). The only thing the package manager holds is a cache of the repository state to know which packages are available without having to ping the repository every single time. Again, that’s merely nitpicking. The thing that really disturbed me appeared several lines later:


pip install awscli

In 99% of the cases, this command will fail as it is typically executed from a regular user’s login session. By default, the pip script disallows installation of external Python libraries system-wide, and for perfectly valid reasons. If it somehow works, however, we’re in deep trouble. The worst that can happen is overwriting system-level Python libraries. For a more Windows-like reference, imagine altering the default DirectX installation. Pandemonium. A slightly better way to do this is:


pip install --user awscli

This will guarantee that libraries are installed into the user’s $HOME/.local/lib sub-directory and the wrapper scripts into $HOME/.local/bin. An even better way to do it is:


python2 -m pip install --user awscli

Thus, we specify the Python version for which we want to download the library and our approach is portable across multiple UNIX-like operating systems. PIP, as executed via “pip install”, is merely a wrapper script, which may or may not be offered by a Linux distribution and may or may not point to Python 2. Hell, in some cases it might even be buggy! The assumptions in the article went downhill hereafter. They made it really difficult for me to trust anything the writer stated unless tested thoroughly. The Internet is full of AWS horror stories already.

What this article demonstrates and what I think is extremely hurtful to the Linux ecosystem is bad practices and naive assumptions. Every Linux distribution is a Ubuntu, every Shell is Bash, etc. All of it, because people are encouraged to disregard core aspects of UNIX system management. One of my recent woes was when a company provided printer drivers as a URL to a Shell installation script. The recommended procedure was downloading the script via cURL and piping it directly to Bash with elevated privileges. Absolute madness!

OpenSUSE – a Server Perspective

This entry is a bit of an update to How I migrate(d) to OpenSUSE and Why , but also a short story about my experience with openSUSE since I had started using it to power our servers a couple of months ago . My general impression of openSUSE didn’t really change much, but at least now it’s supported by real-life experience.

I work as a junior system administrator / Python software developer / database administrator / other. I am primarily responsible for data storage and the well-being of our network infrastructure. When I started, the main operating system was Ubuntu Server 16.04 LTS with some machines accidentally running Ubuntu Server 17.10, which hit end-of-life soon after. Since I had to migrate the 17.10 machines, I considered switching our infrastructure to a more mature server environment like CentOS or openSUSE. Ubuntu Server 16.04 LTS is actually quite solid and I have nothing against it, however the sysadmin culture favors other operating systems and Debian-based distributions tend to introduce Debian-specific tweaks, which are non-portable. I worked with CentOS before as it was running on our structural biology workstations back when I was still a biologist. As a server platform CentOS is simply incredible. Many of the open-source projects developing server applications and appliances target CentOS primarily because of this. Unfortunately, CentOS packages are typically very stale, missing certain improvements unless they’re backported. Also, all of the mentioned server applications are installed from manually downloaded RPM packages, which puts the extra burden of having to track them on the administrator. What if there existed a platform that is neither stale, nor requires manually tracking the additional installed software…? It turns out such a platform indeed exists!

Fast-forward a couple of months and almost all of our KVM virtual machines run openSUSE 42.3 or 15.0 Leap. It wasn’t without hurdles, of course. Firstly, I absolutely loathe the idea of a GUI-first installer. Especially when it’s coded in a scripting language like Ruby. In addition, due to its complexity, it is prone to breakage and lags the serial console horribly. However, to be perfectly fair, it also handles quite some unusual partitioning regimes and a discrete selection of installed packages. After the first boot everything runs smoothly. The firewall is properly set up to allow only SSH connections so remote management is a given. My advice: create a single image and clone it, remembering to refresh the SSH keys, otherwise the OpenSSH client will complain. Also, change the IP address if it’s static or else the router will reject connections. For that purpose YaST2 is the perfect tool. In fact, thanks to additional modules (available in the repositories) it aides full system management, from drafting firewall rules to setting up Samba and database instance control. It has saved my life many times already and frankly, it is one of the features which brought me to openSUSE in the first place. What I really like about openSUSE is also the vastness of additional repositories. Our workflow revolves around time-series data and for that purpose we often rely on tools from the TICK and ELK software stacks. Of course, the stacks need to be relatively up-to-date so that we have access to the latest stable code base. Regular repositories in long-term support distributions never offer that (except for FreeBSD, I suppose). Moreover, for ease of use and upgrading we require repositories, not one-off DEB/RPM packages. The openSUSE Open Build System provides exactly that! SQL database servers and extensions? The server:database repository. HTTP servers? The server:http repository. And so on and so forth. In essence, it is as simple as adding the repository URL either through YaST2 or on the command line through zypper and we’re good to go. No more rogue packages polluting our long-running systems. Finally, the more standard software packages available in the official openSUSE repositories are of an acceptable vintage. I was slightly disappointed by Python 3.4, since I tend to overuse features introduced in Python 3.5, however openSUSE 15.0 is already out and I will migrate once I am sure it is stable enough.

To sum up, I can definitely recommend openSUSE Leap as a server platform. With fantastic tools such as YaST2 and the Open Build System, it is a pleasure to administer. BTRFS is making progress, however I would avoid it unless file system level compression is necessary. Unfortunately, it often is. As openSUSE is strongly against ZFS due to clashing licenses, BTRFS is almost the only option available. Alongside Leap, I tried Tumbleweed as a developer platform. Alas, it rolls too fast for me and the frequent massive package upgrades are simply overwhelming. Also, software stacks can break if they aren’t completely in sync. Despite the fact that the openSUSE project sports the amazing OpenQA testing platform, some bugs still get through. Nevertheless, running openSUSE in both development AND deployment is extremely tempting! Definitely beats the Fedora/CentOS combo.

Game Design Series – Jurassic Park (NES)

To prepare myself for my future game developer career I decided to play through some gaming classics for various Nintendo (GameBoy, NES) and SEGA (Saturn, Dreamcast, Genesis) consoles and analyze them thoroughly. The truth is that many amazing gameplay elements were invented way back in the 70-90s and haven’t appeared since. It’s a real shame, because frankly speaking they were groundbreaking. In my analyses I will try to focus on game difficulty, graphics, interesting gameplay aspects and the overall appeal of the game. First off is Jurassic Park for the Nintendo Entertainment System.

Official Title: Jurassic Park
Release Year: 1993
Developer: Ocean Software

jp-2

Game main menu – super scary!

Synopsis
The game follows the plot from the movie by Steven Spielberg and the techno-thriller by Michael Crichton (entirely different feel than the movie). You are PhD Alan Grant and your task is to escape from the now wild Jurassic Park located on Isla Nublar. Along the way you have to save Tim and Lex (grandchildren of Prof. Hammond) from being eaten alive by the legendary predator T-Rex or trampled by a stampede of triceratops’.

jp-1

Alan Grant in front of the Park gate

Graphics
Jurassic Park features an isometric view produced by sprites drawn at an angle from various sides. Interestingly, the collision box of some of them was defined only by the sprite’s base, allowing the game’s protagonist or his enemies to vanish behind obstacles. The color palette is crisp, though consists primarily of gray, red and different shades of green. It definitely looks better than early NES games. Projectiles are animated and so are the various dinosaurs infesting the Park. The main menu screen, featuring a viciously looking T-Rex en face with dripping saliva is worth an extra mention. Unfortunately, the impressive visuals would occasionally tax the NES hardware causing graphical glitches and oddities.

Gameplay
In order to successfully escape from the Park, Alan needs to complete various tasks, ranging from saving Tim and Lex to unlocking computer terminals. A major part of the game is collecting turquoise-gray dinosaur eggs in order to reveal key cards, and collecting different types of ammo to combat the vicious dinos. There are several species of dinosaurs, each with a different behavior pattern. Compsognathus individuals are small and easy to kill as they always trot in a straight line towards Grant. Velociraptors are much faster and can actually outrun the player when charging. They also do much more damage on contact. Somewhat sadly all of the dinos drop only basic ammunition (swamp green). Bolas rounds (red), penetrating rounds (gray) and upgraded rounds (green) need to be collected from the ground in designated spots. An interesting aspect of the game are mystery boxes with a question mark on top. They provide extra lives, health packs or contain deadly booby traps. What I appreciate the most is the fact that the game does not follow the standard “stage(s) + boss fight” pattern. In fact, there are only 2 real boss fights against the T-Rex. The gameplay is well-balanced with a mix of regular collection stages, boss fights, puzzles and dynamic rescue missions. In total 6 levels with clear briefing screens explaining the tasks in each level.

Difficulty
Jurassic Park is one of those NES games which seem hard at first, but as the player memorizes enemy attack patterns, locations of health packs, etc. it becomes increasingly easier. In addition, it is not as overwhelming as, for instance Castlevania or Ninja Gaiden. Jurassic Park is definitely a beatable title, though admittedly the T-Rex levels can be quite annoying.

Closing Remarks
While the core of the game (collecting eggs and shooting dinos) is fairly standard among NES titles, the addition of rescue missions and unusual boss fights feels refreshing. I believe that even platformers would profit from such gameplay mix-ins. Actually, they’re often fun regardless of the genre.

Sources

Games Then and Now

Inspired by the talks from Brenda and John Romero, I decided to write a short piece on the evolution of gaming. I will not focus on specific time periods, however, as the industry progressed through subsequent phases quite fluidly. Rather, I will try to draw a comparison between then (1980-1990) and now (201x). I was born at the end of 1980s, therefore I still managed to get to know the amazing Nintendo Entertainment System (NES) first-hand. This will be the starting point of our journey, though I will mention other consoles and gaming systems when relevant.

To begin with, the NES and the original Nintendo Gameboy were amazing systems. Such a variety and richness of games as for these two platforms was never seen before. I didn’t own a Gameboy myself, because they were quite expensive, but some of my childhood friends had them so I would often borrow one to play a bit. Also, back then it was perfectly natural for kids to meet in small groups and game in turns. My favorites were Donkey Kong Land and Super Mario Land. Both were quite difficult, but the enjoyment was enormous regardless! I did have a Famiclone (a Japanese Famicom clone) as these were extremely popular in East-Central Europe. Of course, the cartridges were also Famicom imitations and the system itself (branded Pegasus) would never run any of the original NES games without a special converter. I had no idea about that when I was young since it was easy to get games from local flea-markets anyway. I remember playing Contra and Rescue Rangers 2 for hours and hours on until I could perfectly memorize the entire play-through. Many of the games then were platformers, beat-em-ups, racing games or sports games in general. Regardless of the genre, twitch reflexes were a must! Also, most of the games didn’t have password-based checkpoints so once dead, the player had to start from the very beginning. The replay value was in the difficulty of a game and the necessity to master it to complete it and beat the final boss. From today’s perspective this sounds terribly tedious, but the motivation behind making games was also different. They were supposed to bring fun and excitement in its purest form. Beating a game was intended as the supreme reward for mastering a game and honestly, it really felt rewarding back then. DOS games were slightly different due to the lack of a proper controller pad. They weren’t as fast-paced as NES games, but you could actually save the game state in some of them. Regardless, they still posed a considerable challenge.

tmnt2-4

One of the final bosses in Teenage Mutant Ninja Turtles 2 (NES)

Game design is an interesting topic when it comes to NES, DOS, the Nintendo Gameboy and other platforms from that era. Since games had to fit on a single cartridge or diskette (or multiple diskettes, of course), they could not contain information about the entire state of the game, but rather a set of procedures to draw pixels in correct positions at correct times. As a result, programmers had to implement various hacks to define object boundaries or increase the number of available colors. This caused graphical glitches when the bitmaps were too big or allowed the player to abuse the shape of an object to his/her advantage. Also, forget tutorial levels, help menus, maps, etc. Some games were packaged with a manual or booklet, which introduced the game world or explained basic gameplay aspects, but very rarely would a game provide any help features at all. The player had to explore the game to understand it fully and complete it.

final-fantasy-xv-screenshot-023

A screenshot from Final Fantasy XV (PS4)

Fast-forward several decades and games look and feel entirely different. Firstly, they are a lot more graphically appealing and realistic so we are no longer expected to use our imagination to complete the mental image of a character. Almost everything is WYSIWYG (What You See Is What You Get). That helps with immersion a lot! On the down side, gore and violence are a lot more explicit and traumatizing (think, the Dead Space franchise). Game mechanics haven’t changed much, since even nowadays every game has a “core” which defines its gameplay. However, because games are no longer limited by diskettes or computer memory, developers often mix genres and implement novel gameplay aspects which were unknown in the past. In addition, the player is often gradually introduced to the game world so that he or she is not overwhelmed by the game from the very beginning. Finally, there is a major shift towards developing games in franchises or series to generate sustainable revenue and not as one-off hits. This, of course, puts pressure on developers and emphasizes the use of pre-purchase bonuses or advertising to make sure the game sells.

The differences between games then and now don’t mean that games used to be better or worse in the past, compared to modern games. The evolution of games merely expresses the growth of the industry. Nowadays, gaming is more approachable so that everyone can enjoy it. To us, veterans of the early Nintendo and Sega consoles modern games might seem boring or too easy, though that is only our perspective. In addition, when I recently returned to Castlevania and Teenage Mutant Ninja Turtles II (both on the NES) I realized how unnecessarily frustrating games used to be due to technical limitations. In the end, to each their own. Since I have a lot less time nowadays, I prefer casual games and not the challenging monsters of the past. However, I did find Dark Souls enjoyable, to be perfectly honest.

We Are Developers 2018 – Day 3

Finally, day 3 of the Congress. My morning preparations were the same as on the previous day – water, food and loads of coffee to get my gears running. I was locked & loaded for 8 whopping talks. Since it would take me hours to write about all of them, I will only briefly summarize each.

First off was Philipp Krenn from Elastic, talking about the ELK stack (ElasticSearch + Logstash + Kibana). Apparently, the stack has a new member called Beats. It helps with creating handlers for specific types of data streams (file-based, metrics, network packets, etc.). I feel like that feature was missing in the current composition of the stack, though it only makes the stack bigger and more complex. I was actually investigating the use of Logstash + ElasticSearch + Grafana for sorting, filtering and cherry-picking log messages, but the maintenance overhead was a bit too much. I settled with Telegraf + InfluxDB (time-series SQL-like storage back-end) + Grafana. Telegraf’s logparser plugin simulates Logstash and InfluxDB proved to be an extremely robust storage solution. In addition, Grafana’s ability to handle ElasticSearch records was too rigid (pun intended) for our use case. So in general, it’s a “no”, but I’ll keep my log files open for new options in case/when our framework grows.

20180518_105027.jpg

Catalina Butnaru (right) show-casing various AI assessment frameworks

Second up, Catalina Butnaru on AI, however from an ethics perspective. Frankly, I am allergic to ethics and including it in discussions about AI, because ethics often derails or postpones progress. However, Catalina nailed it. Her talk was extremely appealing and real. I learned that ethical considerations should not go into the “wontfix” bucket and genuinely affect all of us. Well done!

Next, Joe Sepi from IBM talked about getting involved in open-source communities and helping build better software together. His recollection was quite personal, because he had to suffer from the same prejudices all of us fear when delving into an alien new project, framework or programming language. The take-home message? Never give up! Fork, commit, send PRs, make software better. Together.

20180518_132945.jpg

I skipped Martin Wezowski‘s talk to save my (metaphorically) dying stomach, but made it to the presentation from Angie Jones (Twitter). She’s an incredibly engaging speaker and the points she raised really resonated with me. All of us write (or should write!) unit and function tests. However, how do you test a machine learning algorithm or neural network? How do you simulate a client of a shop app or a human target of an image recognition module? It turns out that when dealing with people, machine learning can prove finicky and extremely error-prone. Actually, to the point when it’s funny. Until we begin discussing morbid matters like How many kids need to jump in front of an autonomous car for it to slide off a cliff and kill its passengers? 2? 5? 6? or Why does an image recognition application recognize people of darker skin tone as gorillas? Was there a race prejudice when selecting test image sets? 10 points to Angie Jones for the important lesson!

The next talk was given by Diana Vysoka, a young developer advocate, working for the We Are Developer World Congress organization. On one hand, I feel quite old seeing teenagers get into programming. On the other hand, that’s encouraging in terms of our civilization’s future. Listening to people like her makes me still want to live on this planet.

20180518_152044.jpg

Eric Steinberger (right) making convolutional neural networks plain and simple!

If Diana is a rising star, Eric Steinberger is already one for some time. A math and IT prodigy who can explain extremely complex concepts in such simple words that even a fart like me can comprehend them. He believes that AGI (Artificial General Intelligence) is possible and I believe him. After all, how do we define the requirements for AGI, compared to a standard neural network, which can already be purposed for almost any task? Obviously, we should aim higher than simple bio-mimicry. As humans we’re flawed and our potential is limited. Let’s not unnecessarily handicap the development of AI!

Finally, the last talk. Enter Joel Spolsky, the creator of StackOverflow! I attended his talk last year and was ready for more awesomeness. Joel delivered. Continuously. His anecdotes and stories gave a perfect closure to the Congress. It’s great to be a software developer and meet so many amazing people in one place. See you there next year!

We Are Developers 2018 – Day 2

Day 2 of the We Are Developers World Congress is up (at least for me, since I don’t have enough stamina for both the after-party and another full day of talks). Compared to day 1, I made some progress on the food and water front. The local grocery store, Hofer proved extremely useful. Armed with bacon buns and non-sparkling water I was ready for more developer-flavored bliss!

Alas, the first presentation was slightly disappointing. Instead of a talk about accelerated learning, I got a lecture on how learning works, from which I learned nothing. Thankfully, the second talk fully compensated for the shortcomings of the first one. Enter Brenda Romero – one of the legends of game development (think Wizardry 1-8). This talk was doubly important for me, because I would really love to join the game development “circus”, but I’m not yet sure whether I have the guts (or a “more-than-mellow” liver). I’m still not sure, but the take-home message was crystal clear – just do it! Brenda had a lot of important things to say regarding not giving up and not taking comments from others too personally. The audience can be brutal and vicious, and the gaming industry itself is tough. At least I know what I’m up against!

20180517_100610.jpg

Brenda Romero (centre) talking about her childhood toy assembling endeavors

Numero tertio was a continuation of game development goodness. I originally intended to attend the AI talk by Lassi Kurkijarvi, but John Romero. I don’t think I need to say more to anyone who at least heard of Quake or Doom. It was not a replay of last year’s talk, mind you! Rather, we got a full story of Doom’s development, which to me was both interesting and inspiring. John Romero is an amazing game developer and the pace at which he, John Carmack and other programmers at idSoftware produced Doom was simply dazzling. While modern games are of course a lot more complex, developers from the early 1990s didn’t have the tools, such as SDKs or version control we now possess.

20180517_110340

John Romero (centre) on developing and shipping Doom

 

Later on, it just spiraled! I lost track of the talks a bit, since there was some major reshuffling in the schedule. The presentation from Tessa Mero on ChatOps at Cisco was quite interesting. I do use Slack and various IRC clients, but a greater need for ChatOps and its integration with the software development cycle is definitely there. I wasn’t fully aware of that, to be completely honest. Next, Tereza Iofciu from mytaxi gave us a tour of machine learning and showed us the importance of computer algorithms in predictive cab distribution planning. It wasn’t about self-driving cars or reducing manpower, but rather about reducing the load on drivers and improving clientele’s satisfaction. Computer-accelerated supply-demand, so to speak.

In the afternoon I took an accidental detour to a book-signing event hosted by John and Brenda Romero. Not only did I get a chance to talk to them personally (*heavy breathing!*), but also got a copy of Masters of Doom signed (*more heavy breathing!*). John said that if I read it, I’ll definitely get into game development professionally. I’m completely embracing the idea as I type this. One of the last talks I attended was given by Yan Cui on how he used the Akka actor model implementation (together with Netty) to solve latency issues in a mobile multiplayer game (MMO specifically). Obviously, it was a success and his convincing speech makes me want to try it out. It’s about concurrency, but without the overhead of traditional multiprocessing and/or multithreading. Although I don’t code in C# just yet, there is a Python implementation of Akka, which was recently recommended to me.

20180517_154747.jpg

Yan Cui (centre) explaining message relays in the actor model of concurrent programming

In summary, it was great to meet like-minded folks and actually talk to fellow game developers, who like challenges and don’t shy away from trying out new approaches to software design. Perhaps that’s what I’m looking for – challenges? Stay tuned for more exciting impressions from day 3 of the Congress!

 

We Are Developers 2018 – Day 1

To begin with, I attended the We Are Developers World Congress last year (2017) and I was quite amazed by it. I got to see John Romero, the legend of game development and author of titles such as Wolfenstein 3D, Doom and Quake I. Actually, the congress inspired me so much that I decided to finally part with my scientific career and pursue a life as a software developer and/or system administrator (a bit of both in reality). To the point, though. The We Are Developers World Congress is a fairly novel venture and even the Internet knows very little about it outside of  the main website and single blog posts. It hasn’t become a tradition just yet, thereby media coverage is patchy at best. Considering that it grows exponentially in its grandeur (2000 attendees last year, 8000 registered attendees this year!), I decided to cover it myself.

wearedevelopers_2017_logo

wearedevelopers_2018_logo

The logo from the 2017 edition (above) and the logo from the 2018 edition (below)

The Congress started with a treat already – a fireside chat between Monty Munford and Stephen Gary Wozniak (Steve Wozniak, The Great Woz). It was intended as a casual interview, but The Woz proved to be exactly the person as depicted in the 2013 movie with Ashton Kutcher entitled Jobs. Steve Wozniak is extremely chatty and simply adores talking about himself, therefore it was only natural for him to dominate the discussion. Slightly to the detriment of the “chat” aspect of the event. I enjoyed it nevertheless. Many important points were raised – the economy of social media (Should we not get a fair share of the profit made by Facebook and Google off our personal data?), the “I” in “Artificial Intelligence” (It’s not really “intelligence” if it’s programmed!), Elon Musk (Tesla fails to deliver, year after year…), etc. It was somewhat surprising to witness that Steve Wozniak hasn’t really changed since the crazy ventures of his teen years with Steven Paul Jobs. Quite the amazing spirit!

20180516_102109.jpg

Monty Munford (left) having a fireside chat with the Great Woz (right)

The fireside chat was followed by an interesting talk from Joseph Sirosh from Microsoft. He talked about the various machine learning tools offered as part of Microsoft’s Azure hosting platform. To be honest, I am extremely skeptical regarding Microsoft’s ideas, especially when it concerns open-source software, supposedly open to the public. Microsoft has a disappointing track record of using the embrace, extend, extinguish tactic against promising software projects and a sinusoidal quality trend of its flagship product – Windows. Accordingly, I took the with a bucket of salt approach. The mood among other attendees was similarly negative. Unnecessarily, though! Azure’s machine learning tools seemed very promising in the end. I do consider using them for some of my projects.

After the lunch break I joined the Headless CMS track, and after the initial slightly disappointing talk, I was enthusiastic about Jeremiah Lee and his JSON API idea. REST APIs are a big part of the Web nowadays, ever increasingly so. We do need a slightly more elaborate and efficient data format standard built on-top of the venerable JSON. At that point I realized that unlike the Web development track last year, this time programming language animosities were absent. The implementation is irrelevant to  the standard if we all agree on its importance!  The last talk in the Headless CMS track I attended was given by Kaz Sato from Google. The topic being machine learning again, but this time leveraging Google’s AutoML platform and TensorFlow. Machine learning is actually one of the main themes of this year’s edition of the We Are Developers World Congress. It’s very clear that we need it!

20180516_113233.jpg

Joseph Sirosh (centre), showcasing MS Azure AI services and APIs

To sum up, based on the various talks I attended, I begin to form a vision regarding the future of computers. We started with humongous, clunky mainframes and progressed into the personal computer era with the contributions from Steve Wozniak, Steve Jobs and many others. However, the dichotomy returns. Computers turn into mobile “enabling” devices, which aid us in our daily tasks and ease our interaction with the world (and each other). Heaps of data at our fingertips! However, we need a back-end, an infrastructure of powerful serves to store data and organize it in an accessible way. In-between is of course a robust network interface which carries the data from the back-end to us, the clients/users.

 

How I migrate(d) to OpenSUSE and Why

I’m a die hard FreeBSD fan. I simply love it! It rubs me the right (UNIX) way. Through trials and tribulations I managed to make it do things it was possibly not designed to do. ZFS? Amazeballs. Cool factor over 9000! However, all of that came at a tremendous cost in energy and time. I reached a point when I don’t want to spend time manually configuring everything and needing to invent ways of automatizing things which should work out-of-the-box. Furthermore, most FreeBSD tools are not compatible with other operating systems, therefore learning FreeBSD (or any other BSD variant, for that matter) locks me in FreeBSD. Despite many incompatibilities, this is not the case with Linux. On a side note, the ZFS on Linux project was a great idea. The Linux ecosystem badly needed a mature storage-oriented filesystem, such as ZFS. BTRFS to me at least “is not there yet”. Other tools, such as containers were reinvented in some many different ways that Linux has outpaced FreeBSD many times over. Importantly, Linux tools were tested in many more real life scenarios and are in general more streamlined. For automation, this is crucial. Again, I don’t want to tinker with virtually every tool I intend to use. Neither do I want to read pages and pages of technical documents to get a simple container running. More so, I should not be forced to, since that’s terribly unproductive. Finally, I like to run the same operating system on most of my computers (be it i386, x86_64 or ARM). FreeBSD support for many desktop and laptop subsystems is spotty at best…

Enter OpenSUSE!

green_lizard

Cute lizard stock photo. Courtesy of the Interweb.

Seemingly, OpenSUSE addresses all of the above issues. True, ZFS support is not reliable and there are no plans to the contrary. The problem is as always licensing. BTRFS is still buggy enough to throw a surprise blow where it hurts the most. Personally, I don’t run RAID 5/6 setups, but that’s BTRFS’ biggest weakness right now. That and occasional “oh shit!” moments. Regardless, I think I’ll need to get used to it. Lots of backups, coffee and prayer – the bread & butter of a sysadmin. On the up side, this is virtually the only concern I have regarding OpenSUSE.

The clear positives:

  • Centralized system management via YaST2 (printers, bootloader, kernel parameters, virtual machines, databases, network servers, etc.). A command-line interface is also available for headless appliances. This is absolutely indispensable.
  • Access to extra software packages via semi-official repositories. Every tool or framework I needed was easily found. This is a much more scalable approach than the Debian/Ubuntu way of downloading ready .deb packages from vendors and having to watch out for updates. Big plus.
  • Impressive versatility. OpenSUSE is theoretically a desktop-oriented platform, though thanks to the many frameworks it offers, it works equally well on servers. In addition, there is the developer-centric rolling-release flavor, Tumbleweed, which tries to follow upstream projects closely. Very important when relying on core libraries like pandas or numpy in Python.

So far, I’ve switched my main desktop machines over to OpenSUSE, but I’m also testing its capabilities as a KVM host and database server. Wish me luck!

Why Golang is not for me…

Recently, I decided the time has come to progress my not-yet-existent game developer career. I always wanted to write games and there is a lot of great old-school games which deserve reiterations using modern technologies. After some discussions with my wife (big kudos to her!) and getting properly inspired by DOS-era gems and jewels, I was ready to pick a language. I’m quite confident in my Python skills, however for games I’d rather use one of the mid- to heavy-weight contestants like Java, C#, C or C++. Despite having some experience in C, pure C is too simplistic, heavily procedural and unfortunately doesn’t provide enough tools to build rich graphical applications. Sure, I could try nuklear.h or similar headers for drawing shapes. That’s sufficient for menus, though not for the entire project. Clearly, C is more suited for number-crunching subroutines. C++ is way too complex for me, though of course most games are written in C++, since rendering libraries are coded in C++ and so are game engines. That makes perfect sense. Something easier perhaps? C# is a Microsoft thing and I would like my game(s) to be easily accessible to all platforms. That left me with Java and a new contender – Go.

golang_gopher

Funky gopher on a funky horse – courtesy of the Web

The Golang project officially began in 2009 and managed to garner quite some appeal throughout the years. It’s not a Google toy anymore. For instance, CloudFlare uses it in their Railgun project (circa 4000 lines of code, last time I checked). Other notable examples include the entire TICK stack for time-based metrics (Telegraf, InfluxDB, Capacitator, Kibana) and Grafana (visualization platform for various database back-ends like InfluxDB, MySQL, ElasticSearch, etc.). I even found a 3D game engine advertised as programmed in Go (~50% was written in C, though). Since it appeared that Go is here to stay and is slowly establishing its position as one of the mainstream languages, I decided to at least take a look at it. Sadly, the more I read about it, the less inclined I was to code in it. The emphasis on concurrency is both important and useful, however I feel the language is severely lacking in many respects.

 

No time for classes

Thanks to my Python background I am well accustomed to object-oriented programming and I consider it relevant to writing DRY code. It’s not always the best approach, though in most cases it provides means of maintaining modular programs. We know that modular is good, because it allows us to exchange bits and pieces without breaking APIs. There was a bit of a switch from old-style classes in Python2 to diamond classes in Python3, which seemed to be inspired by Java. However, Python went one step ahead and introduced multiple inheritance, purposely omitted from Java. As it tends to be quite confusing, I avoid it, rather than abuse it. Pure C, the ancestor of many modern languages, lacks classes and they were never introduced in subsequent revisions of the C/C++ standard. It stands to reason as C++ came along in 2006 and expanded the successful formula of C with multiple useful features, including object-oriented paradigms. Also, back in the days, procedural programming was sufficient and even nowadays it is perfectly adequate for system level programming. Unfortunately, Go’s design follows C rather than C++. Thereby, it demonstrates a strong procedural focus, lacking actual means of data encapsulation. Forget classes, object hierarchies, clean polymorphism, operator overloading, etc. To me that’s a step backwards not forward. It means that Go will suffer from the very same general limitations as C.

 

The emperor’s new clothes

One of the major aspects of a language is its syntax. Python wins against many more performant languages, because it’s simple, encourages the use of a clean and consistent coding style, and makes reading other people’s code a breeze. In fact, so does C (in a way) if it’s not abused. The reason why Java was successful upon its release was that it closely followed the syntax of C and C++. It was meant as a portable, cross-platform language with a familiar look to encourage existing programmers to switch. One could code in C, C++ and Java, covering a multitude of use cases effortlessly. In addition, Java Virtual Machines support other languages like Scala, Clojure, Groovy and Jython for even more potent combinations. In contrast, Go was inspired by C, though it completely overhauled the standard C-like syntax for no apparent reason. This leads to confusion, the need to unlearn old, but useful habits and invest resources in learning a completely foreign language. At this point I’m hardly motivated.

 

Simple == useful?

As I mentioned earlier, Go selectively omits many modern and potentially useful language features like classes. It was originally advertised as a simple to understand systems programming language to make the life of people at Google easier. Yet, it locks prospective programmers in a one could even say dumbed down C/C++ syntax, which is alien to other languages. It is true that C++ is a monster of a language due to its scope. However, it is perfectly viable to establish the use of subsets or dialects to make it easier to understand. What I mean is that it would be more useful for prospective programmers to learn a language with more features than having to re-invent these features in a band-aid manner as they become more and more comfortable with the language.

 

Conclusions

While my general impression of Go is largely negative I do not by any means consider it a useless language. Quite the opposite! It managed to provide the server space with a number of useful engines and applications for networking, data storage and visualization. Actually, in some cases these pieces of software are more robust than existing solutions in C/C++. Personally, that’s quite impressive. However, I still believe the arguments against Go are valid. I would rather continue learning Java or even go straight for C++ and recommend others to do the same.