GNU/Linux and Its Users

I decided to devote this entry to a reflection I made recently, while participating in discussions in the Facebook Linux group. I realized, as many have before me, that there is a strong correlation between the maturity and complexity of an operating system and the savviness of its users. The BSDs and more demanding GNU/Linux distributions like CRUX, Arch and Gentoo attract experienced computer users, while Ubuntu and its derivatives entice beginners mostly. As few express the need to learn the Unix Way, beginner-oriented operating systems (this includes Windows and MacOS X, of course) are far more popular. Consequently, they garner stronger commercial support from hardware and software companies as they constitute a market for new products.

The truth is, we have all been beginners once. More so, unless we’re old enough to remember the ancestral iterations of the original UNIX operating system (I’m not!), we’ve been MacOS X or Windows users way before switching to a modern Unix-like operating system. Alas, as such we have been tainted with a closed-source mindset, encouraging us to take no responsibility for our computers and solve system-level problems with mundane trial-and-error hackery. Not only is such a mindset counter-productive, but also hampers technological progress. Computers are becoming increasingly crucial in our everyday lives and a certain degree of computer-literacy and awareness is simply mandatory. Open-source technologies encourage a switch to a more modern mindset, entailing information sharing, discussions and learning various computer skills in general. The sooner we accustom ourselves with this mindset, the faster we can move on.

The current problem in the GNU/Linux community (much less in non-Linux Unix communities) is that the entry barrier is being continuously lowered as to yield a speedier influx of users. Unfortunately, many of these users are complete beginners not only in terms of Unices, but also in terms of using computers in general. With them the closed-source mentality is carried over and we, the more experienced users, have to deal with it. Some [experienced users] provide help, while others are annoyed with the constant nagging. Within us lies the responsibility to educate newbies and encourage them to embrace the open-source mindset (explained above). However, they don’t want to. They want the instant gratification they received when using Windows or MacOS X, because someone convinced them that GNU/Linux can be a drop-in replacement for their former commercial OS. They want tutorial-grade, easy-to-follow answers to unclear, badly formulated questions. Better yet, they want them now, served on a silver platter. We all love helping newbies, but we shouldn’t encourage them to remain lazy. Otherwise, we’ll eventually share the fate of Windows or MacOS X as just this other mainstream platform. I cannot speak for everyone, though I would personally prefer GNU/Linux to continue its evolution as a tech-aware platform of the future.

Just Linux Things

chillin_penguins

As a follow-up to my previous post, I noticed that a lot of the recent open-source technologies are extremely Linux-centric. It all started around the time GNOME3 and systemd were introduced. Both heavily rely on facilities present in Linux-based operating systems, making them difficult to port to other Unix platforms like the BSDs. These technologies are also strongly promoted to entice prospective developers. While I understand the need for platform-centric efficiency inherently tied to Linux-specific features (cgroups, etc.), it is also important not to ignore the rest of the IT ecosphere. Yes, I mean even Windows, which is normally not considered on equal terms as Unix, but is relevant when talking about C# or .NET applications.

A recent example of this trend is Docker. Containers are now the new pink and everyone wants to get a piece of the pie. Docker barely reached release 1.x, yet some companies already make claims about “widespread adoption”. I thought industries prefer stable and tested-for-years solutions. I find this new craze odd the least. As expected, Docker is a Linux thing. While the containers are indeed OS-level on GNU/Linux systems, much like LXC (Linux Containers), they’re not on Windows and neither on MacOS X. Oddly enough, the latter uses a Unix virtual machine manager xhyve (based on FreeBSD’s bhyve). Therefore, despite the fact that developer interfaces are similar or even identical, the engines running underneath will have a substantial overhead on non-Linux systems. At that point one might consider whether native and more established solutions are not already available and more suitable for multi-container setups. On FreeBSD we have Jails and a ton of Jail managers to make one’s life easier. I have a feeling that Jails on FreeBSD would do a lot better than Docker containers on GNU/Linux. Not to mention that a FreeBSD base system is a lot slimmer than whatever one could consider a base system in GNU/Linux world. “Widespread adoption” seems to be lacking, because most of world’s servers run GNU/Linux.

Another weird trend I notice is identifying everything Linux-related with Ubuntu, as if Ubuntu was the only True Linux Distribution. I often read articles that claim to touch on Linux, but in reality discuss Ubuntu. A square is a rectangle, but not all rectangles are squares! That’s so obvious, no? This “Linux = Ubuntu” assumption hurts the whole ecosystem quite a lot. People learn how to use Ubuntu, think they’ve mastered Linux. Then, they’re dropped into a den full of Gentoos and CentOS’, and they end up suffering. Ouch! A fallout of this worrying trend is the fact that people deploy Ubuntu in places where a more lightweight GNU/Linux distribution would be a lot more suitable. That’s one of the reasons why Docker switched from Ubuntu to Alpine Linux eventually. With the wealth and diversity of the GNU/Linux ecosystem, one doesn’t even need to go far.

It’s Time for FreeBSD!

For the last couple of weeks I have been delving deeper into the arcane arts of FreeBSD, paying extra attention to containers (jails), local/remote package distribution (poudriere) and storage utilities (ZFS RAID, mirrors). Truth be told, with every ounce of practical knowledge I was increasingly impressed by this Unix-like operating system. It’s nothing short of amazing, really! The irony lies in the fact that FreeBSD is an underdog in the Unix world (less than OpenBSD or Illumos, but still), despite the fact that it excels as a server environment and established many technologies currently in focus (process isolation, efficient networking and firewalls, data storage, etc.) years ago already. GNU/Linux is picking up pace, but it still has a long way to go.

I feel traditional Unices got the “base system + additional applications” separation properly. One might think it’s just a matter of personal taste – the “order vs chaos” debate going on for ages. The truth is that this separation is not only reasonable, but also extremely useful when one begins to treat the operating system as more than a mere Internet browser or music player. Order is paramount to organizing and securing data from ill-intent or hardware failure. I really appreciate the use of /usr/local, /var/db, /var/cache and other typical Unix volumes on FreeBSD, as it makes the system more predictable, and therefore hustle-free. When we have a multitude of systems to care about, hustle-free becomes a necessity. With the new container technologies like Docker it’s a realistic scenario – 1 host system serving N guest systems. One doesn’t need to run a server farm to get a taste of that.

This is basically where (and when) FreeBSD comes in. It’s a neatly organized Unix-like system with great storage capabilities (ZFS), process isolation for guest systems (jails and bhyve), network routing (ipfw, pf, etc.) and package distribution (synth, poudriere, etc.). And everything is fully integrated! The “pkg” package manager knows what jails are and do, and can easily install programs to them. Poudriere coordinates building new packages, using jail containers so that the host system is not compromised. These packages can later be distributed via HTTP/HTTPS/FTP remotely or locally via the file protocol. Such low-level integration is somewhat foreign to the GNU/Linux world, though among server distributions like OpenSUSE, CentOS or Ubuntu Server it is constantly improving.

Still, whenever I think about the divide between BSD and GNU/Linux, I see a tall brick wall that both sides are struggling to tear down. FreeBSD wants to become more desktop-oriented, while GNU/Linux is trying to reinforce its server roots. Difficult to tell whether this is good or bad. The BSDs do indeed excel as server systems, as recently highlighted in a NASA study. GNU/Linux is more suited for heavy computation and leisure. The brick wall has plenty of nicks, yet it stands strong. Maybe there is a “third option”? Why not let each do the job they do best? What I mean to say is that FreeBSD has its place in the world and the time is ripe to truly begin to appreciate it!

FreeBSD-Debian ZFS Migration

Since the Zettabyte File System (ZFS) is steadily getting more and more stable on non-Solaris and non-FreeBSD systems, I decided to put my data pool created for the previous entry to the test. In principle, it should be possible to migrate a pool from one operating system to another. Imagine the following scenario – a company is getting new hardware and/or new IT experts and needs to migrate to a different OS. In my case it was from FreeBSD to Debian and vice versa. All data volumes were located in a single pool, but depending on the size of the company, it might be several pools instead. Before even thinking of migrating it is first important to make sure that all I/O related to the pool(s) to be migrated was stopped. When the coast is clear we can “zpool export <pool>” and begin our exodus to another operating system.

From FreeBSD to Debian
After exporting the zdata pool I installed Debian Testing/Stretch onto the system-bound SSD drive. ZFS is not part of the base installation, hence all pool imports need to be done after the system is ready and the zfs kernel module is built from the zfs-dkms and spl-dkms packages. apt resolves all dependencies properly so the only weak link is potential issues with building ZFS on GNU/Linux. Should no problems occur, we can proceed with importing the ZFS pool. GNU/Linux is cautious and warns the user about duplicate partitions/volumes. Those will not be mounted, even if the pool itself is imported successfully. Thankfully, conflicts can be resolved instantly by using a transition partition/drive to move data around. Once that’s done, our ZFS pool is ready for new writes. Notice that the content of /usr/local/ will undergo major changes as FreeBSD uses it for storing installed ports/packages and their configurations. In addition, /var/db will contain the pkg sqlite database with all registered packages. While this does not specifically interfere with either apt or Debian (apt configurations are in /var/lib and cached .deb packages in /var/cache/apt/archives), it’s important to take notice of.

From Debian to FreeBSD
Here, the migration is slightly smoother. The “bsdinstall” FreeBSD installer is designed in a more server-centric fashion (and ZFS is integral to the base system) so the ZFS pool can be connected and imported even before the first boot into the new system. The downside is that FreeBSD does not warn about “overmounting” system partitions from the zdata pool so it’s relatively easy to bork the fresh installation. Also, /var/cache will contain loads of unwanted directories and /usr/src, /usr/obj, /usr/ports and /usr/local need to be populated anew just like during a brand new FreeBSD installation.

Either way, the migration process is not too difficult and definitely not horrendously time-consuming. Should the user/administrator have PostgreSQL, MySQL or other SQL-like databases in /var/db, extra steps might need to be taken to ascertain forward and backward compatibility of the database packages. In the end, it’s a matter of knowing what each OS places where. FreeBSD is structured in a very intuitive and safe (from an administrator’s point of view) way. Debian, just like any other GNU/Linux distribution is a bit more chaotic, hence more caution is required. Both are good in their own regard, hence my incentive for migration testing.

FreeBSD – SSD + 2xHDD ZFS Installation

I recently got an extra 2 TB hard drive for my mighty (cough, cough, maybe some 9-10 years ago) HP Z200 workstation running FreeBSD 11.0-RELEASE so I decided to finally build a proper 2-drive RAID (Redundant Array of Independent Disks) mirror. I read the zfs and zpool manual pages (manpages) thoroughly on top of the related FreeBSD Handbook chapters and got to work. Since I also have a 160GB SSD inside that PC, some tinkering was required. The main issue was that SSD drives make use of TRIM for improved block device balancing. UFS provides TRIM support, but ZFS does not. Initially, I thought of having two separate ZFS pools – zroot for root-on-zfs and boot snapshots on the SSD and zdata for high volume data partitions like /usr and /var on the 2-drive array. However, after careful considerations I came up with a simpler partitioning scheme:

160GB Intel SSD:
MBR -> BSD:
141G     freebsd-ufs   (TRIM enabled; mounted as “/”)
8G         freebsd-swap

zdata mirrored array on 1.5T Seagate Barracuda + 2T WD Caviar Green:
GPT:
1.32T freebsd-zfs (on each drive)

With such a partitioning scheme I lost boot snapshots, though it was a lot easier to install the OS as I could rely on the standard FreeBSD installation procedure (bsdinstall) entirely. First, I performed a standard installation via bsdinstall onto the SSD. Next, I created a 2-drive ZFS pool and named it “zdata” following the Handbook. I made sure that all parent partitions like /usr and /var are mounted from the SSD and only the variable and expandable sub-directories like /var/db, /usr/ports, /usr/src, /usr/local, etc. are placed on the ZFS pool. Since each of those required a parent directory in the ZFS pool, I used /zdata/usr and /zdata/var, respectively. That way the /usr and /var mountpoints did not get overridden with empty /usr and /var directories from the ZFS pool. This protects the core system from getting wiped if one of the ZFS drives fails. In addition, the system can be reinstalled easily and the ZFS pool added later without major setbacks. The trick is that all ports are installed to /usr/local and the package manager database  is in the /var/db directory. Flexible, easy and extremely well documented.

Just to clarify, the above is no rocket science and can be done very easily with the tools immediately available in the core FreeBSD installation. This should really be highlighted more as apart from the descendants of Solaris, FreeBSD is the only operating system that offers such capabilities out-of-the-box. GNU/Linux systems have their own RAID and volume management tools, but they’re definitely not as established as ZFS. The GNU/Linux alternative to ZFS is btrfs, as it too combines a volume manager with a file system. However, key features like RAID-5/6 are still unstable and no GNU/Linux distribution offers btrfs-only setups.

Unix and Software Design

Getting it right in software design is tricky business. It takes more than a skillful programmer and a reasonable style guide. For all of the shortcomings to be ironed out, we also need users to test the software and share their feedback. Moreover, it is true that some schools of thought are much closer to getting it right than others. I work with both Unix-like operating systems and Windows on a daily basis. From my quite personal experience Unix software is designed much better and there is good reasons for that. I’ll try to give some examples of badly designed software and why Unix applications simple rock.

The very core of Unix is the C programming language. This imposes a certain way of thinking about how software should work and how to avoid common pitfalls. Though simple and very efficient, C is an unforgiving language. By default it lacks advanced object-oriented concepts and exception handling. Therefore, past Unix programmers had to swiftly establish good software design practices. As a result, Unix software is less error-prone and easier to debug. Also, C teaches how to combine small functions and modules into bigger structures to write more elaborate software. While modern Unix is vastly different from the early Unix, good practices remained a driving force as people behind them still live or have left an everlasting impression. It is also important to note that the graphical user interface (Xorg and X11 server) was added to Unix much later and the system itself functions perfectly fine without it.

Windows is entirely different as it was born from more recent concepts, when bitmapped displays were prevalent and the graphical user interface (GUI) began to matter. This high-level approach impacts software design greatly. Windows software is specifically GUI-centred and as such emphasizes the use of UIs much more. Obviously, it’s a matter of dispute, though personally I believe that good software comes from a solid command-line core. GUIs should be used when needed not as a lazy default. To put it a bit into perspective…

My research group uses a very old piece of software for managing lab journals. It’s a GUI to a database manager that accesses remotely hosted journals. Each experiment is a database record consisting of text and image blocks. From the error prompts I encountered thus far I judged that the whole thing is written in C#. That’s not the problem, though. The main issue is that the software is awfully slow and prints the most useless error messages ever. My personal favorite is “cannot authenticate credentials”. Not only is it obvious if one cannot log in, but it contains no information as to why the login attempt failed. Was the username or password wrong? Lack of access due to server issues? Maybe the user forgot to connect to the Internet at all? Each of these should have a separate pop-up message with an optional suggestion on what to do to fix the issue. “Contact your system administrator” not being one of them!

On Deprecating Software

In the open-source world software comes and goes much like animal and plant species in the bio world. The reasons are various. Software A was written a long time ago when computers severely lacked in performance. Therefore, it could not adjust to modern programming paradigms easily, and had to be forked and rewritten as software B. Another case – developers were few and at one point they lost interest in software C. Years later someone dug up the project and noticed its many uses. He/she decided to breathe new life into it as software D. The story that everyone talks about nowadays follows an entirely different scenario, though.

Once upon a time, there was a Unix sound system called OSS (Open Sound System). It aligned with the Unix style of device handling and was easy to understand. In fact, it was the first sound system that could be called “advanced”. FreeBSD still relies on a modified version 4 of OSS and it’s perfectly fine for daily use. Then came Linux, based on Unix paradigms, though not Unix itself. In very general terms, it did a lot of things differently and required extra abstraction layers for its sound implementation. OSS was considered cumbersome and too low-level to be worthwhile in the long run. Thus, ALSA (Advanced Linux Sound Architecture) was born. For a long while OSS and ALSA co-existed until OSS was intentionally deprecated. Interestingly, many of the drawbacks of OSS were addressed in OSS v4, making the arguments against it rather moot. However, Linux dominated the open-source world to the point that all OSS-using Unix-based or Unix-like operating systems were marginalized. As a consequence, developers of new sound software primarily targeted ALSA. When I compare it to OSS, there are things it does better and things it does worse. The added abstraction layers theoretically simplify configuration. After all, not everyone needs to know how to address sound I/O on hardware-level. However, due to abstraction it’s more difficult to troubleshoot in cases when by default sound I/O is misconfigured.

Fast forward a few years and some developers now notice that even ALSA is cumbersome and too low level. Due to the rapid expansion of GNU/Linux into the desktop ecosystem, user expectations have changed and various system components have to follow suit as a result. That includes the sound system stack. Lennart Poettering observed that the current solution [ALSA] is flawed and implementing high-level sound features, such as dynamic multi-output setups or mixing is difficult. However, he decided not to (or couldn’t?) fix the underlying problems, but rather build a layer on top of ALSA. Such means of abstraction is likely to add problems, rather than subtract them. On one hand, configuration becomes more intuitive and potentially easier to adjust. On the other hand, the lower-level system (ALSA) still exists and the problems it causes are not addressed, but rather circumvented. Regardless, many projects decided to switch their sound backend from ALSA to the “new cool kid” PulseAudio entirely, for instance Skype, Steam, BlueZ and recently also Firefox.

Curiously enough, replacing ALSA with PulseAudio effectively only streamlines configuration on desktop computers. It’s not a game-changer that magically solves all of the problems attributed to ALSA or OSS, contrary to the claims PulseAudio proponents make. Can OSS or ALSA handle sound output device hot-plugging? Yes. Can volumes be easily adjusted on a per-application basis? Yes. Can multiple applications play sound to the same output? Yes, indeed! Frankly, instead of broken layers on top of broken layers, I would rather see a fix to the underlying components. Still, PulseAudio is here to stay for good and we need to find ways of dealing with it. My favorite is the apulse shim that provides a PulseAudio-like backend for applications and directs all output to ALSA. It’s simple and just works.

The big question I would like to drop, though is whether we should really keep on deprecating software so frivolously? For the majority of cases, both ALSA and OSS can do pretty much the same. Do we then really need something as complex as PulseAudio? Why not a simplified backend so that application developers live happier lives? Food for thought, I believe.