The (Necessary?) GNU/Linux Fragmentation

I would like to share with you a story of my recent struggles with Debian. They’re partially my fault, but also partially due to the way Debian handles network management, which is quite different from how other GNU/Linux distributions do it.

The story begins with me being happy with a regular desktop install, powered by XFCE4, but then wanting to switch to the less distracting Openbox. I installed Openbox + extras like the tint2 panel, nitrogen (background/wallpaper setter) and other lightweight alternatives to XFCE4 components. While sweeping up XFCE4 leftovers, “apt autoremove” accidentally removed way too many packages, including network-manager. I was instantly left with no network connection and no means of restoring it, as I learned later. By default network management on Debian is handled by the ifupdown scripts, which “up” interfaces listed in /etc/network/interfaces and direct them to dhclient to get the DHCP lease or assign a static IP address. Incidentally, ifupdown utils have no means of directing wireless interfaces to wpa_supplicant for WPA-encrypted networks. Nowadays, this is handled by network-manager, which “Just Works”. network-manager uses wpa_supplicant to handle WPA-encryption (in addition to many other things), whilst performing the rest of network management itself. This is quite different from running wpa_supplicant directly, which simply failed in my case due to a known regression.

It’s quite sad to see that Debian, despite moving from init scripts to systemd for boot + service management, still insists on configuring network interfaces via Shell scripts (the mentioned ifupdown tools), while a mainstream solution in the form of network-manager is available! Why is it recommended as a “Just Works” alternative yet not offered by default? On Red Hat based distributions (say Fedora, CentOS, etc.) the matter is really simple – you get network-manager and you’re good to go out-of-the-box. That stands to reason, though, as Network Manager is a Red Hat project. Still, the “Just Works” approach baffles me (and disturbs even) greatly. “Just Works” sounds like a catch-phrase typical of commercial operating systems. Since they target desktop and laptop computers mainly, it’s enough if an ethernet interface and/or a wireless interface “just work”. What about servers with multiple NICs, routers, gateways, NATs and VMs, each having its IP address or set thereof? What then? Oh, right, we can write systemd units for each interface and control them via systemctl. Or use the ip utilities for a similar purpose. Or the deprecated ifconfig, which we shouldn’t use, but still can because it’s in the repositories of many distributions. Alternatively, perform DHCP requests via one of the selected clients – dhclient, dhcpcd, dhcpd, etc. We end up with a hodgepodge of programs that are best left to their own devices due to incomplete documentation and/or unclear configuration means. Each GNU/Linux distribution having a set of their own base utilities.

Personally, I feel that’s where the BSDs succeed. You get a clearly separated base system with well-documented and easily-configurable tools that are maintained as a whole. Network interface configuration borders on trivial. More so, the installer handles wireless connections almost seamlessly. Why is it so difficult on GNU/Linux? At this point, I believe the GNU/Linux community would profit greatly by agreeing on a common “base system”. Red Hat’s systemd is the first step to the unification of the ecosystem. While I am strongly opposed to systemd, because it gives merely the illusion of improved efficacy by simplifying configuration and obfuscating details, GNU/Linux should be a bit stronger on common standards, at least for system-level utilities.

GNU/Linux and Its Users

I decided to devote this entry to a reflection I made recently, while participating in discussions in the Facebook Linux group. I realized, as many have before me, that there is a strong correlation between the maturity and complexity of an operating system and the savviness of its users. The BSDs and more demanding GNU/Linux distributions like CRUX, Arch and Gentoo attract experienced computer users, while Ubuntu and its derivatives entice beginners mostly. As few express the need to learn the Unix Way, beginner-oriented operating systems (this includes Windows and MacOS X, of course) are far more popular. Consequently, they garner stronger commercial support from hardware and software companies as they constitute a market for new products.

The truth is, we have all been beginners once. More so, unless we’re old enough to remember the ancestral iterations of the original UNIX operating system (I’m not!), we’ve been MacOS X or Windows users way before switching to a modern Unix-like operating system. Alas, as such we have been tainted with a closed-source mindset, encouraging us to take no responsibility for our computers and solve system-level problems with mundane trial-and-error hackery. Not only is such a mindset counter-productive, but also hampers technological progress. Computers are becoming increasingly crucial in our everyday lives and a certain degree of computer-literacy and awareness is simply mandatory. Open-source technologies encourage a switch to a more modern mindset, entailing information sharing, discussions and learning various computer skills in general. The sooner we accustom ourselves with this mindset, the faster we can move on.

The current problem in the GNU/Linux community (much less in non-Linux Unix communities) is that the entry barrier is being continuously lowered as to yield a speedier influx of users. Unfortunately, many of these users are complete beginners not only in terms of Unices, but also in terms of using computers in general. With them the closed-source mentality is carried over and we, the more experienced users, have to deal with it. Some [experienced users] provide help, while others are annoyed with the constant nagging. Within us lies the responsibility to educate newbies and encourage them to embrace the open-source mindset (explained above). However, they don’t want to. They want the instant gratification they received when using Windows or MacOS X, because someone convinced them that GNU/Linux can be a drop-in replacement for their former commercial OS. They want tutorial-grade, easy-to-follow answers to unclear, badly formulated questions. Better yet, they want them now, served on a silver platter. We all love helping newbies, but we shouldn’t encourage them to remain lazy. Otherwise, we’ll eventually share the fate of Windows or MacOS X as just this other mainstream platform. I cannot speak for everyone, though I would personally prefer GNU/Linux to continue its evolution as a tech-aware platform of the future.

Software Design Be Like…

I stumbled upon this very accurate article from the always-fantastic Dedoimedo, recently. I don’t agree with the notion that replacing legacy software with new software is done only because old software is considered bad. Oftentimes we glorify software that was never written properly and throughout the years accumulated a load of even more ugly crust code as a means of patching core functionalities. At one point a rewrite is a given. However, I do fully agree with the observation that a lot of software nowadays is written with an unhealthy focus on developer experience, rather than user experience. Also, continuity in design should be assumed as sacred.

One of the things that ticks me off (much like it did in the case of the Dedoimedo author) is when developers emphasize how easy it is for prospective developers to continue working on their software. Not how useful it is to the average Joe, but how time-efficient it is to add code. It’s nice, but should not be emphasized so. Among the many characteristics a piece of software may have, I appreciate usefulness the most. Even in video games, mind you! Anything that makes the user spend less time on repetitive tasks is worth its weight in lines of code (or something of that sort). Features that developers consider useful are in reality not so to regular users way too often. Also, too many features are a bad thing. Build a core program with essential features only and mix in a scripting language to add features that users may require on a per-user basis. See: Vim, Emacs, Chimera, PyMOL, etc. Success guaranteed!

Another matter is software continuity. Backward-compatibility should be considered mandatory. It should be the number one priority of any software project as it’s the very reason people keep getting our software. FreeBSD is strong on backward-compatibility and that’s why it’s such a rock-solid operating system. New frameworks are good. New frameworks that break or remove important functionalities are bad. The user is king and he/she should always be able to perform their work without obstructions or having to re-learn something. Forcing users to re-learn every now and then is a *great* way of thinning out the user base. One of the top positions on my never-do list.

Finally, an important aspect of software design is good preparation. Do we want our software to be extensible? Do we want it to run fast and be multi-threaded? Certain features should be considered exhaustively from the very beginning of the project. Otherwise, we end up adding lots of unsafe glue code or hackish solutions that will break every time the core engine is updated. Also, never underestimate documentation. It’s not only for us, but also for the team leader and all of the future developers who *will* eventually work on our project. Properly describing I/O makes for an easier job for everyone!

Just Linux Things

chillin_penguins

As a follow-up to my previous post, I noticed that a lot of the recent open-source technologies are extremely Linux-centric. It all started around the time GNOME3 and systemd were introduced. Both heavily rely on facilities present in Linux-based operating systems, making them difficult to port to other Unix platforms like the BSDs. These technologies are also strongly promoted to entice prospective developers. While I understand the need for platform-centric efficiency inherently tied to Linux-specific features (cgroups, etc.), it is also important not to ignore the rest of the IT ecosphere. Yes, I mean even Windows, which is normally not considered on equal terms as Unix, but is relevant when talking about C# or .NET applications.

A recent example of this trend is Docker. Containers are now the new pink and everyone wants to get a piece of the pie. Docker barely reached release 1.x, yet some companies already make claims about “widespread adoption”. I thought industries prefer stable and tested-for-years solutions. I find this new craze odd the least. As expected, Docker is a Linux thing. While the containers are indeed OS-level on GNU/Linux systems, much like LXC (Linux Containers), they’re not on Windows and neither on MacOS X. Oddly enough, the latter uses a Unix virtual machine manager xhyve (based on FreeBSD’s bhyve). Therefore, despite the fact that developer interfaces are similar or even identical, the engines running underneath will have a substantial overhead on non-Linux systems. At that point one might consider whether native and more established solutions are not already available and more suitable for multi-container setups. On FreeBSD we have Jails and a ton of Jail managers to make one’s life easier. I have a feeling that Jails on FreeBSD would do a lot better than Docker containers on GNU/Linux. Not to mention that a FreeBSD base system is a lot slimmer than whatever one could consider a base system in GNU/Linux world. “Widespread adoption” seems to be lacking, because most of world’s servers run GNU/Linux.

Another weird trend I notice is identifying everything Linux-related with Ubuntu, as if Ubuntu was the only True Linux Distribution. I often read articles that claim to touch on Linux, but in reality discuss Ubuntu. A square is a rectangle, but not all rectangles are squares! That’s so obvious, no? This “Linux = Ubuntu” assumption hurts the whole ecosystem quite a lot. People learn how to use Ubuntu, think they’ve mastered Linux. Then, they’re dropped into a den full of Gentoos and CentOS’, and they end up suffering. Ouch! A fallout of this worrying trend is the fact that people deploy Ubuntu in places where a more lightweight GNU/Linux distribution would be a lot more suitable. That’s one of the reasons why Docker switched from Ubuntu to Alpine Linux eventually. With the wealth and diversity of the GNU/Linux ecosystem, one doesn’t even need to go far.

It’s Time for FreeBSD!

For the last couple of weeks I have been delving deeper into the arcane arts of FreeBSD, paying extra attention to containers (jails), local/remote package distribution (poudriere) and storage utilities (ZFS RAID, mirrors). Truth be told, with every ounce of practical knowledge I was increasingly impressed by this Unix-like operating system. It’s nothing short of amazing, really! The irony lies in the fact that FreeBSD is an underdog in the Unix world (less than OpenBSD or Illumos, but still), despite the fact that it excels as a server environment and established many technologies currently in focus (process isolation, efficient networking and firewalls, data storage, etc.) years ago already. GNU/Linux is picking up pace, but it still has a long way to go.

I feel traditional Unices got the “base system + additional applications” separation properly. One might think it’s just a matter of personal taste – the “order vs chaos” debate going on for ages. The truth is that this separation is not only reasonable, but also extremely useful when one begins to treat the operating system as more than a mere Internet browser or music player. Order is paramount to organizing and securing data from ill-intent or hardware failure. I really appreciate the use of /usr/local, /var/db, /var/cache and other typical Unix volumes on FreeBSD, as it makes the system more predictable, and therefore hustle-free. When we have a multitude of systems to care about, hustle-free becomes a necessity. With the new container technologies like Docker it’s a realistic scenario – 1 host system serving N guest systems. One doesn’t need to run a server farm to get a taste of that.

This is basically where (and when) FreeBSD comes in. It’s a neatly organized Unix-like system with great storage capabilities (ZFS), process isolation for guest systems (jails and bhyve), network routing (ipfw, pf, etc.) and package distribution (synth, poudriere, etc.). And everything is fully integrated! The “pkg” package manager knows what jails are and do, and can easily install programs to them. Poudriere coordinates building new packages, using jail containers so that the host system is not compromised. These packages can later be distributed via HTTP/HTTPS/FTP remotely or locally via the file protocol. Such low-level integration is somewhat foreign to the GNU/Linux world, though among server distributions like OpenSUSE, CentOS or Ubuntu Server it is constantly improving.

Still, whenever I think about the divide between BSD and GNU/Linux, I see a tall brick wall that both sides are struggling to tear down. FreeBSD wants to become more desktop-oriented, while GNU/Linux is trying to reinforce its server roots. Difficult to tell whether this is good or bad. The BSDs do indeed excel as server systems, as recently highlighted in a NASA study. GNU/Linux is more suited for heavy computation and leisure. The brick wall has plenty of nicks, yet it stands strong. Maybe there is a “third option”? Why not let each do the job they do best? What I mean to say is that FreeBSD has its place in the world and the time is ripe to truly begin to appreciate it!

FreeBSD-Debian ZFS Migration

Since the Zettabyte File System (ZFS) is steadily getting more and more stable on non-Solaris and non-FreeBSD systems, I decided to put my data pool created for the previous entry to the test. In principle, it should be possible to migrate a pool from one operating system to another. Imagine the following scenario – a company is getting new hardware and/or new IT experts and needs to migrate to a different OS. In my case it was from FreeBSD to Debian and vice versa. All data volumes were located in a single pool, but depending on the size of the company, it might be several pools instead. Before even thinking of migrating it is first important to make sure that all I/O related to the pool(s) to be migrated was stopped. When the coast is clear we can “zpool export <pool>” and begin our exodus to another operating system.

From FreeBSD to Debian
After exporting the zdata pool I installed Debian Testing/Stretch onto the system-bound SSD drive. ZFS is not part of the base installation, hence all pool imports need to be done after the system is ready and the zfs kernel module is built from the zfs-dkms and spl-dkms packages. apt resolves all dependencies properly so the only weak link is potential issues with building ZFS on GNU/Linux. Should no problems occur, we can proceed with importing the ZFS pool. GNU/Linux is cautious and warns the user about duplicate partitions/volumes. Those will not be mounted, even if the pool itself is imported successfully. Thankfully, conflicts can be resolved instantly by using a transition partition/drive to move data around. Once that’s done, our ZFS pool is ready for new writes. Notice that the content of /usr/local/ will undergo major changes as FreeBSD uses it for storing installed ports/packages and their configurations. In addition, /var/db will contain the pkg sqlite database with all registered packages. While this does not specifically interfere with either apt or Debian (apt configurations are in /var/lib and cached .deb packages in /var/cache/apt/archives), it’s important to take notice of.

From Debian to FreeBSD
Here, the migration is slightly smoother. The “bsdinstall” FreeBSD installer is designed in a more server-centric fashion (and ZFS is integral to the base system) so the ZFS pool can be connected and imported even before the first boot into the new system. The downside is that FreeBSD does not warn about “overmounting” system partitions from the zdata pool so it’s relatively easy to bork the fresh installation. Also, /var/cache will contain loads of unwanted directories and /usr/src, /usr/obj, /usr/ports and /usr/local need to be populated anew just like during a brand new FreeBSD installation.

Either way, the migration process is not too difficult and definitely not horrendously time-consuming. Should the user/administrator have PostgreSQL, MySQL or other SQL-like databases in /var/db, extra steps might need to be taken to ascertain forward and backward compatibility of the database packages. In the end, it’s a matter of knowing what each OS places where. FreeBSD is structured in a very intuitive and safe (from an administrator’s point of view) way. Debian, just like any other GNU/Linux distribution is a bit more chaotic, hence more caution is required. Both are good in their own regard, hence my incentive for migration testing.

FreeBSD – SSD + 2xHDD ZFS Installation

I recently got an extra 2 TB hard drive for my mighty (cough, cough, maybe some 9-10 years ago) HP Z200 workstation running FreeBSD 11.0-RELEASE so I decided to finally build a proper 2-drive RAID (Redundant Array of Independent Disks) mirror. I read the zfs and zpool manual pages (manpages) thoroughly on top of the related FreeBSD Handbook chapters and got to work. Since I also have a 160GB SSD inside that PC, some tinkering was required. The main issue was that SSD drives make use of TRIM for improved block device balancing. UFS provides TRIM support, but ZFS does not. Initially, I thought of having two separate ZFS pools – zroot for root-on-zfs and boot snapshots on the SSD and zdata for high volume data partitions like /usr and /var on the 2-drive array. However, after careful considerations I came up with a simpler partitioning scheme:

160GB Intel SSD:
MBR -> BSD:
141G     freebsd-ufs   (TRIM enabled; mounted as “/”)
8G         freebsd-swap

zdata mirrored array on 1.5T Seagate Barracuda + 2T WD Caviar Green:
GPT:
1.32T freebsd-zfs (on each drive)

With such a partitioning scheme I lost boot snapshots, though it was a lot easier to install the OS as I could rely on the standard FreeBSD installation procedure (bsdinstall) entirely. First, I performed a standard installation via bsdinstall onto the SSD. Next, I created a 2-drive ZFS pool and named it “zdata” following the Handbook. I made sure that all parent partitions like /usr and /var are mounted from the SSD and only the variable and expandable sub-directories like /var/db, /usr/ports, /usr/src, /usr/local, etc. are placed on the ZFS pool. Since each of those required a parent directory in the ZFS pool, I used /zdata/usr and /zdata/var, respectively. That way the /usr and /var mountpoints did not get overridden with empty /usr and /var directories from the ZFS pool. This protects the core system from getting wiped if one of the ZFS drives fails. In addition, the system can be reinstalled easily and the ZFS pool added later without major setbacks. The trick is that all ports are installed to /usr/local and the package manager database  is in the /var/db directory. Flexible, easy and extremely well documented.

Just to clarify, the above is no rocket science and can be done very easily with the tools immediately available in the core FreeBSD installation. This should really be highlighted more as apart from the descendants of Solaris, FreeBSD is the only operating system that offers such capabilities out-of-the-box. GNU/Linux systems have their own RAID and volume management tools, but they’re definitely not as established as ZFS. The GNU/Linux alternative to ZFS is btrfs, as it too combines a volume manager with a file system. However, key features like RAID-5/6 are still unstable and no GNU/Linux distribution offers btrfs-only setups.