OSS is dead. Long live OSS!

April 8th, 2007

The question often asked from me is: “OSS is deprecated so why are you still developing and maintaining it?”.

This is a short and simple question. However the answer is not short or simple. First of all we need to return some 10 years back in time.

Once upon a time in Linux there was a tiny sound subsystem called VoxWare (formerly known as the Linux Sound Diver). It was maintained by me and released under GPL for Linux (and under the BSD license for FreeBSD and some other Unix variants). That piece of code was included in the Linux kernel source tree. I was working on the code “just for fun”. However it become too difficult to work on the sound stuff in my spare time at the same time when working on some Windows projects for my living. I was contacted by 4Front Technologies and we desided to make our living with a commercial version of OSS.

Unfortunately it took too long time to find the proper procedure to support the GPL/BSD version and the commercial one from the same source tree. So a well known Linux distribution vendor got irritated and hired another person to create another version of OSS for them (without even asking me to do that). The result was rather different than my plans for the future so I had to quit as the maintainer of sound for Linux.

Since that moment the kernel (OSS/Free) and the commercial OSS versions have been maintained by different teams. Unfortunately the Linux kernel version of the API got frozen to the OSS 3.8 version while we continued the development of the official API. In addition the OSS/Free version was (unfortunately) restructured so that most of the common (device independent) code was duplicated in the individual low level drivers. This made it impossible to keep the kernel drivers up to date with the development made to the official OSS version. The result was that the kernel drivers got frozen to the 3.8 version forever, unfortunately.

Then couple of years later a group of fearless programmers had created an entirely different, incompatible and Linux-only sound API called ALSA. They pushed it to the Linux kernel tree and the old OSS/Free version was declared as “deprecated”. It was supposed to become more advanced that OSS/Free 3.8. It was released under GPL (only) so it seemed to be the right thing. However application programmers didn’t like the ASA API and continued to use OSS instead. It was necessary to declare OSS as “deprecated” to push application developers to support ALSA instead of OSS.

However even that was not enough. Application developers still preferred OSS. This was bad for ALSA because they had to provide OSS emulation. In addition the kernel level OSS emulation bypassed some features (such as dmix) that ALSA has implemented in library level. So the OSS emulation was later implemented in library level. However providing OSS emulation in ALSA caused some side effects. Developers of audio applications still refused to convert to ALSA because the OSS API was still available. So some even more agressive policy was needed.

So far the pro-ALSA Borgs have managed to get Linux distributions to compile most audio enabled applications with just the ALSA plugins enabled (all OSS support is stripped). In some cases the distributions even try to prevent users from removing ALSA and installing OSS by keeping ALSA’s mixer interface busy (the Gnome/GTK mixer appled is immediately relaunched if it gets killed). Or the kernel may have been modified to keep parts of kernel’s sound core included even sound support is completely disabled in kernel’s configuration. “We are the ALSA project. Your system will be assimilated. Resistance is futile”. Has anybody ever heard about “freedom of choice”?
ALSA was officially included in the Linux 2.6.0 kernel that was released for more than 3 years ago (December 2003). If ALSA is as great as they claim then shoudn’t it have completely replaced OSS in all applications during that time? Apparently that has not happened so far. Will it happen during next three years? I don’t think so.

There is a relatively small community of ALSA believers who have written most of the currently available ALSA applications (usually called ALSA this or ALSA that). Older applications still support OSS in addtion to OSS. Some newer ones ALSA-only because their developers have been told that the OSS API will disapper tomorrow. However the ALSA API is still almost completely undocumented (after three years of it’s release) so how can anybody expect that programmers could develop good applications based on it.

A funny detail is that even some key developers of ALSA now suggest that developers use the Jack API instead of alsa-lib (btw, Jack has a fully functional OSS plugin). Somehow this is starting to smell like Emperor’s New Clothes.

Back to the subject. The latest Linux 2.6.20 kernel still has the old and obsolete 10+ years old OSS version included. It’s being killed (for a very good reason). However it looks like we are getting a very long funeral. ALSA too has OSS emulation. In fact there are two redundant versions of it: one in the kernel and another implemented in library level. Both of them emulate only the now obsolete 3.8 API version. This is the dead and deprecated OSS.

However this is not the only OSS. We at 4Front have continued working on Open Sound System for all the past years. It has become the real Common Unix and Linux Sound Solution (CULSS). In addtion to Linux it’s now the official sound subsystem for all the Unix variants (other than MacOS). However for many Linux diehards it’s not an alternative because:

  • It’s not GPLed (yet). Instead it’s a commercial product by some evil capitalist pigs.
  • It’s not in the Linux kernel source tree so it doesn’t exist.
  • It’s being used also by the public enemies of Linux.
  • It’s “binary only”.

For the above reasons the benefits of OSS are widely ignored:

  • It’s based on the widely known Unix/POSIX/Linux device model.
  • It’s fully documented (OTOH some parts of the documentation are still under construction).
  • The API is simple and compact which makes it very easy to use for programmers.
  • It has been there for 15 years so practically all applications already support it.
  • It’s kernel only.
  • It’s designed to work under general purpose operating systems such as Linux and Unix. There is no need to use any special real time enabled kernels (they can be used but it’s not a requirement).
  • The limitations and “idiosyncrasies” referred by ALSA’s marketing propaganda have been fixed years ago.
  • Fully dynamic minor/major device number allocation which permits unlimited number of audio/MIDI/mixer devices.
  • New device naming that makes applications immune to changes in the device configuration (installing and removing devices).
  • Transparent virtual mixing that makes it possible for any number of applications to share the same physical audio device(s). This also works for recording and full duplex.
  • Powerful device enumeration support.

Then we have ALSA which is:

  • Not documented. Use the Source, Luke!
  • The API is not compatible/similar with anything else (past, present or future).
  • Very thin device abstraction.
  • The API is designed for low/zero latency which makes it very challenging to use in normal applications that don’t have any latency requirements.
  • Requires redundant layers libraries in addition to the kernel space code (alsa-lib, Jack). This causes increased memory requirements in embedded systems.
  • Has enormous number of functions (1500+ couple of years ago). Majority of the calls have not been used by any applications (even many applications use different functions than any others). Massive number of unnecessary library functions increases the memory footprint even further. And what about the CPU consumption? And will anybody be ever possible to document (or even test) all of them?
  • There are multiple (redundant) transfer methods for audio. How does the programmer know which one should be used with given hardware?
  • Some devices use interleaved channels (for stereo and multich) while some others use non-interleaved.
  • Static minor number assignment that causes waste of the available device/card space. Number of cards, devices and subdevices possible in the system is limited.
  • Strange configuration file mechanism that requires degree in LISP programing to understand it.
  • Sharing of devices is based on the dmix feature that nobody but experts can configure properly.
  • The API is based on callbacks which requires deep programming knowledge from the developers. Gotos have been considered harmful for decades. Callbacks are even worse (in fact they are a re-incarnation of the famous come-from statement).

So which one should be declared as deprecated? As we are talking about APIs the right authority to make the decision are the application developers. They have their “freedom of choice”.
Actually it’s not nice to compare OSS against ALSA in this way so I don’t continue any further. However they have done the same for years (see ALSA’s web page (before they remove that stuff)). So I coudn’t resist. At least we have given them three years of time to discover and fix the above problems but nothing seem to have happened. And I didn’t even mention MIDI yet. Maybe I should do it next…
Regards,

Hannu

Why OSS is OSS?

April 5th, 2007

Hi again guys!

Over the past years OSS has been criticized because it doesn’t support this and that feature of this and that sound card. Instead it limits applications to some common subset of features found on every sound card. In general this is true. However is it actually a problem?

Using sound in applications is in many ways similar to using networking. For example a web browser simply tells the networking software to connect to a given TCP port of given server (name/address). Networking software then builds the connection and that’s it. Web browsers don’t try to find out which network interface card is installed in the system. They don’t try to find if the device has some hardware parameters that could be tweaked to get better performance. In fact the networking software doesn’t even let them to do that (such changes would ruin the performance of the other applications using the same NIC). Instead the NIC settings are managed by kernel’s network core/drivers which have better knowledge about the local network.

This is known as the “black box” model. A web browser application sees only the networking subsystem box. There may be some control switches on the front panel of the box. However they are related with the current stream/socket. The actual NIC devices are located on the back side of the box. There may also be some control switches for the devices but the application cannot see them. Instead they can be used by the system operator (if necessary).

The OSS API is based on this black box approach too. The application doesn’t need to worry about the device parameters at all. Instead it just tells the OSS box to create an audio stream with given parameters (sampling rate, number of bits and number of channels). After that the application programmer can focus on doing his/her job. OSS will automatically perform the required conversions in software if the device doesn’t support the requested format itself. When the stream is running the programmer can use the usual POSIX/Unix/Linux system calls such as read(), write() and select()/poll() to feed more data.

The black box model makes OSS audio applications very simple and robust. About 95% of applications don’t need to use any advanced techniques to get audio working (unfortunately this seems to be very difficult to accept by the developers). Practically all audio applications I have examined do completely unnecessary things that don’t make anything better. In fact in many cases such applications will simply break in systems that have slightly different sound cards (usually some of the high end professional ones).

Of course there are special applications that need to be fully aware about the hardware details. For example it doesn’t make any sense to send encoded digital bit streams (such as AC3) to an ordinary analog output device that cannot decode them. Or an audio recorder/editor application probably should prevent the user from recording a 24bit/192kHz/5.1 file from a (modem) device that supports only 8bits/8kHz/mono. However such applications are rare (about 5% of all applications doing audio). They are usually audio oriented and designed by programmers with good audio knowledge.

Majority of all applications doing audio require just very limited number of features from the sound subsystem. Applications that require anything more are rather rare. For this reason the OSS API has been divided to a small subset of fundamental core functions that are easy to use and to a wider set of less frequently used functions. This makes it very easy to learn and master OSS. At the same time programers seeking for some challenge can get it.

The above wast the first half of my “mission”. The second half is that I was introduced to Unix in 1984. I have always liked it’s powerfull capability to do mighty things with just a small set of carefully selected features. I have also done some MS-DOS (since the introduction of the HP150 microcomputer) and Windows programming during my life before Linux but I never liked that.

I think the above explains why OSS become what it is now. There are some fundamental rules I have tried to follow as much as possible:

o Use the familiar Unix/POSIX/Linux device/file API as much as possible. There are millions of programmers who already know how to use this interface. Developing an entirely new interface that is not compatible with anything earlier will just cause massive confusion and make it difficult to get programs working with the good old legacy API.

o Keep the number of features as small as possible. In this way the API can (hopefully) be properly tested and documented. Compact API is also easier to understand by the application developers. Add new API features only if it’s absolutely necessary. In this way OSS currently has something like 200 ioctl functions. Even this may be too much but fortunately most applications need just a handfull of them. There are good chances that we can actually document and test all these functions within reasonable amount of time. Compare this to some competing API that has some 1500+ (and counting) functions. I would be very surprised if all of them ever get documented (or even used by any applications).

o Don’t require the application programers to use nasty programming techniques such as callbacks or mutexes/semaphores. Instead let the programs to work in fully linear manner. Callbacks and semaphores are features that belong to the kernel code, not to the applications.

o Use the kernel’s device file interface (open, close, read, write, ioctl, select/poll) instead of some library interface. This interface uses very weak binding which makes it possible to use OSS enabled applications in systems that don’t have OSS installed (of course sound doesn’t work but the other features of the application can still be used). Library based applications in turn don’t even start if the required sound library is not installed (unless the program uses nasty dynamic linking to load some sound plugin).

o Let the applications to only do things they should do. For example try to prevent ordinary audio applications from changing the global output volume (that would disturb all the concurrently running applications). Instead provide a way to change just their own output level.

o Provide full backward and forward compatibility of the API. Applications developed for any earlier versions of the OS API should run without recompile under the very latest OSS version. Equally well applications developed and compiled for the latest OSS version should run under any earlier OSS version. This is possible if the application designer uses the suggested default actions when the new features are not available in the older system (ioctl returns errno=EINVAL).

o Keep the API endianess neutral. Applications should work without modifications if they are compiled under a big endian or a litle endian architecture.

o Make the API 64/32 bit neutral. Applications compiled using a 32 bit compiler should run without any problems under a 64 bit compiled kernel.

This is pretty much all of this. Comments are welcome and I will try to answer them as well as I can. You may wonder why I had only talked about audio but not MIDI. MIDI is entirely different beast and I will discuss about it later.

Blogging starting up - please wait..

March 22nd, 2007

Why am I blogging?

The reason is that I started working on OSS almost exactly 15 years ago. I had got a Sound Blaster 1.5 card few months earlier and decided to write a Minix driver for it. I knew that there was something called Linux (I had seen the famous message by Linus on comp.os.minix and I was studying at University of Helsinki at the same time with him). However it took several months before I finally got Linux to work with my SCSI disk (due to some IRQ assignment problems).

So I got the Minix driver working after few weeks of hacking. For obvious reasons it’s performance was not acceptable and playback was just clicking. Later in the summer I had got Linux installed and the driver converted for it. The initial version of the Sound Blaster driver for Linux was released at the end of August 1992. This was the beginning of my never ending odyssey with Linux/Unix sound.

Why sound and why Linux/Unix?

The story started about 15 years earlier during late 70’s. I heard a radio documentary about computer and electronic music. I was very impressed by the compositions made with Music V. I immediately knew it was something I would like to do. Then more than ten years later I saw a sound card made by some small Singapore based company called Creative Labs. I bought the cards immediately and the SDK couple of weeks later. That was in 1990 or was it 1991.

After some hacking under MS-DOS I found out that it was more challenging to get applications to talk to the card than it was to write the application itself. The SDK required use of techniques such as interrupts or callbacks. That was so lame that I had to start looking for some other approach. I had used Unix (HP-UX) since 1984 but the only alternative that worked in PC (that time) was very expensive Xenix. I had heard about Minux from my friends and decided to try it. Btw, there was some other guy who ordered Minix before me in the same bookstore in Helsinki and I think that guy was Linus (I didn’t know him yet that time).

After all the Linux sound driver got released. Initially it supported only Linux and the 8 bit mono Sound Blaster 1.0/1.5 cards (at that time nothing else was available). Then after a while a stereo card (Sound Blaster Pro) and a 16 bit stereo card (Pro Audio Spectrum 16) was introduced. Bit later came the Gravis Ultrasound one.

About the same time I found out that there were few other PC Unix operating systems such as 386BSD (or was it BSD386), SCO Xenix386 and Novell Unixware. It was natural to expand the Sound Blaster only Linux driver to support multiple sound card architectures and operating systems. The diver got renamed to VoxWare (I didn’t know that some startup company (that was later acquired by Netscape) had registered the same name).

In autumn 1995 I was contacted by Dev Mazumdar who had made a SB Pro (MCA) driver for AIX. Soon after that we decided to join forces and I become an eployee of Dev’s 4Front Technologies. Our initial plan to port VoxWare to AIX and start selling the product for AIX and the other commercial Unix versions. Part of the plan was to continue supporting OSS as an open source project for Linux and BSD. However our first announcement was actually for Linux. We released USS (Unix Sound System) for Linux during summer 1996. Unfortunately that name irritated the owners of Unix trade mark who suggested that we call it as Open Sound System instead. So OSS was born.

Now 15 years later we have finally released OSS 4.0 which is the biggest milestone in the history of the product. That means 10 years of work since the OSS/Free drivers that are (still) included in the kernel source tree were frozen. Majority of the API changes have been developed during past 5 years. It has taken 2.5 years to rewrite the core functionality of OSS to be compatible with the latest interfaces provided by different operating systems.

OSS 4.0 is very close to the original idea I had when I started working on sound drivers. It just took 15 years to get it done. Most of this time was spent on implementing drivers for dozens of sound cards and many different operating systems. However significant amount of stime has also been spent on working with the OSS application developers and on finding out ways for making their applications to work even better. But now it’s finished and OSS 4.0 is what it s. More about that later…