Friday, October 2, 2009

RPM Hell

End-users these days are rarely exposed to "rpm". They'll probably be using yum or PackgeKit (or a GUI-based software installer). RPM, FYI, is the backend package management tool used by Fedora, SuSe, Mandriva, etc (together known as the RPM-based distributions). And it sucks bigtime!

Although RPM has come a long way from what it was in the 1990s and early 2000s, it stands on poor foundations and is unlikely to get any better soon. Having worked on Fedora-ARM for over half a year now, I can see the glaring issues that it faces. Fedora has done a hell of a nice job of mitigating those, but when you start to dig deeper, you realise how filthy the RPM world is.

Here is a link to a VERY old pre-yum days paper, "RPM Hell". Its a google cache link since the site which hosted the original paper was down. Here is the original paper.
Now the interesting thing is that, most of the issues mentioned in this paper STILL EXIST and are very real!

Following issues stand out:

1) Package management is pathetic. If I want to install a terminal-based editor, in all probability, I might end-up installing a whole load of GTK and X crap that I am never going to use. The main reason for this is because RPM packages are monolithic. Large packages like emacs typically install all GUI files needed to support it in X even when it is a headless server. So, we have a (needless) dependency, emacs -> gtk. Now, gtk pulls in LOTS of its own dependencies which are really not required and I end up with a bloated installation. Now, this problem can be solved by splitting emacs into 2 packages. emacs-common and emacs-gtk and I just need to install emacs-common. But, we'll soon realise that there are many capabilities a software can provide and we can't keep making a separate package for each. We'll end up with an unwieldy distribution. So, RPM should have a way of internally defining capabilities and installing only those which are really required.

While we were working on the fedora-ARM project and making a basic rootfs for F-11, we had to build almost half of the GNOME packages to satisfy dependencies. WTH!! Gnome shouldn't even come in the picture yet! But thats what happens when you are working with monolithic packages!

2) There's no way to install multiple versions of a software. Now this is not really a RPM problem, it is an inherent Linux issue. Rules like, all executables go to /usr/bin make it difficult to install multiple versions, since there'll be file conflicts.
So, if "A" needs B-1.0.0 but, "C" needs B-2.0.0, you have to choose between A and C. Fedora has greatly alleviated this problem by making sure every package that requires "B" would require the same version of "B". There are a lot of rebuilds, etc for the purpose. And they are partly justified (If we don't force packages to move to newer versions, they never will and we'll end-up with 10 versions of each package installed on our system). But sometimes, you just need a way around! and there's no way you can do it.
eg. There was a time when we had a working package built aganist python 2.4 and the new distributions had python 2.6. Now there were some difficulties because of which we couldn't rebuild the package aganist python 2.6. Only if there could be multiple versions installed! But sigh! We had to devise some dirty workarounds to get stuff working.

There are many other issues that yum intelligently hides. Most of them are highlighted in the "RPM hell" link above. Guess its time to rethink RPM?


Vedang said...

i don't think "rethinking" RPM will solve the problem. In the end, dependencies are always going to be a problem with free software, because free means more choice, automatically means choice problems. I'm sure debs also suffer from this, but to a lesser extent because:
a) apt-get has been around much longer than interfaces for rpms.
b) debian/ubuntu has a much richer/deeper repository than RH/Fedora (IMHO)
c) The Debian community is just so much cooler! (:P sorry, couldn't resist!)

Jitesh Shah said...

@vedang you'll be surprised at how rich fedora's repos are and you'll be surprised how cool the redhat community is :D

Anyway, thats not the problem at all. I have worked a bit on debian package management too. and AFAIK ubuntu packages choose to subdivide packages. eg. they have emacs-common, emacs-gtk, etc. Thats at a finer level than RPM, but it doesn't solve the inherent issue with package management, it just mitigates the issue. (Fedora maintainers can do it by playing around with spec files, but it wouldn't really be a "big gain")

Also, "choice" also isn't the problem. You may have 100 editors to choose from, but "RPM hell" will be seen when you try to install one of them :)

Anonymous said...

The overwhelmingly vast majority of 'RPM Hell' is self-inflicted, and has everything to do with repositories that are populated with rude packages and nothing to do with the very format of the packaging itself. How lazy to jump to any other conclusion!

If you mix shady repositories with packages that break compatibility, then you should expect problems. I'm disappointed so many people cop this 'RPM Hell' label as a slur against the OS or the packaging format instead of owning up to the real problem. As if a CPIO archive and some meta-data somehow will ruin civilization!

Packages bringing in too many dependencies? Of course this is preventable, but that's a lazy packager issue, and you will find you need to install an extra 10Mb of dependencies if you're missing certain libraries, etc. Ultimately on a 20Gb VM root-disk, even, an extra 10Mb isn't significant, but it does provide something for people to worry at. Somehow this is a package format issue again, as if CPIO+Dependencies automagically inflate themselves. Again, how bizarre!