2008-01-27

Stone-Age Sysadmin Sunday: Dealing with CPIO Archives

Some truly revolutionary things have been around in UNIX for a very long time because they're just that good. Some are around as-is: find, netstat or ifconfig for example. Others have been improved upon ad infinitum, like vim (a better, sleeker version of vi that somehow stays true to its vi roots).


Then, you have ugly things that just won't die. Case and point: cpio. Originally meant as a stream archiver for tape storage, any sysadmin worth their salt will eventually run across some silly software package or archive file that's in cpio format.  And inevitably, getting data out of that archive will be important for one reason or another to avoid some "crisis."  Cpio is legacy for many a good reason, but you will inevitably be forced to square off against cpio one of these days, and it's best not to enter the arena unarmed. This week will be a shorter, succinct edition of Sysadmin Sunday, as we delve into cpio.

Also, I started writing this article several weeks back.  I usually queue up several articles ahead of time, especially the weekly Sysadmin Sunday feature.  I had no intention of actually having to mess with cpio, but just like talking about what to do in the event of a catastrophic system failure often leads to having to deal with one, writing about cpio did not come without its consequences.

I was forced to deal with cpio last week at work.  It seems there are some old scripts that keep using cpio for archiving data. I found out that the version of cpio that we had running wouldn't acknowledge files owned by users or groups with a UID or GID higher than 65,535. This is the typical "32-bit UID/GID" problem that got fixed in the REST OF UNIX back in the 1990s sometime. While poking around, I found that the earliest version of cpio was actually written by a caveman named Ungh in abacus-native machine code. The dude had serious skills. Apparently, though, no one's bothered to update it in the last few millennia.

Create a cpio archive
(not recommended... haha!)
cpio expects standard input to bear a list of files to archive. The best way to do this is to pipe find output into it.  Keep in mind that cpio does NOT strip leading slashes.  If you restore a cpio archive that was from the root directory, it WILL over-write data starting at the root directory.  For this reason, I recommend you cd to the root directory and run a find with a relative path from root instead of running find with an absolute path.

$ cd /; find path/to/archive | cpio -o > file.cpio

Extract files from a cpio archive
(so you can put them on something more modern, like 5 1/4" floppies)

The simplest way to extract from cpio is like this:
$ cpio -id < file.cpio

Occasionally, someone will compress the file with the UNIX compress command, which will give you a file ending in .Z. Use zcat for this:
$ zcat file.cpio.Z | cpio -id

And similarly, if you run across the bizarre gzipped cpio file, you can do this:
$ gzcat file.cpio.gz | cpio -id

(Props to [jdoublep] for the hilarious post title)

blog comments powered by Disqus