Kuro5hin.org: technology and culture, from the trenches
create account | help/FAQ | contact | links | search | IRC | site news
[ Everything | Diaries | Technology | Science | Culture | Politics | Media | News | Internet | Op-Ed | Fiction | Meta | MLP ]
We need your support: buy an ad | premium membership

[P]
Designing object extensions for the Unix Operating Systems

By xL in Technology
Mon Feb 24, 2003 at 11:40:42 AM EST
Tags: Software (all tags)
Software

There is no operating system out there that has fascinated geeks around the world as much as Unix has. I, too, have been swayed by this fascination. There's somethig in there that makes Unix feel like second nature in a way, even when you are still learning to work with it. The simplicity of the basic Unix design is what makes it so pervasive. What I have been looking for is methods to elevate this simplicity to a realm of applications that is currently exempt of this simplistic beauty: The desktop. Although the current generation of Desktop Environments is starting to reach a level of usability, this goal was not reached without sacrificing most of the Unix mindset along the way.


What I really want to tackle is the braindead way we have so far been trying to create object-oriented layers on top of our type-agnostic filesystem layer. I think we have been looking too much at the way Windows has been dealing with these things. As of late, some of those stupidities have even made it into Apple's MacOS X (although luckily not all of them). Here are my main areas of concern:

Encoding the filetype into the filename is lame

No matter how you put it, file extensions are crippled and confusing. Users don't want to know about them. Hiding them from view, as we've seen on Windows, is a bad idea. A file's mime-type and a reference to the application that produced it should be part of its regular attributes. The original MacOS hit gold in this department, its filesystem kept track of creator and type attributes. Apple chose not to implement this scheme on their MacOS X products, I suspect because it was actually causing mostly trouble on their old OS: Users communicated with other users and computers that had no clue about the 4 character attributes, rendering the scheme a burden. I think they were correct to dump the scheme in its current form, but instead of moving to the bad habits of their competitors, they should have elevated that scheme to the internet, where MIME is king. If the Operating System keeps type-information natively as MIME-types, typing is implicit with internet downloads and email attachments.

Absolute paths are a horrible pain and softlinks are a bad aspirin

Why, in the name of all the creatures in Richard Stallman's beard, are we still compiling applications with all kinds of absolute path references in this enlightened 21st century? Do we actively hate our users, do we want them to suffer figuring out whether we want our mandatory configuration files installed in /etc, /usr/local/etc, /usr/local/application/etc or whatever other evil place our nerdy little brains can come up with? We should find better things to do with our lives. Softlinks can help us a bit, but only so that we can see our filesystem degenerate into an angry fruit salad of links.

I want to fight this horror from two ways: First, let's keep applications and their static datafiles in the same place by using directories as application containers. MacOS X has gone this way and I think it is really effective. We've been spreading our applications all around the filesystem like spraying housecats long enough. Secondly, I want to create a mechanism for having multiple user-controlled path definitions (in the style of $PATH) as an intrinsic part of filesystem navigation.

A last battle against absolute path references is a way to make softlinks act as hardlinks. Apple's alias system on the original MacOS is a good point of reference.

And then what?

All this ideas I will work out here have been put into a proof-of-concept application kit dubbed grace. The ideas are simple enough to put into kernel space, which I think is the only really sane place to get new paradigms enforced consistently. I will work this out as we go along. Note that none of the ideas outlined are truly original. They can be found in other operating systems and mostly stem from the eighties or earlier. My main purpose is getting these ideas together and see what they can do in the context of Unix. Only if these ideas can be made compatible with the ideas behind Unix, will they be able to surve as a way forward for the open source movement. The implementation suggestions outlined here are simple enough to make it easy to adapt existing software. Most of the essential groundwork for these ideas has already been done, as they rely heavily on a filesystem that lends itself well to creative abuse. This heavier reliance on filesystem flexibility is already surfacing with the increasing popularity of maildir, which has proven that the relatively simple filesystem semantics are much better able to deal with traditional death traps like logging than the complex structures we traditionally have this urge to build on top of them. Filesystem coders are picking up on this, making a lot of disadvantages in storage size and performance possibly related to these ideas go away.

I. Implementing extended attributes

The currently available desktop environments use XML files to store generic attributes about files in a directory. Unfortunately, Gnome's nautilus file manager has started doing this in the user's home directory since Gnome2, which means all file attributes are private by definition. There is nothing intrinsically wrong with XML files for attributes, except where they need support from a low level, which is the case with mime-types and creator-ids. Attributes that remain wholly inside the scope of userland can be in any complex format. In any case, these attributes should be kept in the same directory as the files, so that directories that are shared between users can still carry the benefts of attributes.

To make it easy for low-level code to work with mime-types and other small attributes, the soft-link is a construct that is easy to abuse as a carrier of small bits of information. An attribute of a file is thus a directory entry with the same name as the file, but with a translation that makes it possible to know the attribute's name and that allows it to co-exist with the file. Any scheme will do. For the reference implementation I chose the following scheme:

AttributePath (path, fileName, attributeType) = path + "." + fileName + ":<" + attributeType + ">"

DirectoryAttributePath (path, attributeType) = path + ".:<" + attributeType + ">"

It is best to define attributeTypes as short. Ideally, the scheme should stick to a maximum of 4 characters for the type, so that no unreasonable size limitations start surfacing. Applications that write files can now set their mime-type by creating a <mime> attribute. These files should also be accompanies by an <appl> attribute that contains a reference back to the application binary. In this phase, this will be an absolute path. In short, a typical directory would look something like this:

 ./
 ../
 .README:<mime> -> text/plain
 .README:<appl> -> /bin/less
 README
 .install.sh:<mime> -> application/bourne-shell
 .install.sh:<appl> -> /bin/sh
 install.sh

The reference shell application should show us this:

 [/usr/src/app]% ls
 --rw-  pi        458.0 K Text     README
 --rwx  pi         19.4 K Script   install.sh
 [/usr/src/app]% ls -m
 --rw-   458.0 K text/plain                README
 --rwx    19.4 K application/bourne-shell  install.sh

Ideally, if a file is somehow without attributes, the system should be able to fall back to resolving file extensions or using a mime-equivalent of the "magic file" to figure things out.

II. Introducing path volumes

A path volume is a collection of directories that is stuck under a volume name. Like with the Unix shell's $PATH environment variable, files under such a volume can be resolved from one of more directories. The system adds a layer on top of normal readdir operations to combine multiple directories as overlays. When reading directories or files, the paths are searched right to left. When writing or creating a file, this is attempted left-to-right. When passing a path argument to a system routing handling files or directories, a reference to a file inside a path volume looks like this:

VolumePath = volumeName + ":" + path

This may look familiar to people who know the Amiga OS. The system reads these paths from the environment, where volume "foo:" references the variable $PATH_FOO, which contains a list of colon-separated paths. An example:

 [/etc/myapp]% ls
 --r--  root       1020   Text     master.cf
 --r--  root        4.0 K Text     buttons.cf
 [/etc/myapp]% cd /home/pi/.etc/myapp
 [/home/pi/.etc/myapp]% ls
 --rw-  pi          340   Text     styles.cf
 [/home/pi/.etc/myapp]% cd etc:myapp
 [etc:myapp]% ls
 --r--  root       1020   Text     master.cf
 --r--  root        4.0 K Text     buttons.cf
 --rw-  pi          340   Text     styles.cf
 [etc:myapp]% assign
 bin:        /bin
             /usr/bin
             /usr/local/bin
             /home/pi/bin
 lib:        /lib
             /usr/lib
             /usr/local/lib
             /home/pi/lib
 etc:        /etc
             /usr/etc
             /usr/local/etc
             /home/pi/.etc
 src:        /usr/src
             /home/pi/src
 mp3:        /home/mp3
             /home/pi/News/Pan
 home:       /home/pi/

I don't think it would be wise idea to allow for recursive volume-references without a good research into the security implications for such a venture. Programs with elevated privileges should take care that no untrusted paths are part of a volume it accesses for configuration or direction.

III. A brief venture into the creative abuse of directories

Since directories can contain properties, it is relatively easy to start using directories for more than just hierarchical storage. As in MacOS X, a directory could be defined as an application object. Objects within this directory can be used as executable connection points with the world. Even better, these connection points can help in making application objects self-describing with regards to handling of file types. Primarily, this makes it possible to "apply" a file object to an application object (i.e. open it), but other methods can be defined within the system. Let's look at the hypothetical directory layout of an editor application:

 myApp/
   .:<mime> -> system/application
   .mime/
     text/
       plain/
         :open
       html/
         :open
   :run
   .icon/
     toolbar/
       .back.png:<mime> -> image/png
       back.png
       .reload.png:<mime> -> image/png
       reload.png
       .stop.png:<mime> -> image/png
       stop.png
   .text/
     en_US/
       data.po
     en_UK/
       data.po
     nl/
       data.po
   .info/
     version -> 1.0.0
     author -> Pim van Riezen <pi@madscience.nl>

In a desktop environment, if an icon representing a file is to be dragged over the icon representing this application, the system needs to perform a few steps. First it checks if there is a :mime executable diretly inside the application's directory. If not, it takes the left half of the file's mimetype (if it was text/plain, the left half is "text") and sees if there's a :open executable inside a subdirectory with the left half's name in the subdirectory .mime. If that fails, too, it will check with the full mime type (.mime/text/plain/:open in our example). If there's still nothing found, the file manager can ring alarm bells.

The system code that handles application launching needs a bit of smartness. An application needs to have knowledge of the exact path to its directory alongside the current working directory, so that it can access its own resources. When implemented properly (which is still beyond the scope of the current reference implementation), this removes another need for absolute paths.

IV. The quest for the holy creator application

If we want to adequately trace a file back to its creating application, we need some way of pointing to an application without reverting back to those evil absolute path descriptions. Ideally, we should be able to move an application object around without breaking the link between the application and the objects it created. A partial solution would be to reference back to application from within the context of a path volume (ie set the <appl> attribute of a file just written by vi to "bin:vi", but this still leaves a nasty after-taste of having to specifically designate certain directories for application objects. In a desktop environment, a user may not be inclined to think this way, ideally the user should be free to organize his filesystem any way he likes without keeping track of complex path bindings.

Apple's original MacOS used a desktop database file to keep these relationships intact, as a sort of reverse filing system. The major disadvantage of such an approach is that this database and the main filesystem can easily get out of sync. There is no way around keeping such a database, but an effort must be made to make it that regular operations on the filesystem ideally do not need synchronization.

Luckily, Unix has one way of finding a filesystem object back without knowing its absolute path: The inode. With applications defined as directories, a reference to the inode of this directory is sufficient for the application to get to all its resources, regardless of where on the filesystem it is, as long as it still exists. An extra attribute with a unique random id for the application object can be used to prevent a reused inode from colliding later on.

Heavily using the filesystem for this database allows us to use the path volume overlay features to combine system-installed applications with applications installed by the local user. Here's a hypothetical layout:

/
  .:<volm> -> 7a34bb4f

 /Applications/Gimp/ (inode=00018d23)
   .:<mime> -> system/application
   .:<apid> -> gimp/1f72a105
   :run
   .mime/
     image/
       :open

 appdb:
   .apid/
     gimp/
       default -> 1f72a105
       1f72a105 -> 7a34bb4f/00018d23
   .mime/
     image/
       :open -> gimp

These links inside appdb: can be set up when the application is run for the first time. The default bindings for certain mime-types should only be set if there are no current bindings. In other situations, the application should either ask or just shut up.

Conclusion

By taking a couple of ideas from the seventies and eighties and combining them with an Operating System from the sixties, we may be able to create an environment that offers users tremendous benefits while staying true and compatible to the tremendous environmental legacy of keeping Unix around. By making mimetypes an intrinsic part of the low level system, even applications that have nothing to do with the desktop benefit. Applications like webservers, just like desktop file managers, have to do a lot of second guessing or need our pampering to get the information right. If the power of data-streams with mime-types is extended to include the Unix pipe paradigm, even more interesting things can happen even on the commandline. Getting these ideas rolling would be a good start, though. The reference code is nice to get an idea about how things could look from an application's perspective, but the infrastructure needs to be there in a lower layer, either the standard C library or the kernel.

The example code

The example code is built with a small c++ library called grace, which actually covers most of the infrastructure for handling mime and path volumes. You can download its sources at http://lab.madscience.nl/grace-0.6.tar.gz and building src/libgrace. There is currently no automatic install (figure that), copy grace/lib/libgrace.a and grace/include/grace/*.h to /usr/local/lib and /usr/local/include/grace respectively in order to get the other code to build, which can be downloaded from http://lab.madscience.nl/gracesh-0.1.tar.gz. This is an example application that offers a shell commandline. The only commands you have are "ls", "cd", "assign" and "exit", but that shouldn't stop you from having fun. To get it to display mimetypes correctly (including fallback to file extensions if a file has no <mime> attribute), set the environment variable PATH_ETC to something like /etc:/usr/etc:/home/you/.etc and copy the mime.db file to the .etc directory in your homedir.

Note that this version omits the dot in front of attribute entries. So to set the mime-type to file foo to text/plain, do a ln -s text/plain "foo:<mime>" and you are set.

Sponsors

Voxel dot net
o Managed Hosting
o VoxCAST Content Delivery
o Raw Infrastructure

Login

Related Links
o http://lab .madscience.nl/grace-0.6.tar.gz
o http://lab .madscience.nl/gracesh-0.1.tar.gz
o Also by xL


Display: Sort:
Designing object extensions for the Unix Operating Systems | 118 comments (85 topical, 33 editorial, 0 hidden)
interoperability (5.00 / 3) (#17)
by speek on Sun Feb 23, 2003 at 03:30:13 PM EST

If you send a file, how do you make sure the mimetype goes with it? You need universal appication support of encoding and decoding mimetypes with sent files, which doesn't currently exist. Thus, we're always falling back on file extensions, which seem to generally work ok. I certainly prefer NOT to hide them from view.

--
al queda is kicking themsleves for not knowing about the levees

Not a problem with email. (5.00 / 1) (#20)
by i on Sun Feb 23, 2003 at 04:05:23 PM EST

Not a problem with http either. For everything else, send an archive (tarball/whatever Windows format is in fashion today) which includes metadata in an agreed-upon format. Compliant systems will know how to import such archives; non-compliant ones will just ignore the metadata.

and we have a contradicton according to our assumptions and the factor theorem

[ Parent ]
non-compliant apps are the problem (none / 0) (#23)
by speek on Sun Feb 23, 2003 at 05:05:46 PM EST

I remember it being a problem for Mac's downloading files to get the file association right a few years back. I bet that's fixed now that they recognize file extensions.

--
al queda is kicking themsleves for not knowing about the levees
[ Parent ]

Ah, you mean (none / 0) (#26)
by i on Sun Feb 23, 2003 at 05:22:23 PM EST

importing from a non-compliant app. Yes, one has to cope somehow, but fallback to extension is the last resort. Better try and guess the mime type from content (magic numbers etc).

and we have a contradicton according to our assumptions and the factor theorem

[ Parent ]
OO Unix? (4.50 / 2) (#18)
by porkchop_d_clown on Sun Feb 23, 2003 at 03:31:28 PM EST

Have you looked at the changes made in NextStep/OS X? Not quite object oriented, but not CPM either.

Alternately, perhaps you want to look at Plan 9?

(The problem with Plan 9 is that Lucent has a restrictive license on it).


--
You can lead a horse to water, but you can't make him go off the high dive.


Lexical File Names in Plan 9 (none / 0) (#107)
by srichman on Mon Feb 24, 2003 at 09:48:01 PM EST

Agreed. The Plan 9 folks have spent a lot of time thinking about what's braindead in the unix filesystem (and the rest of the OS) and how to fix it. A good starting point would be Rob Pike's paper "Lexical File Names in Plan 9, or, Getting Dot-Dot Right".

[ Parent ]
Why? (4.00 / 1) (#19)
by i on Sun Feb 23, 2003 at 03:50:21 PM EST

This is a radical departure from the tradition. If you feel it's necessary, why not  do things properly from the very beginning (i.e. clone BeOS :) — arbitrary metadata within the file, mandatory mime-type attribute, and SQL-ish search interface to the filesystem?

and we have a contradicton according to our assumptions and the factor theorem

The capability is already in the linux kernel (none / 0) (#25)
by nusuth on Sun Feb 23, 2003 at 05:20:24 PM EST

Reiserfs is designed from ground up to have very good small file performance. The idea is making the filesystem some form of database where filenames are keys and their content is values. Obviously internal fragmentation and seek times make this very impractical on other file systems, not on reiserfs.

With reiserfs 4 (destined for 2.6 IIRC) all elements of filesystem as database concept will be complete. Not only small file performace is further improved but also the file system will support plugins to it, in userspace too (if I got that right. I don't know all that much about this stuff.) Plugins may supply necessary bindings for mime types, SQLish access interfaces or ACLs etc.

Ofcourse linux may continue to reiserfs support and people may continue to use it as a traditional filesystem. The fact that someone wrote infrastructure doesn't mean it will be utilized. But I think a brave linux distro will do, and all will have to follow when that happens.

Actually I was planning to make an editorial comment on without reiserfs comparison, this article is incomplete. We will just have to discuss that topical now.

[ Parent ]

Good. (none / 0) (#28)
by i on Sun Feb 23, 2003 at 05:30:50 PM EST

I thougt Reiser has these capabilities but wasn't quite sure. I like the fact it has. Somebody really has to design a litle brave distro around it. Or just a packager perhaps.

and we have a contradicton according to our assumptions and the factor theorem

[ Parent ]
That is somewhat debatable (none / 0) (#30)
by nusuth on Sun Feb 23, 2003 at 05:57:25 PM EST

I thougt Reiser has these capabilities but wasn't quite sure. I like the fact it has.

It doesn't really have any additional semantics to bind metadata and file. That has to be imposed on the filesystem, possibly by using plugins.

Somebody really has to design a litle brave distro around it. Or just a packager perhaps.

I don't see why one wouldn't. Modifying a shell, a file browser, the default file dialog writing a metadata server and live update plugin will get most of the beloved BeOS's capabilities without too much work or breaking compatibility.

AFAICT what Reiser guys are planning is a bit more radical (like having etc/hosts instead of etc/hosts and every URL->IP matching is in a separate file in /etc/hosts) but the implementation does not demand it.

[ Parent ]

Doing the radical thing (none / 0) (#44)
by xL on Mon Feb 24, 2003 at 03:30:09 AM EST

Has been done before. Problem is, most of the time people measure an OS either by how well it runs Photoshop or how well it runs emacs (ok I'm stretching it but you get the point). Radical systems require extensive porting. I think it's better to evolve an existing system slowly, so that emacs will keep working :).

[ Parent ]
Please let UNIX die with dignity. (1.00 / 4) (#21)
by Mr Hogan on Sun Feb 23, 2003 at 04:33:03 PM EST


--
Life is food and rape, then tilt.

Too late. Linux is already here. [nt] (none / 0) (#105)
by porkchop_d_clown on Mon Feb 24, 2003 at 03:25:16 PM EST


--
You can lead a horse to water, but you can't make him go off the high dive.


[ Parent ]
Configuration in OS X (3.50 / 2) (#24)
by David McCabe on Sun Feb 23, 2003 at 05:17:52 PM EST

Mac OS X does not store configurations in the application bundle. It stores them in ~/Library/Preferences/, in a plist whose file name is the UUID of the application.

Plist is a NeXTSTEP file format that looks like this (IIRC):

{
  org.dmccabe.myapp.someNumber = 45
  org.dmccabe.myapp.someString = "Hello"
  org.dmccabe.myapp.anArray = [some,syntax,for,arrays]
}

But anyway, it's a list of names in a non-flat namespace with typed values. Every application has a name that is supposed to start with the owner's domain name, just like with Java.

Doesn't use Plist anymore (none / 0) (#34)
by fluffy grue on Mon Feb 24, 2003 at 12:01:25 AM EST

Now it uses an XML variant of Plist, which is rather ugly. One of those things which is XML in syntax but not in spirit.
--
"Is a hyperlink" is a hyperlink.
"Is not a quine" is not a quine.

Cats: Nature's entropy generators

[ [ Parent ]

MacOS 9 didn't either (none / 0) (#45)
by xL on Mon Feb 24, 2003 at 03:32:42 AM EST

Preferences are better outside the application anyway, did I assert they shouldn't? It's other resources (toolbar icons, translations, all the stuff we're currently cramming in /usr/share and private libraries) that could find their place there.

[ Parent ]
That I agree with you on (none / 0) (#60)
by fluffy grue on Mon Feb 24, 2003 at 04:37:51 AM EST

Application bundles are great.
--
"Is a hyperlink" is a hyperlink.
"Is not a quine" is not a quine.

Cats: Nature's entropy generators

[ [ Parent ]

inodes (5.00 / 3) (#29)
by swr on Sun Feb 23, 2003 at 05:50:00 PM EST

Luckily, Unix has one way of finding a filesystem object back without knowing its absolute path: The inode. With applications defined as directories, a reference to the inode of this directory is sufficient for the application to get to all its resources, regardless of where on the filesystem it is, as long as it still exists.

Ugh... Please, no. Restoring from backups, copying files from another system, etc. won't work if you use inode numbers.

Inode numbers are low-level filesystem information. Useful to the filesystem, not to much else.

If files were objects, the File class would have had its inode field declared private.



A matter of fixing it in the right place (3.00 / 2) (#43)
by xL on Mon Feb 24, 2003 at 03:26:02 AM EST

Psst! Directory entries already point to inodes! The trick is, you don't see this. You just open "foo" and it opens. You restore "foo" from backup and it still opens. Did you notice the inode changed?

The way to get this right would be to either have the app-to-inode mappings update when you install an application file/directory, or include path references alongside the inode-reference to sidestep the backup problem.

The inodes, I agree, should not leak to userland.

[ Parent ]

I have three words for you... (5.00 / 2) (#74)
by fluffy grue on Mon Feb 24, 2003 at 05:13:52 AM EST

"Rebuilding the desktop..."
--
"Is a hyperlink" is a hyperlink.
"Is not a quine" is not a quine.

Cats: Nature's entropy generators

[ [ Parent ]

Yah (5.00 / 1) (#81)
by xL on Mon Feb 24, 2003 at 05:27:59 AM EST

I've been bursting my branes on that one, there is no way to prevent this possibility 100%. This suggests the only way to pull this off would be a "unique object id" mapping in the filesystem layer, which would kill interoperability (although fallback to absolute path references for filesystems that do not support such a feature would still be possible).

It's nicer to have something that could still work with NFSv3, though.

[ Parent ]

UNIX doesn't encode types into filename (4.00 / 2) (#35)
by fluffy grue on Mon Feb 24, 2003 at 12:03:03 AM EST

UNIX uses magic numbers and hashbangs to determine the filetype.
--
"Is a hyperlink" is a hyperlink.
"Is not a quine" is not a quine.

Cats: Nature's entropy generators

[

No it doesn't (4.50 / 2) (#42)
by xL on Mon Feb 24, 2003 at 03:20:50 AM EST

Only file(1) does. The rest of your applications behave like nice little Windows apps. Including your desktop. Magic is not really the answer anyway, it is not guaranteed to render correct results (not every file format can easily be distinguished) and it requires relatively intensive analysis of the file, which doesn't really work well if you've got a whole directory full of them.

[ Parent ]
Um, most programs use magic numbers (4.50 / 2) (#46)
by fluffy grue on Mon Feb 24, 2003 at 03:36:13 AM EST

The only notable exception I can think of is Apache, which still uses extensions for some braindead reason. Most applications I've seen don't even look at the filename except for printing out the error message to say "I can't parse this."

Also, most modern file formats are quite easily distinguishable within the first 4-8 bytes.

Also, who said anything about a desktop? I don't even run a DE (what with it being totally optional in nice, modular OSes). But the DE could easily just call file in order to determine what it is. If Gnome or KDE use the extension to form associations, that's a problem with Gnome or KDE, not with UNIX.


--
"Is a hyperlink" is a hyperlink.
"Is not a quine" is not a quine.

Cats: Nature's entropy generators

[ [ Parent ]

Care to name one (4.00 / 2) (#47)
by xL on Mon Feb 24, 2003 at 03:49:22 AM EST

that is not specifically an app to handle lots of datatypes (like, say ImageMagick)? It strikes me that the magicfile only comes with file(1). No library, no headers. Must not be really popular then?

Yes, you don't have to scan the whole file to figure out what it is, but magic is not foolproof and you still have to effectively scan all files, which may be miles away from your directory on the disk, which will not really help performance if you want to get the type for a whole dir full of them. Slower than just reading stuff from in the directory, I am sure.

Also, magic is central. New apps with new datatypes would need to register somehow. Implicit types don't need registration (and that goes file file suffixes as well as mime-tags).

[ Parent ]

Ah I get your point (4.00 / 1) (#48)
by xL on Mon Feb 24, 2003 at 03:52:06 AM EST

What you meant was, most programs are completely type agnostic. Yes, that's Unix. Nothing to do with the magicfile, though, which not many programs really use. Programs that expect input of a certain type just read what you feed them and complain if it's not what they want. That's hardly good interfacing, even from a text shell perspective.

[ Parent ]
well, (none / 0) (#49)
by pb on Mon Feb 24, 2003 at 04:00:46 AM EST

That's about the best anyone can do without knowing the type of the file beforehand.  Which is also why people use file extensions.

With your solution, what do you do about mislabeled file types?  Would your app also complain that the file was labeled with the wrong attribute?
---
"See what the drooling, ravening, flesh-eating hordes^W^W^W^WKuro5hin.org readers have to say."
-- pwhysall
[ Parent ]

Mislabelling is the same problem as before (none / 0) (#51)
by xL on Mon Feb 24, 2003 at 04:08:17 AM EST

A file can also have the wrong suffix. Programs can respond the same.

[ Parent ]
So... (none / 0) (#56)
by pb on Mon Feb 24, 2003 at 04:24:56 AM EST

You get all of the problems of file extensions, and none of the simplicity?

I think I see why this wasn't already a Unix 'feature'...
---
"See what the drooling, ravening, flesh-eating hordes^W^W^W^WKuro5hin.org readers have to say."
-- pwhysall
[ Parent ]

Only less likely (none / 0) (#58)
by xL on Mon Feb 24, 2003 at 04:30:39 AM EST

Since the labelling of the file is outside of the user's greasy little hands during normal conditions, the mislabelling problem doesn't surface as often as it can on suffix-based typing. You can also set the permissions on a file wrong, so it is a miracle that file permissions are still a part of Unix even if they have all the disadvantages of an unprotected file system, according to your logic.

[ Parent ]
heh. (none / 0) (#68)
by pb on Mon Feb 24, 2003 at 04:53:50 AM EST

But file permissions are built into the file system; would your system also not allow the user to set the file type, then?

It's sounding less and less like a Unix by the minute...

Incidentally, there have been many attempts to replace, supplant, or supercede good ol' Unix file permissions for various reasons, ranging from AFS to ext2's extended attributes.  And they all have their pluses and minuses, but none of them are quite so flexible and so simple as is the original (especially when coupled with a proper use of groups).

I think that Unix file permissions could use a bit of a rewrite, and I'd be happy to help out with that. Just as soon as I can figure out what would be substantially better, but without a substantial increase in complexity and maintenance.
---
"See what the drooling, ravening, flesh-eating hordes^W^W^W^WKuro5hin.org readers have to say."
-- pwhysall
[ Parent ]

The way I'd extend UNIX perms: (none / 0) (#70)
by fluffy grue on Mon Feb 24, 2003 at 05:04:07 AM EST

Keep the existing bitmask, but add an ACL (in metadata) which overrides it. Not too different than what NT does, actually, except it'd suck less. :)
--
"Is a hyperlink" is a hyperlink.
"Is not a quine" is not a quine.

Cats: Nature's entropy generators

[ [ Parent ]

not too different from AFS (none / 0) (#75)
by pb on Mon Feb 24, 2003 at 05:13:56 AM EST

AFS has ACLs at the directory level, and it basically ignores Unix file permissions.

I think ACLs can be quite handy, but you have to be careful with them; it's easy to misuse ACLs as well. I've seen many badly configured Novell Netware networks that have extra and useless ACLs scattered about for everything...

Also, ACLs aren't too too different from Unix's groups and permissions.  I'd hope that there would be a simpler solution in between; maybe someone will come up with one someday...

A few of the extra bits used in AFS might be handy as well; AFS provides rlidwka, which is quite a bit more than rwx.  I'm not convinced that they're all handy, but a couple more bits could be nice.  :)
---
"See what the drooling, ravening, flesh-eating hordes^W^W^W^WKuro5hin.org readers have to say."
-- pwhysall
[ Parent ]

User-defined groups? (none / 0) (#79)
by fluffy grue on Mon Feb 24, 2003 at 05:23:40 AM EST

The ACL itself wouldn't have to be part of the file's metadata... it could, say, be a pointer to another file which just describes the ACL. So you could do something like "aclset ~/.myACLs/fojar *.jpg" and then they'd just link the files' ACL metadata to the inode of 'fojar,' so then editing 'fojar' would change the ACL of all the files which reference it.

That's sort of how I setup the ACL mechanism in a crappy MUCK-type game server I never got around to finishing. You could have a script's permissions set to something like: bob !alice .friends !.foes, and then 'friends' is an entry in the ACL groups table which says something like 'larry roger alice' and 'foes' would contain 'richard bill bob', so then that script's ACL would mean, basically, 'let bob run it, but none of the other foes, and let all of my friends run it, except alice.'

It stored the ACLs as properties within the property hierarchy of the object itself. (An object was basically a nested hierarchy of key/value pairs.) So, like, in that case, /ACL/groups/foes = richard bill bob and /ACL/exec = bob !alice .friends .foes (I also did away with UIDs altogether, and just used the username as the UID).

It was all just a proof of concept though. But I don't think the overhead of that sort of mechanism really matters on today's user desktop systems. (Obviously it'd suck for a huge fileserver which is under a constant maximal load though.)
--
"Is a hyperlink" is a hyperlink.
"Is not a quine" is not a quine.

Cats: Nature's entropy generators

[ [ Parent ]

Yup, it's the added overhead and complexity. (none / 0) (#85)
by pb on Mon Feb 24, 2003 at 05:48:10 AM EST

The overhead is most likely why AFS only added ACLs to directories.  I think there should be a middle ground with enough functionality and less overhead; there's nothing wrong with adding a few more bits, but once you start adding variable-sized lists, you run into the whole metadata goop problem, and by then you might as well do something like the MacOS resource/data fork...

However, for a fileserver (or a volume) that is under a constant maximal load, you'd probably want to have some mechanism to turn off ACLs when they aren't needed.  (or, in your example, just not set them in the first place)
---
"See what the drooling, ravening, flesh-eating hordes^W^W^W^WKuro5hin.org readers have to say."
-- pwhysall
[ Parent ]

Oh but you'd be allowed (none / 0) (#72)
by xL on Mon Feb 24, 2003 at 05:08:05 AM EST

to change a file's type, too. You're just not setting the filetype implicitly when you save a document, the application does that for you. Just like you set your file's permissions somewhere other than in an application's save dialog.

[ Parent ]
hmm? (none / 0) (#76)
by pb on Mon Feb 24, 2003 at 05:17:14 AM EST

So when I create something and save it with my application, what sets the file permissions?

The application does, right?
---
"See what the drooling, ravening, flesh-eating hordes^W^W^W^WKuro5hin.org readers have to say."
-- pwhysall
[ Parent ]

bash(1) umask(2) [nt] (none / 0) (#78)
by xL on Mon Feb 24, 2003 at 05:22:59 AM EST



[ Parent ]
um, no... (none / 0) (#83)
by pb on Mon Feb 24, 2003 at 05:42:00 AM EST

bash is a shell; umask is a C API call that masks off bits from an open call (for example to make your files not writable for everyone else).

This still wouldn't stop, say, an editor from making scripts executable, or from making important configuration files not writable.

And what's to stop an app from setting the umask to 000, anyhow?

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
int main() {
    int fd;
    umask(0000);
    fd = open("foo.test",O_CREAT|O_WRONLY,0777);
    write(fd,"test",4);
    close(fd);
    return 0;
}

Oops, that's right.  Nothing.
---
"See what the drooling, ravening, flesh-eating hordes^W^W^W^WKuro5hin.org readers have to say."
-- pwhysall
[ Parent ]

huh? (none / 0) (#84)
by xL on Mon Feb 24, 2003 at 05:47:06 AM EST

You're missing the point. He asked how you determine with what permissions your applications write their files. You do that with the umask command in bash(1) which calls umask(2). Yes, an individual application can be coded to override this or emit nasal daemons, but my point was that you're not doing it in the save dialog. It's under the default control of the OS, which takes hints from the user or the application.

[ Parent ]
Right. (none / 0) (#86)
by pb on Mon Feb 24, 2003 at 05:50:16 AM EST

The application does that for you.

By this point, I fail to see what fundamental difference we were bickering about; maybe it'll make more sense in the morning.  :)

Cheers.
---
"See what the drooling, ravening, flesh-eating hordes^W^W^W^WKuro5hin.org readers have to say."
-- pwhysall
[ Parent ]

No, programs which aren't type-agnostic do too (5.00 / 2) (#50)
by fluffy grue on Mon Feb 24, 2003 at 04:04:40 AM EST

Pretty much all image-manipulation programs, both at the commandline and GUI level, use magic numbers for stuff. Like, all of the things in the netpbm and imagemagick collection use magic numbers exclusively. In the case of netpbm it's only to say "Hey, this isn't really a JPEG," but in the case of imagemagick, it actually uses the magic number to decide which code path to take.

mplayer is another example of something which uses magic numbers for format detection.

Galeon also uses magic numbers. I just copied a JPEG file to the file /tmp/asdf.test and it displayed the JPG just fine.

So did GIMP. Granted, GIMP uses the extension for detection of output format, but for input format it uses the magic number.

Same with xv.

So now that I've given a few applications which use magic numbers, you tell me a few applications (not broken DEs like Gnome or KDE, but actual applications) which use the file extension to determine which file type it is.

And anyway, what does application behavior have to do with the OS in general? You're proposing fixing the supposed reliance on extensions in UNIX, which has nothing to do with UNIX itself, but with the applications written for UNIX.

Well-behaved Windows applications use the MIME type of a file to determine what it is; unfortunately, Windows itself (win9x at least, dunno about XP) only provides a mapping of extension to MIME type.

Also, I find it interesting that you mention MacOS X as an OO OS which UNIX should strive to become. You do realize, of course, that MacOS X is UNIX, right? It's a fancy DE with library-level file typing information. And it's not because of the resource type specifier provided by HFS+, because it works just fine on UFS as well, and it works for files which were downloaded from the web without a resource fork. (Which wasn't the case for classic MacOS, by the way.)
--
"Is a hyperlink" is a hyperlink.
"Is not a quine" is not a quine.

Cats: Nature's entropy generators

[ [ Parent ]

Impressive (none / 0) (#52)
by xL on Mon Feb 24, 2003 at 04:14:18 AM EST

Didn't gather that so many applications relied on magic. As said, I would have expected a standard library for the magic file to have been around if it were so prevalent.

You are right that this is a problem in the application domain. My boggle is, that a little bit of support from the OS core layer (ie libc or the kernel) could make it easier for those applications to handle things consistently. Under this scheme, you can get many existing applications to "do the right thing", potentially without even recompiling. Making, say, vi grok the Gnome VFS would be much more trouble.

Ultimately, applications relying on magic are still feasible under the mime scheme; Just bind your image application to image/* and let it figure out the rest for itself.

[ Parent ]

That's up to the DE (none / 0) (#57)
by fluffy grue on Mon Feb 24, 2003 at 04:29:07 AM EST

"Hey, there's this file which maps to image/somethingIdunno. Okay, uh, send it to display I guess." Also, Gnome and KDE both provide mechanisms for their respective apps to bind themselves to the file formats they support (with system-wide defaults). It's no more roundabout than the Desktop file in classic MacOS.

Didn't gather that so many applications relied on magic. As said, I would have expected a standard library for the magic file to have been around if it were so prevalent.
There is. It's called exec(). That's the UNIX way - why use a library which needs lots of linkages and so on when you could just let external commands tell you? Just run file | magic2mime. Then you don't need to recompile all your applications whenever the magic-number "library" gets updated.
--
"Is a hyperlink" is a hyperlink.
"Is not a quine" is not a quine.

Cats: Nature's entropy generators

[ [ Parent ]

That could work (5.00 / 1) (#61)
by xL on Mon Feb 24, 2003 at 04:39:37 AM EST

Although I still think magic is not elegant for general purposes, more of a great aid when all else fails. Desktop environments currently in use already use meta-data, its perceived merits should really not be up for debate. The "why do we need all this?" is something you could be discussing with the authors of Gnome and KDE as well. What I am trying to bring up with this article is that, if we agree that applications are moving this way, we are doing it in a dumbfounded way where it could be done in a way that is more in sync with Unix.

If you hate desktops, there's nothing you really want from meta-data. The path volumes concept is something that _can_ be relevant in the commandline environment, though. Does that idea ignite the same neophobic neurons for you? How about application bundle directories?

[ Parent ]

Why do you call me neophobic? (3.00 / 2) (#67)
by fluffy grue on Mon Feb 24, 2003 at 04:53:18 AM EST

Did I indicate that I'm scared of change? Or are you just trying to insult me?

Anyway. Path volumes have actually existed in various UNIXen in the past. AIX has something just like it, and some shells provide it at the shell level. Which is where it should be.

Also, I think that you are having trouble seeing past the end of your own nose. Do you really think that the things you've rambled about in your article are new thoughts in the UNIX world? Did you even look for existing libraries for, say, regular expressions?

Also, what's "more in sync with UNIX" is to call exec() and use pipes, or write a simple library for parsing magic number databases or whatever. Which is what the various DEs already do. You've already mentioned Gnome-VFS by name so I know you know it exists. So what's wrong with that? Yeah, it's tightly-integrated into Gnome. Gnome wants to keep everything integrated, rather than making 2734987298472389487239 different separate packages which every Gnome app relies on. (As a result, there's just 2734987298472389487239 separate Gnome libraries instead. But that's a different matter, since at least they're all distributed as a whole, unless you run Debian.)

Basically, everything you're trying to solve is already a part of the various other projects which are trying to solve these very same issues. You're bringing nothing new to the table, while simultaneously making erroneous claims about the various user-friendly UNIX projects not thinking about these issues.

Did you even do any basic research at all before you wrote your article? I mean, you said, and I quote, "Didn't gather that so many applications relied on magic." Which indicates to me that you didn't look, you only assumed that UNIX apps use file extensions, which just plain isn't true.

Again, the only UNIX app I know of which uses extensions rather than magic numbers is Apache, and it probably does that more for efficiency reasons than anything.

In fact, do you want me to show you various screenshots of applications parsing a JPEG file with a non-JPEG extension just fine?

BTW, the extension is there for the user's benefit, so they know what to expect.
--
"Is a hyperlink" is a hyperlink.
"Is not a quine" is not a quine.

Cats: Nature's entropy generators

[ [ Parent ]

Didn't mean it as an insult (none / 0) (#71)
by xL on Mon Feb 24, 2003 at 05:04:35 AM EST

You just strike me as someone who likes Unix "just the way it is". I'm a neophobe myself on many levels, about many things. It's essential survival strategy best brought to words in the saying "if it ain't broken, don't fix it". Perhaps I should have labelled it as "pragmatic".

I have consistently taken no credit for any of the ideas I mentioned in this article, heck I have even said where I yanked them from: MacOS 9, AmigaOS, MacOS X and MacOS 9 respectively. And that was my entire point, didn't want to enflame you but wanted to point out that arguing whether meta-data is stupid or not is really not relevant to the article's intentions.

Your points against meta-data are valid. No need to supply me with photographic evidence for something I already believed you were right about a couple of inches up this tree. If my clumsy way of getting that point across hit the wrong nerve, I apologize.

[ Parent ]

They're not points against meta-data (none / 0) (#73)
by fluffy grue on Mon Feb 24, 2003 at 05:11:00 AM EST

They're points for pointing out major inaccuracies in your article (namely that applications use the file extension to determine filetype).

Anyway. Where I think UNIX really does need some major reworking is in its GUI input layer (not in the GUI itself; I think X is just fine the way it is, what with its flexibility and so on). I've ranted about that plenty elsewhere though (and also in various private emails to various people). But that's not a reworking of anything in UNIX so much as a refactoring of existing subsystems to add new flexibility.
--
"Is a hyperlink" is a hyperlink.
"Is not a quine" is not a quine.

Cats: Nature's entropy generators

[ [ Parent ]

And just to be inconsistent.. (none / 0) (#77)
by xL on Mon Feb 24, 2003 at 05:20:18 AM EST

..and beat the dead horse just after saying I want it gone: As long as a type is visible when the user sees its representation. When you can display a filename, you can display its type. And, most of the time, people will be using the file extensions anyway, like they did on the Mac. The impact on users would really not be that radical, but they would have more freedom to manipulate things the way they prefer.

Some current practises are already not really compatible with file extensions, by the way. Take backup files (file.txt.bak, or file.txt~) or shared libraries (libfoo.so.0.2.5). Others are really not that well-suited to pure magic-based interpretation. Not all text files are just text files, they may have special meaning to an application, even though they share a common syntax with other file types just like it. Magic works well for multimedia files, as cited in your many examples, but like suffixes or even meta-data, it is not 100% perfect. All that one can do is cover as many bases as possible.

[ Parent ]

Right... (none / 0) (#80)
by fluffy grue on Mon Feb 24, 2003 at 05:26:38 AM EST

But if the extension is purely there for the user's benefit, and the applications (including the DE) uses metadata or magic or whatever to determine the actual type of the file, then the person could easily save their file as whatever.jpg.bak and it'd work both for them and for the applications.

Which it does, just fine, in UNIX, now.

Also, which fileformats do you actually use which aren't identifiable by magic number? I haven't seen any new file format in years which doesn't have a well-defined header of some sort.
--
"Is a hyperlink" is a hyperlink.
"Is not a quine" is not a quine.

Cats: Nature's entropy generators

[ [ Parent ]

Hmm i'd say scripts (none / 0) (#82)
by xL on Mon Feb 24, 2003 at 05:32:44 AM EST

if it weren't for the hash-bang. Varioused XML-based formats would probably collide on pure magic alone without more parsing.

I would vote for most of the files in /etc, too. Resolv.conf has a distinct format. It absolutely belongs in the text/* hierarchy, but text/plain is really off in that case as much as it would be for an html file (which is technically also plaintext).

[ Parent ]

XML is recognized as XML (4.00 / 1) (#106)
by fluffy grue on Mon Feb 24, 2003 at 03:48:47 PM EST

And isn't the point to XML that anything which groks XML can grok any XML? ;) Also, there's the DOCTYPE tag, which is there specifically to say which exact form of XML it is. And XML DOCTYPE and MIME type don't get along; XML is supposed to be nothing but text/xml.

text/plain and text/html are totally distinct formats (just because it doesn't use 8-bit characters doesn't mean it's plaintext!), but I wouldn't consider /etc/resolv.conf to be distinct from text/plain. The "proper" solution is to change all config files to XML or similar, though in most cases (like /etc/resolv.conf) that's total overkill.

In any case, you'd directly edit /etc/resolv.conf in a plain text editor, right? If you were to double-click on /etc/resolv.conf, it means you're going to be editing it directly, and probably don't want to go into a high-level system config thing. The high-level system config thing (which is intended to abstract the user away from /etc/resolv.conf and so on) would already know how to handle /etc/resolv.conf. I see no reason for the DE to handle /etc/resolv.conf differently from, say, /etc/apache/httpd.conf or whatever.
--
"Is a hyperlink" is a hyperlink.
"Is not a quine" is not a quine.

Cats: Nature's entropy generators

[ [ Parent ]

Unmagicable files (none / 0) (#100)
by cpt kangarooski on Mon Feb 24, 2003 at 01:47:52 PM EST

It would depend on the specific file format involved, but split files could qualify. This could be remedied by adding headers to the file to indicate that it was a part of a file, and information about the type of file it was, but in that case, you'd want to make very sure that every joining tool knew to strip the new headers out.

Personally, I don't mind magic as a tool of last resort, but I prefer metadata to take priority, since, being outside of the data 'band' as it were, it can carry lots of useful information without conflicting with the actual data.

'Course, I like forked or streamed files for similar reasons.

--
All my posts including this one are in the public domain. I am a lawyer. I am not your lawyer, and this is not legal advice.
[ Parent ]

BTW (none / 0) (#69)
by fluffy grue on Mon Feb 24, 2003 at 04:55:54 AM EST

I already said elsewhere that I agreed that application bundles are a good thing. You don't have to assume that just because I disagree with you on one thing means that I disagree with you on everything.
--
"Is a hyperlink" is a hyperlink.
"Is not a quine" is not a quine.

Cats: Nature's entropy generators

[ [ Parent ]

That doesn't seem to work (none / 0) (#95)
by KnightStalker on Mon Feb 24, 2003 at 12:08:01 PM EST

$ file online.png
online.png: PNG image data, 48 x 48, 8-bit/color RGBA, non-interlaced
$ file online.png | magic2mime
application/octet-stream

This is the default Debian sid install of magic2mime. It had the same problem with nearly all other types -- all it seems to understand are "text/plain", "application/octet-stream" and "message/rfc822".

[ Parent ]

Nevermind (none / 0) (#97)
by KnightStalker on Mon Feb 24, 2003 at 12:12:09 PM EST

file -b online.png | magic2mime
image/png

Read the docs, Luke... :-)

[ Parent ]

Why do you need a creator attribute? (4.33 / 3) (#36)
by GGardner on Mon Feb 24, 2003 at 01:04:53 AM EST

I just uploaded a bunch of digital pictures to my machine, viewed them all with a web browser, modified a couple of them with the GIMP, resized them all with imagemagik, and e-mailed them out. What should the creator attribute be for these -- it has little useful meaning. The unix "file" command understands more magic numbers than you'll ever need, and provides all the functionality you need, without mucking about in the filesystem.

Your article alludes (I think) to a recent Outlook virus where a file extension looked innocent, but really contained a dangerous script. The real problem here isn't that type info was encoded as a filename suffix, but that the type info was in two places, and inconsistent. The only way to remove this inconsistency is to have the type info in one place, and that place is in the file proper, where it can't be lost.

Finally, I agree with your issues with hierarchical filesystems. Hierarchical databases went out of style 30 years ago. Hans Rieser is making noises about some really interesting stuff for RieserFS version 6, but that's a ways off -- http://www.namesys.com/whitepaper

For GUIs (3.00 / 2) (#53)
by juahonen on Mon Feb 24, 2003 at 04:18:43 AM EST

The single most often mentioned use for the creator attribute I've seen is with GUI applications. If the file stored creator application information, a GUI would then "know" which application it should use to open the file with. If you have only GIMP, then your system could seldom use the information. But if you're using many different image manipulation programs, a different for different purpose, the GUI would "remember" the right one for the right image.

There's also some use for security. Say, if the OS is configured to store the last program modifying the file as "creator" then you'd be able to see if some program has been tampering with the image. Imagine finding "Outlook" as the creator for an image file. That should ring some bells.

The only way to remove this inconsistency is to have the type info in one place, and that place is in the file proper, where it can't be lost.

I don't think that would be a good idea. You'd have to rewrite all the base Unix systems anew to understand the text/plain in front of the file. And the mime type would have to be coded the same way for all files, I think. Ruining OS compatibility. And since the MIME type is stored in email and HTTP headers, the information would not be lost. Forget the extension and use the MIME type. If it says it's a picture, then it is a picture and if an image viewer doesn't show it, it's a broken picture. The only program which would not send MIME types would be FTP and perhaps some other programs. I don't remember does DCC (from IRC) send MIME types.

But the point is, you cannot make a change that'd propagate to systems that are beyond your control.



[ Parent ]
Not in the file (none / 0) (#62)
by fluffy grue on Mon Feb 24, 2003 at 04:39:49 AM EST

Do it as metadata.
--
"Is a hyperlink" is a hyperlink.
"Is not a quine" is not a quine.

Cats: Nature's entropy generators

[ [ Parent ]

Please, no creator attribute (none / 0) (#89)
by vrt3 on Mon Feb 24, 2003 at 08:39:00 AM EST

I'm all for some kind of type attribute, but I'd hate a creator attribute. I decide which application I'm going to use based on what I want to do, more than the type of the file. What I want to do varies greatly over the life cycle of a file.

Take an image file for example. To create it or edit it, I would use The Gimp, so that's what would be in the creator attribute. But most times if I click it in a file manager, I simply want to view it. In those cases, loading The Gimp is way too much overhead.

In fact, I'm quite content with the current state of affairs concerning file types.
When a man wants to murder a tiger, it's called sport; when the tiger wants to murder him it's called ferocity. -- George Bernard Shaw
[ Parent ]

A way to satisfy this (3.00 / 1) (#90)
by xL on Mon Feb 24, 2003 at 10:03:24 AM EST

Would be to define distinct edit and view methods for a given type, combined with a desktop preference for either editing or viewing, possibly depending on full or partial mimetypes:

image/* -> view
text/* -> edit
text/html -> view

Not much changes in configurability compared with current schemes. I think Gnome2 already carries the distinction between View and Open methods.

[ Parent ]

I need a creator attribute (4.00 / 1) (#99)
by cpt kangarooski on Mon Feb 24, 2003 at 01:42:28 PM EST

My situation is this: I read a lot. Oftimes, the files that I read are unformatted text or html formatted text. I have no desire whatsoever to alter these files; I want to read them as-is. Furthermore, certain programs, such as web browsers, are much better for reading text since they tend to not attempt to edit the data being displayed. E.g. in a lot of browsers you can mash the space key to page down, as opposed to the behavior in a word processor, where doing so would insert spaces into the text.

However, I also have a lot of other files in the same formats that I need to edit. Those I want to open in an editing program right away.

Without a creator binding of some type, and without changing the filetypes (which would be a lie), and mindful that most editors will open and permit editing of, but not saving to the same file of, a read-only file... how do I manage to double-click the icons of each and have them open in the appropriate programs?

Now, I will agree that it's important to be able to override the default handling temporarily. And that it's important to be able to trivially alter the binding as desired. But I think that both issues can be solved without our having to eliminate this lovely capability altogether.

Certainly I regard it as a massive PITA to have to explicitly open files in Windows via the 'Open With' command as opposed to a simple double click. Both due to the time savings and because I would then virtually never have to think and remember what program was appropriate to open what file.

--
All my posts including this one are in the public domain. I am a lawyer. I am not your lawyer, and this is not legal advice.
[ Parent ]

That's certainly a consideration (none / 0) (#102)
by Kal on Mon Feb 24, 2003 at 02:22:04 PM EST

how do I manage to double-click the icons of each and have them open in the appropriate programs?

This is certainly a consideration, but I'm not sure how much it applies to unix. Most of the interaction with a unix machine is via the shell, and while it does support a mouse in some manner, I don't think I've seen one that supports double clicking. Personally, I just type vim foo.html or netscape foo.html depending on what I want to do.

[ Parent ]
Unix?! (none / 0) (#108)
by cpt kangarooski on Mon Feb 24, 2003 at 11:16:36 PM EST

Well, although I abhor Unix, avoid having to suffer through using Unix, and doubt that Unix could ever be improved enough to be fit for human consumption, there are GUIs for Unix.

As for shells, why the hell shouldn't I be able to use a mouse in meaningful ways in a shell? Using a mouse to select items in a shell, or trigger actions, or show contextual options etc, sounds like a good idea to me, even aside from the ordinary copy and paste operations one does with text.

Nevertheless, these kinds of features will pretty certainly never appear in a shell. People will continue suffering through tcsh or bash or something. The inability of most Unix users to look beyond the technology of the mid-70's is exactly what dooms it to being a fundamentally worthless OS, IMO.

--
All my posts including this one are in the public domain. I am a lawyer. I am not your lawyer, and this is not legal advice.
[ Parent ]

I'm sure that's a perfectly valid opionion (none / 0) (#109)
by Kal on Tue Feb 25, 2003 at 01:33:12 AM EST

I'm sure that opinion is valid, it just happens to be one I intensely disagree with. Is operating a computer via a text interface for everyone? Certainly not, but for those who know how to use it well it's far more powerful than a standard GUI.

there are GUIs for Unix

Sure there are. Unfortunately most of them suck. So far the only ones I've found to be helpful for me are PWM and ION

As for shells, why the hell shouldn't I be able to use a mouse in meaningful ways in a shell?

You already can use it in meaningful ways, you can select text and easily paste that text to a command line. What more do you really need?

People will continue suffering through tcsh or bash or something.

I'm intensely curious why you consider it suffering? It's probably going to end up just being a personal thing because I detest having to navigate most GUI interfaces, the majority of the unix GUIs especially.

[ Parent ]
-1, Not Really About Unix. (4.50 / 4) (#37)
by pb on Mon Feb 24, 2003 at 02:12:18 AM EST

Your article frequently mentions MacOS X when comparing to "Unix"; I conclude that your beef with "Unix" is more of an issue with the state of current Unix distributions and actually has nothing to do with "Unix" (the loose standard for an OS Kernel API) itself.

And as to your comment about filetypes, well, we encode the type in the filename because it's convenient for us humans.  But most Unix distributions also include a command called 'file' that's designed to more precisely determine the type of a file.

I'd also like to mention something about most of the rest of your suggestions: they all sound quite interesting from an academic perspective, but they didn't work for MacOS at all.  The type/creator attributes were a huge pain in the ass--nothing but a great reason to learn to use ResEdit or Norton Utilities for the Mac.

As for MIME Types, people generally build this sort of support into GUI file browser apps like gfm and kfm and the like, and not into the inodes. Why?  Well, maybe the sort of people who would mess with inodes are the same sort of people who remember the DWIM debacle and would much rather explicitly specify their applications themselves. Maybe these people realize that "GIMP" is not always the answer, especially if you're not running X at the moment--and "convert" (from ImageMagick) isn't always the answer when you are running X!

In any case, just because I don't want or need such added bloaty cruftiness in a Unix, it doesn't mean that other people don't either.  So I wish you luck with your GUI-centric pet peeves, but I was really hoping for a proposal about, say, improving the shell interface for pipes or something... something more on the level of a Unix enhancement than a web browser enhancement.
---
"See what the drooling, ravening, flesh-eating hordes^W^W^W^WKuro5hin.org readers have to say."
-- pwhysall

Pure unix (4.33 / 3) (#40)
by xL on Mon Feb 24, 2003 at 03:14:36 AM EST

You're making a lot of points in small space. Let me try to address them:

My beef is indeed not with 'Unix' as in the kernel/APIs. I would not have gone out of my way to find the answer within the existing constraints of Unix if I thought there was anything wrong with that. My beef is with the way how people have taken all the legacy cruft for granted, while the only real beauty of ideas is in the kernel.

Filetypes in the suffix are convenient to some humans, for most they are no more than a reference point. Contrary to what you say, they really _are_ the only way used by today's Unix application to determine those filetypes. The magic database used by file is not used by Gnome, KDE or your webbrowser. They only look at the extension. You think this arrangement is convenient. Lots of people do, they are used to it and it's not unworkable, just inelegant. I know there are many people who don't really feel much for it.

As I stated in the article, the MacOS had problems with its filtyping largely when exchanging files with the outside world. The internet killed the scheme. With MIME, this disadvantage is not there.

Using the "duality of vision" example of Outlook actually works in favor of abolishing file extensions. You see, webbrowsers and email clients generally already give MIME-information the upper hand in determining what to do with content. I don't see how introducing mime-types would create problems that file extensions don't have. Now, too, I reckon you are not opening your .GIF image in gimp on the console. That has absolutely nothing to do with typing but all with the decision of the shell (be it an X11 filemanager or a text console) what to do with it.

What meta-information can offer is in fact more freedom. The current suffix -> filetype -> application scheme only leaves room for one default alternative. Meta-information (specifically creator tags) allows multiple programs handling the same type information to co-exist, and per-file preference over the default application to use for any action on a file.

The "bloaty cruft" as you describe is already there. It is called GNOME. If you want a Desktop, you add complexity. The issue at hand here is not whether we should add complexity at all (obviously we do since a large number of people want to use the desktop), but rather how can we keep this complexity to a minimum. The environment-specific databases that current desktops use are way more complex than this.

[ Parent ]

On MIME Types, magic, and reinventing the wheel. (none / 0) (#54)
by pb on Mon Feb 24, 2003 at 04:21:42 AM EST

Actually, many apps including GNOME and KDE can or do use magic numbers to determine MIME types.

And you're right, at the console, I have no reason to try to "execute" a file without an application. Although perhaps a "chooser" app would be handy if I didn't know what apps were installed on the system. (although there are other apps that can tell you what's on the system...)

Fortunately I don't have KDE or GNOME installed on my system at all.  I don't want it, I don't compile it, and I'd hate to see that cruft moved out of the applications and into the kernel (although I suppose that in that case, I'd just turn off the kernel options for bizarre MIME stuff).

However, I see nothing wrong with writing a library for adding support for this sort of thing; in fact, I'm sure that if you looked, you'd have a few to choose from.
---
"See what the drooling, ravening, flesh-eating hordes^W^W^W^WKuro5hin.org readers have to say."
-- pwhysall
[ Parent ]

Users. (5.00 / 5) (#55)
by baldnik on Mon Feb 24, 2003 at 04:23:52 AM EST

Do we actively hate our users (...) ?

Yup.

User specific configuration (none / 0) (#88)
by squigly on Mon Feb 24, 2003 at 07:41:08 AM EST

I like the ideas here, especially application bundles, but how does this work with user specific data?  

Presumably we will occasionally want to delete users, and their local config files.  This implies putting user specific config data in the user's home dir.  This seems to be the wrong place.  We still have an absolute path albeit rooted in a relative path.

For the love of all that is good and holy... (5.00 / 6) (#91)
by Talez on Mon Feb 24, 2003 at 10:04:22 AM EST

Why the hell don't we just invent a metadata "wrapping paper" for all file objects as a stopgap and start designing REAL FILESYSTEMS.

See BeFS as a point of refernce.

Wanna know how flexible that filesystem was? The email client was a bunch of folder windows with the metadata attributes all enabled. Reply was on a context sensitive menu that brought up the Compose Mail window.

It was heaven on a hard disk.

I miss that filesystem. If it were a woman she would have long legs, perfect breasts, long silky hair, bright blue eyes and perfect white teeth.

I would make love to her constantly.

Si in Googlis non est, ergo non est

BFS did most of this (none / 0) (#96)
by jonr on Mon Feb 24, 2003 at 12:08:04 PM EST

I miss BFS, it simply worked.

[ Parent ]
"If it were a woman ... (none / 0) (#101)
by Bill Melater on Mon Feb 24, 2003 at 02:16:11 PM EST

... she would have long legs, perfect breasts, long silky hair, bright blue eyes and perfect white teeth."

Be sure to back her up regularly.

[ Parent ]

Preaching to the Reprobate (5.00 / 3) (#93)
by avdi on Mon Feb 24, 2003 at 11:52:26 AM EST

Excellent article.  You're going to get torn to bits for it.  I love my Linux box, even as I deplore these very same brain-dead attributes that you point out.  Unfortunately, I've noticed that whenever an article recommending progressive changes to the basic UNIX formula is posted to a public forum, the author catches all kinds of flack because "it works for me, so why change it?"  Never underestimate the power of UNIX to turn a bright-eyed, bleeding-edge, questing young geek into an overnight neo-luddite proclaiming "K&R said it, I believe it, and that settles it!".  Ironically enough, by all reports Kernighan and Ritchie and the other fathers of UNIX never fell into this curious state of stubborn nostalgia, preferring to move on to research major improvements on the UNIX model.  Their devotees, unfortunately, did not follow.

Nevertheless, carry on, and don't let the codgers get you down.  One day sense will prevail, and people will admit that UNIX actually leaves room for certain improvements.  And in the meanwhile, let me recommend this article by Hans Reiser, of ReiserFS fame: http://www.namesys.com/whitepaper.html.  He's got a lot of good ideas about how filesystems aught to be.

--
Now leave us, and take your fish with you. - Faramir

Sorry for not spotting this earlier (none / 0) (#94)
by Ubiq on Mon Feb 24, 2003 at 11:58:34 AM EST

...traditional death traps like logging...

Perhaps you meant locking?



Yes that should've been "locking" (none / 0) (#98)
by xL on Mon Feb 24, 2003 at 12:36:16 PM EST

Not only that, but I saw it before and then forgot to make a note. I suck.

[ Parent ]
Locking is a necessary evil, along with versioning (5.00 / 1) (#111)
by gainax on Tue Feb 25, 2003 at 03:58:59 AM EST

Locking and versioning are necessary evils in this (multi-user/multi-processing!) world.  They are a royal pain in the ass, but they will have to be integrated into applications at some point.  So what do you do when you've got 6 people wanting to edit the same document?

[ Parent ]
Locking is a necessary evil (none / 0) (#118)
by xL on Fri Feb 28, 2003 at 02:01:56 AM EST

So the trick is to make sure you only use it when necessary. Maildir is a system that sidesteps locking for mailboxes by using the filesystem's own atomic features. Under this system, multiple clients can have write access to the same mailbox without problems.

If 6 people need to work on the same file at the same time, there are also ways to see if excessive locking can be sidestepped. Chances are that the document has so many editors for one of the following reasons:

  • They're all working on different parts
  • They're all adding new data
Both these scenarios can be designed without locking by splitting the file. In the first, split it into regions (if it's a C file, create multiple files for different functions inside, if it's a HTML file it could be split by paragraph). The second scenario can be satisfied by using new files in stead of appending to the existing file.

What remains is full random access. No way to sidestep locking here.

[ Parent ]

Creator codes are evil (4.66 / 6) (#103)
by gidds on Mon Feb 24, 2003 at 02:42:38 PM EST

I quite agree with your ideas about typing; encoding types within filenames is a hack (albeit a useful one). Apple's scheme of type codes was a basically good idea. They need to be more accessible and malleable, though.

But the idea of a creator code is fundamentally flawed IMO. It encodes something into the file that's independent of it; it assumes (and enforces) the idea that only one application should be able to open each file. This is clearly wrong; it worked well when each app had its own private file type(s), but these days there are hundreds of applications that each work on any of the basic types: images (GIFs, JPEGs, PNGs, &c), plain text, marked-up text (HTML &c), sound (WAVs, AIFFs, MP3s &c), video (MPEGs &c), archives (ZIPs &c), and many more. When I launch an image file, for example, I want it to use my lightweight image viewer. I neither know nor care whether it was created in a PhotoLine, saved from my browser, generated from an app of my own, or whatever; regardless of how it was created, I want it to open in the same app. The same applies to most of my files. Mac OS 9 drove me wild in this respect, and I spent much time and effort stripping out creator codes from files after creating them.

Yes, there needs to be a way to work out how to open a given file, but encoding this in the file itself is the wrong way. Instead, the system should know what to use to open a file; it should be able to map each type to an application. Mac OS X gets this pretty much right IMO, keeping track of a `preferred' app for each type, but also letting you select from the range of available apps when you want to. Other OSs do something similar, whether you have to create the list of preferred apps manually, or implicitly by installing apps in a particular order or location. But in all cases, this is something that should be controlled by the user, not the file itself.

In short: type codes good, creator codes bad. Apps should know about files, but files should not know about apps.

Andy/

Amen (none / 0) (#112)
by Josh A on Tue Feb 25, 2003 at 08:21:29 AM EST

Mac OS 9 drove me wild in this respect

Yeah it did. I've gotten into the habit of almost never double-clicking files, because the creator code rarely had anything to do with my intentions.

OS X goes a little ways with this, but I don't even like having "preferred" apps. All right, an exception: I prefer to open PDFs in Acrobat rather than Preview. But, a gif or a jpeg? Depends on whether I just want to view it, or whether I want to edit it, etc. And an .html file? Do I want to open it in GoLive or BBEdit at this moment?

Thank $deity I can drag files to application icons, both in the dock and in MaxMenus. With OS 9, I had to use GoMac.

---
Thank God for Canada, if only because they annoy the Republicans so much. – Blarney


[ Parent ]
The filesystem itself is the problem (5.00 / 2) (#104)
by riptalon on Mon Feb 24, 2003 at 02:55:02 PM EST

Reiser4 promises to do most of the above and much more. While Reiser4 implements plugins that should allow the filesystem to behave in almost any way you want, just the simple change of allowing files to also be directories (as Multics did) goes a long way to improving things. Any number of attributes can then be implemented as files "within" the file. Problems such as these need to be fixed at the level of the filesystem, by creating a flexible filesystem that does not place artificial constraints on the user. Only then can you attempt to build a better enviroment on top of it. Hacking around trying to build better structures on top of the weak base of present POSIX filesystems is never going to be the answer.



spreading applications over the filesystem (4.00 / 1) (#114)
by mka on Tue Feb 25, 2003 at 03:20:24 PM EST

makes sense for several reasons:
  • Simple $PATH. To enable invoking your application from command line you do not have to maintain a list of all the application directories. Now I have to work in an OS which stores the applications in its folders somewhere in C:\Program Files and I really miss the possibility to invoke all the programs from command line
  • some application data are read-only and platform independent and may be shared in heterogeneous network, whereas sharing ELF32/ix86 binaries to e.g. MacOS X makes no sense. This makes networked installations much simpler (imagine networked installation of application contained in it's own directory in when the application needs to access some read-write data at runtime)
  • cooperation among applications. It is nice to have single /etc/profile shared among all Bourne-familly shells. I agree, that most applications have their own configuration files and don't care about the others, but for creating integrated enviroments consisting of replacable components it is necessary to have common place for storing all the configuration.


As per your first point... (none / 0) (#115)
by skyknight on Tue Feb 25, 2003 at 10:35:58 PM EST

Make a single directory into which you put links to your programs, put that directory in your path, and voila, you can invoke all of your favorite programs from the command line. It's a little crufty, but it certainly works.

It's not much fun at the top. I envy the common people, their hearty meals and Bruce Springsteen and voting. --SIGNOR SPAGHETTI
[ Parent ]
Alternatives from Debian... (none / 0) (#117)
by Alhazred on Thu Feb 27, 2003 at 11:06:43 AM EST

Debian Linux's "alternatives" package is designed to manage just such a system. It is also widely deployed in Red Hat based systems like Mandrake and I think also Suse.

Alternatives provides a fairly simple management tool that allows you to provide various applications under a generic name. This allows you to both place applications anywhere in the file system and let all users get at them, and to provide "generic functionality", thus /usr/bin/editor can be invoked on different systems and SOMETHING equivalent will always run, it may be vi on one box and pico on another, but at least your script or application will get reasonable behaviour, and an RPM or PKG can install a replacement set of functionality without breaking anything (thus the vi package can overload /usr/bin/editor to invoke vi instead of pico).

This is all accomplished with symlinks. To be honest I'm not sure what the original poster's fobia about symlinks is. Why not simply use an alternatives-like system to bind each mime-type/action to a handler application using symlinks? This gives you the best of both worlds, you can place applications and their configuration whereever the network and system best need them, yet point to things in a way that won't break even when you move them around.
That is not dead which may eternal lie And with strange aeons death itself may die.
[ Parent ]

The Unix Philosophy (4.00 / 1) (#116)
by obvious on Wed Feb 26, 2003 at 03:56:19 PM EST

These are some interesting ideas, but I found the introductory paragraph a little misleading. The article really was about applying the Mac OS philosophy to Unix. It seemed that it would be about applying the Unix philosophy to Gnome and KDE, which is what I would really like to see.

I would like to see a desktop environment built on the Unix philosophy of small tools that can be used together to solve a problem, and I'd like to see a file manager that really embraces the Unix file system. I'm not sure exactly how this could be accomplished, but there must be some way---some holy grail of interface design out there that we're missing. Gnome and KDE are too busy trying to be familiar to Windows users to find it.

Right now I prefer using a simple window manager, rather than a desktop environment. I've never found a GUI file manager I like better than the command line, and desktop environments don't really offer that much more, and they generally come at the cost of flexability.

That's not to say it has to be this way. I think there is some way to create a GUI that doesn't take away the inherent flexibility of a command line. I hope I or someone else finds it sooner or later.

Designing object extensions for the Unix Operating Systems | 118 comments (85 topical, 33 editorial, 0 hidden)
Display: Sort:

kuro5hin.org

[XML]
All trademarks and copyrights on this page are owned by their respective companies. The Rest 2000 - Present Kuro5hin.org Inc.
See our legalese page for copyright policies. Please also read our Privacy Policy.
Kuro5hin.org is powered by Free Software, including Apache, Perl, and Linux, The Scoop Engine that runs this site is freely available, under the terms of the GPL.
Need some help? Email help@kuro5hin.org.
My heart's the long stairs.

Powered by Scoop create account | help/FAQ | mission | links | search | IRC | YOU choose the stories!