On Mon, Mar 07, 2005 at 04:13:50PM +0100, flaig@xxxxxxxxxxxxxxx wrote:
> > > So far, all approaches to data storage and organisation have finally
> > > resorted to the age-old file model, simply because it works well, and
> > > especially to the unixish "everything's a file" approach, simply because
> > > it works best -- so far. Look at the Mac, they used to have this pretty
> > > nifty system with two different "forks" for data and code in each file
> > > and internal coding for icons and all that stuff; where is it now in
> > > OS-X? Gone. Plain monolithic files, their types determined by extenders,
> > > seem to do the job best. Even the internet is file-based to such an
> > > extent that web browsers and file browsers are mostly identical (look at
> > > Konqueror).
> >
> > If I were Apple, I would have done away with the resource fork model
> > because it's a redundant primitive. It's nothing more than another type
> > of container, but the filesystem already has containers: directories.
>
> Hrmpf... yes, of course, but forks -- or the "chunks" implemented by
> some guy in Modula-2 for the Atari ST back in 1987, with an arbitrary
> number of non-sequentially arranged parts per file -- make it
> mandatory to use this. I think Raoul's suggestion for storing
> meta-information is quite cool. This could be used for compression,
> indexing, encryption etc., all on the system level. But the way it
> was, it really seems to have been redundant!
The solution is really quite obvious if you don't think about
filesystems. Just forget that there is a filesystem.
> > I don't think web browsers and file browser are identical. Indeed, HTTP
> > URLs have a fileish structure, but they are different in many ways:
> >
> > - A node need not be a file, or a directory. http://fu/ and
> > http://fu/bar can both have content. In a traditional FS 'fu' must
> > be a directory, and can have no content.
>
> Maybe I have misunderstood something, but I always thought that
> accessing the "content" of http://fu/ were just a conventionalized
> shorthand for http://fu/index.html ? Or is "index.html" required only
> because servers are based on file systems?
Indeed, "index.html" isn't specified anywhere in the HTTP specification.
It's just how apache maps a non-leaf node to the filesystem by default.
> > > (2) There must be privacy. Data will have to be stamped in such a way
> > > that the rights of authors, authorized readers and the riffraff can be
> > > distinguished clearly. In face of the contemporary situation, this may be
> > > the most important point of all.
> >
> > ACL style permissions are not something that will happen in Unununium.
> > Instead, privacy and security will be based on the assumption that if a
> > program has something, it can use it. For this to be effective, there
> > must not be a global shared namespace (like a filesystem). Instead,
> > functions (there are only functions in Unununium; a traditional program
> > is a function bound to the filesystem namespace) are given references to
> > the things they need directly through parameters.
> >
> > To illustrate, I might play a song in linux by running
> > "mpg123 music/grassgreen.mp3". I'm telling mpg123 where to find the
> > thing I want to play by giving it a reference to a string. It then uses
> > this string as a key to a global namespace (the filesytem), but it could
> > actually use any string at all. If mpg123 were running as root this
> > could be very dangerous.
> >
> > In contrast, in Unununium I would do something more like
> > "mpg123(grassgreen)". The expression "grassgreen" is not a reference to
> > a string, but a reference to the thing which I want to play. There is no
> > global namespace, so the mpg123 function can't access anything but
> > "grassgreen".
>
> Yeah, that's the way EQUUS does it (using Erlang/Lisp-style "atoms" as
> references), though fettered to the Unix file system...
>
> If I may quote from my own web page:
>
> # As said before, EQUUS programs consist of functions only. The starting point
> # of a program is a function whose name is identical to the program's file
> # name.
> #
> # [ophis@naglfar]$ cat hello.q
> # function hello.q[ A B C D ] -> [ A B C D ]
> #
> # [ophis@naglfar]$ equus hello.q Nice to see you!
> # Program result = [ (A=)[Nice] (B=)[to] (C=)[see] (D=)[you!]].
> (http://www.incitatus.net/tutorial.html#argumentpassing)
>
> Of course, when reducing a program design to a single complex
> function, you cannot avoid side effects. So if I want to write a
> letter, I call my text processing function (which is, I remember,
> called whenever any piece of software desires formatted input), and
> this function will create an object which survives after the function
> has terminated, i.e. the program ended. (It cannot be returned to the
> user as it could be returned to a calling function, of course.)
>
> Then I find out that I need another copy of the letter and invoke the
> text processing function again, passing the reference to the former
> letter as an argument. So far so good. But now Mr. Bad Guy places a
> trojan on my wonderful machine which I am dumb enough to execute, with
> the result that my letter is mailed out to everyone whom it does not
> concern. Or overwritten with four-letter words or whatever -- using
> side effects too.
>
> Could you please enlighten me about how persistence and security can
> be reconciled on the "if a program has something, it can use it"
> basis?
I'm not sure I understand the situation here. The "text processing
function" does indeed return the letter. This might be to the user, or
it might be to a calling function. It's the same thing.
Nothing can protect you from yourself. If you are dumb enough to:
- install a trojan
- invoke it with a reference to something important
then you are screwed. Don't give your bank account information to
strangers.
I don't see what this has to do with side effects, or how this is a
problem specific to Unununium's security model. Unununium offers far
superior protection from this sort of thing because not only must one
invoke the trojan, but must also give the trojan a reference to
something important.
Consider the traditional case where I install some arbitrary code and
run it. That code can then access *anything* to which I have access,
simply by passing arbitrary strings to open(2).
> > Another way is to stream the logs over a network to another machine with
> > compatible hardware. This isn't as useful for personal backup, but it
> > would be a killer feature for servers. Using this method one could have
> > a hot spare for every machine ready to go in the case that a production
> > server dies from hardware failure or power loss. Usually this takes a
> > ton of work for each application, but this would be a general solution
> > that would work for any application, even those running in POSIX
> > emulation.
>
> AND you could use this for keeping track of versions too, couldn't you?
Somewhat. It could be considered versions of the entire state of the
machine. Generally, one wants to version individual objects. The
persistence mechanism's notion of objects are RAM pages, which probably
isn't what the user's "objects" are.
> > Unununium will soon remove the barrier between hard disk and ram from
> > applications. Eventually, I'd like to remove the barrier between the
> > local machine and remote machines as well. This will require some
> > portable representation of objects. When this is implemented, the same
> > serialization could be used to make portable backups of individual
> > objects.
>
> Well, in nature there *are* syncytia -- associations or clusters of
> cells without separating membranes, often of enormous size, with many
> thousands of nuclei cooperating --, but in general common
> multicellularity prevails. As a life scientist, I am a bit prejudiced
> by that. As one of my bioinformatics students once realized, growing
> very pale: "The power to do everything implies the power to get
> everything wrong..."
>
> (see above)
...and the power to never go wrong implies the inability to do anything.
What are you saying? Should we all live in isolation?
If a computer node is a nucleus, then I guess a namespace is a cell. It
seems you think distributed computing implies a giant shared namespace
to which all nodes have equal access. This isn't how it works at all.
Each node, and in fact, each function running on that node, has a
reference to only a very small part of the state of the entire machine
(organism, in the cellular analogy). This is multicellularity. Can a
small pain in my toe trigger the circulation to my arms to stop and make
them fall off? Let's hope not.
|