Be Newsletter
Issue 52, December 4, 1996
Table of Contents
BE ENGINEERING INSIGHTS: The S in FS
Architecture
By Cyril Meurillon
Perhaps you've heard: There's a complete rewrite of the Be
file system going on. In last week's Engineering Insights
column, Dominic Giampaolo described how Be files are
organized and represented in the new system and how database
features are merged into this representation. In other
words, last week we saw the "file" part of the file system;
this week, I'm going to show you the "system."
Loadable File Systems: The File System Protocol
If we zoom out a bit, we'll see that the Be file system
proper is actually just one implementation of the more
general Be file system protocol (FSP).
The file system protocol is an API for a set of common file
operations (open, read, write, and so on). To support a
particular "brand" of file system, one simply implements the
FSP API to do whatever the file system expects to be done.
In addition to our native file system, we've implemented (or
are implementing) this protocol for DOS, HFS, NFS, and
ISO9660. When you (the developer or user) want to talk to a
file that resides on a DOS volume, the kernel simply loads
the DOS FSP add-on, just as it would load an add-on to talk
to a specific graphics card or other device. In fact,
implementing an FSP is mostly an exercise in writing a
kernel add-on: The type of work that's required is
comparable to writing a new device driver. It's not for the
hobbyist, but it's not terribly difficult (unless, of
course, you care about efficiency, reliability, and
integrity).
The communication between the kernel and a file system (that
is, an FSP add-on) is "message-based." When a program makes
a file system call, such as open() , read() , or mkdir() , the
kernel maps the call to a specific FSP message and then
"sends" the message to the file system. The part of the
kernel that provides this mapping is called the file system
independent layer (FSIL).
Currently, FSP messages are simply function calls; thus, for
example, when your program calls open() , the FSIL maps the
call to (and invokes) an FSP function called op_open() . This
function-call-as-message interface works well with add-ons -
- but we also want to extend the interface to make it more
flexible. For example, we plan to allow user-level programs
to act as file systems. Communication, in this case, could
be through a pipe or port.
Design Goals
While designing the file system protocol and the file system
independent layer, we had three goals:
- Parallelism. We wanted a given file system to be able to
handle more than one FSP message at a time -- it should be
able to delete file A while writing to file B. To aid in
this, we needed to design the FSIL such that it did as
little serialization (of FSP messages) as possible.
Of course, some operations need to be serialized in order to
maintain structural consistency and re-entrancy; but only
the file system implementation (at the FSP level) should
know which calls need to be serialized. The FSIL shouldn't
make any assumptions. For example, imagine what would happen
if the FSIL assumed that reading and closing a file were
mutually exclusive operations that needed to be serialized:
In other words, if you tried to close a file that you were
reading, the FSIL would hold the close() message until the
read() completed. This is fine for normal files -- but it
doesn't work for a UNIX-style pipe in which read() is often
blocked waiting for more data to show up. If the FSIL
withheld the close() call while the read() call was blocked,
your application (and, possibly, the entire file system)
would deadlock.
- Re-entrancy. The FSIL must be thread-safe. Thread safety
applies most notably to the identification of "nodes" (files
or directories). How does the FSIL tell a file system which
node an operation is meant to apply to? Names can't be used
since there's no guarantee that the name will stay the same:
For example, some other thread might rename the file we're
currently reading -- and we don't want our read to fail
simply because the open file has been renamed! We could
serialize all the calls to a particular node, but that
violates our goal of maximum parallelism.
Instead, the FSIL maintains a "node structure" that's
independent of the node name. This structure is passed with
every file system operation. When can the FSIL free a node
structure? It isn't safe to do it after deleting a node
because some other thread may still hold the node open.
The solution we came up with is to provide a reference count
on the nodes. Every time a node is opened (or passed as a
parameter to an FSP function), its reference count is
incremented. Then, when the node is closed (or after the FSP
function returns), the reference count is decremented. The
node is freed only when the reference count reaches 0.
- Thorough Generality. We wanted the FSP to be general
enough to handle a broad variety of file systems: DOS, HFS,
NFS, ISO9660, and others. And in addition to these "real"
file systems, we also wanted to be able to implement
"virtual" file systems. Instead of representing real files
stored somewhere on a disk, virtual file systems provide
"services." For example, the /dev file system provides
access to hardware devices; the /pipe file system gives
access to UNIX-like pipes. Virtual file systems are
potentially the most interesting part of the new system.
Some FSP Calls
I'm getting itchy... Let's look at some of the FSP protocol:
typedef int op_open(void *ns, void *node, int omode, void
**cookie);
typedef int op_close(void *ns, void *node, void *cookie);
typedef int op_free_cookie(void *ns, void *node, void
*cookie);
typedef int op_read(void *ns, void *node, void *cookie,
off_t pos, void *buf, size_t *len);
typedef int op_write(void *ns, void *node, void *cookie,
off_t pos, const void *buf, size_t *len);
typedef int op_ioctl(void *ns, void *node, void *cookie, int
cmd, void *buf, size_t len);
In these prototypes, ns designates the "name space," or
volume, that the node is on; this is necessary since a
single FSP implementation (that is, for a particular file
system) can serve more than one volume. The cookie argument
is what allows the file system to distinguish between
different clients of the same file. Every time open() is
called, the file system has the opportunity to allocate a
cookie. The FSIL passes the cookie as an argument to the FSP
calls with every subsequent operation on this open file.
By looking at the operation names, the mapping between
system calls and file system operations is quite obvious.
For example, the system call open() is turned into the
op_open() FSP function. op_free_cookie() is the only
operation that's not self-explanatory: It frees the cookie
allocated by op_open() . Why do we need a separate function
for this? You might think that we could free the cookie in
op_close() -- but that would actually be wrong. If
op_close() freed the cookie, there would be a race condition
with ongoing op_read() and op_write() calls operating on the
same cookie. The solution, once again, is to reference count
the cookie. A separate operation op_free_cookie() is invoked
after op_close() when the last operation on the cookie
returns.
I hope you've gotten an idea of the kinds of issues we dealt
with while designing the FSIL and the FSP. The FSP API
probably won't be released until after DR9, but no doubt the
subject will show up in the news groups before then. If you
have any questions, don't hesitate to contact me at
cyril@be.com.
News from the Front
By William Adams
Last week was Thanksgiving. Other than the fact that it
ushers in the heaviest shopping day of the year, I can't
forget that I am grateful for many things.
My 19-month-old likes keyboards and mice.
My wife is a programmer.
I have plenty of computers.
I'm allowed to program as much as I want.
And my favorite computer now has a MIDI synthesizer in
software!
In case you missed that, Marc Ferguson, who spelled out the
whats and whyfors of MIDI in a previous newsletter, has
released the software synthesizer for DR8.2. There are two
packages you can install. One is relatively small and will
get you hearing as soon as possible. The other is quite
large and you can download it at a more leisurely pace.
ftp://ftp.be.com/pub/dr8_update/simple_midi_small.tgz
In this package you will find two things:
simple-midi The actual player
general.midi.small A smaller version of the
instruments (315 K)
Place the general.midi.small file into /boot/system/bin.data
and rename it general.midi. Put the simple-midi file
anywhere you like, possibly /boot/apps . You should do a
setfile simple-midi once it's installed.
Alternatively, you can get the bigger package, which
contains the higher-quality instruments.
ftp://ftp.be.com/pub/dr8_update/simple_midi_large.tgz
simple-midi The actual player
general.midi The full bodied
instruments (5 MB)
Again, place the general.midi file into
/boot/system/bin.data and put the simple-midi file anywhere
you like.
This is a drag-and-drop interface for playing MIDI files.
Simply drag a MIDI file onto the icon and it starts playing.
You can also play files by starting the application and
using the Open button. You should only execute a single MIDI
player at a time. If you try tricks like copying the file
and playing multiple files that way, you might not get what
you expect.
Personally, I'm not a MIDI person. I don't do instruments,
but I can certainly appreciate high-quality music (for a
computer) when I hear it. I played with quite a few MIDI
sequences and they sound great. And since the instruments
will be the same on all machines that install this package,
the music will sound relatively similar on yours, too.
This is not the final effort for software MIDI on the BeOS.
This is merely a placeholder for what will come in DR9. The
synthesizer doesn't provide you with an API so that you can
use it from within your own application; this will show up
in DR9. But for those who've been dying to hear some
thumpin' sounds out of their machines without having to buy
a hardware synthesizer, your day has come.
FROM YOUR BENCH
I thought I was a pretty good programmer and that I could
help other programmers to get serious things done using the
BeOS. One thing I've learned in recent years is that no
matter how good you are, the Internet will bring someone to
your door who is much better. I believe the BeOS has
attracted extremely skilled, highly motivated programmers
who crank out the most awesome stuff, and honestly, I find
myself learning from them as often as I humbly try to teach
them.
If you don't check the BeWare(TM) and ftp.be.com site often,
you should. There are new offerings coming out all the time.
I've mentioned a number of developer offerings in the past,
and I'll continue to try and highlight things as they come
by.
Of particular note this week are two applications that have
been progressing over the past few months. You can check the
What's New page at ftp://ftp.be.com/pub/contrib/whatsnew.txt
to see where they are.
- BetMap. This is a drag-and-drop paint program, the likes
of which you've probably never seen. I said the same thing
when it first came out, and it's still true. Only now it has
an API so you can participate in this add-ons-gone-wild
world.
- Zonic. This is a very useful and very clean layout engine
and widget set. I think it demonstrates the ease with which
multiple interfaces can be created for the BeOS without too
much work. You should get the demo and play with it. "Really
neat" is probably the best description.
APOLOGIES AND CHEERLEADING
As many people have pointed out to me via e-mail, "Why
aren't the samples available when the newsletter comes out?"
No one to blame but me. Excuses... Excuses... Excuses... The
standard refrains apply, and I'll try to do better in the
future. There's only me doing these things, and things do
get busy, but there is hope. One of the busy things I've
been doing is interviewing many candidates to jump into the
pit with me. It goes well, and you should have some new
targets... I mean new voices of support soon.
Remember Macworld? It's now one week closer than the last
time I mentioned it. Thump, thump, thump. Hear that? That's
the beat of the Be Drum. It's pushing you to develop that
code as fast as you can so that we can show your wares and
what promises to be an illuminating, highly visible event
for the BeOS and its developers.
BE MARKETING MUTTERINGS: Comdex, Sore Feet But No Sore Heads
By Alex Osadzinski
I'm not a great fan of Comdex, so much so that I've avoided
attending or exhibiting there for six years. But when
Motorola graciously offered us two demo stations in their
PowerPC Pavilion at Comdex this year, the tightwad in me
couldn't resist. Accordingly, Maureen Miller, our trade show
manager, Ron Theis, our Webmaster, and I spent five days
showing the BeOS on a BeBox(TM) and on a Power Computing
machine to thousands and thousands of people.
It was actually fun.
This is quite amazing.
Most people work for two kinds of income: Actual money and
psychic income. Psychic income is the satisfaction you get
from doing your job. Comdex was fun because no matter how
many times we gave the BeOS demo, we got tremendous psychic
income from the reactions of the audience. It was also fun
because Las Vegas appears to have outgrown the show. It was
possible to get a cab without waiting for two hours. It was
possible to get a table at a restaurant. And it was possible
to get a hotel room, albeit for a price that made one's
eyeballs bleed.
We survived the show by working hard and by avoiding the
traditional trade show excesses of late-night parties and
lots of drinking. Our feet hurt, but not our heads. I will
confess that all three of us succumbed to the allure of the
craps table but, incredibly, all won Big Money (well, OK,
Some Money). This bodes well and probably means that there's
good karma surrounding Be employees these days. The casino
system seems to function on the basis that, while we cheered
and rejoiced at winning $15 (our threshold of excitement was
fairly low), portly gentlemen in plaid pants smoking noxious
cigars were losing $100 at a time.
The most pleasant and gratifying part of the show was that
over half the people coming to our booth knew our company's
name, knew roughly what we're trying to achieve, and wanted
to see a demo. This is a huge change from, say, May of this
year, when most visitors to a trade show booth had never
heard of us. The rampant speculation in the press about our
future seems to have had one benefit, at least. It was also
nice to meet with many Be developers and put faces to e-mail
addresses. Nicer still, many of the developers are working
on real projects, having moved on from simply experimenting
with the BeOS.
Were there any other interesting things at Comdex? I don't
know! Our booth was never quiet enough to allow us to slip
away to check out the rest of the show.
CoffeeBean
By Jean-Louis Gassée
I just flew back from Maui by way of Pittsburgh,
Pennsylvania. Using this itinerary to raise questions about
my sanity is fruitless -- I'm engaged in work on an
alternative personal computer platform to begin with. But,
seriously, I had a good reason. After taking my family of
French pilgrims to celebrate the ultimate American holiday
in a locale more fitting our (vastly exaggerated) hedonistic
roots, I flew to Carnegie Mellon for the reading of Paul
Clip's thesis, CoffeeBean, a Java virtual machine
implemented on the BeOS under the tutelage of Professor Adam
Beguelin.
It was good and useful fun. Developing applications for a
platform still in its infancy isn't for everyone. Writing
system-level code is living even more dangerously,
especially at a time when the system is still changing at a
fast pace from one revision to another. So, Paul Clip's
project wasn't only a test of his intellect and manhood, but
also a way to push our product in areas that aren't
necessarily probed in other developments. Overall, Paul was
generous in his comments, so much so I won't repeat the most
enthusiastic ones. His work pointed out to several
weaknesses (in California, "areas for improvement"). Some
have to do with differences with Sun's views of the world.
For instance, Java expects 64-bit signed integers, which are
unavailable in the current version of the C++ compiler on
our system. There's also a difference between the systems
used by the two worlds for priorities and synchronization.
An opportunity for us to roadtest our kernel in subtle
scheduling situations. Paul's work also fingered a problem
other Be developers have discovered, this one related to
hardware. Not all multithreading situations are dealt with
equally well by our system. When we moved from the AT&T
Hobbit processor to the PowerPC, the 604 was unavailable and
projected to be prohibitively expensive (for us at least).
The 601 was clearly an ephemeral product. Only the 603 had
the future and the price we liked. There was one glitch,
however: Motorola stated that multiprocessor systems
couldn't be implemented with the 603. On closer examination,
this techno-marketing statement was based on the absence of
cache-coherency hardware on the 603. Loosely speaking, cache
coherency is a function by which one processor can advise
others of the "pollution" of data contained in its cache,
thus preventing its colleagues from reliance on now-invalid
copies of the same data in their own caches. We decided we
could work around that problem, mostly in software, and we
produced working 603-based dual-processor hardware. In most
cases, the workaround we designed imposes an invisible
performance penalty. Now, imagine a situation where two
threads, one to each processor, work on the same data. This
can give rise to sizable overhead when the caches have to be
constantly updated. One can construct cases when two
processors perform more slowly than when one BeBox CPU is
turned off. Fortunately, in real life, the system performs
loosely coupled tasks most of the time, and the 603 penalty
isn't a factor. The 604 is now available, without much of
the earlier price penalty exacted on our small company with
its limited purchase power; it features cache-coherency
hardware and thus does away with the limitations of the 603
in MP applications. If not all, most PowerMac licensees have
604 dual-processor hardware but not much system software to
take advantage of it, a situation we intend to deal with
promptly.
Back to Carnegie-Mellon, we're grateful for Prof. Beguelin's
hospitality, we thank Paul Clip for his dissecting
dissertation of our system, and we've already incorporated
several of his suggestions into the next release of our
system.
BeDevTalk Summary
BeDevTalk is an unmoderated discussion group that's a forum
for the exchange of technical information, suggestions,
questions, mythology, and suspicions. In this column, we
summarize some of the active threads, listed by their
subject lines as they appear, verbatim, in the group.
To subscribe to BeDevTalk, visit the mailing list page on
our Web site: http://www.be.com/aboutbe/mailinglists.html.
- -----WEEK 2---------------------------
Subject: Better thread control
More discussion on simulating Amiga "signals" using
semaphores (or benaphores) and thread calls. Various
scenarios were hypothesized; some were answered with source
code examples.
- -----WEEK 2---------------------------
Subject: [CALL FOR VOTE ?!] Active window != Frontmost
window
This discussion, which was initially about different ways to
let the user place/tile/layer windows on the desktop, veered
into a debate on the meaning and enforcement of "multi-
user," as well as a search for the perfect real-world
analogy for user-settable GUI preferences.
Combining the two thoughts, we have your boss's wife peeking
over your shoulder as she tries to hijack your car. No one
pointed out the blatant sexism in "the boss's wife," but the
thread is young.
- -----WEEK 2---------------------------
Subject: VMM the horror continues
AKA: DR9 And VMM
A heated debate on whether VM is really a good thing. No
clear consensus, but a number of folks pointed out that you
have to keep the concept of VM separate from that of
swapping -- it's possible to have "good" VM but "bad"
swapping. In particular, an unbounded swap file is bad.
- -----NEW---------------------------
Subject: Shutdown a BeBox
A simple question -- "How do I shutdown my BeBox from within
a program" -- was answered, and then the thread went on to
discuss whether programatically shutting down or powering
off is a desirable feature, and, if allowed, how it should
be coordinated with a UPS. In general, soft-off is seen as a
nice feature that can easily be abused; it was proposed that
the mechanism be protected by root authentication (once
multi-user is implemented).
- -----NEW---------------------------
Subject: Two comments on AppSketcher
Some initial remarks about Marc Verstaen's AppSketcher led
to a wider discussion of UI modeling engines (M. Verstaen
pointed out that AppSketcher isn't intended to be such an
engine). Should the UI for an app be hand-tweaked, and the
coordinates hard-coded, or is this in itself an indication
of poor design? It was offered that a good layout engine can
figure out where things should go based on relative sizes
and positions without the programmer having to supply any
coordinate values. Others balk at the thought of automated
UI design. It ended in tears.
- -----NEW---------------------------
Subject: new file system
AKA: Newsletter #51 - File Details
AKA: Regarding journaling and DR9 FS.
This thread collected comments on Dominic Giampaolo's
article in last week's newsletter. Some folks are
disappointed in the decision to retain a hierarchical file
system, but there was approbation (and expectation) for the
IFS aspects of the system. (See Cyril Meurillon's article in
this issue for more information.) The final thread (of the
above) offered suggestions for fine-tuning the journaling
feature.
- -----NEW---------------------------
Subject: Preferences and version numbers
Different schemes for storing and encoding of version
denoters were proposed. The thread became quite precise in
its detail: The exact format of version numbers and the
structure that would store version information were
discussed. The one debatable subject was whether version
strings should be human-readable.
- -----NEW---------------------------
Subject: ImageViewer's drag-to-save protocol?
AKA: Proposal for Browser drag-save protocol v0.1
How does ImageViewer do its drag-to-save thing? A number of
listeners (well, a couple, anyway) were interested in
emulating this behavior. There were some guesses about what
was happening, and then Peter Potrebic parted the curtains:
- User starts dragging a selection in ImageViewer. It
doesn't know the destination so it uses it's own data format
for the image.
- The image is dropped on the Browser. The Browser doesn't
understand the message so it replies with a
B_MESSAGE_NOT_UNDERSTOOD .
- ImageViewer receives the
B_MESSAGE_NOT_UNDERSTOOD ...
[creates] a temp file and replies back to the sender.
- The Browser gets this reply. It knows what to do with
files (refs) so it copies the data file to the appropriate
place.
You can find example code in the RRRRRRRRaster app:
ftp://ftp.be.com/pub/Samples/Rras_sdk.tgz
A proposal for generalizing the drag-to-save mechanism was
proposed and debated.
- -----NEW---------------------------
Subject: international locales
This thread introduced the notion of internationalizing the
BeOS in general, and creating localized calendars
specifically. It was conceded that the Chinese calendar is
notoriously difficult.
|