Table of Contents
BE ENGINEERING INSIGHTS: Changes in the BeOS Driver API By Cyril Meurilloncyril@be.com Oh no, you think, yet another article about drivers. Are they crazy about drivers at Be, or what? Ouaire iz ze beauty in driverz? The truth is that I would have loved to write about another (hotter) topic, one that has kept me very busy for the past few months, but my boss said I couldn't (flame him at cyrilsboss@be.com ;-). I guess I'll have wait until it becomes public information. In the meantime, please be a good audience, and continue reading my article. Before I get on with the meat of the subject, I'd like to stress that the following information pertains to our next release, BeOS Release 4. Because R4 is still in the making, most of what you read here is subject to change in the details, or even in the big lines. Don't write code today based on the following. It is provided to you mostly as a hint of what R4 will contain, and where we're going after that. Introduction of Version Control That's it. We finally realized that our driver API was not perfect, and that there was room for future improvements, or "additions." That's why we'll introduce version control in the driver API for R4. Every driver built then and thereafter will contain a version number that tells which API the driver complies to. In concrete terms, the version number is a driver global
variable that's exported and checked by the device file
system at load time. In #define B_CUR_DRIVER_API_VERSION 2 extern _EXPORT int32 api_version; In your driver code, you'll need to add the following definition: #include <Drivers.h> ... int32 api_version = B_CUR_DRIVER_API_VERSION. Driver API version 2 refers to the new (R4) API. Version 1 is the R3 API. If the driver API changes, we would bump the version number to 3. Newly built drivers will have to comply to the new API and declare 3 as their API version number. Old driver binaries would still declare an old version (1 or 2), forcing the device file system to translate them to the newer API (3). This incurs only a negligible overhead in loading drivers. But, attendez, vous say. What about pre-R4 drivers, which don't declare what driver API they comply to? Well, devfs treats drivers without version number as complying to the first version of the API -- the one documented today in the Be Book. Et voila.
New Entries in the I know you're all dying to learn what's new in the R4 driver API... Here it is, revealed to you exclusively! We'll introduce scatter-gather and (a real) select in R4, and add a few entries in the device_hooks structure to let drivers deal with the new calls. Scatter-gather As discreetly announced by Trey in his article at http://www.be.com/aboutbe/benewsletter/volume_II/Issue35.html, we've added 2 new system calls, well known to the community of UNIX programmers: struct iovec { void *iov_base; size_t iov_len; }; typedef struct iovec iovec; extern ssize_t readv_pos(int fd, off_t pos, const iovec *vec, size_t count); extern ssize_t writev_pos(int fd, off_t pos, const iovec *vec, size_t count); These calls let you read and write multiple buffers to/from a file or a device. They initiate an IO on the device pointed to by fd, starting at position pos, using the count buffers described in the array vec. One may think this is equivalent to issuing multiple simple reads and writes to the same file descriptor -- and, from a semantic standpoint, it is. But not when you look at performance! Most devices that use DMA are capable of "scatter-gather." It means that the DMA can be programmed to handle, in one shot, buffers that are scattered throughout memory. Instead of programming N times an IO that points to a single buffer, only one IO needs to be programmed, with a vector of pointers that describe the scattered buffers. It means higher bandwidth. At a lower level, we've added two entries in the
typedef status_t (*device_readv_hook) (void *cookie, off_t position, const iovec *vec, size_t count, size_t *numBytes); typedef status_t (*device_writev_hook) (void *cookie, off_t position, const iovec *vec, size_t count, size_t *numBytes); typedef struct { ... device_readv_hook readv; /* scatter-gather read from the device */ device_writev_hook writev; /* scatter-gather write to the device */ } device_hooks; Notice that the syntax is very similar to that of the single read and write hooks: typedef status_t (*device_read_hook) (void *cookie, off_t position, void *data, size_t *numBytes); typedef status_t (*device_write_hook) (void *cookie, off_t position, const void *data, size_t *numBytes); Only the descriptions of the buffers differ. Devices that can take advantage of scatter-gather should
implement these hooks. Other drivers can simply declare them
NULL. When a Select I'm not breaking the news either with this one. Trey
announced in his article last week the coming of extern int select(int nbits, struct fd_set *rbits, struct fd_set *wbits, struct fd_set *ebits, struct timeval *timeout); rbits, wbits and ebits are bit vectors. Each bit represents a file descriptor to watch for a particular event: * rbits: wait for input to be available (read returns something immediately without blocking) * wbits: wait for output to drain (write of 1 byte does not block) * ebits: wait for exceptions.
Here are the two hooks we added to the device_hooks structure: struct selectsync; typedef struct selectsync selectsync; typedef status_t (*device_select_hook) (void *cookie, uint8 event, uint32 ref, selectsync *sync); typedef status_t (*device_deselect_hook) (void *cookie, uint8 event, selectsync *sync); #define B_SELECT_READ 1 #define B_SELECT_WRITE 2 #define B_SELECT_EXCEPTION 3 typedef struct { ... device_select_hook select; /* start select */ device_deselect_hook deselect; /* stop select */ } device_hooks;
with the sync and ref it was passed in the select hook. This
happens typically at interrupt time, when input buffers are
filled or when output buffers drain. Another place where
The deselect hook is called to indicate that the file
descriptor shouldn't be watched any more, as the result of
one or more events on a watched file descriptor, or of a
timeout. It is a serious mistake to call
Drivers that don't implement select() should declare these
hooks NULL. Introduction of "Bus Managers" Another big addition to R4 is the notion of "bus managers."
Arve wrote a good article on this, which you'll find at: Bus managers are loadable modules that drivers can use to access a hardware bus. For example, the R3 kernel calls which drivers were using looked like this: extern long get_nth_pci_info(long index, pci_info *info); extern long read_pci_config(uchar bus, uchar device, uchar function, long offset, long size); extern void write_pci_config(uchar bus, uchar device, uchar function, long offset, long size, long value); ... Now, they're encapsulated in the PCI bus manager. The same happened for the ISA, SCSI and IDE bus related calls. More busses will come. This makes the kernel a lot more modular and lightweight, as only the code handling the present busses are loaded in memory. A New Organization for the Drivers Directory In R3, That's why we've broken down these directories into subdirectories that help the device file system locate drivers when new devices are opened.
For example, the serial driver publishes the following devices: ports/serial1 ports/serial2 It lives under
If "fred", a driver, wishes to publish a ports/XYZ device, then it should setup this symbolic link:
If a driver publishes devices in more than one directory, then it must setup a symbolic link in every directory in publishes in. For example, driver "foo" publishes: fred/bar/machin greg/bidule then it should come with the following symbolic links: ../add-ons/kernel/drivers/dev/fred/bar/foo -> ../../../bin/foo ../add-ons/kernel/drivers/dev/greg/foo -> ../../bin/foo This new organization speeds up device name resolution a
lot. Imagine that we're trying to find the driver that
serves the device " Future Directions You see that the driver world has undergone many changes in
BeOS Release 4. All this is nice, but there are other
features that did not make it in, which we'd like to
implement in future releases. Perhaps the most important one
is asynchronous IO. The asynchronous Thanks to the driver API versioning, we'll have no problems
throwing the necessary hooks into the
BE ENGINEERING INSIGHTS: Higher-Performance Display By Jean-Baptiste Quérujbq@be.com In application writing, the Interface Kit (and the Application Server which runs underneath the Kit) are responsible for handling all the display that finally goes on screen. They provide a nice, reasonably fast way to develop a good GUI for your application. Sometimes however, they aren't fast enough, especially for
game writing. Using a windowed-mode BDirectWindow sometimes
helps (or doesn't slow things down, in any case), but you
still have to cooperate with other applications whose
windows can suddenly overlap yours or want to use the
graphics accelerator exactly when you need it. Switching to
a full-screen The Looks quite exciting, hey? Unfortunately, all is not
perfect.
Here is a code snippet, ready for you to use and customize: #include <Application.h> #include <WindowScreen.h> #include <string.h> typedef long (*blit_hook)(long,long,long,long,long,long); typedef long (*sync_hook)(); class NApplication:public BApplication { public: NApplication(); bool is_quitting; // So that the WindowScreen knows what to do // when disconnected. private: bool QuitRequested(); void ReadyToRun(); }; class NWindowScreen:public There are some traps to be aware of before you begin playing
with the * About * You should call * You should call * You should neither call Choosing a good color_space and a good framebuffer size: * You should be aware that in R3.x some drivers do not support 16 bpp, and some others do not support 32 bpp. You should also know that some graphics cards do not allow you to choose any arbitrary framebuffer size; some will not accept a framebuffer wider than 1600 or 2048, or higher than 2048, some will only be able to use a small set of widths. * I recommend not using a framebuffer wider than the display area (except for temporary development reasons or if you don't care about compatibility issues). It's also a good idea not to use the full graphics card memory but to leave 1kB to 4kB unused (for the hardware cursor). * Here are some height limits you should not break if you want your program to be compatible with the mentioned cards: * in a B_8_BIT_640x480 space: 640x1632 all 1MB cards * in a B_8_BIT_800x600 space: 800x1305 all 1MB cards * in a B_16_BIT_640x480 space: 640x1635 all 2MB cards * in a B_16_BIT_800x600 space: 800x1308 all 2MB cards * in a B_32_BIT_640x480 space: 640x1636 all 4MB cards * in a B_32_BIT_800x600 space: 800x1309 all 4MB cards * Although the Be Book says that * One of the keys to high-performance -- the graphics card hooks must be treated with special attention. If there is a sync function (hook number 10), all other hooks can be asynchronous. Be careful to call the sync hook when it's needed (e.g., to synchronize hardware acceleration and framebuffer access, or to finish all hardware accelerations before page-flipping or before being disconnected from the screen). * While R3 does not support any form of multiple monitors,
future releases will. You should keep in mind that a
* In R3.x, * About 15/16bpp: We have discovered the bugs in the R3 drivers that affected
5/16bpp WindowScreens with ViRGE and Matrox cards. There are
some updated drivers available at: Also be aware that some drivers do not support both 15bpp and 16bpp. Even worse, the old Matrox driver would use a 15bpp screen when asked for 16bpp. Update your drivers!
DEVELOPERS' WORKSHOP: Yet Another Locking Article By Stephen Beaulieuhippo@be.com" "Developers' Workshop" is a weekly feature that provides
answers to our developers' questions, or topic requests.
To submit a question, visit
http://www.be.com/developers/suggestion_box.html.
It is funny, but somewhat fitting that many times the
Newsletter article you intend to write is not really the
Newsletter article you end up writing. With the best of
intentions, I chose to follow a recent trend in articles and
talk about multithreaded programming and locking down
critical sections of code and resources. The vehicle for my
discussion was to be a Multiple-Reader Single-Writer locking
class in the mode of In the hopes of this being my first short Newsletter
article, I will leave the details of the class to the sample
code. For once it was carefully prepared ahead of time and
is reasonably commented. I will briefly point out two neat
features of the class before heading into a short discussion
of locking styles. The first function to look at is the
The stack_base method is not infallible, and needs to be
backed up by The other functions to look at are the I want to take a little space to discuss locking
philosophies and their trade-offs. The two opposing views
can be presented briefly as "Lock Early, Lock Often" and
"Lock Only When and Where Necessary." These philosophies sit
on opposite ends of the spectrum of ease of use and
efficiency, and both have their adherents in the company
(understanding that most engineers here fall comfortably in
the middle ground.)
The "Lock Early, Lock Often" view rests on the idea that if
you are uncertain exactly where you need to lock, it is
better to be extra sure that you lock your resources. It
advises that all locking classes should support "nested"
calls to The main advantage of the "Lock Early, Lock Often" strategy
is its simplicity. It is very easy to add locking to your
applications: create an Autolock at the top of all your
functions and be assured that it will do its magic. The
downside of this philosophy is that the lock itself needs to
get smarter and to hold onto state information, which can
cause some inefficiencies in space and speed.
At the other end of the spectrum is the "Lock Only When and
Where Necessary." This philosophy asserts that programmers
using the "Lock Early, Lock Often" strategy do not
understand the locking requirements of their applications,
and that is essentially a bug just waiting to happen. In
addition, the overhead added to applications by locking when
it is unnecessary (say, in a function that is only called
>from within another function that already holds the lock)
and by using an additional class to manage the lock makes
the application larger and less efficient. This view instead
requires programmers to really design their applications and
to fully understand the implications of the locking
mechanisms chosen.
So, which is correct? I think it often depends on the
tradeoffs you are willing to make. With locks with only a
single owner, the state information needed is very small,
and usually the lock's system for determining if a thread
holds the lock is fairly efficient (see the The MultiLocker sample code provided leans far to the
efficiency side. The class itself allows multiple readers to
acquire the lock, but does not allow these readers to make
nested The class does have a debug mode where state information is
kept about readers so you can be sure that you are not
performing nested The sample code can be found at:
ftp://ftp.be.com/pub/samples/support_kit/MultiLock.zip
The class should be pretty efficient, and you are free to
use it and make adjustments as necessary. My thanks go out
to Pierre and George from the app_server team, for the
original lock on which this is based, and for their
assistance with (and insistence on) the efficiency concerns.
And, if it is, are we wrong to focus on it? Can we pace off
enough running room to launch the virtuous ascending spiral
of developers begetting users begetting developers? Is the
A/V space large enough to swing a cat and ignite a platform?
Perhaps there's another way to look at the platform
question, one that's brought to mind by the latest turn of
Apple's fortunes. Back in 1985, Apple had a bad episode: The
founders were gone, the new Mac wasn't taking off and the
establishment was dissing Apple as a toy company with a toy
computer. The advice kept pouring in: reposition the
company, refocus, go back to your roots, find a niche where
you have a distinctive advantage. One seer wanted to
position Apple as a supplier of Graphics-Based Business
Systems, another wanted to make the company the Education
Computer Company. Steve Jobs, before taking a twelve year
sabbatical, convinced Apple to buy 20% of Adobe, and thus
began the era of desktop publishing and the Gang of Four
(Apple, Adobe, Aldus and Canon).
Apple focused on publishing, and is still focused on
publishing (as evidenced by the other Steve -- Ballmer --
ardently promoting NT as *the* publishing platform). Does
that make Apple a publishing niche player? Not really. iMac
buyers are not snapping up the "beetle" Mac for publishing,
they just want a nice general-purpose computer. Although
Apple is still thrown into the publishing bin, the Mac has
always strived to be an everyday personal computer, and the
numbers show that this isn't mere delusion: For example,
Macs outsell Photoshop ten to one. But let's assume that at
the company's zenith, publishing made up as much as 25% of
Apple sales. Even then, with higher margin CPUs, Apple
couldn't live on publishing alone, hence the importance of a
more consumer-oriented product such as the iMac and hence,
not so incidentally, the importance of keeping Microsoft
Office on the platform.
The question of the viability of an A/V strategy stems from
us being thrown into the same sort of bin as our noble
hardware predecessor. But at Be we have an entirely
different business model. A hardware company such as Apple
can't survive on a million units per year. Once upon a time
it could, but those were the salad days of expensive
computers and 66% gross margins. We, on the other hand, have
a software-only business model and will do extremely well
with less than a million units per year--and so will our
developers. As a result, the virtuous spiral will ignite
(grab a cat).
More important -- and here we share Apple's
"niche-yet-general" duality -- the question may be one that
never needs to be answered: While BeOS shows its unique
abilities in A/V, we're also starting to see applications
for everyday personal computing. I'm writing this column on
Gobe Productive and e-mailing it to the prose-thetic surgeon
using Mail-It, both purchased using NetPositive and
SoftwareValet.
1997 Be Newsletters | 1995 & 1996 Be Newsletters Copyright ©1998 Be, Inc. Be is a registered trademark, and BeOS, BeBox, BeWare, GeekPort, the Be logo and the BeOS logo are trademarks of Be, Inc. All other trademarks mentioned are the property of their respective owners. Comments about this site? Please write us at webmaster@be.com. |