This todo is primarily targetted at the Hurd proper and applications that rely on the Hurd interfaces.
- psmisc
The tools provided by the psmisc package are linux centric. Killall and pstree, for instance, require Linux's proc file system but could just as easily use Hurd's libps.
- tmpfs
- ppp
- unionfs/stowfs
- supermount translator
Related: ?KnownHurdLimits
From Marcus, 2002:
- xkb driver for console (for international users)
- kbd leds in console (well, in general, Roland's new driver in oskit for that crap)
- fixing fakeroot (it's buggy)
- fixing tmpfs (it's buggy, Neal says it's Mach's fault)
- adding posix shared memory (requires the io_close call to be implemented)
- adding posix file locking (requires the io_close call to be implemented)
- testing
- find + various filesystems (are inode numbers for . and .. sane?)
- ext2fs with other block sizes than 4096
- --help and --version and --usage in all programs
- I have seen ^V in some --help output, might be argp bug
- Verify that all options are documented clearly, and that no unimplemented options appear
- Is the short and long description in the help output correct?
- Is the return value of all programs correct (eg, does main() return a sane value)
- Is the suid bit correctly set for all installed programs?
- Translators
- Does settrans -g work? -fg?
- Does fsysopts work? Does setting options with fsysopts work?
- Does stat() work on all translated nodes and give proper data?
- What about chown, chmod (some translators should pass this through to the underlying node, esp in /dev!)
- Does statfs give correct data?
- Are all inode numbers and link counts correct?
- We also should have a "make check" test suite. We can add this once Jeff finished his automake patches
- pick up the other things
- pthread, definitely. Now that we are so close
- new console is basically done
- needs integration of course
- X switching support
- there is certainly more to do ...
Wolfgang list of Easy tasks on July 28, 2002:
| Difficulty | Task |
|---|---|
| 0 | Check if all programs handle options (at least --help, --version and --usage; don't forget about the shell scripts) |
| 1 | Check if all translators handle fsysopts |
| 1 | Check if all translators respond to "settrans -g" |
| 1 | More tests of this kind |
| 2 | Fix those of the above who don't work as intended |
| 2 | Document (in doc/hurd.texi) all undocumented programs (translators as well as programs in utils/ and sutils/ and some others) |
| 1 | Find a POSIX test suite, run it on GNU/Hurd, report the results |
| 1 | Find more useful test suites to run |
| 3 | Update INSTALL-cross |
| 2 | Check if all the store classes in libstore work (we have many of them, look into the Makefile) |
| 4 | Fix those who don't work |
| 2 | Document all still undocumented store classes |
| 2 | The console is pretty new code, it told me it wants to get tested |
Where difficulty 0 means trivial and 4 means tricky; the difficulty has nothing to do with the importance.
This is a collection of resources concerning user-space device drivers.
Also see device drivers and IO systems. driver glue code.
Issues
IRQs
Can be modeled using RPCs.
Security considerations: IRQ sharing.
Omega0 paper defines an interface.
As is can be read in the Mach 3 Kernel Principles, there is an event object facility in Mach that can be used for having user-space tasks react to IRQs. However, at least in GNU Mach, that code (
kern/eventcount.c) doesn't seem functional at all and isn't integrated properly in the kernel.
DMA
Security considerations.
- I/O MMU.
I/O Ports
- Security considerations.
PCI and other buses
- Security considerations: sharing.
Latency of doing RPCs
- GNU Mach is said to have a high overhead when doing RPC calls.
Plan
Examine what other systems are doing.
L4
Hurd on L4: deva, fabrica
Minix 3
Start with a simple driver and implement the needed infrastructure (see Issues above) as needed.
http://savannah.nongnu.org/projects/user-drivers/
Some (unfinished?) code written by Robert Millan in 2003: PC keyboard and parallel port drivers, using
libtrivfs.
Documentation
An Architecture for Device Drivers Executing as User-Level Tasks, 1993, David B. Golub, Guy G. Sotomayor, Freeman L. Rawson, III
Performance Measurements of the Multimedia Testbed on Mach 3.0: Experience Writing Real-Time Device Drivers, Servers, and Applications, 1993, Roger B. Dannenberg, David B. Anderson, Tom Neuendorffer, Dean Rubine, Jim Zelenka
User Level IPC and Device Management in the Raven Kernel, 1993, D. Stuart Ritchie, Gerald W. Neufeld
Creating User-Mode Device Drivers with a Proxy, 1997, Galen C. Hunt
The APIC Approach to High Performance Network Interface Design: Protected DMA and Other Techniques, 1997, Zubin D. Dittia, Guru M. Parulkar, Jerome R. Cox, Jr.
The Fluke Device Driver Framework, 1999, Kevin Thomas Van Maren
Omega0: A portable interface to interrupt hardware for L4 system, 2000, Jork Löser, Michael Hohmuth
Userdev: A Framework For User Level Device Drivers In Linux, 2000, Hari Krishna Vemuri
User Mode Drivers, 2002, Bryce Nakatani
Towards Untrusted Device Drivers, 2003, Ben Leslie, Gernot Heiser
Encapsulated User-Level Device Drivers in the Mungi Operating System, 2004, Ben Leslie Nicholas, Nicholas FitzRoy-Dale, Gernot Heiser
Linux Kernel Infrastructure for User-Level Device Drivers, 2004, Peter Chubb
Initial Evaluation of a User-Level Device Driver, 2004, Kevin Elphinstone, Stefan Götz
User-level Device Drivers: Achieved Performance, 2005, Ben Leslie, Peter Chubb, Nicholas FitzRoy-Dale, Stefan Götz, Charles Gray, Luke Macpherson, Daniel Potts, Yueting Shen, Kevin Elphinstone, Gernot Heiser
Virtualising PCI, 2006, Myrto Zehnder, Peter Chubb
Microdrivers: A New Architecture for Device Drivers, 2007, Vinod Ganapathy, Arini Balakrishnan, Michael M. Swift, Somesh Jha
External Projects
There must be some blocking / dead-locking
problem in term:
# w | grep [t]sch
tschwing p1 192.168.10.60: Tue 8PM 0:03 2172 /bin/bash
tschwing p2 192.168.10.60: Tue 4PM 40hrs 689 emacs
tschwing p3 192.168.10.60: 8:52PM 11:37 15307 /bin/bash
tschwing p0 192.168.10.60: 6:42PM 11:47 8104 /bin/bash
tschwing p8 192.168.10.60: 8:27AM 0:02 16510 /bin/bash
Now open a new screen window, or login shell, or...
# ps -Af | tail
[...]
tschwinge 16538 676 p6 0:00.08 /bin/bash
root 16554 128 co 0:00.09 ps -Af
root 16555 128 co 0:00.01 tail
bash is started (on p6), but newer makes it to the shell promt; doesn't
even start to execute .bash_profile / .bashrc. The next shell started, on
the next available pseudoterminal, will work without problems.
The term on p6 has already been running before:
# ps -Af | grep [t]typ6
root 6871 3 - 5:45.86 /hurd/term /dev/ptyp6 pty-master /dev/ttyp6
In this situation, w will sometimes report erroneous values for IDLE
for the process using that terminal.
Killed that term instance, and things were fine again.
All this reproducible happens while running the GDB testsuite.
Have a freshly started shell blocking on such a term instance.
$ ps -F hurd-long -p 1766 -T -Q
PID TH# UID PPID PGrp Sess TH Vmem RSS %CPU User System Args
1766 0 3 1 1 6 131M 1.14M 0.0 0:28.85 5:40.91 /hurd/term /dev/ptyp3 pty-master /dev/ttyp3
0 0.0 0:05.76 1:08.48
1 0.0 0:00.00 0:00.01
2 0.0 0:06.40 1:11.52
3 0.0 0:05.76 1:09.89
4 0.0 0:05.42 1:06.74
5 0.0 0:05.50 1:04.25
... and after 5:45 h:
$ ps -F hurd-long -p 21987 -T -Q
PID TH# UID PPID PGrp Sess TH Vmem RSS %CPU User System Args
21987 1001 676 21987 21987 2 148M 2.03M 0.0 0:00.02 0:00.07 /bin/bash
0 0.0 0:00.02 0:00.07
1 0.0 0:00.00 0:00.00
$ ps -F hurd-long -p 1766 -T -Q
PID TH# UID PPID PGrp Sess TH Vmem RSS %CPU User System Args
1766 0 3 1 1 6 131M 1.14M 0.0 0:29.04 5:42.38 /hurd/term /dev/ptyp3 pty-master /dev/ttyp3
0 0.0 0:05.76 1:08.48
1 0.0 0:00.00 0:00.01
2 0.0 0:06.41 1:11.90
3 0.0 0:05.82 1:10.28
4 0.0 0:05.52 1:07.06
5 0.0 0:05.52 1:04.63
$ sudo gdb /hurd/term 1766
[sudo] password for tschwinge:
GNU gdb (GDB) 7.0-debian
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i486-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /hurd/term...Reading symbols from /usr/lib/debug/hurd/term...done.
(no debugging symbols found)...done.
Attaching to program `/hurd/term', pid 1766
[New Thread 1766.1]
[New Thread 1766.2]
[New Thread 1766.3]
[New Thread 1766.4]
[New Thread 1766.5]
[New Thread 1766.6]
Reading symbols from /lib/libhurdbugaddr.so.0.3...Reading symbols from /usr/lib/debug/lib/libhurdbugaddr.so.0.3...
[System doesn't respond anymore, but no kernel crash.]
Mach interfaces do not allow for proper resource accounting, when a server allocates resources on behalf of a client.
Mach can't do a good job at resource management, as it doesn't have enough information how resources are used: which data is important and which is discardable, for example.
These issues are what Neal Walfield is working on with his new kernel viengoos.
Examples
The canonical TODO file from the CVS archive.
Just like in other Unix systems one can, for example, use fdisk or parted
to manage hard disks' partition tables. After doing changes to a disk's
partition table, the kernel has to be instructed to reinitialize its internal
data structures: where does a partition begin, where does it end, etc.
With fdisk and friends this is done on Linux with the BLKRRPART IOCTL,
which is used to tell the kernel to reread the disk's partition table.
parted also uses this interface on Linux, but for GNU Hurd, the corresponding
function, libparted/arch/gnu.c (gnu_disk_commit), doesn't do anything at all.
The infrastructure in GNU Mach is already there,
linux/src/drivers/block/ide.c (ide_ioctl) <BLKRRPART> and
linux/src/drivers/scsi/sd_ioctl.c (sd_ioctl) <BLKRRPART>, but the IOCTL needs
to be routed from libparted through glibc's Hurd IOCTL interface,
through Hurd's libstore, to GNU Mach.
This is not a huge project, and actually one that is suitable for someone who wants to start with hacking the system.
Even though we don't need a /etc/fstab for mounting filesystems
(passive translators to the rescue; they have problems on their
own, see the critique), we still need this file for fsck -a
and swapon -a to function.
Given an a.out executable that only does raise (SIGABRT), invoking that
one...
... against
crash-dump-corewill...... not overwrite existing
corefiles.Is this reasonable? Linux does overwrite them, for example.
... show big variances in running-time behavior:
$ TIMEFORMAT='real %R user %U system %S' $ rm -f core; time env CRASHSERVER=/servers/crash-dump-core ./a.out; ls -l core Aborted (core dumped) real 1.350 user 0.000 system 0.010 -rw------- 1 tschwinge tschwinge 17031168 Jul 7 21:59 core $ rm -f core; time env CRASHSERVER=/servers/crash-dump-core ./a.out; ls -l core Aborted (core dumped) real 22.771 user 0.000 system 0.010 -rw------- 1 tschwinge tschwinge 17031168 Jul 7 21:59 core $ rm -f core; time env CRASHSERVER=/servers/crash-dump-core ./a.out; ls -l core Aborted (core dumped) real 1.367 user 0.000 system 0.010 -rw------- 1 tschwinge tschwinge 17031168 Jul 7 22:00 core $ rm -f core; time env CRASHSERVER=/servers/crash-dump-core ./a.out; ls -l core Aborted (core dumped) real 5.789 user 0.000 system 0.010 -rw------- 1 tschwinge tschwinge 17031168 Jul 7 22:00 core $ rm -f core; time env CRASHSERVER=/servers/crash-dump-core ./a.out; ls -l core Aborted (core dumped) real 22.664 user 0.010 system 0.000 -rw------- 1 tschwinge tschwinge 17031168 Jul 7 22:01 core... produce a huge
corefile:$ du -hs core 17M coreOn Linux, the
corefile occupies 76 KiB of disk space, which seems much more reasonable.... produce an invalid
corefile:$ gdb a.out core warning: core file may not match specified executable file. [New Thread 76651] warning: Wrong size fpregset in core file. Reading symbols from /lib/libc.so.0.3...[...] Core was generated by `./a.out'. Program terminated with signal 6, Aborted. warning: Wrong size fpregset in core file. #0 0x00000000 in ?? () (gdb) bt #0 0x00000000 in ?? () Cannot access memory at address 0x17Probably the
crashserver code and GDB are out of sync.
... against
crash-suspendwill...... not work at all:
$ CRASHSERVER=/servers/crash-suspend ./a.out $ [returns to the shell and doesn't suspended]... show big variances in running-time behavior:
$ TIMEFORMAT='real %R user %U system %S' $ rm -f core; time env CRASHSERVER=/servers/crash-suspend ./a.out; ls -l core Aborted (core dumped) real 1.381 user 0.000 system 0.010 -rw------- 1 tschwinge tschwinge 17031168 Jul 7 22:04 core $ rm -f core; time env CRASHSERVER=/servers/crash-suspend ./a.out; ls -l core Aborted (core dumped) real 1.332 user 0.000 system 0.010 -rw------- 1 tschwinge tschwinge 17031168 Jul 7 22:04 core $ rm -f core; time env CRASHSERVER=/servers/crash-suspend ./a.out; ls -l core Aborted (core dumped) real 21.228 user 0.000 system 0.010 -rw------- 1 tschwinge tschwinge 17031168 Jul 7 22:04 core $ rm -f core; time env CRASHSERVER=/servers/crash-suspend ./a.out; ls -l core Aborted (core dumped) real 1.323 user 0.000 system 0.010 -rw------- 1 tschwinge tschwinge 17031168 Jul 7 22:05 core $ rm -f core; time env CRASHSERVER=/servers/crash-suspend ./a.out; ls -l core Aborted (core dumped) real 22.279 user 0.000 system 0.010 -rw------- 1 tschwinge tschwinge 17031168 Jul 7 22:05 core $ rm -f core; time env CRASHSERVER=/servers/crash-suspend ./a.out; ls -l core Aborted (core dumped) real 1.362 user 0.000 system 0.000 -rw------- 1 tschwinge tschwinge 17031168 Jul 7 22:08 core $ rm -f core; time env CRASHSERVER=/servers/crash-suspend ./a.out; ls -l core Aborted (core dumped) real 21.110 user 0.000 system 0.000 -rw------- 1 tschwinge tschwinge 17031168 Jul 7 22:08 core $ rm -f core; time env CRASHSERVER=/servers/crash-suspend ./a.out; ls -l core Aborted (core dumped) real 1.350 user 0.000 system 0.020 -rw------- 1 tschwinge tschwinge 17031168 Jul 7 22:08 core... can reliably crash GNU Mach:
This happens if a
corefile is already present (and won't get overwritten; see above). I reproduced this three times.$ TIMEFORMAT='real %R user %U system %S' $ time env CRASHSERVER=/servers/crash-suspend ./a.out; ls -l core Aborted real 2.856 user 0.000 system 0.010 -rw------- 1 tschwinge tschwinge 17031168 Jul 7 22:08 core panic: zalloc: zone kalloc.8192 exhausted Kernel Breakpoint trap, eip 0x20020a77 Stopped at 0x20020a76: int $3 db> trace 0x20020a76(2006aba8,4d0f7e9c,200209b0,0,0) 0x20020a4d(2006b094,2006ae40,2000,20016803,4a5f4114) 0x2002bca5(49a03564,1,0,9,1000) 0x20022f4c(2000,4a5f45d4,4a84879c,49a46564,4ac43e78) 0x20021e65(4ac43e78,4a5f45d4,4a5f4114,0,0) 0x2005309d(2106ba9c,3,38,28,1783) Bad frame pointer: 0x2106ba78 $ addr2line -i -f -e /boot/gnumach-xen 0x20020a76 0x20020a4d 0x2002bca5 0x20022f4c 0x20021e65 0x2005309d Debugger /home/tschwinge/tmp/gnumach/gnumach-1-branch-Xen-branch.build/../gnumach-1-branch-Xen-branch/kern/debug.c:105 panic /home/tschwinge/tmp/gnumach/gnumach-1-branch-Xen-branch.build/../gnumach-1-branch-Xen-branch/kern/debug.c:148 zalloc /home/tschwinge/tmp/gnumach/gnumach-1-branch-Xen-branch.build/../gnumach-1-branch-Xen-branch/kern/zalloc.c:470 kalloc /home/tschwinge/tmp/gnumach/gnumach-1-branch-Xen-branch.build/../gnumach-1-branch-Xen-branch/kern/kalloc.c:185 ipc_kobject_server /home/tschwinge/tmp/gnumach/gnumach-1-branch-Xen-branch.build/../gnumach-1-branch-Xen-branch/kern/ipc_kobject.c:76 mach_msg_trap /home/tschwinge/tmp/gnumach/gnumach-1-branch-Xen-branch.build/../gnumach-1-branch-Xen-branch/ipc/mach_msg.c:1367
This is a collection of resources concerning Berkeley Packet Filters.
Documentation
Wikipedia: Berkeley Packet Filter
The Packet Filter: An Efficient Mechanism for User-level Network Code, 1987, Jeffrey C. Mogul, Richard F. Rashid, Michael J. Accetta
The BSD Packet Filter: A New Architecture for User-level Packet Capture, 1992, Steven Mccanne, Van Jacobson
Protocol Service Decomposition for High-Performance Networking, 1993, Chris Maeda, Brian N. Bershad
Efficient Packet Demultiplexing for Multiple Endpoints and Large Messages, 1994, Masanobu Yuhara Fujitsu, Masanobu Yuhara, Brian N. Bershad, Chris Maeda, J. Eliot, B. Moss
... and many more
Implementation
?HurdFr
Git repository: http://rcs-git.duckcorp.org/hurdfr/bpf.git/
The patch for GNU Mach is expected to be complete and functional, the translator less so -- amongst others, there are unresolved issues concerning support of ?IOCTLs.
-
GNU Savannah bug #25054 -- Kernel panic with eth-multiplexer
GNU Savannah patch #6619 -- pfinet uses the virtual interface
GNU Savannah patch #6620 -- pfinet changes its filter rules with its IP address
GNU Savannah patch #6621 -- pfinet sets the mach device into the promiscuous mode
GNU Savannah patch #6622 -- pfinet uses the BPF filter
GNU Savannah patch #6851 -- fix a bug in BPF
This is a collection of resources concerning device drivers and I/O systems in general.
Also see user-space device drivers. driver glue code.
Documentation
An I/O System for Mach 3.0, 1991, Alessandro Forin, David Golub, Brian Bershad
Linux Device Driver Emulation in Mach, 1996, Shantanu Goel, Dan Duchamp
Eliminating receive livelock in an interrupt-driven kernel, 1997, Jeffrey Mogul, Dec Western, Jeffrey C. Mogul, K. K. Ramakrishnan
IO-Lite: A Unified I/O Buffering and Caching System, 1997, Vivek S. Pai, Peter Druschel, Willy Zwaenepoel
The Flux OSKit: A substrate for kernel and language research, 1997, Bryan Ford, Godmar Back, Greg Benson, Jay Lepreau, Albert Lin, Olin Shivers
Reuse Linux Device Drivers in Embedded Systems, 1998, Chi-wei Yang, Paul C. H. Lee, Ruei-Chuan Chang
THINK: A Software Framework for Component-based Operating System Kernels, 2002, Jean-Philippe Fassino, Jean-Bernard Stefani, Julia Lawall, Gilles Muller
An I/O Architecture for Microkernel-Based Operating Systems, 2003, Hermann Haertig, Jork Loeser, Jork Löser, Frank Mehnert, Lars Reuther, Martin Pohlack, Alexander Warg
High-Speed I/O: The Operating System as a Signalling Mechanism, 2003, Matthew Burnside, Angelos D. Keromytis
Unmodified device driver reuse and improved system dependability via virtual machines, 2004, Joshua Levasseur, Volkmar Uhlig, Jan Stoess, Stefan Götz
External Projects
Project UDI, a multi-company effort to define a Uniform Driver Interface
-
- Unofficial OSKit source on Savannah
Mach-like
It might be possible to integrate these systems' device drivers, as they're expected to mostly be using the same interfaces as the current in-kernel Mach drivers are.
OSF Mach
Darwin
GNU Emacs mostly does work, however there are a few issues.
diredon a directory hangs. (UseC-g C-gto break the unresponsive operation.)Configuration in
src/s/:gnu.husesbsd-common.h.gnu-kfreebsd.husesgnu-linux.h-- we probably should too.gnu-linux.hmakes a few things depend on/proc(also seeHAVE_PROCFS) -- either resort to our own ways, or enhance our procfs accordingly.sysdep.c
Got a hang when compiling GNU Emacs 23, when it was compiling
.elto.elcfiles. Looked like busy-looping inside glibc. This was not reproducible so far.Debian emacs23_23.1+1-2, grubber, (probably) busy-looping in
ext2fson/media/datawhen resuming emacs23 build in~/tmp/emacs/emacs23-*/(dpkg-buildpackage -B -uc -nc 2>&1 | tee L). No modifications toemacs23-*so far, I think. Hangs always in the same place, it seems, and reproducible. Tarred toemacs23-23.1+1.tar.bz2(beware: empty and zero-permission files:emacs23-23.1+1/.pc/debian-site-init-el.diff/lisp/site-init.el,emacs23-23.1+1/.pc/autofiles.diff/src/config.in~). At hang-time: the rootfs is fine (syncfs -c -s /works;syncfsinvolving/media/datahangs). Plan: GDB on that ext2fs, and see what's hanging / locked.
The canonical tasks file from the CVS archive.
syncfs is a tiny wrapper around the file syncfs
RPC.
Its functionality should me merged into GNU coreutils' sync program, see
GNU Savannah task #6614.
to run tmpfs as a regular user, /servers/default-pager must be executable by that user. by default it seems to be set to read/write.
$ sudo chmod ugo+x /servers/default-pagerThen I get this error:
tmpfs: /build/mbanck/hurd-20060825/build-tree/hurd/tmpfs/dir.c:62: diskfs_get_directs: Assertion `__builtin_offsetof (struct tmpfs_dirent, name) >= __builtin_offsetof (struct dirent, d_name)' failed.i rearranged the struct tempfs_dirent in tmpfs.h to line up with the struct dirent. now the assert passes at line 62 of dir.c passes.
struct tmpfs_dirent { struct tmpfs_dirent *next; struct disknode *dn; + char padding[3]; uint8_t namelen; char name[0]; };now ls works on an empty directory. you can touch files, and run `ls' on them. mkdir, rmdir works too. fsysopts works df works
creating a symlink fails.
old patch to get symlinks working:
http://www.mail-archive.com/bug-hurd@gnu.org/msg11844.html --- node.c.orig 2005-07-24 09:56:39.000000000 -0400 +++ node.c 2005-07-24 09:55:46.000000000 -0400 @@ -330,6 +330,7 @@ create_symlink_hook (struct node *np, const char *target) { assert (np->dn->u.lnk == 0); + np->dn_stat.st_size = strlen (target); if (np->dn_stat.st_size > 0) { const size_t size = np->dn_stat.st_size + 1; @@ -337,6 +338,7 @@ if (np->dn->u.lnk == 0) return ENOSPC; memcpy (np->dn->u.lnk, target, size); + np->dn->type = DT_LNK; adjust_used (size); recompute_blocks (np); } @@ -380,8 +382,6 @@ error_t diskfs_truncate (struct node *np, off_t size) { - if (np->allocsize <= size) - return 0; if (np->dn->type == DT_LNK) { @@ -392,6 +392,9 @@ return 0; } + if (np->allocsize <= size) + return 0; + assert (np->dn->type == DT_REG); if (default_pager == MACH_PORT_NULL)now symlinks work.
can't write data to a file
miscellaneous notes:
diskfs_disk_name could be NULL, but it is "swap"
using default_pager_object_set_size (np->dn->u.reg.memobj, size); to truncate and grow.
why are our blocks 512? shouldn't it something else? or at least settable? or does ?libdiskfs demand this?
diskfs_get_filemap_pager_struct (struct node *np) returns null.
shouldn't it return default_pager?
<antrik> hde: what's the status on tmpfs?
<hde> Broke
<hde> k0ro traced the errors like the assert show above to a pager problem.
See the pager cannot handle request from multiple ports and tmpfs sends
request using two differ ports, so to fix it the pager needs to be hacked
to support multiple requests.
<hde> You can enable debugging in the pager by changing a line from dprintf
to ddprintf I can tell you how if you want.
<antrik> and changing tmpfs to use a single port isn't possible?...
<hde> antrik, I am not sure.
<hde> IIRC k0ro was saying it cannot be changed and I cannot recall his
reasons why.
<sdschulze> antrik: Doing it the quick&dirty way, I'd just use an N-ary
tree for representing the directory structure and mmap one new page (or
more) for each file.
<hde> sdschulze, What are you talking about?
<sdschulze> hde: about how I would implement tmpfs
<hde> O
<azeem> sdschulze: you don't need to reimplement it, just fix it :)
<sdschulze> azeem: Well, it seems a bit more difficult than I considered.
<sdschulze> I had assumed it was implemented the way I described.
<hde> O and the assert above gets triggered if you don't have a
default-pager setup on /servers/default-pager
<hde> the dir.c:62 assert that is.
<azeem> hde: you sure? I think I have one
<hde> I am almost sure.
<azeem> mbanck@beethoven:~$ showtrans /servers/default-pager
<azeem> /hurd/proxy-defpager
<azeem> isn't that enough?
<hde> It is suppose to be.
<hde> Try it as root
<hde> I was experiecing alot of bugs as a normal user, but according to
marcus it is suppose to work as root, but I was getting alot of hangs.
<azeem> hde: same issue, sudo doesn't work
<hde> sucky, well then there are alot of bugs. =)
<azeem> eh, no
<azeem> I still get the dir.c assert
<sdschulze> me too
<sdschulze> Without it, I already get an error message trying to set tmpfs
as an active translator.
<hde> I think I found the colprit.
<hde> default_pager_object_set_size --> This is were tmpfs is hanging.
<hde> mmm Hangs on the message to the default-pager.
<hde> Well it looks like tmpfs is sending a message to the default-pager,
the default-pager then receives the message and, checks the seqno. I
checked the mig gen code and noticed that the seqno is the reply port, it
this does not check out then the default pager is put into a what it
seems infinte condition_wait hoping to get the correct seqno.
<hde> Now I am figuring out how to fix it, and debugging some more.
<marco_g> hde: Still working on tmpfs?
<hde> Yea
<marco_g> Did you fix a lot already?
<hde> No, just trying to narrow down the reason why we cannot write file
greater then 4.5K.
<marco_g> ahh
<marco_g> What did you figure out so far?
<hde> I used the quick marcus fix for the reading assert.
<marco_g> reading assert?
<hde> Yea you know ls asserted.
<marco_g> oh? :)
<hde> Because, the offsets changed in sturct dirent in libc.
<hde> They added 64 bit checks.
<hde> So marcus suggested a while ago on bug-hurd to just add some padding
arrays to the struct tmpfs_dirent.
<hde> And low and behold it works.
<marco_g> Oh, that fix.
<hde> Yup
<hde> marco_g, I have figured out that tmpfs sends a message to the
default-pager, the default-pager does receive the message, but then
checks the seqno(The reply port) and if it is not the same as the
default-pagers structure->seqno then she waits hoping to get the correct
one. Unfortantly it puts the pager into a infinite lock and never come
out of it.
<marco_g> hde: That sucks...
<marco_g> But at least you know what the problem is.
<hde> marco_g, Yea, now I am figuring out how to fix it.
<hde> Which requires more debugging lol.
<hde> There is also another bug, default_pager_object_set_size in
<hde> mach-defpager does never return when called and makes tmpfs hang. I
<hde> will have a closer look at this later this week.
<hde> Cool, now that I have two pagers running, hopefully I will have less
system crashes.
<marcus> running more than one pager sounds like trouble to me, but maybe
hde means something different than I think
<hde> Well the other pager is only for tmpfs to use.
<hde> So I can debug the pager without messing with the entire system.
<hde> marcus, I am trying ti figure out why diskfs_object_set_size waits
forever. This way when the pager becomes locked forever I can turn it
off and restart it. When I was doing this with only one mach-defpager
running the system would crash.
<marcus> hde: how were you able to start two default pagers??
<hde> Well you most likely will not think my way of doing it was correct,
and I am also not sure if it is lol. I made my hacked version not stop
working if one is alreay started.
<hde> See, the default-pager has a function called
default_pager_object_set_size this sets the size for a memory object,
well it checks the seqno for each object if it is wrong it goes into a
condition_wait, and waits for another thread to give it a correct seqno,
well this never happens.
<hde> Thus, you get a hung tmpfs and default-pager.
<hde> pager_memcpy (pager=0x0, memobj=33, offset=4096, other=0x20740,
size=0x129df54, prot=3) at pager-memcpy.c:43
<hde> bddebian, See the problem?
<bddebian> pager=0x0?
<hde> Yup
<hde> Now wtf is the deal, I must debug.
<hde> -- Function: struct pager * diskfs_get_filemap_pager_struct
<hde> (struct node *NP)
<hde> Return a `struct pager *' that refers to the pager returned by
<hde> diskfs_get_filemap for locked node NP, suitable for use as an
<hde> argument to `pager_memcpy'.
<hde> That is failing.
<hde> If it is not one thing it is another.
<bddebian> All of Mach fails ;-)
<hde> It is alot of work to make a test program that uses libdiskfs.
<bing> to run tmpfs as a regular user, /servers/default-pager must be
executable by that user. by default it seems to be set to read/write.
<bing> $ sudo chmod ugo+x /servers/default-pager
<bing> you can see the O_EXEC in tmpfs.c
<bing> maybe this is just a debian packaging problem
<bing> it's probably a fix to native-install i'd guess
<bing> tmpfs is failing on default_pager_object_create with -308, which
means server died
<bing> i'm running it as a regular user, so it gets it's pager from
/servers/default-pager
<bing> and showtrans /servers/default-pager shows /hurd/proxy-defpager
<bing> so i'm guessing that's the server that died
<bing> this is about /hurd/tmpfs
<bing> a filesystem in memory
<bing> such that each file is it's own memory object
<andar> what does that mean exactly? it differs from a "ramdisk"?
<bing> instead of the whole fs being a memory object
<andar> it only allocates memory as needed?
<bing> each file is it's own
<bing> andar: yeah
<bing> it's not ext2 or anything
<andar> yea
<bing> it's tmpfs :-)
<bing> first off, echo "this" > that
<bing> fails
<bing> with a hang
<bing> on default_pager_object_create
<andar> so writing to the memory object fails
<bing> well, it's on the create
<andar> ah
<bing> and it returns -308
<bing> which is server died
<bing> in mig-speak
<bing> but if i run it as root
<bing> things behave differently
<bing> it gets passed the create
<bing> but then i don't know what
<bing> i want to make it work for the regular user
<bing> it doesn't work as root either, it hangs elsewhere
<andar> but it at least creates the memory object
<bing> that's the braindump
<bing> but it's great for symlinks!
<andar> do you know if it creates it?
<bing> i could do stowfs in it
<antrik> bing: k0ro (I think) analized the tmpfs problem some two years ago
or so, remember?...
<antrik> it turns out that it broke due to some change in other stuff
(glibc I think)
<antrik> problem was something like getting RPCs to same port from two
different sources or so
<antrik> and the fix to that is non-trivial
<antrik> I don't remember in what situations it broke exactly, maybe when
writing larger files?
<bing> antrik: yeah i never understood the explanation
<bing> antrik: right now it doesn't write any files
<bing> the change in glibc was to struct dirent
<antrik> seems something more broke in the meantime :-(
<antrik> ah, right... but I the main problem was some other change
<antrik> (or maybe it never really worked, not sure anymore)
Copying baseGNU to the virtual disk works. Even booting got through but when I try to run native-install it never gets to the very end. First time it froze on sed package, the other time on sysv-rc.
How much memory did you configure for the QEMU system? It may simply be -- I've seen this myself -- that the system runs out of memory, as at the native-install stage (I think at least) swap is not yet configured and enabled. What I've been doing is: boot (with -s), MAKEDEV hdWHATEVER in /dev/ for the swap device, run /hurd/mach-defpager, followed by swapon /dev/hdWHATEVER. Does this help?
Thank You very much, more memory solved the freezing.
$ settrans --create --active ramdisk0 /hurd/storeio -T copy zero:32M
$ mkfs.ext2 -F -b 4096 ramdisk0
[...]
$ settrans --active --orphan ramdisk0 /hurd/ext2fs.static ramdisk0
$ df -h ramdisk0/
df: Warning: cannot read table of mounted file systems
Filesystem Size Used Avail Use% Mounted on
- 32M 1.1M 30M 4% /media/data/home/tschwinge/ramdisk0
This uses settrans and ?storeio to create a ramdisk of 32 MiB by routing
a thusly sized zero store through the copy store, connecting
that to the ramdisk0 node, create a ext2 filesystem on it, and replace the
translator running on the ramdisk0 node with a instance of the ext2fs
translator running on the same node (?translator
stacking).
It is a open issue hurd why this does only work with
ext2fs.static, but not the dynamically linked ext2fs (settrans:
/hurd/ext2fs: Translator died).
A (better) alternative would be using the tmpfs
translator, but that one is broken at the moment.
The GNU Hurd is under active development. Because of that, there is no stable version. We distribute the Hurd sources only through CVS at present.
Although it is possible to bootstrap the GNU/Hurd system from the sources by cross-compiling and installing the system software and the basic applications, this is a difficult process. It is not recommended that you do this. Instead, you should get a binary distribution of the GNU/Hurd, which comes with all the GNU software precompiled and an installation routine which is easy to use.
The Debian project has commited to provide such a binary distribution. Debian GNU/Hurd is currently under development and available in the unstable branch of the Debian archive.
Introduction
- What Is the GNU Hurd - A Brief Description
- Advantages
- History
- Logo
- Status
- ?KnownHurdLimits
- ?Translation - Localized sites about the Hurd
- Donate
- ?SeenHurd - Media references
- ?Shopping - Hurd Gear
- ?FunnyHurd - From a different Herd
- FAQ
Understanding
- Introductory Material
- Architecture
- Towards a New Strategy of OS Design by Thomas Bushnell, BSG.
- Critique - Analysis
- Hurd Hacking Guide
- Concepts
Using
- Running
- Distrib -- Distributions
- Public Hurd Boxen
- Neighborhurds and Subhurds
Common Problems
- Console
- ?Xfree86 -- ?DebianX -- ?DebianXorg
- ?GNUstep
- ?XattrHurd: Setting translators under GNU/Linux
- ?SerialConsole: Setting up a serial console.
Contributing
Open Issues
Developer References
- Rules
- Trackers
- Toolchain
- RPC Interfaces
- Libraries
- libpager
- libstore
- libchannel
- libhello example -- Hurd library example
- libnetfs -- short introductory material
- IO Path
- Porting
- Debugging
- Hurd Sourcecode Reference: Searchable and browsable index of the code.
- Networking
Sometimes it may already be helpful to capture a translator's stdout and
stderr, for example in this situation where pfinet was
silently dying all the time, without any console output:
$ sudo settrans -fgap ↩
/servers/socket/2 ↩
/bin/sh -c '/hurd/pfinet -i eth0 -a [...] > /tmp/stdout 2> /tmp/stderr'
$ [...]
$ cat /tmp/stdout
NET4: Linux TCP/IP 1.0 for NET4.0
IP Protocols: ICMP, UDP, TCP
TCP: Hash tables configured (ehash 65536 bhash 65536)
$ cat /tmp/stderr
pfinet: ../../hurd.work/pfinet/ethernet.c:196: ethernet_xmit: Unexpected error: (os/device) invalid IO size.
(Trying to run GDB in this case was of no help -- due to a bug in GDB (supposedly) it wouldn't catch the fault.)
Be made aware that both stdout and stderr will be block bufferend and no
longer line buffered when doing such a redirection, so you'll have to arrange
for appropriate fflushes on these, or force them to be line buffered again
using the appropriate glibc magic (setvbuf). Otherwise you'll see text in
the output files only if either glibc herself decides to flush (after some KiB
of text) the after translator exits.
It is a open issue hurd to decide / implement / fix that (all?) running (passive?) translators' output should show up on the console / syslog.
