Shared filesystems
From SOFTICE
Contents |
The problem
Each UML instance requires a base filesystem to operate. In addition, some space is needed in which the student can work on his assignments. One gigabyte of hard drive space is a reasonable minimum.
Much of this space contains redundant data; every instance will contain the same libc, the same applications, and the same utilities as every other instance. This is wasted space. In addition to requiring an excessive level of hardware support (a commodity server may not be able to support more than forty or fifty instances), it makes process migration between physical hosts much more difficult, as the entire filesystem must be transferred from one host to another.
A better approach would be to store most redundant data in a centralized location and share it between instances.
Possible solutions
Copy-on-write overlay files
Copy-on-write (COW) overlay files serve as an alternate location to store changes to a filesystem; the COW file records all changes while the base filesystem remains unchanged (and can be kept as read-only). In doing so, they allow the base filesystem to be shared between multiple instances, as each instance requires read-write access to its root filesystem. While they appear to file utilities (ls, etc) to be of equal size to the base filesystem, they are "sparse" files, requiring a much smaller amount of disk space to hold.
The primary problem with COW files is this: they require the base filesystem to be absolutely stable. If the base filesystem is modified, it renders every COW file invalid.
UnionFS
The UnionFS project works on a higher level than the COW file approach; it can combine two filesystems of disparate type and size into a single virtual filesystem.
UnionFS is still under development and may not be of the required quality and stability required for this project. It is also unknown whether UnionFS will work properly as the root filesystem of a UML instance.
Multiple filesystems
The root filesystem of a UML instance must be mounted read/write; it cannot be read-only without the use of technology such as COW files. However, this applies only to the root filesystem. If it is possible to break large amounts of information out of the root filesystem (the entire /usr hierarchy, for instance) and store it in a centralized location (such as an NFS server), the size of the files that must be moved between hosts can be dramatically reduced.
As the virtual filesystem stands now (1 June 2006), the vast majority of the used disk space is occupied by the uncompressed Linux kernel source, which properly belongs underneath /usr/src. Even without this source, /usr is 150 MiB; with it, it climbs to over 500MiB. If all assignments are completed through use of LKMs, this source need not change and can be stored on a read-only NFS mount.
A possible arrangement is this: /home and /usr are mounted through NFS. A copy of the root filesystem image is kept on each node in the cluster (to reduce network traffic) and COW files are transferred as needed as processes migrate.

