The question of how RAM is used on RAM-only systems arose when Tin Hat was first designed. The issue came up again and I decided to test three ways of setting up a RAM-only system to lay the question to rest:
1) The traditional initial ramdisk image, initrd. Here one puts down an ext2 filesystem onto a file via a loopback device and populates it. The file is then unmounted and gzipped. On boot, it used by the bootloader as the initial root filesystem. The kernel must be configured with CONFIG_BLK_DEV_INITRD=y and CONFIG_BLK_DEV_RAM_SIZE bigger than the size of the ext2 filesystem. When the system boots, "df" reports that root filesystem is on /dev/ram0 (the ramdisk) with sized fixed to that of the ext2 filesystem you created. It cannot be resized, thus fixing the division between RAM set aside for the filesystem and RAM used for processes. "free" reports the used RAM as = the fixed RAM set aside for the ramdisk plus RAM used for processes.
2) The newer initial ramfs image, initramfs. Here one populates a directory, and then creates a compressed cpio archive which is expanded into ramfs upon boot and becomes the root filesystem. The kernel must be configured with CONFIG_BLK_DEV_INITRD=y but one does not need to set CONFIG_BLK_DEV_RAM_SIZE, nor does one need to set CONFIG_TMPFS=y. When the system is up, "df" does not report the root filesystem and one cannot interact with it by doing things like "mount --bind / dir". Also the distinction between what RAM is set aside for the filesystem and what RAM is used for processes is blurred. "df" reports nothing and "free" reports total usage without distinction, ie. used RAM = RAM used for files (as reported by "du") plus RAM used for processes.
3) Bootstraping into tmpfs. In this case, one uses either an initrd or initramfs image to get into a small ram-only environment that sets up a tmpfs filesystem, unpacks the new root filesystem into it from some image like a squashfs on the boot device, and finally does a switch_root to tmpfs. Two images are needed, the first initramfs and the second image which will be decompressed into tmpfs. Here the kernel must be configured for initrd/initramfs, but in addition needs CONFIG_TMPFS=y. When the system is up, "df" reports root filesystem mounted as tmpfs at the size set by the initrd/initramfs when it was mounted, and "free" reports the total RAM usage without distinction as for an initramfs above. Unlike #2 above, one can interact with the root filesystem, for example, one can set (or reset) a limit on how much RAM is set aside for the root filesystem by doing "mount -o remount,size=512m /" and one can do "mount --bind / dir".
Note: It is possible to configure the kernel to support an initramfs (CONFIG_BLK_DEV_INITRD=y) but not support tmpfs (CONFIG_TMPFS=n).
I wanted to test these with respect to three aspects of memory usage by processes: 1) memory allocation on the heap. 2) memory allocation on the stack, 3) memory needed for text. I didn't expect any difference between stack and heap allocation, but I particulary wanted to test if ramdisk and ramfs did "execution in place", ie., they do not copy the process's text from the filesystem to page memory as is done when the filesystem is on a "real" block device, eg. a hard drive. To check this I wrote three programs:
1) heap.c which forks into the background and then uses glibc's malloc to request memory from the heap. It sits in a tight loop until killed.
Eg. One executes
~ # heap 4000
to request 4000 (4k) pages of RAM. Here's what pmap gives:
~ # pmap 317
317: heap 4000
08048000 4K r-x-- /bin/heap
08049000 4K r---- /bin/heap
0804a000 4K rw--- /bin/heap
b6e93000 16012K rw--- [ anon ] <-------- 16000K = 4000 pages
b7e36000 1232K r-x--- /lib/libc-2.8.so
b7f6a000 8K r---- /lib/libc-2.8.so
b7f6c000 4K rw--- /lib/libc-2.8.so
b7f6d000 16K rw--- [ anon ]
b7f71000 4K r-x-- [ anon ]
b7f72000 108K r-x-- /lib/ld-2.8.so
b7f8d000 4K r---- /lib/ld-2.8.so
b7f8e000 4K rw--- /lib/ld-2.8.so
bfc79000 84K rw--- [ stack ]
total 17488K
2) stack.c which works similarly but the allocation is done on the stack. It links against glibc. One caveat, even though we set ulimit -s 0, its easy to get a stack overflow. One can run multiple instances of stack to repeatedly allocate stack memory. Eg. With
~ # stack 1000
pmap gives
~ # pmap 336
336: stack 1000
08048000 4K r-x-- /bin/stack
08049000 4K r---- /bin/stack
0804a000 4K rw--- /bin/stack
b7df8000 8K rw--- [ anon ]
b7dfa000 1232K r-x-- /lib/libc-2.8.so
b7f2e000 8K r---- /lib/libc-2.8.so
b7f30000 4K rw--- /lib/libc-2.8.so
b7f31000 16K rw--- [ anon ]
b7f35000 4K r-x-- [ anon ]
b7f36000 108K r-x-- /lib/ld-2.8.so
b7f51000 4K r---- /lib/ld-2.8.so
b7f52000 4K rw--- /lib/ld-2.8.so
bf767000 4012K rw--- [ stack ] <-------- 4000K = 1000 pages
total 5412K
3) mktext.c creates text.asm which is then assembled to text. This binary does NOT link against anything, rather it uses registers ebc, ecx and edx to calculate the fibonacci sequence without allocating any memory on the stack or heap. Its a just 1000 pages (ie 4MB) of
mov edx, ebx
add edx, ecx
mov ebx, ecx
mov ecx, edx
Crazy no? The point is, when this is run, does the kernel copy the 4MB of text to page memory, or does it execute in place? pmap gives
~ # pmap 310
310: text
08048000 4100K r-x-- /bin/text <-------- 4000K = 1000 pages
b7fc8000 4K r-x-- [ anon ]
bfdb3000 84K rwx-- [ stack ]
total 4188K
text forks into the background when run, so we need only run a bunch of these to see what happens to our RAM usage.
I have prebuilt ISO's, but if you want to rebuild them from scratch, all the necessary goodies can be found here. The README describes what scripts to run to build the ISOs. I even put the kernel images, binaries and libraries in the directory. The code to build heap, stack and text is in the "tests" directory. I grabbed busybox, pmap and the libraries off a vanilla working Gentoo system --- glibc-2.8_p20080602-r1. To get busybox's configuration, just run "busybox bbconfig". To get the kernel's config, you'll find it in /proc/config.gz. I used vanilla linux-2.6.28 patched to .5
I configured these kernels for the hardware on a VMWare or qemu emulator. The ISOs expect the boot device to be a cdrom at /dev/hda. If you want to use qemu, you'll have to switch to hdc in the build-xxx.sh scripts and rebuild the ISOs. I also set CONFIG_BLK_DEV_RAM_SIZE=32768 K, and used 64MB of physical RAM in the emulator. Finally, the only difference between bzImage and bzImage.ne2e is that the former has Ext2 execution in place support enabled, whereas the later does not. In all other ways, they are identical kernels.
1. Ramdisk system: The ISO is project-initrd.iso The raw results of the test can be found below. Here's a summary of points to note:
2. Initramfs system: The ISO is project-initramfs.iso Here's a summary of points to note:
3. Initramfs->tmpfs system: The ISO is project-initramfs-tmpfs.iso. Here's a summary of points to note:
4. The control system: To compare the above results, I also looked at a "disked" version. Its a VMWare image: disked.tar.bz2 Here are the results:
| Attachment | Size |
|---|---|
| initrd-heap.txt | 4.52 KB |
| initrd-stack.txt | 4.48 KB |
| initrd-text.txt | 15.58 KB |
| initramfs-heap.txt | 4.35 KB |
| initramfs-stack.txt | 4.32 KB |
| initramfs-text.txt | 15.35 KB |
| initramfs-tmpfs-heap.txt | 4.76 KB |
| initramfs-tmpfs-stack.txt | 9.9 KB |
| initramfs-tmpfs-text.txt | 15.76 KB |
| disked-heap.txt | 5.02 KB |
| disked-stack.txt | 5.04 KB |
| disked-text.txt | 15.48 KB |