Given the large number of comments I got (26!), I feel obliged to post a summary of what was said.
First, the problem:
I want to create a large file (let’s say 10 GB) to use as swap space. This file can’t be a sparse file (a file with holes, see wikipedia if you don’t know about sparse files).
Since I’m going to
mkswap it, I don’t care about the data that is actually in that file after creating it. The stupid way (but only solution on ext3) to create it is to fill it with zeroes, with is very inefficient.
Yes, it will work on ext4. A convenient which makes this easy to use can be found here at http://sandeen.fedorapeople.org/utilities/fallocate.c. It was written by Eric Sandeen, a former XFS developer who now works for Red Hat, who has been a big help making sure ext4 will be ready for Fedora and Red Hat Enterprise Linux. (Well, I guess I shouldn’t call him a former XFS developer since he still contributes patches to XFS now and then, but he’s spending rather more time on ext4 these days.)
One warning about the program; it calls the fallocate system call directly, and it doesn’t quite have the right magic architecture-specific magic for certain architectures which have various restrictions on how arguments need to be passed to system calls. In particular, IIRC, I believe there will be issues on the s390 and powerpc architectures. The real right answer is to get fallocate into glibc; folks with pull into making glibc do the right thing, please talk to me.
Glibc does have posix_fallocate(), which implements the POSIX interface. posix_fallocate() is wired to use the fallocate system call, for sufficiently modern versions of glibc.
However, posix_fallocate() is probablematic for some applications; the problem is that for filesystems that don’t support fallocate(), posix_fallocate() will simulate it by writing all zeros to the file. However, this is not necessarily the right thing to do; there are some applications that want fallocate() for speed reasons, but if the filesystem doesn’t support it, they want to receive the ENOSPC error message, so they can try some other fallback — which might or might not involve writing all zero’s to the file.
The other shortcoming with posix_fallocate() is that it doesn’t support the FALLOC_FL_KEEP_SIZE flag. What this flag allows you to do is to allocate disk blocks to the file, but not to modify the i_size parameter. This allows you to allocate space for files such as log files and mail spool files so they will be contiguous on disk, but since i_size is not modified, programs that append to file won’t get confused, and tail -f will continue to work. For example, if you know that your log files are normally approximately 10 megs a day, you can fallocate 10 megabytes, and then the log file will be contiguous on disk, and the space is guaranteed to be there (since it is already allocated). When you compress the log file at the end of the day, if the log file ended up being slightly smaller than 10 megs, the extra blocks will be discarded when you compress the file, or if you like, you can explicitly trim away the excess using ftruncate().
fallocate works fine: creating a 20 GB file is almost immediate. Also, syncing or umounting the filesystem is also immediate, and reading the file returns only zeros. I’m not sure how it is implemented, but it looks nice :-). However, it still doesn’t solve my initial problem: mkswap works, but not swapon:
:/tmp# touch tmp :/tmp# /root/fallocate -l 10g tmp :/tmp# ls -lh tmp -rw-r--r-- 1 root root 10G Mar 3 11:01 tmp :/tmp# du tmp 10485764 tmp :/tmp# mkswap tmp Setting up swapspace version 1, size = 10737414 kB no label, UUID=a316ce8e-cf33-412b-8dc0-e10d9f2ebdbb :/tmp# strace swapon tmp [...] swapon("/tmp/tmp") = -1 EINVAL (Invalid argument) write(2, "swapon: tmp: Invalid argument\n"..., 30swapon: tmp: Invalid argument ) = 30 exit_group(-1)
(swapon works fine if the file is created normally — without using fallocate()).
Any other ideas?