On files and streams

Posted by waldner on 11 April 2010, 1:03 am

It's generally understood that doing

$ command < file
# or
$ command file   # for commands that support this

and

$ cat file | command

are essentially the same thing. (yes, it's a UUOC, but it's not relevant for the point we want to make here)

This is "usually" true, for most practical purposes; however, there are some subtle differences, essentially related to the fact that a file and a stream, while having some similarities, and usually being presented to the application under the same interface, are in fact two different kinds of object.

The most important difference is that a file can be lseek()ed, while a stream cannot. This is important because some programs may be able to play tricks on files, but not on streams; to name some examples, tail or tac, to be more efficient, usually seek to the end of the input if it is a file, which they cannot do if it's a stream. If we use strace on (GNU) tail, for example, we can clearly see that it checks (using fstat() on the descriptor corresponding to the input) which kind of object it's dealing with, and acts accordingly:

$ ls -l bigfile
-rw-r--r-- 1 waldner users 176221788 2009-10-17 14:21 bigfile
$ $ strace tail -n 1 bigfile
execve("/usr/bin/tail", ["tail", "-n", "1", "testfile"], [/* 62 vars */]) = 0
...
fstat(3, {st_mode=S_IFREG|0644, st_size=176221788, ...}) = 0
lseek(3, 0, SEEK_CUR)                   = 0
lseek(3, 0, SEEK_END)                   = 176221788
lseek(3, 176218112, SEEK_SET)           = 176218112
read(3, "8 999978 999978 999978 999978 99"..., 3676) = 3676
fstat(1, {st_mode=S_IFCHR|0777, st_rdev=makedev(1, 3), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f8372bb5000
read(3, "", 0)                          = 0
close(3)                                = 0
write(1, "LAST LINE\n", 10)             = 10
...

Here, the input is a file (as shown by st_mode=S_IFREG) and tail can jump to the end straight away using lseek(). If we feed the input through a pipe, that is not possible:

$ cat bigfile | strace tail -n 1
execve("/usr/bin/tail", ["tail", "-n", "1"], [/* 62 vars */]) = 0
...
fstat(0, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0
read(0, "FIRST LINE\nSECOND LINE\nTHIRD LIN"..., 8192) = 8192
...[snip about 21500 reads like the above]
read(0, "8 999978 999978 999978 999978 99"..., 8192) = 3676
read(0, "", 8192)                       = 0
fstat(1, {st_mode=S_IFCHR|0600, st_rdev=makedev(136, 2), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fbf4c83b000
close(0)                                = 0
write(1, "LAST LINE\n", 10)             = 10

Here fstat() says that the input is S_IFIFO, so tail has no choice but read the whole thing from start to end.

Also note the huge difference in speed, due to the above behavior:

$ time tail -n 1 bigfile
LAST LINE

real    0m0.046s
user    0m0.002s
sys     0m0.001s
$ time cat bigfile | tail -n 1
LAST LINE

real    0m1.123s
user    0m0.139s
sys     0m0.553s

So in general, apart from avoiding the UUOC, using files and not streams/pipes is more efficient for those commands that benefit from the special properties of a file.

However, it turns out that sometimes (hopefully less and less) streams have some advantages over files. Programs that lack support for large files (again, hopefully very few these days) are known to fail or behave incorrectly if they have to read files bigger than 2 gigabytes on a 32-bit system. This is due to the fact that using non-LFS system calls like open(), lseek(), or stat()/fstat() on such big files may produce overflow errors, making it effectively impossible for those programs to access those files normally.
Usually, to "persuade" these programs to operate on such files, two things are needed:

They should not have to use open() directly on the file, as they don't use O_LARGEFILE so that would fail. This can be solved (for those commands that support reading either from a specified file OR from standard input) by using redirection, so for example nonlfscommand < bigfile or cat bigfile | nonlfscommand
Then, these programs might try to detect which kind of object their input is, and based on that, if they detect a regular file, attempt other operations like lseek(), which again would fail. Usually, the detection would be done by calling stat() or fstat(). It turns out that a 32-bit stat() on a big file fails because st_size overflows. So we need a way to make stat() succeed, and in such a way that the program does not attempt further manipulation.

The second requirement rules out things like nonlfscommand < bigfile, because that could still potentially run fstat() on the actual file, which would fail.
So, cat bigfile | nonlfscommand, however inefficient it might be, appears to be the winner here, since fstat()ing stdin when it's a pipe returns S_IFIFO and st_size 0, which does not make the call fail. Furthermore, the program detecting a non-file as input, it won't attempt to lseek() or other risky operations.
As said however, this should not be a problem on any modern system, which should definitely all support large files. Perhaps the only problematic programs could be legacy, binary-only programs that were built without large file support.

Filed under shell, tips Tagged file, pipeline, shell, stream, UUOC

Comments are closed | Permalink

\1

On files and streams

BTC

Recent Posts

Categories

Archives

\1

On files and streams

BTC

Recent Posts

Categories

Tags

Archives