Why doesn’t “sudo echo value > target” work, and why pipe into “tee”?

Quick one here for beginners, since I’ve been asked this question quite a bit lately.

Problem and reason for it

When you type “sudo echo value > /sys/target” into a POSIX shell, the interpreter parses the command and comes up with something like this:

Program: sudo
Parameters: echo value
Input: STDIN
Output: /sys/target
Error: STDERR

Note that “sudo” is not a shell built-in command, it is a separate program which is not even installed on many Linux systems. When “sudo” is executed with those arguments, it opens a root shell and in that shell runs “echo value”. Whether “echo” is a shell built-in, or a separate binary (e.g. /bin/echo) the result should be the same: “value” is written to STDOUT.

The STDOUT for sudo however is not the console since sudo was run with the redirection to /sys/target. This leads us onto the reason why your command doesn’t work.

In order to execute your original command, a few system calls are involved:

Fork: The shell process fork into two processes
Open: The child process opens /sys/target for writing
Dup2: The child process maps the opened file as the STDOUT descriptor (replacing the previous file assigned to that descriptor)
Exec: The child process calls exec to replace itself with the desired program (sudo)

Note that this all occurs in your current security context, since “sudo” is not launched yet (the “exec” call launches “sudo”). So as a result, the “open” system call will fail, since your user account does not have the required permissions for opening the target file for writing. The “echo” and “sudo” do not fail because they aren’t even started.

Solutions

The solution to this is to do the output redirection from within the security context that “sudo” gives you. There are various ways to achieve this:

tee

Tee duplicates its input stream to multiple outputs (hence the name, “T”). Play with the following examples:

echo Hello | tee
echo Hello | tee /dev/stdout
echo Hello | tee /dev/stdout /dev/stderr
echo Hello | tee /dev/stdout /dev/stdout /dev/stdout

This is commonly used for redirecting from programs run with sudo:

echo value | sudo tee /sys/target

The “echo” program does not need to be run with sudo as just writes its parameters to STDOUT. “tee” is run with sudo as it needs to open the target file for writing. Note that since “tee” also copies the input data to its output, you will see your value written to the terminal when using this method.

Nested shell

Another method is to have “sudo” start a shell and redirect from within that shell:

sudo sh -c 'echo value > /sys/target'

If using single-quotes to wrap the command, this will prevent environment variables from your interactive shell from getting expanded into the command. Instead, expansion takes place in the root shell. This may be beneficial in some cases.

sudoers

The previous methods use sudo, which will typically ask for a password. If you want to permit certain operations to be run without requiring manual password entry (e.g. for scripting or for hoykeys), you can put the commands into scripts, chown the scripts so they’re owned by root, then edit the “sudoers” file appropriately.

setuid

You can also replace the scripts with C programs that call setuid to acquire root. After compiling, you must chown the programs to root and set the setuid bit on them so that they can acquire root automatically.

This method requires care since you’re allowing any user to run certain root-only operations. Be careful of what commands you permit to be run like this, take care to avoid buffer overruns/underruns, and compile in ultra-strict mode (-Wall -Wextra -pedantic -Werror -fstack-protector).

The infamous tar-pipe

Bulk copying

Copying files on or between Linux/Unix machines is considerably nicer than on their Windows counterparts. Recursive copy with the built-in command (cp) is available by addition of a single flag, where Windows often requires a separate program (xcopy) and additional flags to achieve the same task. Copying over networks is a breeze with SCP, where Windows would complain about the shell not supporting UNCs. Alternatively, you would have to map network shares to drive letters first, and keep track of which letters are what shares.

Of course, on Windows you can drag/drop the files in a GUI much like on Linux, but the moment you go away for a coffee will be the moment that Windows freezes the operation and pops up an annoying dialog asking you if you’re sure that you want to copy some of the files. Then half an hour later when you return, the copy is still only ten seconds in…

On the Linux front, sometimes we want to customize things a bit:

error handling (fail or continue?)
symbolic link handling (reference or duplicate?)
hard link handling (reference or duplicate?)
metadata (copy or ignore?)
permissions (copy or ignore?)
sparse files (preserve or fill?)
filesystem boundaries (recurse or skip?)

Additionally, copying many small files over SCP can take a very long time; SCP performs best with large files. Rather than re-invent the wheel with a whole new file copy & networking program, we can do much better with the tools that we already have, thanks to the modular and interoperable nature of software built upon the Unix philosophy.

Most (or maybe all) of these problems can be solved with rsync, but rsync is not available in all environments (e.g. managed servers, overpriced Microsoft crap).

Tar examples

A simple and highly customizable way to read a load of files is provided by the tape backup utility tar. You can tell it how to handle the various intricacies listed above and it will then recursively read a load of files and write them in a single stream to its output or to a file.

[code]
Common tar options:
-c combine files into an archive
-x extract files from archive
-f <file> set archive filename (default is standard input/output)
-t list names of files in archive
-z, -j, -J use gzip / bzip2 / lzma (de)compression
-v list names of files processed
-C <path> set current working directory to this path before proceeding
[/code]
[code language=”bash”]tar -cf output.tar file1 file2 …[/code]
[code language=”bash”]tar -xf input.tar[/code]

By writing to the standard output, we can pass this archive through a stream compressor, e.g. gzip, bzip2.

[code language=”bash”]
tar -c file1 file2 … | gzip -c > output.tar.gz
[/code]

As this is a common use of tar, the most common compressors can also be specified as flags to tar rather than via a pipeline:

Archive and compress:

[code language=”bash”]
tar -czf output.tar.bz2 file1 file2 …
tar -cjf output.tar.bz2 file1 file2 …
tar -cJf output.tar.xz file1 file2 …
[/code]

Decompress and extract

[code language=”bash”]
tar -xzf input.tar.bz2
tar -xjf input.tar.bz2
tar -xJf input.tar.xz
[/code]

Tar streams can be transferred over networks to a destination computer, where a second tar instance is run. This second one receives the archive stream from the first tar instance and extracts the files onto the destination computer. This usage of two tar instances over a pipeline has resulted in the technique being nicknamed the “tar-pipe”.

Where network speed is the bottleneck, tar can be instructed to (de)compress the streams on the fly, and offers a choice of codecs. Note that due to the pipelined nature of this operation, any other streaming (de)compressors can also be used even if not supported by tar.

Tar-pipe examples

In its simplest form, to copy one folder tree to another:

[code language=”bash”]tar -C source/ -c . | tar -C dest/ -x[/code]

One could specify the -h parameter for the left-side tar, to have it follow symbolic links and build a link-free copy of the source in the destination, e.g. for sharing the tree with Windows users.

To copy the files over a network, simply wrap the second tar in an SSH call:

[code language=”bash”]tar -C source/ -c . | ssh user@host ‘tar -C dest/ -x'[/code]

To copy from a remote machine, put the first tar in an SSH call instead:

[code language=”bash”]ssh user@host ‘tar -C source/ -c .’ | tar -C dest/ -x[/code]

SSH provides authentication and encryption, so this form can be used over insecure networks such as the internet. The SCP utility uses SSH internally. SSH can also provide transparent compression, but the options provided by tar will generally be more useful.

Fast but insecure alternative: netcat

A lightweight and insecure alternative would be to use netcat, which should only be used on secure private networks:

[code language=”bash”]
# On the source machine
tar -C source/ -c *.log | nc host port
[/code]
[code language=”bash”]
# On the target machine
nc -l -p port | tar -C dest/ -x
[/code]

This lightweight form is useful on ultra-low-end hardware such as the Raspberry Pi. It is considerably less robust than the SSH tar-pipe, and is also very insecure.

Compressed tar-pipe

If the network is slow then data compression can easily be used with the tar-pipe:

[code language=”bash”]
# z = gzip (high speed)
# j = bzip2 (compromise)
# J = xz (high compression)

# example, using bzip2 (why would anyone use bzip2 vs choice of xz/gzip nowadays?)
tar -C source/ -cj . | ssh user@host ‘tar -C dest/ -xj’
[/code]

To use a (de)compressor of your choice, provided it is installed on both machines:

[code language=”bash”]
tar -C source/ -c . | my-compressor | ssh user@host ‘my-decompressor | tar -C dest/ -x’
[/code]

You could, for example, use a parallel implementation of a common compressor such as pigz / pbzip2 / pxz, in order to speed things up a bit.

Tar also has a command-line parameter for specifying the compressor/decompresser, provided it follows a certain set of rules.

The choice of (de)compressor and compressor settings depends on the available processing power, RAM, and network bandwidth. Copying between two modern i7 desktops over 1gig ethernet, gzip compression should suffice. On a fast connection, heavy compression (e.g. xz -9e) will create a bottleneck. For a 100mbit ethernet connection or a USB2 connection, bzip2 or xz (levels 1-3) might give better performance. On a Raspberry Pi, a bzip2 tar-pipe might end up being slower (due to CPU bottleneck) than an uncompressed tar-pipe (limited by network bandwidth).

A niche use example of tar+compression

I originally wrote this while solving a somewhat unrelated problem. From in Estonia I can remotely power on my home PC in the UK via a Raspberry Pi Wake-On-LAN server with Dynamic DNS, then I can use port backwarding to access the UK PC. In order to transfer a large amount of data (~1TB) from the UK PC to Estonia, the fastest method (by far) was to use Sneakernet: i.e. copy the data to a USB disk, then have that disk posted to Estonia.

A friend back home plugged the USB disk in, which contained a couple of hundred gigabytes of his own files (which he wanted to send me), but the disk was formatted using Microsoft’s crappy FAT32 file system. After copying a load of small files to the disk, it became very slow to use then while trying to copy “large” files (only >4GB), it failed completely. I recall Bill Gates once said that we’d never need more than 640kB or RAM – well apparently, he thought that a 4GB file-size limit would also be futureproof… FAT32 also didn’t support symbolic links, and although Microsoft’s recent versions of NTFS do, their own programs still often fail miserably when handing symbolic links to executable files.

To solve this I wanted to reformat the disk as Ext4, but keep the files that were on it. The only disk in my home PC with enough free space for the files already on the USB disk was very slow (damn Eco-friendly 5400rpm disk!), so moving the files from the USB disk to this one would take a long time. Hence, I used the left half of a tar-pipe with a parallel-gzip (pigz) compressor to copy the data from the USB disk to the very slow disk.

By compressing the data on the fly before storing it, I could fit somewhat more source data into the measly 20MB/s write speed of the slow disk, getting an effective write speed of around 50MB/s – saturating the link from the USB disk, which was one bottleneck that couldn’t be avoided.

After that was complete, I blitzed and re-formatted the USB disk as Ext4, then ran the right-half of the tar-pipe to extract the data from the slow disk back to the USB disk and resumed copying “To Estonia” files to the disk.

One-liner to duplicate database over network

Here is a handy little bash pipeline that I used to transfer a MariaDB database from a development system (in the USA) to the production system (in Europe). It’s really a one liner, but this website isn’t wide enough to display it all on one line:

mysqldump -u root --password=<password> --databases <databases> |
ssh <user@target> "mysql -u root --password=<password>"

Note that this will drop any existing database on the target system with the same name as the one being duplicated.

If you use public-key cryptography rather than passwords for ssh authentication, then this will run with no user input necessary at all.

For a demonstration, use the following shell scripts:

Script to create example database:

#!/usr/bin/env sh
mysql -u root -p <<quit
create database lemon;
use lemon;
create table lime (field1 VARCHAR(32), field2 int);
insert into lime values ("zesty", 42);
quit

Script to copy example database to another system:

#!/bin/bash

# Prompts the user for input, stores input in variable, uses existing value if
# user provides no input.
# Parameters: prompt, default value, variable name, [optional: silent?]
function prompt {
        local def line silent
        def="$2"
        [ -z "$def" ] && def=`eval echo -n '$'"$3"`
        [ -z "$def" ] && val="" || val=" ($def)"
        echo -n "$1$val: "
        [ "$4" ] && silent="-s" || silent=""
        read $silent line
        [ -z "$line" ] && eval $3="$def" || eval $3="$line"
        [ "$silent" ] && echo ""
}

# Default values
DBNAME=lemon
DBUSER=root
DBPASS=
SSHOST=

# Get input from user
prompt "Database name" "$1" DBNAME
prompt "Database username" "$2" DBUSER
prompt "Database password" "" DBPASS "-s"
prompt "SSH target host" "$3" SSHOST

# A nice one/two-liner (well one-line if you replace the variables with useful
# values, and ditch the above code)
mysqldump -u "$DBUSER" --password="$DBPASS" --databases "$DBNAME" |
ssh "$SSHOST" "mysql -u \"$DBUSER\" --password=\"$DBPASS\""

This script assumes that an SSH server is enabled on the target machine, and that the MySQL root passwords are the same on both systems.

Then to see the copied database (run on the target system):

echo "select * from lemon.lime;" | mysql -p

Or if you can’t be bothered opening an interactive shell on the target system:

echo "select * from lemon.lime;" | ssh <user@target> "mysql -p"