How to Use the find Command in Linux

Configurare noua (How To)

Situatie

The Linux find command is great at searching for files and directories. But you can also pass the results of the search to other programs for further processing.

Solutie

The Linux find command is powerful and flexible. It can search for files and directories using a whole raft of different criteria, not just filenames. For example, it can search for empty files, executable files, or files owned by a particular user. It can find and list files by their accessed or modified times, you can use regex patterns, it is recursive by default, and it works with pseudo-files like named pipes (FIFO buffers).

All of that is fantastically useful. The humble find command really packs some power. But there’s a way to leverage that power and take things to another level. If we can take the output of the find command and use it automatically as the input of other commands, we can make something happen to the files and directories that find uncovers for us.

The principle of piping the output of one command into another command is a core characteristic of Unix-derived operating systems. The design principle of making a program do one thing and do it well, and to expect that its output could be the input of another program—even an as yet unwritten program—is often described as the “Unix philosophy.” And yet some core utilities, like mkdir, don’t accept piped input.

To address this shortcoming the xargs command can be used to parcel up piped input and to feed it into other commands as though they were command-line parameters to that command. This achieves almost the same thing as straightforward piping. That’s “almost the same” thing, and not “exactly the same” thing because there can be unexpected differences with shell expansions and file name globbing.

Using find With xargs

We can use find with xargs to some action performed on the files that are found. This is a long-winded way to go about it, but we could feed the files found by find into xargs , which then pipes them into tar to create an archive file of those files. We’ll run this command in a directory that has many help system PAGE files in it.

find ./ -name “*.page” -type f -print0 | xargs -0 tar -cvzf page_files.tar.gz

The command is made up of different elements.

find ./ -name “*.page” -type f -print0:  The find action will start in the current directory, searching by name for files that match the “*.page” search string. Directories will not be listed because we’re specifically telling it to look for files only, with -type f. The print0 argument

  • tells find to not treat whitespace as the end of a filename. This means that that filenames with spaces in them will be processed correctly.
  • xargs -o: The -0 arguments xargs to not treat whitespace as the end of a filename.
  • tar -cvzf page_files.tar.gz: This is the command xargs is going to feed the file list from find to. The tar utility will create an archive file called “page_files.tar.gz.”

We can use ls to see the archive file that is created for us.

ls *.gz

The archive file is created for us. For this to work, all of the filenames need to be passed to tar en masse, which is what happened. All of the filenames were tagged onto the end of the tar command as a very long command line.

You can choose to have the final command run on all the file names at once or invoked once per filename. We can see the difference quite easily by piping the output from xargs to the line and character counting utility wc. This command pipes all the filenames into wc at once. Effectively, xargs constructs a long command line for wc with each of the filenames in it.

find . -name “*.page” -type f -print0 | xargs -0 wc

The lines, words, and characters for each file are printed, together with a total for all files.

If we use xarg‘s  -I (replace string) option and define a replacement string token—in this case ” {}“—the token is replaced in the final command by each filename in turn. This means wc is called repeatedly, once for each file.

find . -name “*.page” -type f -print0 | xargs -0 -I “{}” wc “{}”

The output isn’t nicely lined up. Each invocation of wc operates on a single file so wc has nothing to line the output up with. Each line of output is an independent line of text.

Because wc can only provide a total when it operates on multiple files at once, we don’t get the summary statistics.

The find -exec Option

The find command has a built-in method of calling external programs to perform further processing on the filenames that it returns. The -exec (execute) option has a syntax similar to but different from the xargs command.

find . -name “*.page” -type f -exec wc -c “{}” \;

This will count the words in the matching files. The command is made up of these elements.

  • find .: Start the search in the current directory. The find command is recursive by default, so subdirectories will be searched too.
  • -name “*.page”: We’re looking for files with names that match the “*.page” search string.
  • -type f: We’re only looking for files, not directories.
  • -exec wc: We’re going to execute the wc command on the filenames that are matched with the search string.
  • -w: Any options that you want to pass to the command must be placed immediately following the command.
  • “{}”: The “{}” placeholder represents each filename and must be the last item in the parameter list.
    • \;: A semicolon “;” is used to indicate the end of the parameter list. It must be escaped with a backslash “\” so that the shell doesn’t interpret it.

    When we run that command we see the output of wc. The -c (byte count) limits its output to the number of bytes in each file.

As you can see there is no total. The wc command is executed once per filename. By substituting a plus sign “+” for the terminating semicolon “;” we can change -exec‘s behaviour to operate on all files at once.

find . -name “*.page” -type f -exec wc -c “{}” \+

We get the summary total and neatly tabulated results that tell us all files were passed to wc as one long command line.

Tip solutie

Permanent

Voteaza

(10 din 19 persoane apreciaza acest articol)

Despre Autor