
Therefore, we need to use single quotes to prevent expanding variable names when starting tar. : 22:18:10 security alert: 10 times failed login from the same IPĪs the test above shows, we have three empty filenames in the output, as the shell variable $TAR_FILENAME doesn’t exist when we start the tar command. : 17:07:14 Security alert: 10 Permission Denied Requests from the same IP. : 22:08:14 security alert: 10 times failed login from the same IP So, for example, if we double-quote COMMAND, the $TAR_FILENAME variable will be expanded by the shell when invoking the tar command: $ tar xzf app_ -to-command="grep -label=$TAR_FILENAME -Hi 'security alert' true" This is because the TAR_* variables are assigned during tar‘s execution and passed to COMMAND. Therefore, we add the true command at the end to make COMMAND always return 0 and suppress those error messages.Īnother point we should note is we’ve wrapped COMMAND with single quotes. This messes up the output, which is definitely not what we want. Tar: Exiting with failure status due to previous errors Logs/app2/user.log: 22:08:14 security alert: 10 times failed login from the same IP $ tar xzf app_ -to-command='grep -label=$TAR_FILENAME -Hi "security alert"'

#CYGWIN GREP BINARY FILE MATCHES ARCHIVE#
Therefore, zgrep can search the files’ content in a compressed archive, but it cannot tell which file inside the archive hits the match. Here, we use the -O option to ask the tar command to extract files to Stdout instead of disk. If type is ‘without-match’, when grep discovers null input binary data it assumes that the rest of the file does not match this is equivalent to the -I option. Simply put, zgrep uses gzip to decompress the files to Stdout and pipes it to grep to perform the search.īasically, it’s pretty similar to the command: tar xzfO app_ | grep -Hai 'security alert' Looking at the grep manual, this seems to be because (bold mine). That means we can read the source to understand how it works. usr/bin/zgrep: POSIX shell script, ASCII text executable Next, to figure out why it happens, we need to understand how zgrep works.įirst, zgrep is just a shell script: $ file $(which zgrep) However, if we take a closer look at the filenames in the output, we only see the tar.gz file’s name instead of the names of the log files in the archive. -i: Ignore case distinctions when matching patternsĪs the output above shows, zgrep has successfully found the three “ security alert” occurrences.
Therefore, these three steps may increase the disk IO load dramatically. Also, the files in the archive can be much bigger than our example. However, in the real world, the tarball may contain a significant number of files. Our example has only four small log files. This can be the most straightforward way to achieve the goal.

This seems extremely related, but I don't understand perl, so I haven't been able to get the accepted answer there to work. Somewhere along the way, my process of writing the files got a little messed up and I'm trying to debug this problem by inspecting how long each record actually is. Within the binary header is an ascii string 80 characters long.

Each record consists of a (binary) header followed by binary data.
#CYGWIN GREP BINARY FILE MATCHES SERIES#
I'm generating binary data files that are simply a series of records concatenated together.
