sift - grep on steroids

Sift: Grep On Steroids?

I’ve been playing around with Sift this weekend as a potentially friendly and faster alternative to Grep (not that grep is slow). Although the tool clearly has broader applications than parsing server logs, it’s very suited to that purpose. If you’re currently using Grep for this purpose, I’d recommend checking Sift out.

Overview

  • No dependencies. Easy, cross platform install.
  • Familiar to grep users.
  • Really fast. Handles huge files easily.
  • Perl RegEx supporting multiple patterns simultaneously.
  • A single flag for searching inside and outside gzipped files (e.g. sift -z Googlebot access*). Grep requires two separate instances to do this. Considering audits for larger sites with multiple servers, this is more of a time saver than you’d think.

The published speed benchmarks are impressive:

Sift Benchmarks
And suspect!

Installation

To try it out, you can grab the relevant installer from here. If you’re on Windows or Linux, add the utility to your path and you’ll be able to use it easily from here on out.

For Windows:

Control Panel\System and Security\System
Advanced System Settings\ Environment Variables
Append "C:\bin;" to PATH, copy the Sift application to C:\bin

If you’re running homebrew on OSX:

$ brew install sift

There’s an ARM build too, which you might find useful if you use a Raspberry Pi for this sort of thing (for example, if you’re circumventing installation restrictions at work).

When you start playing around you’ll find the syntax is pretty similar to grep:

sift -argumentflags REGEXPATTERN file/s

Flags you’ll want to consider:

  • -x Limit search to named file extensions (comma separated).
  • -R Do not recurse into directories.
  • –no-filename Do not print filename.
  • -z search content of compressed files.
  • –stats Show statistics

For a full list of commands, including File and Match condition options, type:

sift -h

I’ll persist with sift over grep for now, especially as the grep options I tend to use are the same in sift, which means I don’t have to relearn anything.

Check out the git repository here, and let me know what you think – will you be trying it out for a while, or sticking to {awk|ack|ag} ( the older ‘grep on steroids‘)?

Leave a Reply

Your email address will not be published. Required fields are marked *