Grep (COMP2041)

grep is a command-line text search tool in UNIX. It's actually an acronym for Global Regular Expression Print.

It takes input in the format:

grep regex file

If no file is present it defaults to reading from stdin.

It then prints all lines that match the regex search.

Commands

  • ^ = invert
  • ^ = start of line (yes, it means both of these depending on context)
  • $ = end of line
  • . = any one character
  • | = or
  • * = repeat the previous character 0 or more times

You can do anything with <>, * and () - all the other commands are syntactic sugar.

Examples

  • abc = abc
  • a.c = a [any character] c(e.g. atc, abc, acc, afc, a0c)
  • [^a-z] = anything not lowercase (e.g. 083472AD)
  • a|the = either of a or the, but not both
  • as*k = ak, ask, assk, asssk, assssk, etc
  • f$ = any line that ends with f - e.g. "blue fox fibs" is not, but "beth likes stuff" is
  • ^h = any line that starts with h - e.g. "ohai there" is not, but "hello world" is

Escaping Characters

If a character like '*' is to be used as a character rather than a command we need to 'escape' it. We do this by placing a backslash in front of it (\*).

Everything except ^, \, -, [, ] lose meaning inside brackets.