GREP
Reference
GREP (General Regular Expression Program) is a search tool for finding strings in text files. I frequently use it for finding specific pieces of code in the source files.
This section of the document describes some of the ways I use GREP, and some of the helpful options and types of regular expressions which make GREP more than just a string finder.
One thing to keep in mind is that GREP is case sensitive.
If you want to pass multiple options to GREP, it expects that you will concatenate the options and use only one -. The order of the options does not matter, but the options must be specified before the regular expression.
Searching through all your source files (*.?)
Since a typical project uses only two types of source files, C (.C) and header (.H), you can tell GREP to search your entire set of source files by using *.? to specify all files with single letter extensions.
For example, the following command finds all occurrences of the string PIX_LEN anywhere in your source:
grep "PIX_LEN" *.?
Passing special characters (the \ character)
If you want to search for special characters (the ones which GREP uses, or which DOS does not want to pass) you can specify them in the regular expression by putting \ before them.
For example, to search for the string "HELP" (where the quotes are part of the search), you could use the command:
grep "\"HELP\"" *.c
The \ characters will prevent DOS from removing the extra quotes and will tell GREP to treat them as quote characters, not string delimiters.
Matching on either case (the -i option)
Usually, when looking up something in the code, you will want to ignore the case of the string (that is, you will want to match, even if you did not get the capitalization correct). To tell GREP to ignore case, use the -i switch.
For example,
grep -i "reset" *.c
will find all occurrences of reset in any C file, even if the r is capitalized.
Getting the line numbers for the match (the -n option)
If you are trying to fix code, you may want the line numbers for the lines which match the expression. GREP provides the -n option to give the line numbers at the start of the line.
For example:
grep -ni "reset" *.c
will find all occurrences of reset, independent of capitalization, and report the line numbers. This gives you a list of places you might want to check.
Searching for 2 words on the same line ( "word1.*word2" )
To search for 2 words on the same line, where you know the order in which they will appear, but not what will appear between them, you can use the command:
grep -in "word1.*word2" *.c
will find all lines containing word1 followed by any number of any characters, followed by word2. The sequence .* tells GREP to match any character (.) any number of times (*).
Note: Using * tells GREP that it can match . (any character) zero or more times. If you want to specify one or more times, use + instead.
Searching for assignments to a variable ("var *=[^=]")
To get GREP to search for all assignments to a variable called var, you can use the command:
grep "var *=[^=]" *.c
This tells GREP to match any line containing var, followed by any number of spaces ( *), followed by an equals sign, followed by any character which is not an equals sign ([^=]) (to eliminate the == comparison).
This use of the character set ([]) and its "not a member" option (the ^ as the first character in the set) can be used for several other complicated searches.
Searching for setting a specific value (" =[^=] *value")
To get GREP to search for all assignments which assign a specific value to any variable (for example, for looking for changes to a specific state), you can use the command:
grep " =[^=] *value" *.c
This tells GREP to match any occurrence of =, followed by something which is not an equals sign, followed by any number of spaces, then followed by the value of interest.
Note that the use of =[^=] to skip over equality comparisons only works because of our coding standard which surrounds equals signs with spaces. That allows GREP to find the 'non-equal-sign' character before finding the value.
Find all conditionals relating to TIME ("#if.*_TIME[^a-zA-Z_]")
You can get GREP to find all conditionals based on symbols ending in _TIME by specifying:
"#if.*_TIME[^a-zA-Z_]"
as the regular expression.
This tells GREP to match any line which contains #if, followed by any number of any character, followed by _TIME, followed by anything but a letter or the underscore character.
This should find any conditional compile which uses any symbol ending in _TIME, including those with trailing comments or with multiple symbols in the condition.
Find something, excluding marked lines from the search
If you mark your source code with initial characters to indicate sections which have been iff'd out, then you can restrict the search for grep to eliminate these lines by adding a check to the beginning of the search string. The following example assumes that lines to be ignored are marked with either * or > in the first column.
Here you tell grep to find any line whose first character is not * or >, followed by any number (including zero) of any character, followed by the string searching for.
grep -n "^[^*>].*searching for" *.?
If you want to restrict the search even further, to eliminate ASSERT lines, and if you know that none of the valid lines will have A as the first real character, then you can build a more exacting match by telling grep to require spaces, followed by something which is neither a space nor an A.
Note that you need to tell grep that you you want spaces followed by something which is not a space, or it will match anything since it can use the last space to match against the "not A" requirement.
grep -n "^[^*>] *[^ A].*searching for" *.?
Following are some bunch of commands that might be useful if you want to find files in unix/linux.
GREP (General Regular Expression Program) is a search tool for finding strings in text files. I frequently use it for finding specific pieces of code in the source files.
This section of the document describes some of the ways I use GREP, and some of the helpful options and types of regular expressions which make GREP more than just a string finder.
One thing to keep in mind is that GREP is case sensitive.
If you want to pass multiple options to GREP, it expects that you will concatenate the options and use only one -. The order of the options does not matter, but the options must be specified before the regular expression.
Searching through all your source files (*.?)
Since a typical project uses only two types of source files, C (.C) and header (.H), you can tell GREP to search your entire set of source files by using *.? to specify all files with single letter extensions.
For example, the following command finds all occurrences of the string PIX_LEN anywhere in your source:
grep "PIX_LEN" *.?
Passing special characters (the \ character)
If you want to search for special characters (the ones which GREP uses, or which DOS does not want to pass) you can specify them in the regular expression by putting \ before them.
For example, to search for the string "HELP" (where the quotes are part of the search), you could use the command:
grep "\"HELP\"" *.c
The \ characters will prevent DOS from removing the extra quotes and will tell GREP to treat them as quote characters, not string delimiters.
Matching on either case (the -i option)
Usually, when looking up something in the code, you will want to ignore the case of the string (that is, you will want to match, even if you did not get the capitalization correct). To tell GREP to ignore case, use the -i switch.
For example,
grep -i "reset" *.c
will find all occurrences of reset in any C file, even if the r is capitalized.
Getting the line numbers for the match (the -n option)
If you are trying to fix code, you may want the line numbers for the lines which match the expression. GREP provides the -n option to give the line numbers at the start of the line.
For example:
grep -ni "reset" *.c
will find all occurrences of reset, independent of capitalization, and report the line numbers. This gives you a list of places you might want to check.
Searching for 2 words on the same line ( "word1.*word2" )
To search for 2 words on the same line, where you know the order in which they will appear, but not what will appear between them, you can use the command:
grep -in "word1.*word2" *.c
will find all lines containing word1 followed by any number of any characters, followed by word2. The sequence .* tells GREP to match any character (.) any number of times (*).
Note: Using * tells GREP that it can match . (any character) zero or more times. If you want to specify one or more times, use + instead.
Searching for assignments to a variable ("var *=[^=]")
To get GREP to search for all assignments to a variable called var, you can use the command:
grep "var *=[^=]" *.c
This tells GREP to match any line containing var, followed by any number of spaces ( *), followed by an equals sign, followed by any character which is not an equals sign ([^=]) (to eliminate the == comparison).
This use of the character set ([]) and its "not a member" option (the ^ as the first character in the set) can be used for several other complicated searches.
Searching for setting a specific value (" =[^=] *value")
To get GREP to search for all assignments which assign a specific value to any variable (for example, for looking for changes to a specific state), you can use the command:
grep " =[^=] *value" *.c
This tells GREP to match any occurrence of =, followed by something which is not an equals sign, followed by any number of spaces, then followed by the value of interest.
Note that the use of =[^=] to skip over equality comparisons only works because of our coding standard which surrounds equals signs with spaces. That allows GREP to find the 'non-equal-sign' character before finding the value.
Find all conditionals relating to TIME ("#if.*_TIME[^a-zA-Z_]")
You can get GREP to find all conditionals based on symbols ending in _TIME by specifying:
"#if.*_TIME[^a-zA-Z_]"
as the regular expression.
This tells GREP to match any line which contains #if, followed by any number of any character, followed by _TIME, followed by anything but a letter or the underscore character.
This should find any conditional compile which uses any symbol ending in _TIME, including those with trailing comments or with multiple symbols in the condition.
Find something, excluding marked lines from the search
If you mark your source code with initial characters to indicate sections which have been iff'd out, then you can restrict the search for grep to eliminate these lines by adding a check to the beginning of the search string. The following example assumes that lines to be ignored are marked with either * or > in the first column.
Here you tell grep to find any line whose first character is not * or >, followed by any number (including zero) of any character, followed by the string searching for.
grep -n "^[^*>].*searching for" *.?
If you want to restrict the search even further, to eliminate ASSERT lines, and if you know that none of the valid lines will have A as the first real character, then you can build a more exacting match by telling grep to require spaces, followed by something which is neither a space nor an A.
Note that you need to tell grep that you you want spaces followed by something which is not a space, or it will match anything since it can use the last space to match against the "not A" requirement.
grep -n "^[^*>] *[^ A].*searching for" *.?
Following are some bunch of commands that might be useful if you want to find files in unix/linux.
No comments:
Post a Comment