In AWK, the `gsub` function is used for advanced string manipulation, allowing you to replace all occurrences of a pattern within a string with another string. The syntax for using `gsub` is as follows:
gsub(regexp, replacement, target)
- `regexp` is the regular expression to search for. - `replacement` is the string to replace the matched pattern with. - `target` is the string to search and replace within. Here is an example of using `gsub` in AWK:
# Replace all occurrences of “apple” with “orange” in a string
str = “apple banana cherry apple”;
gsub(“apple”, “orange”, str);
print str; # Output: orange banana cherry orange
In this example, we use the `gsub` function to replace all occurrences of the string "apple" with "orange" in the string "apple banana cherry apple". The modified string is then printed to the console. You can also use regular expressions with `gsub` to match more complex patterns. Here are some examples:
# Replace all digits with “X” in a string
str = “123 abc 456 def”;
gsub(/[0-9]/, “X”, str);
print str; # Output: XXX abc XXX def
# Remove all punctuation from a string
str = “Hello, World!”;
gsub(/[[:punct:]]/, “”, str);
print str; # Output: Hello WorldAdditionally, `gsub` can also be used to capture matched groups in the regular expression, and use them in the replacement string using the special variables `\1`, `\2`, etc. These variables are set to the matched groups in the order that they appear in the regular expression. Here is an example:
# Swap the order of the first and last names in a list names = "John Smith, Jane Doe, Bob Johnson"; gsub(/(\w+)\s+(\w+)/, "\\2, \\1", names); print names; # Output: Smith, John, Doe, Jane, Johnson, Bob
In this example, we use the regular expression `(\w+)\s+(\w+)` to match two words separated by whitespace. The first word is captured in group 1, and the second word is captured in group 2. We then use the replacement string `”\\2, \\1″` to swap the order of the two words, separated by a comma and a space. The modified string is then printed to the console.
The `gsub` function is a powerful tool for advanced string manipulation in AWK. You can use it to search and replace strings using regular expressions, capture groups in the regular expression and use them in the replacement string, and perform complex transformations on text data.