2018-02-28

AWK: Get/Extract String Matching Regexp


Example of how to use AWK in order to get/extract/return parts of lines/strings, matching a "dynamic" regular expression (specified through a shell environment variable).

Source file used in this example
$ cat example.txt
Attr[27716, Field[4937190,usage=51]]
Attr[27716, Field[4937191,usage=10]]
Attr[27716, Field[4937192,usage=321]]

Two usages
# Defining the regular expression pattern as environment variable
$ MY_PATTERN="([0-9]+),used"

# Using the predefined awk variables RSTART and RLENGTH
# Returns the whole string matching the regexp
$ awk -v PATTERN="${MY_PATTERN}" 'match($0, PATTERN) { print substr( $0, RSTART, RLENGTH) }' example.txt
4937190,usage
4937191,usage
4937192,usage

# Using the capture group & array functionality in order to get only the regexp part in parenthesis
$ awk -v PATTERN="${MY_PATTERN}" 'match($0, PATTERN, array) { print array[1] }' example.txt
4937190
4937191
4937192


2018-02-08

Filtering and Splitting Lines With AWK & "while read"


In this example lines contained in a ksh/bash String are:
  • filtered (by awk)
  • duplicate consecutive lines are removed (by awk)
  • then split only using the shell internal "while read" feature (without needing to use "cut, awk"  or other external "binaries")

Example: if ${RESULT_STR} (see below) contains:
*********************** Backup Status of DB_29384_SITE1 ************************
Backup Type   Status                   Start Time       Error Message
------------- ------------------------ ---------------- ---------------------------
ARCHIVELOG    COMPLETED                2018.02.07 16:54 
ARCHIVELOG    COMPLETED WITH WARNINGS  2018.02.07 17:00 RMAN-08137: WARNING: ...
ARCHIVELOG    COMPLETED                2018.02.07 20:55 
ARCHIVELOG    COMPLETED WITH WARNINGS  2018.02.07 21:06 RMAN-08137: WARNING: ...
DB INCR       COMPLETED                2018.02.08 02:00  ARCHIVELOG    COMPLETED                2018.02.08 03:04 
DB INCR       COMPLETED                2018.02.08 03:06 
ARCHIVELOG    COMPLETED WITH WARNINGS  2018.02.08 03:53 RMAN-08137: WARNING: ...
ARCHIVELOG    COMPLETED                2018.02.08 04:54 

AWK: Filtering & Removing Duplicate Lines
The result of
print - "${RESULT_STR}" | 
  awk '/ RMAN-| ORA-/
    MSG=substr($0, 86)
    if(MSG!=PREV){ 
      print MSG
      PREV=MSG 
    }
  }' 
is:
("/ RMAN-| ORA-/" filters the "interesting" lines – containing error/warning messages. "substr($0, 86)" returns the end of the string starting with column 57. "MSG and PREV" are used in order to remove duplicate consecutive lines)
RMAN-08137: WARNING: archived log not deleted, needed for standby or upstream capture process 

While Read: Splitting The Input
"errNr" gets the characters up to the first space, "errTxt" receives all other characters until the end of line
print - "${RESULT_STR}" | 
  awk '/ RMAN-| ORA-/{
    MSG=substr($0, 57)
    if(MSG!=PREV){ 
      print MSG
       PREV=MSG
   }
  }' |
  while read errNr errTxt ; do
    print - "errNr: ${errNr} / errTxt: ${errTxt} "
  done
Output
errNr: RMAN-08137: / errTxt: WARNING: archived log not deleted, needed for standby or upstream capture process 

Specifying a "while read" Delimiter
The delimiter can be changed from space to colon by using "IFS=':'"
IFS=':'
print - "${RESULT_STR}" |
  awk '/ RMAN-| ORA-/{MSG=substr($0, 57); if(MSG!=PREV){ print MSG; PREV=MSG }}' |
  while read errNr errTxt ; do
    print - "errNr: ${errNr} / errTxt: ${errTxt} "
  done
Output (without ':' after RMAN-08137)
errNr: RMAN-08137 / errTxt: WARNING: archived log not deleted, needed for standby or upstream capture process

[]

2018-02-02

Subroutine/Procedure/Function that Reads from Standard Input (Pipe)

The procedure "pipe2dbg()" reads from the standard input (from a pipe), copies the input to the standard output and - if enabled - also to a debug log file.
# Enables/disables the debugging 
typeset -r C_DO_DBG="Y"

# Define the path of the debug log file
typeset -r C_DBG_FILE="/tmp/myDebugFile.log"

pipe2dbg () {
  if [[ -n "${C_DBG_FILE}" && "${C_DO_DBG}" == "Y" ]]; then
    while read line; do
      print - "${line}" | tee -a ${C_DBG_FILE}
    done
  else
    while read line; do
      print - "${line}"
    done
  fi
}

The procedure "pipe2stderr()" reads from the standard input (from a pipe), copies the input to the standard output and if one of the keywords "error|warning|sorry" are found, writes to the standard error output (see "1>&2") instead of standard output:
pipe2stderr () {
  typeset L_ERR_STR
  while read line; do
    L_ERR_STR="$(print - "${line}" | egrep -e 'error|warning|sorry')"
    if [[ -n "${L_ERR_STR}" ]]; then
      print - "E: ${L_ERR_STR}" 1>&2
    else
      print - "${line}"
    fi
  done
}


Usage examples in ksh/bash scripts.

Example 1
. . .
print - "Write this to the standard output and to the debug log file" | pipe2dbg
. . .

Example 2
Writing the output of Oracle sqlplus SQL statements to the standard output with the possibility to enable disable the debugging output to a debug log file:

. . .
sqlplus -SILENT /NOLOG <<EOSQL | pipe2dbg
  CONNECT <DbUser>/<Password>
  <SQL Statements>
EOSQL
. . .