Linux/grep

The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

grep

Print only matching:

grep -o '[PATTERN]'

Get IP addresses:

ifconfig | grep -o '[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}'

IP=`curl -s ip.oeey.com`
echo "$IP" | grep -o '^[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}$'
echo $?  # 0 on success, 1 on fail

Ignore fully commented lines:

cat [file] | grep -v "^\s*#" | grep -v "^$"

OR

grep -e pattern1 -e pattern2 filename

grep -E "pattern1|pattern2"

grep "pattern1\|pattern2"

ref: [1]

Show Non ASCII Characters

grep -a --color='auto' -P -n "[\x80-\xFF]" file.xml
grep -a --color='auto' -P -n "[^\x00-\x7F]" file.xml

echo '소녀시대' | grep -P "[\x80-\xFF]"

grep -axv '.*' file.txt  # doesn't seem to work on anything

Sample: https://www.w3.org/2001/06/utf-8-wrong/UTF-8-test.html

ref: https://stackoverflow.com/questions/3001177/how-do-i-grep-for-all-non-ascii-characters

Detect Corrupted Unicode Characters

awk '/[^\x00-\x7F]/{ print NR ":", $0 }' file.txt

$ awk '/[^\x00-\x7F]/{ print NR ":", $0 }' file
1: Interruptor EC nÃ£o estÃ¡ em DESLOCAR
4: è¾…åŠ©é©¾é©¶å®¤é—¨å…³é—
5: Porte cab. aux. fermÃ©e
7: Ð”Ð²ÐµÑ€ÑŒ Ð°Ð¿Ð¿Ð°Ñ€Ð°Ñ‚Ð½Ð¾Ð¹ ÐºÐ°Ð¼ÐµÑ€Ñ‹ Ð·Ð°ÐºÑ€Ñ‹Ñ‚Ð°
13: é«˜åŽ‹ä¿æŠ¤æ‰‹æŸ„å‘ä¸‹
14: BarriÃ¨re descendue
16: ÐžÐ³Ñ€Ð°Ð½Ð¸Ñ‡. ÐŸÐ»Ð°Ð½ÐºÐ° Ð’Ð’Ðš Ð¾Ð¿ÑƒÑ‰.
19: Barra de separaÃ§Ã£o descida
22: DPæœªå¯åŠ¨
23: Puiss. rÃ©p. non activÃ©e
25: !!! Ð’Ð½ÐµÑˆÐ½ÑÑ Ð¼Ð¾Ñ‰Ð½Ð¾ÑÑ‚ÑŒ Ð½Ðµ Ð²ÐºÐ»ÑŽÑ‡ÐµÐ½Ð°
26: PotÃªncia Dist NÃ£o Ativada
28: PotÃªncia dist nÃ£o activada
31: æœºè½¦æœªç§»åŠ¨
33: Motor no se estÃ¡ moviendo
34: Ð›Ð¾ÐºÐ¾Ð¼Ð¾Ñ‚Ð¸Ð² Ð½ÐµÐ¿Ð¾Ð´Ð²Ð¸Ð¶ÐµÐ½
35: Auto NÃ£o se Movendo
37: A nÃ£o se move
40: æœºè½¦çŠ¶å†µå…è®¸è‡ªåŠ¨åœæœº
41: Conditions auto\npermettent arrÃªt auto
43: Ð£ÑÑ‚Ð°Ð½Ð¾Ð²ÐºÐ¸ Ð»Ð¾ÐºÐ¾Ð¼Ð¾Ñ‚Ð¸Ð²Ð°\nÐŸÑ€ÐµÐ´ÑƒÑÐ¼Ð°Ñ‚Ñ€Ð¸Ð²Ð°ÑŽÑ‚ Ð     °Ð²Ñ‚Ð¾Ð¼Ð°Ñ‚Ð¸Ñ‡ÐµÑÐºÑƒÑŽ Ð¾ÑÑ‚Ð°Ð½Ð¾Ð²ÐºÑƒ
44: CondiÃ§Ãµes da moto\nPermitem Auto Parada

Ref: https://stackoverflow.com/questions/30738924/detecting-corrupt-characters-in-utf-8-encoded-text-file

--

Assuming you have your locale set to UTF-8 (see locale output), this works well to recognize invalid UTF-8 sequences:

grep -axv '.*' file.txt

Ref: https://stackoverflow.com/questions/29465612/how-to-detect-invalid-utf8-unicode-binary-in-a-text-file

keywords

Linux/grep

Contents

grep

OR

Show Non ASCII Characters

Detect Corrupted Unicode Characters

keywords

Navigation menu

Linux/grep

grep

OR

Show Non ASCII Characters

Detect Corrupted Unicode Characters

keywords

Navigation menu

Search