Linux/grep
Jump to navigation
Jump to search
grep
Print only matching:
grep -o '[PATTERN]'
Get IP addresses:
ifconfig | grep -o '[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}'
IP=`curl -s ip.oeey.com` echo "$IP" | grep -o '^[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}$' echo $? # 0 on success, 1 on fail
Ignore fully commented lines:
cat [file] | grep -v "^\s*#" | grep -v "^$"
OR
grep -e pattern1 -e pattern2 filename
grep -E "pattern1|pattern2"
grep "pattern1\|pattern2"
ref: [1]
Show Non ASCII Characters
grep -a --color='auto' -P -n "[\x80-\xFF]" file.xml grep -a --color='auto' -P -n "[^\x00-\x7F]" file.xml
echo '소녀시대' | grep -P "[\x80-\xFF]"
grep -axv '.*' file.txt # doesn't seem to work on anything
Sample: https://www.w3.org/2001/06/utf-8-wrong/UTF-8-test.html
ref: https://stackoverflow.com/questions/3001177/how-do-i-grep-for-all-non-ascii-characters
Detect Corrupted Unicode Characters
awk '/[^\x00-\x7F]/{ print NR ":", $0 }' file.txt
$ awk '/[^\x00-\x7F]/{ print NR ":", $0 }' file 1: Interruptor EC não está em DESLOCAR 4: è¾…åŠ©é©¾é©¶å®¤é—¨å…³é— 5: Porte cab. aux. fermée 7: Дверь аппаратной камеры закрыта 13: 高压ä¿æŠ¤æ‰‹æŸ„å‘下 14: Barrière descendue 16: Огранич. Планка ВВК опущ. 19: Barra de separação descida 22: DP未å¯åŠ¨ 23: Puiss. rép. non activée 25: !!! ВнешнÑÑ Ð¼Ð¾Ñ‰Ð½Ð¾ÑÑ‚ÑŒ не включена 26: Potência Dist Não Ativada 28: Potência dist não activada 31: 机车未移动 33: Motor no se está moviendo 34: Локомотив неподвижен 35: Auto Não se Movendo 37: A não se move 40: 机车状况å…许自动åœæœº 41: Conditions auto\npermettent arrêt auto 43: УÑтановки локомотива\nПредуÑматривают Ð °Ð²Ñ‚оматичеÑкую оÑтановку 44: Condições da moto\nPermitem Auto Parada
Ref: https://stackoverflow.com/questions/30738924/detecting-corrupt-characters-in-utf-8-encoded-text-file
--
Assuming you have your locale set to UTF-8 (see locale output), this works well to recognize invalid UTF-8 sequences:
grep -axv '.*' file.txt