Showing posts with label awk. Show all posts
Showing posts with label awk. Show all posts

2010-10-11

Nessus XML parsing with awk

At the office, I use Nessus for automated network scanning and patch auditing. With credentials and proper tuning of the scan policy, Nessus is a very powerful tool for more than skript kiddie network scanning. This leaves me with a whole bunch of data to wade through on a weekly basis.

Usually, I only concern myself with the high-severity issues for weekly reports, then as I have time, I dig deeper into the more trivial problems. Still, this required me to manually open the scan files, filter them by severity, and export the data. I got tired of that and made a quick and really dirty XML parser (.nessus files are XML) with shell and grep. It was horrendously slow.

Andy, a fellow KC2600-er helped me wrap my brain around some of the finer points of awk to make it more efficient. This is slightly modified from the one I use at work, which is part of a bigger script that does other things. I figure it's useful for others who use Nessus regularly. The script is here.

Basically, it stores the HostName tag when it encounters it, then iterates through the lines, storing them temporarily until it runs into a line indicating a high-severity plugin has been triggered (severity level 3), then it spits out the host name and the plugin that was triggered. I probably could write the whole thing in awk, but I wrapped it in a little bit of plain old shell script.

Output looks something like this:


Windows
----------------------------------------------------
x.x.x.19:MS10-062: Vulnerability in MPEG-4 Codec Could Allow Remote Code Execution (975558)
x.x.x.19:Adobe Reader <= 9.3.4 / 8.2.4 CoolType.dll SING Font 'uniqueName' Field Parsing Overflow (APSA10-02)
x.x.x.20:MS10-066: Vulnerability in Remote Procedure Call Could Allow Remote Code Execution (982802)

Mac
----------------------------------------------------
x.x.x.8:Mac OS X AFP Shared Folders Unauthenticated Access (Security Update 2010-006) (uncredentialed check)

Linux
----------------------------------------------------
x.x.x.40:PHP 5.2 < 5.2.14 Multiple Vulnerabilities

2009-12-04

awk over non-interactive ssh sessions

I had trouble wrapping my brain around this one today. I was attempting to grab the second field of ps(1) output to display only the PID of a given process. Of course, if we had pgrep and pkill, this would be a no-brainer. But we don't.

Note, when I use grep on ps(1) I also pipe it through grep -v grep (which ignores any line containing the pattern "grep") so that grepping for some_process doesn't end up like this:

root 751 200 0 Jun 25 ? 3:11 /usr/sbin/some_process -d
axon 2429 222 0 4:54:59 pts/1 0:00 grep some_process

First attempt:

ssh somebox "ps -ef | grep some_process | grep -v grep | awk '{print $2}' "
The output, though, was the whole line out of ps.

root 751 200 0 Jun 25 ? 3:11 /usr/sbin/some_process -d

I attempted escaping the ticks, double ticks, escaped double ticks, double quotes, and all kinds of madness. Nothing was working. I was getting either a whole ps line (as if awk wasn't even there) or syntax errors from awk.

Finally, I ask a co-worker (the biggest shell geek I know) pointed out that it was being frobbed by two separate shells. He gave me a somewhat complicated line to use, but I figured out an easier way. The second shell was interpreting $2, thinking I was referencing a shell argument, and was passing nothing to awk. Solution? Escape $2.

ssh somebox "ps -ef | grep some_process | grep -v grep | awk '{print \$2}' "
751

Hooray! Hopefully this helps some poor sysadmin somewhere when the time comes to reference variables remotely in something like perl or awk.

2009-01-16

Twitter Followers/Friends from the CLI

I started getting curious on twitter. I had two questions:
  1. Who am I following that's not following me back? (i.e. can Martin Roesch hear me? The answer is no, he can't)
  2. Who is following me that I'm not following back?
Already familiar enough with the Twitter API, I threw together some quick and ugly command-line foo, and @digitaljestin wanted to know how I did it. This is REALLY ugly, and could use a lot of refinement. That said:

I'll probably program a quick stand-alone newlisp or php tool for this over the weekend. Regardless, here's how I did it on the CLI.

First, Twitter will only hand you 100 friends and followers at once. If I were going to automate this, I would poll the followers_count and following_count attributes from http://twitter.com/users/show/username.xml to figure out how many "pages" I needed to fetch.

If you have 203 followers, you will have to do three requests for follower info. Same with friends (those whom you follow). I had over 200 (but less than 300) each. So I did 3 of each request.

I'm only interested in the screen_name attribute within the XML of each. Note that I'm doing a lot of cheap grep | awk crap here, so it just builds lists of screen names without any markup.

$ wget http://user:password@twitter.com/statuses/followers.xml \
| grep "<screen_name>" | awk -F"[\<\>]" '{print $3}' > followers.txt
$ wget http://user:password@twitter.com/statuses/followers.xml\?page=2 \
| grep "<screen_name>" | awk -F"[\<\>]" '{print $3}' >> followers.txt
$ wget http://user:password@twitter.com/statuses/followers.xml\?page=3 \
| grep "<screen_name>" | awk -F"[\<\>]" '{print $3}' >> followers.txt

$ wget http://user:password@twitter.com/statuses/friends.xml \
| grep "<screen_name>" | awk -F"[\<\>]" '{print $3}' > friends.txt
$ wget http://user:password@twitter.com/statuses/friends.xml\?page=2 \
| grep "<screen_name>" | awk -F"[\<\>]" '{print $3}' >> friends.txt
$ wget http://user:password@twitter.com/statuses/friends.xml\?page=3\
| grep "<screen_name>" | awk -F"[\<\>]" '{print $3}' >> friends.txt

Then, I just sorted them:
$ sort friends.txt > friends-sort.txt
$ sort followers.txt > followers-sort.txt

Using diff, it's easy to tell who is not following you, and who you aren't following.
The < shows lines that only appear only in the first file (ones you follow only). The > shows lines that only appear only in the second file (ones following you). Grepping for only lines that start with < and > avoids all the patch-file line offset stuff. Some diffs have varying syntax to do this, but letting grep filter it should work across more platforms.

$ diff friends.txt followers.txt | grep "[<>]" | sort
[excerpt]
< H_i_R
< Hak5
< KCWeather
< Scobleizer
< Veronica
< bacontwits
< beseKUre
< brightkite
< datalossdb
< hackadaydotcom
< ihacked
< ihackstuff
< kingpin_
< milw0rm
< mroesch
< obsessable
< om
< packetlife
< pauldotcom
< schneier
< textfiles
< wilw
< window
------------------ (split added by ax0n)
> BlackHatUSA
> Computersaurus
> HacClearwater
> HackersAlerts
> HackerspacesBot
> SOURCEBoston
> SecuritySatan
> quine
> reverz
> rsreese
> secureideas
> securitypro2009
> stopthemanga

2008-02-07

UNIX tip of the day: More awk recipes

Awk, as we have mentioned before, is a ridiculously handy utility that often goes under-appreciated by systems administrators and UNIX geeks alike.

A few days ago, a colleague of mine told me about some complex awk magic that he'd implemented in order to acquire not only the matching line of an input stream (in this case, a log file), but the two lines prior to that matching line which contained some useful information as to what was going on. It was an elaborate solution that worked well, but I swooped in with a much simpler recipe to do the same thing. This prints the two lines prior to the matching expression as well as the line containing the expression. Certain incarnations of grep can do the same thing, but this way you can also format the lines if you know your awk-fu.

$ awk '/some-regex/{print two "\n" one "\n" $0};{two=one};{one=$0}' /some/file.log

I actually keep a bunch of handy commands and UNIX tips written down. These are things that I know I won't need to use very frequently, but know I'll eventually need again.

I'll share some more awk magic with you from within its pages.

Get only the last field of a line that matches a regex:
$ awk '/some-regex/{print $NF}' /some/file

This works because NF contains the number of fields found in the line. $NF, then, contains the value of that last field. Just like $1 would contain the value of the first, if NF is 5, $NF would have the value of the 5th (and last) field. I love this one.

Example:
I'll set the field separator to a / and use awk to get only the last entry from the directory structure with find:

Raw find output:
$ find .
.
./.localized
./images
./images/apache_pb.gif
./images/gradient.jpg
./images/macosxlogo.png
./images/web_share.gif
./index.html


Now with awk:
$ find . | awk -F/ '/images/{print $NF}'
images
apache_pb.gif
gradient.jpg
macosxlogo.png
web_share.gif


If you're into AIX, a lot of the configuration files are in "Stanza" format. That is, a label, followed by a bunch of data and then a blank line between records. Awk can get just the one stanza you want from a stanza file. Example here is the /etc/security/user file on AIX, which tracks security profile information for every user on the system. The "default" stanza is an important one, as anything within it gets propagated to all users first, then any deviations from the default happen in the users' own stanzas:
# awk '/^default/,/^$/ {print}' /etc/security/user

Truth be told, stanza format and some of its variants are popular in other operating systems, but this particular awk recipe works best on AIX.

Have any awk-fu? Let's see some of your favorites. The comments are open!