Technology

HOW TO : Use awk to print values larger than certain number

Quick how to on using awk to filter results if a certain value (column) is larger than a set value.

For example, if you have a file (servers.txt) with lines in this format

a_datacenter, servers 20
 error, servers xyz
 b_datacenter, servers 21
 c_datacenter, servers 50

and you want to show only the lines that have server value larger than 20, you can do this in awk by running

grep datacenter servers.txt | awk '$3 > 20  {print ;}' | more

breaking down the commands

grep – parsing down the output to just show the lines containing datacenter

awk – $3 > 20 : Get the third variable (awk seperates text using spaces by default) and check if it is greater than 20

print – print the entire line

HOW TO : Parse IP Address in Windows Batch File

We had a recent challenge at work which required us to execute different actions based on which office a particular workstation was located in. Since we have unique network ranges per office, I thought this would be a good variable to use. Just for future reference, here is how we accomplished this in a batch file. The workstations were running Windows 7

[code]

@ECHO OFF

FOR /f "tokens=3" %%I IN (
‘netsh interface ip show address "Local Area Connection" ^| findstr "IP Address"’
) DO SET ipAddress=%%I

REM "Office 1"
IF NOT x%ipAddress:10.130=%==x%ipAddress% (
ECHO "Office 1" + %ipAddress%
ECHO "do_something_else" )

REM "Office 2"
IF NOT x%ipAddress:10.140=%==x%ipAddress% (
ECHO "Office 2" + %ipAddress%
ECHO "do_something_else" )

[/code]

Details of function used

  • netsh interface ip show address “Local Area Connection” : With this command we are extracting the IP information of just the LAN port
  • findstr “IP Address” : returns the line containing “IP Address”
  • IF NOT x%ipAddress:10.130=%==x%ipAddress% : We are using the substitution function and returning false if the new string doesnt match the original
  • FOR /f “tokens=3” : Using the functions in the FOR loop to extract the third variable in the matching line

Update 1 : Application Development : domainScan

Following up from my post earlier this month regarding building a security application that scans publicly available data (Google) and report on potential information leakage from a hostname.

I created a repo on github if anyone is interested in contributing. First thing any good developer does is to check code in early and often :). The repo is at https://github.com/kudithipudi/security-domainscan

Here’s the sudo code I put together as a framework to build on

[code]

functions
read_file(file)
open file;
for each line
process_line(hostname)

process_line(hostname)
search_google(hostname)
write to log

search_google (hostname)
connect to google api
get results for hostname
return number of results

main
read_file(input)

[/code]

 

HOW TO : Search for a record in MongoDB based on length

Quick entry for my own records.

MongoDB is one of the popular open source document database that is part of the nosql movement. One of the applications we deployed at work uses MongoDB as an internal storage engine. We ran into an issue where MongoDB was trying to replicate data to MySQL and the replication stopped because of a size mismatch for an object between MongoDB and MySQL. Essentially MongoDB was trying to insert a record into MySQL that was larger than the defined length.

Here is the query we used to find the culprit objects. We used the awesome Robomongo client to connect to the MongoDB instance.

[code]db.some_table_to_search.find({$where:"this.some_column_to_search.length > 40"})[/code]

Breaking down the command

db -> Specifies the database you are trying to search

some_table_to_search -> Specifie the table you are trying to search

some_column_to_search -> Specified the particular column you are trying to search.

In this specific example, we were looking for entries longer than 40 characters for this column.

If you come from the traditional RDBMS world, here is a link from MongoDB comparing terminology between RDBMS and MongoDB.

http://docs.mongodb.org/manual/reference/sql-comparison/

Idea for a security application

I think the best way to learn a new (programming) language is to address a real world problem :). So here is one, I want to solve in the next few months.

One of the things I like to do as part of a evaluation security process is to check the amount of public information available for a website. I frequently find that people find information leakage from websites they thought were secure or not publicly accessible.

The idea is to create a python script to do the following

  • Must have
    • Inject list of hostnames and do the following
      • Check whether they resolve to a public IP or not
      • If resolving to public IP, check the amount of data being exposed by this site by doing a quick google search
      • Report on the amount of information available sorted by amount
  • Nice to have
    • take domain name instead of hostnames and try to do a domain transfer and capture all hostnames in the domain
    • leverage Google API instead of web scraping
    • web interface to allow input and show output

Why python? Well, I have been trying to learn it for sometime now and I think it is time to put all that learning to use :).

Anyone interested in joining the fun?

Lessons of the trade : Handling CVV numbers

Just for my notes.. Even though the CVV numbers on a credit card, look like numbers :), don’t treat them as integers in your code. Some of the numbers start with a 0.. so 059 might become 59 by the time you try to process it if you capture the CVV field as an integer.

Just treat them like a string.

And obviously you are not storing them anywhere in your application/network :). Or you might end up in the headlines like some of our retailers.

HOW TO : Convert PFX/P12 crypto objects into a java keystore

We needed to add a certificate that is currently in PKCS#12 format currently into a java keystore at work recently. The typical step would be due to create an empty keystore and then import the certificate from the PKCS#12 store using the following command

[code]keytool -importkeystore -srckeystore sourceFile.p12 -srcstoretype PKCS12 -destkeystore destinationFile.jks[/code]

Note: PKCS#12 files can have extensions “.p12” or “.pfx”

The command executed without any issues, but we received the following error when we started the application server using this newly created keystore

[code]java.io.IOException: Error initializing server socket factory SSL context: Cannot recover key [/code]

It didn’t make sense, because we were able to view the certificate in the keystore and were using the right password in the configuration files.

After a lot of searching and head scratching, the team came up with the following solution

  1. Export the public key and private key from the PKCS#12 store using openssl.
  2. Import these keys into the java keystore (default format of JKS)

The commands used were

[code]
openssl pkcs12 -in sourcePKCS12File.p12 -nocerts -out privateKey.pem
openssl pkcs12 -in sourcePKCS12File.p12 -nokeys -out publicCert.pem
openssl pkcs12 -export -out intermittentPKCS12File.p12 -inkey privateKey.pem -in publicCert.pem
keytool -importkeystore -srckeystore intermittantPKCS12File.p12 -srcstoretype PKCS12 -destkeystore finalKeyStore.jks
[/code]

HOW TO : Use grep and awk to find count of unique entries

I have use grep extensively before to analyze data in log files before. A good example is this post about using grep and sort to find the unique hits to a website. Here is another way to do it using grep and awk.

Say the log file you are analyzing is in the format below and you need to get the unique number of BundleIDs

[code]2013-02-25 12:00:06,684 ERROR [com.blahblah.sme.command.request.CustomCommand] Unable to execute AssignServiceCommand, request = ‘<AssignServiceToRequest><MemberId>123456</MemberId><OrderBundle><BundleId>5080</BundleId></OrderBundle></AssignServiceToRequest>'[/code]

you can use grep and awk to find the number of times a unique bundleID appears by running

[code]grep -i bundleID LOG_FILE_NAME | awk ‘{ split ($11,a,">"); print a[6]}’ | sort | uniq -c | sort -rn [/code]

breaking down the commands

grep -i : tells grep to only show the lines from the file (LOG_FILE_NAME) containing the text bundleID and makes the search case insensitive

awk ‘{ split ($11,a,”>”); print a[6]}’ : tells awk to grab the input from grep and take the 11th item (by default awk separates content with a space) and split the string into an array (a) using > as a delimiter. And finally print out the value of the array a’s sixth member

sort : sorts the output from awk into ascending order

uniq -c : takes the output from sort and counts uniq items

sort -qn : takes the output from uniq and does a reverse order sort

The output looked like this

[code]
173 5080</BundleId
12 5090</BundleId
8 2833</BundleId
1 2412</BundleId
1 2038</BundleId
1 1978</BundleId
1 1924</BundleId
[/code]

HOW TO : Configure tcpdump to rotate capture files based on size

quick note for self. If you are capturing traffic using tcpdump, you can rotate the capture files based on size

[code]sudo tcpdump -i INTERFACE_TO_CAPTURE_TRAFFIC_ON -C 10 -s0 -W NO_OF_FILES_TO_ROTATE_THROUGH -w /PATH_TO_CAPTURE_FILE [/code]

explanation of the options used

-i : specify the interface you want to capture the traffic on. If  not specified, tcpdump will listen on the lowest numbered interface. i.e. eth0

-C : specify the size of the file multiplied by 1000000 bytes. In this example, the file created would be 10000000 bytes. Or ~9.8MB

-s : specify the packet length to capture. 0 (zero) tells tcpdump to capture the entire packet

-W : specify the number of files to rotate through once the files size specified in -C is reached. The files keep rotating throughout the capture

-w : Specify the path to the capture file. tcpdump appends an integer to the end of the file based on the number of files it has to rotate through.

HOW TO : Restrict access to proxied content in Apache

If you are using the mod_proxy feature in Apache to forward requests for certain content to a backend server, but want to restrict access to that content to clients originating from certain IP addresses, you can use the location feature in Apache.

The Location directive limits the scope of the enclosed directives by URL. This is very similar to the Directory directive, but the difference is that you can put controls based on the URL rather than the location of the content.

In this example, I am forwarding content destined to https://kudithipudi.org/testLocation to an internal server at http://127.0.0.1:8080/testLocation. I am going to use the Location directive to restrict access to just requests originating from IP Address 10.10.10.10

[code]

<Location /testLocation>
Order Deny,Allow
Deny from all
Allow from 10.10.10.10
</Location>

ProxyPass /testLocation http://127.0.0.1:8080/testLocation
ProxyPassReverse /testLocation http://127.0.0.1:8080/testLocation [/code]