Programming

Another day.. Another Hack

The net is up in arms about a new release from team Ghostshell of compromise data. Details of the leak can be found at http://www.theregister.co.uk/2012/08/28/team_ghostshell_megahack/ and the source of the data is at http://pastebin.com/BuabHTvr .

I thought I would put my nascent python skills to use and write a simple script to parse through the release and download all the data. Hoping to analyze it later on. It is pretty basic, but does the job of parsing the release and downloading the content. You can get the script at https://github.com/kudithipudi/Misc-Scripts/blob/master/parseHellfire.py

Watch out for an analysis of the content soon :).

HOW TO : Use Python to look for credit card numbers

Simple script in python to look for credit card numbers in a file.

[code]

#Importing modules
import re
import os

# Define variables
inputFile = ‘test.txt’
searchPattern = ‘((\D(6011|5[1-5]\d{2}|4\d{3}|3\d{3})\d{11,12}\D)|(^(6011|5[1-5]\d{2}|4\d{3}|3\d{3})\d{11,12}\D))’

tempinputFile = open(inputFile)
tempLine = tempinputFile.readline()

while tempLine:
print ("LINE: " + tempLine)
foundContent = re.search(searchPattern,tempLine, re.IGNORECASE)
if foundContent:
print("FOUND: " + foundContent.group())
tempLine = tempinputFile.readline()

tempinputFile.close() [/code]

The script started out as a simple check for any 16 digit numbers that had a non numeric character on either end. But I tweaked it a little bit to look for credit card like numbers using the regex from http://www.regular-expressions.info/creditcard.html. Finally I added an option to match credit card like numbers if the numbers start at the beginning of the line (i.e there is no non-numeric number before the credit card number)

Project PaaS : Day 2 on Google App Engine

It looks like I was able to accomplish writing the application that I wanted to on the App Engine in 2 days!!  at least in it’s basic form.  After some help from Google, I updated the application I created yesterday (http://samurai-apps.appspot.com/) to display the User Agent string being sent by the client.

The code has been updated to github at https://github.com/kudithipudi/google-app-engine/

Lessons from day 2?

  • Python doesn’t like tabs :). Always use spaces to ident. I was using Notepad++ as the editor and it automatically puts tabs when you hit enter. Why? Looks like that is the best practice according to this style guide (http://www.python.org/dev/peps/pep-0008/)
  • The “Logs” console in the SDK toolkit should be your best friend. It let’s you know if there is any error in your code and what line it believes the error is at.

Next, I will try to pretty it up a bit.

Isn’t it amazing that I was able to create a simple app in a matter of 2 days and host it on an “infinitely” scalable  platform without even taking our my credit card.

Project PaaS : Day 1 on Google App Engine

Following up on my public resolution for his month..I started playing with Google App engine. I think a lot has been written about what it is and how it works, but in a nutshell, think of it as an environment to deploy your applications and not have to worry about underlying system capacity. It provides support for Java, Python and more recently Google’s own Go programming languages.

I chose Python, since I have been meaning to dabble in it for a while now. So without further adieu, here is a link to my first application on Google App Engine

http://samurai-apps.appspot.com/

And obviously it has to be hello world :).

How did I get here?

  1. As any good programmer would do, I first tried to find a good place to store my source code. I chose Github, since it seems to be the goto place for hackers (in the good sense 🙂 ) in recent times. I opened a free account on it and created a repository called google-app-engine at https://github.com/kudithipudi/google-app-engine/ .
  2. Following instructions listed here https://developers.google.com/appengine/docs/python/gettingstartedpython27/ and created the helloworld script.
  3. Enabled App Engine on my account by validating myself. Had to use my mobile phone to do the validation.
  4. Created an app called samurai-apps in the Google App Engine control panel
  5. Deployed the helloworld script to the Google Apple engine using the deploy function in the SDK tool. (note: make sure that the name of the app you create in the SDK is the same as the one you created in the app engine control panel. Or you will get an error stating “This application does not exist (app_id=u’xxx’).” where  xxx is the name of the app in the SDK tool)

Pretty simple eh.. 🙂

Now the challenge is to program something more useful than print hellworld :).

 

HOW TO : Sync git clients across workstations using dropbox

I have recently started using git as a source control for the various scrips that I write. As I also mentioned in this post, I use dropbox to synchronize my data across workstations. Here is my setup for synchronizing git clients across multiple workstations using the same SSH keys (note: this is not a recommended setup from a security prospective. you are recommended to generate different SSH key pairs per workstation to ensure one key getting lost doesn’t compromise your entire account).

  1. Workstation 1
    1. create a directory under your dropbox root, that you want to use as your git home directory. Say DROPBOX/git
    2. Install Git for Windows, or whatever git client you want to use
    3. Change the home path on the git client by executing [code]HOME=’PATH_TO_DROPBOX/DROPBOX/git’ [/code]
    4. Check if the home path has been changed by executing [code]echo $HOME[/code]
    5. Create your SSH keys and configure your public key on the git server
  2. Workstation 2
    1. Repeat and rinse step 1 – 4 specified for workstation 1. You don’t need to create the SSH keys since the other clients will recognize the keys that dropbox would have synced up.

HOW TO : Clear screen based on OS in python scripts

I like shiny new toys :). Even though perl is pretty powerful and more than enough for the simple tasks I get to automate from time to time, I want to start learning python and find out first hand, why the whole geek community is raving about it.

As I start to write new scripts in python, I wanted to document how I used to do some things in perl and how I implemented them in python.

One of the standard features of any script I write is to “clear” the screen before starting to send output to the console. Here is the comparison between perl and python

perl

[code]system $^O eq ‘MSWin32’ ? ‘cls’ : ‘clear’; [/code]

python

[code]

# Clear screen, based on the OS
if (os.name == ‘nt’):
os.system("cls")
else:
os.system("clear")

[/code]

Demonstrating the power of perl

I haven’t scripted in perl for quite some time (disadvantages of moving into management 🙂 ). Today, we had to analyze some log files at work and thought I would dust off my scripting skills..

The source data is Apache web logs and we had to find out the number of hits from a unique IP address for a particular scenario.

Pretty simple right, grep will do the job very well. As demonstrated in this blog post. But we had to analyze the data for a ton of servers and I really didn’t want to repeat the same command again and again. Did you know that laziness is the mother of invention :). So I wrote a simple perl script to do the job for me. The biggest advantage of writing this perl script was not that it helped reduce the copy/paste job, but the speed that the script took to run. Details of the comparison below

HOW 99% OF ENGINEERS WOULD DO IT

The analysis consisted of getting web logs for the last week (and some of these log files were already rotated/compressed). Concatenating them to create one large file and then getting the number of hits by IP for a certain condition. This can be done very simply by using a couple of commands that come standard with any *nix system

  • cp
  • cat
  • grep for each day we needed the data

The final grep command would look like this

[code] grep -i "\[20/Feb/2012" final_log | grep -i "splash.do" | grep -i productcode | cut -d’ ‘ -f 1 -| sort |uniq -c | sort -rn > ~/2_20_2012_ip_report.log [/code]

Timing this command showed that it took ~1 min and 22 seconds to run it.

HOW THE 1% DO IT:)

I wrote this perl script (disclaimer : I am not a programmer :), so pls excuse the hack code).

[code]

#!/usr/bin/perl
# Modules to load
# use strict;
use warnings;

# Variables
my $version = 0.1;

# Clear the screen
system $^O eq ‘MSWin32’ ? ‘cls’ : ‘clear’;

# Create one large file to parse
`cp /opt/apache/logs/access_log ~/access_log`;
`cp /opt/apache/logs/access_log.1.gz ~/access_log.1.gz`;
`cp /opt/apache/logs/access_log.2.gz ~/access_log.2.gz`;

`gunzip access_log.1.gz`;
`gunzip access_log.2.gz`;

`cat access_log.2 access_log.1 access_log > final_access_log`;

# Hostname
$hostName=`hostname`;
chomp($hostName);

print "The Hostname of the server is : $hostName \n";

# Process the log file file, one line at a time
open(INPUTFILE,"< final_access_log") || die "Couldn’t open log file, exiting $!\n";

while (defined ($line = <INPUTFILE>)) {
chomp $line;
if ($line =~ m/\[20\/Feb\/2012/)
{
open(OUTPUTFILE, ">> 2_20_2012_log_file") || die "Couldn’t open log file, exiting $!\n";
print OUTPUTFILE "$line\n";
close(OUTPUTFILE);
next;
}
if ($line =~ m/\[21\/Feb\/2012/)
{
open(OUTPUTFILE, ">> 2_21_2012_log_file") || die "Couldn’t open log file, exiting $!\n";
print OUTPUTFILE "$line\n";
close(OUTPUTFILE);
next;
}
if ($line =~ m/\[22\/Feb\/2012/)
{
open(OUTPUTFILE, ">> 2_22_2012_log_file") || die "Couldn’t open log file, exiting $!\n";
print OUTPUTFILE "$line\n";
close(OUTPUTFILE);
next;
}
if ($line =~ m/\[23\/Feb\/2012/)
{
open(OUTPUTFILE, ">> 2_23_2012_log_file") || die "Couldn’t open log file, exiting $!\n";
print OUTPUTFILE "$line\n";
close(OUTPUTFILE);
next;
}
if ($line =~ m/\[24\/Feb\/2012/)
{
open(OUTPUTFILE, ">> 2_24_2012_log_file") || die "Couldn’t open log file, exiting $!\n";
print OUTPUTFILE "$line\n";
close(OUTPUTFILE);
next;
}
if ($line =~ m/\[25\/Feb\/2012/)
{
open(OUTPUTFILE, ">> 2_25_2012_log_file") || die "Couldn’t open log file, exiting $!\n";
print OUTPUTFILE "$line\n";
close(OUTPUTFILE);
next;
}

if ($line =~ m/\[26\/Feb\/2012/)
{
open(OUTPUTFILE, ">> 2_26_2012_log_file") || die "Couldn’t open log file, exiting $!\n";
print OUTPUTFILE "$line\n";
close(OUTPUTFILE);
next;
}
if ($line =~ m/\[27\/Feb\/2012/)
{
open(OUTPUTFILE, ">> 2_27_2012_log_file") || die "Couldn’t open log file, exiting $!\n";
print OUTPUTFILE "$line\n";
close(OUTPUTFILE);
next;
}
if ($line =~ m/\[28\/Feb\/2012/)
{
open(OUTPUTFILE, ">> 2_28_2012_log_file") || die "Couldn’t open log file, exiting $!\n";
print OUTPUTFILE "$line\n";
close(OUTPUTFILE);
next;
}
}

`rm final_access_log`;
`rm access_log`;
`rm access_log.1`;
`rm access_log.2`;

for ($day=0; $day < 9; $day++)
{
$outputLog = $hostName."_2_2".$day."_2012.txt";
$inputLog = "2_2".$day."_2012_log_file";

$dateString = "\\[2".$day."/Feb/2012";

print "Running the aggregator with following data\n";
print "Input File : $inputLog\n";
print "Output Log : $outputLog\n";
print "Date String: $dateString\n";

`grep -i "splash.do" | grep -i productcode | cut -d’ ‘ -f 1 -| sort |uniq -c | sort -rn > ~/$outputLog`;

# Cleanup after yourself
`rm $inputLog`;
}

[/code]

I wrote a smaller script to do the same job as the command line hack that I tried earlier and compared the time. First, here is the smaller script

[code]

#!/usr/bin/perl
# Modules to load
# use strict;
use warnings;

# Variables
my $version = 0.1;
# Clear the screen
system $^O eq ‘MSWin32’ ? ‘cls’ : ‘clear’;
open (TEMPFILE,"< final_log");

# Match date and write to another log file
while (defined ($line = <TEMPFILE>)) {
chomp $line;
if ($line =~ m/\[20\/Feb\/2012/)
{
open(OUTPUTFILE, ">> perl_speed_test_output.log");
print OUTPUTFILE "$line\n";
close(OUTPUTFILE);
next;
}
}

`grep -i "splash.do" perl_speed_test_output.log | grep -i productcode | cut -d’ ‘ -f 1 -| sort |uniq -c | sort -rn > ~/perl_speed_test_output_ip.log`;

[/code]

Timing this script, showed that it took 21 seconds to run it.  > 300% improvement in speed and more importantly, less load (RAM utilization) on the system

One has to love technology :).

HOW TO : for loop in bash

Quick post for my own reference down the road. the “for” loop comes in very handy, when you want to perform the same task on multiple items in a bash shell.

For example, I wanted to query the DNS results of a couple of sub domains (blog.gogoair.com, pr.gogoair.com, tracker.gogoair.com), I can do it the normal way (that 99% of us do 🙂 )

[code] dig blog.gogoair.com

dig pr.gogoair.com

dig tracker.gogoair.com [/code]

Or, I can use the for loop function and do this

[code] for i in {blog,pr,tracker}.gogoair.com; do echo "$i" ; dig +short "$i"; done [/code]

Got to love technology :).. Makes you lazy!!..err I meant to say productive.

Thx to Cliff for the inspiration.

HOW TO : Combining Perl and Zoho to produce reports

This HOW TO is more for my notes. We had a request at work, where we had to parse some log files and create a graph from the data in the log files.

The log files looked like this

[bash]
0m0.107s
0m0.022s
0m0.015s
2011-01-05_02_22
0m0.102s
0m0.024s
0m0.014s
2011-01-05_02_23
[/bash]

I wrote the following perl script to get the log file to look as such

[bash]| 0m0.107s| 0m0.022s| 0m0.015s| 2011-01-05 | 02:22

| 0m0.102s| 0m0.024s| 0m0.014s| 2011-01-05 | 02:23 [/bash]

perl script

[perl]
#!/usr/bin/perl
# Modules to load
# use strict;
use warnings;

# Variables
my $inputFile = ‘input.txt’;
my $version = 0.1;

my $logFile = ‘parsed_input.csv’;

# Sub Functions
sub Log($$$);
sub Trim($);

# Clear the screen
system $^O eq ‘MSWin32’ ? ‘cls’ : ‘clear’;

# Open the output log file
open(LOGFILE,"> $logFile") || die "Couldn’t open $logFile, exiting $!\n";

# Open the input file
open(INPUTFILE,"< $inputFile") || die "Couldn’t open $inputFile, exiting $!\n";

# Process the input file, one line at a time
while (defined ($line = <INPUTFILE>)) {
chomp $line;
# Check for blank line
if ($line =~ /^$/)
{
# Start a new line in the output
print LOGFILE "\n";
}
else
{
# Split the date and time
if ($line =~ /2011/)
{
@date = split (/_/,$line);
print LOGFILE "| $date[0] | $date[1]:$date[2]";
}
else
{
# Write the value to the output
print LOGFILE "| $line";
}
}
}
[/perl]
I then took the parsed log files and imported them into the cloud based reporting engine provided by Zoho at http://reports.zoho.com

The final result are these reports

SERVER1

SERVER2

Did I say, I love technology? 🙂

HOW TO : Improve Jboss startup times

We run multiple applications in Jboss at my work and one of the applications used to take an inordinate time to come up. A typical application would take < 1 minute to get deployed and this particular application for some reason was taking ~7-8 minutes. We initially thought it was a bug in the code and gave hell to our development team :).. But on closer investigation, we found out that a feature we enabled in the Jboss server settings which allows content to be hosted on network storage was causing the issue.

I blogged the feature in Jboss to follow sym links here (https://kudithipudi.org/2008/07/25/howto-configure-jboss-to-follow-symbolic-links/). So essentially when Jboss was started, it was checking all the content in these network path to check for applications to deploy. And traversing a network share with 1000s of directories isn’t fun :)..

We fixed it by making a simple edit to the start up script. Here’s the psuedo code for the script

  1. Remove soft links to network share
  2. Start Jboss
  3. Put soft links to network share

And now the application starts in less than a minute :).

I guess there might be other elegant ways to do this. i.e. Configure Jboss to only deploy certain applications, but this did the trick for us :).