Technology

HOW TO : Load/Stress test a Linux based server

We ran into an issue at work recently, which prompted us to do some performance testing on some of our Linux servers. The requirement was to stress test the key components of the server (CPU, RAM, HDD, Network) and prove that different servers with the same configuration were capable of performing identically. Pretty simple right :).. The challenge was to find tools that could be run to stress test each of the components. There were a lot of tools for CPU and memory (RAM) testing, but not a lot for network and hard drive (HDD) testing. After searching high and low, we found a couple of tools, that I wanted to document here for future reference.

HDD Testing :

I found a pretty interesting tool called Iozone written by William Norcott (Oracle) and Don Capps. You can get the source code and builds for major OSs at http://iozone.org . Despite installing the program using RPM, we were not able to  run the program without specifying the complete path.

There are a ton of options for the program, but the easiest method to run it was in automated mode with the output going to an Excel spreadsheet (more like a glorified CSV file 🙂 ). Here is the command we used

/opt/iozone/bin/iozone -a -Rb output_excel_file.xls

The “-a” is to tell the program in automated mode and the “-Rb” is to tell the program to format the output in Excel format. And you can then go ahead and open the spreadsheet in Excel and create 3D graphs to check and compare the output.

Network Testing :

Most of the information out there in terms of testing the network stack of a machine is either to copy large files over a network share or via FTP. We didn’t find that was enough to really max out a Gigport since there were protocol limitations that didn’t allow us to saturate the network port. After some searching, we stumbled across a tool called “ettcp” on Sourceforge. ettcp itself is an offshoot of ttcp. ttcp (stands for test tcp) was created to test network performance between two nodes. I couldn’t find any place to download ttcp itself, but you can download ettcp at http://ettcp.sourceforge.net/.

We used a server, to act as a common receiver for all the servers we intended to do a performance test on. Here are the commands we used to run the test

RECEIVER (Common Server)
./ettcp -r -s -f M

The options are

  • “-r” for designating the machine as receiver
  • “-f M” for showing the output in Mega Bytes.

TRANSMITTER (Test Servers)
./ettcp -t -s receiver_hostname -n 10000000 -f M

the options are

  • “-t” for designating the machine as transmitter
  • “-s receiver_hostname” to define the receiver
  • “-n” to define the number of packets to send to the receiver

Twitter is not going to make it..

Pretty dramatic prognosis eh..Esp, after all the hype that twitter is getting in the media, one would think that the whole world uses twitter. Here’s why I don’t think twitter will not be such a darling 2/3 years from now. It might still exist in some shape/form (I think it is going to be acquired and integrated into a larger offering), but it won’t be the buzz word anymore.

First the good part.. Let’s take a look at some data from Compete and Alexa (companies that provide web statistics about sites) in comparison to facebook and myspace (the 800 lbs Gorillas of social networking)

Traffic Ranking :

Similar data from Alexa (where the long term data is available for free, unlike Compete)

No surprises here. We see that the ranking of Twitter has gone up drastically and is staying there..

Reach :

Interesting trend. One would expect that the reach (i.e. the number of people accessing the site) would go up as drastically as the ranking, but it is not as linear. Facebook is doing pretty well in this statistic.

Page Views :

I think this is to be expected. Twitter does not rely as much on page view due to the design and nature of the site.

Looking at all the data above, it looks like twitter is in a good place right. Now let’s add another dimension to the analysis. Let’s take a look at how Google views twitter. “Google Insights” is a service offered by Google to help analyze the “search” volume of keywords across regions and timelines. Since Google has ~70% of the search engine traffic, this is a good way to observe the trends.

Here is a graph showing the “interest” starting from 2004…

Zooming into the last 12 months..

Interesting to see that while facebook is getting more and more “interesting”, twitter is not catching up. This might be due to two reasons.

  1. Twitter really did not promote it’s search function until July 2009, meaning the search engines did not have a way to scrape the site and as such did not have a lot of content to show to users.
  2. Twitter partnered with Google and Bing in Oct 2010 2009 to index it’s posting. Again, this means that the search engines just got to the data and it might take some time for them to start showing the data in the search results.

I don’t know if this is conclusive evidence to show that Twitter will not make it. Twitter definitely has its place, but I don’t think it is all that the media has hyped it to be. Even though the media hype has driven more and more users to Twitter, the company has to yet come out with an viable business plan. I think that businesses are flocking towards twitter, because users are.. but once the fad passes on, the world is also going to move on.

P.S : I don’t think I need to mention to anyone looking at the data presented above that MySpace is on a downward spiral :). Who still uses Coldfusion to run a site?? That is so 90s!! :).

Microsoft's confusing online strategy

First there was msn.com, Microsoft’s first attempt to become a major online player. Users were told that this would become the one portal that they would need to go to for all their needs. At that time, it was a competition to portals like Netscape and AOL.

Then came along live.com..Microsoft’t second attempt to bring all it’s on-line properties into one place. And live had(s) some really good features (skydrive, sync, spaces). Esp the revamped hotmail, now called live mail. Life was good right..

No..Microsoft then decides to spend a couple hundred million dollars to launch and promote Bing, it’s new search engine.

So you might be wondering, where is the confusion!! msn.com is for content, live.com is for services and bing.com is the search engine. Makes sense right?? Well, for some reason, Microsoft decides to redirect anyone going to live.com (the site that was once promoted as the only home page you ever need to visit) to the bing site. What about average Joe, that just wants to get to his e-mail and goes to live.com? Well, he gets redirected to bing. And can he at least get a link to his old mail services from the bing home page? NO!! One would think that is common sense, but then I am not smarter than the marketing/product folks at Microsoft :). Or is this just a clever way to drive more traffic to bing to increase it’s exposure? I think that is the case. Take a look at the web ranking from Alexa, comparing bing to live.com and msn.com

Now, if you dig deeper into the clickstream data for bing.com (http://www.alexa.com/siteinfo/bing.com#clickstream), you will see that roughly 70% of the traffic originating from live.com is going back to live.com (p.s. : This is a back of the napkin calculation 🙂 ).

Let’s compare that to the all knowing Google God’s way of doing things? If you go to Google.com and click on More at the top you can see links to all the services Google offers. And if you do the same thing on Bing.com, you don’t get the big picture of what Microsoft can offer..

All I am asking Microsoft is to put a link to http://home.live.com (the home page for all live services) on Bing.

Also, it would help Microsoft to start backing up all those millions of dollars in ad money with real content in it’s search engine. I used “kudithipudi” (yes.. I am pretty selfish 😉 ) as a search term in the major search engines and here is the ranking by the number of results

HOW TO : Convert mpg files to flv format using ffmpeg

For my documentation. Here is the command line parameters, I used to convert a video file in mpg format to flv (flash video) format using ffmped (open source format converter)

ffmpeg -i Original_Video.mpg -deinterlace -ar 44100 -r 25 -qmin 3 -qmax 6 Converted_Video.flv

I will post details on where to get software and such, when I have time :).

HOW TO : Perl subfunction to unmount a partition in Linux

For my record…here’s a snippet of perl code that can be called as a sub function to unmount a partition in Linux.  The magic is in the line “grep m{$mountPoint}, qx{/bin/mount}”, which essentially lets you check if the partition is already mounted or not.

sub UnMountVolume($)
{
    my $mountPoint = $_[0];

    print "Unmounting $mountPoint\n";
	# Check if the mount point exists
	if ( grep m{$mountPoint}, qx{/bin/mount} )
	{
		#Let's try to unmount it
		system("/bin/umount $mountPoint");
	}
	else
	{
		print "$mountPoint is not mounted, so didn't have to do anything\n";
	}
}

As with any perl code, I am sure there are a tons of ways to do this in a more efficient and “cool” way.

HOW TO : Find which interface a particular IP address is configured on

There are a ton of scripts to find how many IP addresses  are configured on a system, but I could not find one, whic would show me which particular network interface an IP address was configured on. Here is a one liner, that will give you this information in Linux

/sbin/ifconfig | grep -B1 10.10.10.10 | awk '{if (NR==1) print $1}'

The same script can be changes a bit to support other operating systems too. Essentially, I am doing a grep (search) of the output of ifconfig, which shows all the network information on the system for a particular IP. At the same time, I am using the -B1 option, which will show the line above the matching line. Finally, I am piping this to awk and printing the first row in the first column.

Travelocity down due to power outage..

Looks like Travelocity was down early today due to a power outage. Here is a screenshot of the site, right after they came up

According to Pingdom, the site was down for ~1 hour and 29 minutes. If they did come up at an alternate site, I personally think that is a pretty good response time. Running a high transaction web site (and one that is as complicated as Travelocity) is no easy feat and when you throw DR into the mix, it gets pretty nasty. The site is primarily hosted out of the EDS/Sabre/Travelocity datacenter in Tulsa, Okhlahoma.