admin

Excel : Import tables from web pages

Ran into a bit of a tough nut the other day. One of my colleagues was trying to gather data from a HTML page and run some reports. He could scrape the page and copy the data into Excel, but any operations he tried on the data errored out. He tried every trick in the book (change format of columns etc) but it didn’t help.

A bit of googling and found this new cool function in Excel 2003.

Go to “Data –> Import External Data –> New Web Query” in Excel and you check on this new cool method.

“F1” in Excel rules :)..

IPSec – What is it??

This is a great write up by Stephen Friedl about the IPSec suite of protocols. Highly recommend reading it.

On a side note, I finally updated the Blogging software on the site. Have been getting a lot of “blogspam” from the entries on this site. It is just amazing how far the spammers go to NOT get their message :).

Runs for 2005

For my records.. here are some of the runs I participated in this year

April 9 | BADGERLAND STRIDERS SOUTH SHORE HALF MARATHON | 13.1 Miles | Chip Time : 2:12:31
May 26 | CHASE Corporate Challenge | 3.5Miles
Aug 7th | Chicago Distance Classic | 13.1Miles | Chip Time : 2:17:01
Sept 24th | Chicago Half Marathon | 13.1 Miles | Chip Time : 2:21:28
OCt 9th | Chicago Lasalle Marathon | 26.2 Miles | Chip Time : 5:14:41

As you can see.. it has been a very “slow pace” year :).

|–=Happy New Year=–|

And so has ended the year of the Monkey [2004] and starts the year of the Rooster [2005].

2004 has been quite an interesting year for me..

I moved all the way from Hawaii, back to the Mainland..
I trained and ran a marathon.
Started working at one of the most exciting tech. company.
Made new friends (no links for this unfortunately 🙂 )

In general, life was fun..

Here is wishing that 2005 is going to bring us all more joy, better health and love..

C Novim Godom 🙂

RRDTOOL – How to remove spikes

We use Cacti at work to graph the usage of our clients links. It is a pretty popular feature with our customers. A problem (well not really. More like a gotcha) with rrdtool is the way it stores data. Here’ a quote from the rrd tutorial
“Round robin is a technique that works with a fixed amount of data, and a pointer to the current element. Think of a circle with some dots plotted on the edge, these dots are the places where data can be stored. Draw an arrow from the center of the circle to one of the dots, this is the pointer. When the current data is read or written, the pointer moves to the next element. As we are on a circle there is no beginning nor an end, you can go on and on. After a while, all the available places will be used and the process automatically reuses old locations. This way, the database will not grow in size and therefore requires no maintenance. RRDTool works with with Round Robin Databases (RRDs).”

So rrd stores the difference in values (between the last value and current) in the database, rather than the value itself. This creates a problem when routers are rebooted. The counters on the interfaces get cleared and rrd is fooled into thinking that there is a spike in usage. This results to “spikes” in the graphs. Sometimes you see that a 128kbps link has maxed out at 98mbps!!! :). The best way to stop this from happening is to set the correct min and max values for the ds names. Coming back to cacti again. When cacti creates a new rrd database, it does not really give one the option to setup the maximum and minimum speeds of a interface. It defaults to a max of 100000000 (i.e. 100mbps). Occasionally when we have to reboot our routers, I do the following to remove the spikes

cp filename.rrd filename.rrd.backup
Any good admin knows that before you mess with a file, you make a backup :).

rrdtool info filename.rrd | more
This gives us the chance to get the ds (data sources) names

rrdtool tune filename.rrd -a ds_name:MAXIMUM_VALUE
Set the maximum of the ds to the required

rrdtool dump filename.rrd > filename.xml
Export all data in the rrd to a xml file

mv filename.rrd filename.rrd.old
Rename the rrd to make way for the new one.

rrdtool restore filename.xml filename.rrd -r
Restore the rrd from the xml file with the -r (range check) option. So any values that are higher than the new maximum value are ignored.

And the spikes are gone..

DARPA – Grand Challenge Results

DARPA announced a challenge (open to everyone) to build an autonomous ground vehicle which can travel without any human supervision on rough terrain. The track was an undisclosed path between Los Angeles and Las Vegas, passing through the Majove desert. A total of 15 vehicles qualified for the challenge which took place today. Unfortunately none of the vehicles came even close to finishing the 142 mile race. The farthest distance (7.2 miles)was by covered by a modified HUMMV, build by Red Team. Even though none of the teams could finish the race, I think it was a great event and both the government and the industry learnt a lot from it. For a complete list of the competing teams, click here .

spam stats

While following this threat on slasdot, I came across this site by a system admin who tracks where the spam he fights on a daily basis originates from. This is further proof to break the myth of spamming that “Most spam originates from outside the US”. From this site, we can see that ~35% of spam originates from the US. I am not sure how the author is getting the country of origin from the IP address. I work at an ISP and know that even though all the IP addresses belong (are allocated) to us, they are all allocated to our customers (located outside the US). And we don’t SWIP them as we are required to :). So if any of our customers send out spam, for the rest of the world, it would look as if though it is originating from the US. Wonder how much of the 35% is made up of such spammers.