For my reference.. good article about command line usage of grep . http://arstechnica.com/open-source/news/2009/05/command-line-made-easy-five-simple-recipes-for-grep.ars . Older how to articles on my site related to this topic are here and here.
Technology
Lessons of the Trade : Simple way to deter web scrapers
If your website is a target of web scrapers (http://en.wikipedia.org/wiki/Screen_scraping), here’s a simple way to keep them on their toes..
Every few days (or weeks) make a small change to the web page format. The trick is to make the change in such a way so that your users do not notice it or are not annoyed by the change. Most of the simple scrapers use, screen scraping and this would confuse the hell out of them 🙂 and usually deters the most abusive ones. The smart ones would have approached you in the first place, requesting permission to scrape or get authorized for an interface.
Lessons of the trade : Troubleshooting database perfromance
If you have ever worked in an IT shop, you will know that the one thing you cannot escape from is issues related to Database performance. Now, I am no DBA in any way or fashion, but thought I should record some of the common issues and ways they have been overcome in my career so far. More for self records than trying to teach someone :).
- Network Related
- Check if the NIC (network interface) on the DB server has any speed mis match with the network device (most probably a switch) that it is connected to.
- Check the latency between the application and the DB (pertains to applications connecting to DB over WAN links)
- System Related
- Check if a rouge process is using all the system resources.
- Check if the disk sub system is performing optimally
- Recommend using RAID 10 for transactional systems.
- Don’t forget to check those batteries on the RAID controllers 🙂
- Check if the DB is just running out of gas.
- Long term capacity trending records come in handy here.
- Database Related
- Views are evil.. if created the wrong way. Esp, on data that is accessed frequently. Remember that you are now making two calls to the database. One to read the data, One to create the view.
- Ensure your logs are not being writted ot (thx Ray for pointing out the typo) written to the same disk subsystem as your data files.
- Indexes are good.. But only to an extent. If the size of your indexes is twice the size of your data.. you have an issue.
- Check for invalid objects. You will be surprised how many times, people overlook this.
- Sometimes, it helps to flush the SGA (Oracle specific). Be aware that it will slow down the response time for a while (until the cache gets populated again).
- Avoid excessive monitoring. Esp. with tools that query the system tables quite frequently. This has a negative impact on the database performance.
Did you run into any strange situations and figured out a solution? Please feel free to add your comments below…
New Toy : Kindle 2.0
I have been looking at getting an e-book reader for some time now (yes.. even though I don’t have time, I still like to think I can read 🙂 ).. My sister and brother in law surprised me by getting me the newly released Kindle 2.0 (Thx Guys!!).. I think I am one of the first few people to receive it!!. Here are some pictures of the unpacking and the device itself
The Kindle in it’s original packing.. Getting pretty close to being as cool as Apple packaging

The most amazing thing (for me).. the USB/Power Cable.. Look how small the power brick is!!


ÂÂ
First Thoughts??
Since, I didn’t have a previous e-book or the Kindle 1.0, I don’t have anything to compare it to. I think the styling of the device is very sleek and sexy. I like the alumunium backend (Reminds you of the first gen iPhone). The interface is OK.. Was not very impressed with it. I love the fact that you can browse Wikipedia anytime/anywhere with the built-in wireless connection for free!!
My next task is to figure out a way to get some content onto this baby. There are several books in the public domain (like the ones on Gutenberg.org) that I would like to get onto the Kindle first. Once, I feel comfortable with the unit, I will try some e-books from the Kindle store.
HOW TO : Simple perl script to replace lines in file
Nothing fancy.. but here is a simple perl script to open a file, search for specific content in the a line and replace it with some other content.
open (SOURCE, "< source.xml")
or die "Could not open file source.xml: $!\n";
open (DESTINATION, ">modfile.xml")
or die "Could not open file modfile.xml: $!\n";
while (defined($line =
if ($line =~ m/YYYYYYYY/i) {
$line = "XXXXXXXXXXXXXXXXXXX\n";
}
print DESTINATION "$line";
}
close (SOURCE);
close (DESTINATION);
You are opening a file named source.xml, reading every line and if there is some text that matches “YYYYYYYY”, you are replacing the whole line with “XXXXXXXXXXXXXXXXXXX”. I am sure there are more elegant ways to write this :).. but this will do the trick too..
Lessons of the trade : Data purge in databases..
Quick note to myself.. If you have a high volume transactional database and are looking to purge data from a table(s).. make sure you purge the data in small chunks. If you purge the data in larger chunks (rows), other processes trying to access the data on those tables have to go to the redo logs to access the data, since the purge job will put a lock on the table. This obviously adds latency to the queries.. So purge the data in smaller chunks (rows), forcing the database to flush the redo logs.
SourceForge : Project of the Month
SourceForge.net is an online community supporting open source projects by providing hosting, distribution and subversion services. They choose a project every month from the hundreds of thousands of projects that are hosted on SourceForge based on popularity and activity. Most of the projects are the who is who of the Open source community. This is a good link to bookmark.. http://sourceforge.net/community/index.php/potm/
HOW TO : Delete duplicate lines in Linux/Unix
If you have a large text file and want to quickly delete any duplicate lines, you can use the following option of sort in *nix..
sort -u filename > newfilename
If you want to get even funkier :), like only checking duplicates on a particular column, feel free to check out this link http://www.gnu.org/software/coreutils/manual/html_node/sort-invocation.html#sort-invocation.
HOW TO : Microsoft Windows – Routing to /dev/null
Ran into an interesting issue at work today and wanted to document it. We had a rouge process in one of our applicatoins and it was trying to send e-mails via one of our mail gateways at an alarming rate..There was no customer impact, since the mail server was rejecting all the connections. But the high number of connections were causing a strain on our firewalls..
If this was Linux, we would have done something simple like adding a route to point all the traffic destined to mail server to /dev/null by running “route add IP_ADDRESS_OF_MAIL_SERVER MASK /dev/null”
A search on Google showed that you can achieve similar results by doing the following “route ADD IP_ADDRESS_OF_MAIL_SERVER MASK 255.255.255.255 127.0.0.1“. 127.0.0.1 being the IP address of the loopback interface in this case. But when we ran the command, we got an error “incorrect gateway 127.0.0.1”.. So there is NO way to route traffic in Microsoft Windows to a null device..
Finaly, we figured out a round about way to achieve this.. Since the main aim was to reduce the load on the firewall, we identifid an un used IP in the same network as the application server and added a static route to point all traffic going to the mail server to this IP. We ran the following command “route ADD IP_ADDRESS_OF_MAIL_SERVER MASK 255.255.255.255 UN_USED_IP_ADDRESS”
For example, if you application server is in the range 192.168.1.0/24, the mail server is 192.168.2.20.. and an unused IP in the application server range is 192.168.1.10.. the command would look like this “route ADD 192.168.2.20 MASK 255.255.255.255 192.168.1.10“.. You will see a lot of SYN_SENT status in the network connections, since the application is trying to connect t othe mail server via an IP address that doesn’t exist..
Might not be the smartest way to achive this.. but it did the trick.
Windows 7 : Installer Issues
As posted here, I have been playing around with the Beta version of Windows 7. Everything was working great, until I started getting  a mysterious error “Installer stopped working”, when I tried to install any new software. A Google search, led me to this site (http://www.sevenforums.com/general-discussion/2349-windows-installer-cant-install-any-msi-package-4.html) . Here’s the solution to the issue.ÂÂ
start regedit
navigate to HKLM\Software\Microsoft\SQMClient\Windows\DisabledSessions
rename MachineThrottling to _MachineThrottling
