Linux

HOW TO : Sort Apache Web Logs for hits by Unique IP Addresses

 

Say you want to find out how many hits you are getting t0 a specific page from a particular source IP, you can use this quick collection of Linux tools to get this data

[code]grep -i "URL_TO_CHECK" PATH_TO_APACHE_ACCESS_LOG | cut -d’ ‘ -f 1 -| sort |uniq -c | sort -rn > ~/ip_report.txt[/code]

You are using

  • grep to filter the string of the page you want the report on
  • cut to get the IP address from the log file
  • sort and uniq to sort the unique IP addresses
  • and finally sort -rn to sort the data in descending order

Example :

[code]grep -i "GET /" /opt/apache/logs/access_log | cut -d’ ‘ -f 1 -| sort |uniq -c | sort -rn > ~/ip_report.txt[/code]

gets you the report of hits to the index page.

HOW TO : Find list of files used by a process in Linux

Quick howto on finding the list of files being accessed by a process in Linux. I needed to find this for troubleshooting an issue where a particular process was using an abnormally high percentage of CPU. I wanted to find out what this particular process was doing and accessing.

  1. Find the process ID (pid) of the process you want to analyze by running[code] ps -ef | grep NAME_OF_PROCESS [/code]
  2. Find the files the process is accessing at a given time by running[code]sudo ls -l /proc/PROCESS_ID/fd [/code]

For example, if I wanted to find the list of files being accessed by mysql, the process would look as such

[code] ps -ef | grep mysqld [/code]

which would show the output as

[code]samurai@samurai:~$ ps -ef | grep mysqld
mysql     3304     1  0 Feb04 ?        00:00:23 /usr/sbin/mysqld
samurai  23389 23374  0 14:57 pts/0    00:00:00 grep –color=auto mysqld
[/code]

I can then find the list of files being used by mysql by running

[code] sudo ls -l /proc/3304/fd [/code]

which would give me

[code]

lrwx—— 1 root root 64 Feb  7 15:00 0 -> /dev/null
lrwx—— 1 root root 64 Feb  7 15:00 1 -> /var/log/mysql/error.log
lrwx—— 1 root root 64 Feb  7 15:00 10 -> socket:[4958]
lrwx—— 1 root root 64 Feb  7 15:00 11 -> /tmp/ibdu9WRh (deleted)
lrwx—— 1 root root 64 Feb  7 15:00 12 -> socket:[4959]
lrwx—— 1 root root 64 Feb  7 15:00 14 -> /var/lib/mysql/blog/wp_term_relatio                        nships.MYI
lrwx—— 1 root root 64 Feb  7 15:00 15 -> /var/lib/mysql/blog/wp_postmeta.MYI
lrwx—— 1 root root 64 Feb  7 15:00 17 -> /var/lib/mysql/blog/wp_term_relatio                        nships.MYD
lrwx—— 1 root root 64 Feb  7 15:00 18 -> /var/lib/mysql/blog/wp_term_taxonom                        y.MYI
lrwx—— 1 root root 64 Feb  7 15:00 2 -> /var/log/mysql/error.log
lrwx—— 1 root root 64 Feb  7 15:00 20 -> /var/lib/mysql/blog/wp_postmeta.MYD
lrwx—— 1 root root 64 Feb  7 15:00 21 -> /var/lib/mysql/blog/wp_term_taxonom                        y.MYD
lrwx—— 1 root root 64 Feb  7 15:00 22 -> /var/lib/mysql/blog/wp_terms.MYI
lrwx—— 1 root root 64 Feb  7 15:00 23 -> /var/lib/mysql/blog/wp_terms.MYD
lrwx—— 1 root root 64 Feb  7 15:00 3 -> /var/lib/mysql/ibdata1
lrwx—— 1 root root 64 Feb  7 15:00 4 -> /tmp/ibvANyz7 (deleted)
lrwx—— 1 root root 64 Feb  7 15:00 5 -> /tmp/ibonS0mU (deleted)
lrwx—— 1 root root 64 Feb  7 15:00 6 -> /tmp/ibcKctaH (deleted)
lrwx—— 1 root root 64 Feb  7 15:00 7 -> /tmp/ibB5DS5t (deleted)
lrwx—— 1 root root 64 Feb  7 15:00 8 -> /var/lib/mysql/ib_logfile0
lrwx—— 1 root root 64 Feb  7 15:00 9 -> /var/lib/mysql/ib_logfile1
[/code]

HOW TO : Modify iptables rules

Quick how to for my personal records. iptables is an open source firewall (and it does a lot more) included with most linux distributions.

Steps to add new rule to existing configuration

  • Check the list of rules and their corresponding sequence

[code]sudo iptables -vL –line-numbers [/code]

  • Add the new rule at the required location/sequence

[code] sudo iptables -I INPUT LINE_NUMBER RULE [/code]

Example :

[code]iptables -I INPUT 8 -s X.X.X.X/24 -p tcp -m state –state NEW -m tcp –dport 3128 -j ACCEPT[/code]

  • Save the configuration

[code] sudo serivce iptables save [/code]

Thx to Sijis for helping with the commands.

HOW TO : Fix Jboss startup script for CentOS

Quick note for myself on fixing the default startup script provided by Jboss to work on CentOS. Thx to Shankar to finding the solution.

The default startup script (/$JBOSS_HOME/bin/jboss_init_redhat.sh) that Jboss provides does not work properly in CentOS. The start option works fine, but when you try to stop Jboss, it gives you a “No JBossas is currently running” message and quits.

Here’s a quick way to fix it. Edit the jboss_init_redhat.sh file and replace

[code]JBOSSSCRIPT=$(echo $JBOSSSH | awk ‘{print $1}’ | sed ‘s/\//\\\//g’) [/code]

with

[code]JBOSSSCRIPT=$(echo $JBOSSSH | awk ‘{print $1}’)[/code]

HOW TO : Configure Oracle data source in Jboss

Here are some quick notes on configuring a data source for an Oracle database in Jboss. Data Source are common access points to different sources of data, provided by the application server framework to the applications running in it. These instructions are very specific to the 5.x EAP version.  Jboss SOA has a pretty easy ant based script to configure data sources. I am not sure why Redhat didn’t think it would be good to include it as part of the EAP package too.

  1. Download the latest version of the JDBC driver from Oracle at http://www.oracle.com/technetwork/database/features/jdbc/index-091264.html . You can also get to this link by searching for “download ojdbc jar” in Google. In fact, I would recommend that, given that Oracle might change the link for future editions. You will need an Oracle account to download the driver file.
  2. Copy the driver file to $JBOSS_HOME/server/$JBOSSS_PROFILE/lib
  3. Disable the default hsqldb datasource provided by Jboss. This is good for development purposes, but for any application server you want to deploy into a production environment, you need to replace it with a more robust DBMS. It will have a major impact on performance. There are two places hsqldb is referred to in the default install
    • $JBOSS_HOME/server/$JBOSSS_PROFILE/deploy/hsqldb-ds.xml
    • $JBOSS_HOME/server/$JBOSSS_PROFILE/deploy/messaging/hsqldb-persistence-service.xml
    • I usually rename these files with a DO_NOT_USE prefix. You can delete them too, but I leave them around just in case.
  4. Configure the Oracle data source by copying from the sample files and configuring them
    • $JBOSS_HOME/server/$JBOSSS_PROFILE/deploy/oracle-ds.xml (you can find the sample file at $JBOSS_HOME/docs/examples/oracle-ds.xml)
    • $JBOSS_HOME/server/$JBOSSS_PROFILE/deploy/messaging/oracle-persistence-service.xml (you can find the sample file at $JBOSS_HOME/docs/examples/oracle-persistence-service.xml)
      • Comment out the following line in the file if you are not using clustering in the application server [code] <attribute name="ChannelFactoryName">jboss.jgroups:service=ChannelFactory</attribute> [/code]
Restart Jboss and it will create all the required tables and objects in the schema provided in the connect string. It is implied that you have created a schema in Oracle with the required privileges.

HOW TO : Check web services using curl

Quick note for myself to check web services using curl ([L/U]nix utility to play with http(s) traffic)

[code] curl https://URL_TO_TEST –insecure –trace-ascii debug.txt [/code]

Comments on options :
–insecure is used if you are testing web services served over SSL using self signed certs
–trace-ascii dumps all traffic between the client (curl in this case) and the server in human readable format

HOW TO : Search ownership of files in Linux

Say you have a directory with a bunch of sub directories and files and you want to see if all the files are owned by a particular user, you can use the following set of commands

[code]ls DIRECTORY_PATH -l -R | awk {‘print $3’} | grep -v USER_NAME[/code]

The set of commands do the following

  • ls -l -R shows the list of files and directories
  • awk prints the name of the owner of the file (it is the third column)
  • grep shows only the lines where the owner name doesn’t match

And yeah.. this works in most variants of Linux :).

HOW TO : Apache and SELinux

Quick note for future reference..

If you ever run into errors like this

[code]
<pre>Starting httpd: Warning: DocumentRoot [/var/www/html/static] does not exist
Warning: DocumentRoot [/var/www/html/static] does not exist
Warning: DocumentRoot [/var/www/html/static] does not exist
Warning: DocumentRoot [/var/www/html/static] does not exist
(13)Permission denied: httpd: could not open error log file /etc/httpd/logs/error_log.
Unable to open logs
[FAILED]
[/code]
And you are scratching your head why Apache is throwing these errors, even when the said directory and files exist. And you have the right permissions!! Check if you have SELinux running and being enforced.
On RHEL, you can check if SELinux is running by
[code]cat /selinux/enforce [/code]
The two values are 0 and 1. 0 means, SELinux is not being enforced and 1 means it is.
You can quickly disable SELinux temporarily by
[code]echo 0 >/selinux/enforce [/code]
If you want to disable it permanently (i.e. survive reboots), you have to edit the file /etc/selinux/config and change the SELINUX line from enabled to disabled.

Overheard : Random comments about technology

Here are some interesting titbits from a executive summary event hosted by Redhat/Intel that I attended yesterday.

We decreased the execution times for our orders from 1.5 seconds to 5 milliseconds

This from an executive managing the technology organization for a large trading company. Imagine the geekiness in accomplishing this :).

For every 450 smartphones that get activated a server is added to support them

This from an Intel executive. So if there are 500000 android phones being activated every day.. that’s around 1111 servers being added just to serve the android fans :).

1 in 4 servers currently runs Linux

This from a Redhat executive. If anyone doubts that Linux is mainstream.. they are living under a rock 🙂