Archive

Archive for the ‘Linux Command Line’ Category

SCREEN command use cases

May 29, 2015 Leave a comment

 
 
Sometimes we have to control job creation & termination via screen from inside a bash or shell script. Pasted below are some onliner’s which you will find useful.
 
1) Create screen test
 

reynold@jackal:~/$ screen -dmS test /usr/bin/top 
reynold@jackal:~/$ screen -ls
There is a screen on:
	10222.test	(07/01/2014 02:52:58 AM)	(Detached)
1 Socket in /var/run/screen/S-reynold.

reynold@jackal:~/$ 

 
2) Terminate the screen
 

reynold@jackal:~/$ screen -S test -X "quit"
reynold@jackal:~/$ screen -ls
No Sockets found in /var/run/screen/S-reynold.

reynold@jackal:~/$ 

 

3) Start a screen job which have command output piped. The first screen creation command won’t work in this case.
 

root@jmail7:~# screen -dmS straycustomerdirs bash -c 'cat /root/ops/reynold/straycustomerdirs.list | xargs rm -vrf $1'
root@jmail7:~#

 

4) To list all running screens,
 

screen -ls

 
5) To connect to an already running screen,
 

screen -rx SCREENNAME

 
6) To create a another screen inside a screen (yeah, its just sounds like dream inside a dream as in movie Inception 😀 ),
 

Ctrl + a + c

 
7) To list all subscreens inside a screen,
 

Ctrl + a + "

 

Categories: Linux Command Line

GIT Reference

May 28, 2015 Leave a comment

 
Some day to day useful git stuff for system administrators,
 
 
1) Server side repository setup,
 

ssh reynold@git.jackal.com

 

[reynold@git ~/]$ cd public_git
[reynold@git ~/public_git]$ mkdir testrepo.git
[reynold@git ~/public_git]$ cd testrepo.git/
[reynold@git ~/public_git/testrepo.git]$ git init --bare
Initialized empty Git repository in /home/reynold/public_git/testrepo.git/
[reynold@git ~/public_git/testrepo.git]$ 

 
2) On local machine,
 

cd Projects/
git init
git config --global user.name "Reynold PJ"
git config --global user.email reynold@jackal.com
git remote add public reynold@git.jackal.com:/git/reynold/testrepo.git
git add testscripy.py
git commit -m "Added testscripy.py"
git push public master

 
Check url

git.jackal.com:/git/reynold/testrepo.git

 
 
3) Ignore local changes and reset to the one in origin/master,
 

git reset --hard origin/master
git pull origin

 

git checkout master
git merge master

 
4) Create a new branch, apply the changes in that branch and push the changes. When you are making changes to a shared code base, its always recommended to make your changes in your own branch and later merge it to the master branch.
 

reynold@jackal:~/git/chef-cookbook-couchdb$ git pull
reynold@jackal:~/git/chef-cookbook-couchdb$ git checkout -b reynold
reynold@jackal:~/git/chef-cookbook-couchdb$ git add attributes/default.rb
reynold@jackal:~/git/chef-cookbook-couchdb$ git commit -m "attributes/default.rb: Removed timewindow from auto-compaction"
reynold@jackal:~/git/chef-cookbook-couchdb$ git push origin
reynold@jackal:~/git/chef-cookbook-couchdb$ 

 

Merge the changes made in new branch ‘reynold’ to the master branch.
 

reynold@jackal:~/git/chef-cookbook-couchdb$ git checkout master
Already on 'master'
reynold@jackal:~/git/chef-cookbook-couchdb$ git pull origin master
From git.jackal.com:/git/chef-cookbook-couchdb
 * branch            master     -> FETCH_HEAD
Already up-to-date.
reynold@jackal:~/git/chef-cookbook-couchdb$ git merge reynold
Updating 0fdb954..7039a98
Fast-forward
 attributes/default.rb |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
reynold@jackal:~/git/chef-cookbook-couchdb$ git push origin master
Total 0 (delta 0), reused 0 (delta 0)
To git.jackal.com:/git/chef-cookbook-couchdb.git
   0fdb954..7039a98  master -> master
reynold@jackal:~/git/chef-cookbook-couchdb$ 

 

Categories: GIT, Linux Command Line

Understanding traceroute using Scapy

April 18, 2015 Leave a comment

 

Scapy is a packet generator/sniffer and in this post we will be discussing the use of scapy to understand the working of traceroute. And the best part is that, its pythonic 😀

 

Assumptions made:

 

1) I am having a test vm with following details,

Hostname: client1.jackal.com
IP : 192.168.122.101
interface: eth0
Gateway: 192.168.122.1

2) tcpdump is installed on the test vm
3) We are doing a traceroute to google open dns ip 8.8.8.8

 

Explanation:

 

Open two command prompts on your test VM in which one interface contains the traceroute running with the following options,

 

root@client1:~# tcpdump -v -i eth0 -n -t icmp and port not 22

On the other prompt type “scapy” which will open up an interpreter,

 

root@client1:~# 
root@client1:~# scapy
>>> 

Now follow the steps outlined below,

1) Send packet 1 with ttl set as 1,

>>> send(IP(dst='8.8.8.8', ttl=1)/ICMP())
.
Sent 1 packets.
>>> 

In tcpdump output you will see the following(step 2,3, etc. also contains tcpdump output shown after packet send operation),

IP (tos 0x0, ttl 1, id 1, offset 0, flags [none], proto ICMP (1), length 28)
    192.168.122.101 > 8.8.8.8: ICMP echo request, id 0, seq 0, length 8
IP (tos 0xc0, ttl 64, id 18982, offset 0, flags [none], proto ICMP (1), length 56)
    192.168.122.1 > 192.168.122.101: ICMP time exceeded in-transit, length 36
	IP (tos 0x0, ttl 1, id 1, offset 0, flags [none], proto ICMP (1), length 28)
    192.168.122.101 > 8.8.8.8: ICMP echo request, id 0, seq 0, length 8

 
2) Send packet 2 with ttl set as 2,

>>> send(IP(dst='8.8.8.8', ttl=2)/ICMP())
.
Sent 1 packets.
>>> 

tcpdump output,

IP (tos 0x0, ttl 2, id 1, offset 0, flags [none], proto ICMP (1), length 28)
    192.168.122.101 > 8.8.8.8: ICMP echo request, id 0, seq 0, length 8
IP (tos 0x0, ttl 253, id 51505, offset 0, flags [none], proto ICMP (1), length 56)
    10.111.44.1 > 192.168.122.101: ICMP time exceeded in-transit, length 36
	IP (tos 0x0, ttl 1, id 1, offset 0, flags [none], proto ICMP (1), length 28)
    192.168.122.101 > 8.8.8.8: ICMP echo request, id 0, seq 0, length 8

 
3) Send packet 3 with ttl set as 3. Here you won’t get “ICMP time exceeded in-transit” message. That means that router have either disabled icmp responses or not accessible. You usually see “3 * * *” as responses in such cases of traceroute. Retry 3 times and if you are receiving the same response then display ” * * *”

>>> send(IP(dst='8.8.8.8', ttl=3)/ICMP())
.
Sent 1 packets.
>>> send(IP(dst='8.8.8.8', ttl=3)/ICMP())
.
Sent 1 packets.
>>> send(IP(dst='8.8.8.8', ttl=3)/ICMP())
.
Sent 1 packets.
>>> 
IP (tos 0x0, ttl 3, id 1, offset 0, flags [none], proto ICMP (1), length 28)
    192.168.122.101 > 8.8.8.8: ICMP echo request, id 0, seq 0, length 8
IP (tos 0x0, ttl 3, id 1, offset 0, flags [none], proto ICMP (1), length 28)
    192.168.122.101 > 8.8.8.8: ICMP echo request, id 0, seq 0, length 8
IP (tos 0x0, ttl 3, id 1, offset 0, flags [none], proto ICMP (1), length 28)
    192.168.122.101 > 8.8.8.8: ICMP echo request, id 0, seq 0, length 8

 
4) Send packet 4 with ttl set as 4.

>>> send(IP(dst='8.8.8.8', ttl=4)/ICMP())
.
Sent 1 packets.
>>> 
IP (tos 0x0, ttl 4, id 1, offset 0, flags [none], proto ICMP (1), length 28)
    192.168.122.101 > 8.8.8.8: ICMP echo request, id 0, seq 0, length 8
IP (tos 0x0, ttl 252, id 62907, offset 0, flags [none], proto ICMP (1), length 96)
    182.73.11.177 > 192.168.122.101: ICMP time exceeded in-transit, length 76
	IP (tos 0x0, ttl 1, id 1, offset 0, flags [none], proto ICMP (1), length 28)
    192.168.122.101 > 8.8.8.8: ICMP echo request, id 0, seq 0, length 8

 
5) Send packet 5 with ttl set as 5,

>>> send(IP(dst='8.8.8.8', ttl=5)/ICMP())
.
Sent 1 packets.
>>> 
IP (tos 0x0, ttl 5, id 1, offset 0, flags [none], proto ICMP (1), length 28)
    192.168.122.101 > 8.8.8.8: ICMP echo request, id 0, seq 0, length 8
IP (tos 0x0, ttl 250, id 25844, offset 0, flags [none], proto ICMP (1), length 96)
    182.79.247.9 > 192.168.122.101: ICMP time exceeded in-transit, length 76
	IP (tos 0x0, ttl 1, id 1, offset 0, flags [none], proto ICMP (1), length 28)
    192.168.122.101 > 8.8.8.8: ICMP echo request, id 0, seq 0, length 8

 
6) Send packet 6 with ttl set as 6,

>>> send(IP(dst='8.8.8.8', ttl=6)/ICMP())
.
Sent 1 packets.
>>> 
IP (tos 0x0, ttl 6, id 1, offset 0, flags [none], proto ICMP (1), length 28)
    192.168.122.101 > 8.8.8.8: ICMP echo request, id 0, seq 0, length 8
IP (tos 0x0, ttl 247, id 0, offset 0, flags [none], proto ICMP (1), length 56)
    72.14.223.230 > 192.168.122.101: ICMP time exceeded in-transit, length 36
	IP (tos 0x0, ttl 1, id 1, offset 0, flags [none], proto ICMP (1), length 28)
    192.168.122.101 > 8.8.8.8: ICMP echo request, id 0, seq 0, length 8

 
7) Send packet 7 with ttl set as 7,

>>> send(IP(dst='8.8.8.8', ttl=7)/ICMP())
.
Sent 1 packets.
>>> 
IP (tos 0x0, ttl 7, id 1, offset 0, flags [none], proto ICMP (1), length 28)
    192.168.122.101 > 8.8.8.8: ICMP echo request, id 0, seq 0, length 8
IP (tos 0xc0, ttl 246, id 31013, offset 0, flags [none], proto ICMP (1), length 56)
    72.14.237.3 > 192.168.122.101: ICMP time exceeded in-transit, length 36
	IP (tos 0x80, ttl 1, id 1, offset 0, flags [none], proto ICMP (1), length 28)
    192.168.122.101 > 8.8.8.8: ICMP echo request, id 0, seq 0, length 8

 
8) Send packet 8 with ttl set as 8,

>>> send(IP(dst='8.8.8.8', ttl=8)/ICMP())
.
Sent 1 packets.
>>> 
IP (tos 0x0, ttl 8, id 1, offset 0, flags [none], proto ICMP (1), length 28)
    192.168.122.101 > 8.8.8.8: ICMP echo request, id 0, seq 0, length 8
IP (tos 0x0, ttl 54, id 23662, offset 0, flags [none], proto ICMP (1), length 28)
    8.8.8.8 > 192.168.122.101: ICMP echo reply, id 0, seq 12535, length 8

This means that the source server is able to identify the destination host in the 8th hop. By default the traceroute program performs upto 30 hops and if its unable to find the destination in 30 hops, it will print a host unreachable message.

The traceroute program actually sends/forwards an ICMP packet with source address set as the machine’s ip in which traceroute is run, and it also sets the TTL value to 1 initially. So when the packet reaches the immediate next router, it reduces the packets TTL by 1 and finds the TTL has reached 0. So it returns a message ICMP time exceeded in-transit to the sender address in packet header. Next time, the sender again increments the TTL value by 1(TTL is now 2) and sends the packet to the destination which will fail on the second router because the TTL of packet will be 0 after it reaches the second router and hence it won’t forward it, but instead reply back to sender with the same message as before. This same logic is applied for subsequent hops, until the packet reaches the destination.
 

 

To Send all 8 packets at once,

>>> send(IP(dst='8.8.8.8', ttl=(1,8))/ICMP())
........
Sent 8 packets.
>>> 

 

IP (tos 0x0, ttl 1, id 1, offset 0, flags [none], proto ICMP (1), length 28)
    192.168.122.101 > 8.8.8.8: ICMP echo request, id 0, seq 0, length 8
IP (tos 0xc0, ttl 64, id 18988, offset 0, flags [none], proto ICMP (1), length 56)
        192.168.122.1 > 192.168.122.101: ICMP time exceeded in-transit, length 36
	IP (tos 0x0, ttl 1, id 1, offset 0, flags [none], proto ICMP (1), length 28)
    192.168.122.101 > 8.8.8.8: ICMP echo request, id 0, seq 0, length 8
IP (tos 0x0, ttl 2, id 1, offset 0, flags [none], proto ICMP (1), length 28)
    192.168.122.101 > 8.8.8.8: ICMP echo request, id 0, seq 0, length 8
IP (tos 0x0, ttl 253, id 55537, offset 0, flags [none], proto ICMP (1), length 56)
        10.111.44.1 > 192.168.122.101: ICMP time exceeded in-transit, length 36
	IP (tos 0x0, ttl 1, id 1, offset 0, flags [none], proto ICMP (1), length 28)
    192.168.122.101 > 8.8.8.8: ICMP echo request, id 0, seq 0, length 8
IP (tos 0x0, ttl 3, id 1, offset 0, flags [none], proto ICMP (1), length 28)
    192.168.122.101 > 8.8.8.8: ICMP echo request, id 0, seq 0, length 8
IP (tos 0x0, ttl 4, id 1, offset 0, flags [none], proto ICMP (1), length 28)
    192.168.122.101 > 8.8.8.8: ICMP echo request, id 0, seq 0, length 8
IP (tos 0x0, ttl 5, id 1, offset 0, flags [none], proto ICMP (1), length 28)
    192.168.122.101 > 8.8.8.8: ICMP echo request, id 0, seq 0, length 8
IP (tos 0x0, ttl 6, id 1, offset 0, flags [none], proto ICMP (1), length 28)
    192.168.122.101 > 8.8.8.8: ICMP echo request, id 0, seq 0, length 8
IP (tos 0x0, ttl 250, id 29640, offset 0, flags [none], proto ICMP (1), length 96)
        182.79.247.9 > 192.168.122.101: ICMP time exceeded in-transit, length 76
	IP (tos 0x0, ttl 1, id 1, offset 0, flags [none], proto ICMP (1), length 28)
    192.168.122.101 > 8.8.8.8: ICMP echo request, id 0, seq 0, length 8
IP (tos 0x0, ttl 252, id 14334, offset 0, flags [none], proto ICMP (1), length 96)
        182.73.11.177 > 192.168.122.101: ICMP time exceeded in-transit, length 76
	IP (tos 0x0, ttl 1, id 1, offset 0, flags [none], proto ICMP (1), length 28)
    192.168.122.101 > 8.8.8.8: ICMP echo request, id 0, seq 0, length 8
IP (tos 0x0, ttl 7, id 1, offset 0, flags [none], proto ICMP (1), length 28)
    192.168.122.101 > 8.8.8.8: ICMP echo request, id 0, seq 0, length 8
IP (tos 0x0, ttl 8, id 1, offset 0, flags [none], proto ICMP (1), length 28)
    192.168.122.101 > 8.8.8.8: ICMP echo request, id 0, seq 0, length 8
IP (tos 0x0, ttl 247, id 0, offset 0, flags [none], proto ICMP (1), length 56)
        72.14.223.230 > 192.168.122.101: ICMP time exceeded in-transit, length 36
	IP (tos 0x0, ttl 1, id 1, offset 0, flags [none], proto ICMP (1), length 28)
    192.168.122.101 > 8.8.8.8: ICMP echo request, id 0, seq 0, length 8
IP (tos 0xc0, ttl 246, id 40140, offset 0, flags [none], proto ICMP (1), length 56)
        72.14.237.3 > 192.168.122.101: ICMP time exceeded in-transit, length 36
	IP (tos 0x80, ttl 1, id 1, offset 0, flags [none], proto ICMP (1), length 28)
    192.168.122.101 > 8.8.8.8: ICMP echo request, id 0, seq 0, length 8
IP (tos 0x0, ttl 54, id 28109, offset 0, flags [none], proto ICMP (1), length 28)
        8.8.8.8 > 192.168.122.101: ICMP echo reply, id 0, seq 14816, length 8

Custom TLD for local network

November 19, 2013 Leave a comment

 

 

In this post I will mention the steps to setup a TLD ( top level domain ) which can be used in a local network. Even though I have configured dns zones manually, this was the first time I configured a TLD zone(eventhough local one) of my own and it felt really cool after completing the setup 🙂

 

ASSUMPTION:

1) TLD used: “.jackal”
2) Bind version: 9
3) OS: Debian 7 (wheezy)
4) DNS/Nameserver ip: 10.111.44.221

 

SOLUTION:

1) Install bind and required packages,

apt-get install bind9 dnsutils

 

2) Insert the following into file “/etc/bind/named.conf.default-zones”,

zone "jackal." {
        type master;
        file "/etc/bind/db.jackal";
        allow-transfer { any;};
        allow-query { any;};
};

 

3) Verify configuration,

root@dns01:~# named-checkconf 
root@dns01:~#

 

4) Create the zone file for “jackal.” in “/etc/bind/db.jackal”

;
; BIND data file for TLD ".jackal"
;
$TTL	604800
@	IN	SOA	jackal. root.jackal. (
			      2		; Serial
			 604800		; Refresh
			  86400		; Retry
			2419200		; Expire
			 604800 )	; Negative Cache TTL
;
@	  IN	NS	ns1.jackal.
@	  IN	NS	ns2.jackal.
@	  IN	A	10.111.44.221
dns01	  IN 	A	10.111.44.222
apache01  IN	A	10.111.44.223
mysql01   IN	A	10.111.44.224
postfix01 IN	A	10.111.44.225
dovecot01 IN	A 	10.111.44.226
ns1	  IN	A	10.111.44.221
ns2	  IN	A 	10.111.44.221

 

5) And verify the zone file, bind configuration and after that restart bind service.

root@dns01:/etc/bind# named-checkzone jackal. db.jackal 
zone jackal/IN: loaded serial 2
OK
root@dns01:/etc/bind# named-checkconf 
root@dns01:/etc/bind# service bind9 restart
[....] Stopping domain name service...: bind9waiting for pid 2279 to die
. ok 
[ ok ] Starting domain name service...: bind9.
root@dns01:/etc/bind#

 

6) Create a separate directory for storing zone files of domains,

mkdir /etc/bind/zones/

 

7) Use the “initdns.sh” script for creating dns zone entries.
NOTE: We are using TLD’s ending with “.jackal”. Also customize the “initdns.sh” for your own use 😀

root@dns01:/# ./initdns.sh rogerjo.jackal
[*] Created zone file for rogerjo.jackal
[*] Added zone entry for rogerjo.jackal in bind configuration
root@dns01:/# named-checkzone rogerjo.jackal /etc/bind/zones/rogerjo.jackal 
zone rogerjo.jackal/IN: loaded serial 1378789827
OK
root@dns01:/# rndc reload
server reload successful
root@dns01:/#

 

 

 

initdns.sh

#!/bin/bash

if [ $# -ne 1 ];then
	echo "Usage: initdns.sh "
	exit 1
fi

## Domain name
MYDOMAIN=$1
ZONECONFIG="/etc/bind/named.conf.default-zones"

if [ `sed -n '/^zone "'${MYDOMAIN}'."/p' ${ZONECONFIG}|wc -l` -eq 1 ];then
	echo "[ERROR] Entry for ${MYDOMAIN} already exists"
	exit 1
fi

## Nameservers
NAMESERVER1="ns1.jackal"
NAMESERVER2="ns2.jackal"

## Apache and ftp service are running on the same host
APACHE_IP="10.111.44.222"
FTP_IP="10.111.44.222"

##Mail server
SMTP_IP="10.111.44.224"
POP_IMAP_IP="10.111.44.225"

## DB Server
MYSQL_IP="10.111.44.223"

## Create zone file
cat > /etc/bind/zones/${MYDOMAIN} << EOF \$TTL    86400 @       IN      SOA     ns.${MYDOMAIN}. root.${MYDOMAIN}. (                         1378789827      ; Serial                         10800   ; Refresh                         3600    ; Retry                         604800  ; Expire                         10800 ) ; Minimum ${MYDOMAIN}.       IN NS   ${NAMESERVER1}. ${MYDOMAIN}.       IN NS   ${NAMESERVER2}. ${MYDOMAIN}.       IN A    ${APACHE_IP} www.${MYDOMAIN}.   IN CNAME ${MYDOMAIN}. ${MYDOMAIN}.       IN MX  10  mx01.${MYDOMAIN}. ${MYDOMAIN}.       IN MX  10  mx02.${MYDOMAIN}. mx01.${MYDOMAIN}.  IN A	   ${SMTP_IP} mx02.${MYDOMAIN}.  IN A	   ${SMTP_IP} pop.${MYDOMAIN}.   IN A    ${POP_IMAP_IP} imap.${MYDOMAIN}.  IN A    ${POP_IMAP_IP} mysql.${MYDOMAIN}. IN A	   ${MYSQL_IP} ftp.${MYDOMAIN}.   IN A	   ${FTP_IP} EOF echo "[*] Created zone file for ${MYDOMAIN}" ## Create zone entry in bind configuration cat >> ${ZONECONFIG} << EOF

zone "${MYDOMAIN}." {
  	type master;
	file "/etc/bind/zones/${MYDOMAIN}";
};
EOF

echo "[*] Added zone entry for ${MYDOMAIN} in bind configuration"

 

 

 

removedns.sh

#!/bin/bash

if [ $# -ne 1 ];then
	echo "Usage: removedns.sh <domainname>"
	exit 1
fi

## Domain name
DOMAIN=$1

ZONECONFIG="/etc/bind/named.conf.default-zones"

if [ `sed -n '/^zone "'${DOMAIN}'."/p' /etc/bind/named.conf.default-zones|wc -l` -eq 1 ];then
	##Remove entries from dns configuration file
	sed -i -e '/^zone "'${DOMAIN}'."/,/^};/d' ${ZONECONFIG}
	sed -i '$d' ${ZONECONFIG}

	echo "[*] Removed zone entries from bind configuration"
else
	echo "[ERROR] ${DOMAIN} not present in bind configuration"
	exit 1
fi

#Remove zone file if it exists
if [ -f /etc/bind/zones/${DOMAIN} ];then
	rm -f /etc/bind/zones/${DOMAIN}
	echo "[*] Removed zone db file"
fi

Create RAID 0 on Dell PERC 5/i from Linux command line using MegaCli

June 16, 2013 2 comments

 

SCENARIO: Customer have Dell PERC 5/i raid controller with RAID 1 already configured on two drives. His DC added two new drives each with two different sizes(a 1TB and a 2TB size) to the RAID controller but it wasn’t visible inside the server. The fdisk command didn’t listed the two new drives.

The PERC controller will only show the OS the drives that are configured as RAID volumes. If we want the new drive to be seeing by Windows/Linux and not be part of the existing RAID 1 we already have, we can create a new RAID 0 volumes with only the new drives.

 

 

SOLUTION:

 

1) Download the MegaCli and Lib_Utils rpm to the server from the rapidshare urls pasted below(you cannot wget it to the server :P),

https://rapidshare.com/files/3230206587/Lib_Utils-1.00-08.noarch.rpm
http://rapidshare.com/files/565005303/MegaCli-8.01.06-1.i386.rpm

 

2) Install the Lib_Utils and MegaCli rpm packages inside the server,

rpm -ivh Lib_Utils-1.00-08.noarch.rpm
rpm -ivh MegaCli-8.01.06-1.i386.rpm

 

3) Retrieve the physical drive information using MegaCli command,

root@jackal777[/opt/MegaRAID/MegaCli]# ./MegaCli64 -PdList -a0| egrep 'Device|Firm|Inq|Coer'
Enclosure Device ID: 8
Device Id: 0
Non Coerced Size: 1.818 TB [0xe8d088b0 Sectors]
Coerced Size: 1.818 TB [0xe8d00000 Sectors]
Firmware state: Online, Spun Up
Inquiry Data:      WD-WMC300310248WDC WD20EFRX-68AX9N0                    80.00A80
Device Speed: Unknown 
Media Type: Hard Disk Device
Enclosure Device ID: 8
Device Id: 1
Non Coerced Size: 1.818 TB [0xe8d088b0 Sectors]
Coerced Size: 1.818 TB [0xe8d00000 Sectors]
Firmware state: Online, Spun Up
Inquiry Data:      WD-WMC300410955WDC WD20EFRX-68AX9N0                    80.00A80
Device Speed: Unknown 
Media Type: Hard Disk Device
Enclosure Device ID: 8
Device Id: 4
Non Coerced Size: 931.012 GB [0x74606db0 Sectors]
Coerced Size: 931.0 GB [0x74600000 Sectors]
Firmware state: Unconfigured(good), Spun Up
Inquiry Data:      WD-WCAV5E944009WDC WD10EARS-00Y5B1                     80.00A80
Device Speed: Unknown 
Media Type: Hard Disk Device
Enclosure Device ID: 8
Device Id: 5
Non Coerced Size: 1.818 TB [0xe8d088b0 Sectors]
Coerced Size: 1.818 TB [0xe8d00000 Sectors]
Firmware state: Unconfigured(good), Spun Up
Inquiry Data:       MJ0251YMG06ZAAHitachi HUA5C3020ALA640                 ME0KR5A0
Device Speed: Unknown 
Media Type: Hard Disk Device
root@jackal777[/opt/MegaRAID/MegaCli]# 

 

4) We are going to use the last two drives for creating RAID 0 array. The firmware state of these two drives is “Unconfigured(good), Spun Up“. The first two drives is already configured as RAID 1. Details of the two disks is pasted below,

Disk 1:

Enclosure Device ID: 8
Device Id: 4
Non Coerced Size: 931.012 GB [0x74606db0 Sectors]
Coerced Size: 931.0 GB [0x74600000 Sectors]
Firmware state: Unconfigured(good), Spun Up
Inquiry Data:      WD-WCAV5E944009WDC WD10EARS-00Y5B1                     80.00A80
Device Speed: Unknown 
Media Type: Hard Disk Device

Disk 2:

Enclosure Device ID: 8
Device Id: 5
Non Coerced Size: 1.818 TB [0xe8d088b0 Sectors]
Coerced Size: 1.818 TB [0xe8d00000 Sectors]
Firmware state: Unconfigured(good), Spun Up
Inquiry Data:       MJ0251YMG06ZAAHitachi HUA5C3020ALA640                 ME0KR5A0
Device Speed: Unknown 
Media Type: Hard Disk Device

 

The general format for creating raid array 0,1 or 5 using MegaCli is as follows,

MegaCli -CfgLdAdd -r(0|1|5) [E:S, E:S, ...] -aN

 

Where E refers to Enclosure Deivce ID and S refers to Device Id.

Now create the RAID 0 array using drives [8:4] and [8:5] as follows,

 

root@jackal777[/opt/MegaRAID/MegaCli]# ./MegaCli64 -CfgLdAdd -r0[8:4,8:5] -a0
                                     
Adapter 0: Created VD 1

Adapter 0: Configured the Adapter!!

Exit Code: 0x00
root@jackal777[/opt/MegaRAID/MegaCli]# 

 

 

 

NOTE:

Use the MegaCli-8.01.06-1.i386.rpm from url. And use the package Lib_Utils-1.00-08.noarch.rpm from 8.00.29_Linux_MegaCli.zip (downloaded from official website). Dont use the MegaCli-8.00.29-1.i386.rpm from 8.00.29_Linux_MegaCli.zip because the MegaCli version contained inside zip file doesn’t support Logical drive creation. And we have to use version 8.01

 

 

REFERENCES:

http://tools.rapidsoft.de/perc/perc-cheat-sheet.html
http://www.overclock.net/t/359025/perc-5-i-raid-card-tips-and-benchmarks
http://blog.nexcess.net/2010/12/28/managing-hardware-raid-with-megacli/
http://community.spiceworks.com/how_to/show/8781-configuring-virtual-disks-on-a-perc-5-6-h700-controller
http://tools.rapidsoft.de/perc/perc-cheat-sheet.html
http://hwraid.le-vert.net/wiki/LSIMegaRAIDSAS
http://artipc10.vub.ac.be/wordpress/2011/09/12/megacli-useful-commands/
http://hwraid.le-vert.net/wiki/LSIMegaRAIDSAS
http://preston4tw.blogspot.in/2013/03/megacli-80216-breaks-dell-perc-5i.html
https://code.google.com/p/fastvps/downloads/detail?name=MegaCli-8.01.06-1.i386.rpm&can=2&q=

Categories: Linux Command Line, RAID

One-liners for troubleshooting Virtuozzo load issues

June 9, 2013 1 comment

 

I wish to introduce the various one-liners that can be used to troubleshoot, load or performance issues, within a Virtuozzo node.

To begin with, we will discuss the various situations that can degrade performance of a Virtuozzo node. As is obvious, heavy usage of either/all of CPU, Memory, Disk or Network will degrade the performance of node. The same case applies with Virtuozzo nodes too, but with an additional complexity; it could also happen as a result of the processes running inside a container. In this case we need to identify the problem container and deal with the processes inside that container.

I have found these one-liners quite useful for finding out the problem container while troubleshooting Virtuozzo alerts.

 

1) Troubleshooting load issues caused by high CPU activity

 

=> Display the list of containers sorted based on cpu usage,

/usr/sbin/vzlist -o ctid,laverage

OR

/usr/sbin/vzstat -t -s cpu|awk ‘NF==10{print $0}’

=> Sometimes the above one-liners will not show the actual cpu usage inside the containers(possibly due to some delay in updating the stats), but still, the load inside node will be high. In this situation, running the command pasted below will help find out the cpu intensive containers.

for i in `/usr/sbin/vzlist -H -o ctid`; do echo "CTID: ${i} `/usr/sbin/vzctl exec ${i} cat /proc/loadavg`"; done

=> List out all containers for which the status is not in “OK” status. This is quite helpful while troubleshooting load issues when the load average in the node is super-high(above 1000)

/usr/sbin/vzstat -t|awk ‘{if(NF==10 && $2!=”OK” && $1!=”CTID”)print $0}’

=> Lists the top 10 containers based on number of processes running inside the container.

/usr/sbin/vzlist -H -o ctid,numproc|sort -r -n -k2|head

 

2) Troubleshooting load issues caused by n/w activity

 

=> Sorts containers based on socket usage

/usr/sbin/vzstat -t -s sock|awk 'NF==10{print $0}'

=> Sorts containers based on TCP sender buffer usage,

/usr/sbin/vzlist -H -o ctid,tcpsndbuf |sort -r -n -k2

=> Sorts containers based on TCP receive buffer usage,

/usr/sbin/vzlist -H -o ctid,tcprcvbuf |sort -r -n -k2

=> Sorts containers based on the highest inbound traffic(quite useful while troubleshooting n/w related attacks),

/usr/sbin/vznetstat -r |awk '$3 ~ /G/ {print $0}'|sort -r -nk3

=> Sorts containers based on the highest oubound traffic(quite useful while troubleshooting n/w related attacks) ,

/usr/sbin/vznetstat -r |awk '$5 ~ /G/ {print $0}'|sort -r -nk5

 

3) Troubleshooting performance issues caused by memory utilization

 

The ‘dmesg‘ command displays containers which has resource shortages. There is possibility that it could be the abusive one. So there is a need to check the process inside that container.

[root@virtuozzo ~]# dmesg|egrep -v '(SMTP-LOG|INPUT-DROP|LIMIT-PPS-DROP|FORWARD-DROP)'
TTL=64 ID=0 PROTO=UDP SPT=68 DPT=67 LEN=556
[1101732.300833] __ratelimit: 44 callbacks suppressed
[1101732.310531] Fatal resource shortage: kmemsize, UB 12215.
[1101742.294179] Fatal resource shortage: kmemsize, UB 12215.
[1101752.277368] Fatal resource shortage: kmemsize, UB 12215.
[1101752.393226] Fatal resource shortage: kmemsize, UB 12215.
[1105092.458621] __ratelimit: 101 callbacks suppressed
[1105092.468411] Fatal resource shortage: kmemsize, UB 12215.
[root@virtuozzo ~]#

 

4) Troubleshooting load issues caused by high disk I/O activity.

 

You can install the ‘atop’ command and spot problem processes at the top of the list when sorting by disk usage (‘D’). To get more information on using ‘atop’ refer url.

The above one-liners will help you identify the problem CTID or the process(PID) responsible for performance issue. In the second case after finding the process-id, you can use ‘vzpid’ command to spot the container inside which the process is running and either renice or stop that process. And in the first case, you can view the processes running inside the container using either ‘vzps’ or ‘vztop’ command. The usage of which is given below,

vztop -b -c -n 1 -E 

OR

vzps auxfww -E 

So, that is it guys. I sincerely hope you get to take away something helpful from all this.

Happy Hunting 😀

Comprehensive Analysis of /proc/user_beancounters : Virtuozzo

June 8, 2013 Leave a comment

While troubleshooting issues related to a Virtuozzo VPS, we usually come across the ‘user_beancounters’ file in the “/proc directory”. This file is of importance only if we use UBC or Combined SML+UBC memory mode for our Virtuozzo VPS. The resource control information about running virtual environments is contained in this file. So basically, ‘/proc/user_beancounters’ represents your VPS’s allocation of various system resources (memory). It thus is a main indicator of how well our VPS works, how stable it is, or whether there is a resource shortage. So, if you face any trouble while running or installing applications on your VPS, one good way to find the source of the problem is to take a look at this file.

Let’s dig deeper into the details of this file.

In Parallels Virtuozzo containers, virtualization technology resource control settings for each virtual machine are stored in the configuration file “/etc/vz/conf/XXX.conf” (where XXX is the ID of the given CT). These settings are loaded and applied to the containers during the VPS’s startup or on some events such as execution of “vzctl set CTID”. For running containers the resource control settings are exposed via “/proc/user_beancounters”. One such file “/proc/user_beancounters” exists in the node and one inside the VPS too. The file in the hardware node contains the resource control settings of all running VPSs. A pictorial representation of the file “/proc/user_beancounters” inside a VPS is shown below:

image1

A brief description of the various columns are given below,

UID: Indicates the ID of the container. In Virtuozzo each container is given a unique ID for the ease of management.

RESOURCE: This field indicates the primary,secondary and auxiliary parameters in Virtuozzo. In order to get more details of these resources refer url

HELD: Indicates the current usage of the various resources.

MAXHELD: Indicates the maximum usage of the resource since VPS startup.

BARRIER & LIMIT: Gives us the values of the softlimit and hardlimit of the virtozzo resource controls. Resource requests above that particular limit gets denied.

FAILCNT: It shows the number of refused or rather denied resource allocations of VPS right from the start up stage of the VPS. A non-zero value in this column indicates resource shortage and we need to either increase that particular resource or find the process responsible for it and optimize it. Otherwise it can cause weird issues with services running inside the container. Eg: unexpected service down, intermittent website issues,etc.

The following awk script can be used to list out all the containers with a non-zero values for the column “failcnt”. This script will print out all the containers with a non-zero failcnt value, along with their resource name and the corresponding failcnt value. Save the script as “/root/failcnt.awk” or any name that you like.

/root/failcnt.awk

BEGIN{OFS="";i=0;j=1;failcntflag=0;}
{
if(NF==7 && index($1,":") >0 ){
if(failcntflg==1){
printf "\nCTID=%s",arr1[1];
for(j=1;j<=i;j++){ printf " %s ",vector[j];delete vector[j];}
failcntflg=0;i=0;
}
split($1,arr1,":");

if($NF!=0) {
i = i+1;
failcntflg=1;
vector[i] = $2" "$NF;
}

}

 

if (NF==6 && $NF!=0){
i = i+1;
failcntflg=1;
vector[i] = $1" "$NF;
}
}
END{ printf "\n" }

Now run the script from node as follows,

[root@adminahead ~]# awk -f /root/failcnt.awk /proc/user_beancounters

CTID=10592 lockedpages 13
CTID=13917 kmemsize 357 shmpages 4 physpages 5 oomguarpages 1 tcprcvbuf 755
CTID=13904 kmemsize 528 numothersock 1
CTID=13905 kmemsize 73 numothersock 1
CTID=13897 kmemsize 1 shmpages 4 tcprcvbuf 4751
CTID=10000000 numothersock 1986
CTID=10594 kmemsize 27 physpages 7 oomguarpages 1 tcpsndbuf 295136
CTID=12435 shmpages 4
CTID=12437 kmemsize 2 shmpages 2 tcprcvbuf 690
CTID=12441 shmpages 3
CTID=12438 shmpages 1 physpages 712 oomguarpages 73 tcpsndbuf 63
CTID=10651 physpages 15 oomguarpages 8
CTID=10611 physpages 24 oomguarpages 11
CTID=10623 numothersock 14
CTID=10570 physpages 6 oomguarpages 3
CTID=10578 physpages 517 oomguarpages 33
CTID=10603 physpages 49 oomguarpages 40
CTID=10633 physpages 87 oomguarpages 24
CTID=10610 numproc 71 physpages 2250 oomguarpages 472
[root@ adminahead ~]#

As you can see from the above output, container “13917” shows the highest number of ‘failcnt’ for resources. For this VPS, “kmemsize”,”shmpages”,”physpages”,”oomguarpages” and “tcprcvbuf” show non-zero failcnt values and among them the first four resources are related to RAM. Upgrading the RAM inside that VPS is a good suggestion, but that should be considered only after finding out the resource intensive process inside the container and optimizing it.

You can use the following commands to list out the memory intensive processes inside the container.

* Lists top 3 memory intensive processes,

ps -auxf | sort -nr -k 4 | head -3

OR

wget -O /root/ps_mem.py http://www.pixelbeat.org/scripts/ps_mem.py
python /root/ps_mem.py |tail -3

The “/proc/user_beancounters” in the node can be monitored continuously to find out the VPSs that are short of resources and the corresponding VPS owner can be contacted for resource upgrade or optimization.