Archive

Archive for the ‘Programming’ Category

Combine regular expression and conditional statements in BASH

January 1, 2012 2 comments

As we all know we can use conditional statements in BASH. For example, show usage if number of arguments is 0:

#!/usr/bin/env bash
if [ $# -eq 0 ]; then
    echo "Usage: $0 start|stop|restart"
    exit 0
fi
echo "going to run with \$1=$1"

We might also want to use regular expression to test if $1 is start, stop or restart if $# is no longer 0:

#!/usr/bin/env bash
function usage() {
    echo "Usage: $0 start|stop|restart"
}
if [ $# -eq 0 ]; then
    usage
    exit 0
fi

if [[ ! $1 =~ ^(start|stop|restart)$ ]]; then
    usage
    exit 0
fi
echo "going to run with \$1=$1"

But wouldn’t it be nice if the tests can be combined together. With bash operator || the above code can be written as:

#!/usr/bin/env bash
function usage() {
    echo "Usage: $0 start|stop|restart"
}   
if [ $# -eq 0 ] || [[ ! $1 =~ ^(start|stop|restart)$ ]]; then
    usage
    exit 0
fi  
echo "going to run with \$1=$1"

One more example using operator && instead:

#!/usr/bin/env bash
if [ -d ~/a_folder ] && [[ $1 =~ ^(install|remove)$ ]]; then
    echo "going to $1 something" 
else
    echo "Folder ~/a_folder doesn't exist or you specified the wrong parameter:"
    echo "Usage: $0 install|remove" 
    exit 0
fi  
Advertisements
Categories: Bash, Programming

MySQL: select data whose certain column contains the specified values only

December 23, 2011 2 comments

This problem could be best described by an example as below:

select * from mytbl;
+------+------+
| k    | v    |
+------+------+
| A    | 1    |
| A    | 2    |
| A    | 3    |
| B    | 1    |
| B    | 3    |
| C    | 1    |
| C    | 3    |
| C    | 4    |
| E    | 2    |
| E    | 1    |
| F    | 5    |
+------+------+

k, v are both char(10), not null.

I want to find out records whose v column is either 1 or 3 [and only 1 or 3, therefore in the above example the only qualified records are (‘B’, ‘1’) and (‘B’, ‘3’)]. This is just a simplified version of the problem I faced at work recently and I did find someone posting the similar question on Internet but I forgot to bookmark the url and I didn’t find its solutions interesting hence I pulled my hair a bit and came up with the following solution using group_concat and regexp:

select k, group_concat(distinct v order by v) as g from mytbl group by k having g regexp '^(1,?)?(3)?$';
+------+------+
| k    | g    |
+------+------+
| B    | 1,3  |
+------+------+

If the requirement becomes v is either 1, 3 or 4 (again, 1, 3 or 4 only),

select k, group_concat(distinct v order by v) as g from mytbl group by k having g regexp '^(1,?)?(3,?)?(4)?$';
+------+-------+
| k    | g     |
+------+-------+
| B    | 1,3   |
| C    | 1,3,4 |
+------+-------+
Categories: mysql, Programming

Play with randomness in MySQL

December 2, 2011 1 comment

Some of you might know about the rand() function in MySQL — it randomly generates float point number between 0 and 1. Most of the time we use it to generate results that we want to be in random order. For example,

We have a people table that has the following records:

select * from people;
+-----------+
| name      |
+-----------+
| Bob       |
| Alice     |
| Kim       |
| Tom       |
| Jerry     |
| Linda     |
| Fransisco |
| Zack      |
| Peter     |
+-----------+

We are going to pick 3 persons out of the list randomly, so we do a

mysql> select * from people order by rand() limit 3;
+------+
| name |
+------+
| Zack |
| Bob  |
| Kim  |
+------+

The above query should return (most likely) different result every time. Next I am going to make it a little more interesting. Let’s say I have another table named prize which contains a list of prizes:

select * from prize;
+-----------------+
| name            |
+-----------------+
| Pencil          |
| Coffee grinder  |
| iPad            |
| GPS watch       |
| Yoga mat        |
| 2 movie tickets |
+-----------------+

What we want to do is to assign a randomly picked prize to each person in the people table (assuming the same prize can be assigned to more than one person as there are fewer number of prizes than number of people), here’s the query

select o.*, (select name from prize order by rand() limit 1) as prize from people o;

This would return something like the following (again the result will be most like different if you try it):

+-----------+-----------------+
| name      | prize           |
+-----------+-----------------+
| Bob       | iPad            |
| Alice     | Yoga mat        |
| Kim       | GPS watch       |
| Tom       | GPS watch       |
| Jerry     | iPad            |
| Linda     | GPS watch       |
| Fransisco | Pencil          |
| Zack      | 2 movie tickets |
| Peter     | GPS watch       |
+-----------+-----------------+

For your convenience you can use the following sql statements to generate the tables with data populated:

CREATE TABLE `people` (
  `name` varchar(30) DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
INSERT INTO `people` VALUES ('Bob'),('Alice'),('Kim'),('Tom'),('Jerry'),('Linda'),('Fransisco'),('Zack'),('Peter');

CREATE TABLE `prize` (
  `name` varchar(50) DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
INSERT INTO `prize` VALUES ('Pencil'),('Coffee grinder'),('iPad'),('GPS watch'),('Yoga mat'),('2 movie tickets');
Categories: mysql, Programming

A simple node.js rss parser using sax-js

November 15, 2011 Leave a comment

The xml parser sax-js written by Issacs (the creator of npm, the de facto package manager of node.js) comes with a few examples that deal with local xml files. I couldn’t find one that can parse xml data from remote host (think RSS) therefore I decided to write one. In this example I borrowed codes heavily from both sax-js’s example code and node-rss source code.

The codes

cat saxrss.js

var sax=require('sax');
var http=require('http');

var callback=function(){};

exports.get_rss=function(host,port,path, cb) {
	callback=cb
	var parser = sax.parser(true)
	var item = null
	var currentTag = null
	var items=[]
	var cnt=0

	parser.onclosetag = function (tagName) {
		var tag_name=tagName.toLowerCase();
		if (tag_name === 'item' || tag_name === 'entry') {
			currentTag = item = null
			cnt++
			return
		}
		if (currentTag && currentTag.parent) {
			var p = currentTag.parent
			delete currentTag.parent
			currentTag = p
		}
	}

	parser.onopentag = function (tag) {
		var tag_name=tag.name.toLowerCase()
		if (tag_name !== 'item' && tag_name !== 'entry' && !item) return
		if (tag_name === 'item') {
			item = tag
				items[cnt]={}
		}
		tag.parent = currentTag
		tag.children = []
		tag.parent && tag.parent.children.push(tag)
		currentTag = tag
	}

	parser.ontext = function (text) {
		if (currentTag) {
			items[cnt][currentTag.name.toLowerCase()]=text
		}
	}

	parser.onend = function () {
		callback(items)
	}

	var body='';
	http.get( { host:host, path:path, port:port }, function(res) {
		res.addListener('end', function() {
			parser.write(body).end()
		});
		res.setEncoding('utf8');
		res.on('data', function(d) {
			body+=d;
		});
	});
}

cat test1.js

var rss=require('./saxrss.js');
var host='feeds.finance.yahoo.com';
// to get finance headlines about stock AAPL
var path='/rss/2.0/headline?s=aapl&region=US&lang=en-US';

rss.get_rss(host, 80, path, function(items) {
	console.log(items);
});
To run

node test1.js

Required node modules:

sax

References:

node-rss
sax-js

[ UPDATE 2/20/2012 ]
With xml-simple module, the above example can be written as

// getting xml and convert to json object using xml-simple example
var http=require('http'), simplexml=require('xml-simple'), config= {host:'feeds.finance.yahoo.com', path:'/rss/2.0/headline?s=aapl&region=US&lang=en-US', port:80}, body='';

http.get( config, function( res ) {
	res.addListener('end', function() {
		simplexml.parse(body, function(e, parsed) {
			console.log(parsed.channel.item);
			//console.log(JSON.stringify(parsed));
		});
	});
	res.setEncoding('utf8');
	res.on('data', function(d) {
		body+=d;
	});
});

To install xml-simple, simply npm install -g xml-simple.

tmux techniques by example

November 14, 2011 2 comments

I am a big fan of tmux – a terminal multiplexer. Think of it as a text version of vnc client, with many more powerful features. In this post I will demo some of the tmux techniques that I use quite often.

Assumptions:

1) GNU version of tmux
2) Default shell is BASH

Preparation:

For demo’s purpose I make up a dummytask.sh to simulate the task(s) that we will be running in tmux windows:

#!/bin/bash
taskname=$@
ans=''
if [ -n "$taskname" ]; then
	while [ ! "$ans" == "q" -a ! "$ans" == "Q" ]; do
		read -e -n 1 -p "Running task $taskname, to exit, press q(Q). " ans
	done
	echo "Done task $taskname."
else
	echo "Usage: $0 task."
	echo "Example1: $0 debugging"
	echo "Example2: $0 importing data"
fi
Example: create a tmux session with session name mysess
tmux new-session -s mysess -d

If the option -d (detached) is omitted, you will be taken directly to the first window titled “0:bash” once the command is executed and any commands afterwards will be entered into that window. Therefore it’s a good habit to use option -d whenever creating a new session.

Example: create a new tmux session and change the first default window title to task1

[ type q, Enter, exit, Enter if the sess tmux session is currently attached ]
In the first example, tmux will create a first window titled “0:bash” (could be ksh, csh etc depending on default shell setting) by default, to change to something else, simply using -n (name) option:

tmux new-session -s sess -d -n task1
Example: create tmux session mysess if it has not been created yet
tmux list-session 2>&1 | grep -q "^mysess:" || tmux new-session -s sess -d

Notes: 2>&1 is to suppress error output “failed to connect to server: Connection refused”, which occurs when there are no tmux sessions running. -q is used to suppress the normal output of of grep. It won’t affect the result but using it makes the commands less distracting. Regular expression ^sess: is used to make sure it won’t match session name such as “sessionabc” by mistake. Logical operator || is just a shorthand form of if [ ! condititon ]; then … fi.

Example: create a new window title mywin in an existing tmux session mysess, if the window has not existed yet, create the session first if it hasn’t existed yet
#!/bin/bash
sess=mysess
wn=mywin

tmux list-session 2>&1 | grep -q "^$sess" || tmux new-session -s $sess -d
tmux list-window -t $sess 2>&1 | grep -q ": $wn \[" || tmux new-window -t $sess -n $wn

Example: run a script in mysess:mywin in the above example

#!/bin/bash
sess=mysess
wn=mywin

tmux list-session 2>&1 | grep -q "^$sess" || tmux new-session -s $sess -d
tmux list-window -t $sess 2>&1 | grep -q ": $wn \[" || tmux new-window -t $sess -n $wn
tmux send-keys -t $sess:$wn "./dummytask.sh cooking" Enter

How do we know the above script is doing what we intended to do? Check out the next example.

Example: attach tmux session mysess
tmux a -t mysess
Example: run a script in the first window of the newly created tmux session
tmux new-session -s mysess -n mywin "bash dummytask.sh cooking"

This works but there’s a problem, once you exit the program by pressing q or Q, tmux session also terminates. This gets more annoying when a program crashes and you don’t get any debug info, a better handling of the task will be provided in the following example.

Example: run program in a tmux window and exit to bash shell inside the window if the program exits or crashes
#!/bin/bash
sess=mysess
wn=mywin

# duplicate session or window handling code here
# ...
tmux send-keys -t $sess:$wn "./dummytask.sh cooking" Enter
Example: how to check if a tmux session is attached

Sometimes it’s desired to run certain command not in a tmux window, use the following code to detect if attempt is made to run some command inside a tmux window:

if [ "$TERM" = "screen" -a -n "$TMUX" ]; then
    echo "This command should be run when tmux is not attached"
fi
Example: attach a tmux session with a specific window selected
#!/bin/bash
sess=mysess

tmux list-session 2>&1 | grep -q "^$sess" || tmux new-session -s $sess -d


wn=win0
tmux list-window -t $sess 2>&1 | grep -q ": $wn \[" || tmux new-window -t $sess -n $wn
tmux send-keys -t $sess:$wn "./dummytask.sh task 0" Enter

wn=winX
tmux list-window -t $sess 2>&1 | grep -q ": $wn \[" || tmux new-window -t $sess -n $wn
tmux send-keys -t $sess:$wn "./dummytask.sh important mission" Enter

wn=win1
tmux list-window -t $sess 2>&1 | grep -q ": $wn \[" || tmux new-window -t $sess -n $wn
tmux send-keys -t $sess:$wn "./dummytask.sh another thing" Enter


# here's the meat of this script, select window winX before attaching the session
tmux select-window -t $sess:winX && tmux a -t $sess
Categories: Programming, tmux

Node.js example 2: parallel processing

October 15, 2011 Leave a comment

[ UPDATE 2/29/2012 ]
I just discovered a much better way to perform the same parallel processing task through the answer by Linus G Thiel to one of my stackoverflow questions.

        var async=require('async');
        // gen an integer between 1 and max
        function gen_rnd(max) {
                return Math.floor( (Math.random()*max+1)*1000 );
        }

        function job(lbl, cb) {
                console.log('Job %s started', lbl);
                console.time(lbl+'-timer');
                setTimeout( function() {
                        console.timeEnd(lbl+'-timer');
                        cb(null, lbl.toUpperCase());
                }, gen_rnd(5) );
        }

        console.time('all jobs');
        async.parallel([
                function(cb) { job('p1',cb) },
                function(cb) { job('p2',cb) }
        ], function(err, results) {
                console.log('do something else upon completion of p1 and p2');
                console.log('results=%j', results);
                console.timeEnd('all jobs');
        });

[ Original version of this post ]
I just wrote another example using Node.js to demonstrate the benefit of non-blocking io programming. In this example I need to do a job that depends on two processes p1() and p2(). Job won’t start until both p1 and p2 are finished. If I start p1 and p2 one by one, the total time used to do job would be at least T(p1)+T(p2). But if I can start p1 and p2 in parallel then the time required would be reduced to Max(T(p1), T(p2)). To make the simulation closer to real-life examples, the execution time for p1 or p2 would be anywhere between 1 and 6 seconds. So if you run this example you will see sometimes p1 is finished early, sometimes p2 is. The core part of this example uses event.Emitter.

Update 11/6/2011: using console.time and console.timeEnd to keep track of time consumed

var util=require('util');
var events=require('events');
var max_sleep=5000; // simulated maximum execution time in milliseconds for each process
var cnt=0;

function p1(jobs) {
    console.time('process-1');
    var slp=Math.floor(Math.random()*max_sleep)+1000;
    setTimeout( function() {
        cnt++;
        console.timeEnd('process-1');
        if(cnt>=2) {
            var m='from p1, cnt='+cnt;
            console.log(m);
            jobs.write(m);
        }
    }, slp );
}

function p2(jobs) {
    console.time('process-2');
    var slp=Math.floor(Math.random()*max_sleep)+1000;
    setTimeout( function() {
        cnt++;
        console.timeEnd('process-2');
        if(cnt>=2) {
            var m='from p2, cnt='+cnt;
            console.log(m);
            jobs.write(m);
        }
    }, slp );
}

function Jobs() {
    events.EventEmitter.call(this);
}

util.inherits(Jobs, events.EventEmitter);

Jobs.prototype.write=function(data) {
    this.emit('ready', data);
}

var jobs=new Jobs();

jobs.on('ready', function(data) {
    console.log('Received data: '+data);
    console.log('Job done!');
    console.timeEnd('all-processes');
});

console.time('all-processes');
p1(jobs);
p2(jobs);

Output examples:

1-
process-1: 1724ms
process-2: 3232ms
from p2, cnt=2
Received data: from p2, cnt=2
Job done!
all-processes: 3241ms

2-
process-2: 1377ms
process-1: 4803ms
from p1, cnt=2
Received data: from p1, cnt=2
Job done!
all-processes: 4805ms

Learning Node.js + socket.io – a simple streaming example.

October 14, 2011 6 comments

I learned about Node.js not too long ago (Node is actually very new) and found its non-blocking io very interesting. Socket.io is a Node module that enables real-time communication between web client and server. It basically chooses the best supported streaming technology for the client if native streaming technology is not available (Websocket for example). I am a total beginner to both Node and socket.io and I wrote a little more complicated example than the basic one on socket.io home page. In this example, a random number is generated every 4 seconds and broadcasted to the network (see the video link at the end of this post).

Server:
Install socket.io if it’s not installed:
npm install socket.io -g

var io = require('/usr/local/lib/node_modules/socket.io').listen(8080);
var t;  // I usually don't like using global variables but hope it's ok for DEMO's purpose

function rnd() {
    var num=Math.floor(Math.random()*1000);
    return num;
}
io.sockets.on('connection', function (socket) {
    t=setInterval( function() {
        var n=rnd();
        socket.broadcast.emit('stream', {n:n.toString()});
    }, 4000);
    socket.on('action', function (data) {
        console.log('received action');
        if(data.todo=='stop') {
            socket.broadcast.emit('stream', {n:'Stopped'});
            console.log('stopping timer now.');
            clearInterval(t);
        } else if(data.todo='run') {
            // the setInterval code definitely can
            // be combined/optimized with the one above
            // again for DEMO's sake I just leave it as is
            t=setInterval( function() {
                var n=rnd();
                socket.broadcast.emit('stream', {n:n.toString()});
            }, 4000);
        }
    });
});

Client:
Note:
The client socket.io.js code can be found at /usr/local/lib/node_modules/socket.io/node_modules/socket.io-client/dist on my Ubuntu 10.10

client.html, served using Apache (I know this can be changed to use Node.js totally, with some help from Express webframe)

<script src="socket.io.js"></script>
<script>
    var sw='run';
  var socket = io.connect('http://192.168.1.200:8080');

  socket.on('stream', function (data) {
        document.getElementById('number').innerHTML=data.n;
  });

    function stop_timer() {
        if(sw=='run') {
            socket.emit( 'action', {todo: 'stop'} );
            sw='stop';
        } else {
            socket.emit( 'action', {todo: 'run'} );
        }
    }
</script>
<div style="border:1px solid #ccc" id="number">&nbsp;</div>
<a href="#" onclick="stop_timer();return false;">Action</a>

Here’s a quick video made with ScreenFlow (didn’t purchase so pls forgive the watermark):