Archive

Archive for April, 2011

Useful “IN” in mysql

April 29, 2011 Leave a comment

When using IN in mysql statement, the number of columns after the keyword WHERE can be more than one, for example,

SELECT * from student_info WHERE (native_language, interest) in (
    ('English', 'singing'),
    ('Spanish', 'dancing'),
    ('Franch', 'movie')
)

The same technique can be used to compare two tables with the same primary (or unique index) keys that are formed by multiple columns (also known as the composite key). For example:

SELECT * from tbl_A WHERE ( student_id, last_name, ssn4 ) 
NOT IN ( SELECT student_id, last_name, ssn4 FROM tbl_B )

Note: Columns student_id, last_name, ssn4 form the primary (or unique index) key in both table tbl_A and tbl_B, the column names don't have to be the same in tbl_B though, as long as each corresponding column holds the same type of data as the other table.

Rows that don't have the same primary keys can be easily spotted by simply swapping the table names (and adjust the column names if different) in the above example, if necessary.

Categories: mysql, Programming, Tip

Yet another way to convert strings to numbers in Perl

April 27, 2011 2 comments

OK I know what you are gonna say when you see the title of this post — it’s not necessary to convert strings to numbers, strings are numbers if they look like numbers and Perl will treat them like numbers when you want it to. But there are some situations where data cleaning is required, for example, importing data from a csv file into database and you want to make sure numeric columns contains only numbers, strings such as ‘a345′, ‘ ‘, will be converted to 0 before they are being inserted into the database. For the past I’ve been using the following home-cooked method to do the conversion

sub tonum {
	return $_[0] =~ m/^\s*([+-]?\d+(\.\d+)?)\s*$/?$1:0;
}

This functions converts strings like ' +123.00', '78 ', '-333.456', '+abc' into '+123.00', '78', '-333.456', and '0' respectively. However I found yet another way to perform the same task and it's much faster:

sub tonum2 {
	no warnings;
	return $_[0]+0;
}

How much faster can this method beat the previous one? I wrote a couple of testing scripts:

cat a.pl

#!/usr/bin/perl
use warnings;
use strict;

sub tonum {
	return $_[0] =~ m/^\s*([+-]?\d+(\.\d+)?)\s*$/?$1:0;
}
my $v;
for my $j (0..1_000_000) {
	for my $i (' 23.45', '-7  ', 'a34.56') {
		$v= tonum($i);
	}
}

---------------------------------
time perl a.pl

real 0m31.250s
user 0m31.202s
sys 0m0.012s

cat b.pl

#!/usr/bin/perl
use warnings;
use strict;

sub tonum2 {
	no warnings;
	return $_[0]+0;
}
my $v;
for my $j (0..1_000_000) {
	for my $i (' 23.45', '-7  ', 'a34.56') {
		$v= tonum2($i);
	}
}

---------------------------------
time perl b.pl

real 0m8.633s
user 0m8.617s
sys 0m0.008s

We now have our clear winner. The tricky part is tonum2 takes advantage of "no warnings" statement which suppresses the "Argument blahblah isn't numeric" warnings. It's kind of like the @ operator in PHP.

Categories: perl, Programming, Tip

Interesting results from benchmarking trim “function” in Perl

April 27, 2011 Leave a comment

I’ve been wondering if writing a customer trim function in perl will come with some performance penalty. To satisfy my itchiness, I ran the following three scripts one at a time and use handy command time to get the running times.

cat a.pl

#!/usr/bin/perl
#!/usr/bin/perl
use warnings;
use strict;

sub trim {
	my $str=shift;
	$str =~ s/^\s+|\s+$//g;
	return $str;
}

my $test_str= "  this is a test   ";

my $result;
for my $i (0..1_000_000) {
	my $copy=$test_str;
	$result=trim($copy);
}

------------------------------
time perl a.pl

real 0m18.451s
user 0m18.417s
sys 0m0.012s

cat b.pl


#!/usr/bin/perl
use warnings;
use strict;

my $test_str= "  this is a test   ";

my ($result,$copy);
for my $i (0..1_000_000) {
	$copy=$test_str;
	$copy =~ s/^\s+|\s+$//g;
	$result=$copy;
}

------------------------------
time perl b.pl

real 0m13.330s
user 0m13.313s
sys 0m0.000s

By using the regex directly saves the running time by a few seconds. And I continue to break the regex into two as follows:

cat c.pl

#!/usr/bin/perl
use warnings;
use strict;

my $test_str= "  this is a test   ";

my ($result, $copy);
for my $i (0..1_000_000) {
	$copy=$test_str;
	$copy =~ s/^\s+//g;
	$copy =~ s/\s+$//g;
	$result=$copy;
}

------------------------------
time perl c.pl

real 0m6.534s
user 0m6.524s
sys 0m0.004s

I thought the results from b.pl and c.pl should be pretty close but I was so wrong - it takes only a half of the time to run c.pl when compared to b.pl.

What I learned from the above bench mark tests:
1) Avoid trim "functions" as much as possible
2) Use the double regex method as shown in c.pl instead of one regex
3) Running time can be cut further if it's known beforehand whether there are trailing spaces (use s/\s+$//g) or leading spaces (use s/^\s+//g) only

Categories: perl, Programming

Go to parent directory in Mac OSX Finder

April 21, 2011 Leave a comment

Shortcut key: Command + UP Arrow

Found it here, 2nd comment.

Categories: mac, Tip

Running virtualbox Linux guests in headless mode

April 21, 2011 2 comments

Sometimes it’s more convenient to start virtualbox Linux guests in headless mode and access them through ssh. This guide assumes the vm to be running in headless mode
1) has ssh server installed
2) uses static ip address or dhcp but ip address is known and doesn’t change

To run vb guest in headless mode, simply run the following command

VBoxHeadless -startvm vm_name&

To ensure vm to be accessible, its network setting under VirtualBox Manager needs to be set to Bridge mode and use static ip address (or dhcp if the ip doesn't change upon reboots) for the vm. This needs to be done before the above command is run.

Wait for some moments after the headless command is issued, start pinging the vm to check if the vm is up by its ip (or use telnet vm_ip 22 if PING is blocked on the vm). SSH to it should be a piece of cake now.

If you are like me who doesn't even want to type the command ssh (Yup I wrote a one-letter bash function just to wrap the ssh call to some machines that I use on a daily basis) , you might want to turn the VBoxHeadless command into a function in bash:
1) vim ~/.bash_profile
2) append the following
function headless {
VBoxHeadless -startvm $1&
}

headless() {
        [ $# -lt 1 ] && echo "Usage: $FUNCNAME vm_name" && return
        VBoxHeadless -startvm $1&
}

3) . ~/.bash_profile
4) headless vm_name
Notes: Step 3 just needs to be run once right after the function is added to your bash profile. Before step 4 is run please make sure the vm is not already running.

Categories: linux, Tip, virtualbox

Yet another awesome iPad app

April 20, 2011 Leave a comment

Just installed it and loving it.

[ yet this app failed to impress my love ss. ]

Categories: Uncategorized

Perl DBI bind_columns vs. fetchrow_hashref bench marking

April 17, 2011 Leave a comment

Have been wondering the real difference between bind_columns and fetchrow_hashref methods while fetching data using perl DBI and here’s the findings I’ve come up with:

[04/17/2011 10:35:53] Using bind_columns method starts …
[04/17/2011 10:36:39] Using bind_columns method ends …
[04/17/2011 10:36:39] Using fetchrow_hashref method starts …
[04/17/2011 10:38:36] Using fetchrow_hashref method ends …

The clear winner is bind_columns (47 seconds, which is much less than the time taken for fetchrow_hashref method, 1 minute and 57 seconds) The table being tested has 150,000 records and it’s structure is as follows:

describe tbl_userinfo;
+----------+------------+------+-----+---------+----------------+
| Field    | Type       | Null | Key | Default | Extra          |
+----------+------------+------+-----+---------+----------------+
| id       | int(11)    | NO   | PRI | NULL    | auto_increment |
| password | char(32)   | NO   |     | NULL    |                |
| active   | tinyint(4) | NO   |     | 1       |                |
+----------+------------+------+-----+---------+----------------+

The test is done by looping each method 10 times through the script below:

#!/usr/bin/perl
use strict;
use warnings;
use DBI;
use Time::Local;
use POSIX qw/strftime/;

sub d {
	my $msg=shift;
	print strftime("[%m/%d/%Y %H:%M:%S] ", localtime(time)).$msg."\n";
}

sub get_records_bind {
	my ($sth, $ref_records)=@_;
	$sth->execute();
	my($id, $password, $active);
	$sth->bind_columns(undef, \$id, \$password, \$active);
	while($sth->fetch()) {
		$ref_records->{$id}={
			password=>$password,
			active=>$active,
		};
	}
}

sub get_records_hash {
	my ($sth, $ref_records)=@_;
	$sth->execute();
	while(my $ref=$sth->fetchrow_hashref()) {
		$ref_records->{$ref->{id}}={
			password=>$ref->{password},
			active=>$ref->{active}
		};
	}
}	

my $dbh=DBI->connect( "DBI:mysql:database=testdb:host=localhost", 'username', 'password') or die "Can't connect $DBI::errstr";

my $sth=$dbh->prepare("select id,password,active from tbl_userinfo");

d("Using bind_columns method starts ...");
my $r={};
for my $i (1..10) {
	%$r=();
	get_records_bind($sth, $r)
}
d("Using bind_columns method ends ...");

d("Using fetchrow_hashref method starts ...");
for my $i (1..10) {
	%$r=();
	get_records_hash($sth, $r)
}
d("Using fetchrow_hashref method ends ...");

$dbh->disconnect or warn "Disconnection failed" if $dbh;

Therefore I think it's better to use the bind_columns method whenever possible, especially when the dataset is huge.

Categories: perl, Programming

My take on rounded corners in web development

April 10, 2011 1 comment

I am not a big fan of rounded corners, when it comes to web development. Yes I know a few ways of making rounded corners ( using 4 corner images and nest in a way that makes the whole div look like a rounded corner box; using a javascript way that doesn’t require images; using a whole image that suites situations when scaling is not required ) but none of the approaches provides a neat and sounding solution and my take  on rounded corners is very simple — avoid them if you don’t have to use them. [ I will be using rounded corners again when css3 are supported in all major web browsers as making rounded corners only requires one line of css code. ]

I have to agree that rounded corners make sites look smoother but let’s not forget there are sites that don’t use rounded corners at all and I haven’t heard anyone saying their sites suck because of the lack of rounded corners. Take a look at the following sites:

http://www.google.com

http://www.facebook.com

http://www.dropbox.com

http://www.linode.com

In my opinion, making rounded corners to appear the same on all browsers ( yes compatibility issue again ) takes  away too much precious time and effort which can be used to improve functionality of a site, and it also introduces extra complexity and unnecessary div, css, js code.

 

Ref: http://www.facebook.com/note.php?note_id=143248897792

Capture tmux output – the much less painful way

April 7, 2011 4 comments

Before I found out this, I had to enter scrolling mode, using mouse to select/copy content to system clipboard. Those days have gone as the following will definitely speed things up a lot:

1) tmux a
2) ^b [
3) ^ space
4) move cursor with arrow keys and/or pageup/pagedown keys
5) ^ w
6) ^b ]
7) ^b:
8) save-buffer ~/out_file

Explanations:
2) enters copy mode
3) starts the selection
5) copies to buffer (0)
6) ends copy mode
7) enters tmux command line mode
8) use this command to save buffer (0) to file ~/out_file, specify with option [ -b buffer-index ] if needed, tmux command list-buffer will show the buffer list

Have fun!

Sources:
http://cheasy.de/tmux.pdf
http://www.openbsd.org/faq/faq7.html [ scroll to page buttom, section "Advanced tmux usage" ]

Categories: howto
Follow

Get every new post delivered to your Inbox.