Wednesday, January 5, 2011

Sorting techniques in Perl

Check out this SlideShare Presentation:

Thursday, September 2, 2010

Validating numbers in a perl script

This morning one of my perlish friends asked me for help. He had a regular expression that was used for validating a string which contained numbers. The regex was failing and he wanted to know how to fix it. The strings he had were "12.3%", "1.25%" and similar. Numbers in it came from disk space usage, in percentages varying from 0% to 100% Similar to what we see in the output of df command.
And he wanted to extract the numeric part of it, to be used in numeric calculations.

His regex was not taking into account that these numbers could have "optional" parts. They could be "100%", "12%", "0.8%", "0%" and so on. Here is the regex I gave him.

[root@sonash5 tmp]# perl -e '$str = "1.23%"; print "matches" if ($str =~ m/^((\d{1,3})?(\.\d+)?)\s*%$/); print ", and the match is $1 \n"'
matches, and the match is 1.23
[root@sonash5 tmp]#

Are you thinking, why not just remove the % character from end of the string and use the rest of it as numeric entity?

That's a bad idea. What if some machine gives you "75 %"? Or even "75", assuming that they are percentages? And the string obtained could even be " 34.5%", resulting in the numeric calculations go wrong. When those string are coming from a source that you don't know, or may vary based on the operating system and the hardware and other things, it is a good idea to validate the string for the numeric pattern that you are looking for. And if the pattern is found, extract it and use it further.

Tuesday, May 25, 2010

How do I I find doubled words in a file

Noticed that mistake in the title? Or missed it? Here's how to catch those in your text files.

perl -nl -e 'print if m/\b(\w+)\s+\1/' filename

Instead if the whole line, you just want to see the line number and the word that is doubled?

perl -nl -e 'print "$. : $1" if m/\b(\w+)\s+\1/' filename

And you want to correct these?

perl -pi -e 's/\b(\w+)\s+\1/\1/g' filename

Friday, May 21, 2010

How do I get difference between two dates that are in yyyy-mm-dd format

A few days ago one of my friends asked me - How to get the difference in terms of number of days from two dates stored in yyyy-mm-dd format. I was in a hurry and told him how to go about, but couldn't show him the exact steps.

Yesterday I posed this questions to the interns in our company, thinking they would benefit from solving it. I told them to read the date values from shell environment variables, and use the programming language /tools of their choice to prepare the solution.

One guy wrote a neat C code to get the answer. He had to maintain a data structure to store how many days are present in all the months of the year, and also a function to find if the given year is a leap year or not. That resulted in the code growing to about 100 lines.

I asked them if they could think of some other approach, and if they were aware of how time is maintained in UNIX systems.

A perl script that uses package Time::Local makes the task easier to accomplish.

#!/bin/perl
use strict;
use warnings;
use Time::Local;

my $date1 = $ENV{'date1'};
my $date2 = $ENV{'date2'};

my @date1_breakup = split(/-/, $date1);
my @date2_breakup = split(/-/, $date2);

my $date1_unix = timelocal(0, 0, 0, $date1_breakup[2], $date1_breakup[1], $date1_breakup[0]);
my $date2_unix = timelocal(0, 0, 0, $date2_breakup[2], $date2_breakup[1], $date2_breakup[0]);

my $diffSeconds = $date2_unix - $date1_unix;
my $diffDays = $diffSeconds / (60 * 60 * 24);
print "difference in days : $diffDays\n";



A plain simple bourne shell script can also do the job for us, if the date command supports -d option.

#/bin/sh
date1_unix=`date -d $date1 +%s`
date2_unix=`date -d $date2 +%s`
diff=`expr $date2_unix - $date1_unix`
diff_days=`expr $diff / 86400` # Number of seconds in a day are 86400
echo $diff_days


yogeshs@yogesh-laptop:~/temp$ export date1=2010-04-22
yogeshs@yogesh-laptop:~/temp$ export date2=2010-05-23
yogeshs@yogesh-laptop:~/temp$ ./date_diff.sh
31
yogeshs@yogesh-laptop:~/temp$


Here's what happened inside the shell script.
yogeshs@yogesh-laptop:~/temp$ sh -x date_diff.sh
+ date -d 2010-04-22 +%s
+ date1_unix=1271874600
+ date -d 2010-05-23 +%s
+ date2_unix=1274553000
+ expr 1274553000 - 1271874600
+ diff=2678400
+ expr 2678400 / 86400
+ diff_days=31
+ echo 31
31
yogeshs@yogesh-laptop:~/temp$



You still want to see how this is done using C? Do let me know and I'll share that piece of code with you.

Wednesday, March 25, 2009

Zenity

Its been ten years since I was introduced to UNIX, and there are still so many things to learn. Today while browsing at my favorite UNIX forums, I saw a question: help - something doesn't work with zenity.

I had never heard of zenity, and the question aroused my interest to know what it was about. After following the rule do a web search before you ask, I discovered that zenity was present at my system! And I didn't knew that. Poor me.

I never knew that we could pop up a dialog box from a shell script. Here's how.

zenity --info --title "Hi" --window-icon=none --text "Howdy world\!"

A look at the man page and I got answer to that question

Friday, December 26, 2008

Problems galore, Solutions aplenty

A few years ago, one of my friends asked me for advice. He was frustrated, mostly out of his job. I wanted to help, but he was at the other side of the globe! I couldn't give him a ready solution. But instead, I sent him my recepie of solving a problem. Here it is.

1. Identify the problem
This is the first step. Unless we identify the exact problem, with all its details, we won't be able to prepare a solution for it. If you could prepare a problem statement, not more than a few lines, that would help much to make things clear.

2. Make a list of all possible solutions
When you are aware what the problem is or what is bothering you or what is it that you are not satisfied with, think of all the possible ways in which the situation could be improved. Even if a solution looks simple, or complex, far fetched, or lengthy, do add it to your list. Include only practical solutions.

3. Choose the best solution from the list
When you know the all possible ways, choose the best one. Depends on what is best for you in that situation. Choosing the best one would require comparisons and judgement on the criteria that you have.

4. Implement the best solution that we now know
Go ahead, its time for action.

When I have a problem, I tend to think in this way. And it works for me. And who doesn't have problems in this world? Sometimes I think that we are here to solve problems. We are all problem-solvers.

Sunday, December 21, 2008

screen 'em

Some of the tasks that I do involve running commands that take way long to finish - some take a few hours, others take a couple of days. Obviously, you can't be staring at the terminal till the task is done. And if you close the terminal, the command dies away. So how to run these commands? Simply running them in background won't help, as they would be terminated when you log out.

One idea is to run them in background, and using nohup.

$ nohup my_cmd &
$ exit

This would start the command in the background in such a way that it won't be stopped even if you log out. But you won't see the output at your terminal. nohup would redirect it to a file named nohup.out

Once I start the command using nohup, there is no way I can interact with it. And I can't see the output at the terminal. Sure, there is a way if I really want to. Keep watching: $ tail -f nohup.out

A better alternative in these scenarios is to use the GNU Screen utility. It offers many useful features.

* I can start the command and leave it alone. Later, I can see how it is progressing whenever I want to.

* I can start the command from one computer, and then see how it is progressing from a different computer. So I can start the task at office, then go home, and when I find time, check how it is going.

* More than one users can share a screen. So if my colleague wants to check, he can also connect to the same screen and see the progress.

* Multiple terminal sessions can be created in a screen session. So I can keep running the command in one of them, and do something else in the other.

* Screen is very helpful when the network connection is unreliable. I would still have the task running even if connection breaks.

Enough. Tel me how do I start.

Check if a screen session is already running
$ screen -ls

Attach to a not detached session
$ screen -x

Start a new screen session
$ screen

Create a new terminal session
CTRL+a c

Toggle between two terminal sessions
CTRL+a CTRL+a

Go to the nth terminal session
CTRL+a n

Send the command character CTRL+a to a window
CTRL+a a

Detach from a screen
CTRL+a d

Terminate the screen
$ exit

man screen for more information