Friday, October 27, 2006

Matlab: Working with text data files using Perl (1)

Matlab is a great tool when you work with mathematic related software like image processing (assuming that you have appropriate toolboxes, like Image Processing Toolbox). It is very convenient high level programing language, in which you can do, I think, almost everything and many things that are almost impossible in other languages like (C, Java, Perl) are extremely easy here.

Nevertheless, there are few things that are easy and natural in those other languages, whereas in Matlab they become quite difficult. What really annoys me, and not only me, are: (i) lack of hash arrays, (ii) working with text files and (iii) lack of multi-thread work.

Factors (i) and (ii) are closely related. Everyone who has ever used Perl, Ruby or Python to work and mine text files knows that hashes are essential. Of course, they are useful to many other things, however based on my own experience, vast majority of operation of files were based on hash arrays usage. What is more, I have to say, that you can easily work with Matlab without hash tables, and I really did for a long time. But, when recently I needed to process (write/read/create/joining data from few files/extracting and looking for string and numeric data) text files in Matlab, lack of hash tables, and poor file handling operation in Matlab become very frustrating.

Fortunately there is Perl, which is excellent in dealing with text files. However, the most important thing about this, is the fact that Matlab has embedded support for using Perl scripts within Matlab programs using perl function:
Perl is included with MATLAB on Windows systems, and thus MATLAB users can run M-files containing the Perl function. On Unix systems, MATLAB just calls the Perl interpreter that's available with the OS - from Matlab help.
This function can save a lot of work, and make working with files in Matlab very fast, and convenient; in fact, as convenient as Perl is. At the same Matlab page we read:
It is sometimes beneficial to use Perl scripts instead of MATLAB code...
Indeed, it is very beneficial, especially if one works for example with text files and further in this post I am going to show one example how Perl can be used withing Matlab while working with txt files.

1. Data File

My test text file is called rectangles.txt. It contains file names (first column) and description of two rectangles and one point. Each rectangle is characterized by four values: [x y width height].
First rectangle is described in columns 2-6, second in 7-10. Two last colums (11,12) contain
position [x,y] of a point.

2. My task

Little script in Matlab (getRects('filename')), which takes file name, searches given file name in the file, and returns three vectors ( rect_1=[x y width height]; rect_2=[x y width height]; point=[x y]) based on info in the file, for instance:

[rect_1 rect_2 point] = getRects('001anv.tiff');
%GIVES:
rect_1 = [113 302.5 49.035541 58.514286];
rect_2 = [274 292.5 90.87837491 89.96484776];
point = [5.220703 4.375];

3. Method

To do my task, I will create simple Perl script, that will take file name form Matlab, search for data, and returns required vectors to Matlab.

4. Matlab script

Lets start from creating our Matlab script.

function [rect_1 rect_2 point]= getRects(filename)

%execute Perl script
a=perl('getRectPos.pl',filename,'rectangles.txt');
eval(a); %evaluate string returned from Perl

As can be seen the script is very easy. It just executes Perl program passing two arguments to it: (i) filename to be search in the file; (ii) 'rectangles.txt' - name of file to look from.

The interesting fact about this is the eval(a) operation. As described in my former post, Perl returns string representing its standard output. Hence, in order to get Matlab data structures and variables from Perl, it is required to return them as strings, and then execute that string using eval function.

5. Perl script

Finally, we come to the Perl script it self.

#File: getRectPos.pl

#Read arguments from Matlab
my $fileName = $ARGV[0];
my $fileWithRect = $ARGV[1];


#create piece of Matlab code - declaration
#of variables.
my $out="rect_1=[]; rect_2=[]; point=[];";

my %goldHash=&readFileToHash(
$fileWithRect);

if ( exists $goldHash{$
fileName}) {

my @r1 = @{$goldHash{$key}}[0..3]; #(x,y,w,h)
my @r2 = @{$goldHash{$key}}[4..7]; #(x,y,w,h)
my @p = @{$goldHash{$key}}[-2..-1]; #(x, y)

#create the rest of the Matlab code to be executed by eval
foreach my $v (@gmROI) { $out .= "Med(end+1)= $v; "; }
foreach my $v (@glROI) { $out .= "Lat(end+1)= $v; "; }
foreach my $v (@scales) { $out .= "Scales(end+1)= $v; "; }

}

#print string to Matlab
print $out;


sub readFileToHash {
#read file values to a hash table
#key value in this hash is file name (first column)
#data is all the rest columns: 2-12
my $fname = shift;

my %temp=();

open(FG,"<$fname") || die "Cannot find file ",$fname; while () {
chomp;
my ($k,my @rec) = split('\t',$_);
$temp{$k}=[@rec];
}
close FG;
return %temp;
}

6. Results

After executing Perl script, Matlab gets string to variable a. Example content of that string is:

a=
rect_1=[];rect_2=[];point=[];rect_1(end+1)= 104.000000; rect_1(end+1)= 303.000000; rect_1(end+1)= 49.035541; rect_1(end+1)= 58.514286; rect_2(end+1)= 282.964459; rect_2(end+1)= 294.000000; rect_2(end+1)= 49.035541; rect_2(end+1)= 58.514286; point(end+1)= 5.220703; point(end+1)= 4.375000;

after executing evel(a), in Matlab environment all required vectors are created (rect_1,rect_2,point).

7. Conclusion

Thanks to Perl and possibility of using it within Matlab many task concering text files can be solve more easily and more convenient.