Purposes
To avoid licencing issues while running distributed matlab jobs over dozens of servers, you will have to use the Matlab Compiler. Your m-scripts will be packaged as an executable running on every workstation.
There is no restriction on the toolbox you use in your scripts or m-script dependancies.
To easier the launch and monitoring of your job on the cloud, you will use the Gogolist tool. It will just take your matlab executable as an argument, and will manage the scheduling of your parallelized job.
By using Gogolist, you will break you loops and replace them by listing files. No more :
for year=2000:2012;
for month=1:12;
do_my_process(year, month);
end;
end
Which will be replaced by a simple text-file listing containing the parameters for one execution :
2000 01
2000 02
...
2012 12
Do you have initialized your account to use gogolist ? Follow the Quick Start `Quick Start (on Cersat Infrastructure)`_
Add this environment variable in your ~/.cshrc
setenv MCR_CACHE_ROOT “/tmp/mcr_cache_$USER”
A simple compilation test, which does nothing really useful.
Create a hello_world.m containing :
function hello_world()
disp('Hello world !');
end % function
Compile it by using the following command :
MATLAB_PATH/bin/mcc -R nojvm -R nodisplay -m hello_world.m
Run the created matlab executable
./run_hello_world.sh /home/adonnante/MATLAB/R2012a
...
Hello world !
Create a read_ascat.m file containing :
function read_ascat(year, month, day)
if isa(year, 'char')
year = str2num(year);
end
if isa(month, 'char')
month = str2num(month);
end
if isa(day, 'char')
day = str2num(day);
end
nc = cl_netcdf_writer ;
datasetRoot = '/home9/begmeil/cersat/ftp/products/gridded/MWF/L3/ASCAT/Daily/Netcdf';
fileDir = fullfile(datasetRoot, sprintf('%.4d',year), sprintf('%.2d',month), sprintf('%.2d',day));
tmp = dir(fullfile(fileDir,sprintf('%.4d%.2d%.2d%.2d_*.nc.bz2',year,month,day,0))); % uncoded .nc.bz2 file
lstFiles = {} ;
if ~isempty(tmp)
lstFiles{1} = fullfile(fileDir,tmp.name);
lstFiles{1}
cc = set_file_name(nc, lstFiles{1}) ;
s = read(cc) ;
lon = get_var_value(nc, s, 'longitude', 1, 1, 1) ;
min(lon)
max(lon)
disp('Read OK');
else
disp('No data');
end
end % function
Compile it by using the following command :
qsub -I -l "nodes=1:cloudphys" (or ssh br156-100 for example)
cd /path/to/your/m-script/dir
MATLAB_PATH/bin/mcc -R nojvm -R nodisplay -a /home/biblios/logiciels/matlablib/Linux/mexnc -a /home/biblios/logiciels/matlablib/Linux/netcdf -a /home/biblios/logiciels/matlablib/Lib/Common/ -m read_netcdf.m
Run the created matlab executable
./run_read_netcdf.sh /home/adonnante/MATLAB/R2012a 2012 01 01
...
ans =
/home9/begmeil/cersat/ftp/products/gridded/MWF/L3/ASCAT/Daily/Netcdf/2012/01/01/2012010100_2012010200_daily-ifremer-L3-MWF-GLO-20120103113741-01.0.nc.bz2
ans =
-179.8750
ans =
179.8750
Read OK
Prepare your parameters listing. Here, we want to launch the process for 6 dates :
cat date_listing.txt
2012 01 01
2012 01 02
2012 01 03
2012 01 04
2012 01 05
2012 01 06
Launch your distributed job :
ssh cerhouse1
cd /path/to/your/executable/dir
cat date_listing.txt | /home5/begmeil/tools/gogolist/bin/gogolist.py -w /home/cercache/users/USER/tmp/workspace --stdin -e './run_read_netcdf.sh MATLAB_PATH' --qsub-options='-l nodes=1:cloudphys,mem=2gb' --split-max-lines=1 --reporting
(output)
INFO:gogolist:Job workspace : /home/cercache/users/xxxxx/tmp/workspace/20120921/000004
INFO:gogolist:Job successfully registered in monitor. Go to : http://cercloudweb/jobsmonitor/job/31631/
INFO:gogolist:Batch Manager : torque Job id : 57778[].cerhouse1.ifremer.fr
INFO:gogolist:job name:run_read_netcdf id:57778[].cerhouse1.ifremer.fr (Q:0 / R:0 / C:0 / E:0 / H:0 / W:0 / X:0) )
INFO:gogolist:No running jobs. Remaining Jobs to process : 6
INFO:gogolist:job name:run_read_netcdf id:57778[].cerhouse1.ifremer.fr (Q:0 / R:6 / C:0 / E:0 / H:0 / W:0 / X:0) )
INFO:gogolist:Remaining Jobs to process (including currently running) : 6 [2012-09-21T09:40:05Z]
INFO:gogolist: Jobs launched: 6/6 (running: 6 terminated: 0)
INFO:gogolist: Exit OK = 0 | Exit ERROR = 0 | Lines submitted = 0/6 (0.00%)
INFO:gogolist:workspace: /home/cercache/users/xxxxx/tmp/workspace/20120921/000004
INFO:gogolist:Remaining Jobs to process (including currently running) : 0 [2012-09-21T09:40:35Z]
INFO:gogolist: Jobs launched: 6/6 (running: 0 terminated: 6)
INFO:gogolist: Exit OK = 6 | Exit ERROR = 0 | Lines submitted = 6/6 (100.00%)
INFO:gogolist: exec time : mean=0:00:28, sum=0:02:48
INFO:gogolist:Job workspace: /home/cercache/users/xxxxx/tmp/workspace/20120921/000004
INFO:gogolist:Job is TERMINATED
Check your log files :
cat /home/cercache/users/xxxxx/tmp/workspace/20120921/000004/logs/* | grep "Read OK"
Read OK
Read OK
Read OK
Read OK
Read OK
Read OK
OK, let’s run your own m-scripts now ! :-)