The hardware in a user's virtual machine may be single processor workstations, vector machines or parallel supercomputers or any combination of those. The individual elements may all be of a single type (homogeneous) or all different (heterogeneous) or any mixture, as long as all machines used are connected through one or more networks. These networks may be as small as a LAN connecting machines in the same room, as large as the Internet connecting machines across the world or any combination. This ability to bring together diverse resources under a central control allows the PVM user to divide a problem into subtasks and assign each one to be executed on the processor architecture that is best suited for that subtask.
PVM is based on the message-passing model of parallel programming. Messages are passed between tasks over the connecting networks.
User's tasks are able to initiate and terminate other
tasks, send and receive data, and synchronize with one another
using a library of message passing routines. Tasks are dynamic
(i.e. can be started or killed during the execution of a program), even
the configuration of the virtual machine (i.e. the actual machine that
are part of your PVM) can be dynamically configured.
This section describes how to install PVM on machines that you wish to run PVM applications on. If PVM has been previously installed on the machines, you may skip the first part of this section. Check with your system administrator to see if and where PVM has been installed.
The latest version of PVM is always available in from Netlib at the URL http://www.netlib.org/pvm3/ in a file named pvm3.x.x.tar.gz. There are pointers to more PVM information in this directory. As of May 2000, the most recent version of PVM was pvm3.4.
The following set of commands download, unpack, build and install a release of PVM. Start in the directory just above PVM_ROOT, that is, just above where you wish PVM to be installed, for example $HOME or /usr/local.
% ftp ftp.netlib.org Name: anonymous Password: you@host.domain (put your email address here) ftp> bin ftp> cd pvm3 ftp> ls pvm3*.tar.z.uu ftp> get pvm3.4........ ftp> quitThe environment variable PVM_ROOT tells PVM where the system is installed so that PVM can find various files. It must be set in order for PVM to run. PVM also needs to know what architecture it running on (e.g., SUN4), so that it can run the appropriate executables. To do this it checks the environment variable PVM_ARCH. A script is provided with PVM which figures out the correct value for PVM_ARCH
The following examples assume that PVM is installed in
$HOME/pvm3.
If it is installed in another location, for example, /usr/local/pvm3
please make the appropriate changes.
On each machines you should include into the .bashrc the following export's :
If you normally use bash, sh or ksh, add these lines to the end of your
.bashrc:
export PVM_ROOT=/usr/lib/pvm3
export PVM_DPATH=$PVM_ROOT/lib/pvmd
export MANPATH=$MANPATH:$PVM_ROOT/man
export PATH=$PATH:$PVM_ROOT/lib:$PVM_ROOT/include
To use the pvm functionalities, a rsh between the different noded of
the parallel virtual machines must be possible. In order to enable this,
you should configure the .rhosts :
Make a file .rhosts in your $HOME directory and write in it:
lpc1.itec.uni-klu.ac.at harald
lpc2.itec.uni-klu.ac.at harald
if you are user harald and the nodes of the parallel machines are lpc1
and son on.
The PVM distribution comes with several example programs which are an
excellent way of learning PVM. you can find these examples in
$PVM_ROOT/examples.
Use these examples and their associated Makefiles as templates for your
own programs.
PVM provides a very flexible environment for message passing. It supports MIMD (Multiple Instruction stream over Multiple Data stream) style parallel computation, though most programs are written in the SPMD (Single Program over Multiple Data stream) style.
NOTE: Almost all PVM calls in C are functions generally start with pvm_ (e.g., mytid = pvm_mytid()) and they return a value, usually communicating the success of the call or other useful information.
C Code myprog.c
#include "pvm3.h"
#define NTASKS 5
main()
{
int mytid, info;
/* enroll in PVM */
mytid = pvm_mytid();
/* Possibly do some work here ... */
printf("Hello from task %d", mytid);
/* exit from PVM */
info = pvm_exit();
exit();
}
Compiling in C: Compile the code using the C compiler, letting it
know the location of the include files, and linking with the PVM libraries
:
% cc -o myprog myprog.c -I$PVM_ROOT/include -L$PVM_ROOT/lib/LINUX -lpvm3 -lnslRunning the executable: Start the parallel virtual machine PVM, quit to the console and run one copy of the executable. Then re-enter PVM and use the PVM console to run multiple (3) copies of the executable. Finally halt the virtual machine, so that any PVM deamon process are stopped.
%(asteroid) pvm pvm> conf 1 host, 1 data format HOST DTID ARCH SPEED asteroid 40000 LINUX 1000 pvm> quit % myprog Hello from task 262146 % pvm pvm> spawn -3 -> myprog [1] 3 successful t40004 t40005 t40006 [1:t40005] Hello from task 262149 [1:t40005] EOF [1:t40006] Hello from task 262150 [1:t40006] EOF [1:t40004] Hello from task 262148 [1:t40004] EOF [1] finished pvm> haltYou have now seen how a simple PVM program (one without any specific message passing) can be written and compiled. You have also seen how the program is run from the command line and from the PVM console. You even seen how multiple copies of the program are spawned from the PVM console. These things will now be explained in more detail and with more examples.
Each PVM task receives a unique task identifier tid from the PVM daemon. This tid is received when the process enrolls into PVM (on its first call to a PVM routine) and is used by the other tasks to address message to it.
The following subsections give an example program in both C. For this example, we shall assume that the user has built his own set of PVM executables on the LINUX machine cetus1a and the program sources reside in the user's $HOME directory.
The examples given here illustrate a master/worker code that sums up the values of an integer array. The master task spawns off five worker tasks, sends each of the workers a portion of the array to be summed up, receives the partial totals from each of the workers, and adds those up for the final total. The worker tasks receive an array of integers, add up all the values in the array, and send the total back to the master process.
Figure 1: C version of master example.
#include "pvm3.h"
#include <stdio.h>
#define SIZE 1000
#define NPROCS 5
main()
{
int mytid, task_ids[NPROCS];
int a[SIZE], results[NPROCS], sum = 0;
int i, msgtype, num_data = SIZE/NPROCS;
/* enroll in PVM */
mytid = pvm_mytid();
for (i = 0; i < SIZE; i++)
a[i] = i % 25;
/* spawn worker tasks */
pvm_spawn("worker", (char **)0, PvmTaskDefault, "", NPROCS, task_ids);
/* send data to worker tasks */
for (i = 0; i < NPROCS; i++) {
pvm_initsend(PvmDataDefault);
pvm_pkint(&num_data, 1, 1);
pvm_pkint(&a[num_data*i], num_data, 1);
pvm_send(task_ids[i], 4);
}
/* wait and gather results */
msgtype = 7;
for (i = 0; i < NPROCS; i++) {
pvm_recv(task_ids[i], msgtype);
pvm_upkint(&results[i], 1, 1);
sum += results[i];
}
printf("The sum is %d \n",sum);
pvm_exit();
}
Figure 2: C version of worker example.
#include "pvm3.h"
#include <stdio.h>
main()
{
int mytid;
int i, sum, *a;
int num_data, master;
/* enroll in PVM */
mytid = pvm_mytid();
/* receive portion of array to be summed */
pvm_recv(-1, -1);
pvm_upkint(&num_data, 1, 1);
a = (int *) malloc(num_data*sizeof(int));
pvm_upkint(a, num_data, 1);
sum = 0;
for(i = 0; i < num_data; i++)
sum += a[i];
/* send computed sum back to master */
master = pvm_parent();
pvm_initsend(PvmDataRaw);
pvm_pkint(&sum, 1, 1);
pvm_send(master, 7);
pvm_exit();
}
Figure 1 gives the C code for the master program which is contained in
the file master.c. Figure 2 shows the C code for the worker portion
of the PVM example application stored in the file worker.c. Let's
examine each program to see how the PVM calls are used.
The first line of both programs includes the PVM header file. This file defines PVM symbolic names and functions.
The master program initializes then the data that is to be summed up. Next, the worker processes are spawned.
int numt = pvm_spawn(char *task, char **argv, int flag, char *where,The task parameter is a string containing the name of the executable file to be run (``spawned''). Any arguments that must be sent to this program are in an array pointed to by argv. If no arguments are required by the task, then the argv parameter can be NULL. The flag parameter is used to determine the specific machine or type of architecture the spawned task is to be run on. Possible values for flag are:
int ntask, int *tids)
For error codes : try man pvm_spawn.
In our example code five (NPROCS) copies of the executable file ``worker'' will be spawned by the master program. No arguments are to be sent and we are allowing PVM to choose which machines will be used to execute the worker code. The task ids will be placed in the array task_ids.
To send a message from one task to another, a send buffer is created to hold the data. The function pvm_initsend() creates and clears a buffer and returns a buffer identifier.
int bufid = pvm_initsend (int encoding)Generally, pvm_initsend() must be called each time a new message is to be sent, in order to clear the default send buffer. The encoding parameter can be either PvmDataDefault, PvmDataRaw or PvmDataInPlace. The PvmDataDefault option will use XDR (eXtended Data Representation standard) encoding for sending message data (allows machines with different data representations to exchange information). The PvmDataRaw option does no encoding of the message data. The PvmDataInPlace leaves the data in memory instead of copying it to the send buffer, this saves time but requires that the programmer does not alter the data until the send is completed.
The send buffer needs to be packed with data to be sent. (KOFFER PACKEN) The functions to pack data into the active send buffer are pvm_pkXXXX() where the XXXX indicates the type of data being packed. The data types supported by PVM (and their XXXX function designation) are byte (byte), complex (cplx), double complex (dcplx), double (double), float (float), integer (int), long (long) and short(short). The example code packs integers and uses the function pvm_pkint().
int info = pvm_pkint (int *np, int nitem, int stride)The first np parameter is a pointer to the data to be packed. The nitem parameter is the total number of items to be packed. The stride parameter is the stride (step size, number of items to skip) to use when packing. There is also a function to pack a NULL terminated string (str) that requires only a single parameter, the pointer to the string.
A single message can be packed with any number of different data types and there is no limit on the complexity of the message. However, you should ensure that the received message is unpacked in the same order it was originally packed.
The function to send a message is pvm_send().
int info = pvm_send (int tid, int msgtag)This function attaches an integer label of msgtag and immediately sends the contents of the send buffer to the task with the task id of tid. The msgtag is an arbitrary integer that can be used to distinguish between different messages that a task could send out.
In the example code, for each worker, the master program clears the send buffer for each new message and packs this buffer with two things: 1) the number of array elements that follow in the message and 2) the array portion to be summed. Since each consecutive item from the array a is to be sent, starting with the num_data*i position, the stride for the packing function is 1. The task_ids array that was returned from the pvm_spawn() call is used to address each different task that will receive a portion of the array. The arbitrarily chosen value `4' is the msgtag used to label the messages.
After the array portions have been distributed to the worker tasks, the workers will process the data and send back some results.
The master program must receive (pvm_recv()) a partial sum from each of the worker processes.
int bufid = pvm_recv (int tid, int msgtag)This will receive a message from task tid with label msgtag and place it into the receive buffer with id bufid. If no message is waiting from the given task with the expected label, the function waits until a message from the proper task or with the correct label arrives. Values of `-1' for the parameters will match with any task id and/or label. In the example code, the master program is expecting a label value of `7' on messages from the worker tasks. The loop forces the messages to be recieved from the workers in order.
Once a message has been received the data within must be unpacked. The unpacking functions are pvm_upkXXXX() where the XXXX corresponds to the type of data that is to be unpacked. The same XXXX extensions used in the pvm_pkXXXX() packing functions are valid for pvm_upkXXXX(). For example, since the master program is receiving an integer from its worker processes, it calls the integer unpacking function
int info = pvm_upkint (int *np, int nitem, int stride)The np parameter is a pointer to where the data is to be stored. The nitem parameters specifies the number of items to be unpacked, and the stride parameter specifies the stride to be used when unpacking the data. Our example code unpacks each partial result received into a different element of the results array and adds it to the running sum.
After the sum is computed and printed the master task informs the PVM daemon that it is exiting from the virtual machine.
int info = pvm_exit()As for the worker program, after enrolling in the virtual machine, the worker tasks wait to receive their portion of the array to be summed. Using the `-1' wild cards in the pvm_recv() call indicates that the task does not care what task the message was sent from nor what message label was used. This is because the worker had not found out the tid of the master. Since the size of the array being sent may not be known ahead of time, after unpacking the number of data items from the message, the worker code allocates enough memory to hold the rest of the data contained in the message. The array fragment is summed up and the total is sent back to the parent.
The task id of the parent task that spawned the current task is returned by the pvm_parent() function.
int parent_id = pvm_parent()Note that since the master program is expecting a msgtag of `7' from the worker tasks, this value must be used in the pvm_send() call.
Helpful Hint: If the set of tasks spawned will need to communicate amongst themselves, the parent task needs to send the task array tids that is generated in the pvm_spawn() function to each.
The compilation process for our example codes would be
> cc -o master master.c -I$PVM_ROOT/include -L$PVM_ROOT/lib/LINUX -lpvm3
-lnsl
> cc -o worker worker.c -I$PVM_ROOT/include -L$PVM_ROOT/lib/LINUX -lpvm3
-lnsl
Complete details for all PVM library functions can be found in PVM
3 User's Guide and Reference Manual, online availabe at : http://www.netlib.org/pvm3/book/node1.html
Includes TroubleShooting !!!!
cetus1a> pvm pvm>The prompt for the PVM console is pvm>. From this prompt you can add or delete computers from your virtual machine, list the current virtual machine configuration and check the status of executing tasks. The help or ? command accesses the interactive help facility.
The conf command lists the current virtual machine configuration. This list includes the host name, PVM daemon task id, architecture type, and relative speed rating. At the first time it is only :
pvm> conf
6 hosts, 1 data format
HOST DTID ARCH SPEED
cetus1a 40000 SUN4
1000
In order to add new hosts use the add command :
pvm > add cetus1b
....
pvm> conf 6 hosts, 1 data format HOST DTID ARCH SPEED cetus1a 40000 SUN4 1000 cetus1b 80000 SUN4 1000 cetus1c c0000 SUN4 1000 cetus1d 100000 SUN4 1000 cetus1e 140000 SUN4 1000 cetus1f 180000 SUN4 1000If you wish to dynamically create or alter your virtual machine, you do not need to create a hostfile. Use the add and delete commands to add and remove computers from your virtual machine.
Your PVM application can be started on any of the hosts in the virtual machine. To do this from the initial host, we can either put the PVM console into the background with a control-Z followed by bg or issue the quit command. The quit command only exits the console program; all the daemons and PVM tasks are left running. You may quit and restart the PVM console any number of times without disturbing any of the PVM daemons or hosts in the configuration. Of course, if you have multiple windows on the initial host, you can leave the console program running in one window while you execute your programs from another.
pvm> quit pvmd still running. cetus1a>As if it were any other UNIX executable, you need only type the name of the executable at your system prompt to start your PVM application.
cetus1a> master The sum is 12000While the PVM daemon is running, you may edit and recompile your PVM programs as you require. There is no need to keep halting and restarting the daemon and console programs to make alterations in your code.
When you have finished, you need to halt PVM. Bring the PVM console to the foreground with the fg command or restart it if you had quit earlier and issue the halt command.
cetus1a> pvm pvmd already running. pvm> halt cetus1a>This will terminate all PVM daemons within the virtual machine configuration that included the host issuing the halt command.
The PVM Home Page is at the URL: http://www.epm.ornl.gov/pvm/pvm_home.html. A lot of useful information can be found there, including pointers to other introductions, and information about the latest developments in PVM.
The PVM 3 User's Guide and Reference Manual is the most complete and readily available source of information on PVM. As has been stated before, this document can be obtained fromnetlib@ornl.gov by sending the message send ug.ps from pvm3. It is also available via the URL: http://www.netlib.org/pvm3/ug.ps.
There is a book on PVM available from MIT Press called PVM: Parallel Virtual Machine A Users' Guide and Tutorial for Networked Parallel Computing. An online HTML version of the book can be browsed at URL: http://www.netlib.org/pvm3/book/pvm-book.html.
A convenient reference card to PVM calls is available from URL:http://www.netlib.org/pvm3/refcard.ps, this summarizes the C and Fortran interface in two sheets.
A UseNet newsgroup, comp.parallel.pvm, exists. The forum is
intended for users to exchange ideas, tricks, successes and problems.