The Queen's University of Belfast

Parallel Computer Centre
[Next] [Previous] [Top]
System Components
System components
- pvmd3 daemon
- Runs on each host
- Provides inter-host point of contact
- Provides authentication of tasks
- Executes processes on host
- Provides fault tolerance
- Message routing and source and sink for messages
- libpvm programming library
- Contains the message passing routines and is linked to each application component
- application components
- User's programs, containing message passing calls, which are executed as PVM tasks
PVM Terminology
- Host
- A physical machine such as a workstation
- Virtual Machine
- A combination of hosts running as a single concurrent resource
- Process
- A program, data, stack, etc such as a Unix process or node program
- Task
- A PVM process - the smallest unit of computation
- TID
- A unique identifier, within the virtual machine, which is associated with each task
- Message
- An ordered list of data to be sent between tasks
Message Passing
Send and receive
- Sending a message is a 3 step process
- Initialize a buffer using the pvmfinitsend function
- Pack the data into the buffer using pvmfpack
- Send the contents of the buffer to another process using pvmfsend or to a number of processes (multi-cast) using pvmfmcast
- Receiving a message is a 2 step process
- Call the blocking routine pvmfrecv or the non-blocking routine pvmfnrecv
- Unpack the message using pvmfunpack
Message Buffers
- PVM permits a user to manage multiple buffers but
- there is only one active send and one active receive buffer per process at any given moment
- the packing, sending, receiving and unpacking routines only affect active buffers
- the developer may switch between multiple buffers for message passing
Packing Data
- In C PVM supports a number of different routines for packing data this corresponds to the different types of data to be packed
- PVM only provides one FORTRAN routine (pvmfpack) for packing data. pvmfpack handles all data types
PVM Console
Functions
- PVM console is analagous to a console on any multi-tasking computer. It is used to
- configure the virtual machine
- start pvm tasks (including the daemon)
- stop pvm tasks
- receives information and error messages
Starting the console
pvm
- The console may be started and stopped multiple times on any of the hosts on which PVM is running
- The console responds with the prompt
pvm>
- The console accepts commands from standard input
Configuration of PVM
conf
- The configuration of the virtual machine may be listed using the conf command
pvm> conf
1 host, 1 data format
HOST DTID ARCH SPEED
navaho-atm 40000 SGI5 1000
- This gives a configuration of 1 host called navaho-atm , its pvmd task id, architecture and relative speed rating
add
- Additional hosts may be added using the add command
pvm> add sioux-atm mohican-atm
2 successful
HOST DTID
sioux-atm 100000
mohican-atm 140000
pvm> conf
3 hosts, 1 data format
HOST DTID ARCH SPEED
navaho-atm 40000 SGI5 1000
sioux-atm 100000 SGI5 1000
mohican-atm 140000 SGI5 1000
delete
- Hosts may be removed from the virtual machine using the delete command
pvm> delete sioux-atm
1 successful
HOST STATUS
sioux-atm deleted
Executing a program
spawn
- The spawn command may used to execute a program from the pvm console
pvm> spawn -> program-name
- This will direct the output from the program to the screen
Leaving PVM
quit, halt
- To exit from the console session while leaving all daemons and jobs running use the quit command
pvm> quit
- To kill all pvm processes including the console, and then shut down PVM
pvm> halt
Using a hostfile
- PVM may be started using a hostfile to define the virtual machine
- The hostfile lists the hosts in the user's virtual machine. In its simplest form it lists one host per line
- The first host listed must be the computer on which PVM is initially started
- For example
# Configuration used for hello/world
navaho-atm
sioux-atm
- Options such as userid and passwd (lo and pw) may be specified for each host
Error handling
- Generally in Fortran the completion status of a pvm routine may be checked, if the status >= 0 then the routine completed successfully, a negative value indicating that an error has occurred
- When an error occurs, a message is automatically printed that shows the task ID, function and error
- Automatic error reporting may be disabled by calling pvmfserror
- Error reports may be generated manually by calling pvmfperror
Debugging a PVM Application
- A task run by hand can be started under a serial debugger. Add PVMDEBUG to pvmfspawn arguments
- The following steps should be followed when debugging
- Run program as a single task and debug as any other serial program
- Run the program on a single host to eliminate network errors and errors in message passing
- Run the program across a few hosts to locate deadlock and synchronization problems
- Use XPVM
Fault Detection
- Pvmd will recover automatically, applications must perform their own error recovery
- A dead host will remain dead, but may be reconfigured later
- The pvmd to pvmd message passing system drives all fault detection
- Faults are triggered by retry time-out
Future Enhancements
- The next version of pvm is supposed to have
- Greater scalability to enable 100s of hosts to be used in parallel
- Parallel startup
- Better debugging: libpvm to generate trace messages
- Better high speed network utilization eg. FDDI
- Better console program with features such as command history, command aliases and quoting
[Next] [Previous] [Top]
All documents are the responsibility of, and copyright, © their authors and do not represent the views of The Parallel Computer Centre, nor of The Queen's University of Belfast.
Maintained by Alan Rea, email A.Rea@qub.ac.uk
Generated with CERN WebMaker