PVM: System Components, QUB

The Queen's University of Belfast

Parallel Computer Centre

[Next] [Previous] [Top]

System Components

System components

pvmd3 daemon
- Runs on each host
- Provides inter-host point of contact
- Provides authentication of tasks
- Executes processes on host
- Provides fault tolerance
- Message routing and source and sink for messages
libpvm programming library
- Contains the message passing routines and is linked to each application component
application components
- User's programs, containing message passing calls, which are executed as PVM tasks

PVM Terminology

Host
- A physical machine such as a workstation
Virtual Machine
- A combination of hosts running as a single concurrent resource
Process
- A program, data, stack, etc such as a Unix process or node program
Task
- A PVM process - the smallest unit of computation
TID
- A unique identifier, within the virtual machine, which is associated with each task
Message
- An ordered list of data to be sent between tasks

Message Passing

Send and receive

Sending a message is a 3 step process
- Initialize a buffer using the pvmfinitsend function
- Pack the data into the buffer using pvmfpack
- Send the contents of the buffer to another process using pvmfsend or to a number of processes (multi-cast) using pvmfmcast
Receiving a message is a 2 step process
- Call the blocking routine pvmfrecv or the non-blocking routine pvmfnrecv
- Unpack the message using pvmfunpack

Message Buffers

PVM permits a user to manage multiple buffers but
- there is only one active send and one active receive buffer per process at any given moment
- the packing, sending, receiving and unpacking routines only affect active buffers
- the developer may switch between multiple buffers for message passing

Packing Data

In C PVM supports a number of different routines for packing data this corresponds to the different types of data to be packed
PVM only provides one FORTRAN routine (pvmfpack) for packing data. pvmfpack handles all data types

PVM Console

Functions

PVM console is analagous to a console on any multi-tasking computer. It is used to
- configure the virtual machine
- start pvm tasks (including the daemon)
- stop pvm tasks
- receives information and error messages

Starting the console

pvm

The console may be started and stopped multiple times on any of the hosts on which PVM is running
The console responds with the prompt

pvm>

The console accepts commands from standard input

Configuration of PVM

conf

The configuration of the virtual machine may be listed using the conf command

pvm> conf

1 host, 1 data format

HOST DTID ARCH SPEED

navaho-atm 40000 SGI5 1000

This gives a configuration of 1 host called navaho-atm , its pvmd task id, architecture and relative speed rating

add

Additional hosts may be added using the add command

pvm> add sioux-atm mohican-atm

2 successful

HOST DTID

sioux-atm 100000

mohican-atm 140000

pvm> conf

3 hosts, 1 data format

HOST DTID ARCH SPEED

navaho-atm 40000 SGI5 1000

sioux-atm 100000 SGI5 1000

mohican-atm 140000 SGI5 1000

delete

Hosts may be removed from the virtual machine using the delete command

pvm> delete sioux-atm

1 successful

HOST STATUS

sioux-atm deleted

Executing a program

spawn

The spawn command may used to execute a program from the pvm console

pvm> spawn -> program-name

This will direct the output from the program to the screen

Leaving PVM

quit, halt

To exit from the console session while leaving all daemons and jobs running use the quit command

pvm> quit

To kill all pvm processes including the console, and then shut down PVM

pvm> halt

Using a hostfile

PVM may be started using a hostfile to define the virtual machine
The hostfile lists the hosts in the user's virtual machine. In its simplest form it lists one host per line
The first host listed must be the computer on which PVM is initially started
For example

# Configuration used for hello/world

navaho-atm

sioux-atm

Options such as userid and passwd (lo and pw) may be specified for each host

Error handling

Generally in Fortran the completion status of a pvm routine may be checked, if the status >= 0 then the routine completed successfully, a negative value indicating that an error has occurred
When an error occurs, a message is automatically printed that shows the task ID, function and error
Automatic error reporting may be disabled by calling pvmfserror
Error reports may be generated manually by calling pvmfperror

Debugging a PVM Application

A task run by hand can be started under a serial debugger. Add PVMDEBUG to pvmfspawn arguments
The following steps should be followed when debugging
- Run program as a single task and debug as any other serial program
- Run the program on a single host to eliminate network errors and errors in message passing
- Run the program across a few hosts to locate deadlock and synchronization problems
- Use XPVM

Fault Detection

Pvmd will recover automatically, applications must perform their own error recovery
A dead host will remain dead, but may be reconfigured later
The pvmd to pvmd message passing system drives all fault detection
Faults are triggered by retry time-out

Future Enhancements

The next version of pvm is supposed to have
- Greater scalability to enable 100s of hosts to be used in parallel
- Parallel startup
- Better debugging: libpvm to generate trace messages
- Better high speed network utilization eg. FDDI
- Better console program with features such as command history, command aliases and quoting

[Next] [Previous] [Top]

All documents are the responsibility of, and copyright, © their authors and do not represent the views of The Parallel Computer Centre, nor of The Queen's University of Belfast.
Maintained by Alan Rea, email A.Rea@qub.ac.uk

Generated with CERN WebMaker