The Queen's University of Belfast

Parallel Computer Centre
[Next] [Previous] [Top]
Introduction
Summary (salesman)
Binary Compatible with Y-MP architecture
EL => Entry Level: a departmental supercomputer or minisupercomputer.
Air- cooled, < 6kW of power per cabinet - can be installed in an air conditioned office.
Supports a full implementation of UNICOS - Cray's flavour of UNIX.
Supports all/most of Cray's software development tools.
System Specifications
CPU
- CMOS Technology
- 30ns clock period
- 64 bit architecture
- Maximum of 4 single processors or 8 dual processors
- 133 Mflops peak per CPU
- Shared registers for inter-processor communication and synchronization
Memory
- Shared central memory
- 4 ports per CPU
- 70 ns CMOS DRAM
- 256 - 1024 Mbytes memory (32 - 128 Mwords)
- 1.05 Gb/sec memory bandwidth per CPU, approximately 130 Mwords/sec
- 4.2 Gb/sec total bandwidth
- 64 memory banks (regardless of memory size)
I/O
- VME based - 40 Mb/sec per VME
- 1 - 4 per CPU
- 1 - 4 in-cabinet (40 Gb max disk capacity)
- Up to 3 expansion cabinets with a max of 16 VME sub-systems (200 Gb max disk capacity)
- 1.05 Gb/sec total system I/O
- 640 Mb/sec total VME bandwidth
- 100 Mb/sec HIPPI
- 264 Mb/sec per CPU
Physical Characteristics
- 635 Kg
- 1 square metre footprint
- 127 cm x 381 cm
- 6 kw power consumption
- Air cooled
- Operating temperature 10xb0 - 30xb0 - any air-conditioned office is suitable no need for an expensive controlled environment.
CPU Schematic
- Primary registers accessible directly by the functional units:
- V(ector) - 8 registers each containing 64 elements of 64 bits.
- S(calar) - 8 registers (64 bits each).
- A(ddress) - 8 registers.
- Intermediate registers B and T support (caching) for primary registers.
- VM - 64 bit vector mask. Each bit refers to an element in a vector register. The mask is used to select elements from a vector.
- VL vector length - range 1 to 64 specifying the length of the vector.
- RTC - run time clock
Functional Units
- Vector: Add, Shift, Logical, 2nd Logical, Population/Parity
- Floating point: Add, Multiply and Reciprocal Approximation - operands may be a pair of vector or scalar registers.
- Scalar: Add, Shift, Logical and Population/Parity/Leading zero.
- Address: Add and multiply.
- Segmented ie operations broken down into steps can complete in 1 clock cycle. Pipelining - a result every clock cycle.
- Independent - Add & Multiply may occur in parallel.
- Chaining - a result becomes the input.
QUB configuration
- 4 CPUs (single processor per board)
- 1 Gb of memory
- 15 Gb disk
- Ethernet connection
- Possible upgrade to FDDI connection
- 1 I/O sub system
SOFTWARE
- Introduction to Unicos for Application Programmers (iuap) - a computer based training package for new users of the Cray
- Compilers for Fortran 77, C and Pascal
- Cray Tools - a suite of tools for measuring and improving the performance of programs run
- Application packages - PVM/Hence, Nag Libraries, Unichem, TurboKiva, MPGS.
On-line information
- Manual pages
- Document viewer - docview
- Computer based training -iuap
- explain & whatis
Running programs
- Cray(QUB) primarily intended for use as a batch environment
- submit all compute intensive operations to the batch queues - includes compilation of large programs
- submit a job to a queue using a script file which is submitted to the queue system using the `qsub' command - validates all Network Queuing System request options
- Versions of the NAG routines which have been optimized for the Cray are available.
- Reducing compilation time - makefiles
- File compression utility - pack
[Next] [Previous] [Top]
All documents are the responsibility of, and copyright, © their authors and do not represent the views of The Parallel Computer Centre, nor of The Queen's University of Belfast.
Maintained by Alan Rea, email A.Rea@qub.ac.uk
Generated with CERN WebMaker