| Preface | 5 |
|---|
| Contents | 7 |
|---|
| I Petaflop/s Computing | 14 |
|---|
| Lessons Learned from 1-Year Experience with SX-9 and Toward the Next Generation Vector Computing | 15 |
| Introduction | 15 |
| SX-9 System Overview | 16 |
| HPC Challenge Benchmark Results | 18 |
| Case Study Analysis of Memory-Conscious Tuning for SX-9 | 25 |
| Multi-Vector Cores Processor Design | 30 |
| Summary | 33 |
| References | 34 |
| BSC-CNS Research and Supercomputing Resources | 35 |
| Overview | 35 |
| Supercomputing Resources at BSC | 36 |
| MareNostrum | 36 |
| MareNostrum Performance 2008 | 38 |
| Shared Memory System | 38 |
| Backup and HSM Service | 39 |
| Spanish Supercomputing Network | 39 |
| PRACE Prototype | 40 |
| Research at BSC | 42 |
| Challenges and Opportunities of Hybrid Computing Systems | 43 |
| Introduction | 43 |
| European Context | 45 |
| Validation Scenario | 46 |
| Initial Results | 47 |
| Operational Requirements | 49 |
| Conclusions and Future Work | 51 |
| References | 51 |
| Going Forward with GPU Computing | 52 |
| Computing needs at CEA | 52 |
| Starting the Process | 54 |
| Available Hardware | 54 |
| Choosing a Programming Language | 57 |
| CUDA | 57 |
| OpenCL | 58 |
| RapidMind | 58 |
| HMPP | 58 |
| A Remark on Languages | 60 |
| Training Sessions | 60 |
| The System Administration Side | 60 |
| The Grand Challenges Strategy | 61 |
| Foreseen Problems | 61 |
| First Results | 62 |
| Conclusion | 63 |
| Optical Interconnection Technology for the Next Generation Supercomputers | 64 |
| Introduction | 64 |
| Components and Structure | 66 |
| Performance | 67 |
| Conclusions | 69 |
| References | 69 |
| HPC Architecture from Application Perspectives | 70 |
| Introduction | 70 |
| Trend of CPU Performance | 72 |
| Architectural Challenges | 74 |
| SIMD-based Approaches | 75 |
| Conclusions | 77 |
| References | 78 |
| II Strategies | 79 |
|---|
| A Language for Fortran Source to Source Transformation | 80 |
| Compiler | 80 |
| Self Defined Transformations | 81 |
| The Transformation Language | 81 |
| Transformation Variables | 82 |
| Transformation Constructs | 82 |
| Self Defined Procedures in the Transformation Code | 83 |
| Intrinsic Procedures | 83 |
| Parsing Primitives in Parsing Mode | 84 |
| Examples | 85 |
| Concluding Remarks | 87 |
| The SX-Linux Project: A Progress Report | 88 |
| Introduction | 89 |
| Project Paths | 89 |
| Progress and Status | 91 |
| The GNU Toolchain | 91 |
| Binutils | 91 |
| GCC | 92 |
| Current Toolchain Status | 93 |
| Future Work | 94 |
| User Space and I/O Forwarding | 94 |
| Newlib | 94 |
| Future of Newlib | 95 |
| Virtualization Layer | 96 |
| I/O Forwarding | 97 |
| I/O Forwarding-Current Implementation | 98 |
| I/O Forwarding Library Status | 99 |
| Kernel | 99 |
| Kitten LWK | 100 |
| Implementation and Status | 100 |
| Bootstrapping | 100 |
| Early Introspection | 100 |
| Stack and Memory Layout | 101 |
| Interrupts | 102 |
| System Calls | 102 |
| Context Switch | 102 |
| User Space | 103 |
| Status | 103 |
| Outlook | 104 |
| References | 105 |
| Development of APIs for Desktop Supercomputing | 106 |
| Introduction | 106 |
| Client APIs for GDS | 108 |
| Client APIs | 108 |
| Script Generator API | 108 |
| Implementation of Script Generator API in AEGIS | 110 |
| Development of GDS Application of Three-dimensional Virtual Plant Vibration Simulator | 112 |
| Three-dimensional Virtual Plant Vibration Simulator | 112 |
| Development of GDS Application of Three-dimensional Virtual Plant Vibration Simulator | 113 |
| Summary | 114 |
| References | 115 |
| The Grid Middleware on SX and Its Operation for Nation-Wide Service | 117 |
| Introduction | 117 |
| Structu
|