: Michael Resch, Xin Wang, Wolfgang Bez, Erich Focht, Hiroaki Kobayashi, Sabine Roller
: Michael M. Resch, Xin Wang, Erich Focht, Hiroaki Kobayashi, Sabine Roller
: High Performance Computing on Vector Systems 2011
: Springer-Verlag
: 9783642222443
: 1
: CHF 113.90
:
: Allgemeines, Lexika
: English
: 184
: Wasserzeichen
: PC/MAC/eReader/Tablet
: PDF
The book presents the state of the art in high performance computing and simulation on modern supercomputer architectures. It covers trends in hardware and software development in general and specifically the future of vector-based systems and heterogeneous architectures. The application contributions cover computational fluid dynamics, material science, medical applications and climate research. Innovative fields like coupled multi-physics or multi-scale simulations are presented. All papers were chosen from presentations given at the 13th Teraflop Workshop held in October 2010 at Tohoku University, Japan.
High Performance Computing on Vector Systems 2011348
3348
Preface5
Contents7
Part I: Techniques and Tools for High Performance Systems9
Performance and Scalability Analysisof a Chip Multi Vector Processor10
1 Introduction11
2 Chip Multi Vector Processor12
2.1 Structure of a Chip Multi Vector Processor12
2.2 Performance Model of a Chip Multi Vector Processor13
3 Performance Tuning for a Chip Multi Vector Processor15
3.1 Performance Analysis Using the Roofline Model15
3.2 Program Optimization16
3.2.1 Loop Unrolling16
3.2.2 Cache Blocking17
3.2.3 Performance Tuning Strategy Based on the Roofline Model17
4 Performance and Scalability Analysis18
4.1 Methodology18
4.2 Benchmarks19
4.3 Performance Evaluation of CMVP20
4.4 Performance Evaluation of CMVP with Performance Tuning22
5 Conclusions25
References26
I/O Forwarding for Quiet Clusters28
1 Introduction29
2 Operating System Noise30
2.1 So …Who's the Noisy Neighbour?31
2.2 Impact on Applications31
2.3 Mitigation32
2.3.1 Silence Your System32
2.3.2 Embrace Noise33
2.3.3 Synchronize Noise33
2.3.4 Prioritize33
2.3.5 Travel Light33
3 Measuring Noise34
3.1 Test System34
3.2 Fixed Work Quanta Benchmark35
3.3 Fixed Time Quanta Benchmark36
4 I/O Induced Noise36
5 I/O Forwarding38
5.1 I/O Forwarding Architecture39
5.2 System I/O Interceptors: Libsysio40
5.3 I/O Forwarding Protocol: IOD Driver and Server41
5.4 Communication Framework: Portals41
5.5 Using the I/O Forwarding Framework42
5.6 Noise42
5.7 FUSE Driver44
6 Conclusion44
References45
A Prototype Implementation of OpenCL for SX Vector Systems47
1 Introduction48
2 OpenCL48
3 OpenCL for SX49
4 Early Evaluation and Discussions51
5 Conclusions53
References55
Distributed Parallelization of Semantic Web Java Applications by Means of the Message-Passing Interface57
1 Introduction57
2 Use Case Description: Random Indexing59
3 Parallelization Strategy60
4 Realization by Means of MPI61
5 Implementation63
6 Application Performance Evaluation64
7 Performance Tailoring: Hybrid MPI-Java Threads Communication Pattern66
8 Final Discussion and Conclusion68
References69
HPC Systems at JAIST and Development of Dynamic Loop Monitoring Tools Toward Runtime Parallelization71
1 Introduction71
2 Information Environment and HPC Systems at JAIST72
3 Development of Dynamic Loop Monitoring Tools Toward Runtime Parallelization74
3.1 Background and Objectives of Dynamic Loop Monitoring Tools75
3.2 Parallelism and Loop Nest Structures75
3.3 Loop Nest Detection and Loop-Call Context Tree Generation76
3.4 Evaluation of Our L-CCT Generation78
3.4.1 Experiment78
3.4.2 Results78
3.5 Run-Time Data Dependence Analysis80
3.5.1 Motivations and Strategies81
3.5.2 Details of Our Runtime Data Dependence Analysis81
3.5.3 Preliminary Evaluation of Runtime Data Dependence Analysis82
4 Conclusions83
References83
Part II: Methods and Technologies for Large-Scale Systems85
Tree Based Voxelization of STL Data86
1 Introduction86
2 Octree Overview88
3 Mesh Generation89
3.1 Intersection Algorithm and Tree Generation90
3.2 Flooding92
3.3 Boundary Conditions92
3.4 The File Format94
4 Sample Mesh95
5 Outlook96
References96
An Adaptable Simulation Framework Based on a Linearized Octree98
1 Introduction and Overall Layout of the Apes Framework98
1.1 Used Technologies99
1.2 Components of the Apes Suite99
1.3 Distributed Computing101
2 Related Work101
3 The Distributed Linearized Octree102
3.1 Implementation of the Element Description102
3.2 Element Properties104
3.3 Acting on the Tree106
4 Configuration of Simulation Runs107
5 Usage in Solvers107
5.1 Ateles108
5.2 Musubi109
6 Outlook110
References110
High Performance Computing for Analyzing PB-Scale Data in Nuclear Experiments and Simulations111
1 Introduction111
2 Large-Scale Data Integrated Analysis System112
3