: Michael Resch, Rainer Keller, Valentin Himmler, Bett
: Rainer Keller, Valentin Himmler, Bettina Krammer, Alexander Schulz
: Tools for High Performance Computing Proceedings of the 2nd International Workshop on Parallel Tools for High Performance Computing, July 2008, HLRS, Stuttgart
: Springer-Verlag
: 9783540685647
: 1
: CHF 47.40
:
: Informatik
: English
: 202
: Wasserzeichen/DRM
: PC/MAC/eReader/Tablet
: PDF
Developing software for current and especially for future architectures will require knowledge about parallel programming techniques of applications and library p- grammers. Multi-core processors are already available today, and processors with a dozen and more cores are on the horizon. The major driving force in hardware development, the game industry, has - ready shown interest in using parallel programming paradigms, such as OpenMP for further developments. Therefore developers have to be supported in the even more complex task of programming for these new architectures. HLRS has a long-lasting tradition of providing its user community with the most up-to-date software tools. Additionally, important research and development projects are worked on at the center: among the software packages developed are the MPI correctness checker Marmot, the OpenMP validation suite and the M- implementations PACX-MPI and Open MPI. All of these software packages are - ing extended in the context of German and European community research projects, such as ParMA, the InterActive European Grid (I2G) project and the German C- laborative Research Center (Sonderforschungsbereich 716). Furthermore, ind- trial collaborations, i.e. with Intel and Microsoft allow HLRS to get its software production-grade ready. In April 2007, a European project on Parallel Programming for Multi-core - chitectures, in short ParMA was launched, with a major focus on providing and developing tools for parallel programming.
Preface5
Contents7
List of Contributors11
I Integrated Development Environments12
Sun HPC ClusterTools 7+: A Binary Distribution of Open MPI13
Introduction13
History14
Sun-Driven features15
Sun Product Activity23
Pros and Cons25
Future work and conclusions26
References27
An Integrated Environment For the Development of Parallel Applications29
Introduction29
Challenges31
Architecture33
A Simple Case Study38
Future Directions41
Conclusion43
References44
Debugging MPI Programs on the Grid using g-Eclipse45
Introduction45
Related Work46
Overview of g-Eclipse Approach47
Remote Builder48
Grid Application Launchers49
Trace Viewer49
Conclusions and Future Work54
References54
II Parallel Communication and Debugging56
Enhanced Memory debugging of MPI-parallel Applications in Open MPI57
Introduction57
Overview of Memcheck58
Design and Implementation59
Performance Implications61
Detectable error classes and findings in actual applications65
Conclusion and future work67
References68
MPI Correctness Checking with Marmot69
Introduction70
Related Work70
Design of Marmot71
Collaboration with other tools78
Experiences with real Applications80
How to install and use Marmot83
Conclusion and Future Work84
References84
Memory Debugging in Parallel and Distributed Applications87
Introduction87
The Challenges of Memory Debugging in Parallel Development88
Classifying Memory Errors88
Detecting Memory Leaks90
The MemoryScape Debugger90
MemoryScape Architecture91
MemoryScape Features92
MemoryScape Usage Tips95
MemoryScape User Case Study: SIMULIA Uses MemoryScape to Find and Fix Bugs Quickly96
Future MemoryScape Product Plans98
Conclusion98
III Performance Analysis Tools99
Sequential Performance Analysis with Callgrind and KCachegrind100
Introduction100
Callgrind: a Call-Graph building Online Cache Simulator104
KCachegrind: Profile Visualization112
Usage Example117
Future Development118
References120
Improving Cache Utilization Using Acumem VPE121
Introduction122
Throughput Study of SPEC CPU 2006124
First Generation Performance Tools Based on Hardware Counters126
Enter: The New Performance Tool128
Utilization Study of the Worst SPEC CPU 2006 Applications132
Tuning Example: 179.art134
Tuning Example: Revisiting the Throughput Applications138
Conclusion140
References141
Parallel Performance Analysis Tools141
The Vampir Performance Analysis Tool-Set143
Introduction143
Performance Analysis via Profiling or Tracing144
Instrumentation with VampirTrace145
Run-Time Measurement and Event Recording148
Trace Visualization with Vampir and VampirServer152
Related Work158
Conclusions and Future Work158
References159
Usage of the SCALASCA toolset for scalable performance analysis of large-scale parallel applications160
Introduction160
Overview161
Instrumentation and Measurement162
Trace Analysis165
Understanding Performance Behavior167
Outlook169
References170
Evolution of a Parallel Performance System171
Introduction171
TAU Performance System Design and Architecture172
TAU Instrumentation174
TAU Measurement 180
TAU Analysis185
Conclusion and Future Work188
References190
Cray Performance Analysis Tools193
Introduction193
The Cray Performance Analysis Tools194
Conclusions and Future Work200
References201
Index202