Basic Strategies to Improve Performance
Things to check for when parallelized loops do not perform well
parallel startup costs (incurred at parallel directive)
avoid parallelizing small loops
watch for load imbalances (threads have differnet amounts of work)
unnecessary synchronization (for example critical or ordered directives)
non-cache friendly programs
memory contention - one thread repeatedly updates a cache line that other threads use for input
false sharing - two or more threads repeatedly update the same cache line
use private variables where possible
change size and arrangement of arrays to avoid cache misses
performance.src last modified Mar 23, 2009
Introduction
Table of Contents
(
frame
/
no frame
)
Printable
(single file)
© Dartmouth College