SlideShare a Scribd company logo
Chapter 3: Principles of Scalable
         Performance

•   Performance measures
•   Speedup laws
•   Scalability principles
•   Scaling up vs. scaling down



                EENG-630 Chapter 3   1
Performance metrics and
               measures

•   Parallelism profiles
•   Asymptotic speedup factor
•   System efficiency, utilization and quality
•   Standard performance measures



                 EENG-630 Chapter 3   2
Degree of parallelism
• Reflects the matching of software and
  hardware parallelism
• Discrete time function – measure, for each
  time period, the # of processors used
• Parallelism profile is a plot of the DOP as a
  function of time
• Ideally have unlimited resources

               EENG-630 Chapter 3    3
Factors affecting parallelism
                profiles
•   Algorithm structure
•   Program optimization
•   Resource utilization
•   Run-time conditions
•   Realistically limited by # of available
    processors, memory, and other
    nonprocessor resources

                 EENG-630 Chapter 3    4
Average parallelism variables
• n – homogeneous processors
• m – maximum parallelism in a profile
∀ ∆ - computing capacity of a single
  processor (execution rate only, no
  overhead)
• DOP=i – # processors busy during an
  observation period

              EENG-630 Chapter 3   5
Average parallelism
• Total amount of work performed is
  proportional to the area under the profile
  curve
                     t2
          W = ∆ ∫ DOP(t )dt
                    t1
                    m
          W = ∆ ∑ i ⋅ ti
                    i =1
               EENG-630 Chapter 3    6
Average parallelism

        1 t2
A=
    t 2 − t1 ∫t1 DOP(t )dt

      m
                   m
                          
A =  ∑ i ⋅ ti  /  ∑ ti 
     i =1        i =1 
        EENG-630 Chapter 3   7
Example: parallelism profile and
     average parallelism




          EENG-630 Chapter 3   8
Asymptotic speedup
                                                     m
         m
T (1) = ∑ ti (1) = ∑
                     m
                        Wi
                                       T (1)       ∑W         i

        i =1       i =1 ∆         S∞ =        =    i =1
                                       T (∞ )     m
          m
T ( ∞ ) = ∑ ti ( ∞ ) = ∑
                          m
                            Wi                    ∑W /i
                                                  i =1
                                                          i


          i =1         i =1 i∆        = A in the ideal case
        (response time)



                     EENG-630 Chapter 3      9
Performance measures
• Consider n processors executing m
  programs in various modes
• Want to define the mean performance of
  these multimode computers:
  – Arithmetic mean performance
  – Geometric mean performance
  – Harmonic mean performance


              EENG-630 Chapter 3   10
Arithmetic mean performance
        m
Ra = ∑ Ri / m             Arithmetic mean execution rate
                          (assumes equal weighting)
       i =1
       m
R = ∑ ( f i Ri )
 *                        Weighted arithmetic mean
 a                        execution rate
       i =1

     -proportional to the sum of the inverses of
      execution times

               EENG-630 Chapter 3           11
Geometric mean performance
        m
Rg = ∏ R       1/ m
               i
                        Geometric mean execution rate

       i =1
          m
R = ∏ Ri
 *
 g
                 fi      Weighted geometric mean
                         execution rate
        i =1
  -does not summarize the real performance since it does
   not have the inverse relation with the total time

                EENG-630 Chapter 3         12
Harmonic mean performance

 Ti = 1 / Ri      Mean execution time per instruction
                  For program i


    1 m    1 m 1
Ta = ∑ Ti = ∑          Arithmetic mean execution time
    m i =1 m i =1 Ri   per instruction




                EENG-630 Chapter 3        13
Harmonic mean performance
                            m
Rh =1 / Ta =          m               Harmonic mean execution rate
                    ∑1 / R )
                     (
                     i=1
                                 i




             1
R =
 *
 h    m
                            Weighted harmonic mean execution rate

      ∑( f
      i =1
             i   / Ri )
                      -corresponds to total # of operations divided by
                       the total time (closest to the real performance)
                          EENG-630 Chapter 3        14
Harmonic Mean Speedup
• Ties the various modes of a program to the
  number of processors used
• Program is in mode i if i processors used
• Sequential execution time T1 = 1/R1 = 1
                                 1
          S = T1 / T =
                   *

                         ∑
                             n
                          i =1
                                 f i / Ri

              EENG-630 Chapter 3            15
Harmonic Mean Speedup
     Performance




     EENG-630 Chapter 3   16
Amdahl’s Law
• Assume Ri = i, w = (α, 0, 0, …, 1- α)
• System is either sequential, with probability
  α, or fully parallel with prob. 1- α
                     n
          Sn =
               1 + (n − 1)α
• Implies S → 1/ α as n → ∞
               EENG-630 Chapter 3   17
Speedup Performance




    EENG-630 Chapter 3   18
System Efficiency
• O(n) is the total # of unit operations
• T(n) is execution time in unit time steps
• T(n) < O(n) and T(1) = O(1)

           S ( N ) = T (1) / T (n)

                   S (n) T (1)
          E ( n) =      =
                     n    nT (n)
               EENG-630 Chapter 3    19
Redundancy and Utilization
• Redundancy signifies the extent of
  matching software and hardware parallelism

         R (n) = O(n) / O(1)
• Utilization indicates the percentage of
  resources kept busy during execution
                                  O ( n)
       U ( n) = R ( n) E ( n) =
                                nT (n20
               EENG-630 Chapter 3
                                        )
Quality of Parallelism
• Directly proportional to the speedup and
  efficiency and inversely related to the
  redundancy
• Upper-bounded by the speedup S(n)

                                   3
             S ( n) E ( n)     T (1)
    Q ( n) =               =   2
                 R ( n)      nT (n)O(n)
              EENG-630 Chapter 3       21
Example of Performance
• Given O(1) = T(1) = n3, O(n) = n3 + n2log n,
  and T(n) = 4 n3/(n+3)
  S(n)    =     (n+3)/4
  E(n)    =     (n+3)/(4n)
  R(n)    =     (n + log n)/n
  U(n)    =     (n+3)(n + log n)/(4n2)
  Q(n)    =     (n+3)2 / (16(n + log n))


               EENG-630 Chapter 3       22
Standard Performance Measures
• MIPS and Mflops
  – Depends on instruction set and program used
• Dhrystone results
  – Measure of integer performance
• Whestone results
  – Measure of floating-point performance
• TPS and KLIPS ratings
  – Transaction performance and reasoning power

               EENG-630 Chapter 3     23
Parallel Processing Applications
•   Drug design
•   High-speed civil transport
•   Ocean modeling
•   Ozone depletion research
•   Air pollution
•   Digital anatomy


                EENG-630 Chapter 3   24
Application Models for Parallel
           Computers
• Fixed-load model
  – Constant workload
• Fixed-time model
  – Demands constant program execution time
• Fixed-memory model
  – Limited by the memory bound



              EENG-630 Chapter 3    25
Algorithm Characteristics
• Deterministic vs. nondeterministic
• Computational granularity
• Parallelism profile
• Communication patterns and
  synchronization requirements
• Uniformity of operations
• Memory requirement and data structures
              EENG-630 Chapter 3   26
Isoefficiency Concept
• Relates workload to machine size n needed
  to maintain a fixed efficiency
                                   workload
                 w( s )
        E=                             overhead
           w( s ) + h( s, n)

• The smaller the power of n, the more
  scalable the system

              EENG-630 Chapter 3      27
Isoefficiency Function
• To maintain a constant E, w(s) should grow
  in proportion to h(s,n)

                   E
         w( s ) =      × h( s, n)
                  1− E
• C = E/(1-E) is constant for fixed E

        f E ( n) = C × h( s, n)
               EENG-630 Chapter 3   28
Speedup Performance Laws
• Amdahl’s law
  – for fixed workload or fixed problem size
• Gustafson’s law
  – for scaled problems (problem size increases
    with increased machine size)
• Speedup model
  – for scaled problems bounded by memory
    capacity

               EENG-630 Chapter 3      29
Amdahl’s Law
• As # of processors increase, the fixed load
  is distributed to more processors
• Minimal turnaround time is primary goal
• Speedup factor is upper-bounded by a
  sequential bottleneck
• Two cases:
   DOP < n
   DOP ≥ n

               EENG-630 Chapter 3   30
Fixed Load Speedup Factor
• Case 1: DOP > n                  • Case 2: DOP < n
            Wi   i 
  ti (i ) =      n                                        Wi
            i∆                     t i ( n ) = t i (∞ ) =
                                                            i∆
            m
               Wi   i
 T ( n) = ∑         n
          i =1 i∆                     m

                        T (1)           ∑W     i
                   Sn =        =        i =i

                        T ( n)      m
                                        Wi     i 
                                   ∑i
                                   i =1
                                               n
                                                
                        EENG-630 Chapter 3            31
Gustafson’s Law
• With Amdahl’s Law, the workload cannot
  scale to match the available computing
  power as n increases
• Gustafson’s Law fixes the time, allowing
  the problem size to increase with higher n
• Not saving time, but increasing accuracy


              EENG-630 Chapter 3   32
Fixed-time Speedup
• As the machine size increases, have
  increased workload and new profile
• In general, Wi’ > Wi for 2 ≤ i ≤ m’ and   W1’
  = W1
• Assume T(1) = T’(n)



               EENG-630 Chapter 3   33
Gustafson’s Scaled Speedup
                             '
 m
            Wi  m
                                 i 
 ∑Wi = ∑ i
 i =1  i =1
                                 n
                                  
                                      + Q ( n)
         m'

         ∑W         i
                        '
                              W1 + nWn
  S ='
     n
         i =1
                            =
           m
                              W1 + Wn
         ∑W
         i =1
                    i

          EENG-630 Chapter 3               34
Memory Bounded Speedup
             Model
• Idea is to solve largest problem, limited by
  memory space
• Results in a scaled workload and higher accuracy
• Each node can handle only a small subproblem for
  distributed memory
• Using a large # of nodes collectively increases the
  memory capacity proportionally


                 EENG-630 Chapter 3      35
Fixed-Memory Speedup
• Let M be the memory requirement and W
  the computational workload: W = g(M)
• g*(nM)=G(n)g(M)=G(n)Wn
                   m*

               ∑W         i
                              *
                                        W1 + G (n)Wn
 S =
  *
  n
                   i =1
                                     =
       m*
            Wi *    i                W1 + G (n)Wn / n
       ∑ i
       i =1
                     n  + Q ( n)
                     
                     EENG-630 Chapter 3       36
Relating Speedup Models
• G(n) reflects the increase in workload as
  memory increases n times
• G(n) = 1 : Fixed problem size (Amdahl)
• G(n) = n : Workload increases n times when
  memory increased n times (Gustafson)
• G(n) > n : workload increases faster than
  memory than the memory requirement

              EENG-630 Chapter 3   37
Scalability Metrics
• Machine size (n) : # of processors
• Clock rate (f) : determines basic m/c cycle
• Problem size (s) : amount of computational
  workload. Directly proportional to T(s,1).
• CPU time (T(s,n)) : actual CPU time for
  execution
• I/O demand (d) : demand in moving the
  program, data, and results for a given run
              EENG-630 Chapter 3   38
Scalability Metrics
• Memory capacity (m) : max # of memory words
  demanded
• Communication overhead (h(s,n)) : amount of
  time for interprocessor communication,
  synchronization, etc.
• Computer cost (c) : total cost of h/w and s/w
  resources required
• Programming overhead (p) : development
  overhead associated with an application program

                EENG-630 Chapter 3     39
Speedup and Efficiency
• The problem size is the independent
  parameter
                         T ( s,1)
      S ( s, n) =
                  T ( s, n) + h( s, n)
                       S ( s, n)
           E ( s, n) =
                           n
              EENG-630 Chapter 3    40
Scalable Systems
• Ideally, if E(s,n)=1 for all algorithms and
  any s and n, system is scalable
• Practically, consider the scalability of a m/c

                     S ( s, n) TI ( s, n)
         Φ ( s, n) =            =
                     S I ( s, n) T ( s, n)


                EENG-630 Chapter 3     41
Ad

More Related Content

What's hot (20)

distributed shared memory
 distributed shared memory distributed shared memory
distributed shared memory
Ashish Kumar
 
Csma cd and csma-ca
Csma cd and csma-caCsma cd and csma-ca
Csma cd and csma-ca
kazim Hussain
 
Hardware and Software parallelism
Hardware and Software parallelismHardware and Software parallelism
Hardware and Software parallelism
prashantdahake
 
Parallel computing persentation
Parallel computing persentationParallel computing persentation
Parallel computing persentation
VIKAS SINGH BHADOURIA
 
Direct Memory Access(DMA)
Direct Memory Access(DMA)Direct Memory Access(DMA)
Direct Memory Access(DMA)
Page Maker
 
Message passing in Distributed Computing Systems
Message passing in Distributed Computing SystemsMessage passing in Distributed Computing Systems
Message passing in Distributed Computing Systems
Alagappa Govt Arts College, Karaikudi
 
Cache coherence
Cache coherenceCache coherence
Cache coherence
Priyam Pandey
 
Multi Processors And Multi Computers
 Multi Processors And Multi Computers Multi Processors And Multi Computers
Multi Processors And Multi Computers
Nemwos
 
System interconnect architecture
System interconnect architectureSystem interconnect architecture
System interconnect architecture
Gagan Kumar
 
Multiprocessor Systems
Multiprocessor SystemsMultiprocessor Systems
Multiprocessor Systems
vampugani
 
Array Processor
Array ProcessorArray Processor
Array Processor
Anshuman Biswal
 
Operating system 31 multiple processor scheduling
Operating system 31 multiple processor schedulingOperating system 31 multiple processor scheduling
Operating system 31 multiple processor scheduling
Vaibhav Khanna
 
Parallel processing
Parallel processingParallel processing
Parallel processing
Syed Zaid Irshad
 
2.communcation in distributed system
2.communcation in distributed system2.communcation in distributed system
2.communcation in distributed system
Gd Goenka University
 
Chpt7
Chpt7Chpt7
Chpt7
RohitKeshari
 
Basic Computer Organization and Design
Basic  Computer  Organization  and  DesignBasic  Computer  Organization  and  Design
Basic Computer Organization and Design
Aksum Institute of Technology(AIT, @Letsgo)
 
Introduction to MPI
Introduction to MPI Introduction to MPI
Introduction to MPI
Hanif Durad
 
Distributed operating system(os)
Distributed operating system(os)Distributed operating system(os)
Distributed operating system(os)
Dinesh Modak
 
Communication primitives
Communication primitivesCommunication primitives
Communication primitives
Student
 
Introduction to Parallel and Distributed Computing
Introduction to Parallel and Distributed ComputingIntroduction to Parallel and Distributed Computing
Introduction to Parallel and Distributed Computing
Sayed Chhattan Shah
 
distributed shared memory
 distributed shared memory distributed shared memory
distributed shared memory
Ashish Kumar
 
Hardware and Software parallelism
Hardware and Software parallelismHardware and Software parallelism
Hardware and Software parallelism
prashantdahake
 
Direct Memory Access(DMA)
Direct Memory Access(DMA)Direct Memory Access(DMA)
Direct Memory Access(DMA)
Page Maker
 
Multi Processors And Multi Computers
 Multi Processors And Multi Computers Multi Processors And Multi Computers
Multi Processors And Multi Computers
Nemwos
 
System interconnect architecture
System interconnect architectureSystem interconnect architecture
System interconnect architecture
Gagan Kumar
 
Multiprocessor Systems
Multiprocessor SystemsMultiprocessor Systems
Multiprocessor Systems
vampugani
 
Operating system 31 multiple processor scheduling
Operating system 31 multiple processor schedulingOperating system 31 multiple processor scheduling
Operating system 31 multiple processor scheduling
Vaibhav Khanna
 
2.communcation in distributed system
2.communcation in distributed system2.communcation in distributed system
2.communcation in distributed system
Gd Goenka University
 
Introduction to MPI
Introduction to MPI Introduction to MPI
Introduction to MPI
Hanif Durad
 
Distributed operating system(os)
Distributed operating system(os)Distributed operating system(os)
Distributed operating system(os)
Dinesh Modak
 
Communication primitives
Communication primitivesCommunication primitives
Communication primitives
Student
 
Introduction to Parallel and Distributed Computing
Introduction to Parallel and Distributed ComputingIntroduction to Parallel and Distributed Computing
Introduction to Parallel and Distributed Computing
Sayed Chhattan Shah
 

Viewers also liked (20)

Parallel computing chapter 2
Parallel computing chapter 2Parallel computing chapter 2
Parallel computing chapter 2
Md. Mahedi Mahfuj
 
Parallel computing(2)
Parallel computing(2)Parallel computing(2)
Parallel computing(2)
Md. Mahedi Mahfuj
 
Computer architecture kai hwang
Computer architecture   kai hwangComputer architecture   kai hwang
Computer architecture kai hwang
Sumedha
 
Bengali optical character recognition system
Bengali optical character recognition systemBengali optical character recognition system
Bengali optical character recognition system
Md. Mahedi Mahfuj
 
Observer pattern
Observer patternObserver pattern
Observer pattern
Md. Mahedi Mahfuj
 
Mediator pattern
Mediator patternMediator pattern
Mediator pattern
Md. Mahedi Mahfuj
 
Clustering manual
Clustering manualClustering manual
Clustering manual
Md. Mahedi Mahfuj
 
Parallel searching
Parallel searchingParallel searching
Parallel searching
Md. Mahedi Mahfuj
 
Advanced computer architecture
Advanced computer architectureAdvanced computer architecture
Advanced computer architecture
Md. Mahedi Mahfuj
 
Program and Network Properties
Program and Network PropertiesProgram and Network Properties
Program and Network Properties
Beekrum Duwal
 
Statistical Machine Translation for Language Localisation
Statistical Machine Translation for Language LocalisationStatistical Machine Translation for Language Localisation
Statistical Machine Translation for Language Localisation
Achchuthan Yogarajah
 
Brown Junior Public School - EQAO School Report
Brown Junior Public School - EQAO School ReportBrown Junior Public School - EQAO School Report
Brown Junior Public School - EQAO School Report
EvanSage
 
Aca2 08 new
Aca2 08 newAca2 08 new
Aca2 08 new
Sumit Mittu
 
Aca2 01 new
Aca2 01 newAca2 01 new
Aca2 01 new
Sumit Mittu
 
Map reduce
Map reduceMap reduce
Map reduce
Md. Mahedi Mahfuj
 
R with excel
R with excelR with excel
R with excel
Md. Mahedi Mahfuj
 
Apache hadoop & map reduce
Apache hadoop & map reduceApache hadoop & map reduce
Apache hadoop & map reduce
Md. Mahedi Mahfuj
 
Strategy pattern.pdf
Strategy pattern.pdfStrategy pattern.pdf
Strategy pattern.pdf
Md. Mahedi Mahfuj
 
Interviews 1
Interviews 1Interviews 1
Interviews 1
Mayela TD
 
Twitter
TwitterTwitter
Twitter
Alex J. Martin
 
Ad

Similar to Parallel computing chapter 3 (20)

CS8451 - Design and Analysis of Algorithms
CS8451 - Design and Analysis of AlgorithmsCS8451 - Design and Analysis of Algorithms
CS8451 - Design and Analysis of Algorithms
Krishnan MuthuManickam
 
Unit 1
Unit 1Unit 1
Unit 1
Abha Damani
 
Algorithm analysis
Algorithm analysisAlgorithm analysis
Algorithm analysis
sumitbardhan
 
Algorithm analysis
Algorithm analysisAlgorithm analysis
Algorithm analysis
Akshay Dagar
 
Big Oh.ppt
Big Oh.pptBig Oh.ppt
Big Oh.ppt
Senthil Vit
 
algorithmAnalysisanddatasteucturesof.ppt
algorithmAnalysisanddatasteucturesof.pptalgorithmAnalysisanddatasteucturesof.ppt
algorithmAnalysisanddatasteucturesof.ppt
maliozer
 
Introduction of Analysis of Algorithm , asymptotic notations
Introduction of Analysis of Algorithm , asymptotic notationsIntroduction of Analysis of Algorithm , asymptotic notations
Introduction of Analysis of Algorithm , asymptotic notations
namrabsit
 
Analysis of Algorithms_Under Graduate Class Slide
Analysis of Algorithms_Under Graduate Class SlideAnalysis of Algorithms_Under Graduate Class Slide
Analysis of Algorithms_Under Graduate Class Slide
HanumatGSastry
 
Unit 1_final DESIGN AND ANALYSIS OF ALGORITHM.pdf
Unit 1_final DESIGN AND ANALYSIS OF ALGORITHM.pdfUnit 1_final DESIGN AND ANALYSIS OF ALGORITHM.pdf
Unit 1_final DESIGN AND ANALYSIS OF ALGORITHM.pdf
saiscount01
 
Data Structures- Part2 analysis tools
Data Structures- Part2 analysis toolsData Structures- Part2 analysis tools
Data Structures- Part2 analysis tools
Abdullah Al-hazmy
 
Unit ii algorithm
Unit   ii algorithmUnit   ii algorithm
Unit ii algorithm
Tribhuvan University
 
Intro to super. advance algorithm..pptx
Intro to super.   advance algorithm..pptxIntro to super.   advance algorithm..pptx
Intro to super. advance algorithm..pptx
ManishBaranwal10
 
Asymptotics 140510003721-phpapp02
Asymptotics 140510003721-phpapp02Asymptotics 140510003721-phpapp02
Asymptotics 140510003721-phpapp02
mansab MIRZA
 
Lecture 1 and 2 of Data Structures & Algorithms
Lecture 1 and 2 of Data Structures & AlgorithmsLecture 1 and 2 of Data Structures & Algorithms
Lecture 1 and 2 of Data Structures & Algorithms
haseebanjum2611
 
algorithmanalysis and effciency.pptx
algorithmanalysis and effciency.pptxalgorithmanalysis and effciency.pptx
algorithmanalysis and effciency.pptx
ChSreenivasuluReddy
 
BCSE202Lkkljkljkbbbnbnghghjghghghghghghghgh
BCSE202LkkljkljkbbbnbnghghjghghghghghghghghBCSE202Lkkljkljkbbbnbnghghjghghghghghghghgh
BCSE202Lkkljkljkbbbnbnghghjghghghghghghghgh
shivapatil54
 
jn;lm;lkm';m';;lmppt of data structure.pdf
jn;lm;lkm';m';;lmppt of data structure.pdfjn;lm;lkm';m';;lmppt of data structure.pdf
jn;lm;lkm';m';;lmppt of data structure.pdf
VinayNassa3
 
Iare ds ppt_3
Iare ds ppt_3Iare ds ppt_3
Iare ds ppt_3
AlugatiRajitha
 
Chapter One.pdf
Chapter One.pdfChapter One.pdf
Chapter One.pdf
abay golla
 
Algorithms
Algorithms Algorithms
Algorithms
yashodhaHR2
 
CS8451 - Design and Analysis of Algorithms
CS8451 - Design and Analysis of AlgorithmsCS8451 - Design and Analysis of Algorithms
CS8451 - Design and Analysis of Algorithms
Krishnan MuthuManickam
 
Algorithm analysis
Algorithm analysisAlgorithm analysis
Algorithm analysis
sumitbardhan
 
Algorithm analysis
Algorithm analysisAlgorithm analysis
Algorithm analysis
Akshay Dagar
 
algorithmAnalysisanddatasteucturesof.ppt
algorithmAnalysisanddatasteucturesof.pptalgorithmAnalysisanddatasteucturesof.ppt
algorithmAnalysisanddatasteucturesof.ppt
maliozer
 
Introduction of Analysis of Algorithm , asymptotic notations
Introduction of Analysis of Algorithm , asymptotic notationsIntroduction of Analysis of Algorithm , asymptotic notations
Introduction of Analysis of Algorithm , asymptotic notations
namrabsit
 
Analysis of Algorithms_Under Graduate Class Slide
Analysis of Algorithms_Under Graduate Class SlideAnalysis of Algorithms_Under Graduate Class Slide
Analysis of Algorithms_Under Graduate Class Slide
HanumatGSastry
 
Unit 1_final DESIGN AND ANALYSIS OF ALGORITHM.pdf
Unit 1_final DESIGN AND ANALYSIS OF ALGORITHM.pdfUnit 1_final DESIGN AND ANALYSIS OF ALGORITHM.pdf
Unit 1_final DESIGN AND ANALYSIS OF ALGORITHM.pdf
saiscount01
 
Data Structures- Part2 analysis tools
Data Structures- Part2 analysis toolsData Structures- Part2 analysis tools
Data Structures- Part2 analysis tools
Abdullah Al-hazmy
 
Intro to super. advance algorithm..pptx
Intro to super.   advance algorithm..pptxIntro to super.   advance algorithm..pptx
Intro to super. advance algorithm..pptx
ManishBaranwal10
 
Asymptotics 140510003721-phpapp02
Asymptotics 140510003721-phpapp02Asymptotics 140510003721-phpapp02
Asymptotics 140510003721-phpapp02
mansab MIRZA
 
Lecture 1 and 2 of Data Structures & Algorithms
Lecture 1 and 2 of Data Structures & AlgorithmsLecture 1 and 2 of Data Structures & Algorithms
Lecture 1 and 2 of Data Structures & Algorithms
haseebanjum2611
 
algorithmanalysis and effciency.pptx
algorithmanalysis and effciency.pptxalgorithmanalysis and effciency.pptx
algorithmanalysis and effciency.pptx
ChSreenivasuluReddy
 
BCSE202Lkkljkljkbbbnbnghghjghghghghghghghgh
BCSE202LkkljkljkbbbnbnghghjghghghghghghghghBCSE202Lkkljkljkbbbnbnghghjghghghghghghghgh
BCSE202Lkkljkljkbbbnbnghghjghghghghghghghgh
shivapatil54
 
jn;lm;lkm';m';;lmppt of data structure.pdf
jn;lm;lkm';m';;lmppt of data structure.pdfjn;lm;lkm';m';;lmppt of data structure.pdf
jn;lm;lkm';m';;lmppt of data structure.pdf
VinayNassa3
 
Chapter One.pdf
Chapter One.pdfChapter One.pdf
Chapter One.pdf
abay golla
 
Ad

More from Md. Mahedi Mahfuj (19)

Parallel computing(1)
Parallel computing(1)Parallel computing(1)
Parallel computing(1)
Md. Mahedi Mahfuj
 
Message passing interface
Message passing interfaceMessage passing interface
Message passing interface
Md. Mahedi Mahfuj
 
Matrix multiplication graph
Matrix multiplication graphMatrix multiplication graph
Matrix multiplication graph
Md. Mahedi Mahfuj
 
Strategy pattern
Strategy patternStrategy pattern
Strategy pattern
Md. Mahedi Mahfuj
 
Database management system chapter16
Database management system chapter16Database management system chapter16
Database management system chapter16
Md. Mahedi Mahfuj
 
Database management system chapter15
Database management system chapter15Database management system chapter15
Database management system chapter15
Md. Mahedi Mahfuj
 
Database management system chapter12
Database management system chapter12Database management system chapter12
Database management system chapter12
Md. Mahedi Mahfuj
 
Strategies in job search process
Strategies in job search processStrategies in job search process
Strategies in job search process
Md. Mahedi Mahfuj
 
Report writing(short)
Report writing(short)Report writing(short)
Report writing(short)
Md. Mahedi Mahfuj
 
Report writing(long)
Report writing(long)Report writing(long)
Report writing(long)
Md. Mahedi Mahfuj
 
Job search_resume
Job search_resumeJob search_resume
Job search_resume
Md. Mahedi Mahfuj
 
Job search_interview
Job search_interviewJob search_interview
Job search_interview
Md. Mahedi Mahfuj
 
Basic and logical implementation of r language
Basic and logical implementation of r language Basic and logical implementation of r language
Basic and logical implementation of r language
Md. Mahedi Mahfuj
 
R language
R languageR language
R language
Md. Mahedi Mahfuj
 
Big data
Big dataBig data
Big data
Md. Mahedi Mahfuj
 
Chatbot Artificial Intelligence
Chatbot Artificial IntelligenceChatbot Artificial Intelligence
Chatbot Artificial Intelligence
Md. Mahedi Mahfuj
 
Cloud testing v1
Cloud testing v1Cloud testing v1
Cloud testing v1
Md. Mahedi Mahfuj
 
Distributed deadlock
Distributed deadlockDistributed deadlock
Distributed deadlock
Md. Mahedi Mahfuj
 
Paper review
Paper review Paper review
Paper review
Md. Mahedi Mahfuj
 

Recently uploaded (20)

Building a research repository that works by Clare Cady
Building a research repository that works by Clare CadyBuilding a research repository that works by Clare Cady
Building a research repository that works by Clare Cady
UXPA Boston
 
Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?
Eric Torreborre
 
Secondary Storage for a microcontroller system
Secondary Storage for a microcontroller systemSecondary Storage for a microcontroller system
Secondary Storage for a microcontroller system
fizarcse
 
Who's choice? Making decisions with and about Artificial Intelligence, Keele ...
Who's choice? Making decisions with and about Artificial Intelligence, Keele ...Who's choice? Making decisions with and about Artificial Intelligence, Keele ...
Who's choice? Making decisions with and about Artificial Intelligence, Keele ...
Alan Dix
 
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Christian Folini
 
Top Hyper-Casual Game Studio Services
Top  Hyper-Casual  Game  Studio ServicesTop  Hyper-Casual  Game  Studio Services
Top Hyper-Casual Game Studio Services
Nova Carter
 
DNF 2.0 Implementations Challenges in Nepal
DNF 2.0 Implementations Challenges in NepalDNF 2.0 Implementations Challenges in Nepal
DNF 2.0 Implementations Challenges in Nepal
ICT Frame Magazine Pvt. Ltd.
 
Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Kit-Works Team Study_아직도 Dockefile.pdf_김성호Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Wonjun Hwang
 
Developing Product-Behavior Fit: UX Research in Product Development by Krysta...
Developing Product-Behavior Fit: UX Research in Product Development by Krysta...Developing Product-Behavior Fit: UX Research in Product Development by Krysta...
Developing Product-Behavior Fit: UX Research in Product Development by Krysta...
UXPA Boston
 
Building Connected Agents: An Overview of Google's ADK and A2A Protocol
Building Connected Agents:  An Overview of Google's ADK and A2A ProtocolBuilding Connected Agents:  An Overview of Google's ADK and A2A Protocol
Building Connected Agents: An Overview of Google's ADK and A2A Protocol
Suresh Peiris
 
Building the Customer Identity Community, Together.pdf
Building the Customer Identity Community, Together.pdfBuilding the Customer Identity Community, Together.pdf
Building the Customer Identity Community, Together.pdf
Cheryl Hung
 
Computer Systems Quiz Presentation in Purple Bold Style (4).pdf
Computer Systems Quiz Presentation in Purple Bold Style (4).pdfComputer Systems Quiz Presentation in Purple Bold Style (4).pdf
Computer Systems Quiz Presentation in Purple Bold Style (4).pdf
fizarcse
 
AI and Gender: Decoding the Sociological Impact
AI and Gender: Decoding the Sociological ImpactAI and Gender: Decoding the Sociological Impact
AI and Gender: Decoding the Sociological Impact
SaikatBasu37
 
May Patch Tuesday
May Patch TuesdayMay Patch Tuesday
May Patch Tuesday
Ivanti
 
DevOpsDays SLC - Platform Engineers are Product Managers.pptx
DevOpsDays SLC - Platform Engineers are Product Managers.pptxDevOpsDays SLC - Platform Engineers are Product Managers.pptx
DevOpsDays SLC - Platform Engineers are Product Managers.pptx
Justin Reock
 
Middle East and Africa Cybersecurity Market Trends and Growth Analysis
Middle East and Africa Cybersecurity Market Trends and Growth Analysis Middle East and Africa Cybersecurity Market Trends and Growth Analysis
Middle East and Africa Cybersecurity Market Trends and Growth Analysis
Preeti Jha
 
Cybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and MitigationCybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and Mitigation
VICTOR MAESTRE RAMIREZ
 
React Native for Business Solutions: Building Scalable Apps for Success
React Native for Business Solutions: Building Scalable Apps for SuccessReact Native for Business Solutions: Building Scalable Apps for Success
React Native for Business Solutions: Building Scalable Apps for Success
Amelia Swank
 
Right to liberty and security of a person.pdf
Right to liberty and security of a person.pdfRight to liberty and security of a person.pdf
Right to liberty and security of a person.pdf
danielbraico197
 
accessibility Considerations during Design by Rick Blair, Schneider Electric
accessibility Considerations during Design by Rick Blair, Schneider Electricaccessibility Considerations during Design by Rick Blair, Schneider Electric
accessibility Considerations during Design by Rick Blair, Schneider Electric
UXPA Boston
 
Building a research repository that works by Clare Cady
Building a research repository that works by Clare CadyBuilding a research repository that works by Clare Cady
Building a research repository that works by Clare Cady
UXPA Boston
 
Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?
Eric Torreborre
 
Secondary Storage for a microcontroller system
Secondary Storage for a microcontroller systemSecondary Storage for a microcontroller system
Secondary Storage for a microcontroller system
fizarcse
 
Who's choice? Making decisions with and about Artificial Intelligence, Keele ...
Who's choice? Making decisions with and about Artificial Intelligence, Keele ...Who's choice? Making decisions with and about Artificial Intelligence, Keele ...
Who's choice? Making decisions with and about Artificial Intelligence, Keele ...
Alan Dix
 
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Christian Folini
 
Top Hyper-Casual Game Studio Services
Top  Hyper-Casual  Game  Studio ServicesTop  Hyper-Casual  Game  Studio Services
Top Hyper-Casual Game Studio Services
Nova Carter
 
Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Kit-Works Team Study_아직도 Dockefile.pdf_김성호Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Wonjun Hwang
 
Developing Product-Behavior Fit: UX Research in Product Development by Krysta...
Developing Product-Behavior Fit: UX Research in Product Development by Krysta...Developing Product-Behavior Fit: UX Research in Product Development by Krysta...
Developing Product-Behavior Fit: UX Research in Product Development by Krysta...
UXPA Boston
 
Building Connected Agents: An Overview of Google's ADK and A2A Protocol
Building Connected Agents:  An Overview of Google's ADK and A2A ProtocolBuilding Connected Agents:  An Overview of Google's ADK and A2A Protocol
Building Connected Agents: An Overview of Google's ADK and A2A Protocol
Suresh Peiris
 
Building the Customer Identity Community, Together.pdf
Building the Customer Identity Community, Together.pdfBuilding the Customer Identity Community, Together.pdf
Building the Customer Identity Community, Together.pdf
Cheryl Hung
 
Computer Systems Quiz Presentation in Purple Bold Style (4).pdf
Computer Systems Quiz Presentation in Purple Bold Style (4).pdfComputer Systems Quiz Presentation in Purple Bold Style (4).pdf
Computer Systems Quiz Presentation in Purple Bold Style (4).pdf
fizarcse
 
AI and Gender: Decoding the Sociological Impact
AI and Gender: Decoding the Sociological ImpactAI and Gender: Decoding the Sociological Impact
AI and Gender: Decoding the Sociological Impact
SaikatBasu37
 
May Patch Tuesday
May Patch TuesdayMay Patch Tuesday
May Patch Tuesday
Ivanti
 
DevOpsDays SLC - Platform Engineers are Product Managers.pptx
DevOpsDays SLC - Platform Engineers are Product Managers.pptxDevOpsDays SLC - Platform Engineers are Product Managers.pptx
DevOpsDays SLC - Platform Engineers are Product Managers.pptx
Justin Reock
 
Middle East and Africa Cybersecurity Market Trends and Growth Analysis
Middle East and Africa Cybersecurity Market Trends and Growth Analysis Middle East and Africa Cybersecurity Market Trends and Growth Analysis
Middle East and Africa Cybersecurity Market Trends and Growth Analysis
Preeti Jha
 
Cybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and MitigationCybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and Mitigation
VICTOR MAESTRE RAMIREZ
 
React Native for Business Solutions: Building Scalable Apps for Success
React Native for Business Solutions: Building Scalable Apps for SuccessReact Native for Business Solutions: Building Scalable Apps for Success
React Native for Business Solutions: Building Scalable Apps for Success
Amelia Swank
 
Right to liberty and security of a person.pdf
Right to liberty and security of a person.pdfRight to liberty and security of a person.pdf
Right to liberty and security of a person.pdf
danielbraico197
 
accessibility Considerations during Design by Rick Blair, Schneider Electric
accessibility Considerations during Design by Rick Blair, Schneider Electricaccessibility Considerations during Design by Rick Blair, Schneider Electric
accessibility Considerations during Design by Rick Blair, Schneider Electric
UXPA Boston
 

Parallel computing chapter 3

  • 1. Chapter 3: Principles of Scalable Performance • Performance measures • Speedup laws • Scalability principles • Scaling up vs. scaling down EENG-630 Chapter 3 1
  • 2. Performance metrics and measures • Parallelism profiles • Asymptotic speedup factor • System efficiency, utilization and quality • Standard performance measures EENG-630 Chapter 3 2
  • 3. Degree of parallelism • Reflects the matching of software and hardware parallelism • Discrete time function – measure, for each time period, the # of processors used • Parallelism profile is a plot of the DOP as a function of time • Ideally have unlimited resources EENG-630 Chapter 3 3
  • 4. Factors affecting parallelism profiles • Algorithm structure • Program optimization • Resource utilization • Run-time conditions • Realistically limited by # of available processors, memory, and other nonprocessor resources EENG-630 Chapter 3 4
  • 5. Average parallelism variables • n – homogeneous processors • m – maximum parallelism in a profile ∀ ∆ - computing capacity of a single processor (execution rate only, no overhead) • DOP=i – # processors busy during an observation period EENG-630 Chapter 3 5
  • 6. Average parallelism • Total amount of work performed is proportional to the area under the profile curve t2 W = ∆ ∫ DOP(t )dt t1 m W = ∆ ∑ i ⋅ ti i =1 EENG-630 Chapter 3 6
  • 7. Average parallelism 1 t2 A= t 2 − t1 ∫t1 DOP(t )dt  m   m  A =  ∑ i ⋅ ti  /  ∑ ti   i =1   i =1  EENG-630 Chapter 3 7
  • 8. Example: parallelism profile and average parallelism EENG-630 Chapter 3 8
  • 9. Asymptotic speedup m m T (1) = ∑ ti (1) = ∑ m Wi T (1) ∑W i i =1 i =1 ∆ S∞ = = i =1 T (∞ ) m m T ( ∞ ) = ∑ ti ( ∞ ) = ∑ m Wi ∑W /i i =1 i i =1 i =1 i∆ = A in the ideal case (response time) EENG-630 Chapter 3 9
  • 10. Performance measures • Consider n processors executing m programs in various modes • Want to define the mean performance of these multimode computers: – Arithmetic mean performance – Geometric mean performance – Harmonic mean performance EENG-630 Chapter 3 10
  • 11. Arithmetic mean performance m Ra = ∑ Ri / m Arithmetic mean execution rate (assumes equal weighting) i =1 m R = ∑ ( f i Ri ) * Weighted arithmetic mean a execution rate i =1 -proportional to the sum of the inverses of execution times EENG-630 Chapter 3 11
  • 12. Geometric mean performance m Rg = ∏ R 1/ m i Geometric mean execution rate i =1 m R = ∏ Ri * g fi Weighted geometric mean execution rate i =1 -does not summarize the real performance since it does not have the inverse relation with the total time EENG-630 Chapter 3 12
  • 13. Harmonic mean performance Ti = 1 / Ri Mean execution time per instruction For program i 1 m 1 m 1 Ta = ∑ Ti = ∑ Arithmetic mean execution time m i =1 m i =1 Ri per instruction EENG-630 Chapter 3 13
  • 14. Harmonic mean performance m Rh =1 / Ta = m Harmonic mean execution rate ∑1 / R ) ( i=1 i 1 R = * h m Weighted harmonic mean execution rate ∑( f i =1 i / Ri ) -corresponds to total # of operations divided by the total time (closest to the real performance) EENG-630 Chapter 3 14
  • 15. Harmonic Mean Speedup • Ties the various modes of a program to the number of processors used • Program is in mode i if i processors used • Sequential execution time T1 = 1/R1 = 1 1 S = T1 / T = * ∑ n i =1 f i / Ri EENG-630 Chapter 3 15
  • 16. Harmonic Mean Speedup Performance EENG-630 Chapter 3 16
  • 17. Amdahl’s Law • Assume Ri = i, w = (α, 0, 0, …, 1- α) • System is either sequential, with probability α, or fully parallel with prob. 1- α n Sn = 1 + (n − 1)α • Implies S → 1/ α as n → ∞ EENG-630 Chapter 3 17
  • 18. Speedup Performance EENG-630 Chapter 3 18
  • 19. System Efficiency • O(n) is the total # of unit operations • T(n) is execution time in unit time steps • T(n) < O(n) and T(1) = O(1) S ( N ) = T (1) / T (n) S (n) T (1) E ( n) = = n nT (n) EENG-630 Chapter 3 19
  • 20. Redundancy and Utilization • Redundancy signifies the extent of matching software and hardware parallelism R (n) = O(n) / O(1) • Utilization indicates the percentage of resources kept busy during execution O ( n) U ( n) = R ( n) E ( n) = nT (n20 EENG-630 Chapter 3 )
  • 21. Quality of Parallelism • Directly proportional to the speedup and efficiency and inversely related to the redundancy • Upper-bounded by the speedup S(n) 3 S ( n) E ( n) T (1) Q ( n) = = 2 R ( n) nT (n)O(n) EENG-630 Chapter 3 21
  • 22. Example of Performance • Given O(1) = T(1) = n3, O(n) = n3 + n2log n, and T(n) = 4 n3/(n+3) S(n) = (n+3)/4 E(n) = (n+3)/(4n) R(n) = (n + log n)/n U(n) = (n+3)(n + log n)/(4n2) Q(n) = (n+3)2 / (16(n + log n)) EENG-630 Chapter 3 22
  • 23. Standard Performance Measures • MIPS and Mflops – Depends on instruction set and program used • Dhrystone results – Measure of integer performance • Whestone results – Measure of floating-point performance • TPS and KLIPS ratings – Transaction performance and reasoning power EENG-630 Chapter 3 23
  • 24. Parallel Processing Applications • Drug design • High-speed civil transport • Ocean modeling • Ozone depletion research • Air pollution • Digital anatomy EENG-630 Chapter 3 24
  • 25. Application Models for Parallel Computers • Fixed-load model – Constant workload • Fixed-time model – Demands constant program execution time • Fixed-memory model – Limited by the memory bound EENG-630 Chapter 3 25
  • 26. Algorithm Characteristics • Deterministic vs. nondeterministic • Computational granularity • Parallelism profile • Communication patterns and synchronization requirements • Uniformity of operations • Memory requirement and data structures EENG-630 Chapter 3 26
  • 27. Isoefficiency Concept • Relates workload to machine size n needed to maintain a fixed efficiency workload w( s ) E= overhead w( s ) + h( s, n) • The smaller the power of n, the more scalable the system EENG-630 Chapter 3 27
  • 28. Isoefficiency Function • To maintain a constant E, w(s) should grow in proportion to h(s,n) E w( s ) = × h( s, n) 1− E • C = E/(1-E) is constant for fixed E f E ( n) = C × h( s, n) EENG-630 Chapter 3 28
  • 29. Speedup Performance Laws • Amdahl’s law – for fixed workload or fixed problem size • Gustafson’s law – for scaled problems (problem size increases with increased machine size) • Speedup model – for scaled problems bounded by memory capacity EENG-630 Chapter 3 29
  • 30. Amdahl’s Law • As # of processors increase, the fixed load is distributed to more processors • Minimal turnaround time is primary goal • Speedup factor is upper-bounded by a sequential bottleneck • Two cases:  DOP < n  DOP ≥ n EENG-630 Chapter 3 30
  • 31. Fixed Load Speedup Factor • Case 1: DOP > n • Case 2: DOP < n Wi i  ti (i ) = n Wi i∆   t i ( n ) = t i (∞ ) = i∆ m Wi i T ( n) = ∑ n i =1 i∆   m T (1) ∑W i Sn = = i =i T ( n) m Wi i  ∑i i =1 n   EENG-630 Chapter 3 31
  • 32. Gustafson’s Law • With Amdahl’s Law, the workload cannot scale to match the available computing power as n increases • Gustafson’s Law fixes the time, allowing the problem size to increase with higher n • Not saving time, but increasing accuracy EENG-630 Chapter 3 32
  • 33. Fixed-time Speedup • As the machine size increases, have increased workload and new profile • In general, Wi’ > Wi for 2 ≤ i ≤ m’ and W1’ = W1 • Assume T(1) = T’(n) EENG-630 Chapter 3 33
  • 34. Gustafson’s Scaled Speedup ' m Wi m i  ∑Wi = ∑ i i =1 i =1 n   + Q ( n) m' ∑W i ' W1 + nWn S =' n i =1 = m W1 + Wn ∑W i =1 i EENG-630 Chapter 3 34
  • 35. Memory Bounded Speedup Model • Idea is to solve largest problem, limited by memory space • Results in a scaled workload and higher accuracy • Each node can handle only a small subproblem for distributed memory • Using a large # of nodes collectively increases the memory capacity proportionally EENG-630 Chapter 3 35
  • 36. Fixed-Memory Speedup • Let M be the memory requirement and W the computational workload: W = g(M) • g*(nM)=G(n)g(M)=G(n)Wn m* ∑W i * W1 + G (n)Wn S = * n i =1 = m* Wi * i  W1 + G (n)Wn / n ∑ i i =1  n  + Q ( n)   EENG-630 Chapter 3 36
  • 37. Relating Speedup Models • G(n) reflects the increase in workload as memory increases n times • G(n) = 1 : Fixed problem size (Amdahl) • G(n) = n : Workload increases n times when memory increased n times (Gustafson) • G(n) > n : workload increases faster than memory than the memory requirement EENG-630 Chapter 3 37
  • 38. Scalability Metrics • Machine size (n) : # of processors • Clock rate (f) : determines basic m/c cycle • Problem size (s) : amount of computational workload. Directly proportional to T(s,1). • CPU time (T(s,n)) : actual CPU time for execution • I/O demand (d) : demand in moving the program, data, and results for a given run EENG-630 Chapter 3 38
  • 39. Scalability Metrics • Memory capacity (m) : max # of memory words demanded • Communication overhead (h(s,n)) : amount of time for interprocessor communication, synchronization, etc. • Computer cost (c) : total cost of h/w and s/w resources required • Programming overhead (p) : development overhead associated with an application program EENG-630 Chapter 3 39
  • 40. Speedup and Efficiency • The problem size is the independent parameter T ( s,1) S ( s, n) = T ( s, n) + h( s, n) S ( s, n) E ( s, n) = n EENG-630 Chapter 3 40
  • 41. Scalable Systems • Ideally, if E(s,n)=1 for all algorithms and any s and n, system is scalable • Practically, consider the scalability of a m/c S ( s, n) TI ( s, n) Φ ( s, n) = = S I ( s, n) T ( s, n) EENG-630 Chapter 3 41
  翻译: