Prepare for your exams
Get points
Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

ECE 320 - Performance Optimization and Amdahl's Law: Solutions to Assignment 1, Assignments of Computer Architecture and Organization

University of Waterloo Computer Architecture and Organization

Detailed solutions to assignment 1 in ece 320, focusing on performance optimization concepts and amdahl's law. It explores the impact of optimizing different portions of a program on overall system speedup, demonstrating the importance of optimizing common case improvements. The solutions delve into the application of amdahl's law to calculate speedup and analyze the trade-offs between different optimization strategies. The document also examines the relationship between clock rate, mips, and execution time, highlighting the importance of mips for faster execution.

Typology: Assignments

2024/2025

Uploaded on 02/10/2025

seannn 🇨🇦

1 document

1 / 5

This page cannot be seen from the preview

Don't miss anything!

Assignment 1 - Performance - Solutions

ECE 320

Question 1

(Question 1.17 from Hennessy and Patterson 5th Edition)

Your company has just bought a new Intel Core i5 dual-core processor, and you have been tasked with

optimizing your software for this processor. You will run two applications on this processor, Application A

and Application B. When run together, Application A requires 80% of the resources while Application B

requires only 20%. 40% of Application A and 99% of Application B can be optimized. When a portion of a

program is optimized, that portion is sped up by a factor of two. The application details are summarized in

the table below.

Application A Application B

Resource Requirement 80% 20%

Optimizable Portion 40% 99%

(a) Assume only Application A is optimized. How much speedup would you achieve with Application A if

it is run in isolation? How much overall system speedup would you observe?

(b) Assume only Application B is optimized. How much speedup would you achieve with Application B if

it is run in isolation? How much overall system speedup would you observe?

(c) What observations can you make between the results from (a) and (b)?

Solution

Recall Amdahl’s Law, which states

S=1

(1 −f) + f

s

where

•Srepresents the speedup,

•frepresents the fraction of the program that is optimized, and

•srepresents the speedup of the optimized portion.

(a) The speedup of Application A in isolation, SA, is

SA=1

(1 −0.4) + 0.4

2

= 1.25

1

Partial preview of the text

Download ECE 320 - Performance Optimization and Amdahl's Law: Solutions to Assignment 1 and more Assignments Computer Architecture and Organization in PDF only on Docsity!

Assignment 1 - Performance - Solutions

ECE 320

Question 1

(Question 1.17 from Hennessy and Patterson 5th Edition) Your company has just bought a new Intel Core i5 dual-core processor, and you have been tasked with optimizing your software for this processor. You will run two applications on this processor, Application A and Application B. When run together, Application A requires 80% of the resources while Application B requires only 20%. 40% of Application A and 99% of Application B can be optimized. When a portion of a program is optimized, that portion is sped up by a factor of two. The application details are summarized in the table below.

Application A Application B Resource Requirement 80% 20% Optimizable Portion 40% 99%

(a) Assume only Application A is optimized. How much speedup would you achieve with Application A if it is run in isolation? How much overall system speedup would you observe?

(b) Assume only Application B is optimized. How much speedup would you achieve with Application B if it is run in isolation? How much overall system speedup would you observe?

(c) What observations can you make between the results from (a) and (b)?

Solution

Recall Amdahl’s Law, which states S =

(1 − f ) + fs where

S represents the speedup,
f represents the fraction of the program that is optimized, and
s represents the speedup of the optimized portion.

(a) The speedup of Application A in isolation, SA, is

SA =

(1 − 0 .4) + 02.^4

The total speedup of the overall system, Stot−A, is

Stot−A =

Note that SA is used as the speedup of the optimized portion, s, of the overall system.

(b) The same approach can be used as in (a), starting with the speedup of Application B in isolation.

SB =

The total speedup is Stot−B =

(c) Stot−A = 1.19 and Stot−B = 1.11. Despite 99% of Application B having been sped up, the overall speedup is less than that of the Application A case. This demonstrates that, when considering Amdahl’s Law, the highest f should be chosen to be optimized, favouring common case improvements.

Question 2

(Question 1.15 from Hennessy and Patterson 5th Edition) Assume that we make an enhancement to a computer that improves some mode of execution by a factor of ten. Enhanced mode is used 50% of the time, measured as a percentage of the execution time when the enhanced mode is in use.

(a) What percentage of the original unenhanced execution time has been converted to enhanced mode?

(b) What is the speedup we have obtained from enhanced mode?

Solution

Recall Amdahl’s Law, which states S = Told Tnew

(1 − f ) + fs where

S represents the speedup,
Told and Tnew represent the application time before and after optimization,
f represents the fraction of the program that is optimized, and
s represents the speedup of the optimized portion.

(a) Based on the problem statement, enhanced mode is used 50% of the time when the enhanced mode is in use. This means the optimized portion execution time is 50% of Tnew. To get an expression in terms of Told, we have f , which is the fraction of the original application without enhanced mode which can be optimized. The value f × Told corresponds to the amount of execution time of the original application

(a) 1.25 times faster than that of CPU B?

(b) 1.1 times faster than that of CPU B?

Solution

Let t represent the clock cycle time of each CPU.

CPU A Instruction Type CPI IC Compare 1 20% × ICA Branch 2 20% × ICA Other 1 60% × ICA

CPU B

Instruction Type CPI IC Compare + Branch 2 20% × ICA Other 1 60% × ICA

(a) The total execution time is computed using the data for each CPU in the tables above.

TA = tA

∑^3

i=

[ICi × CP Ii] TB = tB

∑^2

i=

[ICi × CP Ii]

= tA × ICA(0. 2 × 1 + 0. 2 × 2 + 0. 6 × 1) = 1. 25 tA × ICA(0. 2 × 2 + 0. 6 × 1) = 1. 2 × ICA × tA = 1. 25 × ICA × tA

Because TA < TB , CPU A is faster.

(b) The total execution time of TA is the same as (a).

TB = tB

∑^2

i=

[ICi × CP Ii]

= 1. 1 tA × ICA(0. 2 × 2 + 0. 6 × 1) = 1. 1 × ICA × tA

Because TA > TB , CPU B is faster.

Question 4

(a) Consider two competing processors. Processor A has a higher clock rate and a higher MIPS (millions of instructions per second) than Processor B. Under what conditions, if any, will Processor A always execute faster than Processor B?

(b) Suppose that there are two implementations of the same instruction set architecture, Machine A and Machine B. The table below shows their effective CPIs for a particular program and the clock cycle time for each machine.

Machine A Machine B Clock Cycle Time 20 ns 15 ns CPI 1.5 1.

Which machine is faster for this program and by how much?

Solution

(a) The total execution time of a program is Tprog = IC × CP I × tclk, where tclk is the clock cycle time. The CPI can be expressed in terms of MIPS as follows, where f represents the clock rate.

MIPS =

f CP I × 106 =⇒ CP I =

f MIPS × 106

Tprog = IC × CP I × tclk

= IC ×

f MIPS × 106

× tclk

= IC ×

MIPS × 106

Therefore, the clock rate does not have an effect on overall execution time. The processor with the higher MIPS is always faster if

the number of instructions in the program is constant, and
both processors use the same benchmarks, ISA, compiler, and OS.

(b) The total execution time of a program is Tprog = IC × CP I × tclk, where tclk is the clock cycle time. Given that both Machine A and B execute the same program, the instruction count will be the same; thus, IC is not needed.

TA = 1. 5 × IC × 20 ns TB = 1. 0 × IC × 15 ns = 30 × ICns = 15 × ICns

Since TB < TA, Machine B is faster.

TA

TB

Therefore, Machine B is twice as fast as Machine A.

ECE 320 - Performance Optimization and Amdahl's Law: Solutions to Assignment 1, Assignments of Computer Architecture and Organization

Related documents

Partial preview of the text

Download ECE 320 - Performance Optimization and Amdahl's Law: Solutions to Assignment 1 and more Assignments Computer Architecture and Organization in PDF only on Docsity!

Assignment 1 - Performance - Solutions

ECE 320

Question 1

Solution

SA =

(1 − 0 .4) + 02.^4

SB =

Question 2

Solution

Solution

CPU B

∑^3

∑^2

∑^2

Question 4

Solution

MIPS =

= IC ×

MIPS × 106

TA

TB