Statistics
| Revision:

root / spec / gcc / gcc_dvfs.txt @ 53

History | View | Annotate | Download (38 KB)

1
sim-outorder_dvfs: SimpleScalar/PISA Tool Set version 3.0 of September, 1998.
2
Copyright (c) 1994-1998 by Todd M. Austin.  All Rights Reserved.
3

    
4

    
5
Processor Parameters:
6
Issue Width: 4
7
Window Size: 16
8
Number of Virtual Registers: 32
9
Number of Physical Registers: 16
10
Datapath Width: 64
11
Total Power Consumption: 24.105
12
Branch Predictor Power Consumption: 1.14342  (5.17%)
13
 branch target buffer power (W): 1.04097
14
 local predict power (W): 0.0275244
15
 global predict power (W): 0.031332
16
 chooser power (W): 0.0206036
17
 RAS power (W): 0.0229956
18
Rename Logic Power Consumption: 0.0887797  (0.402%)
19
 Instruction Decode Power (W): 0.0038821
20
 RAT decode_power (W): 0.0273861
21
 RAT wordline_power (W): 0.00645964
22
 RAT bitline_power (W): 0.0486255
23
 DCL Comparators (W): 0.0024263
24
Instruction Window Power Consumption: 0.517536  (2.34%)
25
 tagdrive (W): 0.0186418
26
 tagmatch (W): 0.00697769
27
 Selection Logic (W): 0.00331194
28
 decode_power (W): 0.0131921
29
 wordline_power (W): 0.0176803
30
 bitline_power (W): 0.457732
31
Load/Store Queue Power Consumption: 0.201758  (0.913%)
32
 tagdrive (W): 0.0854673
33
 tagmatch (W): 0.0207657
34
 decode_power (W): 0.00194105
35
 wordline_power (W): 0.00302882
36
 bitline_power (W): 0.0905553
37
Arch. Register File Power Consumption: 0.769909  (3.48%)
38
 decode_power (W): 0.0273861
39
 wordline_power (W): 0.0176803
40
 bitline_power (W): 0.724843
41
Result Bus Power Consumption: 0.499392  (2.26%)
42
Total Clock Power: 10.1199  (45.8%)
43
Int ALU Power: 1.19732  (5.42%)
44
FP ALU Power: 3.66922  (16.6%)
45
Instruction Cache Power Consumption: 0.614638  (2.78%)
46
 decode_power (W): 0.186809
47
 wordline_power (W): 0.00542611
48
 bitline_power (W): 0.231588
49
 senseamp_power (W): 0.07296
50
 tagarray_power (W): 0.117856
51
Itlb_power (W): 0.0565504 (0.256%)
52
Data Cache Power Consumption: 1.80232  (8.15%)
53
 decode_power (W): 0.15387
54
 wordline_power (W): 0.0368784
55
 bitline_power (W): 0.749615
56
 senseamp_power (W): 0.58368
57
 tagarray_power (W): 0.278274
58
Dtlb_power (W): 0.193103 (0.874%)
59
Level 2 Cache Power Consumption: 1.23116 (5.57%)
60
 decode_power (W): 0.0990259
61
 wordline_power (W): 0.00799512
62
 bitline_power (W): 0.83087
63
 senseamp_power (W): 0.14592
64
 tagarray_power (W): 0.147353
65
sim: command line: ./sim-outorder_dvfs gcc00.O2unroll.gcc.100M.ss -funroll-loops -fforce-mem -fcse-follow-jumps -fcse-skip-blocks -fexpensive-optimizations -fstrength-reduce -fpeephole -fschedule-insns -finline-functions -fschedule-insns2 -O regclass.i -o regclass.s 
66

    
67
sim: simulation started @ Mon Nov 30 16:18:55 2009, options follow:
68

    
69
sim-outorder: This simulator implements a very detailed out-of-order issue
70
superscalar processor with a two-level memory system and speculative
71
execution support.  This simulator is a performance simulator, tracking the
72
latency of all pipeline operations.
73

    
74
# -config                     # load configuration from a file
75
# -dumpconfig                 # dump configuration to a file
76
# -h                    false # print help message    
77
# -v                    false # verbose operation     
78
# -d                    false # enable debug message  
79
# -i                    false # start in Dlite debugger
80
-seed                       1 # random number generator seed (0 for timer seed)
81
# -q                    false # initialize and terminate immediately
82
# -chkpt               <null> # restore EIO trace execution from <fname>
83
# -redir:sim           <null> # redirect simulator output to file (non-interactive only)
84
# -redir:prog          <null> # redirect simulated program output to file
85
-nice                       0 # simulator scheduling priority
86
-max:inst                   0 # maximum number of inst's to execute
87
-fastfwd                    0 # number of insts skipped before timing starts
88
# -ptrace              <null> # generate pipetrace, i.e., <fname|stdout|stderr> <range>
89
-fetch:ifqsize              4 # instruction fetch queue size (in insts)
90
-fetch:mplat                3 # extra branch mis-prediction latency
91
-fetch:speed                1 # speed of front-end of machine relative to execution core
92
-bpred                  bimod # branch predictor type {nottaken|taken|perfect|bimod|2lev|comb}
93
-bpred:bimod     2048 # bimodal predictor config (<table size>)
94
-bpred:2lev      1 1024 8 0 # 2-level predictor config (<l1size> <l2size> <hist_size> <xor>)
95
-bpred:comb      1024 # combining predictor config (<meta_table_size>)
96
-bpred:ras                  8 # return address stack size (0 for no return stack)
97
-bpred:btb       512 4 # BTB config (<num_sets> <associativity>)
98
# -bpred:spec_update       <null> # speculative predictors update in {ID|WB} (default non-spec)
99
-decode:width               4 # instruction decode B/W (insts/cycle)
100
-issue:width                4 # instruction issue B/W (insts/cycle)
101
-issue:inorder          false # run pipeline with in-order issue
102
-issue:wrongpath         true # issue instructions down wrong execution paths
103
-commit:width               4 # instruction commit B/W (insts/cycle)
104
-ruu:size                  16 # register update unit (RUU) size
105
-lsq:size                   8 # load/store queue (LSQ) size
106
-cache:dl1       dl1:128:32:4:l # l1 data cache config, i.e., {<config>|none}
107
-cache:dl1lat               1 # l1 data cache hit latency (in cycles)
108
-cache:dl2       ul2:1024:64:4:l # l2 data cache config, i.e., {<config>|none}
109
-cache:dl2lat               6 # l2 data cache hit latency (in cycles)
110
-cache:il1       il1:512:32:1:l # l1 inst cache config, i.e., {<config>|dl1|dl2|none}
111
-cache:il1lat               1 # l1 instruction cache hit latency (in cycles)
112
-cache:il2                dl2 # l2 instruction cache config, i.e., {<config>|dl2|none}
113
-cache:il2lat               6 # l2 instruction cache hit latency (in cycles)
114
-cache:flush            false # flush caches on system calls
115
-cache:icompress        false # convert 64-bit inst addresses to 32-bit inst equivalents
116
-mem:lat         18 2 # memory access latency (<first_chunk> <inter_chunk>)
117
-mem:width                  8 # memory access bus width (in bytes)
118
-tlb:itlb        itlb:16:4096:4:l # instruction TLB config, i.e., {<config>|none}
119
-tlb:dtlb        dtlb:32:4096:4:l # data TLB config, i.e., {<config>|none}
120
-tlb:lat                   30 # inst/data TLB miss latency (in cycles)
121
-res:ialu                   4 # total number of integer ALU's available
122
-res:imult                  1 # total number of integer multiplier/dividers available
123
-res:memport                2 # total number of memory system ports available (to CPU)
124
-res:fpalu                  4 # total number of floating point ALU's available
125
-res:fpmult                 1 # total number of floating point multiplier/dividers available
126
# -pcstat              <null> # profile stat(s) against text addr's (mult uses ok)
127
-bugcompat              false # operate in backward-compatible bugs mode (for testing only)
128

    
129
  Pipetrace range arguments are formatted as follows:
130

    
131
    {{@|#}<start>}:{{@|#|+}<end>}
132

    
133
  Both ends of the range are optional, if neither are specified, the entire
134
  execution is traced.  Ranges that start with a `@' designate an address
135
  range to be traced, those that start with an `#' designate a cycle count
136
  range.  All other range values represent an instruction count range.  The
137
  second argument, if specified with a `+', indicates a value relative
138
  to the first argument, e.g., 1000:+100 == 1000:1100.  Program symbols may
139
  be used in all contexts.
140

    
141
    Examples:   -ptrace FOO.trc #0:#1000
142
                -ptrace BAR.trc @2000:
143
                -ptrace BLAH.trc :1500
144
                -ptrace UXXE.trc :
145
                -ptrace FOOBAR.trc @main:+278
146

    
147
  Branch predictor configuration examples for 2-level predictor:
148
    Configurations:   N, M, W, X
149
      N   # entries in first level (# of shift register(s))
150
      W   width of shift register(s)
151
      M   # entries in 2nd level (# of counters, or other FSM)
152
      X   (yes-1/no-0) xor history and address for 2nd level index
153
    Sample predictors:
154
      GAg     : 1, W, 2^W, 0
155
      GAp     : 1, W, M (M > 2^W), 0
156
      PAg     : N, W, 2^W, 0
157
      PAp     : N, W, M (M == 2^(N+W)), 0
158
      gshare  : 1, W, 2^W, 1
159
  Predictor `comb' combines a bimodal and a 2-level predictor.
160

    
161
  The cache config parameter <config> has the following format:
162

    
163
    <name>:<nsets>:<bsize>:<assoc>:<repl>
164

    
165
    <name>   - name of the cache being defined
166
    <nsets>  - number of sets in the cache
167
    <bsize>  - block size of the cache
168
    <assoc>  - associativity of the cache
169
    <repl>   - block replacement strategy, 'l'-LRU, 'f'-FIFO, 'r'-random
170

    
171
    Examples:   -cache:dl1 dl1:4096:32:1:l
172
                -dtlb dtlb:128:4096:32:r
173

    
174
  Cache levels can be unified by pointing a level of the instruction cache
175
  hierarchy at the data cache hiearchy using the "dl1" and "dl2" cache
176
  configuration arguments.  Most sensible combinations are supported, e.g.,
177

    
178
    A unified l2 cache (il2 is pointed at dl2):
179
      -cache:il1 il1:128:64:1:l -cache:il2 dl2
180
      -cache:dl1 dl1:256:32:1:l -cache:dl2 ul2:1024:64:2:l
181

    
182
    Or, a fully unified cache hierarchy (il1 pointed at dl1):
183
      -cache:il1 dl1
184
      -cache:dl1 ul1:256:32:1:l -cache:dl2 ul2:1024:64:2:l
185

    
186

    
187

    
188
sim: ** starting performance simulation **
189
warning: syscall: sigvec ignored
190
warning: syscall: sigvec ignored
191

    
192
Cache Parameters:
193
  Size in bytes: 16384
194
  Number of sets: 512
195
  Associativity: 4
196
  Block Size (bytes): 8
197

    
198
Access Time: 9.27925e-09
199
Cycle Time:  1.09081e-08
200

    
201
Best Ndwl (L1): 8
202
Best Ndbl (L1): 1
203
Best Nspd (L1): 1
204
Best Ntwl (L1): 1
205
Best Ntbl (L1): 4
206
Best Ntspd (L1): 1
207

    
208
Time Components:
209
 data side (with Output driver) (ns): 8.44162
210
 tag side (ns): 8.55667
211
 decode_data (ns): 5.29318
212
 wordline_data (ns): 1.03507
213
 bitline_data (ns): 0.810785
214
 sense_amp_data (ns): 0.58
215
 decode_tag (ns): 2.37065
216
 wordline_tag (ns): 1.36749
217
 bitline_tag (ns): 0.158246
218
 sense_amp_tag (ns): 0.26
219
 compare (ns): 2.42991
220
 mux driver (ns): 1.6125
221
 sel inverter (ns): 0.357877
222
 data output driver (ns): 0.722579
223
 total data path (with output driver) (ns): 7.71904
224
 total tag path is set assoc (ns): 8.55667
225
 precharge time (ns): 1.6289
226

    
227
Cache Parameters:
228
  Size in bytes: 16384
229
  Number of sets: 512
230
  Associativity: 1
231
  Block Size (bytes): 32
232

    
233
Access Time: 6.07496e-09
234
Cycle Time:  7.99836e-09
235

    
236
Best Ndwl (L1): 2
237
Best Ndbl (L1): 2
238
Best Nspd (L1): 1
239
Best Ntwl (L1): 1
240
Best Ntbl (L1): 2
241
Best Ntspd (L1): 2
242

    
243
Time Components:
244
 data side (with Output driver) (ns): 6.07496
245
 tag side (ns): 6.05737
246
 decode_data (ns): 2.92313
247
 wordline_data (ns): 1.32956
248
 bitline_data (ns): 0.452976
249
 sense_amp_data (ns): 0.58
250
 decode_tag (ns): 1.84499
251
 wordline_tag (ns): 0.825016
252
 bitline_tag (ns): 0.252886
253
 sense_amp_tag (ns): 0.26
254
 compare (ns): 2.30022
255
 valid signal driver (ns): 0.574251
256
 data output driver (ns): 0.789293
257
 total data path (with output driver) (ns): 5.28567
258
 total tag path is dm (ns): 6.05737
259
 precharge time (ns): 1.92339
260

    
261
Cache Parameters:
262
  Size in bytes: 16384
263
  Number of sets: 128
264
  Associativity: 4
265
  Block Size (bytes): 32
266

    
267
Access Time: 9.14093e-09
268
Cycle Time:  1.11718e-08
269

    
270
Best Ndwl (L1): 4
271
Best Ndbl (L1): 2
272
Best Nspd (L1): 1
273
Best Ntwl (L1): 1
274
Best Ntbl (L1): 2
275
Best Ntspd (L1): 1
276

    
277
Time Components:
278
 data side (with Output driver) (ns): 6.05114
279
 tag side (ns): 7.98848
280
 decode_data (ns): 2.92572
281
 wordline_data (ns): 1.437
282
 bitline_data (ns): -0.0440331
283
 sense_amp_data (ns): 0.58
284
 decode_tag (ns): 1.46851
285
 wordline_tag (ns): 1.27791
286
 bitline_tag (ns): -0.0315811
287
 sense_amp_tag (ns): 0.26
288
 compare (ns): 2.29478
289
 mux driver (ns): 2.37376
290
 sel inverter (ns): 0.345094
291
 data output driver (ns): 1.15245
292
 total data path (with output driver) (ns): 4.89869
293
 total tag path is set assoc (ns): 7.98848
294
 precharge time (ns): 2.03083
295

    
296
Cache Parameters:
297
  Size in bytes: 262144
298
  Number of sets: 1024
299
  Associativity: 4
300
  Block Size (bytes): 64
301

    
302
Access Time: 1.44948e-08
303
Cycle Time:  1.76863e-08
304

    
305
Best Ndwl (L1): 2
306
Best Ndbl (L1): 2
307
Best Nspd (L1): 1
308
Best Ntwl (L1): 1
309
Best Ntbl (L1): 4
310
Best Ntspd (L1): 1
311

    
312
Time Components:
313
 data side (with Output driver) (ns): 11.3269
314
 tag side (ns): 12.2049
315
 decode_data (ns): 4.99158
316
 wordline_data (ns): 2.59771
317
 bitline_data (ns): 0.867749
318
 sense_amp_data (ns): 0.58
319
 decode_tag (ns): 4.52586
320
 wordline_tag (ns): 1.24192
321
 bitline_tag (ns): 0.46158
322
 sense_amp_tag (ns): 0.26
323
 compare (ns): 2.17054
324
 mux driver (ns): 3.21212
325
 sel inverter (ns): 0.332908
326
 data output driver (ns): 2.28987
327
 total data path (with output driver) (ns): 9.03704
328
 total tag path is set assoc (ns): 12.2049
329
 precharge time (ns): 3.19154
330
Speed down!
331

    
332
Cache Parameters:
333
  Size in bytes: 16384
334
  Number of sets: 512
335
  Associativity: 4
336
  Block Size (bytes): 8
337

    
338
Access Time: 9.27925e-09
339
Cycle Time:  1.09081e-08
340

    
341
Best Ndwl (L1): 8
342
Best Ndbl (L1): 1
343
Best Nspd (L1): 1
344
Best Ntwl (L1): 1
345
Best Ntbl (L1): 4
346
Best Ntspd (L1): 1
347

    
348
Time Components:
349
 data side (with Output driver) (ns): 8.44162
350
 tag side (ns): 8.55667
351
 decode_data (ns): 5.29318
352
 wordline_data (ns): 1.03507
353
 bitline_data (ns): 0.810785
354
 sense_amp_data (ns): 0.58
355
 decode_tag (ns): 2.37065
356
 wordline_tag (ns): 1.36749
357
 bitline_tag (ns): 0.158246
358
 sense_amp_tag (ns): 0.26
359
 compare (ns): 2.42991
360
 mux driver (ns): 1.6125
361
 sel inverter (ns): 0.357877
362
 data output driver (ns): 0.722579
363
 total data path (with output driver) (ns): 7.71904
364
 total tag path is set assoc (ns): 8.55667
365
 precharge time (ns): 1.6289
366

    
367
Cache Parameters:
368
  Size in bytes: 163
369
Processor Parameters:
370
Issue Width: 4
371
Window Size: 16
372
Number of Virtual Registers: 32
373
Number of Physical Registers: 16
374
Datapath Width: 64
375
Total Power Consumption: 5.04786
376
Branch Predictor Power Consumption: 0.163621  (5.37%)
377
 branch target buffer power (W): 0.143672
378
 local predict power (W): 0.00624411
379
 global predict power (W): 0.00709918
380
 chooser power (W): 0.00415536
381
 RAS power (W): 0.00245052
382
Rename Logic Power Consumption: 0.0104069  (0.341%)
383
 Instruction Decode Power (W): 0.000485262
384
 RAT decode_power (W): 0.00342327
385
 RAT wordline_power (W): 0.000774826
386
 RAT bitline_power (W): 0.00542029
387
 DCL Comparators (W): 0.000303288
388
Instruction Window Power Consumption: 0.0514899  (1.69%)
389
 tagdrive (W): 0.00230958
390
 tagmatch (W): 0.000872211
391
 Selection Logic (W): 0.000413992
392
 decode_power (W): 0.00164902
393
 wordline_power (W): 0.00221003
394
 bitline_power (W): 0.0440351
395
Load/Store Queue Power Consumption: 0.0227073  (0.745%)
396
 tagdrive (W): 0.0106829
397
 tagmatch (W): 0.00259571
398
 decode_power (W): 0.000242631
399
 wordline_power (W): 0.000378602
400
 bitline_power (W): 0.00880742
401
Arch. Register File Power Consumption: 0.0805164  (2.64%)
402
 decode_power (W): 0.00342327
403
 wordline_power (W): 0.00221003
404
 bitline_power (W): 0.0748831
405
Result Bus Power Consumption: 0.0624241  (2.05%)
406
Total Clock Power: 1.21766  (40%)
407
Int ALU Power: 0.149665  (4.91%)
408
FP ALU Power: 0.458652  (15%)
409
Instruction Cache Power Consumption: 0.112954  (3.71%)
410
 decode_power (W): 0.0233511
411
 wordline_power (W): 0.000678263
412
 bitline_power (W): 0.0289484
413
 senseamp_power (W): 0.03648
414
 tagarray_power (W): 0.0234957
415
Itlb_power (W): 0.00879361 (0.289%)
416
Data Cache Power Consumption: 0.46436  (15.2%)
417
 decode_power (W): 0.0192338
418
 wordline_power (W): 0.00460979
419
 bitline_power (W): 0.0791131
420
 senseamp_power (W): 0.29184
421
 tagarray_power (W): 0.0695636
422
Dtlb_power (W): 0.027654 (0.907%)
423
Level 2 Cache Power Consumption: 0.216952 (7.12%)
424
 decode_power (W): 0.0123782
425
 wordline_power (W): 0.00099939
426
 bitline_power (W): 0.103859
427
 senseamp_power (W): 0.07296
428
 tagarray_power (W): 0.0267553
429
 init_reg_sets init_reg_sets_1 fix_register reg_preferred_class reg_preferred_or_nothing regclass_init regclass reg_class_record record_address_regs reg_scan reg_scan_mark_refs
430
time in parse: 15.985004
431
time in integration: 1.252078
432
time in jump: 7.788488
433
time in cse: 23.697481
434
time in loop: 9.684604
435
time in cse2: 42.626664
436
time in flow: 6.960435
437
time in combine: 34.282143
438
time in sched: 9.264579
439
time in local-alloc: 12.156759
440
time in global-alloc: 12.632791
441
time in sched2: 7.180449
442
time in dbranch: 18.137133
443
time in shorten-branch: 0.388023
444
time in stack-reg: 0.000000
445
time in final: 7.328460
446
time in varconst: 0.220009
447
time in symout: 0.000000
448
time in dump: 0.000000
449

    
450
sim: ** simulation statistics **
451
sim_num_insn              172201155 # total number of instructions committed
452
sim_num_refs               68343604 # total number of loads and stores committed
453
sim_num_loads              44498322 # total number of loads committed
454
sim_num_stores         23845282.0000 # total number of stores committed
455
sim_num_branches           35305432 # total number of branches committed
456
sim_elapsed_time                213 # total simulation time in seconds
457
sim_inst_rate           808456.1268 # simulation speed (in insts/sec)
458
sim_total_insn            201110718 # total number of instructions executed
459
sim_total_refs             79004247 # total number of loads and stores executed
460
sim_total_loads            53353346 # total number of loads executed
461
sim_total_stores       25650901.0000 # total number of stores executed
462
sim_total_branches         41270254 # total number of branches executed
463
sim_cycle                 185517323 # total simulation time in cycles
464
sim_IPC                      0.9282 # instructions per cycle
465
sim_CPI                      1.0773 # cycles per instruction
466
sim_exec_BW                  1.0841 # total instructions (mis-spec + committed) per cycle
467
sim_IPB                      4.8775 # instruction per branch
468
IFQ_count                 289261509 # cumulative IFQ occupancy
469
IFQ_fcount                 60074385 # cumulative IFQ full count
470
ifq_occupancy                1.5592 # avg IFQ occupancy (insn's)
471
ifq_rate                     1.0841 # avg IFQ dispatch rate (insn/cycle)
472
ifq_latency                  1.4383 # avg IFQ occupant latency (cycle's)
473
ifq_full                     0.3238 # fraction of time (cycle's) IFQ was full
474
RUU_count                1039743711 # cumulative RUU occupancy
475
RUU_fcount                 20166064 # cumulative RUU full count
476
ruu_occupancy                5.6046 # avg RUU occupancy (insn's)
477
ruu_rate                     1.0841 # avg RUU dispatch rate (insn/cycle)
478
ruu_latency                  5.1700 # avg RUU occupant latency (cycle's)
479
ruu_full                     0.1087 # fraction of time (cycle's) RUU was full
480
LSQ_count                 416172911 # cumulative LSQ occupancy
481
LSQ_fcount                 16765980 # cumulative LSQ full count
482
lsq_occupancy                2.2433 # avg LSQ occupancy (insn's)
483
lsq_rate                     1.0841 # avg LSQ dispatch rate (insn/cycle)
484
lsq_latency                  2.0694 # avg LSQ occupant latency (cycle's)
485
lsq_full                     0.0904 # fraction of time (cycle's) LSQ was full
486
bpred_bimod.lookups        43325511 # total number of bpred lookups
487
bpred_bimod.updates        35305432 # total number of updates
488
bpred_bimod.addr_hits      30829119 # total number of address-predicted hits
489
bpred_bimod.dir_hits       31597275 # total number of direction-predicted hits (includes addr-hits)
490
bpred_bimod.misses          3708157 # total number of misses
491
bpred_bimod.jr_hits         2853186 # total number of address-predicted hits for JR's
492
bpred_bimod.jr_seen         3578281 # total number of JR's seen
493
bpred_bimod.jr_non_ras_hits.PP       317354 # total number of address-predicted hits for non-RAS JR's
494
bpred_bimod.jr_non_ras_seen.PP       965369 # total number of non-RAS JR's seen
495
bpred_bimod.bpred_addr_rate    0.8732 # branch address-prediction rate (i.e., addr-hits/updates)
496
bpred_bimod.bpred_dir_rate    0.8950 # branch direction-prediction rate (i.e., all-hits/updates)
497
bpred_bimod.bpred_jr_rate    0.7974 # JR address-prediction rate (i.e., JR addr-hits/JRs seen)
498
bpred_bimod.bpred_jr_non_ras_rate.PP    0.3287 # non-RAS JR addr-pred rate (ie, non-RAS JR hits/JRs seen)
499
bpred_bimod.retstack_pushes      3240187 # total number of address pushed onto ret-addr stack
500
bpred_bimod.retstack_pops      3023023 # total number of address popped off of ret-addr stack
501
bpred_bimod.used_ras.PP      2612912 # total number of RAS predictions used
502
bpred_bimod.ras_hits.PP      2535832 # total number of RAS hits
503
bpred_bimod.ras_rate.PP    0.9705 # RAS prediction rate (i.e., RAS hits/used RAS)
504
il1.accesses              224882588 # total number of accesses
505
il1.hits                  210153246 # total number of hits
506
il1.misses                 14729342 # total number of misses
507
il1.replacements           14728830 # total number of replacements
508
il1.writebacks                    0 # total number of writebacks
509
il1.invalidations                 0 # total number of invalidations
510
il1.miss_rate                0.0655 # miss rate (i.e., misses/ref)
511
il1.repl_rate                0.0655 # replacement rate (i.e., repls/ref)
512
il1.wb_rate                  0.0000 # writeback rate (i.e., wrbks/ref)
513
il1.inv_rate                 0.0000 # invalidation rate (i.e., invs/ref)
514
dl1.accesses               71738991 # total number of accesses
515
dl1.hits                   70588822 # total number of hits
516
dl1.misses                  1150169 # total number of misses
517
dl1.replacements            1149657 # total number of replacements
518
dl1.writebacks               351145 # total number of writebacks
519
dl1.invalidations                 0 # total number of invalidations
520
dl1.miss_rate                0.0160 # miss rate (i.e., misses/ref)
521
dl1.repl_rate                0.0160 # replacement rate (i.e., repls/ref)
522
dl1.wb_rate                  0.0049 # writeback rate (i.e., wrbks/ref)
523
dl1.inv_rate                 0.0000 # invalidation rate (i.e., invs/ref)
524
ul2.accesses               16230656 # total number of accesses
525
ul2.hits                   15915676 # total number of hits
526
ul2.misses                   314980 # total number of misses
527
ul2.replacements             310884 # total number of replacements
528
ul2.writebacks                57635 # total number of writebacks
529
ul2.invalidations                 0 # total number of invalidations
530
ul2.miss_rate                0.0194 # miss rate (i.e., misses/ref)
531
ul2.repl_rate                0.0192 # replacement rate (i.e., repls/ref)
532
ul2.wb_rate                  0.0036 # writeback rate (i.e., wrbks/ref)
533
ul2.inv_rate                 0.0000 # invalidation rate (i.e., invs/ref)
534
itlb.accesses             224882588 # total number of accesses
535
itlb.hits                 224791856 # total number of hits
536
itlb.misses                   90732 # total number of misses
537
itlb.replacements             90668 # total number of replacements
538
itlb.writebacks                   0 # total number of writebacks
539
itlb.invalidations                0 # total number of invalidations
540
itlb.miss_rate               0.0004 # miss rate (i.e., misses/ref)
541
itlb.repl_rate               0.0004 # replacement rate (i.e., repls/ref)
542
itlb.wb_rate                 0.0000 # writeback rate (i.e., wrbks/ref)
543
itlb.inv_rate                0.0000 # invalidation rate (i.e., invs/ref)
544
dtlb.accesses              72156405 # total number of accesses
545
dtlb.hits                  72154217 # total number of hits
546
dtlb.misses                    2188 # total number of misses
547
dtlb.replacements              2060 # total number of replacements
548
dtlb.writebacks                   0 # total number of writebacks
549
dtlb.invalidations                0 # total number of invalidations
550
dtlb.miss_rate               0.0000 # miss rate (i.e., misses/ref)
551
dtlb.repl_rate               0.0000 # replacement rate (i.e., repls/ref)
552
dtlb.wb_rate                 0.0000 # writeback rate (i.e., wrbks/ref)
553
dtlb.inv_rate                0.0000 # invalidation rate (i.e., invs/ref)
554
rename_power           1934584.4292 # total power usage of rename unit
555
bpred_power            30403550.1688 # total power usage of bpred unit
556
window_power           9575569.0703 # total power usage of instruction window
557
lsq_power              4221547.5322 # total power usage of load/store queue
558
regfile_power          14971664.2742 # total power usage of arch. regfile
559
icache_power           22613675.4127 # total power usage of icache
560
dcache_power           91352329.0037 # total power usage of dcache
561
dcache2_power          40299014.4492 # total power usage of dcache2
562
alu_power              113066329.8702 # total power usage of alu
563
falu_power             85248423.5077 # total power usage of falu
564
resultbus_power        11602591.6225 # total power usage of resultbus
565
clock_power            226342084.9251 # total power usage of clock
566
avg_rename_power             0.0104 # avg power usage of rename unit
567
avg_bpred_power              0.1639 # avg power usage of bpred unit
568
avg_window_power             0.0516 # avg power usage of instruction window
569
avg_lsq_power                0.0228 # avg power usage of lsq
570
avg_regfile_power            0.0807 # avg power usage of arch. regfile
571
avg_icache_power             0.1219 # avg power usage of icache
572
avg_dcache_power             0.4924 # avg power usage of dcache
573
avg_dcache2_power            0.2172 # avg power usage of dcache2
574
avg_alu_power                0.6095 # avg power usage of alu
575
avg_falu_power               0.4595 # avg power usage of falu
576
avg_resultbus_power          0.0625 # avg power usage of resultbus
577
avg_clock_power              1.2201 # avg power usage of clock
578
fetch_stage_power      53017225.5815 # total power usage of fetch stage
579
dispatch_stage_power   1934584.4292 # total power usage of dispatch stage
580
issue_stage_power      270117381.5481 # total power usage of issue stage
581
avg_fetch_power              0.2858 # average power of fetch unit per cycle
582
avg_dispatch_power           0.0104 # average power of dispatch unit per cycle
583
avg_issue_power              1.4560 # average power of issue unit per cycle
584
total_power            566382940.7581 # total power per cycle
585
avg_total_power_cycle        3.0530 # average total power per cycle
586
avg_total_power_cycle_nofp_nod2       2.3762 # average total power per cycle
587
avg_total_power_insn         2.8163 # average total power per insn
588
avg_total_power_insn_nofp_nod2       2.1920 # average total power per insn
589
rename_power_cc1        735624.1756 # total power usage of rename unit_cc1
590
bpred_power_cc1        4864984.1449 # total power usage of bpred unit_cc1
591
window_power_cc1       6139595.1503 # total power usage of instruction window_cc1
592
lsq_power_cc1           588970.8488 # total power usage of lsq_cc1
593
regfile_power_cc1      7248538.0263 # total power usage of arch. regfile_cc1
594
icache_power_cc1       10311463.2669 # total power usage of icache_cc1
595
dcache_power_cc1       24220616.6187 # total power usage of dcache_cc1
596
dcache2_power_cc1      3435783.7953 # total power usage of dcache2_cc1
597
alu_power_cc1          12386459.8164 # total power usage of alu_cc1
598
resultbus_power_cc1    5383035.7028 # total power usage of resultbus_cc1
599
clock_power_cc1        53604695.6071 # total power usage of clock_cc1
600
avg_rename_power_cc1         0.0040 # avg power usage of rename unit_cc1
601
avg_bpred_power_cc1          0.0262 # avg power usage of bpred unit_cc1
602
avg_window_power_cc1         0.0331 # avg power usage of instruction window_cc1
603
avg_lsq_power_cc1            0.0032 # avg power usage of lsq_cc1
604
avg_regfile_power_cc1        0.0391 # avg power usage of arch. regfile_cc1
605
avg_icache_power_cc1         0.0556 # avg power usage of icache_cc1
606
avg_dcache_power_cc1         0.1306 # avg power usage of dcache_cc1
607
avg_dcache2_power_cc1        0.0185 # avg power usage of dcache2_cc1
608
avg_alu_power_cc1            0.0668 # avg power usage of alu_cc1
609
avg_resultbus_power_cc1       0.0290 # avg power usage of resultbus_cc1
610
avg_clock_power_cc1          0.2889 # avg power usage of clock_cc1
611
fetch_stage_power_cc1  15176447.4119 # total power usage of fetch stage_cc1
612
dispatch_stage_power_cc1  735624.1756 # total power usage of dispatch stage_cc1
613
issue_stage_power_cc1  52154461.9323 # total power usage of issue stage_cc1
614
avg_fetch_power_cc1          0.0818 # average power of fetch unit per cycle_cc1
615
avg_dispatch_power_cc1       0.0040 # average power of dispatch unit per cycle_cc1
616
avg_issue_power_cc1          0.2811 # average power of issue unit per cycle_cc1
617
total_power_cycle_cc1  128919767.1532 # total power per cycle_cc1
618
avg_total_power_cycle_cc1       0.6949 # average total power per cycle_cc1
619
avg_total_power_insn_cc1       0.6410 # average total power per insn_cc1
620
rename_power_cc2        523332.7492 # total power usage of rename unit_cc2
621
bpred_power_cc2        2895415.8287 # total power usage of bpred unit_cc2
622
window_power_cc2       3720395.9782 # total power usage of instruction window_cc2
623
lsq_power_cc2           405005.8616 # total power usage of lsq_cc2
624
regfile_power_cc2      1546166.4493 # total power usage of arch. regfile_cc2
625
icache_power_cc2       10311463.2669 # total power usage of icache_cc2
626
dcache_power_cc2       17659259.4487 # total power usage of dcache_cc2
627
dcache2_power_cc2      1760957.9237 # total power usage of dcache2_cc2
628
alu_power_cc2          6790085.5855 # total power usage of alu_cc2
629
resultbus_power_cc2    2999332.8551 # total power usage of resultbus_cc2
630
clock_power_cc2        34848590.7313 # total power usage of clock_cc2
631
avg_rename_power_cc2         0.0028 # avg power usage of rename unit_cc2
632
avg_bpred_power_cc2          0.0156 # avg power usage of bpred unit_cc2
633
avg_window_power_cc2         0.0201 # avg power usage of instruction window_cc2
634
avg_lsq_power_cc2            0.0022 # avg power usage of instruction lsq_cc2
635
avg_regfile_power_cc2        0.0083 # avg power usage of arch. regfile_cc2
636
avg_icache_power_cc2         0.0556 # avg power usage of icache_cc2
637
avg_dcache_power_cc2         0.0952 # avg power usage of dcache_cc2
638
avg_dcache2_power_cc2        0.0095 # avg power usage of dcache2_cc2
639
avg_alu_power_cc2            0.0366 # avg power usage of alu_cc2
640
avg_resultbus_power_cc2       0.0162 # avg power usage of resultbus_cc2
641
avg_clock_power_cc2          0.1878 # avg power usage of clock_cc2
642
fetch_stage_power_cc2  13206879.0957 # total power usage of fetch stage_cc2
643
dispatch_stage_power_cc2  523332.7492 # total power usage of dispatch stage_cc2
644
issue_stage_power_cc2  33335037.6527 # total power usage of issue stage_cc2
645
avg_fetch_power_cc2          0.0712 # average power of fetch unit per cycle_cc2
646
avg_dispatch_power_cc2       0.0028 # average power of dispatch unit per cycle_cc2
647
avg_issue_power_cc2          0.1797 # average power of issue unit per cycle_cc2
648
total_power_cycle_cc2  83460006.6781 # total power per cycle_cc2
649
avg_total_power_cycle_cc2       0.4499 # average total power per cycle_cc2
650
avg_total_power_insn_cc2       0.4150 # average total power per insn_cc2
651
rename_power_cc3        643228.7741 # total power usage of rename unit_cc3
652
bpred_power_cc3        5453683.6554 # total power usage of bpred unit_cc3
653
window_power_cc3       4040480.1416 # total power usage of instruction window_cc3
654
lsq_power_cc3           765929.3461 # total power usage of lsq_cc3
655
regfile_power_cc3      2254950.9162 # total power usage of arch. regfile_cc3
656
icache_power_cc3       11541684.4921 # total power usage of icache_cc3
657
dcache_power_cc3       24428759.4779 # total power usage of dcache_cc3
658
dcache2_power_cc3      5447529.1364 # total power usage of dcache2_cc3
659
alu_power_cc3          16858072.6744 # total power usage of alu_cc3
660
resultbus_power_cc3    3606322.7683 # total power usage of resultbus_cc3
661
clock_power_cc3        52088020.6044 # total power usage of clock_cc3
662
avg_rename_power_cc3         0.0035 # avg power usage of rename unit_cc3
663
avg_bpred_power_cc3          0.0294 # avg power usage of bpred unit_cc3
664
avg_window_power_cc3         0.0218 # avg power usage of instruction window_cc3
665
avg_lsq_power_cc3            0.0041 # avg power usage of instruction lsq_cc3
666
avg_regfile_power_cc3        0.0122 # avg power usage of arch. regfile_cc3
667
avg_icache_power_cc3         0.0622 # avg power usage of icache_cc3
668
avg_dcache_power_cc3         0.1317 # avg power usage of dcache_cc3
669
avg_dcache2_power_cc3        0.0294 # avg power usage of dcache2_cc3
670
avg_alu_power_cc3            0.0909 # avg power usage of alu_cc3
671
avg_resultbus_power_cc3       0.0194 # avg power usage of resultbus_cc3
672
avg_clock_power_cc3          0.2808 # avg power usage of clock_cc3
673
fetch_stage_power_cc3  16995368.1475 # total power usage of fetch stage_cc3
674
dispatch_stage_power_cc3  643228.7741 # total power usage of dispatch stage_cc3
675
issue_stage_power_cc3  55147093.5448 # total power usage of issue stage_cc3
676
avg_fetch_power_cc3          0.0916 # average power of fetch unit per cycle_cc3
677
avg_dispatch_power_cc3       0.0035 # average power of dispatch unit per cycle_cc3
678
avg_issue_power_cc3          0.2973 # average power of issue unit per cycle_cc3
679
total_power_cycle_cc3  127128661.9870 # total power per cycle_cc3
680
avg_total_power_cycle_cc3       0.6853 # average total power per cycle_cc3
681
avg_total_power_insn_cc3       0.6321 # average total power per insn_cc3
682
total_rename_access       200485660 # total number accesses of rename unit
683
total_bpred_access         35305432 # total number accesses of bpred unit
684
total_window_access       729804749 # total number accesses of instruction window
685
total_lsq_access           73117456 # total number accesses of load/store queue
686
total_regfile_access      254805585 # total number accesses of arch. regfile
687
total_icache_access       225608434 # total number accesses of icache
688
total_dcache_access        71738991 # total number accesses of dcache
689
total_dcache2_access       16230656 # total number accesses of dcache2
690
total_alu_access          180946593 # total number accesses of alu
691
total_resultbus_access    196891599 # total number accesses of resultbus
692
avg_rename_access            1.0807 # avg number accesses of rename unit
693
avg_bpred_access             0.1903 # avg number accesses of bpred unit
694
avg_window_access            3.9339 # avg number accesses of instruction window
695
avg_lsq_access               0.3941 # avg number accesses of lsq
696
avg_regfile_access           1.3735 # avg number accesses of arch. regfile
697
avg_icache_access            1.2161 # avg number accesses of icache
698
avg_dcache_access            0.3867 # avg number accesses of dcache
699
avg_dcache2_access           0.0875 # avg number accesses of dcache2
700
avg_alu_access               0.9754 # avg number accesses of alu
701
avg_resultbus_access         1.0613 # avg number accesses of resultbus
702
max_rename_access                 4 # max number accesses of rename unit
703
max_bpred_access                  4 # max number accesses of bpred unit
704
max_window_access                17 # max number accesses of instruction window
705
max_lsq_access                    6 # max number accesses of load/store queue
706
max_regfile_access               12 # max number accesses of arch. regfile
707
max_icache_access                 4 # max number accesses of icache
708
max_dcache_access                 4 # max number accesses of dcache
709
max_dcache2_access                7 # max number accesses of dcache2
710
max_alu_access                    4 # max number accesses of alu
711
max_resultbus_access              8 # max number accesses of resultbus
712
max_cycle_power_cc1         10.6408 # maximum cycle power usage of cc1
713
max_cycle_power_cc2          8.4431 # maximum cycle power usage of cc2
714
max_cycle_power_cc3          9.5628 # maximum cycle power usage of cc3
715
parasitic_power_cc1    92959457.1091 # parasitic power cc1
716
parasitic_power_cc2    92959457.1091 # parasitic power cc2
717
parasitic_power_cc3    92959457.1091 # parasitic power cc3
718
min amperage                 0.0000 # min amperage
719
max amperage                 5.0331 # max amperage
720
slow_cycles            185467323.0000 # slow cycles
721
fast_cycles                  0.0000 # fast cycles
722
sim_invalid_addrs                 0 # total non-speculative bogus addresses seen (debug var)
723
ld_text_base             0x00400000 # program text (code) segment base
724
ld_text_size                2485696 # program text (code) size in bytes
725
ld_data_base             0x10000000 # program initialized data segment base
726
ld_data_size                 287696 # program init'ed `.data' and uninit'ed `.bss' size in bytes
727
ld_stack_base            0x7fffc000 # program stack segment base (highest address in stack)
728
ld_stack_size                 16384 # program initial stack size
729
ld_prog_entry            0x00400140 # program entry point (initial PC)
730
ld_environ_base          0x7fff8000 # program environment base address address
731
ld_target_big_endian              0 # target executable endian-ness, non-zero if big endian
732
mem.page_count                  873 # total number of pages allocated
733
mem.page_mem                  3492k # total size of memory pages allocated
734
mem.ptab_misses                9978 # total first level page table misses
735
mem.ptab_accesses         609997197 # total page table accesses
736
mem.ptab_miss_rate           0.0000 # first level page table miss rate
737

    
738
84
739
  Number of sets: 512
740
  Associativity: 1
741
  Block Size (bytes): 32
742

    
743
Access Time: 6.07496e-09
744
Cycle Time:  7.99836e-09
745

    
746
Best Ndwl (L1): 2
747
Best Ndbl (L1): 2
748
Best Nspd (L1): 1
749
Best Ntwl (L1): 1
750
Best Ntbl (L1): 2
751
Best Ntspd (L1): 2
752

    
753
Time Components:
754
 data side (with Output driver) (ns): 6.07496
755
 tag side (ns): 6.05737
756
 decode_data (ns): 2.92313
757
 wordline_data (ns): 1.32956
758
 bitline_data (ns): 0.452976
759
 sense_amp_data (ns): 0.58
760
 decode_tag (ns): 1.84499
761
 wordline_tag (ns): 0.825016
762
 bitline_tag (ns): 0.252886
763
 sense_amp_tag (ns): 0.26
764
 compare (ns): 2.30022
765
 valid signal driver (ns): 0.574251
766
 data output driver (ns): 0.789293
767
 total data path (with output driver) (ns): 5.28567
768
 total tag path is dm (ns): 6.05737
769
 precharge time (ns): 1.92339
770

    
771
Cache Parameters:
772
  Size in bytes: 16384
773
  Number of sets: 128
774
  Associativity: 4
775
  Block Size (bytes): 32
776

    
777
Access Time: 9.14093e-09
778
Cycle Time:  1.11718e-08
779

    
780
Best Ndwl (L1): 4
781
Best Ndbl (L1): 2
782
Best Nspd (L1): 1
783
Best Ntwl (L1): 1
784
Best Ntbl (L1): 2
785
Best Ntspd (L1): 1
786

    
787
Time Components:
788
 data side (with Output driver) (ns): 6.05114
789
 tag side (ns): 7.98848
790
 decode_data (ns): 2.92572
791
 wordline_data (ns): 1.437
792
 bitline_data (ns): -0.0440331
793
 sense_amp_data (ns): 0.58
794
 decode_tag (ns): 1.46851
795
 wordline_tag (ns): 1.27791
796
 bitline_tag (ns): -0.0315811
797
 sense_amp_tag (ns): 0.26
798
 compare (ns): 2.29478
799
 mux driver (ns): 2.37376
800
 sel inverter (ns): 0.345094
801
 data output driver (ns): 1.15245
802
 total data path (with output driver) (ns): 4.89869
803
 total tag path is set assoc (ns): 7.98848
804
 precharge time (ns): 2.03083
805

    
806
Cache Parameters:
807
  Size in bytes: 262144
808
  Number of sets: 1024
809
  Associativity: 4
810
  Block Size (bytes): 64
811

    
812
Access Time: 1.44948e-08
813
Cycle Time:  1.76863e-08
814

    
815
Best Ndwl (L1): 2
816
Best Ndbl (L1): 2
817
Best Nspd (L1): 1
818
Best Ntwl (L1): 1
819
Best Ntbl (L1): 4
820
Best Ntspd (L1): 1
821

    
822
Time Components:
823
 data side (with Output driver) (ns): 11.3269
824
 tag side (ns): 12.2049
825
 decode_data (ns): 4.99158
826
 wordline_data (ns): 2.59771
827
 bitline_data (ns): 0.867749
828
 sense_amp_data (ns): 0.58
829
 decode_tag (ns): 4.52586
830
 wordline_tag (ns): 1.24192
831
 bitline_tag (ns): 0.46158
832
 sense_amp_tag (ns): 0.26
833
 compare (ns): 2.17054
834
 mux driver (ns): 3.21212
835
 sel inverter (ns): 0.332908
836
 data output driver (ns): 2.28987
837
 total data path (with output driver) (ns): 9.03704
838
 total tag path is set assoc (ns): 12.2049
839
 precharge time (ns): 3.19154