root / project / results / m3 / bzip2_orig_small.txt @ 59
History | View | Annotate | Download (31.9 KB)
| 1 |
sim-outorder: SimpleScalar/PISA Tool Set version 3.0 of September, 1998. |
|---|---|
| 2 |
Copyright (c) 1994-1998 by Todd M. Austin. All Rights Reserved. |
| 3 |
|
| 4 |
|
| 5 |
Processor Parameters: |
| 6 |
Issue Width: 4 |
| 7 |
Window Size: 16 |
| 8 |
Number of Virtual Registers: 32 |
| 9 |
Number of Physical Registers: 16 |
| 10 |
Datapath Width: 64 |
| 11 |
Total Power Consumption: 24.105 |
| 12 |
Branch Predictor Power Consumption: 1.14342 (5.17%) |
| 13 |
branch target buffer power (W): 1.04097 |
| 14 |
local predict power (W): 0.0275244 |
| 15 |
global predict power (W): 0.031332 |
| 16 |
chooser power (W): 0.0206036 |
| 17 |
RAS power (W): 0.0229956 |
| 18 |
Rename Logic Power Consumption: 0.0887797 (0.402%) |
| 19 |
Instruction Decode Power (W): 0.0038821 |
| 20 |
RAT decode_power (W): 0.0273861 |
| 21 |
RAT wordline_power (W): 0.00645964 |
| 22 |
RAT bitline_power (W): 0.0486255 |
| 23 |
DCL Comparators (W): 0.0024263 |
| 24 |
Instruction Window Power Consumption: 0.517536 (2.34%) |
| 25 |
tagdrive (W): 0.0186418 |
| 26 |
tagmatch (W): 0.00697769 |
| 27 |
Selection Logic (W): 0.00331194 |
| 28 |
decode_power (W): 0.0131921 |
| 29 |
wordline_power (W): 0.0176803 |
| 30 |
bitline_power (W): 0.457732 |
| 31 |
Load/Store Queue Power Consumption: 0.201758 (0.913%) |
| 32 |
tagdrive (W): 0.0854673 |
| 33 |
tagmatch (W): 0.0207657 |
| 34 |
decode_power (W): 0.00194105 |
| 35 |
wordline_power (W): 0.00302882 |
| 36 |
bitline_power (W): 0.0905553 |
| 37 |
Arch. Register File Power Consumption: 0.769909 (3.48%) |
| 38 |
decode_power (W): 0.0273861 |
| 39 |
wordline_power (W): 0.0176803 |
| 40 |
bitline_power (W): 0.724843 |
| 41 |
Result Bus Power Consumption: 0.499392 (2.26%) |
| 42 |
Total Clock Power: 10.1199 (45.8%) |
| 43 |
Int ALU Power: 1.19732 (5.42%) |
| 44 |
FP ALU Power: 3.66922 (16.6%) |
| 45 |
Instruction Cache Power Consumption: 0.614638 (2.78%) |
| 46 |
decode_power (W): 0.186809 |
| 47 |
wordline_power (W): 0.00542611 |
| 48 |
bitline_power (W): 0.231588 |
| 49 |
senseamp_power (W): 0.07296 |
| 50 |
tagarray_power (W): 0.117856 |
| 51 |
Itlb_power (W): 0.0565504 (0.256%) |
| 52 |
Data Cache Power Consumption: 1.80232 (8.15%) |
| 53 |
decode_power (W): 0.15387 |
| 54 |
wordline_power (W): 0.0368784 |
| 55 |
bitline_power (W): 0.749615 |
| 56 |
senseamp_power (W): 0.58368 |
| 57 |
tagarray_power (W): 0.278274 |
| 58 |
Dtlb_power (W): 0.193103 (0.874%) |
| 59 |
Level 2 Cache Power Consumption: 1.23116 (5.57%) |
| 60 |
decode_power (W): 0.0990259 |
| 61 |
wordline_power (W): 0.00799512 |
| 62 |
bitline_power (W): 0.83087 |
| 63 |
senseamp_power (W): 0.14592 |
| 64 |
tagarray_power (W): 0.147353 |
| 65 |
sim: command line: ./sim-outorder -max:inst 10000000 bzip200.O2unroll.gcc.100M.ss input.source |
| 66 |
|
| 67 |
sim: simulation started @ Mon Nov 30 12:03:47 2009, options follow: |
| 68 |
|
| 69 |
sim-outorder: This simulator implements a very detailed out-of-order issue |
| 70 |
superscalar processor with a two-level memory system and speculative |
| 71 |
execution support. This simulator is a performance simulator, tracking the |
| 72 |
latency of all pipeline operations. |
| 73 |
|
| 74 |
# -config # load configuration from a file |
| 75 |
# -dumpconfig # dump configuration to a file |
| 76 |
# -h false # print help message |
| 77 |
# -v false # verbose operation |
| 78 |
# -d false # enable debug message |
| 79 |
# -i false # start in Dlite debugger |
| 80 |
-seed 1 # random number generator seed (0 for timer seed) |
| 81 |
# -q false # initialize and terminate immediately |
| 82 |
# -chkpt <null> # restore EIO trace execution from <fname> |
| 83 |
# -redir:sim <null> # redirect simulator output to file (non-interactive only) |
| 84 |
# -redir:prog <null> # redirect simulated program output to file |
| 85 |
-nice 0 # simulator scheduling priority |
| 86 |
-max:inst 10000000 # maximum number of inst's to execute |
| 87 |
-fastfwd 0 # number of insts skipped before timing starts |
| 88 |
# -ptrace <null> # generate pipetrace, i.e., <fname|stdout|stderr> <range> |
| 89 |
-fetch:ifqsize 4 # instruction fetch queue size (in insts) |
| 90 |
-fetch:mplat 3 # extra branch mis-prediction latency |
| 91 |
-fetch:speed 1 # speed of front-end of machine relative to execution core |
| 92 |
-bpred bimod # branch predictor type {nottaken|taken|perfect|bimod|2lev|comb}
|
| 93 |
-bpred:bimod 2048 # bimodal predictor config (<table size>) |
| 94 |
-bpred:2lev 1 1024 8 0 # 2-level predictor config (<l1size> <l2size> <hist_size> <xor>) |
| 95 |
-bpred:comb 1024 # combining predictor config (<meta_table_size>) |
| 96 |
-bpred:ras 8 # return address stack size (0 for no return stack) |
| 97 |
-bpred:btb 512 4 # BTB config (<num_sets> <associativity>) |
| 98 |
# -bpred:spec_update <null> # speculative predictors update in {ID|WB} (default non-spec)
|
| 99 |
-decode:width 4 # instruction decode B/W (insts/cycle) |
| 100 |
-issue:width 4 # instruction issue B/W (insts/cycle) |
| 101 |
-issue:inorder false # run pipeline with in-order issue |
| 102 |
-issue:wrongpath true # issue instructions down wrong execution paths |
| 103 |
-commit:width 4 # instruction commit B/W (insts/cycle) |
| 104 |
-ruu:size 16 # register update unit (RUU) size |
| 105 |
-lsq:size 8 # load/store queue (LSQ) size |
| 106 |
-cache:dl1 dl1:128:32:4:l # l1 data cache config, i.e., {<config>|none}
|
| 107 |
-cache:dl1lat 1 # l1 data cache hit latency (in cycles) |
| 108 |
-cache:dl2 ul2:1024:64:4:l # l2 data cache config, i.e., {<config>|none}
|
| 109 |
-cache:dl2lat 6 # l2 data cache hit latency (in cycles) |
| 110 |
-cache:il1 il1:512:32:1:l # l1 inst cache config, i.e., {<config>|dl1|dl2|none}
|
| 111 |
-cache:il1lat 1 # l1 instruction cache hit latency (in cycles) |
| 112 |
-cache:il2 dl2 # l2 instruction cache config, i.e., {<config>|dl2|none}
|
| 113 |
-cache:il2lat 6 # l2 instruction cache hit latency (in cycles) |
| 114 |
-cache:flush false # flush caches on system calls |
| 115 |
-cache:icompress false # convert 64-bit inst addresses to 32-bit inst equivalents |
| 116 |
-mem:lat 18 2 # memory access latency (<first_chunk> <inter_chunk>) |
| 117 |
-mem:width 8 # memory access bus width (in bytes) |
| 118 |
-tlb:itlb itlb:16:4096:4:l # instruction TLB config, i.e., {<config>|none}
|
| 119 |
-tlb:dtlb dtlb:32:4096:4:l # data TLB config, i.e., {<config>|none}
|
| 120 |
-tlb:lat 30 # inst/data TLB miss latency (in cycles) |
| 121 |
-res:ialu 4 # total number of integer ALU's available |
| 122 |
-res:imult 1 # total number of integer multiplier/dividers available |
| 123 |
-res:memport 2 # total number of memory system ports available (to CPU) |
| 124 |
-res:fpalu 4 # total number of floating point ALU's available |
| 125 |
-res:fpmult 1 # total number of floating point multiplier/dividers available |
| 126 |
# -pcstat <null> # profile stat(s) against text addr's (mult uses ok) |
| 127 |
-bugcompat false # operate in backward-compatible bugs mode (for testing only) |
| 128 |
|
| 129 |
Pipetrace range arguments are formatted as follows: |
| 130 |
|
| 131 |
{{@|#}<start>}:{{@|#|+}<end>}
|
| 132 |
|
| 133 |
Both ends of the range are optional, if neither are specified, the entire |
| 134 |
execution is traced. Ranges that start with a `@' designate an address |
| 135 |
range to be traced, those that start with an `#' designate a cycle count |
| 136 |
range. All other range values represent an instruction count range. The |
| 137 |
second argument, if specified with a `+', indicates a value relative |
| 138 |
to the first argument, e.g., 1000:+100 == 1000:1100. Program symbols may |
| 139 |
be used in all contexts. |
| 140 |
|
| 141 |
Examples: -ptrace FOO.trc #0:#1000 |
| 142 |
-ptrace BAR.trc @2000: |
| 143 |
-ptrace BLAH.trc :1500 |
| 144 |
-ptrace UXXE.trc : |
| 145 |
-ptrace FOOBAR.trc @main:+278 |
| 146 |
|
| 147 |
Branch predictor configuration examples for 2-level predictor: |
| 148 |
Configurations: N, M, W, X |
| 149 |
N # entries in first level (# of shift register(s)) |
| 150 |
W width of shift register(s) |
| 151 |
M # entries in 2nd level (# of counters, or other FSM) |
| 152 |
X (yes-1/no-0) xor history and address for 2nd level index |
| 153 |
Sample predictors: |
| 154 |
GAg : 1, W, 2^W, 0 |
| 155 |
GAp : 1, W, M (M > 2^W), 0 |
| 156 |
PAg : N, W, 2^W, 0 |
| 157 |
PAp : N, W, M (M == 2^(N+W)), 0 |
| 158 |
gshare : 1, W, 2^W, 1 |
| 159 |
Predictor `comb' combines a bimodal and a 2-level predictor. |
| 160 |
|
| 161 |
The cache config parameter <config> has the following format: |
| 162 |
|
| 163 |
<name>:<nsets>:<bsize>:<assoc>:<repl> |
| 164 |
|
| 165 |
<name> - name of the cache being defined |
| 166 |
<nsets> - number of sets in the cache |
| 167 |
<bsize> - block size of the cache |
| 168 |
<assoc> - associativity of the cache |
| 169 |
<repl> - block replacement strategy, 'l'-LRU, 'f'-FIFO, 'r'-random |
| 170 |
|
| 171 |
Examples: -cache:dl1 dl1:4096:32:1:l |
| 172 |
-dtlb dtlb:128:4096:32:r |
| 173 |
|
| 174 |
Cache levels can be unified by pointing a level of the instruction cache |
| 175 |
hierarchy at the data cache hiearchy using the "dl1" and "dl2" cache |
| 176 |
configuration arguments. Most sensible combinations are supported, e.g., |
| 177 |
|
| 178 |
A unified l2 cache (il2 is pointed at dl2): |
| 179 |
-cache:il1 il1:128:64:1:l -cache:il2 dl2 |
| 180 |
-cache:dl1 dl1:256:32:1:l -cache:dl2 ul2:1024:64:2:l |
| 181 |
|
| 182 |
Or, a fully unified cache hierarchy (il1 pointed at dl1): |
| 183 |
-cache:il1 dl1 |
| 184 |
-cache:dl1 ul1:256:32:1:l -cache:dl2 ul2:1024:64:2:l |
| 185 |
|
| 186 |
|
| 187 |
|
| 188 |
sim: ** starting performance simulation ** |
| 189 |
|
| 190 |
sim: ** simulation statistics ** |
| 191 |
sim_num_insn 9998718 # total number of instructions committed |
| 192 |
sim_num_refs 7324725 # total number of loads and stores committed |
| 193 |
sim_num_loads 3564303 # total number of loads committed |
| 194 |
sim_num_stores 3760422.0000 # total number of stores committed |
| 195 |
sim_num_branches 496714 # total number of branches committed |
| 196 |
sim_elapsed_time 13 # total simulation time in seconds |
| 197 |
sim_inst_rate 769132.1538 # simulation speed (in insts/sec) |
| 198 |
sim_total_insn 10000001 # total number of instructions executed |
| 199 |
sim_total_refs 7324988 # total number of loads and stores executed |
| 200 |
sim_total_loads 3564455 # total number of loads executed |
| 201 |
sim_total_stores 3760533.0000 # total number of stores executed |
| 202 |
sim_total_branches 496900 # total number of branches executed |
| 203 |
sim_cycle 11667223 # total simulation time in cycles |
| 204 |
sim_IPC 0.8570 # instructions per cycle |
| 205 |
sim_CPI 1.1669 # cycles per instruction |
| 206 |
sim_exec_BW 0.8571 # total instructions (mis-spec + committed) per cycle |
| 207 |
sim_IPB 20.1297 # instruction per branch |
| 208 |
IFQ_count 46562985 # cumulative IFQ occupancy |
| 209 |
IFQ_fcount 11602984 # cumulative IFQ full count |
| 210 |
ifq_occupancy 3.9909 # avg IFQ occupancy (insn's) |
| 211 |
ifq_rate 0.8571 # avg IFQ dispatch rate (insn/cycle) |
| 212 |
ifq_latency 4.6563 # avg IFQ occupant latency (cycle's) |
| 213 |
ifq_full 0.9945 # fraction of time (cycle's) IFQ was full |
| 214 |
RUU_count 103819235 # cumulative RUU occupancy |
| 215 |
RUU_fcount 592836 # cumulative RUU full count |
| 216 |
ruu_occupancy 8.8984 # avg RUU occupancy (insn's) |
| 217 |
ruu_rate 0.8571 # avg RUU dispatch rate (insn/cycle) |
| 218 |
ruu_latency 10.3819 # avg RUU occupant latency (cycle's) |
| 219 |
ruu_full 0.0508 # fraction of time (cycle's) RUU was full |
| 220 |
LSQ_count 90636120 # cumulative LSQ occupancy |
| 221 |
LSQ_fcount 10192125 # cumulative LSQ full count |
| 222 |
lsq_occupancy 7.7684 # avg LSQ occupancy (insn's) |
| 223 |
lsq_rate 0.8571 # avg LSQ dispatch rate (insn/cycle) |
| 224 |
lsq_latency 9.0636 # avg LSQ occupant latency (cycle's) |
| 225 |
lsq_full 0.8736 # fraction of time (cycle's) LSQ was full |
| 226 |
bpred_bimod.lookups 496976 # total number of bpred lookups |
| 227 |
bpred_bimod.updates 496713 # total number of updates |
| 228 |
bpred_bimod.addr_hits 496383 # total number of address-predicted hits |
| 229 |
bpred_bimod.dir_hits 496531 # total number of direction-predicted hits (includes addr-hits) |
| 230 |
bpred_bimod.misses 182 # total number of misses |
| 231 |
bpred_bimod.jr_hits 260 # total number of address-predicted hits for JR's |
| 232 |
bpred_bimod.jr_seen 271 # total number of JR's seen |
| 233 |
bpred_bimod.jr_non_ras_hits.PP 3 # total number of address-predicted hits for non-RAS JR's |
| 234 |
bpred_bimod.jr_non_ras_seen.PP 6 # total number of non-RAS JR's seen |
| 235 |
bpred_bimod.bpred_addr_rate 0.9993 # branch address-prediction rate (i.e., addr-hits/updates) |
| 236 |
bpred_bimod.bpred_dir_rate 0.9996 # branch direction-prediction rate (i.e., all-hits/updates) |
| 237 |
bpred_bimod.bpred_jr_rate 0.9594 # JR address-prediction rate (i.e., JR addr-hits/JRs seen) |
| 238 |
bpred_bimod.bpred_jr_non_ras_rate.PP 0.5000 # non-RAS JR addr-pred rate (ie, non-RAS JR hits/JRs seen) |
| 239 |
bpred_bimod.retstack_pushes 291 # total number of address pushed onto ret-addr stack |
| 240 |
bpred_bimod.retstack_pops 281 # total number of address popped off of ret-addr stack |
| 241 |
bpred_bimod.used_ras.PP 265 # total number of RAS predictions used |
| 242 |
bpred_bimod.ras_hits.PP 257 # total number of RAS hits |
| 243 |
bpred_bimod.ras_rate.PP 0.9698 # RAS prediction rate (i.e., RAS hits/used RAS) |
| 244 |
il1.accesses 10000566 # total number of accesses |
| 245 |
il1.hits 9999950 # total number of hits |
| 246 |
il1.misses 616 # total number of misses |
| 247 |
il1.replacements 237 # total number of replacements |
| 248 |
il1.writebacks 0 # total number of writebacks |
| 249 |
il1.invalidations 0 # total number of invalidations |
| 250 |
il1.miss_rate 0.0001 # miss rate (i.e., misses/ref) |
| 251 |
il1.repl_rate 0.0000 # replacement rate (i.e., repls/ref) |
| 252 |
il1.wb_rate 0.0000 # writeback rate (i.e., wrbks/ref) |
| 253 |
il1.inv_rate 0.0000 # invalidation rate (i.e., invs/ref) |
| 254 |
dl1.accesses 7324754 # total number of accesses |
| 255 |
dl1.hits 6262067 # total number of hits |
| 256 |
dl1.misses 1062687 # total number of misses |
| 257 |
dl1.replacements 1062175 # total number of replacements |
| 258 |
dl1.writebacks 641812 # total number of writebacks |
| 259 |
dl1.invalidations 0 # total number of invalidations |
| 260 |
dl1.miss_rate 0.1451 # miss rate (i.e., misses/ref) |
| 261 |
dl1.repl_rate 0.1450 # replacement rate (i.e., repls/ref) |
| 262 |
dl1.wb_rate 0.0876 # writeback rate (i.e., wrbks/ref) |
| 263 |
dl1.inv_rate 0.0000 # invalidation rate (i.e., invs/ref) |
| 264 |
ul2.accesses 1705115 # total number of accesses |
| 265 |
ul2.hits 1075115 # total number of hits |
| 266 |
ul2.misses 630000 # total number of misses |
| 267 |
ul2.replacements 625904 # total number of replacements |
| 268 |
ul2.writebacks 417340 # total number of writebacks |
| 269 |
ul2.invalidations 0 # total number of invalidations |
| 270 |
ul2.miss_rate 0.3695 # miss rate (i.e., misses/ref) |
| 271 |
ul2.repl_rate 0.3671 # replacement rate (i.e., repls/ref) |
| 272 |
ul2.wb_rate 0.2448 # writeback rate (i.e., wrbks/ref) |
| 273 |
ul2.inv_rate 0.0000 # invalidation rate (i.e., invs/ref) |
| 274 |
itlb.accesses 10000566 # total number of accesses |
| 275 |
itlb.hits 10000551 # total number of hits |
| 276 |
itlb.misses 15 # total number of misses |
| 277 |
itlb.replacements 0 # total number of replacements |
| 278 |
itlb.writebacks 0 # total number of writebacks |
| 279 |
itlb.invalidations 0 # total number of invalidations |
| 280 |
itlb.miss_rate 0.0000 # miss rate (i.e., misses/ref) |
| 281 |
itlb.repl_rate 0.0000 # replacement rate (i.e., repls/ref) |
| 282 |
itlb.wb_rate 0.0000 # writeback rate (i.e., wrbks/ref) |
| 283 |
itlb.inv_rate 0.0000 # invalidation rate (i.e., invs/ref) |
| 284 |
dtlb.accesses 7324771 # total number of accesses |
| 285 |
dtlb.hits 7268817 # total number of hits |
| 286 |
dtlb.misses 55954 # total number of misses |
| 287 |
dtlb.replacements 55826 # total number of replacements |
| 288 |
dtlb.writebacks 0 # total number of writebacks |
| 289 |
dtlb.invalidations 0 # total number of invalidations |
| 290 |
dtlb.miss_rate 0.0076 # miss rate (i.e., misses/ref) |
| 291 |
dtlb.repl_rate 0.0076 # replacement rate (i.e., repls/ref) |
| 292 |
dtlb.wb_rate 0.0000 # writeback rate (i.e., wrbks/ref) |
| 293 |
dtlb.inv_rate 0.0000 # invalidation rate (i.e., invs/ref) |
| 294 |
rename_power 1035812.9798 # total power usage of rename unit |
| 295 |
bpred_power 13340565.8411 # total power usage of bpred unit |
| 296 |
window_power 6038210.4799 # total power usage of instruction window |
| 297 |
lsq_power 2353957.4117 # total power usage of load/store queue |
| 298 |
regfile_power 8982703.9897 # total power usage of arch. regfile |
| 299 |
icache_power 7830907.5780 # total power usage of icache |
| 300 |
dcache_power 23281018.9904 # total power usage of dcache |
| 301 |
dcache2_power 14364262.3853 # total power usage of dcache2 |
| 302 |
alu_power 56778997.7615 # total power usage of alu |
| 303 |
falu_power 42809561.7962 # total power usage of falu |
| 304 |
resultbus_power 5826522.5626 # total power usage of resultbus |
| 305 |
clock_power 118071388.2580 # total power usage of clock |
| 306 |
avg_rename_power 0.0888 # avg power usage of rename unit |
| 307 |
avg_bpred_power 1.1434 # avg power usage of bpred unit |
| 308 |
avg_window_power 0.5175 # avg power usage of instruction window |
| 309 |
avg_lsq_power 0.2018 # avg power usage of lsq |
| 310 |
avg_regfile_power 0.7699 # avg power usage of arch. regfile |
| 311 |
avg_icache_power 0.6712 # avg power usage of icache |
| 312 |
avg_dcache_power 1.9954 # avg power usage of dcache |
| 313 |
avg_dcache2_power 1.2312 # avg power usage of dcache2 |
| 314 |
avg_alu_power 4.8665 # avg power usage of alu |
| 315 |
avg_falu_power 3.6692 # avg power usage of falu |
| 316 |
avg_resultbus_power 0.4994 # avg power usage of resultbus |
| 317 |
avg_clock_power 10.1199 # avg power usage of clock |
| 318 |
fetch_stage_power 21171473.4191 # total power usage of fetch stage |
| 319 |
dispatch_stage_power 1035812.9798 # total power usage of dispatch stage |
| 320 |
issue_stage_power 108642969.5914 # total power usage of issue stage |
| 321 |
avg_fetch_power 1.8146 # average power of fetch unit per cycle |
| 322 |
avg_dispatch_power 0.0888 # average power of dispatch unit per cycle |
| 323 |
avg_issue_power 9.3118 # average power of issue unit per cycle |
| 324 |
total_power 257904348.2380 # total power per cycle |
| 325 |
avg_total_power_cycle 22.1050 # average total power per cycle |
| 326 |
avg_total_power_cycle_nofp_nod2 17.2047 # average total power per cycle |
| 327 |
avg_total_power_insn 25.7904 # average total power per insn |
| 328 |
avg_total_power_insn_nofp_nod2 20.0731 # average total power per insn |
| 329 |
rename_power_cc1 322969.7602 # total power usage of rename unit_cc1 |
| 330 |
bpred_power_cc1 567484.6120 # total power usage of bpred unit_cc1 |
| 331 |
window_power_cc1 2695058.8675 # total power usage of instruction window_cc1 |
| 332 |
lsq_power_cc1 566371.5895 # total power usage of lsq_cc1 |
| 333 |
regfile_power_cc1 2476550.9535 # total power usage of arch. regfile_cc1 |
| 334 |
icache_power_cc1 2442153.4392 # total power usage of icache_cc1 |
| 335 |
dcache_power_cc1 9380259.4845 # total power usage of dcache_cc1 |
| 336 |
dcache2_power_cc1 1309090.9086 # total power usage of dcache2_cc1 |
| 337 |
alu_power_cc1 4882348.4716 # total power usage of alu_cc1 |
| 338 |
resultbus_power_cc1 2221048.1326 # total power usage of resultbus_cc1 |
| 339 |
clock_power_cc1 29088334.1749 # total power usage of clock_cc1 |
| 340 |
avg_rename_power_cc1 0.0277 # avg power usage of rename unit_cc1 |
| 341 |
avg_bpred_power_cc1 0.0486 # avg power usage of bpred unit_cc1 |
| 342 |
avg_window_power_cc1 0.2310 # avg power usage of instruction window_cc1 |
| 343 |
avg_lsq_power_cc1 0.0485 # avg power usage of lsq_cc1 |
| 344 |
avg_regfile_power_cc1 0.2123 # avg power usage of arch. regfile_cc1 |
| 345 |
avg_icache_power_cc1 0.2093 # avg power usage of icache_cc1 |
| 346 |
avg_dcache_power_cc1 0.8040 # avg power usage of dcache_cc1 |
| 347 |
avg_dcache2_power_cc1 0.1122 # avg power usage of dcache2_cc1 |
| 348 |
avg_alu_power_cc1 0.4185 # avg power usage of alu_cc1 |
| 349 |
avg_resultbus_power_cc1 0.1904 # avg power usage of resultbus_cc1 |
| 350 |
avg_clock_power_cc1 2.4932 # avg power usage of clock_cc1 |
| 351 |
fetch_stage_power_cc1 3009638.0513 # total power usage of fetch stage_cc1 |
| 352 |
dispatch_stage_power_cc1 322969.7602 # total power usage of dispatch stage_cc1 |
| 353 |
issue_stage_power_cc1 21054177.4543 # total power usage of issue stage_cc1 |
| 354 |
avg_fetch_power_cc1 0.2580 # average power of fetch unit per cycle_cc1 |
| 355 |
avg_dispatch_power_cc1 0.0277 # average power of dispatch unit per cycle_cc1 |
| 356 |
avg_issue_power_cc1 1.8046 # average power of issue unit per cycle_cc1 |
| 357 |
total_power_cycle_cc1 55951670.3942 # total power per cycle_cc1 |
| 358 |
avg_total_power_cycle_cc1 4.7956 # average total power per cycle_cc1 |
| 359 |
avg_total_power_insn_cc1 5.5952 # average total power per insn_cc1 |
| 360 |
rename_power_cc2 221939.4193 # total power usage of rename unit_cc2 |
| 361 |
bpred_power_cc2 283976.4218 # total power usage of bpred unit_cc2 |
| 362 |
window_power_cc2 2183478.5414 # total power usage of instruction window_cc2 |
| 363 |
lsq_power_cc2 332709.0539 # total power usage of lsq_cc2 |
| 364 |
regfile_power_cc2 984989.2725 # total power usage of arch. regfile_cc2 |
| 365 |
icache_power_cc2 2442153.4392 # total power usage of icache_cc2 |
| 366 |
dcache_power_cc2 7307983.0981 # total power usage of dcache_cc2 |
| 367 |
dcache2_power_cc2 1049637.9156 # total power usage of dcache2_cc2 |
| 368 |
alu_power_cc2 2992850.4462 # total power usage of alu_cc2 |
| 369 |
resultbus_power_cc2 1618521.8857 # total power usage of resultbus_cc2 |
| 370 |
clock_power_cc2 20102892.6252 # total power usage of clock_cc2 |
| 371 |
avg_rename_power_cc2 0.0190 # avg power usage of rename unit_cc2 |
| 372 |
avg_bpred_power_cc2 0.0243 # avg power usage of bpred unit_cc2 |
| 373 |
avg_window_power_cc2 0.1871 # avg power usage of instruction window_cc2 |
| 374 |
avg_lsq_power_cc2 0.0285 # avg power usage of instruction lsq_cc2 |
| 375 |
avg_regfile_power_cc2 0.0844 # avg power usage of arch. regfile_cc2 |
| 376 |
avg_icache_power_cc2 0.2093 # avg power usage of icache_cc2 |
| 377 |
avg_dcache_power_cc2 0.6264 # avg power usage of dcache_cc2 |
| 378 |
avg_dcache2_power_cc2 0.0900 # avg power usage of dcache2_cc2 |
| 379 |
avg_alu_power_cc2 0.2565 # avg power usage of alu_cc2 |
| 380 |
avg_resultbus_power_cc2 0.1387 # avg power usage of resultbus_cc2 |
| 381 |
avg_clock_power_cc2 1.7230 # avg power usage of clock_cc2 |
| 382 |
fetch_stage_power_cc2 2726129.8610 # total power usage of fetch stage_cc2 |
| 383 |
dispatch_stage_power_cc2 221939.4193 # total power usage of dispatch stage_cc2 |
| 384 |
issue_stage_power_cc2 15485180.9409 # total power usage of issue stage_cc2 |
| 385 |
avg_fetch_power_cc2 0.2337 # average power of fetch unit per cycle_cc2 |
| 386 |
avg_dispatch_power_cc2 0.0190 # average power of dispatch unit per cycle_cc2 |
| 387 |
avg_issue_power_cc2 1.3272 # average power of issue unit per cycle_cc2 |
| 388 |
total_power_cycle_cc2 39521132.1188 # total power per cycle_cc2 |
| 389 |
avg_total_power_cycle_cc2 3.3874 # average total power per cycle_cc2 |
| 390 |
avg_total_power_insn_cc2 3.9521 # average total power per insn_cc2 |
| 391 |
rename_power_cc3 293223.7412 # total power usage of rename unit_cc3 |
| 392 |
bpred_power_cc3 1561288.9463 # total power usage of bpred unit_cc3 |
| 393 |
window_power_cc3 2512888.3723 # total power usage of instruction window_cc3 |
| 394 |
lsq_power_cc3 507823.2255 # total power usage of lsq_cc3 |
| 395 |
regfile_power_cc3 1601192.1431 # total power usage of arch. regfile_cc3 |
| 396 |
icache_power_cc3 2981028.8535 # total power usage of icache_cc3 |
| 397 |
dcache_power_cc3 8698101.1525 # total power usage of dcache_cc3 |
| 398 |
dcache2_power_cc3 2355155.9872 # total power usage of dcache2_cc3 |
| 399 |
alu_power_cc3 8182515.3756 # total power usage of alu_cc3 |
| 400 |
resultbus_power_cc3 1976825.7491 # total power usage of resultbus_cc3 |
| 401 |
clock_power_cc3 28948646.5377 # total power usage of clock_cc3 |
| 402 |
avg_rename_power_cc3 0.0251 # avg power usage of rename unit_cc3 |
| 403 |
avg_bpred_power_cc3 0.1338 # avg power usage of bpred unit_cc3 |
| 404 |
avg_window_power_cc3 0.2154 # avg power usage of instruction window_cc3 |
| 405 |
avg_lsq_power_cc3 0.0435 # avg power usage of instruction lsq_cc3 |
| 406 |
avg_regfile_power_cc3 0.1372 # avg power usage of arch. regfile_cc3 |
| 407 |
avg_icache_power_cc3 0.2555 # avg power usage of icache_cc3 |
| 408 |
avg_dcache_power_cc3 0.7455 # avg power usage of dcache_cc3 |
| 409 |
avg_dcache2_power_cc3 0.2019 # avg power usage of dcache2_cc3 |
| 410 |
avg_alu_power_cc3 0.7013 # avg power usage of alu_cc3 |
| 411 |
avg_resultbus_power_cc3 0.1694 # avg power usage of resultbus_cc3 |
| 412 |
avg_clock_power_cc3 2.4812 # avg power usage of clock_cc3 |
| 413 |
fetch_stage_power_cc3 4542317.7998 # total power usage of fetch stage_cc3 |
| 414 |
dispatch_stage_power_cc3 293223.7412 # total power usage of dispatch stage_cc3 |
| 415 |
issue_stage_power_cc3 24233309.8622 # total power usage of issue stage_cc3 |
| 416 |
avg_fetch_power_cc3 0.3893 # average power of fetch unit per cycle_cc3 |
| 417 |
avg_dispatch_power_cc3 0.0251 # average power of dispatch unit per cycle_cc3 |
| 418 |
avg_issue_power_cc3 2.0770 # average power of issue unit per cycle_cc3 |
| 419 |
total_power_cycle_cc3 59618690.0840 # total power per cycle_cc3 |
| 420 |
avg_total_power_cycle_cc3 5.1099 # average total power per cycle_cc3 |
| 421 |
avg_total_power_insn_cc3 5.9619 # average total power per insn_cc3 |
| 422 |
total_rename_access 9999553 # total number accesses of rename unit |
| 423 |
total_bpred_access 496713 # total number accesses of bpred unit |
| 424 |
total_window_access 43796129 # total number accesses of instruction window |
| 425 |
total_lsq_access 7324796 # total number accesses of load/store queue |
| 426 |
total_regfile_access 17532659 # total number accesses of arch. regfile |
| 427 |
total_icache_access 10000985 # total number accesses of icache |
| 428 |
total_dcache_access 7324754 # total number accesses of dcache |
| 429 |
total_dcache2_access 1705115 # total number accesses of dcache2 |
| 430 |
total_alu_access 9998472 # total number accesses of alu |
| 431 |
total_resultbus_access 13066581 # total number accesses of resultbus |
| 432 |
avg_rename_access 0.8571 # avg number accesses of rename unit |
| 433 |
avg_bpred_access 0.0426 # avg number accesses of bpred unit |
| 434 |
avg_window_access 3.7538 # avg number accesses of instruction window |
| 435 |
avg_lsq_access 0.6278 # avg number accesses of lsq |
| 436 |
avg_regfile_access 1.5027 # avg number accesses of arch. regfile |
| 437 |
avg_icache_access 0.8572 # avg number accesses of icache |
| 438 |
avg_dcache_access 0.6278 # avg number accesses of dcache |
| 439 |
avg_dcache2_access 0.1461 # avg number accesses of dcache2 |
| 440 |
avg_alu_access 0.8570 # avg number accesses of alu |
| 441 |
avg_resultbus_access 1.1199 # avg number accesses of resultbus |
| 442 |
max_rename_access 4 # max number accesses of rename unit |
| 443 |
max_bpred_access 4 # max number accesses of bpred unit |
| 444 |
max_window_access 15 # max number accesses of instruction window |
| 445 |
max_lsq_access 4 # max number accesses of load/store queue |
| 446 |
max_regfile_access 11 # max number accesses of arch. regfile |
| 447 |
max_icache_access 4 # max number accesses of icache |
| 448 |
max_dcache_access 4 # max number accesses of dcache |
| 449 |
max_dcache2_access 4 # max number accesses of dcache2 |
| 450 |
max_alu_access 4 # max number accesses of alu |
| 451 |
max_resultbus_access 7 # max number accesses of resultbus |
| 452 |
max_cycle_power_cc1 12.7721 # maximum cycle power usage of cc1 |
| 453 |
max_cycle_power_cc2 9.7233 # maximum cycle power usage of cc2 |
| 454 |
max_cycle_power_cc3 10.7378 # maximum cycle power usage of cc3 |
| 455 |
parasitic_power_cc1 6787403.2765 # parasitic power cc1 |
| 456 |
parasitic_power_cc2 6787403.2765 # parasitic power cc2 |
| 457 |
parasitic_power_cc3 6787403.2765 # parasitic power cc3 |
| 458 |
min amperage 0.0000 # min amperage |
| 459 |
max amperage 5.6515 # max amperage |
| 460 |
slow_cycles 0.0000 # slow cycles |
| 461 |
fast_cycles 0.0000 # fast cycles |
| 462 |
sim_invalid_addrs 0 # total non-speculative bogus addresses seen (debug var) |
| 463 |
ld_text_base 0x00400000 # program text (code) segment base |
| 464 |
ld_text_size 147104 # program text (code) size in bytes |
| 465 |
ld_data_base 0x10000000 # program initialized data segment base |
| 466 |
ld_data_size 91520 # program init'ed `.data' and uninit'ed `.bss' size in bytes |
| 467 |
ld_stack_base 0x7fffc000 # program stack segment base (highest address in stack) |
| 468 |
ld_stack_size 16384 # program initial stack size |
| 469 |
ld_prog_entry 0x00400140 # program entry point (initial PC) |
| 470 |
ld_environ_base 0x7fff8000 # program environment base address address |
| 471 |
ld_target_big_endian 0 # target executable endian-ness, non-zero if big endian |
| 472 |
mem.page_count 49487 # total number of pages allocated |
| 473 |
mem.page_mem 197948k # total size of memory pages allocated |
| 474 |
mem.ptab_misses 56335 # total first level page table misses |
| 475 |
mem.ptab_accesses 54160690 # total page table accesses |
| 476 |
mem.ptab_miss_rate 0.0010 # first level page table miss rate |
| 477 |
|
| 478 |
|
| 479 |
Cache Parameters: |
| 480 |
Size in bytes: 16384 |
| 481 |
Number of sets: 512 |
| 482 |
Associativity: 4 |
| 483 |
Block Size (bytes): 8 |
| 484 |
|
| 485 |
Access Time: 9.27925e-09 |
| 486 |
Cycle Time: 1.09081e-08 |
| 487 |
|
| 488 |
Best Ndwl (L1): 8 |
| 489 |
Best Ndbl (L1): 1 |
| 490 |
Best Nspd (L1): 1 |
| 491 |
Best Ntwl (L1): 1 |
| 492 |
Best Ntbl (L1): 4 |
| 493 |
Best Ntspd (L1): 1 |
| 494 |
|
| 495 |
Time Components: |
| 496 |
data side (with Output driver) (ns): 8.44162 |
| 497 |
tag side (ns): 8.55667 |
| 498 |
decode_data (ns): 5.29318 |
| 499 |
wordline_data (ns): 1.03507 |
| 500 |
bitline_data (ns): 0.810785 |
| 501 |
sense_amp_data (ns): 0.58 |
| 502 |
decode_tag (ns): 2.37065 |
| 503 |
wordline_tag (ns): 1.36749 |
| 504 |
bitline_tag (ns): 0.158246 |
| 505 |
sense_amp_tag (ns): 0.26 |
| 506 |
compare (ns): 2.42991 |
| 507 |
mux driver (ns): 1.6125 |
| 508 |
sel inverter (ns): 0.357877 |
| 509 |
data output driver (ns): 0.722579 |
| 510 |
total data path (with output driver) (ns): 7.71904 |
| 511 |
total tag path is set assoc (ns): 8.55667 |
| 512 |
precharge time (ns): 1.6289 |
| 513 |
|
| 514 |
Cache Parameters: |
| 515 |
Size in bytes: 16384 |
| 516 |
Number of sets: 512 |
| 517 |
Associativity: 1 |
| 518 |
Block Size (bytes): 32 |
| 519 |
|
| 520 |
Access Time: 6.07496e-09 |
| 521 |
Cycle Time: 7.99836e-09 |
| 522 |
|
| 523 |
Best Ndwl (L1): 2 |
| 524 |
Best Ndbl (L1): 2 |
| 525 |
Best Nspd (L1): 1 |
| 526 |
Best Ntwl (L1): 1 |
| 527 |
Best Ntbl (L1): 2 |
| 528 |
Best Ntspd (L1): 2 |
| 529 |
|
| 530 |
Time Components: |
| 531 |
data side (with Output driver) (ns): 6.07496 |
| 532 |
tag side (ns): 6.05737 |
| 533 |
decode_data (ns): 2.92313 |
| 534 |
wordline_data (ns): 1.32956 |
| 535 |
bitline_data (ns): 0.452976 |
| 536 |
sense_amp_data (ns): 0.58 |
| 537 |
decode_tag (ns): 1.84499 |
| 538 |
wordline_tag (ns): 0.825016 |
| 539 |
bitline_tag (ns): 0.252886 |
| 540 |
sense_amp_tag (ns): 0.26 |
| 541 |
compare (ns): 2.30022 |
| 542 |
valid signal driver (ns): 0.574251 |
| 543 |
data output driver (ns): 0.789293 |
| 544 |
total data path (with output driver) (ns): 5.28567 |
| 545 |
total tag path is dm (ns): 6.05737 |
| 546 |
precharge time (ns): 1.92339 |
| 547 |
|
| 548 |
Cache Parameters: |
| 549 |
Size in bytes: 16384 |
| 550 |
Number of sets: 128 |
| 551 |
Associativity: 4 |
| 552 |
Block Size (bytes): 32 |
| 553 |
|
| 554 |
Access Time: 9.14093e-09 |
| 555 |
Cycle Time: 1.11718e-08 |
| 556 |
|
| 557 |
Best Ndwl (L1): 4 |
| 558 |
Best Ndbl (L1): 2 |
| 559 |
Best Nspd (L1): 1 |
| 560 |
Best Ntwl (L1): 1 |
| 561 |
Best Ntbl (L1): 2 |
| 562 |
Best Ntspd (L1): 1 |
| 563 |
|
| 564 |
Time Components: |
| 565 |
data side (with Output driver) (ns): 6.05114 |
| 566 |
tag side (ns): 7.98848 |
| 567 |
decode_data (ns): 2.92572 |
| 568 |
wordline_data (ns): 1.437 |
| 569 |
bitline_data (ns): -0.0440331 |
| 570 |
sense_amp_data (ns): 0.58 |
| 571 |
decode_tag (ns): 1.46851 |
| 572 |
wordline_tag (ns): 1.27791 |
| 573 |
bitline_tag (ns): -0.0315811 |
| 574 |
sense_amp_tag (ns): 0.26 |
| 575 |
compare (ns): 2.29478 |
| 576 |
mux driver (ns): 2.37376 |
| 577 |
sel inverter (ns): 0.345094 |
| 578 |
data output driver (ns): 1.15245 |
| 579 |
total data path (with output driver) (ns): 4.89869 |
| 580 |
total tag path is set assoc (ns): 7.98848 |
| 581 |
precharge time (ns): 2.03083 |
| 582 |
|
| 583 |
Cache Parameters: |
| 584 |
Size in bytes: 262144 |
| 585 |
Number of sets: 1024 |
| 586 |
Associativity: 4 |
| 587 |
Block Size (bytes): 64 |
| 588 |
|
| 589 |
Access Time: 1.44948e-08 |
| 590 |
Cycle Time: 1.76863e-08 |
| 591 |
|
| 592 |
Best Ndwl (L1): 2 |
| 593 |
Best Ndbl (L1): 2 |
| 594 |
Best Nspd (L1): 1 |
| 595 |
Best Ntwl (L1): 1 |
| 596 |
Best Ntbl (L1): 4 |
| 597 |
Best Ntspd (L1): 1 |
| 598 |
|
| 599 |
Time Components: |
| 600 |
data side (with Output driver) (ns): 11.3269 |
| 601 |
tag side (ns): 12.2049 |
| 602 |
decode_data (ns): 4.99158 |
| 603 |
wordline_data (ns): 2.59771 |
| 604 |
bitline_data (ns): 0.867749 |
| 605 |
sense_amp_data (ns): 0.58 |
| 606 |
decode_tag (ns): 4.52586 |
| 607 |
wordline_tag (ns): 1.24192 |
| 608 |
bitline_tag (ns): 0.46158 |
| 609 |
sense_amp_tag (ns): 0.26 |
| 610 |
compare (ns): 2.17054 |
| 611 |
mux driver (ns): 3.21212 |
| 612 |
sel inverter (ns): 0.332908 |
| 613 |
data output driver (ns): 2.28987 |
| 614 |
total data path (with output driver) (ns): 9.03704 |
| 615 |
total tag path is set assoc (ns): 12.2049 |
| 616 |
precharge time (ns): 3.19154 |