Well, I'm using the PMU for my analysis.
Let's say that code1 is the original code, with good performances, while
code2 is the "bad" code.
I observe the following:
*****************************************
| code1 | code2 |
*****************************************
Instr. $ miss rate | 0.035% | 0.035% |
*****************************************
Data $ miss rate | 2.735% | 2.745% |
*****************************************
Instr. TLB miss rate| 0.001% | 0.001% |
*****************************************
Data TLB miss rate | 0.159% | 0.339% |
*****************************************
The Data miss rate increases a little bit while the Data TLB miss rate
increases a lot.
If the problem was a problem of data alignment relative to a cache line,
I guess that the Data miss rate should increase a lot. Isn't it? Idem
for the instructions. But in this case, I don't see that.
>From this analysis, I would conclude that the performance issue is due
to the number of Data TLB misses.
But I don't see directly the link between this measured effect and the
fact that I add some code which is not executed...
Do you know what the cost of a data TLB miss is?
Thanks,
Thierry.
-------------------------------------------------------------------
Subscription options: http://lists.arm.linux.org.uk/mailman/listinfo/linux-arm
FAQ/Etiquette: http://www.arm.linux.org.uk/armlinux/mailinglists.php
|