Shahaneh
2013-08-07 21:26:43 UTC
ARM11
Differences from ARM9
In terms of instruction set, the ARM11 builds on the preceding ARM9 generation. It incorporates all ARM926EJ-S features and adds the ARMv6 instructions for media support (SIMD) and accelerating IRQ response.
Microarchitecture improvements in ARM11 cores include:
SIMD instructions which can double MPEG-4 and audio digital signal processing algorithm speed
Cache is physically addressed, solving many cache aliasing problems and reducing context switch overhead
Unaligned and mixed-endian data access is supported
Reduced heat production and lower overheating risk
Redesigned pipeline, supporting faster clock speeds (target up to 1 GHz)
Longer: 8 (vs 5) stages
Out-of-order completion for some operations (e.g. stores)
Dynamic branch prediction/folding (like XScale)
Cache misses don't block execution of non-dependent instructions
Load/store parallelism
ALU parallelism
64-bit data paths
JTAG debug support (for halting, stepping, breakpoints, and watchpoints) was simplified. The EmbeddedICE module was replaced with an interface which became part of the ARMv7 architecture. The hardware tracing modules (ETM and ETB) are compatible, but updated, versions of those used in the ARM9. In particular, trace semantics were updated to address parallel instruction execution and data transfers.
ARM makes an effort to promote good Verilog coding styles and techniques. This ensures semantically rigorous designs, preserving identical semantics throughout the chip design flow, which included extensive use of formal verification techniques. Without such attention, integrating an ARM11 with third party designs could risk exposing hard-to-find latent bugs. Due to ARM cores being integrated into many different designs, using a variety of logic synthesis tools and chip manufacturing processes, the impact of its register-transfer level (RTL) quality is magnified many times.[3] The ARM11 generation focused more on synthesis than previous generations, making such concerns be more of an issue.
Cores
There are four ARM11 cores:
ARM1136[4]
ARM1156, introduced Thumb2 instructions
ARM1176, introduced security extensions[5]
ARM11MPcore, introduced multicore support
PICA200
Specification
65 nm Single Core [7](max. clock frequency 400 MHz)
pixel performance: 800 Mpixel/s
1600 Mpixel/s
vertex performance: 15.3 Mpolygon/s
160Mtriangle/s
Power consumption: 0.5-1.0 mW/MHz
Frame Buffer max. 4095×4095 pixels
Supported pixel formats: RGBA 4-4-4-4, RGB 5-6-5, RGBA 5-5-5-1, RGBA 8-8-8-8
Vertex program (ARB_vertex_program)
Render-to-Texture
MipMap
Bilinear texture filtering
Alpha blending
Full-scene anti-aliasing (2×2)
Polygon offset
8-bit stencil buffer
24-bit depth buffer
Single/Double/Triple buffer
DMP's MAESTRO-2G technology
per pixel lighting
procedural texture
refraction mapping
subdivision primitive
shadow
gaseous object rendering