Chip123 科技應用創新平台

 找回密碼
 申請會員

QQ登錄

只需一步,快速開始

Login

用FB帳號登入

搜索
1 2 3 4
查看: 6244|回復: 1
打印 上一主題 下一主題

The Xbox 360 CPU architecture

[複製鏈接]
跳轉到指定樓層
1#
發表於 2007-2-9 10:39:56 | 只看該作者 回帖獎勵 |倒序瀏覽 |閱讀模式
The Xbox 360 CPU architecture# V2 v8 e/ J5 }5 e

5 }7 w. _1 l' {4 X) p- G  LThe Xbox 360 system has a single chip (with 165 million transistors) for its CPU. This chip is in fact a three-way symmetric multiprocessor design. The three PowerPC cores are identical, except that they are physically reflected through the X and Y axis. Each of the CPU cores is a specialized PowerPC chip with a VMX128 extension related to (and partially compatible with) the VMX instructions in the G4 and G5 CPUs. The three CPU cores share a 1MB Level2 cache. Each processor has 32KB each of data and instruction Level1 cache. The chip's front-side bus/physical interface has a 21.6GB/second bandwidth, and runs at 5.4GHz. The high frequency clocks are generated on-chip by four phase-locked loops: two for the core clocks, two for the PHY clock. 8 d- L0 l+ ?: g% s9 U& v

& a7 Z7 `9 D4 l+ bThe Xbox 360 CPU chip has testing and debug functions, including tracing, configuration control, and performance monitoring features. Access to these functions is through the block in Figure 1 labeled test/debug. The block labeled Miscellaneous IO provides a JTAG port, a POST monitor, and an interface for a serial EEPROM in case patch logic configuration was needed during bring-up.   W1 o6 h; E' ]+ G  M, Z
6 n9 A  V! P+ V8 k( u# {8 A3 z
To improve manufacturing yield, the SRAM Arrays used in the L1 and L2 caches support both row and column redundancy. This redundancy is enabled at chip test by burning electronic fuses. The eFuses are one of the unique capabilities of the IBM 90nm CMOS SOI technology the chip is fabricated in. Efuses were also used to record a unique supply voltage to be used for each chip. Finally, to help reduce the potential impact of process variations on the operation of the PHY analog circuits, eFuses were used for parametric adjustment in the analog units.
3 R7 Q$ f. j2 f, t8 L6 n  O+ t5 T! ]( h, F* E
The physical package of the chip matters, too. A crucial design goal in the CPU of a consumer electronics device is high volume with good yield and comparatively low cost. The package is a 2-2-2 FC-PBGA, measuring 31mm by 31mm.
1 C9 r. t: `1 z! O& z9 a4 E: ?: O$ c( \- |4 }
The CPU core examined
9 `7 q& i* U/ X) W3 y% ^6 I5 A" t$ B- E  m  @0 v4 N1 X/ P
The CPU cores (there are three) are the highest frequency PowerPC cores currently available, running at 3.2GHz. Throughout, the CPU uses extensive clock gating, leaving pipelines shut down until there are instructions to be processed; this dramatically reduces power consumption under real-world loads. The basic design is a 64-bit PowerPC architecture, with the complete PowerPC ISA available. 5 c0 I7 I! k+ W: {( }  j0 ?

! C, M' ~5 t( C% j) I2 s1 b" F& z$ W5 ~" C, K5 o
The instruction unit is multithreaded, with two simultaneous threads. The instruction cache is 32KB. The core implements a two-issue, in-order execution microarchitecture. This means two instructions are issued at a time but execution within the units is in sequential order. Execution is delayed to cover the load use penalty without stalling the pipeline.
- S' i$ a* r4 x' C9 R
+ U9 z3 G8 L% R) Z2 p; W, YThe L1 instruction cache (Icache) is a 32K Byte cache with parity error checking. It is two-way set associative cache with 128B lines. First-level translation for instruction addresses is done using a 64-entry, two-way set associative effective to real address translation cache.
; m8 H" \  v" G
% ^& T1 J$ a+ `2 \1 {The two issued instructions can go to one of five execution pipes: Branch (which is really part of the instruction unit), Load/Store , Fixed Point, Floating Point, and VMX. Difficult instructions are implemented through microcode. At dispatch they are cracked and converted into multiple micro-ops.1 W9 R2 y1 g: q' L: N/ Y; k0 H
+ L; F7 [! j0 y, U$ k
The branch unit includes a 4KB two-way set-associative Branch History Table per thread.8 `) W! }6 Y  e  y1 {! ^# h3 w

# j/ m) f& [4 y9 VThe Fixed Point pipe actually has two units: one to handle the simple operations like (add/sub, cmp, logical ops, and rotate); and one to handle the complex operations like multiply/divide.# `, n' K' c  U' U2 u. m& K

$ r% A! _1 e$ J7 w( fThe Load/Store pipe handles access to the L1 Data cache and the storage hierarchy. Like the L1 Icache, the L1 Dcache is a 32KByte cache with parity error checking. However, it is four-way set associative. It is "store through" and provides non-blocking access so a cache miss does not hold up a subsequent hit.0 Z( M, Z# }) y, i
; X: v+ i0 k9 L9 U4 `
A 64 entry two-way associative ERAT handles first-level data address translation. Second-level translation for both data and instructions is handled by a 1K entry four-way associative TLB (translation lookaside buffer) which can be software as well as hardware-managed.
  O  i% @7 ^; Y) ~2 y' ]# Y: c8 u- {% E) U

1 ^+ d3 _( ?! @& q6 e0 K% Q
8 f5 M/ J* \/ }4 m. z7 D0 Ghttp://www-128.ibm.com/developerworks/power/library/pa-fpfxbox/?ca=dgr-lnxw09XBoxDesign
7 o) s) E; N$ `' d1 D9 s" j% X0 D/ Q  N6 {# J" m5 r$ x
[ 本帖最後由 masonchung 於 2007-2-9 10:41 AM 編輯 ]

本帖子中包含更多資源

您需要 登錄 才可以下載或查看,沒有帳號?申請會員

x
分享到:  QQ好友和群QQ好友和群 QQ空間QQ空間 騰訊微博騰訊微博 騰訊朋友騰訊朋友
收藏收藏 分享分享 頂 踩 分享分享
2#
發表於 2008-9-7 01:30:29 | 只看該作者
感謝分享,這是很有用的資料。
8 j3 s1 D$ r* t3 q) W5 p& M5 J只是3+1式的多處理器架構,會不會有編程上的困難?
您需要登錄後才可以回帖 登錄 | 申請會員

本版積分規則

首頁|手機版|Chip123 科技應用創新平台 |新契機國際商機整合股份有限公司

GMT+8, 2024-5-14 02:53 PM , Processed in 0.112006 second(s), 18 queries .

Powered by Discuz! X3.2

© 2001-2013 Comsenz Inc.

快速回復 返回頂部 返回列表