Intel Processor manual

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289

Go to page of

A good user manual

The rules should oblige the seller to give the purchaser an operating instrucion of Intel Processor, along with an item. The lack of an instruction or false information given to customer shall constitute grounds to apply for a complaint because of nonconformity of goods with the contract. In accordance with the law, a customer can receive an instruction in non-paper form; lately graphic and electronic forms of the manuals, as well as instructional videos have been majorly used. A necessary precondition for this is the unmistakable, legible character of an instruction.

What is an instruction?

The term originates from the Latin word „instructio”, which means organizing. Therefore, in an instruction of Intel Processor one could find a process description. An instruction's purpose is to teach, to ease the start-up and an item's use or performance of certain activities. An instruction is a compilation of information about an item/a service, it is a clue.

Unfortunately, only a few customers devote their time to read an instruction of Intel Processor. A good user manual introduces us to a number of additional functionalities of the purchased item, and also helps us to avoid the formation of most of the defects.

What should a perfect user manual contain?

First and foremost, an user manual of Intel Processor should contain:
- informations concerning technical data of Intel Processor
- name of the manufacturer and a year of construction of the Intel Processor item
- rules of operation, control and maintenance of the Intel Processor item
- safety signs and mark certificates which confirm compatibility with appropriate standards

Why don't we read the manuals?

Usually it results from the lack of time and certainty about functionalities of purchased items. Unfortunately, networking and start-up of Intel Processor alone are not enough. An instruction contains a number of clues concerning respective functionalities, safety rules, maintenance methods (what means should be used), eventual defects of Intel Processor, and methods of problem resolution. Eventually, when one still can't find the answer to his problems, he will be directed to the Intel service. Lately animated manuals and instructional videos are quite popular among customers. These kinds of user manuals are effective; they assure that a customer will familiarize himself with the whole material, and won't skip complicated, technical information of Intel Processor.

Why one should read the manuals?

It is mostly in the manuals where we will find the details concerning construction and possibility of the Intel Processor item, and its use of respective accessory, as well as information concerning all the functions and facilities.

After a successful purchase of an item one should find a moment and get to know with every part of an instruction. Currently the manuals are carefully prearranged and translated, so they could be fully understood by its users. The manuals will serve as an informational aid.

Table of contents for the manual

  • Page 1

    Intel ® 80200 Processor based on Intel ® XScale ™ Microarchitecture Developer’s Manual March, 2003 Order Number : 2734 11-003[...]

  • Page 2

    ii March, 200 3 Deve loper’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Information in t his document i s provided i n connection with Intel® products . No license, expr ess or impli ed, by estop pel or oth e rwise, to any intellectual prop erty rights is g ranted by this docum ent. Except as provided in [...]

  • Page 3

    Deve loper’ s Man ual March, 200 3 iii Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Contents 1 Introduction ........... .......... ......... ......... .............. .......... ......... ......... ............. 1 1.1 I ntel ® 80200 Pr ocessor based on Intel ® XScale ™ Microarchite cture Hig h-Level O verview ......[...]

  • Page 4

    iv March, 200 3 Deve loper’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture 3.2.2.1 Page ( P) Attribu te Bit ............. ............ ............. ............. .................... ......... 2 3.2.2.2 Cacheabl e (C), B ufferabl e (B), and eXtens ion (X) B its ............ ............. ... 2 3.2.2.3 Instru[...]

  • Page 5

    Deve loper’ s Man ual March, 200 3 v Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture 6.2.3.3 Write Miss Policy ...................... ............. ............. ............. ................... ...... 7 6.2.3.4 Write-Ba ck Versus Write- Through .. ............. ............. ............. ................... 7 6.2.4 Rou[...]

  • Page 6

    vi March, 200 3 Deve loper’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture 9.3 Programm er Model ...... ............. .................... ............. ............ ............. .................... ............. ......... 2 9.3.1 INTCTL . ............. ............. ............. ................... ......[...]

  • Page 7

    Deve loper’ s Man ual March, 200 3 vii Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture 12.5.3 Instructi on Fetch La tency Mo de ...... ............. ............. ............. ............. ................... ...... 8 12.5.4 Data/Bus Request B uffer Full Mode ............ ............. ............. ............. ......[...]

  • Page 8

    viii March, 200 3 Deve loper’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture 13.11.6.4 DBG.V .. ............. ............. ................... ............. ............. ............. ............. . 25 13.11.6.5 DBG.RX .................... ................... ............. ............. ............. .....[...]

  • Page 9

    Deve loper’ s Man ual March, 200 3 ix Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture 14.4.10 Miscellane ous Instruc tion Timi ng .... ............. ............. ............. ................... ............. ...... 9 14.4.11 Thumb* Instruc tions ..... ................... ............. ............. ............. ......[...]

  • Page 10

    x March, 200 3 Deve loper ’s Ma nual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture B.4.1 Instructio n Cache ................ ............. ............. ................... ............. ............. ............. ..... .. 17 B.4.1.1. Cac he Miss Cos t.......... ............. ............. ............. .............[...]

  • Page 11

    Deve loper’ s Man ual March, 200 3 xi Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture C.2.2 TAP Pins ...... ............. ............. ............. ............. ................... ............. ............. ........... .. ...... 3 C.2.3 Instruct ion Regis ter (IR) ......... ............. ............. ............. .[...]

  • Page 12

    xii March, 200 3 Deve loper ’s Ma nual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Figures 1-1 Int el ® 80200 Processor b ased on Intel ® XScale ™ Microarchitecture Featur es ..................... ........... ........... 2 3-1 Example of Locked E ntries in TLB .................... ................. ............ [...]

  • Page 13

    Deve loper’ s Man ual March, 200 3 xiii Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Tabl es 2-1 Multiply with Internal Accumulate Format...................... ............ ........... ........... ........... ................. ............ ...4 2-2 MIA{<cond>} acc0, Rm, Rs ...................... ........... .....[...]

  • Page 14

    xiv March, 200 3 Deve loper ’s Ma nual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture 9-1 Interrupt Contro l Register (C P13 register 0) .................. ........... ........... ............ ................ ............ ..... ........... 3 9-2 Interrupt Source Reg ister (CP13, reg ister 4) ............ ........... .[...]

  • Page 15

    Deve loper’ s Man ual March, 200 3 xv Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture 14-14 Semaphore Instructio n Timings ................. ........... ........... ............ ........... ........... ................. ... ... ................. ...9 14-15 CP15 Register Acces s Instruction Timings ..... ................. [...]

  • Page 16

    [...]

  • Page 17

    Deve loper’ s Man ual March, 2 003 1-1 Introduction 1 1.1 Int el ® 80200 Processor based o n Intel ® XScale ™ Microarchite cture High-Lev el Overv iew The Intel ® 8 0200 proces sor based on Inte l ® XScale ™ microarchitecture, is the next generation in the Intel ® StrongARM* p rocessor family (co mpliant with ARM* Architecture V5TE) . It[...]

  • Page 18

    1-2 Marc h, 200 3 Deve loper’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Introduction 1.1.2 Features Figur e 1- 1 shows the maj or f uncti on al bl ocks o f t he Int el ® 80200 processor . The following sections gi ve a brief, h igh-level overview of these blocks. 1.1.2.1 Multiply/Accumulate (MAC) The MA[...]

  • Page 19

    Deve loper ’ s M anual March, 2003 1-3 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Introduction 1.1.2.2 Memory Management The Intel ® 8020 0 processor implements the Memory M anagement Unit (MMU) Ar chitecture specified in the ARM Ar chitectur e Refer ence Manual . The MMU provides access protection and virtual to ph[...]

  • Page 20

    1-4 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Introduction 1.1.2.6 Power Management The Intel ® 80200 pr ocessor supports two low power modes : idle an d sleep. These modes are discusse d in Section 8.3, “Power Manageme nt” on page 8-5 . 1.1.2.7 Int errupt Controller An in[...]

  • Page 21

    Deve loper ’ s M anual March, 2003 1-5 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Introduction 1.2 T erminolog y and Conven tions 1.2.1 Number Represent ation All numb ers i n th is d ocum ent c an be ass umed to be bas e 10 un les s des ig nated othe rwi se . In te xt and pseudo code descri ptions, hexadecimal n umb[...]

  • Page 22

    1-6 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Introduction 1.3 Other Releva nt Document s • Intel ® 80200 Pr ocessor ba sed on Intel ® XScal e ™ Microar chitecture Datasheet, I n tel Order # 273414 • ARM Arch itectur e V ersion 5TE Specification Document Number: ARM DDI[...]

  • Page 23

    Deve loper ’ s M anual March, 2003 2-1 Programming Model 2 This chapter describe s the programming model of th e Intel ® 8020 0 processor based on Intel ® XScale ™ microarchitecture, namely th e implementation options and extensions to the ARM* V ersion 5 architecture. The ARM* Architecture V ersion 5TE S pecification (ARM DDI 0100 E) describ[...]

  • Page 24

    2-2 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Programming Model 2.2.4 A RM* DSP-Enhanced Instruction Set The Intel ® 80200 pr ocessor implements AR M DSP-enhanced i n struction set, which is a set of instructions that boos t the performance of signal processing applications. T[...]

  • Page 25

    Deve loper ’ s M anual March, 2003 2-3 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Programming Model 2.3 Extensions to ARM* Architec ture The Intel ® 8020 0 processor made a few extens ions to the ARM V ersion 5 architecture to meet the needs of v arious markets and design requirement s. The f ollowing i s a list of [...]

  • Page 26

    2-4 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Programming Model 2.3.1.1 Multiply With Internal Accumulate Format A new multiply format has been created to define operations on 4 0-bit accumulators. Ta b l e 2 - 1 , “Multiply with Internal Accumulate Format” on page 2-4 show[...]

  • Page 27

    Deve loper ’ s M anual March, 2003 2-5 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Programming Model MIA does not support unsigned multip lication; all values in Rs and Rm are interpreted as signed data values. MIA is useful for operating on signed 16-bit data that was loaded into a general purpose r egister by LDRSH [...]

  • Page 28

    2-6 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Programming Model The MIAxy instru ction performs one16-bit signed multip ly and accumulates these to a single 40-bit accumulator . x refers to either the upper half or lower half of register Rm (multiplicand) and y refers to the up[...]

  • Page 29

    Deve loper ’ s M anual March, 2003 2-7 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Programming Model 2.3.1.2 Internal A ccumulator A ccess Format The Intel ® 80200 process or defines a new instruction format for accessing internal accumu lat ors in CP0. T able 2-5, “Internal Accumulator Acces s Format” on page 2-[...]

  • Page 30

    2-8 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Programming Model The MAR instruction moves the value in regis ter RdLo to bits [31:0] of the 40-b it accumulator (acc0) and moves bits[7:0] of the value in regis t er RdHi into b its[39:32] of acc0. The instruction is o nly execute[...]

  • Page 31

    Deve loper ’ s M anual March, 2003 2-9 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Programming Model 2.3.2 New Page Attribut es The Intel ® 8020 0 processor extends the page att ributes defined by the C and B b its in the page descriptors with an additiona l X bit. This bit allows four more attributes to be encoded w[...]

  • Page 32

    2-10 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Programming Model The P bit controls ECC . The TEX (T ype Extens ion) fi eld is pr esent in s everal of the d escriptor types. In the Intel ® 80200 processor, only the LSB of this field is used; this is called the X bit. A Small Pag[...]

  • Page 33

    Deve loper ’ s M anual March, 2003 2-11 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Programming Model 2.3.3 A dditions to CP15 Functionality T o accommodate the functionality in the Intel ® 8020 0 processor , registers in CP15 an d CP14 have been added or augmented. See Chapter 7, “Configu ration” for details. At[...]

  • Page 34

    2-12 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Programming Model 2.3.4 E vent Architecture 2.3.4.1 Exception Summary Ta b l e 2 - 1 1 shows all the exceptions that the Intel ® 80200 pr ocessor may generate, and the attributes of each. Sub sequent sections g ive details on each e[...]

  • Page 35

    Deve loper ’ s M anual March, 2003 2-13 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Programming Model 2.3.4.3 Prefetch Abort s The Intel ® 80200 processor detects three types of prefetch aborts: Instruction MMU abort, external abort on an instru ct ion access, and an ins truction cache parity error . These aborts are[...]

  • Page 36

    2-14 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Programming Model 2.3.4. 4 Data Abort s T wo types of data aborts exist in the Intel ® 80200 processor: precis e and imprecise. A prec ise data abort is d efined as o ne where R14_A BOR T always contains the PC (+8) of the ins truct[...]

  • Page 37

    Deve loper ’ s M anual March, 2003 2-15 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Programming Model Imprecise data aborts • A data cache parity error is imprecise; the extended Status field o f the Fault S tat us Register is set to 0xb1 1000. • All external data aborts except for those gener ated on a data MMU t[...]

  • Page 38

    2-16 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Programming Model Multiple Da ta Aborts Multiple data ab orts may be detected by hardware, but only the h ighest pr iority one is reported. If the reported data abort is precise, software can cor rect the cause of the abort and re-ex[...]

  • Page 39

    Deve loper ’ s M anual March, 2003 3-1 Memory Management 3 This chapter describes th e memory management unit implemented in the Intel ® 8 0200 proces sor based on I ntel ® XScale ™ microarchitecture, and is compliant with the ARM* Architecture V5TE . 3.1 Overview The Intel ® 8020 0 processor implements the Memory M anagement Unit (MMU) Ar c[...]

  • Page 40

    3-2 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Memory Management 3.2 Archite cture Model 3.2.1 V ersion 4 vs. V ersion 5 ARM* MMU V ersion 5 Architecture introduces t he support o f tiny pages, which are 1 KByte in size. The reserved field i n the first-level des criptor (encodi[...]

  • Page 41

    Deve loper ’ s M anual March, 2003 3-3 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Memory Management 3.2.2.4 Dat a Cache and W rite Buffer All of these descriptor bits affect th e behavior of th e Dat a Cache and the W rite B uff er . If the X bit for a descriptor is zero, the C and B bits op erate as mandated by the [...]

  • Page 42

    3-4 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Memory Management 3.2.2.5 Deta ils on Dat a Cache and W rite Buffer B ehavior If the MMU is disabled all data accesses are non -cacheable and non-buf ferable. This is the same behavior as when the MMU is enabled, and a data access u[...]

  • Page 43

    Deve loper ’ s M anual March, 2003 3-5 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Memory Management 3.3 Interaction of the MMU, Instructio n Cache, a nd Data Cache The MMU, instruction cache, and data/mini-data cach e may be ena bled/disabled independen tly . The instruction cache can be enabled with the MMU enabled [...]

  • Page 44

    3-6 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Memory Management 3.4 Control 3.4.1 I nvalidate (Fl ush) Operatio n The entire instru ction and data TLB can b e invalidated at t he same time with one command or they can be invali dated sep arately . An indivi dual ent ry in the d[...]

  • Page 45

    Deve loper ’ s M anual March, 2003 3-7 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Memory Management 3.4.3 Locking Entries Individual entries can be locked into the instruction and data TLBs. See T able 7-14, “Cache Loc kdo wn Fun cti ons” on pa ge 7- 14 for the exact comman d s. If a lock operation finds the v ir[...]

  • Page 46

    3-8 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Memory Management Note: Care must be exercised here when allowing exceptio ns to occur during this routine whose handlers may have data that lies in a page that is tryin g to be locked into the TLB. Example 3-3. Lockin g Entries i n[...]

  • Page 47

    Deve loper ’ s M anual March, 2003 3-9 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Memory Management 3.4.4 Ro und-Robin Replacement Algor ithm The line replacem ent algorithm for the TLB s is round- robin; there is a round-r obin pointer that keeps track of the n ext entry to replace. The next entry to rep lace is the[...]

  • Page 48

    [...]

  • Page 49

    Deve loper ’ s M anual March, 2003 4-1 Instruction Cache 4 The Intel ® 80200 p rocessor based on Intel ® XScale ™ m icroarchitecture (compliant with th e ARM* Architecture V5TE) in struction cache enh ances perform ance by reducing the number of instruction fetch es from external memory . The cache provides fas t execution of cached code. Cod[...]

  • Page 50

    4-2 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Instructio n Cache 4.2 Operation 4.2.1 O peration When Instruction Cache is Enabled When the cache is enabled, it compares every ins truction request addres s against the addres ses of instructions that it is curr ently holding. If [...]

  • Page 51

    Deve loper ’ s M anual March, 2003 4-3 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Instru ction Cac he 4.2.3 Fetch Policy An instruction-cach e “miss” occurs when the request ed instruction is not found in the instruction fetch buf fers or instruction cache; a fetch r equest is then made to external m emory . The [...]

  • Page 52

    4-4 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Instructio n Cache 4.2.5 Parity Protection The instruction cache is protected by pa rity to ensure data integr i ty . Each instruction cache word has 1 parity bit. (The instruction cache tag is NOT parity protected. ) W hen a parity[...]

  • Page 53

    Deve loper ’ s M anual March, 2003 4-5 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Instru ction Cac he 4.2.6 I nstruction Fetch Latency Because the Intel ® 80200 processor core is clocked at a multiple of the external bus clock, and the two clocks ar e truly asynchronous , an exact fetch latency is di ffi cult to der[...]

  • Page 54

    4-6 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Instructio n Cache 4.3 Instru ction Cac he Contro l 4.3.1 I nstruction Cache S t ate at RESET After reset, the instruction cach e is always disabled, unlocked , and invalidated (f lushed). 4.3.2 Enabling/Disabling The instruction ca[...]

  • Page 55

    Deve loper ’ s M anual March, 2003 4-7 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Instru ction Cac he 4.3.3 Invalidating the Ins truction Cache The entire instruction cache along with the fetch buffers are inv alidated by writing to coprocessor 15, r egister 7. (See T able 7-12, “Cache Functions ” on page 7-1 1 f[...]

  • Page 56

    4-8 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Instructio n Cache 4.3.4 L ocking Instructions in the Instruction Cache Software has the ability to lock performance cr itical routines into the ins truction cache. Up to 28 lines in each set can be locked; hardware ignores the lock[...]

  • Page 57

    Deve loper ’ s M anual March, 2003 4-9 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Instru ction Cac he Software can lock down several dif ferent routines located at different memo ry locations. This may cause some sets to have more lock ed lines than other s as shown in Fi gure 4-2 . Example 4- 4 on page 4-9 shows how[...]

  • Page 58

    [...]

  • Page 59

    Deve loper ’ s M anual March, 2003 5-1 Branch T a rget Buf f er 5 Intel ® 802 00 process or base d on Inte l ® XScale ™ microarchitecture (compliant with the ARM* Architecture V5TE) uses dynamic branch prediction to re duce the penalties associated with changing the flo w of program ex ecution. The Intel ® 80200 p r ocessor features a branch[...]

  • Page 60

    5-2 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Br anch T arg et Bu f fer 5.1.1 Reset After Processor Reset, the BTB is disabled and all entries are invalidated. 5.1.2 U p date Policy A new entry is stored into the BTB when the following conditions ar e met: • the branch instru[...]

  • Page 61

    Deve loper ’ s M anual March, 2003 5-3 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Branch T a rget Buffer 5.2 BTB C ontrol 5.2.1 Disabli n g/Enabling The BTB is always disabled out of Reset. Software can enable the BTB thro ugh a bit in a coprocessor r egis ter (see Section 7.2.2 ). Before enabling or disabli ng the B[...]

  • Page 62

    [...]

  • Page 63

    Deve loper ’ s M anual March, 2003 6-1 Data Cache 6 The Intel ® 80200 p rocessor based on Intel ® XScale ™ m icroarchitecture (compliant with th e ARM* Architecture V5TE) d ata cache enhances perfo rmance by reducing the num ber of data accesses to and fr om external memory . There ar e two data cache st ructures in the Intel ® 80200 process[...]

  • Page 64

    6-2 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Data Cache Figure 6-1. Dat a Cache Organization way 0 way 1 way 31 32 bytes (cache line) Set 31 CAM DA T A way 0 way 1 way 31 32 bytes (cache line) Set 1 CAM DA T A way 0 way 1 way 31 32 bytes (cache line) Set Index Set 0 Ta g Data [...]

  • Page 65

    Deve loper ’ s M anual March, 2003 6-3 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Data Cache 6.1.2 Mini-Dat a Cache Overview The mini-data cache is a 2-Kbyte, 2-way set as sociative cache ; this means there are 32 sets with each set containing 2 ways. Each way of a s et contains 32 bytes (one cac he line) and one val[...]

  • Page 66

    6-4 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Data Cache 6.1.3 Write Buffer and Fill Buffer Overvi ew The Intel ® 80200 pr ocessor employs an eight entry write bu ffer , each entry co ntaining 16 bytes. Stores to external memory are f i rst placed in the write bu ffer and subs[...]

  • Page 67

    Deve loper ’ s M anual March, 2003 6-5 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Data Cache 6.2 Dat a Cache an d Mini-Data Cache Operation The following dis cussions refer to the data cache and min i -data cache as one cach e (data/mini-data) since th eir behavior is the same when ac cessed. 6.2.1 Operation When Cac[...]

  • Page 68

    6-6 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Data Cache 6.2.3. 2 Read Mi ss Poli cy The following se quence of even ts occurs when a cacheable (s ee Section 6.2.3.1, “Cacheability” on page 6-5 ) load operation misses the cache: 1. Th e fill buffer is checked to see if an o[...]

  • Page 69

    Deve loper ’ s M anual March, 2003 6-7 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Data Cache 6.2.3.3 W rite Miss Pol icy A write operation that misses the cache requests a 3 2-byte cache line fro m external memory if the access is cacheable and write allocation is specified in the page. In this case the following seq[...]

  • Page 70

    6-8 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Data Cache 6.2.4 Round -Robin Replacement A lgorithm The line replacement algorithm for the data cache is round-rob in. Each set in the d ata cache has a round-robin pointer that k eeps track of the next line (in that set) to replac[...]

  • Page 71

    Deve loper ’ s M anual March, 2003 6-9 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Data Cache 6.3 Dat a Ca che and Mini-Data Cache Control 6.3.1 Dat a Memory St ate After Reset After processor reset, both th e data cache and mini-d ata cache are disabled, all valid bits are set to zero (invalid), an d the round-r obin[...]

  • Page 72

    6-10 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Data Cache 6.3.3.1 Global Clean and Invali date Operation A simple software routine is used to globally clean th e data cache. It takes advantage of t he line-allocate data cache operation, which allocates a lin e into the data cache[...]

  • Page 73

    Deve loper ’ s M anual March, 2003 6-11 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Data Cache The line-allocate command will not operate on the mini Data Cache, so system software must clean this cache by reading 2KB yte of co ntiguous unus ed data into it. This data must be unused and reserved for this purpose so th[...]

  • Page 74

    6-12 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Data Cache 6.4 Re-co nfigurin g the Dat a Ca che as Dat a RAM Software has the ability to lock tags associated with 32-by te lines in the data cache, th us creating the appearance of d ata RAM. Any subsequ ent access to this line alw[...]

  • Page 75

    Deve loper ’ s M anual March, 2003 6-13 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Data Cache Examp le 6 -3. Locki ng Dat a into th e Dat a Cac he ; configured with C=1 and B=1 ; R0 is the number of 32-byte lines to lock into the data cache. In this ; example 16 lines of data are locked into the cache. ; MMU and data[...]

  • Page 76

    6-14 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Data Cache Example 6-4. Creati ng Data RAM ; R1 contains the virtual address of a region of memory to configure as data RAM, ; which is aligned on a 32-byte boundary. ; MMU is configured so that the memory region is cacheable. ; R0 i[...]

  • Page 77

    Deve loper ’ s M anual March, 2003 6-15 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Data Cache T ags can be locked into the d at a cache by enabling th e data cache lock mode bit located in coprocessor 15, r egister 9. (See T able 7-14, “Cache Lockdown Functi ons” on page 7-14 for the exact command.) Once enab l e[...]

  • Page 78

    6-16 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Data Cache 6.5 Write Buffer/Fill Buff er Operation an d Contro l See Section 1.2 .2, “T erminolo gy and Acronyms ” on page 1-5 for a def i nition of coales cing. The write buf fer is always enabled, which means, stores to externa[...]

  • Page 79

    Deve loper ’ s M anual March, 2003 7-1 Configuration 7 This chapt er describes the Sy stem Control C oprocesso r (CP15) and copro cessor 14 (CP14). C P15 configures the MMU, caches, buffers and other system at tributes. Where possible, the defin ition of CP15 follows th e definition in t he first generation Intel ® Stro ngARM* products. C P14 co[...]

  • Page 80

    7-2 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Configuration The format of M RC and MCR is sh ow n in Ta b l e 7 - 1 . cp_num is defin ed for CP15, CP14, CP13 and CP0. CP13 contai ns the interrupt controll er and b us con troll er r egist ers a nd is desc ribe d in Chapter 9, ?[...]

  • Page 81

    Deve loper ’ s M anual March, 2003 7-3 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Configuration The format of LDC and STC is show n i n Ta b l e 7 - 2 . LD C and ST C follow th e program ming notes in the ARM Architectur e Refer ence Manual . LDC and STC transfer a single 32-b it word between a copr ocessor register [...]

  • Page 82

    7-4 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Configuration 7.2 CP15 Reg isters Ta b l e 7 - 3 lis ts the CP15 regis ters implemented in the Intel ® 80200 p rocessor. T abl e 7- 3. CP1 5 Re gist ers Register (CRn) Op code_2 Access Description 0 0 Read / Write-Ignored ID 0 1 Re[...]

  • Page 83

    Deve loper ’ s M anual March, 2003 7-5 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Configuration 7.2.1 Register 0: ID and Cache T ype Reg isters Register 0 ho uses two read-only registers that are used for par t identification: an ID r egister and a cache type register . The ID Register is s elected when opcode_2 =0 .[...]

  • Page 84

    7-6 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Configuration 5:3 Read / Write Ignored Instruction cache associativity = 0b101 = 32 kB 2 Read-as-Z ero / Wri te Ignored Reser ved 1:0 Read / Write Ignored Instruction cache line length = 0b10 = 8 w ords/line T able 7 -5. Cache T yp [...]

  • Page 85

    Deve loper ’ s M anual March, 2003 7-7 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Configuration 7.2.2 Register 1: Control and Auxiliary Control Registers Register 1 is made up of two r egisters, one that is compliant with ARM V e rsion 5 and is referenced by opcode_ 2 = 0x0, and t he othe r which is s pecific to Int [...]

  • Page 86

    7-8 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Configuration The mini-data cache attribute bits, in the Intel ® 80 200 process or Cont rol Register , are used to control the allocation po licy for the mini-data cach e and whether it uses write-back caching o r write-through cac[...]

  • Page 87

    Deve loper ’ s M anual March, 2003 7-9 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Configuration 7.2.3 Register 2: T ranslation T able Base Register 7.2.4 Register 3: Domain Access Control Regis ter 7.2.5 Register 4: Reserved Register 4 is reserved. Reading and wr iting this register y ields unpredictable results. T a[...]

  • Page 88

    7-10 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Configuration 7.2.6 R egister 5: Fault S t atus Register The Fault Status Register (FSR) indicates wh ich fault has occurred, w hich could be either a prefetch abort or a data abort. Bit 10 extends the encodin g of the status field f[...]

  • Page 89

    Deve loper ’ s M anual March, 2003 7-11 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Configuration 7.2.8 Regi ster 7: Cache Functions All the fu nctions d efined in t he first g eneration of Intel ® Stron g ARM* appear her e. The Intel ® 80200 pro cessor adds other function s as well. This register shou ld be accesse[...]

  • Page 90

    7-12 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Configuration Other items to note abo ut the line-allocate command are: • It forces all pending memory operati ons to complete. • Bits [31:5] of Rd is used to specific th e virtual address of the line to allocated into the data c[...]

  • Page 91

    Deve loper ’ s M anual March, 2003 7-13 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Configuration 7.2.9 Register 8: TLB Operations Disabling/enabling the MMU has no e ffect on the contents of either TLB: valid entries stay v alid, locked items remain locked. All operat ions defined in Ta b l e 7 - 1 3 work regardless [...]

  • Page 92

    7-14 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Configuration 7.2.10 Register 9: Cac h e Lock Down Register 9 is used for locking do wn entries into the instruction cache and data cache. (The pro tocol for locking down entri es can be f ound in Chapter 6, “Data Cache” .) Ta b [...]

  • Page 93

    Deve loper ’ s M anual March, 2003 7-15 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Configuration 7.2.1 1 Register 10: TLB Lock Down Register 10 is used for locking down en tries into th e instruction TLB, and data TLB. (The protocol for locking down entries can be found in Chapter 3, “Memory Managemen t ” .) L oc[...]

  • Page 94

    7-16 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Configuration 7.2.13 Register 13: Proce ss ID The Intel ® 80200 pr ocessor supports the remapping of vi rtual addresse s through a Process ID (PID) register . This remapping occurs before the instruction cache, instruction TLB, data[...]

  • Page 95

    Deve loper ’ s M anual March, 2003 7-17 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Configuration 7.2.14 Register 14: Breakpoint Registers The Intel ® 8 0200 proces sor contains two i nstructi on br eakpoint add ress register s (IBCR0 and IBCR1), one data breakpoint address register (DBR0), one co nfigurable data mas[...]

  • Page 96

    7-18 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Configuration 7.2.15 Register 15: Coprocessor Access Register This register is selected w hen opco de_ 2 = 0 and CRm = 1. This register cont rols access rights to all the coproces sors in the system except for CP15 and CP14. Both CP1[...]

  • Page 97

    Deve loper ’ s M anual March, 2003 7-19 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Configuration T abl e 7-20. Coprocess or Access Registe r 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 0 0 C P 1 3 C P 1 2 C P 1 1 C P 1 0 C P 9 C P 8 C P 7 C P 6 C P 5 C P 4 C P 3 C P 2 C P 1 C[...]

  • Page 98

    7-20 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Configuration 7.3 CP14 Reg isters Ta b l e 7 - 2 1 lists the CP1 4 registers implement ed in the Intel ® 80200 pro cessor. 7.3.1 R egisters 0-3: Performance Monitori ng The performance monitoring unit contains a con trol register (P[...]

  • Page 99

    Deve loper ’ s M anual March, 2003 7-21 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Configuration 7.3.3 Registers 6-7 : Clock and Power Manageme nt These regist ers contai n functions for ma naging the co re clock and power . Three low power mo des are suppor t ed that are entered upon executing the functions listed i[...]

  • Page 100

    7-22 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Configuration 7.3.4 R egisters 8-15: Software Debug Software deb ug is supported by address breakp oint regi sters (Cop rocessor 15, re gister 14) , serial communication ov er the JT A G interface an d a trace buff er. Registers 8 an[...]

  • Page 101

    Deve loper ’ s M anual March, 2003 8-1 System Management 8 This chapter describes the clocking and power management featur es of the In tel ® 80200 proces sor based on Intel ® XScale ™ microarchitecture (compliant with the A RM* Architecture V5TE) along with reset details. Main featu r es include a sof tware controlled internal clock frequenc[...]

  • Page 102

    8-2 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture System Management The Intel ® 80200 pr ocessor supports low volt age operati on with a s upply as l ow as 0.95 V . At lower voltages, not all CCLK config urations are available. See the Intel ® 802 00 processor Datasheet for vo lt[...]

  • Page 103

    Deve loper ’ s M anual March, 2003 8-3 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Syst em Man agement 8.2 Processor Reset The RESET# pin mu st be asserted when C LK and power are applied to the proces sor . CLK, MCLK, and power must be prese nt and stable befo re RESET# can be d eas serted. T o ensure reset, RESET# m[...]

  • Page 104

    8-4 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture System Management 8.2.2 R eset Effect on Output s After RESETOUT# is as serted, the processor’ s output pins are driven to a well-defined state. Critical bus signals receiv e a ‘0’ or ‘1 ’ value, as shown in Figure 8- 2 . [...]

  • Page 105

    Deve loper ’ s M anual March, 2003 8-5 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Syst em Man agement 8.3 Power Managem ent The Intel ® 8 0200 proces sor prov ides low power mod es: idle and sl eep, whi ch are li sted in increasing pow er savi ng order . Ta b l e 8 - 3 d escribes the attributes of each lo w power mo[...]

  • Page 106

    8-6 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture System Management The JT AG clock must be stopped during s leep mode. Drive a ‘0’ into the JT AG clock when not togglin g it.[...]

  • Page 107

    Deve loper ’ s M anual March, 2003 9-1 Interrupts 9 9.1 Int roduc tion The Intel ® 80200 p rocessor based on Intel ® XScale ™ m icroarchitecture (compliant with th e ARM* Architecture V5TE) sup ports a variety o f external and intern al interrupt sources. The Interrupt Control Unit (IC U) controls how the In tel ® 802 00 processor r eacts to[...]

  • Page 108

    9-2 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Interrup ts 9.3 Programm er Model Software has access to three registers in the ICU. INTCTL is used t o enable or disable (mask) individual interrupts. As mentioned, masking of all interrupts may still be accomplished via the CPSR r[...]

  • Page 109

    Deve loper ’ s M anual March, 2003 9-3 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Interrupts 9.3.1 INTCTL INTCTL is used to s pecify what interrupts ar e disabled (masked). T able 9-1. Interrupt Control Register (CP13 register 0) 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 B [...]

  • Page 110

    9-4 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Interrup ts 9.3.2 INTSRC The Interrupt Source regis ter (INTSRC) indicates which interrupts are active. This register may be used by an I S R to determine q uickly the sour ce of an interr upt. Even if an interrupt is mas ked with I[...]

  • Page 111

    Deve loper ’ s M anual March, 2003 9-5 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Interrupts 9.3.3 INTSTR Systems may h ave differing priorities for the various interrup t cases; the ICU allows system designers to as sociate each internal interrupt sour ce with one of the two i nternal interrupts: FIQ and IRQ. This a[...]

  • Page 112

    [...]

  • Page 113

    Deve loper ’ s M anual March, 2003 10-1 External Bus 10 10.1 General Description The Intel ® 80200 p rocessor based on Intel ® XScale ™ m icroarchitecture (compliant with th e ARM* Architecture V5TE) bus is a split bus, with separate request and da ta buses. It is designed primarily as the memory and I/O bus for the Intel ® 80200 pro cessor,[...]

  • Page 114

    10-2 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Exte rnal Bus An alternate configuratio n with a separate memory bus is als o possible, shown in Fi gure 10-2 . All signals o n this bus, data and request, are sampled on the rising edge of MCL K . MCLK is created by the sy stem and [...]

  • Page 115

    Deve loper ’ s M anual March, 2003 10-3 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Exte rnal Bus 10.2 Signal Descriptio n T abl e 10-1. Intel ® 80200 P rocessor based on Intel ® XSc ale ™ Microarchitecture Bus Signals Signal Wid th I/O Funct ion MCLK 1 I bus clock (note: all bus activity is triggered by the risin[...]

  • Page 116

    10-4 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Exte rnal Bus 10.2.1 Request Bus The request bus issues read or write requests from t he Intel ® 80200 processor or ot her bus mas ter to the chipset or memory controller . Each request takes two MCLK cycles. All signals should be s[...]

  • Page 117

    Deve loper ’ s M anual March, 2003 10-5 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Exte rnal Bus In addition to the alignment constrain ts listed above, read transactions never cross a 32-byte boundary , and wri te trans actions n ever cros s a 16-byt e boundary . Some write case explanations. Byte an d short writes [...]

  • Page 118

    10-6 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Exte rnal Bus 10.2.2 Data Bus Some time after a request is m ade on the request bus, data must be transfer red for that request on the data bus. Each requ es t has a corresp onding transaction (o ne or more cycles) on the data bu s. [...]

  • Page 119

    Deve loper ’ s M anual March, 2003 10-7 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Exte rnal Bus 10.2.3 Critical W ord First The CW F signal is only used du ring read burst s of eight word s ( Le n = 6). CWF need s to be driven at the same time as DV alid of the first data cycle of the transaction. This bit indicates[...]

  • Page 120

    10-8 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Exte rnal Bus There are eight byte enables ( BE# ) associated with the D bus. Each b yte enable corresponds to one byte of the bus. During a write cycle, the byte enables for each byte that is being written is asserted low . More det[...]

  • Page 121

    Deve loper ’ s M anual March, 2003 10-9 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Exte rnal Bus 10.2.5 Multimaster Support Simple multim aster support is su pplied with the Hold pin. The Hold pin causes the Intel ® 80200 processor to stop issuing new requests as soon as possible ( see below for timing) and to flo a[...]

  • Page 122

    10-1 0 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Exte rnal Bus A simpler but lower perfor mance method would be to ass ert Ho ld to the Intel ® 80200 processo r, wait for all outstandin g transactions to complete, gran t the issue bus to the alternate ma ster (using the issue b[...]

  • Page 123

    Deve loper ’ s M anual March, 2003 10-11 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Exte rnal Bus 10.2.6 Abort If for any reason a request mad e by the In tel ® 80200 pr ocessor can not b e complet ed, it mu st be aborted. At the same time as the assertion DV alid for any data cycle o f any transaction, Abort can be[...]

  • Page 124

    10-1 2 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Exte rnal Bus 10.2.7 ECC Software ru nning on the I ntel ® 80200 proces sor may configur e pages in memory as being ECC protected. For such pag es, the In tel ® 80200 processor checks the ECC code associated with read data, and [...]

  • Page 125

    Deve loper ’ s M anual March, 2003 10-13 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Exte rnal Bus 10.2.8 Big Endian System Configuration The Intel ® 8020 0 processor supports ex ecution in a big end i an system. A sys tem is said to be bi g endian if multi-byte v alues are accessed with the M SB at lower addresses. [...]

  • Page 126

    10-1 4 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Exte rnal Bus 10.3 Example s All examples assu me a 64-bit bus, in a little endian syst em. 10.3.1 Si mple Re ad Word In Fi gure 10-4 , a read request for one word at address 0x240 is issued at time 10 ns. ADS# is asserted low at [...]

  • Page 127

    Deve loper ’ s M anual March, 2003 10-15 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Exte rnal Bus 10.3.2 Read Burst, No Critical W o rd First In Figur e 10-5 the request goes out the same as the last ex ample, with the address 0x24 8 this time and the length 0x6, indicating an eight word cach e line fill. The first d[...]

  • Page 128

    10-1 6 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Exte rnal Bus 10.3.3 Read Burs t, Crit ical Word First D ata Return Figur e 10-6 is the same as the last with one dif ference: CWF is asserted high on the first data cycle of the return data. This indicates that th e data is retur[...]

  • Page 129

    Deve loper ’ s M anual March, 2003 10-17 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Exte rnal Bus 10.3.4 Word W r ite Figure 10-7 s hows a 32-bit write r equest to ad dress 0x 240. W/R# is high when ADS# is asserted low . T wo cycles before the write data needs to b e on the bus for the SDRAM, DV alid is asserted by [...]

  • Page 130

    10-1 8 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Exte rnal Bus 10.3.5 T w o Word Coalesced W rite In Fi gure 10-8 , tw o store byte instruction s from the instruction st ream have been coalesced into a single write comm and in t he write b uffer . The bytes were st ored to addre[...]

  • Page 131

    Deve loper ’ s M anual March, 2003 10-19 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Exte rnal Bus 10.3.5. 1 W rite Burst Figure 10-9 s hows a four word write caused by the eviction of a half cache l ine. In this case, the Len is 0x5 ind icating fo ur words. DV alid is asserted for two cons ecutive cycles here, but th[...]

  • Page 132

    10-2 0 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Exte rnal Bus 10.3.6 Wri te Burst, Coalesced Figur e 10-1 0 shows a four word cache write caused by store requests co al esced in a write buf fer . The Len is 0x5 in dicating four word s. D V alid is asserted for two consecutive c[...]

  • Page 133

    Deve loper ’ s M anual March, 2003 10-21 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Exte rnal Bus 10.3.7 Pipelined Accesses The example in Figure 10-1 1 demonstrates the four deep pip elined nature of this bus. In this example, the Intel ® 8 0200 processor is b us limited and is issu ing requests as quickly as it ca[...]

  • Page 134

    10-2 2 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Exte rnal Bus 10.3.8 Locked Access An example of a locke d access is shown in Figure 10-12 . Here the processor is doing an ato m ic read/write to ad dress 0x240, denot ed as A i n the figure. The Lock signal, which is valid at th[...]

  • Page 135

    Deve loper ’ s M anual March, 2003 10-23 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Exte rnal Bus 10.3.9 Aborted Access As discuss ed in Section 10 .2.6, “Abort” on pag e 10-11 , any request from the I ntel ® 80200 processor can be abo rted by the chipset or memory . This might occur if there was a PCI error , o[...]

  • Page 136

    10-2 4 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Exte rnal Bus 10.3.10 Hold Figur e 10-1 4 shows an examp le of hold being as serted to stop n ew transactions being is sued. The Intel ® 80200 processor fl oats the issue bus p ins and issues no transactio ns until HldA is deasse[...]

  • Page 137

    Deve loper ’ s M anual March, 2003 11-1 Bus Controller 11 1 1.1 Introduct ion The Intel ® 80200 p rocessor based on Intel ® XScale ™ m icroarchitecture (compliant with th e ARM* Architecture V5TE) B us Controller Unit (B CU) is responsibl e for accessing of f-chip memory . It initiates bus cycles as do cumented in Chap ter 10, “External Bus[...]

  • Page 138

    11-2 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Bus Controll er 1 1.3 Error H andling The BCU is able to detect and respond to t wo classes of e rrors: bus abor ts and ECC error s. Information abou t errors is cap tured in a set of p r ogrammer -accessible registers: ELOG0, ELOG1,[...]

  • Page 139

    Deve loper ’ s M anual March, 2003 11-3 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Bus Controller 1 1 .3.2 E CC Errors An ECC error occu rs when the BCU reads data and notices that the as sociated ECC bits do no t match the dat a. This could also happen as a result o f the RMW that the B CU performs on sub bus-width [...]

  • Page 140

    11-4 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Bus Controll er Error report ing may be enab led with the BCUC TL register , described in Section 1 1 .4.1 . If enabled, single bit errors cause th e BCU to assert an interrupt to the Inte rrupt Controller Unit (ICU ). If the interru[...]

  • Page 141

    Deve loper ’ s M anual March, 2003 11-5 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Bus Controller 1 1.4 Prog rammer Model The BCU registers r eside in Copr ocesso r 13 (CP13). They may be accessed/manipulated with the MCR, MRC, STC, and LDC instru ctions. The CRn field of the in s truction denotes the r egister numbe[...]

  • Page 142

    11-6 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Bus Controll er BCUCTL.TP allows so ftware to determine if the BCU has any pendin g memory transactions. This may be used to ensure that all memory operatio ns have completed before attempting to modify system state. For example, the[...]

  • Page 143

    Deve loper ’ s M anual March, 2003 11-7 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Bus Controller When ECC i s enabled, the BCU onl y genera tes an inter rupt on a single- bit error i f BCUCTL.SR is set. When ECC is enable d, the BCU always gene rates an abort on a multi-bit er ror . The BCU repairs single bit errors[...]

  • Page 144

    11-8 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Bus Controll er BCUMOD.AF affects the beh avior of the BCU when it is read ing a 32-byte b lock (a cache line-fill). If this bit is ‘0 ’, then the BCU always emits the 32-byte aligned address of the cache line when requesting it.[...]

  • Page 145

    Deve loper ’ s M anual March, 2003 11-9 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Bus Controller 1 1.4.2 ECC Error Registers The contents of thes e registers sh ould only be consi dered valid i f the corres ponding bit in r egister BCUCTL is set. When an erro r is detected, the BCU selects a fr ee ELOGx/ECARx regist[...]

  • Page 146

    11-1 0 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Bus Controll er The BCU does not write to these ELOGx/ECARx registers unless the corresponding BCUCTL. Ex bit is cleared, either by res et or by software. Software can generate data with in correct ECC values for V alidation purpo[...]

  • Page 147

    Deve loper ’ s M anual March, 2003 12-1 Performance Monitoring 12 This chapter d escribes the perf ormance monitoring facilit y of the Intel ® 80200 processo r based on Intel ® XScale ™ microarchitecture (compliant with the ARM* Architecture V5TE). The even ts that are monitored can provide performance information for compiler writers, system[...]

  • Page 148

    12-2 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Performance Monitor ing 12.2 Clock Coun ter (CCNT ; CP14 - Reg ister 1) The format of C C NT is shown i n T able 1 2-1 . The clock counter i s reset to ‘0’ by Performa nce Monitor Contro l Register (PMNC) or can be set to a prede[...]

  • Page 149

    Deve loper ’ s M anual March, 2003 12-3 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Performance M onitoring 12.3 Perfo rmance Count Reg isters (PMN0 - PMN1; CP14 - Register 2 and 3, Respectively) There are two 32- bit event counters; their format is shown in Ta b l e 1 2 - 2 . The event counters are reset to ‘0’ b[...]

  • Page 150

    12-4 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Performance Monitor ing 12.4 Performance Monito r Control Register (PMNC ) The performance monitor control register (PMNC) is a cop rocessor register that: • controls which events PMN0 and PMN1 m onitors • detects which counter o[...]

  • Page 151

    Deve loper ’ s M anual March, 2003 12-5 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Performance M onitoring 12.4.1 Managing PMNC The following are a few notes about controlling the performan ce monitoring mechanism: • An interrupt is reported when a counter ov erflow flag is set and its as sociated interrupt enab le[...]

  • Page 152

    12-6 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Performance Monitor ing 12.5 Performance Monito ring Event s T able 1 2-4 lists events that may be monitored by the PMU. Each of the Performance Monitor Count Registers (PMN0 and PMN1) can count any lis ted event. Software selects wh[...]

  • Page 153

    Deve loper ’ s M anual March, 2003 12-7 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Performance M onitoring Some typical combination o f counted events are l isted in this section and summarized in T able 12-5 . In this section, we call such an event comb ination a mo de . 12.5.1 I nstruction Cache Efficienc y Mode PM[...]

  • Page 154

    12-8 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Performance Monitor ing 12.5.2 Dat a Cache Efficiency Mode PMN0 totals the num ber of data cache acces ses, which includes cacheable and non- cacheable accesses, mini-data cache access and accesses made t o locations configured as da[...]

  • Page 155

    Deve loper ’ s M anual March, 2003 12-9 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Performance M onitoring 12.5.4 Data/Bus Request Bu ffer Full Mode The Data Cache has buffer s available to service cache misses or uncacheab l e accesses. For every memory reques t that the Data Cache receives from the processor core, [...]

  • Page 156

    12-1 0 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Performance Monitor ing PMN1 counts the number of writeback operations emitted by the data cache. These writebacks occur when the data cache evicts a dirty line of data to make ro om for a n ewl y requested line or as the result o[...]

  • Page 157

    Deve loper ’ s M anual March, 2003 12-11 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Performance M onitoring 12.6 Multiple Performance Mo nitoring Run St atistics Even though only two events can be monitored at any given time, multiple performance monitoring runs can be done, captur ing different events fro m differen[...]

  • Page 158

    12-1 2 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Performance Monitor ing 12.7 Example s In this example, the events selected with the Instruction Cach e Efficiency mod e are monitored and CCNT is used to measur e total execution time. Sampling time ends when PMN0 overflows which[...]

  • Page 159

    Deve loper ’ s M anual March, 2003 13-1 Software Debug 13 This chapt er describes s o ftware debu g and related feature s in the Int el ® 80200 pro cessor bas ed on Intel ® XScale ™ micro architecture (compliant with ARM* Architecture V5TE), namely : • debug modes , registers an d exceptions • a serial debug commu nicati on link via the J[...]

  • Page 160

    13-2 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.3 Introduc tion The Intel ® 80200 processor debug unit, when used with a debugger applicatio n, allows s oftware running on a the In tel ® 80200 pro cessor target to be debu gged. The debug unit allows the deb [...]

  • Page 161

    Deve loper ’ s M anual March, 2003 13-3 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug 13.4 De bug Co ntrol an d S t atus Re gist er (D CSR) The DCSR register is the main control r egis ter for the d ebug unit. T able 13-1 shows the for mat of the register . The DCSR register can be accessed in p rivileged[...]

  • Page 162

    13-4 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.4.1 Globa l Enable Bit (GE) The Global Enable bit disables and enables all d ebug functionality (except the rese t vector trap). Followin g a process or reset, t his bit is clear so all debug function ality is d [...]

  • Page 163

    Deve loper ’ s M anual March, 2003 13-5 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug 13.4.3 V ector T rap Bit s (TF ,TI,T D,T A,TS,TU,TR) The V ector Trap bits allow instruction breakpoints to be set o n exception vectors without using up any of the br eakpoint registers. Wh en a bit is set, it acts as i[...]

  • Page 164

    13-6 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.5 Debug Exceptio ns A debug exception causes the processor to re-direct execution t o a debug event h andling routine. The Intel ® 80 200 proces sor debug architecture defin es the foll owing deb ug exceptions: [...]

  • Page 165

    Deve loper ’ s M anual March, 2003 13-7 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug During Halt mode, software runnin g on the Intel ® 80200 processor cannot access DCSR, or any of hardware breakpoint reg isters, unless the pro ces sor is in Special Debug State (SDS), described below . When a debug exc[...]

  • Page 166

    13-8 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.5.2 Monitor Mode In monitor mod e, the processor handles de bug exceptio ns like normal ARM exceptions. If de bug functional ity is enabl ed (DCSR[ 31] = 1) and the p rocessor i s in Moni tor mode, debug except i[...]

  • Page 167

    Deve loper ’ s M anual March, 2003 13-9 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug 13.6 HW Breakpoint Resou rces The Intel ® 80200 pro cesso r deb ug ar chi tect ure d efi nes tw o inst ruc tion a nd two da ta bre akpo in t registers , denoted IBCR0, IBCR1, DBR 0, and DBR1 . The instruction an d data [...]

  • Page 168

    13-1 0 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.6.2 Dat a Breakpoint s The Intel ® 80 200 proces sor debug architecture defines two dat a breakpo int regist ers (DBR0, DBR1). The format of the registers is shown in T able 13-4 . DBR0 is a dedi cated dat a [...]

  • Page 169

    Deve loper ’ s M anual March, 2003 13-11 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug When DBR1 is progr ammed as a data add ress mask, it is used in co njunction with the addr es s in DBR0. The bits set in DBR1 are ignored by the processor when co mparing the addr es s of a memory access with the ad dre[...]

  • Page 170

    13-1 2 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.8 T ransmit/Recei ve Control Register (TXRXCTRL) Communicati ons between the deb ug handler and debugger are controlled through hand shaking bi ts that ensures the debu gger and debug handler make synchronized[...]

  • Page 171

    Deve loper ’ s M anual March, 2003 13-13 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug 13.8.1 RX Register Ready Bit (RR) The debugger and debug handler use the R R bit to synchron i ze accesses to RX. Normally , the debugger and debug handler use a h andshaking sche me that requires both s ides to po ll t[...]

  • Page 172

    13-1 4 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.8.2 Overflow Fla g (OV) The Overflow flag is a sticky flag that is set when the debugger writes to the RX register while the RR bit is set. The flag is used during high-speed download to indicate that some dat[...]

  • Page 173

    Deve loper ’ s M anual March, 2003 13-15 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug 13.8.4 TX Register Ready Bit (TR) The debugger and debug handler use the TR bit to synchronize accesses to the TX r egis ter . The debugger and debug hand ler must poll the TR bit before access ing the TX reg ister . T [...]

  • Page 174

    13-1 6 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.9 T ransmit Reg ister (TX) The TX register is the debug hand ler transmit buf fer . The debug handler sends data to the debugger through this regi ster . Since the TX register i s accessed by the debug han dle[...]

  • Page 175

    Deve loper ’ s M anual March, 2003 13-17 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug 13.1 1 Deb ug JT AG Access There are four JT AG instruction s used by the debugger dur ing softwar e debug: LDIC, SELDC SR, DBG TX and DBGRX. LDIC is d escribed in Section 1 3 .14, Downloading Code in the IC ache. The o[...]

  • Page 176

    13-1 8 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.1 1.2 SELDC SR JT AG Register Placing the “SELDCSR” JT AG instruction in the JT AG IR, selects the DCSR JT AG Data register ( Fig ure 13-1 ), allowing the deb ugger to access the DCSR, generate an external[...]

  • Page 177

    Deve loper ’ s M anual March, 2003 13-19 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug 13.1 1.2.1 DBG .HLD_RST The debugger uses DBG .HLD_RST when loading code in to the instruction cach e during a processor reset. Details ab out loading code i nto the instruction cach e are in Section 13.14 , Downloading[...]

  • Page 178

    13-2 0 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.1 1.2 .2 DBG .B RK DBG .BRK allows the debugg er to generate an external debug break and asynchronously re- direct exe cut ion to a deb ug h andl ing r outi ne. A debugger sets an external debug break by scann[...]

  • Page 179

    Deve loper ’ s M anual March, 2003 13-21 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug 13.1 1.4 DBG TX JT AG Register The DBG TX JT AG instruction selects the Debug JT AG Data register ( Fi gure 13-3 ). The d ebugger uses the D BG TX data regist er to p oll for break s (internal and e xternal) to debug mo[...]

  • Page 180

    13-2 2 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.1 1.6 D BGRX JT AG Register The DBGRX JT AG instruction selects the DBGRX JT AG Data register . The debug ger uses the DBGRX data regi ster to s end data o r commands to the debug handler . A Capture_DR loads [...]

  • Page 181

    Deve loper ’ s M anual March, 2003 13-23 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug 13.1 1.6.1 RX Write Logic The RX write logic ( Figure 13- 6 ) serves 4 functions: 1) Enable th e debugger write to RX - the logic ensu res only new , valid data from th e debugger is written to RX. In particular, when t[...]

  • Page 182

    13-2 4 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.1 1.6.2 D BGRX Dat a Registe r The bits in the DBGRX data register ( Figu re 13-6 ) are used by t he debugger t o send d ata to the processor . The data register also contains a bit to flush previously written[...]

  • Page 183

    Deve loper ’ s M anual March, 2003 13-25 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug 13.1 1.6.4 DBG .V The debugger sets this bit to indicate the d ata scanned i nto DBG_SR [34:3] is vali d data to write to RX. DBG .V is an input to the RX W rite Logic and is also cleared b y the RX W rite Logic. When t[...]

  • Page 184

    13-2 6 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.12 T race Buffer The 256 entry trace buff er provides the ability to capture control flow information to be used for debuggin g an applicat ion. T wo modes are supported: 1. The buffer fills up completely and [...]

  • Page 185

    Deve loper ’ s M anual March, 2003 13-27 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug When the trace buffer is enabled, reading and writing to either checkpoint register has unpredictable results. When the trace buffer is disabled, writing to a checkpoint register sets the register to the valu e written.[...]

  • Page 186

    13-2 8 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.13 T race Buffer Entries T race buffer entries cons ist of either one or five by tes. Most entries are on e byte messages indicating the type of c ontrol flow change. The tar g et address of t he control flo w[...]

  • Page 187

    Deve loper ’ s M anual March, 2003 13-29 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug 13.13. 1.1 Exception Messa ge Byte When any kind of exception occurs, an exception message is placed in the trace buffer . In an exception message b y te, the message type b i t (M) is always 0. The vector exception (VV[...]

  • Page 188

    13-3 0 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.13.1 .2 Non-exception Mess age Byte Non-exception messag e bytes are use d for direct bran ches, indirect branches, and ro llovers. In a non-excep tion message b yte, the 4-bit message ty pe field (MMMM) speci[...]

  • Page 189

    Deve loper ’ s M anual March, 2003 13-31 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug 13.13. 1.3 Address Bytes Only indirect branch entries contain address bytes in addition to the mes sage byte. Ind irect branch entries always have fou r address bytes indicating the tar g et of that indirect br anch. Wh[...]

  • Page 190

    13-3 2 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.13.2 T race Buffer Usage The Intel ® 80200 processor trace buf fer is 256 bytes in length. The first byte read from the buf fer represents the oldes t trace history information in the b u ffer . The last (256[...]

  • Page 191

    Deve loper ’ s M anual March, 2003 13-33 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug As the trace buf fer is read, the oldest entries are r ead first. Reading a s eries of 5 (o r more) consecutive “0b000 0 0000” entries in the oldest entries indicates that the trace buf fer has not wrapped ar ound a[...]

  • Page 192

    13-3 4 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.14 Downloading Code in the ICache On the Intel ® 8020 0 processor, a 2K mini instruction cache, physically sep arate 1 from the 32K main instruction cache can be used as an on-chip instruction RAM. An externa[...]

  • Page 193

    Deve loper ’ s M anual March, 2003 13-35 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug 13. 14. 2 LD IC JT AG D ata Regi st er The LDIC JT AG Data Register is selected when the LDIC JT A G instruction is in th e JT AG IR. An external host can load and invalidate lines in the instruction cache through this [...]

  • Page 194

    13-3 6 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.14.3 LDIC Cache Function s The Intel ® 80200 processor supports fo ur cache functions that can be executed t hrough JT AG . T wo functions allow an external host to down load code into the main instruction ca[...]

  • Page 195

    Deve loper ’ s M anual March, 2003 13-37 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug All packets are 33 bits in l ength. Bits [2:0] of t he first packet specify th e function to execute. For functions that r equire an addres s, bits[32:6] of the first packet specify an 8-word ali gned address (Packet1[3[...]

  • Page 196

    13-3 8 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.14.4 L oading I C During Reset Code can be download ed into the instruction cache throu gh JT AG during a processor rese t. This feature is us ed durin g softwar e debug to download t he debug h andler prior t[...]

  • Page 197

    Deve loper ’ s M anual March, 2003 13-39 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug 13.14.4.1 Loading IC During Cold Reset for Debug The Figure 13-12 shows the actions necessary to download code into the instru ction cache during a cold reset for debug. NOTE: In the Fi gure 13- 12 hold_rst is a s ignal[...]

  • Page 198

    13-4 0 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug An external host should take the follo wing steps to load code into the instru ction cache following a cold reset: • Assert the RESET# and TRST# pins: This resets the JT AG IR to IDCODE and invalidates the inst[...]

  • Page 199

    Deve loper ’ s M anual March, 2003 13-41 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug 13.14.4.2 Loading IC During a W arm Reset for Debug Loading the instruction cache during a warm reset may be a sl ightly dif ferent situation than during a cold reset. For a warm reset, the main issue is whether th e in[...]

  • Page 200

    13-4 2 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug If it is necessary to download code into the instruction cache then: 2) Assert TRST#. This clears the Halt Mode bit allowing the instruction cache to be invalidated. 3) Clear the Halt Mode bit th rough JT AG . Th[...]

  • Page 201

    Deve loper ’ s M anual March, 2003 13-43 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug 13.14.5 Dynamically Loading IC After Reset An external host c an load code into th e inst ruction cache “o n the fly” or “dynamically”. This occurs when the host downloads code while the processo r is not being [...]

  • Page 202

    13-4 4 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug The following step s describe the details fo r downloading code: • Since the debug handle r is responsible for syn chronization during th e code download, the handler must be executing before the host can begin[...]

  • Page 203

    Deve loper ’ s M anual March, 2003 13-45 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug 13.14.5.1 Dynamic Code Download Sync hronization The following pieces of cod e are necessary in the debug han dler to implement the synchronization used during dy namic code download . The pieces must be ordered in the [...]

  • Page 204

    13-4 6 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.14.6 Mini Instructi on Cache Overview The mini instruction ca che is a smaller version of the main instruction cache (Refer t o Chapter 4 for more details on the main instruction cache). It is a 2KB, 2-way set[...]

  • Page 205

    Deve loper ’ s M anual March, 2003 13-47 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug 13.15 Halt Mode Softwa re Prot ocol This sectio n describes the ov erall debug process in Halt Mode . It describes h o w to st art and en d a debug ses sion and det ails for implementing a debug han dler . Intel provide[...]

  • Page 206

    13-4 8 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.15.1 .2 Placing the Handler in Memory The debug handler is not required to be p laced at a specific pre-defined address. However , there are some limitations on where the handler can be placed due to the overr[...]

  • Page 207

    Deve loper ’ s M anual March, 2003 13-49 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug 13.15.2 Implementing a Debug Handler The debugger uses the debug hand l er to ex amine or mo dify pro cessor stat e by sendi ng command s and reading d ata through JT AG . The API between the deb ugger and deb ug handle[...]

  • Page 208

    13-5 0 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.15.2.3 D ynamic Debug Handler On the Intel ® 8020 0 processo r, the debug handler and over ride vector tab l es reside in the 2 KB mini instruction cache, separate from the main ins truction cache. A “stati[...]

  • Page 209

    Deve loper ’ s M anual March, 2003 13-51 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug 2. Using the Main IC The steps for d o wnloading dynam ic functions into the main in s truction cache is similar to downloading into the min i instruction cache. However , using the main instruction cache h as its advan[...]

  • Page 210

    13-5 2 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.15 .2.4 H igh- Spee d Download Special debu g hardware has been added to supp ort a high-speed d ownload mode to increas e the performance of downloads to sy stem memory (vs. writing a block o f memory using t[...]

  • Page 211

    Deve loper ’ s M anual March, 2003 13-53 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Software Debug 13.15.3 Ending a Debu g Session Prior to ending a debu g session, the d ebugger should tak e the foll owing actio ns: • Clear the DCSR (disable debug, exit Halt Mode, clear all vector traps, disable the trace buf fer)[...]

  • Page 212

    13-5 4 Mar ch, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Sof tw are De bug 13.16 Software Debug Notes/Errat a 1. Trace bu f fer message count v alue on data aborts: LDR to non-PC that aborts gets counted in the ex ception message. Bu t an LDR to the PC that aborts does not get counted o[...]

  • Page 213

    Deve loper ’ s M anual March, 2003 14-1 Performance Considerations 14 This chapter describes relevant perfor mance considerations that compiler writers, app l ication programmers and system designers n eed to be aware of to eff ici ently use Intel ® 80200 processor based on I ntel ® XScale ™ microarchitecture (compliant with t he ARM* Archite[...]

  • Page 214

    14-2 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Performance Considerations 14.2 Branch Predictio n The Intel ® 80200 processor implements dynamic bran ch prediction for the ARM* instructions B and BL and f or the Thu mb* instr uction B . Any ins truction that specifies the PC as [...]

  • Page 215

    Deve loper ’ s M anual March, 2003 14-3 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Performance Considerations 14.4 Instruction Lat encies The latencies for all the instruction s are shown in the following sections with r espect to their functional groups: branch, data processing, multiply , status register access, lo[...]

  • Page 216

    14-4 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Performance Considerations • Minimum Res ource Latency The minimum cycle distance from th e issue clock of the current multiply instr uction to the issue cloc k of the next mu ltiply instru ction assuming th e second multip ly does[...]

  • Page 217

    Deve loper ’ s M anual March, 2003 14-5 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Performance Considerations 14.4.3 Data Pr ocessing Instruction T imings T able 14-5. Br anch Instruction Timings ( Those not predicted by the BTB) Mnemo nic Minimum Issue Latency when Branch Not T ak en Minimum Issue L atency when B ra[...]

  • Page 218

    14-6 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Performance Considerations 14.4.4 Mult iply I nstruc tion Timings T able 14-7. M ultiply Instru ction Timings (Sheet 1 of 2) Mnemo nic Rs V alue (Ea rly T e rm ina tio n) S-Bit Va l u e Min imu m Issue Latency Minimum Result Late ncy[...]

  • Page 219

    Deve loper ’ s M anual March, 2003 14-7 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Performance Considerations UMULL Rs[31:15] = 0x00000 0 1 RdLo = 2; RdHi = 3 2 13 3 3 Rs[31:27] = 0x00 0 1 RdLo = 3; RdHi = 4 3 14 4 4 all others 0 1 RdLo = 4; RdHi = 5 4 15 5 5 1. If the next ins tructi on n eed s to u se the r esul t [...]

  • Page 220

    14-8 March, 2003 Develop er ’ s Man ual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Performance Considerations 14.4.5 Satu rated Ari thmetic Instru ctions h 14.4.6 St atus Register Access Instructions 14.4.7 Lo ad/Store Inst ruc tions T able 14-10. Saturated Dat a Processing Inst ruction Ti mings Mnemonic Minimum Is[...]

  • Page 221

    Deve loper ’ s M anual March, 2003 14-9 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Performance Considerations 14.4.8 Semaphore Instructions 14.4.9 Co processor I nstructions 14.4.10 Miscellaneous Instruc tion T iming 14.4.1 1 Thumb* Instructions The timing of Thumb instruction s are the s ame as th eir equivalent ARM[...]

  • Page 222

    [...]

  • Page 223

    Deve loper ’ s M anual March, 2003 A-1 Compatibility: Intel ® 80200 Processor vs. SA -1 10 A This appendix highlights th e differences between the first generation Intel ® S trongARM* technology (SA-11 0) and the Intel ® 802 00 process or based on Inte l ® XScale ™ microarchitecture (compliant with the ARM* Architecture V5TE). A.1 Int roduc[...]

  • Page 224

    A-2 March, 20 03 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Compatibility: Intel ® 80200 Pro cessor vs. SA -1 10 Feature / Parameter Bri ef Description or Note S A-1 10 Intel ® 802 00 Processor Main Executi on Pipeline Scalar , in-order e xecution, single issue •• • RISC Superpipeline[...]

  • Page 225

    Deve loper ’ s M anual March, 2003 A-3 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Compatibility: Intel ® 80200 P rocesso r vs. SA- 1 1 0 A.3 Architecture Deviatio ns A.3.1 Read Buffer A Read Buf fer is not supporte d on the In tel ® 80200 proces sor and the definit ion of CP15 regis ter 9 has changed from controlli[...]

  • Page 226

    A-4 March, 20 03 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Compatibility: Intel ® 80200 Pro cessor vs. SA -1 10 A.3.4 Wri te Bu ffer Behavior Definitio n of Coalesci ng: C oalescing means bringing t ogether a n ew store op eration with an existing store operation already resident in the wri[...]

  • Page 227

    Deve loper ’ s M anual March, 2003 A-5 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Compatibility: Intel ® 80200 P rocesso r vs. SA- 1 1 0 A.3.6 Performance Differences There exists si gnificant performa nce differences in p rogram execution b etween SA-1 10 and the Intel ® 802 00 process or. If an SA-1 10 appli cati[...]

  • Page 228

    [...]

  • Page 229

    Deve loper ’ s M anual March, 2003 B-1 Optimization Guide B B.1 Int roduc tion This appendix con t ains optimization techniques f or achieving the highes t performance fr om the Intel ® 802 00 process or base d on Inte l ® XScale ™ microarchitecture (compliant with the ARM* Architecture V5TE). It is written for developer s who are optimizing [...]

  • Page 230

    B-2 March, 20 03 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide B.2 Intel ® 80200 Processo r Pipeline One of the biggest dif ferences between the Intel ® 8 0200 proces sor and first-g eneration Int el ® Stron gARM* processors is the p ipeline. Many of the di fferences are s[...]

  • Page 231

    Deve loper ’ s M anual March, 2003 B-3 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de B.2.1 .2. Intel ® 80200 Processor Pipeline Organization The Intel ® 80200 pr ocessor sin gle-issue superpi p eline cons ists of a main execut ion pipel ine, MAC pipeline, and a memo ry access pipeline. These are sh[...]

  • Page 232

    B-4 March, 20 03 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide B.2.1.3. Out Of Order Completion Sequential consisten cy of instruction execution relates to tw o aspects: first, to the order in which the instructions are comp leted; and second, to the order in wh ich memory is[...]

  • Page 233

    Deve loper ’ s M anual March, 2003 B-5 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de B.2.2 Instruction Flow Th rough the Pipeline The Intel ® 8020 0 processor pipel ine issu es a sing le instr uction per cl ock cycle. Instruction execution begins at the F1 pipestage an d completes at the WB pipestag[...]

  • Page 234

    B-6 March, 20 03 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide B.2.3 Main Execution Pi peline B.2.3.1. F 1 / F2 (Instruction Fetch) Pipest ages The job of the in struction fetch stages F1 and F2 is to present the n ext instruction to be executed to the ID stage. Several imp o[...]

  • Page 235

    Deve loper ’ s M anual March, 2003 B-7 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de B.2.3.3. RF (Register File / Shifte r) Pipest age The main function of the RF pipestag e is to read and write to the register file un i t (RFU). It provides sour ce data to: • EX for ALU op erations • MAC for mul[...]

  • Page 236

    B-8 March, 20 03 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide B.2.4 M emory Pipeline The memory pipeline consists of two stages, D1 and D2. The data cache unit , or DCU, consists of the data-cache array , mini-data cache, f ill buf fers, and writebuf fers. The memory pi peli[...]

  • Page 237

    Deve loper ’ s M anual March, 2003 B-9 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de B.3 Bas ic Optimiza tions This chapter outlines optimizations s p ecific to ARM architecture. These optimizations have been modified to suit the Intel ® 80200 processor architecture where needed . B.3.1 Conditi onal[...]

  • Page 238

    B-10 Marc h, 2 003 Develo per ’ s Manual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide B.3.1. 2. Optimizing Branches Branches decrease appl ication performance by indirectly causing p ipeline stalls. Branch prediction improves the per formance by lessenin g the delay i nherent in fetching a new ins[...]

  • Page 239

    Deve loper ’ s M anual March, 2003 B-11 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de P2 Percentage of times we are li kely to incur a branch misprediction p enalty N1 C Number of cycles to execute the if-else portion using cond iti onal instructions assuming the if-conditio n to be true N2 C Number [...]

  • Page 240

    B-12 Marc h, 2 003 Develo per ’ s Manual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide B.3.1.3. Optimizing Complex Expressions Conditional instru ct ions sh ould also be used to improv e the code generated for complex expressions s uch as the C shortcut ev aluation feature. Conside r the following [...]

  • Page 241

    Deve loper ’ s M anual March, 2003 B-13 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de B.3.2 Bit Fi eld Manipu lation The Intel ® 8020 0 processor shift an d logical operations provide a useful w ay of manip ulating bit fields. Bit field op erations can be optimized as follows: ;Set the bit number sp[...]

  • Page 242

    B-14 Marc h, 2 003 Develo per ’ s Manual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide B.3.3 Optimizing the Use of Immediate V alues The Intel ® 80200 pro cessor MOV or MVN in struction should be used wh en loading an immediate (constant) value into a register . Please refer to the ARM Architectur[...]

  • Page 243

    Deve loper ’ s M anual March, 2003 B-15 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de B.3.4 Optimizing In teger M ultip ly a nd Di vide Multiplication by an intege r constant should be optimized to make use of the sh ift operation whenever possible. ;Multiplication of R0 by 2 n mov r0, r0, LSL #n ;Mu[...]

  • Page 244

    B-16 Marc h, 2 003 Develo per ’ s Manual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide B.3.5 Effective Use of Addressing Modes The Intel ® 80200 pr ocessor provide s a variety of addres sing modes that mak e indexing an array of objects high ly efficient. For a detailed descri ption of t hese addr[...]

  • Page 245

    Deve loper ’ s M anual March, 2003 B-17 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de B.4 Cache and Prefetch Optimizations This chapter consid ers how to use the v arious cache memo ries in all their modes and then examines when and how to use prefetch to improve execu tion efficien ci es. B.4.1 Inst[...]

  • Page 246

    B-18 Marc h, 2 003 Develo per ’ s Manual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide B.4.1.4. L ocking Code into the Instruction Cache One very important instruction cache feature is the ability to lock code into the instructio n cache. Once locked into the inst ruction cache, the cod e is always[...]

  • Page 247

    Deve loper ’ s M anual March, 2003 B-19 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de B.4.2 Data and Mini Cache The Intel ® 80200 process o r allows the user to define memor y regions whose cache policies can be set by the user (s ee Section 6.2.3, “Cache Policies” ). Sup ported policies and con[...]

  • Page 248

    B-20 Marc h, 2 003 Develo per ’ s Manual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide B.4.2. 3. Read Allocate and Read- write Al locate M e mory Regions Most of the regula r data and the stack for your ap plication should be allocated to a read-write allocate region. It is ex pected that you write[...]

  • Page 249

    Deve loper ’ s M anual March, 2003 B-21 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de B.4.2 .5. Mini-dat a Cache The mini-data cache is best used for data structures , which have s hort temporal lives , and/or cover vast amounts of data space. Addressing these types of d ata spaces from the Data cach[...]

  • Page 250

    B-22 Marc h, 2 003 Develo per ’ s Manual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide B.4.2. 6. Dat a Alignme nt Cache lines begin on 32-by te address bou ndaries. T o maximize cache line use and min imize cache pollution, data structures s hould be aligned on 32 byte boundaries and sized to multi[...]

  • Page 251

    Deve loper ’ s M anual March, 2003 B-23 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de B.4.2.7. Literal Pools The Intel ® 80200 process or does not h ave a single ins truction that can m ove all literals (a constant or address) to a register . One technique to load registers with litera ls in the Int[...]

  • Page 252

    B-24 Marc h, 2 003 Develo per ’ s Manual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide B.4.3 Cache Considerations B.4.3. 1. Cache Conflict s, Pollution a n d Pressure Cache pollution occurs when unused data is loaded in the cache an d cache pressure occurs when data that is not tempor al to the cur[...]

  • Page 253

    Deve loper ’ s M anual March, 2003 B-25 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de B.4.4 Prefetch Co nsiderations The Intel ® 8020 0 processor has a true p refetch load instru ction (PLD). The p urpose of this instruction is to prelo ad data into the data and mini-data caches. Data prefetching al[...]

  • Page 254

    B-26 Marc h, 2 003 Develo per ’ s Manual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide The Intel ® 80200 pr ocessor needs sev en bus clocks to p rocess a memory reques t to the S D RAM (N processor ). T ypical SDRAM needs 2 to 3 bus clocks to select the memor y locations provided that the current [...]

  • Page 255

    Deve loper ’ s M anual March, 2003 B-27 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de B.4.4.2. Prefetch Loop Scheduling When adding prefetch to a loop w hich operates on arrays, it may b e advantages to prefetch ahead one, two, or more iterations. The data for future iterations is located in memory b[...]

  • Page 256

    B-28 Marc h, 2 003 Develo per ’ s Manual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide B.4.4.6. B andwid th Limitations Overuse of pr efetches can usurp res ources and d egrade performance. This happen s because on ce the bus traffic requests exceed the system resource capacity , the processor stal[...]

  • Page 257

    Deve loper ’ s M anual March, 2003 B-29 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de B.4. 4.7. Ca che Me mory Cons idera tions Stride, the way data structures are walked through, can affect the temporal quality of the data and reduce or increase cache conflicts. The Intel ® 80200 processor data cac[...]

  • Page 258

    B-30 Marc h, 2 003 Develo per ’ s Manual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide on a 32-byt e boundary , modi fications to the Y ear2Date fields i s likely t o use tw o write bu ffers when the data is written out to me mory . However, we can restrict the number of write buffers that are comm[...]

  • Page 259

    Deve loper ’ s M anual March, 2003 B-31 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de B.4. 4.8. Ca che Bl ocking Cache blocking techniques, s uch as strip-mining, are used to improve temporal locality of the data. Given a large data set that can be reused across multiple passes of a loop, data blocki[...]

  • Page 260

    B-32 Marc h, 2 003 Develo per ’ s Manual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide B.4.4. 10. Pointer P refetc h Not all loopin g constructs contai n inducti on variabl es. However , prefetching techni ques can s till be applied. Consider the follo wing linked list traversal examp le: while(p) [...]

  • Page 261

    Deve loper ’ s M anual March, 2003 B-33 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de B.4.4.1 1. Loop Interchange As mentioned earlier , the sequence in which d ata is accessed af fects cache thrashing. Usually , it is best to access data in a contiguous spatially address ran ge. However , arrays of [...]

  • Page 262

    B-34 Marc h, 2 003 Develo per ’ s Manual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide B.4.4.1 3. Prefetch to Re duce Regi ster Pre ssure Prefetch can be used to red u ce register pres sure. When data is need ed for an op eration, then the load is scheduled far enough in ad vance to hide the load l[...]

  • Page 263

    Deve loper ’ s M anual March, 2003 B-35 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de B.5 Instructi on Scheduling This chapter discuss es instruction sched uling optimizations. Ins truction scheduling ref ers to the rearrangeme nt of a sequence o f instru ctions fo r the purpo s e of minimizing pipel[...]

  • Page 264

    B-36 Marc h, 2 003 Develo per ’ s Manual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide ; all other registers are in use sub r1, r6, r7 mul r3,r6, r2 mov r2, r2, LSL #2 orr r9, r9, #0xf add r0,r4, r5 ldr r6, [r0] add r8, r6, r8 add r8, r8, #4 orr r8,r8, #0xf ; The value in register r6 is not used af[...]

  • Page 265

    Deve loper ’ s M anual March, 2003 B-37 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de B.5.1.1. Scheduling Load and Store Double (LDRD/STRD) The Intel ® 8 0200 proces sor introduces t wo new d ouble w ord instr uctions: LDRD and STRD . LDRD loads 64-bits of data from an effective address into t wo co[...]

  • Page 266

    B-38 Marc h, 2 003 Develo per ’ s Manual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide B.5.1.2. Scheduling Load and Store Multiple (LDM/STM) LDM an d STM instruction s have an i ssue laten cy of 2-20 cy cles depending on t he number of registers being loaded or stored. The issue latency is typicall[...]

  • Page 267

    Deve loper ’ s M anual March, 2003 B-39 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de B.5.2 Scheduling Data Processing Instructions Most Intel ® 80200 pro cessor data processi ng ins tructions h ave a result latenc y of 1 cycl e. This means that the current instr uction is able to use the result f r[...]

  • Page 268

    B-40 Marc h, 2 003 Develo per ’ s Manual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide B.5.3 Schedu ling M ultip ly In struct ions Multiply instr uctions can cause pipeline stal ls due to either resource conflict s or result latencies. The following code se gment would incur a stall of 0-3 cy cles [...]

  • Page 269

    Deve loper ’ s M anual March, 2003 B-41 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de B.5.4 Scheduling SWP and SWPB Instructions The SWP and SWPB instructions have a 5 cycle issue laten cy . As a result of this laten cy , the instruction fo llowing the SWP / SWPB instruction would stall for 4 cycles.[...]

  • Page 270

    B-42 Marc h, 2 003 Develo per ’ s Manual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide B.5.5 Schedu ling the MRA and M AR Instructions (MRRC/MCR R) The MRA ( MRRC ) instru ction has an issue latency of 1 cycle, a resu lt latency of 2 or 3 cycles depending on the destination regis ter value being ac[...]

  • Page 271

    Deve loper ’ s M anual March, 2003 B-43 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de B.5.6 S cheduling the MIA and MIAPH Instructions The MI A instruction has an i ssue latency of 1 cycle. The result and resour ce latency can vary from 1 to 3 cycl es depend ing on th e values in the source regi ster[...]

  • Page 272

    B-44 Marc h, 2 003 Develo per ’ s Manual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture Optimizat ion Guide B.5.7 Scheduling MRS and MSR Instructi ons The MRS instruction has an issue latency of 1 cycle and a res ult latency of 2 cycles. Th e MSR instruction has an iss ue latency of 2 cycles (6 if updating the mo de bi[...]

  • Page 273

    Deve loper ’ s M anual March, 2003 B-45 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture Optimization Gui de B.6 Optimizing C Libraries Many of the st andard C library routin es can benefit greatly by being optim ized for the Intel ® 80200 processor architectur e. The following string and memory manipulation routines sho [...]

  • Page 274

    [...]

  • Page 275

    Deve loper ’ s M anual March, 2003 C-1 T est Features C The Intel ® 80200 p rocessor based on Intel ® XScale ™ m icroarchitecture (compliant with th e ARM* Architecture V5TE) implements Design Fo r T est (DFT) techniques to ensure quality and reliability . This appendix describes those tech niques. C.1 Int roduc tion T esting VLSI circuits is[...]

  • Page 276

    C-2 March, 20 03 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture T est Featur es C.2.1 B oundary Scan Architecture Boundary scan test logic co nsists of a Boundary-Scan reg i ster and supp ort logic. These are accessed through a T est Access Port (T AP). The T AP provides a simple serial interface[...]

  • Page 277

    Deve loper ’ s M anual March, 2003 C-3 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture T e st Feat ures C.2 .2 T AP Pi ns The Intel ® 80200 pr oce ssor T AP is comp ose d of fo ur in put c onne ctio ns (T MS, T CK , TRST # and TDI) and one o utput connection ( TDO ). These pins are described in Ta b l e C - 1 . The T AP [...]

  • Page 278

    C-4 March, 20 03 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture T est Featur es C.2.3 Instruction Register (IR) The instruction re gister holds instructio n codes shifted through the T est Data Input (TDI) pin. The instruction codes are u sed to select the specific test op eration to be performed[...]

  • Page 279

    Deve loper ’ s M anual March, 2003 C-5 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture T e st Feat ures T abl e C-3. IEEE Instructions Instruction / Requisite Opcode Description extest IEEE 1 149.1 Required 00000 2 extest initiates testing of external circu itry , typically board-level interconnects and of f chip circu it[...]

  • Page 280

    C-6 March, 20 03 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture T est Featur es C.2.4 T AP T est Data Registers The Intel ® 80200 pr ocessor contains a device identification register and two test data registers (Bypass and RUNBIST). Each tes t data register selected by the T AP controller is con[...]

  • Page 281

    Deve loper ’ s M anual March, 2003 C-7 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture T e st Feat ures C.2.5 T AP Controller The T AP controller is a 16-state synchr onous finite state machine that con t rols the sequence o f test logic operations. The T AP can be controlled via a bu s master . The bus master can be eith[...]

  • Page 282

    C-8 March, 20 03 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture T est Featur es C.2.5. 1. T est Logic R eset S t ate In this state, test logic is disabled to allow normal operation of the Intel ® 80200 pr oce ssor. T es t logic is d isabled by loading the idcode register . No matter what the sta[...]

  • Page 283

    Deve loper ’ s M anual March, 2003 C-9 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture T e st Feat ures C.2.5 .5. Shift-DR S t ate In this contro ller state, the test data register , which is con nected between TDI and T D O as a result of the current in struction, shifts data one b it position nearer to its serial ou tpu[...]

  • Page 284

    C-10 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture T est Featur es C.2. 5.9. Up date-DR S t ate The Boundary -Scan regi ster is p rovided with a latch ed parall el output . This out put preven ts changes at the parallel output whil e data is shifted in response to the ext est , sam[...]

  • Page 285

    Deve loper ’ s M anual March, 2003 C-11 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture T e st Feat ures C.2.5 .13. Exit1-I R S t ate This is a temporary s tate. If TMS is held hi gh on the rising edg e of TCK, the controller enters the Update-IR state, which terminates the scanning p r ocess. If TMS is held low on the ri[...]

  • Page 286

    C-12 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture T est Featur es C.2.5.17. Boundary-Scan Example In the example that fo llows, two command actions are described. Th e example starts in the reset state, a new instruction is loaded an d executed. See Figure C -3 for a JT AG example[...]

  • Page 287

    Deve loper ’ s M anual March, 2003 C-13 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture T e st Feat ures Figure C- 3. JT AG E xample 00 0 1 1 00 0 0 0 0 00 00 0 0 000 000 0 000 0 TCK TMS Don ’ t Care Don ’ t Care NEW I nst = 0001 2 Old I nst abcd Don ’ t Care Boundary Scan Instructio n Register TDI Parallel Out IR S[...]

  • Page 288

    C-14 Marc h, 200 3 Deve loper ’ s M anual Intel ® 80200 P rocesso r based o n Intel ® XScale ™ Microarchitecture T est Featur es Figure C-4. Timing Diagram Illustrati ng the Loading of I nstruction Register TCK TMS Controller S tate TDI Data input to IR IR shift-reg ister Parallel output of IR Data input to TDR TDR shift-reg ister Parallel ou[...]

  • Page 289

    Deve loper ’ s M anual March, 2003 C-15 Intel ® 80200 Proce ssor based on Intel ® XScale ™ Microarchitecture T e st Feat ures Figure C-5. Timing Dia gram Illustrating the Loading of Dat a R egister TCK TMS Controll er State TDI Data input to I R IR shift-reg ister Parallel output of IR Data input to TDR TDR shift-reg ister Parallel output of [...]