Intel 253668-032US manual

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806

Go to page of

A good user manual

The rules should oblige the seller to give the purchaser an operating instrucion of Intel 253668-032US, along with an item. The lack of an instruction or false information given to customer shall constitute grounds to apply for a complaint because of nonconformity of goods with the contract. In accordance with the law, a customer can receive an instruction in non-paper form; lately graphic and electronic forms of the manuals, as well as instructional videos have been majorly used. A necessary precondition for this is the unmistakable, legible character of an instruction.

What is an instruction?

The term originates from the Latin word „instructio”, which means organizing. Therefore, in an instruction of Intel 253668-032US one could find a process description. An instruction's purpose is to teach, to ease the start-up and an item's use or performance of certain activities. An instruction is a compilation of information about an item/a service, it is a clue.

Unfortunately, only a few customers devote their time to read an instruction of Intel 253668-032US. A good user manual introduces us to a number of additional functionalities of the purchased item, and also helps us to avoid the formation of most of the defects.

What should a perfect user manual contain?

First and foremost, an user manual of Intel 253668-032US should contain:
- informations concerning technical data of Intel 253668-032US
- name of the manufacturer and a year of construction of the Intel 253668-032US item
- rules of operation, control and maintenance of the Intel 253668-032US item
- safety signs and mark certificates which confirm compatibility with appropriate standards

Why don't we read the manuals?

Usually it results from the lack of time and certainty about functionalities of purchased items. Unfortunately, networking and start-up of Intel 253668-032US alone are not enough. An instruction contains a number of clues concerning respective functionalities, safety rules, maintenance methods (what means should be used), eventual defects of Intel 253668-032US, and methods of problem resolution. Eventually, when one still can't find the answer to his problems, he will be directed to the Intel service. Lately animated manuals and instructional videos are quite popular among customers. These kinds of user manuals are effective; they assure that a customer will familiarize himself with the whole material, and won't skip complicated, technical information of Intel 253668-032US.

Why one should read the manuals?

It is mostly in the manuals where we will find the details concerning construction and possibility of the Intel 253668-032US item, and its use of respective accessory, as well as information concerning all the functions and facilities.

After a successful purchase of an item one should find a moment and get to know with every part of an instruction. Currently the manuals are carefully prearranged and translated, so they could be fully understood by its users. The manuals will serve as an informational aid.

Table of contents for the manual

  • Page 1

    In tel® 64 and IA-32 Ar chitectur es So ftw ar e De v eloper’ s Manual Vo l u m e 3 A : S ystem Pr ogr amming Guide, P art 1 NO TE: The In tel ® 64 and IA-32 Ar chitectures So ftwar e Dev eloper's Manual co n s i s t s of f i v e vo l u m e s : Basic Architectur e , Order Number 253665; Inst ruction Se t R ef er ence A-M , Or der Number 25[...]

  • Page 2

    ii Vol. 3A INFORMA TION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LI CENSE, EXPRESS OR IMPLIED, BY EST OPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY TH IS DOCUMENT . EX CEPT AS PRO VIDED IN INTEL'S TE RMS AND CONDITIONS OF SALE FOR SUCH PRODUCT S, INTEL AS SUMES NO LIABILITY WHA TSOEVER AND INT [...]

  • Page 3

    Vol. 3A iii CONTENTS PAG E CHAPTER 1 ABOUT THIS MANUAL 1.1 PROCESSORS C OVERED IN THIS MANUAL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1 1.2 OVERVIEW OF THE S YSTEM PROGRAMMING GUIDE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1- 3 1.3 NOTATIONAL CON VENTIONS . . . . . . . . . .[...]

  • Page 4

    CO NTE NT S iv Vol. 3A PAG E 2.7.5 Controllin g the Processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-31 2.7.6 Reading Perf ormance-Monito ring and Time-Stamp Counters . . . . . . . . . . . . . . . . . . . . . 2-32 2.7.6.1 Reading C ounters in 64-Bit Mod e . . . . . . . . . . .[...]

  • Page 5

    Vol. 3A v CO NTE NT S PAGE 4.9.3 Caching Paging -Related Informati on about Memory Typ ing . . . . . . . . . . . . . . . . . . . . . . . 4-38 4.10 CACHING TRANSLAT ION INFORMATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-38 4.10.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . [...]

  • Page 6

    CO NTE NT S vi Vol. 3A PAG E 5.8.7.1 SYSENTER and SY SEXIT Instructions in IA-32e Mo de. . . . . . . . . . . . . . . . . . . . . . . . . . 5-31 5.8.8 Fast System Calls in 64- bit Mode. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-32 5.9 PRIVILEGED INSTRUCTIONS . . . . . . . . . . . . . . . . . . . [...]

  • Page 7

    Vol. 3A vii CO NTE NT S PAGE 6.14 EXCEPTION AND INT ERRUPT HANDLING IN 64-BIT MO DE . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-22 6.14.1 64-Bit Mode IDT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-23 6.14.2 64-Bit Mode Stack Fra me . . . . . . . . . .[...]

  • Page 8

    CO NTE NT S viii Vol. 3A PAG E CHAPTER 8 MULTIPLE-PR OCESSOR MANAGEMEN T 8.1 LOCKED ATOM IC OPERATIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-2 8.1.1 Guaranteed A tomic Operation s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . [...]

  • Page 9

    Vol. 3A ix CO NTE NT S PAGE 8.7.9 Memory Orderin g . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-42 8.7.10 Serializing Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-42 8.7.11 MICROCOD[...]

  • Page 10

    CO NTE NT S x Vol. 3A PAG E 9.5 MEMORY TYPE RAN GE REGISTERS (MTRRS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-9 9.6 INITIAL IZING SSE/SSE2/SS E3/SSSE3 EXTENS IONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-10 9.7 SOFTWARE INITIALIZATION FOR REAL-ADDRESS M ODE OPERATION . . . . . [...]

  • Page 11

    Vol. 3A xi CO NTE NT S PAGE CHAP TER 10 ADVA NCED PR OGRAMMABLE INTERRUP T CONTR OLLER (APIC) 10.1 LOCAL AND I/O AP IC OVERVIEW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-1 10.2 SYSTEM BUS VS . APIC BUS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .[...]

  • Page 12

    CO NTE NT S xii Vol. 3A PAG E 10.7.2.4 Deriving Logical x2APIC ID from the Local x2AP IC ID . . . . . . . . . . . . . . . . . . . . . . . . . 10-50 10.7.2.5 Broadcast/Self Delivery Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-51 10.7.2.6 Lowest Priority Delivery Mode . . . . . . . . . . . . . .[...]

  • Page 13

    Vol. 3A xiii CO NTE NT S PAGE 11.11 MEMORY TYPE RANGE REGISTERS (MTR RS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-30 11.11.1 MTRR Feature Identificati on . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-32 11.11.2 Setting Memory Ranges wi th MTRRs . . . . [...]

  • Page 14

    CO NTE NT S xiv Vol. 3A PAG E 13.1.6.1 Numeric Error flag and IGNNE# . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-8 13.2 EMULATION OF SSE/ SSE2/SSE3/SSSE3/SSE4 EX TENSIONS . . . . . . . . . . . . . . . . . . . . . . . . . . 13-8 13.3 SAVING A ND RESTORING TH E SSE/SSE2/SSE3/ SSSE3/SSE4 STAT E . . . [...]

  • Page 15

    Vol. 3A xv CO NTE NT S PAGE 15.3 MACHINE-CH ECK MSRS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-2 15.3.1 Machine-Check Global Co ntrol MSRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15 - 3 15.3.1.1 IA32_MCG_CAP M SR . . . . .[...]

  • Page 16

    CO NTE NT S xvi Vol. 3A PAG E CHAPTER 16 DEBUGGING, PR OFILING BR ANCHES AND TIME-S TAMP COUNTER 16.1 OVERVIEW OF DEBUG SUPPORT FACILITIES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-1 16.2 DEBUG REGISTERS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .[...]

  • Page 17

    Vol. 3A xvii CO NTE NT S PAGE 16.9 LAST BRANCH, INTERRUPT, AND EXCEPTION RECORDING (PEN TIUM M PROCESSORS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-43 16.10 LAST BRANCH , INTERRUPT, AND EXCEPTION RECORDING (P6 F AMILY PROCESSOR S) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .[...]

  • Page 18

    CO NTE NT S xviii Vol. 3A PAG E CHAPTER 18 MIXING 16-BIT AND 32-BIT CODE 18.1 DEFINING 16-BIT AND 32-BIT P ROGRAM MODULES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-2 18.2 MIXING 16-BIT AND 32-BIT OPERATIONS WITHIN A CODE SEGMENT . . . . . . . . . . . . . . . . . 18- 2 18.3 SHARING DATA AMONG MI XED-SIZE CODE SEGMENTS . . [...]

  • Page 19

    Vol. 3A xix CO NTE NT S PAGE 19.18.6.3 Numeric Und erflow Exception (#U) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-1 4 19.18.6.4 Exception Precede nce . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-14 19.18.6.5 CS and EIP For F PU Exceptions . .[...]

  • Page 20

    CO NTE NT S xx Vol. 3A PAG E 19.25 EXCEPTIONS AND/O R EXCEPTION CONDITIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-28 19.25.1 Machine-Check Archit ecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-30 19.25.2 Priority OF Exceptions . . . . . . . . . . . . [...]

  • Page 21

    Vol. 3A xxi CO NTE NT S PAGE 20.5 VIRTUAL-MACHINE CON TROL STRUCTURE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-3 20.6 DISCOVERING SUPPORT FOR VMX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 0 - 3 20.7 ENABLING AND ENTERING V MX OPERATION . . . . . . .[...]

  • Page 22

    CO NTE NT S xxii Vol. 3A PAG E CHAPTER 22 VMX NON-R OOT OPER ATION 22.1 INSTRUCTIONS THAT CAUSE VM EXITS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-1 22.1.1 Relative Priority of Faults and VM Exits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-1 22.1.2 Instructio[...]

  • Page 23

    Vol. 3A xxiii CO NTE NT S PAGE 23.3.1.3 Checks on Gu est Descriptor-T able Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23-15 23.3.1.4 Checks o n Guest RIP and R FLAGS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23-1 5 23.3.1.5 Checks on Guest Non-Register S tate . . . . . . . . .[...]

  • Page 24

    CO NTE NT S xxiv Vol. 3A PAG E 24.5.6 C learing Address-Ran ge Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-37 24.6 LOADING MSRS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-38 24.7 VMX ABORTS . . . . . . [...]

  • Page 25

    Vol. 3A xxv CO NTE NT S PAGE 26.11 SMBASE RE LOCATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-19 26.11.1 Relocating SMRAM to an Addr ess Above 1 MByte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-20 26.12 I/O INSTRUC TION RESTART . . . . . . . . . . .[...]

  • Page 26

    CO NTE NT S xxvi Vol. 3A PAG E 27.7.1 Handling VM Exits Due to Exceptio ns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-11 27.7.1.1 Reflecting E xceptions to Guest Sof tware. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-11 27.7.1.2 Resuming Gues t Software after Han dling an Exce[...]

  • Page 27

    Vol. 3A xxvii CO NTE NT S PAGE CHAP TER 29 HANDLING BOUNDARY CONDITIONS IN A VIRTUAL MACHINE MONITOR 29.1 OVERVIEW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29-1 29.2 INTERRUPT HANDLIN G IN VMX OPERATION . . . . . . . . . . . . . . . . . . . . . . . [...]

  • Page 28

    CO NTE NT S xxviii Vol. 3A PAG E 30.5 PERFORMANCE MONITOR ING (PROCESSORS BASED ON IN TEL ® ATOM ™ MICROARCH ITECTURE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30-25 30.6 PERFORMANCE MONITORING FOR PROCESSORS BASED ON INTEL ® MICROARCHITECTURE (NEHALEM) . . . . . . . . .[...]

  • Page 29

    Vol. 3A xxix CO NTE NT S PAGE 30.10.3 Incrementing the Time-Stamp C ounter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30- 77 30.10.4 Non-Halted Reference Clockticks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30-77 30.10.5 Cycle Counting and Opportu nistic Process[...]

  • Page 30

    CO NTE NT S xxx Vol. 3A PAG E B.3 MSRS IN THE INTEL ® ATOM ™ PROCESSO R FAMILY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-58 B.4 MSRS IN THE INTEL ® MICRO ARCHITECTURE (NEHALEM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . B -73 B.5 MSRS IN THE PENTIUM ® 4 AND INTEL ® XEON ® PROCESSORS . . . . . . . . . . . [...]

  • Page 31

    Vol. 3A xxxi CO NTE NT S PAGE E.4.3 Processor Model Specific Error C ode Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E- 21 E.4.3.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . MCA Error Type A: L3 ErrorE -21 E.4.3.2 Processor Model Specific Error Code F[...]

  • Page 32

    CO NTE NT S xxxii Vol. 3A PAG E H.4.2 Natural-Width R ead-Only Data Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . H-10 H.4.3 Natural-Width Guest-S tate Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . H-10 H.4.4 Natural-Width H ost-State Fields . . [...]

  • Page 33

    Vol. 3A xxxiii CO NTE NT S PAGE FIGUR ES Figure 1-1. Bit and Byte Or der . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-7 Figure 1-2. Syntax for CPUID, CR , and MSR Data Prese ntation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10 Figure 2-1. IA-32 System-Level[...]

  • Page 34

    CO NTE NT S xxxiv Vol. 3A PAG E Figure 6-2. IDT Gate Descrip tors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-15 Figure 6-3. Interrupt Procedure Call . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-16 Figure 6-4. Stack U[...]

  • Page 35

    Vol. 3A xxxv CO NTE NT S PAGE Figure 10-14. Error Status Register (ESR) in x2APIC Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-36 Figure 10-15. Divide Configuratio n Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-37 Figure 10-16. Initial Count and Curre nt Count R[...]

  • Page 36

    CO NTE NT S xxxvi Vol. 3A PAG E Figure 14-11. IA32_THERM_STATUS R egister . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-1 9 Figure 14-12. IA32_THERM_INTERRUPT Reg ister . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-2 1 Figure 15-1. Machine- Check MSRs . . . . . . . [...]

  • Page 37

    Vol. 3A xxxvii CO NTE NT S PAGE Figure 29-1. Host External Interrupts a nd Guest Virtual Interru pts . . . . . . . . . . . . . . . . . . . . . . . . . 29-5 Figure 30-1. Layout of IA32_PER FEVTSELx MSRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 -4 Figure 30-2. Layout of IA32_FIXED_CT R_CTRL MSR . . . . . . [...]

  • Page 38

    CO NTE NT S xxxviii Vol. 3A PAG E TABLES Table 2-1. Action Taken By x87 FPU In structions for Different Combinations of EM, MP, and TS2-21 Table 2-2. S ummary of System Instruction s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-27 Table 3-1. C ode- and Data-Seg ment Types . . . . . . . . . . . . . . . . [...]

  • Page 39

    Vol. 3A xxxix CO NTE NT S PAGE Table 8-2. Initia l APIC IDs for the Logical Proc essors in a System that has Two Physical Processors Supporting Dual-Core a nd Intel Hyper-Threading Technology8-53 Table 8-3. Example of Possib le x2APIC ID Assi gnment in a System that has Two Physical Processors Supporting x2APIC and Intel Hyper-T hreading Technology[...]

  • Page 40

    CO NTE NT S xl Vol. 3A PAG E Table 13-1. Action Taken for C ombinations of OSFXSR , OSXMMEXCPT, SSE, SS E2, SSE3, EM, MP, and TS113-4 Table 13-2. Action Taken for Combi nations of OSFXSR, SSSE3 , SSE4, EM, and TS . . . . . . . . . . 13-5 Table 13-3. XSAVE Head er Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .[...]

  • Page 41

    Vol. 3 A xli CO NTE NT S PAGE Table 21-4. Forma t of Pending-Deb ug-Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 -8 Table 21-5. Definition s of Pin-Based VM-Executi on Controls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21-11 Table 21-6. Defi nitions of Primary Pr ocessor-Based VM- Exe[...]

  • Page 42

    CO NTE NT S xlii Vol. 3A PAG E Table 30-1. UMask and Event Select E ncodings for Pre-Define d Architectural Performance Events30-13 Table 30-2. Core Specificity E ncoding within a Non- Architectural Umask . . . . . . . . . . . . . . . . . . 30-15 Table 30-3. Agent Specificity E ncoding within a Non- Architectural Umask . . . . . . . . . . . . . . .[...]

  • Page 43

    Vol. 3A xliii CO NTE NT S PAGE Table A-15. List of Metrics Available for Replay T agging (For Replay Event Only)A-206 Table A-16. Event Mask Qualificati on for Logical Processors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A -208 Table A-17 . Performance Moni toring Events on Intel ® Pentium ® M ProcessorsA-214 Table A-18 . Perfo rm[...]

  • Page 44

    CO NTE NT S xliv Vol. 3A PAG E Table F-2. Short Me ssage (21 Cycles) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .F-2 Table F-3. Non-Focused Lowest Priority Messa ge (34 Cycles) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .F- 3 Table F-4. APIC Bus S tatus Cycles Interpreta tion[...]

  • Page 45

    Vol. 3 1-1 CHAP TER 1 ABOUT THIS MANUAL The Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A: System Programming Guide, Part 1 (order numbe r 253668) and the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3B : System Programming Guide, Part 2 (order number 25366 9) are part of a set that describe[...]

  • Page 46

    1-2 Vol. 3 ABOUT THIS M ANUAL • Dual-Core Intel ® Xe o n ® processor L V • Intel ® Core™2 Duo processor • Intel ® Core™2 Quad processor Q6000 series • Intel ® Xe o n ® processor 3000, 320 0 series • Intel ® Xe o n ® processor 5000 series • Intel ® Xe o n ® processor 5100, 530 0 series • Intel ® Core™2 Extreme processo[...]

  • Page 47

    Vol. 3 1-3 ABOUT THIS MANUAL The Intel ® Core TM i7 processor and the Intel ® Core TM i5 processor are based on the Intel ® microarchitecture (Nehalem) and support Intel 64 architecture. Processors based on the Next Generation Intel Processor , codenamed W estmere, support Intel 64 architecture. P6 family , P entium ® M, Intel ® Core™ Solo, [...]

  • Page 48

    1-4 Vol. 3 ABOUT THIS M ANUAL Chapter 6 — Interrupt and Exception Handl ing. Describes the basic interrupt mechanisms defined in the Intel 64 and IA -32 architectures, shows how interrupts and exceptions relate to protection, and de scribes how the architecture handles each exception type. R eference information for each exception is given at the[...]

  • Page 49

    Vol. 3 1-5 ABOUT THIS MANUAL Chapter 16 — Debugging, Branch Profiles and Time-Stamp Counter. Describes the debugging registers and othe r debug mechanism provided in Intel 64 or IA-32 processors. This chapter also describes the time-stamp counter . Chapter 17 — 8086 E mulation. Describes the real-add ress and virtual-8086 modes of the IA-32 arc[...]

  • Page 50

    1-6 Vol. 3 ABOUT THIS M ANUAL Chapter 30 — Perf ormance Monitoring. Describes the Intel 64 and IA-32 archi - tectures’ facilities for monitoring performance. Appendix A — Performance-Monitoring Events. Lists architectur al performance events. Non-architectur al performance events (i.e. model-specific events) are list ed for each generation of[...]

  • Page 51

    Vol. 3 1-7 ABOUT THIS MANUAL means the bytes of a word are numbered st arting from the least significant byte. Figure 1-1 illustrates these conventions. 1.3.2 R eserved Bits and Softw a r e Compatibility In many register and memory layout descriptions, certain bits are marked as reserved . When bits are marked as reserved, it is essential for compa[...]

  • Page 52

    1-8 Vol. 3 ABOUT THIS M ANUAL 1.3.3 Instruction Oper ands When instructions are represented symbolically , a subset of assembly language is used. In this subset, an instruction has the following format: label: mnemo nic argument 1, argument 2, argument3 where: • A label is an identifier which is followed by a colon. • A mnemonic is a reserved n[...]

  • Page 53

    Vol. 3 1-9 ABOUT THIS MANUAL For example, a progr am can keep its code (instructions) and stack in separate segments. Code addresses would always refer to the code space, and stack addresses would always refer to the stack space. The following notation is used to specify a byte address within a segment: Segment-register:Byte-address For example, th[...]

  • Page 54

    1-10 Vol. 3 ABOUT THIS M ANUAL 1.3.7 Ex cep tions An exception is an event that typically occurs when an instruction causes an error . For example, an attempt to divide by zero generates an ex ception. However , some exceptions, such as breakpoints, occur und er other conditions. Some t ypes of excep - tions may pro vide error codes. An error code [...]

  • Page 55

    Vol. 3 1-11 ABOUT THIS MANUAL This example refers to a page-fault exception under conditions where an error code naming a type of fault is reported. Under some conditions, exceptions which produce error codes may not be able to report an accurate code. In this case, the error code is zero, as shown below for a general-protection exception: #GP(0) 1[...]

  • Page 56

    1-12 Vol. 3 ABOUT THIS M ANUAL • Intel ® 64 Architecture Processor T opology Enumeration: http://softwarecommunity .intel.com/articles/eng/3887.htm • Intel ® T rusted Execution T echnology Measured Launched E nvironment Programming Guide, http://www .intel.com/technology/security/index.htm • Developing Multi-threaded Applications: A Platfor[...]

  • Page 57

    Vol. 3 2-1 CHAP TER 2 SYS TEM ARCHITECTUR E OV ERVIEW IA-32 architecture (beginning with the In tel386 processor family) provides extensive support for operating-system and system-development software. This support offers multiple modes of oper ation, which include: • Real mode, protected mode, virtual 8 086 mode, and system management mode. Thes[...]

  • Page 58

    2-2 Vol. 3 SYSTEM AR CHITECTUR E OVERVIEW initiates the switch from real-address mode to protected mode. If IA -32e mode oper - ation is desired, software also initiates a switch from protected mode to IA-32e mode. 2.1 OVERVIEW O F THE SYSTEM-L EV EL ARCHITECTURE System-level architecture consists of a set of registers, data structures, and instruc[...]

  • Page 59

    Vol. 3 2-3 SYSTEM ARCHITECTURE OVERVIEW Figure 2-1. IA-32 S ystem-Lev el R egisters and Data St ructures Local Descriptor T able (LDT) EFLAGS Register Control Registers CR1 CR2 CR3 CR4 CR0 Global Descriptor T able ( GDT) Interrupt Descriptor T able (IDT) IDTR GDTR Interrupt Gate T rap Gate LDT Desc. TSS Desc. Code St a c k Code St a c k Code St a c[...]

  • Page 60

    2-4 Vol. 3 SYSTEM AR CHITECTUR E OVERVIEW Figur e 2-2. System-L ev el Reg isters and Data S tructures in IA-32e Mode Local Descriptor T able (LDT) CR1 CR2 CR3 CR4 CR0 Global Descriptor T able ( GDT) Interrupt Descriptor T able (IDT) IDTR GDTR Interrupt Gate T rap Gate LDT Desc. TSS Desc. Code St a c k Code St a c k Code St a c k Current TSS Code St[...]

  • Page 61

    Vol. 3 2-5 SYSTEM ARCHITECTURE OVERVIEW 2.1.1 Global and Local Descrip tor T ables When operating in protected mode, all memory accesses pass through either the global descriptor table (GDT) or an optional local descriptor table (LDT) as shown in Figure 2-1 . These tables contain entrie s called segment descriptors. Segment descriptors provide the [...]

  • Page 62

    2-6 Vol. 3 SYSTEM AR CHITECTUR E OVERVIEW The architecture also defines a set of special descriptors called gates (call gates, interrupt gates, tr ap gates, and task ga tes). These provide protected gateways to system procedures and handlers that may o perate at a different privilege level than application programs and most procedures. For example,[...]

  • Page 63

    Vol. 3 2-7 SYSTEM ARCHITECTURE OVERVIEW 2. Loads the task register with the segment selector for the new task. 3. Accesses the new TSS through a segment descriptor in the GD T . 4. Loads the state of the new task from the new TSS into the general-pu rpose registers, the segment registers, the LDTR, control register CR3 (base address of the paging-s[...]

  • Page 64

    2-8 Vol. 3 SYSTEM AR CHITECTUR E OVERVIEW The IDTR register is expanded to hold a 64-bit base address. T ask gates are not supported. 2.1.5 Memory Management System architecture supports either direct physical addressing of memory or virtual memory (through paging). When physical addressing is used, a linear address is treated as a physical address[...]

  • Page 65

    Vol. 3 2-9 SYSTEM ARCHITECTURE OVERVIEW 2.1.6 System R egisters T o assist in initiali zing the processor and controlling system operations, the system architecture provides system flags in the EFLAGS register and several system registers: • The system flags and IOPL field in the EFLAGS register co ntrol task and mode switching, interrupt handlin[...]

  • Page 66

    2-10 Vol. 3 SYSTEM AR CHITECTUR E OVERVIEW On systems that support IA-32e mode, the extended feature enable register (IA32_EFER) is available. This model-specific register controls activation of IA-32e mode and other IA-32e mode oper ations. In addition, there are several model- specific registers that govern IA-32e mode instructions: • IA32_Ker [...]

  • Page 67

    Vol. 3 2-11 SYSTEM ARCHITECTURE OVERVIEW running progr am or task. SMM-specific code may then be executed tran sparently . Upon returning from SMM, the processor is placed back into its state prior to the SMI. • Virtual-80 86 mode — In protected mode, the pr ocessor supports a quasi- operating mode known as virtual-8086 mode. This mode allows t[...]

  • Page 68

    2-12 Vol. 3 SYSTEM AR CHITECTUR E OVERVIEW The VM flag in the EFLAGS register determine s whether the processor is operating in protected mode or virtual-8086 mode. T ransitions between protected mode and virtual-8086 mode are generally carried out as part of a task switch or a return from an interrupt or exception handler . See also: Section 17.2.[...]

  • Page 69

    Vol. 3 2-13 SYSTEM ARCHITECTURE OVERVIEW IF Interrup t enable (b it 9) — Controls the response of the processor to maskable hardware interr upt requ ests (see also: Section 6.3.2 , “Maskable Hardware Interrupts” ). The flag is set to respond to maskable hardware interrupts; cleared to inhibit maskable hardware interrupts. The IF flag does not[...]

  • Page 70

    2-14 Vol. 3 SYSTEM AR CHITECTUR E OVERVIEW changing to the state of this flag can generate unexpected exceptions in application programs. See also: Section 7.4, “T ask Linking. ” RF Resume (bit 16) — Controls the processor’s response to instruction-break - point conditions. When set, this flag temporarily disables debug exceptions (# D B ) [...]

  • Page 71

    Vol. 3 2-15 SYSTEM ARCHITECTURE OVERVIEW VIP Virtual interrupt pending (bit 20) — Set by software to i ndicate that a n interrupt is pending; cleared to indicate that no inter rupt is pendin g. This flag is used in conjunctio n with the VIF flag. The pr ocessor re ads this f lag but never modifi es it. The processor only re cognizes the VIP flag [...]

  • Page 72

    2-16 Vol. 3 SYSTEM AR CHITECTUR E OVERVIEW 2.4.1 Global Descriptor T able R egister (GDTR) The GDTR register holds the base address (32 bits in protected mode; 64 bits in IA-32e mode) and the 16-bit table limit for the GD T . The base address specifies the linear address of byte 0 of the GDT ; the ta ble limit specifies the number of bytes in the t[...]

  • Page 73

    Vol. 3 2-17 SYSTEM ARCHITECTURE OVERVIEW 2.4.3 IDTR In terrup t Descriptor T able R egister The IDTR register holds the base address (32 bits in protected mode; 64 bits in IA-32e mode) and 16-bit table limit for the IDT . The base address specifies the linear address of byte 0 of the IDT ; the table limit specifies the number of bytes in the table.[...]

  • Page 74

    2-18 Vol. 3 SYSTEM AR CHITECTUR E OVERVIEW • The MOV CRn instructions do not check that addresses written to CR2 and CR3 are within the linear-address or physical-address limitations of the implemen - tation. • R egister CR8 is av ailable in 64-bit mode only . The control registers are summarized below, and each architectur ally defined control[...]

  • Page 75

    Vol. 3 2-19 SYSTEM ARCHITECTURE OVERVIEW When loading a control register , reserved bits should always be set to the values previously read. The flags in control registers are: PG Paging (bit 31 of CR0) — Enables paging when set; disables paging when clear . When paging is disabled, all linear addresses are treated as physical addresses. The PG f[...]

  • Page 76

    2-20 Vol. 3 SYSTEM AR CHITECTUR E OVERVIEW See also: Section 11.5 .3, “Preventing Caching, ” and Section 11.5, “Cache Control. ” NW Not Write-through (bit 29 of CR0) — When the NW and CD flags are clear , write-back (for Pentium 4, Intel Xeon, P6 family , and Pentium proces - sors) or write-through (for Intel486 processors) is enabled for[...]

  • Page 77

    Vol. 3 2-21 SYSTEM ARCHITECTURE OVERVIEW delayed until an x87 FPU/MMX/SSE/S SE2/SSE3/SS SE3/SSE4 instruction is actually executed by the new task. The processor sets this flag on every task switch and tests it when executing x87 FPU/MMX/SSE/SSE2/SSE3/SS SE3/SSE4 instructions. • If the TS flag is set and the EM flag (bit 2 of CR0) is clear , a dev[...]

  • Page 78

    2-22 Vol. 3 SYSTEM AR CHITECTUR E OVERVIEW EM Emulation (bit 2 of CR0) — Indicates that the processor does not have an internal or external x87 FPU when set; indicates an x8 7 FPU is present when clear . This flag also affects the execution of MM X/ SSE /SS E2 /SS E3/ SSS E3/ SSE 4 ins tr uc tio ns . When the EM flag is set, ex ecution of an x87 [...]

  • Page 79

    Vol. 3 2-23 SYSTEM ARCHITECTURE OVERVIEW flag is set, caching of the page-directory is prevented; when the flag is clear , the page-directory can be cached. This flag affects only the processor ’ s internal caches (both L1 and L2, when present). The processor ignores this f l a g i f p a g i n g i s n o t u s e d ( t h e P G f l a g i n r e g i s[...]

  • Page 80

    2-24 Vol. 3 SYSTEM AR CHITECTUR E OVERVIEW when set; when clear , processor aliase s references to registers DR4 and DR5 for compatibility with software written to run on earlier IA-32 processors. See also: Section 16.2.2, “Debug Registers DR4 and DR5. ” PSE Page Size Extensions (bit 4 of CR4) — Enables 4-MByte pages with 32-bit paging when s[...]

  • Page 81

    Vol. 3 2-25 SYSTEM ARCHITECTURE OVERVIEW processor will generate an inv alid opcode exception (#UD) if it attempts to execute any SSE/SSE2/SSE3and instruction, with the exception of P AUSE, PREFETCH h , SFENCE, LFENCE, MFENCE, MOVNTI, CLFLUSH, CRC32, and POPCNT . The operating system or execut ive must explicitly set this flag. NO TE CPUID feature [...]

  • Page 82

    2-26 Vol. 3 SYSTEM AR CHITECTUR E OVERVIEW all interrupts are enabled. This field is available in 64-bit mode. A value of 15 means all interrupts will be disabled. 2.5.1 CPUID Qualification of Con trol R egister Flags The VME, PVI, T SD, DE, PSE, P AE, MCE, PGE, PCE, OS FXSR, and OSXMMEXCPT flags in control register CR4 are model specific. All of t[...]

  • Page 83

    Vol. 3 2-27 SYSTEM ARCHITECTURE OVERVIEW state, SSE state, or a future processor extended state) is represented by a bit in XCR0. The OS can enable future processor extended states in a forward manner by specifying the appropriate bit mask value using the XSETBV instruction according to the results of the CPUID leaf 0DH. With the exception of bit 6[...]

  • Page 84

    2-28 Vol. 3 SYSTEM AR CHITECTUR E OVERVIEW SLD T St ore LD T Register No No LGDT Lo a d G DT R eg is te r No Ye s SGD T S tor e GD T Reg ister No No LT R Loa d T as k Re g is te r No Ye s STR S tor e T ask Regis ter No No LIDT Load I DT Re gi st er No Ye s SID T S tor e ID T Regis ter No No MOV C R n Load an d sto re contro l reg is ter s No Ye s S[...]

  • Page 85

    Vol. 3 2-29 SYSTEM ARCHITECTURE OVERVIEW 2.7 .1 L oading and S toring Sys tem Regis ters The GDTR, LDTR, ID TR, and TR registers each ha ve a load and store instruction for loading data into and storing data from the register: • LGDT (Load GDTR Register) — Loads the GD T base address and limit from memory into the GD TR register . • SGDT (Sto[...]

  • Page 86

    2-30 Vol. 3 SYSTEM AR CHITECTUR E OVERVIEW The LMSW (load machine status word) and SMSW (store machine status word) instructions operate on bits 0 through 15 of control register CR0 . These instructions are provided for compatibility with the 16-bit Intel 286 processor . Programs written to run on 32-bit IA-32 processors should not use these instru[...]

  • Page 87

    Vol. 3 2-31 SYSTEM ARCHITECTURE OVERVIEW Instructions), ” for a detailed explanation of the function and use of this instruction. 2.7 .3 L oading and S toring Debu g Regis ters Internal debugging facilities in the processor are controlled by a set of 8 debug regis - ters (DR0-DR7). The MOV instruction allows setup data to be loaded to and stored [...]

  • Page 88

    2-32 Vol. 3 SYSTEM AR CHITECTUR E OVERVIEW introduced with the P entium Pro processor). If an y non-wak e events are pending during shutdown, they will be handled af ter the wake event from shutdown is processed (for example, A20M# interrupts). The LOCK prefix invokes a locked (atomi c) read-modify -write operation when modi - fying a memory operan[...]

  • Page 89

    Vol. 3 2-33 SYSTEM ARCHITECTURE OVERVIEW Fixed-function performance counters record only specific events that are defined in Chapter 20, “Introduction to Virtual-Machine Extensions” , and the width/number of fixed-function counters are enumerated by CPUID leaf 0AH. The time-stamp counter is a model-spe cific 64-bit counter that is reset to zero[...]

  • Page 90

    2-34 Vol. 3 SYSTEM AR CHITECTUR E OVERVIEW 2.7 .7 .1 Re ading and Writing Model- Specific Regist ers in 64-Bit Mode RDMSR and WRMSR require an index to specify the address of an MSR. In 64-bit mode, the index is 32 bits; it is specified using ECX. 2.7 .8 Enabling Pr ocessor Ex tended S tates The XSETBV instruction is required to en able OS support [...]

  • Page 91

    Vol. 3 3-1 CHAP TER 3 PR O TECTED-MODE MEMORY MANAGEMEN T This chapter describes the Intel 64 and IA-32 architecture’ s protected-mode memory management facilities, including the phys ical memory requirements, segmentation mechanism, and paging mechanism. See also: Chapter 5, “Protection” (for a description of the processor’ s protection me[...]

  • Page 92

    3-2 Vol. 3 PRO TECTED-MODE MEMO RY MANAGEMEN T segment, the segment type, and the location of the first byte of the segment in the linear address space (called the base address of the segment). The offset part of the logical address is added to the base addre ss for the segment to locate a byte within the segment. The base address plus the offset t[...]

  • Page 93

    Vol. 3 3-3 PRO TECTED-MODE MEMORY MANAGEMEN T storage. When using paging, each segment is divided into p ages (typically 4 KBytes each in size), which are stored e ither in physical memory or on the disk. The oper - ating system or executive maintains a page directory and a set of page ta bles to keep track of th e pages. When a program (or task) a[...]

  • Page 94

    3-4 Vol. 3 PRO TECTED-MODE MEMO RY MANAGEMEN T FFFF_FFF0H. RAM (DRA M) is placed at the bottom of the add ress space because the initial base address for the DS data se gment after reset initialization is 0. 3.2.2 Pro tected Flat Model The protected flat model is similar to the basic flat model, except the segment limits are set to include only the[...]

  • Page 95

    Vol. 3 3-5 PRO TECTED-MODE MEMORY MANAGEMEN T More complexity can be added to this pr otected flat model to provide more protec - tion. For example, for the paging mechanism to pro vide isolation between user and supervisor code and data, four segments need to be defined: code and data segments at privilege level 3 for the user , and code and data [...]

  • Page 96

    3-6 Vol. 3 PRO TECTED-MODE MEMO RY MANAGEMEN T Access checks can be used to protect not only against referencing an address outside the limit of a segment, but also against performing disallowed operations in certain segments. For example, since code segments are designated as read-only segments, hardware can be used to prevent writes into code seg[...]

  • Page 97

    Vol. 3 3-7 PRO TECTED-MODE MEMORY MANAGEMEN T In 64-bit mode, segmentation is ge nerally (but not completely) disabled, creating a flat 64-bit linear-address space. The processor treats the segment base of CS, DS, ES, SS as zero, creating a linear address that is equal to the effective address. The FS and GS segments are exceptions. These se gment [...]

  • Page 98

    3-8 Vol. 3 PRO TECTED-MODE MEMO RY MANAGEMEN T 3.3.1 Intel ® 64 Proc essors and Physical Addr ess Space On processors that support Intel 64 architecture (CPUID.80000001:EDX[29] = 1), the size of the physical address r ange is implementation-specific and indicated by CPUID.80000008H:EAX[bits 7-0]. For the fo rmat of information returned in EAX, see[...]

  • Page 99

    Vol. 3 3-9 PRO TECTED-MODE MEMORY MANAGEMEN T If paging is not used, the processor maps the linea r address directly to a physical address (that is, the linear address goes out on the processor’s address bus). If the linear address space is paged, a second level of address translation is used to trans - late the linear address into a physical add[...]

  • Page 100

    3-10 Vol. 3 PRO TECTED-MODE MEMO RY MANAGEMEN T TI (table indicator) flag (Bit 2) — Specifies the descriptor table to use: clearing this flag selects the GDT ; setting this flag selects the current LD T . Requested Privilege Level (RPL) (Bits 0 and 1) — Specifies the privilege leve l of the selector . The priv - ilege level can range from 0 to [...]

  • Page 101

    Vol. 3 3-11 PRO TECTED-MODE MEMORY MANAGEMEN T For a progr am to access a segment, the segment selector for the segment must have been loaded in one of the segment register s. So, although a system can define thou - sands of segments, only 6 can be available for immediate use. Other segments can be made available by loading their segment selectors [...]

  • Page 102

    3-12 Vol. 3 PRO TECTED-MODE MEMO RY MANAGEMEN T 3.4.4 Segment Loading Ins tructions in IA-32e Mode Because ES, DS, and S S segment registers are not used in 64-bit mode, their fields (base, limit, and attribute) in segment de scriptor registers are ignored. Some forms of segment load instructions are also inva lid (for example, LDS, POP ES). Addres[...]

  • Page 103

    Vol. 3 3-13 PRO TECTED-MODE MEMORY MANAGEMEN T 3.4.5 Segment Descrip tors A segment descriptor is a data structure in a G D T or LDT that provides the processor with the size and location of a segment, as well as access control and status informa - tion. Segment descriptors are typically create d by compilers, linkers, loaders, or the operating sys[...]

  • Page 104

    3-14 Vol. 3 PRO TECTED-MODE MEMO RY MANAGEMEN T to the segment li mit. Offsets greate r than the segment limit generate general-protection exceptions (#GP). For expand-down segments, the segment limit has the reverse function; the offset can range from the segment limit to FFFFFFFFH or FFFFH, depending on the setting of the B flag. Offsets less tha[...]

  • Page 105

    Vol. 3 3-15 PRO TECTED-MODE MEMORY MANAGEMEN T store its own data, such as information regarding the whereabouts of the missing segment. D/B (default operation size/default st ack pointer size and/or upper bound) flag Performs different functions depending on whether the segment descriptor is an executable code segment, an expand-down data segment,[...]

  • Page 106

    3-16 Vol. 3 PRO TECTED-MODE MEMO RY MANAGEMEN T G (granularity) fla g Determines t he scaling o f the segmen t limit fiel d. When the granulari ty flag is clear , the segment limit is int erpreted in byte units; when flag is set, the s egment limit is interpreted in 4-KByte units. (This flag does no t affect the granularity of the base address; it [...]

  • Page 107

    Vol. 3 3-17 PRO TECTED-MODE MEMORY MANAGEMEN T Stack segments are data segments which mu st be read/write segments. Loading the SS register with a segment selector fo r a nonwritable data segment generates a general-protection exception (#GP). If the size of a stack segment needs to be changed dynamically , the stack segment can be an expand-down d[...]

  • Page 108

    3-18 Vol. 3 PRO TECTED-MODE MEMO RY MANAGEMEN T For code segments, the three low-order bits of the type field are interpreted as accessed (A), read enable (R), and conforming (C). Code segments can be execute- only or execute/read, depending on the setting of the read-enable bit. An execute/read segment might be used when constants or other static [...]

  • Page 109

    Vol. 3 3-19 PRO TECTED-MODE MEMORY MANAGEMEN T • T ask -state segment (TSS) descriptor . • Call-gate descriptor . • Interrupt-gate descriptor . • T r ap-gate descriptor . • T ask -gate descriptor . These descriptor types fall into two catego ries: system-segment descriptors and gate descriptors. System-segment descriptors po int to system[...]

  • Page 110

    3-20 Vol. 3 PRO TECTED-MODE MEMO RY MANAGEMEN T See also: Section 3.5.1, “Segment Descriptor T ables”, and Section 7.2.2, “TSS Descriptor” (for more information on the system-s egment descriptors); see Section 5.8.3, “Call Gates” , Section 6.11, “IDT Descriptors” , and Section 7.2.5, “T ask -Gate Descriptor” (for more informatio[...]

  • Page 111

    Vol. 3 3-21 PRO TECTED-MODE MEMORY MANAGEMEN T Each system must have one GD T defined, which may be used for all programs and tasks in the system. Optionally , one or more LDT s can be defi ned. For example, an LDT can be defined for each separate task being run, or some or all tasks can share the same LDT . The GDT is not a segment itself; instead[...]

  • Page 112

    3-22 Vol. 3 PRO TECTED-MODE MEMO RY MANAGEMEN T 3.5.2 Segment Descript or T ables in IA-32e Mode In IA-32e mode, a segment descriptor table can contain up to 8192 (2 13 ) 8-byte descriptors. An entry in the segment descriptor table can be 8 bytes. System descrip - tors are expanded to 16 bytes (o ccupying the space of two entries). GDTR and LD TR r[...]

  • Page 113

    Vol. 3 4-1 CHAP TER 4 PA G I N G Chapter 3 explains how segmentation converts logical addresses to linear addresses. Paging (or linear-address tr anslation) is the process of translating linear addresses so that they can be used to access memory or I/O devices. Paging translates each linear address to a physical address and determines, for each tra[...]

  • Page 114

    4-2 Vol. 3 PAG ING paging modes. Section 4.1.3 discusses how CR0.WP , CR4 .PSE, CR4.PGE, and IA32_EFER.NXE modify the operation of the different paging modes. 4.1.1 Three P aging Modes If CR0.PG = 0, paging is not used. The logical processor treats all linear addresses as if they were physical addresses. CR4.P AE and IA32_EFER.LME are ignored by th[...]

  • Page 115

    Vol. 3 4-3 PAG I NG linear addresses larger than 32 bits, 32-bit paging and PAE paging translate 32-bit linear addresses. Because it is used only if IA32 _EFER.LME = 1, I A-32e paging is used only in IA-32e mode. (In fact, it is the use of IA-32e paging that defines IA -32e mode.) IA-32e mode has two sub-modes: • Compatibility mode. This mode use[...]

  • Page 116

    4-4 Vol. 3 PAG ING enable these modes and make transitions be tween them. The following items identify certain limitations and other details: • IA32_EFER.LME cannot be modified while p aging is enabled (CR0.PG = 1). Attempts to do so using WRMSR cause a general-protection exception (#GP(0)). • P aging cannot be enabled (by setting CR0.PG to 1) [...]

  • Page 117

    Vol. 3 4-5 PAG I NG • Software can always disable paging by clearing CR0.PG with MOV to CR0. • Software can make transitions between 32-bit paging and PAE paging by changing the value of CR4.P AE with MOV to CR4. • Software cannot make tr ansitions directly between IA-32e pag ing and either of the other two paging mode s. It must first disabl[...]

  • Page 118

    4-6 Vol. 3 PAG ING 4.1.4 Enumeration o f Paging F eatures b y CPUID Software can discover support for different paging features using the CPUID instruc- tion: • PSE: page-size extensions for 32-bit paging. If CPUID.01H:EDX.PSE [bit 3] = 1, CR4.PSE may be se t to 1, enabling support for 4-MByte pages with 32-bit paging (see Section 4.3). • PAE: [...]

  • Page 119

    Vol. 3 4-7 PAG I NG 4.2 HIER ARCHICAL P AGING S TRUCTURES: AN OV ERVIEW All three paging modes translate linear addresses use hierarchical paging struc- tures . This section provides an ov erview of th eir operation. Section 4.3, Section 4.4, and Section 4.5 provide details for the three paging modes. Every paging structure is 4096 Byte s in size a[...]

  • Page 120

    4-8 Vol. 3 PAG ING and bits 20:12 ide ntify a fourth. Again, the last identifi es the page frame. (See Figure 4-8 for an illustration.) The translation process in each of the ex amples abov e completes by identifying a page frame. However , the paging structures may be configured so that tr anslation terminates before doing so. This occurs if proce[...]

  • Page 121

    Vol. 3 4-9 PAG I NG corresponds to 1 TByte, linear addresses are limited to 32 bits; at most 4 GBytes of linear-address space may be accessed at any given time. 32-bit paging uses a hierarchy of paging structures to produce a translation for a linear address. CR3 is used to locate the first paging-structure, the page directory . T able 4-3 illustr [...]

  • Page 122

    4-10 Vol. 3 PAG ING 32-bit paging may map linear addresses to eithe r 4-KByte pages or 4-MByte pages. Figure 4-2 illustrates the translation process when it uses a 4-KByte page; Figure 4-3 covers the case of a 4-MByte page. The following items describe the 32-bit paging process in more detail as well has how the page size is determined: • A 4-KBy[...]

  • Page 123

    Vol. 3 4-11 PAG I NG Because a PDE is identified using bits 31:22 of the linear address, it controls access to a 4-Mbyte region of the linear-address sp ace. Use of the PDE depends on CR.PSE and the PDE’s PS flag (bit 7): • If CR4.PSE = 1 and the PDE’s PS flag is 1, the PDE maps a 4-MByte page (see T able 4-4). The final physical address is c[...]

  • Page 124

    4-12 Vol. 3 PAG ING — Bits 31:12 are from the PTE. T able 4-4. F ormat of a 32-Bit P age-Direct ory Entry that Maps a 4-MByte P age Bit Posi tion (s) Contents 0 (P) Present ; must be 1 to ma p a 4-MByte page 1 (R/W) Read/write; if 0, writes may no t be allow ed to the 4- MByte page re fer enced by this entr y (depends on CPL and CR 0.WP; see Sect[...]

  • Page 125

    Vol. 3 4-13 PAG I NG — Bits 11:0 are from the original linear address. If a paging-structure entry’ s P flag (bit 0) is 0 or if the entry sets any reserved bit, the entry is used neither to reference another paging-structure entry nor to map a page. A reference using a linear address whose tr anslation would use such a paging-struc- ture entry [...]

  • Page 126

    4-14 Vol. 3 PAG ING — If the P flag of a PTE is 1, bit 7 is reserved. — If the P flag and the PS flag of a PDE are both 1, bit 12 is reserved. (If CR4.PSE = 0, no bits are reserved with 32-bit paging.) A reference using a linear address that is successfully translated to a physical address is performed only if allowed by the access rights of th[...]

  • Page 127

    Vol. 3 4-15 PAG I NG those that do neither because they are “not present”; bit 0 (P) and bit 7 (PS ) are highlighted because they determin e how such an entry is used. 4.4 P AE PAGING A logical processor uses PAE paging if CR0.PG = 1, CR 4.P AE = 1, and IA32_EFER.LME = 0 . PAE paging translates 32-bit linear addresses to 52-bit physical address[...]

  • Page 128

    4-16 Vol. 3 PAG ING ters. (This is different from the other paging modes, in which there is one hierarchy referenced by CR3.) Section 4.4.1 discusses the PDPTE registers. Section 4.4. 2 describes linear-address translation with P AE paging. 4.4.1 PDPT E R egisters When P AE paging is used, CR 3 references the b ase of a 32- Byte page-directory- poi[...]

  • Page 129

    Vol. 3 4-17 PAG I NG T able 4-8 gives the format of a PDPTE. If an y of the PDPTEs sets both the P flag (bit 0) and any reserved bit, the MOV to CR instruction causes a general-protection exception (#GP(0)) and the PDPTEs are not loaded. 1 A s s h o w i n T a b l e 4 - 8 , bi t s 2: 1 , 8:5, and 63:MAXPHY ADDR are reserved in the PDPTEs. 4.4.2 Line[...]

  • Page 130

    4-18 Vol. 3 PAG ING processor ignores bits 63:1, and there is no mapping for the 1-GByte region controlled by PDPTE i . A reference using a linear address in this region causes a page-fault exception (see Section 4.7). • If the P flag of PD PTE i is 1, 4-KByte naturally aligned page directory is located at the physical address specified in bits 5[...]

  • Page 131

    Vol. 3 4-19 PAG I NG 4.4.1) A page directory c omprises 512 64-bit entries (PDEs). A PDE is select ed using the physical address defined as follows: — Bits 51:12 are from PDPT E i . — Bits 11:3 are bits 29:21 of the linear address. — Bits 2:0 are 0. Because a PDE is identified using bits 31:21 of the linear address, it controls access to a 2-[...]

  • Page 132

    4-20 Vol. 3 PAG ING T able 4-9 . F o rmat of a P AE Page-Directory Entry that Maps a 2- MByte Page Bit Posi tion (s) Contents 0 (P) Present ; must be 1 to ma p a 2-MByte page 1 (R/W) Read/write; if 0, writes may no t be allow ed to the 2- MByte page re fer enced by this entr y (depends on CPL and CR 0.WP; see Section 4. 6) 2 (U/S) User/supervisor; [...]

  • Page 133

    Vol. 3 4-21 PAG I NG A reference using a linear address that is successfully tr anslated to a ph ysical address is performed only if allowed by the access rights of the tr anslation; see Section 4.6. Figure 4-7 gives a summary of the formats of CR3 and the paging-structure entries with P AE paging. For the paging structure entr ies, it identifies s[...]

  • Page 134

    4-22 Vol. 3 PAG ING T able 4-11. F ormat of a P AE Page-T able Entr y that Maps a 4-KByte Page Bit Posi tion (s) Contents 0 (P) Present ; must be 1 to ma p a 4-KByte page 1 (R/W) Read/write; if 0, writes ma y not be allo wed to the 4-KByte page re fer enced by this entry (dep ends on CPL and CR0.W P; see Section 4. 6) 2 (U/S) User/supervisor; if 0,[...]

  • Page 135

    Vol. 3 4-23 PAG I NG that do neither because they are “not present”; bit 0 (P) and bit 7 (PS) are high- lighted because they determine how a paging-structure entry is used. 4.5 IA-32E PAGING A logical processor uses IA -32e paging if CR0.PG = 1, CR4.PAE = 1, and IA32_EFER.LME = 1. With IA-32e pag ing, lin ear address are translated using a hier[...]

  • Page 136

    4-24 Vol. 3 PAG ING bits corresponds to 4 PByte s, linear addresses are limited to 48 bits; at most 256 TBytes of linear-address space ma y be accessed at any given time. IA-32e paging uses a hier archy of paging structures to produce a translation for a linear address. CR3 is used to locate th e first paging-structure, the PML4 table. T able 4-12 [...]

  • Page 137

    Vol. 3 4-25 PAG I NG • A 4-KByte naturally aligned page-directory-pointer table is located at the physical address specified in b its 51:12 of the PML4E ( see T able 4-13). A page- directory-pointer table comprises 512 64-bit entries (PDPTEs). A PDPTE is selected using the physical address defined as follows: — Bits 51:12 are from the PML 4E. ?[...]

  • Page 138

    4-26 Vol. 3 PAG ING Because a PDE is identified using bits 47:21 of the linear address, it controls access to a 2-MByte region of the linear-address space. Use of the PDE depends on its PS flag (bit 7): Figur e 4-9. Linear-Addr ess T r anslation to a 2-MByte P age using IA- 32e Paging Directory Ptr Linear Address PDPTE CR3 39 38 Pointer T able 9 9 [...]

  • Page 139

    Vol. 3 4-27 PAG I NG T able 4-13. F ormat of an IA -32e PML4 Entry (PML4E) that Refer ences a Page- Direct ory-Pointer T able Bit Pos it ion (s ) Conten ts 0 ( P) Pre s ent ; m us t b e 1 to refer en ce a p a ge -d ire c tor y- p oi nt er ta bl e 1 (R/W) Read/write; if 0, writes ma y no t be allow ed to the 512-GByte r egion c ontr olled by this en[...]

  • Page 140

    4-28 Vol. 3 PAG ING • If the PDE’s PS flag is 1, the PDE maps a 2-MByte page (see T able 4-15). The final physical address is computed as follows: T able 4-14. F ormat of an IA-32e Page-D irect ory-Pointer-T able Entry (PDPT E) that Refere nce s a Pag e D ire cto ry Bit Posi tion (s) Contents 0 (P) Present ; must be 1 to r ef eren ce a page dir[...]

  • Page 141

    Vol. 3 4-29 PAG I NG — Bits 51:21 are from the PDE. — Bits 20:0 are from the original linear address. • If the PDE’s PS flag is 0, a 4-KByte natu rally aligned page table is located at the physical address specified in bits 51:12 of the PDE (see T able 4-16 ). A page table 2 (U/S) User/supervisor; if 0, accesses with CPL=3 a re no t allowe [...]

  • Page 142

    4-30 Vol. 3 PAG ING comprises 512 64-bit entries (PTEs). A PTE is selected using the physical address defined as follows: — Bits 51:12 are from the PDE. — Bits 11:3 are bits 20:12 of the linear address. — Bits 2:0 are all 0. • Because a PTE is identifie d using bits 47 :12 of the linear address, every PTE maps a 4-KByte page (see T able 4-1[...]

  • Page 143

    Vol. 3 4-31 PAG I NG — Bits 11:0 are from the original linear address. If a paging-structure entry’ s P flag (bit 0) is 0 or if the entry sets any reserved bit, the entry is used neither to reference another paging-structure entry nor to map a page. A reference using a linear address whose tr anslation would use such a paging-struc- ture entry [...]

  • Page 144

    4-32 Vol. 3 PAG ING • If the P flag of a PML4E or a PDPTE is 1, the PS flag is reserved. • If the P flag and the PS flag of a PD E are both 1, bits 20:13 are re served. • If IA32_EFER.NXE = 0 and the P flag of a pa ging-structure entry is 1, the XD flag (bit 63) is reserved. A reference using a linear address that is successfully translated t[...]

  • Page 145

    Vol. 3 4-33 PAG I NG — Data reads. Data may be read from any linear address with a valid tr anslation for which the U/S flag (bit 2) is 1 in every pagi ng-structure entry controlling the trans- lation . — Data writes. Data may be written to any linear address with a valid tr anslation for which 6 3 6 2 6 1 6 0 5 9 5 8 5 7 5 6 5 5 5 4 5 3 5 2 5 [...]

  • Page 146

    4-34 Vol. 3 PAG ING both the R/W flag and the U/S flag are 1 in ev ery paging-structure entry controlling the translation. — Instruction fetches. • For 32-bit paging or if IA32_EFER.NXE = 0, instructions may be fetched from any linear address with a valid tr anslation for which the U/S flag is 1 in every paging-structure entry controlling the t[...]

  • Page 147

    Vol. 3 4-35 PAG I NG is 1; it is 0 if a user-mode (CPL = 3) access did so. This flag describes the access causing the page-fault exception, not the access rights specified by paging. • RSVD flag (bit 3) . This flag is 1 if there is no v alid translation for the linear address because a reserved bit was set in one of the paging-s tructure entries [...]

  • Page 148

    4-36 Vol. 3 PAG ING Page-fault exceptions occur only due to an attempt to use a linear address. F ailures to load the PDPTE registers with PAE paging (see Section 4.4.1) cause general- protection exceptions (#GP(0)) and not page-fault exceptions. 4.8 ACCESSED AND DIRTY FLAGS For an y paging-structure entry that is used during linear-address transla[...]

  • Page 149

    Vol. 3 4-37 PAG I NG 4.9 PAGING AND MEMORY T YPING The memory ty pe of a memory access refers to the ty pe of caching used for that access. Chapter 11, “Memory Cache Control” provides many details regarding memory typing in the Intel-64 and IA-32 ar chitectures. This section describes how paging contributes to the determination of memory t ypin[...]

  • Page 150

    4-38 Vol. 3 PAG ING The PA T is a 64-bit MSR (IA32_PA T ; MSR index 277H) comprising eigh t (8) 8-bit entries (entry i comprises bits 8 i +7:8 i of the MSR). For an y access to a physical address, the table combines the memory type specified for that physical address by the MTRRs with a m e mo r y t y p e s e le c t e d f ro m t h e P AT . T able 1[...]

  • Page 151

    Vol. 3 4-39 PAG I NG tively . Section 4.10.3 explains how soft ware can remove inconsistent cached information by inv alidating portions of the TLBs and paging-structure caches. Section 4.10.4 describes special consid erat ions for multipro cessor systems. 4.10.1 T ranslation Look aside Buffers (TLBs) A processor may cache information about the tra[...]

  • Page 152

    4-40 Vol. 3 PAG ING 4.10.1.2 Caching T ranslations in TLBs The processor may acceler ate the paging process by caching individual tr anslations in translation lookaside buffers ( TLBs ). Each entry in a TLB is an individual tr ans- lation. Each translation is referenced by a page number . It contains the following information from the paging-struct[...]

  • Page 153

    Vol. 3 4-41 PAG I NG entries in memory . See Section 4.10.3.2 for how software can ensure that the processor uses the modified paging-structure entries. If the paging structures specify a translatio n using a page larger than 4 KBytes, some processors may choose to cache multiple smaller -page TLB entries for that transla- tion. Each such TLB entry[...]

  • Page 154

    4-42 Vol. 3 PAG ING — The value of the R/W flag of the PML4E. — The value of the U/S flag of the PML4E. — The value of the XD flag of the PML4E. — The values of the PCD and PWT flags of the PML 4E. The following items detail how a processor may use the PML4 cache: — If the processor has a PML4-cache entry for a linear address, it may use [...]

  • Page 155

    Vol. 3 4-43 PAG I NG — The processor may create a PDPTE-cache entry even if there are no transla- tions for any linear address that might use that entry . — If the processor creates a PDPTE-cache entry , the processor may retain it unmodified even if software subsequent ly modifie s the corresponding PML4E or PDPTE in memory . • PDE cache . T[...]

  • Page 156

    4-44 Vol. 3 PAG ING For example, if the R/W flag is 0 in a PML4 E, then the R/W flag will be 0 in any PDPTE- cache entry for a PDPTE from the page-directory-pointer table reference d by that PML4E. This is because the R/W flag of each such PDPTE-cache entry is the logical- AND of the R/W flags in the appropriate PML4E and PDPTE. The paging-structur[...]

  • Page 157

    Vol. 3 4-45 PAG I NG (Any of the above steps would be skipped if the processor do es not support the cache in question.) If the processor does not find a TLB or pagin g-structure-cache entry for the linear address, it uses the linear address to trav er se the entire paging-structure hierarch y , as described in Section 4.3, Section 4.4.2, and Secti[...]

  • Page 158

    4-46 Vol. 3 PAG ING 4.10.3 In validation o f TLBs and Paging-S tructure Caches As noted in Section 4.10.1 and Section 4.10.2, the processor may create entries in the TLBs and the paging-structure caches when linear addresses are translated, and it may retain these entries even after the pa ging structures used to create them have been modified. T o[...]

  • Page 159

    Vol. 3 4-47 PAG I NG In addition to the instructions identified above, page faults invalidate entries in the TLBs and paging-structure caches. In p articular , a page-fault exception resulting from an attempt to use a linear address will in validate an y PML4-cache, PDPTE- cache, and PDE-cache entries that would be used for that linear address as w[...]

  • Page 160

    4-48 Vol. 3 PAG ING • If software using P AE paging modifies a PDPTE, it should reload CR3 with the register’s cu rrent valu e to ensure that the modified PDPTE is loaded into the corresponding PDPTE register (see Section 4.4.1). • If the nature of the paging structures is such that a single entry may be used for multiple purposes (see Sectio[...]

  • Page 161

    Vol. 3 4-49 PAG I NG in response to an attempted user-mode access) but no other adverse behavior . Such an exception will occur at most once for each affected linear address (see Section 4.10.3.1). • If a paging-structure entry is modified to change the XD flag from 1 to 0, failure to perform an inv alidation may result in a “spurious” page-f[...]

  • Page 162

    4-50 Vol. 3 PAG ING TLB shootdown algorithm for processors supporting the Intel-64 and IA-32 architec- tures: 1. Begin barrier: Stop all but one logical processor; that i s, cause all but one to execute the HL T instruction or to enter a spin loop. 2. Allow the active logical processor to change the necessary paging-structure entries. 3. Allow all [...]

  • Page 163

    Vol. 3 4-51 PAG I NG 4.11 INTER ACTIONS WITH VIRTUAL-MACHINE EXTENSIONS (VMX) The architecture for virtual-machine extensio ns (VMX) includes features that interact with paging. Section 4.11.1 discusses ways in which VMX -specific control transfers, called VMX transitions specially affect pagi ng. Section 4.11.2 gives an overview of VMX features sp[...]

  • Page 164

    4-52 Vol. 3 PAG ING concurrently information for multiple addre ss spaces in its TLBs and paging-structure caches. See Section 25.1 for details. When EPT is in use, the addresses in the paging-structures are not used as physical addresses to access memory and memory-mapped I/O. Instead, they are treated as guest-physical addresses and are translate[...]

  • Page 165

    Vol. 3 4-53 PAG I NG segments can be mapped to pages in several w ays. T o implement a flat (unseg- mented) addressing environment, for exampl e, all the code, data, and stack modules can be mapped to one or more large segments (up to 4-GBytes) that share same range of linear addresses (see Figure 3-2 in Section 3.2.2). Here, segments are essential[...]

  • Page 166

    4-54 Vol. 3 PAG ING[...]

  • Page 167

    Vol. 3 5-1 CHAP TER 5 PR O TECTION In protected mode, the Intel 64 and IA -32 architectures provide a protection mecha - nism that operates at both the segment level and the page level. This protection mechanism provides the ability to limit acce ss to certain segments or pages based on privilege levels (four privilege levels for segments and two p[...]

  • Page 168

    5-2 Vol. 3 PRO TECTION there is no control bit for turn ing the protection mechanism on or off . The part of the seg men t -p rot ec tio n m ech an ism that is based on privil ege levels can essen tially be disabled while stil l in protected mode by assigning a priv ile ge le ve l of 0 (m ost pr iv i - leged) to all segment selectors and segment de[...]

  • Page 169

    Vol. 3 5-3 PRO TECTION procedure. The term current privilege leve l (CPL) refers to the setting of this field. • User/supervisor (U/ S) flag — (Bit 2 of paging-structure entries.) Determines the type of page: user or supervisor . • Read/write (R/W) flag — (Bit 1 of paging-structure entries.) Determines the type of access allowed to a page :[...]

  • Page 170

    5-4 Vol. 3 PRO TECTION Many different styles o f protection schem es can be implemented with these fields and flags. When the operating system creates a descriptor , it places values in these fields and flags in keeping with the particul ar protection style chosen for an operating system or executive. Application program do not gener ally access or[...]

  • Page 171

    Vol. 3 5-5 PRO TECTION The following sections describe how the processor uses these fields and flags to perform the various categories of checks descr ibed in the introduction to this chapter . 5.2.1 Code Segment Descrip tor in 64-bit Mode Code segments continue to exist in 64-b i t mode even though, for address calcula - tions, the segment base is[...]

  • Page 172

    5-6 Vol. 3 PRO TECTION 5.3 LIMIT CHECKING The limit field of a segment descriptor prevents program s or procedures from addressing memory locations outside the se gment. The effective value of the limit depends on the setting of the G (granularity) flag (see Figure 5-1 ). F or data segments, the limit also depends on the E (expansion direction) fla[...]

  • Page 173

    Vol. 3 5-7 PRO TECTION • A doubleword at an offset greater than the (effective-limit – 3) • A quadword at an offset greater than the (effective-limit – 7) F or expand-down data segments, the segment limit has the same function but is interpreted differently . Here, the effective limit specifies the last address that is not allowed to be acc[...]

  • Page 174

    5-8 Vol. 3 PRO TECTION The processor examines type information at various times while operating on segment selectors and segment descriptors . The following list gives examples of typical operations where t ype checking is performed (this list is not exhaustive): • When a segment select or is loaded into a segment register — Cer tain segment re[...]

  • Page 175

    Vol. 3 5-9 PRO TECTION instruction. If the descriptor type is for a code segment or call gate, a call or jump to another code segment is indicate d; if the descrip tor type is for a TSS or task gate, a task switch is indicated. — On a call or jump through a call gate (o r on an interrupt- or exception-handler call through a trap or interrupt gate[...]

  • Page 176

    5-10 Vol. 3 PRO TECTION The processor uses privilege levels to prevent a program or task operating at a lesser privilege level from accessing a segment with a gre ater privilege, except under controlled situations. When the processor detects a privilege level violation, it gener - ates a general-protection ex ception (#GP). T o carry out privilege-[...]

  • Page 177

    Vol. 3 5-11 PRO TECTION e x a m p l e , i f t h e D P L o f a d a t a s e g m e n t i s 1 , o n l y p r o g r a m s r u n n i n g a t a C P L o f 0 or 1 can access the segment. — Nonconforming code segment (without using a call gate) — The DPL indicates the privilege level that a progr am or task must be at to access the segment. For example, i[...]

  • Page 178

    5-12 Vol. 3 PRO TECTION loads the segment selector into the segme nt register if the DPL is numeric ally greater than or equal to both the CPL and the RPL. Otherwise, a general-protection fault is generated and the segment register is not loaded. Figure 5-5 shows four procedures (located in cod es segments A, B, C, and D), each running at different[...]

  • Page 179

    Vol. 3 5-13 PRO TECTION As demonstrated in the previous examples , the addressable dom ain of a program or task varies as its CPL changes. When the CPL is 0, data segments at all privile ge levels are accessible; when the CPL is 1, only data segments at privilege le vels 1 through 3 are accessible; when the CPL is 3, only data segme nts at privileg[...]

  • Page 180

    5-14 Vol. 3 PRO TECTION • Load a data-segment register with a segment se lector for a nonconforming, readable, code segment. • Load a data-segment register with a segment se lector for a conforming, readable, code segment. • Use a code-segment override prefix (C S) to read a readable, code segment whose selector is already loaded in the CS re[...]

  • Page 181

    Vol. 3 5-15 PRO TECTION • The target operand points to a T SS, which contains the segment selector for the target code segment. • The target operand points to a task gate, which points to a TSS, which in turn contains the segment selector for the target code segment. The following sections describe first two type s of references. See Section 7.[...]

  • Page 182

    5-16 Vol. 3 PRO TECTION • The RPL of the segment selector of the destination code segment. • The conforming (C) flag in the segment descriptor for the destination code segment, which determines whether the segm ent is a conforming (C flag is set) or nonconforming (C flag is clear) code segment. See S ection 3.4.5.1, “Code- and Data-Segment De[...]

  • Page 183

    Vol. 3 5-17 PRO TECTION The RPL of the segment selector that poin ts to a nonconforming code segment has a limited effect on the privilege check. The RPL must be numerically less than or equal to the CPL of the calling procedure for a successful control tr ansfer to occur . So, in the example in Figure 5-7 , the RPLs of segment selectors C1 and C2 [...]

  • Page 184

    5-18 Vol. 3 PRO TECTION In the example in Figure 5-7, code segment D is a conforming code segment. There - fore, calling procedures in both code segment A and B can access code segment D (using either segment selector D1 or D2, re spectively), because they both have CPLs that are greater than or equal to the DPL of the conforming code segment. For [...]

  • Page 185

    Vol. 3 5-19 PRO TECTION 5.8.3 Call Gates Call gates facilitate controlled transfers of program control between different privi - lege levels. They are typically used only in operating systems or executives that use the privilege-level protection mechanism. Ca ll gates are also useful for transferring program control between 16-bit and 32-bit cod e [...]

  • Page 186

    5-20 Vol. 3 PRO TECTION Note that the P flag in a gate descriptor is norm a l l y a l w a y s s e t t o 1 . I f i t i s s e t t o 0 , a not present (#NP) exception is generated when a program attempts to access the descriptor . The operating system can us e the P flag for special purposes. F or example, it could be used to tr ack the num ber of tim[...]

  • Page 187

    Vol. 3 5-21 PRO TECTION • T arget code segme nts referenced by a 64-bit call gate must be 6 4-bit code segments (CS.L = 1, C S.D = 0). If not, the ref erence generates a general- protection exception, #GP (CS selector). • Only 64-bit mode call gates can be reference d in IA-32e mode (64-bit mode and compatibility mode). The legacy 32-bit mode c[...]

  • Page 188

    5-22 Vol. 3 PRO TECTION 5.8.4 Accessing a Code Segment Thr ough a Call Gate T o access a call gate, a far pointer to the gate is provided as a target operand in a CALL or JMP instruction. The segment selector from this pointer identifies the call gate (see Figure 5-10 ); the offset from the pointer is required, but not used or checked by the proces[...]

  • Page 189

    Vol. 3 5-23 PRO TECTION The privilege checking rules are different depending on whether the con trol transfer was initiated with a CALL or a JMP instruction, as shown in Ta b l e 5 - 1 . The DPL field of the call-gate descriptor specifies the numerically highest privilege level from which a calling procedure can access the call gate; that is, to ac[...]

  • Page 190

    5-24 Vol. 3 PRO TECTION segments B and C. The dotted line shows that a calling procedure in code segment A cannot access call gate B. The RPL of the segment selector to a call gate must satisfy the same test as the CPL of the calling procedure; that is, the RPL must be less than or equal to the DPL of the call gate. In the example in Figure 5-15 , [...]

  • Page 191

    Vol. 3 5-25 PRO TECTION Call gates allow a single code segment to hav e procedures that can be accessed at different privilege levels. For example, an operating system located in a code segment may have some services which are intended to be used by both the oper - ating system and application software (such as procedures for handling character I/O[...]

  • Page 192

    5-26 Vol. 3 PRO TECTION Each task must define up to 4 stacks: one for applications code (running at privilege level 3) and one for each of the privilege leve ls 2, 1, and 0 that are used. (If only two privilege levels are used [3 and 0], then on ly two stacks must be defined.) Each of these stacks is located in a separate segment and is identified [...]

  • Page 193

    Vol. 3 5-27 PRO TECTION 3. Checks the stack -segment descriptor fo r the proper pr ivileges and type and generates an inv alid TSS (#TS) exception if violations are detected. 4. T e mporarily sa ves the current values of the SS and ESP registers. 5. Loads the segment selector and stack pointer for the new stack in the S S and ESP registers. 6. Push[...]

  • Page 194

    5-28 Vol. 3 PRO TECTION dure, one of the par ameters can be a pointer to a data structure, or the sa ved contents of the SS and ESP registers may be used to access parameters in the o ld stack space. The size of the data items passed to the called procedure depends on the call gate size, as described in Section 5.8.3, “Call Gates. ” 5.8.5.1 S t[...]

  • Page 195

    Vol. 3 5-29 PRO TECTION intended to execute returns from procedur es that were called with a CALL instruc - tion. It does not support returns from a JMP instruction, because the JMP instruction does not save a return instruction pointer on the stack. A near return only tran sfers program control within th e current code segment; there - fore, the p[...]

  • Page 196

    5-30 Vol. 3 PRO TECTION 5. (If the RET instruction includes a para meter count operand.) Adds the parameter count (in bytes obtained from the RET instruction) to the current ESP register value, to step past the parameters on the calling procedure’ s stack. The resulting ESP value is not checked against the limit of the stack segment. If the ESP v[...]

  • Page 197

    Vol. 3 5-31 PRO TECTION • Stack segment — Computed b y adding 24 to the value in IA32_SYSENTER_CS. • Stack pointer — Reads this from ECX. The SYSENTER and SYSEXIT instructions pr eform “fast” cal ls and returns because they force the processor into a predefined privilege level 0 state when SYSENTER is executed and into a predefined priv[...]

  • Page 198

    5-32 Vol. 3 PRO TECTION When SYSEXIT transfers control to compatibility mode user code when the operand size attribute is 32 bits, the following fields are generated and bits set: • Target code segment — Computed by adding 16 to the v alue in IA32_SYSENTER_CS. • New CS attributes — L -bit = 0 (go to compatibility mode). • Target instructi[...]

  • Page 199

    Vol. 3 5-33 PRO TECTION When SYSRET transfers control to 32-bit mode user code using a 32-bit operand size, the processor gets the privilege level 3 target instruction and stack pointer from: • Target code segment — Reads a non-NULL selector from IA32_ST AR[63:48]. • Target instruction — Copies the value in ECX into EIP . • Stack segment [...]

  • Page 200

    5-34 Vol. 3 PRO TECTION general-protection exception (#GP) is gene rated. The following system instructions are privileged instructions: • LGD T — Load GD T register . • LLDT — Load LDT register . • L TR — Load task register . • LIDT — Load ID T register . • MOV (control registers) — Load and store control registers. • LMSW ?[...]

  • Page 201

    Vol. 3 5-35 PRO TECTION The processor automatically performs first, second, and third checks during instruc - tion execution. Software must explicitly re quest the fourth check by issuing an ARPL instruction. The fifth check (offset alignmen t) is performed automatically at privilege level 3 if alignment checking is turned on. Offset alignment does[...]

  • Page 202

    5-36 Vol. 3 PRO TECTION 5.10.2 Checking R ead/Write Rights (V ERR and VERW Ins tructions) When the processor accesses any code or data segment it checks the read/write priv - ileges assigned to the segment to verify that the inte nded read or w rite oper ation is allowed. Software can check read/write ri ghts using the VERR (verify for reading) and[...]

  • Page 203

    Vol. 3 5-37 PRO TECTION destination register and sets the ZF flag in the EFLAGS reg ister . If the segment selector is not visible at the current privile ge level or is an in valid type for the LSL instruction, the instruction does not modify the destination register and clears the ZF flag. Once loaded in the destination register , so ftware can co[...]

  • Page 204

    5-38 Vol. 3 PRO TECTION Now assume that instead of setting the RPL of the segment selector to 3, the appli - cation program sets the RPL to 0 (segment selector D2). The operating system can now access data segment D , because its CPL and the RPL of segment selector D2 are both equal to the DPL of data segment D . Because the application program is [...]

  • Page 205

    Vol. 3 5-39 PRO TECTION The example in Figure 5-15 demonstrates how the ARPL instruction is intended to be used. When the operating-system receives segment selector D2 from the application program, it uses the ARPL instruction to compare the RPL of the segment selector with the privilege level of the application program (represented by the code-seg[...]

  • Page 206

    5-40 Vol. 3 PRO TECTION page-fault exception mechanism. This chapter describes the protection violations which lead to page-fault exceptions. 5.11.1 Page-Pr otection Flags Protection information for pages is contained in two flags in a paging-structure entry (see Chapter 4 ): the read/write flag (bit 1) and the user/supervisor flag (bit 2). The pro[...]

  • Page 207

    Vol. 3 5-41 PRO TECTION When the processor is in supervisor mode and the WP flag in register CR0 is clear (its state following reset initialization), all pages are both readable and writable (write- protection is ignored). When the processor is in user mode, it can write only to user- mode pages that are read/write accessible. User-mod e pages whic[...]

  • Page 208

    5-42 Vol. 3 PRO TECTION exception is genera ted. If an exception is genera ted by segmentation, no paging exception is gener ated. Page-level protections cannot be used to override segment-lev el protection. For example, a code segment is by definition not writable. If a code segment is paged, setting the R/W flag for the pages to read -write does [...]

  • Page 209

    Vol. 3 5-43 PRO TECTION 5.13 PAGE-L E VEL PR O TECTION AND EX ECUTE-DISABLE BIT In addition to page-level protection offe red by the U/S and R/W flags, paging struc - tures used with PAE paging and IA-32e paging (see Chapter 4 ) provide the execute- disable bit. This bit offers additional protection for data pages. An Intel 64 or IA-32 processor wi[...]

  • Page 210

    5-44 Vol. 3 PRO TECTION 5.13.2 Ex ecute-Disable P age Pro tection The execute-disable bit in the pag ing structures enhances page protection for data pages. Instructions cannot be fetched from a memory page if IA32_EFER.NXE =1 and the execute-disable bit is set in an y of the paging-structure entries used to map the page. T able 5-5 lists the valid[...]

  • Page 211

    Vol. 3 5-45 PRO TECTION 5.13.3 R eserved Bit Checking The processor enforces reserved bit checking in paging data structure entries. The bits being checked v aries with paging mode a n d m a y va r y w i t h th e s i ze o f p hy s i c a l address space. T able 5-8 shows the reserved bits that are checked when the execute disable bit capability is e[...]

  • Page 212

    5-46 Vol. 3 PRO TECTION If execute disable bit capability is not enable d or not av ailable, reserved bit checking in 64-bit mode includes bit 63 and additional bits. This and reserved bit checking for legacy 32-bit paging modes are shown in T able 5-10 . T able 5-8. IA-32e Mode Page Le vel Pr ot ection Matrix with Ex ecute-Disable Bit Capability E[...]

  • Page 213

    Vol. 3 5-47 PRO TECTION 5.13.4 Ex cep tion Handling When execute disable bit capability is enabled (IA32_EFER.NXE = 1), conditions for a page fault to occur include the same condit ions that apply to an Intel 64 or IA -32 processor without execute disable bit capability plus the following new condition: an instruction fetch to a linear address that[...]

  • Page 214

    5-48 Vol. 3 PRO TECTION[...]

  • Page 215

    Vol. 3 6-1 CHAP TER 6 INTERRUP T AND EXC EP TION HANDLING This chapter describes the interrupt an d exception-handling mechanism when oper - ating in protected mode on an Intel 6 4 or IA -32 processor . Most of the information provided here also applies to interrupt and exception mechanisms used in real- address, virtual-8086 mode, and 64-bit mode.[...]

  • Page 216

    6-2 Vol. 3 INTERRUP T AND EX CEPTION HANDLING 6.2 EXC EPTION AND IN TERRUPT V ECTORS T o aid in handling exceptions and interrupt s, each architecturally defined exception and each interrupt condition requiring special handling by the processor is assigned a unique identification number , called a vector . The processor uses the vector assigned to [...]

  • Page 217

    Vol. 3 6-3 INTERRUP T AND EXCEP TION HANDLING (see Section 6.2, “Exception and Interrupt V ectors” ). Asserting the NMI pin signals a non-maskable interrupt (NMI), which is assi gned to interrupt vector 2. T able 6-1. Pr otected -Mode Exc eptions and Inter rupts Ve c t o r No. Mne- monic Description Typ e Error Code Sourc e 0 #DE Divide Error F[...]

  • Page 218

    6-4 Vol. 3 INTERRUP T AND EX CEPTION HANDLING The processor’s local APIC is normally co nnected to a system-based I/O APIC. Here, external interrupts received at the I/O APIC’ s pins can be directed to the local APIC through the system bus (Pentium 4, Intel Core Duo, Intel Core 2, Intel Atom, and Intel X eon processors) or the APIC serial bus ([...]

  • Page 219

    Vol. 3 6-5 INTERRUP T AND EXCEP TION HANDLING defined interrupt vectors from 0 through 255; those that can be delivered through the local APIC include inte rrupt vectors 16 through 255. The IF flag in the EFLAGS register permits all maskable hardware interrupts to be masked as a group (see Section 6.8.1, “Masking Ma skable Hardware Interrupts” [...]

  • Page 220

    6-6 Vol. 3 INTERRUP T AND EX CEPTION HANDLING 6.4.2 Softw are-Gener ated Exc eptions The INTO , INT 3, and BOUND instructions pe rmit exceptions to be generated in soft - ware. These instructions allow checks for exception conditions to be performed at points in the instruction stre am. For example, INT 3 causes a breakpoint exception to be gener a[...]

  • Page 221

    Vol. 3 6-7 INTERRUP T AND EXCEP TION HANDLING • Aborts — An abort is an exception that does not always report the precise location of the instruction causing the exception and does not allow a restart of the progra m or task that caused the except ion. Aborts are used to report seve re errors, such as hardware errors and inconsistent or illegal[...]

  • Page 222

    6-8 Vol. 3 INTERRUP T AND EX CEPTION HANDLING EFLAGS.OF (ov e rflow) flag. The tr ap handler for this exception resolves the ov erflow condition. Upon return from the trap handler , program or task execution continues at the instruction following the INT O instruction. The abort -class exceptions do not support reliable restarting of the progr am o[...]

  • Page 223

    Vol. 3 6-9 INTERRUP T AND EXCEP TION HANDLING It is possible to issue a maskable hardware interrupt (through the INTR pin) to vector 2 to invok e the NMI interrupt handler; however , this interrupt will not truly be an NMI interrupt. A true NMI interrupt that activates the processor’ s NMI-handling hardware can only be delivered through one of th[...]

  • Page 224

    6-10 Vol. 3 INTERRUP T AND EX CEPTION HANDLING is an interrupt. As with the INT n instruction (see Section 6.4.2, “S oftware-Generated Exceptions” ), when an interrupt is generated through the INTR pin to an exception vector , the processor doe s not push an error code on the stack, so the exception handler may not operate correctly . The IF fl[...]

  • Page 225

    Vol. 3 6-11 INTERRUP T AND EXCEP TION HANDLING 6.8.3 Masking Ex cep tions and Interrup ts When S witching S tacks T o switch to a different stack segment, software often uses a pair of instructions, for example: MOV SS, AX MOV ESP, StackT op If an interrupt or exception occurs after the segment selector has been loaded into the SS register but befo[...]

  • Page 226

    6-12 Vol. 3 INTERRUP T AND EX CEPTION HANDLING While priority among these classes listed in T able 6-2 is consistent throughout the architecture, exceptions within each cl ass are implementation-dependent and may vary from processor to processor . The processor first services a pending exception or interrupt from the class wh ich has the highest pr[...]

  • Page 227

    Vol. 3 6-13 INTERRUP T AND EXCEP TION HANDLING protected mode). Unlike the GDT , the first entry of the IDT may contain a descriptor . T o form an index into the IDT , the process or scales the exception or interrupt vector by eight (the number of bytes in a gate de scriptor). Because there are only 256 inter - rupt or exception v ectors, the IDT n[...]

  • Page 228

    6-14 Vol. 3 INTERRUP T AND EX CEPTION HANDLING 6.11 IDT DESCRIPTORS The ID T may contain an y of three kinds of gate descriptors: • T ask -gate descriptor • Interrupt-gate descriptor • T rap-gate descriptor Figure 6-2 shows the formats for the task -gate, interrupt-gate, and tr ap-gate descriptors. The format of a task gate used in an IDT is [...]

  • Page 229

    Vol. 3 6-15 INTERRUP T AND EXCEP TION HANDLING 6.12 EX CEP TION AND IN TERRUPT HANDLING The processor handles calls to exception- and interrupt -handlers similar to the way it handles calls with a CALL instruction to a procedure or a task. When responding to an exception or interrupt, the processor uses the exception or interrupt v ector as an inde[...]

  • Page 230

    6-16 Vol. 3 INTERRUP T AND EX CEPTION HANDLING “Returnin g from a Called Procedure” ). If index points to a task gate, the processor executes a task switch to the exception- or interrupt-handler task in a manner similar to a CALL to a task gate (see Section 7.3, “T ask Switching” ). 6.12.1 Ex cep tion- or Interrup t-Handler Pr ocedur es An [...]

  • Page 231

    Vol. 3 6-17 INTERRUP T AND EXCEP TION HANDLING When the processor performs a call to the exception- or interrupt-handler procedure: • If the handler procedure is going to be ex ecuted at a numerically lower privilege level, a stack switch occurs. When the stack switch occurs: a. The segment selector and stack pointer for the stack to be used by t[...]

  • Page 232

    6-18 Vol. 3 INTERRUP T AND EX CEPTION HANDLING T o return from an exception- or interrupt-handler procedure, the handler must use the IRET (or IRETD) instruction. The IRET in struction is similar to the RET instruction except that it restores the saved flags into the EFLAGS register . The IOPL field of the EFLAGS register is restored only if the CP[...]

  • Page 233

    Vol. 3 6-19 INTERRUP T AND EXCEP TION HANDLING not permit transfer of ex ecution to an exce ption- or interrupt-handler procedure in a less privileged code segment (numerically greater privilege level) than the CPL. An attempt to violate this rule results in a gener al-protection exception (#GP). The protection mechanism for exception- and interrup[...]

  • Page 234

    6-20 Vol. 3 INTERRUP T AND EX CEPTION HANDLING of the EFLAGS register on the stack. Accessing a handler procedure through a trap gate does not affect the IF flag. 6.12.2 Interrup t T asks When an exception or interrupt handler is a ccessed through a task gate in the IDT , a task switch results. Handling an exception or interrupt with a separate tas[...]

  • Page 235

    Vol. 3 6-21 INTERRUP T AND EXCEP TION HANDLING 6.13 ERR OR CODE When an exception condition is related to a specific segment, the processor pushe s an error code onto the stack of the ex cept ion handler (whether it is a procedure o r task). The error code has the format shown in Figure 6-6 . The error code resembles a segment selector; however , i[...]

  • Page 236

    6-22 Vol. 3 INTERRUP T AND EX CEPTION HANDLING clear , indicates that the index refers to a descriptor in the GD T or the current LDT . TI GDT/LDT (bit 2) — Only used when the IDT fl ag is clear . When set, the TI flag indicates that the index portion of the error code refers to a segment or gate descriptor in the LDT ; when c lear , it indicates[...]

  • Page 237

    Vol. 3 6-23 INTERRUP T AND EXCEP TION HANDLING • The stack pointer (SS:RSP) is pushed unconditionally on interrupts. In legacy modes, this push is conditional and base d on a change in current privilege level (CPL). • The new SS is set to NULL if there is a change in CPL. • IRET behavior changes. • There is a new interrupt stack -switch mec[...]

  • Page 238

    6-24 Vol. 3 INTERRUP T AND EX CEPTION HANDLING ware attempts to reference an interrupt gate with a target RIP th at is not in canonical form. The target code segment re ferenced by th e interrupt gate must be a 64-bit code segment (CS.L = 1, CS.D = 0). If the target is not a 64-bit code segment, a general- protection exception (#GP) is gener ated w[...]

  • Page 239

    Vol. 3 6-25 INTERRUP T AND EXCEP TION HANDLING 6.14.3 IR ET in IA-32e Mode In IA -32e mode, IRET ex ecutes with an 8-byte op erand siz e. There is nothing that forces this requirement. The stack is formatted in such a way that for actions where IRET is required, the 8-byte IRET operand size works correctly . Because interrupt stack -frame pushes ar[...]

  • Page 240

    6-26 Vol. 3 INTERRUP T AND EX CEPTION HANDLING In summary , a stack switch in IA-32e mode works like the legacy stack switch, except that a new SS selector is not loaded from the TS S. Instead, the new SS is forced to NULL. 6.14.5 In terrupt S tack T able In IA-32e mode, a new interrupt stack table (IST) m echanism is available as an alter - native[...]

  • Page 241

    Vol. 3 6-27 INTERRUP T AND EXCEP TION HANDLING 6. 1 5 EXCE PT IO N A ND I NT ER R U PT REFE RE NC E The following sections describe conditions which generate exceptions and interrupts. They are arranged in the order of vect or numbers. The information contained in these sections are as follows: • Exception Class — Indicates whether the exceptio[...]

  • Page 242

    6-28 Vol. 3 INTERRUP T AND EX CEPTION HANDLING Interrup t 0—Divide Err or Ex cep tion (#DE) Ex ception Class Fa u l t . Descripti on Indicates the divisor operand for a DIV or IDIV instruction is 0 or that the re sult cannot be represented in the number of bi ts specified for the de stination operand. Exc ep tion Error Code None. Saved Ins tructi[...]

  • Page 243

    Vol. 3 6-29 INTERRUP T AND EXCEP TION HANDLING Int errupt 1—Debu g Ex cep tion (#DB) Exc eption Class Tr ap or F ault. The ex ception handler can distinguish betw een traps or faults b y examining the c onten ts of DR6 and the other debug r egisters. Description Indicates that one or more of several debug-ex ception conditions has been detected. [...]

  • Page 244

    6-30 Vol. 3 INTERRUP T AND EX CEPTION HANDLING Interrup t 2—NMI In terrup t Exc ep tion Class Not applicable. Descripti on The nonmaskable interrupt (NMI) is ge nerated externally by asserting the processor’s NMI pin or through an NMI reques t set by the I/O APIC to the local APIC. This interrupt causes the NMI interrupt handler to be called. E[...]

  • Page 245

    Vol. 3 6-31 INTERRUP T AND EXCEP TION HANDLING Int er r upt 3— Br ea kp oi nt Exce pti on ( #B P) Exc eption Class Tr a p . Description Indicates that a breakpoint instruction (INT 3) w as executed, causing a breakpoint trap to be generated. T ypically , a debugger sets a breakpoint by replacing the first opcode byte of an instruction with the op[...]

  • Page 246

    6-32 Vol. 3 INTERRUP T AND EX CEPTION HANDLING Interrup t 4—Ov erflow Ex c eption (#OF) Exc ep tion Class Tr a p . Descripti on Indicates that an overflow tr ap occurred when an INT O instruction was executed. The INTO instruction checks the state of the OF flag in the EFLAGS register . I f the OF flag is set, an overflow tr ap is gener ated. Som[...]

  • Page 247

    Vol. 3 6-33 INTERRUP T AND EXCEP TION HANDLING Interrup t 5—BOUND Range Ex ceeded Ex cep tion (#BR) Exc eption Class Fa u l t . Description Indicates that a BOUND-range-ex ceeded fault occurred when a BOUND instruction was executed. The BOUND instruction checks that a signed arra y index is within the upper and lower bounds of an arra y located i[...]

  • Page 248

    6-34 Vol. 3 INTERRUP T AND EX CEPTION HANDLING Interrup t 6—In valid Opc ode Ex cep tion (#UD) Exc ep tion Class Fa u l t . Descripti on Indicates that the processor did one of the following things: • Attempted to execute an in valid or reserv ed opcode. • Attempted to execute an instruction with an oper and type that is invalid for its accom[...]

  • Page 249

    Vol. 3 6-35 INTERRUP T AND EXCEP TION HANDLING processor and earlier IA-32 processors, this exception is not gene rated as the result of prefetching and preliminary decodi ng of an inv alid instruction. (See Section 6.5, “Exception Classifications, ” for general rules for taking of interrupts and exceptions.) The opcodes D6 and F1 are undefined[...]

  • Page 250

    6-36 Vol. 3 INTERRUP T AND EX CEPTION HANDLING Interrup t 7—De vice No t A vailable Ex cep tion (#NM) Exc ep tion Class Fa u l t . Descripti on Indicates one of the following things: The device-not-a vailable exception is gener ated by either of three conditions: • The processor executed an x87 FPU floating -point instruction while the EM flag [...]

  • Page 251

    Vol. 3 6-37 INTERRUP T AND EXCEP TION HANDLING Saved Ins truction Poin ter The saved contents of CS and EIP registers point to the floating-point instruction or the WAIT/FW AIT instruction that generated the ex ception. Progr am S tate Change A program-state change does not accompany a device-not- available fault, because the instruction that gener[...]

  • Page 252

    6-38 Vol. 3 INTERRUP T AND EX CEPTION HANDLING Interrup t 8—Double F ault Exc eption (#DF) Exc ep tion Class Abort. Descripti on Indicates that the processor detected a second exception while calling an exception handler for a prior exception. Normally , when the processor detects another excep - tion while trying to call an exception handler , t[...]

  • Page 253

    Vol. 3 6-39 INTERRUP T AND EXCEP TION HANDLING A segment or page fault may be encountered while prefetching instructions; however , this behavior is outside the domain of T able 6-5 . Any further faults gener - ated while the processor is attempting to tr ansfer control to the appropriate fault handler could still lead to a double-fault sequence. I[...]

  • Page 254

    6-40 Vol. 3 INTERRUP T AND EX CEPTION HANDLING If the double fault occurs when any port ion of the exception handling machine state is corrupted, the handler cannot be invoked and the processor must be rese t.[...]

  • Page 255

    Vol. 3 6-41 INTERRUP T AND EXCEP TION HANDLING Interrup t 9—Copr ocessor Segment Ov errun Exc eption Class Ab ort. (Intel r eserved; do no t use. Rec ent IA-32 pr ocessors do no t generate this e x cep tion.) Description Indicates that an Intel386 CPU-based systems with an Intel 387 math coprocessor detected a page or segment violation while tran[...]

  • Page 256

    6-42 Vol. 3 INTERRUP T AND EX CEPTION HANDLING Interr upt 10—In valid TSS Ex cep tion (#TS) Exc ep tion Class Fa u l t . Descripti on Indicates that there was an error related to a TSS. Such an error might be detected during a task switch or during the ex ecution of instructions that use information from a TSS. T able 6-6 shows the conditions tha[...]

  • Page 257

    Vol. 3 6-43 INTERRUP T AND EXCEP TION HANDLING S tack segment selector inde x The stack segment selector exceeds descrip tor table limit. S tack segment selector inde x The stack segment selector is N ULL. S tack segment selector inde x The stack segment descriptor is a non-data segment. S tack segment selector inde x The stack segment is not writa[...]

  • Page 258

    6-44 Vol. 3 INTERRUP T AND EX CEPTION HANDLING This exception can generated either in the context of the original task or in the context of the new task (see Section 7.3, “T ask Switching” ). Until the processor has completely verified the presence of the ne w TSS , the exception is ge nerated in the context of the original task. Once the exis [...]

  • Page 259

    Vol. 3 6-45 INTERRUP T AND EXCEP TION HANDLING If an inv alid TSS exception occurs during a task switch, it can occur before or after the commit-to-new-task point. If it occurs before the commit point, no progr am state change occurs. If it occurs after the commit point (when the segment descriptor information for the new segment selectors have bee[...]

  • Page 260

    6-46 Vol. 3 INTERRUP T AND EX CEPTION HANDLING Interrup t 11—Segmen t No t Presen t (#NP) Exc ep tion Class Fa u l t . Descripti on Indicates that the present flag of a segment or gate descriptor is clear . The processor can generate this exception during an y of the following operations: • While attempting to load CS, DS, ES, FS, or GS registe[...]

  • Page 261

    Vol. 3 6-47 INTERRUP T AND EXCEP TION HANDLING tors for the segment selectors in a new TS S, the CS and EIP registers point to the first instruction in the new task. If the exception occurred while accessing a gate descriptor , the CS and EIP registers point to the instruction that invoked the access (for example a CALL instruction that references [...]

  • Page 262

    6-48 Vol. 3 INTERRUP T AND EX CEPTION HANDLING Interrup t 12—S tack F ault Exc ept ion (#SS) Exc ep tion Class Fa u l t . Descripti on Indicates that one of the following stack related conditions was detected: • A limit violation is detected during an oper ation that refers to the SS register . Operations that can cause a limit violation includ[...]

  • Page 263

    Vol. 3 6-49 INTERRUP T AND EXCEP TION HANDLING Progr am S tate Change A program-state change does not generally accompany a stack -fault exception, because the instruction that gener ated the fault is not executed. Here, the instruction can be restarted after the exception handler has corrected the stack fault condition. If a stack fault occurs dur[...]

  • Page 264

    6-50 Vol. 3 INTERRUP T AND EX CEPTION HANDLING Interrup t 13—Gener al Pr o tection Ex cep tion (#GP) Exc ep tion Class Fa u l t . Descripti on Indicates that the processor detected one of a class of protection violations called “general-protection violations. ” The conditions that cause this exception to be gener - ated comprise all the prote[...]

  • Page 265

    Vol. 3 6-51 INTERRUP T AND EXCEP TION HANDLING • Loading the CR0 register with a se t NW flag and a clear CD flag. • Referencing an entry in the ID T (following an interrupt or exception) that is not an interrupt, trap , or task gate. • Attempting to access an interrupt or ex ception handler through an interrupt or trap gate from virtual-8086[...]

  • Page 266

    6-52 Vol. 3 INTERRUP T AND EX CEPTION HANDLING • A selector from a TSS involved in a task switch. • IDT ve ctor number . Saved Ins truction Poin ter The saved contents of CS and EIP registers point to the instruction that gener ated the excep tion. Progr am S tate Change In general, a progr am-state change does not accompany a general-protectio[...]

  • Page 267

    Vol. 3 6-53 INTERRUP T AND EXCEP TION HANDLING • If the segment descriptor poin ted to by the segment selector in the destination operand is a code segment an d it has both the D-bit an d the L -bit set. • If the segment descriptor from a 64-bit call gate is in non-canonical space. • If the DPL from a 64-bit call-gate is less than the CPL or [...]

  • Page 268

    6-54 Vol. 3 INTERRUP T AND EX CEPTION HANDLING Interrup t 14—P age-F ault Excep tion (#PF) Exc ep tion Class Fa u l t . Descripti on Indicates that, with paging enabled (the PG flag in the CR0 register is set), the processor detected one of the following conditions while using the page-tr anslation mechanism to translate a linear address to a phy[...]

  • Page 269

    Vol. 3 6-55 INTERRUP T AND EXCEP TION HANDLING — The U/S flag indicates whether the processor was ex ecuting at user mode (1) or supervisor mode (0) at the time of the exception. — The RSVD flag indicates that the processo r detected 1s in reserved bits of the page directory , when the PSE or PAE flags in control register CR4 are set to 1. Note[...]

  • Page 270

    6-56 Vol. 3 INTERRUP T AND EX CEPTION HANDLING second page fault can occur . 1 If a page fault is caused by a page-le vel protection violation, the access flag in the page-direc tory entry is set when the fault occurs. The behavior of IA -32 processors regarding the access flag in the corresponding page-table entry is model specific and not archite[...]

  • Page 271

    Vol. 3 6-57 INTERRUP T AND EXCEP TION HANDLING description for “Interrupt 10—Inv alid TSS Exception (#TS)” in this chapter for addi - tional information on how to handle this situation.) Additional Ex ception-Handli ng Inf ormation Special care should be taken to ensure th at an exception that occurs during an explicit stack switch does not c[...]

  • Page 272

    6-58 Vol. 3 INTERRUP T AND EX CEPTION HANDLING Interrup t 16—x87 FPU Floa ting-P oint Err or (#MF) Exc ep tion Class Fa u l t . Descripti on Indicates that the x87 FPU has detected a fl oating-point error . The NE flag in the register CR0 must be set for an interrupt 16 (floating-point error exception) to be gener ated. (See Section 2.5, “Contr[...]

  • Page 273

    Vol. 3 6-59 INTERRUP T AND EXCEP TION HANDLING Prior to executing a waiting x87 FPU instruction or the WAIT/FW AIT instruction, the x87 FPU checks for pending x87 FPU floating-point exceptions (as described in step 2 above). Pending x87 F PU floating-point exce ptions are ignored for “non-waiting” x87 FPU instructions, which include the FNINIT [...]

  • Page 274

    6-60 Vol. 3 INTERRUP T AND EX CEPTION HANDLING Interrup t 17—Alignmen t Check Ex cep tion (#A C) Exc ep tion Class Fa u l t . Descripti on Indicates that the processor detected an unaligned memory oper and when alignment checking was enabled. Alignment checks are only carried out in data (or stack) accesses (not in code fetches or system se gment[...]

  • Page 275

    Vol. 3 6-61 INTERRUP T AND EXCEP TION HANDLING • AC flag in th e EFLAGS reg ister is set. • The CPL is 3 (protected mode or virtual-8086 mode). Alignment-check exceptions (#AC) are gene rated only when oper ating at privilege level 3 (user mode). Memory references that default to privilege level 0, such as segment descriptor loads, do not gener[...]

  • Page 276

    6-62 Vol. 3 INTERRUP T AND EX CEPTION HANDLING Interrup t 18—Machine-Check Ex cep tion (#MC) Exc ep tion Class Abort. Descripti on Indicates that the processor detected an internal machine error or a bus error , or that an external agent detected a bus error . The machine-check exception is model- specific, av ailable on the Pentium and later gen[...]

  • Page 277

    Vol. 3 6-63 INTERRUP T AND EXCEP TION HANDLING For the P entium 4, Intel X eon, P6 family , and Pentium processors, a progr am-state change alwa ys accompanies a machine-check exception, and an abort class excep - tion is generated. For abort ex ceptions , information about the exception can be collected from the machine-check MSRs , but the progra[...]

  • Page 278

    6-64 Vol. 3 INTERRUP T AND EX CEPTION HANDLING Interrup t 19—SIMD Floating -P oint Ex ception (#XM) Exc ep tion Class Fa u l t . Descripti on Indicates the processor has detected an SSE/SSE2/SSE3 SIMD floating-point excep - tion. The appropriate status flag in the MXCS R register must be set and the particular exception unmasked for this interrup[...]

  • Page 279

    Vol. 3 6-65 INTERRUP T AND EXCEP TION HANDLING Note that because SIMD floating-point exceptions are precise and occur immediately , the situation does not arise where an x87 FP U instruction, a WAIT/FW AIT instruction, or another SSE/SSE2/SSE3 instruction will catch a pending unmask ed SIMD floating- point exception. In situations where a SIMD floa[...]

  • Page 280

    6-66 Vol. 3 INTERRUP T AND EX CEPTION HANDLING Saved Ins truction Poin ter The saved contents of CS and EIP registers point to the SSE/SSE2/SSE3 instruction that was executed when the SIMD floating-point exception w as generated. This is the faulting instruction in which th e error condition was detected. Progr am S tate Change A program-state chan[...]

  • Page 281

    Vol. 3 6-67 INTERRUP T AND EXCEP TION HANDLING Interrup ts 32 to 255—User Defined In terrupts Exc eption Class Not applicable. Description Indicates that the processor did one of the following things: • Executed an INT n instruction where the instruction operand is one of the vector numbers from 32 through 255. • Responded to an interrupt req[...]

  • Page 282

    6-68 Vol. 3 INTERRUP T AND EX CEPTION HANDLING[...]

  • Page 283

    Vol. 3 7-1 CHAP TER 7 TA S K M A N A G E M E N T This chapter describes the IA -32 architecture’ s task management facilities. These facilities are only available when the pr ocessor is running in protected mode. This chapter focuses on 32-bit tasks and the 32-bit TSS structure. For information on 16-bit tasks and the 16-bit TSS structure, see Se[...]

  • Page 284

    7-2 Vol. 3 T ASK MA NAGEMENT 7 .1.2 T ask S tate The following ite ms define the state of the currently executing task: • The task’ s current execution space, defined by the segment selectors in the segment registers (CS, DS, SS, ES, FS, and GS). • The state of the general-purpose registers. • The state of the EFLAGS registe r . • The sta[...]

  • Page 285

    Vol. 3 7-3 T ASK MANAGEME NT 7 .1.3 Ex ecuting a T ask Software or the processor can dispatch a task for execution in one of the following ways: • A explicit call to a task with the CALL instruction. • A explicit jump to a task with the JMP instruction. • An implicit call (by the processor) to an interrupt-handler task. • An implicit call t[...]

  • Page 286

    7-4 Vol. 3 T ASK MA NAGEMENT page tables as other privilege-level-3 tasks can access code and corrupt data and the stack of other tasks. Use of task management facilities for handlin g multitasking applications is optional. Multitasking can be handled in software, with each software defined task ex ecuted in the context of a single IA -32 architect[...]

  • Page 287

    Vol. 3 7-5 T ASK MANAGEME NT The processor updates dynamic fields when a task is suspended during a task switch. The following are dynamic fields: • General-purpose re gister fields — State of the EAX, ECX, EDX, EBX, ESP , EBP , ESI, and EDI registers prior to the task switch. • Segment selector fields — Segment selectors stored in the ES ,[...]

  • Page 288

    7-6 Vol. 3 T ASK MA NAGEMENT • EIP (instruction pointer) field — State of the EIP register prior to the task switch. • Previous task link field — Contains the segment selector for the TSS of the previous task (updated on a task switch that was initiated by a call, interrupt, or exception). This field (which is sometimes called the back link[...]

  • Page 289

    Vol. 3 7-7 T ASK MANAGEME NT • T ask switches are carried out faster if the pages containing these structures are present in memory before the task switch is initiated. 7 .2.2 TSS Descript or The TSS, like all other segments, is defined by a segment descriptor . Figure 7-3 shows the format of a TSS descriptor . TSS descriptors may only be placed [...]

  • Page 290

    7-8 Vol. 3 T ASK MA NAGEMENT of a TSS. A ttempting to switch to a task whose TSS descriptor has a limit less than 67H generates an inv alid- TSS exception (#TS). A larger limit is required if an I/O permission bit map is included or if the operating system stores additional data . The processor does not check for a limit greater than 67H on a tas k[...]

  • Page 291

    Vol. 3 7-9 T ASK MANAGEME NT 7 .2.4 T ask Regis ter The task register holds the 16-bit segment selector and the entire segment descriptor (32-bit base address, 16-bit segment limit, and descriptor attributes) for the TSS of the current task (see Figure 2-5 ). This information is copied from the TS S descriptor in the GDT for the current task. Figur[...]

  • Page 292

    7-10 Vol. 3 T ASK MA NAGEMENT The L TR instruction loads a segment selector (source operand) into the task register that points to a TS S descriptor in the GD T . It then loads the invisible po rtion of the task register with information from the TSS descriptor . L TR is a privileged instruction that may be executed only when the CPL is 0. It’s u[...]

  • Page 293

    Vol. 3 7-11 T ASK MANAGEME NT 7 .2.5 T ask-Gate Descript or A task -gate descriptor prov ides an indire ct, protected reference to a task (see Figure 7-6 ). It can be placed in the GDT , an LDT , or the IDT . The TSS segment selector field in a task -gate descriptor points to a TSS descriptor in the GD T . The RPL in this segment selector is not us[...]

  • Page 294

    7-12 Vol. 3 T ASK MA NAGEMENT to be handled by handler tasks. When an interrupt or exception vector points to a task gate, the processor switches to the spec ified task. Figure 7-7 illustrates how a task gate in an LDT , a task gate in the GD T , and a task gate in the IDT can all point to the same task. 7 .3 T A SK S WITCHING The processor transfe[...]

  • Page 295

    Vol. 3 7-13 T ASK MANAGEME NT • An interrupt or exception vector points to a task -gate descriptor in the IDT . • The current task executes an IRET when the NT flag in the EFL AGS register is set. JMP , CALL, and IRET instructions, as well as interrupts and exceptions, are all mech - anisms for redirecting a program. The referencing of a TSS de[...]

  • Page 296

    7-14 Vol. 3 T ASK MA NAGEMENT 10. If the task switch was initiated with a CALL instruction, JMP instruction , an exception, or an interrupt, the processor se ts the busy (B) flag in the new task’ s TSS descriptor; if initiated with an IRET in struction, the busy (B) flag is left set. 11. Loads the task register with the se gment selector and desc[...]

  • Page 297

    Vol. 3 7-15 T ASK MANAGEME NT rules control access to a TSS, software does not need to perform explicit privilege checks on a task switch. T able 7-1 shows t he exception conditions that the processor checks for when switching tasks. It also shows the exceptio n that is generated for each check if an error is detected and the segment that the error[...]

  • Page 298

    7-16 Vol. 3 T ASK MA NAGEMENT The TS (task switched) flag in the control register CR0 is set every time a task switch occurs. System software uses the TS flag to coordinate the actions of floating-point unit when gener ating floating-point exceptions with the rest of the processor . The TS flag indicates that the context of the floati ng-point unit[...]

  • Page 299

    Vol. 3 7-17 T ASK MANAGEME NT T able 7-2 shows the busy flag (in the TSS segment descriptor), the NT flag, the previous task link field, and TS flag (in control register CR0) during a task switch. The NT flag may be m odified by software ex ecuting at any pr ivilege level. It is possible for a program to set the NT flag and ex ecute an IRET instruc[...]

  • Page 300

    7-18 Vol. 3 T ASK MA NAGEMENT 7 .4.1 Use o f Busy Flag T o Pr ev ent R ecursive T ask Switching A TSS allows only one context to be sav ed for a task; there fore, once a task is called (dispatched), a recursive (or re-entrant) call to the task would cause the current state of the task to be lost. The busy flag in the TSS segment descriptor is provi[...]

  • Page 301

    Vol. 3 7-19 T ASK MANAGEME NT In a multiprocessing system, additional sy nchronization and serialization operations must be added to this procedure to insure th at the TSS and its segm ent descriptor are both locked when the previous task lin k field is changed and the busy flag is cleared. 7 .5 T ASK ADDRESS SPAC E The address space for a task con[...]

  • Page 302

    7-20 Vol. 3 T ASK MA NAGEMENT and the page tables point to different page s of physical memory , then the tasks do not share physical addresses. With either method of mapping task linea r address spaces, the TSSs for all tasks must lie in a shared area of the physical sp ace, which is accessible to all tasks. This mapping is required so that the ma[...]

  • Page 303

    Vol. 3 7-21 T ASK MANAGEME NT shared LDT point to segments that are mapped to a common area of the physical address space, the data and code in th ose segments can be shared among the tasks that share the LD T . This method of sharing is more selective than sharing through the GD T , because the sharing can be limited to specific tasks. Other tasks[...]

  • Page 304

    7-22 Vol. 3 T ASK MA NAGEMENT 7 .7 T ASK MANAGEMEN T IN 64-BIT MODE In 64-bit mode, task structure and task sta te are similar to those in protected mode. However , the task switching mechanism ava ilable in protected mode is not supported in 64-bit mode. T ask management and swit ching must be performed by software. The processor issues a general-[...]

  • Page 305

    Vol. 3 7-23 T ASK MANAGEME NT Although hardware task -switching is no t supported in 64-bit mode, a 64-bit task state segment (TSS) must exist. Figure 7-11 shows the format of a 64-bit TS S. The TSS holds information important to 64-bit mode and that is not directly related to th e task -switch mechanism. This information includes: • RSPn — The[...]

  • Page 306

    7-24 Vol. 3 T ASK MA NAGEMENT Figure 7-11. 64-Bit TSS F ormat 0 31 100 96 92 88 84 80 76 I/O Map Base Address 15 72 68 64 60 56 52 48 44 40 36 32 28 24 20 16 12 8 4 0 RSP0 (lower 32 bits) RSP1 (lower 32 bits) RSP2 (lower 32 bits) Reserved bits. Set to 0. RSP0 (upper 32 bits) RSP1 (upper 32 bits) RSP2 (upper 32 bits) IST1 (lower 32 bits) IST1 (upper[...]

  • Page 307

    Vol. 3 8-1 CHAP TER 8 MULTIPLE-PR OCE SSOR MANAGEMENT The Intel 64 and IA -32 architectures provide mechanisms for managing and improving the performance of multiple processors connected to the same system bus. These include: • Bus locking and/or cache coherency management for performin g atomic operations on system memo ry . • Serializing inst[...]

  • Page 308

    8-2 Vol. 3 MULTIPLE-PR OCESSOR MANAGEMENT • T o distribute interrupt handling among a group of processors — When several processors are operating in a system in par a llel, it is useful to have a centr alized mechanism for receiving interrupts and dist ributing them to av ailable processors for servicing. • T o increas e system performance by[...]

  • Page 309

    Vol. 3 8-3 MULTIPLE-PR OCESSOR MANAGE MENT software to manage the fairness of semaphores and exclusive locking functions. The mechanisms for handling locked atom ic operations ha ve evolved with the complexity of IA-32 processors. More recent IA-32 processors (such as the Pen t iu m 4, I ntel X eon, and P6 family processo rs) and Intel 64 provide a[...]

  • Page 310

    8-4 Vol. 3 MULTIPLE-PR OCESSOR MANAGEMENT the hardware designer to make the LOCK# signal av ailable in system hardware to control memory accesses among processors. For the P6 and more recent processor familie s, if the memory area being accessed is cached internally in the processor , the LOCK# signal is generally not asserted; instead, locking is [...]

  • Page 311

    Vol. 3 8-5 MULTIPLE-PR OCESSOR MANAGE MENT 8.1.2.2 Software Con trolled Bus L ocking T o explicitly force the LOCK semantics, so ftware can use the L OCK prefix with the following instructions when they are used to modify a memory location. An inv alid- opcode exception (#UD) is generated when the LOCK prefix is used with any other instruction or w[...]

  • Page 312

    8-6 Vol. 3 MULTIPLE-PR OCESSOR MANAGEMENT ence weakly o rdered memory ty pes (such as the WC memory type) may not be seri - alized. Locked instructions should not be used to insure that data written can be fetched as instructions. NO TE The locked instructions for the cu rrent versions of the P entium 4, Intel X eon, P6 family , P entium, and Intel[...]

  • Page 313

    Vol. 3 8-7 MULTIPLE-PR OCESSOR MANAGE MENT The act of one processor writing data into the currently executing code segment of a second processor with the intent of having the second processor execu te that data as code is called cross-modifying code . As with self-modifying code, IA -32 processors exhibit model-specific behavior when ex ecuting cro[...]

  • Page 314

    8-8 Vol. 3 MULTIPLE-PR OCESSOR MANAGEMENT have cached the same area of memory from simultaneously modifying data in that area. 8.2 MEMORY ORDERING The term memory or dering refers to the order in which the processor issues reads (loads) and writes (stores) through the system bus to system memory . The Intel 64 and IA-32 architectures support severa[...]

  • Page 315

    Vol. 3 8-9 MULTIPLE-PR OCESSOR MANAGE MENT among processors are expl icitly required to obey program ordering through th e use of appropriate locking or ser ializing operations (see Section 8.2.5, “Strengthening or W eak ening the Memory-Ordering Model” ). 8.2.2 Memory Ordering in P6 and Mor e R ecent Pr oc essor F amilies The Intel Core 2 Duo,[...]

  • Page 316

    8-10 Vol. 3 MULTIPLE-PR OCESSOR MANAGEMENT • Locked instructions have a total order . See the example in Figure 8-1. Consider three processors in a system and each processor performs three writes, one to each of three defined locations (A, B, and C). Individually , the processors perform the writes in the same progr am order , but because of bus [...]

  • Page 317

    Vol. 3 8-11 MULTIPLE-PR OCESSOR MANAGE MENT 8.2.3 Examples Illustr ating th e Memory-Ordering Principles This section provides a set of examples that illustrate the behavior of the memory- ordering principles introduced in Section 8.2.2 . They are designed to give software writers an understanding of how memory orde ring may affect the results of d[...]

  • Page 318

    8-12 Vol. 3 MULTIPLE-PR OCESSOR MANAGEMENT Section 8.2.3.2 through Section 8.2.3.7 give examples using the MOV instruction. The principles that underlie these examples apply to load and store accesses in general and to other instructions that load from or store to memory . Section 8.2.3.8 and Section 8.2.3.9 give examples using the XCHG instruction[...]

  • Page 319

    Vol. 3 8-13 MULTIPLE-PR OCESSOR MANAGE MENT 8.2.3.3 Stor es Are No t Reor dered With Earlier Loads The Intel-64 memory-ordering model ensures that a store by a processor may not occur before a previous load by the same processor . This is illustrated by the following example: Assume r1 == 1. • Because r1 == 1, processor 1’ s store to x occurs b[...]

  • Page 320

    8-14 Vol. 3 MULTIPLE-PR OCESSOR MANAGEMENT has the two loads occurring before the two stores. This would result in each load returning valu e 0. The fact that a load may not be reordered with an earlier store to the same location is illustrated by the following example: The Intel-64 memory-ordering model does not allow the load to be reordered with[...]

  • Page 321

    Vol. 3 8-15 MULTIPLE-PR OCESSOR MANAGE MENT 8.2.3.6 St ores Ar e T ransitiv ely Visible The memory-ordering model ensures tr ansitive visibility of stores; stores that are causally related appear to all processors to occur in an order consistent with the causality relation. This is illust rated by the following example: Assume that r1 == 1 and r2 =[...]

  • Page 322

    8-16 Vol. 3 MULTIPLE-PR OCESSOR MANAGEMENT By the principles discussed in Section 8.2.3.2, • processor 2’ s first and second load cannot be reordered, • processor 3’ s first and second load cannot be reordered. • If r1 == 1 and r2 == 0, processor 0’ s store appears to precede processor 1’ s store with respect to processor 2. • Simil[...]

  • Page 323

    Vol. 3 8-17 MULTIPLE-PR OCESSOR MANAGE MENT reader should note that reordering is prevented also if the locked instruction is executed after a load or a store. The first example illustrates that loads ma y not be reordered with earlier locked instructions: As explained in Section 8.2.3.8, there is a total order of the executions of locked instructi[...]

  • Page 324

    8-18 Vol. 3 MULTIPLE-PR OCESSOR MANAGEMENT 8.2.4 Out-of-Order S tores F or S tring Oper ations The Intel Core 2 Duo, Intel Core, Pentium 4, and P6 family processors modify the processors operation during the string store operations (initiated with the MOVS and ST OS instructions) to maximize performance. Once the “fast string” operations initia[...]

  • Page 325

    Vol. 3 8-19 MULTIPLE-PR OCESSOR MANAGE MENT 2. Stores from separ ate string oper ations (for example, stores from consecutiv e string operations) do not execute out of orde r . All the stores from an earlier string operation will complete before any store from a later string operation. 3. String operations are not reor de red with other store opera[...]

  • Page 326

    8-20 Vol. 3 MULTIPLE-PR OCESSOR MANAGEMENT It is possible for processor 1 to perceive that the repeated string stores in processor 0 are happening out of order . W e assume that fast string oper ations are enabled on processor 0. In Example 8-12, processor 0 does two separate rounds of rep stosd operation of 128 doubleword stores, writing the value[...]

  • Page 327

    Vol. 3 8-21 MULTIPLE-PR OCESSOR MANAGE MENT Processor 1 performs two read operations, th e first read is from an address outside the 512-byte block but to be updated by processor 0, the second ready is from inside the block of memory of string operation. Processor 1 cannot perceive the later store by processor 0 until it sees all the stores from th[...]

  • Page 328

    8-22 Vol. 3 MULTIPLE-PR OCESSOR MANAGEMENT 8.2.5 Str engthening or W eak ening the Memory-Ordering Model The Intel 64 and IA-32 architectures provide sever al mechanisms for strengthening or weakenin g the memory -ordering model to handle special programming situations. These mechanisms include: • The I/O instructions, locking instructio ns, the [...]

  • Page 329

    Vol. 3 8-23 MULTIPLE-PR OCESSOR MANAGE MENT as the XCHG instruction or the LOCK prefix to insure that a read-modify-write opera - tion on memory is carried out atomically . Locking operations typically oper ate like I/O operations in that they wait for all previous instructions to complete and for all buffered writes to drain to memory (see Section[...]

  • Page 330

    8-24 Vol. 3 MULTIPLE-PR OCESSOR MANAGEMENT The PA T was introduced in the Pentium III processor to enhance the caching charac - teristics that can be assigned to pages or groups of pages. The PA T mechanism ty pi - cally used to strengthen caching characterist ics at the page level with respect to the caching characteristics established by the MTRR[...]

  • Page 331

    Vol. 3 8-25 MULTIPLE-PR OCESSOR MANAGE MENT • Non-privileged serializing instructions — CPUID, IRET , and RSM. When the processor serializes instruction execution, it ensures that all pending memory transactions are completed (including writes stored in its store buffer) before it executes the next instruction. Nothing can pass a serializing in[...]

  • Page 332

    8-26 Vol. 3 MULTIPLE-PR OCESSOR MANAGEMENT execution is not deterministically serialized when a branch instruction is executed. 8.4 MULTIPLE-PROC ESSOR (MP) INITIALIZATION The IA -32 architecture (beg inning with the P6 family processors) defines a multiple- processor (MP) initialization protocol called the Multiprocessor Specification Version 1.4 [...]

  • Page 333

    Vol. 3 8-27 MULTIPLE-PR OCESSOR MANAGE MENT 8.4.1 BSP and AP Pr ocessors The MP initialization protocol defines two classes of processors: the bootstrap processor (BSP) and the application proce ssors (APs). Following a power-up or RESET of an MP system, system hardware dynamically selects one of the processors on the system bus as the BSP . The re[...]

  • Page 334

    8-28 Vol. 3 MULTIPLE-PR OCESSOR MANAGEMENT 8.4.3 MP Initialization Pro toc ol Algorithm f or Intel X eon Pr ocessors Following a power-up or RESET of an MP system, the processors in the system execute the MP initialization protocol algori thm to initialize each of the logical proces - sors on the system bus or coherent link domain. In the course of[...]

  • Page 335

    Vol. 3 8-29 MULTIPLE-PR OCESSOR MANAGE MENT • The newly established BSP broadcasts an FIPI message to “all including self , ” which the BSP and APs treat as an end of MP initialization signal. Only the processor with its BSP flag set responds to the FIPI message. It responds by fetching and executing the BIOS boot-strap code, beginning at the[...]

  • Page 336

    8-30 Vol. 3 MULTIPLE-PR OCESSOR MANAGEMENT SVR EQU 0FEE000F0H APIC_ID EQU 0FEE00020H LVT3 EQU 0FEE00370H APIC_ENABLED EQU 0100H BOOT_ID DD ? COUNT EQU 00H VACANT E QU 00H 8.4.4.1 T ypical BSP Initialization Sequenc e After the BSP and APs have been selected (by means of a h ardware protocol, see Section 8.4.3, “MP Initialization Protocol Algorith[...]

  • Page 337

    Vol. 3 8-31 MULTIPLE-PR OCESSOR MANAGE MENT mode address space (1-MByte space). For example, a vector of 0BDH specifies a start-up memory address of 000BD000H. 11. Enables the local APIC by setting bit 8 of the APIC spurious vector register (SVR). MOV ESI, SVR; Add ress of SVR MOV EAX, [ESI]; OR EAX, APIC_ENABLED; Set bi t 8 to enable (0 on reset) [...]

  • Page 338

    8-32 Vol. 3 MULTIPLE-PR OCESSOR MANAGEMENT MOV EAX, 000C46XXH; Load ICR e n coding fr om broadcast SI PI IP ; to all APs into EAX where xx is the vector comput ed in step 8. 16. Waits for the timer interrupt. 17. Reads and evaluates the COUNT v ariable and establishes a processor count. 18. If necessary , reconfigures the APIC and continues with th[...]

  • Page 339

    Vol. 3 8-33 MULTIPLE-PR OCESSOR MANAGE MENT 8.4.5 Identifying L ogical Proc essors in an MP System After the BIOS has completed the MP initialization protocol, each logical processor can be uniquely identified by its local APIC ID. Software can access these APIC IDs in either of the following ways: • Read APIC ID for a local APIC — Code running[...]

  • Page 340

    8-34 Vol. 3 MULTIPLE-PR OCESSOR MANAGEMENT during power-up and initializ ation is 8 bits. Bits 2:1 form a 2-bit ph ysical package identifier (which can also be thought of as a socket identifier). In systems that configure physical processors in clusters, bits 4:3 form a 2-bit cluster ID. Bit 0 is used in the Intel X eon processor MP to identi fy th[...]

  • Page 341

    Vol. 3 8-35 MULTIPLE-PR OCESSOR MANAGE MENT 8.5 INTEL ® HYPER-THREADING T ECHNOLOGY AND INTEL ® MULTI-COR E T ECHNOLOGY Intel Hyper- Threading T echnology and Intel multi-core te chnology are extensions to Intel 64 and IA-32 architectures that enable a single physical processor to e xecute two or more separate code streams (called threads ) concu[...]

  • Page 342

    8-36 Vol. 3 MULTIPLE-PR OCESSOR MANAGEMENT number of addressable IDs attributable to processor cores (Y) in the physical package. • Extended Processor Topology Enumer ation parameters for 32-bit APIC ID : Intel 64 processors supporting CPUID le af 0BH will assign unique APIC IDs to each logical processor in the system. CPUID leaf 0BH reports the [...]

  • Page 343

    Vol. 3 8-37 MULTIPLE-PR OCESSOR MANAGE MENT During initialization, each logical processor is assigned an APIC ID that is stored in the local APIC ID register for each logical processor . If two or more processors supporting Intel Hyper- Threading T echnology are present, each logical p rocessor on the system bus is assigned a unique ID (see Section[...]

  • Page 344

    8-38 Vol. 3 MULTIPLE-PR OCESSOR MANAGEMENT 8.7 INTEL ® HYPER-THR EADING T ECHNOLOGY ARCHITECTUR E Figure 8-4 shows a generalized view of an Intel processor su pporting Intel Hyper- Threading T echnology , using the original In tel Xeon processor MP as an example. This implementation of the Intel Hyper- Thre ading T echnology consists of two logica[...]

  • Page 345

    Vol. 3 8-39 MULTIPLE-PR OCESSOR MANAGE MENT 8.7 .1 S tate of the Logical Pr ocessors The following features are part of the archit ectural state of logical processors within Intel 64 or IA -32 processors supporting Intel Hyper- Threading T echnology . The features can be subdivided into three groups: • Duplicated for each logical processor • Sh[...]

  • Page 346

    8-40 Vol. 3 MULTIPLE-PR OCESSOR MANAGEMENT • Debug registers (DR0, DR1, DR2, DR3, DR6, DR7) and the debug control MSRs • Machine check global status (IA32_MCG_ST A TUS) and machine check capability (IA32_MCG_CAP) MSRs • Thermal clock modulation and ACPI Pow er management control MSRs • Time stamp counter MSRs • Most of the other MSR regis[...]

  • Page 347

    Vol. 3 8-41 MULTIPLE-PR OCESSOR MANAGE MENT gives software a consistent view of memory , independent of the processor on which it is running. See Section 11.11 , “Memory T ype Range R egisters (MTRRs), ” for infor - mation on setting up MTRRs. 8.7 .4 P age A ttribute T able (P A T) Each logical processor has its own P A T MSR (IA32_PA T). Howev[...]

  • Page 348

    8-42 Vol. 3 MULTIPLE-PR OCESSOR MANAGEMENT 8.7 .7 P erforman ce Monitoring Cou nters Performance counters and their companion control MSRs are shared between the logical processors within a processor core for processors based on Intel NetBurst microarchitecture. As a result, software must manage the use of these resources. The performance counter i[...]

  • Page 349

    Vol. 3 8-43 MULTIPLE-PR OCESSOR MANAGE MENT 8.7 .11 MICROC ODE UPDA TE Resour ces In an Intel processor supporting Intel Hyper- Threading T echnology , the microcode update facilities are shared between the logical processors; either logical processor can initiate an update. Each logical processor has its own BIOS signature MSR (IA32_BIOS_SIGN_ID a[...]

  • Page 350

    8-44 Vol. 3 MULTIPLE-PR OCESSOR MANAGEMENT As a consequence, the use of the WBINVD instruction can have an impact on interrupt/event response time. • INVD instruction — The entire cache hie rarch y is invalidated without writing back modified data to memory . All logical processors are stopped from executing until after the invalidate oper atio[...]

  • Page 351

    Vol. 3 8-45 MULTIPLE-PR OCESSOR MANAGE MENT disabled on a logical processor basis. T ypically , if softw are controlled clock modula - tion is going to be used, the feature must be enabled for all the logical processors within a physical processor and the modulation duty cycle must be set to the same value for each logical processor . If the du ty [...]

  • Page 352

    8-46 Vol. 3 MULTIPLE-PR OCESSOR MANAGEMENT 8.8 MULTI-CORE ARCHITECTUR E This section describes the architecture of Intel 64 and IA -32 processors supporting dual-core and quad-core technology . The discu ssion is applicable to the Intel Pentium processor Extreme Edition, Pentium D, I n t e l C o r e D u o , I n t e l C o r e 2 Du o, D u a l - c o r[...]

  • Page 353

    Vol. 3 8-47 MULTIPLE-PR OCESSOR MANAGE MENT 8.8.3 Perf ormance Monit oring Counters Performance coun ters and their companio n control MSRs are shared between two logical processors sharing a processor core if the processor core supports Intel Hyper- Threading T echnology and is based on Intel NetBurst microarchitecture. They are not shared between[...]

  • Page 354

    8-48 Vol. 3 MULTIPLE-PR OCESSOR MANAGEMENT provided for each logical processors (see Section 8.7, “Intel ® Hyper- Threading T ech - nology Architecture, ” and Section 8.8, “Multi-Core Architecture” ). From a software programming perspective, co ntrol tr ansfer of processor operation is managed at the granularity of logical processor (oper [...]

  • Page 355

    Vol. 3 8-49 MULTIPLE-PR OCESSOR MANAGE MENT If the processor supports CPUID leaf 0BH, the 32-bit APIC ID can represent cluster plus several levels of topology within the physical processor package. The exact number of hierarchical levels within a physical processor package must be enumer - ated through CPUID leaf 0BH. Common pr ocessor families may[...]

  • Page 356

    8-50 Vol. 3 MULTIPLE-PR OCESSOR MANAGEMENT 8.9.2 Hierarchical Mapping o f CPUID Extended T opology Leaf CPUID leaf 0BH provides enumeration parame ters for software to identify each hier - archy of the processor topology in a determ inistic manner . Each hierarchical level of the topology starting from the SMT level is represented numerically by a [...]

  • Page 357

    Vol. 3 8-51 MULTIPLE-PR OCESSOR MANAGE MENT For m = 0, m < N, m ++; { cumulative_width[m] = CPUID.(EAX =0BH, ECX= m): EAX[4:0]; } BitWidth[0] = cumu lative_width[0]; For m = 1, m < N, m ++; BitWidth[m] = cumulative_width[m] - cumulative_width[m-1]; Currently , only the following encoding of hierarchical leve l type are defined: 0 (inv alid), [...]

  • Page 358

    8-52 Vol. 3 MULTIPLE-PR OCESSOR MANAGEMENT T able 8-2 shows the initial APIC IDs for a hypothetical situation with a dual processor system. Each physical package providing two processor cores, and each processor core also supporting Intel Hyper- Threading T echnology . Figure 8-7 . T opolog ical Relationships between Hierarchical IDs in a Hypotheti[...]

  • Page 359

    Vol. 3 8-53 MULTIPLE-PR OCESSOR MANAGE MENT 8.9.3.1 Hiera rchical ID of L ogical Pr ocessors with x2APIC ID T able 8-3 shows an exampl e of possible x2AP IC ID assignments for a dual processor system that support x2APIC. Each physical package providing four processor cores, and each processor core also supporting Intel Hyper- Threading T e chnology[...]

  • Page 360

    8-54 Vol. 3 MULTIPLE-PR OCESSOR MANAGEMENT 8.9.4 Algorithm f or Three-Le vel Mappings of APIC_ID Software can gather the initial APIC_IDs for each logical processor supported by the operating system at ru ntime 5 and extract identifiers corresponding to the three levels of sharing topology (package, co re, and SMT). The three-level algorithms below[...]

  • Page 361

    Vol. 3 8-55 MULTIPLE-PR OCESSOR MANAGE MENT a. Query the right-shift v alue for the SMT level of the topology using CPUID leaf 0BH with ECX =0H as input. The number of bits to shift-right on x2APIC ID (EAX[4:0]) can distinguish different higher-level entities above SMT (e.g. processor cores) in the same physical package. This is also the width of t[...]

  • Page 362

    8-56 Vol. 3 MULTIPLE-PR OCESSOR MANAGEMENT Example 8-18. Support Routines for De tecting Hardwar e Multi-Threading and Identifying the Relation ships Betw een Package, Cor e and Logical Pr ocessors 1. Detect support for Hard ware Multi-Threading Support in a process or . // Returns a no n-zero value if CP UID report s the presence of ha rdware mult[...]

  • Page 363

    Vol. 3 8-57 MULTIPLE-PR OCESSOR MANAGE MENT int DeriveCore_Mask_Offsets (void) { if (!HWMTSupported()) return -1; execute cpuid with eax = 11, ECX = 0; while( ECX[15:8] ) { // leve l type e ncoding is valid If (returned level type en coding in ECX[15:8] matches CORE) { Mask_Core_shift = EAX[4:0]; // needed to distinguish differen t physical package[...]

  • Page 364

    8-58 Vol. 3 MULTIPLE-PR OCESSOR MANAGEMENT unsigned char M axLPIDsPerPackage(voi d) { if (!HWMTSupported()) return 1; execute cpuid wi th eax = 1 store returned value of ebx return (unsigned char) ((reg_e bx & NUM_LOGICAL_BITS) >> 16); } b. Find the size o f address space for processo r cores in a ph ysical processor package. // Returns t[...]

  • Page 365

    Vol. 3 8-59 MULTIPLE-PR OCESSOR MANAGE MENT // Returns the mask bit wi dth of a bit field fro m the maximum count that bit fi eld can represe nt. // This algorithm does not a ssume ‘address size’ to have a value equal to power of 2. // Address size for SMT_ ID can be calculated from MaxLPI DsPerPackage()/MaxCoreIDsPerPackage() // Then use the r[...]

  • Page 366

    8-60 Vol. 3 MULTIPLE-PR OCESSOR MANAGEMENT Software must not assume local APIC_ID values in an MP system are consecutive. Non-consecutive local APIC_IDs may b e the result of hardware co nfigurations or debug features implemented in the BIOS or OS. An identifier for each hierarchical level can be extr acted from an 8-bit APIC_ID using the support r[...]

  • Page 367

    Vol. 3 8-61 MULTIPLE-PR OCESSOR MANAGE MENT example also depicts a technique to construct a mask to represent the logical processors that reside in the same core . In Example 8-21, the numerical ID value can be obtained from the value extracted with the mask by shifting it right by shift count. Algorithms below do not shift the value. The assumptio[...]

  • Page 368

    8-62 Vol. 3 MULTIPLE-PR OCESSOR MANAGEMENT using OS specif ic APIs. // Allocate per processor arrays t o store the Package_ID, Core_ID an d SMT_ID for every st arted // processor. ThreadAffinityMask = 1; ProcessorNum = 0; while (ThreadAf finityMask != 0 && Thre adAffinityMask <= SystemAffinity) { // Check to make sure we can util ize thi[...]

  • Page 369

    Vol. 3 8-63 MULTIPLE-PR OCESSOR MANAGE MENT PackagePro cessorMask[0] = Proc essorMask; For (ProcessorNum = 1; Processo rNum < NumStartedLPs; ProcessorNum++) { ProcessorMask << = 1; For (i=0; i < PackageNum; i++) { // we may be comparing bi t-fields of logical processors residing in different // packages, the code below a ssume package s[...]

  • Page 370

    8-64 Vol. 3 MULTIPLE-PR OCESSOR MANAGEMENT } if (i == CoreNum) { //Did not match any bucket, start new bucket CoreIDBucket[i] = PackageID[P roce ssorNum] | CoreID [Process orNum]; CoreProcesso rMask[i] = Proce ssorMask; CoreNum++; } } // CoreNum has the number of cores start ed in the OS // CoreProcessorMask[] array has th e processor set of each c[...]

  • Page 371

    Vol. 3 8-65 MULTIPLE-PR OCESSOR MANAGE MENT 8.10.2 P AUSE Instruction The PAUSE instru ction can improves the performance of processors supporting Intel Hyper- Threading T echnology when executing “spin-wait loops” and other routines where one thread is access ing a shared lo ck or semaphore in a tight polling loop. When executing a spin-wait l[...]

  • Page 372

    8-66 Vol. 3 MULTIPLE-PR OCESSOR MANAGEMENT 8.10.4 MONIT OR/MW AIT Instruction Operating systems usually implement idle loop s to handle thread synch ronization. In a typical idle-loop scenario, there could be sev eral “busy loops” and they would use a set of memory locations. An impacted processor waits in a loop and poll a memory location to d[...]

  • Page 373

    Vol. 3 8-67 MULTIPLE-PR OCESSOR MANAGE MENT Po wer management related events (such as Thermal Monitor 2 or chipset driven STPCLK# assertion) will not cause the moni tor event pending flag to be cleared. F aults will not cause the monitor event pending flag to be cleared. Software should not allow for voluntary context switches in between MONITOR/MW[...]

  • Page 374

    8-68 Vol. 3 MULTIPLE-PR OCESSOR MANAGEMENT the two parameters should default to be the same (the size of the monitor triggering area is the same as the system coherence line size). Based on the monitor line sizes returned by the CPUID , the OS should dynamically allocate structures with appropriate padding. If static data structures must be used by[...]

  • Page 375

    Vol. 3 8-69 MULTIPLE-PR OCESSOR MANAGE MENT JE Get_Lock PAUSE ;Short delay JMP Spin_ Lock Get_Lock: MOV EAX, 1 XCHG EAX, lockvar ;Try to get lock CMP EAX, 0 ;Test if successful JNE Spin_Lock Critical _Section: <critical section code> MOV lockvar, 0 ... Continue : The spin-wait loop abov e uses a “test, te st- and-set” technique for determ[...]

  • Page 376

    8-70 Vol. 3 MULTIPLE-PR OCESSOR MANAGEMENT // C1 handler use s a Halt instruction VOID C1Handler() {S T I HLT } The MONITOR and M WAIT instructions may be consid ered for u se in the C0 id le state loops, if MONITOR and MWAIT are supporte d. Example 8-25. An OS Idle Loop with MONIT OR/MWAIT in the C0 Idle Loop // WorkQueu e is a memory location ind[...]

  • Page 377

    Vol. 3 8-71 MULTIPLE-PR OCESSOR MANAGE MENT } 8.10.6.3 Halt Idle Logical Pr ocessors If one of two logical processors is idle or in a spin-wait loop of long dur ation, explicitly halt that processor by means of a HL T instruction. In an MP system, operating systems can place idle processors into a loop that contin - uously checks the run queue for [...]

  • Page 378

    8-72 Vol. 3 MULTIPLE-PR OCESSOR MANAGEMENT { MONITOR WorkQueue // S etup of eax with W orkQueue LinearAd dress, // ECX, EDX = 0 IF (WorkQueue != 0) THEN { STI MWAIT // EAX, EC X = 0 } } 8.10.6.5 Guidelines for Scheduling Thr eads on Logical Proc essors Sharing Ex ecution R esources Because the logical processors, the order in which threads are disp[...]

  • Page 379

    Vol. 3 8-73 MULTIPLE-PR OCESSOR MANAGE MENT • A high resolution timer within the processor (such as, the local APIC timer or the time-stamp counter). For additional information, see the Intel® 64 and IA-32 Architectures Optimization Reference Manual . 8.10.6.7 Place Locks and Semaphores in Aligned, 128-Byte Block s o f Memory When software uses [...]

  • Page 380

    8-74 Vol. 3 MULTIPLE-PR OCESSOR MANAGEMENT[...]

  • Page 381

    Vol. 3 9-1 CHAP TER 9 PR OCESSOR MANAGEMEN T AND INITIALIZATION This chapter describes the facilities provided for managing processor wide functi ons and for initializing the processor . The subjects covered include: processor initializa - tion, x87 FPU initialization, processor conf iguration, feature determination, mode switching, the MSRs (in th[...]

  • Page 382

    9-2 Vol. 3 PRO CESSOR MANAGE MENT AND INITIALIZA TION The software-initialization code performs all system-specific initialization of the BSP or primary processor and the system logic. At this point, for MP (or DP) systems, the BSP (or primary) processor wak es up each AP (or secondary) processor to enable those processors to execute self -configur[...]

  • Page 383

    Vol. 3 9-3 PROCESSOR MANAGEMEN T AND INITIALIZATION CR2, CR3, CR4 00000000H 00000000H 00000000H CS Select or = F000H Base = FFFF0000H Limit = FFFFH AR = Presen t, R/W, Access ed Selector = F000H Base = FFFF0000H Limit = FFFFH AR = Presen t, R/W, Accessed Selector = F000H Base = FFFF0000H Limit = FFFFH AR = Presen t, R/W, Accessed SS, DS, ES, FS, GS[...]

  • Page 384

    9-4 Vol. 3 PRO CESSOR MANAGE MENT AND INITIALIZA TION LDTR, T ask Reg ist er Selector = 0000H Base = 00000000H Limit = FFFFH AR = Presen t, R/W Selector = 0000H Base = 00000000H Limit = FFFFH AR = Present, R/W Selector = 0000H Base = 00000000H Limit = FFFFH AR = Presen t, R/W DR0, DR1, DR2, DR3 00000000H 00000000H 00000000H DR6 FFFF0FF0H FFFF0FF0H [...]

  • Page 385

    Vol. 3 9-5 PROCESSOR MANAGEMEN T AND INITIALIZATION 9.1.3 Model and S tepping Inf ormation Following a hardw are reset, the EDX register contains component identification and revision information (see Figure 9-2 ). For example, the model, family , and processor type returned for the first processor in the Intel Pe ntium 4 family is as follows: mode[...]

  • Page 386

    9-6 Vol. 3 PRO CESSOR MANAGE MENT AND INITIALIZA TION 9.1.4 First Instruction Ex ecuted The first instruction that is fetched and executed following a hardware reset is located at physical address FFFFFFF0 H. This address is 16 byte s below the processor’s uppermost physical address. The EPROM containing th e software- initialization code must be[...]

  • Page 387

    Vol. 3 9-7 PROCESSOR MANAGEMEN T AND INITIALIZATION The EM flag determines whether floating-poi nt instructions are executed by the x87 FPU (EM is cleared) or a device-not-av ailable exception (#NM) is generated for all floating-point instructions so that an exception handler can emulate the floating- point operation (EM = 1). Ordinarily , the EM f[...]

  • Page 388

    9-8 Vol. 3 PRO CESSOR MANAGE MENT AND INITIALIZA TION • It allows x87 FPU code to run on an IA-32 processor that has neither an integr ated x87 FPU nor is connected to an external math coprocessor , by using a floating-point emulator . • It allows floating-point code to be executed using a special or nonstandard floating-point emulator , select[...]

  • Page 389

    Vol. 3 9-9 PROCESSOR MANAGEMEN T AND INITIALIZATION 9.4 MODEL-SPECIFIC REGISTERS (MSRS) Most IA-32 processors (starting from P entium processors) and Intel 64 processors contain a model-specific registers (MSRs) . A given MSR may not be supported across all families and models for Intel 64 and IA-32 processors. Some MSRs are designated as architect[...]

  • Page 390

    9-10 Vol. 3 PRO CESSOR MANAGE MENT AND INITIALIZA TION all the MTRRs must be cleared to 0, which selects the uncached (UC) memory type. See Section 11.11, “ Memory T ype R ange R egisters (MTRRs), ” for detailed informa - tion on the MTRRs. 9.6 INITIALIZING SSE/SSE2/SSE3/SSSE3 EX TENSIONS For processors that contain SSE/SSE2/SSE3/S SSE3 extensi[...]

  • Page 391

    Vol. 3 9-11 PROCESSOR MANAGEMEN T AND INITIALIZATION mode. The protected-mode data structures that must be loaded are described in Section 9.8, “Software Initializatio n for Protected-Mode Operation. ” 9.7 .1 R eal-Address Mode ID T In real-address mode, the only system data structure that must be loaded into memory is the ID T (also called the[...]

  • Page 392

    9-12 Vol. 3 PRO CESSOR MANAGE MENT AND INITIALIZA TION modules into memory to support reliable operation of the processor in protected mode. These data structures include the following: • A IDT . • A GDT . • A TSS. • (Optional) An LDT . • If paging is to b e used, at least one page dire ctory and one page table. • A code segment that co[...]

  • Page 393

    Vol. 3 9-13 PROCESSOR MANAGEMEN T AND INITIALIZATION descriptors in the GDT . Some operating systems allocate new segments and LDT s as they are needed. This provides maximum flexibility for handling a dynamic program - ming environment. However , many operating systems use a single LD T for all tasks, allocating GDT entries in advance. An embedded[...]

  • Page 394

    9-14 Vol. 3 PRO CESSOR MANAGE MENT AND INITIALIZA TION 9.8.4 Initializing Multitasking If the multitasking mechanism is not going to be used and changes betwe en privilege levels are not allowed, it is not necessary load a TS S into memory or to initialize the task register . If the multitasking mechanism is going to be used and/or changes between [...]

  • Page 395

    Vol. 3 9-15 PROCESSOR MANAGEMEN T AND INITIALIZATION following instructions must be located in an identity-mapped page (until such time that a branch to non-identi ty mapped pages can be effected). 64-bit mode paging tables must be located in the first 4 GBytes of physical-address space prior to activating IA -32e mode. This is necessary because th[...]

  • Page 396

    9-16 Vol. 3 PRO CESSOR MANAGE MENT AND INITIALIZA TION 9.8.5.3 64-bit Mode and Compatibility Mode Oper ation IA-32e mode uses two code segment-descrip tor bits (CS.L and CS.D , see Figure 3-8) to control the oper ating modes after IA -32e mode is initialized. If CS.L = 1 and CS.D = 0, the processor is running in 64-bit mode. With this encoding, the[...]

  • Page 397

    Vol. 3 9-17 PROCESSOR MANAGEMEN T AND INITIALIZATION from 64-bit mode through compatibility mode to legacy or real mode and then back through compatibility mode to 64-bit mode. 9.9 MODE S WITCHING T o use the processor in protected mode af ter hardware or software reset, a mode switch must be performed from real-addr ess mode. Once in protected mod[...]

  • Page 398

    9-18 Vol. 3 PRO CESSOR MANAGE MENT AND INITIALIZA TION 7. If a local descriptor table is going to be used, execute the LLDT instruction to load the segment selector for the LDT in the LDTR register . 8. Execute the L TR instruction to load the task register with a segment selector to the initial protected-mode task or to a writable are a of memory [...]

  • Page 399

    Vol. 3 9-19 PROCESSOR MANAGEMEN T AND INITIALIZATION 4. Load segment registers SS, DS, ES, FS, and GS with a selector for a descriptor containing the following values, which are appropriate for real-address mode: — Limit = 64 KBytes (0FFFFH) — Byte granular (G = 0) —E x p a n d u p ( E = 0 ) — W ritable (W = 1) —P r e s e n t ( P = 1 ) ?[...]

  • Page 400

    9-20 Vol. 3 PRO CESSOR MANAGE MENT AND INITIALIZA TION • Load the system registers with the necessa ry pointers to the data structures and the appropriate flag settings for protected-mode operation. • Switch the processor to protected mode. Figure 9-3 shows the physical memory layout fo r the processor following a hardware reset and the startin[...]

  • Page 401

    Vol. 3 9-21 PROCESSOR MANAGEMEN T AND INITIALIZATION Figure 9-3. Pr ocessor S t ate A fter Rese t T able 9-4. Main Initialization S t eps in ST AR TUP .ASM Source Lis ting ST ARTUP .ASM Line Numbers Description From To 157 157 Jump (short) to the entry c o de in the EPROM 162 169 Construct a tempor ary GD T in RAM with one entry: 0 - null 1 - R/W d[...]

  • Page 402

    9-22 Vol. 3 PRO CESSOR MANAGE MENT AND INITIALIZA TION 9.10.1 Assembler Usage In this example, the Intel assembler ASM386 and build tools BL D386 are used to assemble and build the initialization code module. The following assumptions are used when using the Intel ASM386 and BLD386 tools. • The ASM386 will generate the right operan d size opcodes[...]

  • Page 403

    Vol. 3 9-23 PROCESSOR MANAGEMEN T AND INITIALIZATION 9.10.2 ST AR TUP .ASM Listing Example 9-1 p rovides high- level sample code designed to mo ve the processor into protected mode. This listing does not in clude any opcode and offset information. Example 9-1. ST A R TUP .ASM MS-DOS* 5.0(045-N) 386(TM) MACRO ASS EMBLER STARTUP 09:44:51 08/19/92 PAG[...]

  • Page 404

    9-24 Vol. 3 PRO CESSOR MANAGE MENT AND INITIALIZA TION 28 ; RAM_START will contain the line ar address of the first 29 ; free byte above the copied table s - this may be useful if 30 ; a memory manager is used. 31 32 TSS_INDEX EQU 10 33 34 ; TSS_INDEX is the index of the TSS of the first task to 35 ; run after startup 36 37 38 ;;;;;;;;;;;;;;;;;;;;;[...]

  • Page 405

    Vol. 3 9-25 PROCESSOR MANAGEMEN T AND INITIALIZATION 71 SS_reg DW ? 72 SS_h DW ? 73 DS_reg DW ? 74 DS_h DW ? 75 FS_reg DW ? 76 FS_h DW ? 77 GS_reg DW ? 78 GS_h DW ? 79 LDT_reg DW ? 80 LDT_h DW ? 81 TRAP_reg DW ? 82 IO_map_baseDW ? 83 TASK_STATE ENDS 84 85 ; basic structure of a descrip tor 86 DESC STRUC 87 lim_0_15 DW ? 88 bas_0_15 DW ? 89 bas_16_2[...]

  • Page 406

    9-26 Vol. 3 PRO CESSOR MANAGE MENT AND INITIALIZA TION 114 115 ; ------------------------- DATA S EGMENT---------------------- 116 117 ; Initially, this data segment st arts at linear 0, according 118 ; to the processor’s power-up stat e. 119 120 STARTUP_DATA SEGMENT RW 121 122 free_mem_linear_base LABEL DW ORD 123 TEMP_GDT LABEL BY TE ; must be [...]

  • Page 407

    Vol. 3 9-27 PROCESSOR MANAGEMEN T AND INITIALIZATION 159 ; DS,ES address the bottom 64K of flat linear memory 160 ASSUME DS:STARTUP_DATA, E S:STARTUP_DATA 161 ; See Figure 9-4 162 ; load GDTR with temporary GDT 163 LEA EBX,TEMP_GDT ; build the TEMP_GDT in low ram, 164 MOV DWORD PTR [EBX ],0 ; where we can address 165 MOV DWORD PTR [EBX ]+4,0 166 MO[...]

  • Page 408

    9-28 Vol. 3 PRO CESSOR MANAGE MENT AND INITIALIZA TION 201 MOV ECX, CS_BASE 202 ADD ECX, OFFSET (GDT_E PROM) 203 MOV ESI, [ECX].table_l inear 204 MOV EDI,EAX 205 MOVZX ECX, [ECX].table_l im 206 MOV APP_GDT_ram[EBX].t able_lim,CX 207 INC ECX 208 MOV EDX,EAX 209 MOV APP_GDT_ram[EBX].t able_linear,EAX 210 ADD EAX,ECX 211 REP MOVS BYTE PTR ES:[EDI], BY[...]

  • Page 409

    Vol. 3 9-29 PROCESSOR MANAGEMEN T AND INITIALIZATION 246 247 ; move the TSS 248 MOV EDI,EAX 249 MOV EBX,TSS_INDEX* SIZE(DESC) 250 MOV ECX,GDT_DESC_O FF ;build linear address for TSS 251 MOV GS,CX 252 MOV DH,GS:[EBX].ba s_24_31 253 MOV DL,GS:[EBX].ba s_16_23 254 ROL EDX,16 255 MOV DX,GS:[EBX].ba s_0_15 256 MOV ESI,EDX 257 LSL ECX,EBX 258 INC ECX 259[...]

  • Page 410

    9-30 Vol. 3 PRO CESSOR MANAGE MENT AND INITIALIZA TION 289 PUSH DWORD PTR [EDX].EI P_reg 290 MOV AX,[EDX].DS_reg 291 MOV BX,[EDX].ES_reg 292 MOV DS,AX ; DS and ES no longer linear memory 293 MOV ES,BX 294 295 ; simulate far jump to ini tial task 296 IRETD 297 298 STARTUP_CODE ENDS *** WARNING #377 IN 298, (PASS 2) SEGMEN T CONTAINS PRIVILEGED INSTR[...]

  • Page 411

    Vol. 3 9-31 PROCESSOR MANAGEMEN T AND INITIALIZATION Figur e 9-4. Cons tructing T empor ary GDT and S witching to Pr ot ected Mode (Lines 162-172 of List File) FFFF FFFFH Base=0, Limit=4G ST ART : [CS.BASE+EIP] TEMP_GDT • Jump near start FFFF 0000H • Construct TEMP_GDT • LGDT • Move to protected mode DS, ES = GDT[1] 4 GB 0 GDT [1] GDT [0] G[...]

  • Page 412

    9-32 Vol. 3 PRO CESSOR MANAGE MENT AND INITIALIZA TION Figure 9-5. Moving the GDT , IDT , and TSS from R OM to RAM (Lines 196-261 of List File) FFFF FFFFH GDT RAM • Move the GDT , IDT , TSS • Fix Aliases • L TR 0 RAM_ST ART TSS IDT GDT TSS RAM IDT RAM from ROM to RAM[...]

  • Page 413

    Vol. 3 9-33 PROCESSOR MANAGEMEN T AND INITIALIZATION 9.10.3 MAIN.ASM Sourc e Code The file MAIN.ASM shown in Example 9-2 defines the data and stack segments for this application and can be substituted with the main module task written in a high- level language that is invoked by the IRET instruction executed by ST ARTUP .ASM. Example 9-2. MAIN.ASM [...]

  • Page 414

    9-34 Vol. 3 PRO CESSOR MANAGE MENT AND INITIALIZA TION CODE SEGMENT ER use32 PUBLIC main_start: nop nop nop CODE ENDS END main_start, ds:data, ss:stack 9.10.4 Supporting Files The batch file shown in Example 9-3 can be used to assemble the source code files ST ARTUP .ASM and MAIN.ASM and build the final application. Example 9-3. Batch File to Assem[...]

  • Page 415

    Vol. 3 9-35 PROCESSOR MANAGEMEN T AND INITIALIZATION TABLE GDT ( LOCATION = GDT_EPROM , ENTRY = ( 10: PROTECTED_MODE_TAS K , startup.startup_code , startup.startup_data , main_module.data , main_module.code , main_module.stack ) ), IDT ( LOCATION = IDT_EPROM ); MEMORY ( RESERVE = (0..3FFFH -- Area for the GDT, IDT, TSS copied from ROM , 60000H..0FF[...]

  • Page 416

    9-36 Vol. 3 PRO CESSOR MANAGE MENT AND INITIALIZA TION 9.11 MICROC ODE UPDATE F A CILITIES The Pentium 4, Intel X eon, and P6 family processors have the capability to correct errata by loading an Intel-supplied data bl ock into the processor . The data block is called a microcode update. This section describes the mechanisms the BIOS needs to provi[...]

  • Page 417

    Vol. 3 9-37 PROCESSOR MANAGEMEN T AND INITIALIZATION 9.11.1 Micr ocode Update A microcode update consists of an Intel-supplied binary that contains a descriptive header and data. No executable code resides within the u pdate. Each microcode update is tailored for a specific list of processor signatures. A mismatch of the processor’s signature wit[...]

  • Page 418

    9-38 Vol. 3 PRO CESSOR MANAGE MENT AND INITIALIZA TION NO TE The optional extended signature ta ble is supported starting with processor family 0FH, model 03H. . T able 9-6. Microc ode Update Field Definitions Field Name Offse t (bytes) Lengt h (bytes) Description Header V ersion 0 4 V ersion number of the upda te header. Update R evision 4 4 Uniqu[...]

  • Page 419

    Vol. 3 9-39 PROCESSOR MANAGEMEN T AND INITIALIZATION Reserv ed 36 12 Reserv ed fields for futur e expansion Update Da ta 48 Data Siz e or 2000 Update da ta Extended Signatur e Count Data Size + 48 4 Specifies the number of e xtended signatur e structur es (Processor Sign atur e[n], processor flags[n] and checksum[n]) tha t exist in this micr oco de[...]

  • Page 420

    9-40 Vol. 3 PRO CESSOR MANAGE MENT AND INITIALIZA TION T able 9-7 . Microc ode Update F ormat 31 24 16 8 0 Bytes Header V ersion 0 Update Revision 4 Month: 8 Day: 8 Ye a r : 1 6 8 Processor Signature (CPUID) 12 Res : 4 Extended Family: 8 Extended Mode: 4 Res erve d: 2 Ty p e : 2 Family: 4 Model: 4 St ep p i n g : 4 Checksum 16 Loader R evision 20 P[...]

  • Page 421

    Vol. 3 9-41 PROCESSOR MANAGEMEN T AND INITIALIZATION 9.11.2 Op tional Extended Signatur e T able The extended signature table is a structure that ma y be appended to the end of the encrypted data when the encrypted data only supports a single processor signature (optional case). The extended signature table will always be present when the encrypted[...]

  • Page 422

    9-42 Vol. 3 PRO CESSOR MANAGE MENT AND INITIALIZA TION a processor signature embedded in the mi crocode update with the processor sign a - ture returned by CPUID will cause the BIOS to reject the update. Example 9-5 shows how to check for a valid processor sign ature match between the processor and microcode update. Example 9-5. Pseudo Code to V al[...]

  • Page 423

    Vol. 3 9-43 PROCESSOR MANAGEMEN T AND INITIALIZATION The three platform ID bits, when read as a binary coded deci mal (BCD) number , indi - cate the bit position in the microcode update header’s processor flags field associated with the installed processor . The processor flags in the 48-byte header and the processor flags field associated with t[...]

  • Page 424

    9-44 Vol. 3 PRO CESSOR MANAGE MENT AND INITIALIZA TION } Else { // // Assume the Data Size has been used to calculate the // location of Update.ProcessorSignature [N] and a match // on Update.ProcessorSignature[N] has a lready succeeded // If (Update.ProcessorFlags[n] & Flag) { Load Update } } } 9.11.5 Micr ocode Update Checksum Each microcode [...]

  • Page 425

    Vol. 3 9-45 PROCESSOR MANAGEMEN T AND INITIALIZATION If (ChkSum == 00000000H) Success Else Fail 9.11.6 Micr ocode Update Loader This section describes an update loader used to load an update into a P entium 4, Intel X eon, or P6 family processor . It also discu sses the requirements placed on the BIOS to ensure proper loading. The update loader des[...]

  • Page 426

    9-46 Vol. 3 PRO CESSOR MANAGE MENT AND INITIALIZA TION • ECX contains 79H (address of IA32_ BIOS_UPDT_TRIG). Other requirements are: • If the update is loaded while the processor is in real mode, then the update data may not cross a segment boundary . • If the update is loaded while the processor is in real mode, then the update data may not [...]

  • Page 427

    Vol. 3 9-47 PROCESSOR MANAGEMEN T AND INITIALIZATION If processor core supports Intel Hyper- Threading T echnology , the guideline de scribed in Section 9.11.6.3 also applies. 9.11.6.5 Update Loader Enhanc ements The update loader presented in Section 9. 11.6, “Microcode Upda te Loader , ” is a minimal implementation that can be enhanced to pro[...]

  • Page 428

    9-48 Vol. 3 PRO CESSOR MANAGE MENT AND INITIALIZA TION 9.11.7 .1 Determining the Signatur e An update that is successfully loaded into the processor provides a signature that matches the update revision of the curren tly functioning revision. This signature is available an y time after the actual update has been loaded. R equesting the signature do[...]

  • Page 429

    Vol. 3 9-49 PROCESSOR MANAGEMEN T AND INITIALIZATION Example 9-10. Pseudo Code to Authenticate the Update Z ← Obtain Update Rev ision from the Update Header to be authent icated; X ← Obtain Current Update Signature from MSR 8BH; If (Z > X) { Load Update that is to be authentica ted; Y ← Obtain New Signature from MSR 8BH; If (Z == Y) Succes[...]

  • Page 430

    9-50 Vol. 3 PRO CESSOR MANAGE MENT AND INITIALIZA TION There are no optional functions. BIOS mu st load the appropri ate update for each processor during system ini tialization. A Header V ersion of an update block containing the value 0FFFFFFFFH indicates that the update block is unused and available for storing a new update. The BIOS is responsib[...]

  • Page 431

    Vol. 3 9-51 PROCESSOR MANAGEMEN T AND INITIALIZATION These requirements are checked by the BI OS during the execution of the write update function of this interface. The BI OS sequen tially scans through all of the update blocks in NVRAM starting with index 0. The BIOS scans until it finds an update where the processor fields in the header match th[...]

  • Page 432

    9-52 Vol. 3 PRO CESSOR MANAGE MENT AND INITIALIZA TION } } NO TES The platform Id bits in IA32_PLA TFORM_ID are encoded as a three- bit binary coded decimal field. The platform bits in the microcod e update header are indivi dually bit encoded. The algorithm must do a translation from one format to the other prior to doing a check. When performing [...]

  • Page 433

    Vol. 3 9-53 PROCESSOR MANAGEMEN T AND INITIALIZATION Example 9-12. INT 15 DO42 Calling Progr am Pseudo-code // // We must be in real mode // If the system is not in Real mode ex it // // Detect presence of Genuine Intel processor(s) that can be updated // using(CPUID) // If no Intel processors exist that ca n be updated exit // // Detect the presen[...]

  • Page 434

    9-54 Vol. 3 PRO CESSOR MANAGE MENT AND INITIALIZA TION // Do we have enough update slots for all CPUs? // If there are more blocks required to sup port the unique processor steppings than update blocks provided by the BIOS exit // // Do we need any update blocks at all? If not, we are done // If (NumBlocks == 0) exit // // Record updates for proces[...]

  • Page 435

    Vol. 3 9-55 PROCESSOR MANAGEMEN T AND INITIALIZATION } // // Compare the Update read to that written // If (Update read != Update written) { Display Diagnostic exit } I ← I + (size of microcode update / 2048) } // // Enable Update Loading, and inform user // Issue the Update Control function wi th Task = Enable. 9.11.8.3 Micr ocode Update F uncti[...]

  • Page 436

    9-56 Vol. 3 PRO CESSOR MANAGE MENT AND INITIALIZA TION In general, each function returns with CF cleared and AH contains the returned status. The gener al return codes and other constant definitions are listed in Section 9.11.8.9, “R eturn Code s. ” The OEM error field (AL) is provided for the OEM to return additional error informa - tion speci[...]

  • Page 437

    Vol. 3 9-57 PROCESSOR MANAGEMEN T AND INITIALIZATION 9.11.8.6 F unction 01H—Write Micr ocode Update Data This function integrates a new microcode up date into the BIOS storage device. T able 9-14 lists the parameters and return codes for the function. T able 9-14. Parame ters f or the Write Update Data F unction Input AX Function Code 0D042H BL S[...]

  • Page 438

    9-58 Vol. 3 PRO CESSOR MANAGE MENT AND INITIALIZA TION Descripti on The BIOS is responsible for selecting an ap propriate update block in the non-v olatile storage for storing the new update. This BI OS is also responsible for ensuring the integrity of the information provided by the caller , including authenticating the proposed update before inco[...]

  • Page 439

    Vol. 3 9-59 PROCESSOR MANAGEMEN T AND INITIALIZATION Finally , before storing the proposed update in NVRAM, the BIOS must verify the authenticity of the update via the mechanism described in Section 9.11.6, “Micro - code Update Loader . ” This includes loading the update into the current processor , executing the CPUID instruction, readin g MSR[...]

  • Page 440

    9-60 Vol. 3 PRO CESSOR MANAGE MENT AND INITIALIZA TION Figure 9-8. Micr ocode Upda te Write Operation Flow [1] 1 V alid U pdat e H eader V ers ion? Loader R ev is ion M atc h BI O S’s Loader ? D oes U pdate M at ch A CPU in The S ys t em W rit e M ic r oc ode U pdat e D oes U pdate C hec ks um C orr ec t ly ? Ye s Ye s Ye s No Re t u r n CPU_NOT_[...]

  • Page 441

    Vol. 3 9-61 PROCESSOR MANAGEMEN T AND INITIALIZATION Figur e 9-9. Micr ocode Update Wri te Oper ation Flo w [2] Ret ur n I NVALI D_REVI SI ON Yes 1 Update Revis ion Newer Than NVRAM Update? Update Pass A uthent ici ty Test ? Ret urn SECURI TY _FAI LURE Yes Update NMRA M R ecord Ret ur n SUCCESS U p d a te M atch in g C PU A lr eady In NVRAM ? Space[...]

  • Page 442

    9-62 Vol. 3 PRO CESSOR MANAGE MENT AND INITIALIZA TION 9.11.8.7 Function 0 2H—Microc ode Update Con tro l This function enables loading of binary up dates into the processor . T able 9-15 lists the parameters and return codes for the function. This control is provided on a global basi s for all updates and processors. The caller can determine the[...]

  • Page 443

    Vol. 3 9-63 PROCESSOR MANAGEMEN T AND INITIALIZATION The READ_F AILURE error code returned by this function has meaning only if the control function is implemented in the BIOS NVRAM. The state of this feature (enabled/disabled) can also be implem ented using CMOS RAM b its where READ failure errors cannot occur . 9.11.8.8 F unction 03H—Read Micr [...]

  • Page 444

    9-64 Vol. 3 PRO CESSOR MANAGE MENT AND INITIALIZA TION The read function enables the caller to read any microcode update data that already exists in a BIOS and make decisions about th e addition of new updates. As a result of a successful call, the BIOS copies the microcode update into the location pointed to by ES:DI, with the contents of all Upda[...]

  • Page 445

    Vol. 3 9-65 PROCESSOR MANAGEMEN T AND INITIALIZATION T able 9-18. Return Code De finitions Retu rn Co de Va l u e Description SUCCESS 00H The function c ompleted success fully. NO T_IMPL EMEN TED 86H The f unction is no t implemented . ERASE_F AILURE 90H A failure because of the inability to erase the stor age device. WRITE_F AIL URE 91H A failure [...]

  • Page 446

    9-66 Vol. 3 PRO CESSOR MANAGE MENT AND INITIALIZA TION[...]

  • Page 447

    Vol. 3 10-1 CHAP TER 10 ADV ANCED PR OGRAMMABLE IN TERRUPT CON TROLLER (APIC) The Advanced Progr ammable Interrupt Contro ller (APIC), refe rred to in the following sections as the local APIC, was introduced into the IA-32 processors with the P entium processor (see Section 19.27, “A dvanced Programmable Interrupt Controller (APIC)” ) and is in[...]

  • Page 448

    10-2 Vol. 3 ADVANCED PR OGRAMMABLE INTERRUP T CONTR OLLER (APIC) interrupt pins (LINT0 and LINT1). The I/O devices may also be connected to an 8259-type interrupt controller that is in turn connected to the processor through one of the local interrupt pins. • Externally connected I/O devices — These interrupts originate as an edge or level asse[...]

  • Page 449

    Vol. 3 10-3 ADVANCED PR OGRAMMABLE IN TERRUPT CON TROLLER (APIC) IPIs can be sent to other processors in the system or to th e originating processor (self-interrupts). When the target processor receives an IPI message, its local APIC handles the message automatically (using information included in the message such as vector number and trigger mode)[...]

  • Page 450

    10-4 Vol. 3 ADVANCED PR OGRAMMABLE INTERRUP T CONTR OLLER (APIC) also be delivered to the individual processors through the local interrupt pins; however , this mechanism is commonly not used in MP systems. Figure 10-2. Local APICs and I/O APIC Wh en Intel Xeon Proc essors Are Used in Multiple-Proc essor Syste ms Figure 10-3. L ocal APICs and I/ O [...]

  • Page 451

    Vol. 3 10-5 ADVANCED PR OGRAMMABLE IN TERRUPT CON TROLLER (APIC) The IPI mechanism is typically used in MP systems to send fixed interrupts (inter - rupts for a specific vector number) and sp ecial-purpose interrupts to processors on the system bus. For example, a local APIC can use an IPI to forward a fixed interrupt to another processor for servi[...]

  • Page 452

    10-6 Vol. 3 ADVANCED PR OGRAMMABLE INTERRUP T CONTR OLLER (APIC) forward extendability for future Intel platform innov ations. These extensions and modifications are noted in the following sections. 10.4 LOCAL APIC The following sections describe the architectu re of the local APIC and how to detect it, identify it, and determine its status. Descri[...]

  • Page 453

    Vol. 3 10-7 ADVANCED PR OGRAMMABLE IN TERRUPT CON TROLLER (APIC) Figure 10-4. L ocal APIC S tructure Current Count Register Initial Count Register Divide Configuration Register V ersion Register Error S tatus Register In-Service Register (ISR) Ve c t or Decode Interrupt Co mmand Register (ICR) Acceptance Logic V ec[3:0] & TMR Bit Register Selec[...]

  • Page 454

    10-8 Vol. 3 ADVANCED PR OGRAMMABLE INTERRUP T CONTR OLLER (APIC) T able 10-1 shows how the APIC registers are mapped into the 4-KByte APIC register space. Registers are 32 bits, 64 bits, or 256 bits in width; all are aligned on 128-bit boundaries. All 32-bit registers should be accessed using 128-bit aligned 32-bit loads or stores. Some processors [...]

  • Page 455

    Vol. 3 10-9 ADVANCED PR OGRAMMABLE IN TERRUPT CON TROLLER (APIC) FEE0 00F0H Spurious Interrup t Vector R egister Bits 0-8 Read/Write; bits 9-31 Read Only. FEE0 0100H In-Servi ce Register (ISR); bits 0:31 Rea d On ly . FEE0 0110H In-Servi ce Register (ISR); bits 32:63 Rea d O nly . FEE0 0120H In-Service Register (ISR); bits 6 4:95 Read On ly. FEE0 0[...]

  • Page 456

    10-10 Vol. 3 ADVANCED PR OGRAMMABLE INTERRUP T CONTR OLLER (APIC) 10.4.2 Pr esence o f the Local APIC Beginning with the P6 family processors, th e presence or abs ence of an on-chip local APIC can be detected using the CPUID inst ruction. When the CPUID i nstruction is executed with a source oper and of 1 in th e EAX register , bit 9 of the CPUID [...]

  • Page 457

    Vol. 3 10-11 ADVANCED PR OGRAMMABLE IN TERRUPT CON TROLLER (APIC) 1. Using the APIC global enable/disable flag in the IA32_APIC_BASE MSR (MSR address 1BH; see Figure 10-5 ): — When IA32_APIC_BASE[11] is 0, the processor is functionally equivalent to an IA-32 processor without an on-chip APIC. The CPUID feature flag for the APIC (see Section 10.4.[...]

  • Page 458

    10-12 Vol. 3 ADVANCED PR OGRAMMABLE INTERRUP T CONTR OLLER (APIC) • APIC Global Enable flag, bit 11 ⎯ Enables or disables the local APIC (see Section 10.4.3, “Enabling or Disabling the Local APIC” ). This flag is av ailable in the Pentium 4, Intel X eon, and P6 family processors. It is not guaranteed to be av ailable or av ailable at the sa[...]

  • Page 459

    Vol. 3 10-13 ADVANCED PR OGRAMMABLE IN TERRUPT CON TROLLER (APIC) this, operating system software should avoid writing to the local APIC ID register . The value returned by bits 31-24 of the EBX register (when the CPUID instruction is executed with a source operand value of 1 in the EAX register) is always the Initial APIC ID (determined by the pla[...]

  • Page 460

    10-14 Vol. 3 ADVANCED PR OGRAMMABLE INTERRUP T CONTR OLLER (APIC) x2APIC will introduce 32-bit ID; see Section 10.5 . 10.4.7 .1 Local APIC S tate A fter Pow er-Up or Rese t Following a power -up or RESET of the processor , the state of local APIC and its regis - ters are as follows: • The following registers are reset to all 0s: • IRR, ISR, TMR[...]

  • Page 461

    Vol. 3 10-15 ADVANCED PR OGRAMMABLE IN TERRUPT CON TROLLER (APIC) • The mask bits for all the L VT entries are set. Attempts to reset these bits will be ignored. • (For P entium and P6 family processors) The local APIC continues to listen to all bus messages in order to keep its arbitration ID synchronized with the rest of the system. 10.4.7 .3[...]

  • Page 462

    10-16 Vol. 3 ADVANCED PR OGRAMMABLE INTERRUP T CONTR OLLER (APIC) 10.5 EX TENDED XAPIC (X2APIC) The x2APIC architecture extends the xAPIC arch itecture (described in Section 9.4) in a backward compatible manner and provid es forward extendabilit y for future Intel platform innovations. Specifically , x2APIC • R etains all key elements of compatib[...]

  • Page 463

    Vol. 3 10-17 ADVANCED PR OGRAMMABLE IN TERRUPT CON TROLLER (APIC) Ta b l e 10-2 , “x2APIC operating mode configurations” describe the possible combina - tions of the enable bit (EN - bit 11) and th e extended mode bit (EXTD - bit 10) in the IA32_APIC_BASE MSR. Once the local APIC has been switched to x2APIC mode (EN = 1, EXTD = 1), switching ba[...]

  • Page 464

    10-18 Vol. 3 ADVANCED PR OGRAMMABLE INTERRUP T CONTR OLLER (APIC) 32-bit register . Similarly executing the WRMSR instruction with the APIC register address in ECX, writes bits 0 to 31 of regist er EAX to bits 0 to 31 of the speci fied APIC register . If the register is a 64-bit register th en bits 0 to 31 of register EDX are written to bits 32 to [...]

  • Page 465

    Vol. 3 10-19 ADVANCED PR OGRAMMABLE IN TERRUPT CON TROLLER (APIC) 0080H 008H Ta s k P r i o r i t y R e g i s t e r (TPR) Rea d/W ri te . Bits 7:0 are RW. Bits 3 1:8 are Reserv ed. 0090H 009H Res erve d 00A0H 00AH Processor Priority Register (PPR) Rea d on ly . 00B0H 00BH EOI Register Write only. 0 is the only valid v alue to write. GP fault on non[...]

  • Page 466

    10-20 Vol. 3 ADVANCED PR OGRAMMABLE INTERRUP T CONTR OLLER (APIC) 01F0H 01FH TMR bits 22 4:255 Rea d On ly . 0200H 020H Inte rrup t Reque st Register (IRR); bits 0:31 Rea d On ly . 0210H 021H IRR bits32:63 Rea d On ly . 0220H 022H IRR bits 64:95 Re ad On ly . 0230H 023H IRR bits 96:127 Re ad On ly . 0240H 024H IRR bits 128:159 Re ad On ly . 0250H 0[...]

  • Page 467

    Vol. 3 10-21 ADVANCED PR OGRAMMABLE IN TERRUPT CON TROLLER (APIC) 10.5.1.3 R eserved Bit Checking Section 10.5.1.2 and Ta b l e 10-3 specifies the reserved bit definitions for the APIC registers in x2APIC mode. Non-zero writes (by WRMSR instruction) to re served bits to these registers will raise a general prot ection fault exception while reads re[...]

  • Page 468

    10-22 Vol. 3 ADVANCED PR OGRAMMABLE INTERRUP T CONTR OLLER (APIC) to enable BIOS and/or platform firmware to re-configure the x2APIC IDs in some clusters to provide for unique and non-ov erlapping system wide IDs before config - uring the disconnected components into a single system. 10.5.2 x2APIC R egister A vailability The local APIC registers ca[...]

  • Page 469

    Vol. 3 10-23 ADVANCED PR OGRAMMABLE IN TERRUPT CON TROLLER (APIC) field, VM-exit MSR -load address filed, and VM-entry MSR -load address field in Intel® 64 and IA-32 Architectures Software Develope r’s Manual, Volume 3B ). The X2APIC MSRs cannot to be loaded and stored on VMX transitions. A VMX tr ansi - tion fails if the VMM has specified that [...]

  • Page 470

    10-24 Vol. 3 ADVANCED PR OGRAMMABLE INTERRUP T CONTR OLLER (APIC) The default value for SVR[bit 12] is clear , indicating that an EOI broadcast will be performed. The support for Directed EOI capability can be detected by means of bit 24 in the Local APIC V ersion Register . This feature is supported in both the xAPIC mode and x2APIC modes of a loc[...]

  • Page 471

    Vol. 3 10-25 ADVANCED PR OGRAMMABLE IN TERRUPT CON TROLLER (APIC) • xAPIC mode: IA32_ APIC_BASE[EN]=1 and IA32_APIC_ BASE[EXTD]=0 • x2APIC mode: IA32_APIC_BAS E[EN]=1 and IA32_APIC_BASE[EXTD]=1 • Invalid: IA32_APIC_BASE[EN]=0 and IA32_APIC_BASE[EXTD]=1 The state corresponding to EXTD =1 and EN=0 is not valid and it is not possible to get into[...]

  • Page 472

    10-26 Vol. 3 ADVANCED PR OGRAMMABLE INTERRUP T CONTR OLLER (APIC) x2APIC A fter R ESET The valid tr ansitions from the xAPIC mode state are: • to the x2APIC mode by setting EXT to 1 (resulting EN=1, EXTD= 1). The physical x2APIC ID (see Figure 10-6 ) is preserved across this transition and the logical x2APIC ID (see Figure 10-21 ) is initialized [...]

  • Page 473

    Vol. 3 10-27 ADVANCED PR OGRAMMABLE IN TERRUPT CON TROLLER (APIC) x2APIC T ransitions Fr om x 2APIC Mode From the x2APIC mode, the only v alid x2AP IC transition using IA32_APIC_BASE is to the state where the x2APIC is disabled by setting EN to 0 and EXTD to 0. The x2APIC ID (32 bits) and the legacy local xAPIC ID (8 bits) are pr eserved across thi[...]

  • Page 474

    10-28 Vol. 3 ADVANCED PR OGRAMMABLE INTERRUP T CONTR OLLER (APIC) Support for the x2APIC architecture can be implemented in the local APIC unit. All existing PCI/MSI capable devices and IOxA PIC unit should work with the x2APIC extensions defined in this document . The x2 APIC architecture also provides flexibility to cope with the underlying fabri[...]

  • Page 475

    Vol. 3 10-29 ADVANCED PR OGRAMMABLE IN TERRUPT CON TROLLER (APIC) The extended topology enumeration leaf is intended to assist software with enumer - ating processor topology on systems that requires 32-bit x2APIC IDs to address indi - vidual logical processors. For example, a system with greater than 256 logical processors or greater than 64 proce[...]

  • Page 476

    10-30 Vol. 3 ADVANCED PR OGRAMMABLE INTERRUP T CONTR OLLER (APIC) 10.6 HANDLING LOCAL IN TERRUPTS The following sections describe facilities th at are provided in the local APIC for handling local interrupts. These include: the processor’s LINT0 and LINT1 pins, the APIC timer , the performance-monitoring counters, the thermal sensor , and the int[...]

  • Page 477

    Vol. 3 10-31 ADVANCED PR OGRAMMABLE IN TERRUPT CON TROLLER (APIC) Figure 10-12. Local V ector T able (L VT) 31 0 7 Ve c t or Tim er M od e 0: One-shot 1: Periodic 12 15 16 17 18 Delivery Mode 000: Fixed 100: NMI Mask † 0: Not Masked 1: Masked Address: FEE0 0350H V alue After Reset: 0001 0000H Reserved 12 13 15 16 Ve c t or 31 0 7 8 10 Address: FE[...]

  • Page 478

    10-32 Vol. 3 ADVANCED PR OGRAMMABLE INTERRUP T CONTR OLLER (APIC) The setup information th at can be specified in the registers of the L VT table is as follows: Vector Interrupt vector number . Delivery Mo de Specifies the type of interrupt to be sent to the processor . Some delivery modes will only operate as intended when used in conjunction with[...]

  • Page 479

    Vol. 3 10-33 ADVANCED PR OGRAMMABLE IN TERRUPT CON TROLLER (APIC) Interrup t Input Pin Po larity Specifies the polarity of the corresponding interrupt pin: (0) active high or (1) active low . Remote IRR Flag (Read Only) F or fixed mode, level-triggered interrupts; this flag is set when the local APIC accepts the interrupt for servicing and is reset[...]

  • Page 480

    10-34 Vol. 3 ADVANCED PR OGRAMMABLE INTERRUP T CONTR OLLER (APIC) 10.6.3 Err or Handling The local APIC provides an error status register (ESR) that it uses to record errors that it detects when handling interrupts (see Figure 10-13 ). An APIC error interrupt is generated when the local APIC sets one of the error bits in the ESR. The L VT error reg[...]

  • Page 481

    Vol. 3 10-35 ADVANCED PR OGRAMMABLE IN TERRUPT CON TROLLER (APIC) 10.6.3.1 x2APIC Diff erenc es in Error Handling RDMSR and WRMSR operations to reserved addresses in the x2APIC mode will r aise a GP fault. Additionally reserved bit vi olations cause GP faults as detailed in Section 10.5.1.3 . Beyond illegal register ac cess and reserved bit violati[...]

  • Page 482

    10-36 Vol. 3 ADVANCED PR OGRAMMABLE INTERRUP T CONTR OLLER (APIC) If the ICR is programmed with lowest priority delivery mode then the "Re-directible IPI" bit will be set in x2APIC modes (same as legacy xA PIC behavior) and the inter - rupt will not be processed. Write to the ICR with bo th lowest priority delivery mode and illegal vector[...]

  • Page 483

    Vol. 3 10-37 ADVANCED PR OGRAMMABLE IN TERRUPT CON TROLLER (APIC) The time base for the timer is derived from the processor’ s bus clock, divided by the value specified in the divide configur ation register . The timer can be configured through the timer L VT entry for one-shot or periodic operation. In one-shot mode, the timer is started by prog[...]

  • Page 484

    10-38 Vol. 3 ADVANCED PR OGRAMMABLE INTERRUP T CONTR OLLER (APIC) 10.6.5 Local In terrupt Accep tanc e When a local interrupt is sent to the processor core, it is subject to the acceptance criteria specified in the interru pt acceptance flow chart in Figure 10-25 . If the inter - rupt is accepted, it is logged into the IRR register a nd handled by [...]

  • Page 485

    Vol. 3 10-39 ADVANCED PR OGRAMMABLE IN TERRUPT CON TROLLER (APIC) The ICR consists of the following fields. Vector The vector number of the interrupt being sent. Delivery Mode Specifies the type of IPI to be se n t . T h i s f i e l d i s a l s o k n o w a s t h e IPI message type field. 000 (Fixed) Delivers the interrupt specifie d in the vector f[...]

  • Page 486

    10-40 Vol. 3 ADVANCED PR OGRAMMABLE INTERRUP T CONTR OLLER (APIC) ability for a processor to send a lowest prior - ity IPI is model specific and should be avoid - ed by BIOS and operating system software. 010 (SMI) Delivers an SMI interrupt to the target pro - cessor or processors. The vector field must be programmed to 00H for future compati - bil[...]

  • Page 487

    Vol. 3 10-41 ADVANCED PR OGRAMMABLE IN TERRUPT CON TROLLER (APIC) Destination Mode Selects either physical (0) or logical (1) destination mode (see Section 10.7.2, “Determining IPI Destination” ). Delivery Status (Read Only) Indicates the IPI delivery status, as follows: 0 (Idle) There is currently no IPI activity for this local APIC, or the pr[...]

  • Page 488

    10-42 Vol. 3 ADVANCED PR OGRAMMABLE INTERRUP T CONTR OLLER (APIC) destination field se t to FH for Pentium and P6 family processors and to FFH for P entium 4 and Intel Xeon processors. 11: (All Excluding Self) The IPI is sent to all processors in a system with the exception of the processor sending the IPI. The APIC broadcasts a message with the ph[...]

  • Page 489

    Vol. 3 10-43 ADVANCED PR OGRAMMABLE IN TERRUPT CON TROLLER (APIC) Self Inv alid X L owes t Priority, NMI, INIT , SMI, Start- Up X All Including Self Val id Edge Fixed X All Including Self In valid 2 Level Fi x ed X All Including Self In valid X L ow est Priority , NMI, INIT , SMI, S tart- Up X All Excluding Self Va l i d Edge Fixed, Lo west Priorit[...]

  • Page 490

    10-44 Vol. 3 ADVANCED PR OGRAMMABLE INTERRUP T CONTR OLLER (APIC) 10.7 .1.1 ICR Operation in x2APIC Mode In x2APIC mode, the layout of the In terrupt Command R egister is shown in Figure 10-17 . The lower 32 bits of ICR in x2APIC mode is identical to the lower half of the All ex cluding Self Val id Edge All Modes 1 X All ex cluding Self Val id 2 Le[...]

  • Page 491

    Vol. 3 10-45 ADVANCED PR OGRAMMABLE IN TERRUPT CON TROLLER (APIC) ICR in xAPIC mode, except the Delivery Status bit is removed since it is not needed in x2APIC mode. The destination ID field is expanded to 32 bits in x2APIC mode. T o send an IPI, software must set up the IC R or SELF IPI register to indicate the type of IPI message to be sent and t[...]

  • Page 492

    10-46 Vol. 3 ADVANCED PR OGRAMMABLE INTERRUP T CONTR OLLER (APIC) 10.7 .2 Determining IPI Des tination The destination of an IPI can be one, all, or a subset (group) of the processors on the system bus. The sender of the IPI specif ies the destination of an IPI with the following APIC registers and fields within the registers: • ICR Register — [...]

  • Page 493

    Vol. 3 10-47 ADVANCED PR OGRAMMABLE IN TERRUPT CON TROLLER (APIC) APICs to be addressed on the APIC bus. A br oadcast to all local APICs is specified with 0FH. NO TE The number of local APICs that can be addressed on the system bus may be restricted by hardw are. 10.7 .2.2 Logical Destination Mode In logical destination mode, IPI destination is spe[...]

  • Page 494

    10-48 Vol. 3 ADVANCED PR OGRAMMABLE INTERRUP T CONTR OLLER (APIC) The interpretation of MDA for the two mo dels is described in the following para - graphs. 1. Flat Model — This model is selected by programming DFR bits 28 through 31 to 1111. Here, a unique logical APIC ID can be established for up to 8 local APICs by setting a different bit in t[...]

  • Page 495

    Vol. 3 10-49 ADVANCED PR OGRAMMABLE IN TERRUPT CON TROLLER (APIC) lowest priority delivery mode is not supported i n cluster mode and must not be configured by software. The hierarchical cluster destination model can be used with Pentium 4, Intel X eon, P6 family , or Pentium processors. With this model, a hierarchical network can be created by con[...]

  • Page 496

    10-50 Vol. 3 ADVANCED PR OGRAMMABLE INTERRUP T CONTR OLLER (APIC) mode is not supported in the x2APIC mode. Hence the Destination Format Register (DFR) is eliminated in x2APIC mode. The 32-bit logical x2APIC ID field of LDR is partitioned into two sub-fields: • Cluster ID (LDR[31:16]): is the address of the destination cluster • Logical ID (LDR[...]

  • Page 497

    Vol. 3 10-51 ADVANCED PR OGRAMMABLE IN TERRUPT CON TROLLER (APIC) 10.7 .2.5 Broadcast/Self Deliv ery Mode The destination shorthand fiel d of the ICR allows the delivery mode to be by-passed in favor of broadcasting the IPI to all the processors on the system bus and/or back to itself (see Section 10.7.1, “Interrupt Command Register (ICR)” ). T[...]

  • Page 498

    10-52 Vol. 3 ADVANCED PR OGRAMMABLE INTERRUP T CONTR OLLER (APIC) Here, the TPR value is the task priority value in the TPR (see Figure 10-26 ), the IRRV value is the v ector number for the highest priority bit that is set in the IRR (see Figure 10-28 ) or 00H (if no IRR bit is set), and the ISRV value is the v ector number for the highest priority[...]

  • Page 499

    Vol. 3 10-53 ADVANCED PR OGRAMMABLE IN TERRUPT CON TROLLER (APIC) The SELF IPI register is a write-only regist er . A RDMSR instruction with address of the SELF IPI register will raise a GP fault. The handling and prioritization of a self-IPI sent via the SELF IPI register is architec - turally identical to that for an IPI sent vi a the ICR fro m a[...]

  • Page 500

    10-54 Vol. 3 ADVANCED PR OGRAMMABLE INTERRUP T CONTR OLLER (APIC) priorities of the local APICs by resetting Ar b ID register of each agent to its current APIC ID value. (The P entium 4 and Intel Xe on processors do not implement the Arb ID register .) Section 10.11, “ APIC Bus Message Passing Mechanism and Protocol (P6 F amily , Pentium Pro cess[...]

  • Page 501

    Vol. 3 10-55 ADVANCED PR OGRAMMABLE IN TERRUPT CON TROLLER (APIC) 3. If the local APIC determines that it is the designated destination for the interrupt but the interrupt request is not one of the interrupts given in step 2, the local APIC sets the appropriate bit in the IRR. 4. When interrupts are pending in the IRR and ISR register , the local A[...]

  • Page 502

    10-56 Vol. 3 ADVANCED PR OGRAMMABLE INTERRUP T CONTR OLLER (APIC) 1. (IPIs only) It examines the IPI message to determines if it is the specified destination for the IPI as described in Section 10.7.2, “ Determining IPI Desti - nation. ” If it is the specified destination, it continues its acceptance procedure; if it is not the destination, it [...]

  • Page 503

    Vol. 3 10-57 ADVANCED PR OGRAMMABLE IN TERRUPT CON TROLLER (APIC) interrupt, or one of the MP protocol IPI messages (BI PI, FIPI, and SIPI), the interrupt is sent directly to the processor core for h andling. 3. If the local APIC determines that it is the designated destination for the interrupt but the interrupt request is not one of the interrupt[...]

  • Page 504

    10-58 Vol. 3 ADVANCED PR OGRAMMABLE INTERRUP T CONTR OLLER (APIC) of vectors within a priority group, the vector number is often divided into two parts, with the high 4 bits of the vector indicating its priority and the low 4 bit indicating its ranking within the priority group. 10.9.3.1 T ask and Pr ocessor Priorities The local APIC also defines a[...]

  • Page 505

    Vol. 3 10-59 ADVANCED PR OGRAMMABLE IN TERRUPT CON TROLLER (APIC) Its value in the PPR is computed as follows: IF TPR[7:4] ≥ ISRV[7:4] THEN PPR[7:0] ← TPR[7:0] ELSE PPR[7:4] ← ISRV[7:4] PPR[3:0] ← 0 Here, the ISRV value is the vector number of the highest priority ISR bit that is set, or 00H if no ISR bit is set. Essentially , the processor[...]

  • Page 506

    10-60 Vol. 3 ADVANCED PR OGRAMMABLE INTERRUP T CONTR OLLER (APIC) The IRR contains the active interrupt requests that have been accepted, but not yet dispatched to the processor for servicing. When the local APIC accepts an interrupt, it sets the bit in the IRR that correspon ds the vector of the accepted interrupt. When the processor core is ready[...]

  • Page 507

    Vol. 3 10-61 ADVANCED PR OGRAMMABLE IN TERRUPT CON TROLLER (APIC) bit is cleared for edge-triggered interrupts and set for level-triggered interrupts. If a TMR bit is set when an EOI cycle for its corresponding interrupt vector is generated, an EOI message is sent to all I/O APICs. 10.9.5 Signaling Interrup t Servicing Comple tion For all interrupt[...]

  • Page 508

    10-62 Vol. 3 ADVANCED PR OGRAMMABLE INTERRUP T CONTR OLLER (APIC) • Loading the TPR with a value of 8 (01000B) blocks all interrupts with a priorit y of 8 or less while allowing all inte rrupts with a priority of nine or more to be recognized. • Loading the TPR with zero enables all external interrupts. • Loading the TPR with 0F (01111B) disa[...]

  • Page 509

    Vol. 3 10-63 ADVANCED PR OGRAMMABLE IN TERRUPT CON TROLLER (APIC) There are no ordering mechanisms between direct updates of the APIC.TPR and CR8. Operating softw are should implement either direct APIC TPR updates or CR8 style TPR updates but not mix them. Softw are can use a serializing instruction (for example, CPUID) to serialize updates be twe[...]

  • Page 510

    10-64 Vol. 3 ADVANCED PR OGRAMMABLE INTERRUP T CONTR OLLER (APIC) 10.11 APIC BUS MESSAGE PASSING MECHANISM AND PR O TOCOL (P6 F AMILY , PEN TIUM PR OCESSORS) The Pentium 4 and Intel X eon processors pass messages among the local and I/O APICs on the system bus, using the system bus message passing mechanism and protocol. The P6 family and Pentium p[...]

  • Page 511

    Vol. 3 10-65 ADVANCED PR OGRAMMABLE IN TERRUPT CON TROLLER (APIC) the bus regardless of its sender ’ s arbitration priority , unless more than one APIC issues an EOI message simultaneously . In the latter case, the APICs sending the EOI messages arbitrate using their arbitration priorities. If the APICs are set up to use “lowest priority” arb[...]

  • Page 512

    10-66 Vol. 3 ADVANCED PR OGRAMMABLE INTERRUP T CONTR OLLER (APIC) 10.12.1 Message Addr ess Register F ormat The format of the Message Address Register (lower 32-bits) is shown in Figure 10-32 . Fields in the Message Address Register are as follows: 1. Bits 31-20 — These bits contain a fixed v alue for interrupt messages (0FEEH). This value locate[...]

  • Page 513

    Vol. 3 10-67 ADVANCED PR OGRAMMABLE IN TERRUPT CON TROLLER (APIC) destination mode and only the processor in the system that has th e matching APIC ID is considered for delivery of that interrupt (this means no re-direction). If RH is 1 and DM is 1, the De stination ID Field is interpreted as in logical destination mode and the redirection is limit[...]

  • Page 514

    10-68 Vol. 3 ADVANCED PR OGRAMMABLE INTERRUP T CONTR OLLER (APIC) Reserved fields are not assumed to be any value. Software must preserve their contents on writes. Other fields in the Message Data R egister are described below. 1. Vector — This 8-bit field contains the interrupt vector associated with the message. V alues range from 010H to 0F EH[...]

  • Page 515

    Vol. 3 11-1 CHAP TER 11 MEMORY CACHE CONT R OL This chapter describes the memory cache and cache control mechanisms, the TLBs, and the store buffer in Intel 64 and IA-32 processors. It also describes the memory type range registers (MTRRs) introduced in the P6 family processors and how they are used to control caching of physical memory locations. [...]

  • Page 516

    11-2 Vol. 3 MEMORY CACHE CON TROL Figure 11-2 shows the cache arrangement of Intel Core i7 processor . Figure 11-2. Cache S tructure of the In tel Core i7 Pr ocessors T able 11-1. Characteris tics o f the Caches, TLBs, St ore Buff er, and Write Combining Buffer in In tel 64 and IA-32 Processors Cache or Buffe r Characteris tics Tra c e C a c h e 1 [...]

  • Page 517

    Vol. 3 11-3 MEMORY CACH E CONTROL L1 Da ta Cache • Pentium 4 and Intel Xeon proce ssors (B ased on Intel Ne tBurst microa rchitectur e): 8-KByte, 4-way se t associativ e, 64-byte cache line size. • Pentium 4 and Intel Xeon proce sso rs (Based on Intel Ne tBurst microa rchitecture ): 16-KByte, 8-way se t associativ e, 64-byte cache line size. ?[...]

  • Page 518

    11-4 Vol. 3 MEMORY CACHE CON TROL Instruction TLB (4-KByte Pages) • Pen tium 4 and Intel X eon proces sors (Based on Intel NetBurs t microar chitecture): 128 e ntries, 4-way set associa tive . • Intel A tom processors: 32-entries, fully associative. • Intel Core i7 pr ocessor: 64- entries per thread (128-entries per c ore), 4- way se t associ[...]

  • Page 519

    Vol. 3 11-5 MEMORY CACH E CONTROL Intel 64 and IA-32 processors may implement four types of caches: the trace cache, the level 1 (L1) cache, the level 2 (L2) cache, and the lev el 3 (L3) cache. See Figure 11-1 . Cache availability is described below: • Intel Core i7 processor Family — The L1 cache is divided into two sections: one section is de[...]

  • Page 520

    11-6 Vol. 3 MEMORY CACHE CON TROL • Pentium 4 and Intel Xeon processors Based on Intel NetBurst microar - chitecture — The trace cache caches decoded instructions ( μ ops) from the instruction decoder and the L1 cache contains data. The L2 and L3 caches are unified data and instruction caches located on the processor chip. Dualcore processors [...]

  • Page 521

    Vol. 3 11-7 MEMORY CACH E CONTROL Processors based on Intel Core microarchitectures implement one level of instruction TLB and two levels of data TLB. Intel Co re i7 processor provides a second-level unified TLB. The store buffer is associated with the processors instruction execution units. It allows writes to system memory and/or the internal cac[...]

  • Page 522

    11-8 Vol. 3 MEMORY CACHE CON TROL (depending on the write policy currently in force) can also write it out to memory . If the operand is to be written out to memory , it is written first into the store buffer , and then written from the store buffer to memory when the system bus is available. (Note that for the Pentium processor , write misses do n[...]

  • Page 523

    Vol. 3 11-9 MEMORY CACH E CONTROL registers to access UC memory that may hav e read or write side effects. • Uncacheable (UC-) — Has same characteristics as the strong uncacheable (UC) memory type, except that this memory type can be overridden by programming the MTRRs for the WC memory type. This memo ry type is av ailable in processor familie[...]

  • Page 524

    11-10 Vol. 3 MEMORY CACHE CON TROL possible) and through to system memo ry . When writing thro ugh to memory , inv alid cache lines are never filled, and v a lid cache lines are either filled or inv al - idated. W rite combining is allowed. This type of cache-control is appropriate for frame buffers or when there are devices on the system bus that [...]

  • Page 525

    Vol. 3 11-11 MEMORY CACH E CONTROL 11.3.1 Buff ering of Write Combining Memory L ocations W rites to the WC memory type are not cached in the typical sense of the word cached. They are retained in an internal write combining buffer (WC buffer) that is separate from the internal L1, L2, and L3 caches and the store buffer . The WC buffer is not snoop[...]

  • Page 526

    11-12 Vol. 3 MEMORY CACHE CON TROL The WC memory type is weakly ord ered by definition. Once the eviction of a WC buffer has started, the data is subject to the weak ordering semantics of its defini - tion. Ordering is not maintained between the successive allocation/deallocation of W C b u f f e r s ( f o r e x a m p l e , w r i t e s t o W C b u [...]

  • Page 527

    Vol. 3 11-13 MEMORY CACH E CONTROL large data structure should be marked as un cacheable, or reading it will evict cached lines that the processor will be referencing again. A similar example would be a write-only data structure that is written to (to export the data to another agent), but never read by softw are. Such a structure can be marked as [...]

  • Page 528

    11-14 Vol. 3 MEMORY CACHE CON TROL The L1 instruction cache in P6 family proce ssors implements only the “SI” part of the MESI protocol, because the instruction cache is not writable. The instruction cache monitors changes in the data cache to maintain consistency between the caches when instructions are modified. See Section 11.6, “ Self-Mod[...]

  • Page 529

    Vol. 3 11-15 MEMORY CACH E CONTROL 11.5.1 Cache Cont rol R egisters and Bits Figure 11-3 depicts cache-control mechanisms in IA -32 processors. Other than for the matter of memory address space, these work the same in Intel 64 processors. The Intel 64 and IA -32 architectures prov ide the following cache-control registers and bits for use in enabli[...]

  • Page 530

    11-16 Vol. 3 MEMORY CACHE CON TROL Figure 11-3. Cache-Contr ol Registers and Bits A vailable in Intel 64 and IA-32 Proc essors Page-Directory or Page-T able Entry TLBs MTRRs 3 Physical Memory 0 FFFFFFFFH 2 control overall caching of system memory CD and NW Flags PCD and PWT flags control page-level caching G flag controls page- level flushing of TL[...]

  • Page 531

    Vol. 3 11-17 MEMORY CACH E CONTROL T able 11-5. Cache Oper ating Modes CD NW Caching and Read/Write Policy L1 L2/L3 1 0 0 Normal Cache Mode. Highe st perf ormance cache opera tion. • Read hi ts access the cache; re ad misses may ca use replacem ent. • Write hits update the cache. • Only wr ites to shared lines a nd write mis ses update system[...]

  • Page 532

    11-18 Vol. 3 MEMORY CACHE CON TROL • NW flag, bit 29 of control register CR0 — Controls the write policy for system memory locatio ns (see Section 2.5, “Control Registers” ) . I f th e N W a n d C D f l a g s are clear , write-b ack is enabled for th e whole of system memory , but may be restricted for individual pages or regi ons of memory[...]

  • Page 533

    Vol. 3 11-19 MEMORY CACH E CONTROL corrupt addresses. • PCD flag in the page-directo ry and page-table entries — Controls caching for individual page tables and pages, resp ectively (see Section 4.9, “Paging and Memory T yping” ). This flag only has effect wh en paging is enabled and the CD flag in control register CR0 is clear . The PCD fl[...]

  • Page 534

    11-20 Vol. 3 MEMORY CACHE CON TROL page-table entries) permit caching in an external L2 cache to be controlled on a page-by-page basis, consistent with th e control exercised on the L1 cache of these processors. The P6 and more recent processor families do not provide these pins because the L2 cache in internal to the chip package. 11.5.2 Pr eceden[...]

  • Page 535

    Vol. 3 11-21 MEMORY CACH E CONTROL When normal caching is in effect, the e ffe ctive memory type shown in T able 11 -6 is determined using the following rules: 1. If the PCD and PWT attributes for the page are both 0, then the effective memory type is identical to the MTRR -defined memory type. 2. If the PCD flag is set, then the effective memory t[...]

  • Page 536

    11-22 Vol. 3 MEMORY CACHE CON TROL 11.5.2.2 Selecting Memory T ypes for Pen tium III and More R ecen t Proc essor F amilies The Intel Core 2 Duo, Intel Atom, Intel Core Duo, Intel Core Solo , Pentium M, Pentium 4, Intel X eon, and Pentium III processors use the PA T to select effective page-level memory types. Here, a memory type for a page is sele[...]

  • Page 537

    Vol. 3 11-23 MEMORY CACH E CONTROL 11.5.2.3 Writing V alues Across P ages with Differen t Memory T ypes If two adjoining pages in memory have different memory types, and a word or longer operand is written to a memory location that crosses the page boundary between those two pages, the operand might be written to memory twice. This action does not [...]

  • Page 538

    11-24 Vol. 3 MEMORY CACHE CON TROL 11.5.3 Pre ven ting Caching T o disable the L1, L2, and L3 caches after they have been enabled and have received cache fills, perform the following steps: 1. Enter the no-fill cache mode. (Set the CD flag in control register CR0 to 1 and the NW flag to 0. 2. Flush all caches using the WBINVD instruction. 3. Disabl[...]

  • Page 539

    Vol. 3 11-25 MEMORY CACH E CONTROL 11.5.4 Disabling and Enabling the L3 Cache On processors based on Intel NetBurst microarchitecture, the third-level cache can be disabled by bit 6 of the IA32_MISC_EN ABLE MSR. The third-level cache disable flag (bit 6 of the IA32_MIS C_ENABLE MSR) allows the L3 cache to be disabled and enabled, independently of t[...]

  • Page 540

    11-26 Vol. 3 MEMORY CACHE CON TROL The CLFLUSH instruction allow selected cach e lines to be flushed from memory . This instruction give a program the ability to expl icitly free up cache space, when it is known that cached section of system memory will not be accessed in the near future . The non-tempor al move instructions (M OVNTI, MOVNTQ, MOVNT[...]

  • Page 541

    Vol. 3 11-27 MEMORY CACH E CONTROL on the Intel NetBurst microarchitecture that support Intel Hyper- Threading T ech - nology . 11.6 SELF-MODIFYING CODE A write to a memory location in a code segment that is curr ently cached in the processor causes the associated cache line (or lines) to be inv alidated. This check is based on the physical address[...]

  • Page 542

    11-28 Vol. 3 MEMORY CACHE CON TROL T o a void problems related to implicit cach ing, the operating system must explicitly inv alidate the c ache when changes are made to cacheable da ta that the cache coher - ency mechanism does not automatically handle. This includes writes to dual-ported or physically aliased memory boards th at are not detected [...]

  • Page 543

    Vol. 3 11-29 MEMORY CACH E CONTROL 11.9 IN VALIDATING THE T R ANSLATION LOOK ASIDE BUFFERS (TLBS) The processor updates its address translat ion caches (TLBs) transparently to soft - ware. Sever al mechanisms are available, however , that allow software and hardware to invalidate the TLBs either explicitly or as a side effect of another operation. [...]

  • Page 544

    11-30 Vol. 3 MEMORY CACHE CON TROL The discussion of write ordering in Sectio n 8.2, “Memory Ordering, ” gives a detailed description of the operation of the store buffer . 11.11 MEMORY T YPE RANGE REGIS TERS (MTRRS) The following section pertains only to th e P6 and more recent processor families. The memory type r ange registers (MTRRs) provi[...]

  • Page 545

    Vol. 3 11-31 MEMORY CACH E CONTROL Res erve d* 03H Write-through (WT) 04H Write-pro tected (WP) 05H Writeback (WB) 06H Res erve d* 7H through FFH NO TE: * U se of these enc odings results in a gener al-pro tection ex ception (#GP). Figure 11-4. Mapping Ph ys ical Memory With MTRRs T able 11-8. Memory T ypes That Ca n Be Encoded in MTRRs (Con td.) 0[...]

  • Page 546

    11-32 Vol. 3 MEMORY CACHE CON TROL 11.11.1 MTRR F eature Identification The availability of the MTRR feature is mo del-specific. Software can determine if MTRRs are supported on a processor by executing the CPUID instruction and reading the state of the MTRR flag (bit 12) in the feature information register (EDX). If the MTRR flag is set (indicatin[...]

  • Page 547

    Vol. 3 11-33 MEMORY CACH E CONTROL 11.11.2 Se tting Memory Ranges with MTRRs The memory ranges and the types of memory specified in each range are set by three groups of registers: the IA32_MTRR_DEF_TYPE MSR, the fixed-range MTRRs, and the variable r ange MTRRs. These registers can be read and written to using the RDMSR and WRMSR instructions, resp[...]

  • Page 548

    11-34 Vol. 3 MEMORY CACHE CON TROL memory . When this flag is set, the FE flag can disable the fixed-range MTRRs; when the flag is clear , the FE flag has no affect. When the E flag is set, the type specified in the default memory type fi eld is used for areas of memory not already mapped by either a fixed or v ariable MTRR. Bits 8 and 9, and bits [...]

  • Page 549

    Vol. 3 11-35 MEMORY CACH E CONTROL Figure 11-7 shows flags and fields in these regi ste rs. The functions of these flags and fields are: • Type field, bits 0 through 7 — Specifies the memory type for the r ange (see Ta b l e 1 1 - 8 for the encoding of this field). • PhysBase field, bi ts 12 through (MAXPHYADDR-1) — Specifies the base addre[...]

  • Page 550

    11-36 Vol. 3 MEMORY CACHE CON TROL — The width of the PhysMask field depe nds on the maximum physical address size supported by the processor . CPUID.80000008H reports the maximum physical address size supported by the processor . If CPUID. 80000008H is not available, software may assume that the processor supports a 36-bit physical address size [...]

  • Page 551

    Vol. 3 11-37 MEMORY CACH E CONTROL NO TE It is possible for software to parse the memory descriptions that BIOS provides by using the ACPI /INT15 e820 interface mechanism. This information then can be used to determine how MTRRs are initialized (for example: allowing the BIOS to define valid memory ranges and the maximum memory ra nge supported by [...]

  • Page 552

    11-38 Vol. 3 MEMORY CACHE CON TROL Before attempting to access these SMRR registers, software must test bit 11 in the IA32_MTRRCAP register . If SMRR is not suppor ted, reads from or writes to registers cause general-protection exceptions. When the v alid flag in the IA32_SMRR_ PHYS MASK MSR is 1, accesses to the specified address range are treated[...]

  • Page 553

    Vol. 3 11-39 MEMORY CACH E CONTROL 3FFFFFH (2 MBytes to 4 MBytes), a mask value of FFFE00000H is required. Again, the 12 least-significant bits of this mask v alue are truncated, so that the v alue entered in the PhysMask field of IA32_MTRR_PHYSMASK3 is FFFE00H. This mask is chosen so that when any address in the 200000 H to 3FFFFFH range is AND’[...]

  • Page 554

    11-40 Vol. 3 MEMORY CACHE CON TROL IA32_MTRR_PHYSBASE5 = 0000 0 000 A000 0001H IA32_MTRR_PHYSMASK5 = 0000 0 00F FF80 0800H Caches A000000 0-A0800000 as WC typ e. This MTRR setup uses the ability to overlap any two memory r anges (as long as the ranges are mapped to WB and UC memory types) to minimize the number of MTRR registers that are required t[...]

  • Page 555

    Vol. 3 11-41 MEMORY CACH E CONTROL 11.11.4 Range Siz e and Alignment R equiremen t A range that is to be mapped to a v ari able-r ange MTRR must meet the following “power of 2” size and alignment rules: 1. The minimum range size is 4 KBytes and the base address of the range must be on at least a 4-KByte boundary . 2. For r anges greater than 4 [...]

  • Page 556

    11-42 Vol. 3 MEMORY CACHE CON TROL the MTRRs according to known types of me mory , including memory on devices that it auto-configures. Initialization is expected to occur prior to booting the oper ating system. See Section 11.11.8, “MTRR Considerations in MP Systems, ” for information on initializing MTRRs in MP (multiple-processor) syste ms. [...]

  • Page 557

    Vol. 3 11-43 MEMORY CACH E CONTROL automatically aligns the base address and size to 4-KByte boundaries. Pseudocode for the MemT ypeGet() function is given in Example 11-4 . Example 11-4. MemTypeGe t() Pseudocode #define MIXED_TYPES -1 /* 0 < MIXED_TYPES || MIXED_TYPES > 256 */ IF CPU_FEATUR ES.MTRR /* proce ssor supports M TRRs */ THEN Align[...]

  • Page 558

    11-44 Vol. 3 MEMORY CACHE CON TROL Example 11-5. Get4KMemT ype() Pseudocode IF IA32_MTRRCAP.FIX AND MTRRdefType.FE / * fixed registers enabled */ THEN IF PHY_ADDRESS is within a fixed range return IA32_MTRR_FIX.Type; FI; FOR each variable- range MTRR in IA32_MTRRCAP .VCNT IF IA32_MTRR_PHYSMASK. V = 0 THEN continue; FI; IF (PHY_ADDRESS AND IA32 _MTR[...]

  • Page 559

    Vol. 3 11-45 MEMORY CACH E CONTROL THEN pre_mtrr_change(); update affected MTRR; post_mtrr_change(); FI; ELSE (* try to map using a variable MTRR pair *) IF IA32_MTRRCAP.VCNT = 0 THEN return UNSUPPORTED; FI; IF conflicts with current variable ranges THEN return RANGE_OVERLAP; FI; IF no MTRRs available THEN return VAR_NOT_AVAILABLE; FI; IF BASE and [...]

  • Page 560

    11-46 Vol. 3 MEMORY CACHE CON TROL END The physical address to v ariable range mapping algorithm in the MemT ypeSet func - tion detects conflicts with current variable range registers by cycling through them and determining whether the physical address in question matches any of the current ranges. During this scan, the algorithm can detect whether[...]

  • Page 561

    Vol. 3 11-47 MEMORY CACH E CONTROL 4. Enter the no-fill cache mode. (Set the CD fl ag in control register CR0 to 1 and the NW flag to 0.) 5. Flush all caches using the WBINVD instructions. Note on a processor that supports self-snoopin g, CPUID feature flag bit 2 7, this step is unnecessary . 6. If the PGE flag is set in control register CR4, flush[...]

  • Page 562

    11-48 Vol. 3 MEMORY CACHE CON TROL The requirement that all 4-KByte ranges in a larg e page are of the same memory type implies that large pages with different memory types may suffer a performance penalty , since they mu st be marked with the lowest common denominator memory type. The Pentium 4, Intel X eon, and P6 family processors provide specia[...]

  • Page 563

    Vol. 3 11-49 MEMORY CACH E CONTROL 11.12.2 IA32_P A T MSR The IA32_PA T MSR is located at MSR address 277H (see to App endix B, “Model- Specific Registers (MSRs), ” and this address will remain at the same address on future IA -32 processors that support the P A T feature. Figure 11-9. shows the format of the 64- bit IA32_P A T MSR . The IA32_P[...]

  • Page 564

    11-50 Vol. 3 MEMORY CACHE CON TROL 11.12.3 Selecting a Memory T ype fr om the P A T T o select a memory type for a page from the PA T , a 3-bit index made up of the PA T , PCD, and PWT bits must be encoded in the page-table or page-directory entry for the page. Ta b l e 1 1 - 1 1 shows the possible encodings of the PA T , PCD, and PWT bits and the [...]

  • Page 565

    Vol. 3 11-51 MEMORY CACH E CONTROL The values in all the entries of the PA T can be changed by writing to the IA3 2_PA T MSR using the WRMSR instruction. The IA32_ PA T MSR is read and write accessible (use of the RDMSR and WRMSR instructions, respectively) to software operating at a CPL of 0. T able 11-10 shows the allowable encoding of the entrie[...]

  • Page 566

    11-52 Vol. 3 MEMORY CACHE CON TROL 11.12.5 P A T Compatibility with Earlier IA-32 Pr ocessors For IA -32 processors that support the PA T , the IA32_PA T MSR is always active. That is, the PCD and PWT bits in page-table entries and in page-directory entries (that point to pages) are always select a memory type for a page indi rectly by selecting an[...]

  • Page 567

    Vol. 3 12-1 CHAP TER 12 IN TEL ® MMX ™ T ECHNOLOGY S YSTEM PR OGR AMMING This chapter describes those features of the Intel ® MMX™ technology that must be considered when designing or enhancing an operating system to support MMX tech - nology . It cov ers MMX instruction set emul ation, the MMX state, aliasing of MMX registers, saving MMX sta[...]

  • Page 568

    12-2 Vol. 3 INTEL ® MMX ™ T ECH NOLOGY SYSTEM PROGR AMMING result, the MMX register mapping is fixed an d is not affected by value in the T op Of Stack (TOS) field in the floating-point status word (bits 11 through 13). When a value is written into an MMX register using an MMX instruction, the value also appears in the corresponding floating-poi[...]

  • Page 569

    Vol. 3 12-3 INTEL ® MMX ™ T EC HNOL OGY SY STEM P ROGRAMM ING • When the EMMS instruction is executed, ea ch tag field in the x87 FPU tag word is set to 11B (empty). • Each time an MMX instruction is executed, the TOS value is set to 000B. Execution of MMX instructions does not affe ct the other bits in the x87 FPU status word (bits 0 throug[...]

  • Page 570

    12-4 Vol. 3 INTEL ® MMX ™ T ECH NOLOGY SYSTEM PROGR AMMING 12.3 SAVING AND REST ORING THE MMX S TATE AND R EGISTERS Because the MMX registers are aliased to the x87 FPU data registers, the MMX state can be saved to memory and restored from memory as follows: • Execute an FSA VE, FNSA VE, or FXSA VE instruction to save the MMX state to memory .[...]

  • Page 571

    Vol. 3 12-5 INTEL ® MMX ™ T EC HNOL OGY SY STEM P ROGRAMM ING • Execute eight MOVQ instructions to sav e the contents of the MMX0 through MMX7 registers to memory . An EMMS instruction may then (optionally) be executed to clear the MMX state in the x87 FPU. • Execute eight MOVQ instructions to read the saved contents of MMX registers from me[...]

  • Page 572

    12-6 Vol. 3 INTEL ® MMX ™ T ECH NOLOGY SYSTEM PROGR AMMING • System exceptions: — Invalid Opcode (#UD), if the EM flag in control register CR0 is set when an MMX instruction is executed (see Section 12.1, “Emulation of the MMX Instruction Set” ). — Device not available (#NM), if an MMX instruction is executed when the TS flag in contro[...]

  • Page 573

    Vol. 3 12-7 INTEL ® MMX ™ T EC HNOL OGY SY STEM P ROGRAMM ING When the TOS equals 2 (case B in Figure 12-2), ST0 points to the ph ysical location R2. MM0 maps to ST6, M M1 maps to ST7, MM2 maps to ST0, and so on. Figure 12-2. Mapping of MMX R egisters to x87 FPU Data Register S tack MM0 MM1 MM2 MM3 MM4 MM5 MM6 MM7 ST1 ST2 ST7 ST0 ST6 ST7 ST1 TO [...]

  • Page 574

    12-8 Vol. 3 INTEL ® MMX ™ T ECH NOLOGY SYSTEM PROGR AMMING[...]

  • Page 575

    Vol. 3 13-1 CHAP TER 13 SYS TEM PR OGR AMMING F OR INSTRUCTION SET EX TENSIONS AND PR OCESSOR EX TENDED S TATES This chapter describes system programming features for instruction set extensions operating on the processor state extension known as the SSE state (XMM registers, MXCSR) and for proce ssor extended states. Instruction set extensions oper[...]

  • Page 576

    13-2 Vol. 3 SYSTE M PROGRAM MING FOR IN STRUC TIO N SET EXTENSIONS AND PR OCESSOR guidelines for this support. Because SS E/SSE2/S SE3/SSSE3/SSE4 extensions share the same state, experience the same sets of non-numerical and numerical exception behavior , these guide lines that apply to SS E also apply to other sets of SIMD exten - sions that opera[...]

  • Page 577

    Vol. 3 13-3 SYSTE M PROGRAM MING FOR INST RUCTION SET EXTENSIONS AND T o use POPCNT instruction, software must check CPUID .1:ECX.POPCNT[bit 23] = 1 13.1.3 Checking f or Support for the FXS A V E and FXRST OR Instructions A separate check must be made to insure that the processor supports FXSA VE and FXRSTOR. Make sure: • CPUID.1:EDX.FXSR[bit 24][...]

  • Page 578

    13-4 Vol. 3 SYSTE M PROGRAM MING FOR IN STRUC TIO N SET EXTENSIONS AND PR OCESSOR • OSFXSR and OSXMMEXCPT flag s in control register CR4 • SSE/S SE2/SSE3/SSSE3/S SE4 feature flags returned by CPUID • EM, MP , and TS flags in control register CR0 T able 13-1. Action T aken f or Combinations o f OSFXSR, OSXMMEX CPT , SSE, SSE2, SSE3, EM, MP , a[...]

  • Page 579

    Vol. 3 13-5 SYSTE M PROGRAM MING FOR INST RUCTION SET EXTENSIONS AND The SIMD floating-point exception mask bits (bits 7 through 12), the flush-to-z ero flag (bit 15), the denormals-are- zero flag (bit 6 ), and the rounding control field (bits 13 and 14) in the MXCSR register should be left in their default values of 0. This permits the application[...]

  • Page 580

    13-6 Vol. 3 SYSTE M PROGRAM MING FOR IN STRUC TIO N SET EXTENSIONS AND PR OCESSOR to a 16-byte boundary will also generate a general-protection exception, instead a stack -segment fault exception (#SS). — Page fault (#PF). — Alignment check (#AC). When enable d, this type of alignment check operates on operands that are less than 128-bits in si[...]

  • Page 581

    Vol. 3 13-7 SYSTE M PROGRAM MING FOR INST RUCTION SET EXTENSIONS AND — Device not available (#NM). This exception is generated by executing a SSE/SS E2/SSE 3/SSSE3/SS E4 ins truc tion w hen th e TS fla g (bit 3 ) of CR0 is set to 1. Other exceptions can occur indirectly due to faulty ex ecution of the abov e exceptions. 13.1.6 Pr o viding an Hand[...]

  • Page 582

    13-8 Vol. 3 SYSTE M PROGRAM MING FOR IN STRUC TIO N SET EXTENSIONS AND PR OCESSOR 13.1.6.1 Numeric Error fla g and IGNNE# SSE/SS E2/SSE3/ SSE4 ex tens ions i gnor e the NE flag in control register CR0 (that is, treats it as if it were always set) and the IGNNE# pin. When an unmasked SIMD floating-point exception is detected, it is always reported b[...]

  • Page 583

    Vol. 3 13-9 SYSTE M PROGRAM MING FOR INST RUCTION SET EXTENSIONS AND • Execute a LDMXCSR instruction to restore the state of the MXCSR register from memory . 13.4 SAVING THE SSE/SSE2/SSE3/SSSE3/SSE4 S TATE ON T ASK OR CON TE X T SWITCHES When switching from one task or context to another , it is often necessary to save the SSE/SS E2/SS E3/SSSE 3/[...]

  • Page 584

    13-10 Vol. 3 SYSTE M PROGRAM MING FOR IN STRUC TIO N SET EXTENSIONS AND PR OCESSOR when a suspended task is resumed (usi ng an FXRST OR instruction). Here, the x87 FPU/MMX/SSE/SSE2/S SE3/SSE4 state must be saved as part of the task state. This approach is appropriate for preemptive multitasking operating systems, where the application cannot kn ow [...]

  • Page 585

    Vol. 3 13-11 SYSTE M PROGRAM MING FOR INST RUCTION SET EXTENSIONS AND The TS flag can be set either explicitly (by executing a MOV instruction to control register CR0) or implicitly (using the IA-32 architecture’ s native task switching mech - anism). When the nativ e task switching mechanism is used, the processor automati - cally sets the TS fl[...]

  • Page 586

    13-12 Vol. 3 SYSTE M PROGRAM MING FOR IN STRUC TIO N SET EXTENSIONS AND PR OCESSOR If a new task attempts to access an x87 FP U, MMX, XMM, or MXCSR register while the TS flag is set to 1, a device-not -a vailabl e exception (#NM) is generated. The device- not-a vailable exception handler executes the following pseudo-code. FXSAVE “To x87FPU/MMX/X[...]

  • Page 587

    Vol. 3 13-13 SYSTE M PROGRAM MING FOR INST RUCTION SET EXTENSIONS AND — CPUID leaf function 0DH enumerates the list of processor states (including legacy x87 FPU, SSE states and processor extended states), the offset and size of individual save area for each processor extended state. • Control register enhancement and dedicated register for ena[...]

  • Page 588

    13-14 Vol. 3 SYSTE M PROGRAM MING FOR IN STRUC TIO N SET EXTENSIONS AND PR OCESSOR The XSA VE header is 64 bytes in length an d must be aligned on 64 byte boun dary . Therefore, the XSAVE/XRST OR region must be aligned on 64-byte boundary . The format of the header is as follows (see T able 13-3 ): The value of each bit in HEADER.XST A TE_BV may af[...]

  • Page 589

    Vol. 3 13-15 SYSTE M PROGRAM MING FOR INST RUCTION SET EXTENSIONS AND enabled), a value of "1" in the corresponding bi t of HEADER.XST A TE_BV causes the processor state to be updated with contents of the save area read from the memory image. A value of "0" in HEADER.XST A TE_BV causes the processor state to be initial - ized by[...]

  • Page 590

    13-16 Vol. 3 SYSTE M PROGRAM MING FOR IN STRUC TIO N SET EXTENSIONS AND PR OCESSOR XSA VE, XRSTOR instructions oper ating on FP or SSE state will cause a #NM Device Not Av ailable) exception, if CR0.TS is set. Using this feature, system software can implement the “lazy restore” technique of managing x87 FPU/SSE state using either FXSA VE/FXRST [...]

  • Page 591

    Vol. 3 13-17 SYSTE M PROGRAM MING FOR INST RUCTION SET EXTENSIONS AND 13.8 DE TECTION, ENUMER ATION, ENABLING PROC ESSOR EXTENDED S TATE SUPPORT An OS can determine if the XSA VE/XRST OR/XGETBV/XSETBV in structions and the XFEA TURE_ENABLED_MASK register (XCR0) are av ailable in the pr ocessor by checking the value of CPUID. 1.ECX.XSAVE to be 1. Th[...]

  • Page 592

    13-18 Vol. 3 SYSTE M PROGRAM MING FOR IN STRUC TIO N SET EXTENSIONS AND PR OCESSOR instructions, and provides a more constrained list of features than using all 1's in the save mask. The advantage of using a mask v alue of all-bits-set-to-1 for XS AVE/XRST OR is that it can simplify system software’ s support for processor extended state man[...]

  • Page 593

    Vol. 3 13-19 SYSTE M PROGRAM MING FOR INST RUCTION SET EXTENSIONS AND If all three requirements are met, applicat ions can use the target new instruction set extensions. If any of the above requirements are not met, an attempt to execute an instruction operating on a processor extended state corresponding to bit offset higher than 1 in the XFEA TUR[...]

  • Page 594

    13-20 Vol. 3 SYSTE M PROGRAM MING FOR IN STRUC TIO N SET EXTENSIONS AND PR OCESSOR[...]

  • Page 595

    Vol. 3 14-1 CHAP TER 14 PO W ER AND THERMAL MANAGEMENT This chapter describes facilities of Intel 64 and IA-32 architecture used for power management and thermal monitoring. 14.1 ENHANCED IN TEL SPEEDSTEP ® T ECHNOLOGY Enhanced Intel SpeedStep ® T echnology w as introduced in the Pentium M processor; it is available in P entium 4, Intel Xeon, Int[...]

  • Page 596

    14-2 Vol. 3 POW ER AND THE RMAL MANAGEME NT tools can access model-specific events and report the occurrences of state transitions. 14.2 P-STATE HAR DWARE COOR DINATION The Advanced Configur ation and Power Inte rface (ACPI) defines performance states (P-state) that are used facilitate system software’ s ability to manage processor power consumpt[...]

  • Page 597

    Vol. 3 14-3 PO WER AN D THERMAL MANA GEMENT • IA32_APERF MSR (0xE8) increments in prop ortion to actual performance, while accounting for hardware coordination of P-state and TM1/TM2; or software initiated throttling. • The MSRs are per logical processor; th ey measure performance only when the targeted processor is in the C0 state. • Only th[...]

  • Page 598

    14-4 Vol. 3 POW ER AND THE RMAL MANAGEME NT // This example does not cover the additiona l logic or algorithms // necessary to coordinate multiple logical processo rs to a target P-state. TargetPstate = FindPsta te(PercentPerfor mance); if (Targe tPstate != cur rentPstate) { SetPState(TargetPstate); } // WRMSR of MCNT and ACNT sh ould be performed [...]

  • Page 599

    Vol. 3 14-5 PO WER AN D THERMAL MANA GEMENT corresponding enable mechanism is acti v ated, the headroom is available and certain criteria are met. • The opportunistic processor performance operation is generally tr ansparent to most application software. • System software (BIOS and Operating system) must be aware of hardware support for opportu[...]

  • Page 600

    14-6 Vol. 3 POW ER AND THE RMAL MANAGEME NT to the OS, it may be undesirable to allow the possibility of the processor delivering increased performance that cannot be sustained after the calibration phase. System software can temporarily disengage opportun istic processor performance operation by setting bit 32 of the IA32_PER F_CTL MSR (0199H), us[...]

  • Page 601

    Vol. 3 14-7 PO WER AN D THERMAL MANA GEMENT 14.3.2.4 Application A wareness o f Opportunistic Pr ocessor Operation (Op tional) There may be situations that an end user or application software wishes to be aware of turbo mode activity . It is possible for an application-level utility to periodically check the occurrences of opportunistic proces sor [...]

  • Page 602

    14-8 Vol. 3 POW ER AND THE RMAL MANAGEME NT • When the OS timer service transfers co ntrol, the application can use RDPMC (with ECX = 4000_0001H) to read IA32_P ERF_FIXED_CTR1 (MSR address 30AH) to record the unhalted core clockt ick (UCC) value; followed by RDPMC (ECX=4000_0002H) to read IA32_PERF_ FIXED_CTR2 (MSR address 30BH) to record the unh[...]

  • Page 603

    Vol. 3 14-9 PO WER AN D THERMAL MANA GEMENT Software can progr am the lowest four bi ts of IA32_ENERGY_PERF_BIAS MSR with a value from 0 - 15. The values represent a sliding scale, where a value of 0 (the default reset value) corresponds to a hint preference for highest performance and a va lue of 15 corresponds to th e maximum energy savi ngs. A v[...]

  • Page 604

    14-10 Vol. 3 POW ER AND THE RMAL MANAGEME NT Ref e re nc e , A- M, ” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2A ). If CPUID.05H.ECX[Bit 1] = 1, the target processor supports using interrupts as break -events for MW AIT , even when interrupts are disabled. Use this feature to measure C -state residency as follow[...]

  • Page 605

    Vol. 3 14-11 PO WER AN D THERMAL MANA GEMENT consumption; this is in addition to th e reduction offered by automatic thermal monitoring mechanisms. 4. On-die digital thermal sensor and interrupt mechanisms permit the OS to manage thermal conditions natively without relying on BIOS or other system board components. The first mechanism is not visible[...]

  • Page 606

    14-12 Vol. 3 POW ER AND THE RMAL MANAGEME NT 14.5.1 Catastr ophic Shutdown Detector P6 family processors introduced a thermal sensor that acts as a catastrophic shut - down detector . This catastrophic shutdown de tector was also implemented in Pentium 4, Intel X eon and Pentium M processors. It is always enabled. When processor core temperature re[...]

  • Page 607

    Vol. 3 14-13 PO WER AN D THERMAL MANA GEMENT Support for TM2 is indicated by CPUID .1:ECX.TM2[bit 8] = 1. 14.5.2.3 Tw o Me thods for Enabling TM2 On processors with CPUID family/model/s tepping signature encoded as 0x69n or 0x6Dn (early Pentium M processors), TM2 is enabled if the TM_S ELECT flag (bit 16) of the MSR_THERM2_CTL register is set to 1 [...]

  • Page 608

    14-14 Vol. 3 POW ER AND THE RMAL MANAGEME NT 14.5.2.4 Performanc e S tate T r ansitions and Thermal Monit oring If the thermal control circuitry (TCC) for thermal monitor (TM1/TM2) is activ e, writes to the IA32_PERF_CTL wil l effect a new target operating point as follows: • If TM1 is enabled and the TCC is engaged, the performance state transit[...]

  • Page 609

    Vol. 3 14-15 PO WER AN D THERMAL MANA GEMENT After the second temperature sensor ha s been tripped, the thermal moni tor (TM1/TM2) will remain engaged for a minimu m time period (on the order of 1 ms). The thermal monitor will remain engaged until the processor core temperature drops below the preset trip temperature of the temper ature sensor , ta[...]

  • Page 610

    14-16 Vol. 3 POW ER AND THE RMAL MANAGEME NT interrupt enable flags in the IA32_THERM_INTERRUPT MSR are cleared (interrupts are disabled) and the thermal L VT entry is set to mask interrupts. This interrupt should be handled either by the operat ing system or system management mode (SMM) code. Note that the operation of the thermal moni toring mech[...]

  • Page 611

    Vol. 3 14-17 PO WER AN D THERMAL MANA GEMENT The IA32_CLOCK_MODULA TION MSR contains the following flag and field used to enable software-controlled clock modulation and to select the clock modulation duty cycle: • On-Demand Clock Modulation Enable, bit 4 — Enables on-demand softw are controlled clock modulation when set; disables software-cont[...]

  • Page 612

    14-18 Vol. 3 POW ER AND THE RMAL MANAGEME NT clock modulation at the duty cycle specified by TM1 takes precedence, regardless of the setting of the on-demand clock modu lation duty cycle. For Hyper - Threading T echnology enabled processors, the IA32_CLOCK_MODULA TION register is duplicat ed for each logical processor . In order for the On-demand c[...]

  • Page 613

    Vol. 3 14-19 PO WER AN D THERMAL MANA GEMENT 14.5.5.2 R eading the Digital Sensor Unlike traditional analog thermal devices, th e output of the digital thermal sen sor is a temperature relative to the maximum supported oper ating temperature of the processor . T emperature measurem ents returned by digital thermal sensors are alw ays at or below TC[...]

  • Page 614

    14-20 Vol. 3 POW ER AND THE RMAL MANAGEME NT • P R O CH O T # o r F O R C E P R # L o g ( b i t 3 , R / WC 0) — Sticky bit that indicates whether PROCHOT# or FORCEPR# has been asserted by another agent on the platform since the last clearing of this bit or a reset. If bit 3 = 1, PROCHOT# or FORCEPR# has been externally asserted. So ftware may c[...]

  • Page 615

    Vol. 3 14-21 PO WER AN D THERMAL MANA GEMENT • Reading Valid (bit 31, RO) — Indicates if the digital readout in bits 22:16 is valid. The readout is valid if bit 31 = 1. Changes to temperature can be detected using two thresholds (see Figure 14-12); one is set above and the other below the cu rrent temper ature. These thresholds ha ve the capabi[...]

  • Page 616

    14-22 Vol. 3 POW ER AND THE RMAL MANAGEME NT • Critical Temperature Interr upt Enable (bit 4, R /W) — Enables the generation of an interrupt when the Critical T emperature Detector has detected a critical thermal condition. The recommended response to this condition is a system shutdown. Bit 4 = 0 disables the interrupt; bit 4 = 1 enables the i[...]

  • Page 617

    Vol. 3 15-1 CHAP TER 15 MACHINE-CHECK ARCHITECTUR E This chapter describes the machine-ch eck architecture and machine-check exception mechanism found in the Pentium 4, Int el X eon, and P6 family processors. See Chapter 6, “Interrupt 18— M achine-Check Exception (#MC), ” for more information on machine- check exceptions. A brief descrip - ti[...]

  • Page 618

    15-2 Vol. 3 MACHINE-CHECK ARCHITECTUR E 15.2 COMPATIBILITY WITH PENTIUM PR OCESSOR The Pent ium 4, Intel Xeon, and P6 fam ily processors suppo rt and extend the machine-check except ion mechanism intr od uced in the Pentiu m processor . The Pentium processor reports the following machine-check errors: • data parity errors during read cycles • u[...]

  • Page 619

    Vol. 3 15-3 MACHINE-CHECK AR CHITECTURE Each error-reporting bank is associated with a specific hardw are unit (or group of hardware units) in the proc essor . Use RDMSR and WRMSR to read and to write these regi sters. 15.3.1 Machine-Check Global Contr ol MSRs The machine-check global control MSRs include the IA32_MCG_CAP , IA32_MCG_ST A TUS , and [...]

  • Page 620

    15-4 Vol. 3 MACHINE-CHECK ARCHITECTUR E Where: • Count field, bits 7:0 — Indicates the number of hardware unit error-reporting banks available in a particular processor implementation. • MCG_CTL_P (control MSR present) flag, bit 8 — Indicates that the processor implements the IA32_MCG_CTL MSR when set; t h i s r e g i s t e r i s a b s e n [...]

  • Page 621

    Vol. 3 15-5 MACHINE-CHECK AR CHITECTURE Section 15.6 ), and IA32_MCi_ST A TUS MSR bits 56:55 are used to report the signaling of uncorrected recover able errors and whether software must take recovery actions for uncorrected errors. Note that when MCG_TES_P is not set, bits 56:53 of the IA32_MCi_ST A TUS MSR ar e model-specific. If MCG_TES_P is set[...]

  • Page 622

    15-6 Vol. 3 MACHINE-CHECK ARCHITECTUR E 15.3.1.3 IA32_MCG_CTL MSR The IA32_MCG_CTL MSR is present if the capabi lity flag MCG_CTL_P is set in the IA32_MCG_CAP MSR. IA32_MCG_CTL controls the reporting of machi ne-check exceptions. If present, writing 1s to this register enables machine-check features and writing all 0s disables machine-chec k featur[...]

  • Page 623

    Vol. 3 15-7 MACHINE-CHECK AR CHITECTURE encoding of 06H_1AH and on ward ): the operating system or executive softw are must not modify the contents of the IA32_MC0_CTL MSR. This MSR is internally aliased to the EBL_CR_POWERON MSR and controls platform-specific error handling features. System specific firmw are (the BIOS) is responsible for the appr[...]

  • Page 624

    15-8 Vol. 3 MACHINE-CHECK ARCHITECTUR E introduced with Intel 64 processor having CPUID DisplayF amily_DisplayModel encoding of 06H_1AH. Where: • MCA (machine-check architecture) error code field, bits 15:0 — Specifies the machine-check architecture-defined error code for the machine-check error condition detected. The machine-check architectur[...]

  • Page 625

    Vol. 3 15-9 MACHINE-CHECK AR CHITECTURE • If IA32_MCG_CAP[10] is 0, bits 52:38 also contain “Other Information” (in the same sense as bits 37:3 2). • If IA32_MCG_CAP[10] is 1, bits 52:38 are architectural (not model- specific). In this case, bits 5 2:38 reports the value of a 15 bit counter that increments each time a corrected error is obs[...]

  • Page 626

    15-10 Vol. 3 MACHINE-CHECK ARCHITECTUR E flag indicates that the error did not affect the proce ssor’s state. Softw are restarting might be possible. • ADDRV (IA32_MC i _ADDR register valid) flag, bit 58 — Indicates (when set) that the IA32_MCi_ ADDR register contains the add ress where the error occurred (see Section 15.3. 2.3, “IA32_MCi_A[...]

  • Page 627

    Vol. 3 15-11 MACHINE-CHECK AR CHITECTURE In T able 15-2, the v alues in the two left- most columns are IA32_MCi_ST A TUS[54:53]. If a second event overwrites a previous ly posted event, the information (as guarded by individual valid bits) in the MCi bank is entirely from the second event. Similarly , if a first event is re tained, all of the infor[...]

  • Page 628

    15-12 Vol. 3 MACHINE-CHECK ARCHITECTUR E 15.3.2.4 IA32_MC i _MISC MSRs The IA32_MC i _MISC MSR contains addi tional information describi ng the machine-check error if the MISCV flag in the IA32_MC i _ST A TUS register is set. The IA32_MCi_MISC_MSR is either not implemente d or does not contain additional information if the MISCV flag in the IA32_MC[...]

  • Page 629

    Vol. 3 15-13 MACHINE-CHECK AR CHITECTURE • Recov erable Address LSB (bits 5:0): The lowest valid recov erable address bit. Indicates the position of the least significant bit (LSB) of the recoverable error address. For example, if the processor logs bits [43:9 ] of the address, the LSB sub-field in IA32_MCi_MISC is 01001b (9 decimal). For this ex[...]

  • Page 630

    15-14 Vol. 3 MACHINE-CHECK ARCHITECTUR E When IA32_MCG_CAP[10] = 1, the IA32_MCi_CTL2 MSR for each bank exists, i.e. reads and writes to thes e MSR are supported. However , signaling interface for corrected MC errors ma y not be sup ported in all banks. The layo ut of IA32_MC i _CTL 2 is shown in Figure 15-8: • Corrected error count threshold, bi[...]

  • Page 631

    Vol. 3 15-15 MACHINE-CHECK AR CHITECTURE 15.3.2.6 IA32_MC G Extended Machine Check S tate MSRs The Pentium 4 and Intel X eon processo rs implement a variable number of extended machine-check state MSRs. The MCG_EXT_P flag in the IA32_MCG_CAP MSR indicates the presen ce of these extended registers, and the MCG_EXT_CNT field indicates th e number of [...]

  • Page 632

    15-16 Vol. 3 MACHINE-CHECK ARCHITECTUR E T able 15-5. Extended Machine Check S tate MSRs In Proc essors W ith Suppor t F or Intel 64 Architectur e MSR Address Description IA32_MC G_RAX 180H Cont ains sta te of the RAX r egiste r at the time of the machine- ch e ck er ro r. IA32_MC G_RBX 181H Con tains sta te of the RBX r egister a t the time o f th[...]

  • Page 633

    Vol. 3 15-17 MACHINE-CHECK AR CHITECTURE When a machine-check error is detect ed on a P entium 4 or Intel X eon processor , the processor saves the stat e of the general-purpose registers, the R/EFLAGS register , and the R/EI P in these extended machine-check state MSRs. This information can be us ed by a debugger to analyze the error . These regis[...]

  • Page 634

    15-18 Vol. 3 MACHINE-CHECK ARCHITECTUR E processor; the handler must be writte n to interpret P5_MC_TYPE encodings correctly . 15.4 ENHANC ED CACHE ERROR R EPORTING Starting with Intel Core Duo proc essors, cache error report ing was enhanced. In earlier Intel processo rs, cache status w as based on the number of correction events that occu rred in[...]

  • Page 635

    Vol. 3 15-19 MACHINE-CHECK AR CHITECTURE beyond those of threshold-based error reporting ( Section 15 .4 ). With threshold-based error reporting, softw are is limited to use periodic polling to query the status of hardware corrected MC errors. CMCI provides a signaling mechanism to deliver a lo cal interrupt based on threshold v alues that soft - w[...]

  • Page 636

    15-20 Vol. 3 MACHINE-CHECK ARCHITECTUR E CMCI interrupt delivery is configured by writing to the L VT CMCI register entry in the local APIC register spac e at defaul t address of APIC_BASE + 2F0H. A CMCI interrupt can be deliv ered to more than one logical process ors if multiple logical processors are affe cted by the associated MC errors. F or ex[...]

  • Page 637

    Vol. 3 15-21 MACHINE-CHECK AR CHITECTURE • Delivery status, bits 12 — It is a read-only bit that, when set, indicates that an interrupt from this source has been delivered to the processor core, but has not yet been accepted. • Mask, bits 16 — When set, inhibits reception of the interru pt. (Unlike the PerfMon L VT entry , this bit is not s[...]

  • Page 638

    15-22 Vol. 3 MACHINE-CHECK ARCHITECTUR E b. Each thread examines IA 32_MCi_CTL2[30] indicator for each bank to determine if another thread has already claimed ownership of that bank. • If IA32_MCi_CTL2[30] ha d been set by another thread. This thread can not own bank i and should proceed to step b. and examine the next machine check bank until al[...]

  • Page 639

    Vol. 3 15-23 MACHINE-CHECK AR CHITECTURE • W rite 7FFFH to IA32_MCi_CTL2[15:0], • Re ad b a c k IA32_MCi_CTL2[15:0], the lower 15 bits (14:0) is the maximum threshold supported by the processor . b. Increase the threshold to a value below the maximum value discov ered using step a . 15.5.2.3 CMCI Interrup t Handler The following describes techn[...]

  • Page 640

    15-24 Vol. 3 MACHINE-CHECK ARCHITECTUR E 15.6.1 De tection o f Softw are Err or R eco very Support Software must use b it 24 of IA32_MCG_CAP (MCG_SER_P) to detect the presence of software error recov ery support (see Figure 15-2 ). When IA32_MCG_CAP[24] is set, this indicates that t he processor supports soft - ware error recovery . When this bit i[...]

  • Page 641

    Vol. 3 15-25 MACHINE-CHECK AR CHITECTURE • S (Signaling) flag, bit 56 - Indicates (when set) that a machine check exception was generated for the UCR error reported in this MC bank and system software needs to check the AR flag and the MCA error code fields in the IA32_MCi_ST A TUS register to identify th e necessary recovery action for this erro[...]

  • Page 642

    15-26 Vol. 3 MACHINE-CHECK ARCHITECTUR E IA32_MCi_ST A TUS register . R ecovery action s for SRAO errors are MCA error code specific. The MISCV and the ADD RV flags in the IA32_MCi_ST A TUS register are set when the additional error information is available from the IA32_MCi_MISC and the IA32_MCi_ADDR re gisters. System software needs to inspect th[...]

  • Page 643

    Vol. 3 15-27 MACHINE-CHECK AR CHITECTURE 15.6.4 UCR Err or Overwrite Rules In general, the o verwrite rules are as follows: • UCR errors will overwrite corrected errors. • Uncorrected (PCC=1) errors ov er write UCR (PCC=0) errors. • UCR errors are not written over previous UCR errors. • Corrected errors do not write ov er previous UCR error[...]

  • Page 644

    15-28 Vol. 3 MACHINE-CHECK ARCHITECTUR E 15.7 MACHINE-CHECK A V AILABILITY The machine-check architecture and machine-check exception ( #MC) are model- specific features. Software ca n execute the CPUID instruction to determine whether a processor implem ents these features. F ollowing the execut ion of the CPUID instru ction, the settings of the M[...]

  • Page 645

    Vol. 3 15-29 MACHINE-CHECK AR CHITECTURE (* enables all MCA features *) FI (* Determine number of error-reporting banks supported *) COUNT ← IA32_MCG_CAP.Count; MAX_BANK_NUMBER ← COUNT - 1; IF (Processor Family is 6H and Proces sor EXTMODEL:MODEL is less than 1AH) THEN (* Enable logging of all errors except for MC0_CTL register *) FOR error-rep[...]

  • Page 646

    15-30 Vol. 3 MACHINE-CHECK ARCHITECTUR E also write a 16-bit model-specific error code in the IA32_MC i _ST A TUS register depending on the implementa tion of the machine-check architec - ture of the processor . The MCA error codes are architectura lly defi ned for Intel 64 and IA -32 processors. T o determine the cause of a machine-chec k excep ti[...]

  • Page 647

    Vol. 3 15-31 MACHINE-CHECK AR CHITECTURE 15.9.2 Compound Err or Codes Compound error codes describe errors related to the TLBs, memory , caches, bus and interconnect logic, and internal timer . A set of sub-fields is common to all of compound errors. These sub-fi elds describe the type of ac cess, level in the cache hier archy , and type of request[...]

  • Page 648

    15-32 Vol. 3 MACHINE-CHECK ARCHITECTUR E The behavior of error filtering after cr ossing the yell ow threshold is model- specific. 15.9.2.2 Tr ansaction T ype (TT) Sub-Field The 2-bit T T sub-field ( T able 15-10) indicates the type of t ransaction (dat a, instruction, or generic). The sub-field applies to the T LB, cache, and inter - connect error[...]

  • Page 649

    Vol. 3 15-33 MACHINE-CHECK AR CHITECTURE caused the error . Eviction and snoop requ ests appl y only to the caches. All of the other requests apply to TLBs, caches and interconnects. 15.9.2.5 Bus and Inter connect Err ors The bus and interconnect errors are defi ned with the 2-bit PP (participation), 1-bit T (time-out), and 2- bit II (memory or I/O[...]

  • Page 650

    15-34 Vol. 3 MACHINE-CHECK ARCHITECTUR E 15.9.2.6 Memory Contr oller Erro rs The memory controller errors are defined wi th the 3-bit MMM (memory trans action type), and 4-bit CCCC (channel) sub-fields. The encodings for MMM and CCCC are defined in T able 15-14 . 15.9.3 Architectur ally Defined UCR Err ors Software recov erable compound error code [...]

  • Page 651

    Vol. 3 15-35 MACHINE-CHECK AR CHITECTURE 15-9 ). Their values and compound encoding format are given in Ta b l e 15-15 . T able 15-16 lists va lues of relevant bit fields of IA32_MCi_ST A TUS for archi - tecturally defined SRAO errors. For both the memory scru bbing and L3 exp licit wr iteback errors, the ADD RV and MISCV flags in the IA32_MCi_ST A[...]

  • Page 652

    15-36 Vol. 3 MACHINE-CHECK ARCHITECTUR E IA32_MCG_ST A TUS register for the memory scrubbing and L3 expli cit write - back errors on both the reporting and non-reporting logical processors. 15.9.3.2 Architecturally De fined SRAR Err ors The following two SRAR errors are architecturally defined. • UCR Errors detected on data load; and • UCR Erro[...]

  • Page 653

    Vol. 3 15-37 MACHINE-CHECK AR CHITECTURE T able 15-19 lists va lues of relevant bit fields of IA32_MCi_ST A TUS for archi - tecturally defined SRAR errors. Fo r both the data load and instruct ion fetch errors, the ADDRV and MISCV flags in the IA32_MCi_ST A TUS register ar e set to indicate that the offending physical address information is av aila[...]

  • Page 654

    15-38 Vol. 3 MACHINE-CHECK ARCHITECTUR E For Inst ruction Fetch rec overabl e error , the affected logical processor should find that the RIPV fl ag and the EIPV Flag in the IA32_MCG_ST A TUS register are cleared, indicating that the error is detected at the instruction pointer saved on the stack ma y not be associated with this err or and restarti[...]

  • Page 655

    Vol. 3 15-39 MACHINE-CHECK AR CHITECTURE • When multiple recoverable errors are reported and no other fatal condition (e.g.. overflowed condition for SRAR error) is found for the reported r ecoverable errors, it is possible for system software to recov er from the multiple recover able errors by taking necessary recovery action for each individua[...]

  • Page 656

    15-40 Vol. 3 MACHINE-CHECK ARCHITECTUR E Guidelines for writing a machine-check ex ception handler or a machine- error logging utility are given in the following sec tions. 15.10.1 Machine-Check Ex cep tion Handler The machine-check exception (#MC) corr esponds t o vector 18. T o service machine-check exceptions, a tr ap gate must be added to the I[...]

  • Page 657

    Vol. 3 15-41 MACHINE-CHECK AR CHITECTURE generated). If this flag is clear , the processor may still be able to be restarted (for debugging purposes) but not without loss of program continuity . • For unrecov erable errors, the EIPV flag in the IA32_MCG_ST A TUS register indicates whether the instruction indicate d by the instruction pointer push[...]

  • Page 658

    15-42 Vol. 3 MACHINE-CHECK ARCHITECTUR E When machine-check exceptions are enabled for the P e ntium pr ocessor ( M C E f l a g i s s e t i n c o n tr o l r e g i st er CR4), the m achine-check exception handler uses the RDMSR instruction to read the error type from the P5_MC_TYPE register and the ma chine check address from the P5_MC_ADDR register[...]

  • Page 659

    Vol. 3 15-43 MACHINE-CHECK AR CHITECTURE AND PCC flag in IA32_MC i _STATUS = 1 OR RIPV flag in IA32_MCG_STATUS = 0 (* execution is not restart able *) THEN RESTARTABILITY = FALSE; return RESTARTABILITY to calling procedure; FI; Save time-stamp counter and processor ID; Set IA32_MC i _STATUS to all 0s; Execute serializing instruction (i.e., CPUID); [...]

  • Page 660

    15-44 Vol. 3 MACHINE-CHECK ARCHITECTUR E mechanism to indicate the frequency of ex ceptions. A multiprocessing oper - ating system stores the identit y of the process or node incurring the excep - tion using a unique identifier , such as the processor’ s APIC ID (see Section 10.9, “Handling Interrupts” ). The basic algorithm given in Example [...]

  • Page 661

    Vol. 3 15-45 MACHINE-CHECK AR CHITECTURE was corrected (UC=0) or uncorrected (UC=1). The MCE handler can optionally log and clear the corrected errors in the MC banks if it can implement software algorithm to av oid the undesired race conditions with the CMCI or CMC polling handler . • For uncorrectable errors, the EIPV fl ag in the IA32_MCG_ST A[...]

  • Page 662

    15-46 Vol. 3 MACHINE-CHECK ARCHITECTUR E AR flag to find the type of the UCR erro r for softw are recovery and determine if software error recov ery is possible. • When both the S and the AR flags are cl ear in the IA32_MCi_ST A TUS register for the UCR error (VAL=1, UC=1, EN=x and PC C=0), the error in this bank is an uncorrected no-action requi[...]

  • Page 663

    Vol. 3 15-47 MACHINE-CHECK AR CHITECTURE • When the OVER flag in the IA32_MCi_ST A TUS regi ster is set for the SRAR error (V AL=1, UC=1, EN=1, PCC=0, S=1 and AR=1), the MCE handler cann ot take recovery action as the information of the SRAR error in the IA32_MCi_ST A TUS register was potentially lost due to the overflow condition. Since the reco[...]

  • Page 664

    15-48 Vol. 3 MACHINE-CHECK ARCHITECTUR E RESTARTABILITY = FALSE; FI FI; IF RESTARTABILITY = FALSE THEN Report RESTARTABILITY to console; Reset system; FI; IF MCA_BROADCAST = TRUE THEN IF ProcessorCount = MAX_PROCESSORS AND NOERROR = TRUE THEN Report RESTARTABILITY to console; Reset system; FI; Release SpinLock; Wait till ProcessorC ount = MAX_PROCE[...]

  • Page 665

    Vol. 3 15-49 MACHINE-CHECK AR CHITECTURE IF PCC Flag in IA32_MCi_STATUS = 1 THEN (* processor context might have been corrupted *) RESTARTABILITY = FALSE; ELSE (* It is a uncorrected recoverable (UCR ) error *) IF S Flag in IA32_MCi_STATUS = 0 THEN IF AR Flag in IA32_MCi_STATUS = 0 THEN (* It is a uncorrected no action required (UCNA) error *) GOTO[...]

  • Page 666

    15-50 Vol. 3 MACHINE-CHECK ARCHITECTUR E If MISCV in IA32_MCi_STATUS THEN SAVE IA32_MCi_MISC; FI; IF ADDRV in IA32_MCi_STATUS THEN SAVE IA32_MCi_ADDR; FI; IF CLEAR_MC_BANK = TRUE THEN SET all 0 to IA32_MCi_STATUS; If MISCV in IA32_MCi_STATUS THEN SET all 0 to IA32_MCi_MISC; FI; IF ADDRV in IA32_MCi_STATUS THEN SET all 0 to IA32_MCi_ADDR; FI; FI; CO[...]

  • Page 667

    Vol. 3 15-51 MACHINE-CHECK AR CHITECTURE before these errors are actually handle d and processed by the MCE handler for attempted software error recov ery . Example 15-5 giv es pseudocode for a CMCI handler with UCR support. Example 15-5. Corrected Error Ha ndler Pseudocode with UCR Support Corrected Error HANDLER: (* Called from CMCI handler or OS[...]

  • Page 668

    15-52 Vol. 3 MACHINE-CHECK ARCHITECTUR E[...]

  • Page 669

    Vol. 3 16-1 CHAP TER 16 DEBUGGING, PR O FILING BRANCHES AND TIME- S TAMP COUNTER Intel 64 and IA-32 architectures provide debug facilities for use in debugging code and monitoring performance. These facilitie s are valuable for debugging application software, system software, and multitaski ng operating systems. Debug support is accessed using debu[...]

  • Page 670

    16-2 Vol. 3 DEBUGGING, PR OFILING BR ANCHES AND TIME-S TAMP COUN TER instruction is an alternative w ay to set code breakpoints. It is especially useful when more than four breakpoints are de sired, or when breakpoints are being placed in the source code. • Last branch recording facilities — Store branch records in the last br anch record (LBR)[...]

  • Page 671

    Vol. 3 16-3 DEBUGGING, PR OFILING BRANCH ES AND TIME-S TAMP COUN TER • Whether the breakpoint condition was present when the debug exception was generated. The following paragraphs describe the functions of flags and fields in the debug registers. Figur e 16-1. Debug Regis ters 31 24 23 22 21 20 19 16 15 13 14 12 11 87 0 DR7 L Reserv ed 0 1 2 3 4[...]

  • Page 672

    16-4 Vol. 3 DEBUGGING, PR OFILING BR ANCHES AND TIME-S TAMP COUN TER 16.2.1 Debug Address R egisters (DR0-DR3) Each of the debug-address registers (DR0 through DR 3) holds the 32-bit linear address of a breakpoint (see Figure 16-1). Breakpoint comparisons are made before physical address tr anslation occurs. The contents of debug register DR7 furth[...]

  • Page 673

    Vol. 3 16-5 DEBUGGING, PR OFILING BRANCH ES AND TIME-S TAMP COUN TER exceptions, debug handlers should clear th e register before returning to the inter- rupted task. 16.2.4 Debug Con trol R egister (DR7) The debug control register (DR7) enables or disable s breakpoints and sets break- point conditions (see Figure 16-1). The flags and fields in thi[...]

  • Page 674

    16-6 Vol. 3 DEBUGGING, PR OFILING BR ANCHES AND TIME-S TAMP COUN TER 10 — Break on I/O reads or writes. 11 — Break on data reads or writes but not instruction fetches. When the DE flag is clear , the processor interprets the R/W n bits the same as for the Intel386™ and Intel486™ processors, whi ch is as follows: 00 — Break on instruction [...]

  • Page 675

    Vol. 3 16-7 DEBUGGING, PR OFILING BRANCH ES AND TIME-S TAMP COUN TER the lower address bits in the deb ug registers. Unaligned data or I/O breakpoint addresses do not yield valid results. A data breakpoint for reading or writing data is triggered if any of the bytes partici- pating in an access is within the range defined by a breakpoint address re[...]

  • Page 676

    16-8 Vol. 3 DEBUGGING, PR OFILING BR ANCHES AND TIME-S TAMP COUN TER 16.2.6 Debug Re gisters and In tel ® 64 Proc essors For Intel 64 architecture processors, debug registers DR0–DR7 are 64 bits. In 16-bit or 32-bit modes (protected mode and compatibility mode), writes to a debug register fill the upper 32 bits with zeros. R eads from a debug re[...]

  • Page 677

    Vol. 3 16-9 DEBUGGING, PR OFILING BRANCH ES AND TIME-S TAMP COUN TER 16.3 DEBUG EX CEP T IONS The Intel 64 and IA-32 architectures dedicate two interrupt vectors to handling debug exceptions: vector 1 (debug exception, #DB) and vector 3 (breakpoint excep- tion, #BP). The following sections describe how these exceptions are generated and typical exc[...]

  • Page 678

    16-10 Vol. 3 DEBUGGING, PR OFILING BR ANCHES AND TIME-S TAMP COUN TER See also: Chapter 6, “Interrupt 1—Deb ug Exception (#DB), ” in the Intel® 64 and IA-32 Architectures Software Develope r’s Manual, Volume 3A . 16.3.1.1 Instruction-Breakp oin t Ex cep tion Condition The processor reports an instruction breakpoint when it attempts to exec[...]

  • Page 679

    Vol. 3 16-11 DEBUGGING, PR OFILING BRANCH ES AND TIME-S TAMP COUN TER (resume flag) in the EFLAGS register (see Section 2.3, “System Flags and Fields in the EFLAGS Register , ” in the Intel ® 64 and IA-32 Architec tures Software Developer’s Manual, Volume 3A ). When the RF flag is set, the processor ignores instruction breakpoints. All Intel[...]

  • Page 680

    16-12 Vol. 3 DEBUGGING, PR OFILING BR ANCHES AND TIME-S TAMP COUN TER 16.3.1.2 Data Memory and I/O Brea kpoint E xc eption Conditions Data memory and I/O breakpoints are reported when the processor attempts to access a memory or I/O address specified in a breakpoint-address register (DB0 through DR3) that has been set up to detect data or I/O acces[...]

  • Page 681

    Vol. 3 16-13 DEBUGGING, PR OFILING BRANCH ES AND TIME-S TAMP COUN TER single-step trap does not occur until after the instruction that follows the POPF instruction. The processor clears the TF flag b efore calling the exception handler . If the TF flag was set in a TSS at the time of a task switch, the exception occurs after the first instruction i[...]

  • Page 682

    16-14 Vol. 3 DEBUGGING, PR OFILING BR ANCHES AND TIME-S TAMP COUN TER 16.4 LAST BRANCH, IN TERRUPT, AND EX CEP TION R ECOR DING OVERVIEW P6 family processors introduced the abilit y to set breakpoints on taken br anches, interrupts, and exceptions, and to single-step from one branch to the next. This capability has been modified and extended in the[...]

  • Page 683

    Vol. 3 16-15 DEBUGGING, PR OFILING BRANCH ES AND TIME-S TAMP COUN TER in the last branch record (LBR) stack. For more information, see the Section 16.5.1, “LBR Stack”. • BTF (single-step on branches) flag (b it 1) — When set, the processor treats the TF flag in the EFLAGS reg ister as a “single-step on branches” flag rather than a “si[...]

  • Page 684

    16-16 Vol. 3 DEBUGGING, PR OFILING BR ANCHES AND TIME-S TAMP COUN TER • FREEZE_LBRS_ON_P MI flag (bit 11) — When set, the LBR stack is frozen on a hardware PMI request (e.g. when a counter overflows and is co nfigured to trigger PMI). • FREEZE_PERFMON_ON_PMI fla g (bit 12) — W hen set, a PMI request clears each of the “ENABLE” field of [...]

  • Page 685

    Vol. 3 16-17 DEBUGGING, PR OFILING BRANCH ES AND TIME-S TAMP COUN TER a bug to a particular block of code before instruction single-stepping further narrows the search. If the B TF flag is set when the processor generates a debug exception, the processor clears the BTF flag along with the TF flag. The debugger must reset the B TF and TF flags befor[...]

  • Page 686

    16-18 Vol. 3 DEBUGGING, PR OFILING BR ANCHES AND TIME-S TAMP COUN TER 16.4.6 CPL -Qualified Branch T race Mechanism CPL -qualified branch tr ace mechanism is av ailable to a subset of Intel 64 and IA -32 processors that support the branch tr ace storing mechanism. The processor supports the CPL -qualified branch trace me chanism if CPUID .01H:ECX[b[...]

  • Page 687

    Vol. 3 16-19 DEBUGGING, PR OFILING BRANCH ES AND TIME-S TAMP COUN TER 16.4.8 LBR S tack The last branch record stack and top-of -stack (T OS) pointer MSRs are supported across Intel 64 and IA-32 processor families. However , the number of MSRs in the LBR stack and the v alid range of T OS po inter value can v ary between different processor familie[...]

  • Page 688

    16-20 Vol. 3 DEBUGGING, PR OFILING BR ANCHES AND TIME-S TAMP COUN TER 16.4.8.1 LBR Stack and Intel ® 64 Pr ocessor s LBR MSRs are 64-bits. If IA -32e mode is disab led, only the lower 32-bits of the address is recorded. If IA-32e mode is enabled, the processor writes 64-bit v alues into the MSR. In 64-bit mode, last branch records store 64-bit add[...]

  • Page 689

    Vol. 3 16-21 DEBUGGING, PR OFILING BRANCH ES AND TIME-S TAMP COUN TER 16.4.8.3 Last Ex ception R ecor ds and Intel 64 Ar chitecture Intel 64 and IA -32 processors also provide MSRs that store the branch record for the last branch tak en prior to an exception or an interrupt. The location of the last excep- tion record (LER) MSRs are mod el specific[...]

  • Page 690

    16-22 Vol. 3 DEBUGGING, PR OFILING BR ANCHES AND TIME-S TAMP COUN TER and is cleared on processor RESET an d INIT . DS recording is available in real address mode. The BTS and PEBS fac ilities may not be available on all pr ocessors. The availability of these facilities is indica ted by the B TS_UNA VAILABL E and PEBS_UNA V AILABLE flags, respectiv[...]

  • Page 691

    Vol. 3 16-23 DEBUGGING, PR OFILING BRANCH ES AND TIME-S TAMP COUN TER • PEBS absolute maxi mum — Linear address of the next byte past the end of the PEBS buffer . This address should be a mult iple of the PEBS record size (40 bytes) plus 1. • PEBS interrupt threshold — Linear address of the PEBS record on which an interrupt is to be generat[...]

  • Page 692

    16-24 Vol. 3 DEBUGGING, PR OFILING BR ANCHES AND TIME-S TAMP COUN TER • PEBS counter reset value — A 40-bit value that the counter is to be reset to after state information has collected following counter overflow. This v alue allows state information to be collected after a preset number of events have been counted. Figures 16-6 shows the stru[...]

  • Page 693

    Vol. 3 16-25 DEBUGGING, PR OFILING BRANCH ES AND TIME-S TAMP COUN TER 16.4.9.1 DS Save Area and IA-3 2e Mode Operation When IA-32e mode is active (IA32_EFER.LMA = 1), the structure of the DS save area is shown in Figure 16-8. The organization of each field in IA-32e mode oper ation is similar to that of non- IA-32e mode oper ation. However , each f[...]

  • Page 694

    16-26 Vol. 3 DEBUGGING, PR OFILING BR ANCHES AND TIME-S TAMP COUN TER When IA-32e mode is activ e, the structure of a branch tr ace record is similar to that shown in Figure 16-6, but each field is 8 bytes in length. This makes each B TS record 24 bytes (see Figure 16-9). The structure of a PEBS record is similar to that shown in Figure 16-7, b ut [...]

  • Page 695

    Vol. 3 16-27 DEBUGGING, PR OFILING BRANCH ES AND TIME-S TAMP COUN TER Fields in the buffer management area of a DS save area are described in Section 16.4.9. The format of a branch trace record and a PEBS record are the same as the 64-bit record formats shown in Figures 16-9 and Fi gures 16-10, with th e exception that the branch predicted bit is n[...]

  • Page 696

    16-28 Vol. 3 DEBUGGING, PR OFILING BR ANCHES AND TIME-S TAMP COUN TER The procedures used to program IA32_DEBUG_CTRL MSR to set up a BTS buffer or a CPL -qualified B TS are described in Se ction 16.4.9.3 and Section 16.4 .9.4. Required elements for writing a DS interrupt service routine are largely the same on processors that support using DS Sa ve[...]

  • Page 697

    Vol. 3 16-29 DEBUGGING, PR OFILING BRANCH ES AND TIME-S TAMP COUN TER • It is recommended that the buffer size for the BTS buffer and the PEBS buffer be an integer multiple of the corresponding record sizes. • The precis e eve nt records buf fer should be large enough to hold the number of precise event records that can occur while w aiting for[...]

  • Page 698

    16-30 Vol. 3 DEBUGGING, PR OFILING BR ANCHES AND TIME-S TAMP COUN TER 2. Set the TR and B TS flags in the IA32_DE BUGCTL for Intel Core Solo and Intel Core Duo processors or later processors (or MSR_DEBUGCTLA MSR for processors based on Intel NetBurst Microarchitecture; or MSR_DEBUGCTLB for P entium M processors). 3. Clear the B TINT flag in the co[...]

  • Page 699

    Vol. 3 16-31 DEBUGGING, PR OFILING BRANCH ES AND TIME-S TAMP COUN TER 16.4.9.5 Writing the DS In terrupt Serv ice R outine The BT S, non-precise event-based samplin g, and PEBS facilities share the same interrupt vector and interrupt service routine (called the debug store interrupt service routine or DS ISR). T o handle B T S, non-precise event -b[...]

  • Page 700

    16-32 Vol. 3 DEBUGGING, PR OFILING BR ANCHES AND TIME-S TAMP COUN TER • The ISR must clear the mask bit in the performance counter L VT entry . • The ISR must re-enable the counters to count via IA32_PERF_GLOBAL_CTRL/IA32_PERF_GLOBAL_OVF_CTRL if it is servicing an overflow PMI due to PEBS (or via CCCR's ENABLE bit on processor based on Int[...]

  • Page 701

    Vol. 3 16-33 DEBUGGING, PR OFILING BRANCH ES AND TIME-S TAMP COUN TER 16.5.1 LBR S tack The last branch record stack and top-of -stack (T OS) pointer MSRs are supported across Intel Core 2, Intel Xeon and Intel Atom processor families. F our pair of MSRs are supported in the LBR stack • Last Branch Record (LBR) Stack — MSR_LASTBRANCH_0_FROM_IP [...]

  • Page 702

    16-34 Vol. 3 DEBUGGING, PR OFILING BR ANCHES AND TIME-S TAMP COUN TER • Branch trace store and CPL-qualified BTS — See Section 16.4.6 and Section 16.4.5. • FREEZE_LBRS_ON_P MI flag (bit 11) — see Se cti on 16 .4 .7. • FREEZE_PERFMON_ON_PMI fla g (bit 12) — see Section 16.4.7. • FREEZE_WHILE_SMM_EN (b it 14) — FREEZE_WHILE_SMM_EN is [...]

  • Page 703

    Vol. 3 16-35 DEBUGGING, PR OFILING BRANCH ES AND TIME-S TAMP COUN TER Processors based on Intel microarchitecture (Nehalem) have an LBR MSR Stack as shown in T able 16-8. T able 16-8. LBR S tack Size and T OS Pointer Range 16.6.2 Filtering o f Last Br anch Rec ords MSR_LBR_SELECT is cleared to zero at RESE T , and LBR filtering is disabled, i.e. al[...]

  • Page 704

    16-36 Vol. 3 DEBUGGING, PR OFILING BR ANCHES AND TIME-S TAMP COUN TER 16.7 LAST BRANCH, IN TERRUPT, AND EX CEP TION R ECOR DING (PR OCESSORS BASED ON IN TEL NETBURS T ® MICR OARCHITECTUR E) Pentium 4 and Intel X eon processors based on Intel NetBurst microarchitecture provide the following methods for recording tak en branches, interrupts and exce[...]

  • Page 705

    Vol. 3 16-37 DEBUGGING, PR OFILING BRANCH ES AND TIME-S TAMP COUN TER • IA32_MISC_ENA BLE MSR — Indicates that the processor provides the B TS facilities. • Last branch record (LBR) stack — The LBR stack is a circular stack that consists of four MSRs (MSR_LAST BRAN CH_0 through MSR_LASTBRANCH_3) for the Pentium 4 and Intel Xeon processor fa[...]

  • Page 706

    16-38 Vol. 3 DEBUGGING, PR OFILING BR ANCHES AND TIME-S TAMP COUN TER • BTS (branch trace stor e ) f l a g ( b i t 3 ) — When set, enables the BTS facilities to log BTMs to a memory -resident BTS buffer that is part of the DS save area. See Section 16.4.9, “BTS and DS Save Area. ” • BTINT (branch trace interrupt) flag (bits 4) — When se[...]

  • Page 707

    Vol. 3 16-39 DEBUGGING, PR OFILING BRANCH ES AND TIME-S TAMP COUN TER LBR MSR pair) that contains the most recent (last) br anch record placed on the stack. Prior to placing a new branch record on the stack, the TOS is incremented by 1. When the TOS pointer reaches it maximum value, it wraps around to 0. See T able 16-1 0 and Figure 16-12. T able 1[...]

  • Page 708

    16-40 Vol. 3 DEBUGGING, PR OFILING BR ANCHES AND TIME-S TAMP COUN TER Additional information is saved if an ex cept ion or interrupt occurs in conjunction with a branch instruction. If a branch in struct ion generates a trap type ex ception, two branch records are stored in the LBR stack: a branch record for the branch instruction followed by a bra[...]

  • Page 709

    Vol. 3 16-41 DEBUGGING, PR OFILING BRANCH ES AND TIME-S TAMP COUN TER 16.8 LAST BR ANCH, IN TERRUPT, AND EX C EPTION R ECOR DING (IN TEL ® COR E ™ SOLO AND IN TEL ® COR E ™ DUO PROC ESSORS) Intel Core Solo and Intel Core Duo processors provide last branch interrupt and exception recording. This capability is almost identical to that found in [...]

  • Page 710

    16-42 Vol. 3 DEBUGGING, PR OFILING BR ANCHES AND TIME-S TAMP COUN TER • Debug store ( DS) feature flag (b it 21), retu rned by the CPU ID instruction — Indicates that the processor provides the debug store (DS) mechanism, which allows BTMs to be st ored in a memory-resident BT S buffer . See Section 16.4.5, “Branch T race Store (BTS). ” •[...]

  • Page 711

    Vol. 3 16-43 DEBUGGING, PR OFILING BRANCH ES AND TIME-S TAMP COUN TER 16.9 LAST BR ANCH, IN TERRUPT, AND EX CE PTION R ECOR DING (PEN TIUM M PR OCESSORS) Like the P entium 4 and Intel X eon processor family , P entium M processors provide last branch interrupt and exception recording. The capability operates almost identi- cally to that found in Pe[...]

  • Page 712

    16-44 Vol. 3 DEBUGGING, PR OFILING BR ANCHES AND TIME-S TAMP COUN TER — T R ( t r a ce m e s s a g e e n a b l e ) f l a g ( b it 6 ) — When set, branch trace messages are enabled. When the processor detects a taken branch, interrupt, or exception, it sends the branch record out on the system bus as a branch trace message (B TM). See Section 16[...]

  • Page 713

    Vol. 3 16-45 DEBUGGING, PR OFILING BRANCH ES AND TIME-S TAMP COUN TER For more detail on these capabilities, see Section 16.7.3, “Last Exception Records, ” and Appendix B.7, “MSRs In the Pentium M Processor . ” 16.10 LAST BR ANCH, INTERRUP T, AND EX CEP TION R ECOR DING (P6 F AMILY PROC ESSORS) The P6 family processors provide five MSRs for[...]

  • Page 714

    16-46 Vol. 3 DEBUGGING, PR OFILING BR ANCHES AND TIME-S TAMP COUN TER • B T F ( s i n g l e - s t e p o n b r a n c h e s ) f l a g ( b i t 1 ) — When set, the processor treats the TF flag in the EFLAGS re gister as a “single-step on branches” flag. See Section 16.4.3, “Single -Stepping on Branches, Ex ceptions, and Interrup ts. ” • P[...]

  • Page 715

    Vol. 3 16-47 DEBUGGING, PR OFILING BRANCH ES AND TIME-S TAMP COUN TER tion or interrupt being ge ner ated. When an exception or interrupt occurs, the contents of the LastBranchT oIP and LastBr anchFromIP MSRs are copied into these registers before the to and from addresse s of the exception or interrupt are recorded in the LastBranchT oIP and LastB[...]

  • Page 716

    16-48 Vol. 3 DEBUGGING, PR OFILING BR ANCHES AND TIME-S TAMP COUN TER 16.11 TIME-S TAMP COUNTER The Intel 64 and IA-32 architectures (beginning with the P entium processor) define a time-stamp counter mechanism that can be used to monitor and identify the relative time occurrence of processor events. The counter’ s architecture includes the follo[...]

  • Page 717

    Vol. 3 16-49 DEBUGGING, PR OFILING BRANCH ES AND TIME-S TAMP COUN TER NO TE T o dete rmine aver age processor clock frequency , Intel recommends the use of EMON logic to count processor core clocks over the period of time for which the av erage is required. See Section 30.10, “Counting Clocks, ” and Appendix A, “P erformance- Monitoring Event[...]

  • Page 718

    16-50 Vol. 3 DEBUGGING, PR OFILING BR ANCHES AND TIME-S TAMP COUN TER 16.11.2 IA32_TSC_AU X Register and RD TSCP Support Processor based on Intel microarchitecture (Ne halem) provides an auxiliary TSC register , IA32_TSC_AUX that is designed to be used in conjunction with IA32_TSC. IA32_TSC_AUX provides a 32-bit field that is initialized by privile[...]

  • Page 719

    Vol. 3 17-1 CHAP TER 17 8086 EMULATION IA-32 processors (beginning with the Intel386 processor) provide two ways to execute new or legacy programs that are assembled and/or compiled to run on an Intel 8086 processor: • Real-address mode. • Virtual-8086 mode. Figure 2-3 shows the relationship of these operating modes to protected mode and system[...]

  • Page 720

    17-2 Vol. 3 8086 EMULA TION The following is a summary of the core features of the real-address mode execution environment as would be seen by a program written for the 8086: • The processor supports a nominal 1-MByte physical address space (see Section 17.1.1, “ Address T ranslation in R eal-Address Mode” , for specific details). This addres[...]

  • Page 721

    Vol. 3 17-3 8086 EMULA TION • A single interrupt table, called the “interrupt vector table” or “interrupt table, ” is provided for handling interrupts and exceptions (see Figure 17-2 ). The interrupt table (which has 4-byte entries) takes th e place of the interrupt descriptor table (IDT , with 8-byte entries) used when handling protected[...]

  • Page 722

    17-4 Vol. 3 8086 EMULA TION in real-address mode, however , the processor does not truncate such an address and uses it as a physical address. (Note, however , that for IA-32 processors beginning with the Intel486 processor , the A20M# signal can be used in real-address mode to mask address line A20, ther eby mimicking the 20-bit wrap- around behav[...]

  • Page 723

    Vol. 3 17-5 8086 EMULA TION • Move (MOV) instructions that move operands between general-purpose registers, segment registers, and between memory and general-purpose registers. • The exchange (XCHG) instruction. • Load segment register instructions LDS and LES . • Arithmetic instructions ADD, ADC, SUB, SBB, MUL, IMUL, DIV , IDIV , INC, DEC,[...]

  • Page 724

    17-6 Vol. 3 8086 EMULA TION • Bit test and bit scan instructions B T , B TS, B TR, B T C, BSF , and BSR; the byte-set - on condition instruction SET c c; and the byte swap (BSW AP) instruction. • Double shift instruct ions SHLD and SHRD . • EFLAGS control instructions PUSHF and POPF . • ENTER and LEA VE control instructions. • BOUND instr[...]

  • Page 725

    Vol. 3 17-7 8086 EMULA TION The interrupt vector table is an array of 4-byte entries (see Figure 17-2). Each entry consists of a far pointer to a handler proc edure, made up of a segment selector and an offset. The processor scales the interrupt or exception vector by 4 to obtain an offset into the interrupt table. Following re set, the base of the[...]

  • Page 726

    17-8 Vol. 3 8086 EMULA TION 17 .2 VIRTUAL-8086 MODE Virtual-8086 mode is actually a special type of a task that runs in protected mode. When the operating-system or ex ecutive sw itches to a virtual-8086-mode task, the processor emulates an Intel 8086 proce ssor . The execution environment of the processor while in the 8086-emulation state is the s[...]

  • Page 727

    Vol. 3 17-9 8086 EMULA TION 17 .2.1 Enabling Virtual-8086 Mode The processor runs in virtual-8086 mode when the VM (virtual machine) flag in the EFLAGS register is set. This flag can only be set when the processor switches to a new protected-mode task or resume s virtual-8086 mode via an IRET instruction. System software cannot change the state of [...]

  • Page 728

    17-10 Vol. 3 8086 EMULA TION The processor enters virtual-8086 mode to run the 8086 program and returns to protected mode to run the virtual-8 086 monitor . The virtual-8086 monitor is a 32-bit protecte d-mode code module that runs at a CPL of 0. The monitor consists of initialization, interrupt- and exception-handling, and I/O emulation procedures[...]

  • Page 729

    Vol. 3 17-11 8086 EMULA TION Paging is not necessary for a single virtual-8086-mode task, but paging is useful or necessary in the following situations: • When running multiple virtual-8086-mode tasks. Here, paging allows the lower 1 MByte of the linear address space for each virtual-8086-mode task to be map ped to a different physical address lo[...]

  • Page 730

    17-12 Vol. 3 8086 EMULA TION When a task switch is used to enter virtual-80 86 mode, the TSS for the virtual-8086- mode task must be a 32-bit TSS. (If the ne w TSS is a 16-bit TSS, the upper word of the EFLAGS register is not in the TS S, causing the processor to clear the VM flag when it loads the EFLAGS register .) The processor updates the VM fl[...]

  • Page 731

    Vol. 3 17-13 8086 EMULA TION Figure 17-3. Entering and Leaving Virtual-80 86 Mode Monitor Virtual-8086 Real Mo de Code Pro tected- Mode T asks Virtual-8086 Mode T asks (8086 Programs) Protected- Mode Interrupt and Exception Handlers T ask Switch 1 VM = 1 Protected Mode Virtual-8086 Mode Real-Address Mode RESET PE=1 PE=0 or RESET #GP Exception 3 CAL[...]

  • Page 732

    17-14 Vol. 3 8086 EMULA TION 17 .2.6 Leaving Virtual-8086 Mode The processor can leave the virtual-8086 mode only through an interrupt or excep - tion. The following are situations where an interrupt or exception will lead to the processor leaving virtual-8086 mode (see Figure 17-3 ): • The processor services a hardware interrupt gener ated to si[...]

  • Page 733

    Vol. 3 17-15 8086 EMULA TION execution sequence after verifying that it was entered as a result of a HL T execution. See Section 17.3, “ Interrupt and Exception Handling in Virtual-8086 Mode”, for infor - mation on leaving virtual-8086 mode to handle an interrupt or exception generated in virtual-8086 mode. 17 .2.7 Sensitive Instructions When a[...]

  • Page 734

    17-16 Vol. 3 8086 EMULA TION for another task. This differs from protected mode in which, if the CPL is less than or equal to the IOPL, I/O access is allowed without checking the I/O permission bit map. See Chapter 13, “Input/Output” , in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1 , for more information about[...]

  • Page 735

    Vol. 3 17-17 8086 EMULA TION In virtual-8086 mode, the interrupts and exceptions are divided into three classe s for the purposes of handling: • Class 1 — All processor-generated exceptions and all hardware interrupts, including the NMI interrupt and the hardware interrupts sent to the processor’s external interrupt delivery pins. All class 1[...]

  • Page 736

    17-18 Vol. 3 8086 EMULA TION in the previous paragraphs. These sections describe three possibl e types of interrupt and exception handlers: • Protected-mode interrupt and exceptions handlers — These are the standard handlers that the processor calls through the protected-mode IDT . • Virtual-8086 monitor interrupt and exception handlers — T[...]

  • Page 737

    Vol. 3 17-19 8086 EMULA TION save and restore these registers regardless of the type segment selectors they contain (protected-mode or 8086-style). The interrupt and exception handlers, which may be called in the context of either a protected-mode task or a virtual- 8086-mode task, can use the same code sequences for saving and restoring the regist[...]

  • Page 738

    17-20 Vol. 3 8086 EMULA TION Interrupt and exception handlers can examin e the VM flag on the stack to determine if the interrupted procedure was running in v irtual-8086 mode. If so, the interr upt or exception can be handled in one of three ways: • The protected-mode interrupt or except ion handler that was called can handle the interrupt or ex[...]

  • Page 739

    Vol. 3 17-21 8086 EMULA TION 2. Store the EFLAGS (low-order 16 b its only), CS and EIP v alues of the 8086 program on the privilege-level 3 stack. This is the stack that the virtual-8086- mode task is using. (The 8086 handler may use or modify this information.) 3. Change the return link on th e privilege-level 0 stack to point to the privilege-lev[...]

  • Page 740

    17-22 Vol. 3 8086 EMULA TION executed must be 0, otherwise the proce ssor does not change the state of the V M flag. 17 .3.2 Class 2—Maskable Har d war e Interrup t Handling in Virtual-8086 Mode Using the Virtual Interrup t Mechanism Maskable hardware interrupts are those interrupts that are delivered through the INTR# pin or through an interrupt[...]

  • Page 741

    Vol. 3 17-23 8086 EMULA TION CLI instruction, the processor clears the VIF flag to request that the virtual-8086 monitor inhibit maskable hardw are interrupts from interrupting pro gram ex ecution; when it executes the STI instruction, th e processor sets the VIF flag requesting that the virtual-8086 monitor enable maskabl e hardware interrupts for[...]

  • Page 742

    17-24 Vol. 3 8086 EMULA TION 5. Upon returning to virtual-8086 mode, th e processor continues execution of the 8086 program. When the 8086 program is ready to receive maskable hardware interrupts, it executes the STI instruction to set the VIF flag (enabling maskable hardware interrupts). Prior to setting the VIF flag, th e processor automatically [...]

  • Page 743

    Vol. 3 17-25 8086 EMULA TION tions in virtual-8086 mode in the same manner as an Intel386 or Intel486 processor does. When this flag is set, the virtual mode extension provides the following enhancements to virtual-8086 mode: • Speeds up the handling of software-generated interrupts in virtual-8086 mode by allowing the processor to bypass the vir[...]

  • Page 744

    17-26 Vol. 3 8086 EMULA TION T able 17-2. Softw are In terrup t Handli ng Methods While in Virtual-8086 Mode Method VME IOPL Bit in Red i r. Bitmap* Proc essor Action 1 0 3 X In terrupt dir ected to a pro tected-mode interrupt handler: • Switc hes to priv ilege-l evel 0 stack • Pushes GS, FS, DS an d ES ont o privilege-lev el 0 stack • Pushes[...]

  • Page 745

    Vol. 3 17-27 8086 EMULA TION Redirecting software interrupts back to the 8086 program potentially speeds up interrupt handling because a switch back and forth between virtual-8086 mode and protected mode is not required. This latte r interrupt-handling technique is particu - larly useful for 8086 operating systems (such as MS-DOS) that use the INT [...]

  • Page 746

    17-28 Vol. 3 8086 EMULA TION rupt handler in the protected-mode IDT pointed to by the interrupt vector . See Section 17.3.1, “Class 1—Hardware Interrupt and Exception Handling in Virtual-8086 Mode” , for a complete description of this mechanism and its possible uses. 17 .3.3.2 Methods 2 and 3: Softwa re In terrupt Handling When a software int[...]

  • Page 747

    Vol. 3 17-29 8086 EMULA TION 3. Clears the IF fla g in the EFLAGS register to disable interrupts. 4. Clears the TF flag, in the EFLAGS register . 5. Locates the 8086 program interrupt vector table at linear address 0 for the 8086- mode task. 6. Loads the CS and EIP registers wi th valu es from the interrupt vector table entry pointed to by the inte[...]

  • Page 748

    17-30 Vol. 3 8086 EMULA TION cient means of handling maskable hardware interrupts that occur during a virtual- 8086 mode task. Also, because the IOPL v alue is less than 3 and the VIF flag is enabled, the information pushed on the st ack by the processor when invoking the interrupt handler is slightly diffe rent between methods 5 and 6 (see Ta b l [...]

  • Page 749

    Vol. 3 17-31 8086 EMULA TION It is only possible to enter virtual-8086 mode through a task switch or the execution of an IRET instruction, and it is only po ssible to leave virtual-8086 mode by faulting to a protected-mode interrupt handle r (typ ically the general-pro tection exception handler , which in turn calls the virtual 8086-mode monitor). [...]

  • Page 750

    17-32 Vol. 3 8086 EMULA TION[...]

  • Page 751

    Vol. 3 18-1 CHAP TER 18 MIXING 16-BIT AND 32-BIT CODE Program modules written to run on IA -32 proce ssors can be either 16-bit modules or 32-bit modules. Ta b l e 1 8 - 1 shows the characteristic of 16-bit and 32-bit modules. The IA-32 processors function most efficiently when executing 32-bit program modules. They can, however , also execute 16-b[...]

  • Page 752

    18-2 Vol. 3 MIXING 16-BIT AND 32-BIT CODE 18.1 DEFINING 16-BIT AN D 32-BIT PR OGR AM MODULES The following IA-32 architecture mechanis ms are used to distinguish between and support 16-bit and 32-bit segments and operations: • The D (default operand and address size ) flag in code-segment descriptors. • The B (default stack size) flag in stack [...]

  • Page 753

    Vol. 3 18-3 MIXING 16-BIT AND 32-BIT CODE These prefixes reverse the default size sele cted by the D flag in the code-segme nt descriptor . For example, the processor can interpret the (MOV mem , reg ) instruction in any of f our ways: • In a 32-bit code segment: — Moves 32 bits from a 32-bit register to memory using a 32-bit effective address.[...]

  • Page 754

    18-4 Vol. 3 MIXING 16-BIT AND 32-BIT CODE 18.3 SHARING DATA AMONG MIX ED-SIZE CODE SEGMENTS Data segments can be accessed from both 16-bit and 32-bit code segments. When a data segment that is larger than 6 4 KBytes is to be shared among 16- and 32 -bit code segments, the data that is to be accessed from the 16-bit code segments must be located wit[...]

  • Page 755

    Vol. 3 18-5 MIXING 16-BIT AND 32-BIT CODE Likewise, there are three ways for procedure in a 32-bit code segment to safely make a call to a 16-bit code segment: • Make the call through a 16-bit call gate. Here, the EIP v alue at the CALL instruction cannot exceed FFFFH. • Make a 32-bit call to a 16-bit interface procedure. The interface procedur[...]

  • Page 756

    18-6 Vol. 3 MIXING 16-BIT AND 32-BIT CODE instruction (see Figure 18-1 ). On a 16-bit call, the processor pushes the contents of the 16-bit IP register and (for calls betwee n privilege levels) the 16-bit SP register . The matching RET instruction must also use a 16-bit operand size to pop these 16-bit values from the stack into the 16-bit register[...]

  • Page 757

    Vol. 3 18-7 MIXING 16-BIT AND 32-BIT CODE While executing 32-bit code, if a call is made to a 16-bit code segment which is at the same or a more privileged level (that is, the DPL of the called code segment is less than or equal to the CPL of the calling code segment) through a 16-bit call gate, then the upper 16-bits of the ESP register may be unr[...]

  • Page 758

    18-8 Vol. 3 MIXING 16-BIT AND 32-BIT CODE segments can be modified to safely call procedures to 32-bit code segments in either of two ways: • R elink the CALL instruction to point to 32-bit call gates (see Section 18.4.2 .2, “Passing P ar ameters With a Gate” ). • Add a 32-bit operand-size prefix to each CALL instruction. 18.4.2.2 Passing P[...]

  • Page 759

    Vol. 3 18-9 MIXING 16-BIT AND 32-BIT CODE 18.4.5 Writing In terface Pr ocedur es Placing interface code between 32-bit and 16- bit procedures can be the solution to the following interface problems: • Allowing procedures in 16-bit code segments to call procedures with offsets greater than FFFFH in 32-bit code segments. • Matching operand-siz e [...]

  • Page 760

    18-10 Vol. 3 MIXING 16-BIT AND 32-BIT CODE[...]

  • Page 761

    Vol. 3 19-1 CHAP TER 19 ARCH ITECTUR E COMPATIBILITY Intel 64 and IA-32 processors are binary compatible. Compatibility means that, within limited constraints, progr ams that execute on previous generations of proces - sors will produce identical results when executed on later processors. The compati - bility constraints and an y implementation dif[...]

  • Page 762

    19-2 Vol. 3 ARCHITEC TURE COMPA TIBILITY • Pentium D Processors — A family of dual-core In tel 64 processors that provides two processor cores in a physical package. Each core is based on th e Intel NetBurst microarchitecture. • Pentium Processor Extreme Editions — A family of dual-core Intel 64 processors that provides two processor cores [...]

  • Page 763

    Vol. 3 19-3 ARCHITECTU RE COMPA TIBILITY original value results in a general-pro tec tion exception (#GP). So, programs that execute on the P6 family and Pentium processors cannot erroneously enable func - tions that may be implemented in future IA -32 processors. The P6 family and Pentium processors do not check for attempts to set reserved bits i[...]

  • Page 764

    19-4 Vol. 3 ARCHITEC TURE COMPA TIBILITY control and status register . These instructions and registers a re designed to allow SIMD computations to be made on single -precision floating-point numbers. Sever al of these new instructions also operate in the MMX registers. SSE instructions and registers are described in Section 10, “Pr ogramming wit[...]

  • Page 765

    Vol. 3 19-5 ARCHITECTU RE COMPA TIBILITY 19.10 INTEL HYPER-THREADING T ECHNOLOGY Intel Hyper- Threading T echnology provides two logical processors that can execute two separate code streams (called threads ) concurrently by using shared resources in a single processor core or in a ph ysical package. This feature was introduced in the Intel X e on [...]

  • Page 766

    19-6 Vol. 3 ARCHITEC TURE COMPA TIBILITY 19.13.1 Instructions Added Prior to the P entium Pr ocessor The following instructions were added in the Intel486 processor: • BSW AP (byte swap) instruction. • XADD (exchange and add) instruction. • CMPXCHG (compare and exchange) instruction. • Ι NVD (inv alidate cache) instruction. • WBINVD (wri[...]

  • Page 767

    Vol. 3 19-7 ARCHITECTU RE COMPA TIBILITY • Single-bit instructions. • Bit scan instructions. • Double-shift instructions. • Byte set on condition instruction. • Move with sign/zero extension. • Generalized multiply instruction. • MOV to and from control registers. • MOV to and from test registers (now obsolete). • MOV to and from [...]

  • Page 768

    19-8 Vol. 3 ARCHITEC TURE COMPA TIBILITY The following flags were added to the EFLAGS register in the P entium processor: • VIF (virtual interrupt flag), bit 19. • VIP (virtual interrupt pending), bit 20. • ID (identification flag), bit 21. The AC flag (bit 18 ) was added to the EFLAGS register in the Intel486 processor . 19.16.1 Using EFLAGS[...]

  • Page 769

    Vol. 3 19-9 ARCHITECTU RE COMPA TIBILITY XCHG BP, [BP ] This code functions as the 8086 processor PUSH SP instruction on the P6 family , Pentium, Intel486, Intel386, and Intel 286 processors. 19.17 .2 EFLAGS Pushed on the S tack The setting of the stored v alues of bits 12 through 15 (which includes the IOPL field and the NT flag) in the EFLAGS reg[...]

  • Page 770

    19-10 Vol. 3 ARCHITEC TURE COMPA TIBILITY math coprocessor (flag is clear) or an Intel 387 DX math coprocessor (flag is set). This bit is hardwired to 1 in the P6 family , P entium, and Intel486 proce ssors. The NE (Numeric Exception) flag (bit 5 of the CR0 register) is used in the P6 family , Pentium, and Intel486 processors to determine whether u[...]

  • Page 771

    Vol. 3 19-11 ARCHITECTU RE COMPA TIBILITY On the 32-bit x87 FPUs, the C2 flag serves as an incomplete flag for the F T AN instruc - tion. On the 16-bit IA -32 math coprocessors , the C2 flag is undefined for the FPT AN instruction. This difference has no impact on software, because Intel 287 or 8087 programs do not check C2 after an FPT AN in struc[...]

  • Page 772

    19-12 Vol. 3 ARCHITEC TURE COMPA TIBILITY Software written to run on a 16-bit IA -32 math coprocessor may not oper ate correctly on a 16-bit x87 FPU, if it us es the FLDENV , FRSTOR, or FXRST OR instruc - tions to change tags to values (other than to empty) that are different from actual register contents. The encoding in the tag word for the 32-bi[...]

  • Page 773

    Vol. 3 19-13 ARCHITECTU RE COMPA TIBILITY ters. The only affect may be in how softw a re handles the tags in the tag word (see also: Section 19.18. 4, “x87 FPU T ag W ord” ). 19.18.6 Floating-Poin t Exc eptions This section identifies the implementation differe nces in exception handling for floating-point instructions in the various x87 FPUs a[...]

  • Page 774

    19-14 Vol. 3 ARCHITEC TURE COMPA TIBILITY The difference is apparent only to the exception handler . This difference is for IEEE Standard 754 compatibility . 19.18.6.3 Numeric Underflow Exc eption (#U) When the underflow ex ception is mask ed on the 32-bit x87 F PUs, the underflow exception is signaled when both the result is tiny and denormalizati[...]

  • Page 775

    Vol. 3 19-15 ARCHITECTU RE COMPA TIBILITY the 8087 interrupt, both exception vectors should call the floating-point-error excep - tion handler . Some instructions in a floa ting-point-error exception handler ma y need to be deleted if they use the interrupt cont roller . The P6 family , Pentium, and Intel486 processors have signals that, with the a[...]

  • Page 776

    19-16 Vol. 3 ARCHITEC TURE COMPA TIBILITY 19.18.6.9 Alignment Check Ex ceptions (#AC) If alignment checking is enabled, a mi saligned data operand on the P6 family , Pentium, and Intel486 processors causes an alignme nt check exception (#AC) when a program or procedure is running at privilege-level 3, ex cept for the stack portion of the FSA VE/FNS[...]

  • Page 777

    Vol. 3 19-17 ARCHITECTU RE COMPA TIBILITY 19.18.7 Changes to Floating-Poin t Instructions This section identifies the differences in floating-point instructions for the v arious Intel FPU and math coprocessor architectures, the reason for the differences, and their impact on software. 19.18.7 .1 FDIV , FPREM, and FSQRT Instructions The 32-bit x87 F[...]

  • Page 778

    19-18 Vol. 3 ARCHITEC TURE COMPA TIBILITY tions do not exist on the 16-bit IA-32 math coprocessors. The availability of these new instructions has no impact on existing softw are. 19.18.7 .6 FP T AN Instruction On the 32-bit x87 FPUs , the range of the operand for the FPT AN instruction is much less restricted (| ST(0) | < 2 63 ) than on earlier[...]

  • Page 779

    Vol. 3 19-19 ARCHITECTU RE COMPA TIBILITY arithmetic. The 16-bit IA-32 math coproc essors do report a denormal-oper and exception in this situation. This difference does not affect existing software. On the 32-bit x87 FPUs, loading a denormal v alue that is in single- or double-real format causes the v alue to be conv erted to extended-real format.[...]

  • Page 780

    19-20 Vol. 3 ARCHITEC TURE COMPA TIBILITY FPUs handle all addressing and exception- pointer information, whether in protected mode or not. 19.18.7 .15 FXAM Instruction With the 32-bit x87 FPUs, if the FPU enco unters an empty register when executing the FXAM instruction, it not generate combin ations of C0 through C3 equal to 1101 or 1111. The 16-b[...]

  • Page 781

    Vol. 3 19-21 ARCHITECTU RE COMPA TIBILITY 19.18.10 W A IT /FWAIT Pre fix Diff erenc es On the Intel486 processor , when a WAIT/FW A IT instruction precedes a floating-point instruction (one which itself automatically synchronizes with the previous floating- point instruction), the W AIT/FWAIT instruction is treated as a no-op. P ending floating-poi[...]

  • Page 782

    19-22 Vol. 3 ARCHITEC TURE COMPA TIBILITY 19.20 FPU AND MATH COPROC ESSOR INITIALIZATION T able 9-1 shows the states of the FPUs in th e P6 family , Pentium, Intel486 processors and of the Intel 387 math coprocessor and Intel 287 coprocessor following a power- up, reset, or INIT , or following the execution of an FINIT/FNINIT instruction. The follo[...]

  • Page 783

    Vol. 3 19-23 ARCHITECTU RE COMPA TIBILITY Following is an example code sequence to initialize the system and check for the presence of Intel486 SX processor/Intel 487 SX math coprocessor . fninit fstcw mem_lo c mov ax, me m_loc cmp ax, 037fh jz Intel487_SX_Math_CoP rocessor_presen t ;ax=037fh jmp Intel486_SX_micropro cessor_present ;ax=ffffh If the[...]

  • Page 784

    19-24 Vol. 3 ARCHITEC TURE COMPA TIBILITY 19.21 CON TROL R EGISTERS The following sections identify the new control registers and control register flags and fields that were introduced to the 32 -bit IA-32 in v arious processor families. See Figure 2-6 for the location of these flags and fields in the control registers. The P entium III processor i[...]

  • Page 785

    Vol. 3 19-25 ARCHITECTU RE COMPA TIBILITY • NE — Numeric error . Enables the normal mechanism for reporting floating-point numeric errors. • WP — Write protect. W rite-protects read-only pages against sup ervisor-mode accesses. • AM — Alignment mask. Controls whethe r alignment checking is performed. Operates in conjunction with the AC [...]

  • Page 786

    19-26 Vol. 3 ARCHITEC TURE COMPA TIBILITY 19.22.1.2 Global Pages The new PGE (pag e global enable) flag i n control register CR4, bit 7, pr ovides a mechanism for preventing frequently used pages from being flushed from the tr ans- lation lookaside buffer (TLB). When this flag is set, frequently used pages (such as pages containing kernel procedure[...]

  • Page 787

    Vol. 3 19-27 ARCHITECTU RE COMPA TIBILITY 19.22.4 Changes in Segmen t Descriptor L oads On the Intel386 processor , loading a segment descriptor always causes a lock ed read and write to set the accessed bit of the de scriptor . On the P6 fami ly , Pentium, and Intel486 processors, the locked read and writ e occur only if the bit is not already set[...]

  • Page 788

    19-28 Vol. 3 ARCHITEC TURE COMPA TIBILITY are enabled (the DE flag is set), attempts to reference registers DR4 or DR5 will result in an invalid-opcode exception (#UD). 19.24 REC OGNITION OF BR EAKPOINTS For the P entium processor , it is recommended that debuggers execute the L GDT instruction before returning to the prog ram being debugged to ens[...]

  • Page 789

    Vol. 3 19-29 ARCHITECTU RE COMPA TIBILITY may not be implemented or implemented differently in future processors. The MCE flag in control register CR4 enables the machine-check exception. When this bit is clear (which it is at reset), the processor inhibits generation of the machine- check exception. • General-protection exception (#GP , interrup[...]

  • Page 790

    19-30 Vol. 3 ARCHITEC TURE COMPA TIBILITY 19.25.1 Machine-Check Architectur e The P entium Pro processor introduced a ne w architecture to the IA -32 for handling and reporting on machine-check exceptions. Th is machine-check architecture (described in detail in Chapter 15, “Machine-Check Architec ture” ) greatly expands the ability of the proc[...]

  • Page 791

    Vol. 3 19-31 ARCHITECTU RE COMPA TIBILITY 19.26.3 IDT Limit The LIDT instruction can be used to set a limit on the size of the IDT . A double-fault exception (#DF) is generated if an interrupt or exception attempts to read a vector b e y o n d t h e l i m i t . S h u t d o w n t h e n o c c u r s o n the 32-bit IA -32 processors if the double - fau[...]

  • Page 792

    19-32 Vol. 3 ARCHITEC TURE COMPA TIBILITY • The remote read delivery mode provided in the 82489DX and local APIC for P entium processors is not supported in the local APIC in the Pentium 4, Intel X eon, and P6 family processors. • For the 82489DX, in the lowest priority delivery mode, all the target local APICs specified by the destination fiel[...]

  • Page 793

    Vol. 3 19-33 ARCHITECTU RE COMPA TIBILITY 19.28.1 P6 F amily and Pentium Pr ocessor TSS When the virtual mode extensions are enabled (by setting the VME flag in control register CR4), the TSS in the P6 family and Pentium processors contain an interrupt redirection bit map, which is used in virtua l-8086 mode to redirect interrupts back to an 8086 p[...]

  • Page 794

    19-34 Vol. 3 ARCHITEC TURE COMPA TIBILITY than 0DFFFH, the Intel486 processor will not wrap around and access incorrect loca - tions within the TSS for I/O port v alidation and the P6 family and Pentium processors will not experience general-protection exceptions (#GP). Fig ure 19-1 demonstrates the different areas accessed by the Intel486 and the [...]

  • Page 795

    Vol. 3 19-35 ARCHITECTU RE COMPA TIBILITY data cache and L2 cache of the P6 family processors. In the Intel486 processor , setting these flags to (00B) enables write-through for the cache. External system hardware can force the Pentium processor to disable caching or to use the write-through cache policy should th at be required. In the P6 family p[...]

  • Page 796

    19-36 Vol. 3 ARCHITEC TURE COMPA TIBILITY 19.29.2 Disabling the L3 Cache A unified third-level (L3) cache in processors based on Intel NetBurst microarchitec - ture (see Section 11. 1, “Internal Caches , TLBs, and Buf fers” ) provides the third-lev el cache disable flag, bit 6 of the IA32_M ISC_ENABLE MSR. The third-level cache disable flag all[...]

  • Page 797

    Vol. 3 19-37 ARCHITECTU RE COMPA TIBILITY 19.30.3 Enabling and Disabling P aging Paging is enabled and disabled by loading a value into control register CR0 that modi - fies the PG flag. F or backward and forw ar d compatibility with all IA-32 processors, Intel recommends that the following operations be performed when enabli ng or disabling paging[...]

  • Page 798

    19-38 Vol. 3 ARCHITEC TURE COMPA TIBILITY • The initial stack pointer is FFFCH (32-bit operand) or FFFEH (16-bit operand) and will wrap around to 0H as a result of the POP operation. The result of the memory write is implementa tion-specific. F or example, in P6 family processors, the result of the memory write is S S:0H plus any scaled index and[...]

  • Page 799

    Vol. 3 19-39 ARCHITECTU RE COMPA TIBILITY 19.32 MIXING 16- AND 32-BIT SEGMENTS The features of the 16-bit Intel 286 processor are an object-code compatible subset of those of the 32-bit IA -32 processors. The D (default operation siz e) flag in segment descriptors indicates whether the processor treats a code or data segment as a 16-bit or 32-bit s[...]

  • Page 800

    19-40 Vol. 3 ARCHITEC TURE COMPA TIBILITY 19.33.1 Segment Wr aparound On the 8086 processor , an attempt to access a memory operand that crosses offset 65,535 or 0FFFFH or offset 0 (for example, moving a word to offset 65,535 or pushing a word when the stack pointer is set to 1) causes the offset to wrap around modulo 65,536 or 010000H. With the In[...]

  • Page 801

    Vol. 3 19-41 ARCHITECTU RE COMPA TIBILITY with the exception of “fast string” store oper ations (see Section 8.2.4, “Out-of-Order Stores For String Oper ations” ). The Pentium processor has two store buffers, one corresponding to each of the pipe - lines. W rites in these buffers are always wr itten to memory in the order they were generate[...]

  • Page 802

    19-42 Vol. 3 ARCHITEC TURE COMPA TIBILITY memory . If the access does split across a cache line, it locks the bus and accesses system memory . I/O reads are never reordered in front of buffered memory writes on an IA -32 processor . This ensures an update of all memory locations before reading the status from an I/O device. 19.35 BUS LOCKING The In[...]

  • Page 803

    Vol. 3 19-43 ARCHITECTU RE COMPA TIBILITY sors. The following sections describe these model-specific extensions. The CPUID instruction indicates the availability of some of the model-specific features. 19.37 .1 Model-Specific R egisters The Pentium processor introduced a set of mo del-specific registers (MSRs) for use in controlling hardware functi[...]

  • Page 804

    19-44 Vol. 3 ARCHITEC TURE COMPA TIBILITY Earlier IA-32 processors (such as the Intel486 and Pentium processors) used the KEN# (cache enable) pin and external logic to maintain an external memory map and signal cacheable accesses to the processor . The MTRR mechanism simplifies hard - ware designs by eliminating the KEN# pin an d the external logic[...]

  • Page 805

    Vol. 3 19-45 ARCHITECTU RE COMPA TIBILITY The performance-monitoring counters are useful for debugging programs, optimizing code, diagnosing system failures, or refining hardware designs. See Chapter 30, “P erformance Monitoring, ” for more information on these counters. 19.38 TW O WAYS TO RUN IN TEL 286 PR OCESSOR T ASKS When porting 16-bit pr[...]

  • Page 806

    19-46 Vol. 3 ARCHITEC TURE COMPA TIBILITY[...]