Intel IA-32 manual

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636

Ir a la página of

Buen manual de instrucciones

Las leyes obligan al vendedor a entregarle al comprador, junto con el producto, el manual de instrucciones Intel IA-32. La falta del manual o facilitar información incorrecta al consumidor constituyen una base de reclamación por no estar de acuerdo el producto con el contrato. Según la ley, está permitido adjuntar un manual de otra forma que no sea en papel, lo cual últimamente es bastante común y los fabricantes nos facilitan un manual gráfico, su versión electrónica Intel IA-32 o vídeos de instrucciones para usuarios. La condición es que tenga una forma legible y entendible.

¿Qué es un manual de instrucciones?

El nombre proviene de la palabra latina “instructio”, es decir, ordenar. Por lo tanto, en un manual Intel IA-32 se puede encontrar la descripción de las etapas de actuación. El propósito de un manual es enseñar, facilitar el encendido o el uso de un dispositivo o la realización de acciones concretas. Un manual de instrucciones también es una fuente de información acerca de un objeto o un servicio, es una pista.

Desafortunadamente pocos usuarios destinan su tiempo a leer manuales Intel IA-32, sin embargo, un buen manual nos permite, no solo conocer una cantidad de funcionalidades adicionales del dispositivo comprado, sino también evitar la mayoría de fallos.

Entonces, ¿qué debe contener el manual de instrucciones perfecto?

Sobre todo, un manual de instrucciones Intel IA-32 debe contener:
- información acerca de las especificaciones técnicas del dispositivo Intel IA-32
- nombre de fabricante y año de fabricación del dispositivo Intel IA-32
- condiciones de uso, configuración y mantenimiento del dispositivo Intel IA-32
- marcas de seguridad y certificados que confirmen su concordancia con determinadas normativas

¿Por qué no leemos los manuales de instrucciones?

Normalmente es por la falta de tiempo y seguridad acerca de las funcionalidades determinadas de los dispositivos comprados. Desafortunadamente la conexión y el encendido de Intel IA-32 no es suficiente. El manual de instrucciones siempre contiene una serie de indicaciones acerca de determinadas funcionalidades, normas de seguridad, consejos de mantenimiento (incluso qué productos usar), fallos eventuales de Intel IA-32 y maneras de solucionar los problemas que puedan ocurrir durante su uso. Al final, en un manual se pueden encontrar los detalles de servicio técnico Intel en caso de que las soluciones propuestas no hayan funcionado. Actualmente gozan de éxito manuales de instrucciones en forma de animaciones interesantes o vídeo manuales que llegan al usuario mucho mejor que en forma de un folleto. Este tipo de manual ayuda a que el usuario vea el vídeo entero sin saltarse las especificaciones y las descripciones técnicas complicadas de Intel IA-32, como se suele hacer teniendo una versión en papel.

¿Por qué vale la pena leer los manuales de instrucciones?

Sobre todo es en ellos donde encontraremos las respuestas acerca de la construcción, las posibilidades del dispositivo Intel IA-32, el uso de determinados accesorios y una serie de informaciones que permiten aprovechar completamente sus funciones y comodidades.

Tras una compra exitosa de un equipo o un dispositivo, vale la pena dedicar un momento para familiarizarse con cada parte del manual Intel IA-32. Actualmente se preparan y traducen con dedicación, para que no solo sean comprensibles para los usuarios, sino que también cumplan su función básica de información y ayuda.

Índice de manuales de instrucciones

  • Página 1

    IA-32 In tel ® Ar chitectur e So ftw ar e De v eloper’ s Manual Vo l u m e 3 A : S ystem Pr ogr amming Guide, P art 1 NO TE: The IA-32 Intel Ar chitecture Softwar e Dev eloper's Manual c onsists of f i ve vol u me s : Basic Architectur e , Order Number 253665; Ins truction Se t Re f erence A-M , Or der Number 253666; Instruction Se t Re f e[...]

  • Página 2

    INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRO DUCTS. NO LICENSE, EX- PRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHT S IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN IN TEL’S T ERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABI LITY WHATSOEVER , AND INTEL DISCLAIMS ANY [...]

  • Página 3

    Vol. 3A iii CONTENT S FOR V OLUME 3A AND 3B CHAPTER 1 ABOUT THIS MANUAL 1.1 IA-32 PROCESSORS COVERED IN THIS MANUA L . . . . . . . . . . . . . . . . . . . . . . . 1-1 1.2 OVERVIEW OF THE SYSTEM PROG RAMMING GUIDE . . . . . . . . . . . . . . . . . . . . 1-2 1.3 NOTATIONAL CONVENTIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . [...]

  • Página 4

    CONTENTS iv Vol. 3A PAGE 2.6.7 Readi ng and Writing Model-Specific Registers . . . . . . . . . . . . . . . . . . . . . . . . . . 2 -29 2.6.7.1 R eading and Writing Model-Specific Registers in 64-Bit Mode . . . . . . . . . . . 2-29 CHAPTER 3 PROTECTED-MODE MEMORY MANAGEMENT 3.1 MEMORY MANAGE MENT OVERVIEW . . . . . . . . . . . . . . . . . . . . . . [...]

  • Página 5

    Vol. 3A v CONTENTS PAGE CHAPTER 4 PROTECTION 4.1 ENABLING AND DISABLING SEGMENT AND PAGE PROTECTION . . . . . . . . . . 4-1 4.2 FIELDS AND FLAGS USED FOR SEGMENT-LEVEL AND PAGE-LEVEL PROTECTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-2 4.2.1 Code Segment Descriptor in 64-bit Mode . . . . . . . . . . . .[...]

  • Página 6

    CONTENTS vi Vol. 3A PAGE CHAPTER 5 INTERRUPT AND EXCEPTION HANDLING 5.1 INTERRUPT AND EXCEPTION OVERVIEW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-1 5.2 EXCEPTION AND INTERRUPT VECTORS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-2 5.3 SOURCES OF INTERRUPTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . .[...]

  • Página 7

    Vol. 3A vii CONTENTS PAGE Interrupt 16—x87 FPU Floa ting-Po int Error (#MF) . . . . . . . . . . . . . . . . . . . . . . 5-55 Interrupt 17—Al ignment Check Exception (#AC). . . . . . . . . . . . . . . . . . . . . . . . 5-57 Interrupt 18—Machine-Check Exce ption (#MC) . . . . . . . . . . . . . . . . . . . . . . . . 5-59 Interrupt 19—SIMD Floa[...]

  • Página 8

    CONTENTS viii Vol. 3A PAGE 7.5.4 MP Initialization Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-18 7.5.4.1 Typica l BSP Initialization Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-19 7.5.4.2 Typica l AP Initialization Sequence . . . . . . . . . . . . . . . . . . . . . . [...]

  • Página 9

    Vol. 3A ix CONTENTS PAGE 7.11.6.3 Halt Idle Logical Processors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-52 7.11.6.4 Potential Usa ge of MONITOR/MWAIT in C1 Idle Loops . . . . . . . . . . . . . . . . 7-52 7.11.6.5 Guideline s for Scheduling Threads on Logical Processors Sharing Execution Resources . . . . . . . . .[...]

  • Página 10

    CONTENTS x Vol. 3A PAGE 8.10 APIC BUS MESSAGE PASSING MECHANISM AND PROTOCOL (P6 FAMILY, PENTIUM PROCESSORS) . . . . . . . . . . . . . . . . . . . . . 8-42 8.10.1 Bus Message Fo rmats. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-43 8.11 MESSAGE SIGNALLED INTERRUPTS . . . . . . . . . . . . . . . . . . . [...]

  • Página 11

    Vol. 3A xi CONTENTS PAGE 9.11.6.4 Update in a System Suppo rting Dual-Cor e Technol ogy . . . . . . . . . . . . . . . . 9-46 9.11.6.5 Update Load er Enhance ments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-46 9.11.7 Update Signature and Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9[...]

  • Página 12

    CONTENTS xii Vol. 3A PAGE 10.11.3.1 Base and Mask Calculations with Intel EM64T. . . . . . . . . . . . . . . . . . . . . . . 10-33 10.11.4 Range Size and Alignment Requirement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-34 10.11.4.1 MTRR Precedences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .[...]

  • Página 13

    Vol. 3A xiii CONTENTS PAGE CHAPTER 13 POWER AND THERMAL MANAGEMENT 13.1 ENHANCED INTEL SPEEDSTEP ® TECHNOLOGY . . . . . . . . . . . . . . . . . . . . . . . 13-1 13.1.1 Software Interface For Init iating Performance State Transitions . . . . . . . . . . . . 13-1 13.2 THERMAL MONITORI NG AND PROTECTION . . . . . . . . . . . . . . . . . . . . . . . .[...]

  • Página 14

    CONTENTS xiv Vol. 3A PAGE 15.2 VIRTUAL-8086 MODE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-7 15.2.1 Enabling Virtual-8086 Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-9 15.2.2 Structure of a Virtual-8086 Task . . . . . . . . . . . . . . . . . . . . . . .[...]

  • Página 15

    Vol. 3A xv CONTENTS PAGE 17.6. STREAMING SIMD EXTENSIONS (SSE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-3 17.7. STREAMING SIMD EXTENSIONS 2 (SSE2). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-3 17.8. STREAMING SIMD EXTENSIONS 3 (SSE3). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-3 17.9. HYPER-TH[...]

  • Página 16

    CONTENTS xvi Vol. 3A PAGE 17.17.7.12. FXTRACT Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-17 17.17.7.13. Load Constant Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-17 17.17.7.14. FSETPM Instruction . . . . . . . . . . . . . . . . . . . . . . . . . [...]

  • Página 17

    Vol. 3A xvii CONTENTS PAGE 17.29.1. Large Pages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-34 17.29.2. PCD and PWT Flags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-34 17.29.3. Enabling and Disabling Paging . . . . . . . . . . . . . . . . . . [...]

  • Página 18

    CONTENTS xviii Vol. 3A PAGE 18.5.7.1 Last Exception Records and Intel EM64T . . . . . . . . . . . . . . . . . . . . . . . . . . 18-19 18.5.8 Branch Trace Store (BTS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-19 18.5.8.1 Detection of the BTS Facilities . . . . . . . . . . . . . . . . . . . . . . . . . . . .[...]

  • Página 19

    Vol. 3A xix CONTENTS PAGE 18.11 PERFORMANCE MONITORI NG AND HYPER-THREADING TECHNOLOGY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-60 18.11.1 ESCR MSRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-61 18.11.2 CCCR MSRs . . . . . . . .[...]

  • Página 20

    CONTENTS xx Vol. 3A PAGE 20.7 VM-EXIT CONTROL FIELDS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-14 20.7.1 VM-Exit Controls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-14 20.7.2 VM-Exit Controls for MSRs . . . . . . . . . . . . . . . . . . . . . . . . . . .[...]

  • Página 21

    Vol. 3A xxi CONTENTS PAGE 22.3.2.1 Loadin g Guest Control Registers, Deb ug Reg isters, and MSRs . . . . . . . . 21-14 22.3.2.2 Loadin g Guest Segment Registers and Descriptor-Table Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21-16 22.3.2.3 Loading Guest RIP, RSP, and RFLAGS . . . [...]

  • Página 22

    CONTENTS xxii Vol. 3A PAGE 24.3.2 Exiting From SMM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-4 24.4 SMRAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 -4 24.4.1 SMRAM State Save Map. . . . . . . . . . . . . . . . . . . . [...]

  • Página 23

    Vol. 3A xxiii CONTENTS PAGE CHAPTER 25 VIRTUAL-MACHINE MONITOR PR OGRAMMING CONSIDERATIONS 25.1 VMX SYSTEM PROGRAMMING OVERVIEW . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23-1 25.2 SUPPORTING PROCESSOR OPERATING MODES IN GUEST ENVIRONMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .[...]

  • Página 24

    CONTENTS xxiv Vol. 3A PAGE 26.3.5.1 Initialization of Virtual TLB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-6 26.3.5.2 Response to Pa ge Faults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-7 26.3.5.3 Response to Uses of INVLPG . . . . . . . . . . . . . . . . . . . . . . . [...]

  • Página 25

    Vol. 3A xxv CONTENTS PAGE APPENDIX C MP INITIALIZATION FO R P6 FAMILY PROCESSORS C.1 OVERVIEW OF THE MP INITIALI ZATION PROCESS FOR P6 FAMILY PROCESSORS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-1 C.2 MP INITIALIZATI ON PROTOCOL ALGORITHM . . . . . . . . . . . . . . . . . . . . . . . .[...]

  • Página 26

    CONTENTS xxvi Vol. 3A PAGE H.3.4 32-Bit Host-State Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . H-6 H.4 NATURAL-WI DTH FIELDS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . H-6 H.4.1 Natural-Width Control Fields . . . . . . . . . . . . . . . . . . . . . . . . . [...]

  • Página 27

    Vol. 3A xxvii CONTENTS PAGE Figure 3-23. Format of Page-Direct ory Entries for 4-MByte Pages and 36-Bit Physical Addresses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-38 Figure 3-24. IA-32e Mode Paging Structures (4-KByte Pages) . . . . . . . . . . . . . . . . . . . . 3-40 Figure 3-25. IA-32e Mode Paging Structures[...]

  • Página 28

    CONTENTS xxviii Vol. 3A PAGE Figure 7-6. Topological Relationships betwee n Hierarchical IDs in a Hypothetical MP Platfor m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-36 Figure 8-1. Relationship of Local APIC and I/O APIC In Singl e-Processor Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . [...]

  • Página 29

    Vol. 3A xxix CONTENTS PAGE Figure 11-2. Mapping of MMX Registers to x87 FPU Data Register Stack . . . . . . . . . . . . 11-7 Figure 12-1. Example of Saving the x87 FPU, MMX, SSE, and SSE2 State During an Operating-System Controlled Task Switch . . . . . . . . . . . . . . . . . . 12-9 Figure 13-1. Processor Modulation Through Stop-C lock Me chanism [...]

  • Página 30

    CONTENTS xxx Vol. 3A PAGE Figure 18-23. MSR_IFSB_CTL6, Address: 10 7D2H ; MSR_IFSB_CNTR7, Address: 107D3H . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-70 Figure 18-24. PerfEvtSel0 and PerfEvtSel1 MSRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-71 Figure 18-25. CESR MSR (Pentium Proc essor Onl y). . . . . . . . . [...]

  • Página 31

    Vol. 3A xxxi CONTENTS PAGE Table 6-1. Exception Conditions Checked During a Ta sk Switch . . . . . . . . . . . . . . . . . 6-15 Table 6-2. Effect of a Task Switch on Busy Flag, NT Flag, Previous Task Link Field, and TS Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-17 Table 7-1. Initial APIC IDs for the Logica l Processors in a Sy[...]

  • Página 32

    CONTENTS xxxii Vol. 3A PAGE Table 11-3. Effect of the MMX, x87 FPU, and FXSAVE/FXRS TOR Instructions on the x87 FPU Tag Wo rd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-4 Table 12-1. Action Taken for Combination s of OSFXSR, OSXMMEXCPT, SSE, SSE2, SSE3, EM, MP, and TS1 . . . . . . . . . . . . . . . . . . . . . [...]

  • Página 33

    Vol. 3A xxxiii CONTENTS PAGE Table 23-1. Exit Qualification for Debug Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-5 Table 23-2. Exit Qualification for Task Switch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-6 Table 23-3. Exit Qualifi cation for Control-Register A ccesses. . . . . . . . . . . . . . [...]

  • Página 34

    CONTENTS xxxiv Vol. 3A PAGE Table F-3. Non-Focused Lowest Priority Message (34 Cycles) . . . . . . . . . . . . . . . . . . . . .F-3 Table F-4. APIC Bus Status Cycles Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .F-5 Table G-1. Memory Types Used For VMCS Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . G-2[...]

  • Página 35

    1 About This Manual[...]

  • Página 36

    [...]

  • Página 37

    Vol. 3A 1-1 CHAPTER 1 ABOUT THIS MANUAL The IA-32 Intel® Ar chitectur e Softwar e Developer ’ s Manual, V olume 3A: S ystem Pr ogramming Guide, Part 1 (order num ber 253668) and the IA-32 Intel® Architect ur e Softwar e Developer ’ s Manual, V olume 3B: System Pr ogramming Gui de, Part 2 (order number 253669) are part of a set that describes [...]

  • Página 38

    1-2 Vol. 3A ABOUT THIS MANUAL 1.2 OVERVIEW OF THE SYST EM PROGRAMMING GUIDE A description of this manual’ s content follows: Chapter 1 — About This Manual. Gives an overview o f all three volumes of t he IA-32 Intel Ar chitectur e Softwar e Developer ’ s Manual . It als o describes the notational conventions in these manuals and lists relat e[...]

  • Página 39

    Vol. 3A 1-3 ABOUT THIS MANUAL level, including: task swi tching, exception handling, and compatibility with existing system environments. Chapter 12 — SSE, SSE2 and SSE3 System Programming. Describes those aspects of SSE/SSE2/SSE3 extensions that must be hand led and considered at the system programm ing level, including task switching , exceptio[...]

  • Página 40

    1-4 Vol. 3A ABOUT THIS MANUAL Chapter 25 — V irtual-Mach ine Monitoring Programming Considerations. Describes programming considerations for VMMs. VMMs manage virtual machines (VMs). Chapter 26 — V irt ualization of System Resources. Describes the virtualization of the system resources. These include: debugg ing facilities, ad dress translation[...]

  • Página 41

    Vol. 3A 1-5 ABOUT THIS MANUAL 1.3.1 Bit and Byte Order In illustrations of d ata structures in memory , smaller addresses appear toward the botto m of the figure; addresses increase toward the top. Bit po sitions are numbered from right to left. The numerical value of a set bit is equal to two raised to the power of the bit posit ion. IA-32 proces-[...]

  • Página 42

    1-6 Vol. 3A ABOUT THIS MANUAL 1.3.3 Instruction Operands When instructions are represen ted symbolically , a subset of the IA-32 assem bly language is used. In this subset, an instruction has the following form at: label: mnemonic argument1, argument2, argument3 where: • A label is an identifier which is followed by a colon. • A mnemonic is a r[...]

  • Página 43

    Vol. 3A 1-7 ABOUT THIS MANUAL 1.3.4 Hexadecimal and Binary Numbers Base 16 (hexadecimal) numbers are represented by a string of hexadecimal digits followed by the character H (for example, F82EH) . A hexadecimal di git is a character from the following set: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, and F . Base 2 (binary) numbers are represen te[...]

  • Página 44

    1-8 Vol. 3A ABOUT THIS MANUAL 1.3.7 Exceptions An exception is an event that typically occurs when an instruct ion causes an erro r . For example, an attempt to divide by zero generates an excep tion. However, some exceptions, such as break- points, occur under other conditions. Som e type s of exceptions may provide error codes. An error code repo[...]

  • Página 45

    Vol. 3A 1-9 ABOUT THIS MANUAL be able to report an accurate code. In this case, the error code is zero, as shown below for a general-protection exception. #GP(0) 1.4 RELATED LITERATURE Literature related to IA-32 processors is listed on-line at this link: http://developer .intel.com/design/proces sor/ Some of the docu ments listed at th is web site[...]

  • Página 46

    1-10 Vol. 3A ABOUT THIS MANUAL[...]

  • Página 47

    2 System Ar chitectur e Overview[...]

  • Página 48

    [...]

  • Página 49

    Vol. 3A 2-1 CHAPTER 2 SYSTEM ARCHITECTURE OVERVIEW IA-32 architecture (beginning with the Intel386 processor family) provides extens ive support for operating-system and system-dev elop ment software. This supp ort offers multiple modes of operation, which include: • Real mode, protected m ode, virtual 8086 m ode, and system m anagement mod e. Th[...]

  • Página 50

    2-2 Vol. 3A SYSTEM ARCHITECTURE OVERVIEW 2.1 OVERVIEW OF THE SY STEM-LEVEL ARCHITECTURE IA-32 system-level archit ecture consists of a se t of registers, data st ructures, and instructions designed to support basic system-level operations such as memory management, interrupt and exception handling, task management, and control of multiple processor[...]

  • Página 51

    Vol. 3A 2-3 SYSTEM ARCHITECTURE OVERVIEW Figure 2-1. IA-32 System-Level Registers and Data Structures Local Descriptor T able (LDT) EFLAGS Register Control Registers CR1 CR2 CR3 CR4 CR0 Global Descriptor T able (GDT) Interrupt Descriptor T able (IDT) IDTR GDTR Interrupt Gate T rap Gate LDT Desc. TSS Desc. Code St a c k Code St a c k Code St a c k T[...]

  • Página 52

    2-4 Vol. 3A SYSTEM ARCHITECTURE OVERVIEW Figure 2-2. System-Level Registers an d Data Structures in IA-32e Mode Local Descriptor T able (LDT) CR1 CR2 CR3 CR4 CR0 Global Descriptor T able (GDT) Interrupt Descriptor T able (IDT) IDTR GDTR Interrupt Gate T rap Gate LDT Desc. TSS Desc. Code St a c k Code St a c k Code St a c k Current TSS Code Sta ck I[...]

  • Página 53

    Vol. 3A 2-5 SYSTEM ARCHITECTURE OVERVIEW 2.1.1 Global and Local Descriptor T ables When operating in pr otected mode, all memory accesses pass through either the global descriptor table (GDT) or an optional local desc riptor table (LDT) as shown in Figure 2-1. These tables contain entries called segment descriptors. Segment descriptors provide the [...]

  • Página 54

    2-6 Vol. 3A SYSTEM ARCHITECTURE OVERVIEW For example, a CALL to a call gate can provide access to a proce dure in a code segment that is at the same or a numerically lowe r privilege leve l (more priv ileged) than the current code segment. T o access a procedure through a call gate, the calling procedure 1 supplies the selector for the call gate. T[...]

  • Página 55

    Vol. 3A 2-7 SYSTEM ARCHITECTURE OVERVIEW A task can also be accessed through a task gate. A task gate is similar to a call g ate, except that it provides access (through a segment selector) to a TSS rather than a code segment. 2.1.3.1 T as k-St ate Segmen ts in IA-32e Mode Hardware task switches are not supported in IA -32e mode. However , TSSs con[...]

  • Página 56

    2-8 Vol. 3A SYSTEM ARCHITECTURE OVERVIEW The location of pages (sometimes called page frames) in physical memory is contained in t wo types of system data structures: page directories and page tables. Both structures reside i n phys- ical memory (see Figure 2-1). The base physical address of the page directo ry is contained in control register CR3.[...]

  • Página 57

    Vol. 3A 2-9 SYSTEM ARCHITECTURE OVERVIEW • The GDTR, LDTR, and IDTR registers contain the linear addresses and sizes (limits) of their respective tables. See also: Section 2.4, “Memory-Mana gement Registers.” • The task register contains th e linear address and size of th e TSS for the current task. See also: Section 2.4, “Memor y-Ma nage[...]

  • Página 58

    2-10 Vol. 3A SYSTEM ARCHITECTURE OVERVIEW 2.1.7 Other System Resources Besides the system registers and data structures described in the previous secti ons, system archi- tecture provides the fo llowing additional resources: • Operating system instruction s (see also: Section 2.6, “System Instruction Summ ary”). • Performance-monitoring cou[...]

  • Página 59

    Vol. 3A 2-11 SYSTEM ARCHITECTURE OVERVIEW The processor is placed in real-address mode following power-up or a reset. The PE flag in control register CR0 then contro ls whether the processor is oper ating in real-address or protected mode. See also: Section 9.9, “Mode Switching.” The VM flag in the EFLAGS regi ster determines whether the pr oce[...]

  • Página 60

    2-12 Vol. 3A SYSTEM ARCHITECTURE OVERVIEW 2.3 SYSTEM FLAGS AN D FIELDS IN THE EFLAGS REGISTER The system flags and IOPL field of the EFLAGS re gister control I/O, ma skable hardware inter- rupts, debugging, task switchi ng, and the virt ual-8086 mode (see Figure 2-4). Only privileged code (typically operating system or execu tive code) should be al[...]

  • Página 61

    Vol. 3A 2-13 SYSTEM ARCHITECTURE OVERVIEW The IOPL is also one of the mechanisms th at controls the modification of the IF flag and the handling of int errupt s in virtual -80 86 m ode when vi rtual m ode extensions are in effect (when CR4.VME = 1). See al so: Chapter 13, “Input/Output,” in the IA-32 Intel® Architectur e Softwar e Developer ?[...]

  • Página 62

    2-14 Vol. 3A SYSTEM ARCHITECTURE OVERVIEW VIF V irtual Interrupt (bit 19) — Contains a virtual image of the IF flag. This flag is used in conjunction with the VIP flag. The pro cessor only recognizes the VIF flag when either the VME flag or the PVI flag in cont rol register CR4 is set and the IOPL is less than 3. (The VME flag enables the virtual[...]

  • Página 63

    Vol. 3A 2-15 SYSTEM ARCHITECTURE OVERVIEW 2.4.1 Global Descriptor T able Register (GDTR) The GDTR register holds the base address (32 bits in protected mode; 64 bits in IA-32e mode) and the 16-bit table limit for the G DT . The base address specifies the lin ear address of byte 0 of the GDT ; the table limit specifies the number of bytes in the tab[...]

  • Página 64

    2-16 Vol. 3A SYSTEM ARCHITECTURE OVERVIEW 2.4.3 IDTR Interrupt Descriptor T ab le Register The IDTR register holds the base address (32 bits in protected mode; 64 bits in IA-32 e mode) and 16-bit table limit for the IDT . The base address specifies the linear addr ess of byte 0 of the IDT ; the table limit specifies the number of byte s in the tabl[...]

  • Página 65

    Vol. 3A 2-17 SYSTEM ARCHITECTURE OVERVIEW The control registers are summar ized below , and each architectur ally defined control field in these control registers are described indi vidually . In Figure 2-6, the width of the regist er in 64-bit mode is indicated in parenthesis (except for CR0). • CR0 — Contains system control flags that contro [...]

  • Página 66

    2-18 Vol. 3A SYSTEM ARCHITECTURE OVERVIEW When loading a control register , reserved bits shou ld always be set to th e values previously read. The flags in control registers are: PG Paging (bit 31 of CR0) — Enables paging when set; disab les paging when clear . When paging is disabled, all linear addre sses are treated as physical addresses. The[...]

  • Página 67

    Vol. 3A 2-19 SYSTEM ARCHITECTURE OVERVIEW NW Not Write-th rough (bit 29 of CR0) — When the NW and CD flags are clear , write- back (for Pentium 4, Inte l Xeon, P6 fami ly , and Pentium processors) or write-through (for Intel486 processors) is enabled for writ es that hit the cache and invalidat ion cycles are enabled. See T able 10-5 for detailed[...]

  • Página 68

    2-20 Vol. 3A SYSTEM ARCHITECTURE OVERVIEW • If the TS flag is set and the MP flag (b it 1 of CR0) and EM flag are clear, an #NM exception is not raised prior to the ex ecution of an x87 FPU W AIT/FW AIT instruction. • If the EM flag is set, the sett ing of th e TS flag has no affect on the execution of x87 FPU/MMX/SSE/SSE2/SSE3 instructions. T [...]

  • Página 69

    Vol. 3A 2-21 SYSTEM ARCHITECTURE OVERVIEW FPU or math coprocessor present in the syst em. T able 2-1 shows the interaction of the EM, MP , and TS flags. Also, when the EM flag is set, execution of an MMX instruction causes an invalid- opcode exception (#UD) to be generated (see T able 1 1- 1). Thus, if an IA-32 processor incorporates MMX technology[...]

  • Página 70

    2-22 Vol. 3A SYSTEM ARCHITECTURE OVERVIEW VME V irtual-8086 Mode Extensions (bit 0 of CR4) — Enables interrupt- and exception- handling extensions in virtual-8 086 mode when set; disables the extensions when clear . Use of the virtual mode extensions can im prove the performance of virtual-8086 appli- cations by eliminating the overhead of callin[...]

  • Página 71

    Vol. 3A 2-23 SYSTEM ARCHITECTURE OVERVIEW When enabling the global page feat ure, paging must be enabled (by setting the PG flag in control register CR0) before the PGE flag is set. Reversing this sequence may affect program correctness, and processo r performance will be impacted. See also: Section 3.12, “T ransla tion Lookaside Buffers (TLBs).?[...]

  • Página 72

    2-24 Vol. 3A SYSTEM ARCHITECTURE OVERVIEW 2.5.1 CPUID Qualification of Control Regi ster Flags The VME, PVI, TSD, DE, PSE, P AE, MCE, PGE, PCE, OSFXSR, and OSXMMEXCP T flags in control register CR4 are mode l specific. All of these flags (e xcept the PCE flag) can be qual- ified with th e CPUID instructi on to det ermine if they are implemented on [...]

  • Página 73

    Vol. 3A 2-25 SYSTEM ARCHITECTURE OVERVIEW 2.6.1 Loading and S toring System Registers The GDTR, LDTR, IDTR, and TR registers each ha ve a load and store instruction for loading data into and storing data from the register: • LGDT (Load GDTR Register) — Loads t he GDT base address and limit from memory into the GDTR register . • SGDT (S tore G[...]

  • Página 74

    2-26 Vol. 3A SYSTEM ARCHITECTURE OVERVIEW • SLDT (S tore LDT Register) — Stores the LDT segment se lector from the LDTR register into memory or a general-purpose register . • L T R (Load T ask Register) — Loads seg ment selector and segment descriptor for a TSS from memory into the task register . (The segm ent selector operand can also be [...]

  • Página 75

    Vol. 3A 2-27 SYSTEM ARCHITECTURE OVERVIEW Offset Is W ithin Limits (LSL Instruction),” fo r a detailed explanation of the function and use of this instruction. The VERR (verify for reading) and VER W (verify for writing) instructions verify if a selected segment is readable or writable, respectively , at a given CPL. See Section 4.10.2, “Checki[...]

  • Página 76

    2-28 Vol. 3A SYSTEM ARCHITECTURE OVERVIEW Hardware may respond to this signal in a numbe r of ways. An indicato r light on the front panel may be turned on. An NM I interrupt for recording diagnost ic information may be generated . Reset initialization m ay be invoked (note that the BINIT# pin wa s introduced with th e Pentium Pro processor). If an[...]

  • Página 77

    Vol. 3A 2-29 SYSTEM ARCHITECTURE OVERVIEW See Section 18.10, “Per formance Monitoring Overview ,” an d Section 18.9, “Time-S tamp Counter ,” for more information about th e perform ance mon itoring and time-stamp cou nters. The RDTSC instruction was introduced into the IA-32 architecture with the Pentium processor . The RDPMC instruction wa[...]

  • Página 78

    2-30 Vol. 3A SYSTEM ARCHITECTURE OVERVIEW[...]

  • Página 79

    3 Pr otected-Mode Memory Management[...]

  • Página 80

    [...]

  • Página 81

    Vol. 3A 3-1 CHAPTER 3 PROTECTED-MODE MEMORY MANAGEMENT This chapter describes the IA-32 a rchitecture’ s protected-mode mem ory management facilities, including the physical mem ory requirements, segmentation mechanism , and paging mechanism. See also: Chapter 4, “Protectio n” (for a description of the processor ’ s protection mechanism) an[...]

  • Página 82

    3-2 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT If paging is not used, the linear address space of the proces sor is mapped di rectly into the phys- ical address space of processor . The physical addr ess space is defined as the range of addresses that the processor can generate on its address bus. Because multitasking computing systems commonly defin[...]

  • Página 83

    Vol. 3A 3-3 PROTECTED-MODE MEMORY MANAGEMENT If the page being accessed is not currently in physical memory , the processor interrupts execu- tion of the program (by generati ng a page-fault exception). The operating system or executive then reads the page into physical memory from the disk and conti nues execut ing the program. When paging is impl[...]

  • Página 84

    3-4 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT More complexity can be added to this protected flat model to provide more protecti on. For example, for the paging mechanis m to provide isolation bet ween user and su pervisor code and data, four segments need to be defined: code a nd data segments at privilege level 3 for the user, and code and data se[...]

  • Página 85

    Vol. 3A 3-5 PROTECTED-MODE MEMORY MANAGEMENT 3.2.3 Multi-Segment Model A multi-segment model (su ch as the one shown in Figure 3-4) uses the full cap abilities of the segmentation mechanism to provid ed hardware enforced protection of code, data structures, and programs and tasks. Here, each program (or task) is given its own table of segment descr[...]

  • Página 86

    3-6 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT 3.2.4 Segment ation in IA-32e Mode In IA-32e mode, the effects of segmentation depend on whether the processor is running in compatibility mo de or 64-bit mode. In compatibility m ode, segmentation functions just as it does using legacy 16-bit or 32-bit prot ected mode semantics. In 64-bit mode, segmenta[...]

  • Página 87

    Vol. 3A 3-7 PROTECTED-MODE MEMORY MANAGEMENT 3.3.1 Physical Address S p ace for Processors with Intel ® EM64T On processors that su pport Intel EM64T (CPUID.8000000 1.EDX[29] = 1), the size of p hysical address range is impl ementation-specific and indicated by CPU ID.80000001H. The p hysical address size supported by a given implementation is ava[...]

  • Página 88

    3-8 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT If paging is not used, the processor maps the lin ear address directly to a ph ysical address (that is, the linear address goes out on the processor ’ s address bus). If the linear address space is paged, a second level of address translation is us ed to translate the linear address into a physical add[...]

  • Página 89

    Vol. 3A 3-9 PROTECTED-MODE MEMORY MANAGEMENT TI (table indicator) flag (Bit 2) — Specifies the descriptor table to use: clearing this flag selects the GDT ; setting this flag selects the current LDT . Requested Privilege Level (RPL) (Bits 0 and 1) — Specifies the priv ilege le vel of the selector . The privilege level can range from 0 to 3, wit[...]

  • Página 90

    3-10 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT can be available for immediate use. Other segm ents can be made available by loading their segment selectors into these re gisters during program execu tion. Every segment register has a “visible” part and a “hidden” part. (The hi dden part is sometim es referred to as a “descriptor cache” o[...]

  • Página 91

    Vol. 3A 3-11 PROTECTED-MODE MEMORY MANAGEMENT 3.4.4 Segment Loading Inst ructions in IA-32e Mode Because ES, DS, and SS segment registers are not us ed in 64-bit mode, thei r fields (b ase, limit, and attribute) in segment descri ptor registers are ignored. Some forms of segment load instruc- tions are also invalid (for exam ple, LDS, POP ES). Addr[...]

  • Página 92

    3-12 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT 3.4.5 Segment Descriptors A segment descriptor is a data structure in a GDT or LDT that provides the processor with the size and location of a segment, as well as acce ss control and status information. Segment descriptors are typically created by compilers, linkers, loaders, or the operating system or [...]

  • Página 93

    Vol. 3A 3-13 PROTECTED-MODE MEMORY MANAGEMENT segment limit has the reverse function; the offset can range from the segment limit to FFFFFFFFH or FFFFH, depending on the setting of the B flag. Of fsets less than the segment limit generate general -protection ex ceptions. Decreasing the value in the segment limit field for an expand-down segment all[...]

  • Página 94

    3-14 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT segment. (This flag should always be set to 1 for 32-bit code and d ata segments and to 0 for 16-bit cod e and d a ta seg men ts .) • Executable code segment. The flag is called the D flag and it indicates the default length for effective addresses and operands referenced by instruc- tions in the segm[...]

  • Página 95

    Vol. 3A 3-15 PROTECTED-MODE MEMORY MANAGEMENT L (64-bit code segment) fla g In IA-32e mode, bit 2 1 of the second doublewo rd of the segmen t descriptor indicates whether a code segment contains native 64-bit code. A value of 1 indicates instruction s in this code segment are executed in 64-bit mode. A value of 0 indicates the instructions in this [...]

  • Página 96

    3-16 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT Stack segments are data segments which must be read/write segments. Loading the SS register with a segment selector for a nonwritable data segment generates a general -protection exception (#GP). If the size of a stack segme nt needs to be changed dynamically , the sta ck segment can be an expand-down d[...]

  • Página 97

    Vol. 3A 3-17 PROTECTED-MODE MEMORY MANAGEMENT 3.5 SYSTEM DESCRIPTOR T YPES When the S (descriptor type) flag in a segment descriptor is clear , the descriptor type is a system descriptor . The processor recognizes the following types of system descrip tors: • Local descriptor-table (LDT) segment descriptor . • T ask-state segment (TSS) descript[...]

  • Página 98

    3-18 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT See also: Section 3. 5.1, “Segment Descriptor T ables”, and Section 6.2.2, “TSS Descriptor” (for more information on the sy stem-segment descript ors); see Section 4.8.3, “Call Gates”, Section 5.11, “IDT Descriptors”, and Section 6.2.5, “T ask-Gate Descriptor” (for more info r- matio[...]

  • Página 99

    Vol. 3A 3-19 PROTECTED-MODE MEMORY MANAGEMENT Each system must have one GDT defined, which may be used fo r all programs and tasks in the system. Optionally , one or more LDT s can be defined. For example, an LDT can be defined for each separate task being run, or some or all tasks can share the same LDT . The GDT is not a segment itself; instead, [...]

  • Página 100

    3-20 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT 3.5.2 Segment Descriptor T ables in IA-32e Mode In IA-32e mode, a segment d escriptor table can contain up to 8192 (2 13 ) 8-byte descriptors. An entry in the segment descriptor table can be 8 by tes. System descriptors are expanded to 16 bytes (occupying the space of two entries). GDTR and LDTR registe[...]

  • Página 101

    Vol. 3A 3-21 PROTECTED-MODE MEMORY MANAGEMENT accessed for a long time. See S ection 3.12, “T ranslation Lookasi de Buffers (TLBs)”, for more information on the TLBs. 3.6.1 Paging Options Paging is controlled by three fl ags in the processor ’ s control registers: • PG (paging) flag. Bit 31 of CR0 (available i n all IA-32 processors beginni[...]

  • Página 102

    3-22 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT 3.6.2 Page T ables and Directorie s in the Absence of Intel EM64T The information that the processo r uses to translate linear ad dresses into physical addresses (when paging is enabled) is contained in four data structures: • Page directory — An array of 32-bit page-directory entries (PDEs) contain[...]

  • Página 103

    Vol. 3A 3-23 PROTECTED-MODE MEMORY MANAGEMENT 3.7.1 Linear Address T ransl ation (4-KByte Pages) Figure 3-12 show s the page dir ectory and p age-t able hierarchy wh en mappi ng lin ear ad dresses to 4-KByte pages. The entries in the page director y point to page tables, and the entries in a page table point to pages in physi cal memory . Thi s pag[...]

  • Página 104

    3-24 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT T o select the vari ous tabl e entries, the lin ear address is divided into three sections: • Page-directory entry — Bits 22 throug h 31 provide an offset to an entry in the page directory . Th e selected entry provides the base physical address of a page table. • Page-table entry — Bits 12 thro[...]

  • Página 105

    Vol. 3A 3-25 PROTECTED-MODE MEMORY MANAGEMENT NOTE (For the Pentium processor onl y .) When enab ling or disabling large page sizes, the TLBs must be invalidated (flu shed) after the PSE flag in control register CR4 has been set or cleared. Otherwise, incorrect page translation might occur due to the p rocessor using outdated p age translation info[...]

  • Página 106

    3-26 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT 3.7.6 Page-Directory a nd Page-T able Entries Figure 3-14 shows the format for the page-directory and page-table ent ries when 4-KByte pages and 32-bit physical addresses are being used. Figure 3-15 shows the format for the pa g e - directory entries when 4-MByte pages and 32-bit physical addresses are [...]

  • Página 107

    Vol. 3A 3-27 PROTECTED-MODE MEMORY MANAGEMENT (Page-directory entr ies for 4-KByte page tables) — Specifies the physical address of the first byte of a page table. The bits in this field are int erpreted as the 20 most-significant bits of the phy sical address, which forces page tables to be aligned on 4-KByte boundaries. (Page-directory entries [...]

  • Página 108

    3-28 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT 3. Invalidate the current page-table entry i n the TLB (see Section 3.12, “T ranslation Lookaside Buffers (TLB s)”, for a discussion of TLBs and how to invalidate th em). 4. Return fro m the page -fault handler to restart the interrupted program (or task). Read/write (R/W) flag, bit 1 Specifies the [...]

  • Página 109

    Vol. 3A 3-29 PROTECTED-MODE MEMORY MANAGEMENT This flag is a “sticky” flag, meani ng that once set, the processor does not implicitly clear it. Only software can clear this flag. The accessed and dirty flags are provided for use by memory m anagement software t o manage the transfer of pages and page tables into and out of physical memory . NOT[...]

  • Página 110

    3-30 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT in the TLB when register CR3 is loaded or a task switch occurs. This flag is provided to prevent frequently used pages (such as pages that contain kernel or other operating system or executive cod e ) from being flushed from the TLB . Only software can set or clear this flag . For page-directory entries[...]

  • Página 111

    Vol. 3A 3-31 PROTECTED-MODE MEMORY MANAGEMENT When the P AE paging mechanism is enabled, the processor supports two sizes of pages: 4-KByte and 2-MByte. As w ith 32-bi t addressing , both page sizes can be addressed within the same set of paging tables (that is, a pag e-direct ory entry can point to either a 2-MByte page or a page table that in tur[...]

  • Página 112

    3-32 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT T o select the vari ous tabl e entries, the lin ear address is divided into three sections: • Page-directory-pointer-table entry—Bits 30 and 31 provide an offset to one of the 4 entries in the page-directory-pointer ta ble. The selected entry provid es the base physical address of a page directory .[...]

  • Página 113

    Vol. 3A 3-33 PROTECTED-MODE MEMORY MANAGEMENT CR4 has no affect on the page size when P AE is en abled.) W ith the PS flag set, the linear address is divided into three sections: • Page-directory-pointer-table entry—Bits 30 an d 31 provide an offset to an entry in the page-directory-pointer table. Th e selected entry provides the base physical [...]

  • Página 114

    3-34 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT 3.8.5 Page-Directory and Page-T a ble Entries With Extended Addressing Enabled Figure 3-20 shows the format for the page-d irectory-pointer -table, page-directory , and page-table entries when 4 -KByte pages and 36 -bit extended ph ysical addresses are being used. Figure 3-21 shows the format for th e p[...]

  • Página 115

    Vol. 3A 3-35 PROTECTED-MODE MEMORY MANAGEMENT Figure 3-20. Format of Page-Directo ry-Po inter-T able, Page-Directory , and Page-T able Entries for 4-KByte Pa ges with P AE Enabled 63 36 35 32 Base Reserved (set to 0) Page-Directory-Pointe r-T able Entry 31 12 11 9 8 543 2 0 P C D P W T Ava il Page-Directory Base Address Addr . Res. Reserved 63 36 3[...]

  • Página 116

    3-36 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT The base physical address in an entry specifies the following, depending on the type of entry: • Page-directory-pointer -table entry — the physical address of the first byte of a 4-KByte page directory . • Page-directory entry — the physical address of the first byte of a 4-KByt e page table or [...]

  • Página 117

    Vol. 3A 3-37 PROTECTED-MODE MEMORY MANAGEMENT Access (A) and dirty (D) flags (bits 5 and 6) are provided for table en tries that point to pages. Bits 9, 10, and 11 in all the table entries for the physical address extension are available for use by software. (When t he present flag is clear, bits 1 through 63 are available to software.) All bits in[...]

  • Página 118

    3-38 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT Fi g ur e 3- 23 s ho ws th e fo rm at fo r t h e p ag e- directory entries when 4-MByte pages and 36-bit physical addresses are being us ed. Section 3.7.6, “Page-Dir ectory and Page-T able Entries” describes the functions of the flags and fields in bits 0 through 1 1. Figure 3-22. Linear Address T r[...]

  • Página 119

    Vol. 3A 3-39 PROTECTED-MODE MEMORY MANAGEMENT 3.10 P AE-ENABLED PAGI NG IN IA-32E MODE Intel EM64T 64-bit extensions expand physical add ress extension (P AE) paging structures to potentially support mapping a 64-b it linear address to a 52-bit physical address. In the first implementation of Intel EM64T , P AE paging structures su pport translatio[...]

  • Página 120

    3-40 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT 3.10.2 IA-32e Mode Linear Address T ranslatio n (2-MByte Pages) Figure 3-25 shows the PML4 tab le, page-direct ory-pointer , and page-di rectory hi erarchy w hen mapping linear addresses to 2-MByte page s in IA-32e mode. This method can be used to address up to 2 27 pages, which spans a linear address s[...]

  • Página 121

    Vol. 3A 3-41 PROTECTED-MODE MEMORY MANAGEMENT • Page-director y entry — Bits 29:21 provide an offset to an entry in the page directory . The selected entry provides the base physical address of a 2-MByte page. • Page offset — Bits 20:0 provides an offset to a physical address in the page. 3.10.3 Enhanced Paging Data S tructures Figure 3-26 [...]

  • Página 122

    3-42 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT Except for bit 63, functions of the flags in these entries are as described in Section 3.7.6, “Page- Directory and Page-T able En tries”. The dif ferences are: • A PML4 table entry and a page-direct ory-pointer-table entry are added. • Entries are increased from 32 bits to 64 bits. • The maxim[...]

  • Página 123

    Vol. 3A 3-43 PROTECTED-MODE MEMORY MANAGEMENT • The base physical address fiel d in each entry is extended to 28 bits if the processor ’ s implementation su pports a 40-bit physical address. • Bits 62:52 are available for use by system programmers. • Bit 63 is the ex ecute-disable bit if the ex ec ute-disable bit feature is supported in the[...]

  • Página 124

    3-44 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT If the execute disable bit is enabled in an IA-32 processor , the reserved bits in paging data struc- tures for legacy 32-bit mode an d 64-bit mode are shown in T abl e 3-5. T able 3-4. Reserv ed Bit Checking Wh en Execute Disable Bit is Disabled Mode Paging Mode Paging Structure Check Bits 32-bit 4-KBy[...]

  • Página 125

    Vol. 3A 3-45 PROTECTED-MODE MEMORY MANAGEMENT 3.1 1 MAPPING SEGMENT S TO PAGES The segmentation and paging mechanism s prov ide in the IA-32 architecture support a wide variety of approaches to memory management. When segmen tation and paging is combined, segments can be mapped to pages i n several ways. T o im plement a flat (unsegmen ted) address[...]

  • Página 126

    3-46 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT 3.12 T RANSLATION LOOKASIDE BUFFERS (TLBS) The processor stores the most recently used pa ge-directory and page-tab le entries in on-chip caches called translation lookaside buffers or TLB s. The P6 family and Pentium processors have separate TLBs for the data and instruction caches. Also, the P6 fa mil[...]

  • Página 127

    Vol. 3A 3-47 PROTECTED-MODE MEMORY MANAGEMENT • Implicitly by executing a task switch, which automat ically changes the contents of the CR3 register . The INVLPG instruction is provided to invali date a specific page-table entry in the TLB. Normally , this instruction invalidates only an individual TLB entry; however, in some cases, it may invali[...]

  • Página 128

    3-48 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT[...]

  • Página 129

    4 Pr otection[...]

  • Página 130

    [...]

  • Página 131

    Vol. 3A 4-1 CHAPTER 4 PROTECTION In protected mode, the IA-32 architecture provid es a protection mechanism that operates at both the segment level and the page level. This pr otection mechanism provides the abili ty to limit access to certain segments or pages based on privilege levels (four privilege levels for s egments and two privilege levels [...]

  • Página 132

    4-2 Vol. 3A PROTECTION that is based on privilege levels can essentially be disabled while still in protected mode by ass ig nin g a p rivilege lev el of 0 (most privil eged) to all segment selecto rs and segment descrip- tors. This action disables the p rivilege level protection barriers between segments, but other protection checks such as lim it[...]

  • Página 133

    Vol. 3A 4-3 PROTECTION • Read/write (R/W) f lag — (Bit 1 of a page-d irectory or page-table entry .) Determines the type of access allowed to a page: read only or re ad-write. Figure 4-1 sh ows the location of the vari ous fields and flags in the data, code, and sy stem- segment descriptors; Figure 3-6 shows the locat ion of the RPL (or CP L) f[...]

  • Página 134

    4-4 Vol. 3A PROTECTION Many different styles of prot ection schemes can be implemente d with these fields and flags. When the operating system creat es a descriptor , it places values in these fields and flags in keeping with the particular prot ection style chosen for an operat ing system or executive. Appli - cation program do not gene rally acce[...]

  • Página 135

    Vol. 3A 4-5 PROTECTION 4.3 LIMIT CHECKING The limit field of a segment descriptor preven ts programs or procedures from addressing memory locations outside the segm ent. The effective value of the limit depends on the sett ing of the G (granularity) flag (see Figure 4-1). For data segments, the li mit also depends on the E (expansion directio n) fl[...]

  • Página 136

    4-6 Vol. 3A PROTECTION For expand-down data segments, the segment limit has the same function but is interpreted differently . Here, the effective limit specifies the last address that is not allowed to be accessed within the segment; the ran ge of valid offsets is from (ef fective-limit + 1) to FFFFFFFFH if the B flag is set and from (effective-li[...]

  • Página 137

    Vol. 3A 4-7 PROTECTION • When a segment selector is l oaded into a segment register — Certain segment registers can contain only certain desc riptor types, for example: — The CS register only can be loaded with a selector for a code segment. — Segment selectors for code segments that are not readable or for system segments cannot be loaded [...]

  • Página 138

    4-8 Vol. 3A PROTECTION — On a call or jump through a call gate (or on an interrupt- or exception-handler call through a trap or interru pt gate), the pro cessor automatically checks that the segment descriptor being pointed to by the gate is for a code segment. — On a call or jump to a new task through a task gate (or on an interrupt- or except[...]

  • Página 139

    Vol. 3A 4-9 PROTECTION The processor uses privilege leve ls to prevent a program or task operating at a lesser privilege level from accessing a segment with a greater privilege, except under controlled situations. When the processor detects a privilege level viol ation, it generates a general-protection excep- tion (#GP). T o carry out privilege-le[...]

  • Página 140

    4-10 Vol. 3A PROTECTION — Nonconforming code segment (without using a call gate) — The DPL indicates the privilege level that a program or task must be at to access the segment. For example, if the DPL of a nonconforming code segment is 0, only pro grams running at a CPL of 0 can access the segment. — Call gate — The DPL indicates the numer[...]

  • Página 141

    Vol. 3A 4-11 PROTECTION 4.6 PRIVILEGE LEVEL CHECKI NG WHEN ACCESSING DATA SEGMENT S T o access operands in a data segment, the segment selector for the data segment must be loaded into the data-segment registers (DS, ES, FS, o r GS) or into the stack-s egment register (SS). (Segment registers can be loaded with the MOV , POP , LDS, LE S, LFS, LGS, [...]

  • Página 142

    4-12 Vol. 3A PROTECTION 4. The procedure in code segm ent D should be able to access data segment E because code segment D’ s CPL is numerically less than the DPL of data segment E. However , the RPL of segment selector E3 (which the code segment D procedure is using to access data segment E) is numerically greater than th e DPL of data segment E[...]

  • Página 143

    Vol. 3A 4-13 PROTECTION 4.6.1 Accessing Dat a in Code Segment s In some instances it may be desirable to access data structures that are contained in a code segment. The following meth ods of accessing data in code segments are possible: • Load a data-segment register with a segmen t selector for a nonconf orming, readable, code segment. • Load[...]

  • Página 144

    4-14 Vol. 3A PROTECTION A JMP or CALL instruction can reference another code segment in any of four ways: • The target operand contains the segment selector for the tar get code segment. • The target operand points to a call-gate descri ptor , whi ch contains th e segment selector for the target code segment. • The target operand points to a [...]

  • Página 145

    Vol. 3A 4-15 PROTECTION • The DPL of the segment descriptor for the de stination code segmen t that contains the called procedure. • The RPL of the segment selector of the destination code segment. • The conforming (C) flag in the segment descript or for the destination code segment, which determines whether the segment is a conform ing (C fl[...]

  • Página 146

    4-16 Vol. 3A PROTECTION The RPL of the segment selector th at points to a nonconforming co de segment has a limited effect on the privilege check. The RPL must be nu merically less than or equal to the CPL of the calling procedure for a successful c ontrol transfer to occur . So, in the example in Figure 4-7, the RPLs of segment selectors C1 and C2[...]

  • Página 147

    Vol. 3A 4-17 PROTECTION In the example in Figure 4-7, code segment D is a conforming code segment. Therefore, calling procedures in both code segment A and B can access code segment D (us ing either segment selector D1 or D2, respectively), because they both have CPLs th at are greater than or equal to the DPL of the conforming code segment. For co[...]

  • Página 148

    4-18 Vol. 3A PROTECTION 4.8.3 Call Gates Call gates facilitate controlled transfers of program control be tween dif ferent privilege levels. They are typically used only in operating systems or e xecutive s that use the privilege-level protection mechanism. Call gates are also useful for transferring program control between 16-bit and 32-bit code s[...]

  • Página 149

    Vol. 3A 4-19 PROTECTION Note that the P flag in a gate descriptor is normally always set to 1. If it i s set to 0, a not present (#NP) exception is generated when a program at tempts to access the descriptor . The operating system can use the P flag for special purposes. Fo r example, it could be used to track the number of times the gate is used. [...]

  • Página 150

    4-20 Vol. 3A PROTECTION • T arget code segments referenced by a 64-b it call gate must be 64-bit code segments (CS.L = 1, CS.D = 0). If not, the referen ce generates a general-protection exception, #GP (CS selector). • Only 64-bit mode call g ates can be referenced in IA-32e mo de (64-bit mode and com pati- bility mode). The legacy 32-bit mod e[...]

  • Página 151

    Vol. 3A 4-21 PROTECTION Figure 4-10. Call-Gate Mech anism Figure 4-1 1. Privilege Check for Control T ransfer with Call Gate Offset Segment Selector Far Pointer to Call Gate Required but not used by processor Call-Gate Descriptor Code-Segment Descriptor Descriptor T able Offset Base Base Offset Base Segment Selector + Procedure Entry Point CPL RPL [...]

  • Página 152

    4-22 Vol. 3A PROTECTION The privilege checking rules are dif ferent depending on whether the control transfer was initi- ated with a CALL or a JMP instruction, as shown in T able 4-1. The DPL field of the call-gate descriptor specifi es the numerically highest privilege level from which a calling procedure can access the call gate; that is, to acce[...]

  • Página 153

    Vol. 3A 4-23 PROTECTION Call gates allow a single code segment to have pr ocedures that can be accessed at dif ferent priv- ilege levels. For examp le, an operating system located in a code segment may have some services which are intended to be used by both the operating system and application software (such as procedures for handling character I/[...]

  • Página 154

    4-24 Vol. 3A PROTECTION Each task must define up to 4 stacks: one for applications code (running at privilege level 3) and one for each of the privilege levels 2, 1, and 0 that are used. (If only two privilege levels are used [3 and 0], then only two stacks must be defined. ) Each of these stacks is located in a separate segment and is identi fied [...]

  • Página 155

    Vol. 3A 4-25 PROTECTION 4. T emporarily saves the current valu es of the SS and ESP registers. 5. Loads the segment selector an d stack pointer for the new stack in the SS and ESP registers. 6. Pushes the temporarily saved val ues for the SS and ESP regist ers (for the calling procedure) onto the ne w stack (see Figure 4-13). 7. Copies the nu mber [...]

  • Página 156

    4-26 Vol. 3A PROTECTION 4.8.5.1 St ack Switching in 64-bit Mode Although protection-ch eck rules for call gates are unchanged from 32-bit mode, stack-switch changes in 64-bit mode are different. When stacks are switched as part of a 64 -bit mode privilege-level chang e through a call gate, a new SS (stack segment) descriptor is not load ed; 64 -bit[...]

  • Página 157

    Vol. 3A 4-27 PROTECTION from the stack into the EIP regi ster , it checks that the pointer does not exceed the limit of the current code segment. On a far return at the same p rivilege level, the processor pops both a segment selecto r for the code segment being returned to and a return instruct ion pointer from the stack. Under normal conditions, [...]

  • Página 158

    4-28 Vol. 3A PROTECTION new CPL (excluding conforming code segments), the segment register is loaded with a null segment selector . See the description of the RET instruction in Chap ter 3, Instruction Set Reference , of the IA-32 Intel Ar chitectur e Software D eveloper’ s Manual, V olume 2 , for a detailed descripti on of the priv- ilege level [...]

  • Página 159

    Vol. 3A 4-29 PROTECTION MSRs and general-purpose registers eliminates all memory accesses except when fetching the target code. Any additional state that needs to be saved to allow a return to the calling procedure must be saved explicitly by the calling procedure or be predefined thro ugh programm ing conventions. 4.8.7.1 SYSENTER and SYSEXIT Inst[...]

  • Página 160

    4-30 Vol. 3A PROTECTION When SYSEXIT transfers contro l to compatibility mode user code when the operand size attribute is 32 bits, the following fields are generated and bits set: • T a rget code segment — Computed by adding 16 to the value in IA32_SYSENTER_CS. • New CS attributes — L-bit = 0 (go to comp atib ility mode). • T arget instr[...]

  • Página 161

    Vol. 3A 4-31 PROTECTION When SYSRET transfers control to 64-bit mode us er code using REX.W , the processor gets the privilege level 3 target instruction and stack pointer from: • T arget code segment — Reads a non-NULL selector from IA32_ST AR[63:48] + 16. • T arget instruction — Copies the value in RCX into RIP . • S tack segment — IA[...]

  • Página 162

    4-32 Vol. 3A PROTECTION 4.9 PRIVILEGED INSTRUCTIONS Some of the system instru ctions (called “privileged instructi ons”) are protected from use by applicatio n pr ogr ams. Th e pri vil ege d i nst ruct ion s control system functions (such as the loading of system registers). They can be executed only when the CPL is 0 (m ost priv ileged). If on[...]

  • Página 163

    Vol. 3A 4-33 PROTECTION 3. Checking if the pointer of fse t exceeds the segment limit. 4. Check ing if the supp lier of the point er is allowed to access the segment. 5. Checking the of fset alignmen t. The processor automa tically performs fi rst, s econd, and third checks du ring instruction execu- tion. Software must exp licitly request the four[...]

  • Página 164

    4-34 Vol. 3A PROTECTION 4.10.2 Checking Read/Write Right s (VERR and VER W Instructions) When the processor accesses any code or data segment it checks the read/write privileges assigned to the segment to verify that the inte nded read or write opera tion is allowed. Softwar e can check read/write rights using the VERR ( verify for reading) and VER[...]

  • Página 165

    Vol. 3A 4-35 PROTECTION 5. If the privi lege level and type checks pass, loads the unscramb led limit (the limit scaled according to the setting of the G flag in the se gment descriptor) into the destination register and sets the ZF flag in the EF LAGS register . If the segment se lector is not visible at th e current privilege level or is an inval[...]

  • Página 166

    4-36 Vol. 3A PROTECTION Now assume that instead of setti ng the RPL of the segment selector to 3, th e applicatio n program sets the RPL to 0 (segment se lector D2). The opera ting system can now access da ta segment D, because its CPL and the RPL of segm ent selector D2 are both equal to the DPL of data segment D. Because the application program i[...]

  • Página 167

    Vol. 3A 4-37 PROTECTION application program (represented by the code-seg m ent selector pushed o nto the stack). If the RPL is less than application program’ s privilege level, th e ARPL instruction chang es the RPL of the segment selector to match the privilege level o f the app lic ati on pr ogr am ( seg me nt selector D1). Using this instructi[...]

  • Página 168

    4-38 Vol. 3A PROTECTION 4.1 1.1 Page-Protection Flags Protection inform ation for pages is contained in tw o flags in a page-directory or page-table entry (see Figure 3-1 4): the read/write flag (b it 1) and the user/supervisor flag (bit 2). The protection checks are applied to both first- and second-level pag e tables (that is, page directories an[...]

  • Página 169

    Vol. 3A 4-39 PROTECTION read/write accessible. User -mode pages which are read/write or read-only are readable; super- visor-mode pages are neither readable nor writable from user mode. A page-fault exception is generated on any attempt to violate the protection rules. The P6 family , Pentium, and Intel486 processors allow user-mode pages to be wri[...]

  • Página 170

    4-40 Vol. 3A PROTECTION Page-level protection can be used to enhance se gment-level protection. For example, if a lar ge read-write data segment is paged, the page-pro tection mechanism can be used to write-protect individual pages. NOTE: * If CR0.WP = 1, access type is determined by the R/ W flags of the page-directory and page-table entries. IF C[...]

  • Página 171

    Vol. 3A 4-41 PROTECTION While the execute disable bit capabi lity does not introduce new instructio ns, it does require operating systems to use a P AE-enabled environm ent and establish a page-granular protection policy for memory pages. If the execute disable bi t of a memory page is set, that page can be used only as data. An attempt to execute [...]

  • Página 172

    4-42 Vol. 3A PROTECTION tures. Execute-disable bit protection can be activ ated using the execute-disable bit at any level of the pagin g structure, irresp ective of the corresponding entry in other levels. When execute- disable-bit protection is not activated, the page can be used as code or data. In legacy P A E-enabled mode, T able 4-7 and T abl[...]

  • Página 173

    Vol. 3A 4-43 PROTECTION 4.13.3 Reserved Bit Checking The processor enforces reserved bit checking in paging data structu re entries. The bits being checked varies with pa ging mode and may vary with the size of ph ysical address space. T able 4-9 shows the reserved bits that are checked when the execu te disable bit capability is enabled (CR4.P AE [...]

  • Página 174

    4-44 Vol. 3A PROTECTION T abl e 4-10. Re served Bit Checking WIth Ex ecute-Disable Bit Cap a bility Not Enabled 4.13.4 Exception Handling When execute disable bit capab ility is enabled (IA32_EFER.NXE = 1), con ditions for a page fault to occur include the same conditions that ap ply to an IA-32 processor without execute disable bit capability plus[...]

  • Página 175

    5 Interrupt and Exception Handling[...]

  • Página 176

    [...]

  • Página 177

    Vol. 3A 5-1 CHAPTER 5 INTERRUPT AND EXCEPTION HANDLING This chapter describes the processor ’ s interr upt and ex ception-handling mechanism wh en oper- ating in protected mode. Most of the information pro vided here also applies to interrupt and exception mechanisms used in real-address, virtual-8086 mode, and 64-bit mode. Chapter 15, “8086 Em[...]

  • Página 178

    5-2 Vol. 3A INTERRUPT AND EXCEPTION HANDLING 5.2 EXCEPTION AND INTERRUPT V ECTORS T o aid in handling exceptions and interrupts , each IA-32 architectur e-defined exception and each interrupt condition th at requires special handling by the processor is assigned a unique identification number , called a vect or . The processor uses the vect or ass [...]

  • Página 179

    Vol. 3A 5-3 INTERRUPT AND EXCEPTION HANDLING T able 5-1. Protected-Mod e Exceptions and Interrup ts V ector No. Mne- monic Description T ype Error Code Source 0 #DE Divide Error Fault No DIV and IDIV instructions. 1 #DB RESERVED Fault/ Tra p No For Intel use only . 2 — NMI Interrupt Interrupt No Nonmaskable external interrupt. 3 #BP Brea kpoint T[...]

  • Página 180

    5-4 Vol. 3A INTERRUPT AND EXCEPTION HANDLING The processor ’ s local APIC is normally connected to a system-based I/O APIC. Here, external interrupts received at the I/O APIC’ s pins can be directed to the lo cal APIC through the system bus (Pentium 4 an d Intel Xeon processo rs) or the APIC serial bus (P6 family and Pentium processors). The I/[...]

  • Página 181

    Vol. 3A 5-5 INTERRUPT AND EXCEPTION HANDLING 5.4 SOURCES OF EXCEPTIONS The processor receives excep tions from three sources: • Processor -detected pr ogram-error exceptions. • Software-generated exceptions. • Machine-check exceptions. 5.4.1 Program-Error Exceptions The processor generates one or more exception s when it detects program error[...]

  • Página 182

    5-6 Vol. 3A INTERRUPT AND EXCEPTION HANDLING • Faults — A fault is an exception that can genera lly be corrected and that, once corrected, allows the program to be restarted with no lo ss of contin uity . When a fault is reported, the processor restores the machine state to the st ate prior to the beginning of executi on of the faulting instruc[...]

  • Página 183

    Vol. 3A 5-7 INTERRUPT AND EXCEPTION HANDLING For trap-class exceptions, the return instruction pointer poi nts to the instruction following the trapping instruction. If a trap is detected during an instruc tion which transfers execution, the return instruction pointer reflects the transfer . Fo r example, if a trap is detected while executing a JMP[...]

  • Página 184

    5-8 Vol. 3A INTERRUPT AND EXCEPTION HANDLING 5.7 NONMASKABLE INTERRUPT (NMI) The nonmaskable interrupt (NMI) can be ge nerat ed in eith er of tw o ways: • External hardware asserts the NMI pin. • The processor receives a message on the system bus (Pentium 4 and Intel Xeon processors) or the APIC serial bus (P6 family and Pentium processo rs) wi[...]

  • Página 185

    Vol. 3A 5-9 INTERRUPT AND EXCEPTION HANDLING 5.8.1 Masking Maskable Hardware Interrupts The IF flag can disable the servicing of ma skable hardware interrupts received on the processor ’ s INTR pin or through the local APIC (see Section 5. 3.2, “M askable Hardware Inter- rupts”). When the IF flag is clear , the processor inhibit s interrupt s[...]

  • Página 186

    5-10 Vol. 3A INTERRUPT AND EXCEPTION HANDLING Manual, V o lume 2A, for a detailed description of the operations these instructions are allowed to perform on the IF flag. 5.8.2 Masking Instruction Breakpoints The RF (resume) flag in the EFLA GS register controls the respon se of the processor to instruc- tion-breakpoint conditi ons (see the descript[...]

  • Página 187

    Vol. 3A 5-11 INTERRUPT AND EXCEPTION HANDLING While priority among these classes listed in T abl e 5-2 is consistent th roughout the architecture, exceptions within each class are implementatio n-dependent and may vary from processor to processor . The processor first services a pendin g exception or interrupt from the class which has the highest p[...]

  • Página 188

    5-12 Vol. 3A INTERRUPT AND EXCEPTION HANDLING re-generated when the interrupt handler returns ex ecution to the point in t he program or task where the exceptions and/or interrupts occurred. 5.10 INTERRUPT DESCRIPTOR T ABLE (IDT) The interrupt descriptor table (ID T) associates each exception or interrupt vector with a gate descriptor for the proce[...]

  • Página 189

    Vol. 3A 5-13 INTERRUPT AND EXCEPTION HANDLING 5.1 1 IDT DESCRIPTORS The IDT may contain any of th ree ki n ds of gate descriptors: • T ask-gate descriptor • Interrupt-gate descriptor • T rap-gate descriptor Figure 5 -2 shows the form ats for the task-gate, interrupt-gate, and trap -gate descri ptors. The format of a task gate used in an IDT i[...]

  • Página 190

    5-14 Vol. 3A INTERRUPT AND EXCEPTION HANDLING 5.12 EXCEPTION AND INTERRUPT HANDL ING The processor handles calls to exception- and interrupt -handlers similar to the way it h andles calls with a CALL in struction to a procedure or a task. When responding to an exception or inter- rupt, the processor uses the excepti on or interrupt vector as an ind[...]

  • Página 191

    Vol. 3A 5-15 INTERRUPT AND EXCEPTION HANDLING through Section 4.8.6, “Returning from a Called Pr oced ure”). If index poin ts to a task g ate, the processor executes a task switch to the exception- or interrupt -han dler task in a manner similar to a CALL to a task gate (see Section 6.3, “T ask Switching”). 5.12.1 Exception- or Inte rrupt-H[...]

  • Página 192

    5-16 Vol. 3A INTERRUPT AND EXCEPTION HANDLING When the processor performs a call to the exception- or interrupt-handler procedure: • If the handler procedure is going to be execute d at a numerically lower privilege level, a stack switch occurs. When the stack switch occurs: a. The segment selector and stack pointer for the stack to be used by th[...]

  • Página 193

    Vol. 3A 5-17 INTERRUPT AND EXCEPTION HANDLING T o return from an exception- or interrupt-handl er procedure, the handler must use th e IRET (or IRETD) instruction. The IRET instruction is similar to t he RET instruction except that it restores the saved flags into the EFLAGS register . The IO PL field of the EFLAGS register is restored only if the [...]

  • Página 194

    5-18 Vol. 3A INTERRUPT AND EXCEPTION HANDLING An attempt to violate this rule results in a general-protection exception (#GP). The protection mechanism for exception- and interrupt-handler procedures is dif ferent in the following ways: • Because interrupt and exception vectors have no RPL, the RP L is not checked on implicit calls to exception a[...]

  • Página 195

    Vol. 3A 5-19 INTERRUPT AND EXCEPTION HANDLING 5.12.2 Interrupt T asks When an exception or interrupt handler is accessed through a task gate in the IDT , a task switch results. Handling an exception or interrupt wit h a separate task offers several advantages: • The entire context of the interrupted pr ogram or task is saved auto mat ically . •[...]

  • Página 196

    5-20 Vol. 3A INTERRUPT AND EXCEPTION HANDLING Figure 5-5. Interrupt T ask Switch IDT T ask Gate TSS for Interrupt- TSS Selector GDT TSS Descriptor Interrupt Ve c t o r TSS Base Address Handling T ask[...]

  • Página 197

    Vol. 3A 5-21 INTERRUPT AND EXCEPTION HANDLING 5.13 ERROR CODE When an exception condition is related to a specific segment, the processor pushes an error code onto the stack of the exception ha ndler (whether it is a procedure or task). The error code has the format shown in Figure 5-6. The error code resembles a segment selector; however , instead[...]

  • Página 198

    5-22 Vol. 3A INTERRUPT AND EXCEPTION HANDLING 5.14 EXCEPTION AND INTERRUP T HANDLING IN 64-BIT MODE In 64-bit mode, interrupt and excep tio n handling is similar to what has been described for non- 64-bit modes. The followi ng are the exceptions: • All interrupt handlers pointed by the IDT are in 6 4-bit code (this does not apply to the SMI handl[...]

  • Página 199

    Vol. 3A 5-23 INTERRUPT AND EXCEPTION HANDLING In 64-bit mode, the IDT index is formed by scaling the interrupt vector by 16. The first eight bytes (bytes 7:0) of a 64-bit mode in terrupt gate are similar but not ident ical to legacy 32-bit interrupt gates. The type field (bits 1 1:8 in byt es 7:4) is d escribed in T able 3-2. The Interrupt Stack T [...]

  • Página 200

    5-24 Vol. 3A INTERRUPT AND EXCEPTION HANDLING 5.14.3 IRET in IA-32e Mode In IA-32e mode, IRET executes with an 8-byte op erand size. There is no thing that fo rces this requirement. The stack is formatted in such a wa y that for actions where IRET is required, the 8-byte IRET operand size works correctly . Because interrupt stack-frame pushes are a[...]

  • Página 201

    Vol. 3A 5-25 INTERRUPT AND EXCEPTION HANDLING In summary , a stack switch in IA -32e mode work s like the legacy stack switch, except that a new SS selector is not loaded from the TSS. Instead, the new SS is forced to NULL. 5.14.5 Interrupt St ack T able In IA-32e mode, a new interrupt stack ta ble (IST ) mechanism is available as an alterna tive t[...]

  • Página 202

    5-26 Vol. 3A INTERRUPT AND EXCEPTION HANDLING The IST mechanism provides up to seven IST poin ters in the TSS. The pointers are referenced by an interru pt-gate descript or in the interrup t-descriptor table (IDT); see Figure 5-7. The gate descriptor cont ains a 3-bit IST index field that pr ovides an offset into the IST section of the TSS. Using t[...]

  • Página 203

    Vol. 3A 5-27 INTERRUPT AND EXCEPTION HANDLING Interrupt 0—Divide Er ror Exception (#DE) Exception Class Fault. Description Indicates the divisor operan d for a DIV or IDIV inst ruction is 0 or that the result cannot be repre- sented in the number of bits specified for the destination operand. Exception Error Code None. Saved Instruction Pointer S[...]

  • Página 204

    5-28 Vol. 3A INTERRUPT AND EXCEPTION HANDLING Interrupt 1—Debug Exception (#DB) Exception Class Trap or Fault. The exception handler can distinguish between traps or faults by examining the contents of DR6 and the ot her deb ug regi sters. Descripti on Indicates that one or more of several debug-ex ception conditions has been detected. Wheth er t[...]

  • Página 205

    Vol. 3A 5-29 INTERRUPT AND EXCEPTION HANDLING Interrupt 2—NMI Interrupt Exception Class Not applicable. Description The nonmaskable interrupt (NMI) is generated externally by asserting the processor ’ s NMI pin or through an NMI request set by the I/O APIC to the local APIC. This interrupt causes th e NMI interrupt handler to be called. Excepti[...]

  • Página 206

    5-30 Vol. 3A INTERRUPT AND EXCEPTION HANDLING Interrupt 3—Breakpoint Exception (#BP) Exception Class Tr a p . Descripti on Indicates that a breakpoint inst ruction (INT 3) w as executed, causing a breakpoint trap to be generated. T ypically , a debugger sets a breakpoin t by replacing the first opcode byte of an instruction with the opcode for th[...]

  • Página 207

    Vol. 3A 5-31 INTERRUPT AND EXCEPTION HANDLING Interrupt 4—Overfl ow Exception (#OF) Exception Class Tr a p . Description Indicates that an overflow tr ap occurred when an INTO in struction was executed. The INT O instruction checks the state of the OF flag in the EFLAGS register . If the OF flag is set, an over- flow trap is gener ated. Some arit[...]

  • Página 208

    5-32 Vol. 3A INTERRUPT AND EXCEPTION HANDLING Interrupt 5—BOUND Range Exceeded Exception (#BR) Exception Class Fault. Descripti on Indicates that a BOUND-rang e-exceeded fault occurred wh en a BOUND instruction was executed. The BOUND instruction checks t hat a signed array index is within th e upper and lower bounds of an array located in m emor[...]

  • Página 209

    Vol. 3A 5-33 INTERRUPT AND EXCEPTION HANDLING Interrupt 6—Invalid Opcode Exception (#UD) Exception Class Fault. Description Indicates that the processor did one of the follo wing things: • Attempted to execute an invalid or reserved opcode. • Attempted to execute an instruction with an operand type that is invalid for its accompa- nying opcod[...]

  • Página 210

    5-34 Vol. 3A INTERRUPT AND EXCEPTION HANDLING The opcodes D6 and F1 are undefined opcodes that are reserved by the IA-32 architecture. These opcodes, even though undefined, do not generate an inval id opco de excepti on. The UD2 instruction is guaranteed to generate an invalid opcode exception. Exception Error Co de None. Saved Instruct ion Pointer[...]

  • Página 211

    Vol. 3A 5-35 INTERRUPT AND EXCEPTION HANDLING Interrupt 7—Device Not A vailable Exception (#NM) Exception Class Fault. Description Indicates one of the following thing s: The device-not-available exception is gene rated by either of three conditio ns: • The processor executed an x87 FPU floati ng-point instruction while th e EM flag in control [...]

  • Página 212

    5-36 Vol. 3A INTERRUPT AND EXCEPTION HANDLING Saved Instruct ion Pointer The saved contents of CS and EIP registers poi nt to the floating-point instruction or the W AIT/FW A IT instructi on th at generated the exception. Program St ate Change A program-state change does not accompany a de vice-not-available fault, because the instruc- tion that ge[...]

  • Página 213

    Vol. 3A 5-37 INTERRUPT AND EXCEPTION HANDLING Interrupt 8—Double Fault Exception (#DF) Exception Class Abort. Description Indicates that the processor detected a seco nd except ion wh ile calling an exceptio n h andler for a prior exception. Normally , when the processo r detects an other exception while t rying to call an exception handler, the [...]

  • Página 214

    5-38 Vol. 3A INTERRUPT AND EXCEPTION HANDLING If another exception occurs while attemp ting to call the double-faul t handler , the processor enters shutdown mo de. This mode is similar to the state following execu tion of an HL T instruc- tion. In this mode , the processor stops executi ng instructions until an NM I interrupt, SMI inter- rupt, har[...]

  • Página 215

    Vol. 3A 5-39 INTERRUPT AND EXCEPTION HANDLING Interrupt 9—Coprocessor Segment Overrun Exception Class Abort. (Intel r eserved; do not use. Recent IA-32 processors do not generate this exception.) Description Indicates that an Intel386 CPU-based systems with an Intel 387 math coprocessor detected a page or segment violatio n while transferring th [...]

  • Página 216

    5-40 Vol. 3A INTERRUPT AND EXCEPTION HANDLING Interrupt 10—Invalid TSS Exception (#TS) Exception Class Fault. Descripti on Indicates that there was an error related to a TS S. Such an error might be detected during a task switch or during the execution of instructions that use informat ion from a TSS. T able 5-6 shows the conditions that cause an[...]

  • Página 217

    Vol. 3A 5-41 INTERRUPT AND EXCEPTION HANDLING This exception can generated either in the context of the original task or in the context of the new task (see Section 6.3, “T ask Switching”). Un til the processor has completely verified the presence of the new TSS, the exception is generate d in the context of the original task. Once the existenc[...]

  • Página 218

    5-42 Vol. 3A INTERRUPT AND EXCEPTION HANDLING Exception Error Co de An error code containing the segm ent selector index for the segm ent descriptor that caused the violation is pushed onto the stack o f the exception handler . If the EXT flag is set, it indicates that the exception was caused by an event external to the currently running pr ogram [...]

  • Página 219

    Vol. 3A 5-43 INTERRUPT AND EXCEPTION HANDLING Interrupt 1 1—Segment Not Present (#NP) Exception Class Fault. Description Indicates that the present flag of a segment or g ate descriptor is clear . The processor can generate this exception during any of the following operations: • While attempting to load CS, DS, ES, FS, or GS registers. [Detect[...]

  • Página 220

    5-44 Vol. 3A INTERRUPT AND EXCEPTION HANDLING Saved Instruct ion Pointer The saved contents of CS and EIP registers norma lly point to the instruct ion that generated the exception. If the exception occurr ed while loading segment descriptors fo r the segment selectors in a new TSS, the CS and EIP registers point to the first instruction in the new[...]

  • Página 221

    Vol. 3A 5-45 INTERRUPT AND EXCEPTION HANDLING Interrupt 12—St ack Fault Exception (#SS) Exception Class Fault. Description Indicates that one of the following stack related conditions was detected: • A limit violation is detected during an operatio n that refers to the SS register . Operations that can cause a limit violatio n include stack-ori[...]

  • Página 222

    5-46 Vol. 3A INTERRUPT AND EXCEPTION HANDLING exception. The stack fau lt handler should thus not rely on being abl e to use the segment select ors fou nd i n t he CS, SS, DS, ES, FS, and GS registers without causing another exception. The exception handler should check a ll segment registers before trying t o resume the new task; otherwise, genera[...]

  • Página 223

    Vol. 3A 5-47 INTERRUPT AND EXCEPTION HANDLING Interrupt 13—General Protection Exception (#GP) Exception Class Fault. Description Indicates that the processor detected one of a class of protection viol ations called “general- protection violations.” The co nditions that cause this ex ception to be generated comprise all the protection violatio[...]

  • Página 224

    5-48 Vol. 3A INTERRUPT AND EXCEPTION HANDLING • Loading the CR0 register with a se t NW flag and a clear CD flag. • Referencing an entry in the IDT (followin g an interrupt or exception) t hat is not an interrupt, trap, or task gate. • Attempting to access an interrupt or exception handler through an in terrupt or trap gate from virtual-8086 [...]

  • Página 225

    Vol. 3A 5-49 INTERRUPT AND EXCEPTION HANDLING • A selector from a TSS invo lved in a task switch. • IDT vector number . Saved Instruction Pointer The saved contents of CS and EIP registers poin t to the in struction that generated the exception. Program St ate Chang e In general, a program-state change does not accompany a general-protection ex[...]

  • Página 226

    5-50 Vol. 3A INTERRUPT AND EXCEPTION HANDLING • If the segment descriptor from a 64-b it call gate is in non-canonical space. • If the DPL from a 64- bit call-gate is less th an the CPL or than the RPL of the 64-bit call- gate. • If the upper type field of a 64-bit call gate is not 0x0. • If an attempt is made to load a null selector in th [...]

  • Página 227

    Vol. 3A 5-51 INTERRUPT AND EXCEPTION HANDLING Interrupt 14—Page-Fault Exception (#PF) Exception Class Fault. Description Indicates that, with paging enable d (the PG flag in the CR0 regi st er is set), the processor detected one of the following condi tions while using the pa ge-translation mechanism to translate a linear address to a physical ad[...]

  • Página 228

    5-52 Vol. 3A INTERRUPT AND EXCEPTION HANDLING — The RSVD flag indicates that the processor detected 1s in reserved bits of the page directory , when the PSE or P AE flags in co ntrol register CR4 are set to 1. (The PSE flag is only available in the Pentium 4, Int el Xeon, P6 family , and Pentium processors, and the P AE flag is only available on [...]

  • Página 229

    Vol. 3A 5-53 INTERRUPT AND EXCEPTION HANDLING Saved Instruction Pointer The saved contents of CS an d EIP registers genera lly point t o the instru ction that generated the exception. If the page-fault exception occurred during a task switch, the CS and EIP registers may point to the first instruction of the new task (as described in the following [...]

  • Página 230

    5-54 Vol. 3A INTERRUPT AND EXCEPTION HANDLING When executing this code on one of the 32-bit IA-32 processors, it is possib le to get a page fault, general-protection fault (#G P), or alignment ch eck fault (#AC) after the segment selector has been loaded into the SS register but before the ESP register has b een load ed. At this point, the two part[...]

  • Página 231

    Vol. 3A 5-55 INTERRUPT AND EXCEPTION HANDLING Interrupt 16—x87 FPU Floa ting-Point Error (#MF) Exception Class Fault. Description Indicates that the x87 FPU has detected a floati ng-point error . The NE flag in the register CR0 must be set for an interrupt 16 (floating-point error exceptio n) to be generated. (See Section 2.5, “Control Register[...]

  • Página 232

    5-56 Vol. 3A INTERRUPT AND EXCEPTION HANDLING Prior to executing a waiting x87 FPU instruction or the W AIT/FW AIT instruction, the x87 FPU checks for pending x87 FPU floating -point exceptions (as described in step 2 above). Pending x87 FPU floatin g-point exception s are ignored for “non-wai ting” x87 FPU in struction s, which include the FNI[...]

  • Página 233

    Vol. 3A 5-57 INTERRUPT AND EXCEPTION HANDLING Interrupt 17—Alignment Check Exception (#AC) Exception Class Fault. Description Indicates that the processor detected an una ligned memory operand wh en alignment checking was enabled. Alignment checks are only carried out in data (or stack) accesses (not in code fetches or system segment accesses). A[...]

  • Página 234

    5-58 Vol. 3A INTERRUPT AND EXCEPTION HANDLING Alignment-check excep tions (#AC) are generated only when operating at privilege l evel 3 (user mode). Memory references that default to privileg e level 0, such as segment descriptor loads, do not generate alignment-check exceptions, even when caused by a memory reference made fro m privilege l evel 3.[...]

  • Página 235

    Vol. 3A 5-59 INTERRUPT AND EXCEPTION HANDLING Interrupt 18—Machine- Check Exception (#MC) Exception Class Abort. Description Indicates that the processor detected an internal machine error or a bus erro r , or tha t an external agent detected a bus error . Th e machine-check exception is mode l-specific, available only on the Pentium 4, Intel Xeo[...]

  • Página 236

    5-60 Vol. 3A INTERRUPT AND EXCEPTION HANDLING Program St ate Change The machine-check mechanism is enabled by sett ing the MCE flag in control register CR4. For the Pentium 4, Intel Xeon, P6 family , and Pentium processors, a p rogram-state change always accompanies a machine-check exception, and an abort class exception is generated. For abort exc[...]

  • Página 237

    Vol. 3A 5-61 INTERRUPT AND EXCEPTION HANDLING Interrupt 19—SIMD Floati ng-Point Exception (#XF) Exception Class Fault. Description Indicates the processor has detected an SSE/ SSE2/SSE3 SIMD floating-point exceptio n. The appropriate status flag in th e MXCSR regist er must be set and the particular exception unmasked for this interrupt to be gen[...]

  • Página 238

    5-62 Vol. 3A INTERRUPT AND EXCEPTION HANDLING Note that because SIMD floatin g-point exceptions are precise an d occur immediately , the situ- ation does not arise where an x87 FPU instruct ion, a W AIT/FW AIT inst ruction, or another SSE/SSE2/SSE3 instruction will catch a pen ding unmasked SIMD floatin g-point exception. In situations where a SIMD[...]

  • Página 239

    Vol. 3A 5-63 INTERRUPT AND EXCEPTION HANDLING Saved Instruction Pointer The saved contents of CS and EIP registers po int to the SSE/SSE2/SSE3 instruction th at was executed when the SIMD floating- point exception was generated. This is the faulting in struction in which the error condition was detect ed. Program St ate Chang e A program-state chan[...]

  • Página 240

    5-64 Vol. 3A INTERRUPT AND EXCEPTION HANDLING Interrupt s 32 to 255— User Defined Interrupt s Exception Class Not applicable. Descripti on Indicates that the processor did one of the following things: • Executed an INT n instruction where the instruction op erand is one of the vector numbers from 32 thro ugh 255. • Responded to an interrupt r[...]

  • Página 241

    6 T ask Management[...]

  • Página 242

    [...]

  • Página 243

    Vol. 3A 6-1 CHAPTER 6 T ASK MANAGEMENT This chapter describes the IA-32 architecture’ s ta sk management facilities . These facilities are only available when the processor is running in protected mode. This chapter focuses on 32-bit tasks and the 32-bit TSS structure. For info rmation on 16-bit tasks and the 16-bit TS S structure, see Section 6.[...]

  • Página 244

    6-2 Vol. 3A T ASK MANAGEMENT 6.1.2 T ask St ate The following items define the stat e of the currentl y executing task: • The task’ s current execution space, defined by the segment selectors in the segment registers (CS, DS, SS, ES, FS, and GS). • The state of the general-purpose registers. • The state of the EFLAGS register . • The stat[...]

  • Página 245

    Vol. 3A 6-3 T ASK MANAGEMENT 6.1.3 Executing a T ask Software or the processor can dispatch a task for execution in one of the following ways: • A explicit call to a task with the CALL instru ction . • A explicit jump to a task with t he JMP instruction. • An implicit call (by the processor) to an interrupt-handler ta sk. • An implicit call[...]

  • Página 246

    6-4 Vol. 3A T ASK MANAGEMENT Use of task managem ent facilities for handl ing multitasking appl ications is optional. M ulti- tasking can be handled in software, with each so ftware defined task executed in the context of a single IA-32 architecture task. 6.2 T ASK MANAGEMENT DATA S TRUCTURES The processor defines five data structur es fo r handlin[...]

  • Página 247

    Vol. 3A 6-5 T ASK MANAGEMENT The processor updates dynamic fields when a task is suspended during a task switch. The following are d ynamic fields: • General-purpose register fields — State of the EAX, E CX, EDX, EBX, ESP , EBP , ESI, and EDI registers prior to the task switch. • Segment selector fields — Segment selectors stored in the ES,[...]

  • Página 248

    6-6 Vol. 3A T ASK MANAGEMENT • EFLAGS register field — State of the EF AGS register prior to the t ask switch. • EIP (instruction poin ter) field — State of the EIP register prior to the task switch. • Previous task link field — Contains the segment selector for the TSS of the previous task (updated on a task switch that was initiated b[...]

  • Página 249

    Vol. 3A 6-7 T ASK MANAGEMENT 6.2.2 TSS Descriptor The TSS, like all other segments, is defined by a segment descriptor . Figure 6-3 shows the format of a TSS descriptor . TS S descriptors may only be placed in the GDT ; they cannot be placed in an LDT or the IDT . An attempt to access a TSS using a segment s elector with its TI flag se t (which ind[...]

  • Página 250

    6-8 Vol. 3A T ASK MANAGEMENT The base, limit, and DPL fields an d the granular ity and present flags have function s similar to their use in data-segment descriptors (see Sect ion 3.4.5, “Segment Descri ptors”). When the G flag is 0 in a TSS descriptor for a 32-bit TSS, the limit field must have a val ue equal to or greater than 67H, one byte l[...]

  • Página 251

    Vol. 3A 6-9 T ASK MANAGEMENT 6.2.4 T ask Register The task register holds the 16-bit segment select or and the entire segment descriptor (32-bit base address, 16-bit segment limit, and descript or attributes) for th e TSS of the current task (see Figure 2-5). This inform ation is copied from the TSS descriptor in the GDT for th e current task. Figu[...]

  • Página 252

    6-10 Vol. 3A T ASK MANAGEMENT The L TR instruction loads a segment selector (s ou rce operand) into the task register that points to a TSS descriptor in the GDT . It then loads the invisi ble portion of the task register with infor- mation from the TSS de scriptor . L TR is a p rivileged instruction th at may be executed only w hen the CPL is 0. It[...]

  • Página 253

    Vol. 3A 6-11 T ASK MANAGEMENT 6.2.5 T ask-Gate Descriptor A task-gate descriptor provides an indirect, prot ected reference to a task (see Figure 6-6). It can be placed in the GDT , an LDT , or the IDT . Th e TSS segment selector field in a task-gate descriptor points to a TSS descriptor in the G DT . The RPL in this segm ent selector is not used. [...]

  • Página 254

    6-12 Vol. 3A T ASK MANAGEMENT Figure 6-7 i llustrates how a task gat e in an LDT , a task gate in the GDT , and a task gate in the IDT can all point to the same task. 6.3 T ASK SWITCHING The processor transfers execution to a nother task in one of four cases: • The current program, task, or procedure ex ecutes a JMP or CALL instruction to a TSS d[...]

  • Página 255

    Vol. 3A 6-13 T ASK MANAGEMENT JMP , CALL, and IRET instructions, as well as inte rrupts and exceptions, are all mechanisms for redirecting a program. The referencing of a TSS descriptor or a task gate (when calling or jumping to a task) or the state of the NT flag (when executing an IR ET instruction) determines whether a task switch occurs. The pr[...]

  • Página 256

    6-14 Vol. 3A T ASK MANAGEMENT 10. I f the task switch was initiated with a CALL instru ctio n, JMP inst ruction , an excepti on, or an interrupt, the processor set s the busy (B) flag in the new task’ s TSS descriptor; if initiated with an IRET instruct ion , the busy (B) flag is left set. 1 1. Loads the task regi ster wit h the segm ent selector[...]

  • Página 257

    Vol. 3A 6-15 T ASK MANAGEMENT When switching tasks, the privilege level of the new task does not inherit its privilege level from the suspended task. The new task begins executing at the privilege level sp ecified in the CPL field of the CS register , which is loaded from the TSS. Because ta sks are isolated by their sepa- rate address spaces and T[...]

  • Página 258

    6-16 Vol. 3A T ASK MANAGEMENT The TS (task switched) flag in the control regist er CR0 is set every time a task switch occurs. System software uses the TS flag to coordina te the actions of floating-point unit when gener- ating floating-poi nt exceptions with the rest of the processor . The TS fl ag indicates that the context of the floating-point [...]

  • Página 259

    Vol. 3A 6-17 T ASK MANAGEMENT T able 6-2 shows the busy flag (in the TSS segment descriptor), the NT flag, the previous task link field, and TS flag (in control register CR0) during a task sw itch. The NT flag may be modified by software executing at any privilege level. It is possible for a program to set the NT flag and execute an IRET instructio[...]

  • Página 260

    6-18 Vol. 3A T ASK MANAGEMENT 6.4.1 Use of Busy Flag T o Pr event Recursive T ask Switching A TSS allows only one con text to be saved for a task; therefore, once a task is called (dispatched), a recursive (or re-entrant) call to th e task would cause the current state of the task to be lost. The busy flag in the TSS segment descriptor is provided [...]

  • Página 261

    Vol. 3A 6-19 T ASK MANAGEMENT In a multiprocessing system, additional sy nchronization and serialization operations must be added to this procedure to insu re that the TSS and its segment descriptor are both locked when the previous task link field is ch anged and the busy flag is cleared. 6.5 T ASK ADDRESS SP ACE The address space for a task consi[...]

  • Página 262

    6-20 Vol. 3A T ASK MANAGEMENT that the mapping of TSS addresses does not chan ge while the processor is reading and updating the TSSs during a task switch. The linear ad dress space mapped by the GDT also should be mapped to a shared area of the physical space; otherwise, the purpose of the GDT is defeated. Figure 6-9 shows how the linear address s[...]

  • Página 263

    Vol. 3A 6-21 T ASK MANAGEMENT • Through segment descriptors in distinct LD T s that are mapped to common addr esses in linear address space — If this common area of the li near address space is mapped to the same area of the physical address space for each task, th ese segment descriptors permit the tasks to share segments. Such segment de scri[...]

  • Página 264

    6-22 Vol. 3A T ASK MANAGEMENT Figure 6-10. 16-Bit TSS Format T ask LDT Sele ctor DS Selector SS Selector CS Selector ES Selector DI SI BP SP BX DX CX AX FLAG Wo rd IP (Entry Point) SS2 SP2 SS1 SP1 SS0 SP0 Previous T ask Link 15 0 42 40 36 34 32 30 38 28 26 24 22 20 18 16 14 12 10 8 6 4 2 0[...]

  • Página 265

    Vol. 3A 6-23 T ASK MANAGEMENT 6.7 T ASK MANAGEMENT IN 64-B IT MODE In 64-bit mode, task stru cture and task state are simi lar to those in protecte d mode. However , the task switching mechanism av ailable in protected mo de is not supported in 64-bit m ode. T ask management and switch ing must be performed by software. The processor issues a gener[...]

  • Página 266

    6-24 Vol. 3A T ASK MANAGEMENT Figure 6-1 1. 64-Bit TSS Format 0 31 100 96 92 88 84 80 76 I/O Map Base Address 15 72 68 64 60 56 52 48 44 40 36 32 28 24 20 16 12 8 4 0 RSP0 (lower 32 bits) RSP1 (lower 32 bits) RSP2 (lower 32 bits) Reserved bits. Set to 0. RSP0 (upper 32 bits) RSP1 (upper 32 bits) RSP2 (upper 32 bits) IST1 (lower 32 bits) IST1 (upper[...]

  • Página 267

    7 Multiple-Pr ocessor Management[...]

  • Página 268

    [...]

  • Página 269

    Vol. 3A 7-1 CHAPTER 7 MULTIPLE-PROCESSOR MANAGEMENT The IA-32 architecture pr ovides sever al mechanisms for managing and improving the pe rfor- mance of multiple processors connect ed to the same system bus. These mechanisms include: • Bus locking and/or cache coherency m anagement for performing atom ic operations on system memory . • Seriali[...]

  • Página 270

    7-2 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT • T o distribut e interrupt handling among a group o f processors — When several processors are operating in a system in parallel, it is useful to have a cen tralized mechanism for receiving interrupts and distributing them to availa ble processors for servicing. • T o increase system performance by e[...]

  • Página 271

    Vol. 3A 7-3 MULTIPLE-PROCESSOR MANAGEMENT The mechanisms for handling locked atomic operati ons have evolved as the complexity of IA-32 processors has evolved. As such, more recent IA-32 processors (such as the Pentium 4, Intel Xeon, and P6 family processors) provide a more refined locking mechanism than earlier IA-32 processors. These are describe[...]

  • Página 272

    7-4 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT For the Pentium 4, Intel Xeon, and P6 family processors, if the memory area being accessed is cached internally in the processor , the LOCK# si gnal is generally not asserted; instead, locking is only applied to the processor ’ s caches (see Section 7.1.4, “ Effects of a LOCK Operation on Internal Proce[...]

  • Página 273

    Vol. 3A 7-5 MULTIPLE-PROCESSOR MANAGEMENT 7.1.2.2 Software Contro lled Bu s Locking T o explicitly force the LOCK semantics, softwa re can use the LOCK prefix with the following instructions when they are us ed to modi fy a memory locati on. An invalid-opcode exception (#UD) is generated when the LOCK p refix is used wit h any o ther inst ruction o[...]

  • Página 274

    7-6 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT Locked instructions should not be used to insure that data written can be fetched as instructions. NOTE The locked instructions fo r the current versions of the Pentium 4, Intel X eon, P6 family , Pentium, and Intel486 proce ssors al low data written to be fetched as instructions. However, Intel recommends [...]

  • Página 275

    Vol. 3A 7-7 MULTIPLE-PROCESSOR MANAGEMENT T o write cros s-modifying code and insure that it is compliant with curren t and future versions of the IA-32 architecture, the following processor synchronization algorit hm must be imple- mented: (* Action of Modifying Processor *) Memory_Flag ← 0; (* Set Memory_Flag to val u e ot her than 1 *) Store m[...]

  • Página 276

    7-8 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT T o all ow optimizing of instruction execution, the IA-32 architecture al lows departures from strong-ordering model called processor ordering in Pentium 4, Intel Xeon, and P6 family processors. These pr ocessor- ordering variations allo w performance enhancing operations such as allowing reads to go ahead [...]

  • Página 277

    Vol. 3A 7-9 MULTIPLE-PROCESSOR MANAGEMENT 4. Writes can be buffered. 5. Writes are not performed specu latively; they are only performed for instructions that have actually been retired. 6. Data fro m buffered writes can be forwar ded to waiting reads within the proces sor . 7. Reads or writes cannot pass (be carried out ahead of) I/O instru ctions[...]

  • Página 278

    7-10 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT 7.2.3 Out-of-Order Stores For S tri ng Operations in Pentium 4, Intel Xeon, and P6 Family Processors The Pentium 4, Intel Xeon, and P6 family proce ssors modify the processors operation during the string store operations (initiated with the M OVS and STOS instructions) to maximi ze perfor- mance. Once the [...]

  • Página 279

    Vol. 3A 7-11 MULTIPLE-PROCESSOR MANAGEMENT • The initial operation counter (ECX) must be equal to or greater than 64. • Source and destination must not overlap by less than a cache line (64 bytes, Pentium 4 and Intel Xeon processors; 32 bytes P6 family and Pentium processors). • The memory type for both source and destinat ion addresses must [...]

  • Página 280

    7-12 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT Program synchronization can also be carried out with serializin g instru ctions (see Section 7.4). These instructions are typically used at critical procedure or task boundaries to force completion of all previous instructions befo re a jump to a new section of code or a context switch occurs. Like the I/O[...]

  • Página 281

    Vol. 3A 7-13 MULTIPLE-PROCESSOR MANAGEMENT It is recommended that software written to run on Pentium 4, Intel Xeon, and P6 family proces- sors assume the processor-ordering model or a weaker memory-orderi ng model. The Pentium 4, Intel Xeon, and P6 family processors do not implement a strong memory-ordering m odel, except when using the UC memory t[...]

  • Página 282

    7-14 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT 7.4 SERIALIZING INSTRUCTIONS The IA-32 architecture defines several serializing instructions . These instructions force the processor to complete all modifi cations to flags, registers, and memory by previo us instructions and to drain all buffered writes to mem ory before the next instruction is fetched a[...]

  • Página 283

    Vol. 3A 7-15 MULTIPLE-PROCESSOR MANAGEMENT • When an instruction is execute d that enables or disables paging (that is, chang es the PG flag in control register CR0), the instruction should be followed by a jump instruct ion. The target instruction of the jump instruction is fetched with the new setting of the PG flag (that is, paging is enabled [...]

  • Página 284

    7-16 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT • Intel Xeon pr ocessors with family , model, and st epping IDs up to F09H — The selection of the BSP and APs (see Section 7.5. 1, “BSP and AP Processors ”) is handled through arbitration on the system bus, usin g BIPI and FIPI messages (see Secti on 7.5.3, “MP Initialization Protocol Algorit hm [...]

  • Página 285

    Vol. 3A 7-17 MULTIPLE-PROCESSOR MANAGEMENT • All devices in the system that are capable of delivering interrupts to the processors must be inhibited from doing so for t he duration of the MP initializati on protocol. The time during which interrupts must be inhib ited includes the window between when the BSP issues an INIT -SIPI-SIPI sequence to [...]

  • Página 286

    7-18 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT • The remainder of the processors (which were not selected as the BSP) are designated as APs. They leave their BSP fl ags in the clear state and enter a “wait- for-SIPI state.” • The newly established BSP broadcasts an FIPI me ssage to “all including self,” which the BSP and APs treat as an end[...]

  • Página 287

    Vol. 3A 7-19 MULTIPLE-PROCESSOR MANAGEMENT The following constants and data definitions are used in the accompanying code examples. They are based on the addresses of the APIC registers as defined in T a bl e 8-1. ICR_LOW EQU 0FEE00300H SVR EQU 0FEE000F0H APIC_ID EQU 0FEE00020H LVT3 EQU 0FEE00370H APIC_ENABLED EQU 0100H BOOT_ID DD ? COUNT EQU 00H V[...]

  • Página 288

    7-20 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT space (1-MByte space). For example, a vect or of 0BDH specifies a sta rt-up memory address of 000BD000H. 1 1. Ena bles the lo cal APIC by setting bit 8 of the APIC spurious vector register (SVR). MOV ESI, SVR ; Address of SVR MOV EAX, [ESI] OR EAX, APIC_ENABLED; Set bit 8 to enable (0 on reset) MOV [ESI], [...]

  • Página 289

    Vol. 3A 7-21 MULTIPLE-PROCESSOR MANAGEMENT 16. W aits for the timer interrupt. 17. Reads and evaluates the COUNT variable and establishes a processor count. 18. If necessary , reconfigures the APIC and c ontinues with the remaining system diagnostics as appropriate. 7.5.4.2 T ypic al AP In itialization Sequence When an AP receives the SIPI, it begi[...]

  • Página 290

    7-22 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT 7.5.5 Identifying Logical Proc essors in an MP System After the BIOS has completed the MP initiali zation protocol, each logical processor can be uniquely identified by its local APIC ID. Software can access thes e APIC IDs in either of the following ways: • Read APIC ID for a local APIC — Code running[...]

  • Página 291

    Vol. 3A 7-23 MULTIPLE-PROCESSOR MANAGEMENT For P6 family processors, the APIC ID that is assigned to a processor durin g power-up and initialization is 4 bits (see Figure 7-2). Here, bits 0 an d 1 form a 2-bit processor (or socket ) iden- tifier and bits 2 and 3 form a 2-bit cluster ID. 7.6 HYPER-THREADING AND MULTI-CORE T ECHNOLOGY Hyper-Threading[...]

  • Página 292

    7-24 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT 7.7 DETECTING HARDWARE MU LTI-THREAD ING SUPPORT AND T OPOLOGY Use the CPUID instruction to detect the presence of ha rdware multi-thread ing support in a ph ys- ical processor . The foll owing can be interpreted : • Hardware Multi-Threading featur e flag (CPUID.1:EDX[28] = 1) — Indicates when set that[...]

  • Página 293

    Vol. 3A 7-25 MULTIPLE-PROCESSOR MANAGEMENT 7.7.2 Initializing Dual-Core IA-32 Processors The initialization process fo r an MP system that contains dual-core IA-32 processors is the same as for conventional MP systems (see Section 7.5, “Multiple-Processor (MP) Initialization”). A logical processor in one core is selected as the BSP; other logi [...]

  • Página 294

    7-26 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT 7.8 INTEL ® HYPER-THREADING T ECHNOLOGY ARCHITECTURE Figure 7-4 shows a generalized v iew o f an IA-32 processor supportin g Hyper-Threading T ech- nology , using the Intel Xeon processor MP as an exampl e. This implementation of the Hyp er- Threading T echnology consists of two logical processors (each r[...]

  • Página 295

    Vol. 3A 7-27 MULTIPLE-PROCESSOR MANAGEMENT 7.8.1 St ate of th e Logical Processors The following features are part of the architectural state of l ogical processors within IA-32 processors supporting Hyper-Threading T echnology . The features can be subdivid ed into three groups: • Duplicated for each logical processor • Shared by logical proce[...]

  • Página 296

    7-28 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT • Debug registers (DR0, DR1, DR2, DR3, DR6, DR7) and the debug control MSRs • Machine check global status (IA32_MC G_ST A TUS) and machine check capability (IA32_MCG_CAP) MSRs • Thermal clock modulation and ACPI Power managem ent control MSRs • T ime stamp counter MSRs • Most of the other MSR reg[...]

  • Página 297

    Vol. 3A 7-29 MULTIPLE-PROCESSOR MANAGEMENT of memory , independent of the processor on whic h it is running. See Section 10.11, “Memory T ype Range Registers ( MTRRs),” for inf ormation on setti ng up MTRRs. 7.8.4 Page Attribute T able (P A T) Each logical p rocessor has its own P A T MSR (IA32_CR_ P A T). However , as des cribed in Section 10.[...]

  • Página 298

    7-30 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT The performance counter i nterrupts, events, and precise event monitoring support can be set up and allocated on a per thread (p er logical processor) basis. See Section 18.14, “Performan ce Monitoring and Hyper-Threading T echnology ,” for a discus- sion of pe rformance monitor ing in the Int el Xeon [...]

  • Página 299

    Vol. 3A 7-31 MULTIPLE-PROCESSOR MANAGEMENT 7.8.12 Self Modifying Code IA-32 processors supporting Hyper-Threading T echnology support self-modifying code , where data writes modify instructions cached or currentl y in flight. They also sup port cross-modifying code, where on an MP system writes generated by one proces sor modify instructions cached[...]

  • Página 300

    7-32 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT Entries in the TLBs are tagged wi th an ID that indicates the logi cal processor that init iated the translation. This tag appl ies ev en for translations that are ma rked global using the pag e global feature for memory paging . When a logical processor performs a TLB invalid ation operation, only the TLB[...]

  • Página 301

    Vol. 3A 7-33 MULTIPLE-PROCESSOR MANAGEMENT vector tables for one or both of the logical processors. T ypically in MP systems, the LINT0 and LINT1 pins are not used to deliver interrup ts to the logical processors. Instead all interrup ts are delivered to the local processors through the I/O APIC. • A20M# pin — On an IA-32 processor, the A20M# p[...]

  • Página 302

    7-34 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT 7.9.2 Memory T ype Ra nge Registers (MTRR) MTRR is shared between two lo gical processors sharing a pr ocessor core if the physical processor supports Hyper-Threading T echnology . MTRR is not shared between logi cal proces- sors located in different cores or different physical packages. IA-32 architecture[...]

  • Página 303

    Vol. 3A 7-35 MULTIPLE-PROCESSOR MANAGEMENT 7.10 PROGRAMMING CONSID ERATIONS FOR HARDWARE MULTI-THREADING CAP ABLE PROCESSORS In a multi-threading en vironment, there may be certain ha rdware resources that are physically shared at some level of the hard ware topology . In the multi-processor sy stems, typically bus and memory sub-systems are physic[...]

  • Página 304

    7-36 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT The value of valid APIC_IDs need not be cont iguo us across package boundary or co re bound- aries. 7.10.2 Identifying Logical Proc essors in an MP System For any IA-32 processor , system hardware establis hes an initial APIC ID that is unique for each logical processor fo llow ing power-up or RESET (see S[...]

  • Página 305

    Vol. 3A 7-37 MULTIPLE-PROCESSOR MANAGEMENT T able 7-2 sho ws the initial APIC IDs for a hy pothetical situation with a dual processor system. Each physical package providing two processor cores, and each processor core also supporting Hyper-Threading T echnology . T able 7-1. Initial APIC IDs for the Logical Processors in a System that has Four MP-[...]

  • Página 306

    7-38 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT 7.10.3 Algorithm for Three-L evel Mappings of APIC_ID Software can gather the initial APIC_IDs for each logical pro cessor supported by the operating system at runtime 4 and extract identifiers corresponding to the three levels of sharing topology (package, core, and SMT). The algorithms below focus on a n[...]

  • Página 307

    Vol. 3A 7-39 MULTIPLE-PROCESSOR MANAGEMENT unsigned int HW MTSupported(void) { try { // verify cpuid in struction is supported execute cpuid with eax = 0 to ge t vendor string execute cpuid with eax = 1 to get feature fl ag an d signature } except (EXCEPTION_EXECUTE_HANDLER) { return 0 ; // CPUID is not supported; So HW Multi-threading capability i[...]

  • Página 308

    7-40 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT store returned value of eax return (unsigned ) ((reg_eax >> 26) +1); } else // must be a single-core processor return 1; } 4. Extract the initial APIC ID of a logical processor . #define INITIAL_APIC_ID_BITS 0xFF0 00000 // EBX[31:24] initial APIC ID // Returns the 8-bit unique initial APIC ID for the[...]

  • Página 309

    Vol. 3A 7-41 MULTIPLE-PROCESSOR MANAGEMENT 6. Extract a sub ID given a full ID, maximum sub ID valu e and shift count. // Returns the value of the sub ID, this is not a zero-based value Unsigned char GetSubID(u nsigned char Full_ID, unsigned char MaxSubIDva lue, unsigned char Shift_Count) { MaskWidth = FindMaskWidth(MaxSubIDValue); MaskBits = ((uch[...]

  • Página 310

    7-42 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT CORE_ID, assuming the number of physical packages in each node of a clustered system is symmetric. • Assemble the three-level identifiers of SMT_ID, CORE_ID, P ACKAGE_IDs into arrays for each enabled logical processor . Th is is shown in Exam ple 7-3a. • T o detect th e number of physical packages: use[...]

  • Página 311

    Vol. 3A 7-43 MULTIPLE-PROCESSOR MANAGEMENT Example 7-3 Compu te the Number of Packag es, Cores, and Proce ssor Relationships in a MP System a) Assemble lists of PACKAGE_ID, CORE_ID, and SMT_ID of each enabl ed logical processors //The BIOS and/or OS may limit the number of logical p rocessors available to app lications // after system boot. The bel[...]

  • Página 312

    7-44 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT The algorithm below assumes there is symmetry across package boundary if more than one socket is populated in an MP system. // Bucket Package IDs and compute processor mask for every package. PackageNum = 1; PackageIDBucket[0] = PackageID[0]; ProcessorMask = 1; PackageProcessorMask[0] = Processor Mask; For[...]

  • Página 313

    Vol. 3A 7-45 MULTIPLE-PROCESSOR MANAGEMENT If ((PackageID[ProcessorNum] | CoreID[ProcessorNum]) == CoreIDBucket[i]) { CoreProcessorMask[i] |= ProcessorMask; Break; // found in existing bucket, skip to next iteration } } if (i == CoreNum) { //Did not match any bucket, start new bucket CoreIDBucket[i] = Packa geID[ProcessorNum] | CoreID[ProcessorNum][...]

  • Página 314

    7-46 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT 7.1 1.2 P AUSE Instruction The P AUSE instruction improves the performa nce of IA-32 processors supporting Hyp er- Threading T ech nology when executing “spin-wait loo ps” and other routines where one thread is accessing a shared lock or semaphore in a tig ht polling loop. When ex ecuting a spin -wait [...]

  • Página 315

    Vol. 3A 7-47 MULTIPLE-PROCESSOR MANAGEMENT 7.1 1.4 MONIT OR/MW AIT Instruction Operating systems usually im plement idle loops to handle th read synchronization. In a typical idle-loop scenario, there could be several “busy loops” and they would use a set o f memory loca- tions. An impacted processor waits in a lo op and poll a memory lo cation[...]

  • Página 316

    7-48 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT Power management related events (such as Thermal Monit or 2 or chipset driven STPCLK# assertion) will not cause th e monitor event pendi ng flag to be cleared. Faults will not cause the monitor event pending flag to be cleared. Software should not allow for voluntary context sw itches in between MONITOR/MW[...]

  • Página 317

    Vol. 3A 7-49 MULTIPLE-PROCESSOR MANAGEMENT These above two values bear no relationship to cache line size in the syst em and software should not make any assumptions to th at effect. W ithin a single-cluster system, the two parameters should default to be t he same (the size of th e monitor triggering area i s the sam e as the system coherence line[...]

  • Página 318

    7-50 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT PAUSE ; Short delay JMP Spin_Lock Get_Lock: MOV EAX, 1 XCHG EAX, lockvar ; Try to get lock CMP EAX, 0 ; Test if successful JNE Spin_Lock Critic al_Section: <critical section code> MOV lockvar, 0 ... Continue: The spin-wait loop above uses a “test, test-and-se t” technique for determ ining the ava[...]

  • Página 319

    Vol. 3A 7-51 MULTIPLE-PROCESSOR MANAGEMENT The MONITOR and MWAIT instructions may be consi dered for use in the C0 i dle state loops, if MONITOR and MWAIT are supported. Example 7-6 An OS Idle Loop with MONIT OR/MW AIT in the C0 Idle Lo op // WorkQueue is a memory locati on indicating there is a thread // ready to run. A non-zero value for WorkQueu[...]

  • Página 320

    7-52 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT other logical processors in the physical package. For this reason, halting idl e logical processors optimizes the performance. 5 If all logical processors within a physical package are halted, the processor will enter a power-saving state. 7.1 1.6.4 Potential Usage of MONI TOR/MW AIT in C1 Idle Loop s An o[...]

  • Página 321

    Vol. 3A 7-53 MULTIPLE-PROCESSOR MANAGEMENT 7.1 1.6.5 Guidelines for Scheduling Threads on Logic al Processors Sharing Execution Resour ces Because the logical processors, the order in which threads are dispatched to logical processors for execution can affect the overa ll efficiency of a system. The following guidelines are recom- mended for schedu[...]

  • Página 322

    7-54 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT[...]

  • Página 323

    8 Advanced Pr ogrammable Interrupt Contr oller (APIC)[...]

  • Página 324

    [...]

  • Página 325

    Vol. 3A 8-1 CHAPTER 8 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) The Advanced Programmable Interrupt Controll er (APIC), referred to in the follo wing sections as the local APIC, was introdu ced into the IA-32 processors with the Pentium processor (see Section 17.26., “Advanced Program mable Interrupt Controller (API C)”) and is included[...]

  • Página 326

    8-2 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Local APICs can receive interrupt s from the followi ng sources: • Locally connected I/O devices — These interrupts originate as an edge or level asserted by an I/O device that is connected directly to the processor ’ s local interrupt pins (LINT0 and LINT1). The I/O devices may al[...]

  • Página 327

    Vol. 3A 8-3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Xeon processors) or on the APIC bus (for Pentiu m and P6 family processors). See Section 8.2, “System Bus Vs. APIC Bus.” IPIs can be sent to other I A-32 processors in the system or to the originating processor (self- interrupts). When the target proces sor receives an IPI message, i[...]

  • Página 328

    8-4 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) processors through the local inte rrupt pins; however , this mechanism is com monly not used in MP systems. Figure 8-2. Local APICs and I/ O APIC When Intel Xeon Processors Are Used in Multiple- Processor Systems Figure 8- 3. Local AP ICs and I/O A PIC When P6 Family Pro cessors Are Used[...]

  • Página 329

    Vol. 3A 8-5 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) The IPI mechanism is typically used in MP syst ems to send fixed interrupts (interrupts for a specific vector number) and special-purpose inte rrupts to processors on the system bus. For example, a local APIC can use an IPI to forw ard a fixed interrupt to anot her processor for servicin[...]

  • Página 330

    8-6 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 8.4.1 The Local APIC Block Diagram Figure 8-4 gives a functional block diagram for the local APIC. Software interacts with the local APIC by reading and writing its registers. APIC registers are memory-mapped to a 4-KByte region of the processor ’ s physical address spac e with an init[...]

  • Página 331

    Vol. 3A 8-7 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Figure 8-4. Local APIC Structure Current Count Register Initial Count Register Divide Configuration Register V ersion Register Error S tatus Register In-Service Register (ISR) Ve c t or Decode Interrupt Co mmand Register (ICR) Acceptance Logic Ve c [ 3 : 0 ] & TMR Bit Register Select[...]

  • Página 332

    8-8 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) T able 8-1 shows how the APIC registers are mapped into the 4-KByte APIC register s pace. Registers are 32 bits, 64 bits, o r 256 bits in width; all are aligned on 128-bit boun daries. All 32-bit registers should be accessed using 128-bit aligned 32-bit loads or stor es. Some processors [...]

  • Página 333

    Vol. 3A 8-9 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 8.4.2 Presence of the Local APIC Beginning with the P6 family processors, the pr esence or absence of an on-chip local APIC can be detected using the CPUID inst ruction. When the CP UID instru ction is executed with a source operand of 1 in the EAX register , bit 9 of the CP UID feature [...]

  • Página 334

    8-10 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 8.4.3 Enabling or Disa bling the Local APIC The local APIC can be enabled or disabled in either of two ways: 1. Using the APIC global enable/disable flag in the IA32_APIC_BASE MSR (M SR address 1BH; see Figure 8-5): — When IA32 _APIC_BASE[1 1] is 0, the processor is fu nctionally equi[...]

  • Página 335

    Vol. 3A 8-11 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 8.4.4 Local APIC St atus and Location The status and location of the local APIC ar e contained in the IA 32_APIC_BASE MSR (see Figure 8-5). MSR bit functions are described belo w: • BSP flag, bit 8 ⎯ Indicat es if the processor is the boot strap processor (BSP). See Section 7.5, “[...]

  • Página 336

    8-12 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 8.4.6 Local APIC ID At power up, system hardware assigns a unique APIC ID to each local APIC on the system bus (for Pentium 4 and Intel Xeon processors) or on the APIC bus (for P6 family and Pentium processors). The hardware assigned APIC ID is based on system topology and includes enco[...]

  • Página 337

    Vol. 3A 8-13 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 8.4.7.1 Local APIC St ate After Power-Up or Reset Following a power-up or RESET of the processor , the state of local APIC and its registers are as follows: • The following registers are reset to all 0s: • IRR, ISR, TMR, ICR, LDR, and TPR • T imer initial count and timer current c[...]

  • Página 338

    8-14 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 8.4.7.3 Local APIC St ate Af ter an INIT Reset (“W ait-for-SIPI” St ate) An INIT reset of the processor can be initiated in either of two ways: • By asserting the processor ’ s INIT# pin. • By sending the processor an INIT IPI (an IPI with the delivery mode set to INIT). Upon [...]

  • Página 339

    Vol. 3A 8-15 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 8.5 HANDLING LOCAL INTERRUPT S The following sections describe facilities that are provided in the local APIC for h andling local interrupts. These include: the processor ’ s LINT 0 and LINT1 pins, the APIC tim er , the perfor- mance-monitoring counters, the thermal sensor, and the in[...]

  • Página 340

    8-16 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) monitor register and its associ ated interrupt were introduced in the Pentium 4 and Inte l Xeon processors. As shown in Figures 8-8, some of t hese fields and flags are not availabl e (and reserved ) for some entries. Figure 8-8. Local V ector T able (L VT) 31 0 7 Ve c t o r Tim er M o [...]

  • Página 341

    Vol. 3A 8-17 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) The setup information that can be specified in the registers of the L VT table is as follows: V ector Interrupt vector numb er . Delivery Mode Sp ecifies the type of interrupt to be sent to the processor . Some delivery modes will only operate as int ended w hen used in conjunc- tion wi[...]

  • Página 342

    8-18 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Remote IRR Flag (R ead Only) For fixed mode, level-triggered interrupts; this flag is set when the local APIC accepts the interrupt fo r servicing and is reset when an EOI command is received from the proces sor . The meaning of this flag is undefined for edge-triggered interrupts and o[...]

  • Página 343

    Vol. 3A 8-19 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 8.5.3 Error Handling The local APIC provides an error status register (ESR) that it uses to record errors that it detects when handling interrupts (see Fig ure 8-9). An APIC error interrupt is generated when the local APIC sets one of the error bits in the ESR. The L VT error register a[...]

  • Página 344

    8-20 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) The ESR is a write/read register . A write (of any value) to the ESR must be done just prior to reading the ESR to update the regi ster . This initial writ e causes the ESR contents to be updated with the latest error status. Back-t o-back writes clear the ESR register . After an error [...]

  • Página 345

    Vol. 3A 8-21 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) The time base for the timer is derived from the processor ’ s bus clock, divided by th e value spec- ified in the divide configuration regi st er . The timer can be configured thr ough the timer L VT entry for one-sho t or periodic operation. In one-shot mode, the timer is started by [...]

  • Página 346

    8-22 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 8.5.5 Local Interrupt Accepta nce When a local interrupt is sent to the processo r core, it is subject to the acceptance criteria spec- ified in the interrupt acceptance flow chart in Figure 8-17. If the in terrupt is accepted, it is logged into the IRR register and handle d by the proc[...]

  • Página 347

    Vol. 3A 8-23 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) The ICR consists of the following fields. V ector The vector number of the interrupt being sent. Delivery Mode Sp ecifies the type of IPI to be sent . This field is also know as the IPI message type field. 000 (Fixed) Delivers the int errupt speci fied in the vector field to the target [...]

  • Página 348

    8-24 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) send a lowest priority IPI is model specific and should be avoided by B IOS and operating syst em software. 010 (SMI) Delivers an SMI interrupt to the target process or or processors. The vector field must be pro- grammed to 00H for future comp at ibility . 01 1 (Reserv ed ) 100 (NMI) D[...]

  • Página 349

    Vol. 3A 8-25 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Destination Mode Selects either physical (0) or logical (1) destination mo de (see Section 8.6.2, “Determi ning IPI Destinat ion”). Delivery S tatus (Read Only) Indicates the IPI delivery status, as follows: 0 (Idle) There is currently no IPI activity for this local APIC, or the pre[...]

  • Página 350

    8-26 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) sors and to FFH for Pentium 4 and Intel Xeon pro- cessors. 1 1: (All Excluding Self) The IPI is sent to all pr ocessors in a system with the exception of the processor sending the IPI. The APIC broadcasts a message with the physical des- tination mode an d destination fi eld set to 0x F[...]

  • Página 351

    Vol. 3A 8-27 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) All Excluding Self V alid Edge Fixed, Lowest Priority 1 , 4 , NMI, INIT , SMI, Sta rt - Up X All Excluding Self Invalid 2 Level FIxe d, Lowest Priority 4 , NMI, INIT , SMI, Sta rt - Up X NOTES: 1. The ability of a pr ocessor to send a lowest priority IPI is mod el specific. 2. For these[...]

  • Página 352

    8-28 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 8.6.2 Determining IPI Destination The destination of an IPI can be one, all, or a subset (group) of the processors on the system bus. The sender of the IPI specifies the des tination of an IPI with the fo llowing APIC regist ers and fields within the registers: • ICR Register — The [...]

  • Página 353

    Vol. 3A 8-29 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) NOTE The number of local APICs that can be addressed on the system bus may be restricted by hardware. 8.6.2.2 Logical Destination Mode In logical destination mode , IPI destination is specified using an 8-bit message destination address (MDA), which is entered in the destin ation field [...]

  • Página 354

    8-30 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) The interpretation of MDA fo r the tw o models is described in the following paragraphs. 1. Flat Mod el — T hi s m o d el is s e l ec te d b y p r og ra m m i ng DF R b i t s 2 8 t h r ou gh 3 1 t o 1111 . Here, a unique logical APIC ID can be established for up to 8 local APICs by se[...]

  • Página 355

    Vol. 3A 8-31 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 8.6.2.3 Broadcast/Self Delivery Mode The destination shorthand field of the ICR allows the delivery mode to be by-passed in favor of broadcasting the IPI to all the pr ocessors on the system bus and/ or back to itself (see Section 8.6.1, “Interrupt Command Reg ister (ICR)”). Three d[...]

  • Página 356

    8-32 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Here, the TPR value is the task priority value in the TPR (see Figure 8-18), the IRR V value is the vector number for th e highest priority bit t hat is set in the IRR (see Figure 8-20) or 00H (if no IRR bit is set), and the ISR V value is the vector number for the highest pri ority bit[...]

  • Página 357

    Vol. 3A 8-33 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Section 8.10, “APIC Bus Message Passing Mechanism and Protocol (P6 Family , Pentium Processors),” describes the APIC bus arbitration prot ocols and bus message formats, while Section 8.6.1, “I nterrupt Command Register (IC R),” describes the INIT level de-assert IPI message. Not[...]

  • Página 358

    8-34 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 4. When interrupts are pending in the IRR and ISR register , the local APIC dispatches them to the processor one at a time, based on thei r priority and the curr ent task and processor priorities in the TPR and PPR (see Section 8. 8.3.1, “T ask and Processor Priorities”). 5. When a [...]

  • Página 359

    Vol. 3A 8-35 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 1. (IPIs only) It examines the IPI message to det ermines if it is the specified destinat ion for the IPI as des cribed in Sectio n 8.6.2, “Deter mining IPI Dest ination.” If it is the specified destination, it continues its acceptance procedure; if it is not the destination, it dis[...]

  • Página 360

    8-36 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 3. If the local APIC determines that it is the desig nated destination for the interrupt but the interrupt request is not one of the interrupts given in step 2, th e local APIC looks for an open slot in one of i ts two pending interrupt qu eues contained in the IRR and ISR registers (se[...]

  • Página 361

    Vol. 3A 8-37 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 8.8.3.1 T ask and Processor Pri orities The local APIC also defines a task priority and a processor priority that it uses in determining the order in which interrupt s should be handled. The task priority is a so ftware selected value between 0 and 15 (see Figure 8-18) that is written i[...]

  • Página 362

    8-38 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Its value in the PPR is computed as follows: IF TPR[7:4] ≥ ISRV[7:4] THEN PPR[7:0] ← TPR[7:0] ELSE PPR[7:4] ← ISRV[7:4] PPR[3:0] ← 0 Here, the ISR V val ue is the vector number of the hi ghest priority ISR bit that i s set, or 00H if no ISR bit is set. Essentially , the processo[...]

  • Página 363

    Vol. 3A 8-39 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) The IRR contains the active interrupt requests th at have been accepted, but not yet dispatched to the processor for servicing. Wh en the local APIC accepts an interr upt, it sets the bit in the IRR that corresponds the vector of the accepted interrupt . When the processor core is ready[...]

  • Página 364

    8-40 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 8.8.5 Signaling Interrupt Servicing Completion For all interrupts except those deliv ered with the NMI, SMI, INIT , ExtINT , the start-up, or INIT - Deassert delivery mode, the interrupt handler must include a write to the end-of-interrupt (EOI) register (see Figure 8-21). This writ e m[...]

  • Página 365

    Vol. 3A 8-41 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) the TPR. The IC, however , is considered im plementation-dependent with th e under-lying priority mechanisms subject to change. The CR8, by contrast, is part of the Intel EM64T archi- tecture. Software can depend on this definition remaining unchang ed. Figure 8-22 shows the layout of C[...]

  • Página 366

    8-42 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) The vector number for the spurious -interrupt vector is specified in th e spurious-interrupt vector register (see Figure 8-23). The functio ns of th e fields in this register are as follows: Spurious V ector D etermines the vect or number to be delivered to the processor when the local [...]

  • Página 367

    Vol. 3A 8-43 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 8.10 APIC BUS MESSAGE PASSING MECHANISM AND PROTOCOL (P6 FAMILY , PENTIUM PROCESSORS) The Pentium 4 and Intel Xeon processors pass messages among the local and I/O APICs on the system bus, using the system bus message passing mechan ism and protocol. The P6 family and Pentium processors[...]

  • Página 368

    8-44 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) destination and message during device con figuration, allocating one or more non-shared messages to each MSI capab le function.” The capabilities mechanism provided by the PCI Local Bus Specification is used to identify and configure MSI capable PCI devices. Among other fi elds, this [...]

  • Página 369

    Vol. 3A 8-45 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) • When RH is 1 and the logical destination mode is active in a system using a flat addressing model, the Destination ID field mu st be set so that bits set to 1 identify processors that are present and enabled to receive the interrupt. • If RH is set to 1 and the logical destination[...]

  • Página 370

    8-46 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Reserved fields are not assumed to be any valu e. Software must preserve their contents on writes. Other fields in the Message Da ta Register are described below . 1. V ector — This 8-bit field contains the interrupt vector associated with the message. V alues range from 010H to 0FEH [...]

  • Página 371

    Vol. 3A 8-47 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) d. 100B (NMI) — Deliver the signal to all the agents listed in th e destination field. The vector information is ignored. NMI is an edge triggered interrupt regardless of the T rigger Mode Setting. e. 101B (INIT) — Deliver this signal to all the agen ts listed in the destin ation fi[...]

  • Página 372

    8-48 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC)[...]

  • Página 373

    9 Pr ocessor Management and Initialization[...]

  • Página 374

    [...]

  • Página 375

    Vol. 3A 9-1 CHAPTER 9 PROCESSOR MANAGEMENT AND INITIALIZATION This chapter describes the facilities provi ded for managing processor wide functions and for initializing the processor . The subjects cove red include: processor initi alization, x87 FPU initialization, processo r configur ation, feature determination, m ode switching, the MSRs (in th [...]

  • Página 376

    9-2 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION The software-initialization code performs al l system-specific initia lization of the BSP or primary processo r an d the system logic. At this point, for MP (or DP) systems, the BSP (or primary) processor wakes up each AP (or secondary) processor to enable those pro cessors to execute self-configu[...]

  • Página 377

    Vol. 3A 9-3 PROCESSOR MANAGEMENT AND INITIALIZATION T able 9-1. IA-32 Processor St ates Following Power-up, Reset, or INIT Register Pentium 4 and Intel Xeon Processor P6 Family Processor Pentium Processor EFLAGS 1 00000002H 00000002H 00000002H EIP 0000FFF0H 0000FFF 0H 0000FFF0H CR0 60000010H 2 60000010H 2 600000 10H 2 CR2, CR3, CR4 00000000H 000000[...]

  • Página 378

    9-4 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION MXCSR P wr up or Reset: 1F80H INIT : Unchanged Pentium III processor only- Pwr up or Reset: 1F80H INIT : Unchanged NA GDTR, IDTR Base = 00000000H Limit = FFFFH AR = Present, R/W Base = 00000000H Limit = FFFFH AR = Present, R/W Base = 00000000H Limit = FFFFH AR = Pres ent, R/W LDTR, T ask Register [...]

  • Página 379

    Vol. 3A 9-5 PROCESSOR MANAGEMENT AND INITIALIZATION 9.1.3 Model and Stepping Information Following a hardware reset, the EDX register contains component iden tification and revision information (see Figure 9-2). For example, the model, family , an d processo r type returned for t h e fi rs t pr oc es s o r i n t he I nt el Pe nt iu m 4 f am il y i [...]

  • Página 380

    9-6 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION 9.1.4 First Instruction Executed The first instruction that is fetched and executed following a hardware reset is located at physical address FFFFFFF0H. This address is 16 byte s below the processor ’ s uppermost physical address. The EPROM containing the software-initialization code must be loc[...]

  • Página 381

    Vol. 3A 9-7 PROCESSOR MANAGEMENT AND INITIALIZATION The EM flag determines w hether floating-po int instructions are executed by the x87 FPU (EM is cleared) or a device-not-availab le exception (#NM) is generated for all floating-po int instruc- tions so that an exception handler can em ulate the floati ng-point operation (EM = 1). Ordinarily , the[...]

  • Página 382

    9-8 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION T o em ulate floating-point instructions, the EM, MP , and NE flag in control register CR0 should be set as shown in T able 9-3. Regardless of the value of the EM bit, the In tel486 SX processor generates a device-not-avail- able exception (#NM) up on encountering any floating-point instru ction. [...]

  • Página 383

    Vol. 3A 9-9 PROCESSOR MANAGEMENT AND INITIALIZATION 9.4 MODEL-SPECIFIC REGISTERS (MSRS) The Pentium 4, Intel Xeon, P6 family , and Pentium processors contain a model-speci fic registers (MSRs). These registers are by de finition implementation specific; that is, they are not guaran- teed to be supported on future IA-32 processors and/or to hav e th[...]

  • Página 384

    9-10 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION 9.6 INITIALIZING SSE/SSE2/SSE3 EXTENSIONS For processors that contain SS E/SSE2/SSE3 extensions, steps must be taken when initializing the processor to allow execu tion of these instructions. 1. Check the CPUID feature flags for the presence of the SSE/ SSE 2/SSE3 extensions (respectively: EDX b [...]

  • Página 385

    Vol. 3A 9-11 PROCESSOR MANAGEMENT AND INITIALIZATION 9.7.1 Real-Address Mode IDT In real-address mode, the only system data structur e that must be loaded into m emory is the IDT (also called the “interrupt vector table”). By default, the addres s of the base of the IDT is phys- ical address 0H. This address can be changed by using the LIDT ins[...]

  • Página 386

    9-12 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION • If paging is to be used, at least one page directory and one page table. • A code segment that contains the code to be executed when the processor switches to protected mode. • One or more code modules that contain th e necessary interrupt and exception handlers. Software initialization c[...]

  • Página 387

    Vol. 3A 9-13 PROCESSOR MANAGEMENT AND INITIALIZATION 9.8.2 Initializing Protected-Mode Exceptions and Interrupt s Software init ialization code must at a minim u m load a protected-mode ID T with gate descriptor for each exception vector that the processor can generate. If in terrupt or trap gates are used, the gate descriptors can all poin t to th[...]

  • Página 388

    9-14 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION After the processor has switched to protected m ode , the L TR instruction can be used to load a segment selector for a TSS descri ptor into the task register . This instruction marks the TSS descriptor as busy , but does not perform a task switch. The processor can, however , use the TSS to loca[...]

  • Página 389

    Vol. 3A 9-15 PROCESSOR MANAGEMENT AND INITIALIZATION 64-bit mode consistency checks fail in the following circumstances: • An attempt is made to enable or disable IA-32e m ode whi le paging is enabled. • IA-32e mode is enabled and an attempt is made to enable paging prior to enabl ing physical-address extensions (P AE). • IA-32e mode is activ[...]

  • Página 390

    9-16 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION Compatibility mod e execution is selected on a code-segm ent basis. This mode allows legacy applications to coex ist with 64-bit applications running in 64-bit mode. An operating system running in IA-32e mode can execute existing 16 -bit and 32-b it applicati ons by clearing their code-segment de[...]

  • Página 391

    Vol. 3A 9-17 PROCESSOR MANAGEMENT AND INITIALIZATION 9.9 MODE SWITCHING T o use the processor i n protected mode after hard ware or software reset, a mode switch must be performed from real-address mo de. Once in protected mo de, software generally does not need to return to real-address mode. T o run software w ritten to run in real-address mo de [...]

  • Página 392

    9-18 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION 6. Execute the L TR instruction to lo ad the task register with a segment selecto r to the initial protected-mode task or to a writable area of memory that can be used to store TSS information on a task switch. 7. After entering protected mode, the segment regi sters continue to hold the contents[...]

  • Página 393

    Vol. 3A 9-19 PROCESSOR MANAGEMENT AND INITIALIZATION 4. Load segm ent regist ers SS, DS, ES, FS, and GS with a selector fo r a descri ptor contain ing the following values, which are a ppropriate for real-address mode: — Limit = 64 KBytes (0FFFFH) — Byte granular (G = 0) — Expand up (E = 0) — Writable (W = 1) —P r e s e n t ( P = 1 ) — [...]

  • Página 394

    9-20 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION 9.10 INITIALIZATION AND MODE SWITCHING EXAMPLE This section provides an i nitialization and mode switching example that can be incorporated into an application. This code was originally written to initialize the Intel386 processor , but it will execute successfully on the Pentium 4, Intel Xeon, P[...]

  • Página 395

    Vol. 3A 9-21 PROCESSOR MANAGEMENT AND INITIALIZATION Figure 9-3. Processor State Af ter Reset T able 9-4. Main Initializat ion Step s in ST ARTUP .ASM Source Listing ST ARTUP .ASM Line Numbers Description From T o 157 157 Jump (short) to the entry code in the EPROM 162 169 Construct a temporary GDT in R AM with one entry: 0 - null 1 - R/W data segm[...]

  • Página 396

    9-22 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION 9.10.1 Assembler Usage In this example, the Intel assembler ASM386 and build to ols BLD386 are used to assemble and build the initialization code mod ule. The following assumptions are used when using the Intel ASM386 and BLD386 tools. • The ASM386 will generate the right operand size opcodes a[...]

  • Página 397

    Vol. 3A 9-23 PROCESSOR MANAGEMENT AND INITIALIZATION 9.10.2 ST ARTUP .ASM Listing Example 9-1 provides high-level sample code designed to mov e the processor into protected mode. This listing does not include any opcode and offset information . Example 9-1. ST ARTUP .ASM MS-DOS* 5.0(045-N) 386(TM) MACRO ASSEMBLER STARTUP 09:44:51 08/19/92 PAGE 1 MS[...]

  • Página 398

    9-24 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION 32 TSS_INDEX EQU 10 33 34 ; TSS_INDEX is the index of the TSS of the first task to 35 ; run after startup 36 37 38 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;;;;;;;;;;;;;;;;;;;;;;;;;; 39 40 ; ------------------------- STRUCT URES and EQU --------------- 41 ; structures for system data 42 43 ; TSS struct[...]

  • Página 399

    Vol. 3A 9-25 PROCESSOR MANAGEMENT AND INITIALIZATION 79 LDT_reg DW ? 80 LDT_h DW ? 81 TRAP_reg DW ? 82 IO_map_base DW ? 83 TASK_STATE ENDS 84 85 ; basic structure of a descrip tor 86 DESC STRUC 87 lim_0_15 DW ? 88 bas_0_15 DW ? 89 bas_16_23 DB ? 90 access DB ? 91 gran DB ? 92 bas_24_31 DB ? 93 DESC ENDS 94 95 ; structure for use with LGDT and LIDT [...]

  • Página 400

    9-26 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION 126 127 ; scratch areas for LGDT and LIDT instructions 128 TEMP_GDT_SCRATCH TABLE_REG <> 129 APP_GDT_RAM TABLE_REG <> 130 APP_IDT_RAM TABLE_REG <> 131 ; align end_data 132 fill DW ? 133 134 ; last thing in this segment - sho uld be on a dword boundary 135 end_data LABEL BYTE 136[...]

  • Página 401

    Vol. 3A 9-27 PROCESSOR MANAGEMENT AND INITIALIZATION 175 MOV EBX,CR0 176 OR EBX,PE_BIT 177 MOV CR0,EBX 178 179 ; clear prefetch queue 180 JMP CLEAR_LABEL 181 CLEAR_LABEL: 182 183 ; make DS and ES address 4G o f linear memory 184 MOV CX,LINEAR_SEL 185 MOV DS,CX 186 MOV ES,CX 187 188 ; do board specific initiali zation 189 ; 190 ; 191 ; ...... 192 ; [...]

  • Página 402

    9-28 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION 222 MOV ECX, CS_BASE 223 ADD ECX, OFFSET (IDT_E PROM) 224 MOV ESI, [ECX].table_l inear 225 MOV EDI,EAX 226 MOVZX ECX, [ECX].table_l im 227 MOV APP_IDT_ram[EBX].t able_lim,CX 228 INC ECX 229 MOV APP_IDT_ram[EBX].t able_linear,EAX 230 MOV EBX,EAX 231 ADD EAX,ECX 232 REP MOVS BYTE PTR ES:[EDI], BYTE[...]

  • Página 403

    Vol. 3A 9-29 PROCESSOR MANAGEMENT AND INITIALIZATION 271 272 ;assume no LDT used in t he initial task - if necessary, 273 ;code to move the LDT cou ld be added, and should resemble 274 ;that used to move the TSS 275 276 ; load task register 277 LTR BX ; No task switch, only descriptor loading 278 ; See Figure 9-6 279 ; load minimal set of regi ster[...]

  • Página 404

    9-30 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION Figure 9-4. Constructin g T emporary GDT and Switching to Pro tected Mode (Lines 162-172 of List File) FFFF FFFFH Base=0, Limit=4G ST AR T : [CS.BASE+EIP] TEMP_GDT • Jump near start FFFF 0000H • Construct TEMP_GDT • LGDT • Move to protected mode DS, ES = GDT[1] 4 GB 0 GDT [1] GDT [0] GDT_[...]

  • Página 405

    Vol. 3A 9-31 PROCESSOR MANAGEMENT AND INITIALIZATION Figure 9-5. Moving the GDT , IDT , and TSS from ROM to RAM (Lines 196-261 of List File) FFFF FFFFH GDT RAM • Move the GDT , IDT , TSS • Fix Aliases • L T R 0 RAM_ST AR T TSS IDT GDT TSS RAM IDT RAM from ROM to RAM[...]

  • Página 406

    9-32 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION Figure 9-6. T a sk Switching (Lines 282-296 of List File) GDT RAM RAM_ST ART TSS RAM IDT RAM GDT Alias IDT Alias DS EIP EFLAGS CS SS 0 ES ESP • • • • • • SS = TS S.SS ESP = TSS.ESP PUSH TSS.EFLAG PUSH TSS.CS PUSH TSS.EIP ES = TS S.ES DS = TSS.DS IRET GDT[...]

  • Página 407

    Vol. 3A 9-33 PROCESSOR MANAGEMENT AND INITIALIZATION 9.10.3 MAIN.ASM Source Code The file MAIN.ASM shown in Example 9-2 defines the data and stack segments for this appli- cation and can be substi tuted with the mai n module task wri tten in a high-lev el language that is invoked by the IRET instruction executed by ST AR TUP .ASM. Example 9-2. MAIN[...]

  • Página 408

    9-34 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION Example 9-4. Build F ile INIT_BLD_EXAMPLE; SEGMENT *SEGMENTS(DPL = 0) , startup.startup_code(BAS E = 0FFFF0000H) ; TASK BOOT_TASK(OBJECT = startup, INIT IAL,DPL = 0, NOT INTENABLED) , PROTECTED_MODE_TASK(OBJECT = mai n_module,DPL = 0, NOT INTENABLED) ; TABLE GDT ( LOCATION = GDT_EPROM , ENTRY = ([...]

  • Página 409

    Vol. 3A 9-35 PROCESSOR MANAGEMENT AND INITIALIZATION 9.1 1 MICROCODE UP DATE FACILITIES The Pentium 4, Intel X eon, and P6 family proces sors have the capability to correct errata by loading an Intel-supplied data blo ck into the pr ocessor . The data block is called a microcode update. This section describes the mechanis ms th e BIOS needs to prov[...]

  • Página 410

    9-36 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION 9.1 1.1 Microcode Up date A microcode update consists of an Intel-su pplied binary that con tains a descriptive header and data. No executable code reside s within the update. Each micr ocode update is tailo red for a specific list of processor signat ures. A mismatch of the processo r ’ s sign[...]

  • Página 411

    Vol. 3A 9-37 PROCESSOR MANAGEMENT AND INITIALIZATION . T able 9-6. Microcode Up date Field Defi nitions Field Name Offset (bytes) Length (bytes) Description Header V ersion 0 4 V ersion number of the update header. Update Revision 4 4 Unique version number for th e update, the basis for the update signature provided by the processor t o indicate th[...]

  • Página 412

    9-38 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION T otal Size 32 4 S pecifies the total size of the microcode update in bytes. It is the summation of the header size, the encrypted data size and the size of the optional extended signature table. Reserved 36 12 Reserved fields for future expansion Update Data 48 Data Size or 2000 Update data Exte[...]

  • Página 413

    Vol. 3A 9-39 PROCESSOR MANAGEMENT AND INITIALIZATION Checksum[n] Data Size + 76 + (n * 12) 4 Used by utility software to decompose a microcode update into multiple microcode updates wher e each of the new updates is constructed without the optional Extended Processor Signature T able. T o calculate the Checksum, substitute the Primary Processor Sig[...]

  • Página 414

    9-40 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION 9.1 1.2 Optional Extended Signature T able The extended signature table is a structure that may be appended to the end of the encrypted data when the encrypted data onl y supports a sing le processor signature (optional case). The extended signature table will always be presen t when the encrypte[...]

  • Página 415

    Vol. 3A 9-41 PROCESSOR MANAGEMENT AND INITIALIZATION 9.1 1.3 Processor Identification Each microcode update is designed to for a sp eci fic processor or set of processors. T o determine the correct microcode update to load, software mu st ensure that one of the processor signatures embedded in the microcode update ma tches the 32-bit processo r sig[...]

  • Página 416

    9-42 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION 9.1 1.4 Plat form Identification In addition to verifying t he processor signature, the intended processor platform type m ust be determined to p roperly target the microcode update. The int ended processor platform typ e is determined by reading the IA32 _PLA TFORM_ID register, (MSR 17H). This 6[...]

  • Página 417

    Vol. 3A 9-43 PROCESSOR MANAGEMENT AND INITIALIZATION Example 9-6. Pseudo Code Example of Processor Flags T est Flag ← 1 << IA32_PLATFORM_ID[52:50] If (Update.HeaderVersion == 00000001h) { If (Update.ProcessorFlags & Flag) { Load Update } Else { // // Assume the Data Size has been used to calculate the // location of Update.ProcessorSign[...]

  • Página 418

    9-44 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION Example 9-7. Pseudo Code Example of Checksum T est N ← 512 If (Update.DataSize != 00000000H) N ← Update.TotalSize / 4 ChkSum ← 0 For (I ← 0; I < N; I++) { ChkSum ← ChkSum + MicrocodeUpdate[I] } If (ChkSum == 00000000H) Success Else Fail 9.1 1.6 Microcode Up date Loader This section d[...]

  • Página 419

    Vol. 3A 9-45 PROCESSOR MANAGEMENT AND INITIALIZATION The loader shown in Example 9-8 assumes that update is the address of a microcode update (header and data) emb edded within the cod e segm ent of the BIOS. It also assumes that the processor is operating i n real mode. The dat a may reside anywhere i n memory , aligned on a 16-byte boundary , tha[...]

  • Página 420

    9-46 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION 9.1 1.6.3 Update in a System Supporting Intel Hyper-Thre ading T echnology Intel Hyper-Threading T echnol ogy has implications on th e loading of the microcode u pdate. The update must be loaded for each core in a physical processor . Thus, for a processor supporting Hyper-Threading T echnology ,[...]

  • Página 421

    Vol. 3A 9-47 PROCESSOR MANAGEMENT AND INITIALIZATION CPUID returns a value in a model specific register in addition to its usual register return values. The semantics of CPUID cause it to deposit an update ID value in th e 64-bit model-sp ecific register at address 08BH (IA32_BIOS_SIGN_ID). If no u pdat e is present in the pro cessor , the value in[...]

  • Página 422

    9-48 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION The IA32_BIOS_SIGN_ID register is used to report the m icrocode update signature when CPUID executes. The signature is return ed in the upper DWORD (T able 9-1 1). 9.1 1.7 .2 Authen ticating the Up date An update may be authenticated by the BIOS using the signature primitive, described above, and[...]

  • Página 423

    Vol. 3A 9-49 PROCESSOR MANAGEMENT AND INITIALIZATION 9.1 1.8 Pentium 4, Intel Xeon, and P6 Family Processor Microcode Up date Specifications This section describes the interface that an application can use to dynamically integrate processor- specific updates into th e system B IOS. In this discussi on, the application is referred to as the calling [...]

  • Página 424

    9-50 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION update blocks for each microcode upd ate. In a MP system, a common microcode update may be sufficient for each socket in the system. For IA-32 processors ear lier than fami ly 0FH and mo del 03H, the mi crocode update is 2 KBytes. An MP-capable BIOS that supports mul tiple steppings must allocate[...]

  • Página 425

    Vol. 3A 9-51 PROCESSOR MANAGEMENT AND INITIALIZATION { If ((Update.ProcessorSignature[N] == Processor Signature) && (Update.ProcessorFlags[N] & Platform Bits)) { Load Update.UpdateData into the Processor; Verify update was correctly loaded into the processor Go on to next processor Break; } N ← N + 1 } I ← I + (Update.TotalSize / 20[...]

  • Página 426

    9-52 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION • The calling program should read an y update data th at already exists in the BIOS in order to make decisions about the appropriaten ess of loading the update. The BIOS must refuse to overwrite a newer update with an older versi on. The update header contains information about version and proc[...]

  • Página 427

    Vol. 3A 9-53 PROCESSOR MANAGEMENT AND INITIALIZATION For each processor { If ((this is a unique processor stepping) AND (we have a unique update in the database for this processor)) { Checksum the update from the database; If Checksum fails exit NumBlocks ← NumBlocks + size of microcode update / 2048 } } // // Do we have enough update slots for a[...]

  • Página 428

    9-54 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION } // // Verify the update was loaded correctly // Issue the ReadUpdate function If an error occurred { Display Diagnostic exit } // // Compare the Update read to that written // If (Update read != Update written) { Display Diagnostic exit } I ← I + (size of microcode update / 2048) } // // Enab[...]

  • Página 429

    Vol. 3A 9-55 PROCESSOR MANAGEMENT AND INITIALIZATION 9.1 1.8.4 IN T 15H-based Interface Intel recommends that a BIOS interface be provided that allo ws additional microcode updates to be added to system flash. The INT15H inte rface is the Intel-defined method for doing this. The program that calls this interface is respon sible for providing thr ee[...]

  • Página 430

    9-56 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION Descripti on In order to assure that the BIOS functio n is pr esent, the caller must verify the carry flag, the return code, and the 64-bit si gnature. The update count reflects the nu mber of 2048-byte blocks available for storage within one non-volatile RAM. The loader version nu mber refers to[...]

  • Página 431

    Vol. 3A 9-57 PROCESSOR MANAGEMENT AND INITIALIZATION Description The BIOS is responsible for select ing an appropr iate update block in the n on-volatile storage for storing the new update. Thi s BIOS is also responsible for ensuring t he integrity of the informa- tion provided by the call er , including authenticating the pro posed update before i[...]

  • Página 432

    9-58 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION If no unused update block s are available and th e above criteria are not met, the BIOS can over- write update block(s) for a processor stepping that is no longer present in the system. This can be done by scanning the upd ate blocks and comparing the processor steppi ngs, identified in the MP Sp[...]

  • Página 433

    Vol. 3A 9-59 PROCESSOR MANAGEMENT AND INITIALIZATION Figure 9-8. Microcode Up date W rite Operation Flow [1] 1 V a lid U p d ate H eader V er sion? Loader R evis ion M atc h BI OS’s Loader ? D oes U pdate M atch A CPU in The Sys t em W rit e M ic r o code U pdate D oes U pdate C hecks um C or rec t ly ? Ye s Ye s Ye s N o R etur n CPU_NOT_ PRE S [...]

  • Página 434

    9-60 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION Figure 9-9. Microco de Up date Write Opera t io n Flow [2] Ret ur n I NVALI D_RE VI SI ON Yes 1 Update Revis ion Newer Than NVRAM Update? Update Pass A uthent ici ty Test ? Ret ur n SECURI TY _FAI LURE Yes Update NMRA M R ecord Ret ur n SUCCESS U p d a te M atch in g C P U A lr eady In NVRAM ? Sp[...]

  • Página 435

    Vol. 3A 9-61 PROCESSOR MANAGEMENT AND INITIALIZATION 9.1 1 .8.7 Function 02H—Microcode Up date Control This function enables loadin g of binary updates into the processor . T able 9-15 lists the parame- ters and return codes for the f unction. Description This contr ol is provided on a global basis for all updates and processors. The caller can d[...]

  • Página 436

    9-62 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION The READ_F AILURE error code returned by this function has meaning only if the control func- tion is implemented in the BIO S NVRAM. The state of this feat ure (enabled/disabled) can also be implemented using CMOS RAM bits wh ere READ failure er rors cannot occur . 9.1 1.8 .8 Functi on 03H—Read[...]

  • Página 437

    Vol. 3A 9-63 PROCESSOR MANAGEMENT AND INITIALIZATION Description The read function enables the caller to read any mic rocode update data that already exists in a BIOS and make decisi ons about the addition of new updates. As a result of a successful call, the BIOS copies the microcode update into the location pointed to by ES:DI, with the co ntents[...]

  • Página 438

    9-64 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION UPDA TE_NUM_INV ALID 99H The update number exceeds the maximum numb er of update blocks implemented by the BIOS. NOT_EMPTY 9AH The specified update block is a subseque nt block in use to store a valid microcode update t hat spans multiple blocks. The specified block is not a header block a nd is [...]

  • Página 439

    10 Memory Cache Contr ol[...]

  • Página 440

    [...]

  • Página 441

    Vol. 3A 10-1 CHAPTER 10 MEMORY CACHE CONTROL This chapter describes the IA-32 architecture’ s memory cache and ca che control mechanisms, the TLBs, and the store buf fer . It also describes th e me m ory ty pe rang e registers (MT RRs) fou nd in the P6 family processors and how they are used to control caching of physica l memory locations. 10.1 [...]

  • Página 442

    10-2 Vol. 3A MEMORY CACHE CONTROL T able 10-1. Characteristics of the Caches, TLBs, Store Buffe r, and Write Combining Buffer in IA-32 Processors Cache or Buffer Characte ristics T race Cache 1 - Pentium 4 and Intel Xeon processors: 12 K μ ops, 8-way set associative. - Pentium M processor: not implemented. - P6 family and Pentium processors: not i[...]

  • Página 443

    Vol. 3A 10-3 MEMORY CACHE CONTROL The IA-32 processors implement four types of caches: the trace cache, the level 1 (L1) cache, the level 2 (L2) cache, and the level 3 (L3) cache (see Figure 10-1). The uses of these caches differs from the Pentium 4, Intel Xeon, and P6 family processors, as follows: • Pentium 4 and Intel Xeo n processors — The [...]

  • Página 444

    10-4 Vol. 3A MEMORY CACHE CONTROL The trace cache in the Pentium 4 and Intel Xeon pr ocessors is an integral part of the Intel NetBurst microarchitecture and is available in all execution modes: protected mode, sys tem management mode (SMM), and real-address mode. The L1,L2, and L3 caches are also available in all execution modes; however , use of [...]

  • Página 445

    Vol. 3A 10-5 MEMORY CACHE CONTROL When the processor attempts to write an opera nd to a cacheable area of memory , it first checks if a cache line for that memory location exists in the cache. If a valid cache line does exist, the processor (depending on the write policy curren tly in force) can write the operand into the cache instead of writing i[...]

  • Página 446

    10-6 Vol. 3A MEMORY CACHE CONTROL NOTE The behavior of FP and SSE/SSE2 operations on operands in UC memory is implementation dependent. In so me implementations, accesses to UC memory may occur more than once. T o ensure predictable behavior , use loads and stores of general purpose registers to access UC memory that may have read or write side eff[...]

  • Página 447

    Vol. 3A 10-7 MEMORY CACHE CONTROL memory . When writing through to memory , in valid cache lines are never filled, and valid cache lines are either filled or invalidated. W r ite combining is allowed. This type of cache- control is appropriate for frame buf fers or when there are devices on the system bus that access system memory , but do not perf[...]

  • Página 448

    10-8 Vol. 3A MEMORY CACHE CONTROL 10.3.1 Buffering of Write Combining Memory Locations W rites to the WC memory type are not cached in the typical sense of th e word cached. They are retained in an inter nal write combi ning buf fer (WC bu f fer) that is sep arate from the in ternal L1, L2, and L3 caches and the store buf fer . The WC buf fer is no[...]

  • Página 449

    Vol. 3A 10-9 MEMORY CACHE CONTROL The only elements of WC propagation to the syst em bus that are guaranteed are those provided by transaction atomicity . Fo r example, with a P6 family processor , a completely full WC buffer will always be propagated as a single 32-bit bur st transaction using any chunk order . In a WC buffer eviction where the da[...]

  • Página 450

    10-10 Vol. 3A MEMORY CACHE CONTROL For a description of th ese instructions and there intended use, see Section 10.5.5, “Cache Management I nstructions.” 10.4 CACHE CONTROL PROTOCOL The following section describes th e cache control protocol curren tly defined for the I A-32 archi- tecture. This protocol is used by the Pentium 4, Intel Xeon, P6[...]

  • Página 451

    Vol. 3A 10-11 MEMORY CACHE CONTROL • Cache control and memory ordering instructions — The IA-32 architecture provides several instructions that control the caching of data, the ordering of memory reads and writes, and the prefetching of data. These in structions allow software to control the caching of specific data struct ures, to control me m[...]

  • Página 452

    10-12 Vol. 3A MEMORY CACHE CONTROL Figure 10-2. Cache-Control R egisters an d Bit s Available in IA-32 Processors Page-Directory or Page-T able Entry TLBs MTRRs 3 Physical Memory 0 FFFFFFFFH 2 control overall caching of system memory CD and NW Flags PCD and PWT flags control page-level caching G flag controls page- level flushing of TLBs MTRRs cont[...]

  • Página 453

    Vol. 3A 10-13 MEMORY CACHE CONTROL T a ble 10-5. Cach e Operating Modes CD NW Caching and Read/Write Policy L 1 L2/L3 1 0 0 Normal Cache Mode. Highes t performance cache operation. - Read hits access the cache; read misses may cause replacement. - Write hit s update the cache. - Only writes to shared lines and write misses update system memo ry . -[...]

  • Página 454

    10-14 Vol. 3A MEMORY CACHE CONTROL • NW flag, bit 29 of control register CR0 — Controls th e writ e policy fo r system m emory locations (see Section 2.5, “Control Registers”). If the NW and CD flags are clear , write- back is enabled for the whol e of system memory , but may be restricted for individual pages or regions of memory by ot her[...]

  • Página 455

    Vol. 3A 10-15 MEMORY CACHE CONTROL • Memory type range r egisters (MTRRs) (i ntroduced in P6 family pr ocessors) — Control the type of cachin g used in specific regions o f physical memory . Any of the caching types described in Section 10.3, “Methods of Caching A vailable,” can be selected. See Section 10.1 1, “Memory T ype Range Regist [...]

  • Página 456

    10-16 Vol. 3A MEMORY CACHE CONTROL 10.5.2.1 Selecting Memory T ypes for Pentium Pro an d Pentium II Processors The Pentium Pro and Pentium II processors do not support the P A T . Here, the effective memory type for a page is selected with the MTRRs and the PCD and PWT bits in the page-t able or page- directory entry for the page. T able 10-6 descr[...]

  • Página 457

    Vol. 3A 10-17 MEMORY CACHE CONTROL 4. Setting th e PCD and PWT flags to opposite valu es is considered model-specific for the WP and WC memory types and architecturally -defined for the WB, WT , and UC memory types. 10.5.2.2 Selecting Memory T ype s for Pentium 4, Intel Xeon, and Pentium III Processors The Pentium 4, Intel Xeon, and Pentium III pro[...]

  • Página 458

    10-18 Vol. 3A MEMORY CACHE CONTROL 10.5.2.3 Writing V alues Acro ss Pag es with Differ ent Memory T ypes If two adjoining pages in m emory have different memory types, and a word o r longer operand is written to a mem ory location that crosses the page boundary between tho se two pages, the operand might be w ritten to memory twice. This action doe[...]

  • Página 459

    Vol. 3A 10-19 MEMORY CACHE CONTROL 3. Disable the MTRRs and set the default memory type to uncached or set all MTRRs for the uncached memory type (see the discussion of the discuss ion of the TYPE field and the E flag in Section 10.11.2.1, “IA32_MTRR_DEF_TYPE MSR”). The caches must be flushed (step 2) after the CD flag is set to insure system m[...]

  • Página 460

    10-20 Vol. 3A MEMORY CACHE CONTROL modified lines (such as, d uring testing or fa ult recovery where cache coherency with main memory is not a concern), software should use the WBINVD instruction. The WBINVD instruction first wr ites back any modified lines in all the internal caches, then invalidates the contents of both the L1, L2, and L3 caches.[...]

  • Página 461

    Vol. 3A 10-21 MEMORY CACHE CONTROL 10.5.6.1 Adaptive Mode Adaptive mode facilitates L1 data cache sharin g between logical processors. When running in adaptive mode, the L1 data cache is shared acr oss logical processors in the same core if: • CR3 control registers for logical processors sharing the cache are identical. • The same paging mode i[...]

  • Página 462

    10-22 Vol. 3A MEMORY CACHE CONTROL For Intel486 processors, a write to an instruction in the cache will modify it in both the cache and memory , but if the instruction was prefetched before the write, the old version of t he instruc- tion could be the one executed. T o prevent the old instruction from being executed, flush the instruction prefetch [...]

  • Página 463

    Vol. 3A 10-23 MEMORY CACHE CONTROL cache hierarchy now or as soon as possible, in an ticipation of its use. Th e instructions provide different variations of the hint th at allow selection of the cache leve l into which data will be read. The PREFETCH h instructions can help reduce the long late ncy typically associated with reading data from memor[...]

  • Página 464

    10-24 Vol. 3A MEMORY CACHE CONTROL 10.10 STORE BUFFER IA-32 processors temporarily st ore each write (store) to memory in a store buffer . The store buffer improves processor perf ormance by allow ing the processor to continue ex ecuting instruc- tions without having to wait until a write to memory and/or to a cache is complete. It also allows writ[...]

  • Página 465

    Vol. 3A 10-25 MEMORY CACHE CONTROL ization software should then se t the MTRRs to a specific, syst em-defined memory map. T ypi- cally , the BIOS (basic input/output system) so ftware configures the MTRRs. The operating system or executive is then fr ee to modify the memory map us ing the normal page-level cache- ability attributes. In a multiproce[...]

  • Página 466

    10-26 Vol. 3A MEMORY CACHE CONTROL 10.1 1.1 MTRR Feature Identification The availability of the MTRR feature is model- specific. Software can dete rmine if MTRRs ar e supported on a processor by executing the CPUID instruction and reading the state of the MTRR flag (bit 12) in the feature information register (E DX). If the MTRR flag is set (indica[...]

  • Página 467

    Vol. 3A 10-27 MEMORY CACHE CONTROL • WC (write combining) fla g, bit 10 — The write-combining (WC) memory type is supported when set; t he WC type is not sup ported when clear . Bit 9 and bits 1 1 through 63 in the IA32_MTRRCAP MSR are rese rved. If software attempts to write to the IA32_MTRRCAP MSR, a general- protection exception (#GP) is gen[...]

  • Página 468

    10-28 Vol. 3A MEMORY CACHE CONTROL • FE (fixed MTRRs enabled) flag, bit 10 — Fixed-range MTRRs are enabled when set; fixed-range MTRRs are disabled when clear . When the fixed-range MTRRs are enabled, they take priority over the variable-range MTR Rs when overlaps in ranges occur . If the fixed-range MTRRs are disabled, the variable -rang e MTR[...]

  • Página 469

    Vol. 3A 10-29 MEMORY CACHE CONTROL For the P6 family processors, the prefix for the fixed range MTRRs is MTRRfix. 10.1 1.2.3 V ariable R ange MTR Rs The Pentium 4, Intel Xeon, an d P6 family processors permit software to specify th e memory type for eight variable-size address ranges, using a pair of MTRRs for each range. The first entry in each pa[...]

  • Página 470

    10-30 Vol. 3A MEMORY CACHE CONTROL • PhysBase field, bits 12 through (MAXPHY ADDR-1) — Specifies the base address of the address range. This 24-bit value, in the case where MAXPHY ADDR is 36 bits, is extended by 12 bits at the low end to form the base address (this au toma tically aligns the address on a 4-KByte boundary). • PhysMask field, b[...]

  • Página 471

    Vol. 3A 10-31 MEMORY CACHE CONTROL All other bits in the IA32_MTRR _PHYSBASE n and IA32_MTRR_PHYSMASK n registers are reserved; the processor generates a general-prot ection excepti on (#GP) if software at tempts to write to them. Some mask values can result in ranges that ar e not continuous. In such ranges, the area not mapped by the mask value i[...]

  • Página 472

    10-32 Vol. 3A MEMORY CACHE CONTROL 10.1 1.3 Example Base a nd Mask Calculations The examples in this section apply to processo rs that support a maximu m physical address size of 36 bits. The base and m ask values entered in variable-range MTRR pairs are 24-b it values that the processor extends to 36-bits. For example, to enter a base address of 2[...]

  • Página 473

    Vol. 3A 10-33 MEMORY CACHE CONTROL The following settings for the MTRRs will yield the proper mappin g of the physical address space for this syst em configuration. IA32_MTRR_PHYSBASE0 = 0000 0000 0000 0006H IA32_MTRR_PHYSMASK0 = 0000 000F FC00 0800H Caches 0- 64 MByte as WB c ache type. IA32_MTRR_PHYSBASE1 = 0000 0000 0400 0006H IA32_MTRR_PHYSMASK[...]

  • Página 474

    10-34 Vol. 3A MEMORY CACHE CONTROL Caches 96-100 MByte as WB cache type. IA32_MTRR_PHYSBASE3 = 0000 0000 0400 0000H IA32_MTRR_PHYSMASK3 = 000 0 00FF FFC0 0800H Caches 64-68 MByte as U C cache type. IA32_MTRR_PHYSBASE4 = 0000 0000 00F0 0000H IA32_MTRR_PHYSMASK4 = 0000 00FF FFF0 0800H Caches 15-16 MByte as U C cache type. IA32_MTRR_PHYSBASE5 = 0000 0[...]

  • Página 475

    Vol. 3A 10-35 MEMORY CACHE CONTROL d. If two or more variabl e memory ranges match and the memory types are WT and WB, the WT memory type is used. e. For overlaps not defined by the above rules, processor behavior is undefined. 3. If no fixed or variable memory range matche s, the processor uses th e default memory ty pe. 10.1 1.5 MTRR Initializati[...]

  • Página 476

    10-36 Vol. 3A MEMORY CACHE CONTROL 10.1 1.7 MTRR Maintenan ce Programming Interface The operating system maintains th e MTRRs after booting and sets up or changes t he memory types for memory-mapped devices. The operating system should provide a driver and applica- tion programming interface (API) to access and set the MTRRs. The function calls Mem[...]

  • Página 477

    Vol. 3A 10-37 MEMORY CACHE CONTROL The pseudocode for the Get4KMem T ype() fun ction in Example 10-1 7 obtains the mem ory type for a single 4-KByte range at a given physical a ddress. The sample code determines whether an PHY_ADDRESS falls within a fixed range by com paring the address with the known fixed ranges: 0 to 7FFFFH (64-KByte regions), 8[...]

  • Página 478

    10-38 Vol. 3A MEMORY CACHE CONTROL FI; IF IA32_MTRRCAP.FIX is set AND range can be mapped using a fixed-rang e MTRR THEN pre_mtrr_change(); update affected MTRR; post_mtrr_change(); FI; ELSE (* try to map using a variable MTRR pair *) IF IA32_MTRRCAP.VCNT = 0 THEN return UNSUPPORTED; FI; IF conflicts with current variable ranges THEN return RANGE_O[...]

  • Página 479

    Vol. 3A 10-39 MEMORY CACHE CONTROL The physical address to variab le range mapping algorithm in the MemT ypeSet function detects conflicts with current variable range regi sters by cycling through them and determining whether the physical address in quest ion matches any of the current ranges. During this scan, the algo- rithm can detect whether an[...]

  • Página 480

    10-40 Vol. 3A MEMORY CACHE CONTROL 6. If the PGE flag is set in control register CR4, flush all TLBs by clearing that flag. 7. If the PGE flag is clear in control regi ster CR4, flush all TLBs by executing a MOV from control register CR3 to another register and then a MOV from that register back to CR3. 8. Disable all range registers (by clearing t[...]

  • Página 481

    Vol. 3A 10-41 MEMORY CACHE CONTROL The Pentium 4, Intel Xeon, and P6 family processors provide special support for the physical memory range from 0 to 4 MBytes, which is po tentially mapped by both the fixed and v ari- able MTRRs. This support is invoked when a Pe ntium 4, Intel Xeon, or P6 fami ly processor detects a large p age overlapping the fi[...]

  • Página 482

    10-42 Vol. 3A MEMORY CACHE CONTROL 10.12.2 IA32_CR_P A T MSR The IA32_CR_P A T MSR is located at MSR addre ss 277H (see to App endix B, “Model-Specific Registers (MSRs),” and this add ress will remain at the sam e address on future IA-32 processors that support the P A T feature. Fi gure 10-7 shows the format of the 64-bit IA32_CR_P A T MSR. Th[...]

  • Página 483

    Vol. 3A 10-43 MEMORY CACHE CONTROL 10.12.3 Selecting a Memory T ype from the P A T T o select a memory type fo r a page from the P A T , a 3-bit index made up of the P A T , PCD, and PWT bits must be encoded in the page-table or page-directory entry for the page. T able 10-1 1 shows the possible encodin gs of the P A T , PCD, and PWT bits and the P[...]

  • Página 484

    10-44 Vol. 3A MEMORY CACHE CONTROL The values in all the entries of the P A T can be changed by writing to the IA32_CR_P A T MSR using the WRMSR instruction. The IA32_CR_P A T MSR is read and write accessible (use of the RDMSR and WRMSR instructions, respectively) to so ftware operating at a CPL of 0. T able 10-10 shows the allowable encoding of th[...]

  • Página 485

    Vol. 3A 10-45 MEMORY CACHE CONTROL 10.12.5 P A T Compatibility wi th Earlier IA -32 Processors For IA-32 processors that supp ort the P A T , th e IA32_CR_P A T MSR is always active. That is, the PCD and PWT bits in page-table entries and in page-directory entries (that point to pages) are always select a memory type for a page in directly by selec[...]

  • Página 486

    10-46 Vol. 3A MEMORY CACHE CONTROL[...]

  • Página 487

    11 Intel ® MMX ™ T echnology System Pr ogramming[...]

  • Página 488

    [...]

  • Página 489

    Vol. 3A 11-1 CHAPTER 1 1 INTEL ® MMX ™ T ECHNOLOGY SYSTEM PROGRAMMING This chapter describes those features of the Intel ® MMX™ technology that must be considered when designing or enhancin g an operating syst em to support MMX technology . It covers MMX instruction set emulation, the MMX state, alia sing of MMX registers, saving MMX state, t[...]

  • Página 490

    11-2 Vol. 3A INTEL® MMX™ T ECHNOLOGY SYSTEM PROGRAM MING When a value is written into an MMX register us ing an MMX instru ction, the value also appears in the corresponding floating-point register in b its 0 through 63 . Likewise, when a floating-point value written into a floating-point reg ister by a x87 FPU, the low 64 bi ts of that value al[...]

  • Página 491

    Vol. 3A 11-3 INTEL® MMX™ T E CHNOLOGY SYSTEM PROGRAMMING Execution of MMX instru ctions does not affect the other bits in the x87 FPU status word (bi ts 0 through 10 and bits 14 and 15) or the contents of the other x87 FPU registers that com prise the x87 FPU state (the x87 FPU control word, instructio n pointer , data pointer , or opcode regi s[...]

  • Página 492

    11-4 Vol. 3A INTEL® MMX™ T ECHNOLOGY SYSTEM PROGRAM MING 1 1 .3 S AVING AND RESTOR ING THE MMX ST ATE AND REGISTERS Because the MMX registers are aliased to the x87 FPU data registers, the MMX state can be saved to memory and restored fr om memory as follows: • Execute an FSA VE, FNSA VE, or FXSA VE i nstruction to save the MMX st ate to memor[...]

  • Página 493

    Vol. 3A 11-5 INTEL® MMX™ T E CHNOLOGY SYSTEM PROGRAMMING NOTE The IA-32 architecture does not support scann ing the x87 FPU tag word and then only saving valid entries. 1 1.4 SAVING MMX ST ATE ON T ASK OR CONTEXT SWITCHES When switching from one task or context to another , it is often necessary to save the MMX state. As a general rule, if the e[...]

  • Página 494

    11-6 Vol. 3A INTEL® MMX™ T ECHNOLOGY SYSTEM PROGRAM MING • Other exceptions can occur indi rectly due to the faulty ex ecution of the exception hand lers for the above exceptions. 1 1.5.1 Effect of MMX Instructi ons on Pending x87 Floating-Point Exceptions If an x87 FPU floating-point exception is pending and the processor encounters an MMX in[...]

  • Página 495

    Vol. 3A 11-7 INTEL® MMX™ T E CHNOLOGY SYSTEM PROGRAMMING Figure 1 1-2. Mapping of MMX Registe rs to x87 FPU Dat a Register St ack MM0 MM1 MM2 MM3 MM4 MM5 MM6 MM7 ST1 ST2 ST7 ST0 ST6 ST7 ST1 TOS TOS x87 FPU “push” x87 FPU “pop” x87 FPU “push” x87 FPU “pop” Case A: TOS=0 Case B: TOS=2 MM0 MM1 MM2 MM3 MM4 MM5 MM6 MM7 ST0 Outer circl[...]

  • Página 496

    11-8 Vol. 3A INTEL® MMX™ T ECHNOLOGY SYSTEM PROGRAM MING[...]

  • Página 497

    12 SSE, SSE2 and SSE3 System Pr ogramming[...]

  • Página 498

    [...]

  • Página 499

    Vol. 3A 12-1 CHAPTER 12 SSE, SSE2 AND SSE3 SYSTEM PROGRAMMING This chapter describes features of the streaming SIMD exte nsions (SSE), streaming SIMD extensions 2 (SSE2) and streaming SIMD extens ions 3 (SSE3) that must be considered when designing or enhancing an operating system to supp ort the Pentium II I , Pentium 4, and Intel Xeon processors.[...]

  • Página 500

    12-2 Vol. 3A SSE, SSE2 AND SSE3 SYSTEM PROGRAMMING 12.1.2 Checking for SSE/SSE2/SSE3 Extension Support If the processor attempts to execute an uns upported SSE/SSE2/SSE3 instruction , the processor will generate an invalid-op code exception (#UD). Before an operating system or executive attemp ts to use SSE/SSE2/SSE3 extensions, it should check tha[...]

  • Página 501

    Vol. 3A 12-3 SSE, SSE2 AND SSE3 SYSTEM PROGRAMMING NOTE The OSFXSR and OSXMMEXCP T bits in control register CR4 must be set by the operating system. The processor h as no other way of detecting operating-system support for the FXSA VE and FXRSTOR instructions or for handling SIMD floating-point except ions. 3. Clear CR0.EM[bit 2] = 0. This action d[...]

  • Página 502

    12-4 Vol. 3A SSE, SSE2 AND SSE3 SYSTEM PROGRAMMING The SIMD floating-p oint exception mask bits (bits 7 through 12), the flush-to-zero flag (bit 15), the denormals-are-zero flag (b it 6), and the roundi ng control fiel d (bits 13 and 14) in the MXCSR register should be left in their default va lues of 0. This permits the application to deter- mine [...]

  • Página 503

    Vol. 3A 12-5 SSE, SSE2 AND SSE3 SYSTEM PROGRAMMING • System Exceptions: — Inval id-opcode exception (#UD). This exception is generated when executing SSE/SSE2/SSE3 instructions under the following conditions: • SSE/SSE2/SSE3 feature flags returned by CPUID are set to 0. This condition does not affect the CLFLUSH instruction. • The CLFSH fea[...]

  • Página 504

    12-6 Vol. 3A SSE, SSE2 AND SSE3 SYSTEM PROGRAMMING same conditions th at cause x87 FPU float ing-point error exceptio ns (#MF) to be generated for x87 FPU instruction s. Each of these exceptions can be masked, in which case the processor returns a reasona ble result to the destinat ion operand without i nvoking an exception handler . However , if a[...]

  • Página 505

    Vol. 3A 12-7 SSE, SSE2 AND SSE3 SYSTEM PROGRAMMING In some cases, applications can only save the XMM and MXCSR registers in the following way: • Execute eight MOVDQ instructions to save the contents of the XMM0 through XMM7 registers to memory . • Execute a STMXCSR instr ucti on to save the state of the MXCSR register to memory . In some cases,[...]

  • Página 506

    12-8 Vol. 3A SSE, SSE2 AND SSE3 SYSTEM PROGRAMMING • The operating system can take the respo nsibility for automatically saving th e x87 FPU, MMX, XXM, and MXCSR registers as part of the task switch process (using an FXSA VE instruction) and automatically restoring the st ate of the registers when a suspended ta sk is resumed (using an FXRSTOR in[...]

  • Página 507

    Vol. 3A 12-9 SSE, SSE2 AND SSE3 SYSTEM PROGRAMMING On a task switch, the operatin g system task switching code must execute the fol lowing pseudo- code to set the TS flag according to the cu rrent owner of the x8 7 FPU/MMX/SSE/SSE2/SSE3 state. If the new task (task B in this example) is not the current owner of this state, the TS flag is set to 1; [...]

  • Página 508

    12-10 Vol. 3A SSE, SSE2 AND SSE3 SYSTEM PROGRAMMING • Restores the x87 FPU, MMX, XMM, or MXCSR registers from the new task’ s save area for the x87 FPU/MMX/SSE/SSE2/SSE3 state. • Updates the current x87 FPU/MMX/SSE/SSE2/SSE3 state owner to be the curren t task. • Clears the TS flag.[...]

  • Página 509

    13 Power and Thermal Management[...]

  • Página 510

    [...]

  • Página 511

    Vol. 3A 13-1 CHAPTER 13 POWER AND THERMAL MANAGEMENT This chapter describes facilities of IA-32 arch itecture used for power management and thermal monitoring. 13.1 ENHANCED INTEL SPEEDSTEP ® T E CHNOLOGY Enhanced Intel SpeedStep ® T echnology was introduced in the Pen tium M processor; it is av ail- able in Pentium 4, Int el Xeon, Intel ® C ore[...]

  • Página 512

    13-2 Vol. 3A POWER AND THERMAL MANAGEMENT 13.2 P-ST ATE HARDWARE COORDINATION The Advanced Configuration and Power Interface (ACPI) defines performance states (P-state) that are used facilitate syst em software’ s ability to manage processor power consum ption. Different P-state correspond to dif ferent performance levels that are applied while t[...]

  • Página 513

    Vol. 3A 13-3 POWER AND THERMAL MANAGEMENT If P-states are exposed by the BI OS as hardware coordinated, so ftware is expected to confirm processor suppo rt for P-state hardw are coordina tion feedback and use the feedback mechanism to make P-state decisions. The OSPM is expect ed to reset the MSRs (execute WRMSR with 0 to these MSRs individually) a[...]

  • Página 514

    13-4 Vol. 3A POWER AND THERMAL MANAGEMENT 13.3 MW AIT EXTENSIONS FOR ADVANCED POWER MANAGEMENT IA-32 processors may support a number of C-state 1 that reduce power co nsumption f or inacti ve states. Intel Core Solo and Intel Core Duo processors support bot h deeper C-state and MW AIT extensions that can be used by OS to implement power man a gemen[...]

  • Página 515

    Vol. 3A 13-5 POWER AND THERMAL MANAGEMENT 13.4 THERMAL MONITORI NG AND PROTECTION The IA-32 architecture provides the follow ing mechanisms for monito ring temperature and controlling thermal po wer: 1. The catast rophic shutdown det ecto r forces processor execution to stop if the processor ’ s core temperature rises above a preset limi t. 2. Au[...]

  • Página 516

    13-6 Vol. 3A POWER AND THERMAL MANAGEMENT 13.4.1 Catastrophic Shut down Detector P6 family pr ocessors introduce d a thermal sens or that acts as a catastroph ic shutdown detector . This catastrophic shutdown detector was also i mplemented in Pentium 4, Intel Xeon and Pentium M processors. It is always enabled. When processor core temperature reach[...]

  • Página 517

    Vol. 3A 13-7 POWER AND THERMAL MANAGEMENT MSR_THERM2_CTL register is set to 1 (Fi gure 13-3) and bit 3 of the IA32_MISC_ENABLE register is set to 1. Following a power- up or reset, the TM_SELECT flag may be cleared. BIOS is required to enable either TM1 or TM2. Op erating systems and applications must not disable mechanisms that enable TM1 or TM2. [...]

  • Página 518

    13-8 Vol. 3A POWER AND THERMAL MANAGEMENT • If TM1 is enabled and the TCC is engaged, the performance state transition can commence before the TCC is disengaged. • If TM2 is enabled and the TCC is engaged, the performance state transition specified by a write to the IA32_PERF_CTL will comm ence after the TCC has disengaged. 13.4.2.5 Thermal St [...]

  • Página 519

    Vol. 3A 13-9 POWER AND THERMAL MANAGEMENT • High-T emperature Interru pt Enable flag, bit 0 — Enables an i nterrupt to be generated on the transition from a low -temperature to a high-temperature when set; disables the interrupt when clear .(R/W). • Low-T emperature Interrupt Enable flag, bit 1 — Enables an interrupt to be generated on the [...]

  • Página 520

    13-10 Vol. 3A POWER AND THERMAL MANAGEMENT The IA32_CLOCK_MODULA TION MSR contains the following flag and field used to enable software-controlled clock modulation and to select th e clock modulation duty cycle: • On-Demand Clock Modulation Enable, bit 4 — Enables on-demand software cont rolled clock modulation when set; disables softw are-cont[...]

  • Página 521

    Vol. 3A 13-11 POWER AND THERMAL MANAGEMENT 13.4.4 Detection of T hermal Moni tor and Sof tware Controlled Clock Modulation Facilities The ACPI flag (bit 22) of the CPUID f eature flags indicates the presence of the IA32_THERM_ST A TUS, IA32_THERM_IN TERRUP T , IA32_CLOCK_MODULA TION MSRs, and the xAPIC thermal L VT entry . The TM1 flag (b it 29) of[...]

  • Página 522

    13-12 Vol. 3A POWER AND THERMAL MANAGEMENT been asserted since a previous RESET or the last time software cleared the bit. Software may clear this bit by writing a zero. • PROCHOT# or FO RCEPR# Event (bit 2, RO) — Indicates whet her PROCHOT# or FORCEPR# is b eing asserted. If bi t 2 = 1, PROCHOT# or FORCEPR # has been asserted. • PROCHOT# or [...]

  • Página 523

    Vol. 3A 13-13 POWER AND THERMAL MANAGEMENT • Thermal Threshold #2 Log (bit 9, R/WC0) — Sticky bit that i ndicates whether the Thermal Threshold #2 has been reached since th e last clearing of this bit or a reset. If bit 9 = 1, the Thermal Threshold #2 has been reached. Software ma y clear this bit by writing a zero. • Digital Readout (bits 22[...]

  • Página 524

    13-14 Vol. 3A POWER AND THERMAL MANAGEMENT • THERMTRIP# Interrupt Enable (bit 2, R/W) — When a catastroph ic cooling failure occurs, the processor will automatically shut down. Bit 2 = 0 disables the feature; bit 2 = 1 enables the feature. • FORCPR# Interrupt Enab le (bit 3, R/W) — When a source external to the processor asserts PROCHOT#, t[...]

  • Página 525

    14 Machine Check Ar chitectur e[...]

  • Página 526

    [...]

  • Página 527

    Vol. 3A 14-1 CHAPTER 14 MACHINE-CHECK ARCHITECTURE This chapter describes the m achine-check architecture and ma chine-check exception mecha- nism found in the Pentiu m 4, Intel Xeon, and P6 family processors. See Chapter 5, “Interrupt 18—Machine-Check Exception (#MC),” for more information on machine- check exceptions. A brief description of[...]

  • Página 528

    14-2 Vol. 3A MACHINE-CHECK ARCHITECTURE 14.3 MACHINE-CHECK MSRS Machine check MSRs in the Pentium 4, Intel Xeon , and P6 family processors consist of a set of global control and status registers and several error-reporting register banks (see Figure 14-1). Each error-reporting bank is associated with a specific hardware unit (or g roup of hardware [...]

  • Página 529

    Vol. 3A 14-3 MACHINE-CHECK ARCHITECTURE Where: • Count field, bi ts 0 through 7 — Indicates the number of ha rdware unit er ror-reporting banks available in a particul ar processor implementation. • MCG_CTL_P (control MSR present) flag, bit 8 — Indicates that the processor implements the IA32_MCG_CTL MSR when se t; this register is absent w[...]

  • Página 530

    14-4 Vol. 3A MACHINE-CHECK ARCHITECTURE Where: • Count field, bits 0 thr ough 7 — Indicat es the number of hardware unit error-reporting banks available in a particular processor im plem entation. • MCG_CTL_P (register pr esent) flag, bit 8 — Indicates that the MCG_CTL register is present when set and absent when clear . Bits 9 through 63 a[...]

  • Página 531

    Vol. 3A 14-5 MACHINE-CHECK ARCHITECTURE 14.3.1.4 IA32_MCG_CTL MSR The IA32_MCG_CTL MSR (called the MCG_CTL MS R in P6 fami ly processors) is present if the capability flag MCG_CTL_P is set in the IA32_ MCG_CAP MSR (or the MCG_CAP MSR). IA32_MCG_CTL (or MCG_CTL) controls the rep orting of machine-check exceptions. If present, writing 1s to this regi[...]

  • Página 532

    14-6 Vol. 3A MACHINE-CHECK ARCHITECTURE 14.3.2.2 IA32_MCi_ST A TUS MSRs Each IA32_MC i _ST A TUS MSR (called MC i _ST A TUS in P6 family processors) contains in for- mation related to a machine-check error if its V AL (valid) f lag is set (see Figure 14-6). Software is responsible for clearing IA32_MC i _ST A TUS MSRs by explicitly writing 0s to th[...]

  • Página 533

    Vol. 3A 14-7 MACHINE-CHECK ARCHITECTURE where the error occurred . Do not read these registers if they are not impl emented in the processor . • MISCV (IA32_MC i _MISC register valid) flag, bit 59 — Indicates (when set) that the IA32_MC i _MISC register contains additional inform ation regarding the error . When clear , this flag indicates that[...]

  • Página 534

    14-8 Vol. 3A MACHINE-CHECK ARCHITECTURE 14.3.2.4 IA32_MCi_MISC MSRs The IA32_MC i _MISC MSR (called the MC i _MISC MSR in the P6 family processors) contains additional information describing the machin e-check error if the MISCV flag in the IA32_MC i _ST A TUS register i s set. The IA32_MCi_MISC_MSR is either not i mplemented or does not contai n a[...]

  • Página 535

    Vol. 3A 14-9 MACHINE-CHECK ARCHITECTURE In processors with support for Intel EM64T , 64-bit machine check state MSRs are aliased to the legacy MSRs. In addition, there m ay be registers beyond IA32_MC G_MISC. These may include up to five reserved MSRs (IA32_MCG _RESER VED[1:5]) and save-st ate MSRs for registers introduced in 64-bit mo de. See T ab[...]

  • Página 536

    14-10 Vol. 3A MACHINE-CHECK ARCHITECTURE When a machine-check error is detected on a Pe ntium 4 or Intel Xeon processor, the processor saves the state of the general-purpose registers, the R/EFLAGS register , and the R/EIP in these extended machine-check state MSRs . This information can be used by a debugger to analyze the error . These registers [...]

  • Página 537

    Vol. 3A 14-11 MACHINE-CHECK ARCHITECTURE 14.3.3 Mapping of the Pentium Processor Machine-Check Errors to the Machine-Check Architecture The Pentium processo r reports machine-check errors using tw o registers: P5_MC_TYPE and P5_MC_ADDR. The Pentium 4, Int el Xeon, and P6 family pro cessors map these registers to the IA32_MC i _ST A TUS and IA32_MC [...]

  • Página 538

    14-12 Vol. 3A MACHINE-CHECK ARCHITECTURE Example 14-19. Machine-Check Initializa tio n Pseudocode Check CPUID Feature Flags for MCE and MCA support IF CPU supports MCE THEN IF CPU supports MCA THEN IF (IA32_MCG_CAP.MCG_CTL_P = 1) (* IA32_MCG_CTL register is present *) THEN IA32_MCG_CTL ← FFFFFFFFFFFF FFFFH; (* enables all MCA features *) FI (* De[...]

  • Página 539

    Vol. 3A 14-13 MACHINE-CHECK ARCHITECTURE FOR error-reporting ba nks (0 through MAX_BANK_NUMBER) DO (Optional for BIOS and OS) Log valid errors (OS only) IA32_MCi_STATUS ← 0; OD FI FI FI Setup the Machine Check Exception (#MC) handl er for vector 18 in IDT Set the MCE bit (bit 6) in CR4 register to enable Machine-Chec k Exceptio ns FI 14.6. INTERP[...]

  • Página 540

    14-14 Vol. 3A MACHINE-CHECK ARCHITECTURE 14.6.2 Compound Error Codes Compound error codes describe errors related to the TLBs, memory , c aches, bus and intercon- nect logic, and internal timer . A set of sub-fi elds is common to all of compound errors. These sub-fields describe the ty pe of access, level in the memory hierarchy , and type of reque[...]

  • Página 541

    Vol. 3A 14-15 MACHINE-CHECK ARCHITECTURE For example, the error code ICACHEL1_R D_ERR is constructed from the form: {TT}CACHE{LL}_{RRRR}_ERR, where {TT} is replaced by I, {LL} is replaced by L1, and {RRRR} is replaced by RD. The 2-bit TT sub-field (T able 14 -5) indicates the type of transaction (dat a, instruction, or generic). The sub-field appli[...]

  • Página 542

    14-16 Vol. 3A MACHINE-CHECK ARCHITECTURE The 4-bit RRRR sub-field (see T able 14-7) indicates th e type of action asso ciated with the error . Actions include read and write operations, pr efetches, cache evictions, and snoops. Generic error is returned when the type of error canno t be determin ed. Generic read and generic write are returned when [...]

  • Página 543

    Vol. 3A 14-17 MACHINE-CHECK ARCHITECTURE 14.6.3 Machine-Check Erro r Codes Interpretation Appendix E, “Inter preting Machine-Check Error Cod es,” provides information on interpretin g the MCA error code, model-specific error code, and other information error code fields. For P6 family processors, informat ion has been included on deco ding exte[...]

  • Página 544

    14-18 Vol. 3A MACHINE-CHECK ARCHITECTURE 14.7.1 Machine-Check Exception Handler The machine-check exception (#MC) corresp onds to vector 18. T o serv ice machine-check exceptions, a trap gate must be added to th e IDT . The pointer in the trap gate must point to a machine-check exceptio n handler . T wo approaches can be taken to desig ning the exc[...]

  • Página 545

    Vol. 3A 14-19 MACHINE-CHECK ARCHITECTURE • The MCIP flag in the IA32_MCG_ST A TUS re gister indicates whether a machine-check exception was ge nerated. Before retu rning from the machine-ch eck exception handler , software should clear this flag so that it can be used reliably by an error logging utility . The MCIP flag also detects recu rsion. T[...]

  • Página 546

    14-20 Vol. 3A MACHINE-CHECK ARCHITECTURE 14.7.3 Pentium Processor Machine-Check Exception Handling T o mak e the machine-check exception handler portable to th e Pentium 4, Intel Xeon, P6 family , and Pentium processors, checks can be made (usi ng CPUID) to determine the processor type. Then based on the processor t ype, machine-check exceptions ca[...]

  • Página 547

    Vol. 3A 14-21 MACHINE-CHECK ARCHITECTURE AND RIPV flag in IA32_MCG_STATUS = 0 (* execution is not restartable *) THEN RESTARTABILITY = FALSE; return RESTARTABILITY to calli ng proced ure; FI; Save time-stamp counter and processor ID; Set IA32_MC i _STATUS to all 0 s; Execute serializing instruction (i.e., CPUID); FI; OD; FI; If the processor suppor[...]

  • Página 548

    14-22 Vol. 3A MACHINE-CHECK ARCHITECTURE The basic algorithm given in Example 14-2 1 can be modi fied to provi de more ro bust recovery techniques. For example, software has the flexibility to attempt recovery using information unavailable to the hardware. Specifically , the machine-check exception handler can, after logging carefully analyze the e[...]

  • Página 549

    15 8086 Emulation[...]

  • Página 550

    [...]

  • Página 551

    Vol. 3A 15-1 CHAPTER 15 8086 EMULATION IA-32 processors (begin ning with th e Intel386 processor) prov ide two wa ys to execute new or legacy programs that are assembled and/or compiled to run on an Intel 80 86 processor: • Real-address mode. • V irtual-8086 mode. Figure 2-3 show s the relationship of these operating modes to protected mode and[...]

  • Página 552

    15-2 Vol. 3A 8086 EMULATION The following is a summary of the core features of the real-address mo de execution environment as would be seen by a program written fo r the 8086: • The processor supports a nomin al 1-MByte physical address space (see Section 15.1.1, “Address T ranslation in Real-Address Mode ”, for specific details). This addre[...]

  • Página 553

    Vol. 3A 15-3 8086 EMULATION 8-byte entries) u sed when handling pro tected-mode interrupts and excep tions. Interrupt and exception ve ctor numbers pr ovide an inde x to entries in th e interrupt table. Each entry provides a pointer (called a “vector”) to an interrupt - or exception-han dling procedure. See Section 15.1.4, “Interrupt and Exc [...]

  • Página 554

    15-4 Vol. 3A 8086 EMULATION behavior of the 8086 processor .) Care should be take to en sure that A20M# based address wrap - ping is handled correctly in multipro cessor based system. The IA-32 processors begin ning with the In tel386 processor can generate 32-bit offsets using an address override prefix; however , in real-address mode, the valu e [...]

  • Página 555

    Vol. 3A 15-5 8086 EMULATION • Logical instructions AND, OR, XOR, and NOT . • Decimal instructions DAA, D AS, AAA, AAS, AAM, an d AAD. • Stack instructions PUSH and POP (to g eneral-purpose registers and segment registers). • T ype conversion in structions CWD, CDQ, CBW , and CWDE. • Shift and rotate instruction s SAL, SH L, SHR, SAR, ROL,[...]

  • Página 556

    15-6 Vol. 3A 8086 EMULATION • ENTER and LEA VE control instructions. • BOUND instruction. • CPU identification (CPUID) instr uction. • System instructions CL TS, INVD, WI NVD, INVLPG , LGDT , SGD T , LIDT , SIDT , LMSW , SMSW , RDMSR, WRMSR, RDTSC, and RDPMC. Execution of any of the ot her IA- 32 architecture instructio ns (not given in the[...]

  • Página 557

    Vol. 3A 15-7 8086 EMULATION (For backward compat ibility to Intel 808 6 proce ssors, the default base address and limit of the interrupt vector table shoul d not be chang e d.) T able 15-1 shows the interrupt and exception vector s that can be generated in real-address mode and virtual-8086 mode, and in the Intel 8086 pro cesso r . See Chapter 5, ?[...]

  • Página 558

    15-8 Vol. 3A 8086 EMULATION T able 15-1. Real-Addre ss Mode Exceptions and Interrupt s V ector No. Desc ription Real-Address Mode Virtual-8086 Mode Intel 8086 Processor 0 Divide Error (#DE) Y es Y es Y es 1 Debug Exception (#DB) Y es Y es No 2 NMI Interrupt Y es Y es Y es 3 Breakpoint (#BP) Y es Y es Y es 4 Overflow (#OF) Y es Y es Y es 5 BOUND Ran[...]

  • Página 559

    Vol. 3A 15-9 8086 EMULATION 15.2.1 Enabling Virtual-8086 Mode The processor runs in virtual-8086 mode when the VM (virtual machin e) flag in the EFLAGS register is set. This flag can only be set wh en the processor switches to a new protected-mode task or resumes virtual-8086 mode via an IRET instruction. System software cannot change the state of [...]

  • Página 560

    15-10 Vol. 3A 8086 EMULATION The 8086 operating-system servi ces consists of a kernel and/or operating-system procedures that the 8086 pr ogram makes calls to. These serv ices can be implemented in either of the following two ways: • They can be included in the 8086 program. This approach is desirable for either of the following reas ons: — The[...]

  • Página 561

    Vol. 3A 15-11 8086 EMULATION • When sharing the 8086 operating- system services or ROM code that is common to several 8086 programs running as different 8086-mode tasks. • When redirecting or trapping references to me mo ry -mapped I/O devices. 15.2.4 Protection within a Virtual-8086 T ask Protection is not enforced between the segments of an 8[...]

  • Página 562

    15-12 Vol. 3A 8086 EMULATION Figure 15-3. Entering and Lea ving Virtual-8086 Mode Monitor Virtual-8086 Real Mode Code Protected- Mode T asks Virtual-8086 Mode T asks (8086 Programs) Protected- Mode Interrupt and Exception Handlers T ask Switch 1 VM = 1 Protected Mode Virtual-8086 Mode Real-Address Mode RESET PE=1 PE=0 or RESET #GP Exception 3 CALL [...]

  • Página 563

    Vol. 3A 15-13 8086 EMULATION 15.2.6 Leaving Virtual-8086 Mode The processor can leave the virtu al-8086 mode only through an interrupt or exception . The following are situations where an interrupt or exception wi ll lead to the processor leaving virtual-8086 mode (see Figure 15-3): • The processor services a hardwa re interrupt generated to sign[...]

  • Página 564

    15-14 Vol. 3A 8086 EMULATION 15.2.7 Sensitive Instructions When an IA-32 processor is running in virtua l-808 6 mode, the CLI, STI, PUSHF , POPF , INT n , and IRET instructions are sensitive to IOPL. The IN, INS, OUT , and OUTS instructions, which are sensitive to IOPL in protected mode, are not sensitive in virtual-8086 mode. The CPL is always 3 w[...]

  • Página 565

    Vol. 3A 15-15 8086 EMULATION 15.2.8.2 Memory-Mapped I/O In systems which use memory-map ped I/O, the paging facilities o f the processor can be used to generate exceptions for attempts to access I/O ports. The virtual-8086 monitor may use p aging to control memory-mapped I/O in these ways : • Map part of the linear address space of each ta sk tha[...]

  • Página 566

    15-16 Vol. 3A 8086 EMULATION The method the proc essor uses to handle class 2 and 3 i nterrupts depends on the setting of the following flags and fields: • IOPL field (bits 12 and 13 in the EFLAGS register) — Contr ols how class 3 softw are interrupts are handled when the processor is in virtual-808 6 mode (see Section 2.3, “System Flags and [...]

  • Página 567

    Vol. 3A 15-17 8086 EMULATION 15.3.1 Class 1—Hardware Inte rrupt and Exception Handlin g in Virtual-8086 Mode In virtual-8086 mode , the Pentium, P6 family , Pentium 4, and Intel Xeon processors handle hardware interrupts and exceptions in the same manner as they are handled by the Intel486 and Intel386 processors. They invoke t he protected-mode [...]

  • Página 568

    15-18 Vol. 3A 8086 EMULATION Interrupt and exception handlers can examine the VM flag on the stack to determine if the inter- rupted proc edure was running in vi rtual-8086 mode. If so, the interrupt or except ion can be handled in one of three ways: • The protected-mode interrupt or exception handler that was ca lled can handle the interrupt or [...]

  • Página 569

    Vol. 3A 15-19 8086 EMULATION The virtual-8086 monitor runs at privilege level 0, like the pro tected-mode interrupt and excep- tion handlers. It is common ly closely tied to the protected-mode gene ral-protection exception (#GP , vector 13) handler . If the protected-mode interrupt or excep tion handl er calls th e virt ual- 8086 monitor to hand le[...]

  • Página 570

    15-20 Vol. 3A 8086 EMULATION 15.3.1.3 Handling an Interrupt or Exception Through a T ask Gate When an interrupt or exception vector poi nts to a task gate in the IDT , the processor performs a task switch to the selected in terrupt- or exception-handl ing task. The following actions are carried out as part of this task switch: 1. The EFLAGS registe[...]

  • Página 571

    Vol. 3A 15-21 8086 EMULATION available or not enabled, maskable hardware interrupts are handled as class 1 interrupts. Here, if VIF and VIP flag s are needed, the virtual-80 86 monitor can implement them in software. Existing 8086 programs commonly set and clear the IF flag in the EFLAGS register to enable and disable maskable hardware interru pts,[...]

  • Página 572

    15-22 Vol. 3A 8086 EMULATION 3. The virtual-808 6 monitor shoul d read the VIF flag in the EFLAGS register . — If the VIF flag is clear, the virtual-8086 monit or sets the VIP flag in the EFLAGS image on the stack to indicate that there is a deferred interrupt pending and returns to the protected-mode handler . — If th e VIF flag is set, the vi[...]

  • Página 573

    Vol. 3A 15-23 8086 EMULATION 15.3.3 Class 3—Software Interrupt Handling in V irtual-8086 Mode When the processor receives a software inte rrupt (an interrupt generated with the INT n instruction) while in virtual-8086 mode, it can use any of six different methods to handle the interrupt. The method selected depends on the setti ngs of the VME fla[...]

  • Página 574

    15-24 Vol. 3A 8086 EMULATION T abl e 15-2. Software Interrupt Handling Methods While in Virtual-8086 Mode Method VME IOPL Bit in Redir . Bitmap* Processor Action 10 3 X Interrupt directed to a protected-mode interrup t handler: - Switches to privilege-level 0 stack - Pushes GS, FS, DS and ES onto privilege-level 0 stack - Pushes SS, ESP , EFLAGS, C[...]

  • Página 575

    Vol. 3A 15-25 8086 EMULATION Redirecting software interrupts back to th e 8086 program potentially speeds up in terrupt handling because a switch back and forth between virtual-8086 mode and protected mode is not required. This latter interrupt-handlin g techni que is particularly useful for 8086 o perating systems (such as MS-DOS) that use the INT[...]

  • Página 576

    15-26 Vol. 3A 8086 EMULATION 15.3.3.2 Methods 2 and 3: Sof tware Interrupt Handling When a software interrupt occurs in vi rtual-8086 mode and the metho d 2 or 3 conditions are present, the processor generates a general-pr otection exception (#GP). Method 2 is enabled when the VME flag is set to 0 and the IOPL value is less than 3. Here th e IOPL v[...]

  • Página 577

    Vol. 3A 15-27 8086 EMULATION 6. Loads the CS and EIP register s with values from the interrupt vect or table entry pointed to by the interrupt vector number . Only the 16 low-order bits of the EIP are loaded and t he 16 high-order bits are set to 0. The interrupt vecto r table is assumed to be at linear address 0 of the current virtual-8086 task. 7[...]

  • Página 578

    15-28 Vol. 3A 8086 EMULATION 15.4 PROTECTED-MODE VIRTUAL INTERRUPT S The IA-32 processors (beginning with the Pent ium processo r) also support the VIF and VIP flags in the EFLAGS register in protected mode by sett ing the PVI (protected-mode virt ual interrupt) flag in the CR4 register . Setting the PVI flag allows applicatio ns running at privile[...]

  • Página 579

    16 Mixing 16-Bit and 32-Bit Code[...]

  • Página 580

    [...]

  • Página 581

    Vol. 3A 16-1 CHAPTER 16 MIXING 16-BIT AND 32-BIT CODE Program modules written to run on IA-3 2 processo rs can be either 16-bi t modules or 32-bit modules. T able 16-1 shows the characteristic of 16 -bit and 32-bit modules. The IA-32 processors function most ef ficiently when executing 32-bit program modules. They can, however , also execute 16-bi [...]

  • Página 582

    16-2 Vol. 3A MIXING 16-BIT AND 32-BIT CODE 16.1 DEFINING 16-BIT AND 32-BIT PROGRAM MODULES The following IA -32 architecture mechanisms are used to distinguish between and support 16-bit and 32-bit segmen ts and operation s : • The D (default operand and address size) flag in code-segment descriptors. • The B (default stack size) flag in stack-[...]

  • Página 583

    Vol. 3A 16-3 MIXING 16-BIT AND 3 2-BIT CODE These prefixes reverse the default size selected by the D flag in the code-segment descriptor . For example, the processor can interpret the (MOV mem , reg ) instru ctio n in any of four ways: • In a 32-bit code segment: — Moves 32 bits from a 32-b it reg ister to memory using a 32-bit effective addre[...]

  • Página 584

    16-4 Vol. 3A MIXING 16-BIT AND 32-BIT CODE A stack that spans less than 64 KBytes can be sh ared by both 16- and 32-b it code segments. This class of stacks includes: • Stacks in expand-up segments with the G (granularity) and B (big) flags in the stack- segment descriptor clear . • Stacks in e xpand-down segments with the G and B flags clear .[...]

  • Página 585

    Vol. 3A 16-5 MIXING 16-BIT AND 3 2-BIT CODE These methods of transferring program control overcome t he following architectural lim itations imposed on calls between 16-bit and 32-bit code segment s: • Pointers from 16-bit code segments (w hich by default can only be 16 bits) cannot be u sed to address data or code located beyond FFFFH in a 32-bi[...]

  • Página 586

    16-6 Vol. 3A MIXING 16-BIT AND 32-BIT CODE While executing 32-b it code, if a call is made to a 16-bit code segment which is at the same or a more privileged level (that is, the DPL of the cal led code segment is l ess than o r equal to the CPL of the calling code segment) through a 16-bi t call ga te, then the upper 16-bits of the ESP register may[...]

  • Página 587

    Vol. 3A 16-7 MIXING 16-BIT AND 3 2-BIT CODE 16.4.2.1 Controlling the Operand-Size Att ribute For a Call Three things can determine the operand-size of a call: • The D flag in the segment descriptor for the callin g code segment. • An operand-size instruction prefix. • The type of call gate (16-bit or 32-bit) , if a call is made through a call[...]

  • Página 588

    16-8 Vol. 3A MIXING 16-BIT AND 32-BIT CODE 16.4.3 Interrupt Control T ransfers A program-control transfer caused by an exception or interrupt is always carried out through an interrupt or trap gate (located in the IDT). Here, the type of the gate (16-bit or 32-bit) determ ines the operand-size attribu te used in the im plicit call to the exception [...]

  • Página 589

    Vol. 3A 16-9 MIXING 16-BIT AND 3 2-BIT CODE The interface procedure becomes more complex if any of these rules are violated. For example, if a 16-bit procedure calls a 32- bit procedure with an entry point beyon d FFFFH, the interface procedure will need to prov ide the offset to the entry point. The mapping between 16- and 32-bit addresses is only[...]

  • Página 590

    16-10 Vol. 3A MIXING 16-BIT AND 32-BIT CODE[...]

  • Página 591

    17 IA-32 Ar chitectur e Compatibility[...]

  • Página 592

    [...]

  • Página 593

    Vol. 3A 17-1 CHAPTER 17 IA-32 ARCHITECTURE COMP ATIBILITY All IA-32 processors are binary compatible. Compatibili ty means that, within certain limited constraints, programs that execu te on previous generations of IA-32 processors wi ll produce identical results when executed on later IA-32 processors. The co mpatibility constraints and any implem[...]

  • Página 594

    17-2 Vol. 3A IA-32 ARCHITECTURE COMPATIBILITY 17.2. RESERVED BITS Throughout this manual, certa in bits are marked as reserved in many register and mem ory layout descriptions. When bi ts are marked as undefine d or reserved, it is essential for com patibility with future processors that software t reat these bits as havin g a future, though unknow[...]

  • Página 595

    Vol. 3A 17-3 IA-32 ARCHITECTURE COMPATIBILITY 2. Execute the CPUID instruction. The CPUID instruction (added to the IA-32 in the Pen tium processor) indicates the presen ce of new features directly . See Chapter 14, “Processor Identificati on and Feature Determination,” in the IA-32 Intel® Ar chitectur e Software Developer’ s Manual, V olume[...]

  • Página 596

    17-4 Vol. 3A IA-32 ARCHITECTURE COMPATIBILITY ming for conversion to integer . The remaining two instructions (MONIT OR and MW AIT) accelerate synchronization of threads. SSE3 i nstructions are described in Chapter 12, “Programming with S treaming SIMD Extensions 3 (SSE3),” in the IA-32 Intel® Ar chitectur e Softwar e Developer’ s Manual, V [...]

  • Página 597

    Vol. 3A 17-5 IA-32 ARCHITECTURE COMPATIBILITY 17.12.1 Instructions Added Prio r to the Pentiu m Processor The following instructions were added in the Intel486 processor: • BSW AP (byte swap) instruction. • XADD (exchange and add) instruction. • CMPXCHG (compare and ex change) instruction. • Ι NVD (invalidate cache) instruction. • WBINVD[...]

  • Página 598

    17-6 Vol. 3A IA-32 ARCHITECTURE COMPATIBILITY • Bit scan instructions. • Double-shift instructio ns. • Byte set on condition instruct ion . • Move with sign/zero extension. • Generalized multi ply instruction. • MOV to and from control registers. • MOV to and fr om test register s (now obsolete). • MOV to and from debug registers. ?[...]

  • Página 599

    Vol. 3A 17-7 IA-32 ARCHITECTURE COMPATIBILITY • VIP (virtual interrupt pending), bit 20. • ID (identification flag), bit 21. The AC flag (bit 18) was added to the EF LAGS register in the Intel486 processor . 17.15.1 Using EFLAGS Flags to Di stinguish Between 32-Bit IA-32 Processors The followin g bits in the EFLAGS r egister that can be used to[...]

  • Página 600

    17-8 Vol. 3A IA-32 ARCHITECTURE COMPATIBILITY 17.16.2 EFLAGS Pushed on the St ack The setting of the stored values of bits 12 through 15 (which includes the IOPL fi eld and the NT flag) in the EFLAGS register by the PUSHF in struction, by interrupts, and by exceptions is different with the 32-bit IA-32 p rocessors than with the 8086 and Intel 286 p[...]

  • Página 601

    Vol. 3A 17-9 IA-32 ARCHITECTURE COMPATIBILITY As on the Intel 286 and Intel38 6 processors, the MP (monitor coprocessor) flag (bit 1 of register CR0) determines whether the W AIT/FW AIT instructions or w aiting-type floating-point instruc- tions trap when the context of the x87 FPU is different from that of the currently -executing task. If the MP [...]

  • Página 602

    17-10 Vol. 3A IA-32 ARCHITECTURE COMPATIBILITY is reserved on these processors. The additio n o f the SF flag on a 32-bit x87 FPU h as no impact on software. Existing exception h andlers need not chan ge, bu t may be upgraded to take advan- tage of the additional in formation. 17.17.3 x87 F PU Control Word Only affine closure is supported for infin[...]

  • Página 603

    Vol. 3A 17-11 IA-32 ARCHITECTURE COMPATIBILITY 17.17.5.1 NANS The 32-bit x87 FPUs disti nguish between signaling NaNs (SNaNs) and quiet NaNs (QNaN s). These x87 FPUs only generat e QNaNs and normally do not generate an ex ception upon encoun- tering a QNaN. An invalid-operation exception (# I) is generated only upon encountering a SNaN, except for [...]

  • Página 604

    17-12 Vol. 3A IA-32 ARCHITECTURE COMPATIBILITY 17.17.6.2 NUMERIC OVERFLOW EXCEPTION (#O) On the 32-bit x87 FPUs, wh en the numeric overflow exceptio n is masked and the roundi ng mode is set to chop (toward 0), the resu lt is the largest positive or smallest negative number . The 16-bit IA-32 math coprocessors d o not signal the overflow excepti on[...]

  • Página 605

    Vol. 3A 17-13 IA-32 ARCHITECTURE COMPATIBILITY 16-bit IA-32 math coprocessors, it takes precedence over all other exceptions. This difference causes no impact on existing software, but some unneed ed normalization of denormalized oper- ands is prevented on the Intel486 processor and Intel 387 math coprocessor . 17.17.6.5 CS AND EIP FOR FPU EXCEPTIO[...]

  • Página 606

    17-14 Vol. 3A IA-32 ARCHITECTURE COMPATIBILITY 17.17.6.8 INVALI D OPERATION EXCEP TION ON DENOR MALS An invalid-operation exception is not ge nerated on the 32-bit x87 FPUs upon encountering a denormal value when executing a FSQR T , FDIV , or FPREM instruction or upon conversion to BCD or to integer . The operation proceeds by fi rst normalizing t[...]

  • Página 607

    Vol. 3A 17-15 IA-32 ARCHITECTURE COMPATIBILITY 17.17.6.14 FLOATING-POIN T ERROR EXCEPTION (#MF) In real mode and protected mode (not inclu ding virtual-8086 mode), interrupt vect or 16 must point to the floatin g-point exception handler . In virtua l 8086 mode, the virtu al-8086 monit or can be programmed to accommodate a different locatio n of the[...]

  • Página 608

    17-16 Vol. 3A IA-32 ARCHITECTURE COMPATIBILITY 17.17.7.5 FUCOM, FUCOMP , AND FUCOMPP INSTRUCTIONS When executing the FUCOM, FUCOMP , and FU COMPP instruction s, the 32-bit x87 FPUs perform unordered comp are according to IEEE Stan dard 754. These instructions do not exist on the 16-bit IA-32 math coprocessors. The avail ability of these new instruc[...]

  • Página 609

    Vol. 3A 17-17 IA-32 ARCHITECTURE COMPATIBILITY 16-bit IA-32 math coprocessors do report a deno rmal-operand ex ception in this situ ation. This difference does not af fect existing software. On the 32-bit x87 FPUs, loading a denormal value that is in singl e- or double-real format causes the value to be converted to extended-real format. Loading a [...]

  • Página 610

    17-18 Vol. 3A IA-32 ARCHITECTURE COMPATIBILITY 17.17.7.15 FXAM INSTRUCTION W ith the 32-bit x87 FPUs, if the FPU encounters an empty register when executing the FXAM i ns tr u c ti o n , i t n o t g e n e ra t e co m b i na t i o ns o f C0 t h ro ug h C3 e q u al t o 110 1 o r 1111 . T h e 1 6 - bi t IA-32 math coprocessors may generate these combi[...]

  • Página 611

    Vol. 3A 17-19 IA-32 ARCHITECTURE COMPATIBILITY 17.17.1 1 Operands S plit Across Segment s and/or Pages On the P6 family , Pentium, and Intel486 p rocessor FPUs, when the first half of an operand to be written is inside a page or segment and the second half is outside, a memory fault can cause the first half to be stored but no t the second half. In[...]

  • Página 612

    17-20 Vol. 3A IA-32 ARCHITECTURE COMPATIBILITY coprocessor keeps its ERROR# out put in inactive state after hardware reset; the Intel 387 copro- cessor keeps its ERROR# output in act ive state after hardware reset. Upon hardware reset or executi on of the FINIT/FNINIT i nstruction, the Intel 387 math copro- cessor signals an error conditio n. The P[...]

  • Página 613

    Vol. 3A 17-21 IA-32 ARCHITECTURE COMPATIBILITY cmp ax, 037fh jz Intel487_SX_Math_CoProcessor_present;ax=037fh jmp Intel486_SX_microprocessor_prese nt;ax=ffffh If the Intel 487 SX math coprocessor is not presen t, the following code can be run to set the CR0 register for the Intel486 S X pro c essor . mov eax, cr0 and eax, fffffffdh ;make MP=0 or ea[...]

  • Página 614

    17-22 Vol. 3A IA-32 ARCHITECTURE COMPATIBILITY The content of CR4 is 0H following a hardware reset. Control register CR4 was introduced in the Pentiu m processor . This register contains flags that enable certain new extensions provided in th e Penti um processor: • VME — V irtual-8086 mode extensions. Enables support for a virtual interrupt fl[...]

  • Página 615

    Vol. 3A 17-23 IA-32 ARCHITECTURE COMPATIBILITY 17.21. MEMORY MANAG EMENT FACILITIES The following sections describe the new m emory management facilities avail able in the various IA-32 processors and some comp atib ility differences. 17.21.1 New Memory Mana gement Control Flags The Pentium Pro processor intr oduced three new memory managem ent fea[...]

  • Página 616

    17-24 Vol. 3A IA-32 ARCHITECTURE COMPATIBILITY the data cache; in the Intel486 processor , they implement a wr ite-through strategy . See T able 10-5 for a comparison of these bi ts on t he P6 family , Pentium, and Intel486 processo rs. For complete information on caching, see Chapter 10, “Memory Cache Control.” 17.21.3 Descrip tor T ypes and C[...]

  • Página 617

    Vol. 3A 17-25 IA-32 ARCHITECTURE COMPATIBILITY On the P6 family and Pentium p rocessors, reserved bits 1 1, 12, 14 and 15 are hard-wired to 0. On the Intel486 processor, however , bit 12 can be set. See T able 9-1 for th e dif ferent settings of this register following a power-up or hardware reset. 17.22.3 Debug Registers DR4 and DR5 Although the D[...]

  • Página 618

    17-26 Vol. 3A IA-32 ARCHITECTURE COMPATIBILITY tecture has been added for handling and reporting on hardware errors. See Chapter 14, “Machine-Check Architecture,” for a detail ed descrip tion of the new conditions. The following exceptions and/or exception condi tions were added to the IA-32 with the Pentiu m processor: • Machine-check except[...]

  • Página 619

    Vol. 3A 17-27 IA-32 ARCHITECTURE COMPATIBILITY 17.24.1 Machine-Ch eck Architecture The Pentium Pro processor intro duced a new architecture to the IA-32 for handling and reporting on machine-ch eck exceptions. This mach ine-check architecture (described in detail in Chapter 14, “Machine-Check Architecture ”) great ly expands the ability of the [...]

  • Página 620

    17-28 Vol. 3A IA-32 ARCHITECTURE COMPATIBILITY 17.25.3 IDT Limit The LIDT instruction can be used to set a lim it on the size of the IDT . A double-fault exception (#DF) is generated if an interrupt or exception attempts to read a vector beyond the limit. Shut- down then occurs on the 32-bit IA-32 processors if the doubl e-fault handler vector is b[...]

  • Página 621

    Vol. 3A 17-29 IA-32 ARCHITECTURE COMPATIBILITY • For the 82489DX, in the lowest pri ority delivery mode, all the target local APICs specified by the destination fi eld participate in the lowest p riority arbitration. For the local APIC, only those local APICs which have free interrupt slots will participate in the lowest priority arbitration. 17.[...]

  • Página 622

    17-30 Vol. 3A IA-32 ARCHITECTURE COMPATIBILITY 17.27.1 P6 F amily and Pentium Processor TSS When the virtual mo de extensions are enabled (by setting the VME fl ag in control register CR4), the TSS in the P6 family and Pentium processors contain an interrupt redirection bit map, which is used in virtual-8086 mode to redi rect in terrupts back to an[...]

  • Página 623

    Vol. 3A 17-31 IA-32 ARCHITECTURE COMPATIBILITY general-protection exceptions (# GP). Figure 17-1 demonstrates the different areas accessed by the Intel486 and the P6 family and Pent ium processors. 17.28. CACHE MANAGEMENT The P6 family processors include two levels of internal caches: L1 (level 1) and L2 (level 2). The L1 cache is divided into an i[...]

  • Página 624

    17-32 Vol. 3A IA-32 ARCHITECTURE COMPATIBILITY External system hardware can force the Pentium processor to disable cachin g or to use the write- through cache policy should that be required. In the P6 family processors, the MTRRs can be used to override the CD and NW flags (see T able 10-6). The P6 family and Pentium processors suppor t page-level [...]

  • Página 625

    Vol. 3A 17-33 IA-32 ARCHITECTURE COMPATIBILITY cache to be disabled and enabled, independently of the L1 and L2 caches (see Section 10.5.4, “Disabling and Enabling the L3 Cache”). 17.29. PAGING This section identifies enhancements made to the paging mechanism and implementation differ- ences in the paging mechanism for various IA-32 processors.[...]

  • Página 626

    17-34 Vol. 3A IA-32 ARCHITECTURE COMPATIBILITY The sequence bounded by the MOV and JMP instructions shoul d be identity mapped (that is, the instructions should reside on a page whos e linear and physical addresses are identical). For the P6 family processors, the MOV CR0, REG instruction is serializing, so the jump oper- ation is not required. How[...]

  • Página 627

    Vol. 3A 17-35 IA-32 ARCHITECTURE COMPATIBILITY 17.30.2 Error Code Pushes The Intel486 processor implements the error co de pushed on the stack as a 16-bit value. When pushed onto a 32-bit stack, t he Intel486 processor only pushes 2 bytes and updates ESP by 4. The P6 family and Pentium processors’ error code is a ful l 32 bits with the up per 16 [...]

  • Página 628

    17-36 Vol. 3A IA-32 ARCHITECTURE COMPATIBILITY The 32-bit processors also have descripto rs for TSS segments, call gates, interrupt gates, and trap gates that supp ort the 32-bit architecture. Both kinds of desc riptors can be used in the same system. For those segment descriptors commo n to both 16- and 32-bit processors, cl ear bits in the reserv[...]

  • Página 629

    Vol. 3A 17-37 IA-32 ARCHITECTURE COMPATIBILITY An exception to this behavior occurs when a st ack access is data aligned, and the stack pointer is pointing to the last aligned piece of data that size at the top of the stack (ESP is FFFFFFFCH). When this data is popp ed, no segment limit vi olation occurs and the stack pointer will wrap around to 0.[...]

  • Página 630

    17-38 Vol. 3A IA-32 ARCHITECTURE COMPATIBILITY way of ensuring ordering between routines that produce weakly-ordered results and routines that consume this data. No re-ordering of reads occurs on the Pentium processor , except under the condition noted in Section 7.2.1, “Memory Ordering in the Intel® Pentium® and Intel486™ Processors,” and [...]

  • Página 631

    Vol. 3A 17-39 IA-32 ARCHITECTURE COMPATIBILITY bus to send the interrupt vector to the processor . After receiving the interrupt request signal, the processor asserts LOCK# to insure that no othe r data appears on the data b us until the interrupt vector is received. This bus locking does not occur on the P6 family processors. 17.35. BUS HOLD Unlik[...]

  • Página 632

    17-40 Vol. 3A IA-32 ARCHITECTURE COMPATIBILITY 17.36.3 Memory T ype Range Registers Memory type range registers (MTRRs) are a ne w feature introduced in to the IA-32 in the Pentium Pro processor . MTRRs allo w the processo r to optim ize memory op erations for different types of memory , such as RAM, ROM, frame buffer memory , and memory-mapped I/O[...]

  • Página 633

    Vol. 3A 17-41 IA-32 ARCHITECTURE COMPATIBILITY 17.36.5 Performance-M onitoring Counters The P6 family and Pentium pro cessors provide two performance-monit oring counters for use in monitoring inte rnal hardware operatio ns. Thes e counters are event counters that can be programmed to count a variet y of different types of events, such as the numbe[...]

  • Página 634

    17-42 Vol. 3A IA-32 ARCHITECTURE COMPATIBILITY[...]

  • Página 635

    INTEL SALES OFFICES ASIA P ACIFIC Australia Intel Corp. Level 2 448 St Kilda Road Melbourne VI C 3004 Australia Fax:613- 9862 5599 China Intel Corp. Rm 709, Shaanxi Zhongda Int'l Bldg No.30 Nandajie Street Xian AX71000 2 China Fax:(86 29) 7203 356 Intel Corp. Rm 2710, Metrop oli an To w e r 68 Zouron g Rd Chongqing CQ 400015 China Intel Corp. [...]

  • Página 636

    Intel Corp. 999 CANADA PLACE , Suite 404,#1 1 Va n c ou v e r B C V6C 3E2 Canada Fax:604- 844-2813 Intel Corp. 2650 Quee nsview Dr ive, Suite 25 0 Ottawa ON K2B 8H6 Canada Fax:613- 820-5936 Intel Corp. 190 Attwell Drive, Suite 50 0 Rexcdale ON M9W 6H8 Canada Fax:416- 675-2438 Intel Corp. 171 S t. Clair Av e. E, Suite 6 To r o n t o O N Canada Intel[...]