Intel IA-32 manuel d'utilisation
- Voir en ligne ou télécharger le manuel d’utilisation
- 636 pages
- 2.9 mb
Aller à la page of
Les manuels d’utilisation similaires
-
Computer Accessories
Intel 200
53 pages 1.3 mb -
Computer Accessories
Intel 845
148 pages 1.08 mb -
Computer Accessories
Intel CONTROLLERS 413808
824 pages 11.36 mb -
Computer Accessories
Intel LXD970A Demo Board for 10/100
14 pages 0.24 mb -
Computer Accessories
Intel AXXRSBBU6
14 pages 0.46 mb -
Computer Accessories
Intel xw455Q
12 pages 0.75 mb -
Computer Accessories
Intel SR1450
87 pages 1.8 mb -
Computer Accessories
Intel 80287
515 pages 26.23 mb
Un bon manuel d’utilisation
Les règles imposent au revendeur l'obligation de fournir à l'acheteur, avec des marchandises, le manuel d’utilisation Intel IA-32. Le manque du manuel d’utilisation ou les informations incorrectes fournies au consommateur sont à la base d'une plainte pour non-conformité du dispositif avec le contrat. Conformément à la loi, l’inclusion du manuel d’utilisation sous une forme autre que le papier est autorisée, ce qui est souvent utilisé récemment, en incluant la forme graphique ou électronique du manuel Intel IA-32 ou les vidéos d'instruction pour les utilisateurs. La condition est son caractère lisible et compréhensible.
Qu'est ce que le manuel d’utilisation?
Le mot vient du latin "Instructio", à savoir organiser. Ainsi, le manuel d’utilisation Intel IA-32 décrit les étapes de la procédure. Le but du manuel d’utilisation est d’instruire, de faciliter le démarrage, l'utilisation de l'équipement ou l'exécution des actions spécifiques. Le manuel d’utilisation est une collection d'informations sur l'objet/service, une indice.
Malheureusement, peu d'utilisateurs prennent le temps de lire le manuel d’utilisation, et un bon manuel permet non seulement d’apprendre à connaître un certain nombre de fonctionnalités supplémentaires du dispositif acheté, mais aussi éviter la majorité des défaillances.
Donc, ce qui devrait contenir le manuel parfait?
Tout d'abord, le manuel d’utilisation Intel IA-32 devrait contenir:
- informations sur les caractéristiques techniques du dispositif Intel IA-32
- nom du fabricant et année de fabrication Intel IA-32
- instructions d'utilisation, de réglage et d’entretien de l'équipement Intel IA-32
- signes de sécurité et attestations confirmant la conformité avec les normes pertinentes
Pourquoi nous ne lisons pas les manuels d’utilisation?
Habituellement, cela est dû au manque de temps et de certitude quant à la fonctionnalité spécifique de l'équipement acheté. Malheureusement, la connexion et le démarrage Intel IA-32 ne suffisent pas. Le manuel d’utilisation contient un certain nombre de lignes directrices concernant les fonctionnalités spécifiques, la sécurité, les méthodes d'entretien (même les moyens qui doivent être utilisés), les défauts possibles Intel IA-32 et les moyens de résoudre des problèmes communs lors de l'utilisation. Enfin, le manuel contient les coordonnées du service Intel en l'absence de l'efficacité des solutions proposées. Actuellement, les manuels d’utilisation sous la forme d'animations intéressantes et de vidéos pédagogiques qui sont meilleurs que la brochure, sont très populaires. Ce type de manuel permet à l'utilisateur de voir toute la vidéo d'instruction sans sauter les spécifications et les descriptions techniques compliquées Intel IA-32, comme c’est le cas pour la version papier.
Pourquoi lire le manuel d’utilisation?
Tout d'abord, il contient la réponse sur la structure, les possibilités du dispositif Intel IA-32, l'utilisation de divers accessoires et une gamme d'informations pour profiter pleinement de toutes les fonctionnalités et commodités.
Après un achat réussi de l’équipement/dispositif, prenez un moment pour vous familiariser avec toutes les parties du manuel d'utilisation Intel IA-32. À l'heure actuelle, ils sont soigneusement préparés et traduits pour qu'ils soient non seulement compréhensibles pour les utilisateurs, mais pour qu’ils remplissent leur fonction de base de l'information et d’aide.
Table des matières du manuel d’utilisation
-
Page 1
IA-32 In tel ® Ar chitectur e So ftw ar e De v eloper’ s Manual Vo l u m e 3 A : S ystem Pr ogr amming Guide, P art 1 NO TE: The IA-32 Intel Ar chitecture Softwar e Dev eloper's Manual c onsists of f i ve vol u me s : Basic Architectur e , Order Number 253665; Ins truction Se t Re f erence A-M , Or der Number 253666; Instruction Se t Re f e[...]
-
Page 2
INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRO DUCTS. NO LICENSE, EX- PRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHT S IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN IN TEL’S T ERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABI LITY WHATSOEVER , AND INTEL DISCLAIMS ANY [...]
-
Page 3
Vol. 3A iii CONTENT S FOR V OLUME 3A AND 3B CHAPTER 1 ABOUT THIS MANUAL 1.1 IA-32 PROCESSORS COVERED IN THIS MANUA L . . . . . . . . . . . . . . . . . . . . . . . 1-1 1.2 OVERVIEW OF THE SYSTEM PROG RAMMING GUIDE . . . . . . . . . . . . . . . . . . . . 1-2 1.3 NOTATIONAL CONVENTIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . [...]
-
Page 4
CONTENTS iv Vol. 3A PAGE 2.6.7 Readi ng and Writing Model-Specific Registers . . . . . . . . . . . . . . . . . . . . . . . . . . 2 -29 2.6.7.1 R eading and Writing Model-Specific Registers in 64-Bit Mode . . . . . . . . . . . 2-29 CHAPTER 3 PROTECTED-MODE MEMORY MANAGEMENT 3.1 MEMORY MANAGE MENT OVERVIEW . . . . . . . . . . . . . . . . . . . . . . [...]
-
Page 5
Vol. 3A v CONTENTS PAGE CHAPTER 4 PROTECTION 4.1 ENABLING AND DISABLING SEGMENT AND PAGE PROTECTION . . . . . . . . . . 4-1 4.2 FIELDS AND FLAGS USED FOR SEGMENT-LEVEL AND PAGE-LEVEL PROTECTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-2 4.2.1 Code Segment Descriptor in 64-bit Mode . . . . . . . . . . . .[...]
-
Page 6
CONTENTS vi Vol. 3A PAGE CHAPTER 5 INTERRUPT AND EXCEPTION HANDLING 5.1 INTERRUPT AND EXCEPTION OVERVIEW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-1 5.2 EXCEPTION AND INTERRUPT VECTORS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-2 5.3 SOURCES OF INTERRUPTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . .[...]
-
Page 7
Vol. 3A vii CONTENTS PAGE Interrupt 16—x87 FPU Floa ting-Po int Error (#MF) . . . . . . . . . . . . . . . . . . . . . . 5-55 Interrupt 17—Al ignment Check Exception (#AC). . . . . . . . . . . . . . . . . . . . . . . . 5-57 Interrupt 18—Machine-Check Exce ption (#MC) . . . . . . . . . . . . . . . . . . . . . . . . 5-59 Interrupt 19—SIMD Floa[...]
-
Page 8
CONTENTS viii Vol. 3A PAGE 7.5.4 MP Initialization Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-18 7.5.4.1 Typica l BSP Initialization Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-19 7.5.4.2 Typica l AP Initialization Sequence . . . . . . . . . . . . . . . . . . . . . . [...]
-
Page 9
Vol. 3A ix CONTENTS PAGE 7.11.6.3 Halt Idle Logical Processors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-52 7.11.6.4 Potential Usa ge of MONITOR/MWAIT in C1 Idle Loops . . . . . . . . . . . . . . . . 7-52 7.11.6.5 Guideline s for Scheduling Threads on Logical Processors Sharing Execution Resources . . . . . . . . .[...]
-
Page 10
CONTENTS x Vol. 3A PAGE 8.10 APIC BUS MESSAGE PASSING MECHANISM AND PROTOCOL (P6 FAMILY, PENTIUM PROCESSORS) . . . . . . . . . . . . . . . . . . . . . 8-42 8.10.1 Bus Message Fo rmats. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-43 8.11 MESSAGE SIGNALLED INTERRUPTS . . . . . . . . . . . . . . . . . . . [...]
-
Page 11
Vol. 3A xi CONTENTS PAGE 9.11.6.4 Update in a System Suppo rting Dual-Cor e Technol ogy . . . . . . . . . . . . . . . . 9-46 9.11.6.5 Update Load er Enhance ments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-46 9.11.7 Update Signature and Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9[...]
-
Page 12
CONTENTS xii Vol. 3A PAGE 10.11.3.1 Base and Mask Calculations with Intel EM64T. . . . . . . . . . . . . . . . . . . . . . . 10-33 10.11.4 Range Size and Alignment Requirement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-34 10.11.4.1 MTRR Precedences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .[...]
-
Page 13
Vol. 3A xiii CONTENTS PAGE CHAPTER 13 POWER AND THERMAL MANAGEMENT 13.1 ENHANCED INTEL SPEEDSTEP ® TECHNOLOGY . . . . . . . . . . . . . . . . . . . . . . . 13-1 13.1.1 Software Interface For Init iating Performance State Transitions . . . . . . . . . . . . 13-1 13.2 THERMAL MONITORI NG AND PROTECTION . . . . . . . . . . . . . . . . . . . . . . . .[...]
-
Page 14
CONTENTS xiv Vol. 3A PAGE 15.2 VIRTUAL-8086 MODE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-7 15.2.1 Enabling Virtual-8086 Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-9 15.2.2 Structure of a Virtual-8086 Task . . . . . . . . . . . . . . . . . . . . . . .[...]
-
Page 15
Vol. 3A xv CONTENTS PAGE 17.6. STREAMING SIMD EXTENSIONS (SSE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-3 17.7. STREAMING SIMD EXTENSIONS 2 (SSE2). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-3 17.8. STREAMING SIMD EXTENSIONS 3 (SSE3). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-3 17.9. HYPER-TH[...]
-
Page 16
CONTENTS xvi Vol. 3A PAGE 17.17.7.12. FXTRACT Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-17 17.17.7.13. Load Constant Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-17 17.17.7.14. FSETPM Instruction . . . . . . . . . . . . . . . . . . . . . . . . . [...]
-
Page 17
Vol. 3A xvii CONTENTS PAGE 17.29.1. Large Pages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-34 17.29.2. PCD and PWT Flags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-34 17.29.3. Enabling and Disabling Paging . . . . . . . . . . . . . . . . . . [...]
-
Page 18
CONTENTS xviii Vol. 3A PAGE 18.5.7.1 Last Exception Records and Intel EM64T . . . . . . . . . . . . . . . . . . . . . . . . . . 18-19 18.5.8 Branch Trace Store (BTS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-19 18.5.8.1 Detection of the BTS Facilities . . . . . . . . . . . . . . . . . . . . . . . . . . . .[...]
-
Page 19
Vol. 3A xix CONTENTS PAGE 18.11 PERFORMANCE MONITORI NG AND HYPER-THREADING TECHNOLOGY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-60 18.11.1 ESCR MSRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-61 18.11.2 CCCR MSRs . . . . . . . .[...]
-
Page 20
CONTENTS xx Vol. 3A PAGE 20.7 VM-EXIT CONTROL FIELDS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-14 20.7.1 VM-Exit Controls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-14 20.7.2 VM-Exit Controls for MSRs . . . . . . . . . . . . . . . . . . . . . . . . . . .[...]
-
Page 21
Vol. 3A xxi CONTENTS PAGE 22.3.2.1 Loadin g Guest Control Registers, Deb ug Reg isters, and MSRs . . . . . . . . 21-14 22.3.2.2 Loadin g Guest Segment Registers and Descriptor-Table Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21-16 22.3.2.3 Loading Guest RIP, RSP, and RFLAGS . . . [...]
-
Page 22
CONTENTS xxii Vol. 3A PAGE 24.3.2 Exiting From SMM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-4 24.4 SMRAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 -4 24.4.1 SMRAM State Save Map. . . . . . . . . . . . . . . . . . . . [...]
-
Page 23
Vol. 3A xxiii CONTENTS PAGE CHAPTER 25 VIRTUAL-MACHINE MONITOR PR OGRAMMING CONSIDERATIONS 25.1 VMX SYSTEM PROGRAMMING OVERVIEW . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23-1 25.2 SUPPORTING PROCESSOR OPERATING MODES IN GUEST ENVIRONMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .[...]
-
Page 24
CONTENTS xxiv Vol. 3A PAGE 26.3.5.1 Initialization of Virtual TLB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-6 26.3.5.2 Response to Pa ge Faults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-7 26.3.5.3 Response to Uses of INVLPG . . . . . . . . . . . . . . . . . . . . . . . [...]
-
Page 25
Vol. 3A xxv CONTENTS PAGE APPENDIX C MP INITIALIZATION FO R P6 FAMILY PROCESSORS C.1 OVERVIEW OF THE MP INITIALI ZATION PROCESS FOR P6 FAMILY PROCESSORS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-1 C.2 MP INITIALIZATI ON PROTOCOL ALGORITHM . . . . . . . . . . . . . . . . . . . . . . . .[...]
-
Page 26
CONTENTS xxvi Vol. 3A PAGE H.3.4 32-Bit Host-State Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . H-6 H.4 NATURAL-WI DTH FIELDS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . H-6 H.4.1 Natural-Width Control Fields . . . . . . . . . . . . . . . . . . . . . . . . . [...]
-
Page 27
Vol. 3A xxvii CONTENTS PAGE Figure 3-23. Format of Page-Direct ory Entries for 4-MByte Pages and 36-Bit Physical Addresses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-38 Figure 3-24. IA-32e Mode Paging Structures (4-KByte Pages) . . . . . . . . . . . . . . . . . . . . 3-40 Figure 3-25. IA-32e Mode Paging Structures[...]
-
Page 28
CONTENTS xxviii Vol. 3A PAGE Figure 7-6. Topological Relationships betwee n Hierarchical IDs in a Hypothetical MP Platfor m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-36 Figure 8-1. Relationship of Local APIC and I/O APIC In Singl e-Processor Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . [...]
-
Page 29
Vol. 3A xxix CONTENTS PAGE Figure 11-2. Mapping of MMX Registers to x87 FPU Data Register Stack . . . . . . . . . . . . 11-7 Figure 12-1. Example of Saving the x87 FPU, MMX, SSE, and SSE2 State During an Operating-System Controlled Task Switch . . . . . . . . . . . . . . . . . . 12-9 Figure 13-1. Processor Modulation Through Stop-C lock Me chanism [...]
-
Page 30
CONTENTS xxx Vol. 3A PAGE Figure 18-23. MSR_IFSB_CTL6, Address: 10 7D2H ; MSR_IFSB_CNTR7, Address: 107D3H . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-70 Figure 18-24. PerfEvtSel0 and PerfEvtSel1 MSRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-71 Figure 18-25. CESR MSR (Pentium Proc essor Onl y). . . . . . . . . [...]
-
Page 31
Vol. 3A xxxi CONTENTS PAGE Table 6-1. Exception Conditions Checked During a Ta sk Switch . . . . . . . . . . . . . . . . . 6-15 Table 6-2. Effect of a Task Switch on Busy Flag, NT Flag, Previous Task Link Field, and TS Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-17 Table 7-1. Initial APIC IDs for the Logica l Processors in a Sy[...]
-
Page 32
CONTENTS xxxii Vol. 3A PAGE Table 11-3. Effect of the MMX, x87 FPU, and FXSAVE/FXRS TOR Instructions on the x87 FPU Tag Wo rd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-4 Table 12-1. Action Taken for Combination s of OSFXSR, OSXMMEXCPT, SSE, SSE2, SSE3, EM, MP, and TS1 . . . . . . . . . . . . . . . . . . . . . [...]
-
Page 33
Vol. 3A xxxiii CONTENTS PAGE Table 23-1. Exit Qualification for Debug Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-5 Table 23-2. Exit Qualification for Task Switch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-6 Table 23-3. Exit Qualifi cation for Control-Register A ccesses. . . . . . . . . . . . . . [...]
-
Page 34
CONTENTS xxxiv Vol. 3A PAGE Table F-3. Non-Focused Lowest Priority Message (34 Cycles) . . . . . . . . . . . . . . . . . . . . .F-3 Table F-4. APIC Bus Status Cycles Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .F-5 Table G-1. Memory Types Used For VMCS Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . G-2[...]
-
Page 35
1 About This Manual[...]
-
Page 36
[...]
-
Page 37
Vol. 3A 1-1 CHAPTER 1 ABOUT THIS MANUAL The IA-32 Intel® Ar chitectur e Softwar e Developer ’ s Manual, V olume 3A: S ystem Pr ogramming Guide, Part 1 (order num ber 253668) and the IA-32 Intel® Architect ur e Softwar e Developer ’ s Manual, V olume 3B: System Pr ogramming Gui de, Part 2 (order number 253669) are part of a set that describes [...]
-
Page 38
1-2 Vol. 3A ABOUT THIS MANUAL 1.2 OVERVIEW OF THE SYST EM PROGRAMMING GUIDE A description of this manual’ s content follows: Chapter 1 — About This Manual. Gives an overview o f all three volumes of t he IA-32 Intel Ar chitectur e Softwar e Developer ’ s Manual . It als o describes the notational conventions in these manuals and lists relat e[...]
-
Page 39
Vol. 3A 1-3 ABOUT THIS MANUAL level, including: task swi tching, exception handling, and compatibility with existing system environments. Chapter 12 — SSE, SSE2 and SSE3 System Programming. Describes those aspects of SSE/SSE2/SSE3 extensions that must be hand led and considered at the system programm ing level, including task switching , exceptio[...]
-
Page 40
1-4 Vol. 3A ABOUT THIS MANUAL Chapter 25 — V irtual-Mach ine Monitoring Programming Considerations. Describes programming considerations for VMMs. VMMs manage virtual machines (VMs). Chapter 26 — V irt ualization of System Resources. Describes the virtualization of the system resources. These include: debugg ing facilities, ad dress translation[...]
-
Page 41
Vol. 3A 1-5 ABOUT THIS MANUAL 1.3.1 Bit and Byte Order In illustrations of d ata structures in memory , smaller addresses appear toward the botto m of the figure; addresses increase toward the top. Bit po sitions are numbered from right to left. The numerical value of a set bit is equal to two raised to the power of the bit posit ion. IA-32 proces-[...]
-
Page 42
1-6 Vol. 3A ABOUT THIS MANUAL 1.3.3 Instruction Operands When instructions are represen ted symbolically , a subset of the IA-32 assem bly language is used. In this subset, an instruction has the following form at: label: mnemonic argument1, argument2, argument3 where: • A label is an identifier which is followed by a colon. • A mnemonic is a r[...]
-
Page 43
Vol. 3A 1-7 ABOUT THIS MANUAL 1.3.4 Hexadecimal and Binary Numbers Base 16 (hexadecimal) numbers are represented by a string of hexadecimal digits followed by the character H (for example, F82EH) . A hexadecimal di git is a character from the following set: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, and F . Base 2 (binary) numbers are represen te[...]
-
Page 44
1-8 Vol. 3A ABOUT THIS MANUAL 1.3.7 Exceptions An exception is an event that typically occurs when an instruct ion causes an erro r . For example, an attempt to divide by zero generates an excep tion. However, some exceptions, such as break- points, occur under other conditions. Som e type s of exceptions may provide error codes. An error code repo[...]
-
Page 45
Vol. 3A 1-9 ABOUT THIS MANUAL be able to report an accurate code. In this case, the error code is zero, as shown below for a general-protection exception. #GP(0) 1.4 RELATED LITERATURE Literature related to IA-32 processors is listed on-line at this link: http://developer .intel.com/design/proces sor/ Some of the docu ments listed at th is web site[...]
-
Page 46
1-10 Vol. 3A ABOUT THIS MANUAL[...]
-
Page 47
2 System Ar chitectur e Overview[...]
-
Page 48
[...]
-
Page 49
Vol. 3A 2-1 CHAPTER 2 SYSTEM ARCHITECTURE OVERVIEW IA-32 architecture (beginning with the Intel386 processor family) provides extens ive support for operating-system and system-dev elop ment software. This supp ort offers multiple modes of operation, which include: • Real mode, protected m ode, virtual 8086 m ode, and system m anagement mod e. Th[...]
-
Page 50
2-2 Vol. 3A SYSTEM ARCHITECTURE OVERVIEW 2.1 OVERVIEW OF THE SY STEM-LEVEL ARCHITECTURE IA-32 system-level archit ecture consists of a se t of registers, data st ructures, and instructions designed to support basic system-level operations such as memory management, interrupt and exception handling, task management, and control of multiple processor[...]
-
Page 51
Vol. 3A 2-3 SYSTEM ARCHITECTURE OVERVIEW Figure 2-1. IA-32 System-Level Registers and Data Structures Local Descriptor T able (LDT) EFLAGS Register Control Registers CR1 CR2 CR3 CR4 CR0 Global Descriptor T able (GDT) Interrupt Descriptor T able (IDT) IDTR GDTR Interrupt Gate T rap Gate LDT Desc. TSS Desc. Code St a c k Code St a c k Code St a c k T[...]
-
Page 52
2-4 Vol. 3A SYSTEM ARCHITECTURE OVERVIEW Figure 2-2. System-Level Registers an d Data Structures in IA-32e Mode Local Descriptor T able (LDT) CR1 CR2 CR3 CR4 CR0 Global Descriptor T able (GDT) Interrupt Descriptor T able (IDT) IDTR GDTR Interrupt Gate T rap Gate LDT Desc. TSS Desc. Code St a c k Code St a c k Code St a c k Current TSS Code Sta ck I[...]
-
Page 53
Vol. 3A 2-5 SYSTEM ARCHITECTURE OVERVIEW 2.1.1 Global and Local Descriptor T ables When operating in pr otected mode, all memory accesses pass through either the global descriptor table (GDT) or an optional local desc riptor table (LDT) as shown in Figure 2-1. These tables contain entries called segment descriptors. Segment descriptors provide the [...]
-
Page 54
2-6 Vol. 3A SYSTEM ARCHITECTURE OVERVIEW For example, a CALL to a call gate can provide access to a proce dure in a code segment that is at the same or a numerically lowe r privilege leve l (more priv ileged) than the current code segment. T o access a procedure through a call gate, the calling procedure 1 supplies the selector for the call gate. T[...]
-
Page 55
Vol. 3A 2-7 SYSTEM ARCHITECTURE OVERVIEW A task can also be accessed through a task gate. A task gate is similar to a call g ate, except that it provides access (through a segment selector) to a TSS rather than a code segment. 2.1.3.1 T as k-St ate Segmen ts in IA-32e Mode Hardware task switches are not supported in IA -32e mode. However , TSSs con[...]
-
Page 56
2-8 Vol. 3A SYSTEM ARCHITECTURE OVERVIEW The location of pages (sometimes called page frames) in physical memory is contained in t wo types of system data structures: page directories and page tables. Both structures reside i n phys- ical memory (see Figure 2-1). The base physical address of the page directo ry is contained in control register CR3.[...]
-
Page 57
Vol. 3A 2-9 SYSTEM ARCHITECTURE OVERVIEW • The GDTR, LDTR, and IDTR registers contain the linear addresses and sizes (limits) of their respective tables. See also: Section 2.4, “Memory-Mana gement Registers.” • The task register contains th e linear address and size of th e TSS for the current task. See also: Section 2.4, “Memor y-Ma nage[...]
-
Page 58
2-10 Vol. 3A SYSTEM ARCHITECTURE OVERVIEW 2.1.7 Other System Resources Besides the system registers and data structures described in the previous secti ons, system archi- tecture provides the fo llowing additional resources: • Operating system instruction s (see also: Section 2.6, “System Instruction Summ ary”). • Performance-monitoring cou[...]
-
Page 59
Vol. 3A 2-11 SYSTEM ARCHITECTURE OVERVIEW The processor is placed in real-address mode following power-up or a reset. The PE flag in control register CR0 then contro ls whether the processor is oper ating in real-address or protected mode. See also: Section 9.9, “Mode Switching.” The VM flag in the EFLAGS regi ster determines whether the pr oce[...]
-
Page 60
2-12 Vol. 3A SYSTEM ARCHITECTURE OVERVIEW 2.3 SYSTEM FLAGS AN D FIELDS IN THE EFLAGS REGISTER The system flags and IOPL field of the EFLAGS re gister control I/O, ma skable hardware inter- rupts, debugging, task switchi ng, and the virt ual-8086 mode (see Figure 2-4). Only privileged code (typically operating system or execu tive code) should be al[...]
-
Page 61
Vol. 3A 2-13 SYSTEM ARCHITECTURE OVERVIEW The IOPL is also one of the mechanisms th at controls the modification of the IF flag and the handling of int errupt s in virtual -80 86 m ode when vi rtual m ode extensions are in effect (when CR4.VME = 1). See al so: Chapter 13, “Input/Output,” in the IA-32 Intel® Architectur e Softwar e Developer ?[...]
-
Page 62
2-14 Vol. 3A SYSTEM ARCHITECTURE OVERVIEW VIF V irtual Interrupt (bit 19) — Contains a virtual image of the IF flag. This flag is used in conjunction with the VIP flag. The pro cessor only recognizes the VIF flag when either the VME flag or the PVI flag in cont rol register CR4 is set and the IOPL is less than 3. (The VME flag enables the virtual[...]
-
Page 63
Vol. 3A 2-15 SYSTEM ARCHITECTURE OVERVIEW 2.4.1 Global Descriptor T able Register (GDTR) The GDTR register holds the base address (32 bits in protected mode; 64 bits in IA-32e mode) and the 16-bit table limit for the G DT . The base address specifies the lin ear address of byte 0 of the GDT ; the table limit specifies the number of bytes in the tab[...]
-
Page 64
2-16 Vol. 3A SYSTEM ARCHITECTURE OVERVIEW 2.4.3 IDTR Interrupt Descriptor T ab le Register The IDTR register holds the base address (32 bits in protected mode; 64 bits in IA-32 e mode) and 16-bit table limit for the IDT . The base address specifies the linear addr ess of byte 0 of the IDT ; the table limit specifies the number of byte s in the tabl[...]
-
Page 65
Vol. 3A 2-17 SYSTEM ARCHITECTURE OVERVIEW The control registers are summar ized below , and each architectur ally defined control field in these control registers are described indi vidually . In Figure 2-6, the width of the regist er in 64-bit mode is indicated in parenthesis (except for CR0). • CR0 — Contains system control flags that contro [...]
-
Page 66
2-18 Vol. 3A SYSTEM ARCHITECTURE OVERVIEW When loading a control register , reserved bits shou ld always be set to th e values previously read. The flags in control registers are: PG Paging (bit 31 of CR0) — Enables paging when set; disab les paging when clear . When paging is disabled, all linear addre sses are treated as physical addresses. The[...]
-
Page 67
Vol. 3A 2-19 SYSTEM ARCHITECTURE OVERVIEW NW Not Write-th rough (bit 29 of CR0) — When the NW and CD flags are clear , write- back (for Pentium 4, Inte l Xeon, P6 fami ly , and Pentium processors) or write-through (for Intel486 processors) is enabled for writ es that hit the cache and invalidat ion cycles are enabled. See T able 10-5 for detailed[...]
-
Page 68
2-20 Vol. 3A SYSTEM ARCHITECTURE OVERVIEW • If the TS flag is set and the MP flag (b it 1 of CR0) and EM flag are clear, an #NM exception is not raised prior to the ex ecution of an x87 FPU W AIT/FW AIT instruction. • If the EM flag is set, the sett ing of th e TS flag has no affect on the execution of x87 FPU/MMX/SSE/SSE2/SSE3 instructions. T [...]
-
Page 69
Vol. 3A 2-21 SYSTEM ARCHITECTURE OVERVIEW FPU or math coprocessor present in the syst em. T able 2-1 shows the interaction of the EM, MP , and TS flags. Also, when the EM flag is set, execution of an MMX instruction causes an invalid- opcode exception (#UD) to be generated (see T able 1 1- 1). Thus, if an IA-32 processor incorporates MMX technology[...]
-
Page 70
2-22 Vol. 3A SYSTEM ARCHITECTURE OVERVIEW VME V irtual-8086 Mode Extensions (bit 0 of CR4) — Enables interrupt- and exception- handling extensions in virtual-8 086 mode when set; disables the extensions when clear . Use of the virtual mode extensions can im prove the performance of virtual-8086 appli- cations by eliminating the overhead of callin[...]
-
Page 71
Vol. 3A 2-23 SYSTEM ARCHITECTURE OVERVIEW When enabling the global page feat ure, paging must be enabled (by setting the PG flag in control register CR0) before the PGE flag is set. Reversing this sequence may affect program correctness, and processo r performance will be impacted. See also: Section 3.12, “T ransla tion Lookaside Buffers (TLBs).?[...]
-
Page 72
2-24 Vol. 3A SYSTEM ARCHITECTURE OVERVIEW 2.5.1 CPUID Qualification of Control Regi ster Flags The VME, PVI, TSD, DE, PSE, P AE, MCE, PGE, PCE, OSFXSR, and OSXMMEXCP T flags in control register CR4 are mode l specific. All of these flags (e xcept the PCE flag) can be qual- ified with th e CPUID instructi on to det ermine if they are implemented on [...]
-
Page 73
Vol. 3A 2-25 SYSTEM ARCHITECTURE OVERVIEW 2.6.1 Loading and S toring System Registers The GDTR, LDTR, IDTR, and TR registers each ha ve a load and store instruction for loading data into and storing data from the register: • LGDT (Load GDTR Register) — Loads t he GDT base address and limit from memory into the GDTR register . • SGDT (S tore G[...]
-
Page 74
2-26 Vol. 3A SYSTEM ARCHITECTURE OVERVIEW • SLDT (S tore LDT Register) — Stores the LDT segment se lector from the LDTR register into memory or a general-purpose register . • L T R (Load T ask Register) — Loads seg ment selector and segment descriptor for a TSS from memory into the task register . (The segm ent selector operand can also be [...]
-
Page 75
Vol. 3A 2-27 SYSTEM ARCHITECTURE OVERVIEW Offset Is W ithin Limits (LSL Instruction),” fo r a detailed explanation of the function and use of this instruction. The VERR (verify for reading) and VER W (verify for writing) instructions verify if a selected segment is readable or writable, respectively , at a given CPL. See Section 4.10.2, “Checki[...]
-
Page 76
2-28 Vol. 3A SYSTEM ARCHITECTURE OVERVIEW Hardware may respond to this signal in a numbe r of ways. An indicato r light on the front panel may be turned on. An NM I interrupt for recording diagnost ic information may be generated . Reset initialization m ay be invoked (note that the BINIT# pin wa s introduced with th e Pentium Pro processor). If an[...]
-
Page 77
Vol. 3A 2-29 SYSTEM ARCHITECTURE OVERVIEW See Section 18.10, “Per formance Monitoring Overview ,” an d Section 18.9, “Time-S tamp Counter ,” for more information about th e perform ance mon itoring and time-stamp cou nters. The RDTSC instruction was introduced into the IA-32 architecture with the Pentium processor . The RDPMC instruction wa[...]
-
Page 78
2-30 Vol. 3A SYSTEM ARCHITECTURE OVERVIEW[...]
-
Page 79
3 Pr otected-Mode Memory Management[...]
-
Page 80
[...]
-
Page 81
Vol. 3A 3-1 CHAPTER 3 PROTECTED-MODE MEMORY MANAGEMENT This chapter describes the IA-32 a rchitecture’ s protected-mode mem ory management facilities, including the physical mem ory requirements, segmentation mechanism , and paging mechanism. See also: Chapter 4, “Protectio n” (for a description of the processor ’ s protection mechanism) an[...]
-
Page 82
3-2 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT If paging is not used, the linear address space of the proces sor is mapped di rectly into the phys- ical address space of processor . The physical addr ess space is defined as the range of addresses that the processor can generate on its address bus. Because multitasking computing systems commonly defin[...]
-
Page 83
Vol. 3A 3-3 PROTECTED-MODE MEMORY MANAGEMENT If the page being accessed is not currently in physical memory , the processor interrupts execu- tion of the program (by generati ng a page-fault exception). The operating system or executive then reads the page into physical memory from the disk and conti nues execut ing the program. When paging is impl[...]
-
Page 84
3-4 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT More complexity can be added to this protected flat model to provide more protecti on. For example, for the paging mechanis m to provide isolation bet ween user and su pervisor code and data, four segments need to be defined: code a nd data segments at privilege level 3 for the user, and code and data se[...]
-
Page 85
Vol. 3A 3-5 PROTECTED-MODE MEMORY MANAGEMENT 3.2.3 Multi-Segment Model A multi-segment model (su ch as the one shown in Figure 3-4) uses the full cap abilities of the segmentation mechanism to provid ed hardware enforced protection of code, data structures, and programs and tasks. Here, each program (or task) is given its own table of segment descr[...]
-
Page 86
3-6 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT 3.2.4 Segment ation in IA-32e Mode In IA-32e mode, the effects of segmentation depend on whether the processor is running in compatibility mo de or 64-bit mode. In compatibility m ode, segmentation functions just as it does using legacy 16-bit or 32-bit prot ected mode semantics. In 64-bit mode, segmenta[...]
-
Page 87
Vol. 3A 3-7 PROTECTED-MODE MEMORY MANAGEMENT 3.3.1 Physical Address S p ace for Processors with Intel ® EM64T On processors that su pport Intel EM64T (CPUID.8000000 1.EDX[29] = 1), the size of p hysical address range is impl ementation-specific and indicated by CPU ID.80000001H. The p hysical address size supported by a given implementation is ava[...]
-
Page 88
3-8 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT If paging is not used, the processor maps the lin ear address directly to a ph ysical address (that is, the linear address goes out on the processor ’ s address bus). If the linear address space is paged, a second level of address translation is us ed to translate the linear address into a physical add[...]
-
Page 89
Vol. 3A 3-9 PROTECTED-MODE MEMORY MANAGEMENT TI (table indicator) flag (Bit 2) — Specifies the descriptor table to use: clearing this flag selects the GDT ; setting this flag selects the current LDT . Requested Privilege Level (RPL) (Bits 0 and 1) — Specifies the priv ilege le vel of the selector . The privilege level can range from 0 to 3, wit[...]
-
Page 90
3-10 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT can be available for immediate use. Other segm ents can be made available by loading their segment selectors into these re gisters during program execu tion. Every segment register has a “visible” part and a “hidden” part. (The hi dden part is sometim es referred to as a “descriptor cache” o[...]
-
Page 91
Vol. 3A 3-11 PROTECTED-MODE MEMORY MANAGEMENT 3.4.4 Segment Loading Inst ructions in IA-32e Mode Because ES, DS, and SS segment registers are not us ed in 64-bit mode, thei r fields (b ase, limit, and attribute) in segment descri ptor registers are ignored. Some forms of segment load instruc- tions are also invalid (for exam ple, LDS, POP ES). Addr[...]
-
Page 92
3-12 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT 3.4.5 Segment Descriptors A segment descriptor is a data structure in a GDT or LDT that provides the processor with the size and location of a segment, as well as acce ss control and status information. Segment descriptors are typically created by compilers, linkers, loaders, or the operating system or [...]
-
Page 93
Vol. 3A 3-13 PROTECTED-MODE MEMORY MANAGEMENT segment limit has the reverse function; the offset can range from the segment limit to FFFFFFFFH or FFFFH, depending on the setting of the B flag. Of fsets less than the segment limit generate general -protection ex ceptions. Decreasing the value in the segment limit field for an expand-down segment all[...]
-
Page 94
3-14 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT segment. (This flag should always be set to 1 for 32-bit code and d ata segments and to 0 for 16-bit cod e and d a ta seg men ts .) • Executable code segment. The flag is called the D flag and it indicates the default length for effective addresses and operands referenced by instruc- tions in the segm[...]
-
Page 95
Vol. 3A 3-15 PROTECTED-MODE MEMORY MANAGEMENT L (64-bit code segment) fla g In IA-32e mode, bit 2 1 of the second doublewo rd of the segmen t descriptor indicates whether a code segment contains native 64-bit code. A value of 1 indicates instruction s in this code segment are executed in 64-bit mode. A value of 0 indicates the instructions in this [...]
-
Page 96
3-16 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT Stack segments are data segments which must be read/write segments. Loading the SS register with a segment selector for a nonwritable data segment generates a general -protection exception (#GP). If the size of a stack segme nt needs to be changed dynamically , the sta ck segment can be an expand-down d[...]
-
Page 97
Vol. 3A 3-17 PROTECTED-MODE MEMORY MANAGEMENT 3.5 SYSTEM DESCRIPTOR T YPES When the S (descriptor type) flag in a segment descriptor is clear , the descriptor type is a system descriptor . The processor recognizes the following types of system descrip tors: • Local descriptor-table (LDT) segment descriptor . • T ask-state segment (TSS) descript[...]
-
Page 98
3-18 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT See also: Section 3. 5.1, “Segment Descriptor T ables”, and Section 6.2.2, “TSS Descriptor” (for more information on the sy stem-segment descript ors); see Section 4.8.3, “Call Gates”, Section 5.11, “IDT Descriptors”, and Section 6.2.5, “T ask-Gate Descriptor” (for more info r- matio[...]
-
Page 99
Vol. 3A 3-19 PROTECTED-MODE MEMORY MANAGEMENT Each system must have one GDT defined, which may be used fo r all programs and tasks in the system. Optionally , one or more LDT s can be defined. For example, an LDT can be defined for each separate task being run, or some or all tasks can share the same LDT . The GDT is not a segment itself; instead, [...]
-
Page 100
3-20 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT 3.5.2 Segment Descriptor T ables in IA-32e Mode In IA-32e mode, a segment d escriptor table can contain up to 8192 (2 13 ) 8-byte descriptors. An entry in the segment descriptor table can be 8 by tes. System descriptors are expanded to 16 bytes (occupying the space of two entries). GDTR and LDTR registe[...]
-
Page 101
Vol. 3A 3-21 PROTECTED-MODE MEMORY MANAGEMENT accessed for a long time. See S ection 3.12, “T ranslation Lookasi de Buffers (TLBs)”, for more information on the TLBs. 3.6.1 Paging Options Paging is controlled by three fl ags in the processor ’ s control registers: • PG (paging) flag. Bit 31 of CR0 (available i n all IA-32 processors beginni[...]
-
Page 102
3-22 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT 3.6.2 Page T ables and Directorie s in the Absence of Intel EM64T The information that the processo r uses to translate linear ad dresses into physical addresses (when paging is enabled) is contained in four data structures: • Page directory — An array of 32-bit page-directory entries (PDEs) contain[...]
-
Page 103
Vol. 3A 3-23 PROTECTED-MODE MEMORY MANAGEMENT 3.7.1 Linear Address T ransl ation (4-KByte Pages) Figure 3-12 show s the page dir ectory and p age-t able hierarchy wh en mappi ng lin ear ad dresses to 4-KByte pages. The entries in the page director y point to page tables, and the entries in a page table point to pages in physi cal memory . Thi s pag[...]
-
Page 104
3-24 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT T o select the vari ous tabl e entries, the lin ear address is divided into three sections: • Page-directory entry — Bits 22 throug h 31 provide an offset to an entry in the page directory . Th e selected entry provides the base physical address of a page table. • Page-table entry — Bits 12 thro[...]
-
Page 105
Vol. 3A 3-25 PROTECTED-MODE MEMORY MANAGEMENT NOTE (For the Pentium processor onl y .) When enab ling or disabling large page sizes, the TLBs must be invalidated (flu shed) after the PSE flag in control register CR4 has been set or cleared. Otherwise, incorrect page translation might occur due to the p rocessor using outdated p age translation info[...]
-
Page 106
3-26 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT 3.7.6 Page-Directory a nd Page-T able Entries Figure 3-14 shows the format for the page-directory and page-table ent ries when 4-KByte pages and 32-bit physical addresses are being used. Figure 3-15 shows the format for the pa g e - directory entries when 4-MByte pages and 32-bit physical addresses are [...]
-
Page 107
Vol. 3A 3-27 PROTECTED-MODE MEMORY MANAGEMENT (Page-directory entr ies for 4-KByte page tables) — Specifies the physical address of the first byte of a page table. The bits in this field are int erpreted as the 20 most-significant bits of the phy sical address, which forces page tables to be aligned on 4-KByte boundaries. (Page-directory entries [...]
-
Page 108
3-28 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT 3. Invalidate the current page-table entry i n the TLB (see Section 3.12, “T ranslation Lookaside Buffers (TLB s)”, for a discussion of TLBs and how to invalidate th em). 4. Return fro m the page -fault handler to restart the interrupted program (or task). Read/write (R/W) flag, bit 1 Specifies the [...]
-
Page 109
Vol. 3A 3-29 PROTECTED-MODE MEMORY MANAGEMENT This flag is a “sticky” flag, meani ng that once set, the processor does not implicitly clear it. Only software can clear this flag. The accessed and dirty flags are provided for use by memory m anagement software t o manage the transfer of pages and page tables into and out of physical memory . NOT[...]
-
Page 110
3-30 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT in the TLB when register CR3 is loaded or a task switch occurs. This flag is provided to prevent frequently used pages (such as pages that contain kernel or other operating system or executive cod e ) from being flushed from the TLB . Only software can set or clear this flag . For page-directory entries[...]
-
Page 111
Vol. 3A 3-31 PROTECTED-MODE MEMORY MANAGEMENT When the P AE paging mechanism is enabled, the processor supports two sizes of pages: 4-KByte and 2-MByte. As w ith 32-bi t addressing , both page sizes can be addressed within the same set of paging tables (that is, a pag e-direct ory entry can point to either a 2-MByte page or a page table that in tur[...]
-
Page 112
3-32 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT T o select the vari ous tabl e entries, the lin ear address is divided into three sections: • Page-directory-pointer-table entry—Bits 30 and 31 provide an offset to one of the 4 entries in the page-directory-pointer ta ble. The selected entry provid es the base physical address of a page directory .[...]
-
Page 113
Vol. 3A 3-33 PROTECTED-MODE MEMORY MANAGEMENT CR4 has no affect on the page size when P AE is en abled.) W ith the PS flag set, the linear address is divided into three sections: • Page-directory-pointer-table entry—Bits 30 an d 31 provide an offset to an entry in the page-directory-pointer table. Th e selected entry provides the base physical [...]
-
Page 114
3-34 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT 3.8.5 Page-Directory and Page-T a ble Entries With Extended Addressing Enabled Figure 3-20 shows the format for the page-d irectory-pointer -table, page-directory , and page-table entries when 4 -KByte pages and 36 -bit extended ph ysical addresses are being used. Figure 3-21 shows the format for th e p[...]
-
Page 115
Vol. 3A 3-35 PROTECTED-MODE MEMORY MANAGEMENT Figure 3-20. Format of Page-Directo ry-Po inter-T able, Page-Directory , and Page-T able Entries for 4-KByte Pa ges with P AE Enabled 63 36 35 32 Base Reserved (set to 0) Page-Directory-Pointe r-T able Entry 31 12 11 9 8 543 2 0 P C D P W T Ava il Page-Directory Base Address Addr . Res. Reserved 63 36 3[...]
-
Page 116
3-36 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT The base physical address in an entry specifies the following, depending on the type of entry: • Page-directory-pointer -table entry — the physical address of the first byte of a 4-KByte page directory . • Page-directory entry — the physical address of the first byte of a 4-KByt e page table or [...]
-
Page 117
Vol. 3A 3-37 PROTECTED-MODE MEMORY MANAGEMENT Access (A) and dirty (D) flags (bits 5 and 6) are provided for table en tries that point to pages. Bits 9, 10, and 11 in all the table entries for the physical address extension are available for use by software. (When t he present flag is clear, bits 1 through 63 are available to software.) All bits in[...]
-
Page 118
3-38 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT Fi g ur e 3- 23 s ho ws th e fo rm at fo r t h e p ag e- directory entries when 4-MByte pages and 36-bit physical addresses are being us ed. Section 3.7.6, “Page-Dir ectory and Page-T able Entries” describes the functions of the flags and fields in bits 0 through 1 1. Figure 3-22. Linear Address T r[...]
-
Page 119
Vol. 3A 3-39 PROTECTED-MODE MEMORY MANAGEMENT 3.10 P AE-ENABLED PAGI NG IN IA-32E MODE Intel EM64T 64-bit extensions expand physical add ress extension (P AE) paging structures to potentially support mapping a 64-b it linear address to a 52-bit physical address. In the first implementation of Intel EM64T , P AE paging structures su pport translatio[...]
-
Page 120
3-40 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT 3.10.2 IA-32e Mode Linear Address T ranslatio n (2-MByte Pages) Figure 3-25 shows the PML4 tab le, page-direct ory-pointer , and page-di rectory hi erarchy w hen mapping linear addresses to 2-MByte page s in IA-32e mode. This method can be used to address up to 2 27 pages, which spans a linear address s[...]
-
Page 121
Vol. 3A 3-41 PROTECTED-MODE MEMORY MANAGEMENT • Page-director y entry — Bits 29:21 provide an offset to an entry in the page directory . The selected entry provides the base physical address of a 2-MByte page. • Page offset — Bits 20:0 provides an offset to a physical address in the page. 3.10.3 Enhanced Paging Data S tructures Figure 3-26 [...]
-
Page 122
3-42 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT Except for bit 63, functions of the flags in these entries are as described in Section 3.7.6, “Page- Directory and Page-T able En tries”. The dif ferences are: • A PML4 table entry and a page-direct ory-pointer-table entry are added. • Entries are increased from 32 bits to 64 bits. • The maxim[...]
-
Page 123
Vol. 3A 3-43 PROTECTED-MODE MEMORY MANAGEMENT • The base physical address fiel d in each entry is extended to 28 bits if the processor ’ s implementation su pports a 40-bit physical address. • Bits 62:52 are available for use by system programmers. • Bit 63 is the ex ecute-disable bit if the ex ec ute-disable bit feature is supported in the[...]
-
Page 124
3-44 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT If the execute disable bit is enabled in an IA-32 processor , the reserved bits in paging data struc- tures for legacy 32-bit mode an d 64-bit mode are shown in T abl e 3-5. T able 3-4. Reserv ed Bit Checking Wh en Execute Disable Bit is Disabled Mode Paging Mode Paging Structure Check Bits 32-bit 4-KBy[...]
-
Page 125
Vol. 3A 3-45 PROTECTED-MODE MEMORY MANAGEMENT 3.1 1 MAPPING SEGMENT S TO PAGES The segmentation and paging mechanism s prov ide in the IA-32 architecture support a wide variety of approaches to memory management. When segmen tation and paging is combined, segments can be mapped to pages i n several ways. T o im plement a flat (unsegmen ted) address[...]
-
Page 126
3-46 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT 3.12 T RANSLATION LOOKASIDE BUFFERS (TLBS) The processor stores the most recently used pa ge-directory and page-tab le entries in on-chip caches called translation lookaside buffers or TLB s. The P6 family and Pentium processors have separate TLBs for the data and instruction caches. Also, the P6 fa mil[...]
-
Page 127
Vol. 3A 3-47 PROTECTED-MODE MEMORY MANAGEMENT • Implicitly by executing a task switch, which automat ically changes the contents of the CR3 register . The INVLPG instruction is provided to invali date a specific page-table entry in the TLB. Normally , this instruction invalidates only an individual TLB entry; however, in some cases, it may invali[...]
-
Page 128
3-48 Vol. 3A PROTECTED-MODE MEMORY MANAGEMENT[...]
-
Page 129
4 Pr otection[...]
-
Page 130
[...]
-
Page 131
Vol. 3A 4-1 CHAPTER 4 PROTECTION In protected mode, the IA-32 architecture provid es a protection mechanism that operates at both the segment level and the page level. This pr otection mechanism provides the abili ty to limit access to certain segments or pages based on privilege levels (four privilege levels for s egments and two privilege levels [...]
-
Page 132
4-2 Vol. 3A PROTECTION that is based on privilege levels can essentially be disabled while still in protected mode by ass ig nin g a p rivilege lev el of 0 (most privil eged) to all segment selecto rs and segment descrip- tors. This action disables the p rivilege level protection barriers between segments, but other protection checks such as lim it[...]
-
Page 133
Vol. 3A 4-3 PROTECTION • Read/write (R/W) f lag — (Bit 1 of a page-d irectory or page-table entry .) Determines the type of access allowed to a page: read only or re ad-write. Figure 4-1 sh ows the location of the vari ous fields and flags in the data, code, and sy stem- segment descriptors; Figure 3-6 shows the locat ion of the RPL (or CP L) f[...]
-
Page 134
4-4 Vol. 3A PROTECTION Many different styles of prot ection schemes can be implemente d with these fields and flags. When the operating system creat es a descriptor , it places values in these fields and flags in keeping with the particular prot ection style chosen for an operat ing system or executive. Appli - cation program do not gene rally acce[...]
-
Page 135
Vol. 3A 4-5 PROTECTION 4.3 LIMIT CHECKING The limit field of a segment descriptor preven ts programs or procedures from addressing memory locations outside the segm ent. The effective value of the limit depends on the sett ing of the G (granularity) flag (see Figure 4-1). For data segments, the li mit also depends on the E (expansion directio n) fl[...]
-
Page 136
4-6 Vol. 3A PROTECTION For expand-down data segments, the segment limit has the same function but is interpreted differently . Here, the effective limit specifies the last address that is not allowed to be accessed within the segment; the ran ge of valid offsets is from (ef fective-limit + 1) to FFFFFFFFH if the B flag is set and from (effective-li[...]
-
Page 137
Vol. 3A 4-7 PROTECTION • When a segment selector is l oaded into a segment register — Certain segment registers can contain only certain desc riptor types, for example: — The CS register only can be loaded with a selector for a code segment. — Segment selectors for code segments that are not readable or for system segments cannot be loaded [...]
-
Page 138
4-8 Vol. 3A PROTECTION — On a call or jump through a call gate (or on an interrupt- or exception-handler call through a trap or interru pt gate), the pro cessor automatically checks that the segment descriptor being pointed to by the gate is for a code segment. — On a call or jump to a new task through a task gate (or on an interrupt- or except[...]
-
Page 139
Vol. 3A 4-9 PROTECTION The processor uses privilege leve ls to prevent a program or task operating at a lesser privilege level from accessing a segment with a greater privilege, except under controlled situations. When the processor detects a privilege level viol ation, it generates a general-protection excep- tion (#GP). T o carry out privilege-le[...]
-
Page 140
4-10 Vol. 3A PROTECTION — Nonconforming code segment (without using a call gate) — The DPL indicates the privilege level that a program or task must be at to access the segment. For example, if the DPL of a nonconforming code segment is 0, only pro grams running at a CPL of 0 can access the segment. — Call gate — The DPL indicates the numer[...]
-
Page 141
Vol. 3A 4-11 PROTECTION 4.6 PRIVILEGE LEVEL CHECKI NG WHEN ACCESSING DATA SEGMENT S T o access operands in a data segment, the segment selector for the data segment must be loaded into the data-segment registers (DS, ES, FS, o r GS) or into the stack-s egment register (SS). (Segment registers can be loaded with the MOV , POP , LDS, LE S, LFS, LGS, [...]
-
Page 142
4-12 Vol. 3A PROTECTION 4. The procedure in code segm ent D should be able to access data segment E because code segment D’ s CPL is numerically less than the DPL of data segment E. However , the RPL of segment selector E3 (which the code segment D procedure is using to access data segment E) is numerically greater than th e DPL of data segment E[...]
-
Page 143
Vol. 3A 4-13 PROTECTION 4.6.1 Accessing Dat a in Code Segment s In some instances it may be desirable to access data structures that are contained in a code segment. The following meth ods of accessing data in code segments are possible: • Load a data-segment register with a segmen t selector for a nonconf orming, readable, code segment. • Load[...]
-
Page 144
4-14 Vol. 3A PROTECTION A JMP or CALL instruction can reference another code segment in any of four ways: • The target operand contains the segment selector for the tar get code segment. • The target operand points to a call-gate descri ptor , whi ch contains th e segment selector for the target code segment. • The target operand points to a [...]
-
Page 145
Vol. 3A 4-15 PROTECTION • The DPL of the segment descriptor for the de stination code segmen t that contains the called procedure. • The RPL of the segment selector of the destination code segment. • The conforming (C) flag in the segment descript or for the destination code segment, which determines whether the segment is a conform ing (C fl[...]
-
Page 146
4-16 Vol. 3A PROTECTION The RPL of the segment selector th at points to a nonconforming co de segment has a limited effect on the privilege check. The RPL must be nu merically less than or equal to the CPL of the calling procedure for a successful c ontrol transfer to occur . So, in the example in Figure 4-7, the RPLs of segment selectors C1 and C2[...]
-
Page 147
Vol. 3A 4-17 PROTECTION In the example in Figure 4-7, code segment D is a conforming code segment. Therefore, calling procedures in both code segment A and B can access code segment D (us ing either segment selector D1 or D2, respectively), because they both have CPLs th at are greater than or equal to the DPL of the conforming code segment. For co[...]
-
Page 148
4-18 Vol. 3A PROTECTION 4.8.3 Call Gates Call gates facilitate controlled transfers of program control be tween dif ferent privilege levels. They are typically used only in operating systems or e xecutive s that use the privilege-level protection mechanism. Call gates are also useful for transferring program control between 16-bit and 32-bit code s[...]
-
Page 149
Vol. 3A 4-19 PROTECTION Note that the P flag in a gate descriptor is normally always set to 1. If it i s set to 0, a not present (#NP) exception is generated when a program at tempts to access the descriptor . The operating system can use the P flag for special purposes. Fo r example, it could be used to track the number of times the gate is used. [...]
-
Page 150
4-20 Vol. 3A PROTECTION • T arget code segments referenced by a 64-b it call gate must be 64-bit code segments (CS.L = 1, CS.D = 0). If not, the referen ce generates a general-protection exception, #GP (CS selector). • Only 64-bit mode call g ates can be referenced in IA-32e mo de (64-bit mode and com pati- bility mode). The legacy 32-bit mod e[...]
-
Page 151
Vol. 3A 4-21 PROTECTION Figure 4-10. Call-Gate Mech anism Figure 4-1 1. Privilege Check for Control T ransfer with Call Gate Offset Segment Selector Far Pointer to Call Gate Required but not used by processor Call-Gate Descriptor Code-Segment Descriptor Descriptor T able Offset Base Base Offset Base Segment Selector + Procedure Entry Point CPL RPL [...]
-
Page 152
4-22 Vol. 3A PROTECTION The privilege checking rules are dif ferent depending on whether the control transfer was initi- ated with a CALL or a JMP instruction, as shown in T able 4-1. The DPL field of the call-gate descriptor specifi es the numerically highest privilege level from which a calling procedure can access the call gate; that is, to acce[...]
-
Page 153
Vol. 3A 4-23 PROTECTION Call gates allow a single code segment to have pr ocedures that can be accessed at dif ferent priv- ilege levels. For examp le, an operating system located in a code segment may have some services which are intended to be used by both the operating system and application software (such as procedures for handling character I/[...]
-
Page 154
4-24 Vol. 3A PROTECTION Each task must define up to 4 stacks: one for applications code (running at privilege level 3) and one for each of the privilege levels 2, 1, and 0 that are used. (If only two privilege levels are used [3 and 0], then only two stacks must be defined. ) Each of these stacks is located in a separate segment and is identi fied [...]
-
Page 155
Vol. 3A 4-25 PROTECTION 4. T emporarily saves the current valu es of the SS and ESP registers. 5. Loads the segment selector an d stack pointer for the new stack in the SS and ESP registers. 6. Pushes the temporarily saved val ues for the SS and ESP regist ers (for the calling procedure) onto the ne w stack (see Figure 4-13). 7. Copies the nu mber [...]
-
Page 156
4-26 Vol. 3A PROTECTION 4.8.5.1 St ack Switching in 64-bit Mode Although protection-ch eck rules for call gates are unchanged from 32-bit mode, stack-switch changes in 64-bit mode are different. When stacks are switched as part of a 64 -bit mode privilege-level chang e through a call gate, a new SS (stack segment) descriptor is not load ed; 64 -bit[...]
-
Page 157
Vol. 3A 4-27 PROTECTION from the stack into the EIP regi ster , it checks that the pointer does not exceed the limit of the current code segment. On a far return at the same p rivilege level, the processor pops both a segment selecto r for the code segment being returned to and a return instruct ion pointer from the stack. Under normal conditions, [...]
-
Page 158
4-28 Vol. 3A PROTECTION new CPL (excluding conforming code segments), the segment register is loaded with a null segment selector . See the description of the RET instruction in Chap ter 3, Instruction Set Reference , of the IA-32 Intel Ar chitectur e Software D eveloper’ s Manual, V olume 2 , for a detailed descripti on of the priv- ilege level [...]
-
Page 159
Vol. 3A 4-29 PROTECTION MSRs and general-purpose registers eliminates all memory accesses except when fetching the target code. Any additional state that needs to be saved to allow a return to the calling procedure must be saved explicitly by the calling procedure or be predefined thro ugh programm ing conventions. 4.8.7.1 SYSENTER and SYSEXIT Inst[...]
-
Page 160
4-30 Vol. 3A PROTECTION When SYSEXIT transfers contro l to compatibility mode user code when the operand size attribute is 32 bits, the following fields are generated and bits set: • T a rget code segment — Computed by adding 16 to the value in IA32_SYSENTER_CS. • New CS attributes — L-bit = 0 (go to comp atib ility mode). • T arget instr[...]
-
Page 161
Vol. 3A 4-31 PROTECTION When SYSRET transfers control to 64-bit mode us er code using REX.W , the processor gets the privilege level 3 target instruction and stack pointer from: • T arget code segment — Reads a non-NULL selector from IA32_ST AR[63:48] + 16. • T arget instruction — Copies the value in RCX into RIP . • S tack segment — IA[...]
-
Page 162
4-32 Vol. 3A PROTECTION 4.9 PRIVILEGED INSTRUCTIONS Some of the system instru ctions (called “privileged instructi ons”) are protected from use by applicatio n pr ogr ams. Th e pri vil ege d i nst ruct ion s control system functions (such as the loading of system registers). They can be executed only when the CPL is 0 (m ost priv ileged). If on[...]
-
Page 163
Vol. 3A 4-33 PROTECTION 3. Checking if the pointer of fse t exceeds the segment limit. 4. Check ing if the supp lier of the point er is allowed to access the segment. 5. Checking the of fset alignmen t. The processor automa tically performs fi rst, s econd, and third checks du ring instruction execu- tion. Software must exp licitly request the four[...]
-
Page 164
4-34 Vol. 3A PROTECTION 4.10.2 Checking Read/Write Right s (VERR and VER W Instructions) When the processor accesses any code or data segment it checks the read/write privileges assigned to the segment to verify that the inte nded read or write opera tion is allowed. Softwar e can check read/write rights using the VERR ( verify for reading) and VER[...]
-
Page 165
Vol. 3A 4-35 PROTECTION 5. If the privi lege level and type checks pass, loads the unscramb led limit (the limit scaled according to the setting of the G flag in the se gment descriptor) into the destination register and sets the ZF flag in the EF LAGS register . If the segment se lector is not visible at th e current privilege level or is an inval[...]
-
Page 166
4-36 Vol. 3A PROTECTION Now assume that instead of setti ng the RPL of the segment selector to 3, th e applicatio n program sets the RPL to 0 (segment se lector D2). The opera ting system can now access da ta segment D, because its CPL and the RPL of segm ent selector D2 are both equal to the DPL of data segment D. Because the application program i[...]
-
Page 167
Vol. 3A 4-37 PROTECTION application program (represented by the code-seg m ent selector pushed o nto the stack). If the RPL is less than application program’ s privilege level, th e ARPL instruction chang es the RPL of the segment selector to match the privilege level o f the app lic ati on pr ogr am ( seg me nt selector D1). Using this instructi[...]
-
Page 168
4-38 Vol. 3A PROTECTION 4.1 1.1 Page-Protection Flags Protection inform ation for pages is contained in tw o flags in a page-directory or page-table entry (see Figure 3-1 4): the read/write flag (b it 1) and the user/supervisor flag (bit 2). The protection checks are applied to both first- and second-level pag e tables (that is, page directories an[...]
-
Page 169
Vol. 3A 4-39 PROTECTION read/write accessible. User -mode pages which are read/write or read-only are readable; super- visor-mode pages are neither readable nor writable from user mode. A page-fault exception is generated on any attempt to violate the protection rules. The P6 family , Pentium, and Intel486 processors allow user-mode pages to be wri[...]
-
Page 170
4-40 Vol. 3A PROTECTION Page-level protection can be used to enhance se gment-level protection. For example, if a lar ge read-write data segment is paged, the page-pro tection mechanism can be used to write-protect individual pages. NOTE: * If CR0.WP = 1, access type is determined by the R/ W flags of the page-directory and page-table entries. IF C[...]
-
Page 171
Vol. 3A 4-41 PROTECTION While the execute disable bit capabi lity does not introduce new instructio ns, it does require operating systems to use a P AE-enabled environm ent and establish a page-granular protection policy for memory pages. If the execute disable bi t of a memory page is set, that page can be used only as data. An attempt to execute [...]
-
Page 172
4-42 Vol. 3A PROTECTION tures. Execute-disable bit protection can be activ ated using the execute-disable bit at any level of the pagin g structure, irresp ective of the corresponding entry in other levels. When execute- disable-bit protection is not activated, the page can be used as code or data. In legacy P A E-enabled mode, T able 4-7 and T abl[...]
-
Page 173
Vol. 3A 4-43 PROTECTION 4.13.3 Reserved Bit Checking The processor enforces reserved bit checking in paging data structu re entries. The bits being checked varies with pa ging mode and may vary with the size of ph ysical address space. T able 4-9 shows the reserved bits that are checked when the execu te disable bit capability is enabled (CR4.P AE [...]
-
Page 174
4-44 Vol. 3A PROTECTION T abl e 4-10. Re served Bit Checking WIth Ex ecute-Disable Bit Cap a bility Not Enabled 4.13.4 Exception Handling When execute disable bit capab ility is enabled (IA32_EFER.NXE = 1), con ditions for a page fault to occur include the same conditions that ap ply to an IA-32 processor without execute disable bit capability plus[...]
-
Page 175
5 Interrupt and Exception Handling[...]
-
Page 176
[...]
-
Page 177
Vol. 3A 5-1 CHAPTER 5 INTERRUPT AND EXCEPTION HANDLING This chapter describes the processor ’ s interr upt and ex ception-handling mechanism wh en oper- ating in protected mode. Most of the information pro vided here also applies to interrupt and exception mechanisms used in real-address, virtual-8086 mode, and 64-bit mode. Chapter 15, “8086 Em[...]
-
Page 178
5-2 Vol. 3A INTERRUPT AND EXCEPTION HANDLING 5.2 EXCEPTION AND INTERRUPT V ECTORS T o aid in handling exceptions and interrupts , each IA-32 architectur e-defined exception and each interrupt condition th at requires special handling by the processor is assigned a unique identification number , called a vect or . The processor uses the vect or ass [...]
-
Page 179
Vol. 3A 5-3 INTERRUPT AND EXCEPTION HANDLING T able 5-1. Protected-Mod e Exceptions and Interrup ts V ector No. Mne- monic Description T ype Error Code Source 0 #DE Divide Error Fault No DIV and IDIV instructions. 1 #DB RESERVED Fault/ Tra p No For Intel use only . 2 — NMI Interrupt Interrupt No Nonmaskable external interrupt. 3 #BP Brea kpoint T[...]
-
Page 180
5-4 Vol. 3A INTERRUPT AND EXCEPTION HANDLING The processor ’ s local APIC is normally connected to a system-based I/O APIC. Here, external interrupts received at the I/O APIC’ s pins can be directed to the lo cal APIC through the system bus (Pentium 4 an d Intel Xeon processo rs) or the APIC serial bus (P6 family and Pentium processors). The I/[...]
-
Page 181
Vol. 3A 5-5 INTERRUPT AND EXCEPTION HANDLING 5.4 SOURCES OF EXCEPTIONS The processor receives excep tions from three sources: • Processor -detected pr ogram-error exceptions. • Software-generated exceptions. • Machine-check exceptions. 5.4.1 Program-Error Exceptions The processor generates one or more exception s when it detects program error[...]
-
Page 182
5-6 Vol. 3A INTERRUPT AND EXCEPTION HANDLING • Faults — A fault is an exception that can genera lly be corrected and that, once corrected, allows the program to be restarted with no lo ss of contin uity . When a fault is reported, the processor restores the machine state to the st ate prior to the beginning of executi on of the faulting instruc[...]
-
Page 183
Vol. 3A 5-7 INTERRUPT AND EXCEPTION HANDLING For trap-class exceptions, the return instruction pointer poi nts to the instruction following the trapping instruction. If a trap is detected during an instruc tion which transfers execution, the return instruction pointer reflects the transfer . Fo r example, if a trap is detected while executing a JMP[...]
-
Page 184
5-8 Vol. 3A INTERRUPT AND EXCEPTION HANDLING 5.7 NONMASKABLE INTERRUPT (NMI) The nonmaskable interrupt (NMI) can be ge nerat ed in eith er of tw o ways: • External hardware asserts the NMI pin. • The processor receives a message on the system bus (Pentium 4 and Intel Xeon processors) or the APIC serial bus (P6 family and Pentium processo rs) wi[...]
-
Page 185
Vol. 3A 5-9 INTERRUPT AND EXCEPTION HANDLING 5.8.1 Masking Maskable Hardware Interrupts The IF flag can disable the servicing of ma skable hardware interrupts received on the processor ’ s INTR pin or through the local APIC (see Section 5. 3.2, “M askable Hardware Inter- rupts”). When the IF flag is clear , the processor inhibit s interrupt s[...]
-
Page 186
5-10 Vol. 3A INTERRUPT AND EXCEPTION HANDLING Manual, V o lume 2A, for a detailed description of the operations these instructions are allowed to perform on the IF flag. 5.8.2 Masking Instruction Breakpoints The RF (resume) flag in the EFLA GS register controls the respon se of the processor to instruc- tion-breakpoint conditi ons (see the descript[...]
-
Page 187
Vol. 3A 5-11 INTERRUPT AND EXCEPTION HANDLING While priority among these classes listed in T abl e 5-2 is consistent th roughout the architecture, exceptions within each class are implementatio n-dependent and may vary from processor to processor . The processor first services a pendin g exception or interrupt from the class which has the highest p[...]
-
Page 188
5-12 Vol. 3A INTERRUPT AND EXCEPTION HANDLING re-generated when the interrupt handler returns ex ecution to the point in t he program or task where the exceptions and/or interrupts occurred. 5.10 INTERRUPT DESCRIPTOR T ABLE (IDT) The interrupt descriptor table (ID T) associates each exception or interrupt vector with a gate descriptor for the proce[...]
-
Page 189
Vol. 3A 5-13 INTERRUPT AND EXCEPTION HANDLING 5.1 1 IDT DESCRIPTORS The IDT may contain any of th ree ki n ds of gate descriptors: • T ask-gate descriptor • Interrupt-gate descriptor • T rap-gate descriptor Figure 5 -2 shows the form ats for the task-gate, interrupt-gate, and trap -gate descri ptors. The format of a task gate used in an IDT i[...]
-
Page 190
5-14 Vol. 3A INTERRUPT AND EXCEPTION HANDLING 5.12 EXCEPTION AND INTERRUPT HANDL ING The processor handles calls to exception- and interrupt -handlers similar to the way it h andles calls with a CALL in struction to a procedure or a task. When responding to an exception or inter- rupt, the processor uses the excepti on or interrupt vector as an ind[...]
-
Page 191
Vol. 3A 5-15 INTERRUPT AND EXCEPTION HANDLING through Section 4.8.6, “Returning from a Called Pr oced ure”). If index poin ts to a task g ate, the processor executes a task switch to the exception- or interrupt -han dler task in a manner similar to a CALL to a task gate (see Section 6.3, “T ask Switching”). 5.12.1 Exception- or Inte rrupt-H[...]
-
Page 192
5-16 Vol. 3A INTERRUPT AND EXCEPTION HANDLING When the processor performs a call to the exception- or interrupt-handler procedure: • If the handler procedure is going to be execute d at a numerically lower privilege level, a stack switch occurs. When the stack switch occurs: a. The segment selector and stack pointer for the stack to be used by th[...]
-
Page 193
Vol. 3A 5-17 INTERRUPT AND EXCEPTION HANDLING T o return from an exception- or interrupt-handl er procedure, the handler must use th e IRET (or IRETD) instruction. The IRET instruction is similar to t he RET instruction except that it restores the saved flags into the EFLAGS register . The IO PL field of the EFLAGS register is restored only if the [...]
-
Page 194
5-18 Vol. 3A INTERRUPT AND EXCEPTION HANDLING An attempt to violate this rule results in a general-protection exception (#GP). The protection mechanism for exception- and interrupt-handler procedures is dif ferent in the following ways: • Because interrupt and exception vectors have no RPL, the RP L is not checked on implicit calls to exception a[...]
-
Page 195
Vol. 3A 5-19 INTERRUPT AND EXCEPTION HANDLING 5.12.2 Interrupt T asks When an exception or interrupt handler is accessed through a task gate in the IDT , a task switch results. Handling an exception or interrupt wit h a separate task offers several advantages: • The entire context of the interrupted pr ogram or task is saved auto mat ically . •[...]
-
Page 196
5-20 Vol. 3A INTERRUPT AND EXCEPTION HANDLING Figure 5-5. Interrupt T ask Switch IDT T ask Gate TSS for Interrupt- TSS Selector GDT TSS Descriptor Interrupt Ve c t o r TSS Base Address Handling T ask[...]
-
Page 197
Vol. 3A 5-21 INTERRUPT AND EXCEPTION HANDLING 5.13 ERROR CODE When an exception condition is related to a specific segment, the processor pushes an error code onto the stack of the exception ha ndler (whether it is a procedure or task). The error code has the format shown in Figure 5-6. The error code resembles a segment selector; however , instead[...]
-
Page 198
5-22 Vol. 3A INTERRUPT AND EXCEPTION HANDLING 5.14 EXCEPTION AND INTERRUP T HANDLING IN 64-BIT MODE In 64-bit mode, interrupt and excep tio n handling is similar to what has been described for non- 64-bit modes. The followi ng are the exceptions: • All interrupt handlers pointed by the IDT are in 6 4-bit code (this does not apply to the SMI handl[...]
-
Page 199
Vol. 3A 5-23 INTERRUPT AND EXCEPTION HANDLING In 64-bit mode, the IDT index is formed by scaling the interrupt vector by 16. The first eight bytes (bytes 7:0) of a 64-bit mode in terrupt gate are similar but not ident ical to legacy 32-bit interrupt gates. The type field (bits 1 1:8 in byt es 7:4) is d escribed in T able 3-2. The Interrupt Stack T [...]
-
Page 200
5-24 Vol. 3A INTERRUPT AND EXCEPTION HANDLING 5.14.3 IRET in IA-32e Mode In IA-32e mode, IRET executes with an 8-byte op erand size. There is no thing that fo rces this requirement. The stack is formatted in such a wa y that for actions where IRET is required, the 8-byte IRET operand size works correctly . Because interrupt stack-frame pushes are a[...]
-
Page 201
Vol. 3A 5-25 INTERRUPT AND EXCEPTION HANDLING In summary , a stack switch in IA -32e mode work s like the legacy stack switch, except that a new SS selector is not loaded from the TSS. Instead, the new SS is forced to NULL. 5.14.5 Interrupt St ack T able In IA-32e mode, a new interrupt stack ta ble (IST ) mechanism is available as an alterna tive t[...]
-
Page 202
5-26 Vol. 3A INTERRUPT AND EXCEPTION HANDLING The IST mechanism provides up to seven IST poin ters in the TSS. The pointers are referenced by an interru pt-gate descript or in the interrup t-descriptor table (IDT); see Figure 5-7. The gate descriptor cont ains a 3-bit IST index field that pr ovides an offset into the IST section of the TSS. Using t[...]
-
Page 203
Vol. 3A 5-27 INTERRUPT AND EXCEPTION HANDLING Interrupt 0—Divide Er ror Exception (#DE) Exception Class Fault. Description Indicates the divisor operan d for a DIV or IDIV inst ruction is 0 or that the result cannot be repre- sented in the number of bits specified for the destination operand. Exception Error Code None. Saved Instruction Pointer S[...]
-
Page 204
5-28 Vol. 3A INTERRUPT AND EXCEPTION HANDLING Interrupt 1—Debug Exception (#DB) Exception Class Trap or Fault. The exception handler can distinguish between traps or faults by examining the contents of DR6 and the ot her deb ug regi sters. Descripti on Indicates that one or more of several debug-ex ception conditions has been detected. Wheth er t[...]
-
Page 205
Vol. 3A 5-29 INTERRUPT AND EXCEPTION HANDLING Interrupt 2—NMI Interrupt Exception Class Not applicable. Description The nonmaskable interrupt (NMI) is generated externally by asserting the processor ’ s NMI pin or through an NMI request set by the I/O APIC to the local APIC. This interrupt causes th e NMI interrupt handler to be called. Excepti[...]
-
Page 206
5-30 Vol. 3A INTERRUPT AND EXCEPTION HANDLING Interrupt 3—Breakpoint Exception (#BP) Exception Class Tr a p . Descripti on Indicates that a breakpoint inst ruction (INT 3) w as executed, causing a breakpoint trap to be generated. T ypically , a debugger sets a breakpoin t by replacing the first opcode byte of an instruction with the opcode for th[...]
-
Page 207
Vol. 3A 5-31 INTERRUPT AND EXCEPTION HANDLING Interrupt 4—Overfl ow Exception (#OF) Exception Class Tr a p . Description Indicates that an overflow tr ap occurred when an INTO in struction was executed. The INT O instruction checks the state of the OF flag in the EFLAGS register . If the OF flag is set, an over- flow trap is gener ated. Some arit[...]
-
Page 208
5-32 Vol. 3A INTERRUPT AND EXCEPTION HANDLING Interrupt 5—BOUND Range Exceeded Exception (#BR) Exception Class Fault. Descripti on Indicates that a BOUND-rang e-exceeded fault occurred wh en a BOUND instruction was executed. The BOUND instruction checks t hat a signed array index is within th e upper and lower bounds of an array located in m emor[...]
-
Page 209
Vol. 3A 5-33 INTERRUPT AND EXCEPTION HANDLING Interrupt 6—Invalid Opcode Exception (#UD) Exception Class Fault. Description Indicates that the processor did one of the follo wing things: • Attempted to execute an invalid or reserved opcode. • Attempted to execute an instruction with an operand type that is invalid for its accompa- nying opcod[...]
-
Page 210
5-34 Vol. 3A INTERRUPT AND EXCEPTION HANDLING The opcodes D6 and F1 are undefined opcodes that are reserved by the IA-32 architecture. These opcodes, even though undefined, do not generate an inval id opco de excepti on. The UD2 instruction is guaranteed to generate an invalid opcode exception. Exception Error Co de None. Saved Instruct ion Pointer[...]
-
Page 211
Vol. 3A 5-35 INTERRUPT AND EXCEPTION HANDLING Interrupt 7—Device Not A vailable Exception (#NM) Exception Class Fault. Description Indicates one of the following thing s: The device-not-available exception is gene rated by either of three conditio ns: • The processor executed an x87 FPU floati ng-point instruction while th e EM flag in control [...]
-
Page 212
5-36 Vol. 3A INTERRUPT AND EXCEPTION HANDLING Saved Instruct ion Pointer The saved contents of CS and EIP registers poi nt to the floating-point instruction or the W AIT/FW A IT instructi on th at generated the exception. Program St ate Change A program-state change does not accompany a de vice-not-available fault, because the instruc- tion that ge[...]
-
Page 213
Vol. 3A 5-37 INTERRUPT AND EXCEPTION HANDLING Interrupt 8—Double Fault Exception (#DF) Exception Class Abort. Description Indicates that the processor detected a seco nd except ion wh ile calling an exceptio n h andler for a prior exception. Normally , when the processo r detects an other exception while t rying to call an exception handler, the [...]
-
Page 214
5-38 Vol. 3A INTERRUPT AND EXCEPTION HANDLING If another exception occurs while attemp ting to call the double-faul t handler , the processor enters shutdown mo de. This mode is similar to the state following execu tion of an HL T instruc- tion. In this mode , the processor stops executi ng instructions until an NM I interrupt, SMI inter- rupt, har[...]
-
Page 215
Vol. 3A 5-39 INTERRUPT AND EXCEPTION HANDLING Interrupt 9—Coprocessor Segment Overrun Exception Class Abort. (Intel r eserved; do not use. Recent IA-32 processors do not generate this exception.) Description Indicates that an Intel386 CPU-based systems with an Intel 387 math coprocessor detected a page or segment violatio n while transferring th [...]
-
Page 216
5-40 Vol. 3A INTERRUPT AND EXCEPTION HANDLING Interrupt 10—Invalid TSS Exception (#TS) Exception Class Fault. Descripti on Indicates that there was an error related to a TS S. Such an error might be detected during a task switch or during the execution of instructions that use informat ion from a TSS. T able 5-6 shows the conditions that cause an[...]
-
Page 217
Vol. 3A 5-41 INTERRUPT AND EXCEPTION HANDLING This exception can generated either in the context of the original task or in the context of the new task (see Section 6.3, “T ask Switching”). Un til the processor has completely verified the presence of the new TSS, the exception is generate d in the context of the original task. Once the existenc[...]
-
Page 218
5-42 Vol. 3A INTERRUPT AND EXCEPTION HANDLING Exception Error Co de An error code containing the segm ent selector index for the segm ent descriptor that caused the violation is pushed onto the stack o f the exception handler . If the EXT flag is set, it indicates that the exception was caused by an event external to the currently running pr ogram [...]
-
Page 219
Vol. 3A 5-43 INTERRUPT AND EXCEPTION HANDLING Interrupt 1 1—Segment Not Present (#NP) Exception Class Fault. Description Indicates that the present flag of a segment or g ate descriptor is clear . The processor can generate this exception during any of the following operations: • While attempting to load CS, DS, ES, FS, or GS registers. [Detect[...]
-
Page 220
5-44 Vol. 3A INTERRUPT AND EXCEPTION HANDLING Saved Instruct ion Pointer The saved contents of CS and EIP registers norma lly point to the instruct ion that generated the exception. If the exception occurr ed while loading segment descriptors fo r the segment selectors in a new TSS, the CS and EIP registers point to the first instruction in the new[...]
-
Page 221
Vol. 3A 5-45 INTERRUPT AND EXCEPTION HANDLING Interrupt 12—St ack Fault Exception (#SS) Exception Class Fault. Description Indicates that one of the following stack related conditions was detected: • A limit violation is detected during an operatio n that refers to the SS register . Operations that can cause a limit violatio n include stack-ori[...]
-
Page 222
5-46 Vol. 3A INTERRUPT AND EXCEPTION HANDLING exception. The stack fau lt handler should thus not rely on being abl e to use the segment select ors fou nd i n t he CS, SS, DS, ES, FS, and GS registers without causing another exception. The exception handler should check a ll segment registers before trying t o resume the new task; otherwise, genera[...]
-
Page 223
Vol. 3A 5-47 INTERRUPT AND EXCEPTION HANDLING Interrupt 13—General Protection Exception (#GP) Exception Class Fault. Description Indicates that the processor detected one of a class of protection viol ations called “general- protection violations.” The co nditions that cause this ex ception to be generated comprise all the protection violatio[...]
-
Page 224
5-48 Vol. 3A INTERRUPT AND EXCEPTION HANDLING • Loading the CR0 register with a se t NW flag and a clear CD flag. • Referencing an entry in the IDT (followin g an interrupt or exception) t hat is not an interrupt, trap, or task gate. • Attempting to access an interrupt or exception handler through an in terrupt or trap gate from virtual-8086 [...]
-
Page 225
Vol. 3A 5-49 INTERRUPT AND EXCEPTION HANDLING • A selector from a TSS invo lved in a task switch. • IDT vector number . Saved Instruction Pointer The saved contents of CS and EIP registers poin t to the in struction that generated the exception. Program St ate Chang e In general, a program-state change does not accompany a general-protection ex[...]
-
Page 226
5-50 Vol. 3A INTERRUPT AND EXCEPTION HANDLING • If the segment descriptor from a 64-b it call gate is in non-canonical space. • If the DPL from a 64- bit call-gate is less th an the CPL or than the RPL of the 64-bit call- gate. • If the upper type field of a 64-bit call gate is not 0x0. • If an attempt is made to load a null selector in th [...]
-
Page 227
Vol. 3A 5-51 INTERRUPT AND EXCEPTION HANDLING Interrupt 14—Page-Fault Exception (#PF) Exception Class Fault. Description Indicates that, with paging enable d (the PG flag in the CR0 regi st er is set), the processor detected one of the following condi tions while using the pa ge-translation mechanism to translate a linear address to a physical ad[...]
-
Page 228
5-52 Vol. 3A INTERRUPT AND EXCEPTION HANDLING — The RSVD flag indicates that the processor detected 1s in reserved bits of the page directory , when the PSE or P AE flags in co ntrol register CR4 are set to 1. (The PSE flag is only available in the Pentium 4, Int el Xeon, P6 family , and Pentium processors, and the P AE flag is only available on [...]
-
Page 229
Vol. 3A 5-53 INTERRUPT AND EXCEPTION HANDLING Saved Instruction Pointer The saved contents of CS an d EIP registers genera lly point t o the instru ction that generated the exception. If the page-fault exception occurred during a task switch, the CS and EIP registers may point to the first instruction of the new task (as described in the following [...]
-
Page 230
5-54 Vol. 3A INTERRUPT AND EXCEPTION HANDLING When executing this code on one of the 32-bit IA-32 processors, it is possib le to get a page fault, general-protection fault (#G P), or alignment ch eck fault (#AC) after the segment selector has been loaded into the SS register but before the ESP register has b een load ed. At this point, the two part[...]
-
Page 231
Vol. 3A 5-55 INTERRUPT AND EXCEPTION HANDLING Interrupt 16—x87 FPU Floa ting-Point Error (#MF) Exception Class Fault. Description Indicates that the x87 FPU has detected a floati ng-point error . The NE flag in the register CR0 must be set for an interrupt 16 (floating-point error exceptio n) to be generated. (See Section 2.5, “Control Register[...]
-
Page 232
5-56 Vol. 3A INTERRUPT AND EXCEPTION HANDLING Prior to executing a waiting x87 FPU instruction or the W AIT/FW AIT instruction, the x87 FPU checks for pending x87 FPU floating -point exceptions (as described in step 2 above). Pending x87 FPU floatin g-point exception s are ignored for “non-wai ting” x87 FPU in struction s, which include the FNI[...]
-
Page 233
Vol. 3A 5-57 INTERRUPT AND EXCEPTION HANDLING Interrupt 17—Alignment Check Exception (#AC) Exception Class Fault. Description Indicates that the processor detected an una ligned memory operand wh en alignment checking was enabled. Alignment checks are only carried out in data (or stack) accesses (not in code fetches or system segment accesses). A[...]
-
Page 234
5-58 Vol. 3A INTERRUPT AND EXCEPTION HANDLING Alignment-check excep tions (#AC) are generated only when operating at privilege l evel 3 (user mode). Memory references that default to privileg e level 0, such as segment descriptor loads, do not generate alignment-check exceptions, even when caused by a memory reference made fro m privilege l evel 3.[...]
-
Page 235
Vol. 3A 5-59 INTERRUPT AND EXCEPTION HANDLING Interrupt 18—Machine- Check Exception (#MC) Exception Class Abort. Description Indicates that the processor detected an internal machine error or a bus erro r , or tha t an external agent detected a bus error . Th e machine-check exception is mode l-specific, available only on the Pentium 4, Intel Xeo[...]
-
Page 236
5-60 Vol. 3A INTERRUPT AND EXCEPTION HANDLING Program St ate Change The machine-check mechanism is enabled by sett ing the MCE flag in control register CR4. For the Pentium 4, Intel Xeon, P6 family , and Pentium processors, a p rogram-state change always accompanies a machine-check exception, and an abort class exception is generated. For abort exc[...]
-
Page 237
Vol. 3A 5-61 INTERRUPT AND EXCEPTION HANDLING Interrupt 19—SIMD Floati ng-Point Exception (#XF) Exception Class Fault. Description Indicates the processor has detected an SSE/ SSE2/SSE3 SIMD floating-point exceptio n. The appropriate status flag in th e MXCSR regist er must be set and the particular exception unmasked for this interrupt to be gen[...]
-
Page 238
5-62 Vol. 3A INTERRUPT AND EXCEPTION HANDLING Note that because SIMD floatin g-point exceptions are precise an d occur immediately , the situ- ation does not arise where an x87 FPU instruct ion, a W AIT/FW AIT inst ruction, or another SSE/SSE2/SSE3 instruction will catch a pen ding unmasked SIMD floatin g-point exception. In situations where a SIMD[...]
-
Page 239
Vol. 3A 5-63 INTERRUPT AND EXCEPTION HANDLING Saved Instruction Pointer The saved contents of CS and EIP registers po int to the SSE/SSE2/SSE3 instruction th at was executed when the SIMD floating- point exception was generated. This is the faulting in struction in which the error condition was detect ed. Program St ate Chang e A program-state chan[...]
-
Page 240
5-64 Vol. 3A INTERRUPT AND EXCEPTION HANDLING Interrupt s 32 to 255— User Defined Interrupt s Exception Class Not applicable. Descripti on Indicates that the processor did one of the following things: • Executed an INT n instruction where the instruction op erand is one of the vector numbers from 32 thro ugh 255. • Responded to an interrupt r[...]
-
Page 241
6 T ask Management[...]
-
Page 242
[...]
-
Page 243
Vol. 3A 6-1 CHAPTER 6 T ASK MANAGEMENT This chapter describes the IA-32 architecture’ s ta sk management facilities . These facilities are only available when the processor is running in protected mode. This chapter focuses on 32-bit tasks and the 32-bit TSS structure. For info rmation on 16-bit tasks and the 16-bit TS S structure, see Section 6.[...]
-
Page 244
6-2 Vol. 3A T ASK MANAGEMENT 6.1.2 T ask St ate The following items define the stat e of the currentl y executing task: • The task’ s current execution space, defined by the segment selectors in the segment registers (CS, DS, SS, ES, FS, and GS). • The state of the general-purpose registers. • The state of the EFLAGS register . • The stat[...]
-
Page 245
Vol. 3A 6-3 T ASK MANAGEMENT 6.1.3 Executing a T ask Software or the processor can dispatch a task for execution in one of the following ways: • A explicit call to a task with the CALL instru ction . • A explicit jump to a task with t he JMP instruction. • An implicit call (by the processor) to an interrupt-handler ta sk. • An implicit call[...]
-
Page 246
6-4 Vol. 3A T ASK MANAGEMENT Use of task managem ent facilities for handl ing multitasking appl ications is optional. M ulti- tasking can be handled in software, with each so ftware defined task executed in the context of a single IA-32 architecture task. 6.2 T ASK MANAGEMENT DATA S TRUCTURES The processor defines five data structur es fo r handlin[...]
-
Page 247
Vol. 3A 6-5 T ASK MANAGEMENT The processor updates dynamic fields when a task is suspended during a task switch. The following are d ynamic fields: • General-purpose register fields — State of the EAX, E CX, EDX, EBX, ESP , EBP , ESI, and EDI registers prior to the task switch. • Segment selector fields — Segment selectors stored in the ES,[...]
-
Page 248
6-6 Vol. 3A T ASK MANAGEMENT • EFLAGS register field — State of the EF AGS register prior to the t ask switch. • EIP (instruction poin ter) field — State of the EIP register prior to the task switch. • Previous task link field — Contains the segment selector for the TSS of the previous task (updated on a task switch that was initiated b[...]
-
Page 249
Vol. 3A 6-7 T ASK MANAGEMENT 6.2.2 TSS Descriptor The TSS, like all other segments, is defined by a segment descriptor . Figure 6-3 shows the format of a TSS descriptor . TS S descriptors may only be placed in the GDT ; they cannot be placed in an LDT or the IDT . An attempt to access a TSS using a segment s elector with its TI flag se t (which ind[...]
-
Page 250
6-8 Vol. 3A T ASK MANAGEMENT The base, limit, and DPL fields an d the granular ity and present flags have function s similar to their use in data-segment descriptors (see Sect ion 3.4.5, “Segment Descri ptors”). When the G flag is 0 in a TSS descriptor for a 32-bit TSS, the limit field must have a val ue equal to or greater than 67H, one byte l[...]
-
Page 251
Vol. 3A 6-9 T ASK MANAGEMENT 6.2.4 T ask Register The task register holds the 16-bit segment select or and the entire segment descriptor (32-bit base address, 16-bit segment limit, and descript or attributes) for th e TSS of the current task (see Figure 2-5). This inform ation is copied from the TSS descriptor in the GDT for th e current task. Figu[...]
-
Page 252
6-10 Vol. 3A T ASK MANAGEMENT The L TR instruction loads a segment selector (s ou rce operand) into the task register that points to a TSS descriptor in the GDT . It then loads the invisi ble portion of the task register with infor- mation from the TSS de scriptor . L TR is a p rivileged instruction th at may be executed only w hen the CPL is 0. It[...]
-
Page 253
Vol. 3A 6-11 T ASK MANAGEMENT 6.2.5 T ask-Gate Descriptor A task-gate descriptor provides an indirect, prot ected reference to a task (see Figure 6-6). It can be placed in the GDT , an LDT , or the IDT . Th e TSS segment selector field in a task-gate descriptor points to a TSS descriptor in the G DT . The RPL in this segm ent selector is not used. [...]
-
Page 254
6-12 Vol. 3A T ASK MANAGEMENT Figure 6-7 i llustrates how a task gat e in an LDT , a task gate in the GDT , and a task gate in the IDT can all point to the same task. 6.3 T ASK SWITCHING The processor transfers execution to a nother task in one of four cases: • The current program, task, or procedure ex ecutes a JMP or CALL instruction to a TSS d[...]
-
Page 255
Vol. 3A 6-13 T ASK MANAGEMENT JMP , CALL, and IRET instructions, as well as inte rrupts and exceptions, are all mechanisms for redirecting a program. The referencing of a TSS descriptor or a task gate (when calling or jumping to a task) or the state of the NT flag (when executing an IR ET instruction) determines whether a task switch occurs. The pr[...]
-
Page 256
6-14 Vol. 3A T ASK MANAGEMENT 10. I f the task switch was initiated with a CALL instru ctio n, JMP inst ruction , an excepti on, or an interrupt, the processor set s the busy (B) flag in the new task’ s TSS descriptor; if initiated with an IRET instruct ion , the busy (B) flag is left set. 1 1. Loads the task regi ster wit h the segm ent selector[...]
-
Page 257
Vol. 3A 6-15 T ASK MANAGEMENT When switching tasks, the privilege level of the new task does not inherit its privilege level from the suspended task. The new task begins executing at the privilege level sp ecified in the CPL field of the CS register , which is loaded from the TSS. Because ta sks are isolated by their sepa- rate address spaces and T[...]
-
Page 258
6-16 Vol. 3A T ASK MANAGEMENT The TS (task switched) flag in the control regist er CR0 is set every time a task switch occurs. System software uses the TS flag to coordina te the actions of floating-point unit when gener- ating floating-poi nt exceptions with the rest of the processor . The TS fl ag indicates that the context of the floating-point [...]
-
Page 259
Vol. 3A 6-17 T ASK MANAGEMENT T able 6-2 shows the busy flag (in the TSS segment descriptor), the NT flag, the previous task link field, and TS flag (in control register CR0) during a task sw itch. The NT flag may be modified by software executing at any privilege level. It is possible for a program to set the NT flag and execute an IRET instructio[...]
-
Page 260
6-18 Vol. 3A T ASK MANAGEMENT 6.4.1 Use of Busy Flag T o Pr event Recursive T ask Switching A TSS allows only one con text to be saved for a task; therefore, once a task is called (dispatched), a recursive (or re-entrant) call to th e task would cause the current state of the task to be lost. The busy flag in the TSS segment descriptor is provided [...]
-
Page 261
Vol. 3A 6-19 T ASK MANAGEMENT In a multiprocessing system, additional sy nchronization and serialization operations must be added to this procedure to insu re that the TSS and its segment descriptor are both locked when the previous task link field is ch anged and the busy flag is cleared. 6.5 T ASK ADDRESS SP ACE The address space for a task consi[...]
-
Page 262
6-20 Vol. 3A T ASK MANAGEMENT that the mapping of TSS addresses does not chan ge while the processor is reading and updating the TSSs during a task switch. The linear ad dress space mapped by the GDT also should be mapped to a shared area of the physical space; otherwise, the purpose of the GDT is defeated. Figure 6-9 shows how the linear address s[...]
-
Page 263
Vol. 3A 6-21 T ASK MANAGEMENT • Through segment descriptors in distinct LD T s that are mapped to common addr esses in linear address space — If this common area of the li near address space is mapped to the same area of the physical address space for each task, th ese segment descriptors permit the tasks to share segments. Such segment de scri[...]
-
Page 264
6-22 Vol. 3A T ASK MANAGEMENT Figure 6-10. 16-Bit TSS Format T ask LDT Sele ctor DS Selector SS Selector CS Selector ES Selector DI SI BP SP BX DX CX AX FLAG Wo rd IP (Entry Point) SS2 SP2 SS1 SP1 SS0 SP0 Previous T ask Link 15 0 42 40 36 34 32 30 38 28 26 24 22 20 18 16 14 12 10 8 6 4 2 0[...]
-
Page 265
Vol. 3A 6-23 T ASK MANAGEMENT 6.7 T ASK MANAGEMENT IN 64-B IT MODE In 64-bit mode, task stru cture and task state are simi lar to those in protecte d mode. However , the task switching mechanism av ailable in protected mo de is not supported in 64-bit m ode. T ask management and switch ing must be performed by software. The processor issues a gener[...]
-
Page 266
6-24 Vol. 3A T ASK MANAGEMENT Figure 6-1 1. 64-Bit TSS Format 0 31 100 96 92 88 84 80 76 I/O Map Base Address 15 72 68 64 60 56 52 48 44 40 36 32 28 24 20 16 12 8 4 0 RSP0 (lower 32 bits) RSP1 (lower 32 bits) RSP2 (lower 32 bits) Reserved bits. Set to 0. RSP0 (upper 32 bits) RSP1 (upper 32 bits) RSP2 (upper 32 bits) IST1 (lower 32 bits) IST1 (upper[...]
-
Page 267
7 Multiple-Pr ocessor Management[...]
-
Page 268
[...]
-
Page 269
Vol. 3A 7-1 CHAPTER 7 MULTIPLE-PROCESSOR MANAGEMENT The IA-32 architecture pr ovides sever al mechanisms for managing and improving the pe rfor- mance of multiple processors connect ed to the same system bus. These mechanisms include: • Bus locking and/or cache coherency m anagement for performing atom ic operations on system memory . • Seriali[...]
-
Page 270
7-2 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT • T o distribut e interrupt handling among a group o f processors — When several processors are operating in a system in parallel, it is useful to have a cen tralized mechanism for receiving interrupts and distributing them to availa ble processors for servicing. • T o increase system performance by e[...]
-
Page 271
Vol. 3A 7-3 MULTIPLE-PROCESSOR MANAGEMENT The mechanisms for handling locked atomic operati ons have evolved as the complexity of IA-32 processors has evolved. As such, more recent IA-32 processors (such as the Pentium 4, Intel Xeon, and P6 family processors) provide a more refined locking mechanism than earlier IA-32 processors. These are describe[...]
-
Page 272
7-4 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT For the Pentium 4, Intel Xeon, and P6 family processors, if the memory area being accessed is cached internally in the processor , the LOCK# si gnal is generally not asserted; instead, locking is only applied to the processor ’ s caches (see Section 7.1.4, “ Effects of a LOCK Operation on Internal Proce[...]
-
Page 273
Vol. 3A 7-5 MULTIPLE-PROCESSOR MANAGEMENT 7.1.2.2 Software Contro lled Bu s Locking T o explicitly force the LOCK semantics, softwa re can use the LOCK prefix with the following instructions when they are us ed to modi fy a memory locati on. An invalid-opcode exception (#UD) is generated when the LOCK p refix is used wit h any o ther inst ruction o[...]
-
Page 274
7-6 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT Locked instructions should not be used to insure that data written can be fetched as instructions. NOTE The locked instructions fo r the current versions of the Pentium 4, Intel X eon, P6 family , Pentium, and Intel486 proce ssors al low data written to be fetched as instructions. However, Intel recommends [...]
-
Page 275
Vol. 3A 7-7 MULTIPLE-PROCESSOR MANAGEMENT T o write cros s-modifying code and insure that it is compliant with curren t and future versions of the IA-32 architecture, the following processor synchronization algorit hm must be imple- mented: (* Action of Modifying Processor *) Memory_Flag ← 0; (* Set Memory_Flag to val u e ot her than 1 *) Store m[...]
-
Page 276
7-8 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT T o all ow optimizing of instruction execution, the IA-32 architecture al lows departures from strong-ordering model called processor ordering in Pentium 4, Intel Xeon, and P6 family processors. These pr ocessor- ordering variations allo w performance enhancing operations such as allowing reads to go ahead [...]
-
Page 277
Vol. 3A 7-9 MULTIPLE-PROCESSOR MANAGEMENT 4. Writes can be buffered. 5. Writes are not performed specu latively; they are only performed for instructions that have actually been retired. 6. Data fro m buffered writes can be forwar ded to waiting reads within the proces sor . 7. Reads or writes cannot pass (be carried out ahead of) I/O instru ctions[...]
-
Page 278
7-10 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT 7.2.3 Out-of-Order Stores For S tri ng Operations in Pentium 4, Intel Xeon, and P6 Family Processors The Pentium 4, Intel Xeon, and P6 family proce ssors modify the processors operation during the string store operations (initiated with the M OVS and STOS instructions) to maximi ze perfor- mance. Once the [...]
-
Page 279
Vol. 3A 7-11 MULTIPLE-PROCESSOR MANAGEMENT • The initial operation counter (ECX) must be equal to or greater than 64. • Source and destination must not overlap by less than a cache line (64 bytes, Pentium 4 and Intel Xeon processors; 32 bytes P6 family and Pentium processors). • The memory type for both source and destinat ion addresses must [...]
-
Page 280
7-12 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT Program synchronization can also be carried out with serializin g instru ctions (see Section 7.4). These instructions are typically used at critical procedure or task boundaries to force completion of all previous instructions befo re a jump to a new section of code or a context switch occurs. Like the I/O[...]
-
Page 281
Vol. 3A 7-13 MULTIPLE-PROCESSOR MANAGEMENT It is recommended that software written to run on Pentium 4, Intel Xeon, and P6 family proces- sors assume the processor-ordering model or a weaker memory-orderi ng model. The Pentium 4, Intel Xeon, and P6 family processors do not implement a strong memory-ordering m odel, except when using the UC memory t[...]
-
Page 282
7-14 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT 7.4 SERIALIZING INSTRUCTIONS The IA-32 architecture defines several serializing instructions . These instructions force the processor to complete all modifi cations to flags, registers, and memory by previo us instructions and to drain all buffered writes to mem ory before the next instruction is fetched a[...]
-
Page 283
Vol. 3A 7-15 MULTIPLE-PROCESSOR MANAGEMENT • When an instruction is execute d that enables or disables paging (that is, chang es the PG flag in control register CR0), the instruction should be followed by a jump instruct ion. The target instruction of the jump instruction is fetched with the new setting of the PG flag (that is, paging is enabled [...]
-
Page 284
7-16 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT • Intel Xeon pr ocessors with family , model, and st epping IDs up to F09H — The selection of the BSP and APs (see Section 7.5. 1, “BSP and AP Processors ”) is handled through arbitration on the system bus, usin g BIPI and FIPI messages (see Secti on 7.5.3, “MP Initialization Protocol Algorit hm [...]
-
Page 285
Vol. 3A 7-17 MULTIPLE-PROCESSOR MANAGEMENT • All devices in the system that are capable of delivering interrupts to the processors must be inhibited from doing so for t he duration of the MP initializati on protocol. The time during which interrupts must be inhib ited includes the window between when the BSP issues an INIT -SIPI-SIPI sequence to [...]
-
Page 286
7-18 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT • The remainder of the processors (which were not selected as the BSP) are designated as APs. They leave their BSP fl ags in the clear state and enter a “wait- for-SIPI state.” • The newly established BSP broadcasts an FIPI me ssage to “all including self,” which the BSP and APs treat as an end[...]
-
Page 287
Vol. 3A 7-19 MULTIPLE-PROCESSOR MANAGEMENT The following constants and data definitions are used in the accompanying code examples. They are based on the addresses of the APIC registers as defined in T a bl e 8-1. ICR_LOW EQU 0FEE00300H SVR EQU 0FEE000F0H APIC_ID EQU 0FEE00020H LVT3 EQU 0FEE00370H APIC_ENABLED EQU 0100H BOOT_ID DD ? COUNT EQU 00H V[...]
-
Page 288
7-20 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT space (1-MByte space). For example, a vect or of 0BDH specifies a sta rt-up memory address of 000BD000H. 1 1. Ena bles the lo cal APIC by setting bit 8 of the APIC spurious vector register (SVR). MOV ESI, SVR ; Address of SVR MOV EAX, [ESI] OR EAX, APIC_ENABLED; Set bit 8 to enable (0 on reset) MOV [ESI], [...]
-
Page 289
Vol. 3A 7-21 MULTIPLE-PROCESSOR MANAGEMENT 16. W aits for the timer interrupt. 17. Reads and evaluates the COUNT variable and establishes a processor count. 18. If necessary , reconfigures the APIC and c ontinues with the remaining system diagnostics as appropriate. 7.5.4.2 T ypic al AP In itialization Sequence When an AP receives the SIPI, it begi[...]
-
Page 290
7-22 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT 7.5.5 Identifying Logical Proc essors in an MP System After the BIOS has completed the MP initiali zation protocol, each logical processor can be uniquely identified by its local APIC ID. Software can access thes e APIC IDs in either of the following ways: • Read APIC ID for a local APIC — Code running[...]
-
Page 291
Vol. 3A 7-23 MULTIPLE-PROCESSOR MANAGEMENT For P6 family processors, the APIC ID that is assigned to a processor durin g power-up and initialization is 4 bits (see Figure 7-2). Here, bits 0 an d 1 form a 2-bit processor (or socket ) iden- tifier and bits 2 and 3 form a 2-bit cluster ID. 7.6 HYPER-THREADING AND MULTI-CORE T ECHNOLOGY Hyper-Threading[...]
-
Page 292
7-24 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT 7.7 DETECTING HARDWARE MU LTI-THREAD ING SUPPORT AND T OPOLOGY Use the CPUID instruction to detect the presence of ha rdware multi-thread ing support in a ph ys- ical processor . The foll owing can be interpreted : • Hardware Multi-Threading featur e flag (CPUID.1:EDX[28] = 1) — Indicates when set that[...]
-
Page 293
Vol. 3A 7-25 MULTIPLE-PROCESSOR MANAGEMENT 7.7.2 Initializing Dual-Core IA-32 Processors The initialization process fo r an MP system that contains dual-core IA-32 processors is the same as for conventional MP systems (see Section 7.5, “Multiple-Processor (MP) Initialization”). A logical processor in one core is selected as the BSP; other logi [...]
-
Page 294
7-26 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT 7.8 INTEL ® HYPER-THREADING T ECHNOLOGY ARCHITECTURE Figure 7-4 shows a generalized v iew o f an IA-32 processor supportin g Hyper-Threading T ech- nology , using the Intel Xeon processor MP as an exampl e. This implementation of the Hyp er- Threading T echnology consists of two logical processors (each r[...]
-
Page 295
Vol. 3A 7-27 MULTIPLE-PROCESSOR MANAGEMENT 7.8.1 St ate of th e Logical Processors The following features are part of the architectural state of l ogical processors within IA-32 processors supporting Hyper-Threading T echnology . The features can be subdivid ed into three groups: • Duplicated for each logical processor • Shared by logical proce[...]
-
Page 296
7-28 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT • Debug registers (DR0, DR1, DR2, DR3, DR6, DR7) and the debug control MSRs • Machine check global status (IA32_MC G_ST A TUS) and machine check capability (IA32_MCG_CAP) MSRs • Thermal clock modulation and ACPI Power managem ent control MSRs • T ime stamp counter MSRs • Most of the other MSR reg[...]
-
Page 297
Vol. 3A 7-29 MULTIPLE-PROCESSOR MANAGEMENT of memory , independent of the processor on whic h it is running. See Section 10.11, “Memory T ype Range Registers ( MTRRs),” for inf ormation on setti ng up MTRRs. 7.8.4 Page Attribute T able (P A T) Each logical p rocessor has its own P A T MSR (IA32_CR_ P A T). However , as des cribed in Section 10.[...]
-
Page 298
7-30 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT The performance counter i nterrupts, events, and precise event monitoring support can be set up and allocated on a per thread (p er logical processor) basis. See Section 18.14, “Performan ce Monitoring and Hyper-Threading T echnology ,” for a discus- sion of pe rformance monitor ing in the Int el Xeon [...]
-
Page 299
Vol. 3A 7-31 MULTIPLE-PROCESSOR MANAGEMENT 7.8.12 Self Modifying Code IA-32 processors supporting Hyper-Threading T echnology support self-modifying code , where data writes modify instructions cached or currentl y in flight. They also sup port cross-modifying code, where on an MP system writes generated by one proces sor modify instructions cached[...]
-
Page 300
7-32 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT Entries in the TLBs are tagged wi th an ID that indicates the logi cal processor that init iated the translation. This tag appl ies ev en for translations that are ma rked global using the pag e global feature for memory paging . When a logical processor performs a TLB invalid ation operation, only the TLB[...]
-
Page 301
Vol. 3A 7-33 MULTIPLE-PROCESSOR MANAGEMENT vector tables for one or both of the logical processors. T ypically in MP systems, the LINT0 and LINT1 pins are not used to deliver interrup ts to the logical processors. Instead all interrup ts are delivered to the local processors through the I/O APIC. • A20M# pin — On an IA-32 processor, the A20M# p[...]
-
Page 302
7-34 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT 7.9.2 Memory T ype Ra nge Registers (MTRR) MTRR is shared between two lo gical processors sharing a pr ocessor core if the physical processor supports Hyper-Threading T echnology . MTRR is not shared between logi cal proces- sors located in different cores or different physical packages. IA-32 architecture[...]
-
Page 303
Vol. 3A 7-35 MULTIPLE-PROCESSOR MANAGEMENT 7.10 PROGRAMMING CONSID ERATIONS FOR HARDWARE MULTI-THREADING CAP ABLE PROCESSORS In a multi-threading en vironment, there may be certain ha rdware resources that are physically shared at some level of the hard ware topology . In the multi-processor sy stems, typically bus and memory sub-systems are physic[...]
-
Page 304
7-36 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT The value of valid APIC_IDs need not be cont iguo us across package boundary or co re bound- aries. 7.10.2 Identifying Logical Proc essors in an MP System For any IA-32 processor , system hardware establis hes an initial APIC ID that is unique for each logical processor fo llow ing power-up or RESET (see S[...]
-
Page 305
Vol. 3A 7-37 MULTIPLE-PROCESSOR MANAGEMENT T able 7-2 sho ws the initial APIC IDs for a hy pothetical situation with a dual processor system. Each physical package providing two processor cores, and each processor core also supporting Hyper-Threading T echnology . T able 7-1. Initial APIC IDs for the Logical Processors in a System that has Four MP-[...]
-
Page 306
7-38 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT 7.10.3 Algorithm for Three-L evel Mappings of APIC_ID Software can gather the initial APIC_IDs for each logical pro cessor supported by the operating system at runtime 4 and extract identifiers corresponding to the three levels of sharing topology (package, core, and SMT). The algorithms below focus on a n[...]
-
Page 307
Vol. 3A 7-39 MULTIPLE-PROCESSOR MANAGEMENT unsigned int HW MTSupported(void) { try { // verify cpuid in struction is supported execute cpuid with eax = 0 to ge t vendor string execute cpuid with eax = 1 to get feature fl ag an d signature } except (EXCEPTION_EXECUTE_HANDLER) { return 0 ; // CPUID is not supported; So HW Multi-threading capability i[...]
-
Page 308
7-40 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT store returned value of eax return (unsigned ) ((reg_eax >> 26) +1); } else // must be a single-core processor return 1; } 4. Extract the initial APIC ID of a logical processor . #define INITIAL_APIC_ID_BITS 0xFF0 00000 // EBX[31:24] initial APIC ID // Returns the 8-bit unique initial APIC ID for the[...]
-
Page 309
Vol. 3A 7-41 MULTIPLE-PROCESSOR MANAGEMENT 6. Extract a sub ID given a full ID, maximum sub ID valu e and shift count. // Returns the value of the sub ID, this is not a zero-based value Unsigned char GetSubID(u nsigned char Full_ID, unsigned char MaxSubIDva lue, unsigned char Shift_Count) { MaskWidth = FindMaskWidth(MaxSubIDValue); MaskBits = ((uch[...]
-
Page 310
7-42 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT CORE_ID, assuming the number of physical packages in each node of a clustered system is symmetric. • Assemble the three-level identifiers of SMT_ID, CORE_ID, P ACKAGE_IDs into arrays for each enabled logical processor . Th is is shown in Exam ple 7-3a. • T o detect th e number of physical packages: use[...]
-
Page 311
Vol. 3A 7-43 MULTIPLE-PROCESSOR MANAGEMENT Example 7-3 Compu te the Number of Packag es, Cores, and Proce ssor Relationships in a MP System a) Assemble lists of PACKAGE_ID, CORE_ID, and SMT_ID of each enabl ed logical processors //The BIOS and/or OS may limit the number of logical p rocessors available to app lications // after system boot. The bel[...]
-
Page 312
7-44 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT The algorithm below assumes there is symmetry across package boundary if more than one socket is populated in an MP system. // Bucket Package IDs and compute processor mask for every package. PackageNum = 1; PackageIDBucket[0] = PackageID[0]; ProcessorMask = 1; PackageProcessorMask[0] = Processor Mask; For[...]
-
Page 313
Vol. 3A 7-45 MULTIPLE-PROCESSOR MANAGEMENT If ((PackageID[ProcessorNum] | CoreID[ProcessorNum]) == CoreIDBucket[i]) { CoreProcessorMask[i] |= ProcessorMask; Break; // found in existing bucket, skip to next iteration } } if (i == CoreNum) { //Did not match any bucket, start new bucket CoreIDBucket[i] = Packa geID[ProcessorNum] | CoreID[ProcessorNum][...]
-
Page 314
7-46 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT 7.1 1.2 P AUSE Instruction The P AUSE instruction improves the performa nce of IA-32 processors supporting Hyp er- Threading T ech nology when executing “spin-wait loo ps” and other routines where one thread is accessing a shared lock or semaphore in a tig ht polling loop. When ex ecuting a spin -wait [...]
-
Page 315
Vol. 3A 7-47 MULTIPLE-PROCESSOR MANAGEMENT 7.1 1.4 MONIT OR/MW AIT Instruction Operating systems usually im plement idle loops to handle th read synchronization. In a typical idle-loop scenario, there could be several “busy loops” and they would use a set o f memory loca- tions. An impacted processor waits in a lo op and poll a memory lo cation[...]
-
Page 316
7-48 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT Power management related events (such as Thermal Monit or 2 or chipset driven STPCLK# assertion) will not cause th e monitor event pendi ng flag to be cleared. Faults will not cause the monitor event pending flag to be cleared. Software should not allow for voluntary context sw itches in between MONITOR/MW[...]
-
Page 317
Vol. 3A 7-49 MULTIPLE-PROCESSOR MANAGEMENT These above two values bear no relationship to cache line size in the syst em and software should not make any assumptions to th at effect. W ithin a single-cluster system, the two parameters should default to be t he same (the size of th e monitor triggering area i s the sam e as the system coherence line[...]
-
Page 318
7-50 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT PAUSE ; Short delay JMP Spin_Lock Get_Lock: MOV EAX, 1 XCHG EAX, lockvar ; Try to get lock CMP EAX, 0 ; Test if successful JNE Spin_Lock Critic al_Section: <critical section code> MOV lockvar, 0 ... Continue: The spin-wait loop above uses a “test, test-and-se t” technique for determ ining the ava[...]
-
Page 319
Vol. 3A 7-51 MULTIPLE-PROCESSOR MANAGEMENT The MONITOR and MWAIT instructions may be consi dered for use in the C0 i dle state loops, if MONITOR and MWAIT are supported. Example 7-6 An OS Idle Loop with MONIT OR/MW AIT in the C0 Idle Lo op // WorkQueue is a memory locati on indicating there is a thread // ready to run. A non-zero value for WorkQueu[...]
-
Page 320
7-52 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT other logical processors in the physical package. For this reason, halting idl e logical processors optimizes the performance. 5 If all logical processors within a physical package are halted, the processor will enter a power-saving state. 7.1 1.6.4 Potential Usage of MONI TOR/MW AIT in C1 Idle Loop s An o[...]
-
Page 321
Vol. 3A 7-53 MULTIPLE-PROCESSOR MANAGEMENT 7.1 1.6.5 Guidelines for Scheduling Threads on Logic al Processors Sharing Execution Resour ces Because the logical processors, the order in which threads are dispatched to logical processors for execution can affect the overa ll efficiency of a system. The following guidelines are recom- mended for schedu[...]
-
Page 322
7-54 Vol. 3A MULTIPLE-PROCESSOR MANAGEMENT[...]
-
Page 323
8 Advanced Pr ogrammable Interrupt Contr oller (APIC)[...]
-
Page 324
[...]
-
Page 325
Vol. 3A 8-1 CHAPTER 8 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) The Advanced Programmable Interrupt Controll er (APIC), referred to in the follo wing sections as the local APIC, was introdu ced into the IA-32 processors with the Pentium processor (see Section 17.26., “Advanced Program mable Interrupt Controller (API C)”) and is included[...]
-
Page 326
8-2 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Local APICs can receive interrupt s from the followi ng sources: • Locally connected I/O devices — These interrupts originate as an edge or level asserted by an I/O device that is connected directly to the processor ’ s local interrupt pins (LINT0 and LINT1). The I/O devices may al[...]
-
Page 327
Vol. 3A 8-3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Xeon processors) or on the APIC bus (for Pentiu m and P6 family processors). See Section 8.2, “System Bus Vs. APIC Bus.” IPIs can be sent to other I A-32 processors in the system or to the originating processor (self- interrupts). When the target proces sor receives an IPI message, i[...]
-
Page 328
8-4 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) processors through the local inte rrupt pins; however , this mechanism is com monly not used in MP systems. Figure 8-2. Local APICs and I/ O APIC When Intel Xeon Processors Are Used in Multiple- Processor Systems Figure 8- 3. Local AP ICs and I/O A PIC When P6 Family Pro cessors Are Used[...]
-
Page 329
Vol. 3A 8-5 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) The IPI mechanism is typically used in MP syst ems to send fixed interrupts (interrupts for a specific vector number) and special-purpose inte rrupts to processors on the system bus. For example, a local APIC can use an IPI to forw ard a fixed interrupt to anot her processor for servicin[...]
-
Page 330
8-6 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 8.4.1 The Local APIC Block Diagram Figure 8-4 gives a functional block diagram for the local APIC. Software interacts with the local APIC by reading and writing its registers. APIC registers are memory-mapped to a 4-KByte region of the processor ’ s physical address spac e with an init[...]
-
Page 331
Vol. 3A 8-7 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Figure 8-4. Local APIC Structure Current Count Register Initial Count Register Divide Configuration Register V ersion Register Error S tatus Register In-Service Register (ISR) Ve c t or Decode Interrupt Co mmand Register (ICR) Acceptance Logic Ve c [ 3 : 0 ] & TMR Bit Register Select[...]
-
Page 332
8-8 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) T able 8-1 shows how the APIC registers are mapped into the 4-KByte APIC register s pace. Registers are 32 bits, 64 bits, o r 256 bits in width; all are aligned on 128-bit boun daries. All 32-bit registers should be accessed using 128-bit aligned 32-bit loads or stor es. Some processors [...]
-
Page 333
Vol. 3A 8-9 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 8.4.2 Presence of the Local APIC Beginning with the P6 family processors, the pr esence or absence of an on-chip local APIC can be detected using the CPUID inst ruction. When the CP UID instru ction is executed with a source operand of 1 in the EAX register , bit 9 of the CP UID feature [...]
-
Page 334
8-10 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 8.4.3 Enabling or Disa bling the Local APIC The local APIC can be enabled or disabled in either of two ways: 1. Using the APIC global enable/disable flag in the IA32_APIC_BASE MSR (M SR address 1BH; see Figure 8-5): — When IA32 _APIC_BASE[1 1] is 0, the processor is fu nctionally equi[...]
-
Page 335
Vol. 3A 8-11 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 8.4.4 Local APIC St atus and Location The status and location of the local APIC ar e contained in the IA 32_APIC_BASE MSR (see Figure 8-5). MSR bit functions are described belo w: • BSP flag, bit 8 ⎯ Indicat es if the processor is the boot strap processor (BSP). See Section 7.5, “[...]
-
Page 336
8-12 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 8.4.6 Local APIC ID At power up, system hardware assigns a unique APIC ID to each local APIC on the system bus (for Pentium 4 and Intel Xeon processors) or on the APIC bus (for P6 family and Pentium processors). The hardware assigned APIC ID is based on system topology and includes enco[...]
-
Page 337
Vol. 3A 8-13 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 8.4.7.1 Local APIC St ate After Power-Up or Reset Following a power-up or RESET of the processor , the state of local APIC and its registers are as follows: • The following registers are reset to all 0s: • IRR, ISR, TMR, ICR, LDR, and TPR • T imer initial count and timer current c[...]
-
Page 338
8-14 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 8.4.7.3 Local APIC St ate Af ter an INIT Reset (“W ait-for-SIPI” St ate) An INIT reset of the processor can be initiated in either of two ways: • By asserting the processor ’ s INIT# pin. • By sending the processor an INIT IPI (an IPI with the delivery mode set to INIT). Upon [...]
-
Page 339
Vol. 3A 8-15 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 8.5 HANDLING LOCAL INTERRUPT S The following sections describe facilities that are provided in the local APIC for h andling local interrupts. These include: the processor ’ s LINT 0 and LINT1 pins, the APIC tim er , the perfor- mance-monitoring counters, the thermal sensor, and the in[...]
-
Page 340
8-16 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) monitor register and its associ ated interrupt were introduced in the Pentium 4 and Inte l Xeon processors. As shown in Figures 8-8, some of t hese fields and flags are not availabl e (and reserved ) for some entries. Figure 8-8. Local V ector T able (L VT) 31 0 7 Ve c t o r Tim er M o [...]
-
Page 341
Vol. 3A 8-17 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) The setup information that can be specified in the registers of the L VT table is as follows: V ector Interrupt vector numb er . Delivery Mode Sp ecifies the type of interrupt to be sent to the processor . Some delivery modes will only operate as int ended w hen used in conjunc- tion wi[...]
-
Page 342
8-18 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Remote IRR Flag (R ead Only) For fixed mode, level-triggered interrupts; this flag is set when the local APIC accepts the interrupt fo r servicing and is reset when an EOI command is received from the proces sor . The meaning of this flag is undefined for edge-triggered interrupts and o[...]
-
Page 343
Vol. 3A 8-19 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 8.5.3 Error Handling The local APIC provides an error status register (ESR) that it uses to record errors that it detects when handling interrupts (see Fig ure 8-9). An APIC error interrupt is generated when the local APIC sets one of the error bits in the ESR. The L VT error register a[...]
-
Page 344
8-20 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) The ESR is a write/read register . A write (of any value) to the ESR must be done just prior to reading the ESR to update the regi ster . This initial writ e causes the ESR contents to be updated with the latest error status. Back-t o-back writes clear the ESR register . After an error [...]
-
Page 345
Vol. 3A 8-21 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) The time base for the timer is derived from the processor ’ s bus clock, divided by th e value spec- ified in the divide configuration regi st er . The timer can be configured thr ough the timer L VT entry for one-sho t or periodic operation. In one-shot mode, the timer is started by [...]
-
Page 346
8-22 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 8.5.5 Local Interrupt Accepta nce When a local interrupt is sent to the processo r core, it is subject to the acceptance criteria spec- ified in the interrupt acceptance flow chart in Figure 8-17. If the in terrupt is accepted, it is logged into the IRR register and handle d by the proc[...]
-
Page 347
Vol. 3A 8-23 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) The ICR consists of the following fields. V ector The vector number of the interrupt being sent. Delivery Mode Sp ecifies the type of IPI to be sent . This field is also know as the IPI message type field. 000 (Fixed) Delivers the int errupt speci fied in the vector field to the target [...]
-
Page 348
8-24 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) send a lowest priority IPI is model specific and should be avoided by B IOS and operating syst em software. 010 (SMI) Delivers an SMI interrupt to the target process or or processors. The vector field must be pro- grammed to 00H for future comp at ibility . 01 1 (Reserv ed ) 100 (NMI) D[...]
-
Page 349
Vol. 3A 8-25 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Destination Mode Selects either physical (0) or logical (1) destination mo de (see Section 8.6.2, “Determi ning IPI Destinat ion”). Delivery S tatus (Read Only) Indicates the IPI delivery status, as follows: 0 (Idle) There is currently no IPI activity for this local APIC, or the pre[...]
-
Page 350
8-26 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) sors and to FFH for Pentium 4 and Intel Xeon pro- cessors. 1 1: (All Excluding Self) The IPI is sent to all pr ocessors in a system with the exception of the processor sending the IPI. The APIC broadcasts a message with the physical des- tination mode an d destination fi eld set to 0x F[...]
-
Page 351
Vol. 3A 8-27 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) All Excluding Self V alid Edge Fixed, Lowest Priority 1 , 4 , NMI, INIT , SMI, Sta rt - Up X All Excluding Self Invalid 2 Level FIxe d, Lowest Priority 4 , NMI, INIT , SMI, Sta rt - Up X NOTES: 1. The ability of a pr ocessor to send a lowest priority IPI is mod el specific. 2. For these[...]
-
Page 352
8-28 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 8.6.2 Determining IPI Destination The destination of an IPI can be one, all, or a subset (group) of the processors on the system bus. The sender of the IPI specifies the des tination of an IPI with the fo llowing APIC regist ers and fields within the registers: • ICR Register — The [...]
-
Page 353
Vol. 3A 8-29 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) NOTE The number of local APICs that can be addressed on the system bus may be restricted by hardware. 8.6.2.2 Logical Destination Mode In logical destination mode , IPI destination is specified using an 8-bit message destination address (MDA), which is entered in the destin ation field [...]
-
Page 354
8-30 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) The interpretation of MDA fo r the tw o models is described in the following paragraphs. 1. Flat Mod el — T hi s m o d el is s e l ec te d b y p r og ra m m i ng DF R b i t s 2 8 t h r ou gh 3 1 t o 1111 . Here, a unique logical APIC ID can be established for up to 8 local APICs by se[...]
-
Page 355
Vol. 3A 8-31 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 8.6.2.3 Broadcast/Self Delivery Mode The destination shorthand field of the ICR allows the delivery mode to be by-passed in favor of broadcasting the IPI to all the pr ocessors on the system bus and/ or back to itself (see Section 8.6.1, “Interrupt Command Reg ister (ICR)”). Three d[...]
-
Page 356
8-32 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Here, the TPR value is the task priority value in the TPR (see Figure 8-18), the IRR V value is the vector number for th e highest priority bit t hat is set in the IRR (see Figure 8-20) or 00H (if no IRR bit is set), and the ISR V value is the vector number for the highest pri ority bit[...]
-
Page 357
Vol. 3A 8-33 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Section 8.10, “APIC Bus Message Passing Mechanism and Protocol (P6 Family , Pentium Processors),” describes the APIC bus arbitration prot ocols and bus message formats, while Section 8.6.1, “I nterrupt Command Register (IC R),” describes the INIT level de-assert IPI message. Not[...]
-
Page 358
8-34 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 4. When interrupts are pending in the IRR and ISR register , the local APIC dispatches them to the processor one at a time, based on thei r priority and the curr ent task and processor priorities in the TPR and PPR (see Section 8. 8.3.1, “T ask and Processor Priorities”). 5. When a [...]
-
Page 359
Vol. 3A 8-35 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 1. (IPIs only) It examines the IPI message to det ermines if it is the specified destinat ion for the IPI as des cribed in Sectio n 8.6.2, “Deter mining IPI Dest ination.” If it is the specified destination, it continues its acceptance procedure; if it is not the destination, it dis[...]
-
Page 360
8-36 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 3. If the local APIC determines that it is the desig nated destination for the interrupt but the interrupt request is not one of the interrupts given in step 2, th e local APIC looks for an open slot in one of i ts two pending interrupt qu eues contained in the IRR and ISR registers (se[...]
-
Page 361
Vol. 3A 8-37 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 8.8.3.1 T ask and Processor Pri orities The local APIC also defines a task priority and a processor priority that it uses in determining the order in which interrupt s should be handled. The task priority is a so ftware selected value between 0 and 15 (see Figure 8-18) that is written i[...]
-
Page 362
8-38 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Its value in the PPR is computed as follows: IF TPR[7:4] ≥ ISRV[7:4] THEN PPR[7:0] ← TPR[7:0] ELSE PPR[7:4] ← ISRV[7:4] PPR[3:0] ← 0 Here, the ISR V val ue is the vector number of the hi ghest priority ISR bit that i s set, or 00H if no ISR bit is set. Essentially , the processo[...]
-
Page 363
Vol. 3A 8-39 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) The IRR contains the active interrupt requests th at have been accepted, but not yet dispatched to the processor for servicing. Wh en the local APIC accepts an interr upt, it sets the bit in the IRR that corresponds the vector of the accepted interrupt . When the processor core is ready[...]
-
Page 364
8-40 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 8.8.5 Signaling Interrupt Servicing Completion For all interrupts except those deliv ered with the NMI, SMI, INIT , ExtINT , the start-up, or INIT - Deassert delivery mode, the interrupt handler must include a write to the end-of-interrupt (EOI) register (see Figure 8-21). This writ e m[...]
-
Page 365
Vol. 3A 8-41 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) the TPR. The IC, however , is considered im plementation-dependent with th e under-lying priority mechanisms subject to change. The CR8, by contrast, is part of the Intel EM64T archi- tecture. Software can depend on this definition remaining unchang ed. Figure 8-22 shows the layout of C[...]
-
Page 366
8-42 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) The vector number for the spurious -interrupt vector is specified in th e spurious-interrupt vector register (see Figure 8-23). The functio ns of th e fields in this register are as follows: Spurious V ector D etermines the vect or number to be delivered to the processor when the local [...]
-
Page 367
Vol. 3A 8-43 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 8.10 APIC BUS MESSAGE PASSING MECHANISM AND PROTOCOL (P6 FAMILY , PENTIUM PROCESSORS) The Pentium 4 and Intel Xeon processors pass messages among the local and I/O APICs on the system bus, using the system bus message passing mechan ism and protocol. The P6 family and Pentium processors[...]
-
Page 368
8-44 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) destination and message during device con figuration, allocating one or more non-shared messages to each MSI capab le function.” The capabilities mechanism provided by the PCI Local Bus Specification is used to identify and configure MSI capable PCI devices. Among other fi elds, this [...]
-
Page 369
Vol. 3A 8-45 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) • When RH is 1 and the logical destination mode is active in a system using a flat addressing model, the Destination ID field mu st be set so that bits set to 1 identify processors that are present and enabled to receive the interrupt. • If RH is set to 1 and the logical destination[...]
-
Page 370
8-46 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Reserved fields are not assumed to be any valu e. Software must preserve their contents on writes. Other fields in the Message Da ta Register are described below . 1. V ector — This 8-bit field contains the interrupt vector associated with the message. V alues range from 010H to 0FEH [...]
-
Page 371
Vol. 3A 8-47 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) d. 100B (NMI) — Deliver the signal to all the agents listed in th e destination field. The vector information is ignored. NMI is an edge triggered interrupt regardless of the T rigger Mode Setting. e. 101B (INIT) — Deliver this signal to all the agen ts listed in the destin ation fi[...]
-
Page 372
8-48 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC)[...]
-
Page 373
9 Pr ocessor Management and Initialization[...]
-
Page 374
[...]
-
Page 375
Vol. 3A 9-1 CHAPTER 9 PROCESSOR MANAGEMENT AND INITIALIZATION This chapter describes the facilities provi ded for managing processor wide functions and for initializing the processor . The subjects cove red include: processor initi alization, x87 FPU initialization, processo r configur ation, feature determination, m ode switching, the MSRs (in th [...]
-
Page 376
9-2 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION The software-initialization code performs al l system-specific initia lization of the BSP or primary processo r an d the system logic. At this point, for MP (or DP) systems, the BSP (or primary) processor wakes up each AP (or secondary) processor to enable those pro cessors to execute self-configu[...]
-
Page 377
Vol. 3A 9-3 PROCESSOR MANAGEMENT AND INITIALIZATION T able 9-1. IA-32 Processor St ates Following Power-up, Reset, or INIT Register Pentium 4 and Intel Xeon Processor P6 Family Processor Pentium Processor EFLAGS 1 00000002H 00000002H 00000002H EIP 0000FFF0H 0000FFF 0H 0000FFF0H CR0 60000010H 2 60000010H 2 600000 10H 2 CR2, CR3, CR4 00000000H 000000[...]
-
Page 378
9-4 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION MXCSR P wr up or Reset: 1F80H INIT : Unchanged Pentium III processor only- Pwr up or Reset: 1F80H INIT : Unchanged NA GDTR, IDTR Base = 00000000H Limit = FFFFH AR = Present, R/W Base = 00000000H Limit = FFFFH AR = Present, R/W Base = 00000000H Limit = FFFFH AR = Pres ent, R/W LDTR, T ask Register [...]
-
Page 379
Vol. 3A 9-5 PROCESSOR MANAGEMENT AND INITIALIZATION 9.1.3 Model and Stepping Information Following a hardware reset, the EDX register contains component iden tification and revision information (see Figure 9-2). For example, the model, family , an d processo r type returned for t h e fi rs t pr oc es s o r i n t he I nt el Pe nt iu m 4 f am il y i [...]
-
Page 380
9-6 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION 9.1.4 First Instruction Executed The first instruction that is fetched and executed following a hardware reset is located at physical address FFFFFFF0H. This address is 16 byte s below the processor ’ s uppermost physical address. The EPROM containing the software-initialization code must be loc[...]
-
Page 381
Vol. 3A 9-7 PROCESSOR MANAGEMENT AND INITIALIZATION The EM flag determines w hether floating-po int instructions are executed by the x87 FPU (EM is cleared) or a device-not-availab le exception (#NM) is generated for all floating-po int instruc- tions so that an exception handler can em ulate the floati ng-point operation (EM = 1). Ordinarily , the[...]
-
Page 382
9-8 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION T o em ulate floating-point instructions, the EM, MP , and NE flag in control register CR0 should be set as shown in T able 9-3. Regardless of the value of the EM bit, the In tel486 SX processor generates a device-not-avail- able exception (#NM) up on encountering any floating-point instru ction. [...]
-
Page 383
Vol. 3A 9-9 PROCESSOR MANAGEMENT AND INITIALIZATION 9.4 MODEL-SPECIFIC REGISTERS (MSRS) The Pentium 4, Intel Xeon, P6 family , and Pentium processors contain a model-speci fic registers (MSRs). These registers are by de finition implementation specific; that is, they are not guaran- teed to be supported on future IA-32 processors and/or to hav e th[...]
-
Page 384
9-10 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION 9.6 INITIALIZING SSE/SSE2/SSE3 EXTENSIONS For processors that contain SS E/SSE2/SSE3 extensions, steps must be taken when initializing the processor to allow execu tion of these instructions. 1. Check the CPUID feature flags for the presence of the SSE/ SSE 2/SSE3 extensions (respectively: EDX b [...]
-
Page 385
Vol. 3A 9-11 PROCESSOR MANAGEMENT AND INITIALIZATION 9.7.1 Real-Address Mode IDT In real-address mode, the only system data structur e that must be loaded into m emory is the IDT (also called the “interrupt vector table”). By default, the addres s of the base of the IDT is phys- ical address 0H. This address can be changed by using the LIDT ins[...]
-
Page 386
9-12 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION • If paging is to be used, at least one page directory and one page table. • A code segment that contains the code to be executed when the processor switches to protected mode. • One or more code modules that contain th e necessary interrupt and exception handlers. Software initialization c[...]
-
Page 387
Vol. 3A 9-13 PROCESSOR MANAGEMENT AND INITIALIZATION 9.8.2 Initializing Protected-Mode Exceptions and Interrupt s Software init ialization code must at a minim u m load a protected-mode ID T with gate descriptor for each exception vector that the processor can generate. If in terrupt or trap gates are used, the gate descriptors can all poin t to th[...]
-
Page 388
9-14 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION After the processor has switched to protected m ode , the L TR instruction can be used to load a segment selector for a TSS descri ptor into the task register . This instruction marks the TSS descriptor as busy , but does not perform a task switch. The processor can, however , use the TSS to loca[...]
-
Page 389
Vol. 3A 9-15 PROCESSOR MANAGEMENT AND INITIALIZATION 64-bit mode consistency checks fail in the following circumstances: • An attempt is made to enable or disable IA-32e m ode whi le paging is enabled. • IA-32e mode is enabled and an attempt is made to enable paging prior to enabl ing physical-address extensions (P AE). • IA-32e mode is activ[...]
-
Page 390
9-16 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION Compatibility mod e execution is selected on a code-segm ent basis. This mode allows legacy applications to coex ist with 64-bit applications running in 64-bit mode. An operating system running in IA-32e mode can execute existing 16 -bit and 32-b it applicati ons by clearing their code-segment de[...]
-
Page 391
Vol. 3A 9-17 PROCESSOR MANAGEMENT AND INITIALIZATION 9.9 MODE SWITCHING T o use the processor i n protected mode after hard ware or software reset, a mode switch must be performed from real-address mo de. Once in protected mo de, software generally does not need to return to real-address mode. T o run software w ritten to run in real-address mo de [...]
-
Page 392
9-18 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION 6. Execute the L TR instruction to lo ad the task register with a segment selecto r to the initial protected-mode task or to a writable area of memory that can be used to store TSS information on a task switch. 7. After entering protected mode, the segment regi sters continue to hold the contents[...]
-
Page 393
Vol. 3A 9-19 PROCESSOR MANAGEMENT AND INITIALIZATION 4. Load segm ent regist ers SS, DS, ES, FS, and GS with a selector fo r a descri ptor contain ing the following values, which are a ppropriate for real-address mode: — Limit = 64 KBytes (0FFFFH) — Byte granular (G = 0) — Expand up (E = 0) — Writable (W = 1) —P r e s e n t ( P = 1 ) — [...]
-
Page 394
9-20 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION 9.10 INITIALIZATION AND MODE SWITCHING EXAMPLE This section provides an i nitialization and mode switching example that can be incorporated into an application. This code was originally written to initialize the Intel386 processor , but it will execute successfully on the Pentium 4, Intel Xeon, P[...]
-
Page 395
Vol. 3A 9-21 PROCESSOR MANAGEMENT AND INITIALIZATION Figure 9-3. Processor State Af ter Reset T able 9-4. Main Initializat ion Step s in ST ARTUP .ASM Source Listing ST ARTUP .ASM Line Numbers Description From T o 157 157 Jump (short) to the entry code in the EPROM 162 169 Construct a temporary GDT in R AM with one entry: 0 - null 1 - R/W data segm[...]
-
Page 396
9-22 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION 9.10.1 Assembler Usage In this example, the Intel assembler ASM386 and build to ols BLD386 are used to assemble and build the initialization code mod ule. The following assumptions are used when using the Intel ASM386 and BLD386 tools. • The ASM386 will generate the right operand size opcodes a[...]
-
Page 397
Vol. 3A 9-23 PROCESSOR MANAGEMENT AND INITIALIZATION 9.10.2 ST ARTUP .ASM Listing Example 9-1 provides high-level sample code designed to mov e the processor into protected mode. This listing does not include any opcode and offset information . Example 9-1. ST ARTUP .ASM MS-DOS* 5.0(045-N) 386(TM) MACRO ASSEMBLER STARTUP 09:44:51 08/19/92 PAGE 1 MS[...]
-
Page 398
9-24 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION 32 TSS_INDEX EQU 10 33 34 ; TSS_INDEX is the index of the TSS of the first task to 35 ; run after startup 36 37 38 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;;;;;;;;;;;;;;;;;;;;;;;;;; 39 40 ; ------------------------- STRUCT URES and EQU --------------- 41 ; structures for system data 42 43 ; TSS struct[...]
-
Page 399
Vol. 3A 9-25 PROCESSOR MANAGEMENT AND INITIALIZATION 79 LDT_reg DW ? 80 LDT_h DW ? 81 TRAP_reg DW ? 82 IO_map_base DW ? 83 TASK_STATE ENDS 84 85 ; basic structure of a descrip tor 86 DESC STRUC 87 lim_0_15 DW ? 88 bas_0_15 DW ? 89 bas_16_23 DB ? 90 access DB ? 91 gran DB ? 92 bas_24_31 DB ? 93 DESC ENDS 94 95 ; structure for use with LGDT and LIDT [...]
-
Page 400
9-26 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION 126 127 ; scratch areas for LGDT and LIDT instructions 128 TEMP_GDT_SCRATCH TABLE_REG <> 129 APP_GDT_RAM TABLE_REG <> 130 APP_IDT_RAM TABLE_REG <> 131 ; align end_data 132 fill DW ? 133 134 ; last thing in this segment - sho uld be on a dword boundary 135 end_data LABEL BYTE 136[...]
-
Page 401
Vol. 3A 9-27 PROCESSOR MANAGEMENT AND INITIALIZATION 175 MOV EBX,CR0 176 OR EBX,PE_BIT 177 MOV CR0,EBX 178 179 ; clear prefetch queue 180 JMP CLEAR_LABEL 181 CLEAR_LABEL: 182 183 ; make DS and ES address 4G o f linear memory 184 MOV CX,LINEAR_SEL 185 MOV DS,CX 186 MOV ES,CX 187 188 ; do board specific initiali zation 189 ; 190 ; 191 ; ...... 192 ; [...]
-
Page 402
9-28 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION 222 MOV ECX, CS_BASE 223 ADD ECX, OFFSET (IDT_E PROM) 224 MOV ESI, [ECX].table_l inear 225 MOV EDI,EAX 226 MOVZX ECX, [ECX].table_l im 227 MOV APP_IDT_ram[EBX].t able_lim,CX 228 INC ECX 229 MOV APP_IDT_ram[EBX].t able_linear,EAX 230 MOV EBX,EAX 231 ADD EAX,ECX 232 REP MOVS BYTE PTR ES:[EDI], BYTE[...]
-
Page 403
Vol. 3A 9-29 PROCESSOR MANAGEMENT AND INITIALIZATION 271 272 ;assume no LDT used in t he initial task - if necessary, 273 ;code to move the LDT cou ld be added, and should resemble 274 ;that used to move the TSS 275 276 ; load task register 277 LTR BX ; No task switch, only descriptor loading 278 ; See Figure 9-6 279 ; load minimal set of regi ster[...]
-
Page 404
9-30 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION Figure 9-4. Constructin g T emporary GDT and Switching to Pro tected Mode (Lines 162-172 of List File) FFFF FFFFH Base=0, Limit=4G ST AR T : [CS.BASE+EIP] TEMP_GDT • Jump near start FFFF 0000H • Construct TEMP_GDT • LGDT • Move to protected mode DS, ES = GDT[1] 4 GB 0 GDT [1] GDT [0] GDT_[...]
-
Page 405
Vol. 3A 9-31 PROCESSOR MANAGEMENT AND INITIALIZATION Figure 9-5. Moving the GDT , IDT , and TSS from ROM to RAM (Lines 196-261 of List File) FFFF FFFFH GDT RAM • Move the GDT , IDT , TSS • Fix Aliases • L T R 0 RAM_ST AR T TSS IDT GDT TSS RAM IDT RAM from ROM to RAM[...]
-
Page 406
9-32 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION Figure 9-6. T a sk Switching (Lines 282-296 of List File) GDT RAM RAM_ST ART TSS RAM IDT RAM GDT Alias IDT Alias DS EIP EFLAGS CS SS 0 ES ESP • • • • • • SS = TS S.SS ESP = TSS.ESP PUSH TSS.EFLAG PUSH TSS.CS PUSH TSS.EIP ES = TS S.ES DS = TSS.DS IRET GDT[...]
-
Page 407
Vol. 3A 9-33 PROCESSOR MANAGEMENT AND INITIALIZATION 9.10.3 MAIN.ASM Source Code The file MAIN.ASM shown in Example 9-2 defines the data and stack segments for this appli- cation and can be substi tuted with the mai n module task wri tten in a high-lev el language that is invoked by the IRET instruction executed by ST AR TUP .ASM. Example 9-2. MAIN[...]
-
Page 408
9-34 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION Example 9-4. Build F ile INIT_BLD_EXAMPLE; SEGMENT *SEGMENTS(DPL = 0) , startup.startup_code(BAS E = 0FFFF0000H) ; TASK BOOT_TASK(OBJECT = startup, INIT IAL,DPL = 0, NOT INTENABLED) , PROTECTED_MODE_TASK(OBJECT = mai n_module,DPL = 0, NOT INTENABLED) ; TABLE GDT ( LOCATION = GDT_EPROM , ENTRY = ([...]
-
Page 409
Vol. 3A 9-35 PROCESSOR MANAGEMENT AND INITIALIZATION 9.1 1 MICROCODE UP DATE FACILITIES The Pentium 4, Intel X eon, and P6 family proces sors have the capability to correct errata by loading an Intel-supplied data blo ck into the pr ocessor . The data block is called a microcode update. This section describes the mechanis ms th e BIOS needs to prov[...]
-
Page 410
9-36 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION 9.1 1.1 Microcode Up date A microcode update consists of an Intel-su pplied binary that con tains a descriptive header and data. No executable code reside s within the update. Each micr ocode update is tailo red for a specific list of processor signat ures. A mismatch of the processo r ’ s sign[...]
-
Page 411
Vol. 3A 9-37 PROCESSOR MANAGEMENT AND INITIALIZATION . T able 9-6. Microcode Up date Field Defi nitions Field Name Offset (bytes) Length (bytes) Description Header V ersion 0 4 V ersion number of the update header. Update Revision 4 4 Unique version number for th e update, the basis for the update signature provided by the processor t o indicate th[...]
-
Page 412
9-38 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION T otal Size 32 4 S pecifies the total size of the microcode update in bytes. It is the summation of the header size, the encrypted data size and the size of the optional extended signature table. Reserved 36 12 Reserved fields for future expansion Update Data 48 Data Size or 2000 Update data Exte[...]
-
Page 413
Vol. 3A 9-39 PROCESSOR MANAGEMENT AND INITIALIZATION Checksum[n] Data Size + 76 + (n * 12) 4 Used by utility software to decompose a microcode update into multiple microcode updates wher e each of the new updates is constructed without the optional Extended Processor Signature T able. T o calculate the Checksum, substitute the Primary Processor Sig[...]
-
Page 414
9-40 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION 9.1 1.2 Optional Extended Signature T able The extended signature table is a structure that may be appended to the end of the encrypted data when the encrypted data onl y supports a sing le processor signature (optional case). The extended signature table will always be presen t when the encrypte[...]
-
Page 415
Vol. 3A 9-41 PROCESSOR MANAGEMENT AND INITIALIZATION 9.1 1.3 Processor Identification Each microcode update is designed to for a sp eci fic processor or set of processors. T o determine the correct microcode update to load, software mu st ensure that one of the processor signatures embedded in the microcode update ma tches the 32-bit processo r sig[...]
-
Page 416
9-42 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION 9.1 1.4 Plat form Identification In addition to verifying t he processor signature, the intended processor platform type m ust be determined to p roperly target the microcode update. The int ended processor platform typ e is determined by reading the IA32 _PLA TFORM_ID register, (MSR 17H). This 6[...]
-
Page 417
Vol. 3A 9-43 PROCESSOR MANAGEMENT AND INITIALIZATION Example 9-6. Pseudo Code Example of Processor Flags T est Flag ← 1 << IA32_PLATFORM_ID[52:50] If (Update.HeaderVersion == 00000001h) { If (Update.ProcessorFlags & Flag) { Load Update } Else { // // Assume the Data Size has been used to calculate the // location of Update.ProcessorSign[...]
-
Page 418
9-44 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION Example 9-7. Pseudo Code Example of Checksum T est N ← 512 If (Update.DataSize != 00000000H) N ← Update.TotalSize / 4 ChkSum ← 0 For (I ← 0; I < N; I++) { ChkSum ← ChkSum + MicrocodeUpdate[I] } If (ChkSum == 00000000H) Success Else Fail 9.1 1.6 Microcode Up date Loader This section d[...]
-
Page 419
Vol. 3A 9-45 PROCESSOR MANAGEMENT AND INITIALIZATION The loader shown in Example 9-8 assumes that update is the address of a microcode update (header and data) emb edded within the cod e segm ent of the BIOS. It also assumes that the processor is operating i n real mode. The dat a may reside anywhere i n memory , aligned on a 16-byte boundary , tha[...]
-
Page 420
9-46 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION 9.1 1.6.3 Update in a System Supporting Intel Hyper-Thre ading T echnology Intel Hyper-Threading T echnol ogy has implications on th e loading of the microcode u pdate. The update must be loaded for each core in a physical processor . Thus, for a processor supporting Hyper-Threading T echnology ,[...]
-
Page 421
Vol. 3A 9-47 PROCESSOR MANAGEMENT AND INITIALIZATION CPUID returns a value in a model specific register in addition to its usual register return values. The semantics of CPUID cause it to deposit an update ID value in th e 64-bit model-sp ecific register at address 08BH (IA32_BIOS_SIGN_ID). If no u pdat e is present in the pro cessor , the value in[...]
-
Page 422
9-48 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION The IA32_BIOS_SIGN_ID register is used to report the m icrocode update signature when CPUID executes. The signature is return ed in the upper DWORD (T able 9-1 1). 9.1 1.7 .2 Authen ticating the Up date An update may be authenticated by the BIOS using the signature primitive, described above, and[...]
-
Page 423
Vol. 3A 9-49 PROCESSOR MANAGEMENT AND INITIALIZATION 9.1 1.8 Pentium 4, Intel Xeon, and P6 Family Processor Microcode Up date Specifications This section describes the interface that an application can use to dynamically integrate processor- specific updates into th e system B IOS. In this discussi on, the application is referred to as the calling [...]
-
Page 424
9-50 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION update blocks for each microcode upd ate. In a MP system, a common microcode update may be sufficient for each socket in the system. For IA-32 processors ear lier than fami ly 0FH and mo del 03H, the mi crocode update is 2 KBytes. An MP-capable BIOS that supports mul tiple steppings must allocate[...]
-
Page 425
Vol. 3A 9-51 PROCESSOR MANAGEMENT AND INITIALIZATION { If ((Update.ProcessorSignature[N] == Processor Signature) && (Update.ProcessorFlags[N] & Platform Bits)) { Load Update.UpdateData into the Processor; Verify update was correctly loaded into the processor Go on to next processor Break; } N ← N + 1 } I ← I + (Update.TotalSize / 20[...]
-
Page 426
9-52 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION • The calling program should read an y update data th at already exists in the BIOS in order to make decisions about the appropriaten ess of loading the update. The BIOS must refuse to overwrite a newer update with an older versi on. The update header contains information about version and proc[...]
-
Page 427
Vol. 3A 9-53 PROCESSOR MANAGEMENT AND INITIALIZATION For each processor { If ((this is a unique processor stepping) AND (we have a unique update in the database for this processor)) { Checksum the update from the database; If Checksum fails exit NumBlocks ← NumBlocks + size of microcode update / 2048 } } // // Do we have enough update slots for a[...]
-
Page 428
9-54 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION } // // Verify the update was loaded correctly // Issue the ReadUpdate function If an error occurred { Display Diagnostic exit } // // Compare the Update read to that written // If (Update read != Update written) { Display Diagnostic exit } I ← I + (size of microcode update / 2048) } // // Enab[...]
-
Page 429
Vol. 3A 9-55 PROCESSOR MANAGEMENT AND INITIALIZATION 9.1 1.8.4 IN T 15H-based Interface Intel recommends that a BIOS interface be provided that allo ws additional microcode updates to be added to system flash. The INT15H inte rface is the Intel-defined method for doing this. The program that calls this interface is respon sible for providing thr ee[...]
-
Page 430
9-56 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION Descripti on In order to assure that the BIOS functio n is pr esent, the caller must verify the carry flag, the return code, and the 64-bit si gnature. The update count reflects the nu mber of 2048-byte blocks available for storage within one non-volatile RAM. The loader version nu mber refers to[...]
-
Page 431
Vol. 3A 9-57 PROCESSOR MANAGEMENT AND INITIALIZATION Description The BIOS is responsible for select ing an appropr iate update block in the n on-volatile storage for storing the new update. Thi s BIOS is also responsible for ensuring t he integrity of the informa- tion provided by the call er , including authenticating the pro posed update before i[...]
-
Page 432
9-58 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION If no unused update block s are available and th e above criteria are not met, the BIOS can over- write update block(s) for a processor stepping that is no longer present in the system. This can be done by scanning the upd ate blocks and comparing the processor steppi ngs, identified in the MP Sp[...]
-
Page 433
Vol. 3A 9-59 PROCESSOR MANAGEMENT AND INITIALIZATION Figure 9-8. Microcode Up date W rite Operation Flow [1] 1 V a lid U p d ate H eader V er sion? Loader R evis ion M atc h BI OS’s Loader ? D oes U pdate M atch A CPU in The Sys t em W rit e M ic r o code U pdate D oes U pdate C hecks um C or rec t ly ? Ye s Ye s Ye s N o R etur n CPU_NOT_ PRE S [...]
-
Page 434
9-60 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION Figure 9-9. Microco de Up date Write Opera t io n Flow [2] Ret ur n I NVALI D_RE VI SI ON Yes 1 Update Revis ion Newer Than NVRAM Update? Update Pass A uthent ici ty Test ? Ret ur n SECURI TY _FAI LURE Yes Update NMRA M R ecord Ret ur n SUCCESS U p d a te M atch in g C P U A lr eady In NVRAM ? Sp[...]
-
Page 435
Vol. 3A 9-61 PROCESSOR MANAGEMENT AND INITIALIZATION 9.1 1 .8.7 Function 02H—Microcode Up date Control This function enables loadin g of binary updates into the processor . T able 9-15 lists the parame- ters and return codes for the f unction. Description This contr ol is provided on a global basis for all updates and processors. The caller can d[...]
-
Page 436
9-62 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION The READ_F AILURE error code returned by this function has meaning only if the control func- tion is implemented in the BIO S NVRAM. The state of this feat ure (enabled/disabled) can also be implemented using CMOS RAM bits wh ere READ failure er rors cannot occur . 9.1 1.8 .8 Functi on 03H—Read[...]
-
Page 437
Vol. 3A 9-63 PROCESSOR MANAGEMENT AND INITIALIZATION Description The read function enables the caller to read any mic rocode update data that already exists in a BIOS and make decisi ons about the addition of new updates. As a result of a successful call, the BIOS copies the microcode update into the location pointed to by ES:DI, with the co ntents[...]
-
Page 438
9-64 Vol. 3A PROCESSOR MANAGEMENT AND INITIALIZATION UPDA TE_NUM_INV ALID 99H The update number exceeds the maximum numb er of update blocks implemented by the BIOS. NOT_EMPTY 9AH The specified update block is a subseque nt block in use to store a valid microcode update t hat spans multiple blocks. The specified block is not a header block a nd is [...]
-
Page 439
10 Memory Cache Contr ol[...]
-
Page 440
[...]
-
Page 441
Vol. 3A 10-1 CHAPTER 10 MEMORY CACHE CONTROL This chapter describes the IA-32 architecture’ s memory cache and ca che control mechanisms, the TLBs, and the store buf fer . It also describes th e me m ory ty pe rang e registers (MT RRs) fou nd in the P6 family processors and how they are used to control caching of physica l memory locations. 10.1 [...]
-
Page 442
10-2 Vol. 3A MEMORY CACHE CONTROL T able 10-1. Characteristics of the Caches, TLBs, Store Buffe r, and Write Combining Buffer in IA-32 Processors Cache or Buffer Characte ristics T race Cache 1 - Pentium 4 and Intel Xeon processors: 12 K μ ops, 8-way set associative. - Pentium M processor: not implemented. - P6 family and Pentium processors: not i[...]
-
Page 443
Vol. 3A 10-3 MEMORY CACHE CONTROL The IA-32 processors implement four types of caches: the trace cache, the level 1 (L1) cache, the level 2 (L2) cache, and the level 3 (L3) cache (see Figure 10-1). The uses of these caches differs from the Pentium 4, Intel Xeon, and P6 family processors, as follows: • Pentium 4 and Intel Xeo n processors — The [...]
-
Page 444
10-4 Vol. 3A MEMORY CACHE CONTROL The trace cache in the Pentium 4 and Intel Xeon pr ocessors is an integral part of the Intel NetBurst microarchitecture and is available in all execution modes: protected mode, sys tem management mode (SMM), and real-address mode. The L1,L2, and L3 caches are also available in all execution modes; however , use of [...]
-
Page 445
Vol. 3A 10-5 MEMORY CACHE CONTROL When the processor attempts to write an opera nd to a cacheable area of memory , it first checks if a cache line for that memory location exists in the cache. If a valid cache line does exist, the processor (depending on the write policy curren tly in force) can write the operand into the cache instead of writing i[...]
-
Page 446
10-6 Vol. 3A MEMORY CACHE CONTROL NOTE The behavior of FP and SSE/SSE2 operations on operands in UC memory is implementation dependent. In so me implementations, accesses to UC memory may occur more than once. T o ensure predictable behavior , use loads and stores of general purpose registers to access UC memory that may have read or write side eff[...]
-
Page 447
Vol. 3A 10-7 MEMORY CACHE CONTROL memory . When writing through to memory , in valid cache lines are never filled, and valid cache lines are either filled or invalidated. W r ite combining is allowed. This type of cache- control is appropriate for frame buf fers or when there are devices on the system bus that access system memory , but do not perf[...]
-
Page 448
10-8 Vol. 3A MEMORY CACHE CONTROL 10.3.1 Buffering of Write Combining Memory Locations W rites to the WC memory type are not cached in the typical sense of th e word cached. They are retained in an inter nal write combi ning buf fer (WC bu f fer) that is sep arate from the in ternal L1, L2, and L3 caches and the store buf fer . The WC buf fer is no[...]
-
Page 449
Vol. 3A 10-9 MEMORY CACHE CONTROL The only elements of WC propagation to the syst em bus that are guaranteed are those provided by transaction atomicity . Fo r example, with a P6 family processor , a completely full WC buffer will always be propagated as a single 32-bit bur st transaction using any chunk order . In a WC buffer eviction where the da[...]
-
Page 450
10-10 Vol. 3A MEMORY CACHE CONTROL For a description of th ese instructions and there intended use, see Section 10.5.5, “Cache Management I nstructions.” 10.4 CACHE CONTROL PROTOCOL The following section describes th e cache control protocol curren tly defined for the I A-32 archi- tecture. This protocol is used by the Pentium 4, Intel Xeon, P6[...]
-
Page 451
Vol. 3A 10-11 MEMORY CACHE CONTROL • Cache control and memory ordering instructions — The IA-32 architecture provides several instructions that control the caching of data, the ordering of memory reads and writes, and the prefetching of data. These in structions allow software to control the caching of specific data struct ures, to control me m[...]
-
Page 452
10-12 Vol. 3A MEMORY CACHE CONTROL Figure 10-2. Cache-Control R egisters an d Bit s Available in IA-32 Processors Page-Directory or Page-T able Entry TLBs MTRRs 3 Physical Memory 0 FFFFFFFFH 2 control overall caching of system memory CD and NW Flags PCD and PWT flags control page-level caching G flag controls page- level flushing of TLBs MTRRs cont[...]
-
Page 453
Vol. 3A 10-13 MEMORY CACHE CONTROL T a ble 10-5. Cach e Operating Modes CD NW Caching and Read/Write Policy L 1 L2/L3 1 0 0 Normal Cache Mode. Highes t performance cache operation. - Read hits access the cache; read misses may cause replacement. - Write hit s update the cache. - Only writes to shared lines and write misses update system memo ry . -[...]
-
Page 454
10-14 Vol. 3A MEMORY CACHE CONTROL • NW flag, bit 29 of control register CR0 — Controls th e writ e policy fo r system m emory locations (see Section 2.5, “Control Registers”). If the NW and CD flags are clear , write- back is enabled for the whol e of system memory , but may be restricted for individual pages or regions of memory by ot her[...]
-
Page 455
Vol. 3A 10-15 MEMORY CACHE CONTROL • Memory type range r egisters (MTRRs) (i ntroduced in P6 family pr ocessors) — Control the type of cachin g used in specific regions o f physical memory . Any of the caching types described in Section 10.3, “Methods of Caching A vailable,” can be selected. See Section 10.1 1, “Memory T ype Range Regist [...]
-
Page 456
10-16 Vol. 3A MEMORY CACHE CONTROL 10.5.2.1 Selecting Memory T ypes for Pentium Pro an d Pentium II Processors The Pentium Pro and Pentium II processors do not support the P A T . Here, the effective memory type for a page is selected with the MTRRs and the PCD and PWT bits in the page-t able or page- directory entry for the page. T able 10-6 descr[...]
-
Page 457
Vol. 3A 10-17 MEMORY CACHE CONTROL 4. Setting th e PCD and PWT flags to opposite valu es is considered model-specific for the WP and WC memory types and architecturally -defined for the WB, WT , and UC memory types. 10.5.2.2 Selecting Memory T ype s for Pentium 4, Intel Xeon, and Pentium III Processors The Pentium 4, Intel Xeon, and Pentium III pro[...]
-
Page 458
10-18 Vol. 3A MEMORY CACHE CONTROL 10.5.2.3 Writing V alues Acro ss Pag es with Differ ent Memory T ypes If two adjoining pages in m emory have different memory types, and a word o r longer operand is written to a mem ory location that crosses the page boundary between tho se two pages, the operand might be w ritten to memory twice. This action doe[...]
-
Page 459
Vol. 3A 10-19 MEMORY CACHE CONTROL 3. Disable the MTRRs and set the default memory type to uncached or set all MTRRs for the uncached memory type (see the discussion of the discuss ion of the TYPE field and the E flag in Section 10.11.2.1, “IA32_MTRR_DEF_TYPE MSR”). The caches must be flushed (step 2) after the CD flag is set to insure system m[...]
-
Page 460
10-20 Vol. 3A MEMORY CACHE CONTROL modified lines (such as, d uring testing or fa ult recovery where cache coherency with main memory is not a concern), software should use the WBINVD instruction. The WBINVD instruction first wr ites back any modified lines in all the internal caches, then invalidates the contents of both the L1, L2, and L3 caches.[...]
-
Page 461
Vol. 3A 10-21 MEMORY CACHE CONTROL 10.5.6.1 Adaptive Mode Adaptive mode facilitates L1 data cache sharin g between logical processors. When running in adaptive mode, the L1 data cache is shared acr oss logical processors in the same core if: • CR3 control registers for logical processors sharing the cache are identical. • The same paging mode i[...]
-
Page 462
10-22 Vol. 3A MEMORY CACHE CONTROL For Intel486 processors, a write to an instruction in the cache will modify it in both the cache and memory , but if the instruction was prefetched before the write, the old version of t he instruc- tion could be the one executed. T o prevent the old instruction from being executed, flush the instruction prefetch [...]
-
Page 463
Vol. 3A 10-23 MEMORY CACHE CONTROL cache hierarchy now or as soon as possible, in an ticipation of its use. Th e instructions provide different variations of the hint th at allow selection of the cache leve l into which data will be read. The PREFETCH h instructions can help reduce the long late ncy typically associated with reading data from memor[...]
-
Page 464
10-24 Vol. 3A MEMORY CACHE CONTROL 10.10 STORE BUFFER IA-32 processors temporarily st ore each write (store) to memory in a store buffer . The store buffer improves processor perf ormance by allow ing the processor to continue ex ecuting instruc- tions without having to wait until a write to memory and/or to a cache is complete. It also allows writ[...]
-
Page 465
Vol. 3A 10-25 MEMORY CACHE CONTROL ization software should then se t the MTRRs to a specific, syst em-defined memory map. T ypi- cally , the BIOS (basic input/output system) so ftware configures the MTRRs. The operating system or executive is then fr ee to modify the memory map us ing the normal page-level cache- ability attributes. In a multiproce[...]
-
Page 466
10-26 Vol. 3A MEMORY CACHE CONTROL 10.1 1.1 MTRR Feature Identification The availability of the MTRR feature is model- specific. Software can dete rmine if MTRRs ar e supported on a processor by executing the CPUID instruction and reading the state of the MTRR flag (bit 12) in the feature information register (E DX). If the MTRR flag is set (indica[...]
-
Page 467
Vol. 3A 10-27 MEMORY CACHE CONTROL • WC (write combining) fla g, bit 10 — The write-combining (WC) memory type is supported when set; t he WC type is not sup ported when clear . Bit 9 and bits 1 1 through 63 in the IA32_MTRRCAP MSR are rese rved. If software attempts to write to the IA32_MTRRCAP MSR, a general- protection exception (#GP) is gen[...]
-
Page 468
10-28 Vol. 3A MEMORY CACHE CONTROL • FE (fixed MTRRs enabled) flag, bit 10 — Fixed-range MTRRs are enabled when set; fixed-range MTRRs are disabled when clear . When the fixed-range MTRRs are enabled, they take priority over the variable-range MTR Rs when overlaps in ranges occur . If the fixed-range MTRRs are disabled, the variable -rang e MTR[...]
-
Page 469
Vol. 3A 10-29 MEMORY CACHE CONTROL For the P6 family processors, the prefix for the fixed range MTRRs is MTRRfix. 10.1 1.2.3 V ariable R ange MTR Rs The Pentium 4, Intel Xeon, an d P6 family processors permit software to specify th e memory type for eight variable-size address ranges, using a pair of MTRRs for each range. The first entry in each pa[...]
-
Page 470
10-30 Vol. 3A MEMORY CACHE CONTROL • PhysBase field, bits 12 through (MAXPHY ADDR-1) — Specifies the base address of the address range. This 24-bit value, in the case where MAXPHY ADDR is 36 bits, is extended by 12 bits at the low end to form the base address (this au toma tically aligns the address on a 4-KByte boundary). • PhysMask field, b[...]
-
Page 471
Vol. 3A 10-31 MEMORY CACHE CONTROL All other bits in the IA32_MTRR _PHYSBASE n and IA32_MTRR_PHYSMASK n registers are reserved; the processor generates a general-prot ection excepti on (#GP) if software at tempts to write to them. Some mask values can result in ranges that ar e not continuous. In such ranges, the area not mapped by the mask value i[...]
-
Page 472
10-32 Vol. 3A MEMORY CACHE CONTROL 10.1 1.3 Example Base a nd Mask Calculations The examples in this section apply to processo rs that support a maximu m physical address size of 36 bits. The base and m ask values entered in variable-range MTRR pairs are 24-b it values that the processor extends to 36-bits. For example, to enter a base address of 2[...]
-
Page 473
Vol. 3A 10-33 MEMORY CACHE CONTROL The following settings for the MTRRs will yield the proper mappin g of the physical address space for this syst em configuration. IA32_MTRR_PHYSBASE0 = 0000 0000 0000 0006H IA32_MTRR_PHYSMASK0 = 0000 000F FC00 0800H Caches 0- 64 MByte as WB c ache type. IA32_MTRR_PHYSBASE1 = 0000 0000 0400 0006H IA32_MTRR_PHYSMASK[...]
-
Page 474
10-34 Vol. 3A MEMORY CACHE CONTROL Caches 96-100 MByte as WB cache type. IA32_MTRR_PHYSBASE3 = 0000 0000 0400 0000H IA32_MTRR_PHYSMASK3 = 000 0 00FF FFC0 0800H Caches 64-68 MByte as U C cache type. IA32_MTRR_PHYSBASE4 = 0000 0000 00F0 0000H IA32_MTRR_PHYSMASK4 = 0000 00FF FFF0 0800H Caches 15-16 MByte as U C cache type. IA32_MTRR_PHYSBASE5 = 0000 0[...]
-
Page 475
Vol. 3A 10-35 MEMORY CACHE CONTROL d. If two or more variabl e memory ranges match and the memory types are WT and WB, the WT memory type is used. e. For overlaps not defined by the above rules, processor behavior is undefined. 3. If no fixed or variable memory range matche s, the processor uses th e default memory ty pe. 10.1 1.5 MTRR Initializati[...]
-
Page 476
10-36 Vol. 3A MEMORY CACHE CONTROL 10.1 1.7 MTRR Maintenan ce Programming Interface The operating system maintains th e MTRRs after booting and sets up or changes t he memory types for memory-mapped devices. The operating system should provide a driver and applica- tion programming interface (API) to access and set the MTRRs. The function calls Mem[...]
-
Page 477
Vol. 3A 10-37 MEMORY CACHE CONTROL The pseudocode for the Get4KMem T ype() fun ction in Example 10-1 7 obtains the mem ory type for a single 4-KByte range at a given physical a ddress. The sample code determines whether an PHY_ADDRESS falls within a fixed range by com paring the address with the known fixed ranges: 0 to 7FFFFH (64-KByte regions), 8[...]
-
Page 478
10-38 Vol. 3A MEMORY CACHE CONTROL FI; IF IA32_MTRRCAP.FIX is set AND range can be mapped using a fixed-rang e MTRR THEN pre_mtrr_change(); update affected MTRR; post_mtrr_change(); FI; ELSE (* try to map using a variable MTRR pair *) IF IA32_MTRRCAP.VCNT = 0 THEN return UNSUPPORTED; FI; IF conflicts with current variable ranges THEN return RANGE_O[...]
-
Page 479
Vol. 3A 10-39 MEMORY CACHE CONTROL The physical address to variab le range mapping algorithm in the MemT ypeSet function detects conflicts with current variable range regi sters by cycling through them and determining whether the physical address in quest ion matches any of the current ranges. During this scan, the algo- rithm can detect whether an[...]
-
Page 480
10-40 Vol. 3A MEMORY CACHE CONTROL 6. If the PGE flag is set in control register CR4, flush all TLBs by clearing that flag. 7. If the PGE flag is clear in control regi ster CR4, flush all TLBs by executing a MOV from control register CR3 to another register and then a MOV from that register back to CR3. 8. Disable all range registers (by clearing t[...]
-
Page 481
Vol. 3A 10-41 MEMORY CACHE CONTROL The Pentium 4, Intel Xeon, and P6 family processors provide special support for the physical memory range from 0 to 4 MBytes, which is po tentially mapped by both the fixed and v ari- able MTRRs. This support is invoked when a Pe ntium 4, Intel Xeon, or P6 fami ly processor detects a large p age overlapping the fi[...]
-
Page 482
10-42 Vol. 3A MEMORY CACHE CONTROL 10.12.2 IA32_CR_P A T MSR The IA32_CR_P A T MSR is located at MSR addre ss 277H (see to App endix B, “Model-Specific Registers (MSRs),” and this add ress will remain at the sam e address on future IA-32 processors that support the P A T feature. Fi gure 10-7 shows the format of the 64-bit IA32_CR_P A T MSR. Th[...]
-
Page 483
Vol. 3A 10-43 MEMORY CACHE CONTROL 10.12.3 Selecting a Memory T ype from the P A T T o select a memory type fo r a page from the P A T , a 3-bit index made up of the P A T , PCD, and PWT bits must be encoded in the page-table or page-directory entry for the page. T able 10-1 1 shows the possible encodin gs of the P A T , PCD, and PWT bits and the P[...]
-
Page 484
10-44 Vol. 3A MEMORY CACHE CONTROL The values in all the entries of the P A T can be changed by writing to the IA32_CR_P A T MSR using the WRMSR instruction. The IA32_CR_P A T MSR is read and write accessible (use of the RDMSR and WRMSR instructions, respectively) to so ftware operating at a CPL of 0. T able 10-10 shows the allowable encoding of th[...]
-
Page 485
Vol. 3A 10-45 MEMORY CACHE CONTROL 10.12.5 P A T Compatibility wi th Earlier IA -32 Processors For IA-32 processors that supp ort the P A T , th e IA32_CR_P A T MSR is always active. That is, the PCD and PWT bits in page-table entries and in page-directory entries (that point to pages) are always select a memory type for a page in directly by selec[...]
-
Page 486
10-46 Vol. 3A MEMORY CACHE CONTROL[...]
-
Page 487
11 Intel ® MMX ™ T echnology System Pr ogramming[...]
-
Page 488
[...]
-
Page 489
Vol. 3A 11-1 CHAPTER 1 1 INTEL ® MMX ™ T ECHNOLOGY SYSTEM PROGRAMMING This chapter describes those features of the Intel ® MMX™ technology that must be considered when designing or enhancin g an operating syst em to support MMX technology . It covers MMX instruction set emulation, the MMX state, alia sing of MMX registers, saving MMX state, t[...]
-
Page 490
11-2 Vol. 3A INTEL® MMX™ T ECHNOLOGY SYSTEM PROGRAM MING When a value is written into an MMX register us ing an MMX instru ction, the value also appears in the corresponding floating-point register in b its 0 through 63 . Likewise, when a floating-point value written into a floating-point reg ister by a x87 FPU, the low 64 bi ts of that value al[...]
-
Page 491
Vol. 3A 11-3 INTEL® MMX™ T E CHNOLOGY SYSTEM PROGRAMMING Execution of MMX instru ctions does not affect the other bits in the x87 FPU status word (bi ts 0 through 10 and bits 14 and 15) or the contents of the other x87 FPU registers that com prise the x87 FPU state (the x87 FPU control word, instructio n pointer , data pointer , or opcode regi s[...]
-
Page 492
11-4 Vol. 3A INTEL® MMX™ T ECHNOLOGY SYSTEM PROGRAM MING 1 1 .3 S AVING AND RESTOR ING THE MMX ST ATE AND REGISTERS Because the MMX registers are aliased to the x87 FPU data registers, the MMX state can be saved to memory and restored fr om memory as follows: • Execute an FSA VE, FNSA VE, or FXSA VE i nstruction to save the MMX st ate to memor[...]
-
Page 493
Vol. 3A 11-5 INTEL® MMX™ T E CHNOLOGY SYSTEM PROGRAMMING NOTE The IA-32 architecture does not support scann ing the x87 FPU tag word and then only saving valid entries. 1 1.4 SAVING MMX ST ATE ON T ASK OR CONTEXT SWITCHES When switching from one task or context to another , it is often necessary to save the MMX state. As a general rule, if the e[...]
-
Page 494
11-6 Vol. 3A INTEL® MMX™ T ECHNOLOGY SYSTEM PROGRAM MING • Other exceptions can occur indi rectly due to the faulty ex ecution of the exception hand lers for the above exceptions. 1 1.5.1 Effect of MMX Instructi ons on Pending x87 Floating-Point Exceptions If an x87 FPU floating-point exception is pending and the processor encounters an MMX in[...]
-
Page 495
Vol. 3A 11-7 INTEL® MMX™ T E CHNOLOGY SYSTEM PROGRAMMING Figure 1 1-2. Mapping of MMX Registe rs to x87 FPU Dat a Register St ack MM0 MM1 MM2 MM3 MM4 MM5 MM6 MM7 ST1 ST2 ST7 ST0 ST6 ST7 ST1 TOS TOS x87 FPU “push” x87 FPU “pop” x87 FPU “push” x87 FPU “pop” Case A: TOS=0 Case B: TOS=2 MM0 MM1 MM2 MM3 MM4 MM5 MM6 MM7 ST0 Outer circl[...]
-
Page 496
11-8 Vol. 3A INTEL® MMX™ T ECHNOLOGY SYSTEM PROGRAM MING[...]
-
Page 497
12 SSE, SSE2 and SSE3 System Pr ogramming[...]
-
Page 498
[...]
-
Page 499
Vol. 3A 12-1 CHAPTER 12 SSE, SSE2 AND SSE3 SYSTEM PROGRAMMING This chapter describes features of the streaming SIMD exte nsions (SSE), streaming SIMD extensions 2 (SSE2) and streaming SIMD extens ions 3 (SSE3) that must be considered when designing or enhancing an operating system to supp ort the Pentium II I , Pentium 4, and Intel Xeon processors.[...]
-
Page 500
12-2 Vol. 3A SSE, SSE2 AND SSE3 SYSTEM PROGRAMMING 12.1.2 Checking for SSE/SSE2/SSE3 Extension Support If the processor attempts to execute an uns upported SSE/SSE2/SSE3 instruction , the processor will generate an invalid-op code exception (#UD). Before an operating system or executive attemp ts to use SSE/SSE2/SSE3 extensions, it should check tha[...]
-
Page 501
Vol. 3A 12-3 SSE, SSE2 AND SSE3 SYSTEM PROGRAMMING NOTE The OSFXSR and OSXMMEXCP T bits in control register CR4 must be set by the operating system. The processor h as no other way of detecting operating-system support for the FXSA VE and FXRSTOR instructions or for handling SIMD floating-point except ions. 3. Clear CR0.EM[bit 2] = 0. This action d[...]
-
Page 502
12-4 Vol. 3A SSE, SSE2 AND SSE3 SYSTEM PROGRAMMING The SIMD floating-p oint exception mask bits (bits 7 through 12), the flush-to-zero flag (bit 15), the denormals-are-zero flag (b it 6), and the roundi ng control fiel d (bits 13 and 14) in the MXCSR register should be left in their default va lues of 0. This permits the application to deter- mine [...]
-
Page 503
Vol. 3A 12-5 SSE, SSE2 AND SSE3 SYSTEM PROGRAMMING • System Exceptions: — Inval id-opcode exception (#UD). This exception is generated when executing SSE/SSE2/SSE3 instructions under the following conditions: • SSE/SSE2/SSE3 feature flags returned by CPUID are set to 0. This condition does not affect the CLFLUSH instruction. • The CLFSH fea[...]
-
Page 504
12-6 Vol. 3A SSE, SSE2 AND SSE3 SYSTEM PROGRAMMING same conditions th at cause x87 FPU float ing-point error exceptio ns (#MF) to be generated for x87 FPU instruction s. Each of these exceptions can be masked, in which case the processor returns a reasona ble result to the destinat ion operand without i nvoking an exception handler . However , if a[...]
-
Page 505
Vol. 3A 12-7 SSE, SSE2 AND SSE3 SYSTEM PROGRAMMING In some cases, applications can only save the XMM and MXCSR registers in the following way: • Execute eight MOVDQ instructions to save the contents of the XMM0 through XMM7 registers to memory . • Execute a STMXCSR instr ucti on to save the state of the MXCSR register to memory . In some cases,[...]
-
Page 506
12-8 Vol. 3A SSE, SSE2 AND SSE3 SYSTEM PROGRAMMING • The operating system can take the respo nsibility for automatically saving th e x87 FPU, MMX, XXM, and MXCSR registers as part of the task switch process (using an FXSA VE instruction) and automatically restoring the st ate of the registers when a suspended ta sk is resumed (using an FXRSTOR in[...]
-
Page 507
Vol. 3A 12-9 SSE, SSE2 AND SSE3 SYSTEM PROGRAMMING On a task switch, the operatin g system task switching code must execute the fol lowing pseudo- code to set the TS flag according to the cu rrent owner of the x8 7 FPU/MMX/SSE/SSE2/SSE3 state. If the new task (task B in this example) is not the current owner of this state, the TS flag is set to 1; [...]
-
Page 508
12-10 Vol. 3A SSE, SSE2 AND SSE3 SYSTEM PROGRAMMING • Restores the x87 FPU, MMX, XMM, or MXCSR registers from the new task’ s save area for the x87 FPU/MMX/SSE/SSE2/SSE3 state. • Updates the current x87 FPU/MMX/SSE/SSE2/SSE3 state owner to be the curren t task. • Clears the TS flag.[...]
-
Page 509
13 Power and Thermal Management[...]
-
Page 510
[...]
-
Page 511
Vol. 3A 13-1 CHAPTER 13 POWER AND THERMAL MANAGEMENT This chapter describes facilities of IA-32 arch itecture used for power management and thermal monitoring. 13.1 ENHANCED INTEL SPEEDSTEP ® T E CHNOLOGY Enhanced Intel SpeedStep ® T echnology was introduced in the Pen tium M processor; it is av ail- able in Pentium 4, Int el Xeon, Intel ® C ore[...]
-
Page 512
13-2 Vol. 3A POWER AND THERMAL MANAGEMENT 13.2 P-ST ATE HARDWARE COORDINATION The Advanced Configuration and Power Interface (ACPI) defines performance states (P-state) that are used facilitate syst em software’ s ability to manage processor power consum ption. Different P-state correspond to dif ferent performance levels that are applied while t[...]
-
Page 513
Vol. 3A 13-3 POWER AND THERMAL MANAGEMENT If P-states are exposed by the BI OS as hardware coordinated, so ftware is expected to confirm processor suppo rt for P-state hardw are coordina tion feedback and use the feedback mechanism to make P-state decisions. The OSPM is expect ed to reset the MSRs (execute WRMSR with 0 to these MSRs individually) a[...]
-
Page 514
13-4 Vol. 3A POWER AND THERMAL MANAGEMENT 13.3 MW AIT EXTENSIONS FOR ADVANCED POWER MANAGEMENT IA-32 processors may support a number of C-state 1 that reduce power co nsumption f or inacti ve states. Intel Core Solo and Intel Core Duo processors support bot h deeper C-state and MW AIT extensions that can be used by OS to implement power man a gemen[...]
-
Page 515
Vol. 3A 13-5 POWER AND THERMAL MANAGEMENT 13.4 THERMAL MONITORI NG AND PROTECTION The IA-32 architecture provides the follow ing mechanisms for monito ring temperature and controlling thermal po wer: 1. The catast rophic shutdown det ecto r forces processor execution to stop if the processor ’ s core temperature rises above a preset limi t. 2. Au[...]
-
Page 516
13-6 Vol. 3A POWER AND THERMAL MANAGEMENT 13.4.1 Catastrophic Shut down Detector P6 family pr ocessors introduce d a thermal sens or that acts as a catastroph ic shutdown detector . This catastrophic shutdown detector was also i mplemented in Pentium 4, Intel Xeon and Pentium M processors. It is always enabled. When processor core temperature reach[...]
-
Page 517
Vol. 3A 13-7 POWER AND THERMAL MANAGEMENT MSR_THERM2_CTL register is set to 1 (Fi gure 13-3) and bit 3 of the IA32_MISC_ENABLE register is set to 1. Following a power- up or reset, the TM_SELECT flag may be cleared. BIOS is required to enable either TM1 or TM2. Op erating systems and applications must not disable mechanisms that enable TM1 or TM2. [...]
-
Page 518
13-8 Vol. 3A POWER AND THERMAL MANAGEMENT • If TM1 is enabled and the TCC is engaged, the performance state transition can commence before the TCC is disengaged. • If TM2 is enabled and the TCC is engaged, the performance state transition specified by a write to the IA32_PERF_CTL will comm ence after the TCC has disengaged. 13.4.2.5 Thermal St [...]
-
Page 519
Vol. 3A 13-9 POWER AND THERMAL MANAGEMENT • High-T emperature Interru pt Enable flag, bit 0 — Enables an i nterrupt to be generated on the transition from a low -temperature to a high-temperature when set; disables the interrupt when clear .(R/W). • Low-T emperature Interrupt Enable flag, bit 1 — Enables an interrupt to be generated on the [...]
-
Page 520
13-10 Vol. 3A POWER AND THERMAL MANAGEMENT The IA32_CLOCK_MODULA TION MSR contains the following flag and field used to enable software-controlled clock modulation and to select th e clock modulation duty cycle: • On-Demand Clock Modulation Enable, bit 4 — Enables on-demand software cont rolled clock modulation when set; disables softw are-cont[...]
-
Page 521
Vol. 3A 13-11 POWER AND THERMAL MANAGEMENT 13.4.4 Detection of T hermal Moni tor and Sof tware Controlled Clock Modulation Facilities The ACPI flag (bit 22) of the CPUID f eature flags indicates the presence of the IA32_THERM_ST A TUS, IA32_THERM_IN TERRUP T , IA32_CLOCK_MODULA TION MSRs, and the xAPIC thermal L VT entry . The TM1 flag (b it 29) of[...]
-
Page 522
13-12 Vol. 3A POWER AND THERMAL MANAGEMENT been asserted since a previous RESET or the last time software cleared the bit. Software may clear this bit by writing a zero. • PROCHOT# or FO RCEPR# Event (bit 2, RO) — Indicates whet her PROCHOT# or FORCEPR# is b eing asserted. If bi t 2 = 1, PROCHOT# or FORCEPR # has been asserted. • PROCHOT# or [...]
-
Page 523
Vol. 3A 13-13 POWER AND THERMAL MANAGEMENT • Thermal Threshold #2 Log (bit 9, R/WC0) — Sticky bit that i ndicates whether the Thermal Threshold #2 has been reached since th e last clearing of this bit or a reset. If bit 9 = 1, the Thermal Threshold #2 has been reached. Software ma y clear this bit by writing a zero. • Digital Readout (bits 22[...]
-
Page 524
13-14 Vol. 3A POWER AND THERMAL MANAGEMENT • THERMTRIP# Interrupt Enable (bit 2, R/W) — When a catastroph ic cooling failure occurs, the processor will automatically shut down. Bit 2 = 0 disables the feature; bit 2 = 1 enables the feature. • FORCPR# Interrupt Enab le (bit 3, R/W) — When a source external to the processor asserts PROCHOT#, t[...]
-
Page 525
14 Machine Check Ar chitectur e[...]
-
Page 526
[...]
-
Page 527
Vol. 3A 14-1 CHAPTER 14 MACHINE-CHECK ARCHITECTURE This chapter describes the m achine-check architecture and ma chine-check exception mecha- nism found in the Pentiu m 4, Intel Xeon, and P6 family processors. See Chapter 5, “Interrupt 18—Machine-Check Exception (#MC),” for more information on machine- check exceptions. A brief description of[...]
-
Page 528
14-2 Vol. 3A MACHINE-CHECK ARCHITECTURE 14.3 MACHINE-CHECK MSRS Machine check MSRs in the Pentium 4, Intel Xeon , and P6 family processors consist of a set of global control and status registers and several error-reporting register banks (see Figure 14-1). Each error-reporting bank is associated with a specific hardware unit (or g roup of hardware [...]
-
Page 529
Vol. 3A 14-3 MACHINE-CHECK ARCHITECTURE Where: • Count field, bi ts 0 through 7 — Indicates the number of ha rdware unit er ror-reporting banks available in a particul ar processor implementation. • MCG_CTL_P (control MSR present) flag, bit 8 — Indicates that the processor implements the IA32_MCG_CTL MSR when se t; this register is absent w[...]
-
Page 530
14-4 Vol. 3A MACHINE-CHECK ARCHITECTURE Where: • Count field, bits 0 thr ough 7 — Indicat es the number of hardware unit error-reporting banks available in a particular processor im plem entation. • MCG_CTL_P (register pr esent) flag, bit 8 — Indicates that the MCG_CTL register is present when set and absent when clear . Bits 9 through 63 a[...]
-
Page 531
Vol. 3A 14-5 MACHINE-CHECK ARCHITECTURE 14.3.1.4 IA32_MCG_CTL MSR The IA32_MCG_CTL MSR (called the MCG_CTL MS R in P6 fami ly processors) is present if the capability flag MCG_CTL_P is set in the IA32_ MCG_CAP MSR (or the MCG_CAP MSR). IA32_MCG_CTL (or MCG_CTL) controls the rep orting of machine-check exceptions. If present, writing 1s to this regi[...]
-
Page 532
14-6 Vol. 3A MACHINE-CHECK ARCHITECTURE 14.3.2.2 IA32_MCi_ST A TUS MSRs Each IA32_MC i _ST A TUS MSR (called MC i _ST A TUS in P6 family processors) contains in for- mation related to a machine-check error if its V AL (valid) f lag is set (see Figure 14-6). Software is responsible for clearing IA32_MC i _ST A TUS MSRs by explicitly writing 0s to th[...]
-
Page 533
Vol. 3A 14-7 MACHINE-CHECK ARCHITECTURE where the error occurred . Do not read these registers if they are not impl emented in the processor . • MISCV (IA32_MC i _MISC register valid) flag, bit 59 — Indicates (when set) that the IA32_MC i _MISC register contains additional inform ation regarding the error . When clear , this flag indicates that[...]
-
Page 534
14-8 Vol. 3A MACHINE-CHECK ARCHITECTURE 14.3.2.4 IA32_MCi_MISC MSRs The IA32_MC i _MISC MSR (called the MC i _MISC MSR in the P6 family processors) contains additional information describing the machin e-check error if the MISCV flag in the IA32_MC i _ST A TUS register i s set. The IA32_MCi_MISC_MSR is either not i mplemented or does not contai n a[...]
-
Page 535
Vol. 3A 14-9 MACHINE-CHECK ARCHITECTURE In processors with support for Intel EM64T , 64-bit machine check state MSRs are aliased to the legacy MSRs. In addition, there m ay be registers beyond IA32_MC G_MISC. These may include up to five reserved MSRs (IA32_MCG _RESER VED[1:5]) and save-st ate MSRs for registers introduced in 64-bit mo de. See T ab[...]
-
Page 536
14-10 Vol. 3A MACHINE-CHECK ARCHITECTURE When a machine-check error is detected on a Pe ntium 4 or Intel Xeon processor, the processor saves the state of the general-purpose registers, the R/EFLAGS register , and the R/EIP in these extended machine-check state MSRs . This information can be used by a debugger to analyze the error . These registers [...]
-
Page 537
Vol. 3A 14-11 MACHINE-CHECK ARCHITECTURE 14.3.3 Mapping of the Pentium Processor Machine-Check Errors to the Machine-Check Architecture The Pentium processo r reports machine-check errors using tw o registers: P5_MC_TYPE and P5_MC_ADDR. The Pentium 4, Int el Xeon, and P6 family pro cessors map these registers to the IA32_MC i _ST A TUS and IA32_MC [...]
-
Page 538
14-12 Vol. 3A MACHINE-CHECK ARCHITECTURE Example 14-19. Machine-Check Initializa tio n Pseudocode Check CPUID Feature Flags for MCE and MCA support IF CPU supports MCE THEN IF CPU supports MCA THEN IF (IA32_MCG_CAP.MCG_CTL_P = 1) (* IA32_MCG_CTL register is present *) THEN IA32_MCG_CTL ← FFFFFFFFFFFF FFFFH; (* enables all MCA features *) FI (* De[...]
-
Page 539
Vol. 3A 14-13 MACHINE-CHECK ARCHITECTURE FOR error-reporting ba nks (0 through MAX_BANK_NUMBER) DO (Optional for BIOS and OS) Log valid errors (OS only) IA32_MCi_STATUS ← 0; OD FI FI FI Setup the Machine Check Exception (#MC) handl er for vector 18 in IDT Set the MCE bit (bit 6) in CR4 register to enable Machine-Chec k Exceptio ns FI 14.6. INTERP[...]
-
Page 540
14-14 Vol. 3A MACHINE-CHECK ARCHITECTURE 14.6.2 Compound Error Codes Compound error codes describe errors related to the TLBs, memory , c aches, bus and intercon- nect logic, and internal timer . A set of sub-fi elds is common to all of compound errors. These sub-fields describe the ty pe of access, level in the memory hierarchy , and type of reque[...]
-
Page 541
Vol. 3A 14-15 MACHINE-CHECK ARCHITECTURE For example, the error code ICACHEL1_R D_ERR is constructed from the form: {TT}CACHE{LL}_{RRRR}_ERR, where {TT} is replaced by I, {LL} is replaced by L1, and {RRRR} is replaced by RD. The 2-bit TT sub-field (T able 14 -5) indicates the type of transaction (dat a, instruction, or generic). The sub-field appli[...]
-
Page 542
14-16 Vol. 3A MACHINE-CHECK ARCHITECTURE The 4-bit RRRR sub-field (see T able 14-7) indicates th e type of action asso ciated with the error . Actions include read and write operations, pr efetches, cache evictions, and snoops. Generic error is returned when the type of error canno t be determin ed. Generic read and generic write are returned when [...]
-
Page 543
Vol. 3A 14-17 MACHINE-CHECK ARCHITECTURE 14.6.3 Machine-Check Erro r Codes Interpretation Appendix E, “Inter preting Machine-Check Error Cod es,” provides information on interpretin g the MCA error code, model-specific error code, and other information error code fields. For P6 family processors, informat ion has been included on deco ding exte[...]
-
Page 544
14-18 Vol. 3A MACHINE-CHECK ARCHITECTURE 14.7.1 Machine-Check Exception Handler The machine-check exception (#MC) corresp onds to vector 18. T o serv ice machine-check exceptions, a trap gate must be added to th e IDT . The pointer in the trap gate must point to a machine-check exceptio n handler . T wo approaches can be taken to desig ning the exc[...]
-
Page 545
Vol. 3A 14-19 MACHINE-CHECK ARCHITECTURE • The MCIP flag in the IA32_MCG_ST A TUS re gister indicates whether a machine-check exception was ge nerated. Before retu rning from the machine-ch eck exception handler , software should clear this flag so that it can be used reliably by an error logging utility . The MCIP flag also detects recu rsion. T[...]
-
Page 546
14-20 Vol. 3A MACHINE-CHECK ARCHITECTURE 14.7.3 Pentium Processor Machine-Check Exception Handling T o mak e the machine-check exception handler portable to th e Pentium 4, Intel Xeon, P6 family , and Pentium processors, checks can be made (usi ng CPUID) to determine the processor type. Then based on the processor t ype, machine-check exceptions ca[...]
-
Page 547
Vol. 3A 14-21 MACHINE-CHECK ARCHITECTURE AND RIPV flag in IA32_MCG_STATUS = 0 (* execution is not restartable *) THEN RESTARTABILITY = FALSE; return RESTARTABILITY to calli ng proced ure; FI; Save time-stamp counter and processor ID; Set IA32_MC i _STATUS to all 0 s; Execute serializing instruction (i.e., CPUID); FI; OD; FI; If the processor suppor[...]
-
Page 548
14-22 Vol. 3A MACHINE-CHECK ARCHITECTURE The basic algorithm given in Example 14-2 1 can be modi fied to provi de more ro bust recovery techniques. For example, software has the flexibility to attempt recovery using information unavailable to the hardware. Specifically , the machine-check exception handler can, after logging carefully analyze the e[...]
-
Page 549
15 8086 Emulation[...]
-
Page 550
[...]
-
Page 551
Vol. 3A 15-1 CHAPTER 15 8086 EMULATION IA-32 processors (begin ning with th e Intel386 processor) prov ide two wa ys to execute new or legacy programs that are assembled and/or compiled to run on an Intel 80 86 processor: • Real-address mode. • V irtual-8086 mode. Figure 2-3 show s the relationship of these operating modes to protected mode and[...]
-
Page 552
15-2 Vol. 3A 8086 EMULATION The following is a summary of the core features of the real-address mo de execution environment as would be seen by a program written fo r the 8086: • The processor supports a nomin al 1-MByte physical address space (see Section 15.1.1, “Address T ranslation in Real-Address Mode ”, for specific details). This addre[...]
-
Page 553
Vol. 3A 15-3 8086 EMULATION 8-byte entries) u sed when handling pro tected-mode interrupts and excep tions. Interrupt and exception ve ctor numbers pr ovide an inde x to entries in th e interrupt table. Each entry provides a pointer (called a “vector”) to an interrupt - or exception-han dling procedure. See Section 15.1.4, “Interrupt and Exc [...]
-
Page 554
15-4 Vol. 3A 8086 EMULATION behavior of the 8086 processor .) Care should be take to en sure that A20M# based address wrap - ping is handled correctly in multipro cessor based system. The IA-32 processors begin ning with the In tel386 processor can generate 32-bit offsets using an address override prefix; however , in real-address mode, the valu e [...]
-
Page 555
Vol. 3A 15-5 8086 EMULATION • Logical instructions AND, OR, XOR, and NOT . • Decimal instructions DAA, D AS, AAA, AAS, AAM, an d AAD. • Stack instructions PUSH and POP (to g eneral-purpose registers and segment registers). • T ype conversion in structions CWD, CDQ, CBW , and CWDE. • Shift and rotate instruction s SAL, SH L, SHR, SAR, ROL,[...]
-
Page 556
15-6 Vol. 3A 8086 EMULATION • ENTER and LEA VE control instructions. • BOUND instruction. • CPU identification (CPUID) instr uction. • System instructions CL TS, INVD, WI NVD, INVLPG , LGDT , SGD T , LIDT , SIDT , LMSW , SMSW , RDMSR, WRMSR, RDTSC, and RDPMC. Execution of any of the ot her IA- 32 architecture instructio ns (not given in the[...]
-
Page 557
Vol. 3A 15-7 8086 EMULATION (For backward compat ibility to Intel 808 6 proce ssors, the default base address and limit of the interrupt vector table shoul d not be chang e d.) T able 15-1 shows the interrupt and exception vector s that can be generated in real-address mode and virtual-8086 mode, and in the Intel 8086 pro cesso r . See Chapter 5, ?[...]
-
Page 558
15-8 Vol. 3A 8086 EMULATION T able 15-1. Real-Addre ss Mode Exceptions and Interrupt s V ector No. Desc ription Real-Address Mode Virtual-8086 Mode Intel 8086 Processor 0 Divide Error (#DE) Y es Y es Y es 1 Debug Exception (#DB) Y es Y es No 2 NMI Interrupt Y es Y es Y es 3 Breakpoint (#BP) Y es Y es Y es 4 Overflow (#OF) Y es Y es Y es 5 BOUND Ran[...]
-
Page 559
Vol. 3A 15-9 8086 EMULATION 15.2.1 Enabling Virtual-8086 Mode The processor runs in virtual-8086 mode when the VM (virtual machin e) flag in the EFLAGS register is set. This flag can only be set wh en the processor switches to a new protected-mode task or resumes virtual-8086 mode via an IRET instruction. System software cannot change the state of [...]
-
Page 560
15-10 Vol. 3A 8086 EMULATION The 8086 operating-system servi ces consists of a kernel and/or operating-system procedures that the 8086 pr ogram makes calls to. These serv ices can be implemented in either of the following two ways: • They can be included in the 8086 program. This approach is desirable for either of the following reas ons: — The[...]
-
Page 561
Vol. 3A 15-11 8086 EMULATION • When sharing the 8086 operating- system services or ROM code that is common to several 8086 programs running as different 8086-mode tasks. • When redirecting or trapping references to me mo ry -mapped I/O devices. 15.2.4 Protection within a Virtual-8086 T ask Protection is not enforced between the segments of an 8[...]
-
Page 562
15-12 Vol. 3A 8086 EMULATION Figure 15-3. Entering and Lea ving Virtual-8086 Mode Monitor Virtual-8086 Real Mode Code Protected- Mode T asks Virtual-8086 Mode T asks (8086 Programs) Protected- Mode Interrupt and Exception Handlers T ask Switch 1 VM = 1 Protected Mode Virtual-8086 Mode Real-Address Mode RESET PE=1 PE=0 or RESET #GP Exception 3 CALL [...]
-
Page 563
Vol. 3A 15-13 8086 EMULATION 15.2.6 Leaving Virtual-8086 Mode The processor can leave the virtu al-8086 mode only through an interrupt or exception . The following are situations where an interrupt or exception wi ll lead to the processor leaving virtual-8086 mode (see Figure 15-3): • The processor services a hardwa re interrupt generated to sign[...]
-
Page 564
15-14 Vol. 3A 8086 EMULATION 15.2.7 Sensitive Instructions When an IA-32 processor is running in virtua l-808 6 mode, the CLI, STI, PUSHF , POPF , INT n , and IRET instructions are sensitive to IOPL. The IN, INS, OUT , and OUTS instructions, which are sensitive to IOPL in protected mode, are not sensitive in virtual-8086 mode. The CPL is always 3 w[...]
-
Page 565
Vol. 3A 15-15 8086 EMULATION 15.2.8.2 Memory-Mapped I/O In systems which use memory-map ped I/O, the paging facilities o f the processor can be used to generate exceptions for attempts to access I/O ports. The virtual-8086 monitor may use p aging to control memory-mapped I/O in these ways : • Map part of the linear address space of each ta sk tha[...]
-
Page 566
15-16 Vol. 3A 8086 EMULATION The method the proc essor uses to handle class 2 and 3 i nterrupts depends on the setting of the following flags and fields: • IOPL field (bits 12 and 13 in the EFLAGS register) — Contr ols how class 3 softw are interrupts are handled when the processor is in virtual-808 6 mode (see Section 2.3, “System Flags and [...]
-
Page 567
Vol. 3A 15-17 8086 EMULATION 15.3.1 Class 1—Hardware Inte rrupt and Exception Handlin g in Virtual-8086 Mode In virtual-8086 mode , the Pentium, P6 family , Pentium 4, and Intel Xeon processors handle hardware interrupts and exceptions in the same manner as they are handled by the Intel486 and Intel386 processors. They invoke t he protected-mode [...]
-
Page 568
15-18 Vol. 3A 8086 EMULATION Interrupt and exception handlers can examine the VM flag on the stack to determine if the inter- rupted proc edure was running in vi rtual-8086 mode. If so, the interrupt or except ion can be handled in one of three ways: • The protected-mode interrupt or exception handler that was ca lled can handle the interrupt or [...]
-
Page 569
Vol. 3A 15-19 8086 EMULATION The virtual-8086 monitor runs at privilege level 0, like the pro tected-mode interrupt and excep- tion handlers. It is common ly closely tied to the protected-mode gene ral-protection exception (#GP , vector 13) handler . If the protected-mode interrupt or excep tion handl er calls th e virt ual- 8086 monitor to hand le[...]
-
Page 570
15-20 Vol. 3A 8086 EMULATION 15.3.1.3 Handling an Interrupt or Exception Through a T ask Gate When an interrupt or exception vector poi nts to a task gate in the IDT , the processor performs a task switch to the selected in terrupt- or exception-handl ing task. The following actions are carried out as part of this task switch: 1. The EFLAGS registe[...]
-
Page 571
Vol. 3A 15-21 8086 EMULATION available or not enabled, maskable hardware interrupts are handled as class 1 interrupts. Here, if VIF and VIP flag s are needed, the virtual-80 86 monitor can implement them in software. Existing 8086 programs commonly set and clear the IF flag in the EFLAGS register to enable and disable maskable hardware interru pts,[...]
-
Page 572
15-22 Vol. 3A 8086 EMULATION 3. The virtual-808 6 monitor shoul d read the VIF flag in the EFLAGS register . — If the VIF flag is clear, the virtual-8086 monit or sets the VIP flag in the EFLAGS image on the stack to indicate that there is a deferred interrupt pending and returns to the protected-mode handler . — If th e VIF flag is set, the vi[...]
-
Page 573
Vol. 3A 15-23 8086 EMULATION 15.3.3 Class 3—Software Interrupt Handling in V irtual-8086 Mode When the processor receives a software inte rrupt (an interrupt generated with the INT n instruction) while in virtual-8086 mode, it can use any of six different methods to handle the interrupt. The method selected depends on the setti ngs of the VME fla[...]
-
Page 574
15-24 Vol. 3A 8086 EMULATION T abl e 15-2. Software Interrupt Handling Methods While in Virtual-8086 Mode Method VME IOPL Bit in Redir . Bitmap* Processor Action 10 3 X Interrupt directed to a protected-mode interrup t handler: - Switches to privilege-level 0 stack - Pushes GS, FS, DS and ES onto privilege-level 0 stack - Pushes SS, ESP , EFLAGS, C[...]
-
Page 575
Vol. 3A 15-25 8086 EMULATION Redirecting software interrupts back to th e 8086 program potentially speeds up in terrupt handling because a switch back and forth between virtual-8086 mode and protected mode is not required. This latter interrupt-handlin g techni que is particularly useful for 8086 o perating systems (such as MS-DOS) that use the INT[...]
-
Page 576
15-26 Vol. 3A 8086 EMULATION 15.3.3.2 Methods 2 and 3: Sof tware Interrupt Handling When a software interrupt occurs in vi rtual-8086 mode and the metho d 2 or 3 conditions are present, the processor generates a general-pr otection exception (#GP). Method 2 is enabled when the VME flag is set to 0 and the IOPL value is less than 3. Here th e IOPL v[...]
-
Page 577
Vol. 3A 15-27 8086 EMULATION 6. Loads the CS and EIP register s with values from the interrupt vect or table entry pointed to by the interrupt vector number . Only the 16 low-order bits of the EIP are loaded and t he 16 high-order bits are set to 0. The interrupt vecto r table is assumed to be at linear address 0 of the current virtual-8086 task. 7[...]
-
Page 578
15-28 Vol. 3A 8086 EMULATION 15.4 PROTECTED-MODE VIRTUAL INTERRUPT S The IA-32 processors (beginning with the Pent ium processo r) also support the VIF and VIP flags in the EFLAGS register in protected mode by sett ing the PVI (protected-mode virt ual interrupt) flag in the CR4 register . Setting the PVI flag allows applicatio ns running at privile[...]
-
Page 579
16 Mixing 16-Bit and 32-Bit Code[...]
-
Page 580
[...]
-
Page 581
Vol. 3A 16-1 CHAPTER 16 MIXING 16-BIT AND 32-BIT CODE Program modules written to run on IA-3 2 processo rs can be either 16-bi t modules or 32-bit modules. T able 16-1 shows the characteristic of 16 -bit and 32-bit modules. The IA-32 processors function most ef ficiently when executing 32-bit program modules. They can, however , also execute 16-bi [...]
-
Page 582
16-2 Vol. 3A MIXING 16-BIT AND 32-BIT CODE 16.1 DEFINING 16-BIT AND 32-BIT PROGRAM MODULES The following IA -32 architecture mechanisms are used to distinguish between and support 16-bit and 32-bit segmen ts and operation s : • The D (default operand and address size) flag in code-segment descriptors. • The B (default stack size) flag in stack-[...]
-
Page 583
Vol. 3A 16-3 MIXING 16-BIT AND 3 2-BIT CODE These prefixes reverse the default size selected by the D flag in the code-segment descriptor . For example, the processor can interpret the (MOV mem , reg ) instru ctio n in any of four ways: • In a 32-bit code segment: — Moves 32 bits from a 32-b it reg ister to memory using a 32-bit effective addre[...]
-
Page 584
16-4 Vol. 3A MIXING 16-BIT AND 32-BIT CODE A stack that spans less than 64 KBytes can be sh ared by both 16- and 32-b it code segments. This class of stacks includes: • Stacks in expand-up segments with the G (granularity) and B (big) flags in the stack- segment descriptor clear . • Stacks in e xpand-down segments with the G and B flags clear .[...]
-
Page 585
Vol. 3A 16-5 MIXING 16-BIT AND 3 2-BIT CODE These methods of transferring program control overcome t he following architectural lim itations imposed on calls between 16-bit and 32-bit code segment s: • Pointers from 16-bit code segments (w hich by default can only be 16 bits) cannot be u sed to address data or code located beyond FFFFH in a 32-bi[...]
-
Page 586
16-6 Vol. 3A MIXING 16-BIT AND 32-BIT CODE While executing 32-b it code, if a call is made to a 16-bit code segment which is at the same or a more privileged level (that is, the DPL of the cal led code segment is l ess than o r equal to the CPL of the calling code segment) through a 16-bi t call ga te, then the upper 16-bits of the ESP register may[...]
-
Page 587
Vol. 3A 16-7 MIXING 16-BIT AND 3 2-BIT CODE 16.4.2.1 Controlling the Operand-Size Att ribute For a Call Three things can determine the operand-size of a call: • The D flag in the segment descriptor for the callin g code segment. • An operand-size instruction prefix. • The type of call gate (16-bit or 32-bit) , if a call is made through a call[...]
-
Page 588
16-8 Vol. 3A MIXING 16-BIT AND 32-BIT CODE 16.4.3 Interrupt Control T ransfers A program-control transfer caused by an exception or interrupt is always carried out through an interrupt or trap gate (located in the IDT). Here, the type of the gate (16-bit or 32-bit) determ ines the operand-size attribu te used in the im plicit call to the exception [...]
-
Page 589
Vol. 3A 16-9 MIXING 16-BIT AND 3 2-BIT CODE The interface procedure becomes more complex if any of these rules are violated. For example, if a 16-bit procedure calls a 32- bit procedure with an entry point beyon d FFFFH, the interface procedure will need to prov ide the offset to the entry point. The mapping between 16- and 32-bit addresses is only[...]
-
Page 590
16-10 Vol. 3A MIXING 16-BIT AND 32-BIT CODE[...]
-
Page 591
17 IA-32 Ar chitectur e Compatibility[...]
-
Page 592
[...]
-
Page 593
Vol. 3A 17-1 CHAPTER 17 IA-32 ARCHITECTURE COMP ATIBILITY All IA-32 processors are binary compatible. Compatibili ty means that, within certain limited constraints, programs that execu te on previous generations of IA-32 processors wi ll produce identical results when executed on later IA-32 processors. The co mpatibility constraints and any implem[...]
-
Page 594
17-2 Vol. 3A IA-32 ARCHITECTURE COMPATIBILITY 17.2. RESERVED BITS Throughout this manual, certa in bits are marked as reserved in many register and mem ory layout descriptions. When bi ts are marked as undefine d or reserved, it is essential for com patibility with future processors that software t reat these bits as havin g a future, though unknow[...]
-
Page 595
Vol. 3A 17-3 IA-32 ARCHITECTURE COMPATIBILITY 2. Execute the CPUID instruction. The CPUID instruction (added to the IA-32 in the Pen tium processor) indicates the presen ce of new features directly . See Chapter 14, “Processor Identificati on and Feature Determination,” in the IA-32 Intel® Ar chitectur e Software Developer’ s Manual, V olume[...]
-
Page 596
17-4 Vol. 3A IA-32 ARCHITECTURE COMPATIBILITY ming for conversion to integer . The remaining two instructions (MONIT OR and MW AIT) accelerate synchronization of threads. SSE3 i nstructions are described in Chapter 12, “Programming with S treaming SIMD Extensions 3 (SSE3),” in the IA-32 Intel® Ar chitectur e Softwar e Developer’ s Manual, V [...]
-
Page 597
Vol. 3A 17-5 IA-32 ARCHITECTURE COMPATIBILITY 17.12.1 Instructions Added Prio r to the Pentiu m Processor The following instructions were added in the Intel486 processor: • BSW AP (byte swap) instruction. • XADD (exchange and add) instruction. • CMPXCHG (compare and ex change) instruction. • Ι NVD (invalidate cache) instruction. • WBINVD[...]
-
Page 598
17-6 Vol. 3A IA-32 ARCHITECTURE COMPATIBILITY • Bit scan instructions. • Double-shift instructio ns. • Byte set on condition instruct ion . • Move with sign/zero extension. • Generalized multi ply instruction. • MOV to and from control registers. • MOV to and fr om test register s (now obsolete). • MOV to and from debug registers. ?[...]
-
Page 599
Vol. 3A 17-7 IA-32 ARCHITECTURE COMPATIBILITY • VIP (virtual interrupt pending), bit 20. • ID (identification flag), bit 21. The AC flag (bit 18) was added to the EF LAGS register in the Intel486 processor . 17.15.1 Using EFLAGS Flags to Di stinguish Between 32-Bit IA-32 Processors The followin g bits in the EFLAGS r egister that can be used to[...]
-
Page 600
17-8 Vol. 3A IA-32 ARCHITECTURE COMPATIBILITY 17.16.2 EFLAGS Pushed on the St ack The setting of the stored values of bits 12 through 15 (which includes the IOPL fi eld and the NT flag) in the EFLAGS register by the PUSHF in struction, by interrupts, and by exceptions is different with the 32-bit IA-32 p rocessors than with the 8086 and Intel 286 p[...]
-
Page 601
Vol. 3A 17-9 IA-32 ARCHITECTURE COMPATIBILITY As on the Intel 286 and Intel38 6 processors, the MP (monitor coprocessor) flag (bit 1 of register CR0) determines whether the W AIT/FW AIT instructions or w aiting-type floating-point instruc- tions trap when the context of the x87 FPU is different from that of the currently -executing task. If the MP [...]
-
Page 602
17-10 Vol. 3A IA-32 ARCHITECTURE COMPATIBILITY is reserved on these processors. The additio n o f the SF flag on a 32-bit x87 FPU h as no impact on software. Existing exception h andlers need not chan ge, bu t may be upgraded to take advan- tage of the additional in formation. 17.17.3 x87 F PU Control Word Only affine closure is supported for infin[...]
-
Page 603
Vol. 3A 17-11 IA-32 ARCHITECTURE COMPATIBILITY 17.17.5.1 NANS The 32-bit x87 FPUs disti nguish between signaling NaNs (SNaNs) and quiet NaNs (QNaN s). These x87 FPUs only generat e QNaNs and normally do not generate an ex ception upon encoun- tering a QNaN. An invalid-operation exception (# I) is generated only upon encountering a SNaN, except for [...]
-
Page 604
17-12 Vol. 3A IA-32 ARCHITECTURE COMPATIBILITY 17.17.6.2 NUMERIC OVERFLOW EXCEPTION (#O) On the 32-bit x87 FPUs, wh en the numeric overflow exceptio n is masked and the roundi ng mode is set to chop (toward 0), the resu lt is the largest positive or smallest negative number . The 16-bit IA-32 math coprocessors d o not signal the overflow excepti on[...]
-
Page 605
Vol. 3A 17-13 IA-32 ARCHITECTURE COMPATIBILITY 16-bit IA-32 math coprocessors, it takes precedence over all other exceptions. This difference causes no impact on existing software, but some unneed ed normalization of denormalized oper- ands is prevented on the Intel486 processor and Intel 387 math coprocessor . 17.17.6.5 CS AND EIP FOR FPU EXCEPTIO[...]
-
Page 606
17-14 Vol. 3A IA-32 ARCHITECTURE COMPATIBILITY 17.17.6.8 INVALI D OPERATION EXCEP TION ON DENOR MALS An invalid-operation exception is not ge nerated on the 32-bit x87 FPUs upon encountering a denormal value when executing a FSQR T , FDIV , or FPREM instruction or upon conversion to BCD or to integer . The operation proceeds by fi rst normalizing t[...]
-
Page 607
Vol. 3A 17-15 IA-32 ARCHITECTURE COMPATIBILITY 17.17.6.14 FLOATING-POIN T ERROR EXCEPTION (#MF) In real mode and protected mode (not inclu ding virtual-8086 mode), interrupt vect or 16 must point to the floatin g-point exception handler . In virtua l 8086 mode, the virtu al-8086 monit or can be programmed to accommodate a different locatio n of the[...]
-
Page 608
17-16 Vol. 3A IA-32 ARCHITECTURE COMPATIBILITY 17.17.7.5 FUCOM, FUCOMP , AND FUCOMPP INSTRUCTIONS When executing the FUCOM, FUCOMP , and FU COMPP instruction s, the 32-bit x87 FPUs perform unordered comp are according to IEEE Stan dard 754. These instructions do not exist on the 16-bit IA-32 math coprocessors. The avail ability of these new instruc[...]
-
Page 609
Vol. 3A 17-17 IA-32 ARCHITECTURE COMPATIBILITY 16-bit IA-32 math coprocessors do report a deno rmal-operand ex ception in this situ ation. This difference does not af fect existing software. On the 32-bit x87 FPUs, loading a denormal value that is in singl e- or double-real format causes the value to be converted to extended-real format. Loading a [...]
-
Page 610
17-18 Vol. 3A IA-32 ARCHITECTURE COMPATIBILITY 17.17.7.15 FXAM INSTRUCTION W ith the 32-bit x87 FPUs, if the FPU encounters an empty register when executing the FXAM i ns tr u c ti o n , i t n o t g e n e ra t e co m b i na t i o ns o f C0 t h ro ug h C3 e q u al t o 110 1 o r 1111 . T h e 1 6 - bi t IA-32 math coprocessors may generate these combi[...]
-
Page 611
Vol. 3A 17-19 IA-32 ARCHITECTURE COMPATIBILITY 17.17.1 1 Operands S plit Across Segment s and/or Pages On the P6 family , Pentium, and Intel486 p rocessor FPUs, when the first half of an operand to be written is inside a page or segment and the second half is outside, a memory fault can cause the first half to be stored but no t the second half. In[...]
-
Page 612
17-20 Vol. 3A IA-32 ARCHITECTURE COMPATIBILITY coprocessor keeps its ERROR# out put in inactive state after hardware reset; the Intel 387 copro- cessor keeps its ERROR# output in act ive state after hardware reset. Upon hardware reset or executi on of the FINIT/FNINIT i nstruction, the Intel 387 math copro- cessor signals an error conditio n. The P[...]
-
Page 613
Vol. 3A 17-21 IA-32 ARCHITECTURE COMPATIBILITY cmp ax, 037fh jz Intel487_SX_Math_CoProcessor_present;ax=037fh jmp Intel486_SX_microprocessor_prese nt;ax=ffffh If the Intel 487 SX math coprocessor is not presen t, the following code can be run to set the CR0 register for the Intel486 S X pro c essor . mov eax, cr0 and eax, fffffffdh ;make MP=0 or ea[...]
-
Page 614
17-22 Vol. 3A IA-32 ARCHITECTURE COMPATIBILITY The content of CR4 is 0H following a hardware reset. Control register CR4 was introduced in the Pentiu m processor . This register contains flags that enable certain new extensions provided in th e Penti um processor: • VME — V irtual-8086 mode extensions. Enables support for a virtual interrupt fl[...]
-
Page 615
Vol. 3A 17-23 IA-32 ARCHITECTURE COMPATIBILITY 17.21. MEMORY MANAG EMENT FACILITIES The following sections describe the new m emory management facilities avail able in the various IA-32 processors and some comp atib ility differences. 17.21.1 New Memory Mana gement Control Flags The Pentium Pro processor intr oduced three new memory managem ent fea[...]
-
Page 616
17-24 Vol. 3A IA-32 ARCHITECTURE COMPATIBILITY the data cache; in the Intel486 processor , they implement a wr ite-through strategy . See T able 10-5 for a comparison of these bi ts on t he P6 family , Pentium, and Intel486 processo rs. For complete information on caching, see Chapter 10, “Memory Cache Control.” 17.21.3 Descrip tor T ypes and C[...]
-
Page 617
Vol. 3A 17-25 IA-32 ARCHITECTURE COMPATIBILITY On the P6 family and Pentium p rocessors, reserved bits 1 1, 12, 14 and 15 are hard-wired to 0. On the Intel486 processor, however , bit 12 can be set. See T able 9-1 for th e dif ferent settings of this register following a power-up or hardware reset. 17.22.3 Debug Registers DR4 and DR5 Although the D[...]
-
Page 618
17-26 Vol. 3A IA-32 ARCHITECTURE COMPATIBILITY tecture has been added for handling and reporting on hardware errors. See Chapter 14, “Machine-Check Architecture,” for a detail ed descrip tion of the new conditions. The following exceptions and/or exception condi tions were added to the IA-32 with the Pentiu m processor: • Machine-check except[...]
-
Page 619
Vol. 3A 17-27 IA-32 ARCHITECTURE COMPATIBILITY 17.24.1 Machine-Ch eck Architecture The Pentium Pro processor intro duced a new architecture to the IA-32 for handling and reporting on machine-ch eck exceptions. This mach ine-check architecture (described in detail in Chapter 14, “Machine-Check Architecture ”) great ly expands the ability of the [...]
-
Page 620
17-28 Vol. 3A IA-32 ARCHITECTURE COMPATIBILITY 17.25.3 IDT Limit The LIDT instruction can be used to set a lim it on the size of the IDT . A double-fault exception (#DF) is generated if an interrupt or exception attempts to read a vector beyond the limit. Shut- down then occurs on the 32-bit IA-32 processors if the doubl e-fault handler vector is b[...]
-
Page 621
Vol. 3A 17-29 IA-32 ARCHITECTURE COMPATIBILITY • For the 82489DX, in the lowest pri ority delivery mode, all the target local APICs specified by the destination fi eld participate in the lowest p riority arbitration. For the local APIC, only those local APICs which have free interrupt slots will participate in the lowest priority arbitration. 17.[...]
-
Page 622
17-30 Vol. 3A IA-32 ARCHITECTURE COMPATIBILITY 17.27.1 P6 F amily and Pentium Processor TSS When the virtual mo de extensions are enabled (by setting the VME fl ag in control register CR4), the TSS in the P6 family and Pentium processors contain an interrupt redirection bit map, which is used in virtual-8086 mode to redi rect in terrupts back to an[...]
-
Page 623
Vol. 3A 17-31 IA-32 ARCHITECTURE COMPATIBILITY general-protection exceptions (# GP). Figure 17-1 demonstrates the different areas accessed by the Intel486 and the P6 family and Pent ium processors. 17.28. CACHE MANAGEMENT The P6 family processors include two levels of internal caches: L1 (level 1) and L2 (level 2). The L1 cache is divided into an i[...]
-
Page 624
17-32 Vol. 3A IA-32 ARCHITECTURE COMPATIBILITY External system hardware can force the Pentium processor to disable cachin g or to use the write- through cache policy should that be required. In the P6 family processors, the MTRRs can be used to override the CD and NW flags (see T able 10-6). The P6 family and Pentium processors suppor t page-level [...]
-
Page 625
Vol. 3A 17-33 IA-32 ARCHITECTURE COMPATIBILITY cache to be disabled and enabled, independently of the L1 and L2 caches (see Section 10.5.4, “Disabling and Enabling the L3 Cache”). 17.29. PAGING This section identifies enhancements made to the paging mechanism and implementation differ- ences in the paging mechanism for various IA-32 processors.[...]
-
Page 626
17-34 Vol. 3A IA-32 ARCHITECTURE COMPATIBILITY The sequence bounded by the MOV and JMP instructions shoul d be identity mapped (that is, the instructions should reside on a page whos e linear and physical addresses are identical). For the P6 family processors, the MOV CR0, REG instruction is serializing, so the jump oper- ation is not required. How[...]
-
Page 627
Vol. 3A 17-35 IA-32 ARCHITECTURE COMPATIBILITY 17.30.2 Error Code Pushes The Intel486 processor implements the error co de pushed on the stack as a 16-bit value. When pushed onto a 32-bit stack, t he Intel486 processor only pushes 2 bytes and updates ESP by 4. The P6 family and Pentium processors’ error code is a ful l 32 bits with the up per 16 [...]
-
Page 628
17-36 Vol. 3A IA-32 ARCHITECTURE COMPATIBILITY The 32-bit processors also have descripto rs for TSS segments, call gates, interrupt gates, and trap gates that supp ort the 32-bit architecture. Both kinds of desc riptors can be used in the same system. For those segment descriptors commo n to both 16- and 32-bit processors, cl ear bits in the reserv[...]
-
Page 629
Vol. 3A 17-37 IA-32 ARCHITECTURE COMPATIBILITY An exception to this behavior occurs when a st ack access is data aligned, and the stack pointer is pointing to the last aligned piece of data that size at the top of the stack (ESP is FFFFFFFCH). When this data is popp ed, no segment limit vi olation occurs and the stack pointer will wrap around to 0.[...]
-
Page 630
17-38 Vol. 3A IA-32 ARCHITECTURE COMPATIBILITY way of ensuring ordering between routines that produce weakly-ordered results and routines that consume this data. No re-ordering of reads occurs on the Pentium processor , except under the condition noted in Section 7.2.1, “Memory Ordering in the Intel® Pentium® and Intel486™ Processors,” and [...]
-
Page 631
Vol. 3A 17-39 IA-32 ARCHITECTURE COMPATIBILITY bus to send the interrupt vector to the processor . After receiving the interrupt request signal, the processor asserts LOCK# to insure that no othe r data appears on the data b us until the interrupt vector is received. This bus locking does not occur on the P6 family processors. 17.35. BUS HOLD Unlik[...]
-
Page 632
17-40 Vol. 3A IA-32 ARCHITECTURE COMPATIBILITY 17.36.3 Memory T ype Range Registers Memory type range registers (MTRRs) are a ne w feature introduced in to the IA-32 in the Pentium Pro processor . MTRRs allo w the processo r to optim ize memory op erations for different types of memory , such as RAM, ROM, frame buffer memory , and memory-mapped I/O[...]
-
Page 633
Vol. 3A 17-41 IA-32 ARCHITECTURE COMPATIBILITY 17.36.5 Performance-M onitoring Counters The P6 family and Pentium pro cessors provide two performance-monit oring counters for use in monitoring inte rnal hardware operatio ns. Thes e counters are event counters that can be programmed to count a variet y of different types of events, such as the numbe[...]
-
Page 634
17-42 Vol. 3A IA-32 ARCHITECTURE COMPATIBILITY[...]
-
Page 635
INTEL SALES OFFICES ASIA P ACIFIC Australia Intel Corp. Level 2 448 St Kilda Road Melbourne VI C 3004 Australia Fax:613- 9862 5599 China Intel Corp. Rm 709, Shaanxi Zhongda Int'l Bldg No.30 Nandajie Street Xian AX71000 2 China Fax:(86 29) 7203 356 Intel Corp. Rm 2710, Metrop oli an To w e r 68 Zouron g Rd Chongqing CQ 400015 China Intel Corp. [...]
-
Page 636
Intel Corp. 999 CANADA PLACE , Suite 404,#1 1 Va n c ou v e r B C V6C 3E2 Canada Fax:604- 844-2813 Intel Corp. 2650 Quee nsview Dr ive, Suite 25 0 Ottawa ON K2B 8H6 Canada Fax:613- 820-5936 Intel Corp. 190 Attwell Drive, Suite 50 0 Rexcdale ON M9W 6H8 Canada Fax:416- 675-2438 Intel Corp. 171 S t. Clair Av e. E, Suite 6 To r o n t o O N Canada Intel[...]