Skip to content

Large pages via Contiguous PTEs and partial occupancy

AndyGlew edited this page Mar 18, 2020 · 3 revisions

Large pages can be structural large pages or Contiguous PTE large pages.

This page discusses an issue that arises when the contiguous PTE group that forms such an instance of such Contiguous PTE large pages are inconsistent.

(TBD: awkward wording because it is awkward in this wiki to create a page that can reference by both singular and plural versions of its name. See Wish: alternative syntax for wiki page names.)

E.g. years ago I got a patent, now expired I believe, on what the RISC-V community is now calling NAPOT pages. essentially an application of the "1 extra bit trick" that can indicate any power of two with only one extra bit. with the NAPOT encoding any naturally aligned group of PTEs could be used to indicate a large page that need only occupy one TLB entry, given appropriate TLB indexing and tagging. (Not: does not need to implement all powers of two: e.g. could be NAPO4, or only 1 or a few sizes. Unused bits can be used for other purposes.)

E.g. the Contiguous bit in ARMv8's page tables. (Less flexible than NAPOT, but same idea with respect to using contiguous PTEs.)

The question arises: what happens when the multiple PTEs in such a contiguous PTE group are not consistent? e.g. some might have completely different addresses, permissions, or cache types?

The usual rule is "don't do that" to software, while hardware must be able to detect such inconsistencies before installation and the TLB, or equivalently handle them safely and securely.

Here's another possibility:

Such a contiguous PTE large page might have a bitmask whose bits corresponded to the component pages.

A TLB miss installs the large page TLB entry, but with only one bit set in the bitmask.

TLB lookups take into account the bitmask.

A new TLB miss might access a different page in the contiguous PTE set. When installing, look up to see if there is already a large page TLB entry that covers the new page. Set the corresponding bit.

I.e. merge TLB entries.

This works best if the bitmask is part of an associative CAM look up. not necessarily fully associative, but the bitmask wants to be either CAM-matched or part of the tags.

It can also work with a probe/re-probe TLB.

Such TLB entry merging can be performed in other situations, such as sharing TLB entries between independent hardware threads that happen to share those parts of the address space. (I may have a patent on that too, also probably expired.)

Clone this wiki locally