Linux Huge Page

When a process uses some memory, the CPU is marking the RAM as used by that process. For efficiency, the CPU allocate RAM by chunks of 4K bytes (it’s the default value on many platforms). Those chunks are named pages. Those pages can be swapped to disk, etc.

Since the process address space are virtual in Linux, the CPU and the operating system have to remember which page belong to which process, and where it is stored, i.e. translate from virtual to physical address. The translation is done through TLB. A TLB is a cache of virtual-to-physical translations. Typically this is a very scarce resource on processor. Operating systems try to make best use of limited number of TLB resources. Huge page is a way to reduce number of pages in the kernel so that less TLB entries are used, translation will be faster.

Linux ACPI

#How to check if the running kernel supports huge page ?

By checking the kernel config, we can find if huge pages is supported.

weng@weng-u1604:~$ cat /boot/config-$(uname -r) | grep HUGETLB
CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
CONFIG_CGROUP_HUGETLB=y
CONFIG_HUGETLBFS=y
CONFIG_HUGETLB_PAGE=y
weng@weng-u1604:~$ 

#How to check what are supported huge page size using “hugeadm” tool?

weng@weng-u1604:~$ hugeadm --page-sizes-all
2097152
weng@weng-u1604:~$ 

The above output shows that the supported huge page size is 2MB.

#How to check the current huge pages status?

weng@weng-u1604:~$ cat /proc/meminfo | grep HugePages
AnonHugePages:    425984 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
weng@weng-u1604:~$ 

where: HugePages_Total is the size of the pool of huge pages. HugePages_Free is the number of huge pages in the pool that are not yet allocated. HugePages_Rsvd is short for “reserved,” and is the number of huge pages for which a commitment to allocate from the pool has been made, but no allocation has yet been made. Reserved huge pages guarantee that an application will be able to allocate a huge page from the pool of huge pages at fault time. HugePages_Surp is short for “surplus,” and is the number of huge pages in the pool above the value in /proc/sys/vm/nr_hugepages. The maximum number of surplus huge pages is controlled by /proc/sys/vm/nr_overcommit_hugepages.

The above output says there is no huge pages created so far.

#How to create huge pages ? There are two ways:

  1. Add kernell boot parameter, e.g. "hugepagesz=2MB hugepages=16"
  2. Create on the fly, e.g. 'echo 16 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages'

Verify if it is created:

weng@weng-u1604:~$ cat /proc/meminfo | grep Huge
AnonHugePages:    561152 kB
HugePages_Total:      16
HugePages_Free:       16
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
weng@weng-u1604:~$ 
weng@weng-u1604:~$ hugeadm --pool-list
      Size  Minimum  Current  Maximum  Default
   2097152       16       16       16        *
weng@weng-u1604:~$ 

#How to use these huge pages?

First, we need create a directory as mount point, then mount it like below:

weng@weng-u1604:~$ sudo mkdir /mnt/huge
weng@weng-u1604:~$ sudo mount -t hugetlbfs nodev /mnt/huge
weng@weng-u1604:~$ ls /mnt/huge

Now /mnt/huge/ is ready to be used.

See examples:

1) map_hugetlb: see tools/testing/selftests/vm/map_hugetlb.c

2) hugepage-shm: see tools/testing/selftests/vm/hugepage-shm.c

3) hugepage-mmap: see tools/testing/selftests/vm/hugepage-mmap.c

4) The libhugetlbfs (https://github.com/libhugetlbfs/libhugetlbfs) library provides a wide range of userspace tools to help with huge page usability, environment setup, and control.