In the post of Linux memory management, it is discussed that linux uses availabe DRAM as buffer/cache to optimize the whole system performance. That is certainly a very good thing. However, there could be side effect.
What is the issue?
Last week I have spent a lot time to figure out why kernel lockup during IR829 bundle image installation. During the bundle image installation, there is a step to extract Guest OS disk image out from bundle image and write it into Guest OS disk using linux utility tool “dd”.
From time to time the kernel lockup occurs. During the process, I ran “top” to monitor the overall state of the box, I can see that free memory drops sharply and the memory consumption by Buffer&Cache increased dramatically, which is expected as dd is trying to write disk image into disk /dev/sda. Linux kernel uses available DRAM to cache/buffer disk image. The cache/buffer space will be reclaimed by a kernel thread “kswapd” in case of kernel realizes that there is need/demand by other process. All sounds very nice, however the reality is not so nice, especailly for a running kernel version 2.6.35. Sometime “kswapd” doesn’t do its job right and lockup the box.
What is solution?
The solution is to avoid using the cache memory when Guest OS disk image is copied into disk. By reading the latest version of dd manual, it indicates that there are options like iflag, oflag. However the manual doesn’t say what are possibe flags/values. I then chased down to GNU coreutil document for dd: https://www.gnu.org/software/coreutils/manual/html_node/dd-invocation.html, it mentions the very interesting options such as “nocache”, “dsync”, “direct”. This makes me really exciting. The first try is “dd iflag=nocache oflag-nocache”, it turns out that the option is not accepted. When I checked the version by “dd –version”, I found that the running dd version is 8.5, which is really old, comparing to the latest one version 8.25, which is released in January 20 2016. Check http://ftp.gnu.org/gnu/coreutils/.
So I ended up to download the latest version coreutils and compiled them and packaged it into my linux ramdisk.
Acoording to the dd nocache unit testcase,
However it does not work for me. It works well after I add “dsync” option in oflag like below:
With these new options, there is no noticable cache/buffer memory consumption increase during the installation period, however the paid price is the increased time to complete the whole operation, which is expected.
How it works?
I further checked the souce code how it works. It turns out that dd.c implements a function invalidate_cache,which tells kernel to the block of memory is no longer needed through posix_fadvise(fd, …, POSIX_FADV_DONTNEED).