Yanyg - Software Engineer

Linux缓存预读

目录

1 介绍

缓存预读(readahead)是指基于IO特征,推测性地将文件数据提前读取到缓存中的过程,希望这些提前读取的数据,将来对应用程序有帮助。缓存预读工作良好时,通过消除等待数据时间,以及增大IO传输单元(预读IO尺寸一般大于应用IO尺寸),可以显著提高应用性能。另一方面,预读也可能使性能变得更差:如果预读失效(预读数据将来未使用),稀缺内存和IO带宽将会浪费在永远不会使用的数据上。因此,与通用内存管理一样,预读算法是性能的关键,也重度依赖启发。

具体实现主要涉及多路顺序流识别、预读窗口调整,以及换入换出三个方面。

Readahead is a technique employed by the kernel in an attempt to improve file reading performance. If the kernel has reason to believe that a particular file is being read sequentially, it will attempt to read blocks from the file into memory before the application requests them. When readahead works, it speeds up the system's throughput, since the reading application does not have to wait for its requests. When readahead fails, instead, it generates useless I/O and occupies memory pages which are needed for some other purpose.

Readahead is the process of speculatively(推测地) reading file data into the page cache in the hope that it will be useful to an application in the near future. When readahead works well, it can significantly improve the performance of I/O bound applications by avoiding the need for those applications to wait for data and by increasing I/O transfer size. On the other hand, readahead risks making performance worse as well: if it guesses wrong, scarce(稀有的、缺少的) memory and I/O bandwidth will be wasted on data which will never be used. So, as is the case with memory management in general, readahead algorithms are both performance-critical and heavily based on heuristics(启发式).

"Readahead" is the act of speculatively(推测地) reading a portion of a file's contents into memory in the expectation that a process working with that file will soon want that data. When readahead works well, a data-consuming process will find that the information it needs is available to it when it asks, and that waiting for disk I/O is not necessary. The Linux kernel has done readahead for a long time, but that does not mean that it cannot be done better. To that end, Fengguang Wu has been working on a set of "adaptive readahead" patches for a couple of years(几年).

2 LINUX实现

3 References

512K readahead size with thrashing safe readahead
https://lwn.net/Articles/372281/
Improving readahead
https://lwn.net/Articles/372384/
Adaptive file readahead
https://lwn.net/Articles/155510/
Huge pages in the ext4 filesystem
https://lwn.net/Articles/718102/
zswap: compressed swap caching
https://lwn.net/Articles/528817/