Diff for "AdvancedPageReplacement"

Differences between revisions 18 and 20 (spanning 2 versions)

The basic assumptions behind LRU have been invalidated by streaming media, garbage collection and other things possible in large address spaces, but until 2002 there weren't many replacements available that are suitable to be implemented in a general purpose OS. However, with the advent of LIRS, ARC, Clock-pro and CAR/CART algorithms, it looks like there could be a benefit to Linux in implementing something better than LRU or the unbalanced use-once that is in use currently.

The only problem is, the advanced page replacement algorithms need to keep a history of recently evicted pages, and we don't want to spend too much memory or cpu on that. This page is a template for brainstorming on how we can implement such a framework, and on which of the advanced page replacement algorithms we should experiment with.

Please feel free to edit this page, after having created an account.

Goals

More robust in corner cases where LRU / use-once fail.
Able to deal with long inter-reference distances, as a second level cache. Eg. caching for a database or being a file server.
Clean implementation.
SMP scalable.
Still good performance as a generic VM.

The replacement algorithms

[http://www.almaden.ibm.com/StorageSystems/autonomic_storage/ARC/index.shtml ARC] Adaptive Replacement Cache.
[http://www.cs.wm.edu/~sjiang/lirs.htm LIRS] Low Inter-Reference Recency Set.
[http://www.cs.wm.edu/hpcs/WWW/HTML/publications/abs05-3.html CLOCK-Pro] an effective improvement of the CLOCK replacement, and the ClockProApproximation that Rik van Riel is planning to implement.
[http://www.almaden.ibm.com/cs/people/dmodha/clockfast.pdf CAR] Clock with Adaptive Replacement.

Proposals for dealing with non-resident pages

Rik's interface (for implementation, see NonResidentPages, code and patches on [http://surriel.com/patches/nonresident my home page]):

extern int recently_evicted(struct address_space * mapping, unsigned long offset);
extern int remember_page(struct address_space * mapping, unsigned long offset);

The recently_evicted function is queried by the pagein or page cache allocation code, to determine whether the data at the offset offset from the page cache or process object mapping was recently evicted. The function returns 0 if the page was not found, 1 if the page was found.

The remember_page function is called by the pageout code, telling the non-resident page management code to remember that a page at offset offset from mapping was just paged out.

We use a hash of mapping->host->i_ino and mapping->host->i_sb (and/or possibly other fields) to keep things unique.

Advanced Replacement in other systems

Cache replacement has been studied, implemented and tested in others systems where data access happens very frequently. Databases is one of that systems. ARC has recently been implemented in the [http://www.postgresql.org postgresql DB]. After an initial problem with software patents (ARC is patented by IBM) it was finally included in the postgresql. You can see a patch that was send to the pgsql-patchs mail-list where the ARC implementation is replaced back to LRU in the middle of the discussion about the patents of ARC [http://archives.postgresql.org/pgsql-patches/2005-01/msg00253.php here].

After some discuss, the ARC implementation was finally accepted and merged into the source tree, and is available in the current postgresql sources.

Status of CART implementation

I'm in the last phase of debugging. I seem to have a few errors with the page out code that i need to fix. Should be out soon... (Rahul Iyer, May-7-2005)

CART update

I'd like to say that as of now, CART on Linux works... almost. There is some odd case where some pages get pinned in memory and cannot be freed, because of which we OOM every now and again if we malloc() enough. I presume this is a flags issue.

Also, I need to fix a few other things: 1. I'm kmalloc()'ing the memory for the non resident nodes. This is proving to be painful for speed. I'm thinking of using the slab allocator for these purposes. 2. Currently, there is a non resident node "list". Lookups of this will get painful, so maybe a hash table is in order.

All in all, if i'm not mistaken, on every page fault CART involves an overhead larger than the current system because of the requirement to update the non resident list. So, I guess, this will be faster only if the difference in the number of page faults that a workload generates on CART as opposed to the current scheme can mask the extra overhead. Also, i guess, that lock contention on the nonresident list will also be an issue on multiprocessor machines.

I plan to release the code by Thursday or Friday (worst case by the weekend) without fix #2. I have to fix the "glued" pages issue and do some code cleanup. Hoping for the best

-  ⇤ ← Revision 18 as of 2005-05-09 16:33:19 → 
  Size: 2917
  Editor: fangorn
  Comment: fix up first sentence (thank you Scott Kaplan)
+   ← Revision 20 as of 2005-06-09 03:13:27 → ⇥
  Size: 4975
  Editor: 201-1-161-252
  Comment: Added reference about the recently ARC implementation in postgresql db.
-Deletions are marked like this.
+Additions are marked like this.
 Line 1:
-Tbe basic assumptions behind LRU have been invalidated by streaming media, garbage collection and other things possible in large address spaces, but until 2002 there weren't many replacements available that are suitable to be implemented in a general purpose OS.  However, with the advent of LIRS, ARC, Clock-pro and CAR/CART algorithms, it looks like there could be a benefit to Linux in implementing something better than LRU or the unbalanced use-once that is in use currently.
+The basic assumptions behind LRU have been invalidated by streaming media, garbage collection and other things possible in large address spaces, but until 2002 there weren't many replacements available that are suitable to be implemented in a general purpose OS.  However, with the advent of LIRS, ARC, Clock-pro and CAR/CART algorithms, it looks like there could be a benefit to Linux in implementing something better than LRU or the unbalanced use-once that is in use currently.
 Line 37:
+== Advanced Replacement in other systems ==

Cache replacement has been studied, implemented and tested in others systems where data access happens very frequently. Databases is one of that systems. ARC has recently been implemented in the [http://www.postgresql.org postgresql DB]. After an initial problem with software patents (ARC is patented by IBM) it was finally included in the postgresql. You can see a patch that was send to the ''pgsql-patchs'' mail-list where the ARC implementation is replaced back to LRU in the middle of       the discussion about the patents of ARC [http://archives.postgresql.org/pgsql-patches/2005-01/msg00253.php here]. 

After some discuss, the ARC implementation was finally accepted and merged into the source tree, and is available in the current postgresql sources.
-Line 39:
+Line 46:
+== CART update ==
I'd like to say that as of now, CART on Linux works... almost. There is some odd case where some pages get pinned in memory and cannot be freed, because of which we OOM every now and again if we malloc() enough. I presume this is a flags issue.

Also, I need to fix a few other things:
1. I'm kmalloc()'ing the memory for the non resident nodes. This is proving to be painful for speed. I'm thinking of using the slab allocator for these purposes.
2. Currently, there is a non resident node "list". Lookups of this will get painful, so maybe a hash table is in order.

All in all, if i'm not mistaken, on every page fault CART involves an overhead larger than the current system because of the requirement to update the non resident list. So, I guess, this will be faster only if the  difference in the number of page faults that a workload generates on CART as opposed to the current scheme can mask the extra overhead. Also, i guess, that lock contention on the nonresident list will also be an issue on multiprocessor machines.

I plan to release the code by Thursday or Friday (worst case by the weekend) without fix #2. I have to fix the "glued" pages issue and do some code cleanup. Hoping for the best