I have used GNU/Linux on systems from 4 MB RAM to 512 GB RAM. When they start swapping, most of the time you can still log in and kill off the offending process - you just have to be 100-1000 times more patient.
On my new 32 GB system that has changed: It blocks when it starts swapping. Sometimes with full disk activity but other times with no disk activity.
To examine what might be the issue I have written this program. The idea is:
1 grab 3% of the memory free right now 2 if that caused swap to increase: stop 3 keep the chunk used for 30 seconds by forking off 4 goto 1 -
#!/usr/bin/perl sub freekb { my $free = `free|grep buffers/cache`; my @a=split / +/,$free; return $a[3]; } sub swapkb { my $swap = `free|grep Swap:`; my @a=split / +/,$swap; return $a[2]; } my $swap = swapkb(); my $lastswap = $swap; my $free; while($lastswap >= $swap) { print "$swap $free"; $lastswap = $swap; $swap = swapkb(); $free = freekb(); my $used_mem = "x"x(1024 * $free * 0.03); if(not fork()) { sleep 30; exit(); } } print "Swap increased $swap $lastswap\n"; Running the program forever ought to keep the system at the limit of swapping, but only grabbing a minimal amount of swap and do that very slowly (i.e. a few MB at a time at most).
If I run:
forever free | stdbuf -o0 timestamp > freelog I ought to see swap slowly rising every second. (forever and timestamp from https://github.com/ole-tange/tangetools).
But that is not the behaviour I see: I see swap increasing in jumps and that the system is completely blocked during these jumps. Here the system is blocked for 30 seconds with the swap usage increases with 1 GB:
secs 169.527 Swap: 18440184 154184 18286000 170.531 Swap: 18440184 154184 18286000 200.630 Swap: 18440184 1134240 17305944 210.259 Swap: 18440184 1076228 17363956 Blocked: 21 secs. Swap increase 2000 MB:
307.773 Swap: 18440184 581324 17858860 308.799 Swap: 18440184 597676 17842508 330.103 Swap: 18440184 2503020 15937164 331.106 Swap: 18440184 2502936 15937248 Blocked: 20 secs. Swap increase 2200 MB:
751.283 Swap: 18440184 885288 17554896 752.286 Swap: 18440184 911676 17528508 772.331 Swap: 18440184 3193532 15246652 773.333 Swap: 18440184 1404540 17035644 Blocked: 37 secs. Swap increase 2400 MB:
904.068 Swap: 18440184 613108 17827076 905.072 Swap: 18440184 610368 17829816 942.424 Swap: 18440184 3014668 15425516 942.610 Swap: 18440184 2073580 16366604 This is bad enough, but what is even worse is that the system sometimes stops responding at all - even if I wait for hours. I have the feeling it is related to the swapping issue, but I cannot tell for sure.
My first idea was to tweak /proc/sys/vm/swappiness from 60 to 0 or 100, just to see if that had any effect at all. 0 did not have an effect, but 100 did cause the problem to arise less often.
How can I prevent the system from blocking for such a long time?
Why does it decide to swapout 1-3 GB when less than 10 MB would suffice?
System info:
$ uname -a Linux aspire 3.8.0-32-generic #47-Ubuntu SMP Tue Oct 1 22:35:23 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux Edit:
I tested if the problem is due to 32 GB RAM by removing 24 GB and trying with only 8 GB - I see the same behaviour.
I can also reproduce the swapping behaviour (though not the freezing) by installing GNU/Linux Mint 15 in VirtualBox.
I cannot reproduce the problem on my 8 GB laptop: The script above runs beautifully for hours and hours - swapping out a few megabytes, but never a full gigabyte. So I compared all the variables in /proc/sys/vm/* on both systems: They are exactly the same. This leads me to believe the problem is elsewhere. The laptop runs a different kernel:
Linux hk 3.2.0-55-generic #85-Ubuntu SMP Wed Oct 2 12:29:27 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux Maybe something in the VM system changed from 3.2.0 to 3.8.0?