[Bugs] [Bug 1379769] New: GlusterFS fails to build on old Linux distros with linux/oom.h missing
bugzilla at redhat.com
bugzilla at redhat.com
Tue Sep 27 15:38:54 UTC 2016
Bug ID: 1379769
Summary: GlusterFS fails to build on old Linux distros with
Assignee: bugs at gluster.org
Reporter: oleksandr at natalenko.name
CC: bugs at gluster.org
Created attachment 1205274
Milind Changire has reported that GlusterFS fails to build under RHEL5 because
it does not have linux/oom.h header.
This header is used purely to obtain OOM-related constants.
Also, this issue raises the question how OOM should be managed under old
kernels. From man 5 proc we see this:
/proc/[pid]/oom_adj (since Linux 2.6.11)
This file can be used to adjust the score used to select which
process should be killed in an out-of-memory (OOM) situation. The kernel uses
for a bit-shift operation of the process's oom_score value:
valid values are in the range -16 to +15, plus the special value -17, which
killing altogether for this process. A positive score increases
the likelihood of this process being killed by the OOM-killer; a negative score
The default value for this file is 0; a new process inherits its
parent's oom_adj setting. A process must be privileged (CAP_SYS_RESOURCE) to
Since Linux 2.6.36, use of this file is deprecated in favor of
/proc/[pid]/oom_score_adj (since Linux 2.6.36)
This file can be used to adjust the badness heuristic used to
select which process gets killed in out-of-memory conditions.
The badness heuristic assigns a value to each candidate task
ranging from 0 (never kill) to 1000 (always kill) to determine which process
The units are roughly a proportion along that range of allowed
memory the process may allocate from, based on an estimation of its current
memory and swap
use. For example, if a task is using all allowed memory, its
badness score will be 1000. If it is using half of its allowed memory, its
score will be
There is an additional factor included in the badness score: root
processes are given 3% extra memory over other tasks.
The amount of "allowed" memory depends on the context in which
the OOM-killer was called. If it is due to the memory assigned to the
cpuset being exhausted, the allowed memory represents the set of
mems assigned to that cpuset (see cpuset(7)). If it is due to a
being exhausted, the allowed memory represents the set of
mempolicy nodes. If it is due to a memory limit (or swap limit) being reached,
the allowed mem‐
ory is that configured limit. Finally, if it is due to the
entire system being out of memory, the allowed memory represents all
The value of oom_score_adj is added to the badness score before
it is used to determine which task to kill. Acceptable values range
(OOM_SCORE_ADJ_MIN) to +1000 (OOM_SCORE_ADJ_MAX). This allows
user space to control the preference for OOM-killing, ranging from always
preferring a cer‐
tain task or completely disabling it from OOM killing. The
lowest possible value, -1000, is equivalent to disabling OOM-killing entirely
for that task,
since it will always report a badness score of 0.
Consequently, it is very simple for user space to define the
amount of memory to consider for each task. Setting a oom_score_adj value of
+500, for exam‐
ple, is roughly equivalent to allowing the remainder of tasks
sharing the same system, cpuset, mempolicy, or memory controller resources to
use at least
50% more memory. A value of -500, on the other hand, would be
roughly equivalent to discounting 50% of the task's allowed memory from being
scoring against the task.
For backward compatibility with previous kernels,
/proc/[pid]/oom_adj can still be used to tune the badness score. Its value
is scaled linearly with
Writing to /proc/[pid]/oom_score_adj or /proc/[pid]/oom_adj will
change the other with its scaled value.
In summary, for kernels older that 2.6.11 we must disable OOM-related code
completely, for kernels from 2.6.11 to 2.6.35 incl we must use old interface
(/proc/[pid]/oom_adj), and starting from 2.6.36 we must use
It is not that simple obviously. For example, RHEL6 while having 2.6.32 kernel,
provides /proc/[pid]/oom_score_adj interface. So, I guess, we must to this:
1) if there is no /proc/self/oom_adj and no /proc/self/oom_score_adj, consider
this kernel to be too old and disable OOM-related code (i.e. not define
2) if there is /proc/self/oom_adj, but no /proc/self/oom_score_adj, consider
this kernel to be old and use old OOM /proc interface (define
HAVE_LINUX_OOM_PROC and HAVE_LINUX_OOM_PROC_V1, for example);
3) if there is /proc/self/oom_score_adj, work as we do now (and define
HAVE_LINUX_OOM_PROC_V2 or so);
4) if there is linux/oom.h, use it for constants (define HAVE_LINUX_OOM_H),
otherwise define necessary constants manually.
Not defining HAVE_LINUX_OOM_PROC will throw away OOM-related code completely.
HAVE_LINUX_OOM_V1/HAVE_LINUX_OOM_V2 option will switch the code to write to
specific /proc file as well as constants to deal with. In case we have
HAVE_LINUX_OOM_PROC (V1 or V2), but do not have HAVE_LINUX_OOM_H, we might end
up doing this:
#define OOM_DISABLE -17
#define OOM_ADJUST_MIN -16
#define OOM_ADJUST_MAX 15
#define OOM_SCORE_ADJ_MIN (-1000)
#define OOM_SCORE_ADJ_MAX 1000
With this changes we'll cover all the possibilities one may face while
compiling GlusterFS against relatively old kernel.
Attaching initial Milind's patch as a proof-of-concept, but will take care of
adopting everything written above if there are no objections.
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs