some ideas for kernel programming

introduction

Someone was asking online about how to use his kernel programming skills and get work of that kind. Here are some ideas of what could be programmed and if the programmer can show tasks like this are within his reach and that such things are worth having someone might pay for programming tasks in small contracts. In some cases that would mean new kernel modules. The customer may or may not want the code upstreamed.

pick tasks from an existing project

An example project is Capsicum [1] where the status description includes:
Kernel capability development - While the basic Capsicum kernel framework is now complete, maintaining and refining the current implementation is an on-going task. We anticipate that future kernel features may be required, such as a more formal notion of groupings of related sandboxed processes, in order to make garbage-collecting them on application exit easier.
[2] but maybe it's harder to get paid for something other people are doing anyway unless you make it happen faster.

hunt and fix bugs

The Kernel Self Protection Project [3] tackles security problems: Bad USB [4] is an example. The Fuzzing project [5] could also be used in finding kernel bugs to fix.
Certain software projects don't fit at all. That includes software that does no file format parsing (USB/network/other input fuzzing is excluded for now)

extend resource limits

Resource limits [6] are a standard feature for controlling programs and here is a suggestion for extending them.

This idea occurred to me several years ago when sudo [7] introduced the NOEXEC flag using a C library to prevent unintended execution of programs. The idea behind that is that a program run by sudo could be prevented from executing any further programs. For comments on sudo configuration see my earlier article. [8] sudo has since changed the approach and uses seccomp [9] where possible to implement NOEXEC.

An approach to restricting execution based on resource limits could be as follows:

  1. a new limit (call it exec_depth) applying to each process is an unsigned integer
  2. the maximum value of this integer results in traditional unrestricted behaviour
  3. lower values result in every successful exec*() call decrementing the integer
  4. when the limit is 0 no further exec*() calls are permitted
  5. there is no means to raise the number, at all, ever

In the case of sudo it could set the limit to 1 just before calling the program it wants to confine. Or the flexibility of a counter means it could set a higher number such as 2. An example could be this where you want less executed but not sh

demo$ sudo man ls # shell escape performed as root tested on Fedora 26 (alpha)
sshd sshd sshd bash sudo man less sh pstree
F S   UID    PID   PPID  C PRI  NI ADDR SZ WCHAN  TTY          TIME CMD
4 S     0   1178   1131  0  80   0 - 70274 poll_s pts/0    00:00:00 sudo
4 S     0   1179   1178  0  80   0 - 31492 wait   pts/0    00:00:00 man
4 S     0   1191   1179  0  80   0 - 29242 wait   pts/0    00:00:00 less
4 S     0   1199   1191  0  80   0 - 33138 wait   pts/0    00:00:00 sh
Which means sudo should set exec_depth to 2. But closer examination shows that won't work because nroff calls locale at the same depth as what you aim to prevent. So until man is reorganised to get to groff without nroff this doesn't make a good example.
sudo-/bin/man-/bin/nroff-/bin/locale
sudo-/bin/man-/bin/less-/bin/sh
Another use case is setting exec_depth to 0 at the start of every program that knows it isn't meant to call other programs. A vendor including this kernel feature might fix their package builder to add such a call to a large list of programs where it is not in the original source.

Comparing exec_depth resource limit to other techniques:

  1. More robust than the C library approach first used by sudo
  2. Excessive permission by a single level bypasses the constraint (no use giving 3 for sudo man ls)
  3. Independent from other tools: can be used in combination with other tools
  4. Portable and lightweight
  5. Some might considered it unwanted duplication of existing confinement techniques
  6. Confined programs should anyway have safe behaviour after a failed exec*() - if not this could provide a new attack vector

alter discretionary access control

Probably everyone has got tired at times of users (or even vendors) setting crazy filemodes. I have seen interest from someone in limiting the discretionary access control (ability to chmod your own files) to prevent some of the worse choices especially world write.

As applied to local filesystems that could mean a new mount option to apply policy choices such as:

  1. treat everything as having no world write regardless of actual filemode
  2. when creating or changing modes restrict them to no world write
  3. when creating or changing modes allow only selected users to define write to certain groups (details maybe in a config file or LDAP)

conclusion

After sketching out some development ideas and the effort required and the usefulness of the work people might be found willing to pay for the features or improvements.

footnotes

  1. Capsicum roadmap
  2. Capsicum project info
  3. Kernel Self Protection Project
  4. USB srlabs and USB arstechnica
  5. Fuzzing Project
  6. resource limits
  7. sudo
  8. sudo policy basics
  9. seccomp

Written by Peter M Allan. 2017
linkedin back to articles