Linux 6.19 Merge Window

Linux v6.18 was released on Sunday, November 30th, with the Linux v6.19 merge window opening immediately afterwards. Below are the highlights of the LSM, SELinux, and audit pull requests which have been merged into Linus’ tree.

LSM

  • The LSM initialization code was heavily reworked to improve code quality, avoid unnecessary work related to LSMs that are disabled at boot time, and provide support for a LSM notification that indicates that all enabled LSMs have been fully initialized. The LSM_STARTED_ALL notification is currently unused, but work is in progress which makes use of this notification to measure the IPE boot policy once all of the LSMs have been fully initialized and started.

  • The device_cgroup code was updated to make better use of the seq_put*() helper functions. This is purely a code quality improvement, there should be no visible user impact.

SELinux

  • Traditionally memfd files were labeled as either tmpfs or hugetlbfs files depending on the system’s configuration. While this was simple, and aligned well with the memfd implementation, it made it difficult to differentiate between memfd files and other tmpfs/hugetlbfs files. In order to resolve this a new policy capability was created, “memfd_class”, which, when enabled, adds a new object class for memfd files, memfd_file. The new object class enables policy developers to write policy specifically for memfd files without impacting other tmpfs or hugetlbfs files. As the patch developer, Thiébaud Weksteen, pointed out in the commit description, this is of particular interest when execution of memfds are attempted:

    The ability to limit fexecve on memfd has been of interest to avoid potential pitfalls where /proc/self/exe or similar would be executed (see ChromeOS Issue and memfd exec protections). Reuse the “execute_no_trans” and “entrypoint” access vectors, similarly to the file class. These access vectors may not make sense for the existing “anon_inode” class. Therefore, define and assign a new class “memfd_file” to support such access vectors.

  • A new build time configuration has been introduced, CONFIG_SECURITY_SELINUX_AVC_HASH_BITS, which allows adjustment of the SELinux Access Vector Cache (AVC) hash bucket sizes. The default value is set to 9 bits, resulting in 512 entries for each bucket. Users with unusual workloads or non-typical SELinux policies may want to experiment with this value.

  • The SELinux Access Vector Cache (AVC) moved from a custom hash function to the MurmurHash3 hash, resulting in improvements in hash distribution and latency.

Audit

  • The __audit_inode_child() function loops over the list of logged inodes twice, first to search for a parent inode, and then again to search for a potential match for the child inode. Linux v6.19 will consolidate these two loops into a single loop that searches for a matching parent and child inode at the same time, resulting in approximately a 50% reduction in audit overhead.

Linux 6.18 Released

Linux v6.18 was released on Sunday, November 30th. I already wrote up a post highlighting the LSM, SELinux, and audit changes that were submitted during the merge window. However, there were additional changes that went into Linux v6.18 that are described below.

SELinux

  • Fix a problem where the per-task directory access cache introduced in Linux v6.16 was tied to a credential and not a task. An odd problem, largely caused by a changes over time and a failure to properly update the SELinux object security structure names due to those changes. The fix for this particular problem is to reintroduce a proper per-task security structure for SELinux and rename the existing per-credential security structure to better reflect it’s nature.

Linux 6.18 Merge Window

Linux v6.17 was released on Sunday, September 29th, with the Linux v6.18 merge window opening immediately afterwards. Below are the highlights of the LSM, SELinux, and audit pull requests which have been merged into Linus’ tree.

LSM

  • Management of the BPF LSM security blobs was moved into the LSM framework. Previously the LSM security blobs were managed by SELinux as it was the only LSM with BPF access controls. Moving the blob lifecycle managment to the LSM framework enables other LSMs to implement their own BPF access controls or observation implementations.

  • Convert the LSM block device security blob allocator to use the existing allocator helper function. This should have no effect on users, but helps reduce code duplication and ease maintenance of the code moving forward.

  • Update the Rust credentials code to use sync::aref. This is part of a larger effort to move the Rust kernel code over the sync module.

SELinux

  • Support per-file labeling on functionfs, a pseudo-filesystem that can be used to implement USB gadget drivers.

  • Convert sel_read_bool() to use a small stack buffer instead of a memory page allocated via get_zeroed_page(). There are a limited number of pages available via get_zeroed_page(), migrating SELinux away from these pages helps ensure that system does not exhaust this limited resource.

  • Make better use of the network helper functions to retrieve the sock associated with a network packet. While this has no real effect on the code, it does make it cleaner and easier to maintain.

  • Remove some unused and redundant code.

Audit

  • Create a new AUDIT_MAC_TASK_CONTEXTS audit record to log all of the LSM labels associated with a task on a system with multiple LSMs enabled. Casey Schaufler, the patch’s author, provides an example and an explanation of when the record may be generated in the patch’s description:

    Create a new audit record AUDIT_MAC_TASK_CONTEXTS. An example of the MAC_TASK_CONTEXTS record is:

     type=MAC_TASK_CONTEXTS
       msg=audit(1600880931.832:113)
       subj_apparmor=unconfined
       subj_smack=_
    

    When an audit event includes a AUDIT_MAC_TASK_CONTEXTS record the “subj=” field in other records in the event will be “subj=?”. An AUDIT_MAC_TASK_CONTEXTS record is supplied when the system has multiple security modules that may make access decisions based on a subject security context.

  • Similar to the new AUDIT_MAC_TASK_CONTEXTS record, create a new AUDIT_MAC_OBJ_CONTEXTS audit record to log all of the LSM labels associated with an object on a system with multiple LSMs enabled. Casey Schaufler, the patch’s author, describes the work in the patch description:

    Create a new audit record AUDIT_MAC_OBJ_CONTEXTS. An example of the MAC_OBJ_CONTEXTS record is:

    type=MAC_OBJ_CONTEXTS
      msg=audit(1601152467.009:1050):
      obj_selinux=unconfined_u:object_r:user_home_t:s0
    

    When an audit event includes a AUDIT_MAC_OBJ_CONTEXTS record the “obj=” field in other records in the event will be “obj=?”. An AUDIT_MAC_OBJ_CONTEXTS record is supplied when the system has multiple security modules that may make access decisions based on an object security context.

  • Ensure that fanotify events are always generated. Previously fanotify events were only logged when audit was explicitly configured, in contrast to the Linux audit convention where security relevant events are always logged.

  • Minor comment and coding style fixes.