VFS vn_lock function makes system unresponsive when calling vn_fullpath

Wed May 31 18:31:30 UTC 2017

Hi Mateusz, 

Thank you for your response. 

>> Hard to say off hand, but so far it looks like the vnode is already exclusively locked 

It seems so. I have added "error = vn_lock(dvp, LK_SHARED |
LK_SLEEPFAIL);" to the code of the kernel module before vn_gullpath()
and vn_lock has returned error code: EDEADLK    11    /* Resource
deadlock avoided */. 

>> Are you running the kernel with DEBUG_VFS_LOCKS? 

At the moment no, because I was only suspecting that something wrong
with the vnode, but I hoped that the problem was located somewhere in
the namecache lookup. Later, I will run kernel with DEBUG_VFS_LOCKS. 

>> What is the purpose of this module in the first place? 

It is a clone of the Apple's Sandbox (seatbelts or how it was called).
The module performs: mandatory (by choice) integrity control of the
executable files and option to apply additional constraints on the
executable. The test version of the userland software is almost ready.
The tinyscheme scripts are converted to the binary schemes (like apple
sandbox does, but in another way to avoid problems with Apple). The
rules can be loaded to the module, and testing against the
mpo_kld_check_load was successful. 

This is more like a research and at the same time my personal "security"
tool which ideally, should apply on some programs additional constraints
and warn the admin/security team in real time if the application is
attempting to leave the sandbox (is doing what it was prohibited to do)
(i.e like access /etc/rc.conf, /dev/kmem even as root, execute shell),
for instance php suddenly performs kldload from /tmp. (I know that this
is more "WinNT" approach (Agnitum Outpost  with proactive security), for
this reason I am not advertising it a lot and probably I will keep
source code available online) 

I need this kernel module because I am running some closed source
programs on the server which are "black boxes" and can't be trusted from
any point. And for other reasons... Three developers (inc me) are
working on the development of the project in their spare time. 

>> regular vnode -> path resolution is not guaranteed to work 

Yes, I remember we  discussed a bit this problem when I was studying in
university. But for development it was more than enough. 

>>. For instance someone else can be modifying the directory tree as you translate 

Yes, this is also possible. The plan is to implement other MAC
procedures and then decide what to do with vn_fullpath and vnodes in
general. 

My intention was to confirm that vn_fullpath() is a dead end and that
the FreeBSD developers know of the potential deadlock problem. To let
others know where to look for solution should they experience the same
problem. 

--

Kind Regards,
Alexander Morozov 

> Hi Mateusz, 
> 
> Thank you for your response. 
> 
>>> Hard to say off hand, but so far it looks like the vnode is already exclusively locked 
> 
> It seems so. I have added "error = vn_lock(dvp, LK_SHARED | LK_SLEEPFAIL);" to the code of the kernel module before vn_gullpath() and vn_lock has returned error code: EDEADLK    11    /* Resource deadlock avoided */. 
> 
>>> Are you running the kernel with DEBUG_VFS_LOCKS? 
> 
> At the moment no, because I was only suspecting that something wrong with the vnode, but I hoped that the problem was located somewhere in the namecache lookup. Later, I will run kernel with DEBUG_VFS_LOCKS. 
> 
>>> What is the purpose of this module in the first place? 
> 
> It is a clone of the Apple's Sandbox (seatbelts or how it was called). The module performs: mandatory (by choice) integrity control of the executable files and option to apply additional constraints on the executable. The test version of the userland software is almost ready. The tinyscheme scripts are converted to the binary schemes (like apple sandbox does, but in another way to avoid problems with Apple). The rules can be loaded to the module, and testing against the mpo_kld_check_load was successful. 
> 
> This is more like a research and at the same time my personal "security" tool which ideally, should apply on some programs additional constraints and warn the admin/security team in real time if the application is attempting to leave the sandbox (is doing what it was prohibited to do) (i.e like access /etc/rc.conf, /dev/kmem even as root, execute shell), for instance php suddenly performs kldload from /tmp. (I know that this is more "WinNT" approach (Agnitum Outpost  with proactive security), for this reason I am not advertising it a lot and probably I will keep source code available online) 
> 
> I need this kernel module because I am running some closed source programs on the server which are "black boxes" and can't be trusted from any point. And for other reasons... Three developers (inc me) are working on the development of the project in their spare time. 
> 
>>> regular vnode -> path resolution is not guaranteed to work 
> 
> Yes, I remember we  discussed a bit this problem when I was studying in university. But for development it was more than enough. 
> 
>>> . For instance someone else can be modifying the directory tree as you translate 
> 
> Yes, this is also possible. The plan is to implement other MAC procedures and then decide what to do with vn_fullpath and vnodes in general. 
> 
> My intention was to confirm that vn_fullpath() is a dead end and that the FreeBSD developers know of the potential deadlock problem. To let others know where to look for solution should they experience the same problem. 
> 
> --
> 
> Kind Regards,
> Alexander Morozov 
> 
> Mateusz Guzik писал 2017-05-31 00:27: 
> On Tue, May 30, 2017 at 4:42 PM, Alexander Morozov <alex at nixd.org> wrote:
> 
> Hi,
> 
> At the moment I am developing a kernel module based on MAC Framework which is invoking a vn_fullpath() from vfs_cache.c.
> 
> Here is the thing:
> When the MAC Framework mpo_vnode_check_write procedure is called, the kernel module is trying to retrieve: the path on the disk for the curthread and the path of the file in which the curthread is attempting to write. At the point when the program execution reaches the vn_fullpath() for the resolution of the file's path, the terminal window 'freezes' (actually the whole system is not responding, the rest ttys stop responding after entering login credentials).
> 
> For instance, after loading and initializing a MAC kernel module, I am opening a existing text file using vi  to edit it. After inserting some random text I press ESC key on keyboard  and terminal window 'freezes' (at that moment the  mpo_vnode_check_write is called) ( the struct vnode * vp which is passed to the MAC procedure is valid (not NULL) and type (enum vtype) is VREG).
> 
> I have investigated this issue and found out the following:
> The get_fullpath() is calling the get_fullpath1() where later  the vn_vptocnp() is invoked.
> Retrieving the location for the curthread is successful (the full path returned).
> But when the kernel module is making attempt to retrieve the path for the vp (argument of the mpo_vnode_check_write) the function vn_lock(*vp, LK_SHARED | LK_RETRY) (line 2191) located in function vn_vptocnp() is grabbing control forever. 
> 
> Hard to say off hand, but so far it looks like the vnode is already exclusively locked and now the kernel deadlocks itself by locking it in shared mode. You can easily inspect the state in ddb with show lockedvnods.
> 
> Are you running the kernel with DEBUG_VFS_LOCKS?
> 
> What is the purpose of this module in the first place?
> 
> regular vnode -> path resolution is not guaranteed to work. While it will work most of the time, it is inherently racy problematic in presence of multiple hardlinks. For instance someone else can be modifying the directory tree as you translate back and trick you into thinking the vnode represents a different file. Even if this was not the case, the translation of the sort on each write would be a performance killer.
> 
> The only possibly working approach I see would attach metadata to the vnode after lookup and then use it.  
> Below I have copied and pasted the code which performers the path resolution and the MAC procedure handler:
> static int
> rw_retreive_data(struct thread * td, struct vnode *dvp, char ** rpath, char ** curpath, struct sandbox_rule_app ** rule_ptr)
> { 
> [snip]         error = vn_fullpath(td, td->td_proc->p_textvp, curpath, &curfreepath); 
> [snip] }
> 
> int sandbox_vnode_check_write(struct ucred *active_cred,
> struct ucred *file_cred,
> struct vnode *vp,
> struct label *vplabel)
> {
> IS_MODULE_INITED(0)
> ASSERT_NULL_R(vp, 0);
> 
> struct sandbox_rule_app * rule_ptr = NULL;
> char * rpath = "-";
> char * curpath = "-";
> int error = 0;
> 
> RWLOCK_BLOCK(&sandbox_rules_lock, RWLOCK_READ)
> {
> 
> error = rw_retreive_data(curthread, vp, &rpath, &curpath, &rule_ptr); 
> 
> If this is using rwlock the code is additionally wrong as vn_fullpath can induce unbound sleep, while the lock at hand only supports bound sleep. 
> -- 
> Mateusz Guzik <mjguzik gmail.com [1]>

Links:
------
[1] http://gmail.com