Linux Landlock is a kernel-native security module that lets unprivileged processes sandbox themselves - but nobody uses it because the API is ... hard!
I built `landrun`, a small CLI tool in Go, to make it practical to sandbox any command with fine-grained filesystem and network access controls. No root. No containers. No SELinux/AppArmor configs.
It's lightweight, auditable, and wraps Landlock v5 features (file access + TCP restrictions).
I’d recommend adding your first (and maybe second) paragraphs directly to your readme - this is a much clearer description if you don’t know what landlock is already!
I agree. The first section of the README leaves the impression that Landrun comes with a kernel module -- that would be a red flag for me. The fact that it uses an existing kernel module that is in the mainline is going to be critical to anyone using Landrun.
Looks very interesting. I'm achieving something somewhat similar by running soeme processes under docker and mounting volumes ro, but could definitely see a usecase for adding landlock to more server processes.
yeah you are missing --exec there, which feels a bit useless that you have to mention it, but I prefer things explicit and use all LSM can provide, I can imagine cases where --exec isn't really required. like `cat`.
either case have a look at latest release, it's a bit cleaner.
Could you please help me understand why exec is required for this touch example? Is it necessary to actually launch the touch binary? Or touch itself exec()s something else?
seems to indicate that `--exec` is only required if the command you're executing then uses an `exec`-call internally, which `bash` would need to be able to fork.
So `touch` should not need `--exec`, while `bash` should be able to run anything it can read (including that whitelisted `/tmp`).
The former does not work for me, I have to add --exec. I can only assume it's because touch is in /usr/bin and so it needs permission to execute it from there.
It seems that using --ro or --rw at all makes --exec also mandatory.
well yeah you'll need --exec when you want to run binaries (unlike... cat?)
I hope landlock adds support to bind --exec to actual directories, that'll be fun!
As a workaround you could create a tmpfs device like /tmp_noexec with noexec flag, and mount it instead of the normal /tmp. But landrun does not (yet?) allow changing the name in directory options :(
For added security, I'd create an ephemeral tmpfs disk for each landlocked invocation: obviously the program we're running has no business seeing what other processes may have put to /tmp.
thanks for the link, Sydbox seems like a super cool project, but there's something weird about it: too many links in the README. not on GitHub, and the project that's on GitHub with a similar name hasn't had a commit in 16 years, is it by the same person?
if they can polish up the public facing side of the project, it would instill more confidence.
I don't need a link to Wikipedia every time "PoC" is used. Or to an online man page every time strace(1) is mentioned.
I get it that a documentation can have more than one "entry point", and hyperlinking all occurrences solves that.
But I think assuming certain audience leads to a document that is more effective. You don't explain addition in university-level textbooks, to make it easier to children from primary school.
This product is simply not for people who hear of strace for the first time.
Some Wikipedia articles themselves do this, linking every common word in the article, which makes trying to simply highlight a section of text a fun adventure. I ended up at one point making a userscript to strip all internally-pointing links just to make an article more readable (as an addition to an existing script that stripped all the "[citation needed]" and other noise).
Wikipedia needs some notion of "suggested links" that don't become links unless the text is selected or they're toggled globally or some other explicit action. With those, authors could go and link every last word if they like.
I always seem to end up with duplicate profiles, or `about:profiles` refuses to open ("Another copy of Firefox has made changes to profiles.") and on and on with various hiccups and speedbumps. Small annoyances, but profiles on Chrome always Just Work, and the half-dozen times I tried it on FF was always death by a thousand cuts.
It's been a few years, so I'll give profiles another try I guess. Containers likely won't do it since multiple profiles all use the same domain (console.aws.amazon.com being the obvious one).
Eh. Personally I find it refreshing to see a page err on the side of too many links instead of too few. No need to explain addition in any book if you can just link to the best explanation available.
The bigger issue IMO is that the links seem to be automatically-generated, and the generation is a bit sloppy; for example, the "Syd" links should probably link to the sandboxing technology instead of Pink Floyd's original frontman.
that looks really cool, but unfortunately without any obvious examples or even a link to documentation, I'm closing the tab and likely forgetting it exists... I would assume many others would feel the same way.
> Read the fine manuals of syd,
libsyd, gosyd, plsyd, pysyd, rbsyd, syd.el and watch the asciicasts Memory Sandboxing, PID Sandboxing, Network Sandboxing, and Sandboxing Emacs with syd.
I do agree, though, that the docs could be improved.
This seems pretty nice, as it using directly landlock API from the Linux Kernel (like pledge from OpenBSD). One feature I would like to have is like yaml description for some set of configuration rather that use all this arguments. So we could have preconfigured commands and just execute them. But I think it is just a matter of taste. I will try the tool. Thanks for it.
We are working to make it part of the OCI runtime specification too.
Using existing configuration format would not work because Landlock has its own unique properties: unprivileged, nested sandboxes, dedicated Linux syscalls, and a good compatibility story with opt-in and incremental features.
Awesome! I'm happy to hear that you and others are interested in the configuration language. We should probably coordinate that on the Landlock mailing list when the time comes, so that we don't duplicate that work. We are open to outside contributions :)
Would be cool to see integration of landlock with configuration file in a way that a service launched by systemd can apply the configuration to the executable.
Bubblewrap is very limited, for example it doesn't allow to grant access to /proc/self/exe without giving access to whole /proc subsystem. So I had to write an emulation of /proc in Python and mount it with FUSE to work around this. I wonder if this issue is fixed in landlock, firejail and others.
Also bubblewrap cannot ask for a decision in runtime: you must set up the rules beforehand.
If I understand it correctly, landlock is an API used by an app to sandbox itself. The app itself controls the sandboxing. Bubble wrap is user space tooling external to the app, so the app had no direct awareness or control of its sandboxing. The scenarios each is intended for are orthogonal to one another.
Landlock can be used to sandbox a launched sub process, as it is here, just as the Kernel APIs used by Bubblewrap could (and sometimes are!) used by programs to sandbox themselves.
not exactly correct. bubblewrap, firejail, and i not sure, but maybe even apparmour, all remove capabilities and create+join restricted fs/net namespaces, and then fork the actual thing you want to execute. so it's exactly the same concept, but those use the cap and cgroups.
Same question. One thing I really dislike in Bubblewrap is that I must share the whole net user namespace even if all I want to do is use UNIX domain sockets.
Since I only see net options specifying ports, does this handle this use case?
OpenBSD did get it right, but they also have a more relaxed scheme for backwards compatibility across releases. Linux's strict ABI compatibility guarantees complicate matters slightly, but with the right supporting library it becomes tolerable.
(Full disclosure, I am the author of that library)
FWIW, I do hope that we can motivate people to use Landlock in the same way as people use pledge on OpenBSD, as a lightweight self-sandboxing mechanism that requires fewer architectural changes to your program and results in more constrained sandboxes than Linux namespaces and other mechanisms do.
As far as I know the ABI for pledge and unveil really haven’t changed since release? What is stopping linux from creating NEW security primitives which are easy to use? We have wireguard in the linux kernel as a recent addition. Wireguard shows that new simple primitives can be added to the kernel, it requires someone with “good taste” to do the implementation without sacrificing usability.
BSD systems ship a kernel and user space, which simplifies a lot of things. Linux is more flexible but it comes at a cost. Adding new security features can also be challenging for other reasons. Anyway, Landlock is one of these new security primitives, and it is gaining new features over time.
The Landlock interface must not change the underlying semantic of what is allowed or denied, otherwise it could break apps build for an older or a newer kernel. However, these apps should still use all the available security features. This is challenging.
Landlock provides a way to define fine-grained security policies. I would not say the kernel interface is complex (rather flexible), but what really matter are the user space library interfaces and how they can safely abstract complexity.
I know how linux and bsd work. I still have yet to find a satisfactory answer to why linux cannot create security primitives which are useful — like wireguard. I understand that landlock tries to abstract complexity, but why do we need to design complex user interfaces? Pledge and unveil are just simple syscalls, there is no magic secret sauce on BSDs which enable these syscalls. It is true that bsd userspace has been compiled to bake in plege and unviel syscalls, but that is totally separate from the usability of the interfaces.
For instance, with Pledge, the "dns" promise is implemented with hardcoded path in the kernel. Linux is complex because it is versatile and flexible. Controlling access to such features requires some complexity and the kernel might not be enough.
About interfaces, another example is that Unveil is configured with path names but Landlock uses file descriptors instead (more flexible).
Also, these OpenBSD primitives only apply to the current executed binary, there is no nested sandboxes because the goal is not to create this kind of secure environment but mainly to secure a trusted binary.
For a given linux libc function (what a program calls), the underlying kernel syscall might change over time or vary for other reasons. Since the landlock/seccomp filters are at the kernel level, that breaks programs which only interact with libc calls and don't expect different behaviour.
The underlying kernel syscall should never change, though, right? Pretty sure that's the sort of userspace-backwards-compatibility-breaking change that would result in one of Linus' famous angry emails.
Things like clock_gettime64() to handle dates past 2038.
Calling clock_gettime() in libc will call the newer syscall (assuming __TIME_BITS=64 is set). But Linux has kept backwards compat, old programs can still call the old syscall.
If you wrote your seccomp rule for your program before clock_gettime64 existed, it'd break when glibc switched. I guess that implies each language stdlib should have their own seccomp etc wrappers.
For landlock, the equivalent is that glibc reads various files in /etc varying per libc version or system settings, so landlock rules need to account for that.
> Since the landlock/seccomp filters are at the kernel level
That arguably shows that seccomp is operating at the wrong abstraction level, or the kernel needs another higher level api. With pledge, you operate on capabilities and as new functionality is added to the kernel it is categorized under existing capabilities (for example, if your program pledges not to use networking you can assume that it should not be able to use new networking syscalls added to the kernel in the future).
Seccomp is not an access control system, but Landlock is. Seccomp limits the kernel attack surface and Landlock enforces an access control. They are complementary.
With Landlock, the access control is at the right layer, and the semantic is guaranteed to be the same even if the kernel gets new syscalls. Landlock is the closest thing to Pledge/Unveil we can get with the Linux constraints (and it is gaining new features).
Which also points to landlock-make[0] or vice-versa (the original project that made me aware of the kernel functionality (although didn't realize it also isolated network which is great).
I have been using https://github.com/marty1885/landlock-unveil on Linux for about two years now on my stock Ubuntu kernel. I am not sure, why this hasn't become more popular. It's also rootless sandboxing (and it does `unveil` like OpenBSD I guess). I use it to confine builds of third party software with success.
I disagree. Android's model of starting with a strong sandbox and having apps request permission to acces things outside of it has been much more successful in getting apps to be sandboxed.
I think that isn't good enough either (but at least they tried).
My operating system design is: programs start with nothing other than the ability to perform deterministic computation and to send/receive messages with the capabilities it receives in the initial message. It is not allowed to know what these capabilities refer to; they may be proxies set up by the user, network resources, or something else, and is not necessarily what it asked for. All I/O including the ability to determine the current date/time or how much time has passed, requires the use of capabilities. (Due to this, a program with no capabilities left can be terminated automatically by the operating system (unless a debugger is attached; it is also necessary that the program cannot notice the debugger attached to it), since it is no longer capable of any I/O.)
Is argue it is good enough, but yes it could have gone farther. But ultimately permissions for things like audio would be automatically granted so in the end you end up around the same place.
I'm trying to run a self-contained webserver executable without any external dependency. It starts but daemon <-> workers communication doesn't seem working (it is done via unix socket)
It works fine with bubblewrap or inside a scratch docker container.
aren't abstract sockets un-jailable unless using network namespaces?
or in the other direction, to truly prevent e.g. xorg socket from being accessed by a bubblejailed application, it should exclude --share-net, regardless if you bind the actual path to the socket (since abstract permeates beyond that)
Would that make feasible (in the long term) to have macOS permission manager like « do you want terminal to access documents folder ? » on Linux ?
As a very average user, that’s the kind of thing I miss on windows and Linux.
Because I installed Google chrome, it doesn’t mean I want it to be able to scan every single file I have on my computer yet there is no way to prevent it and I feel it’s a big security and privacy issue that no one speak about !
You might find Flatpak interesting if you're not already familiar with it. Properly packaged applications start with limited file system access—for example, when you browse file:/// in Firefox, it can't see all your files. However, using the "Open File" menu acts as a file system portal, granting access to selected files on demand. While this isn't exactly how macOS handles permissions, it does prevent the unrestricted system access you're concerned about.
Yeah I knew about flatpak but it also has its downside.
When I used it, it break many things. Some app would have weird behavior, theming would break, app wouldn’t open.
Then you get, for those peasant like me who have very slow internet, a 1 hour to download a app that would otherwise take 30 seconds because flatpak download lots of other stuff.
I get why flatpak is great, it’s like docker or python environment, but as usual with Linux it’s more like a developer thing and a recipe for headache and frustration to the average computer user.
My biggest problem with Linux is that there are no per-process firewall settings. I think one can get around this by using AppArmor or using an user per app and assigning rules to a user.
I've used Linux for over a decade now, but there are still many things I haven't learned, so maybe I'm missing something in this regard.
The GitHub page says
- TCP network access control (binding and connecting)
and
- Support for UDP and other network protocol restrictions (when supported by Linux kernel)
so maybe this can be used to firewall processes in an easy way (assuming that it is easy to set up landrun)?
Why not use linux network namespaces to run your processes in different network stack? nftables rules are per network namespaces so you can get all sorts of sophisticated and achieve essentially per process firewalling. The pattern is to create a network namespace, create a veth pair and move one end of the pair into the namespace. Then you could set up rules to route traffic from default namespace to the process namespace via veth device.
Systemd has `NetworkNamespacePath` directive which can spin up services in new namespaces as well. See `man 5 systemd.exec`
I'm not sure about the other commenter's intentions, but on desktop, I wish every program started in a restricted network namespace. Instead of blocking all incoming and outgoing connections by default, it would request user permission interactively and adjust access accordingly.
On Linux you can do the next best thing which is to move out all the interfaces from the default network namespace and use iptables rules for it which block everything just in case.
Then you have to explicitly launch applications in a desired network namespace such as physical (eth0, wlan0 etc) or vpn (wg0).
Accidentally launched applications, or something like the desktop environment have no network connectivity.
That depends on how you set it up, it doesn't have to bypass the "main host" firewall. Consider the following example:
0. If you set up no additional network namespaces, there is still one present, this is called the "default" or "root" network namespace. It is what you refered to as "main host".
1. Say the default net ns has device eth0 that your server receives traffic on.
2. You create a veth pair in the default net ns, veth0 and veth1.
3. You create a new net ns and move veth1 into new net ns. Only veth0 and eth0 remain in default net ns.
4. You set up routes and nftable rules in default net ns as you would normally. Certain traffic you want to route to your new net is so you have a next hop veth0 (note, you ha e to route through to the IP of veth1, using veth0 as next hop)
5. You set up additional nftable rules and whatever you want in the new net ns and this is isolated from default net ns.
End-to-end flow: packet arrives on eth0, traverses netfilter (nftables/iptables) and route lookup to route to "new network" via veth0. Packet is sent "out" the default net stack via veth0 and arrives on veth1 (since they are a pair) in new net ns network stack. There, the packet traverses an isolated netfilter and routing table and a socket can be listening for your service or whatever. Replies would follow the same in reverse. Sent out veth1 in new net ns, arrive on veth0 in default net ns, and exit that stack via eth0
You can use firejail for network isolation, it can run applications in a new network namespace [1]. I'm using this to run applications over tor to make sure that nothing leaks.
I saw there's an option to match on a cgroup among nft meta expressions (but I've never tried it). It could be enough if you just want to add per-process firewall rules, but not configure an additional namespace with it's associated interfaces, routing/nating.
Yes. You could match packets based on username or even SELinux labels.
You could also set a special mark on a packet for each container and then filter based on that. The Internet is surprsingly very thin on nft resources. I spent a few weeks learning how to write them. Definitely, not for the average consumer.
Attaching a separate firewall rules to every process would be a bit heavyweight. What we do have is network namespaces that let you have networking rules (incl firewall) per a group of processes.
that's what all firewall apps on Android (bastardized Linux) does.
well, they already have a user namespace per app which they can match on the firewall rule, but a per "main" program pid net namespace would be pretty much the same. i guess this can be a cool patch to this plus a one weekend qt+rust gui to manage the firewall (or a patch to firewalld gui)... only if i ever had a weekend.
Thank you all for your support, I really didn't expect this to take off like this! given that project is roughly two days old (:D) it's still fair to expect some issues all around, please report them on GH if you found one.
How does the Landlock API compare to mount/network namespaces, as used in Docker containers? As I understand it, namespaces are for isolation, and Landlock would be more like access permissions, is that correct?
Could it be possible for the system to use the Landlock api to catch unauthorized net/fs access by an app and display a popup to ask for authorization, like macOS does?
Namespaces can also be used for sandboxing, but they have a series of problems. Most importantly, they require more substantial changes to your program that wants to sandbox itself, and the program has to jump through a series of hoops to get everything into the right state. It is possible, but the resulting program environment is in the end more unusual and the mechanisms for enabling unprivileged namespaces are making it difficult to use it for smaller use cases. (It involves re-execution of the program that wants to sandbox itself, whereas with Landlock, a small program can just install a Landlock policy during an early startup phase and continue with that.)
Controlling the rules through a separate process is not currently possible, but it was proposed earlier this month on the kernel mailing lists:
I think in the upstream kernel LSMs are also still the only way to prevent a process from creating child namespaces where it has privileges?
E.g. if you can cat CAP_NET_ADMIN even within a restricted namespace, you have access to huge amounts of horrbly broken kernel code. It's easy (for people who know how to exploit kernel bugs) to escalate privileges from there.
Distros have their own fixes for this issue so namespaces definitely aren't useless in practice for sandboxing. But the basic mechanism just doesn't that well suited to it.
Ah I didn't know about that. So you can block the child from creating a userns completely... That seems like an unnecessarily big hammer, but also probably 95% of cases works fine?
I think probably we want an inherited mask of what capabilities you can get in child namespaces. I think I heard someone proposed that upstream but I haven't seen the patches.
NO_NEW_PRIVS is quite irritating in a lot of contexts, since it breaks distant dependencies. For example, you can't run `ping`, so good luck debugging your networking!
> For example, you can't run `ping`, so good luck debugging your networking!
Sending ICMP Echo in userspace (over UDP) is a thing on Linux. From experience, for public Internet, where possible, it is always better to rely on TLS connects (then TCP or UDP, and then ICMP) to ascertain connectivity (lest some middleware meddle with IP or Transport replies).
Namespaces (used by containers) are very powerful but they are also a door to a large attack surface: https://lwn.net/Articles/673597/
Landlock is (only) an access control system, but it's designed to let any process use it, including potentially untrusted ones, which makes it suitable for any apps. It's close and complementary to seccomp.
I get that the "o" in "--ro" is supposed to stand for "only", but this feels clunky to me (especially if there's also a "--rox", which is self-contradictory). I like my long options to be, well, long (complete English words), and backed up by short options. In this case, I'd propose having "-r, --read, -w, --write, -x, --exec", and allowing the short options to be combined as flags (i.e. -rwx).
ROX isn't self-contradictory, Allowing read() and execve(), but denying write() and truncate() are totally valid and common in secure execution contexts, although things gets worse with directory traverse.
So yeah, --rox is fine semantically, just ugly. :D
I think the parent poster was not arguing that allowing this combination of accesses is invalid, just that it can't be called read-ONLY if it's not ONLY read.
"Any color the customer wants, as long as it's black"
Seems pretty cool, but I would probably object to `--best-effort` being enabled by default. This is a sandbox and a security boundary, and degrading security should probably be opt-in rather than opt-out.
I made shell-container for myself which works fine for me (link below). I just run shell and I’m in a new/stateful container with only that for mounted. Works pretty well, but has some quirks here and there
This is great, I run a hobby project, vimgolf.ai, to get my friends to learn vim and had to do a lot with firejail to sandbox the neovim instances correctly. This looks be a lot easier to setup
Very cool project! I was curious if this was possible with util-linux (provider of the unshare command that provides namespace management, the underlying feature behind containers), and it is indeed possible:
Some systems or admins may not trust unprivileged namespacing (thus disabling and its use requiring root), while Landlock may be enabled (and is specifically designed to be used by unprivileged processes).
Is it just me or Linux seems to have too many non-orthogonal ways to restrict processes? Like why Landlock does TCP filtering based on port only? What about non-TCP traffic and maybe IP based restrictions is more useful? How does it interact with Netfilter? Puzzling.
It takes time to develop theses features, but Landlock is gaining new network filtering features. We are working in a way to control socket creation according to their protocols, and also a way to filter UDP (which makes sense to developers and users).
From the point of view of an app developer, it might not make sense to filters peers but services (ports) instead, and filtering peers without their names would not be ideal (the kernel doesn't know about DNS, only IPs). Anyway, this feature might come one day if someone want to work on it, but we follow well-tested incremental development.
Netfiler is a privileged network feature that allows to do almost anything with the network, which makes it unsuitable for (app/unprivileged) sandboxing.
A rough description of upcoming network restriction features in Landlock and how they map to the BSD socket API is in the talk at https://youtu.be/K2onopkMhuM?start=2025 starting around 33:45
I really hope we can get back to these features soon :) I think these would be very useful.
Technically IP doesn't have ports. TCP and UDP (and others) individually have the concept of port. So it makes sense if you want a port filter it is a TCP specific rule.
...of course it is common enough that it would make sense to abstract over the different protocols that have more or less the same concept of ports.
Linux Landlock is a kernel-native security module that lets unprivileged processes sandbox themselves - but nobody uses it because the API is ... hard!
I built `landrun`, a small CLI tool in Go, to make it practical to sandbox any command with fine-grained filesystem and network access controls. No root. No containers. No SELinux/AppArmor configs.
It's lightweight, auditable, and wraps Landlock v5 features (file access + TCP restrictions).
Demo + usage examples in the README.
Would love feedback from the HN crowd!
I’d recommend adding your first (and maybe second) paragraphs directly to your readme - this is a much clearer description if you don’t know what landlock is already!
I agree. The first section of the README leaves the impression that Landrun comes with a kernel module -- that would be a red flag for me. The fact that it uses an existing kernel module that is in the mainline is going to be critical to anyone using Landrun.
I didn't have much luck with one of the readme examples:
Looks very interesting. I'm achieving something somewhat similar by running soeme processes under docker and mounting volumes ro, but could definitely see a usecase for adding landlock to more server processes.yeah you are missing --exec there, which feels a bit useless that you have to mention it, but I prefer things explicit and use all LSM can provide, I can imagine cases where --exec isn't really required. like `cat`.
either case have a look at latest release, it's a bit cleaner.
Could you please help me understand why exec is required for this touch example? Is it necessary to actually launch the touch binary? Or touch itself exec()s something else?
This might be related to needing execute permissions (filesystem x bit) on the directory to modify files within.
Got it. I thought it had to do with execve() syscall.
This is the minimum options I needed to get it to work:
landrun --log-level debug --exec --ro /usr/bin --ro /usr/lib --rw /tmp touch /tmp/foo
Personally I don't like that --exec would allow binaries in /tmp to be executed as well...
But
`landrun --ro /usr/bin --ro /lib --ro /lib64 --rw /path/to/dir touch /path/to/dir/newfile`
vs
`landrun --ro /usr/bin --ro /lib --ro /lib64 --exec /usr/bin/bash`
seems to indicate that `--exec` is only required if the command you're executing then uses an `exec`-call internally, which `bash` would need to be able to fork.
So `touch` should not need `--exec`, while `bash` should be able to run anything it can read (including that whitelisted `/tmp`).
The former does not work for me, I have to add --exec. I can only assume it's because touch is in /usr/bin and so it needs permission to execute it from there.
It seems that using --ro or --rw at all makes --exec also mandatory.
well yeah you'll need --exec when you want to run binaries (unlike... cat?) I hope landlock adds support to bind --exec to actual directories, that'll be fun!
> you'll need --exec when you want to run binaries
well when wouldn't it do that? in what scenario could you even use this tool without needing to execute a binary?
running cat isn't a --exec for one :)
how so?
$ landrun --ro /usr/bin cat a
[landrun:error] 2025/03/22 23:50:16 permission denied
in this case doesn't have access to "a" wherever it is...
$ landrun --ro /usr cat /usr/bin/ls | wc -l
400
executing ls (as in actual binary execution) will require --exec
$ landrun --ro /usr ls /usr/bin/
ls: cannot open directory '/usr/bin/': Permission denied
$ landrun --ro /usr --exec ls /usr/bin/
list of billions of files
note that I don't really love the --exec thingy, if it's not "on" by default it's just for sake of being explicit.
Update: there's a bug to limit "file access", which I'll fix asap.
Update2: Adding a --exec-path instead to limit executable, it wasn't the best idea to have a global --exec anyway
Update3: Have a look at V0.1.4, I think it's far cleaner now.
--ro /usr does not apply to /usr/bin. change it to --ro /usr/bin and then cat will refuse to run.
it's recursive by default
well it's not working. please try it
give it a try with v0.10: landrun --rox /usr/ --ro /usr/lib ls /usr/bin/
As a workaround you could create a tmpfs device like /tmp_noexec with noexec flag, and mount it instead of the normal /tmp. But landrun does not (yet?) allow changing the name in directory options :(
For added security, I'd create an ephemeral tmpfs disk for each landlocked invocation: obviously the program we're running has no business seeing what other processes may have put to /tmp.
> I'd create an ephemeral tmpfs disk for each landlocked invocation
And now you've just invented firejail.
UX-wise, yes. Internally firejail and landrun use different isolation APIs.
Firejail supports Landlock though: https://github.com/netblue30/firejail/pull/6078
Would be possible/make sense to use landlock on OCI/containers land?
Syd[0] uses landlock (among many other mechanisms) to containerize applications and provides an OCI-compatible interface.
[0]: https://gitlab.exherbo.org/sydbox/sydbox
thanks for the link, Sydbox seems like a super cool project, but there's something weird about it: too many links in the README. not on GitHub, and the project that's on GitHub with a similar name hasn't had a commit in 16 years, is it by the same person?
if they can polish up the public facing side of the project, it would instill more confidence.
> too many links in the README
In other documents too. And very repetitive.
I don't need a link to Wikipedia every time "PoC" is used. Or to an online man page every time strace(1) is mentioned.
I get it that a documentation can have more than one "entry point", and hyperlinking all occurrences solves that.
But I think assuming certain audience leads to a document that is more effective. You don't explain addition in university-level textbooks, to make it easier to children from primary school.
This product is simply not for people who hear of strace for the first time.
Some Wikipedia articles themselves do this, linking every common word in the article, which makes trying to simply highlight a section of text a fun adventure. I ended up at one point making a userscript to strip all internally-pointing links just to make an article more readable (as an addition to an existing script that stripped all the "[citation needed]" and other noise).
Wikipedia needs some notion of "suggested links" that don't become links unless the text is selected or they're toggled globally or some other explicit action. With those, authors could go and link every last word if they like.
> which makes trying to simply highlight a section of text a fun adventure
Tip: in Firefox, you can hold Alt to drag and select text without triggering links.
TIL. I wish I knew years ago, I've never been happier to have switched.
Still using Chrome for work stuff, since profile management in FF is still pure hate.
what's wrong with `firefox -ProfileManager` and the like or alternatively containers?
I always seem to end up with duplicate profiles, or `about:profiles` refuses to open ("Another copy of Firefox has made changes to profiles.") and on and on with various hiccups and speedbumps. Small annoyances, but profiles on Chrome always Just Work, and the half-dozen times I tried it on FF was always death by a thousand cuts.
It's been a few years, so I'll give profiles another try I guess. Containers likely won't do it since multiple profiles all use the same domain (console.aws.amazon.com being the obvious one).
Containers don't need to auto-open domains. You can simply use them to "color" tabs manually. That should cover your needs!
I thought Wikipedia recommended against overlinking, and on looking it up, they do:
https://en.wikipedia.org/wiki/Wikipedia:Manual_of_Style/Link...
Eh. Personally I find it refreshing to see a page err on the side of too many links instead of too few. No need to explain addition in any book if you can just link to the best explanation available.
The bigger issue IMO is that the links seem to be automatically-generated, and the generation is a bit sloppy; for example, the "Syd" links should probably link to the sandboxing technology instead of Pink Floyd's original frontman.
> the links seem to be automatically-generated, and the generation is a bit sloppy; for example, the "Syd" links
I dare you, check the git history! (if you care anyway)
It's all manually crafted, with love. From the Shine On You Crazy Diamond badge at the top down to the very last link.
Fair enough lol
I agree regarding polishing the public-facing side of the project, though I don't find it particularly problematic that it's not on Github.
that looks really cool, but unfortunately without any obvious examples or even a link to documentation, I'm closing the tab and likely forgetting it exists... I would assume many others would feel the same way.
From the README:
> Read the fine manuals of syd, libsyd, gosyd, plsyd, pysyd, rbsyd, syd.el and watch the asciicasts Memory Sandboxing, PID Sandboxing, Network Sandboxing, and Sandboxing Emacs with syd.
I do agree, though, that the docs could be improved.
True! I had the same feeling.
[dead]
this looks cool, thanks for sharing. they have linked a ctf event as an interactive example, what? XD
I would have suggested support for more fine-grained file/directory permissions—good to see that’s already planned.
Yeah I agree with that, just release a new version that does that.
Does Linux 6.8 in fact ship ABI v5? At least it’s not guaranteed (Ubuntu 24.04, 6.8.0-55-generic). This post suggests 6.10: https://lore.kernel.org/landlock/20240716.yui4Iezai8ae@digik...
good catch, fixed.
This seems pretty nice, as it using directly landlock API from the Linux Kernel (like pledge from OpenBSD). One feature I would like to have is like yaml description for some set of configuration rather that use all this arguments. So we could have preconfigured commands and just execute them. But I think it is just a matter of taste. I will try the tool. Thanks for it.
If you want a file format, I'd lobby for one of the existing ones rather than some random yaml one
- sandbox-exec's scheme one https://github.com/BrianSwift/macOSSandboxBuild/blob/main/co...
- AppArmor https://wiki.apparmor.net/ (although I'm cognizant that tries to address way more than just filesystem access)
- Java's permission one https://docs.oracle.com/javase/8/docs/technotes/guides/secur...
Likely tens more
I agree that re-use file format could a good option. BTW the used landlock go library has sort of example https://github.com/landlock-lsm/go-landlock/blob/main/exampl...
We are working on a JSON/TOML format for Landlock, with the related library, and bindings for several languages: https://github.com/landlock-lsm/landlockconfig
We are working to make it part of the OCI runtime specification too.
Using existing configuration format would not work because Landlock has its own unique properties: unprivileged, nested sandboxes, dedicated Linux syscalls, and a good compatibility story with opt-in and incremental features.
Still early but Mickaël Salaün, the author of landlock, is working on this.
https://github.com/landlock-lsm/landlockconfig
I'm going to write up some Go bindings for this when it becomes relevant.
(Author of go-Landlock here)
Awesome! I'm happy to hear that you and others are interested in the configuration language. We should probably coordinate that on the Landlock mailing list when the time comes, so that we don't duplicate that work. We are open to outside contributions :)
Would be cool to see integration of landlock with configuration file in a way that a service launched by systemd can apply the configuration to the executable.
Akin to systemd SystemCallFilter directive for no-code application of seccomp filters to the sandboxed process https://www.freedesktop.org/software/systemd/man/latest/syst...
That could be a separate wrapper, like bubblejail is for bubblewrap. Landjail?
I’ll try it, but just off the bat, how does this compare to bubblewrap?
Bubblewrap is very limited, for example it doesn't allow to grant access to /proc/self/exe without giving access to whole /proc subsystem. So I had to write an emulation of /proc in Python and mount it with FUSE to work around this. I wonder if this issue is fixed in landlock, firejail and others.
Also bubblewrap cannot ask for a decision in runtime: you must set up the rules beforehand.
Emulating /proc isn't super interesting when you can simply enter a new process namespace.
This doesn't allow hiding things like /proc/cpuinfo or /proc/cmdline or /proc/modules etc.
If I understand it correctly, landlock is an API used by an app to sandbox itself. The app itself controls the sandboxing. Bubble wrap is user space tooling external to the app, so the app had no direct awareness or control of its sandboxing. The scenarios each is intended for are orthogonal to one another.
Landlock can be used to sandbox a launched sub process, as it is here, just as the Kernel APIs used by Bubblewrap could (and sometimes are!) used by programs to sandbox themselves.
not exactly correct. bubblewrap, firejail, and i not sure, but maybe even apparmour, all remove capabilities and create+join restricted fs/net namespaces, and then fork the actual thing you want to execute. so it's exactly the same concept, but those use the cap and cgroups.
I also would like to understand the differences relative to bubblewrap
Same question. One thing I really dislike in Bubblewrap is that I must share the whole net user namespace even if all I want to do is use UNIX domain sockets.
Since I only see net options specifying ports, does this handle this use case?
> if all I want to do is use UNIX domain sockets
I routinely --unshare-net with UDS ro-binds.
You may be using abstract sockets (@/path/uds.sock) and those do require the same netns I think.
Landlock supports scoped abstract UNIX socket: https://docs.kernel.org/userspace-api/landlock.html#ipc-scop...
Landlock doesn't use namespaces, they are orthogonal.
> but nobody uses it because the API is ... hard!
OpenBSD really got it right with pledge and unveil.
OpenBSD did get it right, but they also have a more relaxed scheme for backwards compatibility across releases. Linux's strict ABI compatibility guarantees complicate matters slightly, but with the right supporting library it becomes tolerable.
See the example at the top of the Readme at https://github.com/landlock-lsm/go-landlock
(Full disclosure, I am the author of that library)
FWIW, I do hope that we can motivate people to use Landlock in the same way as people use pledge on OpenBSD, as a lightweight self-sandboxing mechanism that requires fewer architectural changes to your program and results in more constrained sandboxes than Linux namespaces and other mechanisms do.
As far as I know the ABI for pledge and unveil really haven’t changed since release? What is stopping linux from creating NEW security primitives which are easy to use? We have wireguard in the linux kernel as a recent addition. Wireguard shows that new simple primitives can be added to the kernel, it requires someone with “good taste” to do the implementation without sacrificing usability.
BSD systems ship a kernel and user space, which simplifies a lot of things. Linux is more flexible but it comes at a cost. Adding new security features can also be challenging for other reasons. Anyway, Landlock is one of these new security primitives, and it is gaining new features over time.
The Landlock interface must not change the underlying semantic of what is allowed or denied, otherwise it could break apps build for an older or a newer kernel. However, these apps should still use all the available security features. This is challenging.
Landlock provides a way to define fine-grained security policies. I would not say the kernel interface is complex (rather flexible), but what really matter are the user space library interfaces and how they can safely abstract complexity.
I know how linux and bsd work. I still have yet to find a satisfactory answer to why linux cannot create security primitives which are useful — like wireguard. I understand that landlock tries to abstract complexity, but why do we need to design complex user interfaces? Pledge and unveil are just simple syscalls, there is no magic secret sauce on BSDs which enable these syscalls. It is true that bsd userspace has been compiled to bake in plege and unviel syscalls, but that is totally separate from the usability of the interfaces.
For instance, with Pledge, the "dns" promise is implemented with hardcoded path in the kernel. Linux is complex because it is versatile and flexible. Controlling access to such features requires some complexity and the kernel might not be enough.
About interfaces, another example is that Unveil is configured with path names but Landlock uses file descriptors instead (more flexible).
Also, these OpenBSD primitives only apply to the current executed binary, there is no nested sandboxes because the goal is not to create this kind of secure environment but mainly to secure a trusted binary.
For a given linux libc function (what a program calls), the underlying kernel syscall might change over time or vary for other reasons. Since the landlock/seccomp filters are at the kernel level, that breaks programs which only interact with libc calls and don't expect different behaviour.
The underlying kernel syscall should never change, though, right? Pretty sure that's the sort of userspace-backwards-compatibility-breaking change that would result in one of Linus' famous angry emails.
Things like clock_gettime64() to handle dates past 2038.
Calling clock_gettime() in libc will call the newer syscall (assuming __TIME_BITS=64 is set). But Linux has kept backwards compat, old programs can still call the old syscall.
If you wrote your seccomp rule for your program before clock_gettime64 existed, it'd break when glibc switched. I guess that implies each language stdlib should have their own seccomp etc wrappers.
For landlock, the equivalent is that glibc reads various files in /etc varying per libc version or system settings, so landlock rules need to account for that.
> Since the landlock/seccomp filters are at the kernel level
That arguably shows that seccomp is operating at the wrong abstraction level, or the kernel needs another higher level api. With pledge, you operate on capabilities and as new functionality is added to the kernel it is categorized under existing capabilities (for example, if your program pledges not to use networking you can assume that it should not be able to use new networking syscalls added to the kernel in the future).
Seccomp is not an access control system, but Landlock is. Seccomp limits the kernel attack surface and Landlock enforces an access control. They are complementary.
With Landlock, the access control is at the right layer, and the semantic is guaranteed to be the same even if the kernel gets new syscalls. Landlock is the closest thing to Pledge/Unveil we can get with the Linux constraints (and it is gaining new features).
This is where I need to shout out to everyone's favorite developer Justine for keeping Linux cool:
https://justine.lol/pledge/
Which also points to landlock-make[0] or vice-versa (the original project that made me aware of the kernel functionality (although didn't realize it also isolated network which is great).
[0]https://justine.lol/make/
I have been using https://github.com/marty1885/landlock-unveil on Linux for about two years now on my stock Ubuntu kernel. I am not sure, why this hasn't become more popular. It's also rootless sandboxing (and it does `unveil` like OpenBSD I guess). I use it to confine builds of third party software with success.
I disagree. Android's model of starting with a strong sandbox and having apps request permission to acces things outside of it has been much more successful in getting apps to be sandboxed.
Defaults are important.
I think that isn't good enough either (but at least they tried).
My operating system design is: programs start with nothing other than the ability to perform deterministic computation and to send/receive messages with the capabilities it receives in the initial message. It is not allowed to know what these capabilities refer to; they may be proxies set up by the user, network resources, or something else, and is not necessarily what it asked for. All I/O including the ability to determine the current date/time or how much time has passed, requires the use of capabilities. (Due to this, a program with no capabilities left can be terminated automatically by the operating system (unless a debugger is attached; it is also necessary that the program cannot notice the debugger attached to it), since it is no longer capable of any I/O.)
Is argue it is good enough, but yes it could have gone farther. But ultimately permissions for things like audio would be automatically granted so in the end you end up around the same place.
Are (abstract) unix sockets supported?
I'm trying to run a self-contained webserver executable without any external dependency. It starts but daemon <-> workers communication doesn't seem working (it is done via unix socket)
It works fine with bubblewrap or inside a scratch docker container.
aren't abstract sockets un-jailable unless using network namespaces?
or in the other direction, to truly prevent e.g. xorg socket from being accessed by a bubblejailed application, it should exclude --share-net, regardless if you bind the actual path to the socket (since abstract permeates beyond that)
Well, so should it work?
You're telling me there's another reason, then... Can't guess which one.
Hmmm...
they can be jailed by landlock, we don't have support in go-landlock tho afaik, @Gnoack
It's tracked in https://github.com/landlock-lsm/go-landlock/issues/35 - signals and abstract Unix sockets do unfortunately not interact well with the inherently multithreaded Go runtime. We are working on a fix in https://github.com/landlock-lsm/go-landlock/issues/36 but this needs to be on the kernel side and this is delaying this feature in Go, unfortunately. It is usable from (single threaded) C programs though.
Thanks!
Similarly to the bubblewrap comment, I'd also like to know how it compares to nsjail.
I think nsjail uses mount namespaces (CLONE_NEWNS) instead of landlock for filesystem sandboxing, but what would the practical differences be?
There's conflicting information in the readme about whether --best-effort is enabled or disabled by default.
V0.1.3 is out now!
How does Landrun compare to Firejail?
Would that make feasible (in the long term) to have macOS permission manager like « do you want terminal to access documents folder ? » on Linux ?
As a very average user, that’s the kind of thing I miss on windows and Linux.
Because I installed Google chrome, it doesn’t mean I want it to be able to scan every single file I have on my computer yet there is no way to prevent it and I feel it’s a big security and privacy issue that no one speak about !
You might find Flatpak interesting if you're not already familiar with it. Properly packaged applications start with limited file system access—for example, when you browse file:/// in Firefox, it can't see all your files. However, using the "Open File" menu acts as a file system portal, granting access to selected files on demand. While this isn't exactly how macOS handles permissions, it does prevent the unrestricted system access you're concerned about.
Yeah I knew about flatpak but it also has its downside.
When I used it, it break many things. Some app would have weird behavior, theming would break, app wouldn’t open.
Then you get, for those peasant like me who have very slow internet, a 1 hour to download a app that would otherwise take 30 seconds because flatpak download lots of other stuff.
I get why flatpak is great, it’s like docker or python environment, but as usual with Linux it’s more like a developer thing and a recipe for headache and frustration to the average computer user.
Thats is xdg-portals and it works. It needs apps to support it though which slows adoption
My biggest problem with Linux is that there are no per-process firewall settings. I think one can get around this by using AppArmor or using an user per app and assigning rules to a user.
I've used Linux for over a decade now, but there are still many things I haven't learned, so maybe I'm missing something in this regard.
The GitHub page says
- TCP network access control (binding and connecting)
and
- Support for UDP and other network protocol restrictions (when supported by Linux kernel)
so maybe this can be used to firewall processes in an easy way (assuming that it is easy to set up landrun)?
Why not use linux network namespaces to run your processes in different network stack? nftables rules are per network namespaces so you can get all sorts of sophisticated and achieve essentially per process firewalling. The pattern is to create a network namespace, create a veth pair and move one end of the pair into the namespace. Then you could set up rules to route traffic from default namespace to the process namespace via veth device.
Systemd has `NetworkNamespacePath` directive which can spin up services in new namespaces as well. See `man 5 systemd.exec`
I'm not sure about the other commenter's intentions, but on desktop, I wish every program started in a restricted network namespace. Instead of blocking all incoming and outgoing connections by default, it would request user permission interactively and adjust access accordingly.
On Linux you can do the next best thing which is to move out all the interfaces from the default network namespace and use iptables rules for it which block everything just in case.
Then you have to explicitly launch applications in a desired network namespace such as physical (eth0, wlan0 etc) or vpn (wg0).
Accidentally launched applications, or something like the desktop environment have no network connectivity.
opensnitch does this
Are you sure? Because last I checked OpenSnitch used different techniques from namespaces, that seemed more brittle to me.
I was referring to "request user permission interactively and adjust access accordingly"... it can do that. It uses eBPF though
My biggest issue with using namespaces is that it bypasses the main host firewall entirely.
That depends on how you set it up, it doesn't have to bypass the "main host" firewall. Consider the following example:
0. If you set up no additional network namespaces, there is still one present, this is called the "default" or "root" network namespace. It is what you refered to as "main host".
1. Say the default net ns has device eth0 that your server receives traffic on.
2. You create a veth pair in the default net ns, veth0 and veth1.
3. You create a new net ns and move veth1 into new net ns. Only veth0 and eth0 remain in default net ns.
4. You set up routes and nftable rules in default net ns as you would normally. Certain traffic you want to route to your new net is so you have a next hop veth0 (note, you ha e to route through to the IP of veth1, using veth0 as next hop)
5. You set up additional nftable rules and whatever you want in the new net ns and this is isolated from default net ns.
End-to-end flow: packet arrives on eth0, traverses netfilter (nftables/iptables) and route lookup to route to "new network" via veth0. Packet is sent "out" the default net stack via veth0 and arrives on veth1 (since they are a pair) in new net ns network stack. There, the packet traverses an isolated netfilter and routing table and a socket can be listening for your service or whatever. Replies would follow the same in reverse. Sent out veth1 in new net ns, arrive on veth0 in default net ns, and exit that stack via eth0
I was speaking from a defensive security point of view and not a server trying to route traffic.
The fact that local users can simply create namespaces that bypass the host's firewall is extremely dangerous in my opinion.
You can use firejail for network isolation, it can run applications in a new network namespace [1]. I'm using this to run applications over tor to make sure that nothing leaks.
[1] https://firejail.wordpress.com/documentation-2/basic-usage/#... "A network namespace is a new, independent TCP/IP stack attached to the sandbox. The stack has its own routing table, firewall and set of interfaces."
I saw there's an option to match on a cgroup among nft meta expressions (but I've never tried it). It could be enough if you just want to add per-process firewall rules, but not configure an additional namespace with it's associated interfaces, routing/nating.
Yes. You could match packets based on username or even SELinux labels.
You could also set a special mark on a packet for each container and then filter based on that. The Internet is surprsingly very thin on nft resources. I spent a few weeks learning how to write them. Definitely, not for the average consumer.
> My biggest problem with Linux is that there are no per-process firewall settings.
There is, with cgroups: https://www.kernel.org/doc/Documentation/cgroup-v1/net_cls.t...
Is there an example of this that uses cgroup2?
https://www.freedesktop.org/software/systemd/man/latest/syst...
Attaching a separate firewall rules to every process would be a bit heavyweight. What we do have is network namespaces that let you have networking rules (incl firewall) per a group of processes.
that's what all firewall apps on Android (bastardized Linux) does.
well, they already have a user namespace per app which they can match on the firewall rule, but a per "main" program pid net namespace would be pretty much the same. i guess this can be a cool patch to this plus a one weekend qt+rust gui to manage the firewall (or a patch to firewalld gui)... only if i ever had a weekend.
[dead]
Thank you all for your support, I really didn't expect this to take off like this! given that project is roughly two days old (:D) it's still fair to expect some issues all around, please report them on GH if you found one.
There's very nice presentation on Landlock in the last year Open Source Europe Summit Europe [1].
[1] Linux Sandboxing with Landlock - Mickaël Salaün, Microsoft [video]:
https://youtu.be/d85TDpv8L9U
Super cool project. Justine Tunney released the `pledge` cli [1] a couple years ago that does the same thing, wrapping Landlock.
[1]: https://justine.lol/pledge/
Seems like a Nix could take a good advantage of Landlock, as it already (kind of) knows all the paths processes need access to.
How does the Landlock API compare to mount/network namespaces, as used in Docker containers? As I understand it, namespaces are for isolation, and Landlock would be more like access permissions, is that correct?
Could it be possible for the system to use the Landlock api to catch unauthorized net/fs access by an app and display a popup to ask for authorization, like macOS does?
(Landlock reviewer here)
Namespaces can also be used for sandboxing, but they have a series of problems. Most importantly, they require more substantial changes to your program that wants to sandbox itself, and the program has to jump through a series of hoops to get everything into the right state. It is possible, but the resulting program environment is in the end more unusual and the mechanisms for enabling unprivileged namespaces are making it difficult to use it for smaller use cases. (It involves re-execution of the program that wants to sandbox itself, whereas with Landlock, a small program can just install a Landlock policy during an early startup phase and continue with that.)
Controlling the rules through a separate process is not currently possible, but it was proposed earlier this month on the kernel mailing lists:
https://lore.kernel.org/all/cover.1741047969.git.m@maowtm.or...
I think in the upstream kernel LSMs are also still the only way to prevent a process from creating child namespaces where it has privileges?
E.g. if you can cat CAP_NET_ADMIN even within a restricted namespace, you have access to huge amounts of horrbly broken kernel code. It's easy (for people who know how to exploit kernel bugs) to escalate privileges from there.
Distros have their own fixes for this issue so namespaces definitely aren't useless in practice for sandboxing. But the basic mechanism just doesn't that well suited to it.
The user.max_user_namespaces sysctl itself is namespace aware and is used by bubblewrap's --disable-userns option.
But a prctl like NO_NEW_PRIVS would be better, since it could avoid an intermediary namespace that is needed for the namespace-aware sysctl.
Ah I didn't know about that. So you can block the child from creating a userns completely... That seems like an unnecessarily big hammer, but also probably 95% of cases works fine?
I think probably we want an inherited mask of what capabilities you can get in child namespaces. I think I heard someone proposed that upstream but I haven't seen the patches.
NO_NEW_PRIVS is quite irritating in a lot of contexts, since it breaks distant dependencies. For example, you can't run `ping`, so good luck debugging your networking!
> For example, you can't run `ping`, so good luck debugging your networking!
Sending ICMP Echo in userspace (over UDP) is a thing on Linux. From experience, for public Internet, where possible, it is always better to rely on TLS connects (then TCP or UDP, and then ICMP) to ascertain connectivity (lest some middleware meddle with IP or Transport replies).
Great answer, thanks!
Namespaces (used by containers) are very powerful but they are also a door to a large attack surface: https://lwn.net/Articles/673597/
Landlock is (only) an access control system, but it's designed to let any process use it, including potentially untrusted ones, which makes it suitable for any apps. It's close and complementary to seccomp.
I get that the "o" in "--ro" is supposed to stand for "only", but this feels clunky to me (especially if there's also a "--rox", which is self-contradictory). I like my long options to be, well, long (complete English words), and backed up by short options. In this case, I'd propose having "-r, --read, -w, --write, -x, --exec", and allowing the short options to be combined as flags (i.e. -rwx).
ROX isn't self-contradictory, Allowing read() and execve(), but denying write() and truncate() are totally valid and common in secure execution contexts, although things gets worse with directory traverse.
So yeah, --rox is fine semantically, just ugly. :D
I think the parent poster was not arguing that allowing this combination of accesses is invalid, just that it can't be called read-ONLY if it's not ONLY read.
"Any color the customer wants, as long as it's black"
I mean that it is not "read-only" if it is also executable.
I don't quite understand what --exec does. If I leave out --exec from example 3, is it supposed to prevent bash from executing other programs?
yeah it wasn't the best call, have a look at v0.1.4, I think it's better now!
Seems pretty cool, but I would probably object to `--best-effort` being enabled by default. This is a sandbox and a security boundary, and degrading security should probably be opt-in rather than opt-out.
> --best-effort: Use best effort mode, falling back to less restrictive sandbox if necessary [default: enabled]
Enabled by default: this strikes me as a particularly poor design choice
How does one do resource control with Landrun, e.g., CPU, memory, I/O..?
Not directly, but I think you can run it with systemd:
systemd-run --user --scope -p MemoryMax=1G,IOReadIOPSMax=8000,CPUQuota=20%,<...> landrun ...
You can't. It's only for filesystem and TCP sandboxing.
Exactly, for resource limits you can use setrlimit(2) or cgroups if needed.
Imo, (almost) every directory should be treated as a new sandbox
Pretty much how Plan 9 works IIRC. I think Fuchsia might have a similar idea.
I made shell-container for myself which works fine for me (link below). I just run shell and I’m in a new/stateful container with only that for mounted. Works pretty well, but has some quirks here and there
https://github.com/jrz/container-shell
Not directory but maybe processes with namespaces. rfork controls that, and then you have bind.
I will just leave this here: https://man.archlinux.org/man/firejail.1
And someone also said, but Firejail supports Landlock, too: https://github.com/netblue30/firejail/pull/6078.
This is great, I run a hobby project, vimgolf.ai, to get my friends to learn vim and had to do a lot with firejail to sandbox the neovim instances correctly. This looks be a lot easier to setup
Very cool project! I was curious if this was possible with util-linux (provider of the unshare command that provides namespace management, the underlying feature behind containers), and it is indeed possible:
setpriv --landlock-access 'fs:remove-file,remove-dir,write-file,make-reg' touch /tmp/foo # Permission denied
setpriv --landlock-access 'fs:remove-file,remove-dir,write-file,make-reg' --landlock-rule "path-beneath:make-reg:/tmp" touch /tmp/foo # Allowed
Very verbose unlike unshare and really deals with internal details, so I'd find it hard to use setpriv in practice.
Apparmor, systemd, containers, lxc… landlock.
Hard to choose! One thing I don’t run anymore is docker.
This looks nice, but I fail to see any use cases that cannot be handled with bwrap and mount namespaces.
Some systems or admins may not trust unprivileged namespacing (thus disabling and its use requiring root), while Landlock may be enabled (and is specifically designed to be used by unprivileged processes).
namespace, specially user and net, are terrible to setup and use.
I'm not sure this is better, but assuming it is by the author into.
Weird question, but would this work inside docker as "extra protection"?
well yeah maybe, if you like.
V0.1.11 out, with env support and bunch of other fixes, update!
Is this comparable to systemd-nspawn / systemd-run?
Any resource to get started on applying RO/RW and networking restrictions on a systemd unit?
I'm not sure I understood...
Here's a pretty good overview article: https://benjamintoll.com/2022/02/04/on-running-systemd-nspaw...
Is it just me or Linux seems to have too many non-orthogonal ways to restrict processes? Like why Landlock does TCP filtering based on port only? What about non-TCP traffic and maybe IP based restrictions is more useful? How does it interact with Netfilter? Puzzling.
It takes time to develop theses features, but Landlock is gaining new network filtering features. We are working in a way to control socket creation according to their protocols, and also a way to filter UDP (which makes sense to developers and users).
From the point of view of an app developer, it might not make sense to filters peers but services (ports) instead, and filtering peers without their names would not be ideal (the kernel doesn't know about DNS, only IPs). Anyway, this feature might come one day if someone want to work on it, but we follow well-tested incremental development.
Netfiler is a privileged network feature that allows to do almost anything with the network, which makes it unsuitable for (app/unprivileged) sandboxing.
+1
A rough description of upcoming network restriction features in Landlock and how they map to the BSD socket API is in the talk at https://youtu.be/K2onopkMhuM?start=2025 starting around 33:45
I really hope we can get back to these features soon :) I think these would be very useful.
What about restricting UDP, or only allowing connections to some IPs?
Technically IP doesn't have ports. TCP and UDP (and others) individually have the concept of port. So it makes sense if you want a port filter it is a TCP specific rule.
...of course it is common enough that it would make sense to abstract over the different protocols that have more or less the same concept of ports.
nice! it would be cool (since it's in Go) how to use it like a library, sandboxing some exec directly from your code.
(Author of that library here)
It is a library, as already linked in the other comment: https://github.com/landlock-lsm/go-landlock
The landrun tool is built on the same library. We also provide an official library for Rust, and obviously you can do it from C as well.
I also collected some libraries for other languages at https://wiki.gnoack.org/SoftwareUsingLandlock (but I can not vouch for their quality in detail)
Great job on the lib, thank you!
https://pkg.go.dev/github.com/landlock-lsm/go-landlock/landl...
[dead]
[dead]
Nice work! Too bad it's GPL v2 :(
When would that matter?
The underlying library that does most of the work is MIT.
https://github.com/landlock-lsm/go-landlock
haha, why!