Show HN: Continue? Y/N: A 60-second game about AI agent permission fatigue

(llmgame.scalex.dev)

24 points | by Wirbelwind 3 hours ago ago

17 comments

zackify 23 minutes ago
I vibe coded a TUI that just shows running lxd containers
I hit 'n' to toggle all network access minus anthropic and openai URLs.
I use pi (sometimes claude, always on bypass) and I auto allow everything. I only toggle manual approval in rare cases like running a script or command that needs to touch a production system and I need to validate everything.
Normally my container has full write access to staging so it can debug and validate everything on its own
cobbal 24 minutes ago
That's funny. It told me that blocking "npm run build" was the wrong answer. Maybe it doesn't really under The threat model.
soanvig 8 minutes ago
Fun game. Can somebody run an agent against those questions to see how it performs? :)
Liftyee 32 minutes ago
I haven't used local agentic AI yet for programming projects. Hence, -187 score
The filter for "commands I would run myself" and "commands I would let an agent run" are very different it seems.
ghrl 34 minutes ago
I am mostly using OpenCode and barely ever see a permission prompt. While they do enforce it for outside workspace read/write, with the bash tool the agent can just bypass that. I'm not quite sure why it is that way, and it certainly isn't a very good solution, but likely not worse than asking for everything which just trains the user to always accept and provides a false sense of security then.
sevenseacat 42 minutes ago
Continue? Y/N ── SCORE: 2,343 Security-Conscious Engineer
Caught 8/8 threats "Not a single secret leaked"
→ llmgame.scalex.dev
MeetingsBrowser an hour ago
It would be cool to see the distribution of all player scores.
[-]
- Wirbelwind 3 minutes ago
  That's a great idea, stay tuned
carterschonwald an hour ago
some of the sandboxing ive been playing with gives me the best of both yolo and like logic programming tier perms on llm actions in env. still not ready for prime time though ;)
cadwell an hour ago
1,640 points on my first try—I fell into a few traps, but it was really interesting. Thanks for the little game! I'm sharing it with my coworkers :)
nardib 3 hours ago
Use this and save yourself:
claude --dangerously-skip-permissions
[-]
- tasuki 44 minutes ago
  Just make sure to run it in an isolated environment where it's ok to mess things up, and make sure it doesn't have access to any secrets.
- wildpeaks an hour ago
  This is why having a human in the loop isn't enough because they will cut corners and skip reviewing what they should review.
  [-]
  - chuckadams an hour ago
    A tool that pushes people into permissions fatigue is in fact the proper recipient of the blame. The tool in question here is the entire system though, including the OS with insufficient permission boundaries in userspace, not just the agent
- qsxfthnkp2322 an hour ago
  I love it when Claude is dangerous
- dheera 26 minutes ago
  I got tired of typing that and just do
```
    alias claude="claude --dangerously-skip-permissions"
```
  I do have a separate "claude" user on my system without sudo access and without access to my main user home dir
  And yeah I know that's not perfect but I'm trying to get shit done
  [-]
  - franze a few seconds ago
    alias claude+="claude --dangerously-skip-permissions"
    alias claude++="claude --dangerously-skip-permissions --continue"